Unicode Text Fixer

All Tools

Fix broken Unicode, detect confusables, normalize text, and remove invisible characters.

Characters: 0 UTF-8 Bytes: 0 Scripts: -

Unicode Normalization

Choose a normalization form to apply. NFC is recommended for general use.

NFC Canonical Decomposition, then Canonical Composition. Best for general text.
NFD Canonical Decomposition. Splits characters into base + combining marks.
NFKC Compatibility Decomposition, then Composition. Normalizes ligatures and width variants.
NFKD Compatibility Decomposition. Maximum decomposition for search/comparison.

Character Inspector

Examine each character in your text: codepoint, script, category, and byte size.

Char Codepoint Name Script Category UTF-8

Confusable / Homoglyph Detection

Detect visually similar characters from different scripts (e.g., Cyrillic 'a' vs Latin 'a').

Invisible Character Cleanup

Select which invisible or special characters to remove from your text.

0
0
0
0
0
0

Combining Mark Analysis

Detect combining diacritical marks and compose them into precomposed characters (NFC).

Tip: Press Ctrl+Enter to analyze.

What is Unicode Text Fixer?

Unicode Text Fixer is a free online tool that solves common Unicode text problems. Whether you encounter broken characters after copying text between applications, garbled diacritical marks from encoding mismatches, visually deceptive homoglyphs that mix Cyrillic and Latin scripts, or invisible zero-width characters hidden in your content, this tool identifies and repairs them all.

This tool is designed for developers debugging encoding issues, translators working across multiple writing systems, content writers cleaning up text from various sources, security researchers investigating homoglyph-based phishing attacks, and anyone who has ever pasted text from a PDF or website only to find strange characters.

Key Features

Unicode Text Fixer runs entirely in your browser. No text is ever sent to any server, ensuring complete privacy for sensitive content. The tool supports 12 languages and works with all Unicode scripts including Latin, Cyrillic, Arabic, CJK, and more.