Question 1

What is a code point vs a code unit?

Accepted Answer

A code point is the abstract integer assigned to a character (U+1F389 = 🎉). A code unit is the storage unit of an encoding (UTF-16 stores 🎉 as 2 surrogate code units `D83C DF89`).

Question 2

How are emoji handled?

Accepted Answer

Each emoji codepoint above U+FFFF gets a single entry with both its full codepoint and its UTF-16 surrogate pair shown.

Question 3

Why does "café" show 4 entries?

Accepted Answer

It is 4 codepoints: c, a, f, é. The "é" may also appear as 2 codepoints (e + combining acute) if NFD-decomposed — paste both forms to see the difference.

Question 4

What is the U+XXXX format?

Accepted Answer

Standard Unicode notation. 4 hex digits for the Basic Multilingual Plane (≤U+FFFF), 5 for supplementary planes (e.g., U+1F389).

Question 5

Is there a character limit?

Accepted Answer

Up to 1000 characters per request (to keep output rendering snappy). The 1MB platform limit also applies.

Unicode Code Point Lookup

How to use Unicode Code Point Lookup

What is Unicode Code Point Lookup?

Common use cases

FAQ