Unicode

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world’s writing systems. Developed in conjunction with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version of Unicode contains a repertoire of more than 110,000 characters covering 100 scripts. The standard consists of a set of code charts for visual reference, an encoding methodology and set of standard character encodings, a set of reference data computer files, and a number of related items, such as character properties, rules for normalization, decomposition, collation, rendering, and bidirectional display order (for the correct display of text containing both right-to-left scripts, such as Arabic and Hebrew, and left-to-right scripts).
Unicode’s success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, the Java programming language, and the Microsoft .NET Framework.

2 thoughts on “Unicode

  1. shinichi Post author

    (sk) More than two-thirds of UNICODE characters (74,617 out of 110,000) are for the CJK Unified Ideographs. In other words, more than two-thirds of characters used in the world are Chinese, Japanese or Korean (all originally from China).

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *