DIUnicode: Version History

DIUnicode: Version History

DIUnicode provides Unicode text reader and writer classes with automatic conversion from and to 144 character sets and encodings for Delphi (Embarcadero, CodeGear, Borland).

DIUnicode v7.3.0 – 22 Nov 2023

Support Delphi 12 Athens Win32 and Win64.
Add TDICsvParser.SkipBlankRows property.
TDICsvParser.ReadNextData improvements:
- Include leading white space into non-delimited data, as per CSV specification RFC 4180.
- Raise exception if separator not found.
Update DIUtils.pas Unicode functions to Unicode 15.1.0.

DIUnicode 7.2.0 – 16 Sept 2021

Support Delphi 11 Alexandria Win32 and Win64.
Update DIUtils.pas Unicode functions to Unicode 14.0.0.

DIUnicode 7.1.0 – 5 Jun 2020

Support Delphi 10.4 Sydney Win32 and Win64.

DIUnicode 7.0.0 – 8 Oct 2019

Extend character support to the full range of Unicode Code Points from $000000 to $10FFFF.

Up to now, DIUnicode stored code points as WideChars. This limited Unicode support to the Basic Multilingual Plane (BMP) from $0000 to $FFFF. Code points from the Supplementary Planes were converted to the $FFFD replacement character. This went well with a great number of languages. But less common scripts did not work, just like the increasingly popular emojis from the Symbols and Pictographs Unicode blocks.

DIUnicode 7.0.0 overcomes these limitations and now covers the complete Unicode range. Changes are almost entirely internal and maintain backwards compatibility as much as possible. Existing applications should compile with no or minor changes only. WideChar routines are marked as deprecated and hint at their new complementary UCP routines.

TDIUnicodeReader.Data is still a WideChar buffer. However, its contents is now fully UTF-16 encoded. This means that it may contain code points > $FFFF which take up two WideChars (surrogate pairs). As a result, indexed access to the buffer is no longer guaranteed. TDIUnicodeReader.Data related methods, like TDIUnicodeReader.DataAsStrTrimW are adjusted accordingly.

UnicodeString utility routines are rewritten to handle full UTF-16, including surrogate pairs. Most of them are in DIUtils.pas. YuUtf.pas also contains new utility routines for UTF-16 testing, encoding, and decoding. If possible, string handling routines now take NativeInt type parameters for the buffer length.

Other noteworthy changes:

TDIUnicodeReader.UCP complements TDIUnicodeReader.Char.
Removed conditional compilation directives DI_No_Classes and DI_No_Unicode_Component. TDIUnicodeReader always descends from TComponent and the Classes unit is always used. Source code only.
Improve DIUtils.pas Unicode processing to support Unicode Code Points from $000000 to $10FFFF. Adjust remaining source code accordingly.
Update DIUtils.pas Unicode functions to Unicode 12.1.0.
Delphi 4 and Delphi 5 crash when compiling DIUtils.pas. There is no error message, so it is not possible to work around the problem. Support for these compilers is therefore removed. At least Delphi 6 is required to compile DIUnicode.
Remove DI.inc include file. Directly link in DICompilers.inc instead. Source code only.

DIUnicode 6.10.0 – 7 Mar 2019

Fix potential TDIUnicodeWriter memory leak if TDIUnicodeWriteMethods.Init allocates its own memory.
TDIUnicodeWriter.Clear calls TDIUnicodeWriteMethods.Flush to reset encoder state.
KOI8-U converter now maps 0xB4 to U+0404 instead of U+0403.
Update DIUtils.pas Unicode functions to Unicode 12.
Compatibility update with DIConverters 1.18.0. These changes only affect projects using DIConverters:
- Add ISO-2022-CP-MS encoding: Read_iso_2022_jp_ms read methods and Write_iso_2022_jp_ms write methods.
- DIConverters converter functions now use the native unsigned integer type for the length of a string and support stings longer than 2 GB.
- UTF-8 converter functions reject surrogates and out-of-range code points, namely the in the ranges 0xD800..0xDFFF and >= 0x110000.
- Fix error handling in UCS-2, UCS-4, and UTF-32 decoder functions.
- Tweak the GB18030 converter functions to map 0x8135F437 to U+E7C7.
- Update the CP1255 converter functions to map 0xCA to U+05BA.

DIUnicode 6.9.0 – 24 Dec 2018

Support Delphi 10.3 Rio Win32 and Win64.

DIUnicode 6.8.0 – 3 Apr 2017

Support Delphi 10.2 Tokyo Win32 and Win64.

DIUnicode 6.7.0 – 7 May 2016

Support Delphi 10.1 Berlin Win32 and Win64.

DIUnicode 6.6.2 – 15 Sep 2015

Support Delphi 10 Seattle Win32 and Win64.

DIUnicode 6.6.1 – 25 Apr 2015

Add support for Delphi XE8 Win32 and Win64.

DIUnicode 6.6.0 – 3 Oct 2014

Support Delphi XE7 Win32 and Win64.
Improved documentation shows inherited class members.

DIUnicode 6.5.0 – 28 Apr 2014

Support Delphi XE6 Win32 and Win64.

DIUnicode 6.0.1 – 17 Feb 2014

Compatibility update with other Yunqa products.

DIUnicode 6.0.0 – 25 Sep 2013

Support Delphi XE5 Win32 and Win64.

DIUnicode 5.6.0 – 14 Jun 2013

Support Delphi XE4 Win32 and Win64.

DIUnicode 5.5.0 – 4 Oct 2012

Support Delphi XE3 Win32 and Win64.

DIUnicode 5.2.0 – 14 Apr 2012

Fix: When reading from TDIUnicodeReader.SourceStream, the size of the internal source buffer was not correctly calculated. Depending on the decoding, this slowed down reading or even stoped it before the end of the stream was reached.
Fix: TDIUnicodeReader.SkipEmptyLines consumed additional chars after the line break.
Work around a compiler warning in TDIUnicodeReader.FillSourceBuffer (source code edition only).
Add unit scope to demo projects.

DIUnicode 5.1.0 – 9 Nov 2011

Support Delphi XE2 Win64.

DIUnicode 5.0.0 – 15 Oct 2011

Support Delphi XE2 Win32 (binary editions) and Win64 (source code edition only right now).

DIUnicode 4.2.1 – 20 Feb 2011

Library source code compiles with FreePascal (Win32).

DIUnicode 4.2.0 – 28 Sep 2010

Delphi XE support.
New TDIUnicodeWriter.WriteStr8 and WriteBuf8 methods.
New TDIUnicodeReader.DataAsStrTrim8 method.
Improved help layout for better navigation and readability. Send your comments!

DIUnicode 4.1.1 – 17 Dec 2009

Additions and bug fixes to DIUtils.pas.

DIUnicode 4.1.0 – 14 Sep 2009

Delphi 2010 support.
Update character properties, case folding, and case mapping to Unicode 5.10.

DIUnicode 4.0.1 – 31 Jan 2009

Work around an unexpected Delphi 2009 automatic numeric AnsiChar Unicode conversion in DIUtils.pas which caused a compiler error on Eastern European and Asian language settings.

DIUnicode 4.0.0 – 24 Nov 2008

Delphi 2009 support.

DIUnicode 3.2.1 – 30 Jul 2008

Improve compatibility for parallel installation with other DI packages.
Some code cleanup.

DIUnicode 3.2 – 13 May 2007

Delphi 2007 Support.
Compatibility with DIContainers 1.11.
Add XP Themes to Demo projects.

DIUnicode 3.1 – 28 Dec 2005

Added compatibility with Delphi 2006 Win32.

DIUnicode 3.0.1 – 14 Oct 2005

Fixed an error which could prematurely stop TDIUnicodeReader when a pushed source was popped at the end of a nested document.
Added Delphi 3 compatibility to the utility units.
Included the missing DIUnicodeCodePages.pas unit which is required to compile the FontCharSet example project.
Resolved dependency issues when DIUnicode is used in parallel with other DI products.

DIUnicode 3.0 – 14 Apr 2005

Added support for Delphi 2005 Win32.
Added TDIUnicodeReader.ReadBOM function which returns the Byte Order Mark (BOM) found at the current position and advances the position accordingly.
Added TDIUnicodeReader.SourceFile property as a simple means to read from a file.
Added optional WriteByteOrderMark parameter to TDIUnicodeReader.SaveDataToFile and TDIUnicodeReader.SaveDataToStream which controls if a UTF-16/UCS-2 little endian byte order mark is being written in front of the data.
Other, smaller improvements and bug fixes.

DIUnicode 2.00 – 2. February 2004

Added the possibility to link DIUnicode against DIConverters, which gives access to 130+ character encodings.
Added Pascal implementation for reading / decoding and writing / encoding the following character sets:
- Mac Arabic, Mac Dingbats, Mac Central Europe, Mac Croatian, Mac Cyrillic, Mac Farsi, Mac Greek, Mac Hebrew, Mac Iceland, Mac Roman, Mac Romanian, Mac Thai, Mac Turkish
- UCS-2 LE, CS-2 BE
- UCS-4 LE, UCS-4 BE
- UTF-32 LE, UTF-32 BE
- UTF-7 (Write_UTF_7 / Read_UTF_7)
- UTF-7 Optional Direct Characters (Write_UTF_7_ODC / reads as Read_UTF_7)
- JIS X0201, NextStep, TIS 620
Bug fixes.

DIUnicode 1.50 – 13. December 2003

Added support for reading and writing UTF-7 according to RFC 2152. Writing UTF-7 comes in two flavors, with (Write_UTF_7) or without (Write_UTF_7_ODC) encoding optional direct characters. UTF-7 reading (Read_UTF_7) works equaly well for both writing methods.
Implementation of UTF-7 made it necessary to change the reading and writing implementation of TDIUnicodeReader and TDIUnicodeWriter to allow data buffering between consecutive reads and writes.
TDIUnicodeReader.PushSource and TDIUnicodeReader.PopSource methods added to which allow to insert one source into another, like for Pascal {$INCLUDE …} directive.
TDIUnicodeReader can optionally free its source stream if the reading reached the end of the stream. This is especially usefull when reading nested files using the TDIUnicodeReader.PushSource and TDIUnicodeReader.PopSource methods. The protected property TDIUnicodeReader.AutoFreeSourceStreams may be used by descendent classes which implement specialized reading / parsing.
Added Methods for reading digits and hexadecimal characters to TDIUnicodeReader, as well as for retrieving data as trimmed strings.
Various other improvements, code clean-ups and minor bug fixes.

DIUnicode 1.00 – 5. May 2003

Initial release.

Table of Contents