Re: Parsing a UTF-16LE file line by line, BUG?

Patrick Schluter via Digitalmars-d-learn Sun, 29 Jan 2017 06:51:14 -0800

On Saturday, 28 January 2017 at 15:40:24 UTC, Nestor wrote:

On Friday, 27 January 2017 at 04:26:31 UTC, Era Scarecrow wrote:
Skipping the BOM is just a matter of skipping the first twobytes identifying it...
AFAIK in some cases the BOM takes up to 4 bytes (FOR UTF-32),so when input encoding is unknown one must perform some kind ofdetection in order to apply the correct transcoding later. Ithought by now dmd had this functionality built-in and exposed,since the compiler itself seems to do it for source code units.


On UTF-8 files the BOM is 3 bytes long.

Re: Parsing a UTF-16LE file line by line, BUG?

Reply via email to