http://d.puremagic.com/issues/show_bug.cgi?id=8754
Summary: Function commonPrefix returns invalid string when passing two cyrillic utf-8 strings Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: nob...@puremagic.com ReportedBy: lxyd.dl...@lxyd.net --- Comment #0 from Alexey Dubinin <lxyd.dl...@lxyd.net> 2012-10-04 07:53:02 PDT --- Run this demo: -------- import std.algorithm, std.stdio, std.encoding; void main() { // ciryllic letters 'б' and 'в' consist of two bytes. First one is common auto p = commonPrefix("б", "в"); writeln(p.length); // 1 code unit. Must be 0 assert(isValid(p)); // fails: incomplete code point } -------- I'm just studying D and, so I'm not sure this is a real bug, but commonPrefix seems to be designed to treat strings special way and this way seems to be wrong for strings :) Let me suggest this separate implementation of commonPrefix for strings (tried to mimic original code): -------- import std.functional, std.traits, std.range; auto commonPrefix(alias pred = "a == b", R1, R2)(R1 r1, R2 r2) if (isSomeString!R1 && isSomeString!R2) { auto result = r1.save; for (; !r1.empty && !r2.empty && binaryFun!pred(r1.front, r2.front); r1.popFront(), r2.popFront()){} return result[0..$-r1.length]; } -------- Once again, I'm just studying D and I'm not sure if this code is fully correct, but it seems to work fine with strings (also, not sure if this separate implementation sould be trusted and pure). BTW: documentation has a mistake too: "The type of the result is the same as $(D takeExactly(r1, n))". But takeExactly always returns takeExactly.Result, and commonPrefix can return slice. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------