Re: std.data.json formal review

Dmitry Olshansky via Digitalmars-d Sat, 15 Aug 2015 23:57:08 -0700

On 16-Aug-2015 03:50, Walter Bright wrote:

On 8/15/2015 3:18 AM, Sönke Ludwig wrote:

There is no reason to validate UTF-8 input. The only place where
non-ASCII code units can even legally appear is inside strings, and
there they can just be copied verbatim while looking for the end of the
string.

The idea is to assume that any char based input is already valid UTF
(as D
defines it), while integer based input comes from an unverified
source, so that
it still has to be validated before being cast/copied into a 'string'.
I think
this is a sensible approach, both semantically and performance-wise.


The json parser will work fine without doing any validation at all. I've
been implementing string handling code in Phobos with the idea of doing
validation only if the algorithm requires it, and only for those parts
that require it.


Aye.

There are many validation algorithms in Phobos one can tack on - having
two implementations of every algorithm, one with an embedded reinvented
validation and one without - is too much.

Actually there are next to none. `validate` that throws on failedvalidation is a misnomer.

The general idea with algorithms is that they do not combine things, but
they enable composition.

At the lower level such as tokenizers combining a couple of simple stepstogether makes sense because it makes things run faster. It usuallyeliminates the need for temporary result that must be digestible by thenext range.

For instance "combining" decoding and character classification one mayside-step generating the codepoint value itself (because now it doesn'thave to produce it for the top-level algorithm).



--
Dmitry Olshansky

Re: std.data.json formal review

Reply via email to