Am 22.08.2014 20:08, schrieb Walter Bright:
On 8/21/2014 3:35 PM, Sönke Ludwig wrote:
Destroy away! ;)

Thanks for taking this on! This is valuable work. On to destruction!

I'm looking at:

http://s-ludwig.github.io/std_data_json/stdx/data/json/lexer/lexJSON.html

I anticipate this will be used a LOT and in very high speed demanding
applications. With that in mind,


1. There's no mention of what will happen if it is passed malformed JSON
strings. I presume an exception is thrown. Exceptions are both slow and
consume GC memory. I suggest an alternative would be to emit an "Error"
token instead; this would be much like how the UTF decoding algorithms
emit a "replacement char" for invalid UTF sequences.

The latest version now features a LexOptions.noThrow option which causes an error token to be emitted instead. After popping the error token, the range is always empty.


2. The escape sequenced strings presumably consume GC memory. This will
be a problem for high performance code. I suggest either leaving them
undecoded in the token stream, and letting higher level code decide what
to do about them, or provide a hook that the user can override with his
own allocation scheme.

The problem is that it really depends on the use case and on the type of input stream which approach is more efficient (storing the escaped version of a string might require *two* allocations if the input range cannot be sliced and if the decoded string is then requested by the parser). My current idea therefore is to simply make this configurable, too.

Enabling the use of custom allocators should be easily possible as an add-on functionality later on. At least my suggestion would be to wait with this until we have a finished std.allocator module.

Reply via email to