On Tuesday, 7 May 2013 at 20:14:20 UTC, w0rp wrote:
On Tuesday, 7 May 2013 at 18:36:20 UTC, Sean Kelly wrote:

$ main
n = 1
Milliseconds to call stdJson() n times: 73054
Milliseconds to call newJson() n times: 44022
Milliseconds to call jepJson() n times: 839
newJson() is faster than stdJson() 1.66x times
jepJson() is faster than stdJson() 87.1x times

This is very interesting. This jepJson library seems to be pretty fast. I imagine this library works very similar to SAX, so you can save quite a bit on simply not having to allocate.

Yes, the jep parser does no allocation at all--all callbacks simply receive a slice of the value. It does full validation according to the spec, but there's no interpretation of the values beyond that either, so if you want the integer string you were passed converted to an int, for example, you'd do the conversion yourself. The same goes for unescaping of string data, and in practice I often end up unescaping the strings in-place since I typically never need to re-parse the input buffer.

In practice, it's kind of a pain to use the jep parser for arbitrary processing so I have some functions layered on top of it that iterate across array values and object keys:

int foreachArrayElem(char[] buf, scope int delegate(char[] value)); int foreachObjectField(char[] buf, scope int delegate(char[] name, char[] value));


This works basically the same as opApply, so having the delegate return a nonzero value causes parsing to abort and return that value from the foreach routine. The parser is sufficiently fast that I generally just nest calls to these foreach routines to parse complex types, even though this results in multiple passes across the same data.

The only other thing I was careful to do is design the library in such a way that each parser callback could call a corresponding writer routine to simply pass through the input to an output buffer. This makes auto-reformatting a breeze because you just set a "format output" flag on the writer and implement a few one-line functions.


Before I read this, I went about creating my own benchmark. Here is a .zip containing the source and some nice looking bar charts comparing std.json, vibe.d's json library, and my own against various arrays of objects held in memory as a string:

http://www.mediafire.com/download.php?gabsvk8ta711q4u

For those less interested in downloading and looking at the .ods file, here are the results for the largest input size. (Array of 100,000 small objects)

std.json - 2689375370 ms
vibe.data.json - 2835431576 ms
dson - 3705095251 ms

These results don't seem correct.  Is this really milliseconds?

Reply via email to