On Monday, 2 June 2014 at 00:18:19 UTC, David Soria Parra wrote:
Hi,
I have recently had to deal with large amounts of JSON data in
D. While doing that I've found that std.json is remarkable slow
in comparison to other languages standard json implementation.
I've create a small and simple benchmark parsing a local copy
of a github API call
"https://api.github.com/repos/D-Programming-Language/dmd/pulls"
and parsing it 100% times and writing the title to stdout.
My results as follows:
./d-test > /dev/null 3.54s user 0.02s system 99% cpu 3.560
total
./hs-test > /dev/null 0.02s user 0.00s system 93% cpu 0.023
total
python test.py > /dev/null 0.77s user 0.02s system 99% cpu
0.792 total
The concrete implementations (sorry for my terrible haskell
implementation) can be found here:
https://github.com/dsp/D-Json-Tests/
This is comapring D's std.json vs Haskells Data.Aeson and
python standard library json. I am a bit concerned with the
current state of our JSON parser given that a lot of
applications these day use JSON. I personally consider a high
speed implementation of JSON a critical part of a standard
library.
Would it make sense to start thinking about using ujson4c as an
external library, or maybe come up with a better
implementation. I know Orvid has something and might add some
analysis as to why std.json is slow. Any ideas or pointers as
to how to start with that?
BTW, my acquaintance points out your haskell code is different
from other samples.
Your haskell code parses JSON array only once. This is why so
fast.
He uploads same behaviour code which parses JSON array at each
loop. Please check it.
https://gist.github.com/maoe/e5f72c3cf3687610fe5c
On my env result:
% time ./new_test > /dev/null
./new_test > /dev/null 1.13s user 0.02s system 99% cpu 1.144
total
% time ./test > /dev/null
./test > /dev/null 0.02s user 0.00s system 91% cpu 0.023 total