On Wednesday, 21 October 2015 at 22:24:30 UTC, Marco Leise wrote:
Am Wed, 21 Oct 2015 04:17:16 +0000
schrieb Laeeth Isharc <laeeth.nos...@nospam-laeeth.com>:
Very impressive.
Is this not quite interesting ? Such a basic web back end
operation, and yet it's a very different picture from those
who say that one is I/O or network bound. I already have JSON
files of a couple of gig, and they're only going to be bigger
over time, and this is a more generally interesting question.
Seems like you now get 2.1 gigbytes/sec sequential read from a
cheap consumer SSD today...
You have this huge amount of Reddit API JSON, right?
I wonder if your processing could benefit from the fast
skipping routines or even reading it as "trusted JSON".
The couple of gig were just Quandl metadata for one provider, but
you're right I have that Reddit data too. And that's just a
beginning. What some have been doing for a while, I'm beginning
to do now, and many others will be doing in the next few years -
just as soon as they have finished having meetings about what to
do... I don't suppose they'll be using python, at least not for
long.
I am sure it could benefit - I kind of need to get some other
parts going first. (For once it truly is a case of Knuth's 97%).
But I'll be coming back to look at best way, for json, but text
files more generally.
Have you thought about writing up your experience with writing
fast json? A bit like Walter's Dr Dobbs's article on wielding a
profiler to speed up dmd.
And actually if you have time, would you mind dropping me an
email? laeeth at
....
kaledicassociates.com
Thanks.
Laeeth.