Re: Lazily parse a JSON text file using stdx.data.json?

2018-05-22 Thread Steven Schveighoffer via Digitalmars-d

On 5/22/18 3:58 PM, Dr.No wrote:

Does this cause infine loop?
https://github.com/schveiguy/jsoniopipe/blob/master/source/jsoniopipe/dom.d#L134 



Possibly. Bug reports are welcome :) I think on this line, it will make 
progress: 
https://github.com/schveiguy/jsoniopipe/blob/master/source/jsoniopipe/dom.d#L148, 
but I'm not confident enough to say I'm sure of it.


Of course, as you can probably see, I've spent almost no time working on 
that code base so far. I need to get back to it. The DOM parser has very 
little real usage, I just got it working with the given unittests and 
then checked it in.


I've changed iopipe a bit since then as well, but I think I got it 
compiling just before my "lightning talk" at the Munich D meetup during 
dconf. Didn't have time to demonstrate it though.


-Steve


Re: Lazily parse a JSON text file using stdx.data.json?

2018-05-22 Thread Dr.No via Digitalmars-d
On Sunday, 17 December 2017 at 16:51:21 UTC, Steven Schveighoffer 
wrote:

On 12/17/17 4:44 AM, Jonathan M Davis wrote:


[...]


There is an even more work-in-progress library built on that, 
but it's not yet in dub (this was the library I wrote for my 
dconf talk this year): https://github.com/schveiguy/jsoniopipe


This kind of demonstrates how to parse json data lazily with 
pretty high performance.


It really depends on what you are trying to do, though.


[...]


I think there eventually will have to be a day of reckoning for 
auto-decoding. But it probably will take a monumental effort to 
show how it can be done without being too painful for existing 
code. I still believe it can be done.


-Steve


Does this cause infine loop?
https://github.com/schveiguy/jsoniopipe/blob/master/source/jsoniopipe/dom.d#L134


Re: Lazily parse a JSON text file using stdx.data.json?

2018-01-01 Thread David Gileadi via Digitalmars-d

On 12/30/17 8:16 PM, Marco Leise wrote:

There is also the JSON parser from
https://github.com/mleise/fast
if you need to parse 2x faster than RapidJSON ;)


Nice, I'll take a look.

My original post was mainly to express how surprised I was that one of 
D's front-page features was, for me, impossible to get working in this 
context. I posted in hopes that more experienced folks might consider 
making fixes to help smooth future attempts by others.


I realize that compile-time ranges are not runtime interfaces like many 
languages provide for iteration, but right now ranges seem too hard to 
get right when it feels like they should just work.


Re: Lazily parse a JSON text file using stdx.data.json?

2017-12-30 Thread Marco Leise via Digitalmars-d
Am Sun, 17 Dec 2017 10:21:33 -0700
schrieb David Gileadi :

> On 12/17/17 3:28 AM, WebFreak001 wrote:
> > On Sunday, 17 December 2017 at 04:34:22 UTC, David Gileadi wrote:
> > uh I don't know about stdx.data.json but if you didn't manage to succeed 
> > yet, I know that asdf[1] works really well with streaming json. There is 
> > also an example how it works.
> > 
> > [1]: http://asdf.dub.pm  
> 
> Thanks, reading the whole file into memory worked fine. However, asdf 
> looks really cool. I'll definitely look into next time I need to deal 
> with JSON.

There is also the JSON parser from
https://github.com/mleise/fast
if you need to parse 2x faster than RapidJSON ;)

-- 
Marco



Re: Lazily parse a JSON text file using stdx.data.json?

2017-12-17 Thread David Gileadi via Digitalmars-d

On 12/17/17 3:28 AM, WebFreak001 wrote:

On Sunday, 17 December 2017 at 04:34:22 UTC, David Gileadi wrote:
uh I don't know about stdx.data.json but if you didn't manage to succeed 
yet, I know that asdf[1] works really well with streaming json. There is 
also an example how it works.


[1]: http://asdf.dub.pm


Thanks, reading the whole file into memory worked fine. However, asdf 
looks really cool. I'll definitely look into next time I need to deal 
with JSON.


Re: Lazily parse a JSON text file using stdx.data.json?

2017-12-17 Thread Steven Schveighoffer via Digitalmars-d

On 12/17/17 4:44 AM, Jonathan M Davis wrote:


If I were seriously looking at
reading in a file lazily as a forward range, I'd look at
http://code.dlang.org/packages/iopipe, though as I understand it, it's very
much a work in progress.


There is an even more work-in-progress library built on that, but it's 
not yet in dub (this was the library I wrote for my dconf talk this 
year): https://github.com/schveiguy/jsoniopipe


This kind of demonstrates how to parse json data lazily with pretty high 
performance.


It really depends on what you are trying to do, though.


As for auto-decoding, yeah, it sucks. You can work around it with stuff like
std.utf.byCodeUnit, but auto-decoding is a problem all around, and it's one
that we're likely stuck with, because unfortunately, we haven't found a way
to remove it without breaking everything.


I think there eventually will have to be a day of reckoning for 
auto-decoding. But it probably will take a monumental effort to show how 
it can be done without being too painful for existing code. I still 
believe it can be done.


-Steve


Re: Lazily parse a JSON text file using stdx.data.json?

2017-12-17 Thread WebFreak001 via Digitalmars-d

On Sunday, 17 December 2017 at 04:34:22 UTC, David Gileadi wrote:
I'm a longtime fan of dlang, but haven't had a chance to do 
much in-depth dlang programming, and especially not range 
programming. Today I thought I'd use stdx.data.json to read 
from a text file. Since it's a somewhat large file, I thought 
I'd create a text range from the file and parse it that way. 
stdx.data.json has a great interface for lazily parsing text 
into JSON values, so all I had to do was turn a text file into 
a lazy range of UTF-8 chars that stdx.data.json's lexer could 
use. (In my best Clarkson voice:) How hard could it be?


[...]


uh I don't know about stdx.data.json but if you didn't manage to 
succeed yet, I know that asdf[1] works really well with streaming 
json. There is also an example how it works.


[1]: http://asdf.dub.pm


Re: Lazily parse a JSON text file using stdx.data.json?

2017-12-17 Thread Jonathan M Davis via Digitalmars-d
On Saturday, December 16, 2017 21:34:22 David Gileadi via Digitalmars-d 
wrote:
> I'm a longtime fan of dlang, but haven't had a chance to do much
> in-depth dlang programming, and especially not range programming. Today
> I thought I'd use stdx.data.json to read from a text file. Since it's a
> somewhat large file, I thought I'd create a text range from the file and
> parse it that way. stdx.data.json has a great interface for lazily
> parsing text into JSON values, so all I had to do was turn a text file
> into a lazy range of UTF-8 chars that stdx.data.json's lexer could use.
> (In my best Clarkson voice:) How hard could it be?
>
> Several hours later, I've finally given up and am just reading the whole
> file into a string. There may be a magic incantation I could use to make
> it work, but I can't find it, and frankly I can't see why I should need
> an incantation in the first place. It really ought to just be a method
> of std.stdio.File.
>
> Apparently some of the complexity is caused by autodecoding (e.g. joiner
> returns a range of dchar from char ranges), and some of the fault may be
> in stdx.data.json, but either way I'm surprised that I couldn't do it.
> This is the kind of thing I expected to be ground level stuff.

I don't know what problems specifically you were hitting, but a lot of
range-based stuff (especially parsing) requires forward ranges so that there
can be some amount of lookahead (having just a basic input range can be
incredibly restrictive), and forward ranges and lazily reading from a file
don't tend to go together very well, because it tends to require allocating
buffers that then have to be copied on save. It gets to be rather difficult
to do it efficiently. std.stdio.File does support lazily reading in a file,
which works well with foreach, but if you're trying to process the entire
file as a range, it's usually just way easier to read in the entire file at
once and operate on it as a dynamic array. The option halfway in between is
to use std.mmfile so that the file gets treated as a dynamic array but the
OS is reading it in piecemeal for you. If I were seriously looking at
reading in a file lazily as a forward range, I'd look at
http://code.dlang.org/packages/iopipe, though as I understand it, it's very
much a work in progress.

As for auto-decoding, yeah, it sucks. You can work around it with stuff like
std.utf.byCodeUnit, but auto-decoding is a problem all around, and it's one
that we're likely stuck with, because unfortunately, we haven't found a way
to remove it without breaking everything.

- Jonathan M Davis