On Friday, 13 October 2017 at 21:53:12 UTC, Steven Schveighoffer wrote:
On 10/13/17 4:27 PM, Andrew Edwards wrote:
On Friday, 13 October 2017 at 19:17:54 UTC, Steven Schveighoffer wrote:
On 10/13/17 2:47 PM, Andrew Edwards wrote:
A bit of advice, please. I'm trying to parse a gzipped JSON file retrieved from the internet. The following naive implementation accomplishes the task:

     auto url = "http://api.syosetu.com/novelapi/api/?out=json&lim=500&gzip=5";;
     getContent(url)
         .data
         .unzip
         .runEncoded!((input) {
             ubyte[] content;
             foreach (line; input.byLineRange!true) {
                 content ~= cast(ubyte[])line;
             }
             auto json = (cast(string)content).parseJSON;

input is an iopipe of char, wchar, or dchar. There is no need to cast it around.

In this particular case, all three types (char[], wchar[], and dchar[]) are being returned at different points in the loop. I don't know of any other way to generate a unified buffer than casting it to ubyte[].

This has to be a misunderstanding. The point of runEncoded is to figure out the correct type (based on the BOM), and run your lambda function with the correct type for the whole thing.

Maybe I'm just not finding the correct words to express my thoughts. This is what I mean:

// ===========

void main()
{
auto url = "http://api.syosetu.com/novelapi/api/?out=json&lim=500&gzip=5";;
        getContent(url)
                .data
                .unzip
                .runEncoded!((input) {
                        char[] content; // Line 20
                        foreach (line; input.byLineRange!true) {
                                content ~= line;
                        }
                });
}

output:
source/app.d(20,13): Error: cannot append type wchar[] to type char[]

Changing line 20 to wchar yields:
source/app.d(20,13): Error: cannot append type char[] to type wchar[]

And changing it to dchar[] yields:
source/app.d(20,13): Error: cannot append type char[] to type dchar[]

I'm not sure actually this is even needed, as the data could be coming through without a BOM. Without a BOM, it assumes UTF8.

Note also that getContent returns a complete body, but unzip may not be so forgiving. But there definitely isn't a reason to create your own buffer here.

this should work (something like this really should be in iopipe):

while(input.extend(0) != 0) {} // get data until EOF

This!!! This is what I was looking for. Thank you. I incorrectly assumed that if I didn't process the content of input.window, it would be overwritten on each .extend() so my implementation was:

ubyte[] json;
while(input.extend(0) != 0) {
     json ~= input.window;
}

This didn't work because it invalidated the Unicode data so I ended up splitting by line instead.

Sure enough, this is trivial once one knows how to use it correctly, but I think it would be better to put this in the library as extendAll().

ensureElems(size_t.max) should be equivalent, though I see you responded cryptically with something about JSON there :)

:) I'll have to blame it on my Security+ training. Switching out the while loop with ensureElements() in the following results in an error:

void main()
{
auto url = "http://api.syosetu.com/novelapi/api/?out=json&lim=500&gzip=5";;
        getContent(url)
                .data
                .unzip
                .runEncoded!((input) {
                        // while(input.extend(0) != 0){} // this works
                        input.ensureElems(size_t.max); // this doesn't
                        auto json = input.window.parseJSON;
                        foreach (size_t ndx, _; json) {
                                if (ndx == 0) continue;
                                auto title = json[ndx]["title"].str;
                                auto author = json[ndx]["writer"].str;
                                writefln("title: %s", title);
                                writefln("author: %s\n", author);
                        }
                });
}

output:

Running ./uhost
std.json.JSONException@std/json.d(1400): Unexpected end of data. (Line 1:8192)
----------------
4 uhost 0x000000010b671112 pure @safe void std.json.parseJSON!(char[]).parseJSON(char[], int, std.json.JSONOptions).error(immutable(char)[]) + 86

[etc]


Reply via email to