Hi Andy, Nice use of syntax (though you have to loose the semi-colon of course).
Visually i like the arrow operator a lot. Looks like a visual pipeline "https://wiki.mozilla.org/images/f/ff/Example.json.gz" => fetch:binary() => archive:extract-text() I also think that this could be a bug or at least a good improvement to make as the docs say gzip archives can be created. Christian, you think we should file an issue for this? --Marc On Tue, Jan 26, 2016 at 9:51 PM, Andy Bunce <bunce.a...@gmail.com> wrote: > Hi Marco, > > I get the same. This works: > > ```` > "https://wiki.mozilla.org/images/f/ff/Example.json.gz" > !fetch:binary(.) > !archive:extract-text(.) > ```` > > But this returns empty: > > ```` > "https://wiki.mozilla.org/images/f/ff/Example.json.gz" > !fetch:binary(.) > !archive:entries(.) > > <archive:entry xmlns:archive="http://basex.org/modules/archive"/> > ```` > > Expecting to see "example.json" > > Could this be a bug? > > /Andy > > > > On 26 January 2016 at 18:51, Maximilian Gärber <mgaer...@arcor.de> wrote: >> >> Hi, >> >> I think this should work, I use it for OData requests from IIS. >> >> Need to dig through the source...but I used one oft the extract-binary >> functions >> >> Regards, Max >> >> Am 26.01.2016 16:04 schrieb "Marc van Grootel" >> <marc.van.groo...@gmail.com>: >>> >>> Well, shelling out wasn't so hard even on Windows with cygwin tools it's >>> simply >>> >>> proc:execute('gunzip', $path-to-gzipped-file) >>> >>> Worked quite transparently as it extracts the files and removes the >>> .gz file. Would be nice if there's a pure XQuery solution but for now >>> I'm okay. >>> >>> Cheers, >>> >>> On Tue, Jan 26, 2016 at 3:13 PM, Marc van Grootel >>> <marc.van.groo...@gmail.com> wrote: >>> > Hi, >>> > >>> > I hoped that I could use archive module to also extract gzipped files. >>> > I need to fetch/sync large XML from a web service that has the option >>> > of getting files with gzip encoding (to be nice to the web server). >>> > >>> > First attempt was to explicitly get the gz file via the URL and then >>> > treat it like an archive binary (extracting it with the recipe from >>> > the archive module page). The entries XML I get is empty so I suppose >>> > that I cannot read .gz >>> > >>> > Second attempt was to specify Accept-Encoding = gzip which indeed >>> > delivers the XML as a binary. But I probably run into the same issue >>> > when trying to extract. >>> > >>> > Is there a way to do the extraction of .gz encoded files without >>> > having to shell out to some kind of unzipper? >>> > >>> > Cheers, >>> > --Marc >>> >>> >>> >>> -- >>> --Marc > > -- --Marc