Option one is definitly cleaner (and works much faster than my "dirty" version). Thanks so much Christian - You're a star :)
Noam On Thu, May 28, 2015 at 12:09 AM, Christian Grün <christian.gr...@gmail.com> wrote: > Attached are two more solutions that might show you how it could work as > well. > > > On Wed, May 27, 2015 at 10:51 PM, Noam Green <green.n...@gmail.com> wrote: > > OK. Solved it. > > > > Below is the final query: > > > > declare variable $in external; > > declare variable $out external; > > declare variable $vendor external; > > > > > > file:write-text-lines($out, 'Name,Host,Path,Count,Time'), > > let $text := file:read-text($in) > > let $xml := csv:parse($text, map { 'header': true() }) > > > > for $x in $xml//record[contains(vendors,string($vendor))] > > > > let $count := string($x/ATTACKCOUNT) > > let $name := $x/attackname > > let $time := string($x/TIME_STAMP) > > let $path := string($x/path) > > let $host := string($x/host) > > let $result := concat ($name,',',$host,',',$path,',',$count,',',$time) > > > > > > return file:append-text-lines($out, $result) > > > > I played with it so long, I'm not even sure what I finally did. But it > > works, so I'm not touching it :) > > > > Thanks again Christian for all you help! > > Noam > > > > On Wed, May 27, 2015 at 11:21 PM, Noam Green <green.n...@gmail.com> > wrote: > >> > >> Hi Christian, > >> > >> The input file is quite large, so I tried to edit it and leave only 4 > >> lines. The strangest thing happened: when trying to run the shortened > file > >> in the editor, I now get the same error "Content is not allowed in > prolog.". > >> > >> Below is my query code: > >> declare variable $in external; > >> declare variable $out external; > >> declare variable $vendor external; > >> > >> (: > >> Trying to work with CSV > >> let $text := file:read-text($in) > >> let $xml := csv:parse($text, map { 'header': true() }) > >> let $inputcsv := csv:serialize($xml, map { 'header': true() } ) > >> > >> :) > >> > >> file:write-text-lines($out, 'Name,Host,Path,Count,Time'), > >> for $x in doc($in)//record[contains(vendors,string($vendor))] > >> > >> let $count := string($x/ATTACKCOUNT) > >> let $name := $x/attackname > >> let $time := string($x/TIME_STAMP) > >> let $path := string($x/path) > >> let $host := string($x/host) > >> let $result := concat ($name,',',$host,',',$path,',',$count,',',$time) > >> let $nothing := string('') > >> > >> return file:append-text-lines($out, $result) > >> > >> I tried adding some CSV manipulation lines (see commented-out above), as > >> once I remove the comment, the editor comments: > >> "Unexpected end of query: 'with CSV let $..." > >> > >> It doesn't help even if I add a comma after the serialize command. > >> > >> I know I'm missing something stupid, but can't seem to get around it. > >> > >> Cheers, > >> Noam > >> > >> On Wed, May 27, 2015 at 10:42 PM, Christian Grün > >> <christian.gr...@gmail.com> wrote: > >>> > >>> > Why does it work in the editor [...] > >>> > >>> This surprises me. Could you attach me your input file and the query? > >>> > >>> In general, the input of doc() must always be an XML document. > >>> However, you can use csv:parse for that (once again, please check out > >>> our Wiki for an example). > >>> > >>> > >>> > > >>> > On Wed, May 27, 2015 at 10:33 PM, Christian Grün > >>> > <christian.gr...@gmail.com> > >>> > wrote: > >>> >> > >>> >> > C:\Temp>basex -b in=export_new.csv -b out=output.csv -b vendor=IID > >>> >> > CSV_Query.xq > >>> >> > >>> >> It looks as if you are trying to parse a CSV file (export_new.csv) > as > >>> >> XML. Is this really what you wanna do? > >>> >> > >>> >> Christian > >>> >> > >>> >> > >>> >> > > >>> >> > I get the follwing error: > >>> >> > Stopped at C:/Temp/CSV_Query.xq, 6/15: > >>> >> > [FODC0002] "C:/Temp/export_new.csv" (Line 1): Content is not > allowed > >>> >> > in > >>> >> > prolog. > >>> >> > > >>> >> > This seems to be failing on the basic for $x in doc($in) command. > >>> >> > > >>> >> > What am I doing wrong? > >>> >> > > >>> >> > Thanks, > >>> >> > Noam > >>> >> > > >>> >> > On Wed, May 27, 2015 at 10:02 PM, Christian Grün > >>> >> > <christian.gr...@gmail.com> > >>> >> > wrote: > >>> >> >> > >>> >> >> Hi Noam, > >>> >> >> > >>> >> >> > I missed the option of adding a comma after the initial > >>> >> >> > file:write > >>> >> >> > command > >>> >> >> > (the editor was constantly asking for a return command). > >>> >> >> > >>> >> >> In XQuery, multiple expressions can always be separated with > >>> >> >> commas. > >>> >> >> For example, the following XQuery expression returns 4 items as > >>> >> >> results: > >>> >> >> > >>> >> >> 1, "string", <xml/>, file:read-text('abc.xml') > >>> >> >> > >>> >> >> As XQuery itself, due to its functional nature, provides no > >>> >> >> guarantee > >>> >> >> that the four expressions of the above query will be evaluated > one > >>> >> >> after another, the XQuery Scripting Extension [1] was proposed. > It > >>> >> >> offers a semicolon to separate expressions with side-effects > (such > >>> >> >> as > >>> >> >> file functions). Due to its complexity, however, it was not > >>> >> >> implemented by many XQuery implementations. > >>> >> >> > >>> >> >> At least in BaseX (and I think in Saxon, eXist-db and Zorba as > >>> >> >> well), > >>> >> >> you can be assured that expressions separated with commas will > >>> >> >> always > >>> >> >> be evaluated one after another. > >>> >> >> > >>> >> >> Well, this was probably more than you were asking for, but maybe > >>> >> >> it's > >>> >> >> of some interest anyway ;) > >>> >> >> > >>> >> >> Christian > >>> >> >> > >>> >> >> [1] http://www.w3.org/TR/xquery-sx-10/ > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > > >>> >> >> > Thanks again. It worked perfectly (although I must admit I used > >>> >> >> > the > >>> >> >> > dirty > >>> >> >> > option, as the CSV examples are mainly on adapting CSV into > XML, > >>> >> >> > while I > >>> >> >> > need the other direction). > >>> >> >> > > >>> >> >> > Noam > >>> >> >> > > >>> >> >> > On Wed, May 27, 2015 at 9:40 PM, Christian Grün > >>> >> >> > <christian.gr...@gmail.com> > >>> >> >> > wrote: > >>> >> >> >> > >>> >> >> >> Hi Noam, > >>> >> >> >> > >>> >> >> >> > let $csv := csv:serialize($result) > >>> >> >> >> > return file:write-text($out, $csv) > >>> >> >> >> > > >>> >> >> >> > The CVS that comes out only includes one line [...] > >>> >> >> >> > >>> >> >> >> As there are unlimited ways to represent XML nodes as CSV, > there > >>> >> >> >> is > >>> >> >> >> no > >>> >> >> >> way to automatically a representation that always works best. > >>> >> >> >> For > >>> >> >> >> more > >>> >> >> >> information on creating an XML representation that will yield > >>> >> >> >> good > >>> >> >> >> results as CSV, please check out the documentation on our CSV > >>> >> >> >> Module > >>> >> >> >> [1]. > >>> >> >> >> > >>> >> >> >> > Now this works, but I can't seem to find a way to add the > >>> >> >> >> > headers > >>> >> >> >> > to > >>> >> >> >> > the > >>> >> >> >> > first line of the file. > >>> >> >> >> > >>> >> >> >> Obviously, I would recommend you to use the existing CSV > >>> >> >> >> features, > >>> >> >> >> because it will take care of all the usal nifty details. > >>> >> >> >> However, > >>> >> >> >> here > >>> >> >> >> is one simple way to let your file start with a header line: > >>> >> >> >> > >>> >> >> >> file:write-text-lines($out, 'Name,Host,Path,Count,Time'), > >>> >> >> >> let $result := concat > >>> >> >> >> ($name,',',$host,',',$path,',',$count,',',$time) > >>> >> >> >> return file:append-text-lines($out, $result) > >>> >> >> >> > >>> >> >> >> Hope this helps, > >>> >> >> >> Christian > >>> >> >> >> > >>> >> >> >> [1] http://docs.basex.org/wiki/CSV_Module > >>> >> >> > > >>> >> >> > > >>> >> > > >>> >> > > >>> > > >>> > > >> > >> > > >