Re: [basex-talk] Finalizing Query-Objects
It makes no difference for the BaseX server if you close the session and have open query objects (query objects exclusively reside in the client). It can make a difference in client implementations, though. If you have a chance to always close queries after the execution, I think you should do so. I assume your are caching the query results before iterating over them, as it’s some in the other client implementations? Ben Engbers schrieb am Mo., 3. Feb. 2020, 11:01: > Hi, > > The people from CRAN strongly suggested to add tests (comparable to > Unit-tests) to my package (RBaseX). Their request led me to take another > critical look at my code. > So far the tests do not give an error message. But after completing the > last test, 'testthat' reports 1 failure without further explanation. > After changing the order in which the tests are executed, the failure is > always caused by the last test. Therefore I think that it are not the > tests that cause an error, but the finalize-process. > > At this moment, my code is based upon 3 classes: 'RBaseXClient' creates > a new client-session. This session use 'SocketClass' to communicate with > basexserver. When used in query-mode, the session uses 'QueryClass' to > create new query-objects. Due to this architecture, it is easy to > explicitly close a regular query-object, but (at least in R) it is > difficult to close query-objects when finalizing the session-object. > > How does the basexserver respond to closing the session without first > explicitly closing all open querys? Does this result in an error? > > Ben >
Re: [basex-talk] No difference for output from 'FULL' or 'RESULTS'
Hi Ben, The client API code hasn’t changed since BaseX 8. Maybe you need to revise your code. If you believe something wrong happens in the API, I’d still need some more information on what you believe has changed exactly? Best, Christian Ben Engbers schrieb am Mo., 3. Feb. 2020, 15:11: > Hi, > > As far as I can remember when using early versions from my > client-software, the main difference in output after sending \04 or \1F > to the database, was that in the latter case the output was preceded > with XDM Meta data. > > # Full > query_txt <- "for $i in 1 to 2 return Text { $i }" > query_obj <- Query(Session, query_txt) > result <- Full(query_obj) > > resulted in: > "0b" "Text 1" "0b" "Text 2" > > # Iterate over query > query2 <- "for $i in 3 to 4 return Iter { $i }" > query_iterate <- Query(Session, query2) # <== Alternative call to > query-object > while (More(query_iterate)) { > cat(Next(query_iterate), "\n") > } > > resulted in: > Iter 3 > Iter 4 > > Now, iterating over the same query gives: > 0b > Iter 3 > 0b > Iter 4 > > Did something change in the client/server protocol or did I introduce an > error somewhere? > > Ben >
Re: [basex-talk] how to count and remove "entities"
You could use REPLACE instead of ADD (or db:replace instead of db:add) and name your tweet by the JSON id. For more details, have a look at our documentation [1]. Deleting duplicates after the insertion would be another approach, but it surely is too slow if your plan is to store thousands or millions of tweets. [1] http://docs.basex.org/wiki/Database_Module#db:replace thufir schrieb am Di., 4. Feb. 2020, 07:41: > Not sure of the correct lingo, but I'm building a database of tweets. > As I run it, duplicate tweets are added to the database. I can see the > duplicates with: > > for $tweets in db:open("twitter") > return {$tweets/json/id__str} > > Firstly, how would I select the json node for a duplicate entity. But, > before even selecting that node, recursively look to see if there's more > than one result for that id__str value. > > How would I even generate a count of each occurrence for the data of a > specific id__str? > > > thanks, > > Thufir >
Re: [basex-talk] how to count and remove "entities"
I think distinct-result is helpful here: https://stackoverflow.com/q/60051384/262852 as is count. How would I pipe the result from the set of distinct-result to a count? If the count >1 then I could delete that tweet. Just thinking out-loud. Is that reasonable? Or, might I not be re-inventing the wheel here? On 2020-02-03 10:41 p.m., thufir wrote: Not sure of the correct lingo, but I'm building a database of tweets. As I run it, duplicate tweets are added to the database. I can see the duplicates with: for $tweets in db:open("twitter") return {$tweets/json/id__str} Firstly, how would I select the json node for a duplicate entity. But, before even selecting that node, recursively look to see if there's more than one result for that id__str value. How would I even generate a count of each occurrence for the data of a specific id__str? thanks, Thufir
[basex-talk] how to count and remove "entities"
Not sure of the correct lingo, but I'm building a database of tweets. As I run it, duplicate tweets are added to the database. I can see the duplicates with: for $tweets in db:open("twitter") return {$tweets/json/id__str} Firstly, how would I select the json node for a duplicate entity. But, before even selecting that node, recursively look to see if there's more than one result for that id__str value. How would I even generate a count of each occurrence for the data of a specific id__str? thanks, Thufir
Re: [basex-talk] Add command: name of the input will be set as path?
I got it to work in a very kludgy way: new Open(databaseName).execute(context); for (int i = 0; i < tweets.length(); i++) { jsonStringTweet = tweets.get(i).toString(); jsonObjectTweet = new org.json.JSONObject(jsonStringTweet); stringXml = XML.toString(jsonObjectTweet); stringXml = wrap(stringXml); write(stringXml,fileName); String stringFromFile = read(fileName); log.fine(stringFromFile); new Add(fileName, stringXml).execute(context); } } buth there I'm passing the fileName -- certainly I can just pass stringXml by itself somehow? see also: https://stackoverflow.com/a/60047738/262852 thanks, Thufir On 2020-02-03 1:42 p.m., Christian Grün wrote: In this case there's no path argument, but there is an input argument of stringXml. Is that how to pass a String to Add()? There are various ways; one is as follows: String json = "{ \"A\": 123 }"; Context ctx = new Context(); new CreateDB("test").execute(ctx); new Set("parser", "json").execute(ctx); Command add = new Add("json.xml"); add.setInput(new ArrayInput(json)); add.execute(ctx); System.out.println(new XQuery(".").execute(ctx)); On Mon, Feb 3, 2020 at 10:16 PM thufir wrote: On 2020-02-03 6:46 a.m., Christian Grün wrote: What does it mean that "if null, the name of input will be set as the path"? If your path argument points to a directory or a single file, and if you specify no argument for the input variable, the filenames resulting from your first argument will be adopted as database paths. If you run the command "ADD myfile.xml", the input argument will be null. If you run "ADD TO /db/path myfile.xml", input will be "/db/path". Right, but I'm not looking to run the command "ADD myfile.xml" from the console but rather: new Add(null, stringXml).execute(context); In this case there's no path argument, but there is an input argument of stringXml. Is that how to pass a String to Add()? thanks, Thufir
Re: [basex-talk] Add command: name of the input will be set as path?
> In this case there's no path argument, but there is an input argument of stringXml. Is that how to pass a String to Add()? There are various ways; one is as follows: String json = "{ \"A\": 123 }"; Context ctx = new Context(); new CreateDB("test").execute(ctx); new Set("parser", "json").execute(ctx); Command add = new Add("json.xml"); add.setInput(new ArrayInput(json)); add.execute(ctx); System.out.println(new XQuery(".").execute(ctx)); On Mon, Feb 3, 2020 at 10:16 PM thufir wrote: > > > > On 2020-02-03 6:46 a.m., Christian Grün wrote: > >> What does it mean that "if null, the name of input will be set as the > >> path"? > > > > If your path argument points to a directory or a single file, and if > > you specify no argument for the input variable, the filenames > > resulting from your first argument will be adopted as database paths. > > > > If you run the command "ADD myfile.xml", the input argument will be > > null. If you run "ADD TO /db/path myfile.xml", input will be > > "/db/path". > > > > > Right, but I'm not looking to run the command "ADD myfile.xml" from the > console but rather: > > > new Add(null, stringXml).execute(context); > > In this case there's no path argument, but there is an input argument of > stringXml. Is that how to pass a String to Add()? > > > > thanks, > > Thufir
Re: [basex-talk] Add command: name of the input will be set as path?
On 2020-02-03 6:46 a.m., Christian Grün wrote: What does it mean that "if null, the name of input will be set as the path"? If your path argument points to a directory or a single file, and if you specify no argument for the input variable, the filenames resulting from your first argument will be adopted as database paths. If you run the command "ADD myfile.xml", the input argument will be null. If you run "ADD TO /db/path myfile.xml", input will be "/db/path". Right, but I'm not looking to run the command "ADD myfile.xml" from the console but rather: new Add(null, stringXml).execute(context); In this case there's no path argument, but there is an input argument of stringXml. Is that how to pass a String to Add()? thanks, Thufir
Re: [basex-talk] convert JSON to XML to add to database
is this what you're referring to? Command: SET PARSER json Command: CREATE DB tweet /home/thufir/json/tweet.json Result: Database 'tweet' created in 166.11 ms. Which, yes, is exactly the sequence which I'm looking to capture or replicate -- but not from a file as above. It's more the usage of "Add" to add a string. I've converted the JSON to XML, so that rather than tweet.json I have tweet.xml for convenience. Using either ADD or CREATE is my goal -- but not with files. Trying to use Strings. thanks, Thufir On 2020-02-03 6:40 a.m., Christian Grün wrote: How is JSON converted to XML in order to ADD to a database? JSONObject jsonTweet = tweets.getJSONObject(Long.toString(id)); xmlStringTweet = XML.toString(jsonTweet); Do you know how to create a database and add documents as JSON via the BaseX GUI? If yes, you can enable the InfoView panel, and you will see the commands that are called in the background. In the next step, you can call these commands with Java. See [1] for the available BaseX options, and see [2] for an example the assigns an option via the SET command. [1] http://docs.basex.org/wiki/Options [2] https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/org/basex/examples/local/CreateCollection.java
Re: [basex-talk] filtering NaN from a sequence
On Mon, Feb 03, 2020 at 03:24:48PM +0100, Christian Grün scripsit: > > > for $value in $xmlReport/csv/record/Payment_Amount > > > where $value castable as xs:double > > > return xs:double($value) > > > > That errors out! > > [XPTY0004] Cannot convert element()* to xs:double+: > > $xmlReport_1/element(csv)/element(record)/element(Payment_Amount)[. > > castable as xs:double]. > > Did you get this error message for the suggested "for" clause, or a let > clause? The type is on a let clause that derives its value from a for: let $made as xs:double+ := for $value in $xmlReport/csv/record/Payment_Amount where $value castable as xs:double return $value > The XQuery pandora box provides a lot of type conversions that are all > working slightly different: If you specify a type after the let > clause, it is (close to) identical to the "treat as" expression. > Treating values as another values won’t trigger explicit casts; this > is your element nodes won’t be converted to doubles. I have learned something! Thank you, that makes it make sense. > However, if you specify types in functions, … > > declare function local:bla($made as xs:double+) { ... } > > …the values will be "promoted" to the specific type (and this is > similar to casts). And now I have learned something else. :) That's very helpful; much appreciated. -- Graydon
Re: [basex-talk] Add command: name of the input will be set as path?
> What does it mean that "if null, the name of input will be set as the path"? If your path argument points to a directory or a single file, and if you specify no argument for the input variable, the filenames resulting from your first argument will be adopted as database paths. If you run the command "ADD myfile.xml", the input argument will be null. If you run "ADD TO /db/path myfile.xml", input will be "/db/path".
Re: [basex-talk] convert JSON to XML to add to database
> How is JSON converted to XML in order to ADD to a database? > > JSONObject jsonTweet = tweets.getJSONObject(Long.toString(id)); > xmlStringTweet = XML.toString(jsonTweet); Do you know how to create a database and add documents as JSON via the BaseX GUI? If yes, you can enable the InfoView panel, and you will see the commands that are called in the background. In the next step, you can call these commands with Java. See [1] for the available BaseX options, and see [2] for an example the assigns an option via the SET command. [1] http://docs.basex.org/wiki/Options [2] https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/org/basex/examples/local/CreateCollection.java
Re: [basex-talk] JSON to XML conversion
> public void transform(String fileName) throws IOException { > String content = new > String(Files.readAllBytes(Paths.get(fileName)), StandardCharsets.UTF_8); > org.json.JSONObject json = new org.json.JSONObject(content); > log.info(org.json.XML.toString(json)); > } What you seem to want to achieve is: 1. Open a JSON file as a string; 2. Convert this string to a JSON object; 3. Write this JSON object as XML to a log output (?) This would be the XQuery way to do it: let $content := file:read-text('x.json') let $json := json:parse($content) return admin:write-log($json) If you address the BaseX Java code, you can work with different abstraction levels. Maybe it’s already sufficient if you evaluate the upper XQuery string as command: Context ctx = new Context(); String query = "let $content..."; XQuery cmd = new XQuery(query); System.out.println(cmd.execute(ctx));
Re: [basex-talk] filtering NaN from a sequence
> > for $value in $xmlReport/csv/record/Payment_Amount > > where $value castable as xs:double > > return xs:double($value) > > That errors out! > [XPTY0004] Cannot convert element()* to xs:double+: > $xmlReport_1/element(csv)/element(record)/element(Payment_Amount)[. castable > as xs:double]. Did you get this error message for the suggested "for" clause, or a let clause? > I conclude from this that NaN is castable as xs:double which surprised > me when I first tried something like this, but which does make sense in > as much as NaN has to be pseudo-numeric. Exactly: NaN is a valid double value (as is INF and -INF). > let $made as xs:double+ := for $value in $xmlReport/csv/record/Payment_Amount > where $value castable as xs:double > return $value > > doesn't strike me as obviously wrongly typed on $made. I'd expect that > to fail without the where clause but to be OK with it. The XQuery pandora box provides a lot of type conversions that are all working slightly different: If you specify a type after the let clause, it is (close to) identical to the "treat as" expression. Treating values as another values won’t trigger explicit casts; this is your element nodes won’t be converted to doubles. However, if you specify types in functions, … declare function local:bla($made as xs:double+) { ... } …the values will be "promoted" to the specific type (and this is similar to casts).
[basex-talk] No difference for output from 'FULL' or 'RESULTS'
Hi, As far as I can remember when using early versions from my client-software, the main difference in output after sending \04 or \1F to the database, was that in the latter case the output was preceded with XDM Meta data. # Full query_txt <- "for $i in 1 to 2 return Text { $i }" query_obj <- Query(Session, query_txt) result <- Full(query_obj) resulted in: "0b" "Text 1" "0b" "Text 2" # Iterate over query query2 <- "for $i in 3 to 4 return Iter { $i }" query_iterate <- Query(Session, query2) # <== Alternative call to query-object while (More(query_iterate)) { cat(Next(query_iterate), "\n") } resulted in: Iter 3 Iter 4 Now, iterating over the same query gives: 0b Iter 3 0b Iter 4 Did something change in the client/server protocol or did I introduce an error somewhere? Ben
Re: [basex-talk] filtering NaN from a sequence
On Mon, Feb 03, 2020 at 02:09:03PM +0100, Christian Grün scripsit: > Martin’s suggestion is indeed the cleanest solution I can see. Thank you! > A curious side note regarding your approach: > > > where not($value = number('NaN')) > > Comparisons with NaN doubles always yield false, no matter if you use > XQuery, Java or other languages: > > let $d := xs:double('NaN') > return $d = $d Well than I've learned at least one new thing today! Thank you! -- Graydon
Re: [basex-talk] filtering NaN from a sequence
On Mon, Feb 03, 2020 at 08:27:09AM +0100, Martin Honnen scripsit: > Am 03.02.2020 um 01:22 schrieb Graydon Saunders: > > for $value in $xmlReport/csv/record/Payment_Amount/number() > > where ??? > > return $value > > Can you live with > > for $value in $xmlReport/csv/record/Payment_Amount > where $value castable as xs:double > return xs:double($value) That errors out! [XPTY0004] Cannot convert element()* to xs:double+: $xmlReport_1/element(csv)/element(record)/element(Payment_Amount)[. castable as xs:double]. If I do that with /number() at the end of the XPath for $value in $xmlReport/csv/record/Payment_Amount/number() I get "NaN" as the overall result. I conclude from this that NaN is castable as xs:double which surprised me when I first tried something like this, but which does make sense in as much as NaN has to be pseudo-numeric. If I take the type off the variable: let $made := for $value in $xmlReport/csv/record/Payment_Amount instead of let $made as xs:double+ := for $value in $xmlReport/csv/record/Payment_Amount then it works. Which really surprised me because the whole statement should return a sequence of doubles: let $made as xs:double+ := for $value in $xmlReport/csv/record/Payment_Amount where $value castable as xs:double return $value doesn't strike me as obviously wrongly typed on $made. I'd expect that to fail without the where clause but to be OK with it. Thanks! Graydon
Re: [basex-talk] filtering NaN from a sequence
Martin’s suggestion is indeed the cleanest solution I can see. A curious side note regarding your approach: > where not($value = number('NaN')) Comparisons with NaN doubles always yield false, no matter if you use XQuery, Java or other languages: let $d := xs:double('NaN') return $d = $d Best, Christian On Mon, Feb 3, 2020 at 2:14 AM Graydon Saunders wrote: > > Hi Bridger > > functx:is-a-number does indeed work, but it's guts are > > string(number($value)) != 'NaN' > > Which seems improper somehow; it's relying on knowing the string that > corresponding to the conceptual NaN result. > > I may be looking for more elegance than I can plausibly expect, here. :) > > Thanks! > Graydon > > On Sun, Feb 2, 2020 at 8:07 PM Bridger Dyson-Smith > wrote: >> >> Hi Graydon, >> I'm mobile at the moment, so please excuse the abbreviated reply. Would >> functx:is-a-number() [#1] work in your where clause? >> >> I'm completely unable to test... apologies. >> >> Best, >> Bridger >> >> #1 http://www.xqueryfunctions.com/xq/functx_is-a-number.html >> >> On Sun, Feb 2, 2020, 7:22 PM Graydon Saunders wrote: >>> >>> Hello all -- >>> >>> So I have a CSV file, and I can pull that into BaseX in the hopes of >>> writing a query to extract a report. I'm using 9.3.1 for the purpose. >>> >>> Not all of the Payment_Amount fields have a value, so any report-extracting >>> query has to filter those out of any calculations or the whole thing gets >>> infested with NaN. >>> >>> This works: >>> let $xmlReport as document-node(element(csv)) := >>> file:read-text('report.csv') => csv:parse( map { 'header': true(), >>> 'separator' : 'tab' }) >>> >>> let $made as xs:double+ := for $value in >>> $xmlReport/csv/record/Payment_Amount[text() castable as xs:double]/number() >>> return $value >>> >>> return sum($made) => round(2) >>> >>> If I wanted to use a where clause, >>> >>> let $xmlReport as document-node(element(csv)) := >>> file:read-text('report.csv') => csv:parse( map { 'header': true(), >>> 'separator' : 'tab' }) >>> >>> let $made as xs:double+ := for $value in >>> $xmlReport/csv/record/Payment_Amount/number() >>> where ??? >>> return $value >>> >>> return sum($made) => round(2) >>> >>> What do I put in the where clause? I tried >>> where not($value = NaN) >>> and that was not successful: >>> "Stopped at /home/graydon/git/writing/transform/urk.xq, 6/25: >>> [XPTY0020] element(NaN): node expected, xs:double found: 3.38." >>> >>> where not($value = number('NaN')) >>> >>> didn't give an error but the query returns NaN so I know I didn't filter >>> any of the empty records from the sum. >>> >>> How ought that where clause be written? >>> >>> Thanks! >>> Graydon >>>
[basex-talk] Finalizing Query-Objects
Hi, The people from CRAN strongly suggested to add tests (comparable to Unit-tests) to my package (RBaseX). Their request led me to take another critical look at my code. So far the tests do not give an error message. But after completing the last test, 'testthat' reports 1 failure without further explanation. After changing the order in which the tests are executed, the failure is always caused by the last test. Therefore I think that it are not the tests that cause an error, but the finalize-process. At this moment, my code is based upon 3 classes: 'RBaseXClient' creates a new client-session. This session use 'SocketClass' to communicate with basexserver. When used in query-mode, the session uses 'QueryClass' to create new query-objects. Due to this architecture, it is easy to explicitly close a regular query-object, but (at least in R) it is difficult to close query-objects when finalizing the session-object. How does the basexserver respond to closing the session without first explicitly closing all open querys? Does this result in an error? Ben
[basex-talk] Add command: name of the input will be set as path?
What does it mean that "if null, the name of input will be set as the path"? Javadoc: Add public Add(java.lang.String path, java.lang.String input) Constructor, specifying a target path and an input. Parameters: path - target path, optionally terminated by a new file name. If null, the name of the input will be set as path. input - input file or XML string I'm looking to add an xml file, so am using "null" for the path: https://stackoverflow.com/q/60035605/262852 but what are the implications? the "name of the input" will be "set as path"? Where is the "name of the input"? What is "path" in relation to a String which exists only in memory? Just pass a string like: new Add(null, stringXml).execute(context); and that should add to the currently open database? thanks, Thufir