Re: [basex-talk] Setting multiple CSV EXPORTER options is not documented on docs.basex.org

2024-10-03 Thread Christian Grün
Hi Navel,

> BaseX Xquery CSV module is for dealing with CSV files which we don't have.

With this module, you can serialize XML data as CSV (and parse CSV back to
XML). But that’s certainly just one way to do it.

Hi Owen,

> However, I wonder how BaseX might deal with more than a million records.

Feel free to get back to us once you encounter any limits.

Best,
Christian



Naval Sarda  schrieb am Do., 3. Okt. 2024, 14:45:

> Hi Owen, Christian,
>
> We are working on Stratml files which are in XML format. BaseX Xquery CSV
> module is for dealing with CSV files which we don't have.
>
> We can scale up BaseX by adding more servers and splitting the data
> between more servers if the record counts goes high.
>
> Naval
>
>
> On 02/10/24 10:54 pm, Owen Ambur wrote:
>
> Naval, can you answer Christian's question?
>
> My sense is that a lot of features have been built into BaseX that we
> could use but I don't have the knowledge or expertise to understand how
> best to do so.  At least, I hope we're not reinventing capabilities that
> are already available to address our technical objectives
> <https://aboutthem.info/SQS.xml> as well as our broader conceptual
> objectives <https://aboutthem.info/ATI.xml>.
>
> Christian, this morning we had a very encouraging Zoom meeting with folks
> at GSA who are finally getting around to figuring out how to help Uncle
> Sam's agencies comply with section 10
> <https://www.linkedin.com/pulse/trustworthy-institutions-owen-ambur/> of
> the GPRA Modernization Act.  I encouraged them not to reinvent the StratML
> schemas and it would be great if the BaseX community could help us
> demonstrate the benefits of using them, not only by U.S. federal agencies
> but agencies at all levels of government, worldwide, as well as tax-favored
> organizations and others whose plans and reports should be matters of
> public record.
>
> There are >5.8K plans in the StratML collection
> <https://stratml.us/drybridge/index.htm> and indexed in the query
> service.  Their URLs are listed in sitemap format at
> https://stratml.us/docs/sitemap.xml
>
> I am contemplating whether to hire someone to convert the relevant
> elements of the IRS Form 990 database
> <https://www.irs.gov/charities-non-profits/tax-exempt-organization-search-bulk-data-downloads>
> to StratML format, using the templates at
> https://stratml.us/drybridge/index.htm#MPR4CO  However, I wonder how
> BaseX might deal with more than a million records.
>
> Owen Ambur
> https://www.linkedin.com/in/owenambur/
>
>
> On Wednesday, October 2, 2024 at 12:18:16 PM EDT, Christian Grün
>   wrote:
>
>
> Hi Owen,
>
> Is it based on the BaseX XQuery CSV Module?
>
> Best,
> Christian
>
>
>
>
> Owen Ambur  schrieb am Mi., 2. Okt. 2024, 18:12:
>
> I'm not sure how this exchange might relate but my developer has provided
> CSV export capabilities for query results listings at StratML
> <https://search.aboutthem.info/>
>
>
> StratML <https://search.aboutthem.info/>
>
>
>
> Any comments or suggestions on how we might enhance the functionality of
> our StratML-enabled query service would be most welcome.
>
> Owen Ambur
> https://www.linkedin.com/in/owenambur/
>
>
> On Tuesday, October 1, 2024 at 04:42:12 AM EDT, Christian Grün <
> christian.gr...@gmail.com> wrote:
>
>
> Hi Omar,
>
> Thanks for the observation. For the SERIALIZER option, there was a
> corresponding note which I have added to the EXPORTER option, along with
> a little example for exporting CSV [1].
>
> Best,
> Christian
>
> [1] https://docs.basex.org/main/Options#exporter
>
>
>
> On Mon, Sep 30, 2024 at 4:04 PM Omar Siam  wrote:
>
> Hi,
>
> I just tried after some time to script a CSV export with a BXS file. I
> forgot how to set multiple options for the csv serialization. In the end
> this worked:
>  option="EXPORTER">method=csv,csv=header=true,,lax=false,,quotes=true
> I can not find it on docs.basex.org. Perhaps this should be added and
> explained.
>
> Best regards
>
> --
> Mag. Ing. Omar Siam
> Austrian Center for Digital Humanities and Cultural Heritage
> Österreichische Akademie der Wissenschaften | Austrian Academy of Sciences
> Stellvertretende Behindertenvertrauensperson | Deputy representative for
> disabled persons
> Bäckerstraße 13, 1010 Wien, Österreich | Vienna, Austria
> T: +43 1 51581-7295
> omar.s...@oeaw.ac.at | www.oeaw.ac.at/acdh
>
>


Re: [basex-talk] Setting multiple CSV EXPORTER options is not documented on docs.basex.org

2024-10-02 Thread Christian Grün
Hi Owen,

Is it based on the BaseX XQuery CSV Module?

Best,
Christian




Owen Ambur  schrieb am Mi., 2. Okt. 2024, 18:12:

> I'm not sure how this exchange might relate but my developer has provided
> CSV export capabilities for query results listings at
> https://search.aboutthem.info/
>
> Any comments or suggestions on how we might enhance the functionality of
> our StratML-enabled query service would be most welcome.
>
> Owen Ambur
> https://www.linkedin.com/in/owenambur/
>
>
> On Tuesday, October 1, 2024 at 04:42:12 AM EDT, Christian Grün <
> christian.gr...@gmail.com> wrote:
>
>
> Hi Omar,
>
> Thanks for the observation. For the SERIALIZER option, there was a
> corresponding note which I have added to the EXPORTER option, along with
> a little example for exporting CSV [1].
>
> Best,
> Christian
>
> [1] https://docs.basex.org/main/Options#exporter
>
>
>
> On Mon, Sep 30, 2024 at 4:04 PM Omar Siam  wrote:
>
> Hi,
>
> I just tried after some time to script a CSV export with a BXS file. I
> forgot how to set multiple options for the csv serialization. In the end
> this worked:
>  option="EXPORTER">method=csv,csv=header=true,,lax=false,,quotes=true
> I can not find it on docs.basex.org. Perhaps this should be added and
> explained.
>
> Best regards
>
> --
> Mag. Ing. Omar Siam
> Austrian Center for Digital Humanities and Cultural Heritage
> Österreichische Akademie der Wissenschaften | Austrian Academy of Sciences
> Stellvertretende Behindertenvertrauensperson | Deputy representative for
> disabled persons
> Bäckerstraße 13, 1010 Wien, Österreich | Vienna, Austria
> T: +43 1 51581-7295
> omar.s...@oeaw.ac.at | www.oeaw.ac.at/acdh
>
>


Re: [basex-talk] Setting multiple CSV EXPORTER options is not documented on docs.basex.org

2024-10-01 Thread Christian Grün
Hi Omar,

Thanks for the observation. For the SERIALIZER option, there was a
corresponding note which I have added to the EXPORTER option, along with
a little example for exporting CSV [1].

Best,
Christian

[1] https://docs.basex.org/main/Options#exporter



On Mon, Sep 30, 2024 at 4:04 PM Omar Siam  wrote:

> Hi,
>
> I just tried after some time to script a CSV export with a BXS file. I
> forgot how to set multiple options for the csv serialization. In the end
> this worked:
>  option="EXPORTER">method=csv,csv=header=true,,lax=false,,quotes=true
> I can not find it on docs.basex.org. Perhaps this should be added and
> explained.
>
> Best regards
>
> --
> Mag. Ing. Omar Siam
> Austrian Center for Digital Humanities and Cultural Heritage
> Österreichische Akademie der Wissenschaften | Austrian Academy of Sciences
> Stellvertretende Behindertenvertrauensperson | Deputy representative for
> disabled persons
> Bäckerstraße 13, 1010 Wien, Österreich | Vienna, Austria
> T: +43 1 51581-7295
> omar.s...@oeaw.ac.at | www.oeaw.ac.at/acdh
>
>


Re: [basex-talk] Equivalent of eXist's util:wait in BaseX?

2024-09-27 Thread Christian Grün
Hi Joe,

prof:sleep should do the job [1].

Hope this helps,
Christian

[1] https://docs.basex.org/main/Profiling_Functions#prof:sleep



Joe Wicentowski  schrieb am Fr., 27. Sept. 2024, 16:38:

> Hi all,
>
> I'm retrieving content from a list of URLs with BaseX (via the
> http:send-request function) and appear to be hitting rate limits on the
> remote server. I'd like to pause for 5 seconds between requests. In eXist,
> I'd use util:wait [1] to handle this pause. Is there an equivalent in
> BaseX? Or is there a different approach in BaseX?
>
> Thanks!
> Joe
>
> [1]
> https://exist-db.org/exist/apps/fundocs/view.html?uri=http://exist-db.org/xquery/util#wait.1
>


[basex-talk] BaseX 11.3: Performance and Patch Release

2024-09-19 Thread Christian Grün
The next BaseX 11 performance and patch release is available. This is what
you get:

 - [ADD] New XQuery 4 features
 - [ADD] Options: LOG, support for stdout/stderr/slf4j targets
 - [ADD] XQuery: focus expression (A -> B; experimental)
 - [MOD] XQuery, processing single chars: reduced memory consumption
 - [MOD] XQuery, db:node-id, db:node-pre: faster retrieval
 - [FIX] XQuery: self-recursiveness of functions and variables revised
 - [FIX] XQuery: inspect:function has become more robust

By the way, we had not particularly announced the last patch release:

https://basex.org/2024/08/15/basex-11.2/

For a preview of the upcoming 12 features, visit the following page:

https://docs.basex.org/12

Visit our homepage for general information and downloads:

https://basex.org

Have fun,
Your BaseX Team

>


Re: [basex-talk] Error with function-lookup and static variable

2024-09-19 Thread Christian Grün
>
> > If I use $x * $factorial($x - 1)
>

True, it should have been multiplication; thanks.

I get an out-of-range error as of $factorial(21).
>

Integers in BaseX are limited to 64bit (that’s something the spec allows).
You can use decimals to work with greater numbers:

declare variable $factorial := fn($x) {
  if($x > 1) then $x * $factorial($x - 1) else $x
};
$factorial(1000.0)

If you need even larger results, you can use a tail-call-optimized variant…

declare variable $factorial := fn($x, $result) {
  if($x > 1) then $factorial($x - 1, $x * $result) else $result
};
$factorial(10.0, 1)

…or fold-left:

fold-left(1 to 10, 1.0, op('*'))

Hope this helps,
Christian


Re: [basex-talk] Error with function-lookup and static variable

2024-09-19 Thread Christian Grün
Hi Liam,

Hmm, what if there's a $factorial in scope already? (this is a
> technique i teach for taking advantage of closures to hide data)


Variables that are declared more than once in XQuery will continue to be
rejected with the error code XQST0049 (duplicate declaration of static
variable):

declare variable $a := 123;
declare variable $a := 'whatever';

Best,
Christian


Re: [basex-talk] Error with function-lookup and static variable

2024-09-18 Thread Christian Grün
Hi Amanda,

The last query you reported back to us is now evaluated successfully [1].

In addition, XQuery 4 will allow all of us to write self-referencing
variable declarations (it’s already supported by the latest snapshot):

declare variable $factorial := fn($x) {
  if($x > 1) then $x + $factorial($x - 1) else $x
};
$factorial(5)

All the best,
Christian

[1] https://github.com/BaseXdb/basex/issues/2324#issuecomment-2288790599



On Wed, Aug 14, 2024 at 3:43 PM Christian Grün 
wrote:

> …thanks. I already guessed it wasn’t that easy ;) I’ve added it to [1].
>
> In general, I hope we could get completely rid of self-dependency checks.
> It was only defined for variables, not for functions, and I cannot see why
> we still need it today. We are currently discussing this topic for version
> 4.0 [2].
>
> Best,
> Christian
>
> [1] https://github.com/BaseXdb/basex/issues/2324
> [2] https://github.com/qt4cg/qtspecs/issues/1379
>
>
>
> On Wed, Aug 14, 2024 at 1:02 PM Amanda Galtman  wrote:
>
>> Christian, thanks very much.
>>
>> I returned to my actual code, and it works with the latest snapshot dated
>> today.
>>
>> By the way, I also retried the variations I had created when trying to
>> explore workarounds, and one of them still doesn't work with the latest
>> snapshot. It's not blocking me, but in case it is helpful, I reduced it to
>> another small query that reproduces the error with today's snapshot. I know
>> you said you fixed part of the problem, so you might not be surprised that
>> the following code still triggers the error.
>>
>> xquery version "3.1";
>>
>> declare variable $variant :=
>>   if (exists(function-lookup(QName('nonexistent','nonexistent'), 0)))
>>   then
>> ( (: not relevant :) )
>>   else
>> function-lookup(QName('
>> http://www.w3.org/2005/xpath-functions','string'), 1);
>>
>> declare variable $variant-fcns := $variant;
>>
>> declare function local:fcn() {
>>   $variant-fcns('abc')
>> };
>>
>> local:fcn()
>>
>>
>> The levels of indirection (variable, variable, function, function call)
>> seem to be relevant for the problem. If I make the code more direct, it
>> works.
>>
>> Regards,
>> Amanda
>>
>> On Monday, August 12th, 2024 at 5:16 AM, Christian Grün <
>> christian.gr...@gmail.com> wrote:
>>
>> I managed to fix a part of the dependency problem. Your query should now
>> be executable with the latest snapshot [1].
>> – Christian
>>
>> [1] https://files.basex.org/releases/latest/
>>
>>
>>
>> On Fri, Aug 9, 2024 at 5:51 PM Amanda Galtman  wrote:
>>
>>> Hi, all.
>>>
>>> I'm seeing an error in BaseX when I use function-lookup in both a global
>>> variable and a function, where the function relies on the variable. I
>>> reduced the situation to the following small query:
>>>
>>> xquery version "3.1";
>>> declare variable $local:lookup := function-lookup(QName("nonexistent",
>>> "nonexistent"), 1);
>>> declare function local:myfcn() {
>>> let $f := ($local:lookup, function-lookup(QName('
>>> http://www.w3.org/2005/xpath-functions','string'), 1))[1]
>>> return $f('a')
>>> };
>>> local:myfcn()
>>>
>>> When I run it with BaseX 11.1, I get
>>> [XQDY0054] Static variable depends on itself: $local:lookup
>>>
>>> When I run it with Saxon-HE 12, I don't get this error.
>>>
>>> Is there anything I can do in my code to avoid this error?
>>>
>>> Thanks,
>>> Amanda
>>>
>>
>>


Re: [basex-talk] Unknown function db:open

2024-09-17 Thread Christian Grün
Dear Kendall,

with BaseX 10, db:open was replaced with db:get [1]. With version 11, the
deprecated function was removed.

I cannot tell, though, why the prefix is said to be unknown. Could you
please share the exact error message with us?

Best,
Christian

[1] https://docs.basex.org/main/BaseX_10#functions



Kendall Shaw  schrieb am Di., 17. Sept. 2024, 22:26:

> I'm using basex 11.2 under WSL ubuntu 24.04 when I use the command
> xquery count(db:open('some')/*), I see an error message about db prefix
> being undefined. The same thing from an editor tab.
>
> import module namespace db = 'http://basex.org/modules/db';
>
> count(db:open('some')/*)
>
> I get the same error message.
>
> Is this to be expected? How can I supply what is needed?
>
> Thanks,
>
> Kendall Shaw
>
>
>


Re: [basex-talk] BaseX-Talk Digest, Vol 177, Issue 2

2024-09-12 Thread Christian Grün
>
> The app produces a html of 600 MB which still has to be interpreted by the
> browser. Auch, I probably need to reconsider my solutiontactics :-))
> Thanx for your suggestion. If you have any other, I am fully open for it.
>

Good to know. My suggestion: Reduce your result size ;)


Re: [basex-talk] terrible performance of webserver under BaseX11

2024-09-10 Thread Christian Grün
(cc to the list)

For a more helpful response, we would certainly need to look at your code
and run it by ourselves.

I recommend you to run your function…

$SWM:dialog.new

…outside your web application, e.g. in the BqseX GUI. This may help you to
isolate the problematic part of the code.


Re: [basex-talk] terrible performance of webserver under BaseX11

2024-09-10 Thread Christian Grün
Hi Rob,

You’ll probably need to give us more hints:

• How do you use launch your webapp? Do you use basexhttp, Jetty 11,
Tomcat, …?
• Do you use RESTXQ, the basic REST API or something else?
• Is is a specific endpoint that is slow, or is it your webapp environment
in general?
• Can you reproduce the memory issue when directly calling your endpoint
code?

Christian


On Tue, Sep 10, 2024 at 3:31 PM Rob Stapper  wrote:

> Hi Christian,
>
> My webserver-app that performed reasenably well under BaseX10 performs
> terribly under Basex11. Chrome even gives me at the end of the process an
> 'out of memory' message. So it is not only performance but also
> resource-usage. I have absolutely no clou how to address this, so all
> suggestions are very welcome.
>
> mvgr.
>
> Rob Stapper
>
>


Re: [basex-talk] Unexpected error: Improper use? Potential bug? Your feedback is welcome:

2024-09-03 Thread Christian Grün
Hi Rob,

Could you please provide us with some (ideally minimized) code to reproduce
the issue?

Thanks in advance,
Christian


On Tue, Sep 3, 2024 at 10:11 AM Rob Stapper  wrote:

> Hi Christian,
>
> After the hollidaybreak I found the courage to try to port my application
> to BaseX's version 112.
> I get the error-dump shown below. Any idea what is going wrong here?
>
> error-dump:
> =
> Unexpected error: Improper use? Potential bug? Your feedback is welcome:
> Contact: basex-talk@mailman.uni-konstanz.de
> Version: BaseX 11.2
> Java: Eclipse Adoptium, 17.0.8
> OS: Windows 10, amd64
> Stack Trace:
> java.lang.ArrayIndexOutOfBoundsException: Index 3 out of bounds for length
> 3
> at org.basex.query.func.Closure.setSignature(Closure.java:418)
> at org.basex.query.func.StaticFuncs$FuncCache.init(StaticFuncs.java:268)
> at org.basex.query.func.StaticFuncs.check(StaticFuncs.java:83)
> at org.basex.query.QueryParser.check(QueryParser.java:270)
> at org.basex.query.QueryParser.parseLibrary(QueryParser.java:198)
> at
> org.basex.query.QueryContext.lambda$parseLibrary$1(QueryContext.java:236)
> at org.basex.query.QueryContext.run(QueryContext.java:763)
> at org.basex.query.QueryContext.parseLibrary(QueryContext.java:234)
> at org.basex.query.QueryContext.parse(QueryContext.java:178)
> at org.basex.http.web.WebModule.qc(WebModule.java:100)
> at org.basex.http.web.WebModule.parse(WebModule.java:57)
> at org.basex.http.web.WebModules.parse(WebModules.java:376)
> at org.basex.http.web.WebModules.parse(WebModules.java:367)
> at org.basex.http.web.WebModules.parse(WebModules.java:367)
> at org.basex.http.web.WebModules.cache(WebModules.java:336)
> at org.basex.http.web.WebModules.find(WebModules.java:149)
> at org.basex.http.web.WebModules.restxq(WebModules.java:116)
> at org.basex.http.restxq.RestXqServlet.run(RestXqServlet.java:44)
> at org.basex.http.BaseXServlet.service(BaseXServlet.java:69)
> at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587)
> at
> org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1410)
>
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:529)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
>
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
>
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
>
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
>
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
>
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
>
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
>
> at org.eclipse.jetty.server.Server.handle(Server.java:563)
> at
> org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
>
> at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287)
> at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
>
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
> at
> org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
>
> at
> org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
>
> at
> org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
>
> at
> org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
>
> at
> org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
>
> at
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
>
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
>
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
>
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPoo

Re: [basex-talk] Alternatives to JTattoo in combination with BaseX GUI?

2024-08-31 Thread Christian Grün
Hi Daniel,

In principle, JTattoo still works with the latest version of BaseX and Java
11. However, we started assigning specific colors to some GUI components a
while ago, as it turned out to be too tedious to achieve a unified look and
feel (LAF) across various operating systems. As a result, only parts of the
dark Hifi theme will be applied anymore. I imagine that the same problem
may occur with alternative LAFs. We could certainly tweak that (e.g. by
only using specific colors if the default LAF is used), but… Well, pull
requests are welcome ;)

Personally, I also work with a dark background on Windows. I ended up with
a lightweight tool that applies a color transformation to all pixels on the
screen. It’s called Negative Screen [1], and it allows you to freely define
the color transformation matrix.

Best,
Christian

[1] https://github.com/mlaily/NegativeScreen




On Thu, Aug 29, 2024 at 6:02 PM Zimmel, Daniel 
wrote:

> Hi,
>
> I noticed that JTattoo is no longer working properly with BaseX 11 (which
> is not a surprise because it looks like an abandoned project).
> I liked to use it to force a dark theme to the GUI.
>
> Does anybody know of alternatives? Are there any?
>
> Thanks, Daniel
>


Re: [basex-talk] Basex java.lang.ArrayIndexOutOfBoundsException

2024-08-21 Thread Christian Grün
Hi Robert,

In fact out of memory errors are always critical in Java. Maybe you will
need (well, would have needed) to adjust the memory that is used by the JVM
with the Xmx flag [1]. What value is currently assigned?

Best,
Christian

[1] https://docs.basex.org/main/Start_Scripts


On Mon, Aug 19, 2024 at 3:37 PM Chavez, Robert 
wrote:

> Thank you for the suggestion Christian.
>
>
>
> In the logs everything seems to be working just fine until this one call –
>  which returns the error “Out of Main Memory”.  However, looking at the
> systems logs themselves, the system reports ample memory at that time.  I’m
> not exactly sure when BaseX reported that error at that time.
>
>
>
> After that initial “Out of Main Memory” error, all requests (PUT requests,
> simple GUI requests) fail with either:
>
>
>
> java.lang.ArrayIndexOutOfBoundsException
>
>
>
> -or
>
>
>
> java.lang.NegativeArraySizeException
>
>
>
>
>
> The only thing I can think of is that he system was reaching maximum
> harddisk capacity and maybe ran out of wiggle room at that point.
>
>
>
> 3:59:40.28210.112.72.182:36826 basex-rest  REQUEST [PUT]
> /basex/rest/psc/jqa/jqadiaries-v31-1819-11-p194.xml
>
> 13:59:43.88910.112.72.182:36826 basex-rest  500 Out of
> Main Memory. 3607.46 ms
>
> 13:59:45.35810.112.72.182:36828 basex-rest  REQUEST [PUT]
> /basex/rest/psc/jqa/jqadiaries-v31-1819-09-p165.xml
>
> 13:59:45.74610.112.72.182:36828 basex-rest  500 Improper
> use? Potential bug? Your feedback is welcome: Contact:
> basex-talk@mailman.uni-konstanz.de Version: BaseX 10.4 Java: Red Hat,
> Inc., 11.0.16 OS: Linux, amd64 Stack Trace: java.lang.RuntimeException:
> Free slot exceeds file offset: 49357094248 + 110461 > 49357197312 at
> org.basex.util.Uti14:08:00.444   10.112.72.182:37070
> basex-rest  500 Improper use? Potential bug? Your feedback is
> welcome: Contact: basex-talk@mailman.uni-konstanz.de Version: BaseX 10.4
> Java: Red Hat, Inc., 11.0.16 OS: Linux, amd64 Stack Trace:
> java.lang.ArrayIndexOutOfBoundsException  9.96 ms
>
> 14:08:05.54910.112.72.182:37074 basex-rest  REQUEST [PUT]
> /basex/rest/psc/jqa/jqadiaries-v51-1803-08-p001.xml
>
> 14:08:05.55610.112.72.182:37074 basex-rest  500 Improper
> use? Potential bug? Your feedback is welcome: Contact:
> basex-talk@mailman.uni-konstanz.de Version: BaseX 10.4 Java: Red Hat,
> Inc., 11.0.16 OS: Linux, amd64 Stack Trace:
> java.lang.NegativeArraySizeException: -1236248751 at
> org.basex.io.random.DataAccess.readBytes(DataAccess.java:196) at
> org.basex.io.random.DataAccess.readToken(DataAccess.java:175) at
> org.basex.io.random.DataAccess.readToken(DataAccess.java:166) at
> org.basex.data.DiskData.txt(DiskData.java:306) at
> org.basex.data.DiskData.text(DiskData.java:273) at
> org.basex.index.value.DiskValues.key(DiskValues.java:430) at
> org.basex.index.value.DiskValues.indexEntry(DiskValues.java:330) at
> org.basex.index.value.DiskValues.get(DiskValues.java:206) at
> org.basex.index.value.UpdatableDiskValues.delete(UpdatableDiskValues.java:98)
> at org.basex.data.Data.indexDelete(Data.java:1083) at
> org.basex.data.Data.delete(Data.java:683) at
> org.basex.query.up.atomic.Replace.apply(Replace.java:57) at
> org.basex.query.up.atomic.AtomicUpdateCache.ap... 7.06 ms
>
> 14:08:10.59010.112.72.182:370
>
>
>
> *From: *Christian Grün 
> *Date: *Friday, August 9, 2024 at 5:06 AM
> *To: *Chavez, Robert 
> *Cc: *basex-talk@mailman.uni-konstanz.de <
> basex-talk@mailman.uni-konstanz.de>
> *Subject: *Re: [basex-talk] Basex java.lang.ArrayIndexOutOfBoundsException
>
> Hi Robert,
>
>
>
> The size in the screenshot indicates that your input may have exceeded
> some limits for a single database instance [1]. Usually, an insert
> operation that would cause an overflow is rejected. Based on your database
> logs, can you trace back what was the last seemingly successful insert, and
> the first operation that raised the error message?
>
>
>
> With regard to the database optimization, does it make a difference if you
> choose "Full optimization"?
>
>
>
> Best,
>
> Christian
>
>
>
> [1] https://docs.basex.org/main/Statistics
>
>
>
>
>
> On Wed, Aug 7, 2024 at 11:07 PM Chavez, Robert 
> wrote:
>
> Greetings,
>
>
>
> We are suddenly seeing the following error message when using the REST API
> and also when using to Web GUI when trying to list the contents of a
> database by clicking on the database name and also when clicking on
> “optimize” (screen shot of gui attached).

Re: [basex-talk] Error with function-lookup and static variable

2024-08-14 Thread Christian Grün
…thanks. I already guessed it wasn’t that easy ;) I’ve added it to [1].

In general, I hope we could get completely rid of self-dependency checks.
It was only defined for variables, not for functions, and I cannot see why
we still need it today. We are currently discussing this topic for version
4.0 [2].

Best,
Christian

[1] https://github.com/BaseXdb/basex/issues/2324
[2] https://github.com/qt4cg/qtspecs/issues/1379



On Wed, Aug 14, 2024 at 1:02 PM Amanda Galtman  wrote:

> Christian, thanks very much.
>
> I returned to my actual code, and it works with the latest snapshot dated
> today.
>
> By the way, I also retried the variations I had created when trying to
> explore workarounds, and one of them still doesn't work with the latest
> snapshot. It's not blocking me, but in case it is helpful, I reduced it to
> another small query that reproduces the error with today's snapshot. I know
> you said you fixed part of the problem, so you might not be surprised that
> the following code still triggers the error.
>
> xquery version "3.1";
>
> declare variable $variant :=
>   if (exists(function-lookup(QName('nonexistent','nonexistent'), 0)))
>   then
> ( (: not relevant :) )
>   else
> function-lookup(QName('http://www.w3.org/2005/xpath-functions','string'),
> 1);
>
> declare variable $variant-fcns := $variant;
>
> declare function local:fcn() {
>   $variant-fcns('abc')
> };
>
> local:fcn()
>
>
> The levels of indirection (variable, variable, function, function call)
> seem to be relevant for the problem. If I make the code more direct, it
> works.
>
> Regards,
> Amanda
>
> On Monday, August 12th, 2024 at 5:16 AM, Christian Grün <
> christian.gr...@gmail.com> wrote:
>
> I managed to fix a part of the dependency problem. Your query should now
> be executable with the latest snapshot [1].
> – Christian
>
> [1] https://files.basex.org/releases/latest/
>
>
>
> On Fri, Aug 9, 2024 at 5:51 PM Amanda Galtman  wrote:
>
>> Hi, all.
>>
>> I'm seeing an error in BaseX when I use function-lookup in both a global
>> variable and a function, where the function relies on the variable. I
>> reduced the situation to the following small query:
>>
>> xquery version "3.1";
>> declare variable $local:lookup := function-lookup(QName("nonexistent",
>> "nonexistent"), 1);
>> declare function local:myfcn() {
>> let $f := ($local:lookup, function-lookup(QName('
>> http://www.w3.org/2005/xpath-functions','string'), 1))[1]
>> return $f('a')
>> };
>> local:myfcn()
>>
>> When I run it with BaseX 11.1, I get
>> [XQDY0054] Static variable depends on itself: $local:lookup
>>
>> When I run it with Saxon-HE 12, I don't get this error.
>>
>> Is there anything I can do in my code to avoid this error?
>>
>> Thanks,
>> Amanda
>>
>
>


Re: [basex-talk] recursively used variables

2024-08-12 Thread Christian Grün
Hi Rob, hi all,

We’ve recently added some ancient bugs regarding self dependencies in
variable declarations [1,2]. BaseX 11.2 will be out soon.

Best,
Christian

[1] https://github.com/BaseXdb/basex/issues/1095
[2] https://files.basex.org/releases/latest/



On Thu, Oct 8, 2020 at 2:17 PM Rob Stapper  wrote:

> Hi,
>
>
>
> The code[1] below and send as attachment generates a error message:
> “Static variable depends on itself: $Q{
> http://www.w3.org/2005/xquery-local-functions}test”.
>
> I use these variables to refer to my private functions in my modules so I
> can easyly refer to them in a inheritance situation.
>
> It’s not a big problem for me but I was wondering if the error-triggering
> is justified or that it should work.
>
>
>
> [1]===
>
> declare variable $local:test := local:test#1 ;
>
> declare %private function local:test( $i) { if ( $i > 0)  then
> $local:test( $i - 1) } ;
>
>
>
> $local:test( 10)
>
> ===
>
>
>
> Kind regards,
>
>
>
> Rob Stapper
>
>
>
>
>
> Sent from Mail  for
> Windows 10
>
>
>
>
> 
>  Virus-free.
> www.avast.com
> 
> <#m_-3413609695658352816_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>


Re: [basex-talk] Bug report

2024-08-12 Thread Christian Grün
Thanks. The database doesn’t look that huge.

I was surprised that the error is not raised with BaseX 10.7, as I cannot
imagine substantial changes in the way how we add documents. Does the error
persist when you run OPTIMIZE or OPTIMIZE ALL?



On Mon, Aug 12, 2024 at 3:36 PM Csaba Fekete  wrote:

> Hi
> Yeah it might be something to do with size rather than BaseX version.
> Not sure I'm allowed to share all the db but here's the DB INFO:
>
> Database 'mydb' was opened in 42.39 ms.
>
> Database Properties
>
>  NAME: mydb
>
>  SIZE: 1301 MB
>
>  NODES: 79778050
>
>  DOCUMENTS: 3075
>
>  BINARIES: 0
>
>  VALUES: 0
>
>  TIMESTAMP: 2024-08-10T18:30:28.699Z
>
>  UPTODATE: false
>
>
> Resource Properties
>
>  INPUTPATH:
>
>  INPUTSIZE: 0 b
>
>  INPUTDATE: 2024-08-10T18:30:28.687Z
>
>
> Indexes
>
>  TEXTINDEX: false
>
>  ATTRINDEX: false
>
>  TOKENINDEX: false
>
>  FTINDEX: false
>
>  TEXTINCLUDE:
>
>  ATTRINCLUDE:
>
>  TOKENINCLUDE:
>
>  FTINCLUDE:
>
>  LANGUAGE: English
>
>  STEMMING: false
>
>  CASESENS: false
>
>  DIACRITICS: false
>
>  STOPWORDS:
>
>  UPDINDEX: false
>
>  AUTOOPTIMIZE: false
>
>  MAXCATS: 100
>
>  MAXLEN: 96
>
>  SPLITSIZE: 0
>
> On Mon, 12 Aug 2024 at 10:49, Christian Grün 
> wrote:
>
>> Hi Csaba,
>>
>> It does not seem to be possible to reproduce the bug with an empty 'mydb'
>> database. I assume it’s too large to be shared (either publicly or
>> confidentially)? Is it possible to get this reproduced with a small
>> instance?
>>
>> What is output by the following command?
>>
>> > basex -v -c"OPEN mydb" -c"INFO DB"
>>
>> Thanks,
>> Christian
>>
>>
>> On Sat, Aug 10, 2024 at 8:31 PM Csaba Fekete 
>> wrote:
>>
>>> Hello
>>>
>>> The following works on Basex 10.7 but fails on Basex 11.1.
>>>
>>> The command I'm trying to run is:
>>>
>>> /opt/basex/bin/basex -v -c'OPEN mydb; ADD "/tmp/ZTH4ZRO.xml"'
>>>
>>> I've attached the XML file.The output I'm getting:
>>>
>>> Database 'mydb' was opened in 54.52 ms.
>>> Improper use? Potential bug? Your feedback is welcome:
>>> Contact: basex-talk@mailman.uni-konstanz.de
>>> Version: BaseX 11.1
>>> Java: Ubuntu, 11.0.19
>>> OS: Linux, amd64
>>> Stack Trace:
>>> java.lang.ArrayIndexOutOfBoundsException: arraycopy: length -1 is
>>> negative
>>> at java.base/java.lang.System.arraycopy(Native Method)
>>> at org.basex.util.Array.insert(Array.java:99)
>>> at org.basex.util.list.ObjectList.insert(ObjectList.java:153)
>>> at org.basex.index.resource.Docs.insert(Docs.java:166)
>>> at org.basex.index.resource.Resources.insert(Resources.java:71)
>>> at org.basex.data.Data.indexAdd(Data.java:1098)
>>> at org.basex.data.Data.insert(Data.java:836)
>>> at org.basex.query.up.atomic.Insert.apply(Insert.java:44)
>>> at
>>> org.basex.query.up.atomic.AtomicUpdateCache.applyUpdates(AtomicUpdateCache.java:291)
>>> at
>>> org.basex.query.up.atomic.AtomicUpdateCache.execute(AtomicUpdateCache.java:275)
>>> at org.basex.core.cmd.Add.lambda$run$0(Add.java:62)
>>> at org.basex.core.cmd.ACreate.update(ACreate.java:90)
>>> at org.basex.core.cmd.Add.run(Add.java:56)
>>> at org.basex.core.Command.run(Command.java:233)
>>> at org.basex.core.Command.execute(Command.java:93)
>>> at org.basex.api.client.LocalSession.execute(LocalSession.java:131)
>>> at org.basex.api.client.Session.execute(Session.java:36)
>>> at org.basex.core.CLI.execute(CLI.java:94)
>>> at org.basex.core.CLI.execute(CLI.java:78)
>>> at org.basex.core.CLI.execute(CLI.java:65)
>>> at org.basex.BaseX.(BaseX.java:82)
>>> at org.basex.BaseX.main(BaseX.java:44)
>>>
>>


Re: [basex-talk] Error with function-lookup and static variable

2024-08-12 Thread Christian Grün
I managed to fix a part of the dependency problem. Your query should now be
executable with the latest snapshot [1].
– Christian

[1] https://files.basex.org/releases/latest/



On Fri, Aug 9, 2024 at 5:51 PM Amanda Galtman  wrote:

> Hi, all.
>
> I'm seeing an error in BaseX when I use function-lookup in both a global
> variable and a function, where the function relies on the variable. I
> reduced the situation to the following small query:
>
> xquery version "3.1";
> declare variable $local:lookup := function-lookup(QName("nonexistent",
> "nonexistent"), 1);
> declare function local:myfcn() {
>   let $f := ($local:lookup, function-lookup(QName('
> http://www.w3.org/2005/xpath-functions','string'), 1))[1]
>   return  $f('a')
> };
> local:myfcn()
>
> When I run it with BaseX 11.1, I get
> [XQDY0054] Static variable depends on itself: $local:lookup
>
> When I run it with Saxon-HE 12, I don't get this error.
>
> Is there anything I can do in my code to avoid this error?
>
> Thanks,
> Amanda
>


Re: [basex-talk] Bug report

2024-08-12 Thread Christian Grün
Hi Csaba,

It does not seem to be possible to reproduce the bug with an empty 'mydb'
database. I assume it’s too large to be shared (either publicly or
confidentially)? Is it possible to get this reproduced with a small
instance?

What is output by the following command?

> basex -v -c"OPEN mydb" -c"INFO DB"

Thanks,
Christian


On Sat, Aug 10, 2024 at 8:31 PM Csaba Fekete  wrote:

> Hello
>
> The following works on Basex 10.7 but fails on Basex 11.1.
>
> The command I'm trying to run is:
>
> /opt/basex/bin/basex -v -c'OPEN mydb; ADD "/tmp/ZTH4ZRO.xml"'
>
> I've attached the XML file.The output I'm getting:
>
> Database 'mydb' was opened in 54.52 ms.
> Improper use? Potential bug? Your feedback is welcome:
> Contact: basex-talk@mailman.uni-konstanz.de
> Version: BaseX 11.1
> Java: Ubuntu, 11.0.19
> OS: Linux, amd64
> Stack Trace:
> java.lang.ArrayIndexOutOfBoundsException: arraycopy: length -1 is negative
> at java.base/java.lang.System.arraycopy(Native Method)
> at org.basex.util.Array.insert(Array.java:99)
> at org.basex.util.list.ObjectList.insert(ObjectList.java:153)
> at org.basex.index.resource.Docs.insert(Docs.java:166)
> at org.basex.index.resource.Resources.insert(Resources.java:71)
> at org.basex.data.Data.indexAdd(Data.java:1098)
> at org.basex.data.Data.insert(Data.java:836)
> at org.basex.query.up.atomic.Insert.apply(Insert.java:44)
> at
> org.basex.query.up.atomic.AtomicUpdateCache.applyUpdates(AtomicUpdateCache.java:291)
> at
> org.basex.query.up.atomic.AtomicUpdateCache.execute(AtomicUpdateCache.java:275)
> at org.basex.core.cmd.Add.lambda$run$0(Add.java:62)
> at org.basex.core.cmd.ACreate.update(ACreate.java:90)
> at org.basex.core.cmd.Add.run(Add.java:56)
> at org.basex.core.Command.run(Command.java:233)
> at org.basex.core.Command.execute(Command.java:93)
> at org.basex.api.client.LocalSession.execute(LocalSession.java:131)
> at org.basex.api.client.Session.execute(Session.java:36)
> at org.basex.core.CLI.execute(CLI.java:94)
> at org.basex.core.CLI.execute(CLI.java:78)
> at org.basex.core.CLI.execute(CLI.java:65)
> at org.basex.BaseX.(BaseX.java:82)
> at org.basex.BaseX.main(BaseX.java:44)
>


Re: [basex-talk] Error with function-lookup and static variable

2024-08-12 Thread Christian Grün
Hi Amanda,

It is high time to tackle an existing GitHub issue that most likely relates
to the same bug [1].

Thanks for your observation,
Christian

[1] https://github.com/BaseXdb/basex/issues/1095


On Fri, Aug 9, 2024 at 5:51 PM Amanda Galtman  wrote:

> Hi, all.
>
> I'm seeing an error in BaseX when I use function-lookup in both a global
> variable and a function, where the function relies on the variable. I
> reduced the situation to the following small query:
>
> xquery version "3.1";
> declare variable $local:lookup := function-lookup(QName("nonexistent",
> "nonexistent"), 1);
> declare function local:myfcn() {
>   let $f := ($local:lookup, function-lookup(QName('
> http://www.w3.org/2005/xpath-functions','string'), 1))[1]
>   return  $f('a')
> };
> local:myfcn()
>
> When I run it with BaseX 11.1, I get
> [XQDY0054] Static variable depends on itself: $local:lookup
>
> When I run it with Saxon-HE 12, I don't get this error.
>
> Is there anything I can do in my code to avoid this error?
>
> Thanks,
> Amanda
>


Re: [basex-talk] Typos in BaseX documentation

2024-08-09 Thread Christian Grün
Dear Amanda,

Thanks for your offer. I’ve already adopted your changes, and I’ll send you
login data right after this mail.

Best,
Christian


On Thu, Aug 8, 2024 at 3:30 PM Amanda Galtman  wrote:

> Hi, BaseX team.
>
> I found a couple of typos in the BaseX documentation. I wouldn't mind
> fixing them myself if I knew how to do so in the new RESTXQ format. Is the
> source in GitHub, and are BaseX users encouraged to contribute corrections
> to the documentation?
>
> Here are the details, in case you'd rather fix them yourself.
>
> Typo 1: The page https://docs.basex.org/main/Standard_Functions starts
> with a stray "f" ("fThis page presents")
>
> Typo 2: On the page https://docs.basex.org/main/File_Functions, the
> following example is missing a closing curly brace before the final
> parenthesis: file:write('data.bin', xs:hexBinary('414243'), { 'method':
> 'basex')
>
> Thanks,
> Amanda
>


Re: [basex-talk] Basex java.lang.ArrayIndexOutOfBoundsException

2024-08-09 Thread Christian Grün
Hi Robert,

The size in the screenshot indicates that your input may have exceeded some
limits for a single database instance [1]. Usually, an insert operation
that would cause an overflow is rejected. Based on your database logs, can
you trace back what was the last seemingly successful insert, and the first
operation that raised the error message?

With regard to the database optimization, does it make a difference if you
choose "Full optimization"?

Best,
Christian

[1] https://docs.basex.org/main/Statistics


On Wed, Aug 7, 2024 at 11:07 PM Chavez, Robert 
wrote:

> Greetings,
>
>
>
> We are suddenly seeing the following error message when using the REST API
> and also when using to Web GUI when trying to list the contents of a
> database by clicking on the database name and also when clicking on
> “optimize” (screen shot of gui attached).
>
>
>
> Does anyone have experience with hw to handle this?
>
>
>
> Version: BaseX 10.4
>
> Java: Amazon.com Inc., 17.0.12
>
> OS: Linux, amd64
>
>
>
> Thank you,
>
> Robert
>
>
>
>
>
> Unexpected error: Improper use? Potential bug? Your feedback is welcome:
>
> Contact: basex-talk@mailman.uni-konstanz.de
>
> Version: BaseX 10.4
>
> Java: Amazon.com Inc., 17.0.12
>
> OS: Linux, amd64
>
> Stack Trace:
>
> java.lang.ArrayIndexOutOfBoundsException: Index 2147483647 out of bounds
> for length 17322
>
> at
> org.basex.io.random.TableDiskAccess.fpre(TableDiskAccess.java:514)
>
> at
> org.basex.io.random.TableDiskAccess.cursor(TableDiskAccess.java:474)
>
> at
> org.basex.io.random.TableDiskAccess.read5(TableDiskAccess.java:180)
>
> at org.basex.data.Data.textRef(Data.java:467)
>
> at org.basex.data.DiskData.text(DiskData.java:272)
>
> at
> org.basex.query.func.db.DbListDetails$2.get(DbListDetails.java:96)
>
> at
> org.basex.query.func.db.DbListDetails$2.get(DbListDetails.java:1)
>
> at
> org.basex.query.func.fn.FnSubsequence.value(FnSubsequence.java:108)
>
> at org.basex.query.expr.IterMap.value(IterMap.java:106)
>
> at org.basex.query.expr.gflwor.Let$LetEval.next(Let.java:146)
>
> at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:78)
>
> at org.basex.query.QueryContext.next(QueryContext.java:375)
>
> at org.basex.query.expr.constr.Constr.add(Constr.java:73)
>
> at org.basex.query.expr.constr.CElem.item(CElem.java:149)
>
> at org.basex.query.expr.constr.CElem.item(CElem.java:1)
>
> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:51)
>
> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46)
>
> at org.basex.query.expr.constr.Constr.add(Constr.java:72)
>
> at org.basex.query.expr.constr.CElem.item(CElem.java:149)
>
> at org.basex.query.expr.constr.CElem.item(CElem.java:1)
>
> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:51)
>
> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46)
>
> at org.basex.query.expr.List$1.iter(List.java:213)
>
> at org.basex.query.expr.List$1.next(List.java:181)
>
> at org.basex.query.QueryContext.next(QueryContext.java:375)
>
> at org.basex.query.expr.constr.Constr.add(Constr.java:73)
>
> at org.basex.query.expr.constr.CElem.item(CElem.java:149)
>
> at org.basex.query.expr.constr.CElem.item(CElem.java:1)
>
> at org.basex.query.expr.ParseExpr.value(ParseExpr.java:51)
>
> at
> org.basex.query.func.StaticFuncCall.evalArgs(StaticFuncCall.java:147)
>
> at org.basex.query.func.FuncCall.value(FuncCall.java:53)
>
> at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46)
>
> at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:82)
>
> at org.basex.query.scope.MainModule$1.next(MainModule.java:55)
>
> at
> org.basex.http.restxq.RestXqResponse.serialize(RestXqResponse.java:79)
>
> at org.basex.http.web.WebResponse.create(WebResponse.java:58)
>
> at org.basex.http.restxq.RestXqServlet.run(RestXqServlet.java:72)
>
> at org.basex.http.BaseXServlet.service(BaseXServlet.java:69)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:733)
>
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
>
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
>
> at
> org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
>
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
>
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
>
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
>
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
>
> at
> org.apache.catalina.authenticator.Authen

Re: [basex-talk] Duplicate IDs and function id()

2024-07-26 Thread Christian Grün
Hi Andrew,

A precious observation; it appears there isn’t a single test case for that
in the W3 test suite. We’ll fix it with the next release; until then, you
can use head(id(...)).

Thank you,
Christian


On Wed, Jul 24, 2024 at 6:14 PM Andrew Sales  wrote:

> Hello,
>
> If I understand the spec for this function[1] correctly, at most one
> element should be returned for a given ID value:
>
> "If several elements have the same ID value, then E is the one that is
> first in document order."
> ("Rules", item 3)
>
> BaseX behaves differently from e.g. Saxon in this respect. The query:
>
> let $doc := document{}
>
> return (
>   'Saxon-HE 11.6:',
>   xslt:transform($doc, http://www.w3.org/1999/XSL/Transform";
> xmlns:xs="http://www.w3.org/2001/XMLSchema";
> exclude-result-prefixes="xs"
> version="3.0">
>
> 
> 
> 
>
> ),
>   'BaseX 10.7:',
>   $doc/id('a1')
> )
>
> returns:
>
> Saxon-HE 11.6:
> 
> BaseX 10.7:
> 
> 
>
> i.e. Saxon returns only the first in document order, whereas BaseX returns
> both of them.
>
> Grateful for your thoughts,
> Andrew
>
> [1] https://www.w3.org/TR/xpath-functions-31/#func-id
>


Re: [basex-talk] Query optimization: What can I check or measure?

2024-07-16 Thread Christian Grün
I tried to create some code that we can use for joint testing. With the
following snippet, you can create a database (sized around 400 MB) with 1
million value attributes and around 300. distinct values:

db:create(
  'test',
  {
for $i in 1 to 100
let $value := codepoints-to-string(
  random:seeded-integer($i mod 30, 256, 26) ! (. + 97)
)
return 
  },
  'test.xml',
  map { 'maxlen': 256, 'tokenindex': true() }
)

The following query chooses a random entry and returns all elements that
contain the attribute with this value:

let $count := count(index:tokens('test'))
let $pos := (abs(random:integer()) mod $count) + 1
return db:token('test', index:tokens('test')[$pos])

The second query takes around 3 ms in my tests. Do you get similar
performance?


Re: [basex-talk] Query optimization: What can I check or measure?

2024-07-16 Thread Christian Grün
Hi Eliot,

It’s difficult to give a general response on that without having a complete
look at the architecture, but I’ll try:

I’m measuring a consistent 0.3 seconds for this query:
>

How much time is spent if you omit the parent step?

  db:token($analyticsmgmt:analyticsDatabase, $docPath, 'topicpath')

Next, how much results do you get for a single request? Is it always a
single result, or can it be a vast number? How are the values distributued
(index:tokens may help to assess this)?

You can attach "=> prof:time()" to an expression to do some isolated
performance measurements.

In principle, it makes no difference if the data is stored in one huge
document or in millions of documents.

Best,
Christian


Re: [basex-talk] How can I reset my password for the basex website?

2024-07-16 Thread Christian Grün
Thanks for the clarification. I’ll be happy to send you –and everyone else
– an invitation to the edit features of our new documentation.


On Tue, Jul 16, 2024 at 11:40 AM Ben Engbers 
wrote:

> Hi Christian,
>
> I remember that in the past I was able to edit web-pages (such as
> https://docs.basex.org/main/Server_Protocol). Somewhere in the
> top-section of that/the pages there was a button for login.
> That button has gone.
> The bottom line has a new login link but that link directs me to a page
> I can't remember having seen before.
>
> Ben
>
> Op 16-07-2024 om 11:07 schreef Christian Grün:
> > I’m not sure if I understand you correctly (which basex website are you
> > referring to?).
> > If you have upgrade from BaseX 9 or earlier, you will have to set your
> > admin password explicitly. See [1] for some hints.
> >
> > [1] https://docs.basex.org/12/BaseX_10 <
> https://docs.basex.org/12/BaseX_10>
> >
> > On Mon, Jul 15, 2024 at 11:17 AM Ben Engbers  > <mailto:ben.engb...@be-logical.nl>> wrote:
> >
> > I wanted to add a link in the server clients page for my C++ shared
> lib
> > but iIt seems that my password has changed. When trying to login I
> only
> > see a message that I should check my login data.
> > How can I reset that password?
> >
> > Ben
> >
> > Ben Engbers
>


Re: [basex-talk] feature request: control environment of external processes

2024-07-16 Thread Christian Grün
Hi Hauke,

Thanks for the feature request, which was straightforward to implement. It
has been added to the latest snapshot and will be officially supported with
BaseX 12 [1, 2].

Best,
Christian

[1] https://files.basex.org/releases/latest/
[2] https://docs.basex.org/12/Process_Functions



On Tue, Jul 16, 2024 at 8:41 AM Hauke Brandes 
wrote:

> Hello list,
>
> I would like to suggest an enhancement for the proc:system/execute/fork
> family of functions: it should be possible to control the environment
> variables of the external process. At the moment, the environment is
> always inherited from the calling basex process, as far as I can tell.
>
> There are several reasons why it is desirable to control the environment
> of external processes:
>
> * set variables to values that are not known statically when starting basex
> * avoid information leaking (restrict environment to the minimal
> required subset)
> * control the PATH of the external process
>
> It would be great to have an additional, optional entry in the $options
> map (maybe named "env" or "environment") that contains a map of
> environment variables for the external process. If absent, the current
> behavior (inherited environment) should be used. If the option is given,
> **only** the environment variables in the map should be used for the
> external process.
>
> Effectively, the "env"/"environment" option would behave as if it had a
> default value of `map:merge( available-environment-variables() !
> map:entry(., environment-variable(.)) )`
>
> The ProcessBuilder in Java should allow for a straightforward
> implementation.
>
> What do you think?
> Cheers,
> Hauke
>
>


Re: [basex-talk] How can I reset my password for the basex website?

2024-07-16 Thread Christian Grün
I’m not sure if I understand you correctly (which basex website are you
referring to?).
If you have upgrade from BaseX 9 or earlier, you will have to set your
admin password explicitly. See [1] for some hints.

[1] https://docs.basex.org/12/BaseX_10

On Mon, Jul 15, 2024 at 11:17 AM Ben Engbers 
wrote:

> I wanted to add a link in the server clients page for my C++ shared lib
> but iIt seems that my password has changed. When trying to login I only
> see a message that I should check my login data.
> How can I reset that password?
>
> Ben
>
> Ben Engbers
>


Re: [basex-talk] A new C++ client

2024-07-16 Thread Christian Grün
Hi Ben,

Thanks for your new client, much appreciated. I’ve added it to our
documentation [1].

Out of interest: How does your solution differ from the other two clients
that we have listed?

Best,
Christian

[1] https://docs.basex.org/main/Clients



On Sun, Jul 14, 2024 at 10:43 PM Ben Engbers 
wrote:

> Before I started developing the R client for BaseX several years ago, I
> used to program regularly in SWI-Prolog. And I always wondered
> afterwards about the possible benefits of using Prolog in combination
> with XQuery. So after I finished developing RbaseX, I started developing
> ProBaseX.
>
> I did not manage to set up a connection in Swipl. Therefore, my initial
> plan to develop ProBasex entirely in SWI-prolog was changed to a plan to
> first set up a connection in C++ and then use that connection. This work
> eventually resulted in a linux shared library that implements the full
> server protocol. I will now use that library for further development work.
>
> LibBasexCpp and the test program libBasexTest.cpp are now on the
> Internet: https://github.com/BenEngbers/libBasexCpp
>
> I am curious to see if this library fills a need and if it makes sense
> to continue working on a dual platform version.
>
>
> Ben Engbers
>


Re: [basex-talk] Puzzled by timing difference of two similar queries

2024-07-15 Thread Christian Grün
An new snapshot version of BaseX is available [1]:

• In a trivial implementation, the repeated evaluation of
subsequence($seq/item,
$i, 3) results in a repeated evaluation of $seq/item, which is expensive.
• In the current BaseX release, the result of $seq/item is already cached
when it’s evaluated multiple times, but caching is disabled if the context
changes. This explains why multiple input sequences slow down everything.
• With the latest snapshot, the cache is updated with each context switch.
• If the context changes too often, caching is disabled (it’s always a
tradeoff to cache data or process it iteratively).

Hope this helps,
Christian

[1] https://files.basex.org/releases/latest/



On Mon, Jul 15, 2024 at 11:52 AM Christian Grün 
wrote:

> Hi Bernhard,
>
> The short answer is that BaseX exploits the fact that your contains a
> single seq element, and evaluates it faster than how it would be evaluated
> trivially. With multiple sequences, the optimization does not come into
> play. I’ll check if we can improve that; thank you [1].
>
> To reduce the runtime complexity by yourself, you can bind the items to an
> extra variable…
>
>   let $items := $seq/item
>   for $i in 1 to count($items) - 2
>   let $window := subsequence($items, $i, 3)
>
> …or use the window clause [2], as suggested by Martin:
>
>   for sliding window $w in 1 to 5
> start at $s
> end   at $e when $e - $s + 1 = 3
>   return array { $w }
>
> Best,
> Christian
>
> [1] https://github.com/BaseXdb/basex/issues/2316
> [2] https://docs.basex.org/main/XQuery_4.0#window_clause
>
>
>
>
> On Mon, Jul 15, 2024 at 8:24 AM Bernhard Liebl <
> li...@informatik.uni-leipzig.de> wrote:
>
>> Hello,
>>
>> as a rather unexperienced xquery and basex user, I'm puzzled by the
>> timings of running the same xquery on two very similar xml files.
>>
>> For an academic paper, I'm trying to benchmark finding certain successive
>> (windowed)  tags inside a  tag. This is the full query:
>>
>> declare function local:check-sequence($items as element(item)*) as
>> element(item)* {
>> if ($items[1]/@x = $items[2]/@x and $items[2]/@x = $items[3]/@x) then
>> if (sum($items/@y) < 0.5) then $items else ()
>> else ()
>> };
>>
>> for $seq in doc("data.xml")//seq
>> return
>>  {
>>  for $i in 1 to (count($seq/item) - 2)
>>  let $window := subsequence($seq/item, $i, 3)
>>  return {local:check-sequence($window)}
>>  }
>>
>> Here's the puzzling thing. When I run this on an xml file that has one
>>  with 1 s, i.e.  [*1] , I
>> get the following reasonable timings:
>>
>> - Parsing: 0.63 ms
>> - Compiling: 0.95 ms
>> - Optimizing: 30.59 ms
>> - Evaluating: 11.48 ms
>> - Printing: 0.61 ms
>> - Total Time: 44.27 ms
>>
>> However, once I run the same query on an xml file with two (!)  of
>> 1 items each, the runtime does not double, but is now more than 50
>> times of that of the first query:
>>
>> - Parsing: 0.6 ms
>> - Compiling: 1.3 ms
>> - Optimizing: 48.13 ms
>> - Evaluating: 2843.67 ms
>> - Printing: 1.12 ms
>> - Total Time: 2894.82 ms
>>
>> I don't see how I induce any exponential runtimes, and esp.
>> check-sequence should be constant. Am I missing something fundamental here?
>>
>> The result size of the second version is about twice the size as the
>> result size of the first version.
>>
>> Thanks in advance for any pointers.
>>
>> Bernhard Liebl
>>
>>


Re: [basex-talk] Puzzled by timing difference of two similar queries

2024-07-15 Thread Christian Grün
Hi Bernhard,

The short answer is that BaseX exploits the fact that your contains a
single seq element, and evaluates it faster than how it would be evaluated
trivially. With multiple sequences, the optimization does not come into
play. I’ll check if we can improve that; thank you [1].

To reduce the runtime complexity by yourself, you can bind the items to an
extra variable…

  let $items := $seq/item
  for $i in 1 to count($items) - 2
  let $window := subsequence($items, $i, 3)

…or use the window clause [2], as suggested by Martin:

  for sliding window $w in 1 to 5
start at $s
end   at $e when $e - $s + 1 = 3
  return array { $w }

Best,
Christian

[1] https://github.com/BaseXdb/basex/issues/2316
[2] https://docs.basex.org/main/XQuery_4.0#window_clause




On Mon, Jul 15, 2024 at 8:24 AM Bernhard Liebl <
li...@informatik.uni-leipzig.de> wrote:

> Hello,
>
> as a rather unexperienced xquery and basex user, I'm puzzled by the
> timings of running the same xquery on two very similar xml files.
>
> For an academic paper, I'm trying to benchmark finding certain successive
> (windowed)  tags inside a  tag. This is the full query:
>
> declare function local:check-sequence($items as element(item)*) as
> element(item)* {
> if ($items[1]/@x = $items[2]/@x and $items[2]/@x = $items[3]/@x) then
> if (sum($items/@y) < 0.5) then $items else ()
> else ()
> };
>
> for $seq in doc("data.xml")//seq
> return
>  {
>  for $i in 1 to (count($seq/item) - 2)
>  let $window := subsequence($seq/item, $i, 3)
>  return {local:check-sequence($window)}
>  }
>
> Here's the puzzling thing. When I run this on an xml file that has one
>  with 1 s, i.e.  [*1] , I
> get the following reasonable timings:
>
> - Parsing: 0.63 ms
> - Compiling: 0.95 ms
> - Optimizing: 30.59 ms
> - Evaluating: 11.48 ms
> - Printing: 0.61 ms
> - Total Time: 44.27 ms
>
> However, once I run the same query on an xml file with two (!)  of
> 1 items each, the runtime does not double, but is now more than 50
> times of that of the first query:
>
> - Parsing: 0.6 ms
> - Compiling: 1.3 ms
> - Optimizing: 48.13 ms
> - Evaluating: 2843.67 ms
> - Printing: 1.12 ms
> - Total Time: 2894.82 ms
>
> I don't see how I induce any exponential runtimes, and esp. check-sequence
> should be constant. Am I missing something fundamental here?
>
> The result size of the second version is about twice the size as the
> result size of the first version.
>
> Thanks in advance for any pointers.
>
> Bernhard Liebl
>
>


Re: [basex-talk] BaseX 11.1: Fixes, Tweaks

2024-07-11 Thread Christian Grün
>
> IMHO, it is a good idea to release BaseX 12.2 final version, not beta,
> with the fix asap.
>
> Let’s go for BaseX 11.2 first ;)


Re: [basex-talk] BaseX 11.1: Fixes, Tweaks

2024-07-11 Thread Christian Grün
Thanks. It could be due to the bug that creeped into the jetty.xml file.
Does it work if you replace 80 by 8080 ?


On Thu, Jul 11, 2024 at 5:18 PM  wrote:

> Mr.Grun,
>
>
>
> Here is the exact error:
>
>
>
> BaseX 11.2 beta 42dcb98 [HTTP Server]
>
> [main] INFO org.eclipse.jetty.server.Server - jetty-11.0.22; built:
> 2024-06-27T16:27:26.756Z; git: e711d4c7040cb1e61aa68cb248fa7280b734a3bb;
> jvm 21.0.3+9-LTS
>
> Failed to bind to /0.0.0.0:80
>
>
>
>
>
>
> *Regards,Yitzhak Khabinsky*
>
> *From:* ykhab...@bellsouth.net 
> *Sent:* Thursday, July 11, 2024 11:08 AM
> *To:* 'Christian Grün' 
> *Cc:* 'BaseX' ; 'Clark, Ash' <
> as.cl...@northeastern.edu>
> *Subject:* RE: [basex-talk] BaseX 11.1: Fixes, Tweaks
>
>
>
> Mr. Grun,
>
>
>
> I installed the BaseX 11.1
>
>
>
> Trying to launch basexhttp.bat
>
> The windows is splashing, stops at the 1st row (?!), and closes.
>
>
>
> After that I installed BaseX 11.2 beta 42dcb89
>
> The same outcome.
>
> Impossible to start the HTTP listener.
>
>
>
> Additionally, I compared BaseX directory with the BaseX.zip file content.
>
> They are identical, no extra jar files.
>
>
>
>
>
>
> *Regards,Yitzhak Khabinsky*
>
>
>
> *From:* Christian Grün 
> *Sent:* Wednesday, July 10, 2024 12:00 PM
> *To:* Clark, Ash 
> *Cc:* BaseX 
> *Subject:* Re: [basex-talk] BaseX 11.1: Fixes, Tweaks
>
>
>
> PSA for others using the Web Application server, the port has been changed
> from 8080 to 80.
>
>
>
> …which shouldn't have happened. We should have a closer look at our
> distribution workflow. Thanks for the observation, Ash!
>
>
>


Re: [basex-talk] BaseX 11.1: Fixes, Tweaks

2024-07-10 Thread Christian Grün
>
> PSA for others using the Web Application server, the port has been changed
> from 8080 to 80.
>

…which shouldn't have happened. We should have a closer look at our
distribution workflow. Thanks for the observation, Ash!


[basex-talk] BaseX 11.1: Fixes, Tweaks

2024-07-10 Thread Christian Grün
Dear all,

A first BaseX 11 patch release is available. It contains minor bug fixes
and tweaks:

- [FIX] Duplicate libraries removed from distribution packages
- [FIX] XQuery, inspect:module and inspect:xqdoc
- [FIX] WebSocket, ws:broadcast, ws:emit: send maps and arrays
- [FIX] HTTP Payloads: handling multipart messages
- [ADD] XQuery: support for union node tests
- [MOD] XQuery: Better coercion for arrays and maps
- [MIN] Updated to Jetty 11.0.22 and Markup Blitz 1.4

Have fun,
Your BaseX Team


Re: [basex-talk] XSLT with multiple outputs in BaseX 11.0

2024-07-02 Thread Christian Grün
> as a workaround, it seems you can do e.g.
   
  
  
   

This would have been my suggestion, too. Thanks.


Re: [basex-talk] file:write(...) is not respecting indent parameter correctly

2024-07-02 Thread Christian Grün
…that’s it. Another solution is to use fetch:doc:

   fetch:doc($input, { 'stripws': true() })

With XQuery 4, the standard fn:doc function may be extended with an options
argument. If so, the fetch namespace prefix can be omitted.



 schrieb am Di., 2. Juli 2024, 17:36:

> The solution was to add a stripws option.
> After that all output XML files are indented properly because causing the
> issue whitespaces were removed.
> Please see the working properly XQuery below.
>
>
> (: https://docs.basex.org/main/Options#stripws :)
> declare option db:stripws "true";
>
> declare variable $input as xs:string external;
> declare variable $output_dir as xs:string external;
>
> for $x at $i in doc($input)/Events/Event
> return file:write($output_dir || file:dir-separator() || "event_" || $i ||
> ".xml", $x
>   , map {'method': 'xml', 'indent': 'yes', 'omit-xml-declaration': 'no'})
>
>
>
> Regards,
> Yitzhak Khabinsky
>
>
>


Re: [basex-talk] missing invisible-xml library in BaseX 11.0?

2024-07-02 Thread Christian Grün
…thanks for the observations. A BaseX 11.1 patch release will be published
this or next week.


On Mon, Jul 1, 2024 at 10:03 PM Gunther Rademacher  wrote:

> Dear Michael,
>
> thank you for reporting this.
>
> The cited documentation page suggests that the Markup Blitz jar should be
> contained in the full distribution of BaseX 11.0, but it turns out that it
> is not. Sorry about that.
>
> What you can do is get the jar of Markup Blitz 1.3 from Maven Central [1]
> and put it into the lib folder of of the BaseX installation manually.
>
> Alternatively, if you need the fixes for [2] or [3], you might want to
> build the jar from source as explained on [4] and use it instead.
>
> Best regards
> Gunther
>
> [1]
> https://repo1.maven.org/maven2/de/bottlecaps/markup-blitz/1.3/markup-blitz-1.3.jar
> [2] https://github.com/GuntherRademacher/markup-blitz/issues/9
> [3] https://github.com/GuntherRademacher/markup-blitz/issues/10
> [4]
> https://github.com/GuntherRademacher/markup-blitz#building-markup-blitz
>
> *Gesendet:* Montag, 01. Juli 2024 um 21:09 Uhr
> *Von:* "C. M. Sperberg-McQueen" 
> *An:* basex-talk@mailman.uni-konstanz.de
> *Betreff:* [basex-talk] missing invisible-xml library in BaseX 11.0?
> Some days ago I downloaded BaseX110.zip and installed it.
>
> When I try to invoke the invisible-xml function documented at [1],
> however, I get the error message:
>
> [basex:function] Function invisible-xml requires missing class:
> de.bottlecaps.markup.Blitz.
>
> At first, I thought this might be interference from an earlier version
> of BaseX (9.0.1, which is what my package manager offered me), but it
> persists now even after I have uninstalled that earlier version.
>
> Did I do something wrong in the installation? Or was there an oversight
> putting the zip file together?
>
> best,
>
> Michael Sperberg-McQueen
>
> [1] https://docs.basex.org/main/Invisible_XML
>
> --
> C. M. Sperberg-McQueen
> Black Mesa Technologies LLC
> http://blackmesatech.com
>
>
>


Re: [basex-talk] Command line "-i" directory does not update files when file extension not "xml"

2024-07-01 Thread Christian Grün
Hi Daniel,

You can include .xsl files via the CREATEFILTER option [1]:

java -jar BaseX.jar -u -c"set createfilter *.x,*.xsl" -ilocalpath
update.xquery

Hope this helps,
Christian

[1] https://docs.basex.org/main/Options#createfilter



On Fri, Jun 28, 2024 at 3:36 PM Zimmel, Daniel 
wrote:

> Hi,
>
> is the following behaviour a bug or does it need better documentation?
>
> When in a directory of XML files
> one file extension is not .xml but .xsl
> then with
> java -cp BaseX.jar org.basex.BaseX -u -w -i"localpath/"
> update.xquery
> only .xml-files get updates (not .xsl)
>
> Only with
> java -cp BaseX.jar org.basex.BaseX -u -w -i"localpath/file.xsl"
> update.xquery
> the .xsl-file does get an update.
>
> BaseX 11.0
>
> The documentation only says: " Opens the specified XML file, directory
> with XML files, or database", but not anything about file extensions.
> https://docs.basex.org/main/Command-Line_Options
>
> Thanks, Daniel
>
>
>


Re: [basex-talk] HTTP headers are all lowercase in responses

2024-07-01 Thread Christian Grün
Hi Marco,

As Michael has already confirmed, BaseX 10 and later is based on the new
Java HTTP Client, which returns all response header field names in
lowercase as required by HTTP/2 [1].

Best,
Christian

[1] https://www.rfc-editor.org/rfc/rfc7540#section-8.1.2




On Thu, Jun 27, 2024 at 3:29 PM Michael Seiferle  wrote:

> Hi Marco,
>
>
>
> I am not 100% sure, whether BaseX or the Java HTTP Client actually does
> the conversion, but it might be due to the fact, that we use the new Java
> HTTP Client since BaseX 10.
>
>
>
> By default, it uses HTTP/2 where headers must be lower-case while in
> versions before HTTP/2 the headers are defined to be case-insensitive.
>
> Hence, my guess is:
>
> The conversion happens because it must in HTTP/2 while it’s valid in
> HTTP/1/1.1.
>
> I hope this doesn’t cause too much trouble on your end.
>
>
>
> Best
>
> Michael
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Von: *BaseX-Talk  im Auftrag
> von Marco Lettere 
> *Datum: *Mittwoch, 26. Juni 2024 um 19:14
> *An: *basex-talk@mailman.uni-konstanz.de <
> basex-talk@mailman.uni-konstanz.de>
> *Betreff: *[basex-talk] HTTP headers are all lowercase in responses
>
> Dear all,
>
> just by trying the example of the documentation [1] (at least since
> 10.4+)  you should be able to verify that the response contains all
> headers with lower-case names.
>
> I think a client should not take the freedom to alter HTTP responses if
> this is not strictly required. What is the concrete reason for this
> behaviour?
>
> hHanks,
>
> Marco.
>
> [1] http:send-request( href='http://www.google.com' timeout='10'/>)
>


Re: [basex-talk] Problem with inspect:xqdoc() in BaseX 11

2024-06-30 Thread Christian Grün
Hi Ash,

Thanks for the observation. This was a bug indeed, which is now fixed [1].

Best,
Christian

[1] https://files.basex.org/releases/latest/


On Tue, Jun 25, 2024 at 5:07 PM Clark, Ash 
wrote:

> Hi!
>
> I'm working on an upgrade to BaseX 11 — which I love already, kudos to the
> team! Unfortunately, I found that my self-documenting API isn't working
> anymore. The function inspect:xqdoc() returns only numbers as the content
> of .
>
> Here's an XQuery snippet that should reproduce the problem:
>
> inspect:xqdoc('
> https://raw.githubusercontent.com/NEU-DSG/tapas-xq/develop/modules/tapas-api.xql')//*:description
>
> Is this a bug?
>
> Warmly,
> Ash Clark
>
>


Re: [basex-talk] BaseX 11: The XMLPrague 2024 Edition

2024-06-07 Thread Christian Grün
It’s an interesting time! To improve and speed up our work, I invite you
and everybody else to give feedback on the latest developments around
lookups and updates. Some links to start with:

https://qt4cg.org/pr/832/xquery-40/xquery-40-autodiff.html
https://github.com/qt4cg/qtspecs/issues/1225
https://github.com/qt4cg/qtspecs/issues/854

I’ll be offline pretty often until end of June, so @all please bear with me
if I won’t be that responsive.



Martin Honnen  schrieb am Fr., 7. Juni 2024, 13:50:

>
> On 07/06/2024 13:45, Christian Grün wrote:
>
> > But it looks as if BaseX 11 doesn't support ??
> as a (deep) lookup operator so far. Is that right?
>
> Correct. The exact syntax (including key specifiers, which will be similar
> to XPath axis specifiers) is still subject to discussion, so we haven’t
> included it yet in our official release.
>
> Thanks, as Michael Kay was playing at XMLPrague this morning with Saxon 13
> and ?? on XDM maps and arrays I had hoped I could try some of that stuff
> out with BaseX but I understand that is all work in progress and therefore
> hard to include in some snapshot state in an official release.
>
> So I will have to wait to use it until some later release.
>


Re: [basex-talk] BaseX 11: The XMLPrague 2024 Edition

2024-06-07 Thread Christian Grün
> But it looks as if BaseX 11 doesn't support ??
as a (deep) lookup operator so far. Is that right?

Correct. The exact syntax (including key specifiers, which will be similar
to XPath axis specifiers) is still subject to discussion, so we haven’t
included it yet in our official release.


Re: [basex-talk] BaseX 11: The XMLPrague 2024 Edition

2024-06-07 Thread Christian Grün
Thanks, Martin, for the hint.

The changelog RESTXQ endpoint lacked an annotation to make it visible for
users that are not logged in (which was the reason why we didn’t notice it
during development). It should now be visible to everyone.

Best,
Christian



Martin Honnen  schrieb am Fr., 7. Juni 2024, 13:02:

> Congrats on the release and many thanks for it.
>
> I have question on the documentation, the section
> https://docs.basex.org/main/Main_Page has a section Announcements with a
> first item having a link to Changelog as
> https://docs.basex.org/main/Changelog but somehow trying to follow the
> link I always end up on the main page again.
>
>
> On 06/06/2024 13:09, Christian Grün wrote:
>
> Dear all,
>
> We’ve been hard at work finalizing version 11 of BaseX, our open source
> XML framework, database system, and XQuery processor.
>
> First, we have revised our documentation, which is now generated with
> RESTXQ:
>
> https://docs.basex.org/
>
>
>
>


[basex-talk] BaseX 11: The XMLPrague 2024 Edition

2024-06-06 Thread Christian Grün
Dear all,

We’ve been hard at work finalizing version 11 of BaseX, our open source XML
framework, database system, and XQuery processor.

First, we have revised our documentation, which is now generated with
RESTXQ:

https://docs.basex.org/

Next, you can now visit BXFiddle to launch XQuery and Invisible XML online:

https://fiddle.basex.org/

Finally, we have added and updated valuable features to help you tackle
your daily data challenges:

STORAGE
• Key/value Store: Better compactification when storing values
• CSV/JSON/XML parsing: reduced memory consumption

XQUERY
• First release with new 4.0 features
• Numerous new built-in functions (standard, maps, arrays, math)
• Updates: Multiple targets in rename/replace/insert expressions
• New custom functions: archive:refresh, validate:xsd-init, xslt:init
• Full support for Invisible XML.

GENERAL
• GUI: Full Unicode character support
• Web Server: Upgrade to Jetty 11
• Command-Line: New options -C, -Q, -W
• Options: WRITESTORE, writes store to disk at shutdown time

We took the opportunity of the version jump to drop various XQuery
functions in favor of new 4.0 standard functions. Database storage has not
changed.

A big thank you to Gunther Rademacher, who is now helping us to make BaseX
better every day.

Thanks to all of you for your continuous support,
Have fun with the new release,
Your BaseX Team


Re: [basex-talk] validate:xsd-report() is not emitting warnings

2024-06-05 Thread Christian Grün
…fixed.

On Wed, Jun 5, 2024 at 4:28 PM  wrote:

> Mr. Grun,
>
> I tested the latest BaseX 11.0 beta build 9182be0.
>
> It emits the following result for the provided test case:
>
> 
>   valid
>url="file:/C:/Users/i300179/Downloads/Warning.xsd">FacetsContradict: For
> simpleType definition '#AnonType_ADDR_TYPE', the enumeration value ''
> contradicts with value of 'length' facet.
> 
>
> We are almost there. There is one more wrinkle to smooth out.
> Messages' level attribute value became UPPER-CASED:
> level="ERROR"
> level="WARNING"
>
> That level attribute values shall be kept as before:
> level="Error"
> level="Warning"
>
> Otherwise, it breaks existing XQuery analysis of the XSD validation.
>
> Regards,
> Yitzhak Khabinsky
>
>
>


Re: [basex-talk] validate:xsd-report() is not emitting warnings

2024-06-05 Thread Christian Grün
…makes sense. A new snapshot is online, which only assigns the 'invalid'
status when errors or fatal errors are found.


On Wed, Jun 5, 2024 at 3:18 PM  wrote:

> Mr. Grun,
>
> Thanks a lot for such a such quick turnaround.
>
> I tested the latest BaseX 11.0 beta build 88ebdb9.
>
> It emits the following result for the provided test case:
> 
>   invalid
>url="file:/C:/Users/i300179/Downloads/Warning.xsd">FacetsContradict: For
> simpleType definition '#AnonType_ADDR_TYPE', the enumeration value ''
> contradicts with value of 'length' facet.
> 
>
> There is one more outstanding issue to resolve.
> A logic for the final XSD validation result needs to be adjusted.
>
> 1) Messages with the attribute level="Warning" shall be NOT counted as
> invalid!
> As end result: valid
> 2) Messages with the attribute level="Error" shall be counted as a failed
> validation.
> As end result: invalid
>
> Such behavior would be consistent with the Oxygen XML IDE validation
> outcome.
>
>
>
> Regards,
> Yitzhak Khabinsky
>
>
>


Re: [basex-talk] validate:xsd-report() is not emitting warnings

2024-06-05 Thread Christian Grün
Hi Yitzhak,

Though Oxygen XML IDE shows the following upon validation:
> For simpleType definition '#AnonType_ADDR_TYPE', the enumeration value ''
> contradicts with value of 'length' facet.


The warning is not shown as it occurs before the actual validation (i.e,
while the schema is compiled). I’ve changed this behavior: With the latest
snapshot, schema warnings are considered as well [1].

Best,
Christian

[1] https://files.basex.org/releases/latest/


Re: [basex-talk] storing xml-elements

2024-05-27 Thread Christian Grün
Hi Rob,

I think there are various way to achieve that. One way is to store both the
full document in the session as well as the path to the requested node:

let $xml := 
let $result := $xml//blu
let $path := path($result)
return (
  session:set('xml', $xml),
  session:set('xml', $path)
)

You can run a dynamic query on the XML snippet to get back your original
result:

let $xml := session:get('xml')
let $path := session:get('path')
let $result := xquery:eval($path, { '': $xml })
return $result

Best,
Christian



On Sat, May 25, 2024 at 8:45 AM Rob Stapper  wrote:

> Hi Christian,
>
> I'm looking for a way to store a xml-subelement during a session as a
> subelement  and not as a rootelement. The store- and the
> session:set-function  save xml-elements as a root-elements, loosing the
> origional root.
>
> My casus is that I have this batch-webservice that I want to turn into a
> online/interactive-webservice. The application makes use of the root- and
> ancestor functions/axis on xml-subelements. In the batch solution I have
> access to the origional xml-subelement. In the online-solution however I
> have not. This because the code is broken up due to a http-request/response
> interaction with the user. I would like the xml-subelement to survive this
> http-excursion in a way that its root-element stays available and the root-
> and ancestor-function gives the same result as in the batch-version.
>
> Do you have any suggestions on this?
>
>
> mvgr.
>
> Rob Stapper
>
>


Re: [basex-talk] Session problems with two BaseX installations

2024-05-27 Thread Christian Grün
Hi Jack,

I’m no servlet expert either, but it seems you can change the default name
(JSESSIONID) of the session cookie in your web.xml file to something else.
For example:



MYBELOVEDAPP



If you open the Development → Application → Cookies panel of your browser,
you should be able to see if your custom session key was adopted.

If that works, there should be no need anymore to change the DBA session
key, as suggested in my previous reply.

Hope this helps,
Christian



On Wed, May 22, 2024 at 4:10 PM Jack Steyn  wrote:

> Hi Christian,
>
> I think you're right, this is a browser issue relating to ports being
> ignored. Unfortunately changing the session key doesn't resolve the issue.
> It sounds like the DBA webapps need to be contacting different hosts to
> prevent session data from being overwritten. Do you have any advice on how
> to go about accomplishing this? I'm afraid networking is not one of my
> strengths.
>
> Many thanks for your assistance.
>
> Best,
>
> Jack
>
> On Wed, 22 May 2024, 8:47 pm Christian Grün, 
> wrote:
>
>> Hi Jack,
>>
>> It seems that the browser ignores different ports when handling session
>> data (just a guess). Does it work if you change the session key of your
>> second DBA instance [1]?
>>
>> Best,
>> Christian
>>
>> [1]
>> https://github.com/BaseXdb/basex/blob/df83d80238a27f3a168e3d5e88f984c819a37fb8/basex-api/src/main/webapp/dba/lib/config.xqm#L8-L9
>>
>>
>> On Wed, May 22, 2024 at 5:44 AM Jack Steyn  wrote:
>>
>>> Hi,
>>>
>>> I have two separate BaseX installations on the same machine. Each has
>>> its own set of databases and users. I run their HTTP servers
>>> simultaneously, configured to their own sets of ports.
>>>
>>> When I authenticate to the DBA webapp of one, I am logged out of the DBA
>>> webapp of the other.
>>>
>>> What might I do to prevent this?
>>>
>>> Best regards,
>>>
>>> Jack
>>>
>>


Re: [basex-talk] Weird behaviour with sequence and random:integer

2024-05-22 Thread Christian Grün
Hi Marco,

> ("a", "b", "c")[trace(1 + random:integer(3))]

The filter expression is defined in such a way that the predicate is
evaluated anew for every item of the sequence. If you want
random:integer(3) to be evaluated only once, you can either bind it to a
variable…

let $r := random:integer(3) + 1
return ("a", "b", "c")]$r]

…or use functions like fn:subsequence:

subsequence(("a", "b", "c"), random:integer(3) + 1, 1)

Hope this helps,
Christian



Marco Lettere  schrieb am Mi., 22. Mai 2024, 15:34:

> Dear all,
>
> I have this strange behavior when running in BaseX 10.4:
>
> ["a","b","c"](1 + random:integer(3))
>
> I got something empty result, sometimes (correctly) one out of the
> three, sometimes two results...
>
> I report the results and the trace in [1] and [2].
>
> What is the reason for this?
>
> Thank you.
>
> Marco.
>
> [1] Two results
> a
> c
>
> Evaluating:
> 1
> 3
> 3
> Compiling:
> - rewrite list to xs:string sequence: ("a", "b", "c")
> - swap operands: (1 + random:integer(3))
> Optimized Query:
> ("a", "b", "c")[trace((random:integer(3) + 1))]
> Query:
> ("a", "b", "c")[trace(1 + random:integer(3))]
> Result:
> - Hit(s): 2 Items
> - Updated: 0 Items
> - Printed: 3 b
> - Read Locking: (none)
> - Write Locking: (none)
> Timing:
> - Parsing: 0.21 ms
> - Compiling: 0.4 ms
> - Optimizing: 0.1 ms
> - Evaluating: 0.57 ms
> - Printing: 0.02 ms
> - Total Time: 1.31 ms
>
>
> [2] Empty result
> ()
>
> Evaluating:
> 3
> 3
> 1
> Compiling:
> - rewrite list to xs:string sequence: ("a", "b", "c")
> - swap operands: (1 + random:integer(3))
> Optimized Query:
> ("a", "b", "c")[trace((random:integer(3) + 1))]
> Query:
> ("a", "b", "c")[trace(1 + random:integer(3))]
> Result:
> - Hit(s): 0 Items
> - Updated: 0 Items
> - Printed: 0 b
> - Read Locking: (none)
> - Write Locking: (none)
> Timing:
> - Parsing: 0.16 ms
> - Compiling: 0.41 ms
> - Optimizing: 0.15 ms
> - Evaluating: 0.56 ms
> - Printing: 0.02 ms
> - Total Time: 1.3 ms
> Query Plan:
> 
>
>  
>a
>b
>c
>  
>  
>
>  
>3
>  
>  1
>
>  
>
> 
>
>


Re: [basex-talk] Making store reactive

2024-05-22 Thread Christian Grün
Hi Marco,

Thanks for your suggestion. Some thoughts:

• Function items can depend on the currently evaluated code and its static
and dynamic context, but we could possibly design something similar as for
the Job Module, in which the query is passed on as a string or a URI
reference and evaluated completely independently.
• Registered observers could be handled similarly as „services”, i.e., made
persistent, end up in the same query pool, discarded by user requests,
gracefully shut down when a server stops, etc. [1].
• The feature request reminds me of triggers what we envisioned for
databases (but that were eventually discarded [2]).

Having said this, it could take a while to make this happen as it’s a
non-trivial request :) I’d like to hear about suggestions of other readers.

Apart from that, we are always interested in feedback on the Store Module;
it’s still fairly new, but more and more people seem to discover it.

Ciao,
Christian

[1] https://docs.basex.org/wiki/Job_Module#Services
[2] https://github.com/BaseXdb/basex/issues/1082



On Wed, May 22, 2024 at 9:49 AM Marco Lettere  wrote:

> Dear Christian and BaseX developers,
>
> just wondering if adding something like the following would be hard to
> implement.
>
> *store:observe($key as xs:string, $observers as function(*)*) *
>
> with $observers being something like
>
> *function($key as xs:string).*
>
> The semantics is to call the registered observers whenever a value
> associated with the key in the store changes (put, remove, clear, ..).
>
> This would allow for nicely decoupled observer - notification pattern.
>
> Does it make sense?
>
> Regards,
>
> Marco.
>


Re: [basex-talk] Session problems with two BaseX installations

2024-05-22 Thread Christian Grün
Hi Jack,

It seems that the browser ignores different ports when handling session
data (just a guess). Does it work if you change the session key of your
second DBA instance [1]?

Best,
Christian

[1]
https://github.com/BaseXdb/basex/blob/df83d80238a27f3a168e3d5e88f984c819a37fb8/basex-api/src/main/webapp/dba/lib/config.xqm#L8-L9


On Wed, May 22, 2024 at 5:44 AM Jack Steyn  wrote:

> Hi,
>
> I have two separate BaseX installations on the same machine. Each has its
> own set of databases and users. I run their HTTP servers simultaneously,
> configured to their own sets of ports.
>
> When I authenticate to the DBA webapp of one, I am logged out of the DBA
> webapp of the other.
>
> What might I do to prevent this?
>
> Best regards,
>
> Jack
>


Re: [basex-talk] string:levenshtein producing incorrect results?

2024-05-12 Thread Christian Grün
Thanks for the hint, the DBA code has been updated (the function is being
replaced with the new fn:char('\n') function). A new snapshot is online.


On Mon, May 13, 2024 at 8:12 AM Jack Steyn  wrote:

> Thanks, Christian.
>
> When I download and unzip the latest version, start the HTTP server and
> navigate to localhost:8080, I'm given the following error:
>
> Stopped at [...]/basex/webapp/dba/jobs/job-result.xqm, 34/49:
> [XPST0017] Unknown function: string:nl.
>
> It's easy enough to work around by editing job-result.xqm (and jobs.xqm in
> which string:nl also appears), but wanted to bring it to your attention in
> case you weren't aware.
>
> Cheers,
>
> Jack
>
>
> On Fri, 10 May 2024, 7:37 pm Christian Grün, 
> wrote:
>
>> Hi Hack,
>>
>> That’s been helpful, thanks. We’ve aligned our Damerau/Levenshtein
>> algorithms, the latest version should behave as expected [1, 2].
>>
>> Best,
>> Christian
>>
>> [1] https://files.basex.org/releases/latest/
>> [2]
>> https://github.com/BaseXdb/basex/commit/6889ac108c6b32d448d640d53ec098bbb8938f06
>>
>>
>> On Thu, May 9, 2024 at 8:29 AM Jack Steyn  wrote:
>>
>>> Hi,
>>>
>>> According to my copy of BaseX 10.7,
>>>
>>> string:levenshtein('oil field', 'oilfield')
>>>
>>> and
>>>
>>> string:levenshtein('oil field', 'coalfield')
>>>
>>> both return the same value, 0.7778.
>>>
>>> My understanding is that the Levenshtein-Damerau distance between 'oil
>>> field' and 'oilfield' is 1 and between 'oil field' and 'coalfield' is 3, so
>>> following the formula from
>>> https://docs.basex.org/wiki/String_Module#string:levenshtein
>>>
>>> 1.0 – distance / max(length of strings)
>>>
>>> should give 0.888... and 0.666... respectively.
>>>
>>> Am I off-base here or is there something awry with string:levenshtein?
>>>
>>> Cheers,
>>>
>>> Jack
>>>
>>


Re: [basex-talk] string:levenshtein producing incorrect results?

2024-05-10 Thread Christian Grün
Hi Hack,

That’s been helpful, thanks. We’ve aligned our Damerau/Levenshtein
algorithms, the latest version should behave as expected [1, 2].

Best,
Christian

[1] https://files.basex.org/releases/latest/
[2]
https://github.com/BaseXdb/basex/commit/6889ac108c6b32d448d640d53ec098bbb8938f06


On Thu, May 9, 2024 at 8:29 AM Jack Steyn  wrote:

> Hi,
>
> According to my copy of BaseX 10.7,
>
> string:levenshtein('oil field', 'oilfield')
>
> and
>
> string:levenshtein('oil field', 'coalfield')
>
> both return the same value, 0.7778.
>
> My understanding is that the Levenshtein-Damerau distance between 'oil
> field' and 'oilfield' is 1 and between 'oil field' and 'coalfield' is 3, so
> following the formula from
> https://docs.basex.org/wiki/String_Module#string:levenshtein
>
> 1.0 – distance / max(length of strings)
>
> should give 0.888... and 0.666... respectively.
>
> Am I off-base here or is there something awry with string:levenshtein?
>
> Cheers,
>
> Jack
>


Re: [basex-talk] BaseX installation cannot see Java

2024-05-10 Thread Christian Grün
Thanks for the insight. We’ll need to ensure that the check runs with all
versions of Windows that we support, but if a similar problem is reported
back to us again, we’ll think about a multi-step version check.

Best,
Christian



On Thu, May 9, 2024 at 11:09 PM  wrote:

> Mr. Grun,
>
> It seems that the Windows Management Instrumentation Command-line (WMIC)
> command line utility gives a reliable result in windows.
>
> https://www.techtarget.com/searchenterprisedesktop/definition/Windows-Manage
> ment-Instrumentation-Command-line-WMIC
> 
>
> It is deprecated, but working on any Windows OS, including Windows Server
> 2022.
>
> On my home machine with Windows 10 OS the following command
> c:\>wmic product where "Name like '%JRE%' or name Like'%Java%' or name
> Like'%JDK%' " get name,Vendor,Version
>
> emits the following 4 (four) Java installations:
>
> Name  Vendor
> Version
> Java(TM) SE Development Kit 18.0.2 (64-bit)   Oracle Corporation
> 18.0.2.0
> Java 8 Update 391 (64-bit)Oracle Corporation
> 8.0.3910.13
> Eclipse Temurin JDK with Hotspot 21.0.2+13 (x64)  Eclipse Adoptium
> 21.0.2.13
> Java Auto Updater Oracle Corporation
> 2.8.391.13
>
> To see all available entries:
> c:\>wmic product where "Name like '%JRE%' or name Like'%Java%' or name
> Like'%JDK%' " get * /format:textvaluelist
>
> So, you can try to integrate wmic based method in the BaseX Windows
> installer.
>
> Regards,
> Yitzhak Khabinsky
>
>
>


Re: [basex-talk] BaseX installation cannot see Java

2024-05-08 Thread Christian Grün
>
> Where and how BaseX installer got into old Java v1.6 and missed a
> legitimate
> Eclipse Temurin JRE with Hotspot 17.0.8.1+1?
>

Good question. We would need to better understand the NSIS installer code
[1]. The installer provided dedicated Java checks, but those did not work
with newer JDK versions anymore, which is why we switched to the basic
'java -version' call.

If someone has worked with NSIS in the past, feel free to share your
knowledge.

[1] https://nsis.sourceforge.io/Main_Page



>
> c:\>java -version shows the following
> openjdk version "17.0.8.1" 2023-08-24
> OpenJDK Runtime Environment Temurin-17.0.8.1+1 (build 17.0.8.1+1)
> OpenJDK 64-Bit Server VM Temurin-17.0.8.1+1 (build 17.0.8.1+1, mixed mode,
> sharing)
>
> c:\>path shows the following:
> PATH=C:\Program Files\Eclipse
>
> Adoptium\jre-17.0.8.101-hotspot\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOW
> S\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\WI
> NDOWS\System32\OpenSSH\;C:\Windows\CCM;C:\Program
> Files\Eclypsium\;C:\Program Files (x86)\Microsoft SQL
> Server\160\DTS\Binn\;C:\Program
> Files\Azure Data
> Studio\bin;C:\Users\i300179\AppData\Local\Microsoft\WindowsApps;
>
> Overall, that's a lot for such a great support of the product.
>
> Regards,
> Yitzhak Khabinsky
>
>
>


Re: [basex-talk] BaseX installation cannot see Java

2024-05-07 Thread Christian Grün
Hi Yitzhak,

We have revised the Windows installation script and improved the debugging
output. If it fails, it gives more information on the analyzed version
string. Please check it out and tell us if it works for you [1].

Hope this helps,
Christian

[1] https://files.basex.org/releases/latest/


On Tue, May 7, 2024 at 10:40 PM Christian Grün 
wrote:

> Thanks for reporting this, Yitzhak.
>
> It seems something is going wrong in the Java version check of the
> installer:
>
> *https://github.com/BaseXdb/basex-dist/blob/20c565a4b8b79c27d276a191b35cb224ce537ef3/win/BaseX.nsi#L50
> <https://github.com/BaseXdb/basex-dist/blob/20c565a4b8b79c27d276a191b35cb224ce537ef3/win/BaseX.nsi#L50>*
>
>
>
>
>  schrieb am Di., 7. Mai 2024, 19:27:
>
>> C:\>java -version
>> openjdk version "17.0.8.1" 2023-08-24
>> OpenJDK Runtime Environment Temurin-17.0.8.1+1 (build 17.0.8.1+1)
>> OpenJDK 64-Bit Server VM Temurin-17.0.8.1+1 (build 17.0.8.1+1, mixed mode,
>> sharing)
>>
>>
>>
>> Regards,
>> Yitzhak Khabinsky
>>
>>
>>


Re: [basex-talk] BaseX installation cannot see Java

2024-05-07 Thread Christian Grün
Thanks for reporting this, Yitzhak.

It seems something is going wrong in the Java version check of the
installer:

*https://github.com/BaseXdb/basex-dist/blob/20c565a4b8b79c27d276a191b35cb224ce537ef3/win/BaseX.nsi#L50
*




 schrieb am Di., 7. Mai 2024, 19:27:

> C:\>java -version
> openjdk version "17.0.8.1" 2023-08-24
> OpenJDK Runtime Environment Temurin-17.0.8.1+1 (build 17.0.8.1+1)
> OpenJDK 64-Bit Server VM Temurin-17.0.8.1+1 (build 17.0.8.1+1, mixed mode,
> sharing)
>
>
>
> Regards,
> Yitzhak Khabinsky
>
>
>


Re: [basex-talk] minor issue in documentation

2024-05-03 Thread Christian Grün
Hi Rob,

Thanks. The current documentation will soon be replaced by a new version
(generated with BaseX) [1].

Best,
Christian

[1] https://help.basex.org/main/Database_Functions#db:type



On Fri, May 3, 2024 at 11:16 AM Rob Stapper  wrote:

> Hi Christian,
>
> I stumbled on this minor issue in the documentation:
>
> @ docs.basex.org/wiki/Database_Module#db:type
>
> *Examples*
>
>- *db:type("DB", "factbook.xml")* returns *true* if the specified
>resource is an XML document.
>
>
> *true* should be 'xml'.
>
>
>
> mvgr.
>
> Rob Stapper
>
>


Re: [basex-talk] Request and JSON parsing

2024-04-26 Thread Christian Grün
Thanks. Here’s one way to do it:

let $body := document {
  

  <_ type="object">
ba03177
0.83175087

  ba
  1810-40
  Heinrich von Kleist
  Die heilige Cäcilie
  https://kleist-digital.de/...
  Die Aebtissinn, ...

  

  
}
return 
  {
for $result in $body/json/result/_
return {
  for $field in ('Score', 'Blatt', 'Autor', 'Titel', 'Text', 'Link')
  return 
{ $field }: 
{ data($result//*[name() = lower-case($field)]) }
  
}
  }


Another one is…

return 
  {
for $result in $body/json/result/_
return {
  for $text in $result//text()
  let $field := name($text/..)
  return 
{ $field ! (upper-case(substring(., 1, 1)) || substring(.,
2))  }: 
{ $text }
  
}
  }


…but of course you can also output only the relevant fields.

Hope this helps,
Christian


Re: [basex-talk] Request and JSON parsing

2024-04-25 Thread Christian Grün
Hi Günter,

2. I'm getting a full Object (but I dont know, if its the raw JSON-Object
> like above), but I am not able, to parse it, to get a list in html.
>

Could you share the result of your response with us (possibly shortened)?

It would be interesting to learn something about the type of the response
body. What do you get if you inspect:type($response[2]) ?

let $data := json:parse($input, map { 'format': 'xquery' })
>
> return map:for-each($data, function($k, $v) {
>
>   $k || ': ' || string-join($v, ', ')
>
> })
>

If you use json:parse($input) without a specific format, you’ll get an XML
representation of the JSON data, which is usually simpler to postprocess.

Hope this helps
Christian


Re: [basex-talk] Performance issue with BaseX CLI

2024-04-22 Thread Christian Grün
Hi again,

I had a quick look into the monitoring code, and I noticed two things:

1. It looks to me (correct me if I’m wrong) as if the code of the project
was initially written for Saxon and then ported to BaseX. If you are
interested in using BaseX, you could focus on the slow functions, try
alternative writings and (if you want to run the code with both processors
in the future) ensure that Saxon still gives delivers good performance.

2. Some functions can be noticeably sped up (for both BaseX and Saxon) if
you use XQuery 3.1 features such as maps or group by. For example, the
runtime of #131014 could possibly be reduced with something similar to…

  for $ms in $Monitoring/*:MonitoringSite
  let $emsc := $ms/*:euMonitoringSiteCode
  for $ceqm in $ms/*:ChemicalEcologicalQuantitativeMonitoring
  let $V_rech := $ceqm/*:parameterCode || '/' || $ceqm/*:parameterOther ||
'/' || $ceqm/*:chemicalMatrix
  group by $group := $emsc || ': ' || $V_rech
  where count($ceqm) > 1
  return $V_rech

If BaseX turns out to be the way to go, it’s definitely worth taking
advantage of the database aspect. In BaseX, databases are fairly
light-weight, which means you can simply create them before running the
queries (e.g., with a single 'CREATE DB poc
/path/to/poc_rapportage_controle-main/xml' command) and use db:get('poc',
'your-doc.xml') in the queries to access a document (or even stick with
doc('your-doc.xml') if you enable DEFAULTDB [1]).

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/Options#DEFAULTDB


On Mon, Apr 22, 2024 at 9:32 AM Christian Grün 
wrote:

> Hi Antonio,
>
> As Liam indicated, you may get better performance when adding your
> documents to a database.
>
> In general, though, the runtimes of BaseX and Saxon have aligned pretty
> much over the years, and I assume there’ll be a trivial reason behind the
> drastic difference in the runtime.
>
> Your test setup is probably too complex for us readers to spend more time
> with it. Could you possibly share a more basic example with us, ideally
> with a single document and query file?
>
> Thanks in advance,
> Christian
>
>
>
> On Mon, Apr 22, 2024 at 8:54 AM ANDRADE Antonio <
> antonio.andr...@ofb.gouv.fr> wrote:
>
>> @Liam R. E. Quin  : Thanks for your feedback. The
>> processing time is between 2 minutes and more than 11 hours (see table
>> below). Thus, the loading time of the Java virtual machine has little
>> impact. The main XQuery script loads the XML document once at the start of
>> processing. It is then requested several times as part of more or less
>> complex quality controls. At this moment, the XML document is not intended
>> to be stored. This is why it is not loaded into a database before
>> processing.
>>
>>
>>
>>
>>
>> *Saxon*
>>
>> *BaseX*
>>
>>
>>
>> *Start*
>>
>> *Stop*
>>
>> *Elapse time*
>>
>> *Start*
>>
>> *Stop*
>>
>> *Elapse time*
>>
>> Check Monitoring 2022 FRH
>>
>> 06:16:54
>>
>> 06:19:30
>>
>> 00:02:36
>>
>> 06:44:06
>>
>> 10:05:21
>>
>> 03:21:15
>>
>> Check Multi schéma 2022 FRH
>>
>> 06:25:46
>>
>> 06:31:47
>>
>> 00:06:01
>>
>> 10:05:55
>>
>> 11:39:07
>>
>> 01:33:12
>>
>>
>>
>>
>>
>> *De :* Liam R. E. Quin 
>> *Envoyé :* samedi 20 avril 2024 05:00
>> *À :* ANDRADE Antonio ;
>> basex-talk@mailman.uni-konstanz.de
>> *Objet :* Re: [basex-talk] Performance issue with BaseX CLI
>>
>>
>>
>> On Fri, 2024-04-19 at 10:45 +0200, ANDRADE Antonio wrote:
>>
>> Hie,
>>
>>
>>
>> For the purposes of European Water Framework Directive reporting, I
>> compared the performances of the Saxon and BaseX XQuery engines.
>>
>>
>>
>> First, you should consider (as i think Martin said) the Java runtime
>> startup time, typically a second or so.
>>
>>
>>
>> Second, BaseX is a database. If you will process the same document many
>> times, first load it into a database and then use the Python BaseX client.
>> This will avoid startup time, and, more importantly, will allow BaseX to
>> make use of database indexes.
>>
>>
>>
>> If you will only process any given document once, then Saxon may well be
>> the appropriate tool.
>>
>>
>>
>> liam
>>
>>
>>
>>
>>
>> --
>>
>> Liam Quin, https://www.delightfulcomputing.com/

Re: [basex-talk] file:path-to-native() throws an error if its argument does not exist

2024-04-22 Thread Christian Grün
Hi Gerrit,

If you don’t need the canonical path to a file resource on the file system,
file:resolve-path may be the better choice. It can be used for both file
URIs and local (relative or absolute) paths.

Hope this helps,
Christian


On Mon, Apr 22, 2024 at 8:00 AM Imsieke, Gerrit, le-tex <
gerrit.imsi...@le-tex.de> wrote:

> I have a file:// URI that corresponds to a directory that I need to create
> (using svn mkdir, therefore file:create-dir() is not an option here) if it
> doesn’t exist. Calling file:path-to-native() on it results in a
> file:not-found error. Is there a fundamental reason why the file needs to
> exist before transforming its URI into the OS-native representation? Using
> BaseX 10.7.
>
> Gerrit
>


Re: [basex-talk] Performance issue with BaseX CLI

2024-04-22 Thread Christian Grün
Hi Antonio,

As Liam indicated, you may get better performance when adding your
documents to a database.

In general, though, the runtimes of BaseX and Saxon have aligned pretty
much over the years, and I assume there’ll be a trivial reason behind the
drastic difference in the runtime.

Your test setup is probably too complex for us readers to spend more time
with it. Could you possibly share a more basic example with us, ideally
with a single document and query file?

Thanks in advance,
Christian



On Mon, Apr 22, 2024 at 8:54 AM ANDRADE Antonio 
wrote:

> @Liam R. E. Quin  : Thanks for your feedback. The
> processing time is between 2 minutes and more than 11 hours (see table
> below). Thus, the loading time of the Java virtual machine has little
> impact. The main XQuery script loads the XML document once at the start of
> processing. It is then requested several times as part of more or less
> complex quality controls. At this moment, the XML document is not intended
> to be stored. This is why it is not loaded into a database before
> processing.
>
>
>
>
>
> *Saxon*
>
> *BaseX*
>
>
>
> *Start*
>
> *Stop*
>
> *Elapse time*
>
> *Start*
>
> *Stop*
>
> *Elapse time*
>
> Check Monitoring 2022 FRH
>
> 06:16:54
>
> 06:19:30
>
> 00:02:36
>
> 06:44:06
>
> 10:05:21
>
> 03:21:15
>
> Check Multi schéma 2022 FRH
>
> 06:25:46
>
> 06:31:47
>
> 00:06:01
>
> 10:05:55
>
> 11:39:07
>
> 01:33:12
>
>
>
>
>
> *De :* Liam R. E. Quin 
> *Envoyé :* samedi 20 avril 2024 05:00
> *À :* ANDRADE Antonio ;
> basex-talk@mailman.uni-konstanz.de
> *Objet :* Re: [basex-talk] Performance issue with BaseX CLI
>
>
>
> On Fri, 2024-04-19 at 10:45 +0200, ANDRADE Antonio wrote:
>
> Hie,
>
>
>
> For the purposes of European Water Framework Directive reporting, I
> compared the performances of the Saxon and BaseX XQuery engines.
>
>
>
> First, you should consider (as i think Martin said) the Java runtime
> startup time, typically a second or so.
>
>
>
> Second, BaseX is a database. If you will process the same document many
> times, first load it into a database and then use the Python BaseX client.
> This will avoid startup time, and, more importantly, will allow BaseX to
> make use of database indexes.
>
>
>
> If you will only process any given document once, then Saxon may well be
> the appropriate tool.
>
>
>
> liam
>
>
>
>
>
> --
>
> Liam Quin, https://www.delightfulcomputing.com/
> 
>
> Available for XML/Document/Information Architecture/XSLT/
>
> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
>
> Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org
> 
>


Re: [basex-talk] .xar deployment

2024-04-10 Thread Christian Grün
…I agree with Andy: This looks like a Java issue. You could try enabling
IGNORECERT in your .basex configuration [1] and executing REPO INSTALL to
see if that changes anything.

[1] https://docs.basex.org/wiki/Options#IGNORECERT



On Wed, Apr 10, 2024 at 1:57 PM Andy Bunce  wrote:

> Hi Régis,
>
> I am pretty sure your issue comes from the SSL certificates available to
> your Java VM, and is not a problem with your xar package.
> Running Basex with -d shows more information [2]
>
> Something like [1] may fix it for one machine, but it is a slow process
> and far from ideal.
>
> /Andy
>
> [1]
> https://stackoverflow.com/questions/21076179/pkix-path-building-failed-and-unable-to-find-valid-certification-path-to-requ/36427118#36427118
> [2] basex -d -c ""
> ...
> javax.net.ssl.SSLHandshakeException: PKIX path building failed:
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find
> valid certification path to requested target
> at
> java.net.http/jdk.internal.net.http.HttpClientImpl.send(HttpClientImpl.java:578)
> at
> java.net.http/jdk.internal.net.http.HttpClientFacade.send(HttpClientFacade.java:123)
> at org.basex.io.IOUrl.response(IOUrl.java:145)
> at org.basex.io.IOUrl.inputStream(IOUrl.java:127)
> at org.basex.io.in.BufferInput.get(BufferInput.java:49)
> at org.basex.io.IOUrl.read(IOUrl.java:112)
> at
> org.basex.query.util.pkg.RepoManager.install(RepoManager.java:64)
> at org.basex.core.cmd.RepoInstall.run(RepoInstall.java:36)
> at org.basex.core.Command.run(Command.java:233)
> at org.basex.core.Command.execute(Command.java:93)
> at org.basex.api.client.LocalSession.execute(LocalSession.java:131)
> at org.basex.api.client.Session.execute(Session.java:36)
> at org.basex.core.CLI.execute(CLI.java:94)
> at org.basex.core.CLI.execute(CLI.java:78)
> at org.basex.core.CLI.execute(CLI.java:65)
> at org.basex.BaseX.(BaseX.java:83)
> at org.basex.BaseX.main(BaseX.java:45)
> Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed:
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find
> valid certification path to requested target
>
>
> On Wed, 10 Apr 2024 at 11:16, Régis WITZ  wrote:
>
>> Hi all,
>> I'm kind of a XQuery noob, but I'm guilty of the creation of a little
>> BaseX module, Heimdall (10.5281/zenodo.10638084
>> ).
>> It's automatically built, unit tested and documented using GitLab CI.
>> All build results, including the .xar file, are hosted using GitLab Pages.
>> Thus, the module .xar archive is available here :
>> |https://datasphere.gitpages.huma-num.fr/heimdall/xquery/heimdall.xar|
>> .
>> As you see, all this is hosted in Huma-Num's self-hosted GitLab instance
>> (Huma-Num is a french research infrastructure).
>>
>> However, this mail is /not/ (just) a shameless plug.
>> My issue is that I cannot install my .xar directly with |REPO INSTALL| :
>>
>>  1. I cannot install it with the CLI:
>>
>> basex -c"> path='
>> https://datasphere.gitpages.huma-num.fr/heimdall/xquery/heimdall.xar'/>"
>> [repo:not-found] Package
>> 'https://datasphere.gitpages.huma-num.fr/heimdall/xquery/heimdall.xar
>> '
>> not found.
>>
>> Things are perfectly fine with for example the functx package :
>> |basex -c"> path='https://files.basex.org/modules/expath/functx-1.0.xar'/>"|
>>  2. For gigs, I tried to change the protocol from HTTPS to HTTP, and
>> here is the error that I get :
>>
>> basex -c"> path='
>> http://datasphere.gitpages.huma-num.fr/heimdall/xquery/heimdall.xar'/>"
>> [repo:parse] heimdall.xar: Resource "expath-pkg.xml" not found..
>>
>> So, maybe there is a problem with my expath-pkg.xml file ?
>> I didn't find any -but, as I said, I'm no pro. If needed, here is
>> the file that is wrapped in the .xar :
>>
>> https://gitlab.huma-num.fr/datasphere/heimdall/xquery/-/raw/main/expath-pkg.xml?ref_type=heads
>> <
>> https://gitlab.huma-num.fr/datasphere/heimdall/xquery/-/raw/main/expath-pkg.xml?ref_type=heads
>> >
>>  3. However, when I download my .xar file from the URL, and then REPO
>> INSTALL it using the local path, stuff works fine :
>> wget --no-check-certificate
>> https://datasphere.gitpages.huma-num.fr/heimdall/xquery/heimdall.xar
>> basex -c""
>> basex -c""
>>
>> Name Version  Type Path
>>
>> ---
>> http://heimdall.huma-num.fr  2.1  EXPath
>> http-heimdall.huma-num.fr-2.1
>>
>> 1 package(s).
>>
>> So I suppose the problem /might/ not be within the .xar itself, but the
>> way it is hosted ?
>> Maybe REPO INSTALL doesn't like my disturbing lack of SSL certificate ?
>> I tried to search for some clue, but didn't

Re: [basex-talk] some eq versus =

2024-04-10 Thread Christian Grün
Hi Leo,

I came across this question because I needed to know whether there are city
> elements twice in the file. For that I wrote version 2 and the result was
> wrong. Then I wrote version 1 with
> … satisfies . => deep-equal($city)
> that gave the correct answer. I noticed that I do not fully understand the
> cast behavior of the = operator…
>

Thanks. deep-equal() is probably what you want. If you use generalized
comparisons (=, !=, etc.), or if you use “data(.)” or “string(.)”, the
descendant text nodes of the referenced node will be concatenated and
returned as single string. As a result, queries like the following one…

  X = X

…will return “true” because the atomized value of both operands is“X”.

Sibling node traversals are often slow, as the same nodes are repatedly
processed.

  for $city-group in //city
  group by $string := serialize($city-group)
  where count($city-group) > 1
  return head($city-group)

With the latest BaseX 11 snapshot and the upcoming XQuery 4 features, it
could be:

  for value $v in map:build(//city, serialize#1)
  where count($v) > 1
  return head($v)

Hope this helps,
Christian


Re: [basex-talk] some eq versus =

2024-04-09 Thread Christian Grün
Hi Leo,

Yes, they are equivalent. Version 2 is a bit faster because "." will only
need to be atomized once.

We’ll examine if BaseX can automatically rewrite version 1 to version 2.

Best,
Christian


On Mon, Apr 8, 2024 at 11:59 PM Leo Studer  wrote:

> Hello
>
> are the following queries equivalent?
>
> 1. //*city*[*some* *$city* *in* following::*city* *satisfies* *string*(.)
> eq *string*(*$city*)]
>
> 2. //*city*[. = following::*city* ]
>
>
>
> Thanks in advance,
> Leo
>
>
>


Re: [basex-talk] Single command expected

2024-04-08 Thread Christian Grün
Hi Vladimir,

There’s currently no such function available. Even if we had a user:kill
function, we could not ensure that a user logs in a millisecond later when
user:drop is going to be executed. We could enrich user:drop et al. with an
“enforce” option to kill users, but the challenge is that a user may
currently execute a long-running update query that needs to be finalized
before the user can be dropped, thus delaying the execution of the
user:drop operation and possible other subsequent requests.

Have you already observed the presented pattern?

Best,
Christian


On Fri, Apr 5, 2024 at 10:36 AM Ветошкин Владимир 
wrote:

> Hi,
>
> How can I kill session and then user:grant or user:drop permissions in a
> single command?
> When I make it in two different commands - user logins between them and
> the second command gets error, that the user is currently logged in.
>
> --
> С уважением,
> Ветошкин Владимир Владимирович
>
>


Re: [basex-talk] hof:until is gone?

2024-03-28 Thread Christian Grün
Hi Graydon,

Folks tell us it’s time to stop delaying BaseX 11… We’re trying hard.

The good news: The only difference between hof:until [1] and fn:do-until
[2] is the order of parameters. The following queries will do the same
thing:

hof:until(function($a) { $a > 16 }, function($a) { $a * 2 }, 1)
do-until(1, function($a) { $a * 2 }, function($a) { $a > 16 })

With fn:do-until, the input has moved to the first position, the action
comes second and the predicate comes last.

Hope this helps,
Christian

[1] https://files.basex.org/releases/10.0/BaseX100.pdf
[2]
https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-do-until



On Thu, Mar 28, 2024 at 5:41 PM Graydon  wrote:

> Hi Andy --
>
> On Thu, Mar 28, 2024 at 01:06:10PM +, Andy Bunce scripsit:
> > hof:until is not gone, it is just hiding  [1]
> > - In 10.7 it is there but undocumented in Wiki
>
> Which is useful to know -- thank you! -- but makes me think a bunch of
> my production code is going to break hard when 11 is released. I would
> really like to avoid that. (In particular, the conversation with
> management about how much refactoring work will be required.)
>
> > - In BaseX 11 you need to use XQuery 4  fn:while-do⁺, fn:do-until⁺ [2]
>
> It is very probably my brain, but I'm having trouble transposing
> hof:until to those functions. I can an implement what I want to do with
> hof:until like so:
>
> declare function xc:dropTableLinesFancy($in as node()*) as node()* {
>   let $result as map(*) :=
> hof:until(
>   (: are we done? (= we've run off the end of our list of nodes ) :)
>   function($m) {
>  empty($m?oldNodes)
> },
>   (: create new list of line elements :)
>   function($m) {
>   map {
> 'oldNodes': tail($m?oldNodes),
> 'newNodes': ($m?newNodes,
>   switch (true())
> case starts-with(head($m?oldNodes),':stab') return 
> case starts-with(head($m?oldNodes),':rtab') return 
> case $m?toggle return 
> default return head($m?oldNodes)
>   ),
> 'toggle': (
>   switch (true())
> case starts-with(head($m?oldNodes),':stab') return true()
> case starts-with(head($m?oldNodes),':rtab') return false()
> default return $m?toggle
>   )
>   }
>   },
>   (: initial input = we start with no new nodes :)
>   map { 'oldNodes': $in, 'newNodes': (), 'toggle': false() }
> )
>   return $result?newNodes
> };
>
> Anyone willing to provide an example of what that would look like in
> fn:while-do or fn:do-until?
>
> Thanks!
> Graydon
>
>
> --
> Graydon Saunders  | graydon...@fastmail.com
> Þæs oferéode, ðisses swá mæg.
> -- Deor  ("That passed, so may this.")
>


Re: [basex-talk] BaseX and NFS

2024-03-20 Thread Christian Grün
Hi Marco,

I’m sorry, I can’t give any explanation on why BaseX performs worse with
NFS. In principal, we rely on standard Java IO/NIO.

Does it make a noticeable difference if you store copy many small files a)
locally and b) to the NFS exported disk?

Ciao,
Christian



On Thu, Mar 14, 2024 at 5:28 PM Marco Lettere  wrote:

> Dear all,
>
> we have an instance of BaseX running on top of an NFS exported disk. Yes
> yes I know it's not the best possible scenario thus I was expecting a
> slight performance decrease.
>
> Anyway, when comparing local disk to NFS disk, for a tiny operation like
> storing a very small document into a database (without optimize or
> indexes) we get orders of magnitude of difference. And I'm saying from
> few hundreds of ms to several seconds in some cases.
>
> Does anyone have experience with cases like these? Or a solid motivation
> that explains this degradation?
>
> Thank you very much as usual.
>
> Marco.
>
>


Re: [basex-talk] Out of Main Memory

2024-03-15 Thread Christian Grün
Hi Greg,

I would have guessed that 12 GB is enough for 4.7 GB; but it sometimes
depends on the input. If you like, you can share a single typical document
with us, and we can have a look at it. 61 GB will be too large for a
complete full-text index, though. However, it’s always possible to
distribute documets across multiple databases and access them with a single
query [1].

The full-text index is not incremental (in opposition to the other index
structures), which means it must be re-created it after updates. However,
it’s possible to re-index updated database instances and query fully
indexed databases at the same time.

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/Databases


On Thu, Mar 14, 2024 at 10:58 PM Murray, Gregory 
wrote:

> Thanks, Christian. I don’t think selective indexing is applicable in my
> use case, because I need to perform full-text searches on the entirety of
> each document. Each XML document represents a physical book that was
> digitized, and the structure of each document is essentially a header with
> metadata and a body with the OCR text of the book. The OCR text is split
> into pages, where one  element contains all the words from one
> corresponding printed page from the physical book. Obviously the number of
> words in each  varies widely based on the physical dimensions of the
> book and the typeface.
>
>
>
> So far, I have loaded 12,331 documents, containing a total of 2,196,771
> pages. The total size of those XML documents on disk is 4.7GB. But that is
> only a fraction of the total number of documents I want to load into BaseX.
> The total number is more like 160,000 documents. Assuming that the
> documents I’ve loaded so far are a representative sample, and I believe
> that’s true, then the total size of the XML documents on disk, prior to
> loading them into BaseX, would be about 4.7GB * 13 = 61.1GB.
>
>
>
> Normally the OCR text, once loaded, almost never changes. But the metadata
> fields do change as corrections are made. Also we add more XML documents
> routinely as we digitize more books over time. Therefore updates and
> additions are commonplace, such that keeping indexes up to date is
> important, to allow full-text searches to stay performant. I’m wondering if
> there are techniques for optimizing such quantities of text.
>
>
>
> Thanks,
>
> Greg
>
>
>
> *From: *Christian Grün 
> *Date: *Thursday, March 14, 2024 at 8:48 AM
> *To: *Murray, Gregory 
> *Cc: *basex-talk@mailman.uni-konstanz.de <
> basex-talk@mailman.uni-konstanz.de>
> *Subject: *Re: [basex-talk] Out of Main Memory
>
> Hi Greg,
>
>
>
> A quick reply: If only parts of your documents are relevant for full-text
> queries, you can restrict the selection with the FTINDEX option (see [1]
> for more information).
>
>
>
> How large is the total size of your input documents?
>
>
>
> Best,
>
> Christian
>
>
>
> [1] https://docs.basex.org/wiki/Indexes#Selective_Indexing
>
>
>
>
>
>
>
> On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
> wrote:
>
> Hello,
>
>
>
> I’m working with a database that has a full-text index. I have found that
> if I iteratively add XML documents, then optimize, add more documents,
> optimize again, and so on, eventually the “optimize” command will fail with
> “Out of Main Memory.” I edited the basex startup script to change the
> memory allocation from -Xmx2g to -Xmx12g. My computer has 16 GB of memory,
> but of course the OS uses up some of it. I have found that if I exit
> memory-hungry programs (web browser, Oxygen), start basex, and then run the
> “optimize” command, I still get “Out of Main Memory.” I’m wondering if
> there are any known workarounds or strategies for this situation. If I
> understand the documentation about indexes correctly, index data is
> periodically written to disk during optimization. Does this mean that
> running optimize again will pick up where the previous attempt left off,
> such that running optimize repeatedly will eventually succeed?
>
>
>
> Thanks,
>
> Greg
>
>
>
>
>
> Gregory Murray
>
> Director of Digital Initiatives
>
> Wright Library
>
> Princeton Theological Seminary
>
>
>
>
>
>


Re: [basex-talk] Out of Main Memory

2024-03-14 Thread Christian Grün
Hi Greg,

A quick reply: If only parts of your documents are relevant for full-text
queries, you can restrict the selection with the FTINDEX option (see [1]
for more information).

How large is the total size of your input documents?

Best,
Christian

[1] https://docs.basex.org/wiki/Indexes#Selective_Indexing



On Tue, Mar 12, 2024 at 8:34 PM Murray, Gregory 
wrote:

> Hello,
>
>
>
> I’m working with a database that has a full-text index. I have found that
> if I iteratively add XML documents, then optimize, add more documents,
> optimize again, and so on, eventually the “optimize” command will fail with
> “Out of Main Memory.” I edited the basex startup script to change the
> memory allocation from -Xmx2g to -Xmx12g. My computer has 16 GB of memory,
> but of course the OS uses up some of it. I have found that if I exit
> memory-hungry programs (web browser, Oxygen), start basex, and then run the
> “optimize” command, I still get “Out of Main Memory.” I’m wondering if
> there are any known workarounds or strategies for this situation. If I
> understand the documentation about indexes correctly, index data is
> periodically written to disk during optimization. Does this mean that
> running optimize again will pick up where the previous attempt left off,
> such that running optimize repeatedly will eventually succeed?
>
>
>
> Thanks,
>
> Greg
>
>
>
>
>
> Gregory Murray
>
> Director of Digital Initiatives
>
> Wright Library
>
> Princeton Theological Seminary
>
>
>
>
>


Re: [basex-talk] validate:xsd() questions

2024-03-11 Thread Christian Grün
Hi Greg,

A helpful observation. With XQuery 4, parameters can directly be addressed
by their name, and optional arguments can be omitted:

  validate:xsd($doc, options := $options)


The changes that were required to make this work led to a bug that is now
fixed in the latest snapshot [1].

Thanks,
Christian

[1] https://files.basex.org/releases/latest/



On Mon, Mar 11, 2024 at 2:42 PM Murray, Gregory 
wrote:

> Hello,
>
>
>
> What is the default value of the “cache” option for the validate:xsd()
> function? The default doesn’t seem to be indicated in the documentation for
> that function.
>
>
>
> Also, the following code raises an error message saying “Item expected,
> empty sequence found” but that seems odd since the second argument to
> validate:xsd() is supposed to be optional, thereby telling the validation
> process to fall back to using the schema indicated in the document itself.
>
>
>
> let $options := map {'cache' : true()}
>
> for $doc in collection()
>
> return validate:xsd($doc, (), $options)
>
>
>
> Is this a bug or have I misunderstood something?
>
>
>
> Thanks,
>
> Greg
>
>
>
>
>
> Gregory Murray
>
> Director of Digital Initiatives
>
> Wright Library
>
> Princeton Theological Seminary
>
>
>
>
>


Re: [basex-talk] xslt:transform-report result

2024-03-11 Thread Christian Grün
…very appreciated, Greg! Your edits look completely fine. Thanks.


On Mon, Mar 11, 2024 at 5:49 PM Murray, Gregory 
wrote:

> Hi Christian,
>
>
>
> I made two tweaks in the documentation for xslt:transform-report. First,
> the function returns a map, but the documentation indicated xs:string as
> the return type, so I changed it to map(*).  Second, the function seems to
> work fine as long as the stylesheet doesn’t rely on , so
> indicating “Requires Saxon 10” at the beginning of the “Summary” for the
> entire function seems misleading. Based on what you’ve said below, only the
> “messages” value of the map doesn’t work without Saxon 10. So, I moved
> “Requires Saxon 10” to the “messages” description. (Using Requires Saxon
> 10 causes a display oddity, so I omitted the  wrapper.)
>
>
>
> Please revert or fix my changes as you deem best.
>
>
>
> Thanks,
>
> Greg
>
>
>
> *From: *BaseX-Talk  on behalf
> of Christian Grün 
> *Date: *Monday, March 11, 2024 at 3:12 AM
> *To: *Andy Bunce 
> *Cc: *BaseX 
> *Subject: *Re: [basex-talk] xslt:transform-report result
>
> Hi Andy,
>
>
>
> Thanks for your edits, which I’ve just revised: It turns out that Saxon 10
> is required for xslt:transform-report for both BaseX 10 and 11 beta. As
> you’ve indicated, the Saxon API seems to change with new versions. We
> haven’t checked yet what exactly has changed, and whether the API has
> changed again from version 11 to 12 .
>
>
>
> Best,
>
> Christian
>
>
>
>
>
> On Fri, Mar 8, 2024 at 6:57 PM Andy Bunce  wrote:
>
> Hi,
>
>
>
> I have recently been using xslt:transform-report to get xsl:message
> reports.
> I am using BaseX 10.7 with various versions of Saxon-he. I have run
> example 4 from[1]
> with
>
>- saxon-he-10.9.jar
>- saxon-he-11.6.jar
>- saxon-he-12.4.jar
>
> Only saxon-he-10 captures the messages for me. (although when using BaseX
> 11 I seem to recall only Saxon 12 captured them)
>
> Is this correct and expected?
>
>
>
> I understand the Saxon API may be a moving target.
>
> Perhaps the documentation could indicate a recommended Saxon version for a
> given BaseX version, rather than "For the moment, messages can only be
> returned with recent versions of Saxon."
>
> I have made minor updates to [2] to this effect. Please edit or revert if
> my understanding is incorrect.
>
>
>
> /Andy
>
> .
>
> [1] https://docs.basex.org/wiki/XSLT_Module#Examples
>
> [2] https://help.basex.org/main/XSLT_Module
>
>
>
> On Wed, 4 May 2022 at 22:16, Christian Grün 
> wrote:
>
> I agree it's somewhat unexpected. As we are working on the string result
> that is returned by Saxon, it's currently not that easy indeed to decide
> how to interpret the character stream.
>
>
>
>
>
>
>
> Andy Bunce  schrieb am Mi., 4. Mai 2022, 23:11:
>
> Ok thanks. I thought this might have been unintended behavior.
>
> if($report?result instance of document-node()+) then
> document{$report?result} else $report?result
>
> seems to give me what I expected here.
>
> /Andy
>
>
>
>
>
> On Wed, 4 May 2022 at 16:58, Christian Grün 
> wrote:
>
> Thanks, Andy, I’ve updated the documentation.
>
>
>
> On Wed, May 4, 2022 at 3:12 PM Andy Bunce  wrote:
>
> Hi,
>
>
>
> Using BaseX 9.7.1 and saxon9he-9.9.1.jar
>
> The documentation suggests the ?result from xslt:transform-report should
> be *a* document-node where possible [1]
>
> This seems not quite to be the case when there are processing instructions
> or comments at the top level. In these cases a sequence of document-nodes
> is returned.
>
>
>
> /Andy
>
> [1] https://docs.basex.org/wiki/XSLT_Module#xslt:transform-report
>
>
>
> let $xslt:=http://www.w3.org/1999/XSL/Transform";
> version="3.0">
> 
> 
>
>
>
> let $xml:=document{ , }
> return xslt:transform-report($xml,$xslt)
>
>
>
> Returns
>
> map {
>   "messages": (),
>   "result": (, )
> }
>
>
>
>
>
>
>
>
>
>


Re: [basex-talk] xslt:transform-report result

2024-03-11 Thread Christian Grün
Hi Andy,

Thanks for your edits, which I’ve just revised: It turns out that Saxon 10
is required for xslt:transform-report for both BaseX 10 and 11 beta. As
you’ve indicated, the Saxon API seems to change with new versions. We
haven’t checked yet what exactly has changed, and whether the API has
changed again from version 11 to 12 .

Best,
Christian


On Fri, Mar 8, 2024 at 6:57 PM Andy Bunce  wrote:

> Hi,
>
> I have recently been using xslt:transform-report to get xsl:message
> reports.
> I am using BaseX 10.7 with various versions of Saxon-he. I have run
> example 4 from[1]
> with
>
>- saxon-he-10.9.jar
>- saxon-he-11.6.jar
>- saxon-he-12.4.jar
>
> Only saxon-he-10 captures the messages for me. (although when using BaseX
> 11 I seem to recall only Saxon 12 captured them)
> Is this correct and expected?
>
> I understand the Saxon API may be a moving target.
> Perhaps the documentation could indicate a recommended Saxon version for a
> given BaseX version, rather than "For the moment, messages can only be
> returned with recent versions of Saxon."
> I have made minor updates to [2] to this effect. Please edit or revert if
> my understanding is incorrect.
>
> /Andy
> .
> [1] https://docs.basex.org/wiki/XSLT_Module#Examples
> [2] https://help.basex.org/main/XSLT_Module
>
>
> On Wed, 4 May 2022 at 22:16, Christian Grün 
> wrote:
>
>> I agree it's somewhat unexpected. As we are working on the string result
>> that is returned by Saxon, it's currently not that easy indeed to decide
>> how to interpret the character stream.
>>
>>
>>
>> Andy Bunce  schrieb am Mi., 4. Mai 2022, 23:11:
>>
>>> Ok thanks. I thought this might have been unintended behavior.
>>> if($report?result instance of document-node()+) then
>>> document{$report?result} else $report?result
>>> seems to give me what I expected here.
>>> /Andy
>>>
>>>
>>> On Wed, 4 May 2022 at 16:58, Christian Grün 
>>> wrote:
>>>
>>>> Thanks, Andy, I’ve updated the documentation.
>>>>
>>>> On Wed, May 4, 2022 at 3:12 PM Andy Bunce  wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Using BaseX 9.7.1 and saxon9he-9.9.1.jar
>>>>> The documentation suggests the ?result from xslt:transform-report
>>>>> should be *a* document-node where possible [1]
>>>>> This seems not quite to be the case when there are processing
>>>>> instructions or comments at the top level. In these cases a sequence of
>>>>> document-nodes is returned.
>>>>>
>>>>> /Andy
>>>>> [1] https://docs.basex.org/wiki/XSLT_Module#xslt:transform-report
>>>>>
>>>>> let $xslt:=http://www.w3.org/1999/XSL/Transform"; version="3.0">
>>>>> 
>>>>> 
>>>>>
>>>>> let $xml:=document{ , }
>>>>> return xslt:transform-report($xml,$xslt)
>>>>>
>>>>> Returns
>>>>> map {
>>>>>   "messages": (),
>>>>>   "result": (, )
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>


Re: [basex-talk] Troubles with ft:mark

2024-03-07 Thread Christian Grün
Hi Jack,

In general, it should always be possible to get ft:mark working somehow;
it’s just difficult to give general advice how to do it ;)

If you like, you can provide us with a stripped-down version of your code
that you can’t get to work.

Best,
Christian


On Wed, Mar 6, 2024 at 5:25 AM Jack Steyn  wrote:

> Hi Christian,
>
> Thank you very much for your explanation and variant example.
>
> In my use case, the local:search function is itself being called (as a
> named function reference) from within another function that is the endpoint
> of a RESTXQ API. This containing function handles a number of things e.g.
> pagination, deduplication, transformation of XML into HTML.
>
> Even when I rewrite this local:search to match your variant example,
> incorrect results are still returned. But when I then add %rest:GET
> annotations to turn the local:search function into its own endpoint, the
> correct results are returned only when I use that endpoint directly.
>
> Thus I assume the containing function makes things again too complicated
> for metadata to be propagated.
>
> Does that sound plausible to you? And can you suggest any simple ways
> around it? I'm afraid applying %basex:inline hasn't helped.
>
> Very best,
>
> Jack
>
> On Fri, 1 Mar 2024, 8:12 pm Christian Grün, 
> wrote:
>
>> Hi Jack,
>>
>> > When you say you can't reproduce it, do you mean you get 14 results
>> from running this script?
>>
>> Yes, that’s what I meant.
>>
>> The upcoming information will be very technical and specific. You are
>> welcome to focus on the examples.
>>
>> Your updated example was helpful, and I noticed it’s a bunch of issues
>> that lead to the unexpected results. The core challenge is that ft:mark and
>> ft:extract only yield expected results if the internally collected
>> full-text metadata is not lost at some stage during the internal processing
>> – which can happen at many places hidden to the writer of the query.
>>
>> In your specific example, the full-text information gets lost because the 
>> local:search
>> function is too complex to be inlined by the compiler (which enables
>> further optimizations that eventually allow metadata propagation). You can
>> tackle this by forcing the compiler to inline your function:
>>
>>   declare %basex:inline function local:search(...)
>>
>> Using '(ethnicgroups, languages)' instead of 'name() = (...)' is another
>> practical advice; it helps the optimizer to detect at compile time that
>> metadata will be available at runtime. Another solution is to use
>> 'local-name()' instead of 'name()' (local-name does not rely on namespace
>> that may possibly occur in a database, which also affects the way how
>> full-text queries are evaluated).
>>
>> Here’s a variant that should work:
>>
>> declare function local:search(
>>   $database  as xs:string,
>>   $query as xs:string
>> ) {
>>   let $country := ft:search($database, $query)/ancestor::country
>>   let $search := function($node) { $node/text() contains text { $query } }
>>   return (
>> ft:mark($country[.//name[$search(.)]]),
>> ft:mark($country[.//city[$search(.)]]),
>> ft:mark($country[.//(ethnicgroups, languages)[$search(.)]])
>>   )
>> };
>> local:search('factbook', 'German')
>>
>> …or…
>>
>>   let $search := function($nodes) { $nodes[text() contains text { $query
>> }] }
>>   return (ft:mark($country[$search(.//name)]), ...
>>
>> From today’s perspective, we would certainly design ft:mark and
>> ft:extract in a way that the results are always correct. The consequences,
>> however, would be a much more restricted syntax.
>>
>> Hope this helps,
>> Christian
>>
>>
>> On Thu, Feb 29, 2024 at 12:13 AM Jack Steyn  wrote:
>>
>>> Hi Christian,
>>>
>>> When I run your script, I do get 14 elements.
>>>
>>> When I run the following script I just get 12.
>>>
>>> 
>>>   true
>>>   https://files.basex.org/xml/factbook.xml
>>> 
>>>   
>>> 
>>>
>>> When you say you can't reproduce it, do you mean you get 14 results from
>>> running this script?
>>>
>>> Cheers,
>>>
>>> Jack
>>>
>>> On Thu, 29 Feb 2024, 1:02 am Christian Grün, 
>>> wrote:
>>>
>>>> Hi Jack,
>>>>
>>>> Thanks for your observation.
>>>>
>>>>
>>>>> The first result of this query is the entry for Austria. I would
>>>>> expect both of the instances of the word 'German' in that entry to be
>>>>> surrounded by  tags. However only the first instance is.
>>>>>
>>>>
>>>> I couldn’t reproduce this yet. Here’s a command script that returns 14
>>>> German elements:
>>>>
>>>> 
>>>>   true
>>>>   https://files.basex.org/xml/factbook.xml
>>>> 
>>>>   >>> )
>>>> return $marked//*[text() = 'German']
>>>>   ]]>
>>>> 
>>>>
>>>> Could you check if you get the same result?
>>>>
>>>> Thanks in advance
>>>> Christian
>>>>
>>>>


Re: [basex-talk] Inconsistent behavior for root context in BaseX GUI

2024-03-06 Thread Christian Grün
Hi Tim,

Thanks for spending so much time on this, and the visual “proof”.

A mysterious one! The query seems to be correctly evaluated on my system,
whatever I do.


>- There’s no error that’s thrown, so I’m not certain about how to
>share a stack trace?
>
>
Good to know. If no stack trace is output on command line, it’s at least no
obvious bug.


>- I’ve tried to reproduce the issue with other XML input and queries,
>but nothing else seems to trigger it.
>
>
Good to know, too ;)

We’ll keep on trying. If someone else in our group manages to trigger the
bug, I’ll give you an update. Feel free to share more information with us
if you should manage to further isolate the bug.

Best,
Christian


Re: [basex-talk] DBA Editor behavior in v11 not as expected

2024-03-04 Thread Christian Grün
…thanks; we’ve added some usability tweaks in the editor.


On Mon, Mar 4, 2024 at 11:15 AM Eliot Kimber 
wrote:

> Now that I’m using an expected extension, a bit more feedback on the
> Editor UI.
>
>
>
> Having saved a file and then not modified what’s in the editor, opening a
> new file should not prompt for confirmation as the editor isn’t “dirty” and
> there can be no data loss (the contents of the editor were just saved).
>
>
>
> In my case I’m switching between two little test scripts: one to store
> docs and one to operate on them, so having to respond to an unnecessary
> confirmation every time I switch is annoying.
>
>
>
> Cheers,
>
>
>
> E.
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com <https://www.servicenow.com>
>
> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
> <https://twitter.com/servicenow> | YouTube
> <https://www.youtube.com/user/servicenowinc> | Facebook
> <https://www.facebook.com/servicenow>
>
>
>
> *From: *Christian Grün 
> *Date: *Friday, March 1, 2024 at 2:58 PM
> *To: *Eliot Kimber 
> *Cc: *BaseX 
> *Subject: *Re: [basex-talk] DBA Editor behavior in v11 not as expected
> *[External Email]*
>
>
> --
>
> Good to know, it'll be easy to fix that [1]; we should do so anyway [2].
>
>
>
> [1]
> https://github.com/BaseXdb/basex/blob/main/basex-api/src/main/webapp/dba/static/editor.js#L56
>
> [2] https://help.basex.org/main/XQuery_Extensions#suffixes
>
>
>
> Eliot Kimber  schrieb am Fr., 1. März 2024,
> 21:41:
>
> I use “.xqy” for full XQuery scripts and “.xqm” for XQuery modules.
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com <https://www.servicenow.com>
>
> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
> <https://twitter.com/servicenow> | YouTube
> <https://www.youtube.com/user/servicenowinc> | Facebook
> <https://www.facebook.com/servicenow>
>
>
>
> *From: *Christian Grün 
> *Date: *Friday, March 1, 2024 at 10:57 AM
> *To: *Eliot Kimber 
> *Cc: *BaseX 
> *Subject: *Re: [basex-talk] DBA Editor behavior in v11 not as expected
> *[External Email]*
>
>
> --
>
> Hi Eliot,
>
>
>
> We have aligned the behavior of the DBA editor to the BaseX GUI. Saved
> files are only executable if they are detected as XQuery file.
>
>
>
> But maybe we need to support some more file suffixes (.xq is the default
> extension). How do you name your files?
>
>
>
> Best,
>
> Christian
>
>
>
>
>
> Eliot Kimber  schrieb am Fr., 1. März 2024,
> 17:52:
>
> Using the 27-02-2024 build of v11, if I create a query in the Editor panel
> and then save it, the Run button is disabled, which is unexpected.
>
>
>
> If I then open the file, the Run button remains disabled.
>
>
>
> If I cut the contents of the editing panel, select “close” to clear the
> panel, then paste back into the panel, the Run button is enabled.
>
>
>
> I think this behavior is incorrect—as long as there is some content in the
> editor the Run button should be enabled.
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com <https://www.servicenow.com>
>
> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
> <https://twitter.com/servicenow> | YouTube
> <https://www.youtube.com/user/servicenowinc> | Facebook
> <https://www.facebook.com/servicenow>
>
>


Re: [basex-talk] Inconsistency in base-uri()

2024-03-04 Thread Christian Grün
…just a quick reply: That’s probably related to [1], an ancient issue, in
which I tended to recommend the usage of db:path. I wish we’d finally find
time and ressources to tackle this.


[1] https://github.com/BaseXdb/basex/issues/1172

On Mon, Mar 4, 2024 at 11:49 AM Eliot Kimber 
wrote:

> Using BaseX 11 (but I think the code is the same in BaseX 10).
>
>
>
> I’m trying to understand how base-uri() behaves relative to how it should
> behave when the database path of a document is not a valid URI, i.e., it
> has a space in it.
>
>
>
> First I have this test:
>
>
>
> let $doc as document-node() := document {  xml:base="temp/child-uri%20with%20space.xml">child }
>
> return $doc/*/child ! base-uri(.)
>
>
>
> Which produces:
>
> file:///data/basex/data/.dba/temp/child-uri%20with%20space.xml
>
>
>
> Which is the correct result: it’s the value of @xml:base and the escaped
> spaces make it a valid URI.
>
>
>
> Replacing %20 with “ “ in the @xml:base value results in this error:
>
> * Invalid URI: Illegal character in path at index 14: temp/child-uri with
> space.xml.*
>
>
>
> Also correct as the spaces have to be escaped.
>
>
>
> This verifies that base-uri() applied to nodes with explicit @xml:base
> attributes work per the spec. But this test does not involve database paths.
>
>
>
> To try to test things with database paths I then created this pair of test
> scripts:
>
> Script to put docs in a database:
>
> let $db := 'temp'
>
> let $filename as xs:string := 'with space.xml'
>
> let $doc1 as document-node() := document {No
> xml:base}
>
> let $doc2 as document-node() := document {  xml:base="{'/temp/xmlbase/doc2_' || $filename}">With xml:base
> unescaped }
>
> let $doc3 as document-node() := document {  xml:base="{iri-to-uri( '/temp/xmlbase/doc3_' || $filename)}">With xml:base
> escaped }
>
> return (()
>
> ,db:put($db, $doc1, 'doc1_' || $filename)
>
> ,db:put($db, $doc2, 'doc2_' || $filename)
>
> ,db:put($db, $doc3, 'doc3_' || $filename)
>
> )
>
>
>
> Script to report on them:
>
> let $db := 'temp'
>
> let $filenameBase as xs:string := 'with space.xml'
>
> return
>
> for $i in 1 to 3
>
>   let $filename := 'doc' || $i  || '_' || $filenameBase
>
>   let $doc := db:get($db, $filename)
>
>   let $child as element() := $doc/*/child
>
>   let $dbPath := db:path($doc)
>
>   let $baseUriDoc := base-uri($doc)
>
>   let $baseUriChild :=
>
>   try {
>
> base-uri($child)
>
>   } catch * {
>
> $err:description
>
>   }
>
>   return (()
>
>,``[
>
> Doc "`{$dbPath}`":]``
>
>,$doc
>
>,``[xml:base att:  "`{$child/@xml:base}`"]``
>
>,``[base URI of doc:  "`{$baseUriDoc}`"]``
>
>,``[base URI of child: "`{$baseUriChild}`"]``
>
>   )
>
>
>
> Which returns this result:
>
> Doc "doc1_with space.xml":
>
> 
>
>   No xml:base
>
> 
>
> xml:base att:  ""
>
> base URI of doc:  "/temp/doc1_with space.xml"
>
> base URI of child: "/temp/doc1_with space.xml"
>
>
>
> Doc "doc2_with space.xml":
>
> 
>
>   With xml:base
> unescaped
>
> 
>
> xml:base att:  "/temp/xmlbase/doc2_with space.xml"
>
> base URI of doc:  "/temp/doc2_with space.xml"
>
> base URI of child: "Invalid URI: Illegal character in path at index 23:
> /temp/xmlbase/doc2_with space.xml."
>
>
>
> Doc "doc3_with space.xml":
>
> 
>
>   With xml:base
> escaped
>
> 
>
> xml:base att:  "/temp/xmlbase/doc3_with%20space.xml"
>
> base URI of doc:  "/temp/doc3_with space.xml"
>
> base URI of child: "Invalid URI: Illegal character in path at index 15:
> /temp/doc3_with space.xml."
>
>
>
> Note the result for doc3: It’s reporting the base URI of the document
> (/temp/doc3_with space.xml), not the base URI of the child
> (/temp/xmlbase/doc_with%20space.xml). Why? I think the answer is that under
> the covers it’s doing resolve-uri(), which also checks the validity of both
> the base and relative parts.
>
>
>
> One observation is that base-uri() is treating the db-provided base URI
> differently from an xml:base-provided base URI, but only when there is no
> @xml:base attribute.
>
>
>
> In doc 1, the database path has a space but base-uri() does not fail when
> returning it even though it’s not a valid URI. Why not?
>
>
>
> In doc 2, the xml:base-supplied base URI is correctly reported as invalid,
> but the database-supplied base URI of the root is not reported as invalid.
>
>
>
> My expectation would be that the behavior is consistent: Either all URIs
> must be valid, including those coming from database paths or all are
> automatically escaped (as though iri-to-uri() had been applied).
>
>
>
> Finally, why do I get the result for doc 3, where it’s reporting the
> database path as the base URI of the child rather than the
> @xml:base-defined base URI (which is correctly escaped).
>
>
>
> In my code, which depends on the use of @xml:base to do DITA link
> resolution for “resolved” DITA maps, I’ve adjusted my code to escape URIs
> in @xml:base values and as far as I can tell everything works as it should.
> But I’m still concerned about the incons

Re: [basex-talk] Inconsistent behavior for root context in BaseX GUI

2024-03-03 Thread Christian Grün
Hi Tim,

Thanks for your observation. We need more help to reproduce this: Could you
describe in detail, and step by step, how to trigger this issue?

In addition, you could…
• tell us more about your OS and JDK version
• start BaseX on command line and share a possible stack trace with us
• verify if the error also occurs with other XML input (such as ) and
other queries (such as .)

The simpler the use case, the better…

Thanks
Christian



On Fri, Mar 1, 2024 at 6:11 PM Thompson, Timothy 
wrote:

> I’ve been noticing an inconsistent behavior in the GUI with BaseX 11 beta
> (build 17d8426).
>
>
>
> I have a simple query on a database created from a CSV file. Sometimes it
> works, but sometimes the root context is evaluated as an empty sequence:
>
>
>
> - rewrite context value: . -> ()
>
> - rewrite util:root(nodes): util:root(()) -> ()
>
>
>
> I notice this when I open the database as context and try to execute the
> query. After the query fails, if I try to click on the “home” icon in the
> result panel, it also returns an empty sequence, even though the database
> is open.
>
>
>
> If I reopen the database and run the query, it works again.
>
>
>
> Sample data:
>
>
>
> 
>
>   
>
> http://id.loc.gov/authorities/names/n0121
>
> http://id.loc.gov/rwo/agents/n0121
>
> 0.11861849
>
>   
>
>   
>
> http://id.loc.gov/authorities/names/n0122
>
> http://id.loc.gov/rwo/agents/n0122
>
> 0.11699477
>
>   
>
>   
>
> http://id.loc.gov/authorities/names/n0267
>
> http://id.loc.gov/rwo/agents/n0267
>
> 0.10811427
>
>   
>
> 
>
>
>
> Query (with DB open as context):
>
>
>
> count(distinct-values(
>
>   for $rec in /csv/record
>
>   where contains($rec/entry[1], "id.loc.gov")
>
>   return $rec/entry[1]
>
> ))
>
>
>
> Thanks in advance,
>
> Tim
>
>
>


Re: [basex-talk] DBA Editor behavior in v11 not as expected

2024-03-01 Thread Christian Grün
Good to know, it'll be easy to fix that [1]; we should do so anyway [2].

[1]
https://github.com/BaseXdb/basex/blob/main/basex-api/src/main/webapp/dba/static/editor.js#L56
[2] https://help.basex.org/main/XQuery_Extensions#suffixes


Eliot Kimber  schrieb am Fr., 1. März 2024,
21:41:

> I use “.xqy” for full XQuery scripts and “.xqm” for XQuery modules.
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com <https://www.servicenow.com>
>
> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
> <https://twitter.com/servicenow> | YouTube
> <https://www.youtube.com/user/servicenowinc> | Facebook
> <https://www.facebook.com/servicenow>
>
>
>
> *From: *Christian Grün 
> *Date: *Friday, March 1, 2024 at 10:57 AM
> *To: *Eliot Kimber 
> *Cc: *BaseX 
> *Subject: *Re: [basex-talk] DBA Editor behavior in v11 not as expected
> *[External Email]*
>
>
> --
>
> Hi Eliot,
>
>
>
> We have aligned the behavior of the DBA editor to the BaseX GUI. Saved
> files are only executable if they are detected as XQuery file.
>
>
>
> But maybe we need to support some more file suffixes (.xq is the default
> extension). How do you name your files?
>
>
>
> Best,
>
> Christian
>
>
>
>
>
> Eliot Kimber  schrieb am Fr., 1. März 2024,
> 17:52:
>
> Using the 27-02-2024 build of v11, if I create a query in the Editor panel
> and then save it, the Run button is disabled, which is unexpected.
>
>
>
> If I then open the file, the Run button remains disabled.
>
>
>
> If I cut the contents of the editing panel, select “close” to clear the
> panel, then paste back into the panel, the Run button is enabled.
>
>
>
> I think this behavior is incorrect—as long as there is some content in the
> editor the Run button should be enabled.
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com <https://www.servicenow.com>
>
> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
> <https://twitter.com/servicenow> | YouTube
> <https://www.youtube.com/user/servicenowinc> | Facebook
> <https://www.facebook.com/servicenow>
>
>


Re: [basex-talk] DBA Editor behavior in v11 not as expected

2024-03-01 Thread Christian Grün
Hi Eliot,

We have aligned the behavior of the DBA editor to the BaseX GUI. Saved
files are only executable if they are detected as XQuery file.

But maybe we need to support some more file suffixes (.xq is the default
extension). How do you name your files?

Best,
Christian



Eliot Kimber  schrieb am Fr., 1. März 2024,
17:52:

> Using the 27-02-2024 build of v11, if I create a query in the Editor panel
> and then save it, the Run button is disabled, which is unexpected.
>
>
>
> If I then open the file, the Run button remains disabled.
>
>
>
> If I cut the contents of the editing panel, select “close” to clear the
> panel, then paste back into the panel, the Run button is enabled.
>
>
>
> I think this behavior is incorrect—as long as there is some content in the
> editor the Run button should be enabled.
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com 
>
> LinkedIn  | Twitter
>  | YouTube
>  | Facebook
> 
>


Re: [basex-talk] Optimize database never returns, leaves database in "opened by another process" state

2024-03-01 Thread Christian Grün
Glad to hear it, thanks Eliot.


Eliot Kimber  schrieb am Fr., 1. März 2024,
16:21:

> Using the 27-02-2024 build I have confirmed that the optimize database
> deadlock seems to be resolved.
>
>
>
> I was able to easily upgrade my code to replace prof:dump() with message()
> and db:open() with db:get() and everything else seems to be working as it
> should.
>
>
>
> I like the new Editor replacement for the old Query feature in the DBA app.
>
>
>
> Cheers,
>
>
>
> E.
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com <https://www.servicenow.com>
>
> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
> <https://twitter.com/servicenow> | YouTube
> <https://www.youtube.com/user/servicenowinc> | Facebook
> <https://www.facebook.com/servicenow>
>
>
>
> *From: *Christian Grün 
> *Date: *Wednesday, February 28, 2024 at 9:36 AM
> *To: *Eliot Kimber 
> *Cc: *basex-talk@mailman.uni-konstanz.de <
> basex-talk@mailman.uni-konstanz.de>
> *Subject: *Re: [basex-talk] Optimize database never returns, leaves
> database in "opened by another process" state
> *[External Email]*
>
>
> --
>
> …this one could be related to a bug that was recently fixed in the latest
> snapshot [1]. About time to get BaseX 11 finished…
>
>
>
> [1]
> https://github.com/BaseXdb/basex/commit/45d97f8065615fb734b712bc4c77c39899e9d496
>
>
>
>
>
>
>
> On Mon, Feb 26, 2024 at 5:25 PM Eliot Kimber 
> wrote:
>
> Using Basex 10.7 on Linux.
>
>
> I’m running a sequence of jobs to update and optimize a set of databases
> following loading a number of documents created dynamically (as opposed to
> being read from disk).
>
>
>
> I’m seeing a new behavior, which is that the optimization step never
> completes but also doesn’t show any error in the log. The database shows no
> items and is in the locked by another process state if I try to drop it.
> This behavior seems to be consistently repeatable with my current code base
> (I’m working on some code updates, so it’s possible I’ve introduced
> something that would cause this behavior but I haven’t changed the code
> that leads up to the failing optimize). The server has plenty of disk
> space, etc.
>
>
>
> Optimization code is:
>
>try {
>
>  if (db:exists($database))
>
>  then
>
>  (
>
>util:logToConsole('dbadmin:optimizeDatabase', ``[Optimizing
> database `{$database}`]``),
>
>db:optimize($database, true(), $dbadmin:dbOptimizeOptions)
>
>  )
>
>  else util:logToConsole('dbadmin:optimizeDatabase', ``[Database
> '`{$database}`' does not exist. Nothing to optimize.]``)
>
>} catch * {
>
>  util:logToConsole(
>
>'dbadmin:optimizeDatabase',
>
>``[Exception optimizing database '`{$database}`': `{$err:code}` -
> `{$err:description}`]``,
>
>'error')
>
>}
>
>
>
> And the optimization options are:
>
> declare variable $dbadmin:dbOptimizeOptions as map(*) :=
>
> (: Turn on all the indexes :)
>
>   map {
>
> 'attrindex' : true(),
>
> 'tokenindex' : true(),
>
> 'textindex' : true(),
>
> 'ftindex' : true()
>
>   };
>
> This code has been working fine for a long time and I’ve been running 10.7
> for a least a couple of months, so I’m wondering:
>
> A) Would would cause this behavior?
> B) How can I diagnose it short of debugging the Java code (which I can do
> but it’s non-trivial for me to set up).
>
>
>
> Thanks,
>
>
>
> Eliot
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com <https://www.servicenow.com>
>
> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
> <https://twitter.com/servicenow> | YouTube
> <https://www.youtube.com/user/servicenowinc> | Facebook
> <https://www.facebook.com/servicenow>
>
>


Re: [basex-talk] Optimize database never returns, leaves database in "opened by another process" state

2024-03-01 Thread Christian Grün
…exactly; thanks, Andy. Time works against us, so we’ve added a link to the
new documentation on the start page of docs.basex.org.


On Thu, Feb 29, 2024 at 11:15 AM Andy Bunce  wrote:

> Maybe: https://help.basex.org/main/Profiling_Module
>
> /Andy
>
> On Thu, 29 Feb 2024 at 09:40, Eliot Kimber 
> wrote:
>
>> Using the latest build, 11.0 beta 17d8426, the prof:dump() function is
>> reported as an unknown function.
>>
>>
>>
>> What replaces it (or where can I find the V11 docs)?
>>
>>
>>
>> Thanks,
>>
>>
>>
>> E.
>>
>>
>>
>>
>>
>> _
>>
>> *Eliot Kimber*
>>
>> Sr Staff Content Engineer
>>
>> O: 512 554 9368
>>
>> M: 512 554 9368
>>
>> servicenow.com <https://www.servicenow.com>
>>
>> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
>> <https://twitter.com/servicenow> | YouTube
>> <https://www.youtube.com/user/servicenowinc> | Facebook
>> <https://www.facebook.com/servicenow>
>>
>>
>>
>> *From: *Eliot Kimber 
>> *Date: *Thursday, February 29, 2024 at 3:15 AM
>> *To: *Christian Grün 
>> *Cc: *basex-talk@mailman.uni-konstanz.de <
>> basex-talk@mailman.uni-konstanz.de>
>> *Subject: *Re: [basex-talk] Optimize database never returns, leaves
>> database in "opened by another process" state
>>
>> Found the latest build at https://files.basex.org/releases/latest/
>>
>>
>>
>> Cheers,
>>
>>
>>
>> E.
>>
>>
>>
>> _
>>
>> *Eliot Kimber*
>>
>> Sr Staff Content Engineer
>>
>> O: 512 554 9368
>>
>> M: 512 554 9368
>>
>> servicenow.com <https://www.servicenow.com>
>>
>> LinkedIn <https://www.linkedin.com/company/servicenow> | Twitter
>> <https://twitter.com/servicenow> | YouTube
>> <https://www.youtube.com/user/servicenowinc> | Facebook
>> <https://www.facebook.com/servicenow>
>>
>>
>>
>> *From: *Eliot Kimber 
>> *Date: *Thursday, February 29, 2024 at 2:45 AM
>> *To: *Christian Grün 
>> *Cc: *basex-talk@mailman.uni-konstanz.de <
>> basex-talk@mailman.uni-konstanz.de>
>> *Subject: *Re: [basex-talk] Optimize database never returns, leaves
>> database in "opened by another process" state
>>
>> I’m trying to compile the latest code but “mvn clean install” fails on
>> failure to download some dependencies:
>>
>> [*INFO*] *--< *org.basex:basex*
>> >---*
>>
>> [*INFO*] *Building BaseX Core 11.0-SNAPSHOT*
>>
>> [*INFO*]   from pom.xml
>>
>> [*INFO*] *[ jar
>> ]-*
>>
>> Downloading from devsnc-mirror:
>> http://nexus.proxy.devsnc.com/content/groups/public/jp/sourceforge/igo/igo/0.4.3/igo-0.4.3.pom
>>
>> [*WARNING*] The POM for jp.sourceforge.igo:igo:jar:0.4.3 is missing, no
>> dependency information available
>>
>> Downloading from devsnc-mirror:
>> http://nexus.proxy.devsnc.com/content/groups/public/org/apache/lucene-stemmers/3.4.0/lucene-stemmers-3.4.0.pom
>>
>> [*WARNING*] The POM for org.apache:lucene-stemmers:jar:3.4.0 is missing,
>> no dependency information available
>>
>> Downloading from devsnc-mirror:
>> http://nexus.proxy.devsnc.com/content/groups/public/jp/sourceforge/igo/igo/0.4.3/igo-0.4.3.jar
>>
>> Downloading from devsnc-mirror:
>> http://nexus.proxy.devsnc.com/content/groups/public/org/apache/lucene-stemmers/3.4.0/lucene-stemmers-3.4.0.jar
>>
>> [*INFO*]
>> **
>>
>> [*INFO*] *BUILD FAILURE*
>>
>> [*INFO*]
>> **
>>
>> [*INFO*] Total time:  5.589 s
>>
>> [*INFO*] Finished at: 2024-02-29T02:34:11-06:00
>>
>> [*INFO*]
>> **
>>
>> [*ERROR*] Failed to execute goal on project basex: *Could not resolve
>> dependencies for project org.basex:basex:jar:11.0-SNAPSHOT: The following
>> artifacts could not be resolved: jp.sourceforge.igo:igo:jar:0.4.3 (absent),
>> org.apache:lucene-stemmers:jar:3.4.0 (absent): Could not find artifact
>> jp.sourceforge.igo:igo:jar:0.4.3 in devsnc-mirror

Re: [basex-talk] Troubles with ft:mark

2024-03-01 Thread Christian Grün
Hi Jack,

> When you say you can't reproduce it, do you mean you get 14 results from
running this script?

Yes, that’s what I meant.

The upcoming information will be very technical and specific. You are
welcome to focus on the examples.

Your updated example was helpful, and I noticed it’s a bunch of issues that
lead to the unexpected results. The core challenge is that ft:mark and
ft:extract only yield expected results if the internally collected
full-text metadata is not lost at some stage during the internal processing
– which can happen at many places hidden to the writer of the query.

In your specific example, the full-text information gets lost because
the local:search
function is too complex to be inlined by the compiler (which enables
further optimizations that eventually allow metadata propagation). You can
tackle this by forcing the compiler to inline your function:

  declare %basex:inline function local:search(...)

Using '(ethnicgroups, languages)' instead of 'name() = (...)' is another
practical advice; it helps the optimizer to detect at compile time that
metadata will be available at runtime. Another solution is to use
'local-name()' instead of 'name()' (local-name does not rely on namespace
that may possibly occur in a database, which also affects the way how
full-text queries are evaluated).

Here’s a variant that should work:

declare function local:search(
  $database  as xs:string,
  $query as xs:string
) {
  let $country := ft:search($database, $query)/ancestor::country
  let $search := function($node) { $node/text() contains text { $query } }
  return (
ft:mark($country[.//name[$search(.)]]),
ft:mark($country[.//city[$search(.)]]),
ft:mark($country[.//(ethnicgroups, languages)[$search(.)]])
  )
};
local:search('factbook', 'German')

…or…

  let $search := function($nodes) { $nodes[text() contains text { $query }]
}
  return (ft:mark($country[$search(.//name)]), ...

>From today’s perspective, we would certainly design ft:mark and ft:extract
in a way that the results are always correct. The consequences, however,
would be a much more restricted syntax.

Hope this helps,
Christian


On Thu, Feb 29, 2024 at 12:13 AM Jack Steyn  wrote:

> Hi Christian,
>
> When I run your script, I do get 14 elements.
>
> When I run the following script I just get 12.
>
> 
>   true
>   https://files.basex.org/xml/factbook.xml
> 
>   
> 
>
> When you say you can't reproduce it, do you mean you get 14 results from
> running this script?
>
> Cheers,
>
> Jack
>
> On Thu, 29 Feb 2024, 1:02 am Christian Grün, 
> wrote:
>
>> Hi Jack,
>>
>> Thanks for your observation.
>>
>>
>>> The first result of this query is the entry for Austria. I would expect
>>> both of the instances of the word 'German' in that entry to be surrounded
>>> by  tags. However only the first instance is.
>>>
>>
>> I couldn’t reproduce this yet. Here’s a command script that returns 14
>> German elements:
>>
>> 
>>   true
>>   https://files.basex.org/xml/factbook.xml
>> 
>>   > )
>> return $marked//*[text() = 'German']
>>   ]]>
>> 
>>
>> Could you check if you get the same result?
>>
>> Thanks in advance
>> Christian
>>
>>


Re: [basex-talk] Optimize database never returns, leaves database in "opened by another process" state

2024-02-28 Thread Christian Grün
…this one could be related to a bug that was recently fixed in the latest
snapshot [1]. About time to get BaseX 11 finished…

[1]
https://github.com/BaseXdb/basex/commit/45d97f8065615fb734b712bc4c77c39899e9d496



On Mon, Feb 26, 2024 at 5:25 PM Eliot Kimber 
wrote:

> Using Basex 10.7 on Linux.
>
>
> I’m running a sequence of jobs to update and optimize a set of databases
> following loading a number of documents created dynamically (as opposed to
> being read from disk).
>
>
>
> I’m seeing a new behavior, which is that the optimization step never
> completes but also doesn’t show any error in the log. The database shows no
> items and is in the locked by another process state if I try to drop it.
> This behavior seems to be consistently repeatable with my current code base
> (I’m working on some code updates, so it’s possible I’ve introduced
> something that would cause this behavior but I haven’t changed the code
> that leads up to the failing optimize). The server has plenty of disk
> space, etc.
>
>
>
> Optimization code is:
>
>try {
>
>  if (db:exists($database))
>
>  then
>
>  (
>
>util:logToConsole('dbadmin:optimizeDatabase', ``[Optimizing
> database `{$database}`]``),
>
>db:optimize($database, true(), $dbadmin:dbOptimizeOptions)
>
>  )
>
>  else util:logToConsole('dbadmin:optimizeDatabase', ``[Database
> '`{$database}`' does not exist. Nothing to optimize.]``)
>
>} catch * {
>
>  util:logToConsole(
>
>'dbadmin:optimizeDatabase',
>
>``[Exception optimizing database '`{$database}`': `{$err:code}` -
> `{$err:description}`]``,
>
>'error')
>
>}
>
>
>
> And the optimization options are:
>
> declare variable $dbadmin:dbOptimizeOptions as map(*) :=
>
> (: Turn on all the indexes :)
>
>   map {
>
> 'attrindex' : true(),
>
> 'tokenindex' : true(),
>
> 'textindex' : true(),
>
> 'ftindex' : true()
>
>   };
>
> This code has been working fine for a long time and I’ve been running 10.7
> for a least a couple of months, so I’m wondering:
>
> A) Would would cause this behavior?
> B) How can I diagnose it short of debugging the Java code (which I can do
> but it’s non-trivial for me to set up).
>
>
>
> Thanks,
>
>
>
> Eliot
>
>
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com 
>
> LinkedIn  | Twitter
>  | YouTube
>  | Facebook
> 
>


Re: [basex-talk] Odd "Access denied" messages in the log

2024-02-28 Thread Christian Grün
Hi Eliot,

It’s difficult to tell which requests are sent to BaseX. Maybe you can use
a networking monitoring tool such as Wireshark to get more hints?

Best,
Christian


On Sun, Feb 25, 2024 at 11:59 PM Eliot Kimber 
wrote:

> I have an application (our Mirabel system) running on a server inside our
> firewall (so not visible to the open Internet).
>
>
>
> I’ve recently started seeing messages like this in the log:
>
> Access denied: .
>
> Access denied: PRI * HTTP/2.0 SM .
>
>
>
> Where the value reported can be quite varied, but is often unrenderable
> characters or other stuff. (In this case the characters are all \uFFFD).
>
> The log messages all report the same IP address.
>
>
>
> This server does not use named users, so there’s no authentication
> required to access it.
>
>
>
> The IP address is not one of my own servers, so I don’t think it’s
> something generated by my own code.
>
>
>
> Any idea what this might be? It’s started relatively recently, which makes
> me think it might be some sort of penetration test.
>
>
>
> Cheers,
>
>
>
> E.
>
> _
>
> *Eliot Kimber*
>
> Sr Staff Content Engineer
>
> O: 512 554 9368
>
> M: 512 554 9368
>
> servicenow.com 
>
> LinkedIn  | Twitter
>  | YouTube
>  | Facebook
> 
>


Re: [basex-talk] Troubles with ft:mark

2024-02-28 Thread Christian Grün
Hi Jack,

Thanks for your observation.


> The first result of this query is the entry for Austria. I would expect
> both of the instances of the word 'German' in that entry to be surrounded
> by  tags. However only the first instance is.
>

I couldn’t reproduce this yet. Here’s a command script that returns 14
German elements:


  true
  https://files.basex.org/xml/factbook.xml

  


Could you check if you get the same result?

Thanks in advance
Christian


Re: [basex-talk] basex failed to cast large numbers as xs:integer

2024-02-22 Thread Christian Grün
Hi Max,

Thanks for your report.

The behavior is correct, but I agree it’s surprising:

• //* returns two results:  and 
• the atomized value of  is '14588204311438466813'. This value
exceeds the BaseX limit for integers (2^64), which is why [not(. castable
as xs:integer)] returns true for this element.
• the atomized value of  is 1458820431. This value can be
represented as xs:integer, so – as expected – [not(. castable as
xs:integer)] reurns false.

If you want to want to check for numbers that exceed the 2^64 limit, you
can use xs:decimal tests. If you want to check if the single text nodes are
integers, you can use:

//text()[not(. castable as xs:integer)]

Hope this helps,
Christian


Re: [basex-talk] Whitespace

2024-02-20 Thread Christian Grün
Hi Owen,

Do you have specific problems with whitespace in your query service? If
yes, which version of BaseX are you using?

Best,
Christian


On Wed, Feb 14, 2024 at 6:22 PM Owen Ambur  wrote:

> Lack of capability to deal appropriately with whitespaces (and
> punctuation) results in false positives in our StratML-enabled query
> service at https://search.aboutthem.info/
>
> Will look forward to learning if anything can be done about it.
>
> Owen Ambur
> https://www.linkedin.com/in/owenambur/
>
>


Re: [basex-talk] Found problems with map:for-each

2024-02-19 Thread Christian Grün
Hi Vincenzo,

Thanks for your observation and the easily reproducible test case; we’ve
uploaded a new stable snapshot with a bug fix.

Please note that it’s generally risky to explicitly return empty sequences,
as the optimizer will always try to get rid of code that may not contribute
to the final result (we try hard, though, to keep code alive that has side
effects, such as file:write-binary). In the given case, you could simply
rewrite your code as follows:

let $files := map {"hello1.txt" : ... }
return map:for-each($files, function($filename, $content) {
  file:write-binary($filename, $content, 0)
})

Ciao,
Christian

[1] https://files.basex.org/releases/latest/



On Mon, Feb 19, 2024 at 4:51 PM Vincenzo Cestone 
wrote:

> Hi all,
>
> with the basex 10.7 version (but even in 9.6 version I have the same
> issue) I found that the following code wont work as expected:
> (: It wont work :)
> let $files := map {"hello1.txt" : xs:base64Binary('SGVsbG8gd29ybGQ='),
> "hello2.txt" : xs:base64Binary('SGVsbG8gd29ybGQ=')}
> let $w := map:for-each($files, function($filename, $content) {
>   file:write-binary($filename, $content, 0)
> })
> return ()
>
> that is, it will not write the two files hello1.txt and hello2.txt in the
> basex_home/bin folder.
>
> But, if you return $w instead:
> (: It works :)
> let $files := map {"hello1.txt" : xs:base64Binary('SGVsbG8gd29ybGQ='),
> "hello2.txt" : xs:base64Binary('SGVsbG8gd29ybGQ=')}
> let $w := map:for-each($files, function($filename, $content) {
>   file:write-binary($filename, $content, 0)
> })
> return $w
>
> With a similar implementation, with the classic FLWOR, the issue does not
> arise, even if I return the empty sequence:
> (: It works :)
> let $files := map {"hello1.txt" : xs:base64Binary('SGVsbG8gd29ybGQ='),
> "hello2.txt" : xs:base64Binary('SGVsbG8gd29ybGQ=')}
> let $w := for $filename in map:keys($files)
>   return file:write-binary($filename, map:get($files, $filename), 0)
> return ()
>
> that is it will write two files hello1.txt and hello2.txt in
> basex_home/bin folder.
>
> Note that the files in the examples above are for demonstration purposes,
> however in my code they are actually binary files.
> My java version is a Oracle JDK 17
>
> This problem also occur to others?
>
> Thanks,
> Vincenzo
>


Re: [basex-talk] Slow full-text querying

2024-02-18 Thread Christian Grün
Dear Greg,

In BaseX, it’s the text nodes that are indexed. Here are some query that
take advantage of the full-text index:

db:get("theocom")[.//text() contains text 'apple']
db:get("theocom")//page[text() contains text 'apple']
db:get("theocom")//text()[. contains text 'apple']
...

If you check out the output of the GUI’s info panel, you’ll see whether the
full-text index is applied.

There are several ways to compute scores for documents; here are two
variants:

let $db := 'theocom'
let $keywords := 'apple'
let $search := function($doc) { $doc//text() contains text { $keywords } }
for $doc in db:get($db)[$search(.)]
order by ft:score($search($doc)) descending
return db:path($doc)

let $keywords := "apple"
for $texts score $score in db:get("theocom")//text()[. contains text {
$keywords }]
group by $uri := db:path($texts)
let $scores := sum($score)
order by $scores descending
return ($scores, $uri, $texts)

By evaluating specific score values for text nodes, you have more freedom
to decide how to interpret the scores. For example, you can rank scores of
title elements higher than those of paragraphs.

I invite you to have a look at our documentation for more information and
examples [1,2].

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/Full-Text
[2] https://docs.basex.org/wiki/Indexes#Full-Text_Index



On Sat, Feb 17, 2024 at 1:23 PM Murray, Gregory 
wrote:

> Hello,
>
>
>
> I have a database with several thousand XML documents, although I have
> tens of thousands I’d like to add. Each XML document contains a book — both
> the bibliographic metadata such as title, author, etc. (each in its own
> element) and the complete OCR text of all pages of the book. Each page of
> text from each book is in a  element with a single text node
> containing all words from that page in the book, resulting in large blocks
> of text.
>
>
>
> I’ve added a full-text index and optimized it. I am finding that full-text
> searching is very slow. The query shown below consistently takes about 20
> seconds to run, even though there are only about 7400 documents. Obviously
> that’s far too slow to use the query in a web application, where the user
> expects a quick response.
>
>
>
> My first thought is whether the query is actually using the full-text
> index. Is there a way for me to determine that?
>
>
>
> I’m also wondering if my query is crude or is missing something. I don’t
> need the text nodes containing the search words; I only need to know which
> documents contain the words.
>
>
>
> let $keywords := "apple"
>
> for $doc in collection("theocom")
>
> let score $score := $doc contains text {$keywords}
>
> order by $score descending
>
> where $score > 0
>
> return concat($score, " ", base-uri($doc))
>
>
>
> As you can see, I’m searching all text in the entirety of each book. Is
> there a way to rewrite such a query for faster performance?
>
>
>
> Also, I’m wondering if the structure of the XML documents is such that the
> documents themselves need to have smaller blocks of text. For example, if
> the OCR text were contained in  elements, each containing only a
> single line of text, as printed in the original physical book, would
> full-text searching be noticeably faster, since each text node is much
> smaller?
>
>
>
> Thanks,
>
> Greg
>
>
>


Re: [basex-talk] Help with loading of 9 million documents

2024-02-14 Thread Christian Grün
Thanks for the addition, Liam; I should have mentioned that.

If your input has mixed content, and if the relevant sections have
xml:space='preserve' attributes…

The very tc34q.

…whitespace stripping will be safe.

Similarly, it may be helpful to know that the whitspace gets lost if XML
strings…

The very tc34q.

…are evaluated as XQuery. To prevent that, you can add a statement to the
prolog of the query:

declare boundary-space preserve;
The very tc34q.

Whitespace handling is generally a tricky issue in XML.

Best,
Christian


On Wed, Feb 14, 2024 at 10:38 AM Liam R. E. Quin 
wrote:

> On Tue, 2024-02-13 at 20:29 +0100, Christian Grün wrote:
>
>
> If your XML input has been properly indented to improve readibility, you
> can reduce the size of your database by dropping superfluous whitespace
> during the import:
>
> SET STRIPWS ON; CREATE DB ...
> db:create('db', '/path/to/documents', (), map { 'stripws': true() })
>
>
> Beware that this is not schema-based, and can remove whitespace nodes in
> mixed content -
> The very tc34q.
> may become (as i understand it)
> The verytc34q.
> (i have seen this, with different software, cause potentially catastrophic
> problems in aircraft manuals!)
>
> liam
>
> --
>
> Liam Quin, https://www.delightfulcomputing.com/
> Available for XML/Document/Information Architecture/XSLT/
> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
> Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org
>


Re: [basex-talk] Help with loading of 9 million documents

2024-02-13 Thread Christian Grün
Hi Dietmar,

> Or is there a way to speed this up?
The fastest solution is to import all documents during database creation,
either with the CREATE DB command or the corresponding XQuery function:

CREATE DB name-of-db /path/to/documents
db:create('db', '/path/to/documents')

The database command ADD, or db:add, can be used as well to import more
than one document at a time.

If your XML input has been properly indented to improve readibility, you
can reduce the size of your database by dropping superfluous whitespace
during the import:

SET STRIPWS ON; CREATE DB ...
db:create('db', '/path/to/documents', (), map { 'stripws': true() })

> Have you ever loaded 9 million documents into a basex database?

What’s the approximate size of the 32,000 documents?

In principle, it’s no problem to add 10 million documents or more to a
database as long as the input doesn’t exceed specific limits [1]. If you
exceed the limits, you can create multiple databases and access them with a
single query [2].

> I load the documents using the BaseXClient and the ADD method.

Are you using the Java implementation of the client? Feel free to share
some code with us.

Hope this helps,
Christian

[1] https://docs.basex.org/wiki/Statistics
[2] https://docs.basex.org/wiki/Databases


Re: [basex-talk] Help with usage of Regex while deleting resources

2024-02-12 Thread Christian Grün
Hi Deepak,

For deletions, you can write:

let $db := 'db'
for $path in db:list($db, '2023')[matches(., '/\d\d')]
return db:delete($db, $path)

When accessing documents, it’s faster to iterate over the resources:

for $doc in db:get('db', '2023')
where matches(db:path($doc), '/\d\d')
return ...

Hope this helps,
Christian



Deepak Dinakara  schrieb am Mo., 12. Feb. 2024,
18:04:

> Hi,
>
> I wanted to know if it's possible to give a regex while deleting a
> resource.
> I have documents stored in a hierarchy of collections like
> {year}{month}/doc.xml.
> Eg: 202301/abc.xml, 202302/def.xml.
> If I want to delete a resource "abc.xml", Is it possible to issue commands
> like "*db:delete("db-name", '/*/abc.xml')*" ? Right now, I can do a
> XQuery with db:list and endsWith and get the complete path of "abc.xml".
> But regex would have been very handy.
>
> Similarly I also want to execute queries against a list of collections
> using regex.
> Something like "*for $document in collection('db-name/20230*')*" (First 9
> months of 2023)
> Right now, I am doing something like
> "for $i in ('01', '02', '03', '04', ... '09')
> for $document in collection('test-collection/2023' || $i)"
> But if there are better ways, kindly let me know.
>
> Thank you,
> Deepak
>
>


Re: [basex-talk] basexhttp out of memory where basexgui suceeds

2024-02-07 Thread Christian Grün
…great to hear, thanks.

On Thu, Feb 8, 2024 at 7:30 AM Jack Steyn  wrote:

> Hi Christian,
>
> Sorry, I should have provided a self-contained example to begin with.
>
> In any case, I was running BaseX 10.0; after noticing that 10.6 boasts
> 'Much more memory-efficient representation of XML fragments', I upgraded to
> 10.7 and the problem appears to be resolved (and wow, there is a big
> difference in performance – kudos!).
>
> Many thanks,
>
> Jack
>
> On Wed, 7 Feb 2024, 6:09 pm Christian Grün, 
> wrote:
>
>> Hi Jack,
>>
>> If you run the query via basexhttp, how do you retrieve the results,
>> i.e., which client do you use?
>>
>> Can you possibly provide us with a self-contained example, something like…
>>
>> for $i in 1 to 50
>> return  update {
>>   insert node  into .
>> }
>>
>> …and some steps to reproduce the behavior?
>>
>> Thanks in advance,
>> Christian
>>
>>
>> On Wed, Feb 7, 2024 at 6:04 AM Jack Steyn  wrote:
>>
>>> Hi,
>>>
>>> I have a database about 200 MB in size made up of approximately 150 000
>>> documents of similar size and structure as children of the root node.
>>>
>>> When I run the following script in basexgui a significant amount of
>>> memory is consumed (over 1 GB if I'm reading the display correctly), but I
>>> do get a result:
>>>
>>> for $doc in db:get('docs')
>>> return $doc update {
>>> delete node .//*[local-name() = ('A', 'B', 'C', 'D')]
>>> }
>>>
>>> When I run it over basexhttp I get a java.lang.OutOfMemoryError: Java
>>> heap space. I have increased the memory available to the JVM to 4 GB but
>>> this has not affected the failure of the script.
>>>
>>> How can I resolve this? Is there some rewriting of the script that would
>>> help, or is it more specific to basexhttp?
>>>
>>> Many thanks,
>>>
>>> Jack
>>>
>>>


Re: [basex-talk] basexhttp out of memory where basexgui suceeds

2024-02-06 Thread Christian Grün
Hi Jack,

If you run the query via basexhttp, how do you retrieve the results, i.e.,
which client do you use?

Can you possibly provide us with a self-contained example, something like…

for $i in 1 to 50
return  update {
  insert node  into .
}

…and some steps to reproduce the behavior?

Thanks in advance,
Christian


On Wed, Feb 7, 2024 at 6:04 AM Jack Steyn  wrote:

> Hi,
>
> I have a database about 200 MB in size made up of approximately 150 000
> documents of similar size and structure as children of the root node.
>
> When I run the following script in basexgui a significant amount of memory
> is consumed (over 1 GB if I'm reading the display correctly), but I do get
> a result:
>
> for $doc in db:get('docs')
> return $doc update {
> delete node .//*[local-name() = ('A', 'B', 'C', 'D')]
> }
>
> When I run it over basexhttp I get a java.lang.OutOfMemoryError: Java heap
> space. I have increased the memory available to the JVM to 4 GB but this
> has not affected the failure of the script.
>
> How can I resolve this? Is there some rewriting of the script that would
> help, or is it more specific to basexhttp?
>
> Many thanks,
>
> Jack
>
>


Re: [basex-talk] Query regarding delete documents

2024-02-05 Thread Christian Grün
>
> I avoided "OPTIMIZE" since it caused OOM issues and I was fine without the
> index which looks like a double edged sword : ).
>

If you don’t need the indexes, you can disable them when running
db:optimize, and setting textindex and attrindex to false.

QQ, Is there some stat on how much RAM is needed for maintaining basex DB
> of size 'X'GB (with regular inserts and delete)  so that "optimize" could
> be called without worrying about OOM?
>

Hardly possible to say in general; it depends a lot on the “regular inserts
and deletes” ;) If you cannot solve the optimization problem, feel free to
share the OOM stack trace with us.


  1   2   3   4   5   6   7   8   9   10   >