Hi Erol,

I am not volunteering :-) but if somebody wants to take this route this
code might give some pointers [1].
It uses Apache Spark to run Saxon-HE, an XQuery  example [2], and more info
[3].

/Andy

[1] https://github.com/elsevierlabs/spark-xml-utils
[2] https://github.com/elsevierlabs/spark-xml-utils/wiki/xquery
[3]
http://mail-archives.apache.org/mod_mbox/spark-user/201408.mbox/%3c1407936616.34624.yahoomail...@web141003.mail.bf1.yahoo.com%3E

On 22 April 2015 at 10:05, Erol Akarsu <eaka...@gmail.com> wrote:

> Christian,
>
> I think we should be able to attach BaseX to Apache spark. But integration
> code need to be written.
> Everybody is able to read from Hadoop,SOLR, ElasticSearch etc. to Spark
> and process there.
> Why not for BaseX?
>
> Erol Akarsu
>
> On Wed, Apr 22, 2015 at 4:28 AM, Christian Grün <christian.gr...@gmail.com
> > wrote:
>
>> Hi Götz,
>>
>> > it would
>> > make perfect sense to parallelize the query. Is there a way to achieve
>> this
>> > using xQuery?
>>
>> Our initial attempts to integrate low-level support for
>> parallelization in XQuery turned out not to be as successful as we
>> hoped they would be. One reason for that is that you can basically do
>> everything with XQuery, and it's pretty hard to detect patterns in the
>> code that are simple enough to be parallelized. Next to that, Java
>> does not give us enough facilities to control CPU caching behavior.
>>
>> As you already indicated, you can simply run multiple queries in
>> parallel by e.g. using Java threads or the BaseX client/server
>> architecture (which by default allows 8 transactions in parallel [1]).
>> If your queries do a lot of I/O, you will often get better performance
>> by only allowing one transaction at a time, though. This is due to the
>> random access patterns on your external drives (and in my experience,
>> it also applies to SSDs). However, if you work with main-memory
>> instances of databases, parallelization might give you some
>> performance gains (albeit not as big as you might expect).
>>
>> Hope this helps,
>> Christian
>>
>> [1] http://docs.basex.org/wiki/Options#PARALLEL
>>
>
>

Reply via email to