Re: [basex-talk] State of replication and clustering

Andreas Jung Tue, 01 Aug 2017 11:06:12 -0700

Hi Dirk

in our case we have about 1 GB of product catalog for 30 languages spread 
across 30 XML files…so not much data.

One or more instances of a webservices will perform only queries - only reads - 
on the data. Standard XPath queries
and a bunch of full text queries (in particular queries related to „find as you 
type“). On my machine (8 cores, 32 GB RAM)
I could reach up to 50 XPath queries per second. We have no numbers about the 
expected workload (new system, new application).
So we must be prepared to scale. So one single BaseX node might not be enough 
at some point. Currently I am thinking about
bundling one BaseX instance with all the data + one webservice instance into 
one container. So every container is self-contained
and we should be able to scale up by starting up as much containers as needed. 
Not the perfect solution but one that should
work smoothly. I also looked into exist-db and replication but their 
replication mechanism has too many moving parts and scares
me a bit. Do I have a free wish? A configuration-less replication mechanism 
(multi-master) as we have it in Elasticsearch…but only
a dream :-)

Andreas

On 1 Aug 2017, at 18:51, Kirsten, Dirk wrote:

> Hi Andreas,
>
> I am not quite sure to what presentation at XML Prague 2013 you are referring 
> to, but I would guess it was mine given that I was working at this topic at 
> that time and I think I would remember hearing someone else giving a talk 
> about it...
>
> Unfortunately, this was a researched project (my master thesis; should be 
> somewhere on basex.org, but really is a thesis and hardly of any use if you 
> want to "just use it") and never really was continued after 2014 and was far, 
> far from being able to go upstream. So I guess for now it is simply not here 
> and it is quite some project so it would require serious effort.
>
> However, if you just have to read you might be able to partition your data in 
> some way it is appropriate for your application and put the different data on 
> different servers/file systems. But this depends heavily on your use case. 
> Also, it might be interessant what you think the limit will be that you need 
> to scale out for reads. Do you simply have so much data you can't store it on 
> one file system. Or do you have so many parallel users you want to gain some 
> performance?
>
> Cheers
> Dirk
>
> Senacor Technologies Aktiengesellschaft - Sitz: Eschborn - Amtsgericht 
> Frankfurt am Main - Reg.-Nr.: HRB 105546
> Vorstand: Matthias Tomann, Marcus Purzer - Aufsichtsratsvorsitzender: Daniel 
> Grözinger
>
> -----Ursprüngliche Nachricht-----
> Von: basex-talk-boun...@mailman.uni-konstanz.de 
> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] Im Auftrag von Andreas 
> Jung
> Gesendet: Dienstag, 1. August 2017 15:12
> An: BaseX
> Betreff: [basex-talk] State of replication and clustering
>
> Hi there,
>
> what is the state of replication and clustering of BaseX?
>
> I found an XML Prague 2013 presentation but almost no documentation on these 
> topics on the website.
>
> In our case we need to scale out horizontally with a growing number of reads 
> (no writes involved).
>
> Andreas

signature.asc
Description: OpenPGP digital signature

Re: [basex-talk] State of replication and clustering

Reply via email to