I’m rethinking the implementation of my Mirabel system and trying to understand 
how best to manage concurrency for the web app (satisfying multiple read 
requests from web clients) while also managing updates to databases in a way 
that avoids blocking of reads or delaying of updates.

I’ve done a read through of the current documentation at 
https://docs.basex.org/ and also reviewed what I could find online and such. In 
the documentation I find a number of references to the “client/server” 
architecture but I’m not finding any particularly deep discussion of it, either 
in the docs or by searching on i.e., “basex client server”.

When I started my Mirabel project I understood that the way to get concurrency 
was to use multiple BaseX HTTP instances, which can make concurrent read 
requests on a single set of databases. But now I can’t find the source that led 
me to that conclusion—I know the product docs have been reworked significantly 
since then, so maybe something got lost in the update?

Based on my latest reading, it seems clear that in order for BaseX to best 
manage concurrency of reads and writes that the requests need to be within a 
single server instance running in a single JVM. If I’m understanding it 
correctly, for FAIRLOCK to enable interleaving of read and update operations it 
has to be operating in a single JVM, which means that having different JVMs 
reading from a database being updated by yet another JVM is going to be 
problematic because the only lock mechanism in that case is the global lock,

Given that, it’s not clear how you implement a multi-user web application that 
needs to not block while waiting for longer-running queries to complete but 
still have a single server instance to satisfy the queries.

So I feel like I’m missing something.

In my current solution I run multiple basexhttp servers where the first server 
is the web app server. It uses a REST handler that inspects the load on each 
server and sends requests to the lowest-load server. This works but in my 
reading I’m starting to think that this level of complexity is not required (or 
maybe, more accurately, surely others would have needed the same mechanism?).

In my web app I’ve implemented a general asynchronous query mechanism where I 
can serve web pages with elements that then trigger async requests back to the 
server to fetch elements (for example, I cache HTML renderings of DITA tables 
that are then fetched asynchronously to populate report tables that include the 
rendered tables). I don’t see a way to avoid this level of optimization and 
it’s not too complicated. This seems to work well and there are more 
opportunities for caching things.

I’m also continuing to think about Tamara’s suggestion to use polling or web 
sockets to perform long-running queries asynchronously, store the results in a 
results database, then trigger fetching in the client (in response to an 
earlier question of mine about how to avoid keeping HTTP request sessions open 
for a long time).

So my question:

What is the general architecture for a web application that needs to support 
large numbers of concurrent users making large numbers of requests and manage 
updates to databases?

Thanks,

Eliot

_____________________________________________
Eliot Kimber
Sr. Staff Content Engineer
O: 512 554 9368

servicenow

servicenow.com<https://www.servicenow.com>
LinkedIn<https://www.linkedin.com/company/servicenow> | 
X<https://twitter.com/servicenow> | 
YouTube<https://www.youtube.com/user/servicenowinc> | 
Instagram<https://www.instagram.com/servicenow>

Reply via email to