Hi James, hi Andy, When adding database locks to BaseX, I remember we had some thoughts if we shouldn’t add explicit lock features to BaseX, exactly for those cases in which the optimizer is not powerful enough to detect the required locks. One of the reasons why we discarded this idea was that we feared that people would run into deadlocks or inconsistent states.
Indeed I like your idea of raising an error once a database is accessed that was not manually locked; this would prevent us from running into deadlocks or creating inconsistent states. One solution could be to introduce a new MANUALLOCK option, and to extend the existing basex:read-lock and basex:write-lock pragmas to databases. If manual locking is enabled, an error will be raised if a database is accessed that has not been specified in a lock pragma. If it’s disabled, databases specified by query locks will be simply be added to the list of databases to be locked (or ignored if they have already been detected automatically, or discarded if a global lock needs to be set). Manual locks could either be assigned globally or within a query: (# db:manuallock #) { (# basex:write-lock BEP-Staging #) { (# basex:read-lock BEP #) { let $d:=db:open('BEP') return db:create('BEP-staging',$d,$d!base-uri(.)) } } } I wouldn’t call this syntax particularly appealing (it’s surely something that should only be used for special cases in a code base), but if database locking is enhanced in a future version, such pragmas could simply be removed from a query. Any thoughts on that? Christian On Mon, Feb 11, 2019 at 3:22 PM James Ball <basex-t...@jamesball.co.uk> wrote: > > > I think I’m probably doing something similar on a current project with but > with some careful query writes and use of jobs I’ve been able to keep > everything running within acceptable time. > > However while optimising my code it did get me thinking about possible > improvements. Clearly it’s difficult (or even impossible) to determine all > the databases that will be used ahead of running the query. But I wonder how > many times the calling function already knows? > > Then I thought of the (# db:enforceindex #) that was introduced for cases > where the query writer knows that the databases will have indexes. I wondered > if something similar might be possible for databases. > > A pragma or a function wrapper that would allow the name of a database (or > databases) to be supplied and that would restrict access only to that > database for the rest of the query. Returning an error if the query tries to > address another. > > I’m sure this wouldn’t be simple - but might be easier and more reliable than > trying to find more optimisations to the locking algorithm. > > Just thinking aloud… > > Kindest regards, James > > From: Christian Grün <christian.gr...@gmail.com> > Subject: Re: [basex-talk] Global locks > Date: 11 February 2019 at 12:27:37 GMT > To: Andy Bunce <bunce.a...@gmail.com> > Cc: "basex-talk@mailman.uni-konstanz.de" <basex-talk@mailman.uni-konstanz.de> > > > Hi Andy, > > The current behavior is correct indeed – but it might not be what one > expects. Currently, we are… > > a) collecting all static database references in the query and > b) assigning either read or write locks to these databases, depending > if the overall query is updating or not. > > The reason is that it’s often tricky to determine statically (i.e., > while parsing the query and before compiling and optimizing it) which > databases will be accessed for read or write operations without > analyzing the query in more detail. An arbitrary example: > > let $db := db:open('db1') > return insert node <new/> into $db/* > > We would need to follow the variable reference in order to find out if > db1 will be updated. In simple queries such as yours, however, this > might be possible; I’ll have some more thoughts on that. > > Cheers > Christian > >