Re: Understanding DatasetGraph getLock() (DatasetGraphInMem throwing a curve ball)...
Thanks for the background. I'll map to TDB and Mem and throw an UOE if "another" DG is encountered. Same here, I drew a blank on a Jena optimistic lock and try lock. So I've created a LockMRAndMW (effectively lazy) which is used to control the DatasetGraphDistributed i.e. no blocking via the begin(ReadWrite). Then when the streams are called (e.g. find(...)) the actual DG's have the read transaction started. Also a LockMRSWTry and LockMRPlusSWTry which wrap the TDB and Mem lock semantics. It was REALLY important for us that we don't block on the begin(ReadWrite) call as we are currently aggregating 18 separate JVM TDB/Mem instances into one DG (via a Thrift DG implementation). Specifically when we perform an ETL we try each remote DG until we acquire a write lock then the quads are loaded. This way we can support multiple writes as we effectively shard the TDB. This way we reduced bulk ETL load times from the sum of all load times to simplistically the longest load time (assuming we have enough shards...) Internally the sharded DG's are only locked when they are touched. The majority of DG's are TDB backed but the system recognises certain "things" and will spin up a Mem backed DG in another JVM to perform adhoc work then tear it down. On 24 March 2017 at 11:41, A. Sorokawrote: > The lock from getLock is always the same semantics for every impl-- > currently MRSW, with no expectation for changing. It's a kind of "system > lock" to keep the internal state of that class consistent. That's distinct > from the transactional semantics of a given impl. In some cases, the > semantics happen to coincide, when the actual transactional semantics are > also MRSW. But sometimes they don't (actually, I think DatasetGraphInMem is > the only example where they don't right now, but I am myself tinkering with > another example and I am confident that we will have more). When they > don't, you need to rely on the impl to manage its own transactionality, via > the methods for that purpose. I'm not actually sure we have a good > non-blocking method for your use right now. We have inTransaction(), but > that's not too helpful here. > > But someone else can hopefully point to a technique that I am missing. > > > --- > A. Soroka > The University of Virginia Library > > > On Mar 24, 2017, at 6:51 AM, Dick Murray wrote: > > > > Hi. > > > > Is there a way to get what Transactional a DatasetGraph is using and > > specifically what Lock semantics are in force? > > > > As part of a distributed DatasetGraph implementation I have a > > DatasetGraphTry wrapper which adds Boolean tryBegin(ReadWrite) and as the > > name suggests it will try to lock the given DatasetGraph and return > > immediately, i.e. not block. Internally if it acquires the lock it will > > call the wrapped void begin(ReadWrite) which "should" not block. This is > > useful because I can round robin the DatasetGraph's which constitute the > > distribution without blocking. Especially useful as some of the > > DatasetGraph's are running in other JVM's. > > > > Currently I've reverted the mapping to the DatasetGraph class (requires I > > manually check the Jena code) but I'd like to understand why and possibly > > make the code neater... > > > > To automate the wrapping I pulled the Lock via getLock() and used the > class > > to lookup the appropriate wrapper. But after digging I noticed that the > > Lock from getLock() doesn't always match the Transactional locking > > semantics. > > > > DatasetGraphInMem getLock() returns org.apache.jena.shared.LockMRSW but > > internally its Transactional implementation is > > using org.apache.jena.shared.LockMRPlusSW which is subtly different. > This > > is noticeable because getLock() isn't overridden but inherits from > > DatasetGraphBase which declares LockMRSW. > > > > A TDB backed DatasetGraph masquerades as a; > > > > DatasetGraphTransaction > > > > DatasetGraphTrackActive > > > > DatasetGraphWrapper > > > > which wraps the DatasetGraphTDB > > > > DatasetGraphTripleQuads > > > > DatasetGraphBaseFind > > > > DatasetGraphBase where the getLock() returns > > > > > > > > INFO Thread[main,5,main] [class > > org.apache.jena.sparql.core.mem.DatasetGraphInMemory] > > INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] > > > > INFO Thread[main,5,main] [class > > org.apache.jena.tdb.transaction.DatasetGraphTransaction] > > INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] > > INFO Thread[main,5,main] [class org.apache.jena.tdb.store. > DatasetGraphTDB] > > INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] > > > > Regards Dick. > >
Re: Understanding DatasetGraph getLock() (DatasetGraphInMem throwing a curve ball)...
The lock from getLock is always the same semantics for every impl-- currently MRSW, with no expectation for changing. It's a kind of "system lock" to keep the internal state of that class consistent. That's distinct from the transactional semantics of a given impl. In some cases, the semantics happen to coincide, when the actual transactional semantics are also MRSW. But sometimes they don't (actually, I think DatasetGraphInMem is the only example where they don't right now, but I am myself tinkering with another example and I am confident that we will have more). When they don't, you need to rely on the impl to manage its own transactionality, via the methods for that purpose. I'm not actually sure we have a good non-blocking method for your use right now. We have inTransaction(), but that's not too helpful here. But someone else can hopefully point to a technique that I am missing. --- A. Soroka The University of Virginia Library > On Mar 24, 2017, at 6:51 AM, Dick Murraywrote: > > Hi. > > Is there a way to get what Transactional a DatasetGraph is using and > specifically what Lock semantics are in force? > > As part of a distributed DatasetGraph implementation I have a > DatasetGraphTry wrapper which adds Boolean tryBegin(ReadWrite) and as the > name suggests it will try to lock the given DatasetGraph and return > immediately, i.e. not block. Internally if it acquires the lock it will > call the wrapped void begin(ReadWrite) which "should" not block. This is > useful because I can round robin the DatasetGraph's which constitute the > distribution without blocking. Especially useful as some of the > DatasetGraph's are running in other JVM's. > > Currently I've reverted the mapping to the DatasetGraph class (requires I > manually check the Jena code) but I'd like to understand why and possibly > make the code neater... > > To automate the wrapping I pulled the Lock via getLock() and used the class > to lookup the appropriate wrapper. But after digging I noticed that the > Lock from getLock() doesn't always match the Transactional locking > semantics. > > DatasetGraphInMem getLock() returns org.apache.jena.shared.LockMRSW but > internally its Transactional implementation is > using org.apache.jena.shared.LockMRPlusSW which is subtly different. This > is noticeable because getLock() isn't overridden but inherits from > DatasetGraphBase which declares LockMRSW. > > A TDB backed DatasetGraph masquerades as a; > > DatasetGraphTransaction > > DatasetGraphTrackActive > > DatasetGraphWrapper > > which wraps the DatasetGraphTDB > > DatasetGraphTripleQuads > > DatasetGraphBaseFind > > DatasetGraphBase where the getLock() returns > > > > INFO Thread[main,5,main] [class > org.apache.jena.sparql.core.mem.DatasetGraphInMemory] > INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] > > INFO Thread[main,5,main] [class > org.apache.jena.tdb.transaction.DatasetGraphTransaction] > INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] > INFO Thread[main,5,main] [class org.apache.jena.tdb.store.DatasetGraphTDB] > INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] > > Regards Dick.
Understanding DatasetGraph getLock() (DatasetGraphInMem throwing a curve ball)...
Hi. Is there a way to get what Transactional a DatasetGraph is using and specifically what Lock semantics are in force? As part of a distributed DatasetGraph implementation I have a DatasetGraphTry wrapper which adds Boolean tryBegin(ReadWrite) and as the name suggests it will try to lock the given DatasetGraph and return immediately, i.e. not block. Internally if it acquires the lock it will call the wrapped void begin(ReadWrite) which "should" not block. This is useful because I can round robin the DatasetGraph's which constitute the distribution without blocking. Especially useful as some of the DatasetGraph's are running in other JVM's. Currently I've reverted the mapping to the DatasetGraph class (requires I manually check the Jena code) but I'd like to understand why and possibly make the code neater... To automate the wrapping I pulled the Lock via getLock() and used the class to lookup the appropriate wrapper. But after digging I noticed that the Lock from getLock() doesn't always match the Transactional locking semantics. DatasetGraphInMem getLock() returns org.apache.jena.shared.LockMRSW but internally its Transactional implementation is using org.apache.jena.shared.LockMRPlusSW which is subtly different. This is noticeable because getLock() isn't overridden but inherits from DatasetGraphBase which declares LockMRSW. A TDB backed DatasetGraph masquerades as a; DatasetGraphTransaction DatasetGraphTrackActive DatasetGraphWrapper which wraps the DatasetGraphTDB DatasetGraphTripleQuads DatasetGraphBaseFind DatasetGraphBase where the getLock() returns INFO Thread[main,5,main] [class org.apache.jena.sparql.core.mem.DatasetGraphInMemory] INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] INFO Thread[main,5,main] [class org.apache.jena.tdb.transaction.DatasetGraphTransaction] INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] INFO Thread[main,5,main] [class org.apache.jena.tdb.store.DatasetGraphTDB] INFO Thread[main,5,main] [class org.apache.jena.shared.LockMRSW] Regards Dick.