[ https://issues.apache.org/jira/browse/JENA-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989698#comment-16989698 ]
Andy Seaborne commented on JENA-1785: ------------------------------------- Here is my understanding of the problem: this is mainly to make sure I understand the details here! ---- The NodeTable Cache should reflect the node table. The NodeTable is append-only and it does not matter if nodes are added to it by later transactions; what is in the data is determined by the triples/quads, not by presence in the node table. Inside a write-transaction things are different because the W-txn may abort. Nodes created should not be visible outside the W-txn until it commits (JENA-1746). The fact nodes are cleared up from storage after an abort was the problem in JENA-1746. The underlying disk storage, a TransBinaryDataFile, reflects transactions. In a W-txn, there is the visible part (all still running read-transactions) and the additional nodes of the W-txn. NodeTableCache has buffering caches that attempts to reflect this. These are ThreadBufferingCache, which is a pair of normal LRU caches, one "local" for the W-txn nodes and one "base" for the globally visible cache of the node table. The same structure is used for the not-present cache. Problem: A node is initially recorded as "not present". It is now in the base-not-present cache. A W-txn adds this node, putting it in the node<->nodeId caches as well as writing through to the TransBinaryDataFile. It is removed from the not-present-local cache if in there. The code correctly checks to this point because the node<->nodeId caching is checked before not-present. However, problems arise if it falls out of the local node/nodeId cache. Now, a lookup will not find it in a local cache, but will get the wrong answer when it finds it in the base caching because it is in the base-not-present cache. > A newly created node can remain invisible after commit > ------------------------------------------------------ > > Key: JENA-1785 > URL: https://issues.apache.org/jira/browse/JENA-1785 > Project: Apache Jena > Issue Type: Bug > Components: TDB2 > Affects Versions: Jena 3.13.0, Jena 3.13.1 > Reporter: Pavel Mikhailovskii > Assignee: Andy Seaborne > Priority: Critical > Attachments: TestVisibilityOfChanges.java > > Time Spent: 1h > Remaining Estimate: 0h > > A node once marked as non-present (_NodeTableCache.nonPresent_) can remain > invisible even after it's created and the transaction is committed. That > might happen because there's no guarantee that *all* newly created nodes will > be eventually added to the "base" version _ThreadBufferingCache.baseCache_ of > theĀ _node2id_Cache_ (as the _localCache_ has limited capacity) or removed > from the "base" version of the _nonPresent_ cache (even if they were, there > would still be a chance of re-adding them by some read transaction). > The simplest fix is to get rid of the _nonPresent_ cache which seems to be of > limited use anyway. A more sophisticated fix would involve keeping track of > all newly allocated nodes and their removal from the base version of > _nonPresent_ cache on transaction commit. > To reproduce: see the attached test. -- This message was sent by Atlassian Jira (v8.3.4#803005)