[ 
https://issues.apache.org/jira/browse/JENA-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989698#comment-16989698
 ] 

Andy Seaborne commented on JENA-1785:
-------------------------------------

Here is my understanding of the problem: this is mainly to make sure I 
understand the details here!
----
The NodeTable Cache should reflect the node table. The NodeTable is append-only 
and it does not matter if nodes are added to it by later transactions; what is 
in the data is determined by the  triples/quads, not by presence in the node 
table.

Inside a write-transaction things are different because the W-txn may abort. 
Nodes created should not be visible outside the W-txn until it commits 
(JENA-1746). The fact nodes are cleared up from storage after an abort was the 
problem in JENA-1746.

The underlying disk storage, a TransBinaryDataFile, reflects transactions. In a 
W-txn, there is the visible part (all still running read-transactions) and the 
additional nodes of the W-txn.

NodeTableCache has buffering caches that attempts to reflect this.

These are ThreadBufferingCache, which is a pair of normal LRU caches, one 
"local" for the W-txn nodes and one "base" for the globally visible cache of 
the node table.

The same structure is used for the not-present cache.

Problem:

A node is initially recorded as "not present". It is now in the 
base-not-present cache.

A W-txn adds this node, putting it in the node<->nodeId caches as well as 
writing through to the TransBinaryDataFile. It is removed from the 
not-present-local cache if in there. The code correctly checks to this point 
because the node<->nodeId caching is checked before not-present.

However, problems arise if it falls out of the local node/nodeId cache.

Now, a lookup will not find it in a local cache, but will get the wrong answer
when it finds it in the base caching because it is in the base-not-present 
cache.




> A newly created node can remain invisible after commit
> ------------------------------------------------------
>
>                 Key: JENA-1785
>                 URL: https://issues.apache.org/jira/browse/JENA-1785
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 3.13.0, Jena 3.13.1
>            Reporter: Pavel Mikhailovskii
>            Assignee: Andy Seaborne
>            Priority: Critical
>         Attachments: TestVisibilityOfChanges.java
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> A node once marked as non-present (_NodeTableCache.nonPresent_) can remain 
> invisible even after it's created and the transaction is committed. That 
> might happen because there's no guarantee that *all* newly created nodes will 
> be eventually added to the "base" version _ThreadBufferingCache.baseCache_ of 
> theĀ _node2id_Cache_ (as the _localCache_ has limited capacity) or removed 
> from the "base" version of the _nonPresent_ cache (even if they were, there 
> would still be a chance of re-adding them by some read transaction). 
> The simplest fix is to get rid of the _nonPresent_ cache which seems to be of 
> limited use anyway. A more sophisticated fix would involve keeping track of 
> all newly allocated nodes and their removal from the base version of 
> _nonPresent_ cache on transaction commit.
> To reproduce: see the attached test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to