[jira] [Commented] (JENA-1746) TDB2 rollback method clashes with nodetable cache

Jira Fri, 30 Aug 2019 07:32:06 -0700


    [ 
https://issues.apache.org/jira/browse/JENA-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919598#comment-16919598
 ]


Miklós Győrfi commented on JENA-1746:
-------------------------------------

Hi Andy!

I've attached a gradle test project, which reproduce the error well. There are 
some comments in it  to make it understandable. It creates a new database in 
the temp directory, but it does not matter, the error is consistent and 
reproducible.

To start the test use the command in this project:

./gradlew test

 

> TDB2 rollback method clashes with nodetable cache
> -------------------------------------------------
>
>                 Key: JENA-1746
>                 URL: https://issues.apache.org/jira/browse/JENA-1746
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 3.11.0, Jena 3.12.0
>         Environment: Linux  3.16.0-9-amd64 #1 SMP Debian 3.16.68-2 
> (2019-06-17) x86_64 GNU/Linux
> java version "1.8.0_05"
> Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)
>            Reporter: Miklós Győrfi
>            Priority: Critical
>         Attachments: jena-test.tgz
>
>
> *Issue:* Inserting triplets, then rollbacking the TDB2 dataset, and loading 
> back nodes, including some nodes again with the same content causes some 
> artifacts and mess: some nodes disappear, some nodes are replaced. Moreover 
> it unrecoverably *corrupts* the database files: accessing triplets then may 
> cause RiotThriftException.
> **org.apache.jena.riot.thrift.RiotThriftException: No conversion to a
> Node: <RDF_Term >
> *Reproduction*: Create some quads into a non-empty dataset, then rollback it, 
> and create again the same triplets in another order, using anonymous and URL 
> nodes  simultaneously. Although this method does not guarantee the issue, the 
> possibility is high. 
> *Cause*: My inverstigation shows, that the culprit is the {{NodeTableCache}}. 
> It caches the node - nodeId relation of the backed table ({{NodeTableTRDF}}), 
> but the cache does not react to the rollback (abort) operation. The backing 
> table - during rollback - invalidates the  node Id-s. The node Id is in close 
> relation of the position of the node data in the node data file, so new 
> inserts can reuse these invalidated node Ids, or close to it for other nodes. 
> As the nodes (remaining in cache, but not written, and the new ones) then 
> overlaps each other, reading  back them causes Thrift errors, or later it 
> causes missing nodes in the index. The data of the cached nodes disappears, 
> if they fall out from the cache, or the dataset reopens.
> *Possible fix:* None of the NodeTables registers and reacts to the rollback,  
> only the backing file and index are restored. Best possible solution is 
> _creating an option for these components to react to the restoration_. Cache 
> then may evict cached data, or may track changes in transactions, and can 
> evict only those. Anyway it is very justifiable for the rollback situations 
> to evict all the caches.
> TransactionCoordinator has collections for shutdownHooks, and for 
> transactionsComponents. This is a good pattern for creating another 
> collection for notification interfaces, and calling back these on 
> transactional events. CacheNodeTable (and other objects) can then be a 
> listener to this events, and may evict the cache, if necessary.
> Other possibility to create callback option in the NodeTable to react to the 
> invalidation, and propagate back  the invalidation in the NodeTable 
> hierarchy. 
> Another simpler fix is to propagate down the thread-safe storage "version" in 
> the NodeTables, and check it in the cache, and evict.
> *Workaround:* Skipping the cache (setting nodeToIdCacheSize and 
> idToNodeCacheSize to -1 in StoreParams) is a good workaround now, but causes 
> performance issues.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (JENA-1746) TDB2 rollback method clashes with nodetable cache

Reply via email to