[
https://issues.apache.org/jira/browse/JENA-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643660#comment-16643660
]
Andy Seaborne commented on JENA-1615:
-------------------------------------
Hi there - thanks for the clear analysis and for the fix.
{{node-data.bdf}} and {{prefix-data.bdf}} should be closed as well - from a
code inspection, it is {{TransBinaryDataFile}} not closing the state file:
{noformat}
---
a/jena-db/jena-dboe-trans-data/src/main/java/org/apache/jena/dboe/trans/data/TransBinaryDataFile.java
+++
b/jena-db/jena-dboe-trans-data/src/main/java/org/apache/jena/dboe/trans/data/TransBinaryDataFile.java
@@ -217,6 +217,7 @@
@Override
public void close() {
+ stateMgr.close();
binFile.close() ;
}
{noformat}
> Compaction leaks file descriptors
> ---------------------------------
>
> Key: JENA-1615
> URL: https://issues.apache.org/jira/browse/JENA-1615
> Project: Apache Jena
> Issue Type: Bug
> Components: Core, TDB2
> Affects Versions: Jena 3.8.0
> Environment: I reproduced the issue on the following environments:
> * OS / Java:
> ** MacOS 10.13.5
> Java 1.8.0_161 (Oracle)
> ** Debian 9.5
> Java 1.8.0_181 (OpenJDK)
> * Jena version 3.8.0
> * TDB2 mode: mapped
> Reporter: Damien Obrist
> Priority: Major
> Attachments: open_files_after_compaction_after_gc.png,
> open_files_after_compaction_after_gc_with_fix.png,
> open_files_after_compaction_before_gc.png, open_files_before_compaction.png
>
>
> h3. Context
> I'm using a TDB2 dataset in a long-running Scala application, in which the
> dataset gets compacted regularly. After compactions, the application removes
> the {{Data-xxxx}} folder of the previous generation. However, the
> corresponding disk space isn't properly returned back to the OS, but is still
> reported as being used by {{df}}. Indeed, {{lsof}} shows that the application
> keeps open file descriptors that point to the old generation's files. Only
> stopping / restarting the JVM frees the disk space for good.
> h3. Reproduction steps
> * Connect to an existing TDB2 dataset
> {code}
> val dataset = TDB2Factory.connectDataset("sample"){code}
> * Check open files
> [^open_files_before_compaction.png]
> * Compact the dataset
> {code}DatabaseMgr.compact(dataset.asDatasetGraph){code}
> * Check open files (before garbage collection)
> [^open_files_after_compaction_before_gc.png]
> * Check open files (after garbage collection)
> [^open_files_after_compaction_after_gc.png]
> The last sceenshot shows that, even after garbage collection, there are still
> open file descriptors pointing to the old generation {{Data-0001}}.
> h3. Impact
> Depending on how disk usage is being reported, this can be quite problematic.
> In our case, we're running on an OpenShift infrastructure with limited
> storage. After only a handful of compactions, the storage is considered full
> and cannot be used anymore.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)