[
https://issues.apache.org/jira/browse/JENA-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113186#comment-15113186
]
ASF GitHub Bot commented on JENA-1122:
--------------------------------------
Github user afs commented on the pull request:
https://github.com/apache/jena/pull/123#issuecomment-174066342
**Design**
Protecting the text index this way sort of works for TDB specifically
because of an internal feature of TDB (it manages storage to stop duplication)
which is not a guaranteed feature. Other dataset implementations will not work
out so nicely. It will be like two separate datasets and one index will and
probably lead to corruption or inconsistent reading (c.f. email ["transactions
and
docProducers"](http://mail-archives.apache.org/mod_mbox/jena-users/201601.mbox/%3C568FD70B.8060301%40epimorphics.com%3E)).
On [JENA-1122](https://issues.apache.org/jira/browse/JENA-1122) I
summarized discussions up to here as two options suggested:
1. Internal static state in `TextDatasetFactory` that the same datasets
object is returned each time. c.f. TDB's StoreConnection. Extends sharing of
text datasets to work with java/API uses but not "any dataset" in Fuseki
configurations.
2. Fuseki (or in `DatasetAssembler` maybe) assembling datasets deals with
sharing using the graph structure. This copes with any dataset but not API use.
The first one looks hard because of choosing the key to include the dataset
in the general case.
The second one is easier to do because there is a natural key of the
resource (URI, bnode) for the dataset. Bonus would a similar per-text index
assembler check on reuse
[JENA-1104](https://issues.apache.org/jira/browse/JENA-1104).
There is one minor point - Fuseki can have multiple assembler files and
badly chosen, clashing dataset URIs (solution - keep a list of all URIs acorss
assembler configs - useful check anyway)
The ideal for [JENA-1122](https://issues.apache.org/jira/browse/JENA-1122)
is this PR (simplified?) to protect text indexes and (2) above to allow complex
configurations.
> Fuseki fails to start if configured with two services that share the same
> dataset with a lucene index.
> ------------------------------------------------------------------------------------------------------
>
> Key: JENA-1122
> URL: https://issues.apache.org/jira/browse/JENA-1122
> Project: Apache Jena
> Issue Type: Bug
> Components: Text
> Affects Versions: Jena 3.0.0, Fuseki 2.3.0
> Reporter: Brian McBride
>
> This problem arises when the assemblers for the two services run. For each
> service, a separate TextIndexLucene object is created. Both of those objects
> try to lock the same Lucene index directory and one fails.
> A proposed fix is to modify the TextDatasetFactory to only create one
> TextIndexLucene object per on disk directory.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)