[
https://issues.apache.org/jira/browse/JENA-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124975#comment-15124975
]
ASF GitHub Bot commented on JENA-1122:
--------------------------------------
Github user afs commented on the pull request:
https://github.com/apache/jena/pull/123#issuecomment-177246334
Starting from the JENA-1122 description:
> Two Fuseki services, linking to the same dataset description.
Fuseki only calls assemblers once. No other system is (legitimately)
calling Fuseki service building. The configuration file processing puts
service access points into the server-wide state. There is no service
assembler (it could be done but it isn't, it serves no purpose); it is done by
custom processing during walking the datastructure which is happening anyway.
In the Fuseki case, we want shared datasets descriptions, that is, same
name, to yield the same dataset. Processing dataset descriptions is driven by
assemblers and they have names for keys using the root resource. A general
"dataset sharing" outside assemblers is hard because of the lack of key. In
other cases, I can imagine that a shared description alone does not always
imply a shared object - in-memory datasets for example. The more general area
is not clearly defined.
The solution I see is that Fuseki handles the process step for the link:
----
```
fuseki:dataset <#dataset>
<#dataset> rdf:type ja:RDFDataset(OrSubType)
```
----
This happens in `Builder.buildDataService` as it calls
`Assembler.general.open(datasetDesc)`.
It looks to me that if sharing is provided here, the problem statement of
JENA-1122 is addressed.
One matter arising:
Service descriptions can be in multiple files (it is the preferred pattern
to use `configuration/`). The template system behind the UI uses relative URIs
so names of descriptions are unique across the server.
If a user manually writes two configuration files, but uses the same
absolute URI and they meant it to be different, we have a problem and this
could be made to cause an error (safe choice to force shared datasets to be in
the server `config.ttl`).
`FusekiConfig.initializeDataAccessPoints` is the driver, it calls
`readConfigurationDirectory` and the others places service descriptions can be
and so needs checking.
For now, just solving this for two services in the server configuration
file, with entries in the `fuseki:services` list links is a good start.
> Fuseki fails to start if configured with two services that share the same
> dataset with a lucene index.
> ------------------------------------------------------------------------------------------------------
>
> Key: JENA-1122
> URL: https://issues.apache.org/jira/browse/JENA-1122
> Project: Apache Jena
> Issue Type: Bug
> Components: Text
> Affects Versions: Jena 3.0.0, Fuseki 2.3.0
> Reporter: Brian McBride
>
> This problem arises when the assemblers for the two services run. For each
> service, a separate TextIndexLucene object is created. Both of those objects
> try to lock the same Lucene index directory and one fails.
> A proposed fix is to modify the TextDatasetFactory to only create one
> TextIndexLucene object per on disk directory.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)