[jira] [Commented] (JENA-1122) Fuseki fails to start if configured with two services that share the same dataset with a lucene index.

ASF GitHub Bot (JIRA) Sat, 30 Jan 2016 09:06:05 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124975#comment-15124975
 ]


ASF GitHub Bot commented on JENA-1122:
--------------------------------------

Github user afs commented on the pull request:

    https://github.com/apache/jena/pull/123#issuecomment-177246334
  
    Starting from the JENA-1122 description:
    
    > Two Fuseki services, linking to the same dataset description.
    
    Fuseki only calls assemblers once. No other system is (legitimately) 
calling Fuseki service building.  The configuration file processing puts 
service access points into the server-wide state.  There is no service 
assembler (it could be done but it isn't, it serves no purpose); it is done by 
custom processing during walking the datastructure which is happening anyway.
    
    In the Fuseki case, we want shared datasets descriptions, that is, same 
name, to yield the same dataset.  Processing dataset descriptions is driven by 
assemblers and they have names for keys using the root resource.  A general 
"dataset sharing" outside assemblers is hard because of the lack of key. In 
other cases, I can imagine that a shared description alone does not always 
imply a shared object - in-memory datasets for example.  The more general area 
is not clearly defined.
    
    The solution I see is that Fuseki handles the process step for the link:
    
    ----
    ```
        fuseki:dataset   <#dataset>
       
    <#dataset> rdf:type ja:RDFDataset(OrSubType)
    ```
    ----
    
    This happens in `Builder.buildDataService` as it calls 
`Assembler.general.open(datasetDesc)`.
    
    It looks to me that if sharing is provided  here, the problem statement of 
JENA-1122 is addressed.
    
    One matter arising: 
    
    Service descriptions can be in multiple files (it is the preferred pattern 
to use `configuration/`). The template system behind the UI uses relative URIs 
so names of descriptions are unique across the server.  
    
    If a user manually writes two configuration files, but uses the same 
absolute URI and they meant it to be different, we have a problem and this 
could be made to cause an error (safe choice to force shared datasets to be in 
the server `config.ttl`).
    
    `FusekiConfig.initializeDataAccessPoints` is the driver, it calls 
`readConfigurationDirectory` and the others places service descriptions can be 
and so needs checking.
    
    For now, just solving this for two services in the server configuration 
file, with entries in the `fuseki:services` list links is a good start.
    
    



> Fuseki fails to start if configured with two services that share the same 
> dataset with a lucene index.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: JENA-1122
>                 URL: https://issues.apache.org/jira/browse/JENA-1122
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Text
>    Affects Versions: Jena 3.0.0, Fuseki 2.3.0
>            Reporter: Brian McBride
>
> This problem arises when the assemblers for the two services run.  For each 
> service, a separate TextIndexLucene object is created.  Both of those objects 
> try to lock the same Lucene index directory and one fails.
> A proposed fix is to modify the TextDatasetFactory to only create one 
> TextIndexLucene object per on disk directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-1122) Fuseki fails to start if configured with two services that share the same dataset with a lucene index.

Reply via email to