[2015-06-03 12:13:10] Builder INFO Service: :memory [2015-06-03
12:13:11] Config INFO Register: /memory [2015-06-03 12:13:11]
Server INFO Started 2015/06/03 12:13:11 CEST on port 3030
I tested it in two versions: the official release 2.0.0 and the
latest snapshot 2.0.1-SNAPSHOT 2015-05-05T12:48:09+0000. The
phenomenons are as follows:
In 2.0.0: If I load some triples not containing "rdfs:label",
everything works properly. However in this case the index engine
is not working; then as long as I add one triple for "rdfs:label"
into the file I am loading to Fuseki, error emerges:
[2015-06-03 12:10:47] Fuseki INFO [7] Filename: licenties.ttl,
Content-Type=application/octet-stream, Charset=null => Turtle :
Count=40 Triples=40 Quads=0 [2015-06-03 12:10:47] HttpAction
WARN Exception during abort (operation attempts to continue):
Can't abort a write lock-transaction [2015-06-03 12:10:47]
Fuseki INFO [7] 500 Server Error (523 ms)
I remember that a few months ago when 2.0.0 was released for the
first time, I discovered this issue and reported to you. But at
that time I didn't realize that the root reason was because of
indexing. In a later snapshot you fix it, but my test wasn't
proper so I thought the problem is solved and gave you a wrong
feedback. My sincere apologizes.
In 2.0.1 SNAPSHOT: The latest snapshot contains the patch I
mentioned above so they can be successfully loaded. However they
are not indexed at all. Queries with keyword search do not return
any result. Following your advice, I tested loading and query
from both Web UI and s-post/s-query tools, unfortunately (or
fortunately?) the consequences are the same.
TDB: Meanwhile, a similar experiment on Fuseki with TDB in 2.0.0
and 2.0.1 SNAPSHOT is also performed, they both works properly.
Loadings are successful and queries returns search results. The
only difference is in the configuration file the in-memory
dataset is replaced with TDB.
@prefix : <#> . @prefix fuseki:
<http://jena.apache.org/fuseki#>
<http://jena.apache.org/fuseki#>
<http://jena.apache.org/fuseki#>
<http://jena.apache.org/fuseki#> . @prefix rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs:
<http://www.w3.org/2000/01/rdf-schema#>
<http://www.w3.org/2000/01/rdf-schema#>
<http://www.w3.org/2000/01/rdf-schema#>
<http://www.w3.org/2000/01/rdf-schema#> . @prefix tdb:
<http://jena.hpl.hp.com/2008/tdb#>
<http://jena.hpl.hp.com/2008/tdb#>
<http://jena.hpl.hp.com/2008/tdb#>
<http://jena.hpl.hp.com/2008/tdb#> . @prefix ja:
<http://jena.hpl.hp.com/2005/11/Assembler#>
<http://jena.hpl.hp.com/2005/11/Assembler#>
<http://jena.hpl.hp.com/2005/11/Assembler#>
<http://jena.hpl.hp.com/2005/11/Assembler#> . @prefix text:
<http://jena.apache.org/text#> <http://jena.apache.org/text#>
<http://jena.apache.org/text#> <http://jena.apache.org/text#>
. [] rdf:type fuseki:Server ; fuseki:services (
<#service_text_tdb> ) . TDB [] ja:loadClass
"com.hp.hpl.jena.tdb.TDB" . tdb:DatasetTDB rdfs:subClassOf
ja:RDFDataset . tdb:GraphTDB rdfs:subClassOf ja:Model . Text []
ja:loadClass "org.apache.jena.query.text.TextQuery" .
text:TextDataset rdfs:subClassOf ja:RDFDataset .
text:TextIndexLucene rdfs:subClassOf text:TextIndex .
<#service_text_tdb> a fuseki:Service ; rdfs:label "TDB/text
service" ; fuseki:name "tdb" ; fuseki:serviceQuery "query" ;
fuseki:serviceQuery "sparql" ; fuseki:serviceUpdate "update" ;
fuseki:serviceUpload "upload" ; fuseki:serviceReadGraphStore
"get" ; fuseki:serviceReadWriteGraphStore "data" ;
fuseki:dataset <#text_dataset> ; . <#text_dataset> a
text:TextDataset ; text:dataset <#dataset> ; text:index
<#indexLucene> ; . <#dataset> a tdb:DatasetTDB ; tdb:location
"DB" ; ##tdb:unionDefaultGraph true ; . <#indexLucene> a
text:TextIndexLucene ; text:directory <file:Lucene>
<file://Lucene> <file://Lucene> <file://lucene/> ;
##text:directory "mem" ; text:entityMap <#entMap> ; . <#entMap>
a text:EntityMap ; text:entityField "uri" ; text:defaultField
"text" ; text:map ( [ text:field "text" ; text:predicate
rdfs:label ] ) .
Any advice for it now? Thank you very much for your efforts in
advance.
Regards, Yang
PS: I discovered that there is a SNAPSHOT for 2.3.0. I planned to
test on it as well. However I wasn't able to run it.
On 04/17/2015 05:29 PM, Yang Yuanzhe wrote:
Hi Andy,
Thank you very much for your reply.
In fact the problem is irrelevant to the preloaded triples. It
won't work no matter if we start an empty or preloaded one.
Moreover, it takes around 1 minute to load 38k triples, while
TDB only needs 6 seconds. If we turn off text search for an
in-memory dataset, the loading speed rushed to only 1 second.
That's why I thought problem is from Fuseki side.
As for TDB with reasoning, I don't agree with your opinion that
the dataset is not attached to a text index. We have defined
the dataset:
<#tdb_inf_ds> a ja:RDFDataset ; ja:defaultGraph
<#tdb_inf> ; .
We tell Lucene to index it:
:text_dataset a text:TextDataset ; text:dataset
<#tdb_inf_ds> ; text:index <#textIndexLucene> ; .
And we assert that the dataset includes an RDFS inference
model:
<#tdb_inf> a ja:InfModel ; rdfs:label "RDFS Inference Model"
; ja:baseModel <#tdb_graph> ; ja:reasoner [ ja:reasonerURL
<http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner>
<http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner>
<http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner>
<http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner> ] .
Then both text search and RDFS reasoning should work. Such
configuration works properly in Fuseki 1.1.1. However things
changed in 1.1.2 and 2.0.x. I don't know what I should do to
adjust to the new system.
Thank you very much for your efforts again and have a nice
day.
Regards, Yang
On 04/17/2015 02:53 PM, Andy Seaborne wrote:
On 14/04/15 18:51, Yang Yuanzhe wrote:
Hi there,
Sorry to trouble you again. Last month I wrote to you to
figure out the bug in text search for TDB. Given the
following configuration, text search works with TDB:
...
Comments inline:
Now we want to use text search for in-memory datasets, but
we failed after some trials, the configuration file we use
is as follows:
@prefix : <#> . @prefix fuseki:
<http://jena.apache.org/fuseki#>
<http://jena.apache.org/fuseki#>
<http://jena.apache.org/fuseki#>
<http://jena.apache.org/fuseki#> . @prefix rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix
rdfs: <http://www.w3.org/2000/01/rdf-schema#>
<http://www.w3.org/2000/01/rdf-schema#>
<http://www.w3.org/2000/01/rdf-schema#>
<http://www.w3.org/2000/01/rdf-schema#> . @prefix tdb:
<http://jena.hpl.hp.com/2008/tdb#>
<http://jena.hpl.hp.com/2008/tdb#>
<http://jena.hpl.hp.com/2008/tdb#>
<http://jena.hpl.hp.com/2008/tdb#> . @prefix ja:
<http://jena.hpl.hp.com/2005/11/Assembler#>
<http://jena.hpl.hp.com/2005/11/Assembler#>
<http://jena.hpl.hp.com/2005/11/Assembler#>
<http://jena.hpl.hp.com/2005/11/Assembler#> . @prefix
text: <http://jena.apache.org/text#>
<http://jena.apache.org/text#>
<http://jena.apache.org/text#>
<http://jena.apache.org/text#> . @prefix spatial:
<http://jena.apache.org/spatial#>
<http://jena.apache.org/spatial#>
<http://jena.apache.org/spatial#>
<http://jena.apache.org/spatial#> .
[] a fuseki:Server ; fuseki:services ( <#memory> ) .
<#memory> a fuseki:Service ; fuseki:name
"memory" ; fuseki:serviceQuery "sparql" ;
fuseki:serviceQuery "query" ;
fuseki:serviceUpdate "update" ; # SPARQL
query service -- /memory/update fuseki:serviceUpload
"upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ;
fuseki:serviceReadGraphStore "get" ; # Graph
store protocol (read only) -- /memory/get fuseki:dataset
:text_dataset ; .
<#dataset> rdf:type ja:RDFDataset ; ja:defaultGraph [ a
ja:MemoryModel ; ja:content [ja:externalContent
<file:dcat-vl.ttl> <file://dcat-vl.ttl>
<file://dcat-vl.ttl> <file://dcat-vl.ttl/> ] ; ] .
That is going to load the data each time the server starts
but does not attach it anyway to the text index.
Is it the same data as is loaded (separately) into the text
index?
Similarly for the inference setup (which is in a different
Lucene index file:Text <file://Text> <file://Text>
<file://text/>) ...
Andy
# Text [] ja:loadClass
"org.apache.jena.query.text.TextQuery" . text:TextDataset
rdfs:subClassOf ja:RDFDataset . text:TextIndexLucene
rdfs:subClassOf text:TextIndex .
:text_dataset a text:TextDataset ; text:dataset
<#dataset> ; text:index <#textIndexLucene> ; .
# Text index description <#textIndexLucene> a
text:TextIndexLucene ; text:directory <file:Lucene>
<file://Lucene> <file://Lucene> <file://lucene/> ;
##text:directory "mem" ; text:entityMap <#entMap> ; .
<#entMap> a text:EntityMap ; text:entityField "uri"
; text:defaultField "text" ; text:map ( [ text:field
"text" ; text:predicate rdfs:label ] ) .
...
All the tests are based on the 2.0.1 SNAPSHOT built on
April 8th. Any clue or any suggestion for this issue? Thank
you very much and have a nice day.
Regards, Yang