Andy Seaborne wrote:
On 12/09/11 11:24, Paolo Castagna wrote:
Hi Jérôme,
you are lucky, I've just exactly the same need as you and I've
something about it recently.
Unfortunately, the new LARQ (as a separate module) still did not make
it into Fuseki on trunk.
We have an open JIRA for it which you can watch|vote|contribute to:
https://issues.apache.org/jira/browse/JENA-63
Should we chnage the title of JENA-63? It's not about Fuseki, which
just supplies the SPARQL protocol and routes requests to the right
dataset. It's the dataset that must do the LARQ coordination - initial
indexing and incrementally later, across restarts.
Hi Andy,
I am not sure the title of the JENA-63 is going to make much difference.
Users (@ Talis as well) want to easily have SPARQL endpoints and they also
want to easily run free-text searches on those SPARQL endpoints.
Fuseki, currently, provide a very good user experience in terms of quickly
have a SPARQL endpoint, however it does not include free-text search
capabilities.
The patch in JENA-63 does not contain any code change to Fuseki source code,
it only adds LARQ jar (and transitively Lucene v3.1.0) to its dependencies.
All the other necessary code changes have been done already elsewhere (i.e.
ARQ and LARQ).
What would be a more appropriate title?
The overall goal is to make as easy as possible for users to perform free-text
searches over their RDF data if they want to. Notice: this feature is not
"standard" and it is not enabled by default.
Once LARQ is properly released, do you see problems in adding it (and Lucene
v3.1.0) to the Fuseki dependencies?
LARQ is ~46KB.
Lucene v3.1.0 is (unfortunately) much bigger: ~1.2MB.
Is the size of Lucene's jar a concern?
Paolo
It is possible to get Fuseki to automatically run initialization code -
the configuration file support ja:loadClass (a bit misnamed - it loads
and runs a static) but I don't think that is anything other than a
stop-gap.
Andy
In the meantime, if you want to use LARQ with Fuseki this is what you
need to do:
cd /tmp
svn co
https://svn.apache.org/repos/asf/incubator/jena/Jena2/Fuseki/trunk/
fuseki
cd /tmp/fuseki
wget
https://issues.apache.org/jira/secure/attachment/12482758/JENA-63_Fuseki_r1136050.patch
patch -p0< JENA-63_Fuseki_r1136050.patch
mvn package
Now, you can simply use the Fuseki config.ttl file as explained here:
http://openjena.org/wiki/Fuseki#Fuseki_Configuration_File
and use the ja:textIndex property on a dataset to specify an non
existing directory.
LARQ when you point it at a non existing directory will perform the
indexing for you.
This is particularly useful when you have multiple datasets configured
in Fuseki.
WARNING: it might take a while to index large datasets, so be patient.
See also: http://markmail.org/thread/tmptip55ru5wxrrj
LARQ snapshots are here:
https://repository.apache.org/content/repositories/snapshots/org/apache/jena/larq/0.2.2-incubating-SNAPSHOT/
and I can quickly fix/improve things if you have problems or good
suggestions.
I hope this helps, let me know how it goes.
Paolo
Jérôme wrote:
Hi,
i'm trying to use LARQ with my Fuseki server.
I would like to programmaticaly indexing(with lucene) documents when the
server starts.
Something like that:
Model model = ModelFactory.createDefaultModel();
IndexBuilderString larqBuilder = new IndexBuilderString();
model.register(larqBuilder);
FileManager.get().readModel(model, "Data/books.ttl");
larqBuilder.closeWriter();
model.unregister(larqBuilder);
index = larqBuilder.getIndex();
LARQ.setDefaultIndex(index);
Is it possible? In which class it would be the best?
Thanks
Jerome