Hi Venkat,
in addition to what Rob said.

LARQ has not been released yet in Apache.

We discussed the idea of having LARQ included in Fuseki:
https://issues.apache.org/jira/browse/JENA-63

When LARQ is released, as I said, "users who want to package/include LARQ with
Fuseki will need to checkout Fuseki, add the LARQ dependency to the Fuseki
pom.xml and (re)package Fuseki themselves (i.e. mvn package)".

1)

Yes, LARQ can be configured via Fuseki configuration (once you have LARQ in your
classpath) use:

<#dataset> rdf:type tdb:DatasetTDB ;
  tdb:location "/path/to/your/tdb/indexes/" ;
  ja:textIndex "/path/to/lucene/index/" ;
  .

2)

It's true that TDB supports transactions. However, Lucene or other free text
indexes such as Solr or Elastic Search do not have transactions. Here, we have
two systems with indexes, one transactional and one no. My suggestion is to
consider TDB, which support transactions, as the source of truth and make the
best we can to keep the indexes in sync. But, indexes might go out of sync,
therefore users should be aware of that and consider the option to rebuild the
free text indexes regularly/nightly, if possible.

Do you have an idea or suggestion on how to keep two indexes in sync where one
does not support transactions? Things such as the 2PC protocol might be a
possibility, but not without modifying Lucene (or Solr or ElasticSearch) (which
is something I am not keen on).

As Rob said, also, we still have an open issue for SPARQL Update requests:
https://issues.apache.org/jira/browse/JENA-164 ... apologies, I had no time to
look at this recently and I am still trying to find out what's the best way to
catch all possible route of updates: APIs, SPARQL Update & Graph Store HTTP
protocol, bulkloading, ... others?

Are you update coming in directly into Fuseki or you have a central place else
where which receives your updates?

Thanks,
Paolo

Venkat Krishnamurthy wrote:
> Based on trawling the list, I understand that a batch indexing process is
> possible with LARQ. My requirement is to have a 'live' text index that
> works with a running Fuseki instance: when updates come in,  indexed as
> specified in the fuseki configuration.
> 
> Some questions:
> 
> 1) How do I set up/configure LARQ with fuseki to enable live text indexing?
> Can it be done purely via fuseki configuration alone?
> 
> 2) Now that TDB supports transactions, can/should text indexing be done
> when the actual update happens on the underlying dataset within a
> transaction so that the index stays in sync with the dataset? Any other
> suggestions?
> 
> VK
> 

Reply via email to