Thanks Rob, Paolo.

Actually the big use case i have is to support text indexing as part of
SPARQL updates, hence the fuseki question. It looks like configuring Fuseki
to read the index is supported based on Paolo's response to #1 below.

Paolo, i believe IndexWriters in Lucene are transactional - see
https://issues.apache.org/jira/browse/LUCENE-3131, though they dont have
general purpose XA support. I'll explore further.

On Thu, Apr 12, 2012 at 8:47 AM, Paolo Castagna <
[email protected]> wrote:

> Hi Venkat,
> in addition to what Rob said.
>
> LARQ has not been released yet in Apache.
>
> We discussed the idea of having LARQ included in Fuseki:
> https://issues.apache.org/jira/browse/JENA-63
>
> When LARQ is released, as I said, "users who want to package/include LARQ
> with
> Fuseki will need to checkout Fuseki, add the LARQ dependency to the Fuseki
> pom.xml and (re)package Fuseki themselves (i.e. mvn package)".
>
> 1)
>
> Yes, LARQ can be configured via Fuseki configuration (once you have LARQ
> in your
> classpath) use:
>
> <#dataset> rdf:type tdb:DatasetTDB ;
>  tdb:location "/path/to/your/tdb/indexes/" ;
>  ja:textIndex "/path/to/lucene/index/" ;
>  .
>
> 2)
>
> It's true that TDB supports transactions. However, Lucene or other free
> text
> indexes such as Solr or Elastic Search do not have transactions. Here, we
> have
> two systems with indexes, one transactional and one no. My suggestion is to
> consider TDB, which support transactions, as the source of truth and make
> the
> best we can to keep the indexes in sync. But, indexes might go out of sync,
> therefore users should be aware of that and consider the option to rebuild
> the
> free text indexes regularly/nightly, if possible.
>
> Do you have an idea or suggestion on how to keep two indexes in sync where
> one
> does not support transactions? Things such as the 2PC protocol might be a
> possibility, but not without modifying Lucene (or Solr or ElasticSearch)
> (which
> is something I am not keen on).
>
> As Rob said, also, we still have an open issue for SPARQL Update requests:
> https://issues.apache.org/jira/browse/JENA-164 ... apologies, I had no
> time to
> look at this recently and I am still trying to find out what's the best
> way to
> catch all possible route of updates: APIs, SPARQL Update & Graph Store HTTP
> protocol, bulkloading, ... others?
>
> Are you update coming in directly into Fuseki or you have a central place
> else
> where which receives your updates?
>
> Thanks,
> Paolo
>
> Venkat Krishnamurthy wrote:
> > Based on trawling the list, I understand that a batch indexing process is
> > possible with LARQ. My requirement is to have a 'live' text index that
> > works with a running Fuseki instance: when updates come in,  indexed as
> > specified in the fuseki configuration.
> >
> > Some questions:
> >
> > 1) How do I set up/configure LARQ with fuseki to enable live text
> indexing?
> > Can it be done purely via fuseki configuration alone?
> >
> > 2) Now that TDB supports transactions, can/should text indexing be done
> > when the actual update happens on the underlying dataset within a
> > transaction so that the index stays in sync with the dataset? Any other
> > suggestions?
> >
> > VK
> >
>
>

Reply via email to