Hi Venkat

Venkat Krishnamurthy wrote:
> Thanks Rob, Paolo.
> 
> Actually the big use case i have is to support text indexing as part of
> SPARQL updates, hence the fuseki question. It looks like configuring Fuseki
> to read the index is supported based on Paolo's response to #1 below.

We've just got the third necessary +1 to release LARQ.
Once that is done, you have the possibility to use LARQ with Fuseki,
but no updates via SPARQL Update until JENA-164 gets fixed.
Do you want to help? ;-)

This is something I need to learn myself (i.e. how to intercept SPARQL Update
requests and be notified as triples/quads get added/removed so that the index
can be updated accordingly.

> Paolo, i believe IndexWriters in Lucene are transactional - see
> https://issues.apache.org/jira/browse/LUCENE-3131, though they dont have
> general purpose XA support. I'll explore further.

Please, do and let me know.

Paolo

> 
> On Thu, Apr 12, 2012 at 8:47 AM, Paolo Castagna <
> [email protected]> wrote:
> 
>> Hi Venkat,
>> in addition to what Rob said.
>>
>> LARQ has not been released yet in Apache.
>>
>> We discussed the idea of having LARQ included in Fuseki:
>> https://issues.apache.org/jira/browse/JENA-63
>>
>> When LARQ is released, as I said, "users who want to package/include LARQ
>> with
>> Fuseki will need to checkout Fuseki, add the LARQ dependency to the Fuseki
>> pom.xml and (re)package Fuseki themselves (i.e. mvn package)".
>>
>> 1)
>>
>> Yes, LARQ can be configured via Fuseki configuration (once you have LARQ
>> in your
>> classpath) use:
>>
>> <#dataset> rdf:type tdb:DatasetTDB ;
>>  tdb:location "/path/to/your/tdb/indexes/" ;
>>  ja:textIndex "/path/to/lucene/index/" ;
>>  .
>>
>> 2)
>>
>> It's true that TDB supports transactions. However, Lucene or other free
>> text
>> indexes such as Solr or Elastic Search do not have transactions. Here, we
>> have
>> two systems with indexes, one transactional and one no. My suggestion is to
>> consider TDB, which support transactions, as the source of truth and make
>> the
>> best we can to keep the indexes in sync. But, indexes might go out of sync,
>> therefore users should be aware of that and consider the option to rebuild
>> the
>> free text indexes regularly/nightly, if possible.
>>
>> Do you have an idea or suggestion on how to keep two indexes in sync where
>> one
>> does not support transactions? Things such as the 2PC protocol might be a
>> possibility, but not without modifying Lucene (or Solr or ElasticSearch)
>> (which
>> is something I am not keen on).
>>
>> As Rob said, also, we still have an open issue for SPARQL Update requests:
>> https://issues.apache.org/jira/browse/JENA-164 ... apologies, I had no
>> time to
>> look at this recently and I am still trying to find out what's the best
>> way to
>> catch all possible route of updates: APIs, SPARQL Update & Graph Store HTTP
>> protocol, bulkloading, ... others?
>>
>> Are you update coming in directly into Fuseki or you have a central place
>> else
>> where which receives your updates?
>>
>> Thanks,
>> Paolo
>>
>> Venkat Krishnamurthy wrote:
>>> Based on trawling the list, I understand that a batch indexing process is
>>> possible with LARQ. My requirement is to have a 'live' text index that
>>> works with a running Fuseki instance: when updates come in,  indexed as
>>> specified in the fuseki configuration.
>>>
>>> Some questions:
>>>
>>> 1) How do I set up/configure LARQ with fuseki to enable live text
>> indexing?
>>> Can it be done purely via fuseki configuration alone?
>>>
>>> 2) Now that TDB supports transactions, can/should text indexing be done
>>> when the actual update happens on the underlying dataset within a
>>> transaction so that the index stays in sync with the dataset? Any other
>>> suggestions?
>>>
>>> VK
>>>
>>
> 

Reply via email to