[
https://issues.apache.org/jira/browse/JENA-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340105#comment-14340105
]
ASF GitHub Bot commented on JENA-686:
-------------------------------------
GitHub user ehedgehog opened a pull request:
https://github.com/apache/jena/pull/39
Updated text indexing
This change addresses JENA-686, support for cross field conjunctive queries
in jena-text, by
allowing TextDocProducers to be specified in a TextDatasetAssembler
assembly and
existing index entities to be updated as well as added.
DatasetTextGraph - commit's monitor finish() moved to the top
of the method. This is because a batching TextDocProducer
(as we have in our external app) may have quads buffered up
awaiting end-of-batch and they must be flushed by finish();
if they are not, they are auto-flushed after the commit has
been run and an exception is thrown.
TextDatasetFactory - needed methods to create datasets with
doc producers and close-index-on-close flags
TextIndex - added new operation updateEntity to allow update
of (possibly) existing entities
TextIndexLucene - added implementation of updateEntity. Added
deleteDocuments(Term) for deletion of documents (as is used
in ppd-text-index text doc producer batch).
TextDatasetAssembler - updated to allow specification of doc
producer (since done by mainline Jena) and close-index-on
close.
AbstractTestDatasetWithLuceneIndex, DummyDocProducer,
TestTextDatasetAssembler - fix up some issues in the test
framework (use of statics vs use of instance variables,
remembering to close datasets @After done, etc).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/epimorphics/jena-config-doc-producer
updated-text-indexing
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/jena/pull/39.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #39
----
commit 1a919696739e60d46f92b1e58d0fbefb50c14dee
Author: Chris Dollin <[email protected]>
Date: 2015-02-19T15:44:20Z
Integrated changes to jena-text
from work on JENA-686.
commit 5c4e91981f3564252cd74d638285497d149229b2
Author: Chris Dollin <[email protected]>
Date: 2015-02-20T11:31:27Z
Merged in the changes since the apache-jena
fork (over 400 commits) and fix up the jena-text changes that get conflicts
(mostly due to the
automatic merge not handling all the cases well).
commit c702ba48388304dfaf88e84788aed59d3566d685
Author: Chris Dollin <[email protected]>
Date: 2015-02-26T09:40:33Z
Fixed conflicts following merge with latest jena master.
commit d7040a9955a62eeef2e16d38eefb101420a07c06
Author: Chris Dollin <[email protected]>
Date: 2015-02-26T12:02:52Z
Updated dataset assembler so that it can handle dociument
producers that need a dataset as well as a text index, viz, the dependant
text indexer.
----
> Add support for cross field conjunctive queries in jena-text
> ------------------------------------------------------------
>
> Key: JENA-686
> URL: https://issues.apache.org/jira/browse/JENA-686
> Project: Apache Jena
> Issue Type: Improvement
> Components: Text
> Affects Versions: Jena 2.11.1, Fuseki 1.0.1
> Reporter: Brian McBride
> Assignee: christopher james dollin
> Attachments: TestDatasetWithBatchProducer.java
>
>
> We have a project where we are doing text search on addresses and wish to do
> jena text queries like "city:liverpool AND street:green". These queries
> return no results, whilst queries like "street:green AND street:lane" work
> fine.
> The reason is that jena text indexes each property in a separate Lucene
> document, so there is no Lucene document matching city:liverpool AND
> street:green, there are two documents, one for each property.
> Given the scale of our data, we really want to do the conjunctive query in
> Lucene and not two separate queries and then a filter in SPARQL.
> I will attach a test case from an attempt to solve this for us to illustrate.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)