Clone URL (Committers only): https://cms.apache.org/redirect?new=anonymous;action=diff;uri=http://jena.apache.org/documentation%2Fquery%2Ftext-query.mdtext
Index: trunk/content/documentation/query/text-query.mdtext =================================================================== --- trunk/content/documentation/query/text-query.mdtext (revision 1655891) +++ trunk/content/documentation/query/text-query.mdtext (working copy) @@ -9,7 +9,7 @@ accessing the RDF graph. The text index can be either [Apache Lucene](http://lucene.apache.org/core) for a -same-machine text index, or [Apache Solr](http://lucene.apache.org/solr/) +same-machine text index, or [ElasticSearch](https://www.elastic.co/) for a large scale enterprise search application. Some example code is [available here](https://github.com/apache/jena/tree/master/jena-text/src/main/java/examples/). @@ -54,7 +54,7 @@ The text index uses the native query language of the index: [Lucene query format](http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description) or -[Solr query format](http://wiki.apache.org/solr/SolrQuerySyntax). +[Elasticsearch query format](https://www.elastic.co/guide/en/elasticsearch/reference/5.2/query-dsl.html). A text-supporting dataset is configured with a description of which properties work with. When data is added, any properties matching the @@ -83,7 +83,7 @@ ### External applications -By using Solr, in either pattern A (RDF data indexed) or pattern B +By using ElasticSearch, in either pattern A (RDF data indexed) or pattern B (external content indexed), other applications can share the text index with SPARQL search. @@ -156,11 +156,11 @@ [Jena assembler description](../assembler/index.html). Configurations can also be built with code. The assembler describes a 'text dataset' which has an underlying RDF dataset and a text index. The text -index describes the text index technology (Lucene or Solr) and the details +index describes the text index technology (Lucene or ElasticSearch) and the details needed for for each. A text index has an "entity map" which defines the properties to -index, the name of the lucene/solr field and field used for storing the URI +index, the name of the lucene/elasticsearch field and field used for storing the URI itself. For common RDF use, there will be one field, mapping a property to a text @@ -193,8 +193,8 @@ text:TextDataset rdfs:subClassOf ja:RDFDataset . # Lucene index text:TextIndexLucene rdfs:subClassOf text:TextIndex . - # Solr index - text:TextIndexSolr rdfs:subClassOf text:TextIndex . + # ElasticSearch index + text:TextIndexES rdfs:subClassOf text:TextIndex . ## --------------------------------------------------------------- ## This URI must be fixed - it's used to assemble the text dataset. @@ -241,9 +241,8 @@ ### Configuring an Analyzer Text to be indexed is passed through a text analyzer that divides it into tokens -and may perform other transformations such as eliminating stop words. If a Lucene -text index is used then, by default a `StandardAnalyzer` is used. If a Solr text -index is used, the analyzer used is determined by the Solr configuration. +and may perform other transformations such as eliminating stop words. If a Lucene or ElasticSearch +text index is used then, by default a `StandardAnalyzer` is used. It is possible to configure an alternative analyzer for each field indexed in a Lucene index. For example: @@ -270,6 +269,8 @@ In addition, Jena provides `LowerCaseKeywordAnalyzer`, which is a case-insensitive version of `KeywordAnalyzer`. +ElasticSearch currently doesn't support Analyzers beyond Standard Analyzer. + ### Configuration by Code A text dataset can also be constructed in code as might be done for a @@ -417,4 +418,14 @@ </dependency> adjusting the version <code>X.Y.Z</code> as necessary. This will automatically -include a compatible version of Lucene and the Solr java client, but not Solr server. \ No newline at end of file +include a compatible version of Lucene. + +For ElasticSearch implementation, you can include the following Maven Dependency: + + <dependency> + <groupId>org.apache.jena</groupId> + <artifactId>jena-text-es</artifactId> + <version>X.Y.Z</version> + </dependency> + +adjusting the version <code>X.Y.Z</code> as necessary. \ No newline at end of file
