FILTER EXISTS evaluation of different triple stores

2018-01-28 Thread Lorenz B.
Hi all,

just if somebody is interested in reading about the different behavior
of triple stores when evaluating FILTER EXISTS. [1]

Not sure how often this feature is used nowadays, but anyways
interesting to know ( though I'd never write the SPARQL query as given
in the example of the paper, seems odd)


Cheers,

Lorenz

[1] https://arxiv.org/pdf/1801.04387.pdf



Re: CMS diff: Jena Full Text Search

2018-01-28 Thread Andy Seaborne

Chris,

There's 3 diffs a few minutes apart.

This 3rd CMS diff is the right one to apply and includes the superceeds 
the previous ones?


Andy

On 22/01/18 03:15, Chris Tomlinson wrote:

Clone URL (Committers only):
https://cms.apache.org/redirect?new=anonymous;action=diff;uri=http://jena.apache.org/documentation%2Fquery%2Ftext-query.mdtext

Chris Tomlinson

Index: trunk/content/documentation/query/text-query.mdtext
===
--- trunk/content/documentation/query/text-query.mdtext (revision 1821823)
+++ trunk/content/documentation/query/text-query.mdtext (working copy)
@@ -1,5 +1,7 @@
  Title: Jena Full Text Search
  
+Title: Jena Full Text Search

+
  This extension to ARQ combines SPARQL and full text search via
  [Lucene](https://lucene.apache.org) 6.4.1 or
  [ElasticSearch](https://www.elastic.co) 5.2.1 (which is built on
@@ -231,7 +233,7 @@
  
  The most general form is:
 
- (?s ?score ?literal ?g) text:query (property 'query string' limit 'lang:xx')

+ ( ?s ?score ?literal ?g ) text:query ( property 'query string' limit 
'lang:xx' 'highlight:yy' )
  
   Input arguments:
  
@@ -241,13 +243,13 @@

  | query string  | Lucene query string fragment   |
  | limit | (optional) `int` limit on the number of results   |
  | lang:xx   | (optional) language tag spec   |
-| highlight:xx  | (optional) highlighting options|
+| highlight:yy  | (optional) highlighting options|
  
  The `property` URI is only necessary if multiple properties have been

  indexed and the property being searched over is not the [default field
  of the index](#entity-map-definition).
  
-The `query string` syntax conforms the underlying index [Lucene](http://lucene.apache.org/core/6_4_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description)

+The `query string` syntax conforms to the underlying index 
[Lucene](http://lucene.apache.org/core/6_4_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description)
  or
  
[Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/5.2/query-dsl.html).
 In the case of Lucene the syntax is restricted to `Terms`, `Term modifiers`, 
`Boolean Operators` applied to `Terms`, and `Grouping` of terms. _No use of 
`Fields` within the `query string` is supported._
  
@@ -258,9 +260,9 @@

  indexed with the tag _xx_. Searches may be restricted to field values with no
  language tag via `"lang:none"`.
  
-The `highlight:xx` specification is an optional string where _xx_ are options that control the highlighting of search result literals. See [below](#highlighting) for details.

+The `highlight:yy` specification is an optional string where _yy_ are options 
that control the highlighting of search result literals. See 
[below](#highlighting) for details.
  
-If both `limit` and one or more of `lang:xx` or `highlight:xx` are present, then `limit` must precede these arguments.

+If both `limit` and one or more of `lang:xx` or `highlight:yy` are present, 
then `limit` must precede these arguments.
  
  If only the query string is required, the surrounding `( )` _may be_ omitted.
  
@@ -499,7 +501,7 @@
  
   Highlighting
  
-The highlighting option uses the Lucene `Highlighter` and `SimpleHTMLFormatter` to insert highlighting markup into the literals returned from search results (hence the text dataset must be configured to store the literals). The highlighted results are returned via the _literal_ output argument.

+The highlighting option uses the Lucene `Highlighter` and 
`SimpleHTMLFormatter` to insert highlighting markup into the literals returned 
from search results (hence the text dataset must be configured to store the 
literals). The highlighted results are returned via the _literal_ output 
argument. This highlighting feature, introduced in version 3.7.0, does not 
require re-indexing by Lucene.
  
  The simplest way to request highlighting is via `'highlight:'`. This will apply all the defaults:
  
@@ -521,7 +523,7 @@
  
  "the quick ↦brown fox↤ jumped over the lazy baboon"
  
-The `RIGHT_ARROW` is Unicode \u21a6 and the `LEFT_ARROW` is Unicode \u21a4. These are chosen to be single characters that in most situations will be very unlikely to occur in resulting literals. The `fragSize` of 128 is chosen to be large enough that in many situations the matches will result in single fragments. If the literal is larger than 128 characters and there are several matches in the literal then there may be additional fragments separated by the `DIVIDES`, Unicode \u2223.

+The `RIGHT_ARROW` is Unicode, \u21a6, and the `LEFT_ARROW` is Unicode, \u21a4. 
These are chosen to be single characters that in most situations will be very 
unlikely to occur in resulting literals. The `fragSize` of 128 is chosen to be 
large enough that in many situations the matches will result in single 
fragments. If the l

Re: CMS diff: Jena Full Text Search

2018-01-28 Thread Chris Tomlinson
Andy,

Yes I was just word-smithing and tweaking punctuation.

Thanks,
Chris


> On Jan 28, 2018, at 10:30 AM, Andy Seaborne  wrote:
> 
> Chris,
> 
> There's 3 diffs a few minutes apart.
> 
> This 3rd CMS diff is the right one to apply and includes the superceeds the 
> previous ones?
> 
>Andy
> 
> On 22/01/18 03:15, Chris Tomlinson wrote:
>> Clone URL (Committers only):
>> https://cms.apache.org/redirect?new=anonymous;action=diff;uri=http://jena.apache.org/documentation%2Fquery%2Ftext-query.mdtext
>> Chris Tomlinson
>> Index: trunk/content/documentation/query/text-query.mdtext
>> ===
>> --- trunk/content/documentation/query/text-query.mdtext  (revision 
>> 1821823)
>> +++ trunk/content/documentation/query/text-query.mdtext  (working copy)
>> @@ -1,5 +1,7 @@
>>  Title: Jena Full Text Search
>>  +Title: Jena Full Text Search
>> +
>>  This extension to ARQ combines SPARQL and full text search via
>>  [Lucene](https://lucene.apache.org) 6.4.1 or
>>  [ElasticSearch](https://www.elastic.co) 5.2.1 (which is built on
>> @@ -231,7 +233,7 @@
>>The most general form is:
>> - (?s ?score ?literal ?g) text:query (property 'query string' limit 
>> 'lang:xx')
>> + ( ?s ?score ?literal ?g ) text:query ( property 'query string' limit 
>> 'lang:xx' 'highlight:yy' )
>> Input arguments:
>>  @@ -241,13 +243,13 @@
>>  | query string  | Lucene query string fragment   |
>>  | limit | (optional) `int` limit on the number of results   
>> |
>>  | lang:xx   | (optional) language tag spec   |
>> -| highlight:xx  | (optional) highlighting options|
>> +| highlight:yy  | (optional) highlighting options|
>>The `property` URI is only necessary if multiple properties have been
>>  indexed and the property being searched over is not the [default field
>>  of the index](#entity-map-definition).
>>  -The `query string` syntax conforms the underlying index 
>> [Lucene](http://lucene.apache.org/core/6_4_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description)
>> +The `query string` syntax conforms to the underlying index 
>> [Lucene](http://lucene.apache.org/core/6_4_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description)
>>  or
>>  
>> [Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/5.2/query-dsl.html).
>>  In the case of Lucene the syntax is restricted to `Terms`, `Term 
>> modifiers`, `Boolean Operators` applied to `Terms`, and `Grouping` of terms. 
>> _No use of `Fields` within the `query string` is supported._
>>  @@ -258,9 +260,9 @@
>>  indexed with the tag _xx_. Searches may be restricted to field values with 
>> no
>>  language tag via `"lang:none"`.
>>  -The `highlight:xx` specification is an optional string where _xx_ are 
>> options that control the highlighting of search result literals. See 
>> [below](#highlighting) for details.
>> +The `highlight:yy` specification is an optional string where _yy_ are 
>> options that control the highlighting of search result literals. See 
>> [below](#highlighting) for details.
>>  -If both `limit` and one or more of `lang:xx` or `highlight:xx` are 
>> present, then `limit` must precede these arguments.
>> +If both `limit` and one or more of `lang:xx` or `highlight:yy` are present, 
>> then `limit` must precede these arguments.
>>If only the query string is required, the surrounding `( )` _may be_ 
>> omitted.
>>  @@ -499,7 +501,7 @@
>> Highlighting
>>  -The highlighting option uses the Lucene `Highlighter` and 
>> `SimpleHTMLFormatter` to insert highlighting markup into the literals 
>> returned from search results (hence the text dataset must be configured to 
>> store the literals). The highlighted results are returned via the _literal_ 
>> output argument.
>> +The highlighting option uses the Lucene `Highlighter` and 
>> `SimpleHTMLFormatter` to insert highlighting markup into the literals 
>> returned from search results (hence the text dataset must be configured to 
>> store the literals). The highlighted results are returned via the _literal_ 
>> output argument. This highlighting feature, introduced in version 3.7.0, 
>> does not require re-indexing by Lucene.
>>The simplest way to request highlighting is via `'highlight:'`. This will 
>> apply all the defaults:
>>  @@ -521,7 +523,7 @@
>>"the quick ↦brown fox↤ jumped over the lazy baboon"
>>  -The `RIGHT_ARROW` is Unicode \u21a6 and the `LEFT_ARROW` is Unicode 
>> \u21a4. These are chosen to be single characters that in most situations 
>> will be very unlikely to occur in resulting literals. The `fragSize` of 128 
>> is chosen to be large enough that in many situations the matches will result 
>> in single fragments. If the literal is larger than 128 characters and there 
>> are several matches in the literal then there may be additional fragments 
>> separate