Re: Jena 3.6.0?

2017-11-27 Thread Rob Vesse
This week is pretty nuts and next week does not look much better so unlikely to 
have any bandwidth in the immediate future

Rob

On 25/11/2017, 23:02, "Andy Seaborne"  wrote:

The bug in Fuseki that causes UI uploads to fail, and some other UI 
issues, is a bit annoying.

Is there the energy and time to vote on a 3.6.0 release if I build one?
Please respond if you'll be able to vote in the next few weeks.

If there is - from our experience last time, we can test the latest 
development builds now, before a formal VOTE which will shorten the time 
in case there is any problems to address.

 Andy

The build is complaining about a Shiro issue - this is harmless and a 
problem somewhere in the Fuseki tests. Some state is getting initialized 
twice.  It does not happen when Fuseki is run nor does it cause any 
tests to fail.  It happens because of the 1.2.4->1.4.0 Shiro upgrade ; 
it comes in at 1.2.6 -> 1.3.0. Solution: ship with 1.2.6

"""
[...] IniRealm   WARN  Users or Roles are already populated.  Configured 
Ini instance will be ignored.
"""

 Andy








Re: Jena 3.6.0?

2017-11-27 Thread Osma Suominen

Andy Seaborne kirjoitti 26.11.2017 klo 01:02:

Is there the energy and time to vote on a 3.6.0 release if I build one?
Please respond if you'll be able to vote in the next few weeks.


I should be able to test and vote after Dec 7th or so.

-Osma

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi


[jira] [Closed] (JENA-1415) ConversionException for individuals

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne closed JENA-1415.
---

> ConversionException for individuals
> ---
>
> Key: JENA-1415
> URL: https://issues.apache.org/jira/browse/JENA-1415
> Project: Apache Jena
>  Issue Type: Bug
>  Components: ARQ, Jena
>Affects Versions: Jena 3.4.0
> Environment: Linux, Maven, Openllet Reasoner
>Reporter: Christian Bay
>Priority: Minor
>
> Hey there,
> I'm getting a _ConversionException_ when converting a resource to an 
> individual by using _as_.
> [EDIT] Edited this report because I could reproduce this error in a little 
> example.
> Jena throws a _ConversionErrorException_ when trying to cast a Resource with 
> _as_ to an individual. The resource is received by a query as suggested here 
> [https://github.com/Galigator/openllet/blob/integration/examples/src/main/java/openllet/examples/SPARQLDLExample.java]
> This is the test code:
> {code:java}
> package bug;
> import openllet.jena.PelletReasonerFactory;
> import openllet.query.sparqldl.jena.SparqlDLExecutionFactory;
> import org.apache.jena.ontology.OntModel;
> import org.apache.jena.query.Query;
> import org.apache.jena.query.QueryExecution;
> import org.apache.jena.query.QueryFactory;
> import org.apache.jena.query.QuerySolution;
> import org.apache.jena.query.ResultSet;
> import org.apache.jena.query.ResultSetFormatter;
> import org.apache.jena.rdf.model.ModelFactory;
> import org.apache.jena.ontology.Individual;
> import org.apache.jena.ontology.OntClass;
> import org.apache.jena.util.iterator.ExtendedIterator;
> import java.util.ArrayList;
> public class SPARQLDLBug
> {
>   // The ontology loaded as dataset
>   private static final String ontology = "ontologies/simple.owl";
>   private static final String query = "query.sparql";
>   public void run()
>   {
>   // First create a Jena ontology model backed by the 
> Pellet reasoner
>   // (note, the Pellet reasoner is required)
>   final OntModel m = 
> ModelFactory.createOntologyModel(PelletReasonerFactory.THE_SPEC);
>   // Then read the _data from the file into the ontology 
> model
>   m.read(ontology);
>   // Now read the query file into a query object
>   final Query q = QueryFactory.read(query);
>   // Create a SPARQL-DL query execution for the given 
> query and
>   // ontology model
>   final QueryExecution qe = 
> SparqlDLExecutionFactory.create(q, m);
>   // We want to execute a SELECT query, do it, and return 
> the result set
>   final ResultSet rs = qe.execSelect();
>   ArrayList result = new ArrayList();
>   while(rs.hasNext()){
>   QuerySolution qs = rs.next();
> // The Bug occurs in the next line
>   Individual in = 
> qs.getResource("x").as(Individual.class);
>   ExtendedIterator it = 
> in.listOntClasses(true);
>   String className = "";
>   while(it.hasNext()){
>   className = it.next().toString();
>   }
>   result.add(className);  
>   }
>   qe.close();
>   }
>   public static void main(final String[] args)
>   {
>   final SPARQLDLBug app = new SPARQLDLBug();
>   app.run();
>   }
> }
> {code}
> The error message is:
> {code}
> Exception in thread "main" org.apache.jena.ontology.ConversionException: 
> Cannot convert node 
> http://www8.cs.fau.de/research:cgm/schizophrenia#R_AcuteSchizophrenia to 
> Individual
> at 
> org.apache.jena.ontology.impl.IndividualImpl$1.wrap(IndividualImpl.java:61)
> at org.apache.jena.enhanced.EnhNode.convertTo(EnhNode.java:152)
> at org.apache.jena.enhanced.EnhNode.convertTo(EnhNode.java:31)
> at 
> org.apache.jena.enhanced.Polymorphic.asInternal(Polymorphic.java:62)
> at org.apache.jena.enhanced.EnhNode.as(EnhNode.java:107)
> at bug.SPARQLDLBug.run(SPARQLDLBug.java:58)
> at bug.SPARQLDLBug.main(SPARQLDLBug.java:75)
> {code}
> This Query is:
> {code:xml}
>  PREFIX rdf: 
>  PREFIX owl: 
>  PREFIX xsd: 
>  PREFIX rdfs: 
>  PREFIX bio: 
>  SELECT ?x
>  WHERE { ?x rdf:type bio:AcuteSchizo

[jira] [Closed] (JENA-1429) Error with # comments in SPARQL

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne closed JENA-1429.
---

> Error with # comments in SPARQL
> ---
>
> Key: JENA-1429
> URL: https://issues.apache.org/jira/browse/JENA-1429
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Fuseki
> Environment:  Fuseki 3.4.0 
>Reporter: Karima Rafes
>Priority: Trivial
>
> A comment in SPARQL queries take the form of '#', outside an IRI or string, 
> and continue to the end of line[1] but Fuseki sends a parse error (Fuseki 
> 3.4.0 (Build date: 2017-07-17T11:43:07+)).
> [1] https://www.w3.org/TR/rdf-sparql-query/#grammarComments



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (JENA-1388) Lucene text search across multiple fields ("AND") yields no results

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne closed JENA-1388.
---

> Lucene text search across multiple fields ("AND") yields no results
> ---
>
> Key: JENA-1388
> URL: https://issues.apache.org/jira/browse/JENA-1388
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Text
>Affects Versions: Jena 3.4.0
> Environment: CentOS 7.3, OpenJDK 64-Bit, v1.8.0_141-b16
>Reporter: Vilnis Termanis (Iotic Labs)
>Assignee: Osma Suominen
>  Labels: index, lucene, search
> Attachments: config-fields.ttl, multi_field.ttl, multi_index.sparql
>
>
> Searching across two Lucene text indexed fields produces potentially 
> unexpected results. (The following assumes that the string supplied to each 
> field does match and is tied to the same uid/subject.)
> # A query across two fields with *OR* produces two equal rows
> # The same query but with *AND* produces no rows



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (JENA-1406) Fix TDB2 TestDatabaseOps.compact_prefixes_3 test

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne updated JENA-1406:

Fix Version/s: (was: Jena 3.4.0)
   Jena 3.6.0

> Fix TDB2 TestDatabaseOps.compact_prefixes_3 test
> 
>
> Key: JENA-1406
> URL: https://issues.apache.org/jira/browse/JENA-1406
> Project: Apache Jena
>  Issue Type: Bug
>  Components: TDB
>Reporter: Bruno P. Kinoshita
>Assignee: Bruno P. Kinoshita
>Priority: Minor
>  Labels: unit-test
> Fix For: Jena 3.6.0
>
>
> In the thread for 3.5.0 RC1 vote, I found that on
> {noformat}
> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 
> 2015-11-11T05:41:47+13:00)
> Maven home: /opt/maven
> Java version: 1.8.0_151, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/java-8-oracle/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "linux", version: "4.4.0-97-generic", arch: "amd64", family: "unix"
> {noformat}
> The TestDatabaseOps.compact_prefixes_3 test always failed with a 
> NullPointerException and no stack trace.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (JENA-1409) Issues in DBOE testing

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne closed JENA-1409.
---

> Issues in DBOE testing
> --
>
> Key: JENA-1409
> URL: https://issues.apache.org/jira/browse/JENA-1409
> Project: Apache Jena
>  Issue Type: Bug
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
> Fix For: Jena 3.5.0
>
>
> Arising from 3.5.0 RC1 release testing on different operating systems:
> # TestProcessFileLock
> # TestDatabaseOps.compact_prefixes_3
> # Inconsistent use of {{ConfigTestDBOE}}, {{ConfigTest}} (in TDB2), and 
> existing {{ConfigTest}} (TDB1).
> 1,2: These are caused by either some degree of parallel test runs, timing 
> issues deleting test files on disk, or differences in OS behaviour 
> deleting/reusing files. 
> Complete isolation of tests would fail to test for application that delete 
> and reuse directories.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (JENA-1408) Improvements to development build times

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne closed JENA-1408.
---

> Improvements to development build times
> ---
>
> Key: JENA-1408
> URL: https://issues.apache.org/jira/browse/JENA-1408
> Project: Apache Jena
>  Issue Type: Improvement
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
> Fix For: Jena 3.5.0
>
>
> We have {{-Pdev}} for a build that tests the main parts of Jena up to 
> jena-fuseki2 to speed up testing for changes.  It omits jena-iri and 
> jena-shaded-guava and can fail if run when the build picks up snapshots built 
> by Jenkins. We also have {{-Pbootstrap}} which includes these.
> Proposal:
> # Have one profile, which builds everything up to jena-fuseki2.
> # Don't build the javadoc (for speed reasons).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (JENA-1409) Issues in DBOE testing

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne updated JENA-1409:

Fix Version/s: (was: Jena 3.4.0)
   Jena 3.5.0

> Issues in DBOE testing
> --
>
> Key: JENA-1409
> URL: https://issues.apache.org/jira/browse/JENA-1409
> Project: Apache Jena
>  Issue Type: Bug
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
> Fix For: Jena 3.5.0
>
>
> Arising from 3.5.0 RC1 release testing on different operating systems:
> # TestProcessFileLock
> # TestDatabaseOps.compact_prefixes_3
> # Inconsistent use of {{ConfigTestDBOE}}, {{ConfigTest}} (in TDB2), and 
> existing {{ConfigTest}} (TDB1).
> 1,2: These are caused by either some degree of parallel test runs, timing 
> issues deleting test files on disk, or differences in OS behaviour 
> deleting/reusing files. 
> Complete isolation of tests would fail to test for application that delete 
> and reuse directories.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Jena 3.6.0?

2017-11-27 Thread Andy Seaborne
Testing the Fuseki fixes has been done - Laura Morales has used the 
development build to check out the fixes and confirmed they are fixed.


The Jenkins job, other than the Shiro issue, is building cleanly. There 
are no changes around OS issues except TDB2 fixes (and TDB2 is 
"experimental").


More testing is always good but it takes time. The minimum is some 
testing and 3 +1 votes on process and legal; after that more is better 
and I'd say the criteria is "is it better than 3.5.0?", not some notion 
of "perfect".


If we have this week before a release can start, some final things for 
3.6.0: (this is 3.6.0 so a few actual changes can happen, not a 3.5.1)


1/ The jena-text documentation improvements
2/ Downgrade shiro to 1.2.6
3/ riot: status code on warnings (#315)
4/ Ideally, dataset assembler (#314) [might be too tight for time].

Anything else?

Rob - I can merge #315 and we can sort out the implementation stuff later.

Andy

On 25/11/17 23:45, ajs6f wrote:

Ditto, except for me it's the 8th.

ajs6f


On Nov 25, 2017, at 6:12 PM, Bruno P. Kinoshita 
 wrote:

I can run the build and verify signatures any day in the next weeks. Just not 
much time to properly test Fuseki and review changes until after Dec 3rd.
CheersBruno

  From: Andy Seaborne 
To: "dev@jena.apache.org" 
Sent: Sunday, 26 November 2017 12:02 PM
Subject: Jena 3.6.0?

The bug in Fuseki that causes UI uploads to fail, and some other UI
issues, is a bit annoying.

Is there the energy and time to vote on a 3.6.0 release if I build one?
Please respond if you'll be able to vote in the next few weeks.

If there is - from our experience last time, we can test the latest
development builds now, before a formal VOTE which will shorten the time
in case there is any problems to address.

 Andy

The build is complaining about a Shiro issue - this is harmless and a
problem somewhere in the Fuseki tests. Some state is getting initialized
twice.  It does not happen when Fuseki is run nor does it cause any
tests to fail.  It happens because of the 1.2.4->1.4.0 Shiro upgrade ;
it comes in at 1.2.6 -> 1.3.0. Solution: ship with 1.2.6

"""
[...] IniRealm  WARN  Users or Roles are already populated.  Configured
Ini instance will be ignored.
"""

 Andy







Re: Jena 3.6.0?

2017-11-27 Thread Rob Vesse
Yes, I think #315 can be merged as-is and we can improve the implementation 
later to make it more flexible

Rob

On 27/11/2017, 13:10, "Andy Seaborne"  wrote:

Testing the Fuseki fixes has been done - Laura Morales has used the 
development build to check out the fixes and confirmed they are fixed.

The Jenkins job, other than the Shiro issue, is building cleanly. There 
are no changes around OS issues except TDB2 fixes (and TDB2 is 
"experimental").

More testing is always good but it takes time. The minimum is some 
testing and 3 +1 votes on process and legal; after that more is better 
and I'd say the criteria is "is it better than 3.5.0?", not some notion 
of "perfect".

If we have this week before a release can start, some final things for 
3.6.0: (this is 3.6.0 so a few actual changes can happen, not a 3.5.1)

1/ The jena-text documentation improvements
2/ Downgrade shiro to 1.2.6
3/ riot: status code on warnings (#315)
4/ Ideally, dataset assembler (#314) [might be too tight for time].

Anything else?

Rob - I can merge #315 and we can sort out the implementation stuff later.

 Andy

On 25/11/17 23:45, ajs6f wrote:
> Ditto, except for me it's the 8th.
> 
> ajs6f
> 
>> On Nov 25, 2017, at 6:12 PM, Bruno P. Kinoshita 
 wrote:
>>
>> I can run the build and verify signatures any day in the next weeks. 
Just not much time to properly test Fuseki and review changes until after Dec 
3rd.
>> CheersBruno
>>
>>   From: Andy Seaborne 
>> To: "dev@jena.apache.org" 
>> Sent: Sunday, 26 November 2017 12:02 PM
>> Subject: Jena 3.6.0?
>>
>> The bug in Fuseki that causes UI uploads to fail, and some other UI
>> issues, is a bit annoying.
>>
>> Is there the energy and time to vote on a 3.6.0 release if I build one?
>> Please respond if you'll be able to vote in the next few weeks.
>>
>> If there is - from our experience last time, we can test the latest
>> development builds now, before a formal VOTE which will shorten the time
>> in case there is any problems to address.
>>
>>  Andy
>>
>> The build is complaining about a Shiro issue - this is harmless and a
>> problem somewhere in the Fuseki tests. Some state is getting initialized
>> twice.  It does not happen when Fuseki is run nor does it cause any
>> tests to fail.  It happens because of the 1.2.4->1.4.0 Shiro upgrade ;
>> it comes in at 1.2.6 -> 1.3.0. Solution: ship with 1.2.6
>>
>> """
>> [...] IniRealm  WARN  Users or Roles are already populated.  Configured
>> Ini instance will be ignored.
>> """
>>
>>  Andy
>>
>>
>>
> 







Re: Jena 3.6.0?

2017-11-27 Thread ajs6f
Comments inline...

ajs6f

> On Nov 27, 2017, at 8:10 AM, Andy Seaborne  wrote:
> 
> ...
> 1/ The jena-text documentation improvements

Is this required for or by a release? Can we not do this independently?

> 2/ Downgrade shiro to 1.2.6
> 3/ riot: status code on warnings (#315)

+1 to merging; I would ideally like to confirm the fix with Ian Dickinson 
before closing the ticket.

> 4/ Ideally, dataset assembler (#314) [might be too tight for time].

Waiting on feedback from Andy (and anyone else who might be interested).

> Anything else?

1391 is still hanging, but with a release this close I don't think I can write 
enough tests before then to feel comfortable sending a PR, so let's leave it be.

> 
> Rob - I can merge #315 and we can sort out the implementation stuff later.
> 
>Andy
> 
> On 25/11/17 23:45, ajs6f wrote:
>> Ditto, except for me it's the 8th.
>> ajs6f
>>> On Nov 25, 2017, at 6:12 PM, Bruno P. Kinoshita 
>>>  wrote:
>>> 
>>> I can run the build and verify signatures any day in the next weeks. Just 
>>> not much time to properly test Fuseki and review changes until after Dec 
>>> 3rd.
>>> CheersBruno
>>> 
>>>  From: Andy Seaborne 
>>> To: "dev@jena.apache.org" 
>>> Sent: Sunday, 26 November 2017 12:02 PM
>>> Subject: Jena 3.6.0?
>>> 
>>> The bug in Fuseki that causes UI uploads to fail, and some other UI
>>> issues, is a bit annoying.
>>> 
>>> Is there the energy and time to vote on a 3.6.0 release if I build one?
>>> Please respond if you'll be able to vote in the next few weeks.
>>> 
>>> If there is - from our experience last time, we can test the latest
>>> development builds now, before a formal VOTE which will shorten the time
>>> in case there is any problems to address.
>>> 
>>> Andy
>>> 
>>> The build is complaining about a Shiro issue - this is harmless and a
>>> problem somewhere in the Fuseki tests. Some state is getting initialized
>>> twice.  It does not happen when Fuseki is run nor does it cause any
>>> tests to fail.  It happens because of the 1.2.4->1.4.0 Shiro upgrade ;
>>> it comes in at 1.2.6 -> 1.3.0. Solution: ship with 1.2.6
>>> 
>>> """
>>> [...] IniRealm  WARN  Users or Roles are already populated.  Configured
>>> Ini instance will be ignored.
>>> """
>>> 
>>> Andy
>>> 
>>> 
>>> 



Re: Jena 3.6.0?

2017-11-27 Thread Andy Seaborne



On 27/11/17 14:30, ajs6f wrote:

Comments inline...

ajs6f


On Nov 27, 2017, at 8:10 AM, Andy Seaborne  wrote:

...
1/ The jena-text documentation improvements


Is this required for or by a release? Can we not do this independently?


Required? No.

It needs doing and the website gets updated on release.

Andy


Re: Jena 3.6.0?

2017-11-27 Thread ajs6f
Right, I just wouldn't want to make 3.6.0 wait on it if the other stuff gets 
done.

ajs6f

> On Nov 27, 2017, at 9:51 AM, Andy Seaborne  wrote:
> 
> 
> 
> On 27/11/17 14:30, ajs6f wrote:
>> Comments inline...
>> ajs6f
>>> On Nov 27, 2017, at 8:10 AM, Andy Seaborne  wrote:
>>> 
>>> ...
>>> 1/ The jena-text documentation improvements
>> Is this required for or by a release? Can we not do this independently?
> 
> Required? No.
> 
> It needs doing and the website gets updated on release.
> 
>Andy



[jira] [Updated] (JENA-1043) Running Fuseki in Tomcat does not respect Tomcat manager stop/start.

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne updated JENA-1043:

Component/s: (was: eki)

> Running Fuseki in Tomcat does not respect Tomcat manager stop/start.
> 
>
> Key: JENA-1043
> URL: https://issues.apache.org/jira/browse/JENA-1043
> Project: Apache Jena
>  Issue Type: Bug
>  Components: Fuseki
>Affects Versions: Fuseki 2.3.0
>Reporter: Andy Seaborne
>
> Fuseki needs a dataset start/stop lifecycle and couple this into 
> {{FusekiServerListener}} which is the {{ServletContextListener}} for Fuseki.
> See [thread on 
> users@|http://mail-archives.apache.org/mod_mbox/jena-users/201509.mbox/%3C55F849D7.3010403%40u.washington.edu%3E].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


CMS diff: Jena Full Text Search

2017-11-27 Thread Chris Tomlinson
Clone URL (Committers only):
https://cms.apache.org/redirect?new=anonymous;action=diff;uri=http://jena.apache.org/documentation%2Fquery%2Ftext-query.mdtext

Chris Tomlinson

Index: trunk/content/documentation/query/text-query.mdtext
===
--- trunk/content/documentation/query/text-query.mdtext (revision 1816402)
+++ trunk/content/documentation/query/text-query.mdtext (working copy)
@@ -1,5 +1,7 @@
 Title: Jena Full Text Search
 
+Title: Jena Full Text Search
+
 This extension to ARQ combines SPARQL and full text search via
 [Lucene](https://lucene.apache.org) 6.4.1 or
 [ElasticSearch](https://www.elastic.co) 5.2.1 (which is built on
@@ -64,7 +66,21 @@
 ## Table of Contents
 
 -   [Architecture](#architecture)
+-   [External content](#external-content)
+-   [External applications](#external-applications)
+-   [Document structure](#document-structure)
 -   [Query with SPARQL](#query-with-sparql)
+-   [Syntax](#syntax)
+-   [Input arguments](#input-arguments)
+-   [Output arguments](#output-arguments)
+-   [Query strings](#query-strings)
+-   [Simple queries](#simple-queries)
+-   [Queries with language tags](#queries-with-language-tags)
+-   [Queries that retrieve literals](#queries-that-retrieve-literals)
+-   [Queries with graphs](#queries-with-graphs)
+-   [Queries across multiple `Fields`](#queries-across-multiple-fields)
+-   [Queries with _Boolean Operators_ and _Term 
Modifiers_](#queries-with-boolean-operators-and-term-modifiers)
+-   [Good practice](#good-practice)
 -   [Configuration](#configuration)
 -   [Text Dataset Assembler](#text-dataset-assembler)
 -   [Configuring an analyzer](#configuring-an-analyzer)
@@ -108,6 +124,7 @@
 
 The text index uses the native query language of the index:
 [Lucene query 
language](http://lucene.apache.org/core/6_4_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description)
+(with [restrictions](#input-arguments))
 or
 [Elasticsearch query 
language](https://www.elastic.co/guide/en/elasticsearch/reference/5.2/query-dsl.html).
 
@@ -134,6 +151,64 @@
 By using Elasticsearch, other applications can share the text index with
 SPARQL search.
 
+### Document structure
+
+As mentioned above, text indexing of a triple involves associating a Lucene
+document with the triple. How is this done?
+
+Lucene documents are composed of `Field`s. Indexing and searching are 
performed 
+over the contents of these `Field`s. For an RDF triple to be indexed in Lucene 
the 
+_property_ of the triple must be 
+[configured in the entity map of a TextIndex](#entity-map-definition).
+This associates a Lucene analyzer with the _`property`_ which will be used
+for indexing and search. The _`property`_ becomes the _searchable_ Lucene 
+`Field` in the resulting document.
+
+A Lucene index includes a _default_ `Field`, which is specified in the 
configuration, 
+that is the field to search if not otherwise named in the query. In jena-text 
+this field is configured via the `text:defaultField` property which is then 
mapped 
+to a specific RDF property via `text:predicate` (see [entity 
map](#entity-map-definition) 
+below).
+
+There are several additional `Field`s that will be included in the
+document that is passed to the Lucene `IndexWriter` depending on the
+configuration options that are used. These additional fields are used to
+manage the interface between Jena and Lucene and are not generally 
+searchable per se.
+
+The most important of these additional `Field`s is the `text:entityField`.
+This configuration property defines the name of the `Field` that will contain
+the _URI_ or _blank node id_ of the _subject_ of the triple being indexed. 
This property does
+not have a default and must be specified for most uses of `jena-text`. This
+`Field` is often given the name, `uri`, in examples. It is via this `Field`
+that `?s` is bound in a typical use such as:
+
+select ?s
+where {
+?s text:query "some text"
+}
+
+Other `Field`s that may be configured: `text:uidField`, `text:graphField`,
+and so on are discussed below.
+
+Given the triple:
+
+ex:SomeOne skos:prefLabel "zorn protégé a prés"@fr ;
+
+The following is an abbreviated illustration a Lucene document that Jena will 
create and
+request Lucene to index:
+
+Document<
+http://example.org/SomeOne> 
+ 
+ 
+ 
+ 
+>
+
+It may be instructive to refer back to this example when considering the 
various
+points below.
+
 ## Query with SPARQL
 
 The URI of the text extension property function is
@@ -143,63 +218,298 @@
 
 ...   text:query ...
 
+### Syntax
 
 The following forms are all legal:
 
-?s text:query 'word'   # query
-?s text:query (rdfs:label 'word')  # query specific property if 
multiple
-?s text:query ('word' 10)  # wit

[jira] [Created] (JENA-1440) Map ByteBuffer direct to Nodeis, avoiding a Record object (TupleIndexRecord)

2017-11-27 Thread Andy Seaborne (JIRA)
Andy Seaborne created JENA-1440:
---

 Summary: Map ByteBuffer direct to Nodeis, avoiding a Record object 
(TupleIndexRecord)
 Key: JENA-1440
 URL: https://issues.apache.org/jira/browse/JENA-1440
 Project: Apache Jena
  Issue Type: Improvement
  Components: TDB2
Affects Versions: Jena 3.5.0, Jena 3.6.0
Reporter: Andy Seaborne
Assignee: Andy Seaborne
Priority: Minor


Avoiding going bytes-> Record->Tuple can save about 25% of time in a 
large index scan.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (JENA-1440) Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne updated JENA-1440:

Summary: Map ByteBuffer direct to NodeIds, avoiding a Record object 
(TupleIndexRecord)  (was: Map ByteBuffer direct to Nodeis, avoiding a Record 
object (TupleIndexRecord))

> Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)
> -
>
> Key: JENA-1440
> URL: https://issues.apache.org/jira/browse/JENA-1440
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.5.0, Jena 3.6.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Minor
>
> Avoiding going bytes-> Record->Tuple can save about 25% of time in a 
> large index scan.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (JENA-1440) Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne updated JENA-1440:

Description: Avoiding going {{bytes-> Record->Tuple}} can save 
about 25% of time in a large index scan (but there are lots of other things 
going on in a typical query and this change would be negligible).  (was: 
Avoiding going bytes-> Record->Tuple can save about 25% of time in a 
large index scan.)

> Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)
> -
>
> Key: JENA-1440
> URL: https://issues.apache.org/jira/browse/JENA-1440
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.5.0, Jena 3.6.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Minor
> Fix For: Jena 3.7.0
>
>
> Avoiding going {{bytes-> Record->Tuple}} can save about 25% of time 
> in a large index scan (but there are lots of other things going on in a 
> typical query and this change would be negligible).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (JENA-1440) Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)

2017-11-27 Thread Andy Seaborne (JIRA)

 [ 
https://issues.apache.org/jira/browse/JENA-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne updated JENA-1440:

Fix Version/s: Jena 3.7.0

> Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)
> -
>
> Key: JENA-1440
> URL: https://issues.apache.org/jira/browse/JENA-1440
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.5.0, Jena 3.6.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Minor
> Fix For: Jena 3.7.0
>
>
> Avoiding going {{bytes-> Record->Tuple}} can save about 25% of time 
> in a large index scan (but there are lots of other things going on in a 
> typical query and this change would be negligible).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (JENA-1440) Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)

2017-11-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267216#comment-16267216
 ] 

ASF GitHub Bot commented on JENA-1440:
--

GitHub user afs opened a pull request:

https://github.com/apache/jena/pull/317

JENA-1440: TDB2 - transform bytes to NodeIds directly.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afs/jena tdb-mapper

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/jena/pull/317.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #317






> Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)
> -
>
> Key: JENA-1440
> URL: https://issues.apache.org/jira/browse/JENA-1440
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.5.0, Jena 3.6.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Minor
> Fix For: Jena 3.7.0
>
>
> Avoiding going {{bytes-> Record->Tuple}} can save about 25% of time 
> in a large index scan (but there are lots of other things going on in a 
> typical query and this change would be negligible).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] jena issue #317: JENA-1440: TDB2 - transform bytes to NodeIds directly.

2017-11-27 Thread afs
Github user afs commented on the issue:

https://github.com/apache/jena/pull/317
  
Not for Jena 3.6.0.


---


[jira] [Commented] (JENA-1440) Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)

2017-11-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267217#comment-16267217
 ] 

ASF GitHub Bot commented on JENA-1440:
--

Github user afs commented on the issue:

https://github.com/apache/jena/pull/317
  
Not for Jena 3.6.0.


> Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)
> -
>
> Key: JENA-1440
> URL: https://issues.apache.org/jira/browse/JENA-1440
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.5.0, Jena 3.6.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Minor
> Fix For: Jena 3.7.0
>
>
> Avoiding going {{bytes-> Record->Tuple}} can save about 25% of time 
> in a large index scan (but there are lots of other things going on in a 
> typical query and this change would be negligible).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] jena pull request #317: JENA-1440: TDB2 - transform bytes to NodeIds directl...

2017-11-27 Thread afs
GitHub user afs opened a pull request:

https://github.com/apache/jena/pull/317

JENA-1440: TDB2 - transform bytes to NodeIds directly.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afs/jena tdb-mapper

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/jena/pull/317.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #317






---


[GitHub] jena pull request #317: JENA-1440: TDB2 - transform bytes to NodeIds directl...

2017-11-27 Thread ajs6f
Github user ajs6f commented on a diff in the pull request:

https://github.com/apache/jena/pull/317#discussion_r153284608
  
--- Diff: 
jena-db/jena-dboe-base/src/main/java/org/apache/jena/dboe/base/buffer/RecordBufferIteratorMapper.java
 ---
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.jena.dboe.base.buffer;
+
+import static org.apache.jena.atlas.lib.Alg.decodeIndex ;
+
+import java.util.Iterator;
+import java.util.NoSuchElementException;
+
+import org.apache.jena.atlas.lib.Bytes;
+import org.apache.jena.dboe.base.record.Record;
+import org.apache.jena.dboe.base.record.RecordMapper;
+
+// Iterate over one RecordBuffer
+public class RecordBufferIteratorMapper implements Iterator
+{
+private RecordBuffer rBuff ;
+private int nextIdx ;
+private X slot = null ;
+private final byte[] keySlot ;
+private final Record maxRec ;
+private final Record minRec ;
+private final RecordMapper mapper;
+
+//RecordBufferIteratorMapper(RecordBuffer rBuff)
+//{ this(rBuff, null, null); }
+
+RecordBufferIteratorMapper(RecordBuffer rBuff, Record minRecord, 
Record maxRecord, int keyLen, RecordMapper mapper)
+{
+this.rBuff = rBuff ;
+this.mapper = mapper ;
+this.keySlot = (maxRecord==null) ? null : new byte[keyLen];
+nextIdx = 0 ;
+minRec = minRecord ;
+if ( minRec != null )
+{
+nextIdx = rBuff.find(minRec) ;
+if ( nextIdx < 0 )
+nextIdx = decodeIndex(nextIdx) ;
+}
+
+maxRec = maxRecord ; 
+}
+
+private void finish()
+{
+rBuff = null ;
+nextIdx = -99 ;
--- End diff --

Might be nice to call this out as a constant, like `NO_NEXT_INDEX` or the 
like.


---


[jira] [Commented] (JENA-1440) Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)

2017-11-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/JENA-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267220#comment-16267220
 ] 

ASF GitHub Bot commented on JENA-1440:
--

Github user ajs6f commented on a diff in the pull request:

https://github.com/apache/jena/pull/317#discussion_r153284608
  
--- Diff: 
jena-db/jena-dboe-base/src/main/java/org/apache/jena/dboe/base/buffer/RecordBufferIteratorMapper.java
 ---
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.jena.dboe.base.buffer;
+
+import static org.apache.jena.atlas.lib.Alg.decodeIndex ;
+
+import java.util.Iterator;
+import java.util.NoSuchElementException;
+
+import org.apache.jena.atlas.lib.Bytes;
+import org.apache.jena.dboe.base.record.Record;
+import org.apache.jena.dboe.base.record.RecordMapper;
+
+// Iterate over one RecordBuffer
+public class RecordBufferIteratorMapper implements Iterator
+{
+private RecordBuffer rBuff ;
+private int nextIdx ;
+private X slot = null ;
+private final byte[] keySlot ;
+private final Record maxRec ;
+private final Record minRec ;
+private final RecordMapper mapper;
+
+//RecordBufferIteratorMapper(RecordBuffer rBuff)
+//{ this(rBuff, null, null); }
+
+RecordBufferIteratorMapper(RecordBuffer rBuff, Record minRecord, 
Record maxRecord, int keyLen, RecordMapper mapper)
+{
+this.rBuff = rBuff ;
+this.mapper = mapper ;
+this.keySlot = (maxRecord==null) ? null : new byte[keyLen];
+nextIdx = 0 ;
+minRec = minRecord ;
+if ( minRec != null )
+{
+nextIdx = rBuff.find(minRec) ;
+if ( nextIdx < 0 )
+nextIdx = decodeIndex(nextIdx) ;
+}
+
+maxRec = maxRecord ; 
+}
+
+private void finish()
+{
+rBuff = null ;
+nextIdx = -99 ;
--- End diff --

Might be nice to call this out as a constant, like `NO_NEXT_INDEX` or the 
like.


> Map ByteBuffer direct to NodeIds, avoiding a Record object (TupleIndexRecord)
> -
>
> Key: JENA-1440
> URL: https://issues.apache.org/jira/browse/JENA-1440
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.5.0, Jena 3.6.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Minor
> Fix For: Jena 3.7.0
>
>
> Avoiding going {{bytes-> Record->Tuple}} can save about 25% of time 
> in a large index scan (but there are lots of other things going on in a 
> typical query and this change would be negligible).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: CMS diff: Jena Full Text Search

2017-11-27 Thread Chris Tomlinson
Hi All,

I’ve completed my proposed updates to the Jena text-query documentation. The 
documentation corresponds to 3.6.0-SNAPSHOT. I’ve noted several instances where 
the current behavior may be considered an issue that will be corrected in a 
future release. I’ve separately created issues for these: JENA-1437 
, JENA-1438 
, and JENA-1439 
.

Thank you,
Chris


> On Nov 22, 2017, at 8:25 AM, Chris Tomlinson  
> wrote:
> 
> Hi Andy and Osma,
> 
> I posted JENA-1426  since 
> the “improve this page” facility didn’t seem to offer any way to add a commit 
> message or more extensive explanation of the reasons for the proposed edits 
> and they were somewhat extensive. So raising an issue seemed a way to 
> proceed; however, after several days with no comments I thought perhaps I 
> should follow the published protocol and I made the update as guest on the 
> CMS.
> 
> I had several motivations regarding updating the documentation: 1) I wanted 
> to present how the current implementation functions in a way that might be 
> more useful to users - for example clarifying what can be expected to work 
> and what not in terms of using the native Lucene query language, e.g., 
> JENA-1388 ; 2) identify 
> areas that might indicate perhaps unintended aspects of the current 
> implementation; and 3) understand the code in preparation for developing a 
> proposal for adding jena-text highlighting support 
> .
> 
> Based on Osma’s feedback I will be opening a few issues on JIRA and making 
> corrections to the original submission. I assume that updates should just be 
> made as further commits.
> 
> Thanks,
> Chris
> 
> 
> 
>> On Nov 22, 2017, at 6:41 AM, Andy Seaborne > > wrote:
>> 
>> How is this related to JENA-1426?
>> 
>>Andy
>> 
>> On 21/11/17 14:48, Osma Suominen wrote:
>>> ajs6f kirjoitti 20.11.2017 klo 18:36:
 Osma (or anyone else who knows text indexing better than do I, which 
 wouldn't take much)-- could you review this? It's got some great useful 
 detail about how the indexing works and can be used.
>>> Sure, will do.
>>> Comments about specific sections below. Generally this is a very good 
>>> contribution to the jena-text documentation, which has stagnated a bit.
> +The following illustrates a Lucene document that Jena will create and
> +request Lucene to index:
> +
> +Document<
> +stored, indexed, indexOptions=DOCS 
> http://example.org/SomeOne >
> +indexed, omitNorms, indexOptions=DOCS 
> 
> +stored, indexed, tokenized 
> +stored, indexed, omitNorms, indexOptions=DOCS 
> +stored, indexed, tokenized 
> +stored, indexed, omitNorms, indexOptions=DOCS 
> 
> +stored, indexed, tokenized 
> +stored, indexed, omitNorms, indexOptions=DOCS 
> +stored, indexed, tokenized 
> +stored, indexed, omitNorms, indexOptions=DOCS 
> 
> +>
> +
> +It may be instructive to refer back to this example when considering the 
> various
> +points below.
>>> Not sure if this is a perfect illustration. The level of detail is rather 
>>> excessive. I know Lucene quite well and I still struggle to understand 
>>> what's going on here. Is there another way of presenting this information, 
>>> for example just a key-value list that shows the field values that get 
>>> stored in the document? I think the field options stored, indexed, 
>>> tokenized, omitNorms etc. are unnecessary here or at least should not be so 
>>> prominent.
> +The `lang:xx` specification is an optional string, where _xx_ is
> +a BCP-47 language tag. This restricts searches to field values that were 
> originally
> +indexed with the tag _xx_. Searches may be restricted to field values 
> with no
> +language tag via `"lang:none"`. The use of the `lang:xx` is only 
> effective if
> +[multilingual support](#linguistic-support-with-lucene-index) has been 
> configured.
>>> The last sentence is not true. You can restrict by language even without 
>>> enabling multilingual support, as long as langField has been set.
> +Further, if the `lang:xx` is used then the `property` URI must be 
> supplied
> +in order for searches to work.
>>> Not true. The default property should be used if no property was specified.
> +When working with `rdf:langString`s It may be tempting to write:
> +
> +?s text:query "protégé"@fr
> +
> +However, the above will silently fail to return results since the
> +`query string` must be a simple `xsd:string`