Re: CMS diff: Jena Full Text Search

2019-02-22 Thread Chris Tomlinson
I have (finally!) updated the jena text query documentation for the 
improvements that Vincent Ventresque submitted.

Thank you Vincent for the contribution and your patience.

Regards,
Chris


> On Jan 23, 2019, at 12:01 PM, vincent.ventres...@ens-lyon.fr 
>  wrote:
> 
> Clone URL (Committers only):
> https://cms.apache.org/redirect?new=anonymous;action=diff;uri=http://jena.apache.org/documentation%2Fquery%2Ftext-query.mdtext
> 
> vincent.ventres...@ens-lyon.fr
> 
> Index: trunk/content/documentation/query/text-query.mdtext
> ===
> --- trunk/content/documentation/query/text-query.mdtext   (revision 
> 1851871)
> +++ trunk/content/documentation/query/text-query.mdtext   (working copy)
> @@ -609,21 +609,47 @@
> index field. More complex setups, with multiple properties per entity
> (URI) are possible.
> 
> +The assembler file can be either default configuration file 
> (.../run/config.ttl)
> +or a custom file in ...run/configuration folder. Note that you can use 
> several files
> +simultaneously.
> +
> +You have to edit the file (see comments in the assembler code below):
> +
> +1. provide values for paths and a fixed URI for tdb:DatasetTDB
> +2. modify the entity map : add the fields you want to index and desired 
> options (filters, tokenizers...)
> +
> +If your assembler file is run/config.ttl, you can index the dataset with 
> this command :
> +
> +java -cp ./fuseki-server.jar jena.textindexer --desc=run/config.ttl
> +
> Once configured, any data added to the text dataset is automatically
> -indexed as well.
> +indexed as well : 
> https://jena.apache.org/documentation/query/text-query.html#building-a-text-index
> 
> +When you change the jena-text in significant ways, such as changing what 
> analyzer 
> +is used for a given property and so on, then you’ll need to rebuild the 
> Lucene index 
> +via reloading the dataset or using the textIndexer.
> +
> ### Text Dataset Assembler
> 
> The following is an example of a TDB dataset with a text index.
> 
> + Example of a TDB dataset and text index#
> +# The main doc sources are:
> +#  - 
> https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html
> +#  - https://jena.apache.org/documentation/assembler/assembler-howto.html
> +#  - https://jena.apache.org/documentation/assembler/assembler.ttl
> +# See https://jena.apache.org/documentation/fuseki2/fuseki-layout.html 
> for the destination of this file.
> +#
> +
> @prefix : .
> @prefix rdf:  .
> @prefix rdfs: .
> @prefix tdb:  .
> @prefix ja:   .
> @prefix text: .
> +@prefix skos: 
> +@prefix fuseki:   .
> 
> -## Example of a TDB dataset and text index
> ## Initialize TDB
> [] ja:loadClass "org.apache.jena.tdb.TDB" .
> tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
> @@ -631,39 +657,64 @@
> 
> ## Initialize text query
> [] ja:loadClass   "org.apache.jena.query.text.TextQuery" .
> +
> # A TextDataset is a regular dataset with a text index.
> text:TextDataset  rdfs:subClassOf   ja:RDFDataset .
> +
> # Lucene index
> text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
> -# Elasticsearch index
> -text:TextIndexESrdfs:subClassOf   text:TextIndex .
> 
> +
> ## ---
> -## This URI must be fixed - it's used to assemble the text dataset.
> 
> :text_dataset rdf:type text:TextDataset ;
> -text:dataset   <#dataset> ;
> +text:dataset   :my_dataset ; # <-- 
> replace `:my_dataset` with the desired URI
> text:index <#indexLucene> ;
> -.
> +.
> 
> # A TDB dataset used for RDF storage
> -<#dataset> rdf:type  tdb:DatasetTDB ;
> -tdb:location "DB" ;
> -tdb:unionDefaultGraph true ; # Optional
> -.
> 
> -# Text index description
> +:my_dataset rdf:type  tdb:DatasetTDB ;   # <-- 
> replace `:my_dataset` with the desired URI
> +tdb:location "/tmp/tdb-dataset/" ;   # <-- 
> replace `/tmp/tdb-dataset/` with your path 
> (`.../fuseki/run/databases/MY_DATASET`)
> +#tdb:unionDefaultGraph true ; # Optional
> +.
> +
> +# Text index description (see documentation for other options)
> +
> <#indexLucene> a text:TextIndexLucene ;
> -text:directory  ;
> +text:directory  ;# <-- 
> replace ` with your path` 
> (``)
> 

[GitHub] rvesse commented on issue #536: Add support SurroundQueryParser to jena-text

2019-02-22 Thread GitBox
rvesse commented on issue #536: Add support SurroundQueryParser to jena-text
URL: https://github.com/apache/jena/pull/536#issuecomment-466525608
 
 
   Pinging @osma @xristy 
   
   Not that it's a blocker for this PR but anytime I see this kind of code 
pattern in Jena (and Java in general) I think that we really should be using 
the `ServiceLoader` pattern.  This would make stuff like this dynamically 
extensible so we can have some basic defaults out of the box with an easy 
drop-in mechanism to discover user provided extensions.  And text search does 
seem to be an area where end users want to do a lot of customisation


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Resolved] (JENA-1674) Mishandling negative xsd:floats in TDB2

2019-02-22 Thread Andy Seaborne (JIRA)


 [ 
https://issues.apache.org/jira/browse/JENA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-1674.
-
Resolution: Fixed

> Mishandling negative xsd:floats in TDB2
> ---
>
> Key: JENA-1674
> URL: https://issues.apache.org/jira/browse/JENA-1674
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.10.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Major
> Fix For: Jena 3.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Reported on users@:
> https://lists.apache.org/thread.html/063ad1651c559752080001c7faa40d24d0bf29d78636f3a3e222c1ab@%3Cusers.jena.apache.org%3E
> Example:
> {code}
> public static void main(String[] args) {
> Location loc = Location.mem();
> DatasetGraph dsg = DatabaseMgr.connectDatasetGraph(loc);
> Txn.execute(dsg, ()->dsg.add(SSE.parseQuad("(_ :s :p 
> '-1'^^xsd:float)")));
> Txn.execute(dsg, ()->RDFDataMgr.write(System.out, dsg, Lang.NQ));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (JENA-1674) Mishandling negative xsd:floats in TDB2

2019-02-22 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774918#comment-16774918
 ] 

ASF subversion and git services commented on JENA-1674:
---

Commit d5545fea581937db8dce470fec2f4db51141da55 in jena's branch 
refs/heads/master from Andy Seaborne
[ https://gitbox.apache.org/repos/asf?p=jena.git;h=d5545fe ]

Merge pull request #535 from afs/tdb2-xsd_float

JENA-1674: Don't sign extend Float.floatToIntBits.

> Mishandling negative xsd:floats in TDB2
> ---
>
> Key: JENA-1674
> URL: https://issues.apache.org/jira/browse/JENA-1674
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.10.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Major
> Fix For: Jena 3.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Reported on users@:
> https://lists.apache.org/thread.html/063ad1651c559752080001c7faa40d24d0bf29d78636f3a3e222c1ab@%3Cusers.jena.apache.org%3E
> Example:
> {code}
> public static void main(String[] args) {
> Location loc = Location.mem();
> DatasetGraph dsg = DatabaseMgr.connectDatasetGraph(loc);
> Txn.execute(dsg, ()->dsg.add(SSE.parseQuad("(_ :s :p 
> '-1'^^xsd:float)")));
> Txn.execute(dsg, ()->RDFDataMgr.write(System.out, dsg, Lang.NQ));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] afs merged pull request #535: JENA-1674: Don't sign extend Float.floatToIntBits.

2019-02-22 Thread GitBox
afs merged pull request #535: JENA-1674: Don't sign extend Float.floatToIntBits.
URL: https://github.com/apache/jena/pull/535
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (JENA-1674) Mishandling negative xsd:floats in TDB2

2019-02-22 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/JENA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774917#comment-16774917
 ] 

ASF subversion and git services commented on JENA-1674:
---

Commit efa2a7ac19ec148812f29634225a79f40e82f864 in jena's branch 
refs/heads/master from Andy Seaborne
[ https://gitbox.apache.org/repos/asf?p=jena.git;h=efa2a7a ]

JENA-1674: Don't sign extend Float.floatToIntBits.

Add TestFloatNode for value-based testing of FloatNode.
Use canonical NaN for packing Double.NaN.


> Mishandling negative xsd:floats in TDB2
> ---
>
> Key: JENA-1674
> URL: https://issues.apache.org/jira/browse/JENA-1674
> Project: Apache Jena
>  Issue Type: Improvement
>  Components: TDB2
>Affects Versions: Jena 3.10.0
>Reporter: Andy Seaborne
>Assignee: Andy Seaborne
>Priority: Major
> Fix For: Jena 3.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Reported on users@:
> https://lists.apache.org/thread.html/063ad1651c559752080001c7faa40d24d0bf29d78636f3a3e222c1ab@%3Cusers.jena.apache.org%3E
> Example:
> {code}
> public static void main(String[] args) {
> Location loc = Location.mem();
> DatasetGraph dsg = DatabaseMgr.connectDatasetGraph(loc);
> Txn.execute(dsg, ()->dsg.add(SSE.parseQuad("(_ :s :p 
> '-1'^^xsd:float)")));
> Txn.execute(dsg, ()->RDFDataMgr.write(System.out, dsg, Lang.NQ));
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)