Re: CMS diff: Jena Full Text Search

2019-01-28 Thread vincent ventresque

Hi Ajs6f

Thanks for including me in the conversation, but I have to confess I've 
never looked at java classes (I only use command line tools).


Le 28/01/2019 à 21:05, ajs6f a écrit :

On Jan 28, 2019, at 2:57 PM, Chris Tomlinson  
wrote:

Hi Adam,

I haven’t seen that error. What I’ve done in the past is to replace the 
jena-text doc file with the new contents in Eclipse in an SVN checkout of the 
jena-doc-site and then committed.

I can definitely do that (and will when we're happy with the patch), but see 
below.


Out of curiosity when is it necessary to use the

 [] ja:loadClass "org.apache.jena.tdb.TDB” .

and

[] ja:loadClass   "org.apache.jena.query.text.TextQuery” .

? I do not use them in the config when running fuseki war in tomcat.

I have no idea whatsoever! :grin: I wouldn't have thought them needed either.

Vincent-- any comment?

ajs6f



Regards,
Chris




On Jan 28, 2019, at 11:11 AM, ajs6f  wrote:

Recently Vincent offered a nice patch to our text indexing documentation, as shown below. 
Oddly, when I now go to merge it (a bit late, sorry!), I get an error: "Can't locate 
anonymous's tree to clone". Is anyone familiar with that? I know very little about 
the SVN-based CMS, so I'm not even sure where to start looking...

ajs6f


CMS diff: Jena Full Text Search

2019-01-23 Thread vincent . ventresque
Clone URL (Committers only):
https://cms.apache.org/redirect?new=anonymous;action=diff;uri=http://jena.apache.org/documentation%2Fquery%2Ftext-query.mdtext

vincent.ventres...@ens-lyon.fr

Index: trunk/content/documentation/query/text-query.mdtext
===
--- trunk/content/documentation/query/text-query.mdtext (revision 1851871)
+++ trunk/content/documentation/query/text-query.mdtext (working copy)
@@ -609,21 +609,47 @@
 index field. More complex setups, with multiple properties per entity
 (URI) are possible.
 
+The assembler file can be either default configuration file 
(.../run/config.ttl)
+or a custom file in ...run/configuration folder. Note that you can use several 
files
+simultaneously.
+
+You have to edit the file (see comments in the assembler code below):
+
+1. provide values for paths and a fixed URI for tdb:DatasetTDB
+2. modify the entity map : add the fields you want to index and desired 
options (filters, tokenizers...)
+
+If your assembler file is run/config.ttl, you can index the dataset with this 
command :
+
+java -cp ./fuseki-server.jar jena.textindexer --desc=run/config.ttl
+
 Once configured, any data added to the text dataset is automatically
-indexed as well.
+indexed as well : 
https://jena.apache.org/documentation/query/text-query.html#building-a-text-index
 
+When you change the jena-text in significant ways, such as changing what 
analyzer 
+is used for a given property and so on, then you’ll need to rebuild the Lucene 
index 
+via reloading the dataset or using the textIndexer.
+
 ### Text Dataset Assembler
 
 The following is an example of a TDB dataset with a text index.
 
+ Example of a TDB dataset and text index#
+# The main doc sources are:
+#  - 
https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html
+#  - https://jena.apache.org/documentation/assembler/assembler-howto.html
+#  - https://jena.apache.org/documentation/assembler/assembler.ttl
+# See https://jena.apache.org/documentation/fuseki2/fuseki-layout.html for 
the destination of this file.
+#
+
 @prefix : .
 @prefix rdf:  .
 @prefix rdfs: .
 @prefix tdb:  .
 @prefix ja:   .
 @prefix text: .
+@prefix skos: 
+@prefix fuseki:   .
 
-## Example of a TDB dataset and text index
 ## Initialize TDB
 [] ja:loadClass "org.apache.jena.tdb.TDB" .
 tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
@@ -631,39 +657,64 @@
 
 ## Initialize text query
 [] ja:loadClass   "org.apache.jena.query.text.TextQuery" .
+
 # A TextDataset is a regular dataset with a text index.
 text:TextDataset  rdfs:subClassOf   ja:RDFDataset .
+
 # Lucene index
 text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
-# Elasticsearch index
-text:TextIndexESrdfs:subClassOf   text:TextIndex .
 
+
 ## ---
-## This URI must be fixed - it's used to assemble the text dataset.
 
 :text_dataset rdf:type text:TextDataset ;
-text:dataset   <#dataset> ;
+text:dataset   :my_dataset ; # <-- replace 
`:my_dataset` with the desired URI
 text:index <#indexLucene> ;
-.
+.
 
 # A TDB dataset used for RDF storage
-<#dataset> rdf:type  tdb:DatasetTDB ;
-tdb:location "DB" ;
-tdb:unionDefaultGraph true ; # Optional
-.
 
-# Text index description
+:my_dataset rdf:type  tdb:DatasetTDB ;   # <-- replace 
`:my_dataset` with the desired URI
+tdb:location "/tmp/tdb-dataset/" ;   # <-- replace 
`/tmp/tdb-dataset/` with your path (`.../fuseki/run/databases/MY_DATASET`)
+#tdb:unionDefaultGraph true ; # Optional
+.
+
+# Text index description (see documentation for other options)
+
 <#indexLucene> a text:TextIndexLucene ;
-text:directory  ;
+text:directory  ;# <-- replace 
` with your path` 
(``)
 text:entityMap <#entMap> ;
-text:storeValues true ; 
+text:storeValues true ;
 text:analyzer [ a text:StandardAnalyzer ] ;
 text:queryAnalyzer [ a text:KeywordAnalyzer ] ;
 text:queryParser text:AnalyzingQueryParser ;
-text:defineAnalyzers [ . . . ] ;
 text:multilingualSupport true ;
- .
+.
 
+# Entity map (see documentation for other options)
+
+<#entMap> a text:EntityMap ;
+text:defaultField