Thanks I'll change that to something better when have working setup.

On 22/01/2019 17:28, vincent ventresque wrote:
Maybe it doesn't have any connection with your problem but I noticed you have a /tmp directory for your index : don't know if your /tmp directory is erased when you reboot, but you may have trouble when restarting Fuseki

text:directory<file:/tmp/tdb-lucene-index>

Le 22/01/2019 à 16:21, Mikael Pesonen a écrit :

Sorry had issues with permissions. So now I can load configuration from cmd line

/usr/bin/java -Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties -Xmx5600M -jar fuseki-server.jar --update --port 3030 --config=run/config.ttl

and Indexer works
...
INFO  2060054 (32699 per second) properties indexed

and sparql text search works. Only issue remaining is that now Skosmos application (and possible some others) stopped working. Fuseki log says  Fuseki     INFO  [2] 404 Not found: dataset='ds' service='sparql' (1 ms)

Here is the config where I updated tdb:location:

@prefix :<http://localhost/jena_example/#>  .
@prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#>  .
@prefix tdb:<http://jena.hpl.hp.com/2008/tdb#>  .
@prefix ja:<http://jena.hpl.hp.com/2005/11/Assembler#>  .
@prefix text:<http://jena.apache.org/text#>  .
@prefix skos:<http://www.w3.org/2004/02/skos/core#>
@prefix fuseki:<http://jena.apache.org/fuseki#>  .

## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "org.apache.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .


## ---------------------------------------------------------------


:text_dataset rdf:type     text:TextDataset ;
     text:dataset   :my_dataset ;
     text:index     <#indexLucene> ;
     .

# A TDB dataset used for RDF storage
:my_dataset rdf:type      tdb:DatasetTDB ;
     tdb:location "/home/text/tools/jena_data_test/" ;
#    tdb:unionDefaultGraph true ; # Optional
     .

# Text index description
<#indexLucene> a text:TextIndexLucene ;
     text:directory<file:/tmp/tdb-lucene-index>  ;
     text:entityMap <#entMap> ;
     text:storeValues true ;
     text:analyzer [ a text:StandardAnalyzer ] ;
     text:queryAnalyzer [ a text:KeywordAnalyzer ] ;
     text:queryParser text:AnalyzingQueryParser ;
     text:multilingualSupport true ;
  .

<#entMap> a text:EntityMap ;
     text:defaultField     "label" ;
     text:entityField      "uri" ;
     text:uidField         "uid" ;
     text:langField        "lang" ;
     text:graphField       "graph" ;
     text:map (
          [ text:field "label" ;
            text:predicate skos:prefLabel ]
          ) .

<#service> rdf:type fuseki:Service ;
     fuseki:name                     "/ds" ;   # http://host:port/ds-ro
     fuseki:serviceQuery             "query" ;    # SPARQL query service
     fuseki:serviceReadGraphStore    "data" ;     # SPARQL Graph store protocol (read only)
     fuseki:dataset           :text_dataset ;
     .



On 22/01/2019 17:01, Lorenz B. wrote:
Can't reproduce this. With the config file I shared with you, it works
for me as expected with Fuseki 3.10.0

Which config file do you use now?

On 22/01/2019 16:17, vincent ventresque wrote:
Hello

If the configuration folder is empty, the run/config.ttl (default
file) should be the one that Fuseki uses : you should try to replace
the content of run/config.ttl with the one Chris has sent in previous
msg

That worked!

So now with the example Lorenz posted I get

  java -cp ./fuseki-server.jar jena.textindexer --desc=run/config.ttl
org.apache.jena.sparql.ARQException: No such type:
<http://jena.apache.org/text#TextDataset>
         at
org.apache.jena.sparql.core.assembler.AssemblerUtils.build(AssemblerUtils.java:122)
         at
org.apache.jena.query.text.TextDatasetFactory.create(TextDatasetFactory.java:38)          at jena.textindexer.processModulesAndArgs(textindexer.java:90)
         at jena.cmd.CmdArgModule.process(CmdArgModule.java:52)
         at jena.cmd.CmdMain.mainMethod(CmdMain.java:92)
         at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
         at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
         at jena.textindexer.main(textindexer.java:52)


@prefix :<http://localhost/jena_example/#>  .
@prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
@prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:<http://jena.hpl.hp.com/2008/tdb#>  .
@prefix ja:<http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:<http://jena.apache.org/text#>  .
@prefix skos:<http://www.w3.org/2004/02/skos/core#>
@prefix fuseki:<http://jena.apache.org/fuseki#>  .

## Example of a TDB dataset and text index
## Initialize TDB
[] ja:loadClass "org.apache.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .


## ---------------------------------------------------------------


:text_dataset rdf:type     text:TextDataset ;
     text:dataset   :my_dataset ;
     text:index     <#indexLucene> ;
     .

# A TDB dataset used for RDF storage
:my_dataset rdf:type      tdb:DatasetTDB ;
     tdb:location "/tmp/tdb-dataset/" ;
#    tdb:unionDefaultGraph true ; # Optional
     .

# Text index description
<#indexLucene> a text:TextIndexLucene ;
     text:directory<file:/tmp/tdb-lucene-index>  ;
     text:entityMap <#entMap> ;
     text:storeValues true ;
     text:analyzer [ a text:StandardAnalyzer ] ;
     text:queryAnalyzer [ a text:KeywordAnalyzer ] ;
     text:queryParser text:AnalyzingQueryParser ;
     text:multilingualSupport true ;
  .

<#entMap> a text:EntityMap ;
     text:defaultField     "label" ;
     text:entityField      "uri" ;
     text:uidField         "uid" ;
     text:langField        "lang" ;
     text:graphField       "graph" ;
     text:map (
          [ text:field "label" ;
            text:predicate skos:prefLabel ]
          ) .

<#service> rdf:type fuseki:Service ;
     fuseki:name                     "/ds" ;   # http://host:port/ds-ro      fuseki:serviceQuery             "query" ;    # SPARQL query service
     fuseki:serviceReadGraphStore    "data" ;     # SPARQL Graph store
protocol (read only)
     fuseki:dataset           :text_dataset ;
     .

Le 22/01/2019 à 14:58, Mikael Pesonen a écrit :
Hi,

we haven't made any configuration files, so configuration folder is
empty. Everything is working fine except text search so are there
some advances for doing custom config? I understand now that for
text search we need the config and about that I asked for help.

On 21/01/2019 19:24, Chris Tomlinson wrote:
Hi,

You’ve presented your system service configuration. We need to see
the configuration file
<http://jena.apache.org/documentation/fuseki2/fuseki-configuration.html>
in:

      $FUSEKI_BASE/configuration/text_svc.ttl

It is a ttl file that is interpreted by the assembler system to
build the endpoints and dataset components. This is where the
jena-text configuration info will go. Here is a sample
configuration based on info in your service defn:

# Fuseki configuration for BDRC, configures two endpoints:
#   - /bdrc is read-only
#   - /bdrcrw is read-write
#
# This was painful to come up with but the web interface basically
allows no option
# and there is no subclass inference by default so such a
configuration file is necessary.
#
# The main doc sources are:
#  -
https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html

#  -
https://jena.apache.org/documentation/assembler/assembler-howto.html
#  - https://jena.apache.org/documentation/assembler/assembler.ttl
#
# See
https://jena.apache.org/documentation/fuseki2/fuseki-layout.html
for the destination of this file.

@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text: <http://jena.apache.org/text#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix :        <http://base/#> .

[] rdf:type fuseki:Server ;
     fuseki:services (
       :text_svc
     ) .

:text_svc rdf:type fuseki:Service ;
      fuseki:name                       "text_svc" ;     # name of
the dataset in the url
      fuseki:serviceQuery               "query" ; # SPARQL query
service
      fuseki:serviceUpdate              "update" ; # SPARQL
update service
      fuseki:serviceUpload              "upload" ; # Non-SPARQL
upload service
      fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph
store protocol (read and write)
      fuseki:dataset :text_dataset ;
      .

:text_dataset rdf:type     text:TextDataset ;
      text:dataset   :tdb_dataset ;
      text:index     :lucene_index ;
      .

# using TDB
:tdb_dataset rdf:type tdb:DatasetTDB ;
       tdb:location
"/home/text/tools/apache-jena-fuseki-3.9.0/run/tdb_dataset" ;
       tdb:unionDefaultGraph true ;
       .

# Text index description
:lucene_index a text:TextIndexLucene ;
      text:directory
<file:/home/text/tools/apache-jena-fuseki-3.9.0/run/lucene_index> ;
      text:storeValues true ;
      text:entityMap :entitymap ;
      .

# Index mappings
:entitymap a text:EntityMap ;
      text:entityField      "uri" ;
      text:uidField         "uid" ;
      text:defaultField     "label" ;
      text:langField        "lang" ;
      text:graphField       "graph" ; ## enable graph-specific
indexing
      text:map (
           [ text:field "label" ;
             text:predicate skos:prefLabel ]
           ) ;
      .
hth,
Chris




On Jan 21, 2019, at 5:21 AM, Mikael Pesonen
<mikael.peso...@lingsoft.fi> wrote:


Hi,


On 18/01/2019 18:13, Chris Tomlinson wrote:
Hi,

1) If you’re using a default config, it does not have a working
jena-text configuration. The config will need to include
skos:prefLabel in the entity map.

2) when you change the jena-text in significant ways, such as
changing what analyzer is used for a given property and so on,
then you’ll need to rebuild the Lucene index via reloading the
dataset or using the textIndexer
<https://jena.apache.org/documentation/query/text-query.html#building-a-text-index>.
I don’t recall this being mentioned as part of your testing

3) Please indicate exactly which item you’re using
jena-fuseki-war-3.9.0.war or jena-fuseki-webapp-3.9.0.jar etc,
and the config file itself. The error you’ve mentioned previously:
We are running Fuseki as service

-----
[Unit]
Description=Apache Jena Fuseki

[Service]
Type=simple
User=fuseki
#Environment=JAVA_HOME=/usr/lib/jvm/java-8-oracle/
Environment=FUSEKI_HOME=/home/text/tools/apache-jena-fuseki-3.9.0
Environment=FUSEKI_BASE=/home/text/tools/apache-jena-fuseki-3.9.0/run
ExecStart=/usr/bin/java
-Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
-Xmx5600M -jar
/home/text/tools/apache-jena-fuseki-3.9.0/fuseki-server.jar
--update --port 3030 --loc=/home/text/tools/jena_data_test/ /ds

[Install]
WantedBy=multi-user.target
-----

All settings are default otherwise, we haven't changed any config
file.

Are there some minimal settings to this example config so that I
could get skos:prefLabel working?

https://jena.apache.org/documentation/query/text-query.html#configuration


So when we have a working configuration/assembler file, all is
needed is to build the index

java  -cp  $FUSEKI_HOME/fuseki-server.jar jena.textindexer
--desc=assembler_file ?


Thank everyone for the help
Jan 17 17:00:28 semantic-dev java[16800]: [2019-01-17 17:00:28]
Config     INFO  Load configuration:
file:///home/text/tools/apache-jena-fuseki-3.9.0/run/configuration/text_index.ttl <file:///home/text/tools/apache-jena-fuseki-3.9.0/run/configuration/text_index.ttl>

Jan 17 17:00:28 semantic-dev java[16800]: [2019-01-17 17:00:28]
WebAppContext WARN  Failed startup of context
o.e.j.w.WebAppContext@4159e81b{Apache Jena Fuseki
Server,/,file:///home/text/tools/apache-jena-fuseki-3.9.0/webapp/,UNAVAILABLE <file:///home/text/tools/apache-jena-fuseki-3.9.0/webapp/,UNAVAILABLE>}

Jan 17 17:00:28 semantic-dev java[16800]: at
org.apache.jena.fuseki.build.FusekiConfig.readAssemblerFile(FusekiConfig.java:148)
suggests to me that something in the config file is confusing the
readAssemblerFile. It doesn’t look like it’s failing in the
reading the jena-text portion of the config.

If http://api.finto.fi/download/mesh/mesh-skos.ttl
<http://api.finto.fi/download/mesh/mesh-skos.ttl> the dataset,
then can you cut it down to just a small test case with some
concepts with “medi” and a few without? That along with the other
information should help move this further along..

4) Your query:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#
<http://www.w3.org/2004/02/skos/core#>>
PREFIX text: <http://jena.apache.org/text#
<http://jena.apache.org/text#>>
SELECT *
WHERE
{
    GRAPH <http://www.yso.fi/onto/mesh/
<http://www.yso.fi/onto/mesh/>>
    {
      ?concept text:query (skos:prefLabel "medi") .
      ?concept skos:prefLabel ?prefLabel .

      # FILTER (  REGEX(?prefLabel, "\\bmedi", "i"))
    }
}
limit 10
might effectively just be executing:

?concept skos:prefLabel ?prefLabel .
if there is actually no jena-text config - I haven’t checked what
happens when there is no TextIndex configured and the text:query
is invoked, but may be a noop

Thanks,
Chris


On Jan 18, 2019, at 8:08 AM, Mikael Pesonen
<mikael.peso...@lingsoft.fi> wrote:



On 18/01/2019 13:40, Andy Seaborne wrote:
On 17/01/2019 15:45, Mikael Pesonen wrote:
On 17/01/2019 17:38, Andy Seaborne wrote:
On 17/01/2019 12:51, Mikael Pesonen wrote:
On 17/01/2019 13:58, Andy Seaborne wrote:
On 16/01/2019 12:50, Mikael Pesonen wrote:
Hi,

I'm trying to get text search work. Sparql REGEX takes few
seconds to finish so hoping this would be faster.
Application is term search using SKOS ontology.

    First tested if it's enabled by default

    ?concept text:query (skos:prefLabel "medi") .
     ?concept skos:prefLabel ?prefLabel

That returns all concepts so I guess it's not enabled.
If it returns all concepts, the first line matched
(otherwise you get none). If so, there is a text index and
"medi" (case insensitive) matches Lucene rules, everything.
What does this mean then, why is it matching everything?
If zero matches, you don't get to ?concept skos:prefLabel
?prefLabel (if the text index is correct)

The query above, if the index is setup correctly, gets all
concepts where any skos:prefLabel matches "medi" (not just at
the start), then gets all skos:prefLabel for those concepts.
That does not mean ?prefLabel only matches "medi"

:c skos:prefLabel "medi" ;
     skos:prefLabel "Other" .

will return 2 matches including ?prefLabel="Other"
Yes that is how I understood it. But ?concept text:query
(skos:prefLabel "medi")  returns all concepts, also those that
don't have any label having "medi".
Then I don't understand what is going on.

Do you have a complete, minimal example that someone can use to
recreate the situation?

This is the query:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX text: <http://jena.apache.org/text#>
SELECT *
WHERE
{
    GRAPH <http://www.yso.fi/onto/mesh/>
    {
      ?concept text:query (skos:prefLabel "medi") .
      ?concept skos:prefLabel ?prefLabel .

      # FILTER (  REGEX(?prefLabel, "\\bmedi", "i"))
    }
}
limit 10

and graph is dump copied from here: https://finto.fi/mesh/en/
end of page "Download this vocabulary"

So to make clear, we have made zero configuration on
jena/fuseki, all is default from 3.9.0 package.
Andy
--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation -
Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's
and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND



--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND

Reply via email to