Re: Fwd: Re: Error during text index

Sorin Gheorghiu Thu, 12 May 2016 11:46:17 -0700

Hi,

the attempt to perform a sparql insert using *s-update* has failed withthe error:

# /opt/apache-jena-fuseki-2.3.1/bin/s-update--service=http://localhost:3030/<dataset>/update --file update.ru

/usr/lib/ruby/1.9.1/net/protocol.rb:141:in `read_nonblock': end of filereached (EOFError)

        from /usr/lib/ruby/1.9.1/net/protocol.rb:141:in `rbuf_fill'
        from /usr/lib/ruby/1.9.1/net/protocol.rb:122:in `readuntil'
        from /usr/lib/ruby/1.9.1/net/protocol.rb:132:in `readline'
        from /usr/lib/ruby/1.9.1/net/http.rb:2563:in `read_status_line'
        from /usr/lib/ruby/1.9.1/net/http.rb:2552:in `read_new'

from /usr/lib/ruby/1.9.1/net/http.rb:1320:in `block intransport_request'

        from /usr/lib/ruby/1.9.1/net/http.rb:1317:in `catch'
        from /usr/lib/ruby/1.9.1/net/http.rb:1317:in `transport_request'
        from /usr/lib/ruby/1.9.1/net/http.rb:1294:in `request'

from /opt/apache-jena-fuseki-2.3.1/bin/s-update:221:in`response_no_body'from /opt/apache-jena-fuseki-2.3.1/bin/s-update:614:in`SPARQL_update'from /opt/apache-jena-fuseki-2.3.1/bin/s-update:681:in`cmd_sparql_update'

        from /opt/apache-jena-fuseki-2.3.1/bin/s-update:708:in `<main>'

The same error will occur with ruby > 2.0 (but no backtrace printed out):

/opt/apache-jena-fuseki-2.3.1/bin/s-update: end of file reached (EOFError)

Do you have any hit, please?

Thanks
Sorin

Am 04.05.2016 um 14:54 schrieb Andy Seaborne:

Hi there,

This looks like something to do with the solr setup. I'm not veryfamiliar with solr, is there some configuration that affects timeoutson connections? I don't think Jena does any timeouts itself.


    Andy

On 03/05/16 08:50, Sorin Gheorghiu wrote:

After Solr server restart, it looks like the indexes aren't corrupted.
Thus, it seems the error isn't critical and I may ignore it.

But my expectation was that the insert command will add the new
parameter to Jena TDB and not to Solr.


-------- Weitergeleitete Nachricht --------
Betreff:     Re: Error during text index
Datum:     Mon, 2 May 2016 20:05:37 +0200
Von:     Sorin Gheorghiu <sorin.gheorg...@uni-konstanz.de>
An:     users@jena.apache.org



Hi Andy,

after 2 attempts to insert the new SKOS variable, I got the following
error:

org.apache.jena.query.text.TextIndexException:
org.apache.solr.client.solrj.SolrServerException: IOException occured
when talking to server at: http://localhost:8983/solr/GND100316_550

...............................................................................................................................



[2016-05-02 19:23:40] Fuseki     INFO  [4] 500
org.apache.solr.client.solrj.SolrServerException: IOException occured
when talking to server at: http://localhost:8983/solr/GND100316_550
(30,147.934 s)

This occured after more than 8 hours, but it failed before thecompletion.


No related Solr error was printed out in the logs in that moment, but
when I refreshed the Solr page http://localhost:8983/solr/#/~cores, then
I got:

30852656 INFO  (qtp1013423070-18) [   ] o.a.s.s.HttpSolrCall [admin]
webapp=null path=/admin/info/system params={wt=json&_=1462210386319}
status=0 QTime=1758
30854518 ERROR (qtp1013423070-20) [   ] o.a.s.h.RequestHandlerBase
org.apache.solr.common.SolrException: Error handling 'status' action

...............................................................................................................................


Caused by: java.nio.file.NoSuchFileException:
/opt/solr-5.5.0/server/solr/GND100316_550/data/index/segments_1

Indeed, there is no *segments_1* file in ../data/index/ but a different
one:

# ls -lrt /opt/solr-5.5.0/server/solr/GND100316_550/data/index/segments*
-rw-r--r-- 1 root root 937 May  2 17:42
/opt/solr-5.5.0/server/solr/GND100316_550/data/index/segments_10r

I could provide the backtrace if needed. Could you help me to understand
the root cause please?

Thank you
Sorin


Am 29.04.2016 um 12:20 schrieb Andy Seaborne:

The use of rdf:type seems to mix being a displayable label and a class
type.

Maybe adding skos:prefLabel to keep the display label is worth doing.

You can extract the fragment from a URI with:

    STRAFTER(STR(<http://example/foo#bar>), "#")'

(untested):

INSERT { ?s skos:prefLabel ?label }
WHERE {
   ?s a ?T .
   BIND ( ?label as STRAFTER(STR(?T), "#")
}


On 29/04/16 09:25, Sorin Gheorghiu wrote:

Hi Osma,

I do need the type in the text index to get faster results than using
sparql queries.

I found an analyzer which could replace the URI with the string type,
but I cannot use it as long as the non-literal are skiped.

     <fieldType name="text_type_gnd" class="solr.TextField" >
       <analyzer>
         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.PatternReplaceFilterFactory" pattern="
http://d-nb.info/standards/elementset/gnd#"; replacement=""
replace="all" />
       </analyzer>
     </fieldType>

I am still looking for a workaround for this case.

Thanks,
Sorin

Am 29.04.2016 um 08:43 schrieb Osma Suominen:

Hi Sorin!

Why do you need the type in the text index? The text index isdesigned

to store literals. It does not know how to handle URIs at all.

Generally what you would do to combine text search with a restriction
on rdf:type is to use separate query patterns, e.g.

{
   ?s text:query 'nuclear' .
   ?s a gndo:SeriesOfConferenceOrEvent .
}

-Osma


On 28/04/16 18:30, Sorin Gheorghiu wrote:

Hi Andy,

I need just the type of the entry, from the example just the lastpart

'SeriesOfConferenceOrEvent'.
If possible I would set an analyser which would trim the first
part, but
I don't know how.

Thanks
Sorin



Am 28.04.2016 um 17:25 schrieb Andy Seaborne:

Hi Sorin,

I'm curious as to why you are indexing a URI and what you see the
benefit of that.  You might at least want to set the analyser
carefully.

    Andy

PS I fixed the cause of the "UnsupportedOperationException" butonlyin the sense that it now issues a warning and skips thenon-literal.

The test for being a literal or not was there ... but after calling
getLiteral.


On 28/04/16 15:47, Sorin Gheorghiu wrote:

Hello,

Jena text index returned the following error:

# java -cp /opt/apache-jena-fuseki-2.3.1/fuseki-server.jar
jena.textindexer --desc=/etc/default/fuseki/jena-text-config.ttl
java.lang.UnsupportedOperationException:

http://d-nb.info/standards/elementset/gnd#SeriesOfConferenceOrEvent

is
not a literal node
         at org.apache.jena.graph.Node.getLiteral(Node.java:100)
         at

org.apache.jena.query.text.TextQueryFuncs.entityFromQuad(TextQueryFuncs.java:80)

at

org.apache.jena.query.text.TextQueryFuncs.entityFromQuad(TextQueryFuncs.java:67)





         at jena.textindexer.exec(textindexer.java:122)
         at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
         at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
         at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
         at jena.textindexer.main(textindexer.java:51)

when attempted to index entries like:

@prefix gndo: <http://d-nb.info/standards/elementset/gnd#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://d-nb.info/gnd/1-2> gndo:gndIdentifier "1-2" ;
         gndo:variantNameForTheConferenceOrEvent "Conferentie van

Niet-Kernwapenstaten" , "Conference on Non-Nuclear WeaponStates" ;

         gndo:preferredNameForTheConferenceOrEvent "Conference of
Non-Nuclear Weapon States" ;
         a gndo:SeriesOfConferenceOrEvent .

Here is the EntityMap assembler setup:

<#entMap> a text:EntityMap ;
     text:entityField      "gndUri" ;
     text:defaultField     "prefName" ; ## Must be defined in the
text:map
     text:map (
          [ text:field "prefName";
            text:predicate gndo:preferredNameForTheSubjectHeading
          ]
          [ text:field "type";
            text:predicate rdf:type
          ]
          ...

'type' contains an URL, but a literal node is expected instead.

There is no difference if 'type' is defined as 'text' or'string' in

Solr schema.xml.

How is possible to fix it?

Thank you in advance,
Sorin


--
Sorin Gheorghiu             Tel: +49 7531 88-3198
Universität Konstanz        Raum: B703
78464 Konstanz              sorin.gheorg...@uni-konstanz.de

- KIM: Abteilung Contentdienste -

Re: Fwd: Re: Error during text index

Reply via email to