Jena has a text index?

On Wed, 8 Dec 2021 at 10:07, Lorenz Buehmann <
buehm...@informatik.uni-leipzig.de> wrote:

> Even if it's not the strings leading to performance issues, using the
> Jena text index might be definitely more efficient
>
> On 08.12.21 10:38, Matt Whitby wrote:
> > Fuseki. No inference. TDB2.
> >
> > M
> >
> > On Wed, 8 Dec 2021 at 09:25, Andy Seaborne <a...@apache.org> wrote:
> >
> >> Lots of questions! Details matter!!
> >>
> >> On 08/12/2021 09:05, Matt Whitby wrote:
> >>> It's hosted in a container in Azure.
> >> (Jena storage layer)
> >>
> >> Using TDB1? TDB2?
> >>
> >>> I test it via Postman (though we're writing a RESTFul API to sit on
> top).
> >> So this is Fuseki? Is there any inference being used?
> >>
> >>       Andy
> >>
> >>> On Wed, 8 Dec 2021 at 09:00, Andy Seaborne <a...@apache.org> wrote:
> >>>
> >>>> Hi Matt,
> >>>>
> >>>> That query does not look couple-of-minutes expensive.
> >>>>
> >>>> Could you run it removing parts to see what happens? e.g. Remove one
> >>>> OPTIONAL and it's associated part of the filter.
> >>>>
> >>>> Which storage layer are you using?
> >>>>
> >>>>        Andy
> >>>>
> >>>> On 07/12/2021 20:18, aj...@apache.org wrote:
> >>>>> On Tue, Dec 7, 2021, 1:55 PM Matt Whitby <matt.whi...@gmail.com>
> >> wrote:
> >>>>> I dare say running an lcase against each field doesn't help matters,
> >> but
> >>>> with
> >>>>> no other way of doing a case-insensitive search (well, Regex - but
> who
> >>>> likes
> >>>>> Regex?) I'm not sure.
> >>>>>
> >>>>>
> >>>>> On this point alone, if it does turn out that string processing is
> what
> >>>> is
> >>>>> costing you time, you might adjust your data to include a convenience
> >>>>> property with county, district, and parish in lowercase. Then you
> could
> >>>> do
> >>>>> a more direct (and cheaper) match.
> >>>>>
> >>>>> That having been said, it seems unlikely to me that timed-out queries
> >> are
> >>>>> due to something as cheap as lowercasing. Have you tried peeling off
> >> some
> >>>>> of those OPTIONALs to see how much they cost?
> >>>>>
> >>>>> Adam
> >>>>>
> >>>>>
> >>>>> On Tue, Dec 7, 2021, 1:55 PM Matt Whitby <matt.whi...@gmail.com>
> >> wrote:
> >>>>>> I have a Sparql question if that's okay.
> >>>>>>
> >>>>>> There are only around 8m triples in our test data, so pretty small.
> >>>>>>
> >>>>>> The query takes a good couple of minutes to run (and sometimes just
> >>>> times
> >>>>>> out).
> >>>>>>
> >>>>>> I dare say running an lcase against each field doesn't help matters,
> >> but
> >>>>>> with no other way of doing a case-insensitive search (well, Regex -
> >> but
> >>>> who
> >>>>>> likes Regex?) I'm not sure.
> >>>>>>
> >>>>>> Any obvious ways to make it less bad?
> >>>>>>
> >>>>>> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
> >>>>>> select ?s ?name
> >>>>>> where {
> >>>>>>
> >>>>>> ?s <http://www.historicengland.org.uk/data/schema/simplename/name>
> >>>> ?name .
> >>>>>> OPTIONAL {?s <http://www.historicengland.org.uk/data/schema/county>
> >>>>>> ?county}.
> >>>>>> OPTIONAL {?s <
> http://www.historicengland.org.uk/data/schema/district/
> >>>>>> ?district}.
> >>>>>> OPTIONAL {?s <http://www.historicengland.org.uk/data/schema/parish>
> >>>>>> ?parish}.
> >>>>>>
> >>>>>> FILTER (CONTAINS(lcase(?county),"lewes") || CONTAINS(
> >>>>>> lcase(?district),"lewes") || CONTAINS( lcase(?parish),"lewes"))
> >>>>>>
> >>>>>> }
> >>>>>> limit 10
> >>>>>>
> >>>
> >
>


-- 
Matt
Southend. Essex, England

Guff follows....

Me: http://www.about.me/matt.whitby


Photography: http://www.whitbyphoto.com


Travels: http://www.whitbyadventures.com


Music: http://www.last.fm/user/MattWhitby
<http://www.last.fm/user/MattWhitby/%3C/a%3E>


Reading: https://www.goodreads.com/user_challenges/19398505


Development: https://www.hackerrank.com/matt_whitby

Reply via email to