Jena has a text index? On Wed, 8 Dec 2021 at 10:07, Lorenz Buehmann < buehm...@informatik.uni-leipzig.de> wrote:
> Even if it's not the strings leading to performance issues, using the > Jena text index might be definitely more efficient > > On 08.12.21 10:38, Matt Whitby wrote: > > Fuseki. No inference. TDB2. > > > > M > > > > On Wed, 8 Dec 2021 at 09:25, Andy Seaborne <a...@apache.org> wrote: > > > >> Lots of questions! Details matter!! > >> > >> On 08/12/2021 09:05, Matt Whitby wrote: > >>> It's hosted in a container in Azure. > >> (Jena storage layer) > >> > >> Using TDB1? TDB2? > >> > >>> I test it via Postman (though we're writing a RESTFul API to sit on > top). > >> So this is Fuseki? Is there any inference being used? > >> > >> Andy > >> > >>> On Wed, 8 Dec 2021 at 09:00, Andy Seaborne <a...@apache.org> wrote: > >>> > >>>> Hi Matt, > >>>> > >>>> That query does not look couple-of-minutes expensive. > >>>> > >>>> Could you run it removing parts to see what happens? e.g. Remove one > >>>> OPTIONAL and it's associated part of the filter. > >>>> > >>>> Which storage layer are you using? > >>>> > >>>> Andy > >>>> > >>>> On 07/12/2021 20:18, aj...@apache.org wrote: > >>>>> On Tue, Dec 7, 2021, 1:55 PM Matt Whitby <matt.whi...@gmail.com> > >> wrote: > >>>>> I dare say running an lcase against each field doesn't help matters, > >> but > >>>> with > >>>>> no other way of doing a case-insensitive search (well, Regex - but > who > >>>> likes > >>>>> Regex?) I'm not sure. > >>>>> > >>>>> > >>>>> On this point alone, if it does turn out that string processing is > what > >>>> is > >>>>> costing you time, you might adjust your data to include a convenience > >>>>> property with county, district, and parish in lowercase. Then you > could > >>>> do > >>>>> a more direct (and cheaper) match. > >>>>> > >>>>> That having been said, it seems unlikely to me that timed-out queries > >> are > >>>>> due to something as cheap as lowercasing. Have you tried peeling off > >> some > >>>>> of those OPTIONALs to see how much they cost? > >>>>> > >>>>> Adam > >>>>> > >>>>> > >>>>> On Tue, Dec 7, 2021, 1:55 PM Matt Whitby <matt.whi...@gmail.com> > >> wrote: > >>>>>> I have a Sparql question if that's okay. > >>>>>> > >>>>>> There are only around 8m triples in our test data, so pretty small. > >>>>>> > >>>>>> The query takes a good couple of minutes to run (and sometimes just > >>>> times > >>>>>> out). > >>>>>> > >>>>>> I dare say running an lcase against each field doesn't help matters, > >> but > >>>>>> with no other way of doing a case-insensitive search (well, Regex - > >> but > >>>> who > >>>>>> likes Regex?) I'm not sure. > >>>>>> > >>>>>> Any obvious ways to make it less bad? > >>>>>> > >>>>>> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> > >>>>>> select ?s ?name > >>>>>> where { > >>>>>> > >>>>>> ?s <http://www.historicengland.org.uk/data/schema/simplename/name> > >>>> ?name . > >>>>>> OPTIONAL {?s <http://www.historicengland.org.uk/data/schema/county> > >>>>>> ?county}. > >>>>>> OPTIONAL {?s < > http://www.historicengland.org.uk/data/schema/district/ > >>>>>> ?district}. > >>>>>> OPTIONAL {?s <http://www.historicengland.org.uk/data/schema/parish> > >>>>>> ?parish}. > >>>>>> > >>>>>> FILTER (CONTAINS(lcase(?county),"lewes") || CONTAINS( > >>>>>> lcase(?district),"lewes") || CONTAINS( lcase(?parish),"lewes")) > >>>>>> > >>>>>> } > >>>>>> limit 10 > >>>>>> > >>> > > > -- Matt Southend. Essex, England Guff follows.... Me: http://www.about.me/matt.whitby Photography: http://www.whitbyphoto.com Travels: http://www.whitbyadventures.com Music: http://www.last.fm/user/MattWhitby <http://www.last.fm/user/MattWhitby/%3C/a%3E> Reading: https://www.goodreads.com/user_challenges/19398505 Development: https://www.hackerrank.com/matt_whitby