Re: How to set a timeout for QueryExecution correctly?

Cristóbal Miranda Tue, 03 Aug 2021 14:08:11 -0700

Hi Andy,

Q1 seems to have the same problem, I created a script to generate data for
it:
https://github.com/CristobalM/jena-query-timeout/blob/main/scripts/generate_q1_optional.py


It gets stuck in getPlan() when there are no matches for the OPTIONAL
pattern
having ?var5. I tried first with data having that and the timeout worked,
for example,
at 2 seconds.

I'll send you the Visualvm snapshot,


On Mon, 2 Aug 2021 at 06:47, Andy Seaborne <[email protected]> wrote:

> Hi there,
>
> Good catch - I can make that happen on my system.
>
> Recorded as JENA-2139
>
> The fix looks quite simple - delay making the index table until first
> use - which works from testing so far. PR soon.
>
> That explains Q2 and Q3 but not Q1 (no MINUS!).
>
> The same work-during-plan building may be happening - not in the group
> (that is a delayed iterator).
>
> If Q1 returns zero results, then the problem isn't in the sorting code
> code itself (that can spend a long time during which it is not reading
> query iterators but sorting zero rows is quite quick) but could be in
> the need to evaluate the pattern. e.g.  FILTER ( ! bound(?var5) ) is
> always false.
>
> Removing parts of the query and seeing what happens may point to the
> culprit.
>
>  > and got the result in the attachment.
>
> Sorry - no attachments on the list. If you still have it, could you send
> it directly to me please?
>
>      Andy
>
> https://issues.apache.org/jira/browse/JENA-2139
> <https://issues.apache.org/jira/browse/JENA-2139>
>
> On 02/08/2021 03:56, Cristóbal Miranda wrote:
> > Hi Andy,
> >
> > Thanks for the clarifications.
> >
> > I have a program to run a query with timeout from a tdb dataset
> > and a script to generate an nt file for the second query. Because it is
> > the simplest among the three, I decided to start with that one. The code
> > is here https://github.com/CristobalM/jena-query-timeout
> > <https://github.com/CristobalM/jena-query-timeout>.
> >
> > I think I have spotted a problem using that java code. I put that into a
> > new module in the jena project and modified QueryExecutionBase with
> > some prints to see that everything is being called. My initial guess
> > to do
> > this was that there is a concurrency problem with the TimeoutCallback
> > and cancel() was never being called on the iterator and from what I saw
> > that seems to be the case.
> >
> > So what happens is that queryIterator = getPlan().iterator()
> > at QueryExecutionBase:558 is taking some time to finish and until then,
> > queryIterator will remain null. Then, if the timeout occurs before that,
> > queryIterator.cancel() won't be called because to do that there is
> > a check that queryIterator is not null (QueryExecutionBase:189).
> >
> > Using `python3 generate_minus_1.py dataset.nt 10000000 9000000`
> > for the dataset (and loaded with tdb.tdbloader2) and setting a timeout
> > of 30 seconds, the iterator is not cancelled, but with 60 seconds it is
> > cancelled.
> >
> > I tried using 120 seconds of timeout for the full wikidata dataset and
> > the Q1
> > won't stop after that time, so I can't be sure if the same is happening
> > there. I will continue with that query and see if it is the same and
> > let you know what happens.
> >
> > Also, I did a sampling with VisualVM, using the Q2 and the dataset
> > mentioned above and got the result in the attachment.
> >
> >
> >
> >
> > On Fri, 30 Jul 2021 at 16:33, Andy Seaborne <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >
> >
> >     On 30/07/2021 18:04, Cristóbal Miranda wrote:
> >     >> For Q3, does "LIMIT 1" make any difference?
> >     >
> >     >
> >     > No, I'm trying it now and it is taking several minutes at least.
> >     >
> >     >
> >     >> except different cardinality. Because ?var5 isn't used
> >     eslewhere, is
> >     >> that waht was meant here?
> >     >
> >     >
> >     > I'm also not too sure about the meaning of the queries in
> >     general, I'm just
> >     > using
> >     > a large dataset of anonymized queries. I took one of the
> >     > files from here
> >     https://iccl.inf.tu-dresden.de/web/Wikidata_SPARQL_Logs/en
> >     <https://iccl.inf.tu-dresden.de/web/Wikidata_SPARQL_Logs/en>
> >     > (they say those are from Wikidata's users) and executed a script
> >     to filter
> >     > out some types of queries, the result of that is a file with 27
> >     million
> >     > queries.
> >     > But to answer your question about this specific query, since this
> is
> >     > Wikidata,
> >     > what I'd do is to run the query on https://query.wikidata.org/
> >     <https://query.wikidata.org/>, and infer
> >     > the meaning
> >     > of the variables, ?var1 is some person and P570 date of death,
> >     etc... and
> >     > from the filters it seems
> >     > that they want to find those who died in the last 30 days and
> >     were born
> >     > after 1954. ?var5
> >     > usage seems to indicate that they want to find those whose cause
> >     of death
> >     > is not
> >     > known (because it's an OPTIONAL and !BOUND would filter out
> >     those where
> >     > there is a cause of death).
> >     > In this case it is reasonable that I get no results because the
> >     dataset I'm
> >     > using is older than 30 days.
> >     > So yes, you are right, they are equivalent for this query and
> >     with NOT
> >     > EXISTS would
> >     > probably be faster. But still, what I'd like to have is a way to
> >     stop the
> >     > processing
> >     > after a few seconds consistently, independent of the query.
> >     >
> >     > I just did quick peek at the code and found that, correct me if
> >     I'm wrong,
> >     > the cancellation of
> >     > a query is propagated in a top down manner for the
> >     QueryIterators, in a way
> >     > that to notice
> >     > that the query was aborted you have to get the next result (I'm
> >     looking
> >     > specifically at QueryIterRepeatApply.java:65),
> >     > meanwhile the query might be stuck at some heavy processing in a
> >     lower
> >     > level, which might be the case for
> >     > nested OPTIONALs.
> >
> >     Yes - that the mechanism but nested OPTIONALs are optimized to
> >     "(condition)" which incrementally reads iterators.
> >
> >     In a query plan, ("qparse --print=opt") it shows (condition) not
> >     (leftjoin).
> >
> >     The (minus) operator may be reading all one side and then missing the
> >     timeout but it is still advancing one side (the left)
> >
> >     So trying the FILTER NOT EXISTS version of one of the problem queries
> >     will give us some help in identifying if the problem is MINUS or not.
> >
> >     > In the case of QueryIterMinus I see that in the constructor you
> >     have to
> >     > create the index for the right iterator, but
> >     > that index creation is not stoppable. (On a side note, did you
> >     try doing
> >     > QueryIterMinus similar to QueryIterOptionalIndex,
> >     > using QueryIterRepeatApply? I just have the impression that
> >     running that
> >     > subquery with left values bound and checking
> >     > for only one result could be faster, but this is just a
> >     suggestion in case
> >     > you haven't tried, don't take this
> >     > too seriously).
> >
> >     The right-hand-side index building is once per operator so gets
> >     reuses
> >     across the whole of traversing the left.
> >
> >     IndexFactory.create is reading a query iterator.
> >
> >     So on the surface, query iterators are being read and the time
> >     should be
> >     seen.
> >
> >     Unless it hits the LinearIndex case which looks expensive.
> >
> >     >
> >     > If you consider that those are things to be fixed and don't have
> >     it in high
> >     > priority I could give it a try and send a PR,
> >     > if it's ok with you.
> >
> >     That would be excellent but let's establish the problem first to
> >     know it
> >     is going to improve things.
> >
> >     If you can point VisualVM/YourKit/... at a slow query and identify
> >     the
> >     code hot spots, that would be perfect.
> >
> >     A small example dataset would also be good so we have a minimal,
> >     executable example.
> >
> >     >
> >     > Sorry if I seem too intrusive here, but I'm really interested in
> >     having the
> >     > timeout work well for any query!
> >
> >     Not at all - it is part of the point of open source.
> >
> >          Andy
> >
> >     >
> >     >
> >     >
> >     > On Fri, 30 Jul 2021 at 04:47, Andy Seaborne <[email protected]
> >     <mailto:[email protected]>> wrote:
> >     >
> >     >> Hi Cristóbal,
> >     >>
> >     >> I can't tell from just the queries.
> >     >>
> >     >> You say the first query has zero results so it is not the ORDER
> BY.
> >     >>
> >     >> The queries themselves may be inefficient and changing the
> >     order may
> >     >> have a significant effect (all data dependent).
> >     >>
> >     >> All three do not use FILTER NOT EXISTS where they could. That
> >     may well
> >     >> be faster.
> >     >>
> >     >> Q1 uses OPTIONAL !BOUND (see inline), Q2 and Q3 use MINUS.
> >     >>
> >     >> For Q3, does "LIMIT 1" make any difference?
> >     >>
> >     >>       Andy
> >     >>
> >     >> On 29/07/2021 15:57, Cristóbal Miranda wrote:
> >     >>> Hi Andy,
> >     >>>
> >     >>> I have three queries in which this happens:
> >     >>>
> >     >>> 1.
> >     >>>
> >     >>> SELECT ?var1  ?var2  ?var3  ?var4
> >     >>>> WHERE {
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P570
> >     <http://www.wikidata.org/prop/direct/P570>> ?var2 .
> >     >>>>    FILTER (  ( (  ?var2  > "2016-07-30T00:00:00Z"^^<
> >     >>>> http://www.w3.org/2001/XMLSchema#dateTime
> >     <http://www.w3.org/2001/XMLSchema#dateTime>> ) )
> >     >>>> ) .
> >     >>>>    FILTER (  ( (  ?var2  >  (  NOW (  ) - "P32D"^^<
> >     >>>> http://www.w3.org/2001/XMLSchema#duration
> >     <http://www.w3.org/2001/XMLSchema#duration>> )  ) && (  ?var2  <
> NOW
> >     >> (  )
> >     >>>>    ) )
> >     >>>> ) .
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P31
> >     <http://www.wikidata.org/prop/direct/P31>> <
> >     >>>> http://www.wikidata.org/entity/Q5
> >     <http://www.wikidata.org/entity/Q5>> .
> >     >>
> >     >> ---------
> >     >>>>    OPTIONAL {
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P1196
> >     <http://www.wikidata.org/prop/direct/P1196>> ?var5 .
> >     >>>>    }
> >     >>>>    FILTER (  ( !( BOUND (  ?var5  ) ) )
> >     >>
> >     >> same as
> >     >>
> >     >> FILTER
> >     >> NOT EXISTS { ?var1 <http://www.wikidata.org/prop/direct/P1196
> >     <http://www.wikidata.org/prop/direct/P1196>> ?var5 }
> >     >>
> >     >> except different cardinality. Because ?var5 isn't used
> >     eslewhere, is
> >     >> that waht was meant here?
> >     >>
> >     >>>> ) .
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P569
> >     <http://www.wikidata.org/prop/direct/P569>> ?var6 .
> >     >>>>    FILTER (  ( (  ?var6  > "1954-12-31T00:00:00Z"^^<
> >     >>>> http://www.w3.org/2001/XMLSchema#dateTime
> >     <http://www.w3.org/2001/XMLSchema#dateTime>> ) )
> >     >>>> ) .
> >     >>>>    OPTIONAL {
> >     >>>>     ?var1  <http://wikiba.se/ontology#statements
> >     <http://wikiba.se/ontology#statements>> ?var3 .
> >     >>>>    }
> >     >>>>    OPTIONAL {
> >     >>>>     ?var1  <http://wikiba.se/ontology#sitelinks
> >     <http://wikiba.se/ontology#sitelinks>> ?var4 .
> >     >>>>    }
> >     >>>> }
> >     >>>> ORDER BY  DESC( ?var4 ) DESC( ?var2 )ASC( ?var1 )
> >     >>>>
> >     >>>>
> >     >>> 2.
> >     >>>
> >     >>>> SELECT ?var1  ?var2
> >     >>>> WHERE {
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P1889
> >     <http://www.wikidata.org/prop/direct/P1889>> ?var2 .
> >     >>>>     MINUS   {
> >     >>>>       ?var1  <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
> >     <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> <
> >     >>>> http://wikiba.se/ontology#Property
> >     <http://wikiba.se/ontology#Property>> .
> >     >>>>     }
> >     >>>> }
> >     >>>>
> >     >>>>
> >     >>> 3.
> >     >>>
> >     >>>> SELECT DISTINCT ?var1  ?var2  ?var3  ?var4
> >     >>>> WHERE {
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P569
> >     <http://www.wikidata.org/prop/direct/P569>> ?var3 .
> >     >>>>    FILTER (  ( (  ?var3  > "1956-01-01T00:00:00Z"^^<
> >     >>>> http://www.w3.org/2001/XMLSchema#dateTime
> >     <http://www.w3.org/2001/XMLSchema#dateTime>> ) && (  ?var3  <
> >     >>>>
> >     "1957-01-01T00:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime
> >     <http://www.w3.org/2001/XMLSchema#dateTime>>
> >     >> ) )
> >     >>>> ) .
> >     >>>>    OPTIONAL {
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P570
> >     <http://www.wikidata.org/prop/direct/P570>> ?var5 .
> >     >>>>    }
> >     >>>>    FILTER (  ( !( BOUND (  ?var5  ) ) )
> >     >>>> ) .
> >     >>>>     ?var2  <http://www.wikidata.org/prop/direct/P569
> >     <http://www.wikidata.org/prop/direct/P569>> ?var3 .
> >     >>>>    OPTIONAL {
> >     >>>>     ?var2  <http://www.wikidata.org/prop/direct/P570
> >     <http://www.wikidata.org/prop/direct/P570>> ?var6 .
> >     >>>>    }
> >     >>>>    FILTER (  ( !( BOUND (  ?var6  ) ) )
> >     >>>> ) .
> >     >>>>     ?var1  <http://www.w3.org/2000/01/rdf-schema#label
> >     <http://www.w3.org/2000/01/rdf-schema#label>> ?var4 .
> >     >>>>     ?var2  <http://www.w3.org/2000/01/rdf-schema#label
> >     <http://www.w3.org/2000/01/rdf-schema#label>> ?var4 .
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P31
> >     <http://www.wikidata.org/prop/direct/P31>> <
> >     >>>> http://www.wikidata.org/entity/Q5
> >     <http://www.wikidata.org/entity/Q5>> .
> >     >>>>     ?var2  <http://www.wikidata.org/prop/direct/P31
> >     <http://www.wikidata.org/prop/direct/P31>> <
> >     >>>> http://www.wikidata.org/entity/Q5
> >     <http://www.wikidata.org/entity/Q5>> .
> >     >>>>     ?var1  <http://www.wikidata.org/prop/direct/P21
> >     <http://www.wikidata.org/prop/direct/P21>> ?var7 .
> >     >>>>     ?var2  <http://www.wikidata.org/prop/direct/P21
> >     <http://www.wikidata.org/prop/direct/P21>> ?var7 .
> >     >>>>    FILTER (  ( (  STR (  ?var1  )  <  STR (  ?var2  )  ) )
> >     >>>> ) .
> >     >>>>     MINUS   {
> >     >>>>       ?var1  <http://www.wikidata.org/prop/direct/P7
> >     <http://www.wikidata.org/prop/direct/P7>> ?var2 .
> >     >>>>     }
> >     >>>>     MINUS   {
> >     >>>>       ?var1  <http://www.wikidata.org/prop/direct/P9
> >     <http://www.wikidata.org/prop/direct/P9>> ?var2 .
> >     >>>>     }
> >     >>>>     MINUS   {
> >     >>>>       ?var1  <http://www.wikidata.org/prop/direct/P1889
> >     <http://www.wikidata.org/prop/direct/P1889>> ?var2 .
> >     >>>>     }
> >     >>>>     MINUS   {
> >     >>>>       ?var1  <http://www.wikidata.org/prop/direct/P460
> >     <http://www.wikidata.org/prop/direct/P460>> ?var2 .
> >     >>>>     }
> >     >>>> }
> >     >>>> LIMIT 500
> >     >>>
> >     >>>
> >     >>>
> >     >>> The first one is taking 141 minutes,  the second 32 minutes
> >     and the third
> >     >>> is still running.
> >     >>> I have run about 1200 queries, where 38 times the exception
> >     was thrown,
> >     >> but
> >     >>> interestingly not
> >     >>> in the first two queries, where I got 0 and 575925 results
> >     respectively.
> >     >>>
> >     >>> I'm using jena 4.1.0.
> >     >>>
> >     >>>
> >     >>>
> >     >>> On Thu, 29 Jul 2021 at 05:27, Andy Seaborne <[email protected]
> >     <mailto:[email protected]>> wrote:
> >     >>>
> >     >>>> Hi Cristóbal,
> >     >>>>
> >     >>>> What's the query and which version of jena is this?
> >     >>>>
> >     >>>>        Andy
> >     >>>>
> >     >>>> On 28/07/2021 19:39, Cristóbal Miranda wrote:
> >     >>>>> Hello everyone,
> >     >>>>>
> >     >>>>> I'm trying to run a sequence of queries with TDB, using a
> >     >>>>> locally loaded dataset. I don't want to wait more than a few
> >     >>>>> seconds for each query to finish. My attempt to do this
> >     looks like
> >     >>>>> the following:
> >     >>>>>
> >     >>>>>>
> >     >>>>>> try {
> >     >>>>>>      RDFConnection rdfConnection =
> >     >> RDFConnectionFactory.connect(dataset);
> >     >>>>>>      QueryExecution queryExecution =
> >     rdfConnection.query(query);
> >     >>>>>> queryExecution.setTimeout(timeoutMilliseconds);
> >     >>>>>>      ResultSet resultSet = queryExecution.execSelect();
> >     >>>>>>      while (resultSet.hasNext()) {
> >     >>>>>>        QuerySolution querySolution = resultSet.next();
> >     >>>>>>        ...
> >     >>>>>>      }
> >     >>>>>> } catch (QueryCancelledException e) {
> >     >>>>>>       ...
> >     >>>>>> }
> >     >>>>>
> >     >>>>>
> >     >>>>> The problem is that this is not working. With htop I see that
> >     >>>>> the process gets stuck in disk operations. One of the queries
> >     >>>>> took about 2 hours with the code above. An idea would
> >     >>>>> be trying to run this in a new thread and stopping the
> >     thread outside
> >     >>>> once
> >     >>>>> the timeout is reached, but I'm almost sure this wouldn't be
> >     a safe
> >     >>>>> way to stop the processing, even if it worked.
> >     >>>>>
> >     >>>>> Is there a better way to do this?
> >     >>>>>
> >     >>>>> Cristobal
> >     >>>>>
> >     >>>>
> >     >>>
> >     >>
> >     >
> >
>

Re: How to set a timeout for QueryExecution correctly?

Reply via email to