[ https://issues.apache.org/jira/browse/JENA-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424507#comment-17424507 ]
Andy Seaborne edited comment on JENA-2176 at 10/5/21, 8:26 PM: --------------------------------------------------------------- Finding it out it is a literal is a (potential) disk access as well. The common case is that the subject is valid so you have the test and the lookup on the performance path. You can simulate by inserting a {{FILTER(!isLiteral(?o))}}. Any FILTER use will force the TDB NodeId to be resolved to a concrete RDF term. Rearranging the BGP probably found a better order for your data - it evaluated with less intermediate work. Without seeing the before and after BGPs and knowing the data distribution, I can't say more. was (Author: andy.seaborne): Finding it out it is a literal is a (potential) disk access as well. You can simulate by inserting a {{FILTER(!isLiteral(?o))}}. Any FILTER use will force the TDB NodeId to be resolved to a concrete RDF term. Rearranging the BGP probably found a better order for your data - it evaluated with less intermediate work. Without seeing the before and after BGPs and knowing the data distribution, I can't say more. > TDB2 queries can execute quadpatterns with a literal in the subject position > ---------------------------------------------------------------------------- > > Key: JENA-2176 > URL: https://issues.apache.org/jira/browse/JENA-2176 > Project: Apache Jena > Issue Type: Question > Components: TDB2 > Affects Versions: Jena 4.2.0 > Reporter: Justin > Priority: Major > Attachments: z.rq, z.ttl > > > Hello, > If you try to put a triple into a TDB2 with a literal in the subject position > you get the following: > {noformat} > ERROR riot :: [line: 6, col: 18] Subject is not a URI or blank node > {noformat} > So far so good. > But since literals can not be in the subject position of a triple a query > against a TDB2 should never attempt to find a literal in the subject position > of a triple, right? It would be a waste of time. > But if I am reading the logs correctly that is what appears to happen: > {noformat} > root@ec6206bb523f:/mnt/tdb_42# cat /mnt/z.ttl > @prefix ex: <[http://example.com/]> . > ex:apple ex:hasPart ex:skin . > ex:skin ex:hasName "Skin" . > ex:file ex:hasPart "lala" . > root@ec6206bb523f:/mnt/tdb_42# > root@ec6206bb523f:/mnt/tdb_42# cat /mnt/z.rq > prefix ex: <[http://example.com/]> > select * where > { ?s ex:hasPart ?o . optional \\{ ?o ?p ?o1 . } > } > > root@ec6206bb523f:/mnt/tdb_42# /mnt/apache-jena-4.2.0/bin/tdb2.tdbloader > --loc=`pwd` /mnt/z.ttl > 00:31:49 INFO loader :: Loader = LoaderPhased > 00:31:49 INFO loader :: Start: /mnt/z.ttl > 00:31:49 INFO loader :: Finished: /mnt/z.ttl: 3 tuples in 0.07s (Avg: 40) > 00:31:49 INFO loader :: Finish - index SPO > 00:31:49 INFO loader :: Start replay index SPO > 00:31:49 INFO loader :: Index set: SPO => SPO->POS, SPO->OSP > 00:31:49 INFO loader :: Index set: SPO => SPO->POS, SPO->OSP [3 items, 0.0 > seconds] > 00:31:49 INFO loader :: Finish - index OSP > 00:31:49 INFO loader :: Finish - index POS > root@ec6206bb523f:/mnt/tdb_42# /mnt/apache-jena-4.2.0/bin/tdb2.tdbquery -v > --loc=`pwd` --query=/mnt/z.rq > 1 PREFIX ex: <[http://example.com/]> > 2 > 3 SELECT * > 4 WHERE > 5 > { ?s ex:hasPart ?o > 6 OPTIONAL > 7 { ?o ?p ?o1 } > > 8 } > > 00:31:59 INFO exec :: QUERY > PREFIX ex: <[http://example.com/]> > > SELECT * > WHERE > > { ?s ex:hasPart ?o OPTIONAL { ?o ?p ?o1 } > } > 00:31:59 INFO exec :: ALGEBRA > (conditional > (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?s > <[http://example.com/hasPart]> ?o)) > (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?o ?p ?o1))) > 00:32:00 INFO exec :: TDB > (conditional > (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?s > <[http://example.com/hasPart]> ?o)) > (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?o ?p ?o1))) > 00:32:00 INFO exec :: Execute :: ?s <[http://example.com/hasPart]> ?o > 00:32:00 INFO exec :: TDB > (quadpattern (quad <urn:x-arq:DefaultGraphNode> <[http://example.com/skin]> > ?p ?o1)) > 00:32:00 INFO exec :: Execute :: <[http://example.com/skin]> ?p ?o1 > 00:32:00 INFO exec :: TDB > (quadpattern (quad <urn:x-arq:DefaultGraphNode> "lala" ?p ?o1)) > 00:32:00 INFO exec :: Execute :: "lala" ?p ?o1 > -------------------------------------------- > |s|o|p|o1| > ============================================ > |ex:apple|ex:skin|ex:hasName|"Skin"| > |ex:file|"lala"| | | > -------------------------------------------- > {noformat} > Doesn't this: > {noformat} > 00:32:00 INFO exec :: TDB > (quadpattern (quad <urn:x-arq:DefaultGraphNode> "lala" ?p ?o1)) > 00:32:00 INFO exec :: Execute :: "lala" ?p ?o1 > {noformat} > mean a lookup was done in the TDB2 for a triple with the literal "lala" in > the subject position? If so, shouldn't lookups like that be ignored as they > will never find matching triples in the TDB2? -- This message was sent by Atlassian Jira (v8.3.4#803005)