Re: How to remove consistently a triple pattern given a SPARQL query?

Paul Houle Tue, 02 Feb 2016 11:30:12 -0800

Carlo,  Andy,

   I like the Iterator<> interfaces in the Jena framework for getting data
out,  but I make a habit of always putting results in a List  or Queue or
something before putting them back into the same Jena model because i get
less BS per mile that way in terms of Exceptions and other exceptional
events.


   Does Jena have an official policy on being reenterable in that way?

On Tue, Feb 2, 2016 at 2:13 PM, Carlo.Allocca <carlo.allo...@open.ac.uk>
wrote:

>
> Dear Andy and All,
>
> while I was extending and testing the code that I wrote so far concerning
> the removing a triple from a given SPARQL query,
> I realised that I get different outputs depending on how I start the
> implementation of the public Element transform(ElementGroup arg0,
> List<Element> arg1).
> In particular, if I start with (1) I obtain some results, if I start with
> (2) I obtain something different (you can see below the details).
>
> I have also used ElementTransformCleanGroupsOfOne when ElementGroup is
> empty
>         ElementTransform transform = new
> ElementTransformCleanGroupsOfOne();
>         Element el2 = ElementTransformer.transform(eg, transform);
>         return el2;
>
> but no difference in results. I am sure I am doing something wrong.
> Moreover, my questions are: what is the main difference between the two
> approaches? and when I should use ElementGroup arg0 and when List<Element>
> arg1?
>
>
> (1) public Element transform(ElementGroup arg0, List<Element> arg1) {
> List<Element> elemList = arg0.getElements();
>         Iterator<Element> itr = elemList.iterator();
> while (itr.hasNext()) {
>
>
> }
> …
> …
> }
>
>
> (2)     public Element transform(ElementGroup arg0, List<Element> arg1) {
>
>
>
> Iterator<Element> itr = arg1.iterator();
>         while (itr.hasNext()) {
>
>
> }
> …
> …
> }
>
> I know that it may be related to the little knowledge about Jena.
> Many Thanks in advice for your clarification on the above.
>
> Best Regards,
> Carlo
>
>
> =======
>
> Below, I reported the used code (at very bottom), the two used scenario
> with test-cases and results. In practice, you can notice that:
>
>
>
> ==== TESTING:
>
> Scenario A:
>
>     public Element transform(ElementGroup arg0, List<Element> arg1) {
>
>
>
> List<Element> elemList = arg0.getElements();
>         Iterator<Element> itr = elemList.iterator();
> while (itr.hasNext()) {
>
>
> }
> …
> …
> }
>
>
> Test 1:
>
> The triple to remove is (?x  foaf:mbox  ?mbox ) using the below query Q1:
>
> =========== BEFORE Q1
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
>
> SELECT DISTINCT  ?name ?mbox
> WHERE
>   { ?x  foaf:name  ?name
>     OPTIONAL
>       { ?x  foaf:mbox  ?mbox }
>   }
>
>
> ============= AFTER Q1
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
>
> SELECT DISTINCT  ?name ?mbox
> WHERE
>   { ?x  foaf:name  ?name }
>
>
> Test2:
>
> The triple to remove is (?boss1  ex:isBossOf1  ?ind ) using the below
> query Q2:
>
>
>
> =========== BEFORE Q2
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   {   { ?ind  rdf:type  ?z
>         OPTIONAL
>           { ?boss1  ex:isBossOf1  ?ind }
>       }
>     UNION
>       {   { ?boss  ex:isBossOf1  ?ind }
>         UNION
>           { ?boss  ex:isBossOf  ?ind
>             FILTER ( ?boss = "mathieu" )
>           }
>       }
>   }
>
> ============= AFTER Q2: it does not remove the triple.
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   {   { ?ind  rdf:type  ?z }
>     UNION
>       {   { ?boss  ex:isBossOf1  ?ind }
>         UNION
>           { ?boss  ex:isBossOf  ?ind
>             FILTER ( ?boss = "mathieu" )
>           }
>       }
>   }
>
>
> Test 3: The triple to remove is (?ind  rdf:type  ?z) using the below query
> Q3:
>
> =========== BEFORE Q3:
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   { ?ind  rdf:type  ?z
>     FILTER ( ?ind = "mathieu" )
>   }
>
> ============= AFTER Q3: There is still an empty BGP present.
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   { # Empty BGP
>
>
>
>   }
>
>
>
>
>
>
>
>
> Scenario B:
>
>     public Element transform(ElementGroup arg0, List<Element> arg1) {
>
>
>
> Iterator<Element> itr = arg1.iterator();
>         while (itr.hasNext()) {
>
>
> }
> …
> …
> }
>
>
> Test 1:
>
> The triple to remove is (?x  foaf:mbox  ?mbox ) using the below query Q1:
>
> =========== BEFORE Q1
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
>
> SELECT DISTINCT  ?name ?mbox
> WHERE
>   { ?x  foaf:name  ?name
>     OPTIONAL
>       { ?x  foaf:mbox  ?mbox }
>   }
>
>
> ============= AFTER Q1: there is still the OPTION (with a ElementGroup
> empty) clause.
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX  foaf: <http://xmlns.com/foaf/0.1/>
>
> SELECT DISTINCT  ?name ?mbox
> WHERE
>   { ?x  foaf:name  ?name
>     OPTIONAL
>       { # Empty BGP
>
>
>
>       }
>   }
>
>
>
>
> Test 2:
>
> The triple to remove is (?boss1  ex:isBossOf1  ?ind ) using the below
> query Q2:
>
>
>
> =========== BEFORE Q2
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   {   { ?ind  rdf:type  ?z
>         OPTIONAL
>           { ?boss1  ex:isBossOf1  ?ind }
>       }
>     UNION
>       {   { ?boss  ex:isBossOf1  ?ind }
>         UNION
>           { ?boss  ex:isBossOf  ?ind
>             FILTER ( ?boss = "mathieu" )
>           }
>       }
>   }
>
> ============= AFTER Q2: it does not remove the OPTION and it leaves an
> empty BGP.
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   {   { ?ind  rdf:type  ?z
>         OPTIONAL
>           { # Empty BGP
>
>
>
>           }
>       }
>     UNION
>       {   { ?boss  ex:isBossOf1  ?ind }
>         UNION
>           { ?boss  ex:isBossOf  ?ind
>             FILTER ( ?boss = "mathieu" )
>           }
>       }
>   }
>
> Test 3: The triple to remove is (?ind  rdf:type  ?z) using the below query
> Q3:
>
> =========== BEFORE Q3
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   { ?ind  rdf:type  ?z
>     FILTER ( ?ind = "mathieu" )
>   }
>
> ============= AFTER Q3: It does not remove the FILTER, but just the triple.
>
> PREFIX  rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX  ex:   <http://www.semanticweb.org/dataset1/>
> PREFIX  rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>   {   { ?ind  rdf:type  ?z }
>     UNION
>       { # Empty BGP
>
>
>
>         FILTER ( ?boss = "mathieu" )
>       }
>   }
>
>
>
>
>
>
>
> === FULL CODE used with     public Element transform(ElementPathBlock
> eltPB)
>
> @Override
>     public Element transform(ElementPathBlock eltPB) {
>         if (eltPB.isEmpty()) {
>
> //System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)]
> ElementPathBlock IS EMPTY:: " + eltPB.toString());
>             return eltPB;
>         }
>         System.out.println("[RemoveOpTransform::transform(ElementPathBlock
> arg0)] ElementPathBlock:: " + eltPB.toString());
>         Iterator<TriplePath> l = eltPB.patternElts();
>         while (l.hasNext()) {
>             TriplePath tp = l.next();
>             if (tp.asTriple().matches(this.triple)) {
>                 l.remove();
>
> System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)]
> ElementPathBlock:: " + tp.toString() + " TRIPLE JUST REMOVED!!!");
>
> //System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)]
> TRIPLE JUST REMOVED!!! ");
>                 System.out.println("");
>                 return this.transform(eltPB);//eltPB;
>             }
>         }
>         return eltPB;
>     }
>
>
> === FULL CODE used with public Element transform(ElementGroup arg0,
> List<Element> arg1)
>
> @Override
>     public Element transform(ElementGroup arg0, List<Element> arg1) {
>
>
>
>         List<Element> elemList = arg0.getElements();
>         Iterator<Element> itr = elemList.iterator();
>         //Iterator<Element> itr = arg1.iterator();
>         while (itr.hasNext()) {
>             Element elem = itr.next();
>             if (elem instanceof ElementOptional) {
>                 boolean isElementOptionalEmpty =
> isElementOptionalEmpty((ElementOptional) elem);
>                 if (isElementOptionalEmpty) {
>                     itr.remove();
>                 }
>             }
>
>             else if (elem instanceof ElementGroup) {
>                 boolean isElementGroupEmpty =
> isElementGroupEmpty((ElementGroup) elem);
>                 if (isElementGroupEmpty) {
>                     itr.remove();
>                 }
>             }
>             else if (elem instanceof ElementFilter) {
>                 //... check if this filter is the one that we should remove
>                 //...get the variables of the triple pattern that we want
> to delete
>                 Set<Var> tpVars = new HashSet();
>                 Node subj = this.triple.getSubject();
>                 if (subj.isVariable()) {
>                     tpVars.add((Var) subj);
>                 }
>                 Node pred = this.triple.getPredicate();
>                 if (pred.isVariable()) {
>                     tpVars.add((Var) pred);
>                 }
>                 Node obj = this.triple.getObject();
>                 if (obj.isVariable()) {
>                     tpVars.add((Var) obj);
>                 }
>                 //...get the variables of the FILTER expression
>                 Set<Var> expVars = ((ElementFilter)
> elem).getExpr().getVarsMentioned();
>                 //...check whether the FILTER expression contains any of
> the triple pattern variable
>                 for (Var var : expVars) {
>                     //..if it does then we have to delete the entire
> FILTER expression
>                     if (tpVars.contains(var)) {
>                         itr.remove();
>                     }
>                 }
>             }
>             else if (elem instanceof ElementUnion) {
>                 boolean isUnionBothSidesEmpty =
> isUnionBothSidesEmpty1((ElementUnion) elem);
>                 if (isUnionBothSidesEmpty) {
>                     itr.remove();
>                 }
>             }
>
>         }
>         return arg0;
>     }
>
>
>
>
>
>
>
>
> On 2 Feb 2016, at 10:54, Carlo.Allocca <carlo.allo...@open.ac.uk<mailto:
> carlo.allo...@open.ac.uk>> wrote:
>
> Dear Andy,
>
> Thank you for your time. Very appreciated.
> Some comments follow in lines.
>
> On 2 Feb 2016, at 09:36, Andy Seaborne <a...@apache.org<mailto:
> a...@apache.org>> wrote:
>
>
> when removing the triple (?boss ex:isBossOf ?ind .”), I get
>
> SELECT DISTINCT  ?ind ?boss ?g
> WHERE
>  {   { ?ind  rdf:type  ?z }
>    UNION
>      {   { ?boss  ex:isBossOf1  ?ind }
>        UNION
>          { # Empty BGP
>
>          }
>      }
>  }
>
> which is OK.
> I just need to find out how to remove an ElementGroup which contains only
> one element which is the EMPTY one.
> Of course, I need to do the same for the other case, e.g. OPTION,
> SUBquery, etc.
>
> Do note that evaluating {} (empty syntax group) yields one row of zero
> columns - it contributes to the overall results (it's the join identity).
>
> I see. To avoid this I am going to apply a
> ElementTransformCleanGroupsOfOne as you suggested.
>
>
> Now you have to look at all the elements that have a group in
> ElementUnion, ElementOptional, ElementMinus, …
> Yes, I need to cover all the SPARQL language from the “public Element
> transform(ElementGroup arg0, List<Element> arg1)” call.
> At least this is my understanding so far.
>
>
>
> That is what ElementTransformCleanGroupsOfOne does, except it looks for
> "groups of one"
>
> ..  UNION { { stuff } }
>
> and isn't to fussy about finding them all (it's an optimization, more a
> tidying of the tree, not a change in the effect of a query which is what
> removing triple patterns is).
>
> And of course changes from the bottom could potentially cause change all
> the way up to the top of the syntax tree.
>
> also: they maybe be original, legal empty groups in the tree.
>
> Thanks for the detailed clarifications. Indeed, I will consider them.
>
> Many Thanks,
> Best Regards,
> Carlo
>
>
>
>   Andy
>
>
>
>
>
>
> -- The Open University is incorporated by Royal Charter (RC 000391), an
> exempt charity in England & Wales and a charity registered in Scotland (SC
> 038302). The Open University is authorised and regulated by the Financial
> Conduct Authority.
>
>


-- 
Paul Houle

*Applying Schemas for Natural Language Processing, Distributed Systems,
Classification and Text Mining and Data Lakes*

(607) 539 6254    paul.houle on Skype   ontolo...@gmail.com

:BaseKB -- Query Freebase Data With SPARQL
http://basekb.com/gold/

Legal Entity Identifier Lookup
https://legalentityidentifier.info/lei/lookup/
<http://legalentityidentifier.info/lei/lookup/>

Join our Data Lakes group on LinkedIn
https://www.linkedin.com/grp/home?gid=8267275

Re: How to remove consistently a triple pattern given a SPARQL query?

Reply via email to