Carlo, Andy, I like the Iterator<> interfaces in the Jena framework for getting data out, but I make a habit of always putting results in a List or Queue or something before putting them back into the same Jena model because i get less BS per mile that way in terms of Exceptions and other exceptional events.
Does Jena have an official policy on being reenterable in that way? On Tue, Feb 2, 2016 at 2:13 PM, Carlo.Allocca <carlo.allo...@open.ac.uk> wrote: > > Dear Andy and All, > > while I was extending and testing the code that I wrote so far concerning > the removing a triple from a given SPARQL query, > I realised that I get different outputs depending on how I start the > implementation of the public Element transform(ElementGroup arg0, > List<Element> arg1). > In particular, if I start with (1) I obtain some results, if I start with > (2) I obtain something different (you can see below the details). > > I have also used ElementTransformCleanGroupsOfOne when ElementGroup is > empty > ElementTransform transform = new > ElementTransformCleanGroupsOfOne(); > Element el2 = ElementTransformer.transform(eg, transform); > return el2; > > but no difference in results. I am sure I am doing something wrong. > Moreover, my questions are: what is the main difference between the two > approaches? and when I should use ElementGroup arg0 and when List<Element> > arg1? > > > (1) public Element transform(ElementGroup arg0, List<Element> arg1) { > List<Element> elemList = arg0.getElements(); > Iterator<Element> itr = elemList.iterator(); > while (itr.hasNext()) { > > > } > … > … > } > > > (2) public Element transform(ElementGroup arg0, List<Element> arg1) { > > > > Iterator<Element> itr = arg1.iterator(); > while (itr.hasNext()) { > > > } > … > … > } > > I know that it may be related to the little knowledge about Jena. > Many Thanks in advice for your clarification on the above. > > Best Regards, > Carlo > > > ======= > > Below, I reported the used code (at very bottom), the two used scenario > with test-cases and results. In practice, you can notice that: > > > > ==== TESTING: > > Scenario A: > > public Element transform(ElementGroup arg0, List<Element> arg1) { > > > > List<Element> elemList = arg0.getElements(); > Iterator<Element> itr = elemList.iterator(); > while (itr.hasNext()) { > > > } > … > … > } > > > Test 1: > > The triple to remove is (?x foaf:mbox ?mbox ) using the below query Q1: > > =========== BEFORE Q1 > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > > SELECT DISTINCT ?name ?mbox > WHERE > { ?x foaf:name ?name > OPTIONAL > { ?x foaf:mbox ?mbox } > } > > > ============= AFTER Q1 > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > > SELECT DISTINCT ?name ?mbox > WHERE > { ?x foaf:name ?name } > > > Test2: > > The triple to remove is (?boss1 ex:isBossOf1 ?ind ) using the below > query Q2: > > > > =========== BEFORE Q2 > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { { ?ind rdf:type ?z > OPTIONAL > { ?boss1 ex:isBossOf1 ?ind } > } > UNION > { { ?boss ex:isBossOf1 ?ind } > UNION > { ?boss ex:isBossOf ?ind > FILTER ( ?boss = "mathieu" ) > } > } > } > > ============= AFTER Q2: it does not remove the triple. > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { { ?ind rdf:type ?z } > UNION > { { ?boss ex:isBossOf1 ?ind } > UNION > { ?boss ex:isBossOf ?ind > FILTER ( ?boss = "mathieu" ) > } > } > } > > > Test 3: The triple to remove is (?ind rdf:type ?z) using the below query > Q3: > > =========== BEFORE Q3: > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { ?ind rdf:type ?z > FILTER ( ?ind = "mathieu" ) > } > > ============= AFTER Q3: There is still an empty BGP present. > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { # Empty BGP > > > > } > > > > > > > > > Scenario B: > > public Element transform(ElementGroup arg0, List<Element> arg1) { > > > > Iterator<Element> itr = arg1.iterator(); > while (itr.hasNext()) { > > > } > … > … > } > > > Test 1: > > The triple to remove is (?x foaf:mbox ?mbox ) using the below query Q1: > > =========== BEFORE Q1 > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > > SELECT DISTINCT ?name ?mbox > WHERE > { ?x foaf:name ?name > OPTIONAL > { ?x foaf:mbox ?mbox } > } > > > ============= AFTER Q1: there is still the OPTION (with a ElementGroup > empty) clause. > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > > SELECT DISTINCT ?name ?mbox > WHERE > { ?x foaf:name ?name > OPTIONAL > { # Empty BGP > > > > } > } > > > > > Test 2: > > The triple to remove is (?boss1 ex:isBossOf1 ?ind ) using the below > query Q2: > > > > =========== BEFORE Q2 > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { { ?ind rdf:type ?z > OPTIONAL > { ?boss1 ex:isBossOf1 ?ind } > } > UNION > { { ?boss ex:isBossOf1 ?ind } > UNION > { ?boss ex:isBossOf ?ind > FILTER ( ?boss = "mathieu" ) > } > } > } > > ============= AFTER Q2: it does not remove the OPTION and it leaves an > empty BGP. > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { { ?ind rdf:type ?z > OPTIONAL > { # Empty BGP > > > > } > } > UNION > { { ?boss ex:isBossOf1 ?ind } > UNION > { ?boss ex:isBossOf ?ind > FILTER ( ?boss = "mathieu" ) > } > } > } > > Test 3: The triple to remove is (?ind rdf:type ?z) using the below query > Q3: > > =========== BEFORE Q3 > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { ?ind rdf:type ?z > FILTER ( ?ind = "mathieu" ) > } > > ============= AFTER Q3: It does not remove the FILTER, but just the triple. > > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX ex: <http://www.semanticweb.org/dataset1/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { { ?ind rdf:type ?z } > UNION > { # Empty BGP > > > > FILTER ( ?boss = "mathieu" ) > } > } > > > > > > > > === FULL CODE used with public Element transform(ElementPathBlock > eltPB) > > @Override > public Element transform(ElementPathBlock eltPB) { > if (eltPB.isEmpty()) { > > //System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)] > ElementPathBlock IS EMPTY:: " + eltPB.toString()); > return eltPB; > } > System.out.println("[RemoveOpTransform::transform(ElementPathBlock > arg0)] ElementPathBlock:: " + eltPB.toString()); > Iterator<TriplePath> l = eltPB.patternElts(); > while (l.hasNext()) { > TriplePath tp = l.next(); > if (tp.asTriple().matches(this.triple)) { > l.remove(); > > System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)] > ElementPathBlock:: " + tp.toString() + " TRIPLE JUST REMOVED!!!"); > > //System.out.println("[RemoveOpTransform::transform(ElementPathBlock arg0)] > TRIPLE JUST REMOVED!!! "); > System.out.println(""); > return this.transform(eltPB);//eltPB; > } > } > return eltPB; > } > > > === FULL CODE used with public Element transform(ElementGroup arg0, > List<Element> arg1) > > @Override > public Element transform(ElementGroup arg0, List<Element> arg1) { > > > > List<Element> elemList = arg0.getElements(); > Iterator<Element> itr = elemList.iterator(); > //Iterator<Element> itr = arg1.iterator(); > while (itr.hasNext()) { > Element elem = itr.next(); > if (elem instanceof ElementOptional) { > boolean isElementOptionalEmpty = > isElementOptionalEmpty((ElementOptional) elem); > if (isElementOptionalEmpty) { > itr.remove(); > } > } > > else if (elem instanceof ElementGroup) { > boolean isElementGroupEmpty = > isElementGroupEmpty((ElementGroup) elem); > if (isElementGroupEmpty) { > itr.remove(); > } > } > else if (elem instanceof ElementFilter) { > //... check if this filter is the one that we should remove > //...get the variables of the triple pattern that we want > to delete > Set<Var> tpVars = new HashSet(); > Node subj = this.triple.getSubject(); > if (subj.isVariable()) { > tpVars.add((Var) subj); > } > Node pred = this.triple.getPredicate(); > if (pred.isVariable()) { > tpVars.add((Var) pred); > } > Node obj = this.triple.getObject(); > if (obj.isVariable()) { > tpVars.add((Var) obj); > } > //...get the variables of the FILTER expression > Set<Var> expVars = ((ElementFilter) > elem).getExpr().getVarsMentioned(); > //...check whether the FILTER expression contains any of > the triple pattern variable > for (Var var : expVars) { > //..if it does then we have to delete the entire > FILTER expression > if (tpVars.contains(var)) { > itr.remove(); > } > } > } > else if (elem instanceof ElementUnion) { > boolean isUnionBothSidesEmpty = > isUnionBothSidesEmpty1((ElementUnion) elem); > if (isUnionBothSidesEmpty) { > itr.remove(); > } > } > > } > return arg0; > } > > > > > > > > > On 2 Feb 2016, at 10:54, Carlo.Allocca <carlo.allo...@open.ac.uk<mailto: > carlo.allo...@open.ac.uk>> wrote: > > Dear Andy, > > Thank you for your time. Very appreciated. > Some comments follow in lines. > > On 2 Feb 2016, at 09:36, Andy Seaborne <a...@apache.org<mailto: > a...@apache.org>> wrote: > > > when removing the triple (?boss ex:isBossOf ?ind .”), I get > > SELECT DISTINCT ?ind ?boss ?g > WHERE > { { ?ind rdf:type ?z } > UNION > { { ?boss ex:isBossOf1 ?ind } > UNION > { # Empty BGP > > } > } > } > > which is OK. > I just need to find out how to remove an ElementGroup which contains only > one element which is the EMPTY one. > Of course, I need to do the same for the other case, e.g. OPTION, > SUBquery, etc. > > Do note that evaluating {} (empty syntax group) yields one row of zero > columns - it contributes to the overall results (it's the join identity). > > I see. To avoid this I am going to apply a > ElementTransformCleanGroupsOfOne as you suggested. > > > Now you have to look at all the elements that have a group in > ElementUnion, ElementOptional, ElementMinus, … > Yes, I need to cover all the SPARQL language from the “public Element > transform(ElementGroup arg0, List<Element> arg1)” call. > At least this is my understanding so far. > > > > That is what ElementTransformCleanGroupsOfOne does, except it looks for > "groups of one" > > .. UNION { { stuff } } > > and isn't to fussy about finding them all (it's an optimization, more a > tidying of the tree, not a change in the effect of a query which is what > removing triple patterns is). > > And of course changes from the bottom could potentially cause change all > the way up to the top of the syntax tree. > > also: they maybe be original, legal empty groups in the tree. > > Thanks for the detailed clarifications. Indeed, I will consider them. > > Many Thanks, > Best Regards, > Carlo > > > > Andy > > > > > > > -- The Open University is incorporated by Royal Charter (RC 000391), an > exempt charity in England & Wales and a charity registered in Scotland (SC > 038302). The Open University is authorised and regulated by the Financial > Conduct Authority. > > -- Paul Houle *Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes* (607) 539 6254 paul.houle on Skype ontolo...@gmail.com :BaseKB -- Query Freebase Data With SPARQL http://basekb.com/gold/ Legal Entity Identifier Lookup https://legalentityidentifier.info/lei/lookup/ <http://legalentityidentifier.info/lei/lookup/> Join our Data Lakes group on LinkedIn https://www.linkedin.com/grp/home?gid=8267275