This is related to my topic question about how to do binding.   I'm working 
with a remote SPARQL HTTP endpoint.   I want to create code examples that 
are not vulnerable to "SPARQL injection".   This requires a bit more than 
the str.replace() method you mention.

My thought was to construct a graph as follows:

sparqlstore = SparqlStore(endpoint_url, sparql11=True)
g = Graph(store=sparqlstore)
q = prepareQuery(querystring, initNs=prefixes_dict)
g.query(q, initBindings=bindings_dict)

For your case, an approach like this may work because prepareQuery returns 
a Query object that has  query.algebra, and so you can walk the structure 
to do your binding.

And ... if you do this, it may allow me to really at last avoid the 
somewhat unreal dangers of SPARQL injection...


On Monday, October 9, 2017 at 11:49:37 AM UTC-4, Paul A Houle wrote:
>
> I brought this up about a month ago on another thread but I am ready to 
> revisit it.
>
> I am working on this library
>
> https://github.com/paulhoule/gastrodon
>
> and one of the goals for the library is that it should know more about 
> SPARQL than most users.
>
> Here are two bits of SPARQL intelligence that the library already has:
>
> (1) It uses the SPARQL parser in rdflib to look for a GROUP BY statement 
> in a query and if the group variables are used in the SELECT clause,  these 
> are automatically made the indexes of a pandas data frame made from the 
> SELECT output.
>
> (2) The library also substitutes binding variables into SPARQL queries in 
> order to use binding variables with queries sent to remote SPARQL endpoints.
>
> I've been going at (2) with a rather stupid approach based on 
> str.replace() which worked for a while,  but then I found an easy way to 
> break it.  I can see a hack that will get me through the day,  but there is 
> a way to break that,  and even though I think I see a way to make one that 
> is unbreakable,  it is enough work that I might want to write something 
> that can turn SPARQL parse trees back into text because that,  in 
> principle,  is the way to complete SPARQL intelligence.
>
> So I am thinking about the right way to do it.  For a while I was puzzled 
> by the large parse trees created by simple expression,  for instance,
>
> parseQuery("SELECT (5 as ?o) {}")
>
> parses to
>
> [([], {}), SelectQuery_{'projection': [vars_{'expr': 
> ConditionalOrExpression_{'expr': ConditionalAndExpression_{'expr': 
> RelationalExpression_{'expr': AdditiveExpression_{'expr': 
> MultiplicativeExpression_{'expr': rdflib.term.Literal('5', 
> datatype=rdflib.term.URIRef('
> http://www.w3.org/2001/XMLSchema#integer'))}}}}}, 'evar': 
> rdflib.term.Variable('o')}], 'where': GroupGraphPatternSub_{}}], {})
>
> But the more I think about it,  this is just how parse trees produced by 
> something like pyparsing work -- the precedence structure is embedded in 
> the sequence of nodes;  I can imagine this might make query understanding a 
> little harder than it has to be,  but it shouldn't be a barrier to 
> unparsing because if the conditionalAndExpression has only one leg,  the 
> string form is just the string form of the single leg.
>
> Another thing I am thinking about is that I'd like to have some way to 
> substitute list variables from Python into the ExpressionList in an IN 
> clause.  It seems I could add an XVAR term which is ?? + VARNAME,  then add 
> the appropriate production,  but I hate the idea of having to copy that 
> whole parse tree when I only want to make a small change.
>
> Any ideas?
>
>

-- 
http://github.com/RDFLib
--- 
You received this message because you are subscribed to the Google Groups 
"rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rdflib-dev/fa659700-f021-4cf8-b040-2e6bcc985c52%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to