Ooh that is a nasty one, you have an evil mind Andy! Will try and figure out a fix but I'm not sure that we can defend against every possible variant on this e.g. <?var> and "?var"
It may be that we have to resort to just stating in the documentation that ParameterizedSparqlString does its best to prevent SQL injection but that users should use their own validation of inputs in addition to the validation that it provides. Rob On 3/28/13 1:28 PM, "Andy Seaborne" <[email protected]> wrote: >There's another injection attack although it depends on a degree of >cooperation (mistake) in the thing under attack. > >String str = "PREFIX : <http://example/>\nINSERT { <s> <p> '?var' }" ; > >pss.setLiteral("var", "hello'} ; DROP ALL ; INSERT DATA { <s> <p> >'goodbye") ; > >I'll have a think about ?var in comments but the escaping of \n shoudl >block that. > > Andy > >On 28/03/13 19:02, Rob Vesse wrote: >> Comments inline: >> >> >> On 3/24/13 11:37 AM, "Andy Seaborne" <[email protected]> wrote: >> >>> Looking back at the thread with Joshua Taylor on initial bindings and >>> update, I wonder if we could do with "proper" templates. >>> >>> >>> Manipulation of the algebra for query building does not work so well >>> remotely as the query is sent in syntax form. Having query templates >>>in >>> SPARQL syntax with template parameters seems more natural. >>> >>> (read "query or update" for "query" throughout). >>> >>> ParameterizedSparqlString seems to do two things - correct me if I'm >>> wrong Rob - it's a sort of builder of queries (the .append(..) methods) >>> and also a bit like JDBC prepared statements (.setXYZ(...)). But it >>> does not know the syntax of the query or update. They are open to >>> injection [1] although that's fixable. >> >> No it is specifically designed to not be open to injection, any value >>that >> you set is converted into a Node and formatted as such thus preventing >> such injection. >> >> However you have come up with a valid example that defeats this >> protection, this should now be fixed as of my latest commits. >> >> Btw should Jena/ARQ not have been rejecting that as an invalid URI/IRI >>to >> start with? >> >>> >>> My suggestion is to have template queries, which are like, but not >>> identical, to JDBC prepared statements. A template query is a superset >>> of SPARQL. Template parameterization is via a new parse item, not a >>> legal SPARQL variable (e.g. ?{abc}). They must be replaced by a node >>> (which could be a real variable) to use them. There would >>> template.asQuery(..substitutions...). >>> >>> An alternative was using SPARQL variables, requiring a query/update >>> template to declare the template variables and checking when converting >>> to a query or update. But, as below, there are a couple of points >>>where >>> it is desirable to parameterize a template that in SPARQL do not allow >>> variables. If we're tweaking the syntax anyway, we might as well have >>> template variable syntax. >>> >>> The reason for some checking is so you can't do "SELECT * { ?s ?p ?o }" >>> forgetting to replace ?s with a URI, for example. >> >> A few weeks back I added positional parameter support to the class so >>you >> can now do the following: >> >> ParameterizedSparqlString pss = new ParameterizedSparqlString("SELECT * >>{ >> ? ?p ?o }"); >> pss.setIri(0, "http://example/subject"); >> >> If you forget to set a positional parameter then you will have a syntax >> error. >> >> See the javadoc for what is treated as a positional parameter. >> >>> >>> == Query >>> >>> LIMIT and OFFSET take fixed integers by syntax and ideally they would >>> parameters. >>> >>> == Update >>> >>> INSERT DATA / DELETE DATA restrict >>> >>> It would be good to have template data updates. But the data part of >>> INSERT DATA explicitly forbids variables. >>> >>> >>> To this end, I have got the machinery for transforms of Element objects >>> (c.f. transforms on Ops) working. By working on the AST, injection is >>> harder because the template parameter must be a Node to go in the right >>> place in the AST. >>> >>> Comments and thoughts? >> >> Bar the bug you identified the ParameterizedSparqlString is already >> forcing any parameter to be a Node and formatting it appropriately. Of >> course as you point out it is ignorant of the syntax so it isn't a >>perfect >> solution for all scenarios. >> >> Doing a more algebra based template perhaps makes sense in a more >>prepared >> query type environment where you want to prepare a query plan and fill >>in >> the constants later. My only concern would be that the constants you >>fill >> in might lead to a different query plan depending on the optimizer in >>use >> (e.g. stats based) and so injecting them that late could harm >>performance. >> Injecting at the pre-parsing stage allows the normal machinery to >>work. >> >>> >>> Rob - does this relate to jena-jdbc in some way? >> >> jena-jdbc is something experimental I am working on and is designed to >> provide a JDBC to SPARQL bridge (I know William Greenly has jdbc4sparql >> available) but I needed something that was designed to be extensible and >> not tied to remote SPARQL endpoints only. I am intending on using >> ParameterizedSparqlString to do prepared statements for this. I will >> likely start doing some more work on this in the coming weeks as my time >> is freed up from other current work commitments, jena-jdbc is intended >>to >> be an enabling framework for ourselves and others to provide SPARQL >>based >> JDBC drivers. >> >> This may touch on some of the ideas Claude has been proposing about >>lower >> level network protocols though personally I am quite happy to stick with >> HTTP. >> >> Rob >> >>> >>> Andy >>> >>> [1] >>> >>> public static void injection() { >>> String str = >>> "PREFIX : <http://example/>\nINSERT DATA { <s> <p> ?var2 . >>>}" ; >>> ParameterizedSparqlString pss = new >>>ParameterizedSparqlString(str) ; >>> pss.setIri("var2", >>> "hello> } ; DROP ALL ; INSERT DATA { <s> <p> <goodbye" >>> ) ; >>> String x = pss.toString() ; >>> System.out.println(x) ; >>> } >> >
