Re: [sqlalchemy] Re: Security Implications of Directly Executing Queries Compiled to Strings

2019-03-08 Thread 'Van Klaveren, Brian N.' via sqlalchemy
As has been said, if you are generating the SQL, you will be fine so long as 
you use parameters and no blind string interpolation.

This isn't really any different that any other API in that regard - obviously 
you don't want to allow a non-substituted first name field of the form `'; DROP 
TABLE USERS;`

If you are worried about security in terms of accepting SQL (which sounds like 
it's a different concern than yours), you will really want a full-fledged 
parser.

In astronomy, we have a long history of actually letting users execute SQL 
directly via a REST API (there's even a specification called the "Table Access 
Protocol"). There's some SQL parsers out there in Python, but not much that's 
easily reusable or full features. I had started porting the Presto parser to 
Python (github.com/slaclab/lacquer)  but I 
ended up with something kind of buggy (someday I'll try to get back to it... 
Antlr4 has good Python support now) and we had switched that service to Java 
because we needed better threading and I switched to directly using the Presto 
SQL parser. You could easily write a query validator, for example as a CLI or 
as a REST API, to verify there's only one statement with the presto framework 
if you want good SQL support and to verify the statements and operations are 
okay. This is probably out of scope for your needs, but it may be something to 
bring up to your team if you are worried about security more generally for this 
service.

Brian


On Mar 8, 2019, at 9:56 AM, Walt 
mailto:waltas...@gmail.com>> wrote:



On Friday, March 8, 2019 at 11:32:01 AM UTC-6, Jonathan Vanasco wrote:
Do you control the HTTP API or is this someone else's system?

It's someone else's. I'm living in a world where folks have generated their SQL 
with regular old string processing & interpolation in Python because there's no 
database connection to work with. I'm hoping to replace this with sqlalchemy + 
.compile()-ing.

Does the API just execute the raw sql it is provided, and you're looking to 
generate that?  What you want to do is pipe a SqlAlchemy query into a function 
that can compile it into the right statement for your database.  Below is an 
example of PostgreSQL that worked on SqlAlchemy 1 (probably 1.3. too but I 
haven't tested):

Yep, just executes it raw. I've got the basic examples working, I just wanted 
to understand the limitations of using SQLAlchemy vs. DBAPI literal binding, 
particularly any security implications, before I start advocating for more 
people in my organization to start using this workflow.

The security concerns you brought up deal with how/what SqlAlchemy treats as 
trusted user input or not.  Most functions in SqlAlchemy will escape the values 
by default, very few will not and are documented with a dragon in the database. 
 If you are using values for those items in Sql you need to filter them 
yourself.

Thanks, so the limitations the documentation is bringing up are more that 
SQLAlchemy doesn't know how to bind the same variety of types as does the DBAPI?

For 99.9% of use cases though, you can just compile your sql to the database's 
dialect and just send it without worry.

Hooray!

Thank you for your response!

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to 
sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Re: Security Implications of Directly Executing Queries Compiled to Strings

2019-03-08 Thread Jonathan Vanasco


On Friday, March 8, 2019 at 12:56:26 PM UTC-5, Walt wrote:
>
> Thanks, so the limitations the documentation is bringing up are more that 
> SQLAlchemy doesn't know how to bind the same variety of types as does the 
> DBAPI?
>

Hopefully I will get this write so Mike won't have to correct me...

It's not about the "type" but the "context".  The library is large and 
supports many dialects and dbapi drivers.  Over time, that has meant many 
things have had to support 'textual arguments' and supplying literal SQL to 
certain commands.  (see 
https://docs.sqlalchemy.org/en/latest/orm/tutorial.html#orm-tutorial-literal-sql
 
) 

I think there are some odd issues with binding different types, but they're 
largely for the database specific columns/types and few people will 
encounter them.

For example, in the bugreport i mentioned above... some people discovered 
one could submit unsafe input to a group_by or order_by clause.  That 
action is an anti-pattern, as it the functions are designed and documented 
for safe developer input, but some person on the internet generated a CVE 
report... so it became a vulnerability that had to be addressed.

In your example above though, ensuring you have an INT will catch an error 
before it hits the database. Otherwise, if you submit text, SqlAlchemy will 
properly escape it , submit it, and you will get a database error that 
needs to be parsed to figure things out.  (if sqlalchemy were talking to 
the database directly, it would wrap the error into a standardized 
sqlalchemy error that handles the various dialect-specific errors for that 
type, but you'd still need to parse it to figure out what happened on the 
db layer)

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Re: Security Implications of Directly Executing Queries Compiled to Strings

2019-03-08 Thread Walt


On Friday, March 8, 2019 at 11:32:01 AM UTC-6, Jonathan Vanasco wrote:
>
> Do you control the HTTP API or is this someone else's system?
>

It's someone else's. I'm living in a world where folks have generated their 
SQL with regular old string processing & interpolation in Python because 
there's no database connection to work with. I'm hoping to replace this 
with sqlalchemy + .compile()-ing.
 

> Does the API just execute the raw sql it is provided, and you're looking 
> to generate that?  What you want to do is pipe a SqlAlchemy query into a 
> function that can compile it into the right statement for your database.  
> Below is an example of PostgreSQL that worked on SqlAlchemy 1 (probably 
> 1.3. too but I haven't tested):
>

Yep, just executes it raw. I've got the basic examples working, I just 
wanted to understand the limitations of using SQLAlchemy vs. DBAPI literal 
binding, particularly any security implications, before I start advocating 
for more people in my organization to start using this workflow.
 

> The security concerns you brought up deal with how/what SqlAlchemy treats 
> as trusted user input or not.  Most functions in SqlAlchemy will escape the 
> values by default, very few will not and are documented with a dragon in 
> the database.  If you are using values for those items in Sql you need to 
> filter them yourself.  
>

Thanks, so the limitations the documentation is bringing up are more that 
SQLAlchemy doesn't know how to bind the same variety of types as does the 
DBAPI?
 

> For 99.9% of use cases though, you can just compile your sql to the 
> database's dialect and just send it without worry.
>

Hooray!

Thank you for your response!

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Re: Security Implications of Directly Executing Queries Compiled to Strings

2019-03-08 Thread Jonathan Vanasco
forgot to add:

in your example: NEVER TRUST USER INPUT.  you'll get a much better error 
and can fix problems when you don't trust them and try to sanitize stuff

- (x > user_provided_value)
+ (x > int(user_provided_value))  # or float or whatever

The stuff 'compile' generates won't necessarily work on your database 
though; you need to pass in a dialect.  (hence the function that just 
compiles it with a set predetermined options)

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] Re: Security Implications of Directly Executing Queries Compiled to Strings

2019-03-08 Thread Jonathan Vanasco
Do you control the HTTP API or is this someone else's system?

Does the API just execute the raw sql it is provided, and you're looking to 
generate that?  What you want to do is pipe a SqlAlchemy query into a 
function that can compile it into the right statement for your database.  
Below is an example of PostgreSQL that worked on SqlAlchemy 1 (probably 
1.3. too but I haven't tested):

The security concerns you brought up deal with how/what SqlAlchemy treats 
as trusted user input or not.  Most functions in SqlAlchemy will escape the 
values by default, very few will not and are documented with a dragon in 
the database.  If you are using values for those items in Sql you need to 
filter them yourself.  

Until recently, group_by and order_by were "vulnerable" to the anti-pattern 
of submitting raw untrusted user input to them.  see 
https://github.com/sqlalchemy/sqlalchemy/issues/4481 

For 99.9% of use cases though, you can just compile your sql to the 
database's dialect and just send it without worry.

-



# pypi
import sqlparse
from sqlalchemy.dialects import postgresql as dialect_postgresql


# 
==


def print_query(q):
"""
prints a sqlalchemy query
"""
print("-" * 30)
if hasattr(q, 'statement'):
print("[q.statement.compile %s]" % type(q))
statement = 
str(q.statement.compile(dialect=dialect_postgresql.dialect(), 
compile_kwargs={"literal_binds": True}))
elif hasattr(q, 'compile'):
print("[q.compile %s]" % type(q))
statement = str(q.compile(dialect=dialect_postgresql.dialect(), 
compile_kwargs={"literal_binds": True}))
else:
print("[q %s]" % type(q))
statement = str(q)
print(sqlparse.format(statement, reindent=True, 
keyword_case='upper'))
print("-" * 30)

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.