Re: SQL and CSV

2009-05-09 Thread Lawrence D'Oliveiro
In message , andrew 
cooke wrote:

> even if you're not open to injection attacks, you're still less likely to
> get escaping correct than a puprose written, widely used library.

Escaping isn't really that hard to do. For example, the rules for MySQL 
literals are clearly spelled out here 
. You can easily 
check that against my quoting routines here 
.

--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-08 Thread Lawrence D'Oliveiro
In message , andrew 
cooke wrote:

> Lawrence D'Oliveiro wrote:
>> In message , Peter Otten wrote:
>>
>>> While it may not matter here using placeholders instead of manually
>>> escaping user-provided values is a good habit to get into.
>>
>> Until you hit things it can't deal with.
> 
> Can you expand on your comment?  I assume you are thinking of how the
> library might handle some strange class.

Consider something as simple as a "like" clause.

--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-08 Thread andrew cooke

even if you're not open to injection attacks, you're still less likely to
get escaping correct than a puprose written, widely used library.

my request for more information was directed to lawrence, who said "until
you hit things it can't deal with" which seemed to be some kind of cryptic
argument against parameters.

andrew


Nick wrote:
> On May 8, 1:49 pm, "andrew cooke"  wrote:
>> Lawrence D'Oliveiro wrote:
>> > In message , Peter Otten wrote:
>>
>> >> While it may not matter here using placeholders instead of manually
>> >> escaping user-provided values is a good habit to get into.
>>
>> > Until you hit things it can't deal with.
>>
>> The post you are replying to was talking about using the SQL library's
>> "?"
>> syntax that automatically escapes values.  The usual reason this is
>> recommended (if I have understood correctly) is that the library code is
>> much more likely to foil injection attacks.  I have seen this mentioned
>> often and assume it is good advice.
>>
>> Can you expand on your comment?  I assume you are thinking of how the
>> library might handle some strange class.  But aren't the number of types
>> limited by SQL?  In which case a "thing that can't be handled" could
>> presumably be managed by adding an appropriate __str__ or __float__ or
>> whatever?  And you would still use the library to give safety with other
>> values.
>>
>> Maybe you could give an example of the kind of problem you're thinking
>> of?
>>
>> Thanks,
>> Andrew
>
> Injection attacks aren't an issue, its a local app.
>
> It's part of a reconciliation system, where sometimes data is in csv
> files. If you want the whole csv file, you can use csv module without
> a problem.
>
> In some cases, I need to manipulate the data.
>
> The choices are hard code the manipulation, or load the data from a
> config file.
>
> So what I've got is the query in the config and I can process it.
>
> Nick
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>


--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-08 Thread Nick
On May 8, 1:49 pm, "andrew cooke"  wrote:
> Lawrence D'Oliveiro wrote:
> > In message , Peter Otten wrote:
>
> >> While it may not matter here using placeholders instead of manually
> >> escaping user-provided values is a good habit to get into.
>
> > Until you hit things it can't deal with.
>
> The post you are replying to was talking about using the SQL library's "?"
> syntax that automatically escapes values.  The usual reason this is
> recommended (if I have understood correctly) is that the library code is
> much more likely to foil injection attacks.  I have seen this mentioned
> often and assume it is good advice.
>
> Can you expand on your comment?  I assume you are thinking of how the
> library might handle some strange class.  But aren't the number of types
> limited by SQL?  In which case a "thing that can't be handled" could
> presumably be managed by adding an appropriate __str__ or __float__ or
> whatever?  And you would still use the library to give safety with other
> values.
>
> Maybe you could give an example of the kind of problem you're thinking of?
>
> Thanks,
> Andrew

Injection attacks aren't an issue, its a local app.

It's part of a reconciliation system, where sometimes data is in csv
files. If you want the whole csv file, you can use csv module without
a problem.

In some cases, I need to manipulate the data.

The choices are hard code the manipulation, or load the data from a
config file.

So what I've got is the query in the config and I can process it.

Nick
--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-08 Thread andrew cooke
Lawrence D'Oliveiro wrote:
> In message , Peter Otten wrote:
>
>> While it may not matter here using placeholders instead of manually
>> escaping user-provided values is a good habit to get into.
>
> Until you hit things it can't deal with.

The post you are replying to was talking about using the SQL library's "?"
syntax that automatically escapes values.  The usual reason this is
recommended (if I have understood correctly) is that the library code is
much more likely to foil injection attacks.  I have seen this mentioned
often and assume it is good advice.

Can you expand on your comment?  I assume you are thinking of how the
library might handle some strange class.  But aren't the number of types
limited by SQL?  In which case a "thing that can't be handled" could
presumably be managed by adding an appropriate __str__ or __float__ or
whatever?  And you would still use the library to give safety with other
values.

Maybe you could give an example of the kind of problem you're thinking of?

Thanks,
Andrew



--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-08 Thread Lawrence D'Oliveiro
In message , Peter Otten wrote:

> While it may not matter here using placeholders instead of manually
> escaping user-provided values is a good habit to get into.

Until you hit things it can't deal with.

--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-08 Thread Peter Otten
Nick wrote:

> self.cursor = self.connection.cursor()
> first = True
> for row in reader:
> if first:
> headers = []
> for r in row:
> n = r.strip().replace (' ', '_').replace ('-','_')
> headers.append (n)
> command = 'create table csv (%s)' % ','.join (headers)
> self.cursor.execute (command)
> first = False
> else:
> command = 'insert into csv values ("%s")' % '","'.join
> (row)
> self.cursor.execute (command)
> 

You can simplify that a bit:

cursor = self.cursor = self.connection.cursor()

first_row = next(reader)
headers = [column.strip().replace(" ", "_").replace("-", "_") for column in 
first_row]
cursor.execute("create table csv (%s)" % ", ".join(headers))

placeholders = ", ".join("?"*len(headers))
command = "insert into csv values (%s)" % placeholders
cursor.executemany(command, reader)

While it may not matter here using placeholders instead of manually escaping 
user-provided values is a good habit to get into.

> self.cursor.execute (self.query)
> rows = self.cursor.fetchall()
  
rows = self.cursor.execute(self.query)

doesn't build an intermediate list.

> i = 0
> for row in rows:
> results.add (row, i)
> i = i + 1
 
This is written

for i, row in enumerate(rows):
results.add(row, i)

in idiomatic Python.

Peter

--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-07 Thread John Machin
On May 8, 1:45 am, Nick  wrote:
> On May 5, 8:27 pm, Tim Golden  wrote:
>
>
>
> > Nick wrote:
> > > Part of the problem is that the 'selection' needs to be in a config
> > > file. I can put the if row['status'] != 'Cancelled': return True into
> > > a config, read it and eval it, but its not quite as clean as ansql
> > > route.
>
> > Still not clear what the restriction is. If you were writingSQLyou'd have 
> > to read *something* from your config file,
> > unless you're suggesting that the "config file" is in fact
> > aSQLfile. Which is one way of doing it, but then you might
> > just as well have your config file as a Python file and
> > import it.
>
> > Have I missed the point somewhere here? Can you give an
> > example -- even a fictional one -- of what you couldn't
> > do using, say, the example I gave earlier?
>
> > TJG
>
> Solution found. In the end I used SQLite to read from a csv file, and
> now I can query the CSV file. The file is read using the csv module
>
> First create a function
>
> def fraction(p, denom):
>     num, frac = p.split ('-')
>     return float (num) + float (frac) / denom
>
> for use within queries.
>
> Now build the class.
>
>         self.filename  = filename
>         self.dialect   = dialect
>         self.query     = query
>         reader = csv.reader (open (filename, 'r'))
>         self.connection = sqlite.connect(":memory:")
>         self.connection.create_function("fraction", 2, fraction) #
> Adds in function
>         self.cursor = self.connection.cursor()
>         first = True
>         for row in reader:
>             if first:
>                 headers = []
>                 for r in row:
>                     n = r.strip().replace (' ', '_').replace ('-','_')
>                     headers.append (n)
>                 command = 'create table csv (%s)' % ','.join (headers)
>                 self.cursor.execute (command)
>                 first = False
>             else:
>                 command = 'insert into csv values ("%s")' % '","'.join
> (row)
>                 self.cursor.execute (command)
>
> and then I can use this
>
>         self.cursor.execute (self.query)
>         rows = self.cursor.fetchall()
>         headers = []
>         for r in self.cursor.description:
>             headers.append (r[0])
>         results = Results.Results (headers, self.name, {})
>         i = 0
>         for row in rows:
>             results.add (row, i)
>             i = i + 1
>         return results
>
> to query the results.
>
> Results.Results is one of my classes that's reused in lots of places.
>
> The query then looks somethign like this
>
>                 select
>                     Client_Reference_Number as TrdNbr,
>                     Asset_Number as ISIN,
>                     Quantity as Qty,
>                     status
>                 from
>                     csv

The remaining lines of your SELECT statement are incredibly redundant
AFAICT. It seems you have pushed the contents of your csv file into a
data base and pulled them ALL out again ... not what I'd call a
"query". What's the point?


>                 where status in ("CANCELLED")
>
>                 union
>
>                 select
>                     Client_Reference_Number as TrdNbr,
>                     Asset_Number as ISIN,
>                     Quantity as Qty,
>                     status
>                 from
>                     csv
>                 where status not in ("CANCELLED")
>
> All incredibly neat and the first time I've used SQLite.
>
> nick

--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-07 Thread Nick
On May 5, 8:27 pm, Tim Golden  wrote:
> Nick wrote:
> > Part of the problem is that the 'selection' needs to be in a config
> > file. I can put the if row['status'] != 'Cancelled': return True into
> > a config, read it and eval it, but its not quite as clean as ansql
> > route.
>
> Still not clear what the restriction is. If you were writingSQLyou'd have to 
> read *something* from your config file,
> unless you're suggesting that the "config file" is in fact
> aSQLfile. Which is one way of doing it, but then you might
> just as well have your config file as a Python file and
> import it.
>
> Have I missed the point somewhere here? Can you give an
> example -- even a fictional one -- of what you couldn't
> do using, say, the example I gave earlier?
>
> TJG

Solution found. In the end I used SQLite to read from a csv file, and
now I can query the CSV file. The file is read using the csv module

First create a function

def fraction(p, denom):
num, frac = p.split ('-')
return float (num) + float (frac) / denom

for use within queries.

Now build the class.

self.filename  = filename
self.dialect   = dialect
self.query = query
reader = csv.reader (open (filename, 'r'))
self.connection = sqlite.connect(":memory:")
self.connection.create_function("fraction", 2, fraction) #
Adds in function
self.cursor = self.connection.cursor()
first = True
for row in reader:
if first:
headers = []
for r in row:
n = r.strip().replace (' ', '_').replace ('-','_')
headers.append (n)
command = 'create table csv (%s)' % ','.join (headers)
self.cursor.execute (command)
first = False
else:
command = 'insert into csv values ("%s")' % '","'.join
(row)
self.cursor.execute (command)

and then I can use this

self.cursor.execute (self.query)
rows = self.cursor.fetchall()
headers = []
for r in self.cursor.description:
headers.append (r[0])
results = Results.Results (headers, self.name, {})
i = 0
for row in rows:
results.add (row, i)
i = i + 1
return results

to query the results.

Results.Results is one of my classes that's reused in lots of places.

The query then looks somethign like this

select
Client_Reference_Number as TrdNbr,
Asset_Number as ISIN,
Quantity as Qty,
status
from
csv
where status in ("CANCELLED")

union

select
Client_Reference_Number as TrdNbr,
Asset_Number as ISIN,
Quantity as Qty,
status
from
csv
where status not in ("CANCELLED")


All incredibly neat and the first time I've used SQLite.

nick
--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-05 Thread Tim Golden

Nick wrote:

Part of the problem is that the 'selection' needs to be in a config
file. I can put the if row['status'] != 'Cancelled': return True into
a config, read it and eval it, but its not quite as clean as an sql
route.



Still not clear what the restriction is. If you were writing
SQL you'd have to read *something* from your config file,
unless you're suggesting that the "config file" is in fact
a SQL file. Which is one way of doing it, but then you might
just as well have your config file as a Python file and
import it.

Have I missed the point somewhere here? Can you give an
example -- even a fictional one -- of what you couldn't
do using, say, the example I gave earlier?

TJG
--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-05 Thread Matimus
On May 5, 9:25 am, Nick  wrote:
> On May 5, 5:19 pm, Tim Golden  wrote:
>
>
>
> > Nick wrote:
> > > I have a requirement to read a CSV file. Normally, no problem, just
> > > import CSV and slurp the file up.
>
> > > However, in this case I want to filter out lines that have fields set
> > > to particular values.
>
> > > It would be neat to be able to do something like this.
>
> > > select * from test.csv where status <> "Canceled"
>
> > > Using adodb I can do this, so long as I don't have the where clause. :-
> > > (
>
> > > Is there a reasonable lightweight way of doing this in Python?
>
> > > I could write some python code that is used to filter rows, and inport
> > > that from config, but it's not quite as elegant as an SQL route.
>
> > Not entirely clear what you are and aren't prepared to try here, but...
> > the most obvious Python-based way to do this is treating the csv reader
> > as an iterator and filtering there. Your last line suggests that's not
> > what you want but just in case I've misunderstood:
>
> > 
> > id,code,status
> > 1,"ONE","Active"
> > 2,"TWO","Cancelled"
> > 3,"THREE","Active"
> > 
>
> > 
> > import csv
>
> > for row in csv.DictReader (open ("c:/temp/test.csv", "rb")):
> >   if row['status'] != 'Cancelled':
> >     print row
>
> > 
>
> > Doesn't seem too onerous, and could obviously be wrapped in
> > some useful class/module.
>
> > But if you really want to go the SQL route, I believe there are
> > ODBC adapters for CSV which, combined with PyODBC or CeODBC,
> > would probably take you where you want to go.
>
> > TJG
>
> Part of the problem is that the 'selection' needs to be in a config
> file. I can put the if row['status'] != 'Cancelled': return True into
> a config, read it and eval it, but its not quite as clean as an sql
> route.
>
> Nick

Well, if you are using 2.5.x you could always stuff it into a sqlite
in-memory database, and then execute a SQL query. Heck, you don't even
_need_ 2.5, but in 2.5 sqlite is part of the distribution.

Matt
--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-05 Thread Nick
On May 5, 5:19 pm, Tim Golden  wrote:
> Nick wrote:
> > I have a requirement to read a CSV file. Normally, no problem, just
> > import CSV and slurp the file up.
>
> > However, in this case I want to filter out lines that have fields set
> > to particular values.
>
> > It would be neat to be able to do something like this.
>
> > select * from test.csv where status <> "Canceled"
>
> > Using adodb I can do this, so long as I don't have the where clause. :-
> > (
>
> > Is there a reasonable lightweight way of doing this in Python?
>
> > I could write some python code that is used to filter rows, and inport
> > that from config, but it's not quite as elegant as an SQL route.
>
> Not entirely clear what you are and aren't prepared to try here, but...
> the most obvious Python-based way to do this is treating the csv reader
> as an iterator and filtering there. Your last line suggests that's not
> what you want but just in case I've misunderstood:
>
> 
> id,code,status
> 1,"ONE","Active"
> 2,"TWO","Cancelled"
> 3,"THREE","Active"
> 
>
> 
> import csv
>
> for row in csv.DictReader (open ("c:/temp/test.csv", "rb")):
>   if row['status'] != 'Cancelled':
>     print row
>
> 
>
> Doesn't seem too onerous, and could obviously be wrapped in
> some useful class/module.
>
> But if you really want to go the SQL route, I believe there are
> ODBC adapters for CSV which, combined with PyODBC or CeODBC,
> would probably take you where you want to go.
>
> TJG

Part of the problem is that the 'selection' needs to be in a config
file. I can put the if row['status'] != 'Cancelled': return True into
a config, read it and eval it, but its not quite as clean as an sql
route.

Nick
--
http://mail.python.org/mailman/listinfo/python-list


Re: SQL and CSV

2009-05-05 Thread Tim Golden

Nick wrote:

I have a requirement to read a CSV file. Normally, no problem, just
import CSV and slurp the file up.

However, in this case I want to filter out lines that have fields set
to particular values.

It would be neat to be able to do something like this.

select * from test.csv where status <> "Canceled"

Using adodb I can do this, so long as I don't have the where clause. :-
(

Is there a reasonable lightweight way of doing this in Python?

I could write some python code that is used to filter rows, and inport
that from config, but it's not quite as elegant as an SQL route.



Not entirely clear what you are and aren't prepared to try here, but...
the most obvious Python-based way to do this is treating the csv reader
as an iterator and filtering there. Your last line suggests that's not
what you want but just in case I've misunderstood:


id,code,status
1,"ONE","Active"
2,"TWO","Cancelled"
3,"THREE","Active"



import csv

for row in csv.DictReader (open ("c:/temp/test.csv", "rb")):
 if row['status'] != 'Cancelled':
   print row



Doesn't seem too onerous, and could obviously be wrapped in
some useful class/module.

But if you really want to go the SQL route, I believe there are
ODBC adapters for CSV which, combined with PyODBC or CeODBC,
would probably take you where you want to go.

TJG
--
http://mail.python.org/mailman/listinfo/python-list


SQL and CSV

2009-05-05 Thread Nick
I have a requirement to read a CSV file. Normally, no problem, just
import CSV and slurp the file up.

However, in this case I want to filter out lines that have fields set
to particular values.

It would be neat to be able to do something like this.

select * from test.csv where status <> "Canceled"

Using adodb I can do this, so long as I don't have the where clause. :-
(

Is there a reasonable lightweight way of doing this in Python?

I could write some python code that is used to filter rows, and inport
that from config, but it's not quite as elegant as an SQL route.

Thanks

Nick
--
http://mail.python.org/mailman/listinfo/python-list