Re: [Tutor] Building an SQL query

2005-06-03 Thread Gabriel Farrell
On Thu, Jun 02, 2005 at 10:41:20PM +0100, Alan G wrote:
 Why not convert the list to a tuple before applying str():
 
 str(tuple(ids_to_process))

I'm just getting started with Python and PostgreSQL but I found that
str(tuple(valueList)) wouldn't work for me because I had a few values
with apostrophes.  PostgreSQL needed 'Lion''s Mane' but Python was
sending it Lion's Mane, so I ended up writing this little function:

def sqlNice(valueList):
count = 1
y = '('
for x in valueList:
x = x.replace(', '')
y = y + ' + x + '
if count  len(valueList):
y = y + ', '
count = count + 1
y = y + ')'
y = y.replace('NULL', 'NULL')
return y

Does anyone see any major stumbling blocks in that?  

On a side note, I've struggled with PyGreSQL.  At first I was using
the pg module, but I switched to pgdb when insert() wasn't working for
me and I thought I would have less trouble using something that's
DB-API compliant.  There seemed to be more documentation there, and I
figured it's a good idea to go with the standard.  However, it does
seem like I'm covering ground I'm sure someone else has already
crossed when I create these re functions to manipulate queries.  For
inserts, at least, it seems a Python dictionary should be able to do
the job nicely.

gabe
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-03 Thread Danny Yoo


On Fri, 3 Jun 2005, Gabriel Farrell wrote:

 def sqlNice(valueList):
 count = 1
 y = '('
 for x in valueList:
 x = x.replace(', '')
 y = y + ' + x + '
 if count  len(valueList):
 y = y + ', '
 count = count + 1
 y = y + ')'
 y = y.replace('NULL', 'NULL')
 return y

 Does anyone see any major stumbling blocks in that?


Hi Gabriel,

Hmmm... there's got be something already in PyGreSQL that does most of
that, and if there is, let's avoid reinventing the wheel.  Let me check...
ok, there is an undocumented function in pg.py that does the individual
value quotation in pgdb._quote().


It looks like you might be able to get away with:

##
def sqlNice(valueList):
quotedValues = [str(pgdb._quote(x)) for x in valueList]
return '(' + ','.join(quotedValues) + ')'
##


That being said, I feel nervous about advocating using an underscored
function, since that's a hint that it's not a part of the interface to the
pydb module.  Maybe you can contact the author and ask if '_quote()' could
be renamed to 'quote()', or at least to provide some kind of official
access to the quote function.


... Wait a minute.  According to the content of the _quote() function, it
handles tuples and lists properly:

### pgdb.py, from PyGreSQL-3.6.2 ###
def _quote(x):
if isinstance(x, DateTime.DateTimeType):
x = str(x)
elif isinstance(x, unicode):
x = x.encode( 'utf-8' )
if isinstance(x, types.StringType):
x = ' + string.replace(
string.replace(str(x), '\\', ''), ', '') + '
elif isinstance(x, (types.IntType, types.LongType, types.FloatType)):
pass
elif x is None:
x = 'NULL'
elif isinstance(x, (types.ListType, types.TupleType)):
x = '(%s)' % string.join(map(lambda x: str(_quote(x)), x), ',')
elif hasattr(x, '__pg_repr__'):
x = x.__pg_repr__()
else:
raise InterfaceError, 'do not know how to handle type %s' % type(x)
return x
##


So you should be able to pass lists without any problems.  Can you show us
what some of your SQL execution statements have looked like?  You should
be able to do something like:

##
cursor.execute(select name from foo where id in %s,
   ([1, 2, 3, 4, 5])
##



Best of wishes!

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query (Gabriel Farrell)

2005-06-03 Thread Lloyd Kvam
The code to update the database should look something like:
the_cursor.execute( sql_cmd, data)

I am not using postgresql so I do not know the place-holder mark for
your module.  You can get that from the module documentation.

So sql_cmd could be something like
UPDATE my_table SET title=PLACE_HOLDER where 
record_id=PLACE_HOLDER
If data gets passed as a tuple (some modules support dict), data is
(Lion's Mane, 12345)

For mysql the place-holder is %s, and the code would be
the_cursor.execute(UPDATE my_table SET title=%s where record_id=%s,
(Lion's Mane, 12345) )

 I'm just getting started with Python and PostgreSQL but I found that
 str(tuple(valueList)) wouldn't work for me because I had a few values
 with apostrophes.  PostgreSQL needed 'Lion''s Mane' but Python was
 sending it Lion's Mane, so I ended up writing this little function:
 
 def sqlNice(valueList):
 count = 1
 y = '('
 for x in valueList:
 x = x.replace(', '')
 y = y + ' + x + '
 if count  len(valueList):
 y = y + ', '
 count = count + 1
 y = y + ')'
 y = y.replace('NULL', 'NULL')
 return y
 
 Does anyone see any major stumbling blocks in that?  
 
 On a side note, I've struggled with PyGreSQL.  At first I was using
 the pg module, but I switched to pgdb when insert() wasn't working for
 me and I thought I would have less trouble using something that's
 DB-API compliant.  There seemed to be more documentation there, and I
 figured it's a good idea to go with the standard.  However, it does
 seem like I'm covering ground I'm sure someone else has already
 crossed when I create these re functions to manipulate queries.  For
 inserts, at least, it seems a Python dictionary should be able to do
 the job nicely.
 
 gabe

-- 
Lloyd Kvam
Venix Corp
-- 
Lloyd Kvam
Venix Corp

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-03 Thread Lee Harr
data = {}
data['ids_to_process'] = ['1','2','3','5','7','11']

query = '''
UPDATE my_table
   SET state = 'processed'
 WHERE id IN ARRAY%(ids_to_process)s
'''
db.execute(query, data)



Sorry. It should look like ...

query = '''
   UPDATE my_table
  SET state = 'processed'
WHERE id = ANY(ARRAY%(ids_to_process)s)
'''

_
Don't just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query (Gabriel Farrell)

2005-06-03 Thread Gabriel Farrell
On Fri, Jun 03, 2005 at 03:50:09PM -0400, Lloyd Kvam wrote:
 The code to update the database should look something like:
 the_cursor.execute( sql_cmd, data)
 
In PyGreSQL/pgdb it's cursor.execute(query[, params]) but it means
more or less the same thing because pgdb's paramstyle (I knew from the
DB-API[1] to look in help(pgdb) for paramstyle) is pyformat.  I
googled that and found some explanation of pyformat in a message[2] on
the DB-SIG mailing list.  To quickly summarize that message, pyformat
means the string fed to cursor.execute() should follow all the usual
rules of Python string formatting.

Knowing this, I can now execute my query thusly:

 import pgdb
 db = pgdb.connect(database='asdc')
 cursor = db.cursor()
 data = {
... 'noteType': None,
... 'note': Lion's Mane,
... 'recordIdentifier': 'gsf136'
... }
 cursor.execute(INSERT INTO notes (notetype, note, recordidentifier) \
... VALUES (%(noteType)s, %(note)s, %(recordIdentifier)s), data)
 db.commit()

Note that the re matching I had to do before is now taken care of by
pgdb (in the _query() function Danny Yoo was kind enough to track
down).  Before the query gets to PostgreSQL, the None value turns into
a NULL and Lion's Mane transforms into 'Lion''s Mane'.  No re
incantations necessary!

gabe

[1] http://www.python.org/peps/pep-0249.html
[2] http://aspn.activestate.com/ASPN/Mail/Message/db-sig/1632007
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-02 Thread Bob Gailer


At 06:48 AM 6/2/2005, Greg Lindstrom wrote:
Hello-
I am building a query to hit a Postgres (8.0.1) database from Python
(4.2.1) on Linux. Here's how I've been doing it for the past year
or so:
data = "">
data['start_date'] = '2005-6-2'
data['last_name'] = 'Johnson'
query = '''
 SELECT * 
 FROM my_table
 WHERE date = '%(start_date)s'
 AND last_name = '%(last_name)s'
''' % data
results = my_database.Execute(query)
and life has been good. What I would like to do now is use the
Postgres IN operator. For example:
ids_to_process = ('1','2','3','5','7','11')
I would like to get something akin to:
query = '''
 UPDATE my_table 
 SET state = 'processed' 
 WHERE id IN ids_to_process
'''
Do you want, in this case, the where clause to be : WHERE id IN
('1','2','3','5','7','11')?
If so, how about:
query = '''
 UPDATE my_table
 SET state = 'processed'
 WHERE id IN ''' + str(ids_to_process) 
This would, of course, set the
'state' column to 'processed' for all of the ids in the list, but can not
figure out how to get this into a query to pass to the database.
Have any of you smart cookies out there dealt with this? There are
other ways to get the job done, worst case being writing a look and
issuing an UPDATE for each id, but that is not too elegant,
IMHO.
Any help or pointers would be greatly appreciated,
--greg
___
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Bob Gailer
mailto:[EMAIL PROTECTED]
510 558 3275 home
720 938 2625 cell 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-02 Thread Alan G
 I am building a query to hit a Postgres (8.0.1) database
 from Python (4.2.1) on Linux. Here's how I've been doing
 it for the past year or so:
 ...
 query = '''
 SELECT *
 FROM my_table
 


Its a really bad idea to use SELECT * FROM in production code.
There are two main reasons:
1) If the database structure changes your code is likely to break
since SELECT * does not normally guarantee anything about the order
of fields returned, so if the table gets an extra field added you
might find the order changing. At the very least there will be an
extra item in your tuple ofvalues returned whichj may well break
your code.
2) Using * also prevents the database from precompiling your query
and caching it, thus you will slow down the processing by forcing a
SQL
compile step each time. (This is true on Oracle, DB2 and Interbase,
I don't know about Postgres but assume it is similar)

Of course if you are the only user and the database is small these
are not likely to be major issues but if two or more apps use the
same database or if many users are hitting it it could be.

SELECT * is great for experimenting but in production code its
much safer to explicitly list the fields that you want back.

 query = '''
 UPDATE my_table
 SET state = 'processed'
 WHERE id IN ids_to_process
 '''

 This would, of course, set the 'state' column to 'processed'
 for all of the ids in the list, but can not figure out how
 to get this into a query to pass to the database.

What have you tried? What happened? It should just be a case
of using variable interpolation as you did for the Select.

Alan G.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-02 Thread Lee Harr
data = {}
data['start_date'] = '2005-6-2'
data['last_name'] = 'Johnson'

query = '''
SELECT *
  FROM my_table
 WHERE date = '%(start_date)s'
   AND last_name = '%(last_name)s'
''' % data
results = my_database.Execute(query)


First up. This is a bad idea.

It may be ok now, as long as you have absolute control
over what start_date and last_name are, but what about
next week when you decide ... let's allow the user to put
in the dates for start_date and they make start_date
'6-2-05'; DELETE FROM my_table; SELECT * FROM my_table
WHERE date='6-2-05' 

Instead, use the arg quoting mechanism from the db
interface you are using. You don't say which one that
is, but it should look something like ...

data = {}
data['start_date'] = '2005-6-2'
data['last_name'] = 'Johnson'

query = '''
SELECT *
  FROM my_table
 WHERE date = '%(start_date)s'
   AND last_name = '%(last_name)s'
'''
results = my_database.execute(query, data)



ids_to_process = ('1','2','3','5','7','11')

I would like to get something akin to:

query = '''
   UPDATE my_table
  SET state = 'processed'
WHERE id IN ids_to_process
'''

You can use an array type in postgres. I use something
like this:

data = {}
data['ids_to_process'] = ['1','2','3','5','7','11']

query = '''
   UPDATE my_table
  SET state = 'processed'
WHERE id IN ARRAY%(ids_to_process)s
'''
db.execute(query, data)


Notice that I changed the data tuple to a list. The postgres
syntax for an array is ARRAY[1, 2, 3, 5, 7, 11] so you need
the square brackets that a list will give you.

It's kind of a hack, but as far as I know none of the other
databases have arrays, so the db-api people are not into
creating a special array access method.

_
FREE pop-up blocking with the new MSN Toolbar - get it now! 
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-02 Thread Greg Lindstrom
On 6/2/05, Alan G [EMAIL PROTECTED] wrote:
Its a really bad idea to use SELECT * FROM in production code.There are two main reasons:1)
If the database structure changes your code is likely to break since
SELECT * does not normally guarantee anything about the order of fields
returned, so if the table gets an extra field added you might find the
order changing. At the very least there will be an extra item in your
tuple ofvalues returned whichj may well break your code.

Alan, 

I'm using SELECT * specifically for this reason! I have the
query and customer specific data layouts stored in a database and am
using ADOpy to associate the field names to locations in a data
segment. Doing it this way allows us to modify the query and/or
the layout in the database without touching the code that does all of
the heavy lifting. Using this strategy, we are able to perform
all of our extractions to customer specific layouts with one rather
small Python routine (this was the topic I spoke on at PyCon 2005) and
with the web pages we've created our non-technical personnel can
create/modify customer layouts. It started off as a project to do
our daily data pulls but has grown to handle virtually every report we
generate.

2)
Using * also prevents the database from precompiling your query and
caching it, thus you will slow down the processing by forcing a SQL
compile step each time. (This is true on Oracle, DB2 and
Interbase, don't know about Postgres but assume it is similar)
You are correct and in the future this may be a problem but currently
we will accept the time penalty to gain the flexibility described above.
 query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ids_to_process
 '''What have you tried? What happened? It should just be a caseof using variable interpolation as you did for the Select.
Here's what works for me (and a tip-o-the-hat to Bob Gailer for his help)

query = '''UPDATE my_table SET state = 'processed' WHERE id IN %s''' % str(ids_to_process)
query = query.replace('[', '(')
query = query.replace(']', ')')
results = adocursor.Execute(query)

Notice that I have to replace [ with ( and ] with ). A small
bother but the results are allowing me to perform much more complex
queries out of the database. I'm always open to ways to improve
this (in particular I would like to avoid recompiling the query every
time this is hit). It just hit me that we could store the field
names to select in the query right along with everything else...I think
I'll try it to see what sort of increase we get because we plan on
growing our business.

Thanks for responding,
--greg


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-02 Thread Alan G
  SELECT * does not normally guarantee anything about the order of
fields
  returned, so if the table gets an extra field added you might find
the order

 I'm using SELECT * specifically for this reason! I have the query
and
 customer specific data layouts stored in a database and am using
ADOpy
 to associate the field names to locations in a data segment.

Hmm, I dunno ADOpy but assume it somehow miraculously turns your data
set into a dictionary of some sort? How it guesses which order the
SELECT will return the fields is a mystery to me, but maybe it has
knowledge of the Postgres hashing function or somesuch.

 Here's what works for me (and a tip-o-the-hat to Bob Gailer for his
help)

 query = '''UPDATE my_table SET state = 'processed' WHERE id IN %s'''
%
 str(ids_to_process)
 query = query.replace('[', '(')

Why not convert the list to a tuple before applying str():

str(tuple(ids_to_process))


 like to avoid recompiling the query every time this is hit).
 It just hit me that we could store the field names to select
 in the query right along with everything else...

That's what we usually do, so that if we do need to change the data
retrieved for multiple queries we only change it in one place, but
still keep the predictability of defined foield names.

 I think I'll try it to see what sort of increase we get
 because we plan on growing our business.

Unless ADOpy is very slow I wouldn't expect a huge performance
increase since it will only be the compile phase, but if you
hit the query a lot then maybe 5-10%. You are more likely to
see the benefit in a drop CPU loading on the server.

Alan G.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-02 Thread Roger Merchberger
Rumor has it that Alan G may have mentioned these words:

Hmm, I dunno ADOpy but assume it somehow miraculously turns your data
set into a dictionary of some sort?

I dunno ADOpy, but the pg module for PostgreSQL can return a list of 
dictionaries from a query.

  import pg
  pg.set_defuser('example')
  pg.set_defpasswd('example')
  pg.set_defbase('spam')
  pgpasswd = pg.DB()

  pgpasswd.query('select dtg, classc, helo from ips limit 3;').dictresult()

[{'helo': 'veda.cz', 'dtg': '2005-03-30', 'classc': '85.39.122'},
  {'helo': 'ck336290-a.dokku1.fr.home.nl', 'dtg': '2005-03-30', 'classc': 
'217.123.211'},
  {'helo': 'keymaster.com', 'dtg': '2005-03-30', 'classc': '220.73.88'}]

*** output edited slightly with carriage returns ***

  How it guesses which order the
SELECT will return the fields is a mystery to me,

It's a mystery to some RDBs as well, IIRC with SQL there's no 'default 
behavior' -- if it's not defined, it can be spit out in any order it 
chooses; prolly depends on the implementation.

It might also matter how the indices  primary keys are set up as to what 
comes out first...

  but maybe it has
knowledge of the Postgres hashing function or somesuch.

With dictionaries, it doesn't matter nearly so much. ;-)

The ease with which data can be I/O'd thru PostgreSQL with Python is one of 
the main factors of my dumping Perl for it; I can spend more time diddling 
with the data than I need to do making the proggies run.
;-)

HTH,
Roger Merch Merchberger

--
Roger Merch Merchberger   | Bugs of a feather flock together.
sysadmin, Iceberg Computers |   Russell Nelson
[EMAIL PROTECTED]  |

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Building an SQL query

2005-06-02 Thread Greg Lindstrom

Hmm, I dunno ADOpy but assume it somehow miraculously turns your dataset into a dictionary of some sort? How it guesses which order the
SELECT will return the fields is a mystery to me, but maybe it hasknowledge of the Postgres hashing function or somesuch.

Yeah. I used to do it by hand by looking at the description of the cursor object, which holds the name of each field as defined in the database with other stuff. You can then get the field value by getting the index of the field name, then hitting the row data. It's not a lot of fun, but I did it because I didn't want to depend on the order being returned in case we wanted to drop a field for some reason.


 Here's what works for me (and a tip-o-the-hat to Bob Gailer for hishelp) query = '''UPDATE my_table SET state = 'processed' WHERE id IN %s'''% str(ids_to_process) query = 
query.replace('[', '(')Why not convert the list to a tuple before applying str():str(tuple(ids_to_process))

Didn't think of it...thanks :-)Unless ADOpy is very slow I wouldn't expect a huge performanceincrease since it will only be the compile phase, but if youhit the query a lot then maybe 5-10%. You are more likely to
see the benefit in a drop CPU loading on the server.

Which is good, too.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor