Re: [Tutor] Building an SQL query
On Thu, Jun 02, 2005 at 10:41:20PM +0100, Alan G wrote: Why not convert the list to a tuple before applying str(): str(tuple(ids_to_process)) I'm just getting started with Python and PostgreSQL but I found that str(tuple(valueList)) wouldn't work for me because I had a few values with apostrophes. PostgreSQL needed 'Lion''s Mane' but Python was sending it Lion's Mane, so I ended up writing this little function: def sqlNice(valueList): count = 1 y = '(' for x in valueList: x = x.replace(', '') y = y + ' + x + ' if count len(valueList): y = y + ', ' count = count + 1 y = y + ')' y = y.replace('NULL', 'NULL') return y Does anyone see any major stumbling blocks in that? On a side note, I've struggled with PyGreSQL. At first I was using the pg module, but I switched to pgdb when insert() wasn't working for me and I thought I would have less trouble using something that's DB-API compliant. There seemed to be more documentation there, and I figured it's a good idea to go with the standard. However, it does seem like I'm covering ground I'm sure someone else has already crossed when I create these re functions to manipulate queries. For inserts, at least, it seems a Python dictionary should be able to do the job nicely. gabe ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
On Fri, 3 Jun 2005, Gabriel Farrell wrote: def sqlNice(valueList): count = 1 y = '(' for x in valueList: x = x.replace(', '') y = y + ' + x + ' if count len(valueList): y = y + ', ' count = count + 1 y = y + ')' y = y.replace('NULL', 'NULL') return y Does anyone see any major stumbling blocks in that? Hi Gabriel, Hmmm... there's got be something already in PyGreSQL that does most of that, and if there is, let's avoid reinventing the wheel. Let me check... ok, there is an undocumented function in pg.py that does the individual value quotation in pgdb._quote(). It looks like you might be able to get away with: ## def sqlNice(valueList): quotedValues = [str(pgdb._quote(x)) for x in valueList] return '(' + ','.join(quotedValues) + ')' ## That being said, I feel nervous about advocating using an underscored function, since that's a hint that it's not a part of the interface to the pydb module. Maybe you can contact the author and ask if '_quote()' could be renamed to 'quote()', or at least to provide some kind of official access to the quote function. ... Wait a minute. According to the content of the _quote() function, it handles tuples and lists properly: ### pgdb.py, from PyGreSQL-3.6.2 ### def _quote(x): if isinstance(x, DateTime.DateTimeType): x = str(x) elif isinstance(x, unicode): x = x.encode( 'utf-8' ) if isinstance(x, types.StringType): x = ' + string.replace( string.replace(str(x), '\\', ''), ', '') + ' elif isinstance(x, (types.IntType, types.LongType, types.FloatType)): pass elif x is None: x = 'NULL' elif isinstance(x, (types.ListType, types.TupleType)): x = '(%s)' % string.join(map(lambda x: str(_quote(x)), x), ',') elif hasattr(x, '__pg_repr__'): x = x.__pg_repr__() else: raise InterfaceError, 'do not know how to handle type %s' % type(x) return x ## So you should be able to pass lists without any problems. Can you show us what some of your SQL execution statements have looked like? You should be able to do something like: ## cursor.execute(select name from foo where id in %s, ([1, 2, 3, 4, 5]) ## Best of wishes! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query (Gabriel Farrell)
The code to update the database should look something like: the_cursor.execute( sql_cmd, data) I am not using postgresql so I do not know the place-holder mark for your module. You can get that from the module documentation. So sql_cmd could be something like UPDATE my_table SET title=PLACE_HOLDER where record_id=PLACE_HOLDER If data gets passed as a tuple (some modules support dict), data is (Lion's Mane, 12345) For mysql the place-holder is %s, and the code would be the_cursor.execute(UPDATE my_table SET title=%s where record_id=%s, (Lion's Mane, 12345) ) I'm just getting started with Python and PostgreSQL but I found that str(tuple(valueList)) wouldn't work for me because I had a few values with apostrophes. PostgreSQL needed 'Lion''s Mane' but Python was sending it Lion's Mane, so I ended up writing this little function: def sqlNice(valueList): count = 1 y = '(' for x in valueList: x = x.replace(', '') y = y + ' + x + ' if count len(valueList): y = y + ', ' count = count + 1 y = y + ')' y = y.replace('NULL', 'NULL') return y Does anyone see any major stumbling blocks in that? On a side note, I've struggled with PyGreSQL. At first I was using the pg module, but I switched to pgdb when insert() wasn't working for me and I thought I would have less trouble using something that's DB-API compliant. There seemed to be more documentation there, and I figured it's a good idea to go with the standard. However, it does seem like I'm covering ground I'm sure someone else has already crossed when I create these re functions to manipulate queries. For inserts, at least, it seems a Python dictionary should be able to do the job nicely. gabe -- Lloyd Kvam Venix Corp -- Lloyd Kvam Venix Corp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
data = {} data['ids_to_process'] = ['1','2','3','5','7','11'] query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ARRAY%(ids_to_process)s ''' db.execute(query, data) Sorry. It should look like ... query = ''' UPDATE my_table SET state = 'processed' WHERE id = ANY(ARRAY%(ids_to_process)s) ''' _ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query (Gabriel Farrell)
On Fri, Jun 03, 2005 at 03:50:09PM -0400, Lloyd Kvam wrote: The code to update the database should look something like: the_cursor.execute( sql_cmd, data) In PyGreSQL/pgdb it's cursor.execute(query[, params]) but it means more or less the same thing because pgdb's paramstyle (I knew from the DB-API[1] to look in help(pgdb) for paramstyle) is pyformat. I googled that and found some explanation of pyformat in a message[2] on the DB-SIG mailing list. To quickly summarize that message, pyformat means the string fed to cursor.execute() should follow all the usual rules of Python string formatting. Knowing this, I can now execute my query thusly: import pgdb db = pgdb.connect(database='asdc') cursor = db.cursor() data = { ... 'noteType': None, ... 'note': Lion's Mane, ... 'recordIdentifier': 'gsf136' ... } cursor.execute(INSERT INTO notes (notetype, note, recordidentifier) \ ... VALUES (%(noteType)s, %(note)s, %(recordIdentifier)s), data) db.commit() Note that the re matching I had to do before is now taken care of by pgdb (in the _query() function Danny Yoo was kind enough to track down). Before the query gets to PostgreSQL, the None value turns into a NULL and Lion's Mane transforms into 'Lion''s Mane'. No re incantations necessary! gabe [1] http://www.python.org/peps/pep-0249.html [2] http://aspn.activestate.com/ASPN/Mail/Message/db-sig/1632007 ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
At 06:48 AM 6/2/2005, Greg Lindstrom wrote: Hello- I am building a query to hit a Postgres (8.0.1) database from Python (4.2.1) on Linux. Here's how I've been doing it for the past year or so: data = ""> data['start_date'] = '2005-6-2' data['last_name'] = 'Johnson' query = ''' SELECT * FROM my_table WHERE date = '%(start_date)s' AND last_name = '%(last_name)s' ''' % data results = my_database.Execute(query) and life has been good. What I would like to do now is use the Postgres IN operator. For example: ids_to_process = ('1','2','3','5','7','11') I would like to get something akin to: query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ids_to_process ''' Do you want, in this case, the where clause to be : WHERE id IN ('1','2','3','5','7','11')? If so, how about: query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ''' + str(ids_to_process) This would, of course, set the 'state' column to 'processed' for all of the ids in the list, but can not figure out how to get this into a query to pass to the database. Have any of you smart cookies out there dealt with this? There are other ways to get the job done, worst case being writing a look and issuing an UPDATE for each id, but that is not too elegant, IMHO. Any help or pointers would be greatly appreciated, --greg ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor Bob Gailer mailto:[EMAIL PROTECTED] 510 558 3275 home 720 938 2625 cell ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
I am building a query to hit a Postgres (8.0.1) database from Python (4.2.1) on Linux. Here's how I've been doing it for the past year or so: ... query = ''' SELECT * FROM my_table Its a really bad idea to use SELECT * FROM in production code. There are two main reasons: 1) If the database structure changes your code is likely to break since SELECT * does not normally guarantee anything about the order of fields returned, so if the table gets an extra field added you might find the order changing. At the very least there will be an extra item in your tuple ofvalues returned whichj may well break your code. 2) Using * also prevents the database from precompiling your query and caching it, thus you will slow down the processing by forcing a SQL compile step each time. (This is true on Oracle, DB2 and Interbase, I don't know about Postgres but assume it is similar) Of course if you are the only user and the database is small these are not likely to be major issues but if two or more apps use the same database or if many users are hitting it it could be. SELECT * is great for experimenting but in production code its much safer to explicitly list the fields that you want back. query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ids_to_process ''' This would, of course, set the 'state' column to 'processed' for all of the ids in the list, but can not figure out how to get this into a query to pass to the database. What have you tried? What happened? It should just be a case of using variable interpolation as you did for the Select. Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
data = {} data['start_date'] = '2005-6-2' data['last_name'] = 'Johnson' query = ''' SELECT * FROM my_table WHERE date = '%(start_date)s' AND last_name = '%(last_name)s' ''' % data results = my_database.Execute(query) First up. This is a bad idea. It may be ok now, as long as you have absolute control over what start_date and last_name are, but what about next week when you decide ... let's allow the user to put in the dates for start_date and they make start_date '6-2-05'; DELETE FROM my_table; SELECT * FROM my_table WHERE date='6-2-05' Instead, use the arg quoting mechanism from the db interface you are using. You don't say which one that is, but it should look something like ... data = {} data['start_date'] = '2005-6-2' data['last_name'] = 'Johnson' query = ''' SELECT * FROM my_table WHERE date = '%(start_date)s' AND last_name = '%(last_name)s' ''' results = my_database.execute(query, data) ids_to_process = ('1','2','3','5','7','11') I would like to get something akin to: query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ids_to_process ''' You can use an array type in postgres. I use something like this: data = {} data['ids_to_process'] = ['1','2','3','5','7','11'] query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ARRAY%(ids_to_process)s ''' db.execute(query, data) Notice that I changed the data tuple to a list. The postgres syntax for an array is ARRAY[1, 2, 3, 5, 7, 11] so you need the square brackets that a list will give you. It's kind of a hack, but as far as I know none of the other databases have arrays, so the db-api people are not into creating a special array access method. _ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
On 6/2/05, Alan G [EMAIL PROTECTED] wrote: Its a really bad idea to use SELECT * FROM in production code.There are two main reasons:1) If the database structure changes your code is likely to break since SELECT * does not normally guarantee anything about the order of fields returned, so if the table gets an extra field added you might find the order changing. At the very least there will be an extra item in your tuple ofvalues returned whichj may well break your code. Alan, I'm using SELECT * specifically for this reason! I have the query and customer specific data layouts stored in a database and am using ADOpy to associate the field names to locations in a data segment. Doing it this way allows us to modify the query and/or the layout in the database without touching the code that does all of the heavy lifting. Using this strategy, we are able to perform all of our extractions to customer specific layouts with one rather small Python routine (this was the topic I spoke on at PyCon 2005) and with the web pages we've created our non-technical personnel can create/modify customer layouts. It started off as a project to do our daily data pulls but has grown to handle virtually every report we generate. 2) Using * also prevents the database from precompiling your query and caching it, thus you will slow down the processing by forcing a SQL compile step each time. (This is true on Oracle, DB2 and Interbase, don't know about Postgres but assume it is similar) You are correct and in the future this may be a problem but currently we will accept the time penalty to gain the flexibility described above. query = ''' UPDATE my_table SET state = 'processed' WHERE id IN ids_to_process '''What have you tried? What happened? It should just be a caseof using variable interpolation as you did for the Select. Here's what works for me (and a tip-o-the-hat to Bob Gailer for his help) query = '''UPDATE my_table SET state = 'processed' WHERE id IN %s''' % str(ids_to_process) query = query.replace('[', '(') query = query.replace(']', ')') results = adocursor.Execute(query) Notice that I have to replace [ with ( and ] with ). A small bother but the results are allowing me to perform much more complex queries out of the database. I'm always open to ways to improve this (in particular I would like to avoid recompiling the query every time this is hit). It just hit me that we could store the field names to select in the query right along with everything else...I think I'll try it to see what sort of increase we get because we plan on growing our business. Thanks for responding, --greg ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
SELECT * does not normally guarantee anything about the order of fields returned, so if the table gets an extra field added you might find the order I'm using SELECT * specifically for this reason! I have the query and customer specific data layouts stored in a database and am using ADOpy to associate the field names to locations in a data segment. Hmm, I dunno ADOpy but assume it somehow miraculously turns your data set into a dictionary of some sort? How it guesses which order the SELECT will return the fields is a mystery to me, but maybe it has knowledge of the Postgres hashing function or somesuch. Here's what works for me (and a tip-o-the-hat to Bob Gailer for his help) query = '''UPDATE my_table SET state = 'processed' WHERE id IN %s''' % str(ids_to_process) query = query.replace('[', '(') Why not convert the list to a tuple before applying str(): str(tuple(ids_to_process)) like to avoid recompiling the query every time this is hit). It just hit me that we could store the field names to select in the query right along with everything else... That's what we usually do, so that if we do need to change the data retrieved for multiple queries we only change it in one place, but still keep the predictability of defined foield names. I think I'll try it to see what sort of increase we get because we plan on growing our business. Unless ADOpy is very slow I wouldn't expect a huge performance increase since it will only be the compile phase, but if you hit the query a lot then maybe 5-10%. You are more likely to see the benefit in a drop CPU loading on the server. Alan G. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
Rumor has it that Alan G may have mentioned these words: Hmm, I dunno ADOpy but assume it somehow miraculously turns your data set into a dictionary of some sort? I dunno ADOpy, but the pg module for PostgreSQL can return a list of dictionaries from a query. import pg pg.set_defuser('example') pg.set_defpasswd('example') pg.set_defbase('spam') pgpasswd = pg.DB() pgpasswd.query('select dtg, classc, helo from ips limit 3;').dictresult() [{'helo': 'veda.cz', 'dtg': '2005-03-30', 'classc': '85.39.122'}, {'helo': 'ck336290-a.dokku1.fr.home.nl', 'dtg': '2005-03-30', 'classc': '217.123.211'}, {'helo': 'keymaster.com', 'dtg': '2005-03-30', 'classc': '220.73.88'}] *** output edited slightly with carriage returns *** How it guesses which order the SELECT will return the fields is a mystery to me, It's a mystery to some RDBs as well, IIRC with SQL there's no 'default behavior' -- if it's not defined, it can be spit out in any order it chooses; prolly depends on the implementation. It might also matter how the indices primary keys are set up as to what comes out first... but maybe it has knowledge of the Postgres hashing function or somesuch. With dictionaries, it doesn't matter nearly so much. ;-) The ease with which data can be I/O'd thru PostgreSQL with Python is one of the main factors of my dumping Perl for it; I can spend more time diddling with the data than I need to do making the proggies run. ;-) HTH, Roger Merch Merchberger -- Roger Merch Merchberger | Bugs of a feather flock together. sysadmin, Iceberg Computers | Russell Nelson [EMAIL PROTECTED] | ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Building an SQL query
Hmm, I dunno ADOpy but assume it somehow miraculously turns your dataset into a dictionary of some sort? How it guesses which order the SELECT will return the fields is a mystery to me, but maybe it hasknowledge of the Postgres hashing function or somesuch. Yeah. I used to do it by hand by looking at the description of the cursor object, which holds the name of each field as defined in the database with other stuff. You can then get the field value by getting the index of the field name, then hitting the row data. It's not a lot of fun, but I did it because I didn't want to depend on the order being returned in case we wanted to drop a field for some reason. Here's what works for me (and a tip-o-the-hat to Bob Gailer for hishelp) query = '''UPDATE my_table SET state = 'processed' WHERE id IN %s'''% str(ids_to_process) query = query.replace('[', '(')Why not convert the list to a tuple before applying str():str(tuple(ids_to_process)) Didn't think of it...thanks :-)Unless ADOpy is very slow I wouldn't expect a huge performanceincrease since it will only be the compile phase, but if youhit the query a lot then maybe 5-10%. You are more likely to see the benefit in a drop CPU loading on the server. Which is good, too. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor