Re: [HACKERS] Confusion over Python drivers

2010-02-15 Thread Jeff Davis
On Sun, 2010-02-14 at 20:43 +0100, Florian Weimer wrote:
 The downside is that passing strings up to the application may have
 distinctly worse performance characteristics than passing a number.

Yes, that is a good point. I tried to clarify this in the doc.

I think this would fall under the optional type conversion convenience
functions. As long as it's explicit that the conversion is happening, I
think it's OK.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-14 Thread Florian Weimer
* Jeff Davis:

 Agreed. Ultimately, the conversion has to be done somewhere, but I don't
 believe the driver is the place for it. Type conversions are always
 going to be imperfect, and this has some important consequences:
  * The type conversion system will be endlessly tweaked to improve it
  * Developers will always run into problems with it in any complex
 application, so we need to allow them to circumvent the system and do it
 themselves when necessary.

The downside is that passing strings up to the application may have
distinctly worse performance characteristics than passing a number.

 In ruby-pg, you can just do:

   conn.exec(INSERT INTO foo VALUES($1), [Jeff])

 And I think that's appropriate. What I'm saying is that there should
 still exist some way to pass explicit types or formats (although that
 should still be easier than it is in C ;). Here's the long form:

   conn.exec(INSERT INTO foo VALUES($1),
 [{:value = Jeff, :format = 0, :type = 0}])

Okay, this isn't too bad an API.  I will use the same approach.  In my
case, it means no transparent support for arrays, but per your own
guidelines, this is okay.

 That copies value so that foo and bar have the same contents: a 4 byte
 value \000. What would happen though, if val was transparently
 decoded? It would decode it once in ruby, and again inside of postgres
 (in byteain), leaving you with a one byte value in bar, even though foo
 has a four-byte value.

I've never viewed it from this angle, and I agree that it makes sense.

Thanks for your observations and explanations, they were helpful.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-13 Thread Greg Smith

Jeff Davis wrote:

Keep in mind that backwards compatibility is not the only issue here;
forwards compatibility matters as well*. A lot of the encoding issues I
wrote up ( http://wiki.postgresql.org/wiki/Driver_development ) will
probably be real bugs in a python3 application using a driver that
doesn't understand encoding properly.
  


Sure, but the benefit of being the de-facto standard to some problem, 
because you're already supported by everybody, is that it's easier to 
attract the most help to move forward too.  The disclaimers on 
investments always read past performance is not indicative of future 
results.  I don't think that's really true in the open-source project 
case.  Generally, I'd rather trust a project that's been keeping up with 
things for a long time already to continue to do so, particularly one 
that's built up a community around it already, rather than risk 
switching to one without a known good track record in that record. 

(Exercise for the reader: consider how what I just said applies in both 
a positive and negative way toward MySQL and its associated development 
model)


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-12 Thread Federico Di Gregorio
On 12/02/2010 01:00, Jeff Davis wrote:
 * I tried installing psycopg2-2.0.13 and the build system apparently
 doesn't support python3.1, so I assume that psycopg2 doesn't support
 python3 at all.

python3 was almost completely supported some months ago but then I had
to fix some bugs and almost 99% of psycopg users are still with python2
so the python2 branch is in a better shape. But most of the work is done
so, after the next release, I'll start porting changes to the python3
branch (master on git) and finish the work.

federico

-- 
Federico Di Gregorio   f...@initd.org
 Viva il pollo! Viva il pollo! -- sisterconfusion
 Continuo a preferire il coniglio -- vodka



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Confusion over Python drivers

2010-02-12 Thread Tom Lane
Andrew McNamara andr...@object-craft.com.au writes:
 The solution is to write the query in an unambiguous way:
 SELECT $1::date + 1;

 You are missing the point: this is not about what types the SQL execution
 sees. It is about making sure the correct recv function is applied to
 the binary parameter data.

Indeed, and the above locution does set that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-12 Thread Jeff Davis
On Fri, 2010-02-12 at 10:38 +0100, Federico Di Gregorio wrote:
 On 12/02/2010 01:00, Jeff Davis wrote:
  * I tried installing psycopg2-2.0.13 and the build system apparently
  doesn't support python3.1, so I assume that psycopg2 doesn't support
  python3 at all.
 
 python3 was almost completely supported some months ago but then I had
 to fix some bugs and almost 99% of psycopg users are still with python2
 so the python2 branch is in a better shape. But most of the work is done
 so, after the next release, I'll start porting changes to the python3
 branch (master on git) and finish the work.

Ok, sounds great.

Do you attempt to handle the encoding issues when working with python3?
I'd be interested to see your approach, because I didn't get much
feedback on the doc I wrote.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-12 Thread Andrew McNamara
Andrew McNamara andr...@object-craft.com.au writes:
 The solution is to write the query in an unambiguous way:
 SELECT $1::date + 1;

 You are missing the point: this is not about what types the SQL execution
 sees. It is about making sure the correct recv function is applied to
 the binary parameter data.

Indeed, and the above locution does set that.

Sure, but it requires the driver to modify the query - that isn't
reasonable or practical.  Expecting the user to the driver to know
and correct set the type the driver will ultimately see is a recipe
for disaster.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-11 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


The solution is to write the query in an unambiguous way:

  SELECT $1::date + 1;

which is good practice, anyway. If it's not obvious to the type
inference system, it's probably not obvious to you, and will probably
surprise you ;)

 That address this specific case, but it's ugly and not general. The right
 thing is to set the correct type when you're marshalling the parameters...

Well, ugly is in the eye of the beholder, and it certainly is a general 
solution. Any query with ambiguity in its parameters should explicitly 
declare the types, inside the query itself. Having the driver indicate 
the type should be the exception, not the rule.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201002091811
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-11 Thread Jeff Davis
On Wed, 2010-02-10 at 23:13 -0500, Greg Smith wrote:
 Until then, working apps have to 
 be the primary motivation for what to work on here, unless there's a 
 really terrible problem with the driver.  The existing psycopg license 
 and the web site issues were in combination enough to reach that level 
 of total issues for a number of people.  With the news from Federico 
 that a license improvement is approaching and signs of major 
 documentation improvements, that problem seems like it will soon be 
 solved as far as I'm concerned.  When someone manages to make their 
 alternative driver a prerequisite for an app I need, only at that point 
 will that driver show back up on my radar.

Keep in mind that backwards compatibility is not the only issue here;
forwards compatibility matters as well*. A lot of the encoding issues I
wrote up ( http://wiki.postgresql.org/wiki/Driver_development ) will
probably be real bugs in a python3 application using a driver that
doesn't understand encoding properly.

Obviously this is less urgent than having a driver that works now, but
it's still important. I think we would attract some goodwill from the
python community if we were helping them move to python3, rather than
sitting around waiting 'til they've already moved and decided that they
can't use postgresql.

Regards,
Jeff Davis

* I tried installing psycopg2-2.0.13 and the build system apparently
doesn't support python3.1, so I assume that psycopg2 doesn't support
python3 at all.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-11 Thread Andrew McNamara
The solution is to write the query in an unambiguous way:

  SELECT $1::date + 1;

which is good practice, anyway. If it's not obvious to the type
inference system, it's probably not obvious to you, and will probably
surprise you ;)

 That address this specific case, but it's ugly and not general. The right
 thing is to set the correct type when you're marshalling the parameters...

Well, ugly is in the eye of the beholder, and it certainly is a general 
solution. Any query with ambiguity in its parameters should explicitly 
declare the types, inside the query itself. Having the driver indicate 
the type should be the exception, not the rule.

You are missing the point: this is not about what types the SQL execution
sees. It is about making sure the correct recv function is applied to
the binary parameter data.  The server cannot reliably infer which recv
function to use based in query context (although it tries). A wrong
guess can lead to silent data corruption, which is utterly unacceptable.

If the client (driver) sets the type OID to match the format of the
binary parameter it sends, the server can unambiguously decode the data
(and cast the type, if need be).

I would go as far as to suggest that postgres should not accept binary
parameters with an unknown OID - it's dangerous, unreliable and serves
no purpose.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-11 Thread Andrew McNamara
Obviously this is less urgent than having a driver that works now, but
it's still important. I think we would attract some goodwill from the
python community if we were helping them move to python3, rather than
sitting around waiting 'til they've already moved and decided that they
can't use postgresql.

It's very, very difficult (but not impossible) to support both python 2
and 3 simultaneously, particularly if you have non-trivial C extension
code. Even the python gods will admit that it's still early days.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-10 Thread Kevin Ar18

I hope people don't mind my asking about this on the list... as I hinted at 
before, I don't really follow the development of PostgreSQL, I was just 
interested in the Python driver project that I heard about.

Anyways, as I understand it, the current goal is to use psycopg and get it 
changed to LGPL (assuming all the contributors of psycopg agree and confirm 
they did not use GPL code from any other location).  Is this correct?


When I first heard about the endeavor, I thought the goal was to take one or 
several of the non-copyleft projects, which were rather unfocused, and work 
with those teams to produce a really good implementation for Python.  However, 
as I understand it (based on what Greg told me) the license is not really an 
issue as long as it is not GPL; instead, the PostgreSQL team would mostly 
prefer something that is nearly done, so as to have to do much more work.  Is 
this a correct assessment?


Based on that, I guess my question is what would it have taken to have picked 
one of BSD/MIT projects and working with those people instead?  In other words, 
what key things affected the decision for psycopg?  What areas is it so far 
ahead in or that would have just been too much work to fix in the other 
implementations?


Anyways, I hope this message doesn't come across as bad form.  It's unfortunate 
for me that there was not a good enough BSD/MIT project; but I can live without 
right? :)  Still, I just thought I might ask and find out a little more about 
what the team was looking for in a PostgreSQL implementation, and maybe do a 
little research myself (to see if anything was missed).
  
_
Hotmail: Free, trusted and rich email service.
http://clk.atdmt.com/GBL/go/201469228/direct/01/

Re: [HACKERS] Confusion over Python drivers {license}

2010-02-10 Thread Tom Lane
Kevin Ar18 kevina...@hotmail.com writes:
 When I first heard about the endeavor, I thought the goal was to take
 one or several of the non-copyleft projects, which were rather
 unfocused, and work with those teams to produce a really good
 implementation for Python.  However, as I understand it (based on what
 Greg told me) the license is not really an issue as long as it is not
 GPL; instead, the PostgreSQL team would mostly prefer something that
 is nearly done, so as to have to do much more work.  Is this a correct
 assessment?

Well, all else being equal we'd certainly prefer a library that was
licensed more like the core Postgres database.  However, we don't have
infinite resources, and an LGPL license is not a showstopper (at least
not to the people who seem to be willing to work on this problem).
The attractiveness of the license has to be balanced against how much
work we'd have to put in and how long it will take to get results.

Not being a python user myself, I wasn't paying all that close attention
to the discussion, but that's my sense of how the decision went.

If you feel that a BSD/MIT license is a must-have for your purposes,
you're certainly free to push development of one of the other driver
projects instead, and to try to organize some other people to help.
I don't believe anyone is trying to funnel all development effort into
psycopg2.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-10 Thread Kevin Ar18

 Well, all else being equal we'd certainly prefer a library that was
 licensed more like the core Postgres database.  However, we don't have
 infinite resources, and an LGPL license is not a showstopper (at least
 not to the people who seem to be willing to work on this problem).
 The attractiveness of the license has to be balanced against how much
 work we'd have to put in and how long it will take to get results.
 
 Not being a python user myself, I wasn't paying all that close attention
 to the discussion, but that's my sense of how the decision went.
 
 If you feel that a BSD/MIT license is a must-have for your purposes,
 you're certainly free to push development of one of the other driver
 projects instead, and to try to organize some other people to help.
 I don't believe anyone is trying to funnel all development effort into
 psycopg2.
Thanks for the reply.

I guess that's good advice; I suppose I should just do that and talk to some of 
the teams about it.  It would probably help a lot to focus on just one 
implementation instead of several, even if it's not the same one as what the 
PostgreSQL team works on. :)
  
_
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/201469226/direct/01/

Re: [HACKERS] Confusion over Python drivers {license}

2010-02-10 Thread Greg Smith

Kevin Ar18 wrote:


Based on that, I guess my question is what would it have taken to have 
picked one of BSD/MIT projects and working with those people instead?  
In other words, what key things affected the decision for psycopg?  
What areas is it so far ahead in or that would have just been too much 
work to fix in the other implementations?


A lot of it as I see it is plain old fashioned popularity resulting from 
long sustained development.  psycopg has been around since 2001, the 
current version is already compatible with SQLAlchemy, Django, Zope, 
PyDO, and it's integral to the large Python/PostgreSQL toolchain from 
Skype including tools like Londiste.  When something is supported in 
that many places, and the main complaints are its license and 
documentation rather than its code, I know I'm not going to start by 
assuming I should just hack on somebody else's code to try and replace 
it just because their license is a little better.  And I'd consider 
BSD/MIT only a little better than the LGPL.


That sort of bad decision making is how we got to so many abandoned 
Python drivers in the first place.  If everybody who decided they were 
going to write their own from scratch had decided to work on carefully 
and painfully refactoring and improving PyGreSQL instead, in an 
incremental way that grew its existing community along the way, we might 
have a BSD driver with enough features and users to be a valid 
competitor to psycopg2.  But writing something shiny new from scratch is 
fun, while worrying about backward compatibility and implementing all 
the messy parts you need to really be complete on a project like this 
isn't, so instead we have a stack of not quite right drivers without any 
broad application support.


(Aside:  I have a bit of vocabulary I regularly use for this now.  
Open-source projects that just ignore working on an existing stack of 
working implementations, to instead start from scratch to build 
something of questionably improved fanciness instead regardless of its 
impact on backward compatibility and the existing userbase, have in my 
terminology PulseAudioed the older projects).


The gauntlet I would throw down for anyone who thinks there's a better 
approach here is to demonstrate a driver that has working support going 
back to Python 2.4 for any two of the apps on my list above.  Get even 
that much of a foothold, and maybe the fact that yours is more Pythonic, 
cleaner, or better licensed matters.  Until then, working apps have to 
be the primary motivation for what to work on here, unless there's a 
really terrible problem with the driver.  The existing psycopg license 
and the web site issues were in combination enough to reach that level 
of total issues for a number of people.  With the news from Federico 
that a license improvement is approaching and signs of major 
documentation improvements, that problem seems like it will soon be 
solved as far as I'm concerned.  When someone manages to make their 
alternative driver a prerequisite for an app I need, only at that point 
will that driver show back up on my radar.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-10 Thread Andrew McNamara
That's just a matter of prioritizing the issues.  Put the big ones at 
the top, the trivia at the bottom, [...]

I'd like to see a requirement for the use of PQexecParams() over PQexec() - 
even when using libpq's PQescapeStringConn(), PQexec() makes me uneasy.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-10 Thread Tom Lane
Andrew McNamara andr...@object-craft.com.au writes:
 That's just a matter of prioritizing the issues.  Put the big ones at 
 the top, the trivia at the bottom, [...]

 I'd like to see a requirement for the use of PQexecParams() over PQexec() - 
 even when using libpq's PQescapeStringConn(), PQexec() makes me uneasy.

Such a rule seems pretty entirely pointless, unless you have a way to
enforce that the query string passed to the function hasn't been
assembled from parts somewhere along the way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers {license}

2010-02-10 Thread Greg Smith

Tom Lane wrote:

If you feel that a BSD/MIT license is a must-have for your purposes,
you're certainly free to push development of one of the other driver
projects instead, and to try to organize some other people to help.
I don't believe anyone is trying to funnel all development effort into
psycopg2.
  


Certainly not, and I hope no one has gotten the impression that there's 
anything official being recognized about psycopg happening here 
because it hasn't.  Anointing a one true driver (a phrase that seems 
to keep popping up in external discussions of this topic) isn't the sort 
of thing the PostgreSQL core does.  And all that happens to people who 
ignore what I tell them to do is that they receive a steady stream of 
long, sarcastic e-mails for a while afterwards.


I just updated 
http://wiki.postgresql.org/wiki/Python_PostgreSQL_Driver_TODO to reflect 
a few corrections I noted while researching (everybody else is welcome 
to edit that page too you know), and to include some of the recent 
feedback showing up on this list.


I know I was just looking for a Python driver that is compatible with 
the apps I most often run into, documented well enough that I can write 
my own if I feel like it, fast enough, and free enough that I can deploy 
the result wherever I want.  That seemed similar to the priorities other 
people who had an opinion here suggested too.  Pragmatically, psycopg2 
just seemed to have the shortest path toward being something useful to 
the largest userbase in that sort of context, and we've unofficially 
rolled down that path a bit.


This rabble-rousing seems to have nudged both development communities 
toward being more closely aligned in the future in a couple of ways, 
which is great, but I wouldn't read much more into things than that.  
Other projects will continue to grow and shrink, and the pure Python 
ones in particular continue to be quite valuable because they fill a 
niche that psycopg2 doesn't target at all.  I'd sure like to see all 
three of those projects merge into one big one though.  My bet has to be 
on pg8000, since it has the perfect license, supports all the necessary 
Python versions, and it's been around long enough (almost two years) 
that support for it is already in the latest SQLAlchemy beta.


We seem to have revitalized discussion around modernizing PyGreSQL too, 
so I wouldn't discount that one completely yet either.  For those who 
feel a true BSD license is vital, I direct you toward 
http://mailman.vex.net/pipermail/pygresql/2010-February/002315.html to 
learn more about what direction they could use some help in going.  But 
whatever you do, don't start another project instead.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-10 Thread Andrew McNamara
 I'd like to see a requirement for the use of PQexecParams() over PQexec() - 
 even when using libpq's PQescapeStringConn(), PQexec() makes me uneasy.

Such a rule seems pretty entirely pointless, unless you have a way to
enforce that the query string passed to the function hasn't been
assembled from parts somewhere along the way.

The point is that if the driver is doing the right thing, the user of
the driver at least has to choice to do things safely.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-09 Thread Chris Browne
kevina...@hotmail.com (Kevin Ar18) writes:
 Of course all of this is from the perspective of Python users.  Of
 course, you have your own features that you want from your end (from
 PostgreSQL's perspective).  Perhaps this info would help you to know
 which avenue to pursue.

No, those seem like fine ways of getting a good perspective on the
issue.  

I happen not to use Python much, so there's a certain aspect of don't
care on my part... but that doesn't imply that my PostgreSQL
perspective would tend to override yours.  Instead, I think that the
Python users' perspective *is* a mighty important thing.

The interface needs aspects of cleanness on both sides of the
interface...

 - On the Python side, it needs to play well in a variety of ways
   that you seem to have described nicely, some technical, some
   licensing oriented.  Some relating to interfacing to further bits
   of Python and to applications and frameworks written in Python.

 - On the PostgreSQL side, there's certainly a preference for 
   licensing simplicity.

Note that most of the issues there really lie on the Python side, which
underlines the importance of Python users' perspective.

Further, the ideal and issues/problems that you point out all seem
reasonable.  The good seems good and the bad seems like things that do
indeed need to be accepted as consequences of the good.

It will doubtless help guide assistance.
-- 
output = reverse(moc.liamg @ enworbbc)
...as a  robotics designer once told  me, you don't really appreciate
how smart a moron is until you try to design a robot...
-- Jerry Pournelle

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Massa, Harald Armin
The pg8000 / bpgsql seem to be toy projects, and anyway you dont
want to use pure-Python drivers in high-performance environments.

I agree that there are some performance-challenges with pure-Python drivers.

And we should not forget to look for the reasons for the incubation of that
many pure-Python drivers:

a) Python is no longer one-language, one-implementation. There are (at
least) cPython (the original), Jython (on JVM), IronPython (from Microsoft
on CLR), PyPy (Python on Python), Unladen Swallow (from Google on LLVM). In
addition the nearly-Pythons as in Cython, RPython and ShedSkin

Everything apart from cPython (and possible UnladenSwallow) has its
challenges dealing with non-Python extension modules. From a developer
standpoint it can be tempting to be able to rely on the same database
adapter across more then one implementation

b) the stabilization of an Python Application Binary Interface is in early
discussion stage; meaning: it will take some time untill a non-Python
extension can be usable across Python versions. c-Extensions are allways a
major stumbling block on Python-n to Python-(n+1) versions

c) Stability. Python code is same-on-same more robust then C-Code, as some
of the crash-friendly-problems are eliminated (you cannot allocate memory
wrongly within Python, you cannot errorly access memory cross array
boundaries...)

especially a) is a point to consider when standardizing on a PostgreSQL
blessed Python-Postgresql-driver. How will the blessing extend to Jython /
Ironpython / PyPy?

Harald
-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Greg Smith

Massa, Harald Armin wrote:
I agree that there are some performance-challenges with pure-Python 
drivers.
And we should not forget to look for the reasons for the incubation of 
that many pure-Python drivers:
a) Python is no longer one-language, one-implementation. There are (at 
least) cPython (the original), Jython (on JVM), IronPython (from 
Microsoft on CLR), PyPy (Python on Python), Unladen Swallow (from 
Google on LLVM). In addition the nearly-Pythons as in Cython, RPython 
and ShedSkin...
especially a) is a point to consider when standard, it's getting one 
driver that satisfies the needs of the people most like
izing on a PostgreSQL blessed Python-Postgresql-driver. How will the 
blessing extend to Jython / Ironpython / PyPy?


The point isn't so much standardizing.  Having a low performance 
Python driver turns into a PostgreSQL PR issue.  Last thing we need is 
the old PostgreSQL is slow meme to crop back up again via the Python 
community, if the driver suggested by the community isn't written with 
performance as a goal so that, say, PostgreSQL+Python looks really slow 
compared to MySQL+Python.  And if you're writing a database driver with 
performance as a goal, native Python is simply not an option.


Now, once *that* problem is under control, and there's a nicely 
licensed, well documented, major feature complete, and good performing 
driver, at that point it would be completely appropriate to ask what 
about people who want support for other Python platforms and don't care 
if it's slower?.  And as you say, nurturing the incubation of such 
drivers is completely worthwhile.  I just fear that losing focus by 
wandering too far in that direction, before resolving the main problem 
here, is just going to extend resolving the parts of the Python driver 
situation I feel people are most displeased with.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Gabriele Bartolini

Hi there,

Greg Smith ha scritto:
Looks like the first action item is to talk with the Psycopg people 
about their license.


Oh:  and I'm going to take care of this.  License changes can be a 
very sensitive topic and I'm told that discussion probably needs to 
happy in Italian too; I can arrange that.


I can try and help with this issue, given my role with the Italian 
PostgreSQL community and PostgreSQL business with 2ndQuadrant Italia. I 
have met Psycopg's developer a couple of times at open-source 
conferences. I have great respect for his work and his contribution in 
the open-source community, and I will be very happy to try and explain 
the situation to him.


I will keep you posted.

Ciao,
Gabriele

--
Gabriele Bartolini - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Massa, Harald Armin
Greg,


 The point isn't so much standardizing.  Having a low performance Python
 driver turns into a PostgreSQL PR issue.


I totally agree.

And if you're writing a database driver with performance as a goal, native
Python is simply not an option.

yes. Additionally: performance is not the only challenge. A native Python
implementation, without using libpq, will have to reimplement much of libpq
- just let me isolate proper escaping, and will have its own bugs.

Now, once *that* problem is under control, and there's a nicely licensed,
 well documented, major feature complete, and good performing driver, at that
 point it would be completely appropriate to ask what about people who want
 support for other Python platforms and don't care if it's slower?.


Pure Pythondrivers do exist now; and they are allready discussed in the
summaries - which is a good thing. With my remarks I just want to recommend
that we at least should document a position for them; and a way ahead. And I
need a place to point out that Python grew a FFI with ctypes. Maybe someone
will think of a DBAPI2.0 compatible ctypes libpq wrapper ...

Harald



-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 I have written up a set of guidelines for driver development 
 based on what I learned working on ruby-pg:

 http://wiki.postgresql.org/wiki/Driver_development
...
 I would appreciate comments by anyone (Greg Sabino Mullane: I included
 you in the CC because I thought you may have some input).

Good page. I looked it over but have nothing to add at the moment. I may 
do so later once my head is in dbdpg mode (working on other project 
at the moment :)

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201002080931
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAktwIEYACgkQvJuQZxSWSsjczQCgkU5b6iHPREJYMtAdWlFRDkYI
cS4An3AMyc+O06HzN8MYkfq8HG62371y
=+WCV
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Florian Weimer
* Jeff Davis:

 I have written up a set of guidelines for driver development based on
 what I learned working on ruby-pg:

 http://wiki.postgresql.org/wiki/Driver_development

Interesting, thanks.

I'm contemplating to create a new language binding for libpq (or, to
be more precise, turn an existing language binding into something that
can be published).  I've been agonizing a bit over how to create a
bridge between the host language type system and the PostgreSQL type
system.  If I understand you correctly, you suggest to leave
everything as strings.  This solution has the appeal of being
implemented easily.  It also sidesteps a lot of issues revolving
around different representation choices for numbers.

Do you really suggest to preserve the PQexecParams API verbatim, that
is, passing in three arrays containing type, value, and format?  That
seems to be a bit problematic.  I suspect the common case will be to
use unknown types, text format, and the default conversion from values
to strings.  Only for BYTEA values, something else is required, and
I'm wondering how to encode that (the host language doesn't offer a
distinction between text and binary strings).

Conversely, for result sets, I'm tempted to transparently decode
escaped BYTEA columns.

 Note that the ruby-pg driver doesn't 100% adhere to those standards
 (encoding is the primary problem, and that will be fixed).

Lack of Unicode support means that I can punt that to application
authors, I guess.

By the way, the downside of using strings everywhere is that your
binding API will most likely not work with SQLite (or any other
SQL-like database which lacks column type information).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Florian Weimer
* Andrew McNamara:

Any other suggestions before I turn the above into a roadmap page on the 
wiki?

 I got sick of the constant stream of escaping bugs impacting on psycopg
 and pyPgSQL, and wrote my own DB-API driver, using the more modern
 libpq/binary/protocol 3 APIs where ever possible. The result is BSD
 licensed:

 http://code.google.com/p/ocpgdb/

I saw your note that you have to specify the types for date values
etc.  Is this really desirable or even necessary?  Can't you specify
the type as unknown (OID 705, I believe)?

At work, we recently used to typelessness of Perl's DBD::Pg with great
effect, introducing a more compact, type-safe representation for a few
columns, without having to change all the existing Perl scripts
accessing the database.  That's why I'm wondering...

(And we might be using Python instead of Perl today.  Lack of a decent
PostgreSQL module for Python meant it was very hard to argue against
using Perl ...)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis
On Mon, 2010-02-08 at 20:29 +0100, Florian Weimer wrote:
 I'm contemplating to create a new language binding for libpq (or, to
 be more precise, turn an existing language binding into something that
 can be published).  I've been agonizing a bit over how to create a
 bridge between the host language type system and the PostgreSQL type
 system.  If I understand you correctly, you suggest to leave
 everything as strings.  This solution has the appeal of being
 implemented easily.  It also sidesteps a lot of issues revolving
 around different representation choices for numbers.

Agreed. Ultimately, the conversion has to be done somewhere, but I don't
believe the driver is the place for it. Type conversions are always
going to be imperfect, and this has some important consequences:
 * The type conversion system will be endlessly tweaked to improve it
 * Developers will always run into problems with it in any complex
application, so we need to allow them to circumvent the system and do it
themselves when necessary.

Both of these things point to another layer on top of the driver itself.
It could be some extra convenience functions that come with the driver,
or an entirely separate layer (like ActiveRecord). But if we always let
the developer have access to the full power of libpq, it limits the
damage that can be done by a slightly-too-creative API on top of it.

 Do you really suggest to preserve the PQexecParams API verbatim, that
 is, passing in three arrays containing type, value, and format?  That
 seems to be a bit problematic.  I suspect the common case will be to
 use unknown types, text format, and the default conversion from values
 to strings.

I tried to address this specifically in the document:

For example: it should be easy to pass parameters so that PQexecParams
(and others) can be used, avoiding SQL injection risks. The important
thing is to maintain close to a one-to-one mapping between libpq and the
driver's API, and to provide all of the functionality of libpq.

In ruby-pg, you can just do:

  conn.exec(INSERT INTO foo VALUES($1), [Jeff])

And I think that's appropriate. What I'm saying is that there should
still exist some way to pass explicit types or formats (although that
should still be easier than it is in C ;). Here's the long form:

  conn.exec(INSERT INTO foo VALUES($1),
[{:value = Jeff, :format = 0, :type = 0}])

The nice thing about that format is that you can do the easy thing for
most of the parameters in a query, but then choose binary format for
that one BYTEA parameter. That's because, in ruby, you can mix strings
and hashes in the same array.

So I'm not saying we should make everyone code ruby that looks like C.
I'm saying that the job of the driver is to provide full access to
libpq, and anything beyond that should be an optional convenience
routine, and should be free of magic and cleverness (that's the job of a
higher layer).

 Conversely, for result sets, I'm tempted to transparently decode
 escaped BYTEA columns.

Consider the following ruby-pg program, where you have two empty tables
foo and bar, each with a single BYTEA column b:

  conn = PGconn.connect(...)
  conn.exec(INSERT INTO foo VALUES($1), [000])

  # copy the single value in foo into bar
  val = conn.exec(SELECT b FROM foo LIMIT 1)[0][b]
  conn.exec(INSERT INTO bar VALUES($1), [val])

That copies value so that foo and bar have the same contents: a 4 byte
value \000. What would happen though, if val was transparently
decoded? It would decode it once in ruby, and again inside of postgres
(in byteain), leaving you with a one byte value in bar, even though foo
has a four-byte value.

I really think that only higher layers should implement that kind of
magic, no matter how obvious it may seem that the user wants something
extra.

  Note that the ruby-pg driver doesn't 100% adhere to those standards
  (encoding is the primary problem, and that will be fixed).
 
 Lack of Unicode support means that I can punt that to application
 authors, I guess.

Ruby 1.9+ and Python 3.0+ both have string encoding models that can't
just be ignored. We could punt it by always returning byte sequences
rather than strings, but I think that's a particularly extreme version
of my philosophy of not trying to convert between types.

 By the way, the downside of using strings everywhere is that your
 binding API will most likely not work with SQLite (or any other
 SQL-like database which lacks column type information).

I am trying to develop standards suitable for PostgreSQL drivers based
on libpq. These are not meant to be standards for a database-agnostic
API, standards for a high-level database adapter, or even standards for
a driver written against something other than libpq (like the JDBC
driver).

Thank you for your comments. I will try to integrate these thoughts into
the document.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara
 http://code.google.com/p/ocpgdb/

I saw your note that you have to specify the types for date values
etc.  Is this really desirable or even necessary?  Can't you specify
the type as unknown (OID 705, I believe)?

At work, we recently used to typelessness of Perl's DBD::Pg with great
effect, introducing a more compact, type-safe representation for a few
columns, without having to change all the existing Perl scripts
accessing the database.  That's why I'm wondering...

I can't see how this would work with binary query parameters - the server
will see a blob of binary data and have no way to know what it represents.

I presume DBD::Pg is using text parameters, rather than binary.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis
On Tue, 2010-02-09 at 09:15 +1100, Andrew McNamara wrote:
 I can't see how this would work with binary query parameters - the server
 will see a blob of binary data and have no way to know what it represents.

Unknown is unknown, whether in binary or text format. As far as I know,
PostgreSQL never looks inside a literal of unknown type to try to
determine its type -- it only looks at the context (to what function is
it an argument?).

For instance:

  SELECT '5'; -- has no idea what type it is

  SELECT '5' + 1; -- it's an int

  SELECT 'a' + 1; -- it's still an int
  ERROR:  invalid input syntax for integer: a
  LINE 1: SELECT 'a' + 1;

  SELECT '5.0' + 1; -- still an int, bad input format
  ERROR:  invalid input syntax for integer: 5.0
  LINE 1: SELECT '5.0' + 1;

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara
On Tue, 2010-02-09 at 09:15 +1100, Andrew McNamara wrote:
 I can't see how this would work with binary query parameters - the server
 will see a blob of binary data and have no way to know what it represents.

Unknown is unknown, whether in binary or text format. As far as I know,
PostgreSQL never looks inside a literal of unknown type to try to
determine its type -- it only looks at the context (to what function is
it an argument?).

For instance:

  SELECT '5'; -- has no idea what type it is

  SELECT '5' + 1; -- it's an int

  SELECT 'a' + 1; -- it's still an int
  ERROR:  invalid input syntax for integer: a
  LINE 1: SELECT 'a' + 1;

  SELECT '5.0' + 1; -- still an int, bad input format
  ERROR:  invalid input syntax for integer: 5.0
  LINE 1: SELECT '5.0' + 1;

The problem is deeper than that - when query parameters use the binary
option, the server has no way to decode the binary parameter without an
appropriate type OID.

As you say, postgres will cast types depending on context, however this
is stricter when binary parameters are used (because they only have one
valid interpretation, whereas a text parameter may have several).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis
On Mon, 2010-02-08 at 09:14 +0100, Massa, Harald Armin wrote:
 And we should not forget to look for the reasons for the incubation of
 that many pure-Python drivers:

All very good points. That's why the doc I wrote:

http://wiki.postgresql.org/wiki/Driver_development

is specifically targeted at libpq-based drivers (which is repeated
several times).

I think it would be valuable to have a complete, pure-python driver
available (like the JDBC driver). That's a large project, however.

People who use a different python implementation understand that
libraries might not be as plentiful. It will be a while before there are
as many pure-python libraries as there are pure-java libraries.

Right now what we need is a driver toward which we can confidently
direct cPython users. It's faster to wrap libpq than to write a complete
driver. And if we don't have such a driver, we risk alienating an
important community for postgresql growth. So, the cost is lower and the
benefits are higher for wrapping libpq for the cPython users. At least,
that seems to be the case right now; things may change in the future.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis
On Tue, 2010-02-09 at 10:46 +1100, Andrew McNamara wrote:
 The problem is deeper than that - when query parameters use the binary
 option, the server has no way to decode the binary parameter without an
 appropriate type OID.

Postgres does not attempt to decode anything (text or binary format)
until it figures out what type it is.

 As you say, postgres will cast types depending on context, however this
 is stricter when binary parameters are used (because they only have one
 valid interpretation, whereas a text parameter may have several).

Type casts are a different matter; they are only done after the unknown
literals' types have been determined:

  create table n(i int);

  -- insert numeric literal, which is cast to int (assignment cast)
  insert into n values(5.0); -- succeeds

  -- insert unknown literal, which is inferred to be of type int
  insert into n values('5.0'); -- fails on integer type input function
  ERROR:  invalid input syntax for integer: 5.0
  LINE 1: insert into n values('5.0');


Can you provide a concrete example in which the text versus binary
format changes the type inference behavior?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara
On Tue, 2010-02-09 at 10:46 +1100, Andrew McNamara wrote:
 The problem is deeper than that - when query parameters use the binary
 option, the server has no way to decode the binary parameter without an
 appropriate type OID.

Postgres does not attempt to decode anything (text or binary format)
until it figures out what type it is.

How does it figure out what type it is? Either by the type oid passed by
the caller, or by the context if the type oid is unknown.  

Now, with the text format parameters, the parser usually does the right
thing, since text formats have plenty of hints for us humans.

However, with the binary format, unless the caller tells us, there's no way
to tell whether we're correctly parsing the data. If the context implies
one type, but the user passes another, we'll either get an ugly error or,
worse, silently misparse their data.

Generally this isn't a big problem with python, as we have good type
information available. It's only an issue because people have gotten
used to the text parameter parsing being so forgiving.

Using my ocpgdb module, and interacting directly with the libpq wrapping
code, you can see how postgres reacts to various inputs:

 from oclibpq import *
 from ocpgdb import pgoid
 db=PgConnection('')

No parameters:

 r=db.execute('select 1', ())
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x01' 
at 0xb7514200,)]

Int4 parameter, type specified:

 r=db.execute('select $1', [(pgoid.int4, '\x00\x00\x00\x01')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x01' 
at 0xb75141c0,)]

Int4 parameter, type unknown, can't be determined from context:

 r=db.execute('select $1', [(pgoid.unknown, '\x00\x00\x00\x01')])
 r.status
PGRES_FATAL_ERROR
 r.errorMessage
'ERROR:  could not determine data type of parameter $1\n'

Int4 parameter, type unknown, can be determined from context:

 r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x00\x01')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x02' 
at 0xb7514200,)]

Text parameter, type unknown, mismatching context - surprising:

 r=db.execute('select $1 + 1', [(pgoid.unknown, '')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '1112' at 
0xb7514360,)]

Date parameter, type unknown, int context, the value gets misinterpreted:

 r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x01n')])
 r.status
PGRES_TUPLES_OK
 list(r)
[(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x01o' at 
0xb75144a0,)]

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis
On Tue, 2010-02-09 at 12:51 +1100, Andrew McNamara wrote:
 Now, with the text format parameters, the parser usually does the right
 thing, since text formats have plenty of hints for us humans.

The parser doesn't care whether it's text format or binary format when
determining the type.

 However, with the binary format, unless the caller tells us, there's no way
 to tell whether we're correctly parsing the data. If the context implies
 one type, but the user passes another, we'll either get an ugly error or,
 worse, silently misparse their data.

The difference between text and binary format is this: after it has
already determined the type of the parameter,
  (a) if the format is text, it passes it to the type input function
  to construct the value; or
  (b) if the format is binary, it passes it to the type recv function
  to construct the value.

The argument to the input or recv functions may:
  (a) be valid input; or
  (b) be invalid input, and be detected as an error by the input or
  recv function; or
  (c) be invalid input, and not be detected as an error by the input
  or recv function.

For a given type, the input function may be more likely to catch an
input error than the recv function; or the reverse. Either way, it is
very type-specific, and the only difference is the whether the input is
misinterpreted (type error not caught; bad) or an error is thrown (type
error caught; better). 

 Using my ocpgdb module, and interacting directly with the libpq wrapping
 code, you can see how postgres reacts to various inputs:

None of the examples show a difference in the inferred type of a text
versus binary parameter for the same query.

 No parameters:
 
  r=db.execute('select 1', ())
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value 
 '\x00\x00\x00\x01' at 0xb7514200,)]

Expected, because the literal 1 (without quotes) is an integer literal,
not an unknown literal.

 Int4 parameter, type specified:
 
  r=db.execute('select $1', [(pgoid.int4, '\x00\x00\x00\x01')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value 
 '\x00\x00\x00\x01' at 0xb75141c0,)]

Expected, because you specified the type, and sent the binary data to
the integer recv function, and it was valid input.

 Int4 parameter, type unknown, can't be determined from context:
 
  r=db.execute('select $1', [(pgoid.unknown, '\x00\x00\x00\x01')])
  r.status
 PGRES_FATAL_ERROR
  r.errorMessage
 'ERROR:  could not determine data type of parameter $1\n'

Expected -- there is no context to determine the type.

Why do you call it an int4 parameter? It's just bytes, and you never
told postgres what they are (as you did in the previous example).

 Int4 parameter, type unknown, can be determined from context:
 
  r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x00\x01')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value 
 '\x00\x00\x00\x02' at 0xb7514200,)]

Expected: the function + provides the context that allows the server to
interpret the left argument as an integer.

(Again, not an int4 parameter, it's unknown)

 Text parameter, type unknown, mismatching context - surprising:
 
  r=db.execute('select $1 + 1', [(pgoid.unknown, '')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value '1112' at 
 0xb7514360,)]

Expected, because this is exactly the same as the previous one except
for the data you pass in. Notice that the same type is inferred (23).

Why do you call this mismatching context when the context is exactly
the same as above? The only difference is which 4 bytes you provide. You
never told postgres that the bytes were text bytes anywhere.

You may think that it's doing  + 1, but it's actually doing addition
on the bytes. That is apparent in the next example:

 Date parameter, type unknown, int context, the value gets misinterpreted:
 
  r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x01n')])
  r.status
 PGRES_TUPLES_OK
  list(r)
 [(PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x01o' 
 at 0xb75144a0,)]
 

Expected, because the only thing that could possibly detect the error is
the int4recv function, which happens to accept any 4-byte input (so it
will never fail on any 4 bytes of data).

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara
For a given type, the input function may be more likely to catch an
input error than the recv function; or the reverse. Either way, it is
very type-specific, and the only difference is the whether the input is
misinterpreted (type error not caught; bad) or an error is thrown (type
error caught; better). 

This is the crux of the matter: the type input functions are universally
more forgiving since, by their nature, text formats are designed for us
fuzzy humans, and users of adapters have come to expect this. 

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis
On Tue, 2010-02-09 at 13:56 +1100, Andrew McNamara wrote:
 For a given type, the input function may be more likely to catch an
 input error than the recv function; or the reverse. Either way, it is
 very type-specific, and the only difference is the whether the input is
 misinterpreted (type error not caught; bad) or an error is thrown (type
 error caught; better). 
 
 This is the crux of the matter: the type input functions are universally
 more forgiving since, by their nature, text formats are designed for us
 fuzzy humans, and users of adapters have come to expect this. 

Except that it's exactly the opposite with integers. Pass any 4 bytes to
in4recv(), and it will accept it. However, try passing '4.0' to
int4in(), and you get an error.

If I had to make an educated guess about the forgiveness of various type
input and type recv functions, I would say that the recv functions are
more forgiving. After all, you would expect the binary format to be less
redundant, and therefore less likely to catch inconsistencies. I don't
see much of a universal truth there, however.

This is getting pretty far off-topic, so let's just leave it at that.
The drivers should support both formats; the type inference logic
doesn't care at all about the contents of the unknown literals (text or
binary); and queries should be written in such a way that the types are
unambiguous and unsurprising.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Jeff Davis
On Mon, 2010-02-08 at 20:50 +0100, Florian Weimer wrote:
 I saw your note that you have to specify the types for date values
 etc.  Is this really desirable or even necessary?  Can't you specify
 the type as unknown (OID 705, I believe)?

I believe the problem that Andrew is describing is that:

  SELECT $1 + 1;

will infer that $1 is of type int4. But if you really intended $1 to be
a date (which is also valid), it will cause a problem.

If the date is passed in text format, it will cause an error in
int4in(), because the text representation of a date isn't a valid text
representation for an integer.

If the date is passed in binary format, it will pass it to int4recv() --
but because the date is 4 bytes, and int4recv is defined for any 4-byte
input, it won't cause an error; it will produce a wrong result. In other
words, the binary representation for a date _is_ a valid binary
representation for an integer. The type inference has found the wrong
type, but the recv function still accepts it, which causes a problem.

The solution is to write the query in an unambiguous way:

  SELECT $1::date + 1;

which is good practice, anyway. If it's not obvious to the type
inference system, it's probably not obvious to you, and will probably
surprise you ;)

Or, as Andrew suggests, you can pass the type oid along with the
parameter so that postgresql knows the right type.

Either way, relying on a type input or a recv function to cause a type
error is much more fragile.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-08 Thread Andrew McNamara
If the date is passed in binary format, it will pass it to int4recv() --
but because the date is 4 bytes, and int4recv is defined for any 4-byte
input, it won't cause an error; it will produce a wrong result. In other
words, the binary representation for a date _is_ a valid binary
representation for an integer. The type inference has found the wrong
type, but the recv function still accepts it, which causes a problem.

Yes - of the worst kind: silent data corruption.

The solution is to write the query in an unambiguous way:

  SELECT $1::date + 1;

which is good practice, anyway. If it's not obvious to the type
inference system, it's probably not obvious to you, and will probably
surprise you ;)

That address this specific case, but it's ugly and not general. The right
thing is to set the correct type when you're marshalling the parameters...

Or, as Andrew suggests, you can pass the type oid along with the
parameter so that postgresql knows the right type.

That's right - if using the binary parameters, you *must* pass an
appropriate type oid for the data you send to the server. If you use the
unknown oid, bad things will happen (sooner or later).

While this is strictly true of both binary and text parameters, text
parameters have enough redundancy built into the format that it's rarely
a problem. Users have come to expect this leniency.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Greg Smith

Marko Kreen wrote:

I think we should concentrate on the PR problem and technical issues
related to that, keep the other low-level and non-user-visible
issues out.  Or at least separate.  (PsycopgTodo wiki page?)
  


That's just a matter of prioritizing the issues.  Put the big ones at 
the top, the trivia at the bottom, and if you knock stuff of there 
somewhere along the way you discover you've made enough progress that 
the PR stuff starts going away, because people are able to get their 
work done with less drama.


Here's a full TODO page that includes everything mentioned here as best 
I could summarize it:  
http://wiki.postgresql.org/wiki/Python_PostgreSQL_Driver_TODO


Looks like the first action item is to talk with the Psycopg people 
about their license.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Marko Kreen
On 2/7/10, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Feb 6, 2010 at 7:48 PM, Marko Kreen mark...@gmail.com wrote:
   This is long-term todo item for psycopg, seems offtopic
   to the driver situation.

 [...]

  This is routine bug in either app or psycopg, we have no reason
   to touch it.  The guy should report to appropriate lists.

 [...]

  Long-term todo item for psycopg2, offtopic for driver situation.

 [...]

  psycopg2 has array support, I'd like to have tuple/record also.
  
   Minor todo item for psycopg, mostly but not completely offtopic
   for driver situation.


  I'm not a Python user myself, but I have trouble understanding how you
  can describe bugs in one of the Python drivers as off-topic to the
  Python driver situation.

I thought the topic was Confusion over Python drivers?

The only bug there was likely app one, or at least its not widespread
so off-topic.   Rest are more like non-essential cool features,
so again off-topic.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Massa, Harald Armin
Marko,


 I thought the topic was Confusion over Python drivers?

 The only bug there was likely app one, or at least its not widespread
 so off-topic.   Rest are more like non-essential cool features,
 so again off-topic.


Those lack of non-essential cool  features is right on topic - because
what one developer may see as non-essential  is the most important feature
for the next developer.

Exactly these kind of issues are the source of the confusion we now phase:
developer B needing feature X, which was ignored by driver A.

So I concur to put them on the discussion agenda; if they drop out of
priority, at least it is documented WHY and will save other developers these
thoughts.

Harald


-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Josh Berkus

  I'm not a Python user myself, but I have trouble understanding how you
  can describe bugs in one of the Python drivers as off-topic to the
  Python driver situation.
 
 I thought the topic was Confusion over Python drivers?
 
 The only bug there was likely app one, or at least its not widespread
 so off-topic.   Rest are more like non-essential cool features,
 so again off-topic.

Not at all, except that maybe this discussion belongs on -advocacy
rather than here.  Driver quality is partly performance and stability,
but also partly what features it implements and how well it implements
them.  *particularly* for PostgreSQL, which is a database with lots of
cool features; if the driver doesn't support our cool features, then
they don't matter to Python developers.

Anyway, I don't yet have a full diagnosis on the transaction control
issue or I'd already have posted it to psycopg -- it may be a toxic
interaction between Django and Psycopg2 rather than psycopg2 alone.  I'd
not have brought it up except for this discussion.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Greg Smith

Greg Smith wrote:
Here's a full TODO page that includes everything mentioned here as 
best I could summarize it:  
http://wiki.postgresql.org/wiki/Python_PostgreSQL_Driver_TODO


Looks like the first action item is to talk with the Psycopg people 
about their license.


Oh:  and I'm going to take care of this.  License changes can be a very 
sensitive topic and I'm told that discussion probably needs to happy in 
Italian too; I can arrange that.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Greg Smith

Josh Berkus wrote:

Anyway, I don't yet have a full diagnosis on the transaction control
issue or I'd already have posted it to psycopg -- it may be a toxic
interaction between Django and Psycopg2 rather than psycopg2 alone.  I'd
not have brought it up except for this discussion.
  


I'm going to remove it from the list on the wiki then for now.  I don't 
want to annoy the developers by adding a more speculative bug that might 
not even be in their software.  If you get to where it's confirmed and 
info posted to their list, please add a link back into the page once 
it's reported, i.e. link to their mailing list archives or something 
like that.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Andrew McNamara
Any other suggestions before I turn the above into a roadmap page on the 
wiki?

I got sick of the constant stream of escaping bugs impacting on psycopg
and pyPgSQL, and wrote my own DB-API driver, using the more modern
libpq/binary/protocol 3 APIs where ever possible. The result is BSD
licensed:

http://code.google.com/p/ocpgdb/

As well as using the newer APIs, I have attempted to keep the code as
simple as possible, eschewing things like threading as adding too much
complexity for too little gain (particularly true of Python threading),
and I kept to just the code DB-API functionality.

The C code exists mainly to present a pythonic view of libpq. I found
that type conversion and marshalling could generally be done from python
with more than acceptable performance (via the C-coded struct module
in the standard library for common types). In my tests, ocpgdb has
performed at least as well as pyPgSQL and psycopg, often a lot better,
primarily due to the use of the libpq binary protocols, I think.

I'm not proposing my module as your canonical implementation, although
you're welcome to it if you like. Rather, it demonstrates another viable
approach, minimal, and using newer libpq APIs.

BTW, with respect to the discussion of the Python DB-API - I see it as
specifying a lowest-common-denominator, or the subset of functionality
that should be available from most databases without requiring
contortions. Like eating at McDonalds, it does the job, but it's never
going to delight or surprise. A PostGreSQL blessed adapter really should
provide access to all the features in libpq, and I'm not sure this is
directly compatible with DBAPI. Instead, the DBAPI-compliance should be
layered on top.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Greg Smith

Andrew McNamara wrote:

I got sick of the constant stream of escaping bugs impacting on psycopg
and pyPgSQL, and wrote my own DB-API driver, using the more modern
libpq/binary/protocol 3 APIs where ever possible. The result is BSD
licensed:
http://code.google.com/p/ocpgdb/
  


I added you into the list at http://wiki.postgresql.org/wiki/Python

Can you check what I put in there, confirm Windows compatibility, and 
comment on Python 3.X support?


I'd be curious to hear more about the escaping bugs you ran into as 
well.  We already have some notes on the TODO that pushing more of this 
work toward the standard libpq routines would seem appropriate for 
things passing between the driver and libpq.  Were the issues you ran 
into on that side, or more on the Python side of how things were being 
formatted?


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Andrew McNamara
I added you into the list at http://wiki.postgresql.org/wiki/Python

Thanks.

Can you check what I put in there, confirm Windows compatibility, and 
comment on Python 3.X support?

I haven't tried it under Windows and I haven't had any feedback either
way from Windows users. 

For now, ocpgdb has no Python 3 support (I don't foresee any real
problems, however).

I'd be curious to hear more about the escaping bugs you ran into as well.
We already have some notes on the TODO that pushing more of this work
toward the standard libpq routines would seem appropriate for things
passing between the driver and libpq.  Were the issues you ran into on
that side, or more on the Python side of how things were being formatted?

It was a while ago now and I can't remember the specific details - it was
more a general feeling that the existing offerings were going about it
the wrong way (with respect to parameter passing and escaping). I suspect
this was a historical artifact (presumably libpq didn't provide escaping
facilities or parameterised queries when the adapters were written).

Essentially, I just wanted a pyPgSQL with a more modern implementation.
Psycopg was (is?) also using Protocol 2. I felt that the way forward was
to switch to the Protocol 3 API features, in particular, parameterised
queries, and none of the existing Python adapters had done that (I got
the impression while writing my module that nobody was exercising the
new features).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-07 Thread Jeff Davis
On Mon, 2010-02-08 at 12:25 +1100, Andrew McNamara wrote:
 For now, ocpgdb has no Python 3 support (I don't foresee any real
 problems, however).

Encoding issues are the big one. There are a couple gotchas, and I
provided the details here:

http://wiki.postgresql.org/wiki/Driver_development#Text_Encoding

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Jeff Davis
On Fri, 2010-02-05 at 10:35 -0800, Josh Berkus wrote:
 I'm not as concerned about confusion as the fact that *all* of the
 various Python drivers suck in different, and crippling, ways.  I don't
 care how many drivers we have, as long as we have at least one 1st-class
 driver.

Absolutely.

And I would prefer that it can provide all (or nearly all) of the
capabilities of libpq. PyGreSQL apparently doesn't even offer
parameterized queries!

http://pygresql.org/pg.html

That was based on a quick glance at the document; I hope I'm mistaken.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Greg Smith

Marko Kreen wrote:

The pg8000 / bpgsql seem to be toy projects, and anyway you dont
want to use pure-Python drivers in high-performance environments.
We are not talking about C#/java here.
  


Right, and the comments from James reinforce this general idea:  there 
is little value to the people who most want Python+PostgreSQL support in 
working on any of the pure Python implementations, because best case 
performance is still half or less of the ones that more directly wrap 
libpq.  Even the best case with psycopg is enough of a performance hit 
as it is.



py-postgresql seems to be more serious, but as it's python3 only
which makes it irrelevant today.
  


Also true.


Psycopg was the leader, especially in web-environments,
but it has non-obvious license and with dead website it does not
seem that attractive.  Although it is well-maintained still.
Best path forward would be to talk with Psycopg guys about
license clarification/change.
  


Agreed.  A relicensed Psycopg, with a more professional looking 
introduction/documentation page (like the budding page on our Wiki) than 
what the initd web site has, seems like the best platform to hack on top 
of to me as well.  The fact that they've moved to git recently makes it 
that much easier for another branch to exist even outside of their 
somewhat troubled infrastructure.


To summarize what I saw on this thread, the primary wishlist of changes 
to it are:


-License change
-Consider refactoring to better follow standard driver practices, such 
as using PQExecParams
-Improvement in transaction control to resolve issues that cause idle 
transactions
-Possible simplifications in how it's implemented async operations, to 
improve robustness/reduce code complexity

-Confirm/add multi-threaded support
-Confirm/add support for the most common standard types (such as array)


PyGreSQL is the oldest, older than DB-API, and so it's DB-API
interface seems an afterthought and is untested/underused - eg.
it does not support bytea.
  
And if Psycopg can't be relicensed happily and/or improved as above, as 
the only other native Python driver PyGreSQL looks like the next 
candidate to build on top of.  Its major issues are:


-Test/complete/refactor for full DB-API 2.0 support
-Add bytea support
-Add extension support, perhaps modeled on what Psycopg.
-Build a COPY extension
-Confirm/add multi-threaded support
-Confirm/add support for the most common standard types (such as array)

Any other suggestions before I turn the above into a roadmap page on the 
wiki?


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Bruce Momjian
Greg Smith wrote:
 To summarize what I saw on this thread, the primary wishlist of changes 
 to it are:
 
 -License change
 -Consider refactoring to better follow standard driver practices, such 
 as using PQExecParams
 -Improvement in transaction control to resolve issues that cause idle 
 transactions
 -Possible simplifications in how it's implemented async operations, to 
 improve robustness/reduce code complexity
 -Confirm/add multi-threaded support
 -Confirm/add support for the most common standard types (such as array)
 
  PyGreSQL is the oldest, older than DB-API, and so it's DB-API
  interface seems an afterthought and is untested/underused - eg.
  it does not support bytea.

 And if Psycopg can't be relicensed happily and/or improved as above, as 
 the only other native Python driver PyGreSQL looks like the next 
 candidate to build on top of.  Its major issues are:
 
 -Test/complete/refactor for full DB-API 2.0 support
 -Add bytea support
 -Add extension support, perhaps modeled on what Psycopg.
 -Build a COPY extension
 -Confirm/add multi-threaded support
 -Confirm/add support for the most common standard types (such as array)
 
 Any other suggestions before I turn the above into a roadmap page on the 
 wiki?

Great summary!  This is what I was hoping for.  One suggestion on the
license issue is that the LGPL seems like the type of license intended
by the author, and I think it is an acceptable license to most client
application programmers.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Marko Kreen
On 2/7/10, Greg Smith g...@2ndquadrant.com wrote:
 Marko Kreen wrote:
  Psycopg was the leader, especially in web-environments,
  but it has non-obvious license and with dead website it does not
  seem that attractive.  Although it is well-maintained still.
  Best path forward would be to talk with Psycopg guys about
  license clarification/change.

  Agreed.  A relicensed Psycopg, with a more professional looking
 introduction/documentation page (like the budding page on our Wiki) than
 what the initd web site has, seems like the best platform to hack on top of
 to me as well.  The fact that they've moved to git recently makes it that
 much easier for another branch to exist even outside of their somewhat
 troubled infrastructure.

  To summarize what I saw on this thread, the primary wishlist of changes to
 it are:

  -License change

Note that as long they use non-BSD licence, they need the exception
for the horror that is the OpenSSL licence.

So you cannot 100% satisfy the one-click-lawyers.

  -Consider refactoring to better follow standard driver practices, such as
 using PQExecParams

This is long-term todo item for psycopg, seems offtopic
to the driver situation.

  -Improvement in transaction control to resolve issues that cause idle
 transactions

This is routine bug in either app or psycopg, we have no reason
to touch it.  The guy should report to appropriate lists.

  -Possible simplifications in how it's implemented async operations, to
 improve robustness/reduce code complexity

Long-term todo item for psycopg2, offtopic for driver situation.

  -Confirm/add multi-threaded support.

Seems psycopg2 already has good enough threading.

  -Confirm/add support for the most common standard types (such as array)

psycopg2 has array support, I'd like to have tuple/record also.

Minor todo item for psycopg, mostly but not completely offtopic
for driver situation.

  PyGreSQL is the oldest, older than DB-API, and so it's DB-API
  interface seems an afterthought and is untested/underused - eg.
  it does not support bytea.
 
 
  And if Psycopg can't be relicensed happily and/or improved as above, as the
 only other native Python driver PyGreSQL looks like the next candidate to
 build on top of.  Its major issues are:

  -Test/complete/refactor for full DB-API 2.0 support
  -Add bytea support

timestamp/timestamptz also...

  -Add extension support, perhaps modeled on what Psycopg.
  -Build a COPY extension

They seems to have the low-level parts, just they are missing
from DB-API interface.

  -Confirm/add multi-threaded support
  -Confirm/add support for the most common standard types (such as array)

  Any other suggestions before I turn the above into a roadmap page on the
 wiki?

I think we should concentrate on the PR problem and technical issues
related to that, keep the other low-level and non-user-visible
issues out.  Or at least separate.  (PsycopgTodo wiki page?)

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Josh Berkus

 Finally, I just don't see the existing (often PG specific) goals that I have 
 in mind for it appealing to the majority of [web framework/abstraction] users.

What are those goals?

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Robert Haas
On Sat, Feb 6, 2010 at 7:48 PM, Marko Kreen mark...@gmail.com wrote:
 This is long-term todo item for psycopg, seems offtopic
 to the driver situation.
[...]
 This is routine bug in either app or psycopg, we have no reason
 to touch it.  The guy should report to appropriate lists.
[...]
 Long-term todo item for psycopg2, offtopic for driver situation.
[...]
 psycopg2 has array support, I'd like to have tuple/record also.

 Minor todo item for psycopg, mostly but not completely offtopic
 for driver situation.

I'm not a Python user myself, but I have trouble understanding how you
can describe bugs in one of the Python drivers as off-topic to the
Python driver situation.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Confusion over Python drivers

2010-02-06 Thread Kevin Ar18

I saw this on reddit and thought I might drop a line.

I went through this same issue, trying to find a postgresql driver.  Mostly, I 
had no intention of using psycopg because of its copyleft licensing.  (Of 
course, I don't need to go into why.)

Anyways, here's some info that might help on three alternatives:
Py-postgresql, PyGreSQL, and PG3000

Py-postgresql is active (but I only see 1 person committing)
MIT/BSD style license

It uses some some C for speed.

It has DB-API and PG-API bindings.


Problem: is it Python 3 only!



PyGreSQL (looking to not be very active in the last few months)
MIT/BSD style license
I'm vague on it's features, whether it has C optimizations, or if it is a 
version limited to Python 3 or 2.x


PG8000 is active at the moment
MIT/BSD style license
It works on 2.x and Python 3

Problem: it is pure Python, so it is slower


If you plan on pursuing this, I'd recommend talking to people on on 3 projects. 
 Here's what I would describe as ideal in a project:

* MIT/BSD style license
* works on Python 2.x and 3
* C optimizations for speed (or whatever for speed)

--
That's that part.  Now for some problems that would need solving in order for 
the db api to be extremely useful/userfriendly in the Python world:
--
* SQLAlchemy bindings/support -- as of right now only psycopyg works in 
SQLAlchemy without problems.  PG8000 will work, but you need an 
experimental version of boht PG8000 and SQLAlchemy.  The other two do 
not work.  SQLAlchemy support is probably most needed of anything.  SQLAlchemy 
support should not only be possible but default.
* Django bindings (these are unique to Django and nobody but psycopg has
 them, which is not good).  Django support should not only be possible but 
default.

* Needs 2.x and Python 3 compatability
* It is inevitable that if the library uses C optimizations, then that means 
they need to provide lots of different binaries -- think 32bit vs 64bit and 
then 2.4, 2.5, 3.0 etc  Some combinations will often be left out making it 
hard for someone to use/install it.



Of course all of this is from the perspective of Python users.  Of course, you 
have your own features that you want from your end (from PostgreSQL's 
perspective).  Perhaps this info would help you to know which avenue to pursue.
  
_
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
http://clk.atdmt.com/GBL/go/201469229/direct/01/

Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Josh Berkus
Kevin,

 Of course all of this is from the perspective of Python users.  Of
 course, you have your own features that you want from your end (from
 PostgreSQL's perspective).  Perhaps this info would help you to know
 which avenue to pursue.

That's invaluable.  Thanks for chiming in!

--Josh Berkus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread James William Pye
On Feb 6, 2010, at 5:51 PM, Josh Berkus wrote:
 Finally, I just don't see the existing (often PG specific) goals that I have 
 in mind for it appealing to the majority of [web framework/abstraction] 
 users.
 
 What are those goals?

I think the most interesting one that has yet to be implemented is fast, 
multiple destination COPY support. Currently, COPY is supported, but a bytes() 
object is allocated for each row, so it's currently not efficient for moving 
data(pg-to-pg ETL sans the T? =). While some C is still needed to make it 
properly efficient, it's primarily to keep track of the COPY's state and to 
update stats. This is pretty useless to a django user... Well, I suppose it 
might be interesting if COPY OUT could target [or transform into] JSON, but 
idk...

The general, ongoing goal is to implement and document *convenient* Python 
interfaces to PostgreSQL features. A, perhaps uninteresting, case being 
supporting advisory locks. I was thinking a context manager, but it might 
just be something as trivial as an additional method on the connection(very 
simple/direct binding).

Some fuzzy goals: twisted support, some asynchronous interfaces, and greater 
user control over type I/O. The first, twisted, mostly interests me as an 
exercise. The second, async interfaces, scares me as it took me some time just 
to feel not unhappy with the blocking APIs. The third will probably happen, 
but it's going to be a while.

I also have some goals not directly related to a driver. postgresql.unittest is 
currently only used internally, but I hope to document it some day soon so that 
people can write Python unittest.TestCase's that auto-magically build out a 
target cluster(~pg_regress/pgTap for Python?). Well, it works, but it's not 
documented and the APIs haven't been given much thought. Generally, basic 
cluster management tools for Python. (At one point I tried to write a 
programmer's HBA editor, but I think I hurt myself trying to figure out rule 
reduction.. That is, it was trying to be smarter than insert/delete rule at 
position x.)

Well, these are the ones that come to mind, atm, but I don't think there's much 
beyond them.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-06 Thread Jeff Davis
On Fri, 2010-02-05 at 09:19 -0500, Bruce Momjian wrote:
 My son has brought to my attention that our current crop of Python
 client libraries is inadequate/confusing.  I took a look myself, and
 asked on our IRC channel, and am now convinced this area needs
 attention.

I have written up a set of guidelines for driver development based on
what I learned working on ruby-pg:

http://wiki.postgresql.org/wiki/Driver_development

Whether we take one of the existing projects and improve upon it, or
start a complete rewrite, I hope these guidelines will be a useful
destination.

Note that the ruby-pg driver doesn't 100% adhere to those standards
(encoding is the primary problem, and that will be fixed).

I would appreciate comments by anyone (Greg Sabino Mullane: I included
you in the CC because I thought you may have some input). Even if the
python driver doesn't go in that direction, it will help me improve
ruby-pg.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
My son has brought to my attention that our current crop of Python
client libraries is inadequate/confusing.  I took a look myself, and
asked on our IRC channel, and am now convinced this area needs
attention.

First, the sheer number of libraries is confusing.  This is the page
from the Postgres wiki:

http://wiki.postgresql.org/wiki/Python

The first one listed, Psycopg, is noted as preferred libpq-based
driver, but the license is GPL.  Isn't that a problem for many client
applications?

The next one, PyGreSQL, is BSD licensed, but only has documentation for
the classic interface.  The DB-API module says about documentation:

http://www.pygresql.org/pgdb.html

This section of the documentation still needs to be written.

The other three are pure Python drivers, which I guess can be good, but
why three, and then there are three more listed as obsolete/stalled.

Clearly something is wrong here.

The Python-hosted PostgreSQL page has similar problems:

http://wiki.python.org/moin/PostgreSQL

Does Perl have a similar mess?

While I realize experienced people can easily navigate this confusion, I
am concerned about new Postgres adopters being very confused by this and
it is hurting our database adoption in general.  

What is really needed is for someone to take charge of one of these
projects and make a best-of-breed Python driver that can gain general
acceptance as our preferred driver.  I feel Python is too important a
language to be left in this state.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Massa, Harald Armin
Bruce,

   http://wiki.postgresql.org/wiki/Python

 The first one listed, Psycopg, is noted as preferred libpq-based
 driver, but the license is GPL.  Isn't that a problem for many client
 applications?

 The licence of psycopg2 is a little more complicated; the GPL in that
summary just tries to sum it. The actual licence is: [from the LICENCE in
the source distribution]

-
psycopg and the GPL
===

psycopg is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version. See file COPYING for details.

As a special exception, specific permission is granted for the GPLed
code in this distribition to be linked to OpenSSL and PostgreSQL libpq
without invoking GPL clause 2(b).

Note that the GPL was chosen to avoid proprietary adapters based on
psycopg code. Using psycopg in a proprietary product (even bundling
psycopg with the proprietary product) is fine as long as:

 1. psycopg is called from Python only using only the provided API
(i.e., no linking with C code and no C modules based on it); and

 2. all the other points of the GPL are respected (you offer a copy
of psycopg's source code, and so on.)

Alternative licenses


If you prefer you can use the Zope Database Adapter ZPsycopgDA (i.e.,
every file inside the ZPsycopgDA directory) user the ZPL license as
published on the Zope web site, http://www.zope.org/Resources/ZPL.

Also, the following BSD-like license applies (at your option) to the
files following the pattern psycopg/adapter*.{h,c} and
psycopg/microprotocol*.{h,c}:

 Permission is granted to anyone to use this software for any purpose,
 including commercial applications, and to alter it and redistribute it
 freely, subject to the following restrictions:

 1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product documentation
would be appreciated but is not required.

 2. Altered source versions must be plainly marked as such, and must not
be misrepresented as being the original software.

 3. This notice may not be removed or altered from any source distribution.

psycopg is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Proprietary licenses


A non-exclusive license is available for companies that want to include
psycopg in their proprietary products without respecting the spirit of the
GPL. The price of the license is one day of development done by the author,
at the consulting fee he applies to his usual customers at the day of the
request.

-

Harald

-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pigeon
-
%s is too gigantic of an industry to bend to the whims of reality


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Massa, Harald Armin wrote:
 Bruce,
 
http://wiki.postgresql.org/wiki/Python
 
  The first one listed, Psycopg, is noted as preferred libpq-based
  driver, but the license is GPL.  Isn't that a problem for many client
  applications?
 
  The licence of psycopg2 is a little more complicated; the GPL in that
 summary just tries to sum it. The actual licence is: [from the LICENCE in
 the source distribution]

Wow, that is super-confusing.  I am dealing with an issue now that some
companies are concerned that individual names appear with our own BSD
copyrights in small files.   That pales in comparison to this copyright.
I would never be able to recommend software with that complex a
copyright.

I think the community needs to look at other alternatives.

(I am amazed that not only are there a confusing number of drivers, but
there are a confusing number of copyright options for this single driver.)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Peter Eisentraut
On fre, 2010-02-05 at 09:19 -0500, Bruce Momjian wrote:
 While I realize experienced people can easily navigate this confusion,
 I
 am concerned about new Postgres adopters being very confused by this
 and
 it is hurting our database adoption in general.  
 
 What is really needed is for someone to take charge of one of these
 projects and make a best-of-breed Python driver that can gain general
 acceptance as our preferred driver.  I feel Python is too important a
 language to be left in this state.

The situation is unfortunate, but you might as well argue that too many
Linux desktops or Linux distributions confuse new users and hurt
adoption.  These alternatives all exist for a reason, and it will be
difficult to get some of them to abandon their work with the aim of
reducing the overall confusion.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Tim Bunce
On Fri, Feb 05, 2010 at 09:19:26AM -0500, Bruce Momjian wrote:
 My son has brought to my attention that our current crop of Python
 client libraries is inadequate/confusing.  I took a look myself, and
 asked on our IRC channel, and am now convinced this area needs
 attention.
 
   http://wiki.postgresql.org/wiki/Python
 The Python-hosted PostgreSQL page has similar problems:
   http://wiki.python.org/moin/PostgreSQL
 
 Does Perl have a similar mess?

I don't think so.

The primary database interface is DBI and as far as I can see there's
only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/

The only non-DBI interfaces I could find (by skimming the 384 results
for postgresql on search.cpan.org) were:
Postgres: http://search.cpan.org/dist/Postgres/last updated in 1998.
Pg:   http://search.cpan.org/dist/pgsql_perl5/ last updated in 2000.

Tim.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Peter Eisentraut wrote:
 On fre, 2010-02-05 at 09:19 -0500, Bruce Momjian wrote:
  While I realize experienced people can easily navigate this confusion,
  I
  am concerned about new Postgres adopters being very confused by this
  and
  it is hurting our database adoption in general.  
  
  What is really needed is for someone to take charge of one of these
  projects and make a best-of-breed Python driver that can gain general
  acceptance as our preferred driver.  I feel Python is too important a
  language to be left in this state.
 
 The situation is unfortunate, but you might as well argue that too many
 Linux desktops or Linux distributions confuse new users and hurt
 adoption.  These alternatives all exist for a reason, and it will be
 difficult to get some of them to abandon their work with the aim of
 reducing the overall confusion.

Agreed, but can't we do a better job of steering new users and improving
the novice experience for the most popular of them, i.e. think Ubuntu.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Tim Bunce wrote:
 On Fri, Feb 05, 2010 at 09:19:26AM -0500, Bruce Momjian wrote:
  My son has brought to my attention that our current crop of Python
  client libraries is inadequate/confusing.  I took a look myself, and
  asked on our IRC channel, and am now convinced this area needs
  attention.
  
  http://wiki.postgresql.org/wiki/Python
  The Python-hosted PostgreSQL page has similar problems:
  http://wiki.python.org/moin/PostgreSQL
  
  Does Perl have a similar mess?
 
 I don't think so.
 
 The primary database interface is DBI and as far as I can see there's
 only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/
 
 The only non-DBI interfaces I could find (by skimming the 384 results
 for postgresql on search.cpan.org) were:
 Postgres: http://search.cpan.org/dist/Postgres/last updated in 1998.
 Pg:   http://search.cpan.org/dist/pgsql_perl5/ last updated in 2000.

Yes, that's what I thought, and am glad to here it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Peter Eisentraut
On fre, 2010-02-05 at 14:45 +, Tim Bunce wrote:
  Does Perl have a similar mess?
 
 I don't think so.
 
 The primary database interface is DBI and as far as I can see there's
 only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/

I think another difference is that the Perl DBI interface is very rich,
whereas the Python DB-API is quite minimal and almost forces people to
write (incompatible) extensions.  The DB-SIG at Python that ought to
drive all this is also quite dead, possibly because everyone has moved
on to SQLAlchemy.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Peter Eisentraut wrote:
 On fre, 2010-02-05 at 14:45 +, Tim Bunce wrote:
   Does Perl have a similar mess?
  
  I don't think so.
  
  The primary database interface is DBI and as far as I can see there's
  only one DBI PostgreSQL driver: http://search.cpan.org/dist/DBD-Pg/
 
 I think another difference is that the Perl DBI interface is very rich,
 whereas the Python DB-API is quite minimal and almost forces people to
 write (incompatible) extensions.  The DB-SIG at Python that ought to
 drive all this is also quite dead, possibly because everyone has moved
 on to SQLAlchemy.

I assumed it would be Perl that had confusion because of the Perl 6, but
obviously it is Python.  The second PL/Python server-side language that
was proposed recently is taking on a whole new appearance for me.  :-(

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Josh Berkus

 I think another difference is that the Perl DBI interface is very rich,
 whereas the Python DB-API is quite minimal and almost forces people to
 write (incompatible) extensions.  The DB-SIG at Python that ought to
 drive all this is also quite dead, possibly because everyone has moved
 on to SQLAlchemy.

For people who use Python a lot, could I have a list of the deficiencies
in DBAPI?  I've got my horse and lance ready.

Given that SQLAlchemy isn't for everyone, of course ... it couldn't be,
or Django would use it, no?

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Josh Berkus

 The situation is unfortunate, but you might as well argue that too many
 Linux desktops or Linux distributions confuse new users and hurt
 adoption.  These alternatives all exist for a reason, and it will be
 difficult to get some of them to abandon their work with the aim of
 reducing the overall confusion.

I'm not as concerned about confusion as the fact that *all* of the
various Python drivers suck in different, and crippling, ways.  I don't
care how many drivers we have, as long as we have at least one 1st-class
driver.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Josh Berkus wrote:
 
  The situation is unfortunate, but you might as well argue that too many
  Linux desktops or Linux distributions confuse new users and hurt
  adoption.  These alternatives all exist for a reason, and it will be
  difficult to get some of them to abandon their work with the aim of
  reducing the overall confusion.
 
 I'm not as concerned about confusion as the fact that *all* of the
 various Python drivers suck in different, and crippling, ways.  I don't
 care how many drivers we have, as long as we have at least one 1st-class
 driver.

Agreed, and we can point newbies to that one great driver.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Marko Kreen wrote:
 Psycopg was the leader, especially in web-environments,
 but it has non-obvious license and with dead website it does not
 seem that attractive.  Although it is well-maintained still.
 
 Best path forward would be to talk with Psycopg guys about
 license clarification/change.
 
 I suspect GPL does not extend anyway to Python code that imports
 it dynamically, and it does not seem to be their intention - they
 even try to tell it in their current clarification, but it
 is not as clear as it could be.

Yes, this is a good analysis and has some good suggestions for moving
forward.  In summary, I don't think the current PG/Python driver
situation is helping PG or Python, and I believe there are enough people
who care about those two to get a group together to sort out a viable
direction and start working toward the goal of improving things.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Greg Smith

Bruce Momjian wrote:

While I realize experienced people can easily navigate this confusion...


No, that's the worst part--the more you know and the deeper you dig into 
it, the more broken you realize the whole thing is.  When one of the 
best drivers (in some respects) has a web page that looks like this:  
http://initd.org/ that doesn't seem so bad at first--but if you're 
experienced, you know that the page has been in that disturbing state 
since late 2006.  You start digging into the driver mess, figure you 
just need to learn how things fit together, but the hole keeps getting 
bigger as you dig.


The issues here have already been identified:  the Perl DBI is an 
excellent spec, while the Python one is so weak everybody ends up 
needing their own extensions to it.  And then portability *even among 
Python PostgreSQL drivers* goes out the window.  If somebody built a 
BSD/MIT licensed driver that replaces every useful feature of all the 
forks, with no major problems, and a couple of major projects switched 
over to it (think Skype level big), maybe this mess could get 
resolved.  I think it would take someone already familiar with the major 
issues involved a couple of months of regular work to make any serious 
progress on it.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Bruce Momjian
Greg Smith wrote:
 Bruce Momjian wrote:
  While I realize experienced people can easily navigate this confusion...
 
 No, that's the worst part--the more you know and the deeper you dig into 
 it, the more broken you realize the whole thing is.  When one of the 
 best drivers (in some respects) has a web page that looks like this:  
 http://initd.org/ that doesn't seem so bad at first--but if you're 
 experienced, you know that the page has been in that disturbing state 
 since late 2006.  You start digging into the driver mess, figure you 
 just need to learn how things fit together, but the hole keeps getting 
 bigger as you dig.
 
 The issues here have already been identified:  the Perl DBI is an 
 excellent spec, while the Python one is so weak everybody ends up 
 needing their own extensions to it.  And then portability *even among 
 Python PostgreSQL drivers* goes out the window.  If somebody built a 
 BSD/MIT licensed driver that replaces every useful feature of all the 
 forks, with no major problems, and a couple of major projects switched 
 over to it (think Skype level big), maybe this mess could get 
 resolved.  I think it would take someone already familiar with the major 
 issues involved a couple of months of regular work to make any serious 
 progress on it.

Yes, it is going to take serious time, and it is going to take more than
one person to be involved, but I think there are enough people who care
that something serious can be done to improve its current state.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Jeff Davis
On Fri, 2010-02-05 at 09:38 -0500, Bruce Momjian wrote:
 Wow, that is super-confusing.

Agreed. Standardization among licenses is useful, and I think it's
important to have a driver with a license that people already
understand.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Marko Kreen
On 2/5/10, Josh Berkus j...@agliodbs.com wrote:
   I think another difference is that the Perl DBI interface is very rich,
   whereas the Python DB-API is quite minimal and almost forces people to
   write (incompatible) extensions.  The DB-SIG at Python that ought to
   drive all this is also quite dead, possibly because everyone has moved
   on to SQLAlchemy.


 For people who use Python a lot, could I have a list of the deficiencies
  in DBAPI?  I've got my horse and lance ready.

I took quick glance on DBI and compared to that, DB-API does
not define specific api for:
- resultset cursors
- prepared plans
- db structure examination

OTOH, my guess is that DB-API authors assumed the first two
are used transparently by the driver, and I see no reason why
they cannot be, with current DB-API.

Last item is rich indeed, but seems slightly overengineered..

Now, none of the differences between drivers and current confusion
seem to be related to above points.  Hysterical Raisins is only
good explanation about current situation.

The pg8000 / bpgsql seem to be toy projects, and anyway you dont
want to use pure-Python drivers in high-performance environments.
We are not talking about C#/java here.

py-postgresql seems to be more serious, but as it's python3 only
which makes it irrelevant today.

PyGreSQL is the oldest, older than DB-API, and so it's DB-API
interface seems an afterthought and is untested/underused - eg.
it does not support bytea.

Psycopg was the leader, especially in web-environments,
but it has non-obvious license and with dead website it does not
seem that attractive.  Although it is well-maintained still.

Best path forward would be to talk with Psycopg guys about
license clarification/change.

I suspect GPL does not extend anyway to Python code that imports
it dynamically, and it does not seem to be their intention - they
even try to tell it in their current clarification, but it
is not as clear as it could be.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Josh Berkus

 Imho a big problem is that it does way too much itself - i.e. it does not use 
 things like PQExecParams but does escaping/parsing itself...
 Other people may think thats a good idea - I definitely do not think so.

It also has issues with transaction control which cause idle
transactions if the Django front-end times out due to load.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Andres Freund
On Friday 05 February 2010 21:34:53 Marko Kreen wrote:
 On 2/5/10, Josh Berkus j...@agliodbs.com wrote:
I think another difference is that the Perl DBI interface is very
rich, whereas the Python DB-API is quite minimal and almost forces
people to write (incompatible) extensions.  The DB-SIG at Python that
ought to drive all this is also quite dead, possibly because everyone
has moved on to SQLAlchemy.
  
  For people who use Python a lot, could I have a list of the deficiencies
  
   in DBAPI?  I've got my horse and lance ready.
 Psycopg was the leader, especially in web-environments,
 but it has non-obvious license and with dead website it does not
 seem that attractive.  Although it is well-maintained still.
Imho a big problem is that it does way too much itself - i.e. it does not use 
things like PQExecParams but does escaping/parsing itself...
Other people may think thats a good idea - I definitely do not think so.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread Marko Kreen
On 2/5/10, Greg Smith g...@2ndquadrant.com wrote:
  The issues here have already been identified:  the Perl DBI is an excellent
 spec, while the Python one is so weak everybody ends up needing their own
 extensions to it.  And then portability *even among Python PostgreSQL
 drivers* goes out the window.

Well, no.   You are overplaying on the DBI angle.  If the driver can
offer via the minimal DB-API the same features the DBI driver
can offer, then the DBI API is bloated, not the minimal API weak...

Not that DB-API is perfect, a way to toggle prepare and db cursors
on/off in per-query basis would be good to have, but thats mostly it.
Rest of extra features (COPY) need to be done as extensions anyway.

But all that is off-topic to the current driver suckiness.

The problems with Python drivers are that they fail to implement
even the minimal DB-API 2.0:

- correctly, across all interesting data types.
  PygreSQL (bytea).
  pg8000 (does rollback after query fails - WTF).
  None of the drivers seem to support both array and tuple.

- robustly.  Psycopg2 - their misguided attempt to turn DB-API
  calls optionally async cost their stability big-time as it
  complexified their codebase.  It got stable around 2.0.[678].
  I'm not holding my breath on the other ones.
  Fear the fancy API-s.

- with threadsafety level  1.  (I don't care, but seems in some
  circles it's necessary.)

Note how many ways a driver can suck without stepping above
basic DB-API 2.0.

 If somebody built a BSD/MIT licensed driver
 that replaces every useful feature of all the forks, with no major problems,
 and a couple of major projects switched over to it (think Skype level
 big), maybe this mess could get resolved.  I think it would take someone
 already familiar with the major issues involved a couple of months of
 regular work to make any serious progress on it.

Yeah, but the problem here is that there actually is not a problem.
Psycopg license is not bad, just confusing on first
sight.  And website problem can be fix by advertising
 http://wiki.postgresql.org/wiki/Psycopg as new website...

If your requirement is BSD license, then probably getting PyGreSQL
into shape would be faster than writing from scratch.
Main dev effort would be writing Postgres-specific DB-API 2.0
testsuite anyway.  It's likely that actual improvements needed
would be small.

Also I would suggest to match Psycopg extensions API, that would
make it usable to much broader audience.  (Eg: PygreSql does
not offer COPY extensions via the DB-API 2.0 API.  It would be
good to match Psycopg here, instead inventing new API.)

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread James William Pye
On Feb 5, 2010, at 1:34 PM, Marko Kreen wrote:
 py-postgresql seems to be more serious, but as it's python3 only
 which makes it irrelevant today.

Furthermore, if it did work on python2, it's *not* something that's going to 
appeal to mainstream users (Python heavy web frameworks) as it *partially* 
suffers from the same problem that pg8000 does. It's mostly pure-Python, but it 
has some C optimizations(notably, PQ message buffer). I have done some 
profiling, and *with a few tweaks* it's about 2x-3x *slower than psycopg2* for 
the retrieval of a single int column. I think it could go faster, but I don't 
think it's worth the work.

ISTM that the target audience are folk who are married to PG, and are generally 
unhappy with DB-API, but do not want to buy into a big abstraction layer like 
SQLAlchemy. Sure, it supports DB-API like other drivers so it *would be* usable 
with frameworks, but why take the 3x *or greater* hit over a properly 
implemented libpq version?

Finally, I just don't see the existing (often PG specific) goals that I have in 
mind for it appealing to the majority of [web framework/abstraction] users.

 Psycopg was the leader, especially in web-environments,
 but it has non-obvious license and with dead website it does not
 seem that attractive.  Although it is well-maintained still.
 
 Best path forward would be to talk with Psycopg guys about
 license clarification/change.

Yep.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread James William Pye
On Feb 5, 2010, at 11:34 AM, Josh Berkus wrote:
 For people who use Python a lot, could I have a list of the deficiencies
 in DBAPI?  I've got my horse and lance ready.
 
 Given that SQLAlchemy isn't for everyone, of course ... it couldn't be,
 or Django would use it, no?


Here are some to start with:

 - paramstyle
 - No direct support of prepared statements
   [they *tried* to compensate for this with cached statements, but it's 
inadequate]
 - Too many *optional* extensions (Cursor Method .__iter__() being rather 
notable)

http://www.python.org/dev/peps/pep-0249/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusion over Python drivers

2010-02-05 Thread James William Pye
On Feb 5, 2010, at 8:00 AM, Peter Eisentraut wrote:
 I think another difference is that the Perl DBI interface is very rich,
 whereas the Python DB-API is quite minimal and almost forces people to
 write (incompatible) extensions.  

Yep.

 The DB-SIG at Python that ought to drive all this is also quite dead,

From reading messages that come into that list(not-so-lately), I've gotten the 
impression that they are content with DB-API 2. Aside from the TPC interfaces, 
the last round of activity that I know of was dbapi3[1 (which was a long while 
ago)].

 possibly because everyone has moved on to SQLAlchemy.

Yeah. AFAICT, that is the general direction encouraged by DB-SIG. Write an 
abstraction on top of DB-API. SQLAlchemy, SQLObject, anysql, and I'm sure there 
are others.


[1] http://wiki.python.org/moin/Aug2001DbApi3Strawman
http://wiki.python.org/moin/DbApi3
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers