Re: [GENERAL] plpython function problem workaround
The only ?issue? that I have found with it is similar to an issue I posted about multiline in general, which does not seem to be considered a bug. I would say if it is documented that any newlines in a python function, including embedded newlines, are *NIX newlines no matter what operating system the function is created on, that would suffice. As an example - Windows PGAdmin client. Linux Server: create table test1(f1 varchar(50)); insert into test1 values('this is a multi line string line2 line3 ') select * from test1 where f1='this is a multi line string line2 line3 ' --returns 1 row create or replace function testnewlines() returns int as $$ x=plpy.execute(select f1 from test1 where f1='this is a multi line string\r\nline2\r\nline3\r\n') return x.nrows() $$ language 'plpythonu' --returns 1 create or replace function testnewlines() returns int as $$ x=plpy.execute(select f1 from test1 where f1='this is a multi line string line2 line3 ') return x.nrows() $$ language 'plpythonu' --returns 0 Thank You Sim Zacks On Fri, Mar 18, 2005 at 10:12:05PM -0700, Michael Fuhr wrote: I just submitted a small patch to convert CRLF = LF, CR = LF. This patch is in 8.0.2beta1, so PL/Python users might want to test it before 8.0.2 is released. See the recent 8.0.2 Beta Available announcement: http://archives.postgresql.org/pgsql-general/2005-03/msg01311.php -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
On Tue, 29 Mar 2005, Sim Zacks wrote: The only ?issue? that I have found with it is similar to an issue I posted about multiline in general, which does not seem to be considered a bug. I've posted similar concerns in the past. The whole point is that there are two possible approaches: 1) treat text as binary - as we do now; 2) do on the wire conversion - like FTP ASCII mode. Both have disadvantages, and both lead to unexpected results. As I wrote before, 2) is more problematic. You'll have to reject any file with a bare \n from a Windows, otherwise you won't be able to process it correclty. I think if you do: insert into test (sometext) values ('Line one\nLine two\r\n'); -- with the literal chars, not the escape sequences you're expecting exaclty the same on output. If the server converts it in the Unix form: 'Line one\nLine two\n' for storing and the converts back to the Windows form, when you do: select sometext from test; -- from a Windows client you get: Line one\r\nLine two\r\n which is not the same you entered. I doubt FTP ASCII mode handles this correctly. As for the examples you made (the python functions), it's a problem with python string literals (just don't use them). Let's try this: -- CUT HERE 8 #!/usr/bin/env python import pgdb db = pgdb.connect() curs = db.cursor() # this is only to emulate PGAdmin under Windows (I don't have it) # (I ran the script with these uncommented on Linux) #q = create temp table test1(f1 varchar(50)); #curs.execute(q) #q = insert into test1 values('this is a multi line string\r\nline2\r\nline3\r\n'); #curs.execute(q) real test # embedded in string literal q1 = select count(f1) from test1 where f1 = 'this is a multi line string line2 line3 ' # escapes (expanded by python) q2 = select count(f1) from test1 where f1 = 'this is a multi line string\r\nline2\r\nline3\r\n' # escapes (expanded by PostgreSQL) q3 = rselect count(f1) from test1 where f1 = 'this is a multi line string\r\nline2\r\nline3\r\n' curs.execute(q3) # stating the obvious print Comparisons: print %-10s%-10s%-10s % (q1 == q2, q1 == q3, q2 == q3) print %-10s%-10s%-10s % (q1 == q2,q1 == q3, q2 == q3) print \nRunning tests... curs.execute(q1) print Test 1 (string literal):, curs.fetchone()[0] curs.execute(q2) print Test 2 (Python escapes):, curs.fetchone()[0] curs.execute(q3) print Test 3 (PG escapes):, curs.fetchone()[0] # in case someone wonders, let's try using query parameters astring = this is a multi line string line2 line3 q = select count(f1) from test1 where f1 = %(mystr)s curs.execute(q, { mystr: astring }) print Test 4 (parameters):, curs.fetchone()[0] -- 8 CUT HERE This is the output (on Linux): Comparisons: q1 == q2 q1 == q3 q2 == q3 False False False Running tests... Test 1 (string literal): 0 Test 2 (Python escapes): 1 Test 3 (PG escapes): 1 Test 4 (parameters): 0 Which is consistent with your examples, that is, it works _only_ with explicit escapes (either at Python level or at PostgreSQL level). If client-side python works this way, why are you expecting server-side python to work differently? .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] plpython function problem workaround
On Tue, 29 Mar 2005, Marco Colombo wrote: # escapes (expanded by PostgreSQL) q3 = rselect count(f1) from test1 where f1 = 'this is a multi line string\r\nline2\r\nline3\r\n' curs.execute(q3) ^^ This line (no. 28) is useless (but harmless), please ignore it (just a cutpaste error). .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] plpython function problem workaround
On Fri, Mar 18, 2005 at 10:12:05PM -0700, Michael Fuhr wrote: I just submitted a small patch to convert CRLF = LF, CR = LF. This patch is in 8.0.2beta1, so PL/Python users might want to test it before 8.0.2 is released. See the recent 8.0.2 Beta Available announcement: http://archives.postgresql.org/pgsql-general/2005-03/msg01311.php -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] plpython function problem workaround
Sorry it took me so long to respond. I've been out for a couple days. While certain things may be permissible in a language, I think it is also important to look at the context at which the language is applied and make a determination if it will practically turn up in relevant code. If the answer is no, then I think it would be acceptable to add a note that the postgresql python implementation does not accept implicit newlines and any newlines must be explicit (the text \r\n) That being said, the only place I can see wanting to use a newline is for a situation where you would want to either do a select statement with a multiline where such as results = plpy.execute(...where textfield= 'multi line string'...) or if you returned the results and wanted to do the same type of thing. if results[0][textfield]==multi line string: ... However, this example assumes that the client writing the functions and the client inserting the data are always using the same OS, because if a linux client inserted the data and a windows client had written the function, it wouldn't receive any results, because postgresql sees the newline as valid characters. hat being the case I would say that while it is entirely possible to put multiline quoted text in python code, it would be inappropriate in a postgresql environment. Sim Tom Lane [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Sim Zacks [EMAIL PROTECTED] writes: I've been looking at the possibility of having a planned CR in the source code and I don't see a case where it would happen. Does python actually disallow newlines in string literals? That is x = 'foo bar' Whether you think this is good style is not the question --- is it allowed by the language? regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Fri, 18 Mar 2005, Tom Lane wrote: Marco Colombo [EMAIL PROTECTED] writes: Right now I'm parsing the string first, changing the resulting parse tree adding missing nodes (def, INDENT, DEINDENT) and then compiling it. Hmmm ... is this really going to be simpler or more robust than lexing the string carefully enough to insert tabs at the right places? The impression I had so far was that you'd only need to understand about Python's string-literal conventions to get that right ... and that's something that's not likely to change. I'm not so sure that parse trees can be regarded as an immutable API. regards, tom lane I've completed a proof of concept, I think I can answer: - simpler? not at all. It requires understanding of how the parser works. The whole thing is about 50 lines long, but quite a bit of parser magic is going on. And I'm far from the point I can be confident about it doing always the right thing. I still have to handle (de)allocations correctly. - more robust - yes. The only way to make sure we're lexing the string the same way python does is to use its lexer. Every single difference however subtle would call for a bug. And it's re-invening the wheel. But there's no way you can work at lexer level that I'm aware of. That is, to add tokens before sending them to the parser. So you have to work on the parser output. - I have no idea if the node API is immutable at all. For sure, the interface I'm using is one or two levels below the current one, and yes, it's more likely to change. I share your concerns here. If our problem is only string literals, maybe we can handle them with a dedicated lexer. Python string literals are quite complex (compared to other languages): http://docs.python.org/ref/strings.html but not that hard. Still, my first concern is that one day we find another corner case in python syntax that makes our blind tab adding fail. And we're back to square one. BTW, I'm not preparing a patch for now, I'm working with a test program. As soon as I finish it, either I'll post it or prepare a patch against plpython.c, for consideration. I won't say it is ready for inclusion until someone else more knowledgeable than me on both PostgreSQL and python embedding looks at it, anyway. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpython function problem workaround
On Thu, 17 Mar 2005, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: On Thu, Mar 17, 2005 at 01:03:36PM +0100, Marco Colombo wrote: OMG! It's indenting the funtion body. I think you can't do that w/o being syntax-aware. I'm not familiar with the code, why is it adding a 'def' in front of it at all? I undestand that once you do it you'll have to shift the code by an indentation level. Presumbly because it wants to create a function, which can later be called. Since python is sensetive to whitespace it has to indent the code to make it work. Seems like we have to upgrade that thing to have a complete understanding of Python lexical rules --- at least enough to know where the line boundaries are. Which is pretty much exactly the same as knowing which CRs to strip out. So I guess we have a candidate place for a solution. Anyone want to code it up? I don't know enough Python to do it ... I'm no expert but I'll look into it. Unless someone else already tried it, I want to investigate first if it's possible to create a callable object w/o using 'def', which alters the name space and (the thing we're interested to) needs an extra identation level. At first sight, what we do now (at function creation time) is: 1) execute a function definition; 2) compile a function call, and save the resulting code object for later use. I'm wondering if we can save one step, and use a python callable object. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpython function problem workaround
On Thu, 17 Mar 2005, Michael Fuhr wrote: On Thu, Mar 17, 2005 at 09:48:51PM -0500, Tom Lane wrote: Michael Fuhr [EMAIL PROTECTED] writes: Line-ending CRs stripped, even inside quotes; mid-line CRs converted to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder what Python on Windows would do. Unfortunately, I don't think that proves anything, because according to earlier discussion Python will do newline-munging when it reads a file (including a script file). The question that we have to deal with is what are the rules for a string fed to PyRun_String ... and it seems those rules are not the same. Marco, you've stated that you're against munging the code because it's not our job to 'fix' data coming from the client. But I'm suggesting that we think about the code in a different way than the current implementation does: not as a literal that we pass untouched to the Python interpreter, but rather as code that Python would munge anyway if it had read that code from a file. We could still store the code exactly as received and have the language handler munge it on the fly, as we've discovered it's already doing. Comments? Have I overlooked anything? Could munging CRs have effects that a Python programmer wouldn't expect if the same code had been read from a file? Since it mimics Python's own behavior with code read from a file, can anybody justify not doing it? If you put it that way, I'm 100% with you. Just let me suggest not to mimic its behaviour, but to use the Real Thing if we manage to. That is, directly use the Universal Line Support code provided by python itself, so that we don't even have to think about being compatible. Unfortunately, I'm new to python embedding. I think I'll ask on python lists about the function definition code. Actually, we are kind of a corner case: we ask the user to provide the function body w/o the initial def declaration. We're treating partial, incomplete python code and not a well-formed program, so we have to munge it anyway. I have no idea if and how the python C API lets you control such low level details. I think what we really want is to create a callable (anonymous) object from the source of its body. I'm experimenting a bit, trying to write a PLy_procedure_compile() that does not require source munging. I'm aiming at removing the need for extra indentation. The \r\n thing is another beast, and I'm not sure it belongs to the same place in our code. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] plpython function problem workaround
On Fri, Mar 18, 2005 at 11:34:46AM +0100, Marco Colombo wrote: Just let me suggest not to mimic its behaviour, but to use the Real Thing if we manage to. That is, directly use the Universal Line Support code provided by python itself, so that we don't even have to think about being compatible. Sounds good if the Python API provides the hooks for doing so. I had started looking into that but didn't spent much time on it. I'm experimenting a bit, trying to write a PLy_procedure_compile() that does not require source munging. I'm aiming at removing the need for extra indentation. Sounds good too, if that'll work. Looking forward to seeing what you find out. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
Michael Fuhr [EMAIL PROTECTED] writes: On Fri, Mar 18, 2005 at 12:35:07AM -0500, Tom Lane wrote: So that part of it can be solved fairly easily. Should I submit a patch? It should be only a few additional lines in PLy_procedure_munge_source(). Would you apply it only to HEAD, or would it be considered a bug fix that REL8_0_STABLE could get as well? It might be nice to have it in 8.0.2, whenever that comes out. I think it would be reasonable to back-patch a small fix to convert CRLF. The sort of rewrite Marco is considering, I wouldn't back-patch. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] plpython function problem workaround
On Fri, 18 Mar 2005, Tom Lane wrote: Michael Fuhr [EMAIL PROTECTED] writes: On Fri, Mar 18, 2005 at 12:35:07AM -0500, Tom Lane wrote: So that part of it can be solved fairly easily. Should I submit a patch? It should be only a few additional lines in PLy_procedure_munge_source(). Would you apply it only to HEAD, or would it be considered a bug fix that REL8_0_STABLE could get as well? It might be nice to have it in 8.0.2, whenever that comes out. I think it would be reasonable to back-patch a small fix to convert CRLF. The sort of rewrite Marco is considering, I wouldn't back-patch. regards, tom lane Short update: it isn't possible to get a callable object directly from the source provided by the user, since 'return' is not valid outside function definitions in Python. Right now I'm parsing the string first, changing the resulting parse tree adding missing nodes (def, INDENT, DEINDENT) and then compiling it. Grammar definitions for a python function is: funcdef: [decorators] 'def' NAME parameters ':' suite suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT What we get from the users is stmt+ (a sequence of stmt). The INDENT and DEDENT tokens are generated by the parser only when indentation level _changes_. My plan is to generate two parse trees, one from this code (with the right fname of course): def fname(): pass and one from the function definition. Then, we attach the root of the second tree where the pass node is in the first tree. We should get a parse tree ready for compilation. I wish I could push the right tokens in the right places, but it seems it's not possible. Stay tuned. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
Marco Colombo [EMAIL PROTECTED] writes: Right now I'm parsing the string first, changing the resulting parse tree adding missing nodes (def, INDENT, DEINDENT) and then compiling it. Hmmm ... is this really going to be simpler or more robust than lexing the string carefully enough to insert tabs at the right places? The impression I had so far was that you'd only need to understand about Python's string-literal conventions to get that right ... and that's something that's not likely to change. I'm not so sure that parse trees can be regarded as an immutable API. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] plpython function problem workaround
On Fri, Mar 18, 2005 at 10:33:01AM -0500, Tom Lane wrote: I think it would be reasonable to back-patch a small fix to convert CRLF. The sort of rewrite Marco is considering, I wouldn't back-patch. I just submitted a small patch to convert CRLF = LF, CR = LF. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] plpython function problem workaround
On Wed, 16 Mar 2005, Michael Fuhr wrote: [I've changed the Subject back to the thread that started this discussion.] On Wed, Mar 16, 2005 at 05:52:02PM +0100, Marco Colombo wrote: I'm against to any on-the-fly conversion, now. I don't like the idea of PostgreSQL accepting input in one form (\r\n) and providing output in a different form (\n). Also think of a function definition with mixed \r\n and \n lines: we'd have no way to reconstruct the original input. Yeah, that's a reasonable argument against modifying the function source code before storing it in pg_proc. But I expect this problem will come up again, and some people might not care about being able to reconstruct the original input if it's just a matter of stripped carriage returns, especially if the function logic doesn't use literal carriage return characters that would be missed. For those people, the validator hack might be an acceptable way to deal with a client interface that inserts carriage returns that the programmer didn't intend anyway. Not necessarily as part of the core PostgreSQL code or even distributed with PostgreSQL, but as something they could install if they wanted to. Agreed. I think we should just state that text used for function definitions is \n-delimited. Some languages may accept \r\n as well, but that's undocumented side effect, and bad practice. Whether it's an undocumented side effect depends on the language, and whether it's bad practice is a matter of opinion. Sure. I mean, we may just state that, per spec. Program data should be \n-delimeted, full stop. It sounds sensible to me. Just put it somewhere in the docs, problem solved. We're loosing nothing. I'm just proposing to add that to the docs/specs. In any case, that's the language's concern and not something PostgreSQL should judge or enforce. PostgreSQL shouldn't have to know or care about a procedural language's syntax -- a function's source code should be an opaque object that PostgreSQL stores and passes to the language's handler without caring about its contents. Syntax enforcement should be in the language's validator or handler according to the language's own rules. That's what we do now. My point being it's not our job to fix data coming from the client. If a client app creates a plpython function the wrong way, fix it. Why should we place a paperbag on a client bug? Speaking of code munging and syntax enforcement, have a look at this: CREATE FUNCTION foo() RETURNS text AS $$ return line 1 line 2 line 3 $$ LANGUAGE plpythonu; SELECT foo(); foo -- line 1 line 2 line 3 (1 row) Eh? Where'd those leading tabs come from? Why, they came from PLy_procedure_munge_source() in src/pl/plpython/plpython.c: mrc = PLy_malloc(mlen); plen = snprintf(mrc, mlen, def %s():\n\t, name); Assert(plen = 0 plen mlen); sp = src; mp = mrc + plen; while (*sp != '\0') { if (*sp == '\n') { *mp++ = *sp++; *mp++ = '\t'; } else *mp++ = *sp++; } *mp++ = '\n'; *mp++ = '\n'; *mp = '\0'; How about them apples? The PL/Python handler is already doing some fixup behind the scenes (and potentially causing problems, as the example illustrates). OMG! It's indenting the funtion body. I think you can't do that w/o being syntax-aware. I'm not familiar with the code, why is it adding a 'def' in front of it at all? I undestand that once you do it you'll have to shift the code by an indentation level. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Thu, Mar 17, 2005 at 01:03:36PM +0100, Marco Colombo wrote: OMG! It's indenting the funtion body. I think you can't do that w/o being syntax-aware. I'm not familiar with the code, why is it adding a 'def' in front of it at all? I undestand that once you do it you'll have to shift the code by an indentation level. Presumbly because it wants to create a function, which can later be called. Since python is sensetive to whitespace it has to indent the code to make it work. There was an example on the web somewhere (the link has been posted to this list) of a peice of python which you can load into the interpreter which will allow it to accept \r\n terminated lines. I don't recall if anyone actually tried it out or not... Won't fix the indenting problem though... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpMaQLR9wCfh.pgp Description: PGP signature
Re: [GENERAL] plpython function problem workaround
Martijn van Oosterhout kleptog@svana.org writes: On Thu, Mar 17, 2005 at 01:03:36PM +0100, Marco Colombo wrote: OMG! It's indenting the funtion body. I think you can't do that w/o being syntax-aware. I'm not familiar with the code, why is it adding a 'def' in front of it at all? I undestand that once you do it you'll have to shift the code by an indentation level. Presumbly because it wants to create a function, which can later be called. Since python is sensetive to whitespace it has to indent the code to make it work. Seems like we have to upgrade that thing to have a complete understanding of Python lexical rules --- at least enough to know where the line boundaries are. Which is pretty much exactly the same as knowing which CRs to strip out. So I guess we have a candidate place for a solution. Anyone want to code it up? I don't know enough Python to do it ... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] plpython function problem workaround
On Thu, Mar 17, 2005 at 10:49:24AM -0500, Tom Lane wrote: Seems like we have to upgrade that thing to have a complete understanding of Python lexical rules --- at least enough to know where the line boundaries are. Which is pretty much exactly the same as knowing which CRs to strip out. So I guess we have a candidate place for a solution. Anyone want to code it up? I don't know enough Python to do it ... [Sound of crickets] More pabulum for pondering: % cat -v foo.py print '''line 1^M line^M2^M line 3^M '''^M % python foo.py | cat -v line 1 line 2 line 3 % cat -v bar.py print 'line 1^M' % python bar.py File bar.py, line 1 print 'line 1 ^ SyntaxError: EOL while scanning single-quoted string Line-ending CRs stripped, even inside quotes; mid-line CRs converted to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder what Python on Windows would do. If it behaves the same way, then a munging algorithm might be CRLF = LF, otherwise CR = LF. Or we could take Marco's suggestion and do nothing, putting the burden on the client to send the right thing. That doesn't address the indentation munging, though. That appears to be a matter of knowing whether you're inside a quote or not when a LF appears. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] plpython function problem workaround
Michael Fuhr [EMAIL PROTECTED] writes: Line-ending CRs stripped, even inside quotes; mid-line CRs converted to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder what Python on Windows would do. Unfortunately, I don't think that proves anything, because according to earlier discussion Python will do newline-munging when it reads a file (including a script file). The question that we have to deal with is what are the rules for a string fed to PyRun_String ... and it seems those rules are not the same. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] plpython function problem workaround
On Thu, Mar 17, 2005 at 09:48:51PM -0500, Tom Lane wrote: Michael Fuhr [EMAIL PROTECTED] writes: Line-ending CRs stripped, even inside quotes; mid-line CRs converted to LF. Tests done with Python 2.4 on FreeBSD 4.11-STABLE; I wonder what Python on Windows would do. Unfortunately, I don't think that proves anything, because according to earlier discussion Python will do newline-munging when it reads a file (including a script file). The question that we have to deal with is what are the rules for a string fed to PyRun_String ... and it seems those rules are not the same. I was curious about how Python's munging works with quotes that span lines, i.e., when the CRs and LFs might be considered part of a quoted string. Apparently any CR or LF is considered a line ending in an ordinary Python script, with CR and CRLF normalized to LF before being passed to the interpreter, so I'm thinking that a Python programmer wouldn't expect to be able to embed CRs in a string literal and have them remain unchanged. If that's the case, then concerns about CR conversions potentially messing up a user's strings might be unfounded. PL/Python currently treats the function source as a string that's passed unchanged (except for the added def and indentation) to PyRun_String. But that's an implementation detail that the user shouldn't have to care about: I'm wondering if, instead, PL/Python should treat the function source as Python would treat a file and do the same conversions that Python would, namely CRLF = LF and lone CR = LF. That should solve the complaints, and it should be justifiable as more than just a hack: PL/Python would simply be doing the same thing that Python would do if it had read the source from a file. That might even be less surprising than the current behavior. Marco, you've stated that you're against munging the code because it's not our job to 'fix' data coming from the client. But I'm suggesting that we think about the code in a different way than the current implementation does: not as a literal that we pass untouched to the Python interpreter, but rather as code that Python would munge anyway if it had read that code from a file. We could still store the code exactly as received and have the language handler munge it on the fly, as we've discovered it's already doing. Comments? Have I overlooked anything? Could munging CRs have effects that a Python programmer wouldn't expect if the same code had been read from a file? Since it mimics Python's own behavior with code read from a file, can anybody justify not doing it? -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] plpython function problem workaround
Michael Fuhr [EMAIL PROTECTED] writes: Apparently any CR or LF is considered a line ending in an ordinary Python script, with CR and CRLF normalized to LF before being passed to the interpreter, so I'm thinking that a Python programmer wouldn't expect to be able to embed CRs in a string literal and have them remain unchanged. If that's the case, then concerns about CR conversions potentially messing up a user's strings might be unfounded. Yeah, it looks like you are right: http://cvs.sourceforge.net/viewcvs.py/python/python/nondist/peps/pep-0278.txt?rev=1.2 So that part of it can be solved fairly easily. Unfortunately we are still left with the issue of having to lex Python programs enough to know how to indent them ... regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
On Fri, Mar 18, 2005 at 12:35:07AM -0500, Tom Lane wrote: Michael Fuhr [EMAIL PROTECTED] writes: then concerns about CR conversions potentially messing up a user's strings might be unfounded. Yeah, it looks like you are right: http://cvs.sourceforge.net/viewcvs.py/python/python/nondist/peps/pep-0278.txt?rev=1.2 So that part of it can be solved fairly easily. Should I submit a patch? It should be only a few additional lines in PLy_procedure_munge_source(). Would you apply it only to HEAD, or would it be considered a bug fix that REL8_0_STABLE could get as well? It might be nice to have it in 8.0.2, whenever that comes out. Unfortunately we are still left with the issue of having to lex Python programs enough to know how to indent them ... Yeah, that one's a little extra work, but I don't think anybody's complained about it yet. So far I'm thinking it only affects multi-line strings as in the example I posted, and even then it might not matter unless you use that string as a column value or return it to the caller. If you're doing something else like building a query statement, then the bogus indentation is probably irrelevant. It ought to be fixed, but it doesn't seem to be squeaking very loudly. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 02:50:06PM -0700, Michael Fuhr wrote: On Tue, Mar 15, 2005 at 03:41:37PM +, Ragnar Hafstað wrote: actually, perl scripts with \r\n line endings will run just fine in unix/linux. Indeed, and PL/Perl doesn't care. I just tested several PLs with PostgreSQL 8.0.1 on Solaris 9 and here are the results: PL/pgSQL CRLF ok PL/PerlCRLF ok PL/RubyCRLF ok PL/Tcl CRLF ok PL/Python CRLF fails PL/R CRLF fails Details: Thanks for the detailed test you ran. This is something I should have done before spouting off. So, really, AIUI, one of the selling points for each of the above-mentioned languages is their portability. It would appear to me that this newline incompatibility ought to be considered a major bug in python. (Maybe I'm spouting off half-cocked again, though). ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 03:41:37PM +, Ragnar Hafstað wrote: On Tue, 2005-03-15 at 07:33 -0600, David wrote: [about the line-termination problem in plpython] I'd like to insert one note here. While I'm not particularly familiar with either perl or python, when I encountered the referred-to thread regarding Windows/Unix newline incompatibilities in Python, I ran some tests on my Linux system. I wrote some very simple scripts in Perl and Python and tested them. They ran OK when I saved them in native Linux newlines, they ran, but if I saved them in DOS mode (vim), neither the Python nor the Perl scripts would run. actually, perl scripts with \r\n line endings will run just fine in unix/linux. Yes, I believe Michael Fuhr demonstrated that quite admirably in another reply to this message. what you might have been experiencing, is the fact that the unix shell is expecting \n lineendings, and the #! line could have been failing to run the perl executable. Yes, I'm now certain you are correct. On retesting, I caught something I didn't when I first ran them - the error message was identical for each language. The message was : bad interpreter: No such file or directory If the error had been internal to the language, I'm sure the messages from each would have been different. I guess this shows what happens when you shoot off your mouth when you don't know what you're talking about. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] plpython function problem workaround
[I've changed the Subject back to the thread that started this discussion.] On Wed, Mar 16, 2005 at 05:52:02PM +0100, Marco Colombo wrote: I'm against to any on-the-fly conversion, now. I don't like the idea of PostgreSQL accepting input in one form (\r\n) and providing output in a different form (\n). Also think of a function definition with mixed \r\n and \n lines: we'd have no way to reconstruct the original input. Yeah, that's a reasonable argument against modifying the function source code before storing it in pg_proc. But I expect this problem will come up again, and some people might not care about being able to reconstruct the original input if it's just a matter of stripped carriage returns, especially if the function logic doesn't use literal carriage return characters that would be missed. For those people, the validator hack might be an acceptable way to deal with a client interface that inserts carriage returns that the programmer didn't intend anyway. Not necessarily as part of the core PostgreSQL code or even distributed with PostgreSQL, but as something they could install if they wanted to. I think we should just state that text used for function definitions is \n-delimited. Some languages may accept \r\n as well, but that's undocumented side effect, and bad practice. Whether it's an undocumented side effect depends on the language, and whether it's bad practice is a matter of opinion. In any case, that's the language's concern and not something PostgreSQL should judge or enforce. PostgreSQL shouldn't have to know or care about a procedural language's syntax -- a function's source code should be an opaque object that PostgreSQL stores and passes to the language's handler without caring about its contents. Syntax enforcement should be in the language's validator or handler according to the language's own rules. Speaking of code munging and syntax enforcement, have a look at this: CREATE FUNCTION foo() RETURNS text AS $$ return line 1 line 2 line 3 $$ LANGUAGE plpythonu; SELECT foo(); foo -- line 1 line 2 line 3 (1 row) Eh? Where'd those leading tabs come from? Why, they came from PLy_procedure_munge_source() in src/pl/plpython/plpython.c: mrc = PLy_malloc(mlen); plen = snprintf(mrc, mlen, def %s():\n\t, name); Assert(plen = 0 plen mlen); sp = src; mp = mrc + plen; while (*sp != '\0') { if (*sp == '\n') { *mp++ = *sp++; *mp++ = '\t'; } else *mp++ = *sp++; } *mp++ = '\n'; *mp++ = '\n'; *mp = '\0'; How about them apples? The PL/Python handler is already doing some fixup behind the scenes (and potentially causing problems, as the example illustrates). -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] plpython function problem workaround
I've been looking at the possibility of having a planned CR in the source code and I don't see a case where it would happen. I tried this function, as an example: create or replace function crtest() returns int as $$ x=plpy.execute(select * from pg_proc where prosrc like '%\r%') return x.nrows() $$ language 'plpythonu' and it considered the \r to be text and not a CR. Therefore the update function removed the CR at the end of the line but ignored the \r that was specifically mentioned. Is there any case that someone can think of where an actual CR, not a textual representation of it, would purposefully be put into the code? Sim Tom Lane [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Sim Zacks [EMAIL PROTECTED] writes: I don't think that this type of solution should be discussed as an official patch. If it was, I would recommend solving the problem in source code when the function is passed to the translator. Indeed, but first we need to work out what the necessary translation is. A hack using a VALIDATOR function isn't an unreasonable way to prototype the conversion logic. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] plpython function problem workaround
Now that I think about it, FTP programs currently do this when they transfer ASCII files. If you have a planned CR in your program and FTP from mac to windows the CR becomes a CRLF and if you FTP from mac to unix the CR changes to an LF. Sim Zacks [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I've been looking at the possibility of having a planned CR in the source code and I don't see a case where it would happen. I tried this function, as an example: create or replace function crtest() returns int as $$ x=plpy.execute(select * from pg_proc where prosrc like '%\r%') return x.nrows() $$ language 'plpythonu' and it considered the \r to be text and not a CR. Therefore the update function removed the CR at the end of the line but ignored the \r that was specifically mentioned. Is there any case that someone can think of where an actual CR, not a textual representation of it, would purposefully be put into the code? Sim Tom Lane [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Sim Zacks [EMAIL PROTECTED] writes: I don't think that this type of solution should be discussed as an official patch. If it was, I would recommend solving the problem in source code when the function is passed to the translator. Indeed, but first we need to work out what the necessary translation is. A hack using a VALIDATOR function isn't an unreasonable way to prototype the conversion logic. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] plpython function problem workaround
The following is how I understand it, but please let's delay further discussion until someone tests the program under Windows. On Mon, 14 Mar 2005, Michael Fuhr wrote: Hmmm...I think that would be inconsistent with previous reports. For example, in the following message, the poster said that everything (PostgreSQL, pgAdmin) was running on Windows 2003: http://archives.postgresql.org/pgsql-interfaces/2005-02/msg00066.php I'm sorry, he's wrong. The initial report was by Hong Yuan: http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php later he clarified: http://archives.postgresql.org/pgsql-general/2005-01/msg00858.php I am using pgAdmin III Version 1.2.0 under Chinese Windows XP, while the database is 7.4.6 under Linux. BTW I just noticed someone else provided a simpler example: http://archives.postgresql.org/pgsql-general/2005-01/msg00876.php someone should try and complile those under windows. I suggested that he strip the CRs from pg_proc.prosrc and he said it worked: http://archives.postgresql.org/pgsql-interfaces/2005-03/msg00014.php It's not clear that the test in the second message was run on a Windows server (apparently pgAdmin was run on a Windows client), but I think the beginning of the message is saying that he didn't reply for so long because he didn't have access to a Windows server. From that I infer that he tried my suggestion and posted the results when he finally did get access to a customer's Windows server. I could be misreading that, however. I have no idea of where Michele Bendazzoli ran that code. He's not the original poster, tho. A couple of months ago Stuart Bishop brought this issue up in python-dev. Most of the thread is along the lines of strip the carriage returns: http://mail.python.org/pipermail/python-dev/2005-January/051203.html Sorry again, but he's assuming Unix on the server side, and Windows or Mac on the client side. If anyone manages to compile the following code on Windows... ... I bet on windows the first program fails and the second is ok. Hopefully somebody will do a Windows test of the code you posted. I'd find it bizarre that the Python code embedded in a C program had to care whether it was running on *nix or Windows. I find it perfectly consistent! Face it, _any_ C program that's handling data of type _text_ has to know which platform it is running on. If you don't like the behaviour of Python functions, think of printf(). C programmers under Windows are used to write: printf(Hello World!\r\n); as much (old) Mac programmers write: printf(Hello World!\r); and Unix programmers write: printf(Hello World!\n); _ANY_ C program that processes multiline text input has to know which platform it is running on, otherwise it reads or proceduces garbage. Python just requires the input being text, which seems reasonable to me, since, by design, a python program is more that just a sequence of ';'- separated statements (ala C or Perl), with _optional_ intermixing whitespaces. White spaces ('eol' included) do have a meaning in Python syntax. BTW, this attitude reminds me of PostgreSQL strict validation of input, compared to more relaxed checking made by MySQL. I really don't feel the need to enumerate the merits of input validation on this list. Python functions want 'valid text' and the definition is platform dependent, unfortunately. Why should it be relaxed, and accept invalid text as input? If you are to compile any unix text processing utility, such as grep, on a mac, you'll have to change the source and have the program split lines at \r (under windows, splitting lines at \n still works, but leaves a spurious \r around). Python is the same, since it expects programs as _text_. The universal newline support is totally another matter. And it applies to files only (it's a wrapper around file functions). It's a tool for lazy programmers. It is possible that in the feature the Python lexxer is changed to recognize different line endings (since there's no valid case for a \r at the end of a line that I can think of). But requiring the input being text is not 'bizarre' at all. The issue about text representation affects _any_ application. Treating text as binary data is plain wrong, IMHO, and will always lead to problems. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 08:09:47AM +0200, Sim Zacks wrote: I don't think that this type of solution should be discussed as an official patch. If it was, I would recommend solving the problem in source code when the function is passed to the translator. That way each platform could fix the code to work with as is needed and the code would be portable. I ran into this problem about half a year ago and it didn't go anywhere, then I saw a long discussion about it that also didn't go anywhere. I had given up on using plpython until now because I actually need it. I'd like to insert one note here. While I'm not particularly familiar with either perl or python, when I encountered the referred-to thread regarding Windows/Unix newline incompatibilities in Python, I ran some tests on my Linux system. I wrote some very simple scripts in Perl and Python and tested them. They ran OK when I saved them in native Linux newlines, they ran, but if I saved them in DOS mode (vim), neither the Python nor the Perl scripts would run. I noted that some of the writers in that thread were critical of Python for not taking care of the conversions, and it may be that the Perl interfaces in postgresql may take care of this, I have not tried this, but it seems that my tests, if valid, indicate that this problem is not restricted to Python alone as far as the interpreters are concerned. This observation may have nothing to do with the pl/Perl and pl/Python interfaces, but I thought I'd pass this along FWIW. So I figured out how to make it work and thought that it would be helpful to others, but I couldn't figure out how to automate the fix. Tom Lane [EMAIL PROTECTED] wrote in message I don't know Python at all, so I don't know how complicated its lexical structure is, but ISTM you'd at least need enough smarts to distinguish literals from unquoted whitespace. The other small fly in the ointment is that when the server is running on Windows, I suppose we would have to *put in* rather than remove CRs. Sim probably doesn't care about that case, but we couldn't accept an official patch that doesn't handle it. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] plpython function problem workaround
Sim Zacks [EMAIL PROTECTED] writes: I've been looking at the possibility of having a planned CR in the source code and I don't see a case where it would happen. Does python actually disallow newlines in string literals? That is x = 'foo bar' Whether you think this is good style is not the question --- is it allowed by the language? regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 09:46:54AM -0500, Tom Lane wrote: Sim Zacks [EMAIL PROTECTED] writes: I've been looking at the possibility of having a planned CR in the source code and I don't see a case where it would happen. Does python actually disallow newlines in string literals? That is x = 'foo bar' Whether you think this is good style is not the question --- is it allowed by the language? You can do x = this has newlines embedded -- Alvaro Herrera ([EMAIL PROTECTED]) Linux transformó mi computadora, de una `máquina para hacer cosas', en un aparato realmente entretenido, sobre el cual cada día aprendo algo nuevo (Jaime Salinas) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpython function problem workaround
On Tue, 2005-03-15 at 07:33 -0600, David wrote: [about the line-termination problem in plpython] I'd like to insert one note here. While I'm not particularly familiar with either perl or python, when I encountered the referred-to thread regarding Windows/Unix newline incompatibilities in Python, I ran some tests on my Linux system. I wrote some very simple scripts in Perl and Python and tested them. They ran OK when I saved them in native Linux newlines, they ran, but if I saved them in DOS mode (vim), neither the Python nor the Perl scripts would run. actually, perl scripts with \r\n line endings will run just fine in unix/linux. what you might have been experiencing, is the fact that the unix shell is expecting \n lineendings, and the #! line could have been failing to run the perl executable. gnari ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpython function problem workaround
On Tue, 15 Mar 2005, Tom Lane wrote: Sim Zacks [EMAIL PROTECTED] writes: I've been looking at the possibility of having a planned CR in the source code and I don't see a case where it would happen. Does python actually disallow newlines in string literals? That is x = 'foo bar' Whether you think this is good style is not the question --- is it allowed by the language? You can with triple-quoting and by escaping it with backslash. The following code, admitedly ugly, is valid python: a = 'a\ bc' print a b = '''a bc''' print b and produces: abc a bc as output. \newline in any non raw literal is allowed and ignored, while a bare newline in a triple-quoted string literal is allowed and retained. Moreover, this is not an execise of bad style only. It's customary to write docstrings as multiline triple-quoted string literals: def afunction(a, b, c): This is a function. Its arguments are: a - first argument b - second argument c - third argument. It does ans returns nothing. pass It's more or less the recommended way to document a function (or class or module or whatever). See PEP 257 for more examples: http://www.python.org/peps/pep-0257.html So, to answer to your question, newlines are more than allowed in string literals. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 01:40:23PM +0100, Marco Colombo wrote: On Mon, 14 Mar 2005, Michael Fuhr wrote: Hmmm...I think that would be inconsistent with previous reports. For example, in the following message, the poster said that everything (PostgreSQL, pgAdmin) was running on Windows 2003: http://archives.postgresql.org/pgsql-interfaces/2005-02/msg00066.php I'm sorry, he's wrong. Wrong about what? He reported that he was having the same problem and that both client and server were running on Windows 2003. Here's his first message: http://archives.postgresql.org/pgsql-interfaces/2005-02/msg00063.php The initial report was by Hong Yuan: http://archives.postgresql.org/pgsql-general/2005-01/msg00792.php That was a different thread. Same problem, but an earlier thread that Michele apparently didn't know about until I mentioned it. later he clarified: http://archives.postgresql.org/pgsql-general/2005-01/msg00858.php I am using pgAdmin III Version 1.2.0 under Chinese Windows XP, while the database is 7.4.6 under Linux. A problem with Windows = Linux doesn't preclude the same problem from happening with Windows = Windows. At issue is that pgAdmin on Windows apparently adds carriage returns, and whether Python on any platform doesn't like that (that's what we're still trying to determine). BTW I just noticed someone else provided a simpler example: http://archives.postgresql.org/pgsql-general/2005-01/msg00876.php That somebody was me. I have no idea of where Michele Bendazzoli ran that code. He's not the original poster, tho. He was the original poster of *his* thread, the one with a subject of plpythonu strange syntax error. He wasn't the first person to report the problem, but his first message didn't reference any previous messages. http://archives.postgresql.org/pgsql-interfaces/2005-02/msg00063.php I'll postpone commenting on the rest until we find out how the example programs run on Windows. If nobody follows up here then maybe I'll wander over to comp.lang.python. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] plpython function problem workaround
On Tue, 15 Mar 2005, Michael Fuhr wrote: [...] That somebody was me. Ok, sorry. I'll postpone commenting on the rest until we find out how the example programs run on Windows. If nobody follows up here then maybe I'll wander over to comp.lang.python. Yeah, there's no point in discussing until we have some real world data. I can't compile on windows, so I'll have to wait someone else to do that. I'm basing my opinions on Python documentation only. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 06:03:01PM +0100, Marco Colombo wrote: On Tue, 15 Mar 2005, Michael Fuhr wrote: I'll postpone commenting on the rest until we find out how the example programs run on Windows. If nobody follows up here then maybe I'll wander over to comp.lang.python. Yeah, there's no point in discussing until we have some real world data. I can't compile on windows, so I'll have to wait someone else to do that. I'm basing my opinions on Python documentation only. I've been looking through the Python source code (2.4.1c1) and I've found several places that use only \n in embedded code. One is Modules/main.c, which says it's the Python interpreter main program. The Py_Main() function is processing command-line options and does the following: if (c == 'c') { /* -c is the last option; following arguments that look like options are left for the command to interpret. */ command = malloc(strlen(_PyOS_optarg) + 2); if (command == NULL) Py_FatalError( not enough memory to copy -c argument); strcpy(command, _PyOS_optarg); strcat(command, \n); break; } Later, without further changes to the command variable, it does this: if (command) { sts = PyRun_SimpleStringFlags(command, cf) != 0; free(command); Modules/cPickle.c has additional examples: if (!( r=PyRun_String( def __init__(self, *args): self.args=args\n\n def __str__(self):\n return self.args and ('%s' % self.args[0]) or '(what)'\n, Py_file_input, module_dict, t) )) return -1; and if (!( r=PyRun_String( def __init__(self, *args): self.args=args\n\n def __str__(self):\n a=self.args\n a=a and type(a[0]) or '(what)'\n return 'Cannot pickle %s objects' % a\n , Py_file_input, module_dict, t) )) return -1; The code in Demo/embed/demo.c uses only \n to terminate its lines: PyRun_SimpleString(import sys\n); PyRun_SimpleString(print sys.builtin_module_names\n); PyRun_SimpleString(print sys.modules.keys()\n); PyRun_SimpleString(print sys.executable\n); PyRun_SimpleString(print sys.argv\n); If these examples are intended to run on Windows, then presumably using \n works on that platform. That doesn't necessarily preclude \r\n from working as well, but apparently a platform-specific sequence isn't required. I'd still be interested in having somebody run the test programs we've both posted to find out for sure. Is anybody who's following this thread running a PostgreSQL server with PL/Python on Windows? We could use a test platform to answer some questions, since the PostgreSQL aspect of this discussion is about problems with PL/Python. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 03:41:37PM +, Ragnar Hafstað wrote: actually, perl scripts with \r\n line endings will run just fine in unix/linux. Indeed, and PL/Perl doesn't care. I just tested several PLs with PostgreSQL 8.0.1 on Solaris 9 and here are the results: PL/pgSQL CRLF ok PL/PerlCRLF ok PL/RubyCRLF ok PL/Tcl CRLF ok PL/Python CRLF fails PL/R CRLF fails Details: CREATE FUNCTION test_pgsql() RETURNS integer AS 'DECLARE x integer;\r\nBEGIN\r\nx := 123;\r\nRETURN x;\r\nEND;\r\n' LANGUAGE plpgsql; CREATE FUNCTION test_perl() RETURNS integer AS '$x = 123;\r\nreturn $x;\r\n' LANGUAGE plperl; CREATE FUNCTION test_ruby() RETURNS integer AS 'x = 123\r\nreturn x\r\n' LANGUAGE plruby; CREATE FUNCTION test_tcl() RETURNS integer AS 'set x 123\r\nreturn $x\r\n' LANGUAGE pltcl; CREATE FUNCTION test_python() RETURNS integer AS 'x = 123\r\nreturn x\r\n' LANGUAGE plpythonu; CREATE FUNCTION test_r() RETURNS integer AS 'x - 123\r\nreturn(x)\r\n' LANGUAGE plr; SELECT test_pgsql(); test_pgsql 123 (1 row) SELECT test_perl(); test_perl --- 123 (1 row) SELECT test_ruby(); test_ruby --- 123 (1 row) SELECT test_tcl(); test_tcl -- 123 (1 row) SELECT test_python(); ERROR: plpython: could not compile function test_python DETAIL: exceptions.SyntaxError: invalid syntax (line 2) SELECT test_r(); ERROR: R interpreter parse error DETAIL: R parse error caught in PLR36865 - function() {x - 123 return(x) }. CONTEXT: In PL/R function test_r If I remove the CRs from the Python and R functions then they work: CREATE OR REPLACE FUNCTION test_python() RETURNS integer AS 'x = 123\nreturn x\n' LANGUAGE plpythonu; CREATE OR REPLACE FUNCTION test_r() RETURNS integer AS 'x - 123\nreturn(x)\n' LANGUAGE plr; SELECT test_python(); test_python - 123 (1 row) SELECT test_r(); test_r 123 (1 row) -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpython function problem workaround
On Tue, Mar 15, 2005 at 06:03:01PM +0100, Marco Colombo wrote: On Tue, 15 Mar 2005, Michael Fuhr wrote: I'll postpone commenting on the rest until we find out how the example programs run on Windows. If nobody follows up here then maybe I'll wander over to comp.lang.python. Yeah, there's no point in discussing until we have some real world data. I can't compile on windows, so I'll have to wait someone else to do that. I'm basing my opinions on Python documentation only. Paul Moore has run some tests on a Windows server. I don't see his message in the archives yet, but when it shows up it should be a followup to my request for his help (posted in another thread): http://archives.postgresql.org/pgsql-general/2005-03/msg00717.php Paul's tests on Windows show the same results as tests on *nix, viz., that lines in embedded Python need to end in LF, not CRLF. Thanks to Paul for running the tests. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] plpython function problem workaround
On Mon, Mar 14, 2005 at 02:37:00PM +0200, Sim Zacks wrote: I worked around the plpython problem that doesn't allow scripts created on Windows to be run on the *nix server with the following statement. update pg_proc set prosrc=replace(prosrc,chr(13),'') where prolang=87238 --plpythonu's oid in my setup is 87238. I don't know if that is a standard or just on mine. The oid is arbitrary, so you should get it via a (sub)query instead of hardcoding it. Is there a way to automate that script every time a plpythonu function is created? I tried writing a trigger on the pg_proc table but it wouldn't let me: Hmmm...plpythonu doesn't install a VALIDATOR function. I wonder if you could exploit that? This is just a brainstorm, but the following worked for me in trivial tests: CREATE FUNCTION fixpython(funcoid oid) RETURNS void AS $$ BEGIN UPDATE pg_proc SET prosrc = replace(prosrc, chr(13), '') WHERE oid = funcoid; RETURN; END; $$ LANGUAGE plpgsql VOLATILE STRICT; UPDATE pg_language SET lanvalidator = 'fixpython'::regproc WHERE lanname = 'plpythonu'; Are there any problems with doing this? Is a VALIDATOR function permitted to modify the function it's validating? This wouldn't work if plpythonu ever installs a VALIDATOR, but you might be able to use it until such time (barring objections about why it's a Bad Idea, that is). -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] plpython function problem workaround
Thank You Michael, It worked when I tried it on the 3 functions that I have. I checked the archives for any discussion on the Validator and there wasn't anything specifically discussing it. I really didn't find a lot of information about the Validators either, for example where did you see that a validator function gets the language oid passed in? The only place I could find it was after I looked at the other languages installed on my system and saw that they all had the same argument type passed in. Sim Michael Fuhr [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] On Mon, Mar 14, 2005 at 02:37:00PM +0200, Sim Zacks wrote: I worked around the plpython problem that doesn't allow scripts created on Windows to be run on the *nix server with the following statement. update pg_proc set prosrc=replace(prosrc,chr(13),'') where prolang=87238 --plpythonu's oid in my setup is 87238. I don't know if that is a standard or just on mine. The oid is arbitrary, so you should get it via a (sub)query instead of hardcoding it. Is there a way to automate that script every time a plpythonu function is created? I tried writing a trigger on the pg_proc table but it wouldn't let me: Hmmm...plpythonu doesn't install a VALIDATOR function. I wonder if you could exploit that? This is just a brainstorm, but the following worked for me in trivial tests: CREATE FUNCTION fixpython(funcoid oid) RETURNS void AS $$ BEGIN UPDATE pg_proc SET prosrc = replace(prosrc, chr(13), '') WHERE oid = funcoid; RETURN; END; $$ LANGUAGE plpgsql VOLATILE STRICT; UPDATE pg_language SET lanvalidator = 'fixpython'::regproc WHERE lanname = 'plpythonu'; Are there any problems with doing this? Is a VALIDATOR function permitted to modify the function it's validating? This wouldn't work if plpythonu ever installs a VALIDATOR, but you might be able to use it until such time (barring objections about why it's a Bad Idea, that is). -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
On Mon, Mar 14, 2005 at 04:08:09PM +0200, Sim Zacks wrote: I checked the archives for any discussion on the Validator and there wasn't anything specifically discussing it. I really didn't find a lot of information about the Validators either, for example where did you see that a validator function gets the language oid passed in? The validator is passed the function's OID, not the language's OID. See the documentation for CREATE LANGUAGE, as well as the Procedural Languages chapter: http://www.postgresql.org/docs/8.0/interactive/sql-createlanguage.html http://www.postgresql.org/docs/8.0/interactive/xplang.html Note that my idea to use the validator function was just a brainstorm, not thoroughly tested or thought out. Maybe one of the developers will comment about the wisdom of (ab)using it the way I suggested. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] plpython function problem workaround
Michael Fuhr [EMAIL PROTECTED] writes: Are there any problems with doing this? Is a VALIDATOR function permitted to modify the function it's validating? Well, it's not *supposed* to, but the tuple is already stored so I guess an UPDATE on it will work. At the moment. This wouldn't work if plpythonu ever installs a VALIDATOR, but you might be able to use it until such time (barring objections about why it's a Bad Idea, that is). The proposed hack seems far too simplistic to me ... what of CRs that are deliberately included in string literals? I don't know Python at all, so I don't know how complicated its lexical structure is, but ISTM you'd at least need enough smarts to distinguish literals from unquoted whitespace. The other small fly in the ointment is that when the server is running on Windows, I suppose we would have to *put in* rather than remove CRs. Sim probably doesn't care about that case, but we couldn't accept an official patch that doesn't handle it. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
On Mon, Mar 14, 2005 at 10:54:22AM -0500, Tom Lane wrote: The proposed hack seems far too simplistic to me ... what of CRs that are deliberately included in string literals? Yeah, I meant to mention that; I think it's been brought up before in related threads. The programmer would need to be aware of that issue and allow for it. It doesn't seem to be a problem if you use \r escape sequences in a dollar-quoted function body or if you use \\r in single quotes -- is there any case where those sequences might get converted to literal CRs? Third-party software like user interfaces might be problematic, but what about PostgreSQL itself and its official tools (psql, pg_dump, etc.)? I don't know Python at all, so I don't know how complicated its lexical structure is, but ISTM you'd at least need enough smarts to distinguish literals from unquoted whitespace. The example I posted was merely that: an example. I was more interested in whether abusing the validator mechanism would work or if it might have subtle problems. A function that understands the Python grammar is left as an exercise for the reader. The other small fly in the ointment is that when the server is running on Windows, I suppose we would have to *put in* rather than remove CRs. Would we? My understanding is that code passed to PyRun_String() and friends must be free of line-ending CRs on all platforms, and that the code that reads a normal Python script takes care of that (i.e., normalizes line endings to be LF only). Can anybody confirm or deny? Sim probably doesn't care about that case, but we couldn't accept an official patch that doesn't handle it. I wasn't proposing a patch, at least not yet. Just throwing out an idea that somebody might be able to build on. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] plpython function problem workaround
On Mon, 14 Mar 2005, Michael Fuhr wrote: Would we? My understanding is that code passed to PyRun_String() and friends must be free of line-ending CRs on all platforms, and that the code that reads a normal Python script takes care of that (i.e., normalizes line endings to be LF only). Can anybody confirm or deny? I'm not sure of that. I suspect you'll need to pass CRs on windows. If anyone manages to compile the following code on Windows... #include Python.h void run_program(const char *program) { PyObject *ret, *globals, *locals; printf( running:\n%s\n, program); globals = PyDict_New(); locals = PyDict_New(); ret = PyRun_String(program, Py_file_input, globals, locals); if (ret) { Py_DECREF(ret); printf(\n); } else { PyErr_Print(); } Py_DECREF(locals); Py_DECREF(globals); printf( end\n\n); } int main(int argc, char *argv[]) { const char *program1 = print 1\nprint 2\n; const char *program2 = print 1\r\nprint 2\r\n; Py_Initialize(); printf( Initialized.\n); printf( Python %s\n, Py_GetVersion()); run_program(program1); run_program(program2); Py_Finalize(); printf( Finalized.\n); } On my Fedora Core 2, I need to complile it with the following command: gcc -I/usr/include/python2.3 -L/usr/lib/python2.3/config py-test.c -o py-test\ -lpython2.3 -ldl -lm -lpthread -lutil This is my first attempt to embed python, so I may be missing something... On Linux, you get: $ ./py-test 21 | cat -v Initialized. Python 2.3.3 (#1, May 7 2004, 10:31:40) [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] running: print 1 print 2 1 2 end running: print 1^M print 2^M File string, line 1 print 1^M ^ SyntaxError: invalid syntax end Finalized. I bet on windows the first program fails and the second is ok. .TM. -- / / / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _/ _/ _/ [EMAIL PROTECTED] ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [GENERAL] plpython function problem workaround
On Mon, Mar 14, 2005 at 08:14:42PM +0100, Marco Colombo wrote: On Mon, 14 Mar 2005, Michael Fuhr wrote: Would we? My understanding is that code passed to PyRun_String() and friends must be free of line-ending CRs on all platforms, and that the code that reads a normal Python script takes care of that (i.e., normalizes line endings to be LF only). Can anybody confirm or deny? I'm not sure of that. I suspect you'll need to pass CRs on windows. Hmmm...I think that would be inconsistent with previous reports. For example, in the following message, the poster said that everything (PostgreSQL, pgAdmin) was running on Windows 2003: http://archives.postgresql.org/pgsql-interfaces/2005-02/msg00066.php I suggested that he strip the CRs from pg_proc.prosrc and he said it worked: http://archives.postgresql.org/pgsql-interfaces/2005-03/msg00014.php It's not clear that the test in the second message was run on a Windows server (apparently pgAdmin was run on a Windows client), but I think the beginning of the message is saying that he didn't reply for so long because he didn't have access to a Windows server. From that I infer that he tried my suggestion and posted the results when he finally did get access to a customer's Windows server. I could be misreading that, however. A couple of months ago Stuart Bishop brought this issue up in python-dev. Most of the thread is along the lines of strip the carriage returns: http://mail.python.org/pipermail/python-dev/2005-January/051203.html If anyone manages to compile the following code on Windows... ... I bet on windows the first program fails and the second is ok. Hopefully somebody will do a Windows test of the code you posted. I'd find it bizarre that the Python code embedded in a C program had to care whether it was running on *nix or Windows. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] plpython function problem workaround
I don't think that this type of solution should be discussed as an official patch. If it was, I would recommend solving the problem in source code when the function is passed to the translator. That way each platform could fix the code to work with as is needed and the code would be portable. I ran into this problem about half a year ago and it didn't go anywhere, then I saw a long discussion about it that also didn't go anywhere. I had given up on using plpython until now because I actually need it. So I figured out how to make it work and thought that it would be helpful to others, but I couldn't figure out how to automate the fix. Hopefully, aside from the issue of having CRs inside the actual program, using the validator function/updating pg_proc directly won't cause any harmful ramifications. I don't have access to any other platforms right now, aside from Windows client connected to Linux server, so I really couldn't reliably test any other situation. Sim Tom Lane [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Michael Fuhr [EMAIL PROTECTED] writes: Are there any problems with doing this? Is a VALIDATOR function permitted to modify the function it's validating? Well, it's not *supposed* to, but the tuple is already stored so I guess an UPDATE on it will work. At the moment. This wouldn't work if plpythonu ever installs a VALIDATOR, but you might be able to use it until such time (barring objections about why it's a Bad Idea, that is). The proposed hack seems far too simplistic to me ... what of CRs that are deliberately included in string literals? I don't know Python at all, so I don't know how complicated its lexical structure is, but ISTM you'd at least need enough smarts to distinguish literals from unquoted whitespace. The other small fly in the ointment is that when the server is running on Windows, I suppose we would have to *put in* rather than remove CRs. Sim probably doesn't care about that case, but we couldn't accept an official patch that doesn't handle it. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] plpython function problem workaround
Sim Zacks [EMAIL PROTECTED] writes: I don't think that this type of solution should be discussed as an official patch. If it was, I would recommend solving the problem in source code when the function is passed to the translator. Indeed, but first we need to work out what the necessary translation is. A hack using a VALIDATOR function isn't an unreasonable way to prototype the conversion logic. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]