Re: CGI Tutorial

2006-10-05 Thread and-google
Clodoaldo Pinto Neto wrote:

> print 'The submited name was "' + name + '"'

Bzzt! Script injection security hole. See cgi.escape and use it (or a
similar function) for *all* text -> HTML output.

> open('files/' + fileitem.filename, 'w')

BZZT. filesystem overwriting security hole, possibly escalatable to
code execution. clue: fileitem.filename= '../../something.py'

> sid = cookie['sid'].value
> session = shelve.open('/tmp/.session/sess_' + sid

Bad filename use allows choice of non-session files, opening with
shelve allows all sorts of pickle weirdnesses. Just use strings.

> p = sub.Popen(str_command,

o.O

Sure this stuff may not matter for Hello World on a test server, but if
you're writing a tutorial you should ensure newbies know the Right Way
to do it from the start. The proliferation of security-oblivious PHP
tutorials is directly responsible for the disasterous amount of
script-injection- and SQL-injection-vulnerable webapps out there -
let's not have the same for Python.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A critique of cgi.escape

2006-09-25 Thread and-google
Jon Ribbens wrote:

> I'm sorry, that's not good enough. How, precisely, would it break
> "existing code"?

('owdo Mr. Ribbens!)

It's possible there could be software that relies on ' not being
escaped, for example:

# Auto-markup links to O'Reilly, everyone's favourite
# example name with an apostrophe in it
#
URI= 'http://www.oreilly.com/'
html= cgi.escape(text)
html= html.replace('O\'Reilly', 'O\'Reilly' % URI)

Sure this may be rare, but it's what the documentation says, and
changing it may not only fix things but also subtly break things in
ways that are hard to detect.

A similar change to str.encode('unicode-escape') in Python 2.5 caused a
number of similar subtle problems. (In this case the old documentation
was a bit woolly so didn't prescribe the exact older behaviour.)

I'm not saying that the cgi.escape interface is *good*, just that it's
too late to change it.

I personally think the entire function should be deprecated, firstly
because it's insufficient in some corner cases (apostrophes as you
pointed out, and XHTML CDATA), and secondly because it's in the wrong
place: HTML-escaping is nothing to do with the CGI interface. A good
template library should deal with escaping more smoothly and correctly
than cgi.escape. (It may be able to deal with escape-or-not-bother and
character encoding issues automatically, for example.)

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: getting POST vars from BaseHTTPRequestHandler

2006-06-27 Thread and-google
Christopher J. Bottaro wrote:

> When I make a post, it just hangs (in self.rfile.read()).

I don't know about BaseHTTPRequestHandler in particular, but in general
you don't want to call an unlimited read() on an HTTP request - it will
try to read the entire incoming stream, up until the stream is ended by
the client dropping the connection (by which point it's too late to
send a response).

Instead you'll normally want to read the request's Content-Length
header (int(os.environ['CONTENT_LENGTH']) under CGI) and read(that
many) bytes.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Having problems with strings in HTML

2006-06-27 Thread and-google
Sion Arrowsmith wrote:

> I've never encountred a browser getting tripped up by it. I suppose you
> might need it if you've got parameters called quot or nbsp

There are many more entities than you can comfortably remember, and
browsers can interpret anything starting with one as being an entity
reference, hence all the problems with parameters like 'section' (->
§). Plus of course there's nothing stopping future browsers
supporting more entities, breaking your apps.

Just write &. There's no reason not to (except ignorance). The fact
that so much of the web is written with broken HTML is not an argument
for doing the same.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: new python icons for windows

2006-06-21 Thread and-google
Istvan Albert wrote:

> But these new icons are too large, too blocky and too pastel.

Hooray! Glad to see *someone* doesn't like 'em, I'll expect a few more
when b1 hits. :-)

Although I can't really see 'large', 'blocky' or 'pastel'... they're
the same size and shape as other Windows document icons, and I
personally find the Python logo colours quite striking. If it's the
new-fangled shadey gradienty kind of nonsense you don't like, you could
also try the low-colour versions. eg. ICOs compiled with only 16-colour
and 16/32 sizes:

  http://doxdesk.com/file/software/py/pyicons-tiny.zip

> For example it resembles the icon for text files.

This is intentional: to make it obvious that .py files are the
readable, editable scripts, contrasting with .pyc's binary gunk -
something that wasn't 100% clear before. With the obviousness of the
Python-plus and the strong difference between the white and black base
document icons, squinting shouldn't really be necessary IMO.

> can someone point me to a page/link that contains the old icons?

Sure,

  http://svn.python.org/view/python/branches/release24-maint/PC/py.ico
  http://svn.python.org/view/python/branches/release24-maint/PC/pyc.ico

http://svn.python.org/view/python/branches/release24-maint/PC/pycon.ico

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dr. Dobb's Python-URL! - weekly Python news and links (Jun 12)

2006-06-13 Thread and-google
John Salerno wrote:

> I love the new 'folder' icon, but how can I access it as an icon?

I've just given these are proper home, so here:

  http://doxdesk.com/software/py/pyicons.html

cheers!

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CGI redirection: let us discuss it further

2006-03-28 Thread and-google
Sullivan WxPyQtKinter wrote:

> 1. Are there any method (in python of course) to redirect to a web page
> without causing a "Back" button trap... rather than the redirection page
> with a "Location: url" head

What's wrong with the redirection page?

If there's really a necessary reason for not using an HTTP redirect
(for example, needing to set a cookie, which doesn't work cross-browser
on redirects), the best bet is a page containing a plain link and

Re: Uploading files from IE

2006-03-23 Thread and-google
AB wrote:

> I tried the following with the same result:
> myName = ulImage.filename
> newFile = file (os.path.join(upload_dir, os.path.basename(myName)), 'wb')

os.path is different on your system to the uploader's system. You are
using Unix pathnames, with a '/' separator - they are using Windows
ones, with '\', so os.path.basename won't recognise them as separators.
Old-school-Macintosh and RISC OS machines have different path
separators again.

The Content-Disposition filename parameter can be set by the user-agent
to *anything at all*. Using it without some serious sanitising
beforehand is a recipe for security holes. In your original code an
attacker could have arbitrarily written to any file the web user had
access to. The code with os.path.basename is better but could still be
confused by things like an empty string, '.', '..' or invalid
characters.

It's best not to use any user-submitted data as the basis for
filenames. If you absolutely *must* use Content-Disposition as a local
filename you must send it through some strict checking first, whether
the browser sends full paths to you or not.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New-style Python icons

2006-03-21 Thread and-google
Fredrik Lundh wrote:

> could you perhaps add an SVG version ?

Yes. I'll look at converting when I've used them a bit and am happy
with them. I think some of the higher-level Xara effects may not
convert easily to SVG but I'm sure there'll be workarounds of some
sort.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New-style Python icons

2006-03-21 Thread and-google
Luis M. González wrote:

> This is strange... I've been trying to access this site since
> yesterday, but I couldn't

Might it be possible you have malware installed? Since I do a bunch of
anti-spyware work, there are a few different bits of malware that try
to block doxdesk.com, usually using a Hosts file hijack.

Try it with the IP address 64.251.25.168 instead - if that works you
should probably investigate your Hosts file and/or look at spyware
removers.

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New-style Python icons

2006-03-20 Thread and-google
Michael Tobis wrote:

> Besides the pleasant colors what do you like about it?

I like that whilst being a solid and easily-recognisable, it isn't
clever-clever.

I had personally been idly doodling some kind of swooshy thing before,
with a snake's head forming a P and its forked tongue a Y coming out of
it, but in retrospect it was just trying too hard. The plus-tadpoles'
simplicity appeals to me.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why isn't Unicode the default encoding?

2006-03-20 Thread and-google
John Salerno wrote:

> So as it turns out, Unicode and UTF-8 are not the same thing?

Well yes. UTF-8 is one scheme in which the whole Unicode character
repertoire can be represented as bytes.

Confusion arises because Windows uses the name 'Unicode' in character
encoding lists, to mean UTF-16_LE, which is another encoding that can
store the whole Unicode character repertoire as bytes. However
UTF-16_LE is not any more definitively 'Unicode' than UTF-8 is.

Further confusion arises because the encoding 'UTF-16' can actually
mean two things that are deceptively different:

  - Unicode characters stored natively in 16-bit units (using two
UTF-16 characters to represent characters outside of the Basic
Multilingual Plane)

  - Either of the 8-bit encodings UTF-16_LE and UTF-16_BE, detected
automatically using a Byte Order Mark when loaded, or chosen
arbitrarily when saving

Yet more confusion arises because UTF-32 (which can reference any
Unicode character directly) has the same problem. And though
wide-unicode builds of Python understand the first meaning (unicode()
strings are stored natively as UTF-32), they don't support the 8-bit
encodings UTF-32_LE and UTF-32_BE. Phew!

To summarise: confusion.

> Am I right to say that UTF-8 stores the first 128 Unicode code points
> in a single byte, and then stores higher code points in however many
> bytes they may need?

That is correct.

To answer the original question, we're always going to need byte
strings. They're a fundamental part of computing and the need to
process them isn't going to go away. However as Unicode text
manipulation becomes a more common event than byte string processing,
it makes sense to change the default kind of string you get when you
type a literal.

Personally I would like to see byte strings available under an easy
syntax like b'...' and UTF-32 strings available as w'...', or something
like that - currently having u'...' mean either UTF-16 or UTF-32
depending on compile-time options is very very annoying to the few
kinds of programs that really do need to know the difference. But
whatever is chosen, it's all tasty Python 3000 future-soup and not
worth worrying about for the moment.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New-style Python icons

2006-03-20 Thread and-google
Scott David Daniels wrote:

> Maybe you could change the ink color to better distinguish
> the pycon and pyc icons.

Yeah, might do that... I'm thinking I might flip the pycon icon so that
the Windows shortcut badge doesn't obscure the Python logo, too. Maybe.

I'll let them stew on my desktop for a bit first though...

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


New-style Python icons

2006-03-20 Thread and-google
Personally, I *like* the new website look, and I'm glad to see Python
having a proper logo at last!

I've taken the opportunity to knock up some icons using it, finally
banishing the poor old standard-VGA-palette snake from my desktop. If
you like, you can grab them from:

  http://www.doxdesk.com/img/software/py/icons.zip

in .ICO format for Windows - containing all resolutions/depths up to
and including Windows Vista's crazy new enormo-icons. Also contains the
vector graphics source file in Xara format. You can also see a preview
here:

  http://www.doxdesk.com/img/software/py/icons.png

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pure python implementation of string-like class

2006-02-25 Thread and-google
Akihiro KAYAMA wrote:
> As the character set is wider than UTF-16(U+10), I can't use
> Python's native unicode string class.

Have you tried using Python compiled in Wide Unicode mode
(--enable-unicode=ucs4)? You get native UTF-32/UCS-4 strings then,
which should be enough for most purposes.

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Retrieve a GIF's palette entries using Python Imaging Library (PIL)

2006-01-19 Thread and-google
Stuart wrote:

> I see that the 'Image' class has a 'palette' attribute which returns an
> object of type 'ImagePalette'.  However, the documentation is a bit
> lacking regarding how to maniuplate the ImagePalette class to retrieve
> the palette entries' RGB values.

ImagePalette.getdata() should do it.

There seems to be some kind of bug, however, where Images lose their
ImagePalettes after being convert()ed to paletted images (eg. using
Image.ADAPTIVE). For this reason I personally use the getpalette()
method from the wrapped image object, which seems to contain the proper
raw palette data. For example to get a list of [r,g,b] colour lists:

  def chunk(seq, size):
return [seq[i:i+size] for i in range(0, len(seq), size)]

  palette= image.im.getpalette()
  colours= [map(ord, bytes) for bytes in chunk(palette, 3)]

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: XML and namespaces

2005-12-19 Thread and-google
Uche Ogbuji <[EMAIL PROTECTED]> wrote:

> Andrew Clover also suggested an overly-legalistic argument that current
> minidom behavior is not a bug.

I stick by my language-law interpretation of spec. DOM 2 Core
specifically disclaims any responsibility for namespace fixup and
advises the application writer to do it themselves if they want to be
sure of the right output. W3C knew they weren't going to get all that
standardised by Level 2 so they left it open for future work - if
minidom claimed to support DOM 3 LS it would be a different matter.

> '\n'

> (i.e. "ferh" rather than "href"), would you not consider that a minidom
> bug?

It's not a *spec* bug, as no spec that minidom claims to conform to
says anything about serialisation. It's a *minidom* bug in that it
fails to conform to the minimal documentation of the method toxml()
which claims to "Return the XML that the DOM represents as a string" -
the DOM does not represent that XML.

However that doc for toxml() says nothing about being namespace-aware.
XML and XML-with-namespaces both still exist, and for the former class
of document the minidom behaviour is correct.

> The reality is that once the poor user has done:

> element = document.createElementNS("DAV:", "href")

> They are following DOM specification that they have created an element
> in a namespace

It's possible that a namespaced node could also be imported/parsed into
a non-namespace document and then serialised; it's particularly likely
this could happen for scripts processing XHTML.

We shouldn't change the existing behaviour for toxml/writexml because
people may be relying on it. One of the reasons I ended up writing a
replacement was that the behaviour of minidom was not only wrong, but
kept changing under my feet with each version.

However, adding the ability to do fixup on serialisation would indeed
be very welcome - toxmlns() maybe, or toxml(namespaces= True)?

> I'll be sure to emphasize heavily to users that minidom is broken
> with respect to Namespaces and serialization, and that they
> abandon it in favor of third-party tools.

Well yes... there are in any case more fundamental bugs than just
serialisation problems.

Frederik wrote:

> can anyone perhaps dig up a DOM L2 implementation that's not written
> by anyone involved in this thread



-- 
And Clover
mailto:[EMAIL PROTECTED]
http://doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: XML and namespaces

2005-12-03 Thread and-google
Uche <[EMAIL PROTECTED]> wrote:

> Of course.  Minidom implements level 2 (thus the "NS" at the end of the
> method name), which means that its APIs should all be namespace aware.
> The bug is that writexml() and thus toxml() are not so.

Not exactly a bug - DOM Level 2 Core 1.1.8p2 explicitly leaves
namespace fixup at the mercy of the application. It's only standardised
as a DOM feature in Level 3, which minidom does not yet claim to
support. It would be a nice feature to add, but it's not entirely
trivial to implement, especially when you can serialize a partial DOM
tree.

Additionally, it might have some compatibility problems with apps that
don't expect namespace declarations to automagically appear. For
example, perhaps, an app dealing with HTML that doesn't want spare
xmlns="http://www.w3.org/1999/xhtml"; declarations appearing in every
snippet of serialized output.

So it should probably be optional. In DOM Level 3 (and pxdom) there's a
DOMConfiguration parameter 'namespaces' to control it; perhaps for
minidom an argument to toxml() might be best?

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PIL: retreive image resolution (dpi)

2005-08-22 Thread and-google
[EMAIL PROTECTED] wrote:

> I looked at the PIL Image class but cannot see a posibility to retreive
> the image resolution dots per inch (or pixels per inch)

Not all formats provide a DPI value; since PIL doesn't do anything with
DPI it's not part of the main interface.

For PNG and JPEG at least the value may be retrievable from the extra
info dictionary (image.info['dpi']) when loaded from a file that sets
it. Expect an (x, y) tuple (not necessarily square-pixel).

-- 
And Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Importing User-defined Modules

2005-07-25 Thread and-google
Walter Brunswick <[EMAIL PROTECTED]> wrote:

> I need to import modules with user-defined file extensions
> that differ from '.py', and also (if possible) redirect the
> bytecode output of the file to a file of a user-defined
> extension.

You shouldn't really need a PEP for that; you can take control of the
compile and import processes manually. See the py_compile and imp
modules.

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Yet Another Python Web Programming Question

2005-07-09 Thread and-google
Daniel Bickett wrote:

> Python using CGI, for example, was enough for him until he started
> getting 500 errors that he wasn't sure how to fix.

Every time you mention web applications on this list, there will
necessarily be a flood of My Favourite Framework Is X posts.

But you* sound like you don't want a framework to take over the
architecture of your app and tell you what to do. And, indeed, you
don't need to do that. There are plenty of standalone modules you can
use - even ones that are masquerading as part of a framework.

I personally use my own input-stage and templating modules, along with
many others, over standard CGI, and only bother moving to a faster
server interface which can support DB connection pooling (such as
mod_python) if it's actually necessary - which is, surprisingly, not
that often. Hopefully if WSGI catches on we will have a better
interface available as standard in the future.

Not quite sure what 500 Errors you're getting, but usually 500s are
caused by unhandled exceptions, which Apache doesn't display the
traceback from (for security reasons). Bang the cgitb module in there
and you should be able to diagnose problems more easily.

> He is also interested in some opinions on the best/most carefree way
> of interfacing with MySQL databases.

MySQLdb works fine for me:

  http://sourceforge.net/projects/mysql-python/

(* - er, I mean, Hypothetical. But Hypothetical is a girl's name!)

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why UnboundLocalError?

2005-07-09 Thread and-google
Alex Gittens wrote:

> I'm getting an UnboundLocalError

> def fieldprint(widths,align,fields): [...]
> def cutbits(): [...]
> fields = fields[widths[i]:]

There's your problem. You are assigning 'fields' a completely new
value. Python doesn't allow you to rebind a variable from an outer
scope in an inner scope (except for the special case where you
explicitly use the 'global' directive, which is no use for the nested
scopes you are using here).

So when you assign an identifier in a function Python assumes that you
want that identifier to be a completely new local variable, *not* a
reference to the variable in the outer scope. By writing 'fields= ...'
in cutbits you are telling Python that fields is now a local variable
to cutbits. So when the function is entered, fields is a new variable
with no value yet, and when you first try to read it without writing to
it first you'll get an error.

What you probably want to do is keep 'fields' pointing to the same
list, but just change the contents of the list. So replace the assign
operation with a slicing one:

  del fields[:widths[i]]

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with sha.new

2005-07-09 Thread and-google
Florian Lindner wrote:

> sha = sha.new(f.read())

> this generates a traceback when sha.new() is called for the second time

You have reassigned the variable 'sha'.

First time around, sha is the sha module object as obtained by 'import
sha'. Second time around, sha is the SHA hashing object you used the
first time around. This does not have a 'new' method.

Python does not have separate namespaces for packages and variables.
Modules are stored in variables just like any other object.

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python as CGI on IIS and Windows 2003 Server

2005-06-15 Thread and-google
Lothat <[EMAIL PROTECTED]> wrote:

> No test with or without any " let the IIS execute python scrits as cgi.
> Http Error code is 404 (but i'm sure that the file exists in the
> requested path).

Have you checked the security restrictions? IIS6 has a new feature
whereby script mappings are disabled by default even if they are listed
in the configuration list.

To turn CGI on, go to the IIS Manager snap-in and select the 'Web
Service Extensions' folder. Select 'All Unknown CGI Extensions' and
click 'Allow'.

Incidentally, the string I am using is:

  "C:\Program Files\Python\2.4\python.exe" -u "%s" "%s"

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Elementtree and CDATA handling

2005-06-01 Thread and-google
Alain <[EMAIL PROTECTED]> wrote:

> I would expect a piece of XML to be read, parsed and written back
> without corruption [...]. It isn't however the case when it comes
> to CDATA handling.

This is not corruption, exactly. For most intents and purposes, CDATA
sections should behave identically to normal character data. In a real
XML-based browser (such as Mozilla in application/xhtml+xml mode), this
line of script would actually work fine:

> if (a < b && a > 0) {

The problem is you're (presumably) producing output that you want to be
understood by things that are not XML parsers, namely legacy-HTML web
browsers, which have special exceptions-to-the-rule like "

Re: Python 2.4.1 install broke RedHat 9 printconf-backend

2005-04-11 Thread and-google
BrianS wrote:

>   File "/usr/share/printconf/util/printconf_conf.py", line 83, in ?
> from xml.utils import qp_xml
> ImportError: No module named utils

> It seems that the xml package have been changed.

Not exactly. xml.utils is part of the XML processing package PyXML -
you don't get it in the cut-down XML stuff available in the standard
library.

You could try downloading and installing from http://pyxml.sf.net/.
Though I can't guarantee there won't be other problems as RedHat can be
very annoying like this. You might have to keep Python 2.2 around in
addition to 2.4 for RH's benefit; in any case trying to remove 2.2 will
probably lead you into an RPM dependency nightmare.

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: File Uploads

2005-03-27 Thread and-google
Doug Helm wrote:

> form = cgi.FieldStorage()
>   if lobjUp.Save('filename', 'SomeFile.jpg'):

> class BLOB(staticobject.StaticObject):
>   def Save(self, pstrFormFieldName, pstrFilePathAndName):
> form = cgi.FieldStorage()

You are instantiating cgi.FieldStorage twice. This won't work for POST
requests, because instantiating a FieldStorage reads the form data from
the standard input stream (the HTTP request).

Try to create a second one and cgi will try to read all the form data
again; this will hang, waiting for the socket to send it a load more
data which will not be forthcoming.

When using CGI, parse the input only once, then pass the results (a
FieldStorage object if you are using the cgi module) in to any other
functions that need to read it.

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: injecting "set" into 2.3's builtins?

2005-03-11 Thread and-google
Skip Montanaro wrote:

> I use sets a lot in my Python 2.3 code at work and have been using
> this hideous import to make the future move to 2.4's set type
> transparent:

> try:
> x = set

(Surely just 'set' on its own is sufficient? This avoids the ugly else
clause.)

> __builtin__.set = sets.Set

> I'm wondering if others have tried it. If so, did it cause any
> problems?

I don't know of any specific case where it would cause problems but I'd
be very wary of this; certainly doing the same with True and False has
caused problems in the past. A module might sniff for 'set' and assume
it is running on 2.4 if it sees it, with unpredictable results if it
relies on any other 2.4 behaviour.

I'd personally put this at the top of local scripts:

  from siteglobals import *

Then put compatibility hacks like set and bool in siteglobals.py. Then
any modules or other non-site scripts could continue without the
polluted builtin scope.

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: function with a state

2005-03-06 Thread and-google
Xah Lee <[EMAIL PROTECTED]> wrote:

> is it possible in Python to create a function that maintains a
> variable value?

Yes. There's no concept of a 'static' function variable as such, but
there are many other ways to achieve the same thing.

> globe=0;
> def myFun():
>   globe=globe+1
>   return globe

This would work except that you have to tell it explicitly that you're
working with a global, otherwise Python sees the "globe=" and decides
you want 'globe' be a local variable.

  globe= 0
  def myFun():
global globe
globe= globe+1
return globe

Alternatively, wrap the value in a mutable type so you don't have to do
an assignment (and can use it in nested scopes):

  globe= [ 0 ]
  def myFun():
globe[0]+= 1
return globe[0]

A hack you can use to hide statics from code outside the function is to
abuse the fact that default parameters are calcuated at define-time:

  def myFun(globe= [ 0 ]):
globe[0]+= 1
return globe[0]

For more complicated cases, it might be better to be explicit and use
objects:

  class Counter:
def __init__(self):
  self.globe= 0
def count(self):
  self.globe+= 1
  return self.globe

  myFun= Counter().count

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: convert gb18030 to utf16

2005-03-06 Thread and-google
Xah Lee <[EMAIL PROTECTED]> wrotE:

> i have a bunch of files encoded in GB18030. Is there a way to convert
> them to utf16 with python?

You will need CJKCodecs (http://cjkpython.i18n.org/), or Python 2.4,
which has them built in. Then just use them like any other codec. eg.

  f= open(path, 'rb')
  content= unicode(f.read(), 'gb18030')
  f.close()
  f= open(path, 'wb')
  f.write(content.encode('utf-16'))
  f.close()

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get textual content of a Xml element using 4DOM

2005-03-06 Thread and-google
Frank Abel Cancio Bello <[EMAIL PROTECTED]> wrote:

> PrettyPrint or Print return the value to the console, and i need
> keep this  value in a string variable to work with it, how can i
> do this?

The second parameter to either of these functions can be a stream
object, so you can use a StringIO to get string output:

  from StringIO import StringIO
  from xml.dom.ext import Print

  buf= StringIO()
  Print(doc, buf)
  xml= buf.getvalue()

-- 
Andrew Clover
http://www.doxdesk.com/
mailto:[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with minidom and special chars in HTML

2005-02-24 Thread and-google
Horst Gutmann wrote:

> I currently have quite a big problem with minidom and special chars
> (for example ü)  in HTML.

Yes. Ignoring the issue of the wrong doctype, minidom is a pure XML
parser and knows nothing of XHTML and its doctype's entities 'uuml' and
the like. Only the built-in entities (& etc.) will work.

Unfortunately the parser minidom uses won't read external entities -
including the external subset of the DTD (which is where all the stuff
about what "ü" means is stored). And because minidom does not
support EntityReference nodes, the information that there was an entity
reference there at all gets thrown away as it is replaced with the
empty string. Which is kind of bad.

Possible workarounds:

1. pass minidom a different parser to use, one which supports external
entities and which will parse all the DTD stuff. I don't know if there
is anything suitable available, though...

2. use a DOM implementation with the option to support external
entities. For example, with pxdom, one can use DOM Level 3 LS methods,
or pxdom.parse(f, {'pxdom-external-entities': True}).

However note that reading and parsing an external entity will introduce
significant slowdown, especially in the case of the rather complex
multi-file XHTML DTD. Other possibilities:

3. hack the content on the way into the parser to replace the DOCTYPE
declaration with one including entity definitions in the internal
subset:

  
...
  ]>
  ...

4. hack the content on the way into the parser to replace entity
references with character references, eg. ü -> ü. This is
'safe' for simple documents without an internal subset; charrefs and
entrefs can be used in the same places with the same meaning, except
for some issues in the internal subset.

5. use a DOM implementation that supports EntityReference nodes, such
as pxdom. Entity references with no replacement text (or all entity
references if the DOM Level 3 LS parameter 'entities' is set) will
exist as EntityReference DOM objects instead of being flattened to
text. They can safely be reserialized as ü without the
implementation having to know what text they represent.

Entities are a big source of complication and confusion, which I wish
had not made it into XML!

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CGI POST problem was: How to read POSTed data

2005-02-05 Thread and-google
Dan Perl wrote:

> how is a multipart POST request parsed by CGIHTTPServer?

It isn't; the input stream containing the multipart/form-data content
is passed to the CGI script, which can choose to parse it or not using
any code it has to hand - which could be the 'cgi' module, but not
necessarily.

> Where is the parsing done for the POST data following the header?

If you are using the 'cgi' module, then cgi.parse_multipart.

> As a side note, I found other old reports of problems with cgi
> handling POST  requests, reports that don't seem to have had a
> resolution.

(in particular?)

FWIW, for interface-style and multipart-POST-file-upload-speed reasons
I wrote an alternative to cgi, form.py
(http://www.doxdesk.com/software/py/form.html). But I haven't found
cgi's POST reading to be buggy in general.

> There is even a bug reported just a few days ago (1112856) that is
> exactly about multipart post requests. If I understand the bug
> report correctly though, it is only on the latest version in CVS
> and it states that what is in the 2.4 release works.

That's correct.

> All this tells me that it could be a "fragile" part in the standard
> library.

I don't really think so; it's really an old stable part of the library
that is a bit crufty in places due to age. The patch that caused
1112856 was an attempt to rip out and replace the parser stuff, which
as a big change to old code is bound to cause trouble. But that's what
the dev cycle is for.

CGIHTTPServer, on the other hand, I have never really trusted. I would
suspect that fella.

-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Octal notation: severe deprecation

2005-01-12 Thread and-google
John Machin wrote:

> I regard continued usage of octal as a pox and a pestilence.

Quite agree. I was disappointed that it ever made it into Python.

Octal's only use is:

a) umasks
b) confusing the hell out of normal non-programmers for whom a
leading zero is in no way magic

(a) does not outweigh (b).

In Mythical Future Python I would like to be able to use any base in
integer literals, which would be better. Example random syntax:

flags= 2x00011010101001
umask= 8x664
answer= 10x42
addr= 16x0E84  # 16x == 0x
gunk= 36x8H6Z9A0X

But either way, I want rid of 0->octal.

> Is it not regretted?

Maybe the problem just doesn't occur to people who have used C too
long.

OT: Also, if Google doesn't stop lstrip()ing my posts I may have to get
a proper news feed. What use is that on a Python newsgroup? Grr.
-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for source preservation features in XML libs

2004-12-28 Thread and-google
Grzegorz Adam Hankiewicz <[EMAIL PROTECTED]> wrote:

> I have looked at xml.minidom, elementtree and gnosis and haven't
> found any such features. Are there libs providing these?

pxdom (http://www.doxdesk.com/software/py/pxdom.html) has some of this,
but I think it's still way off what you're envisaging.

> One is to be able to tell which source file line a tag starts
> and ends.

You can get the file and line/column where a node begins in pxdom using
the non-standard property Node.pxdomLocation, which returns a DOM Level
3 DOMLocator object, eg.:

uri= node.pxdomLocation.uri
line= node.pxdomLocation.lineNumber
col= node.pxdomLocation.columnNumber

There is no way to get the location of an Element's end-tag, however.
Except guessing by looking at the positions of adjacent nodes, which is
kind of cheating and probably not reliable.

SAX processors can in theory use Locator information too, but AFAIK (?)
this isn't currently implemented.

> Another feature is to be able to save the processed XML code in a way
> that unmodified tags preserve the original identation.

Do you mean whitespace *inside* the start-tag? I don't know of any XML
processor that will do anything but ignore whitespace here; in XML
terms it is utterly insignificant and there is no place to store the
information in the infoset or DOM properties.

pxdom will preserve the *order* of the attributes, but even that is not
required by any XML standard.

> Or in the worst case, all identation is lost, but I can control to
> some degree the outlook of the final XML output.

The DOM Level 3 LS feature format-pretty-print (and PyXML's
PrettyPrint) influence whitespace in content. However if you do want
control of whitespace inside the tags themselves I don't know of any
XML tools that will do it. You might have to write your own serializer,
or hack it into a DOM implementation of your choice.
-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Web forum (made by python)

2004-12-20 Thread and-google
Choe, Cheng-Dae wrote:

> example site is http://bbs.pythonworld.net:9080/pybbs.py

Since this seems quite happy to accept posted 

Re: regex syntax

2004-12-06 Thread and-google
Andreas Volz <[EMAIL PROTECTED]> schrieb:

> Ich hab mir schon überlegt einfach die letzten viel Stellen des
> strings "per Hand" auf die Zeichenfolge zu vergleichen und so
> regex zu umgehen. Aber ich muss es irgendwann ja doch mal nutzen

"Muss"? stimme nicht zu! Regexps sind ja fuer begrenzte Zwecke eine
gute Loesung, aber kein Basisteil der Programmierung. Bei diesem
Beispiel waere:

>>> filename.endswith('.jpg')

viel besser als das vergleichbare Regexp:

>>> re.match('.*\.jpg$', filename)
-- 
Andrew Clover
mailto:[EMAIL PROTECTED]
http://www.doxdesk.com/

--
http://mail.python.org/mailman/listinfo/python-list