Re: Re: Web page data and urllib2.urlopen

2009-08-06 Thread Kushal Kumaran
On Fri, Aug 7, 2009 at 3:47 AM, Dave Angel wrote:
>
>
> Piet van Oostrum wrote:
>>
>> 
>>>
>>> DA> All I can guess is that it has something to do with "browser type" or
>>> DA> cookies.  And that would make lots of sense if this was a cgi page.
>>>  But
>>> DA> the URL doesn't look like that, as it doesn't end in pl, py, asp, or
>>> any of
>>> DA> another dozen special suffixes.
>>>

Note that the URL does not have to have any special suffix for it to
be dynamically generated.  See any page at wikipedia, for example.
Mediawiki, the software running the site, is a php application.

>>
>>
>>>
>>> DA> Any hints, anybody???
>>>
>>
>> If you look into the HTML that Firefox gets, there is a lot of
>> javascript in it.
>>
>
> But the raw page didn't have any javascript.  So what about that original
> raw page triggered additional stuff to be loaded?

FWIW, I'm getting a ton of javascript in the page downloaded using
your code fragment.

> Is it "user agent", as someone else brought out?  And is there somewhere I
> can read more about that aspect of things?  I've mostly built very static
> html pages, where the server yields the same page to everybody.  And some
> form stuff, where the  user clicks on a 'submit" button to trigger a script
> that's not shown on the URL line.
>

-- 
kushal
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators through the C API

2009-08-06 Thread Stefan Behnel
Duncan Booth schrieb:
> Lucas P Melo  wrote:
> 
>> Hello, I'm a total noob about the C API. Is there any way to create a 
>> generator function using the C API? I couldn't find anything like the 
>> 'yield' keyword in it.
>>
>> Thanks in advance.
> 
> You define a new type with tp_flags including Py_TPFLAGS_HAVE_ITER. 
> Anything that would be a local variable in your generator needs to become 
> an attribute in the type.
> 
> The tp_init initialization function should contain all the code up to the 
> first yield, tp_iter should return self and tp_iternext should execute code 
> up to the next yield.

This is pretty easy to do in Cython (or Pyrex), BTW. Just write a class
with an __iter__ and __next__ method, and Cython will generate the C-API
code as expected.

http://docs.cython.org/docs/special_methods.html#iterators

Note that Cython doesn't currently support the "yield" statement, but
that's certainly on the ToDo list.

http://trac.cython.org/cython_trac/ticket/83

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: help with threads

2009-08-06 Thread Michael Mossey
Ah yes, that explains it. Some of these long computations are done in
pure C, so I'm sure the GIL is not being released.
Thanks.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to force SAX parser to ignore encoding problems

2009-08-06 Thread Stefan Behnel
Łukasz wrote:
> I have a problem with my XML parser (created with libraries from
> xml.sax package). When parser finds a invalid character (in CDATA
> section) for example �, throws an exception SAXParseException.
> 
> Is there any way to just ignore this kind of problem. Maybe there is a
> way to set up parser in less strict mode?
> 
> I know that I can catch this exception and determine if this is this
> kind of problem and then ignore this, but I am asking about any global
> setting.

The parser from libxml2 that lxml provides has a recovery option, i.e. it
can keep parsing regardless of errors and will drop the broken content.

However, it is *always* better to fix the input, if you get any hand on it.
Broken XML is *not* XML at all. If you can't fix the source, you can never
be sure that the data you received is in any way complete or even usable.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to fetch an XML file using an HTTPS query

2009-08-06 Thread Stefan Behnel
Tycho Andersen wrote:
> Blah, forgot to include the list. When is python-list going to get Reply-To?

Hopefully never.

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SDMX format

2009-08-06 Thread Stefan Behnel
xamdam wrote:
> Does anyone know of python libs for writing SDMX XML format?
> http://www.SDMX.org/resources/SDMXML/schemas/v2_0/common

Looks like the page behind that link is broken, but in general, working
with XML formats in Python isn't hard at all when you use ElementTree or
lxml. The latter also has support for XML-Schema validation, and you might
be interested in lxml.objectify for handling data centric XML formats
(assuming that's the case here).

Stefan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to kill subprocess when Python process is killed?

2009-08-06 Thread alex23
On Aug 7, 3:42 pm, "mark.v.we...@gmail.com" 
wrote:
> When I kill the main process (cntl-C) the subprocess keeps running.
> How do I kill the subprocess too? The subprocess is likey to run a
> long time.

You can register functions to run when the Python process ends by
using the atexit[1] module.

The following has been tested & works under Python 2.6 on Windows XP:

import atexit

def cleanup():
print 'stop the subprocess in here'

atexit.register(cleanup)

while True:
pass


[1]: http://docs.python.org/library/atexit.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: one method of COM object needs a variant type variable

2009-08-06 Thread Gabriel Genellina
En Thu, 06 Aug 2009 09:37:55 -0300, MICHÁLEK Jan Mgr.  
 escribió:


How i can use this type in win32.com? One method of com object (geomedia  
storage service) needs this variable for storage geometry of geometry  
object (this variable will be writen into blob in DB). Is possible make  
this variable in py??


Any compatible object may be used on the Python side, the pywin32 library  
manages the conversion automatically.


See  
http://docs.activestate.com/activepython/2.4/pywin32/html/com/win32com/HTML/PythonCOM.html


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread Nobody
On Thu, 06 Aug 2009 14:23:47 -0700, Ethan Furman wrote:

>> [0-9]+ allows any number of leading zeros, which is sometimes undesirable.
>> Using:
>> 
>>  (0|[1-9][0-9]*)
>> 
>> is more robust.
> 
> You make a good point about possibly being undesirable, but I question 
> the assertion that your solution is /more robust/.  If the OP 
> wants/needs to match numbers even with leading zeroes your /more robust/ 
> version fails.

Well, the OP did say:

> The regex should only match the exact above.

I suppose that it depends upon the definition of "exact" ;)

More seriously: failing to produce an error when one is called for is also
a bug.

Personally, unless I knew for certain that the rest of the program would
handle leading zeros correctly (e.g. *not* interpreting the number as
octal), I would try to reject it in the parser. It's usually much easier
to determine the cause of an error raised by the parser than if you allow
bogus data to propagate deep into the program.

-- 
http://mail.python.org/mailman/listinfo/python-list


how to kill subprocess when Python process is killed?

2009-08-06 Thread mark.v.we...@gmail.com
I am writing a Python program that launches a subprocess (using
Popen).
I am reading stdout of the subprocess, doing some filtering, and
writing to
stdout of the main process.

When I kill the main process (cntl-C) the subprocess keeps running.
How do I kill the subprocess too? The subprocess is likey to run a
long time.

Context:
I'm launching only one subprocess at a time, I'm filtering its stdout.
The user might decide to interrupt to try something else; the user
wants the process and all subprocesses to go away in response
to a cntl-C

I'm new to python; solution must be for Python 2.5 (windows) to help
me.

Any help and/or pointers appreciated.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python docs disappointing - group effort to hire writers?

2009-08-06 Thread alex23
Paul Rubin  wrote:
> The PHP docs as I remember are sort of regular (non-publically
> editable) doc pages, each of which has a public discussion thread
> where people can post questions and answers about the topic of that
> doc page.  I thought it worked really well.  The main thing is that
> the good stuff from the comment section gets folded into the actual
> doc now and then.

I'd still like to see this kept out of the official docs as much as
possible, mostly for reasons of brevity & clarity. I think the
official docs should be considered definitive and not require a
hermeneutic evaluation against user comments to ensure they're still
correct...

How about a secondary site that embeds the docs and provides
commenting functionality around it? That's certainly a finitely scoped
project that those with issues about the docs could establish and
contribute to, with the possibility of it gaining official support
later once it gains traction.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python configuration question when python scripts are executed using Appweb as web server.

2009-08-06 Thread Gabriel Genellina
En Thu, 06 Aug 2009 12:49:30 -0300, IronyOfLife   
escribió:

On Aug 5, 4:18 pm, "Gabriel Genellina"  wrote:

En Tue, 04 Aug 2009 10:15:24 -0300, IronyOfLife 
escribió:
> On Aug 3, 8:42 pm, "Gabriel Genellina"  wrote:
>> En Mon, 03 Aug 2009 11:04:07 -0300, IronyOfLife  
  

>> escribió:

>> > I have installed python 2.6.2 in windows xp professional machine. I
>> > have set the following environment variables -- PYTHONPATH. It  
points
>> > to following windows folders: python root folder, the lib folder  
and

>> > lib-tk folder.

>> Why? Did you read it somewhere? Usually there is no need to set the  
>> PYTHONPATH variable at all; remove it.

> Setting PYTHONPATH environment variables is mentioned in Python docs.

Could you provide a link please? Setting PYTHONPATH should not be


Here are couple of links that discusses setting PYTHONPATH environment
variable.
http://docs.python.org/using/windows.html


Ouch, that document should be reworked and updated!


http://www.daimi.au.dk/~chili/PBI/pythonpath.html


That's mostly obsolete for current versions. Since PEP370 [0] was  
implemented, a per-user private directory is already in sys.path now, so  
there is no need to play with PYTHONPATH.



I understand your concerns regarding setting of PYTHONPATH while
multiple versions of Python are installed on the same machine. My fix
however does not use PYTHONPATH. The GNUTLS wrapper module for PYTHON
loads the GNUTLS dll's and it was not able to find them. Using FileMon
(win tool) I found out the paths that are scanned and I copied the
dlls to one of such paths. I still do not like this fix. This is a
temporary solution.


Note that this cannot be fixed from inside Python. When you import a  
module, the interpreter scans the directories in ITS search path  
(sys.path) looking for a matching module. Once the module is found:

- if it is a Python file, it's executed
- if it is an extension module (a .pyd file, in fact a .dll renamed) it's  
loaded using LoadLibraryEx (a Windows function), and finally its  
initialization routine is executed.


For LoadLibrary to successfully load the module, all its external  
references must be resolved. That is, Windows must locate and load all  
other DLLs that this module depends on, using its own search strategy [1],  
taking into account the PATH environment variable and many other places.


It is at this stage that you get the error: when the gnutls wrapper  
attempts to load the gnutls DLL. That search is done by Windows, not by  
Python, and PYTHONPATH has nothing to do with it.


Why different results in IIS and appweb? Note that there exist several  
search strategies, they depend on the application home directory, the  
location of the .exe, whether SetDllDirectory was called or not, whether  
the application has a manifest or not, a .local file or not... Hard to  
tell with so many variables.



Can you explain maybe with some sample how to set .pth files? Maybe
this will resolve my issue.


Yes, but I don't think this will actually help with your issue.

pth files are described here [2]. Suppose you want to add c:\foo\bar to  
sys.path, then write a file whatever.pth containing this single line:


c:\foo\bar

and place it on c:\your_python_installation\lib\site-packages
When the interpreter starts, it will find the .pth file, read it, and add  
any directory it finds (that actually exists) to sys.path


Note that another Python installation will use a diferent site-packages  
directory and won't find this particular .pth file, so different Python  
versions don't interfere.


---

[0] http://python.org/dev/peps/pep-0370/
[1] http://msdn.microsoft.com/en-us/library/ms682586(VS.85).aspx
[2] http://docs.python.org/library/site.html

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Generate a new object each time a name is imported

2009-08-06 Thread Michele Simionato
On Aug 2, 1:18 pm, Paul Rubin  wrote:
> Peter Otten <__pete...@web.de> writes:
> > >>> import sys
> > >>> sys.modules["yadda"] = A()
>
> OMG wow.  I bow to you.  But I'm not sure whether that's bowing in
> awe or in terror.

I had to play this kind of tricks on our production code, not so long
ago. Not that I am pride of it, but it was the lesser evil to cope
with a wrong design. The scenario: a large legacy code base
based on the idea of using a Python module to keep configuration
parameters. The idea is fine as long as the parameters are
immutable, but in our case the parameters could be changed.
In theory the parameters should have been set only once,
however in practice this was not guaranteed: every piece
of code could change the parameters at some moment, and things
could get "interesting" to debug.
Throwing away the configuration system was not an option, because
it would have involved changing hundreds of modules, so I set out
for a compromise: the parameters are still mutable, but they
can be changed only once. This was implemented by replacing
the module with a configuration object using custom
descriptors. Here is the code:

$ echo config.py
import sys

class WriteOnce(object):
"WriteOnce descriptor"
def __init__(self, default):
self.default = self.value = default
self.already_set = False
def __get__(self, obj, objcls):
return self.value
def __set__(self, obj, value):
if value != self.value and self.already_set:
raise TypeError('You cannot set twice a write-once
attribute!')
self.value = value
self.already_set = True

class Config(object):
"A singleton to be used as a module"
parameter = WriteOnce(0)

sys.modules['config'] = Config()

The usage is

>>> import config
>>> config.parameter = 42
>>> config.parameter = 43
Traceback (most recent call last):
   ...
TypeError: You cannot set twice a write-once attribute!

Just to say that there are use cases where replacing modules with
objects may have sense.
Still, a better design would just have passed immutable configuration
objects around
(changing the configuration would mean to create a new object).
Unfortunately there are still a lot a people without a functional
mindset :-(
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python3: Using sorted(key=...)

2009-08-06 Thread Paul Rubin
Johannes Bauer  writes:
> def myorder(x):
>  if type(x[0]) == int:
>  return x[0]
>  else:
>  return x[0][0]

I used to write code like that pretty regularly, but over time I found
that it's better to stay consistent and use the same container format
(in this case, tuples) for everything, rather than having special
cases for the singleton and non-singleton case.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SMTP

2009-08-06 Thread Gabriel Genellina
En Fri, 07 Aug 2009 00:21:34 -0300, Sarmad George   
escribió:



Good day all
I am new to the Python list


Welcome!


My question is here not Python as much as it is in servers

I have a code to send an email through my SMTP server - comcast Boston
fromaddrs = "**"
toaddrs = "sgeo...@coe.neu.edu"
msg = "Hello World"

server = smtplib.SMTP ('76.96.30.117', '25') #COMCAST BOSTON
server.set_debuglevel(1)
try:
server.sendmail(fromaddrs, toaddrs, msg)
finally:
server.quit()

Which basically sends a hello world message
Debugging gives me an indication of acceptance as seen below (and I as  
googled

it)
I tried the same from the University server, and I got the same results  
almost!
However, no emails in my Inbox when I log in - PLS can you explain the  
trouble?

Is it a firewall blocker?


It's probably a spam filter in action: you claim to be someone at Yahoo  
but you're not using a Yahoo server to send mail. Try using the Yahoo smtp  
mail servers to deliver your message, or use your ISP address as the  
sender if you use their SMTP. In any case, you should log in first with  
the adequate username+password.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: help with threads

2009-08-06 Thread Loïc Domaigné
Hallo Michael,

> I have a simple application that needs one thread to manage networking
> in addition to the main "thread" that does the main job. It's not
> working right. I know hardly anything about threads, so I was hoping
> someone could point me in the right direction to research this.
>
> Basically, I have a program that does some computational work, and
> also conveys its status to a monitor program elsewhere on the network
> via sockets. I wanted to use a thread to manage the networking so that
> the main program can run without regard to networking (i.e. they would
> be asynchronous). So the network thread loops and calls select.
>
> My problem is that in some cases, the network thread appears to stop,
> while the main thread is doing a long computation.
>
> I'm hoping someone can give me a general idea what to read about. For
> example, under what conditions does a thread stop running? Can other
> threads "take priority"? Are there certain operations that block other
> threads (such as disk access)?

May I suggest:
http://www.dabeaz.com/python/GIL.pdf

HTH,
Loïc
--
My Blog: http://www.domaigne.com/blog

"There is only one problem with common sense; it’s not very common."
-- Milt Bryce
-- 
http://mail.python.org/mailman/listinfo/python-list


SMTP

2009-08-06 Thread Sarmad George
Good day all
I am new to the Python list
My question is here not Python as much as it is in servers

I have a code to send an email through my SMTP server - comcast Boston
fromaddrs = "**"
toaddrs = "sgeo...@coe.neu.edu"
msg = "Hello World"

server = smtplib.SMTP ('76.96.30.117', '25') #COMCAST BOSTON
server.set_debuglevel(1)
try:
server.sendmail(fromaddrs, toaddrs, msg)
finally:
server.quit()

Which basically sends a hello world message
Debugging gives me an indication of acceptance as seen below (and I as googled
it)
I tried the same from the University server, and I got the same results almost!
However, no emails in my Inbox when I log in - PLS can you explain the trouble?
Is it a firewall blocker?
Thanks alot


send: 'ehlo [127.0.1.1]\r\n'
reply: '250-OMTA03.emeryville.ca.mail.comcast.net hello [98.216.11.39], pleased
to meet you\r\n'
reply: '250-HELP\r\n'
reply: '250-AUTH LOGIN PLAIN CRAM-MD5\r\n'
reply: '250-SIZE 15728640\r\n'
reply: '250-ENHANCEDSTATUSCODES\r\n'
reply: '250-8BITMIME\r\n'
reply: '250-STARTTLS\r\n'
reply: '250 OK\r\n'
reply: retcode (250); Msg: OMTA03.emeryville.ca.mail.comcast.net hello
[98.216.11.39], pleased to meet you
HELP
AUTH LOGIN PLAIN CRAM-MD5
SIZE 15728640
ENHANCEDSTATUSCODES
8BITMIME
STARTTLS
OK
send: 'mail FROM: size=11\r\n'
reply: '250 2.1.0  sender ok\r\n'
reply: retcode (250); Msg: 2.1.0  sender ok
send: 'rcpt TO:\r\n'
reply: '250 2.1.5  recipient ok\r\n'
reply: retcode (250); Msg: 2.1.5  recipient ok
send: 'data\r\n'
reply: '354 enter mail, end with "." on a line by itself\r\n'
reply: retcode (354); Msg: enter mail, end with "." on a line by itself
data: (354, 'enter mail, end with "." on a line by itself')
send: 'Hello World\r\n.\r\n'
reply: '250 2.0.0 REQK1c0070qYooa8PEQKQA mail accepted for delivery\r\n'
reply: retcode (250); Msg: 2.0.0 REQK1c0070qYooa8PEQKQA mail accepted for
delivery
data: (250, '2.0.0 REQK1c0070qYooa8PEQKQA mail accepted for delivery')
send: 'quit\r\n'
reply: '221 2.0.0 OMTA03.emeryville.ca.mail.comcast.net comcast closing
connection\r\n'
reply: retcode (221); Msg: 2.0.0 OMTA03.emeryville.ca.mail.comcast.net comcast
closing connection

Sarmad Edward George
Electrical Computer Eng
Northeastern Uni
Boston MA - USA

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Custom namespaces

2009-08-06 Thread Michele Simionato
On Aug 2, 6:46 am, Chris Rebert  wrote:
> > Is there any way to install a custom type as a namespace?
>
> For classes/objects, yes, using metaclasses.
> See the __prepare__() method in PEP 
> 3115:http://www.python.org/dev/peps/pep-3115/

Here is an example of usage:

http://www.artima.com/weblogs/viewpost.jsp?thread=236234
http://www.artima.com/weblogs/viewpost.jsp?thread=236260
(yes, it only works for Python 3+)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pylucene installation problem on Ubuntu 9.04

2009-08-06 Thread KK
On Aug 7, 12:38 am, Jon Clements  wrote:
> On 6 Aug, 19:49, KK  wrote:
>
>
>
> > hi all,
> > I've trying to install pylucene on my linux box from last 2 days but
> > not able to do so. first i tried to install it using apt-get like
> > this,
> > kk-laptop$ sudo apt-get install pylucene
> > and it did install python2.5, python2.5-minimal and pylucene. I must
> > mention one thing that I already had python2.6 on my box as the
> > default python i.e /usr/bin/python is linked to python2.6. Anyway s,
> > now i started the python interpreter using "python" command from cli
> > and then to make sure pylucene has been installed i tried to import
> > the module and to my surprise it said "module pylucene not found".
> > I thought I should enter the python2.6 env and do the same , so i
> > tried starting the python2.6 interpreter using "python2.6" as the
> > command and tried importing the same module and again it failed giving
> > the same irritating message.
> >  As a final try i pulled the source code of pylucene and as per the
> > comments given there in the README file, copied the mentioned files to
> > site-packages directory of python2.6 and then tried importing the
> > module and then got the same error message saying no module name
> > pylucene is present. I'm sick of this error !
> > Can someone point me what is the issue? If it is due to multiple
> > version of python running on box, can someone tell me which one to
> > remove or someone tell me how to get the whole thing running? I'll
> > very much thankful to you guys.
>
> > Thanks,
> > KK.
>
> If you installed using apt, have you a pylucene directory under /usr/
> local/lib/python2.6/dist-packages/?
>
> Also, if you run python, and import sys; print sys.path
> whats it show?
>
> Jon.

Yes I've a dirctory called dist-packages under python2.6 but that
doesn't contain anything on pylucene [it has lupyne, which i installed
day before yesterday and importing lupyne doesn't give any error msg,
but again it is dependent on pylucene]
# for python the output is :
--
kk-laptop$ python
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print sys.path
['', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/
python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/
lib-dynload', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/
dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/var/
lib/python-support/python2.6', '/usr/lib/python2.6/dist-packages/
gtk-2.0', '/var/lib/python-support/python2.6/gtk-2.0', '/usr/local/lib/
python2.6/dist-packages']
>>>
---

for python2.5 this is the output:
kk-laptop$ python2.5
Python 2.5.4 (r254:67916, Apr  4 2009, 17:55:16)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print sys.path
['', '/usr/lib/python2.5', '/usr/lib/python2.5/plat-linux2', '/usr/lib/
python2.5/lib-tk', '/usr/lib/python2.5/lib-dynload', '/usr/local/lib/
python2.5/site-packages', '/usr/lib/python2.5/site-packages', '/usr/
lib/python2.5/site-packages/PIL', '/usr/lib/python2.5/site-packages/
gst-0.10', '/var/lib/python-support/python2.5', '/usr/lib/python2.5/
site-packages/gtk-2.0', '/var/lib/python-support/python2.5/gtk-2.0']
>>>


and for python2.6 the output is this:
kk-laptop$ python2.6
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print sys.path
['', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/
python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/
lib-dynload', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/
dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/var/
lib/python-support/python2.6', '/usr/lib/python2.6/dist-packages/
gtk-2.0', '/var/lib/python-support/python2.6/gtk-2.0', '/usr/local/lib/
python2.6/dist-packages']


>From all of the above what I can see is we don't have any directory
named dist-packages under python2.5 but we've one under 2.6, then
where did all those pylucene files got installed to after i installed
it using apt-get? Any ideas?

Thanks
KK
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using Python to automate builds

2009-08-06 Thread Gabriel Genellina
En Thu, 06 Aug 2009 14:27:56 -0300, Kosta   
escribió:

On Aug 6, 3:57 am, David Cournapeau  wrote:

On Thu, Aug 6, 2009 at 12:39 AM, Kosta wrote:

> Setenv.bat sets up the path and other environment variables build.exe
> needs to compile and link (and even binplace) its utilities.  So
> building itself is not the issue.  The problem is that if I call
> setenv.bat from Python and then build.exe, but the modifications to
> the path (and other environment settings) are not seen by Python, so
> the attempt to build without a specified path fails.

It sounds like you do not propagate the environment when calling
setenv.bat from python. There is an option to do so in
subprocess.Popen init method, or you can define your own environment


My interpretation of the above (and your email) is that using Popen
allows one to pass the Python environment to a child processs (in my
case, setenv.bat).   I need the reverse, to propagate from the child
to the parent.


In addition to just calling setenv, dump the modified environment. Parse  
the output into a dictionary that you can use as the env argument to later  
subprocess calls:


py> import subprocess
py> cmdline = '(call setenv >nul 2>&1 & set)'
py> p = subprocess.Popen(cmdline, stdout=subprocess.PIPE, shell=True)
py> env = dict(line.rstrip().split("=",1) for line in p.stdout)
py> p.wait()
0
py> env
{'TMP': 'D:\\USERDATA\\Gabriel\\CONFIG~1\\Temp', 'COMPUTERNAME': 'LEPTON',  
'HOMEDRIVE': 'D:', ...

py> env['FOO']='Salta Violeta!'
py> subprocess.call("echo %FOO%", shell=True, env=env)
Salta Violeta!
0

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Pure python implementation of string.format?

2009-08-06 Thread nathan binkert
I was wondering if there was a pure python version of string.format
anywhere that could be used with python 2.4 and python 2.5.  Searching
has only turned up an early implementation done for pep 3101, but it
seems that it didn't get that far.

Thanks,

  Nate
-- 
http://mail.python.org/mailman/listinfo/python-list


need help with an egg

2009-08-06 Thread jo
Hi,

I am very new to python

I created an egg on a machine.  The Python version on that  is 2.5.
Copied that egg to a machine which has Python 2.6.

unzip -t Myproj-0.1-py2.5.egg
The above command shows all the files I need

When I run the easy_install, I get the foll. error.  Is it because of
the version?  Or am I doing something wrong?  Or the way I understand
the egg works is wrong.  Can anyone please help?
If it's the version issue, does that mean I cannot use that egg on my
machine with 2.6 version?

Installed /usr/local/lib/python2.6/dist-packages/Myproj-0.1-py2.5.egg
Processing dependencies for Myproj==0.1
Searching for Myproj==0.1
Reading http://pypi.python.org/simple/Myproj/
Couldn't find index page for 'Myproj' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading http://pypi.python.org/simple/
No local packages or download links found for Myproj==0.1
error: Could not find suitable distribution for Requirement.parse
('Myproj==0.1')

Thank you
Jo
-- 
http://mail.python.org/mailman/listinfo/python-list


help with threads

2009-08-06 Thread Michael Mossey
Hello,

I have a simple application that needs one thread to manage networking
in addition to the main "thread" that does the main job. It's not
working right. I know hardly anything about threads, so I was hoping
someone could point me in the right direction to research this.

Basically, I have a program that does some computational work, and
also conveys its status to a monitor program elsewhere on the network
via sockets. I wanted to use a thread to manage the networking so that
the main program can run without regard to networking (i.e. they would
be asynchronous). So the network thread loops and calls select.

My problem is that in some cases, the network thread appears to stop,
while the main thread is doing a long computation.

I'm hoping someone can give me a general idea what to read about. For
example, under what conditions does a thread stop running? Can other
threads "take priority"? Are there certain operations that block other
threads (such as disk access)?

Thanks,
Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python3: Using sorted(key=...)

2009-08-06 Thread Johannes Bauer
MRAB schrieb:
> Johannes Bauer wrote:
>> Hello list,
>>
>> I'm having trouble with a incredibly simple sort of a list containing
>> ints and tuples:
>>
>> def myorder(x):
>> if type(x) == int:
>> return x
>> else:
>> return x[0]
>>
>> odata = sorted([ (a, b) for (a, b) in data["description"].items() ],
>> key=myorder)
>>
> You're sorting a list of tuples (key/value pairs), so 'myorder' is
> always given a tuple.

Oh good lord! You're right... I meant

def myorder(x):
 if type(x[0]) == int:
 return x[0]
 else:
 return x[0][0]

Thanks for your help,
Kind regards,
Johannes

-- 
"Du bist einfach nur lächerlich! Mit solchen albernen und hohlen Sätzen
kannst du mir nicht imprägnieren."
-- Hobbycholeriker Jens Fittig aka Wolfgang Gerber in de.sci.electronics
 <4a6f44d0$0$12481$9b622...@news.freenet.de>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode() vs. s.decode()

2009-08-06 Thread John Machin
Jason Tackaberry  urandom.ca> writes:

> On Thu, 2009-08-06 at 01:31 +, John Machin wrote:

> > Suggested further avenues of investigation:
> > 
> > (1) Try the timing again with "cp1252" and "utf8" and "utf_8"
> > 
> > (2) grep "utf-8" /Objects/unicodeobject.c
> 
> Very pedagogical of you. :)  Indeed, it looks like bigger player in the
> performance difference is the fact that the code path for unicode(s,
> enc) short-circuits the codec registry for common encodings (which
> includes 'utf-8' specifically), whereas s.decode('utf-8') necessarily
> consults the codec registry.

So the next question (the answer to which may benefit all users
of .encode() and .decode()) is:

Why does consulting the codec registry take so long,
and can this be improved?



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python3: Using sorted(key=...)

2009-08-06 Thread MRAB

Johannes Bauer wrote:

Hello list,

I'm having trouble with a incredibly simple sort of a list containing
ints and tuples:

def myorder(x):
if type(x) == int:
return x
else:
return x[0]

odata = sorted([ (a, b) for (a, b) in data["description"].items() ],
key=myorder)


You're sorting a list of tuples (key/value pairs), so 'myorder' is
always given a tuple.


still says:

Traceback (most recent call last):
  File "./genproto.py", line 81, in 
odata = sorted([ (a, b) for (a, b) in data["description"].items() ],
key=myorder)
TypeError: unorderable types: tuple() < int()

Why is that? Am I missing something very obvious?


Are some keys 'int' and others 'tuple'? In Python 3.x you can't compare
them except for equality:


Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit 
(Intel)] on win32

Type "help", "copyright", "credits" or "license" for more information.
>>> (1, ) < 1
False


Python 3.1 (r31:73574, Jun 26 2009, 20:21:35) [MSC v.1500 32 bit 
(Intel)] on win32

Type "help", "copyright", "credits" or "license" for more information.
>>> (1, ) < 1
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unorderable types: tuple() < int()
--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode() vs. s.decode()

2009-08-06 Thread Michael Ströder
Thorsten Kampe wrote:
> * Michael Ströder (Thu, 06 Aug 2009 18:26:09 +0200)
> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(1000)
>> 17.23644495010376
> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(1000)
>> 72.087096929550171
>>
>> That is significant! So the winner is:
>>
>> unicode('äöüÄÖÜß','utf-8')
> 
> Unless you are planning to write a loop that decodes "äöüÄÖÜß" one 
> million times, these benchmarks are meaningless.

Well, I can tell you I would not have posted this here and checked it if it
would be meaningless for me. You don't have to read and answer this thread if
it's meaningless to you.

Ciao, Michael.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread John Machin
On Aug 7, 7:23 am, Ethan Furman  wrote:
> Nobody wrote:
> > On Thu, 06 Aug 2009 08:35:57 -0700, Robert Dailey wrote:
>
> >>I'm creating a python script that is going to try to search a text
> >>file for any text that matches my regular expression. The thing it is
> >>looking for is:
>
> >>FILEVERSION #,#,#,#
>
> >>The # symbol represents any number that can be any length 1 or
> >>greater. Example:
>
> >>FILEVERSION 1,45,10082,3
>
> >>The regex should only match the exact above. So far here's what I have
> >>come up with:
>
> >>re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )
>
> > [0-9]+ allows any number of leading zeros, which is sometimes undesirable.
> > Using:
>
> >    (0|[1-9][0-9]*)
>
> > is more robust.
>
> You make a good point about possibly being undesirable, but I question
> the assertion that your solution is /more robust/.  If the OP
> wants/needs to match numbers even with leading zeroes your /more robust/
> version fails.

I'd go further: the OP would probably be better off matching anything
that looked vaguely like an attempt to produce what he wanted e.g.
r"FILEVERSION\s*[0-9,]{3,}" and then taking appropriate action based
on whether that matched a "strictly correct" regex.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Subclassing Python's dict

2009-08-06 Thread Gabriel Genellina

En Thu, 06 Aug 2009 05:26:22 -0300, Xavier Ho 
escribió:

On Thu, Aug 6, 2009 at 1:19 PM, alex23  wrote:

Xavier Ho wrote:
> You should subclass collections.UserDict, and not the default dict  
class.


Xavier, why do you think that is the correct approach?


I'll be honest first and say that I do not completely understand how  
dict is

implemented in the underlying C structure. But as Bruno had already
mentioned, dict has a slightly different behaviour then we'd expect. For
example, the __getitem__() function isn't actually used by the  
interpreter

(which, you know, *can* be a problem.)


Thinks have evolved...
Before Python 2.2, builtin types were not subclassable. You could not
inherit from dict. In order to write another mapping class, you had to
write the complete interface - or inherit from UserDict, that was a
concrete class implementing the mapping protocol.

Later, DictMixin was added (in 2.3) and it made easier to write other
mapping classes: one had to write the most basic methods (__getitem__ /
__setitem__ / __delitem__ / keys) and the DictMixin provided the remaining
functionality (e.g. values() is built from keys() plus __getitem__). Later
releases allowed an even more modular approach, and until 2.5 DictMixin
was the recommended approach.

Then came 3.0/2.6 and PEP3119 defining a rich hierarchy of abstract base
classes; a normal dictionary implements the MutableMapping ABC and this is
the preferred approach now (the MutableMapping implementation is very
similar to the original DictMixin, but builds on the other base classes
like Sized, Iterable...)


I didn't realise they took UserDict out in later versions (2.6, for
example), and put it back in Python 3.0. Does anyone know why?


UserDict still exists on both releases (collections.UserDict on 3.x), but
it's not the preferred approach to implement a new mapping class anymore.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Help making this script better

2009-08-06 Thread Gabriel Genellina
En Thu, 06 Aug 2009 11:50:07 -0300, jakecjacobson  
 escribió:



After much Google searching and trial & error, I was able to write a
Python script that posts XML files to a REST API using HTTPS and
passing PEM cert & key file.  It seems to be working but would like
some pointers on how to handle errors.  I am using Python 2.4, I don't
have the capability to upgrade even though I would like to.  I am very
new to Python so help will be greatly appreciated and I hope others
can use this script.


Just a few remarks, mostly on style rather than functionality:


#!/usr/bin/python
#
# catalog_feeder.py
#
# This sciript will process a directory of XML files and push them to
the Enterprise Catalog.
#  You configure this script by using a configuration file that
describes the required variables.
#  The path to this file is either passed into the script as a command
line argument or hard coded
#  in the script.  The script will terminate with an error if it can't
process the XML file.
#


Note that Python has "docstrings" - the __doc__ attribute attached to  
every module, class and function. The very first string in the  
module/function/class becomes its docstring. The interactive help system  
-and other tools like pydoc- can inspect and show such info.
The above comment could serve perfectly as this module's docstring - just  
remove all the #'s and enclose the whole text in """triple quotes"""  
(required as it spawns many lines).
By example, in that case you could print the text in your usage() function  
like this:

print __doc__


try:
# Process the XML conf file 
xmldoc = minidom.parse(c)
catalog_host = readConfFile(xmldoc, 'catalog_host')
catalog_port = int(readConfFile(xmldoc, 'catalog_port'))
catalog_path = readConfFile(xmldoc, 'catalog_path')
collection_name = readConfFile(xmldoc, 'collection_name')
cert_file = readConfFile(xmldoc, 'cert_file')
key_file = readConfFile(xmldoc, 'key_file')
log_file = readConfFile(xmldoc, 'log_file')
input_dir = readConfFile(xmldoc, 'input_dir')
archive_dir = readConfFile(xmldoc, 'archive_dir')
hold_dir = readConfFile(xmldoc, 'hold_dir')
except Exception, inst:
# I had an error so report it and exit script
print "Unexpected error opening %s: %s" % (c, inst)
sys.exit(1)


Ok, an unexpected error: but *which* one? doing exactly *what*? You're  
hiding important information (the error type, the full traceback, the  
source line that raised the error...) that's very valuable when something  
goes wrong and you have to determine what happened.
In this case, you're adding a bit of information: the name of the file  
being processed. That's good. But at the same time, hiding all the other  
info, and that's not so good. What about this:


except Exception:
print >>sys.stderr, "Unexpected error opening %s" % c
raise

The final raise will propagate the exception; by default, Python will  
print the exception type, the exception value, the full traceback  
including source lines, and exit the script with a status code of 1. The  
same effect that you intended, but more complete.


In other cases, where you don't have anything to add to the default  
exception handling, the best thing to do is: nothing. That is, don't catch  
an exception unless you have something good to do with it. (Exceptions  
aren't Pokémon: you don't have to "catch 'em all!")



# Log Starting
logOut = verifyLogging(log_file)
if logOut:
log(logOut, "Processing Started ...")


I would move the decision into the log function (that is, just write  
log("something") and make the log function decide whether to write to file  
or not). For your next script, look at the logging module:  
http://docs.python.org/library/logging.html



# Get an arry of files from the input_dir
def getFiles2Post(d):
return (os.listdir(d))


Note that return isn't a function but a statement. You don't need the  
outer (). Also, using a docstring instead of a comment:


def getFiles2Post(input_dir):
"Return the list of files in input_dir to process"
return os.listdir(input_dir)


# Read out the conf file and set the needed global variable
def readConfFile(xmldoc, tag):
return (xmldoc.getElementsByTagName(tag)[0].firstChild.data)


Same as above. Which "needed global variable"?


def cleanup(logOut):
  [...] sys.exit(0)


Exiting the script from everywhere makes it harder to reuse some of its  
functions later. Just return the desired status code to the caller, which  
in turn will ret

Re: heapq "key" arguments

2009-08-06 Thread Joshua Bronson
On Aug 3, 1:36 pm, Raymond Hettinger  wrote:
> [Joshua Bronson]:
>
> > According tohttp://docs.python.org/library/heapq.html, Python 2.5
> > added an optional "key" argument to heapq.nsmallest and
> > heapq.nlargest. I could never understand why they didn't also add a
> > "key" argument to the other relevant functions (heapify, heappush,
> > etc).
>
> The problem is that heapq acts on regular lists, so it does not
> have exclusive access to the structure.  So, there is no reliable
> way for it to maintain a separate list of keys.  Since the keys
> can't be saved in the structure (without possibly breaking other
> code), the fine grained heapq functions (like heappop and heappush)
> would need to call key functions every time they are invoked.
> This is at odds with the implicit guarantee of the key function
> that it will be called no more than once per key.
>
> The overall problem is one of granularity.  A key function should
> be applied once in an initial pass, not on every call to a push/pop
> function.  The everyday solution that most people use is to operate
> on a list of (key, record) tuples and let tuple comparison do the
> work for you.  Another solution is to build a Heap class that does
> have exclusive access to the structure, but the API sugar often
> isn't worth the slightly weaker performance.
>
> One other thought.  Heaps are a lazy evaluation structure, so their
> fined-grained mutation functions only work well with just a single
> ordering function, so there is not need to have (and every reason
> to avoid) changing key functions in mid-stream.  IOW, the key
> function needs to be constant across all accesses.  Contrast this
> with other uses of key functions where it makes perfect sense
> to run minage=min(data, key=attrgetter('age')) and then running
> minsal=min(data, key=attrgetter('salary')).  The flexibility to
> change key functions just doesn't make sense in the context of
> the fine-grained heap functions.
>
> Accordingly, this is why I put key functions in nlargest() and
> nsmallest() but not in heappush() and friends.  The former can
> guarantee no more than one key function call per entry and they
> evaluate immediately instead of lazily.
>
> Raymond


I see, that makes sense. Thanks for the great explanation.

Josh
-- 
http://mail.python.org/mailman/listinfo/python-list


Python3: Using sorted(key=...)

2009-08-06 Thread Johannes Bauer
Hello list,

I'm having trouble with a incredibly simple sort of a list containing
ints and tuples:

def myorder(x):
if type(x) == int:
return x
else:
return x[0]

odata = sorted([ (a, b) for (a, b) in data["description"].items() ],
key=myorder)

still says:

Traceback (most recent call last):
  File "./genproto.py", line 81, in 
odata = sorted([ (a, b) for (a, b) in data["description"].items() ],
key=myorder)
TypeError: unorderable types: tuple() < int()

Why is that? Am I missing something very obvious?

Kind regards,
Johannes

-- 
"Du bist einfach nur lächerlich! Mit solchen albernen und hohlen Sätzen
kannst du mir nicht imprägnieren."
-- Hobbycholeriker Jens Fittig aka Wolfgang Gerber in de.sci.electronics
 <4a6f44d0$0$12481$9b622...@news.freenet.de>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Setuptools - help!

2009-08-06 Thread Robert Kern

On 2009-08-06 18:04, Peter Chant wrote:

Chaps,

any ideas, I'm floundering - I don't quite get it.  I have the following
files, setup.py and main.py in a directory pphoto:

# more setup.py
from setuptools import setup, find_packages
setup(
 name = "Pphoto",
 version = "0.1",
 packages = find_packages(),

 # other arguments here...
 entry_points = {'console_scripts': ['foo = pphoto.main:HelloWorld',]}


)

bash-3.1# more main.py


def HelloWorld():
 print "Hello World!"

print "Odd world"



From various websites that should produce a script foo that runs HelloWorld.

It does produce a script that simply crashes.

bash-3.1# foo
Traceback (most recent call last):
   File "/usr/bin/foo", line 8, in
 load_entry_point('Pphoto==0.1', 'console_scripts', 'foo')()
   File "build/bdist.linux-i686/egg/pkg_resources.py", line 277, in
load_entry_point
   File "build/bdist.linux-i686/egg/pkg_resources.py", line 2098, in
load_entry_point
   File "build/bdist.linux-i686/egg/pkg_resources.py", line 1831, in load
ImportError: No module named pphoto.main
bash-3.1#


Note, doing this as root as it seems not to do anything usefull at all if I
run python setup develop as a user.

Any ideas?  I must be missing something fundamental?


You need to put main.py into the pphoto package.

$ mkdir pphoto/
$ mv main.py pphoto/
$ touch pphoto/__init__.py

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list


Setuptools - help!

2009-08-06 Thread Peter Chant
Chaps,

any ideas, I'm floundering - I don't quite get it.  I have the following
files, setup.py and main.py in a directory pphoto:

# more setup.py
from setuptools import setup, find_packages
setup(
name = "Pphoto",
version = "0.1",
packages = find_packages(),

# other arguments here...
entry_points = {'console_scripts': ['foo = pphoto.main:HelloWorld',]}


)

bash-3.1# more main.py


def HelloWorld():
print "Hello World!"

print "Odd world"


>From various websites that should produce a script foo that runs HelloWorld. 
It does produce a script that simply crashes.

bash-3.1# foo
Traceback (most recent call last):
  File "/usr/bin/foo", line 8, in 
load_entry_point('Pphoto==0.1', 'console_scripts', 'foo')()
  File "build/bdist.linux-i686/egg/pkg_resources.py", line 277, in
load_entry_point
  File "build/bdist.linux-i686/egg/pkg_resources.py", line 2098, in
load_entry_point
  File "build/bdist.linux-i686/egg/pkg_resources.py", line 1831, in load
ImportError: No module named pphoto.main
bash-3.1#


Note, doing this as root as it seems not to do anything usefull at all if I
run python setup develop as a user. 

Any ideas?  I must be missing something fundamental?

Pete


-- 
http://www.petezilla.co.uk
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: XML flaw

2009-08-06 Thread Mark Lawrence

MRAB wrote:

Hi all,

I've just read this article, which mentions Python:

XML flaw threatens millions of apps with DoS attacks
http://infoworld.com/print/86340

Something to worry about?
No.  Discussing letting Tom, Dick and Harriet loose on the Python 
documentaion is far more important than trivial issues like Denial of 
Service atacks via XML.


Sorry, just couldn't resist.

--
Kindest regards.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: XML flaw

2009-08-06 Thread Chris Rebert
On Thu, Aug 6, 2009 at 3:05 PM, MRAB wrote:
> Hi all,
>
> I've just read this article, which mentions Python:
>
> XML flaw threatens millions of apps with DoS attacks
> http://infoworld.com/print/86340
>
> Something to worry about?

More detailed article:
http://blogs.zdnet.com/open-source/?p=4609

Quote:
"If you own any of the following libraries you need to be alert and
ready to patch:
* Python libexpat"
AKA xml.parsers.expat AKA pyexpat

The good news: "[a patch] is in process for Python."

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread Ethan Furman

Nobody wrote:

On Thu, 06 Aug 2009 08:35:57 -0700, Robert Dailey wrote:



I'm creating a python script that is going to try to search a text
file for any text that matches my regular expression. The thing it is
looking for is:

FILEVERSION #,#,#,#

The # symbol represents any number that can be any length 1 or
greater. Example:

FILEVERSION 1,45,10082,3

The regex should only match the exact above. So far here's what I have
come up with:

re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )



[0-9]+ allows any number of leading zeros, which is sometimes undesirable.
Using:

(0|[1-9][0-9]*)

is more robust.


You make a good point about possibly being undesirable, but I question 
the assertion that your solution is /more robust/.  If the OP 
wants/needs to match numbers even with leading zeroes your /more robust/ 
version fails.


~Ethan~

--
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Web page data and urllib2.urlopen

2009-08-06 Thread Dave Angel



Piet van Oostrum wrote:



DA> If Mozilla had seen a page with this line in an appropriate place, it'd
DA> immediately begin loading the other page, at "someotherurl"  But there's no
DA> such line.



  

DA> Next, I looked for javascript.  The Mozilla page contains lots of
DA> javascript, but there's none in the raw page.  So I can't explain Mozilla's
DA> differences that way.



  

DA> I did notice the link to /m/Content/mobile2.css, but I don' t know any way
DA> a CSS file could cause the content to change, just the display.



  

DA> All I can guess is that it has something to do with "browser type" or
DA> cookies.  And that would make lots of sense if this was a cgi page.  But
DA> the URL doesn't look like that, as it doesn't end in pl, py, asp, or any of
DA> another dozen special suffixes.



  

DA> Any hints, anybody???



If you look into the HTML that Firefox gets, there is a lot of
javascript in it.
  


But the raw page didn't have any javascript.  So what about that 
original raw page triggered additional stuff to be loaded?
Is it "user agent", as someone else brought out?  And is there somewhere 
I can read more about that aspect of things?  I've mostly built very 
static html pages, where the server yields the same page to everybody.  
And some form stuff, where the  user clicks on a 'submit" button to 
trigger a script that's not shown on the URL line.




--
http://mail.python.org/mailman/listinfo/python-list


Re: Unexpected side-effects of assigning to sys.modules[__name__]

2009-08-06 Thread Steven D'Aprano
On Thu, 06 Aug 2009 20:01:42 +0200, Jean-Michel Pichavant wrote:


> > I'm completely perplexed by this behaviour. sys.modules() seems to be
> > a regular dict, at least according to type(), and yet assigning to an
> > item of it seems to have unexpected, and rather weird, side-effects.
> >
> > What am I missing?
> >
> >
> >
> >
> Maybe when you assign 123 to sys.modules[__name__], you've removed the
> last reference on  and it is
> garbaged. You are then loosing all your initial namespace.

By Jove, I think you've got it! How obvious in hindsight. Thank you.





-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


XML flaw

2009-08-06 Thread MRAB

Hi all,

I've just read this article, which mentions Python:

XML flaw threatens millions of apps with DoS attacks
http://infoworld.com/print/86340

Something to worry about?
--
http://mail.python.org/mailman/listinfo/python-list


Re: os.path.exists() and Samba shares

2009-08-06 Thread BDZ
On Jul 31, 10:56 pm, "Gabriel Genellina" 
wrote:
> En Fri, 31 Jul 2009 13:33:45 -0300, BDZ  escribió:
>
>
>
> > On Jul 30, 4:41 pm, Loïc Domaigné 
> > wrote:
> >> > Hello. I have written a Python 3.1 script running on Windows that uses
> >> > os.path.exists() to connect to network shares. If the various network
> >> > shares require different user account and password combos than the
> >> > account the script is running under the routine returns false. I need
> >> > something like os.samba.path.exists(username,password,path). Does
> >> > anyone have a suggestion on how I can accomplish what I need to do in
> >> > Python?
>
> >> Could the Python Samba module PySamba be interesting for  
> >> you?http://sourceforge.net/projects/pysamba/
>
> > Unfortunately, although it has the calls I'd want, pysamba appears to
> > be *nix only. I need something that will work under Windows. Is there
> > a set of Python Windows functions (official or contributed) that might
> > do what I need? (I'm new to Python)
>
> SAMBA is a Linux implementation of the SMB protocol, natively supported on
> Windows. You may use the pywin32 package (available on sourceforge.net) to
> call the WNetAddConnection2 Windows 
> function:http://msdn.microsoft.com/en-us/library/aa385413(VS.85).aspx
>
> --
> Gabriel Genellina

The WNetAddConnection2 function under pywin32 seems to work. I am able
to make connections to various SMB network resources hosted by
Windows, Mac, and Linux boxes. It has the annoying side effect of
opening a connection.

There is a Win32 function called NetShareCheck. It sounds perfect. It
just checks that the share exists (no connection left open) and does
not require username or password. Unfortunately it fails for Mac and
Linux SMB resources. Just FYI.

I understand Samba and Windows SMB are not the same thing. What I was
hoping for when I investigated pySamba was to find a python module/
extension that supported a simple SMB interface and would run from any
host platform.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Seeding the rand() Generator

2009-08-06 Thread Nils Ruettershoff

Hi Fred,

I just saw your SQL Statement

An example would be: SELECT first, second, third, fourth,
fifth, sixth from sometable order by rand() limit 1

  
and I feel  me  constrained to give you an advice. Don't use this SQL 
statement to pick up a random row, your user and maybe DBA would much 
appreciate it.
You are certainly asking why. Lets have a brief look what you are asking 
your mysql DB:


Fetch all rows from 'sometable', but only with attribute 'first, 
second,...' sort them all starting at 'random row' and afterward through 
anything away you did before, but the first line


If you have a table with 10 rows you would fetch and sort up to 
10 rows, pick up one row and discard up to 9 rows. That sounds 
not very clever, right?


So please take a look at this site to get a better alternate way for 
that approach:


http://jan.kneschke.de/projects/mysql/order-by-rand/

if you want to know more please check this article too:

http://jan.kneschke.de/2007/2/22/analyzing-complex-queries

regards,

Nils

--
http://mail.python.org/mailman/listinfo/python-list


Re: Subclassing Python's dict

2009-08-06 Thread Raymond Hettinger
> Xavier Ho wrote:
> > You should subclass collections.UserDict, and not the default dict class.
> > Refer to the collections module.
>
> Xavier, why do you think that is the correct approach? The docs say
> "The need for this class has been largely supplanted by the ability to
> subclass directly from dict (a feature that became available starting
> with Python version 2.2)."

UserDict can be a good choice because the pure python source makes
it clear exactly what needs to be overridden (you can see which
methods are implemented in terms of lower level methods and which
ones access the underlying dict directly.

Another choice is to use DictMixin or MutableMapping and fill-in just
the required abstract methods.  This approach is simple and
flexible.
It allows you to wrap a mapping interface around many different
classes
(a dbm for example).  The disadvantage is that it can be slow.

Subclassing a dict is typically done when the new class has to be
substitutable for real dicts (perhaps an API enforces a check
for instance(x, dict) or somesuch).  As the OP found out, the
dict methods all access the underlying structure directly, so
you will need to override *all* methods that need to have a new
behavior.  The remaining methods are inherited and run at C speed,
so performance may dictate this approach.

So, there you have three ways to do it.  In Py3.1, we used the latter
approach for collections.Counter() -- that gives a high speed on the
inherited methods.  For collections.OrderedDict, a hybrid approach
was used (subclassing from both dict and MutableMapping).  Most of
the work is done by MutableMapping and the dict is inherited so that
the OrderedDict objects would be substitutable anywhere regular
dicts are expected.  And IIRC, there are still some cases of UserDict
being used in the python source (situations where subclassing from
dict wouldn't work as well).


Raymond

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to write replace string for object which will be substituted? [regexp]

2009-08-06 Thread Chris Rebert
On Thu, Aug 6, 2009 at 2:03 PM, Ryniek90 wrote:
>> 2) If you really want to learn regexes, get a copy of _Mastering Regular
>> Expressions_ by Friedl (either 2nd or 3rd edition)
>>
>
> I made preview of that book, but some pages are disabled from preview. Has
> that book topics about Python regexp's?
> If so, i must buy it.

Yes, Python is among the language APIs covered.

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: os.walk()

2009-08-06 Thread Chris Rebert
> On Tue, Aug 4, 2009 at 10:48 PM, Chris Rebert  wrote:
>> On Tue, Aug 4, 2009 at 7:06 PM, Michael Savarese wrote:
>> > Greetings
>> > Python newbie here, and thanks to all who have helped me previously.
>> > Is there a way of grabbing file attributes while traversing with os.walk()?
>> > It would be advantageous to have date modified and file size along with the
>> > file name.
>> > If anyone can point me in the right direction, I'd appreciate it.
>>
>> Feed the path to os.stat(), and then use the `stat` module on the result:
>> http://docs.python.org/library/os.html#os.stat
>> http://docs.python.org/library/stat.html#module-stat

2009/8/5 Michael Savarese :
> Chris, thanks for the info.
> I'm a bit stuck here.
>
> am i close?

Yes, you just need to plug some more bricks together (as it were).

> import os, sys
> import os.path
>
> for root, dirs, files in os.walk('c:/Temp'):
> for name in files:
> statinfo=os.stat(name)
#see http://docs.python.org/library/os.path.html#os.path.join
filepath = os.path.join(root, name)
statinfo = os.stat(filepath)
>
>   print root,dirs,name,statinfo.st_size ; it gets stuck here, i guess 
> it needs the full path.
>  is this 
> where i use the join function to bring root, dirs, and filename together?
>  I kinda suck 
> at that too, can you point me in the right direction?
>
> also:
 statinfo.st_mtime
> 1247778166.6563497  can i have a hint on how to convert this?

That is the time represented in seconds since the (UNIX) epoch.
Use the functions in the `time` module to convert it to something more
palatable:
http://docs.python.org/library/time.html

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


How to write replace string for object which will be substituted? [regexp]

2009-08-06 Thread Ryniek90



2) If you really want to learn regexes, get a copy of _Mastering Regular
Expressions_ by Friedl (either 2nd or 3rd edition)
  


I made preview of that book, but some pages are disabled from preview. 
Has that book topics about Python regexp's?

If so, i must buy it.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Subclassing Python's dict

2009-08-06 Thread Raymond Hettinger
> Are you referring to Python 3.0?  Python 2.6 does not have
> collections.UserDict

In Python2.6, it is in its own module.

>>> from UserDict import UserDict


Raymond
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Dave Angel

Robert Dailey wrote:

Hello,

I'm loading a file via open() in Python 3.1 and I'm getting the
following error when I try to print the contents of the file that I
obtained through a call to read():

UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
position 1650: character maps to 

The file is defined as ASCII and the copyright symbol shows up just
fine in Notepad++. However, Python will not print this symbol. How can
I get this to work? And no, I won't replace it with "(c)". Thanks!

  
I see others have alerted you to changes needed in stdout, which is 
ASCII coded by default.


But I wanted to comment on the (c) remark.  If you're in the US, that's 
the wrong abbreviation for copyright.  The only recognized abbreviation 
is (copr).


DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Using Python to automate builds

2009-08-06 Thread Kosta
On Aug 6, 11:58 am, Piet van Oostrum  wrote:
> > Kosta  (K) wrote:
> >K> My interpretation of the above (and your email) is that using Popen
> >K> allows one to pass the Python environment to a child processs (in my
> >K> case, setenv.bat).   I need the reverse, to propagate from the child
> >K> to the parent.
>
> I don't think there is any modern OS that allows that. Unless you use
> your own protocol of course, like letting the child write the
> environment to its stdout and reading and interpreting it in the parent.
> --
> Piet van Oostrum 
> URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4]
> Private email: p...@vanoostrum.org

Piet,

Yes you are correct.  Thinking more about parent/child processes and
what I am doing: opening a cmd window (its own process), starting up
Python (a child process), and then attempting to run setenv.bat (a
child process to Python), and yes I'm out of luck.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to write replace string for object which will be substituted? [regexp]

2009-08-06 Thread Aahz
In article ,
ryniek90   wrote:
>
>I started learning regexp, and some things goes well, but most of them 
>still not.

1) 'Some people, when confronted with a problem, think "I know, I'll use
regular expressions."  Now they have two problems.'
--Jamie Zawinski, comp.emacs.xemacs, 8/1997

2) If you really want to learn regexes, get a copy of _Mastering Regular
Expressions_ by Friedl (either 2nd or 3rd edition)
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

"...string iteration isn't about treating strings as sequences of strings, 
it's about treating strings as sequences of characters.  The fact that
characters are also strings is the reason we have problems, but characters 
are strings for other good reasons."  --Aahz
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using Python to automate builds

2009-08-06 Thread Aahz
In article ,
Hendrik van Rooyen   wrote:
>
>Bit slow - but hey, nobody's perfect.

YM "pobody's nerfect" HTH HAND
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

"...string iteration isn't about treating strings as sequences of strings, 
it's about treating strings as sequences of characters.  The fact that
characters are also strings is the reason we have problems, but characters 
are strings for other good reasons."  --Aahz
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Benjamin Kaplan
On Thu, Aug 6, 2009 at 12:41 PM, Robert Dailey wrote:
> On Aug 6, 11:31 am, "Richard Brodie"  wrote:
>> "Robert Dailey"  wrote in message
>>
>> news:29ab0981-b95d-4435-91bd-a7a520419...@b15g2000yqd.googlegroups.com...
>>
>> > UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
>> > position 1650: character maps to 
>>
>> > The file is defined as ASCII.
>>
>> That's the problem: ASCII is a seven bit code. What you have is
>> actually ISO-8859-1 (or possibly Windows-1252).
>>
>> The different ISO-8859-n variants assign various characters to
>> to '\xa9'. Rather than being Western-European centric and assuming
>> ISO-8859-1 by default, Python throws an error when you stray
>> outside of strict ASCII.
>
> Thanks for the help guys. Sorry I left out code, I wasn't sure at the
> time if it would be helpful. Below is my code:
>
>
> #
> def GetFileContentsAsString( file ):
>   f = open( file, mode='r', encoding='cp1252' )
>   contents = f.read()
>   f.close()
>   return contents
>
> #
> def ReplaceVersion( file, version, regExps ):
>   #match = regExps[0].search( 'FILEVERSION 1,45332,2100,32,' )
>   #print( match.group() )
>   text = GetFileContentsAsString( file )
>   print( text )
>
>
> As you can see, I am trying to load the file with encoding 'cp1252'
> which, according to the python 3.1 docs, translates to windows-1252. I
> also tried 'latin_1', which translates to ISO-8859-1, but this did not
> work either. Am I doing something else wrong?

This is why we need code and full tracebacks. There's a good chance
that your error is on the print(text) line. That's because sys.stdout
is probably a byte stream without an encoding defined. When you try to
print your unicode string, Python has to convert it to a stream of
bytes. Python refuses to guess on the console encoding and just falls
back to ascii, the conversion fails, and you get your error. Try using
print( text.encode( 'cp1252' ) ) instead.
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pylucene installation problem on Ubuntu 9.04

2009-08-06 Thread Benjamin Kaplan
On Thu, Aug 6, 2009 at 2:49 PM, KK wrote:
>
> kk-laptop$ sudo apt-get install pylucene
> and it did install python2.5, python2.5-minimal and pylucene. I must
> mention one thing that I already had python2.6 on my box as the
> default python i.e /usr/bin/python is linked to python2.6. Anyway s,
> now i started the python interpreter using "python" command from cli
> and then to make sure pylucene has been installed i tried to import
> the module and to my surprise it said "module pylucene not found".
> I thought I should enter the python2.6 env and do the same , so i
> tried starting the python2.6 interpreter using "python2.6" as the
> command and tried importing the same module and again it failed giving
> the same irritating message.

Pylucene seems to have installed under Python 2.5. Python extensions
are only installed for one version of python and extensions that use C
may only work under one particular version. If you want to use that
package, run python2.5 from the command line and try importing it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pylucene installation problem on Ubuntu 9.04

2009-08-06 Thread Jon Clements
On 6 Aug, 19:49, KK  wrote:
> hi all,
> I've trying to install pylucene on my linux box from last 2 days but
> not able to do so. first i tried to install it using apt-get like
> this,
> kk-laptop$ sudo apt-get install pylucene
> and it did install python2.5, python2.5-minimal and pylucene. I must
> mention one thing that I already had python2.6 on my box as the
> default python i.e /usr/bin/python is linked to python2.6. Anyway s,
> now i started the python interpreter using "python" command from cli
> and then to make sure pylucene has been installed i tried to import
> the module and to my surprise it said "module pylucene not found".
> I thought I should enter the python2.6 env and do the same , so i
> tried starting the python2.6 interpreter using "python2.6" as the
> command and tried importing the same module and again it failed giving
> the same irritating message.
>  As a final try i pulled the source code of pylucene and as per the
> comments given there in the README file, copied the mentioned files to
> site-packages directory of python2.6 and then tried importing the
> module and then got the same error message saying no module name
> pylucene is present. I'm sick of this error !
> Can someone point me what is the issue? If it is due to multiple
> version of python running on box, can someone tell me which one to
> remove or someone tell me how to get the whole thing running? I'll
> very much thankful to you guys.
>
> Thanks,
> KK.

If you installed using apt, have you a pylucene directory under /usr/
local/lib/python2.6/dist-packages/?

Also, if you run python, and import sys; print sys.path
whats it show?

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Philip Semanchuk


On Aug 6, 2009, at 3:14 PM, Martin v. Löwis wrote:

As a side note, you should probably use something other than "file"  
for
the parameter name in GetFileContentsAsString() since file() is a  
Python

function.


Python 3.1.1a0 (py3k:74094, Jul 19 2009, 13:39:42)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
py> file
Traceback (most recent call last):
 File "", line 1, in 
NameError: name 'file' is not defined



Whooops, didn't know about that change from 2.x to 3.x. Thanks.

--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode() vs. s.decode()

2009-08-06 Thread Steven D'Aprano
On Thu, 06 Aug 2009 20:05:52 +0200, Thorsten Kampe wrote:

> > That is significant! So the winner is:
> > 
> > unicode('äöüÄÖÜß','utf-8')
> 
> Unless you are planning to write a loop that decodes "äöüÄÖÜß" one
> million times, these benchmarks are meaningless.

What if you're writing a loop which takes one million different lines of 
text and decodes them once each?


>>> setup = 'L = ["abc"*(n%100) for n in xrange(100)]'
>>> t1 = timeit.Timer('for line in L: line.decode("utf-8")', setup)
>>> t2 = timeit.Timer('for line in L: unicode(line, "utf-8")', setup)
>>> t1.timeit(number=1)
5.6751680374145508
>>> t2.timeit(number=1)
2.682251165771


Seems like a pretty meaningful difference to me.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Web page data and urllib2.urlopen

2009-08-06 Thread Piet van Oostrum
> Dave Angel  (DA) wrote:

>DA> Massi wrote:
>>> Hi everyone, I'm using the urllib2 library to get the html source code
>>> of web pages. In general it works great, but I'm having to do with a
>>> financial web site which does not provide the souce code I expect. As
>>> a matter of fact if you try:
>>> 
>>> import urllib2
>>> res = urllib2.urlopen("http://www.marketwatch.com/story/mondays-
>>> biggest-gaining-and-declining-stocks-2009-07-27")
>>> page = res.read()
>>> print page
>>> 
>>> you will see that the printed code is very different from the one
>>> given, for example, by mozilla. Since I have really little knowledge
>>> in html I can't even understand if this is a python or html problem.
>>> Can anyone give me some help?
>>> Thanks in advance.
>>> 
>>> 
>DA> I don't think this is a Python issue, but a "raw read" versus an
>DA> interactive interpretation of a page.  The browser does lots more than a
>DA> single roundtrip defined by urlopen/read.

>DA> I also would love to see some explanation of what happens here, or a
>DA> pointer to a reference that would help me understand it.

>DA> I took the output of the read(), and formatted it, roughly, as html.  I
>DA> expected to find a refresh, which is the simplest way that one page can
>DA> cause a very different one to be loaded.
>DA>  

>DA> If Mozilla had seen a page with this line in an appropriate place, it'd
>DA> immediately begin loading the other page, at "someotherurl"  But there's no
>DA> such line.

>DA> Next, I looked for javascript.  The Mozilla page contains lots of
>DA> javascript, but there's none in the raw page.  So I can't explain Mozilla's
>DA> differences that way.

>DA> I did notice the link to /m/Content/mobile2.css, but I don' t know any way
>DA> a CSS file could cause the content to change, just the display.

>DA> All I can guess is that it has something to do with "browser type" or
>DA> cookies.  And that would make lots of sense if this was a cgi page.  But
>DA> the URL doesn't look like that, as it doesn't end in pl, py, asp, or any of
>DA> another dozen special suffixes.

>DA> Any hints, anybody???

If you look into the HTML that Firefox gets, there is a lot of
javascript in it.
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Martin v. Löwis
> As a side note, you should probably use something other than "file" for
> the parameter name in GetFileContentsAsString() since file() is a Python
> function.

Python 3.1.1a0 (py3k:74094, Jul 19 2009, 13:39:42)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
py> file
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'file' is not defined

Regards,
Martin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python docs disappointing

2009-08-06 Thread Wesley Chun
On Jul 31, 1:10 pm, kj  wrote:
> I'm pretty new to Python, and I like a lot overall, but I find the
> documentation for Python rather poor, overall.
>
> I'm sure that Python experts don't have this problem:


kj,

welcome to Python! i'm sorry that you find the documentation lacking.
the one thing about the docs is that they're just pointers to get you
started and aren't very comprehensive. there are plenty of good online
tutorials out there as well as books. in fact, one of my main
motivations for writing "Core Python Programming" was because when i
learned Python 13 years ago, the online docs were enough to get me
started but did not have enough info to help me become an intermediate
Python programmer. there were only *2*(!) Python books out there, and
they were special-topic oriented, not ones to learn the language from.
it took almost a year and a half to write, but from what i hear from
readers and what has been said in reviews, it's pretty comprehensive,
and is a good book to learn Python from. i only wish that *i* had it
when i was learning!

Most "Python experts" do not have the entire language memorized, so
everyone has to look at the docs from time-to-time, not just
beginners. i'll either hit up http://docs.python.org/library/MODULE.html
or flip open my Nutshell or PER references, and finally, i'll google
if i have to (rare). the Python docs are the language manuals and not
necessarily full reference texts, so you have to just take them for
what they are.

hope this helps!
-- wesley
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Core Python Programming", Prentice Hall, (c)2007,2001
"Python Fundamentals", Prentice Hall, (c)2009
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
python training and technical consulting
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Python-URL! - weekly Python news and links (Aug 6)

2009-08-06 Thread Gabriel Genellina
QOTW:  "The economy rises and falls, money comes and goes, but a great
conference has permanent good effects.  Well, a lot more permanent than
government fiscal policy, anyway." - Python Software Foundation Director
"bitter-in-victory-gracious-in-defeat-ly y'rs" timbot


Is python free of "buffer overflow" errors?
http://groups.google.com/group/comp.lang.python/t/a31faac6feced289/

Lessons learned in implementation of a write-once dict:
http://groups.google.com/group/comp.lang.python/t/bc8b91669257e246/

Methods, attributes, iterators, lambdas, and how Ruby handles them:
http://groups.google.com/group/comp.lang.python/t/6e4fc61946513405/

Python 3 allows for custom types to be used as a class namespace 
not just dicts):
http://groups.google.com/group/comp.lang.python/t/50caadd10d2cca16/

Could Python be used to write a device driver?
http://groups.google.com/group/comp.lang.python/t/4efc28f9fe45b69e/

The various meanings of the underscore character '_' in identifiers:
http://groups.google.com/group/comp.lang.python/t/e32d577ad3d5a208/

Generate a new object each time a name is imported:
http://groups.google.com/group/comp.lang.python/t/b7112f74e2efa8bd/

heapq.nlargest takes a "key" argument - why not the other functions in
the same module?
http://groups.google.com/group/comp.lang.python/t/a5095d3f4b54f79b/

Immutable objects and how they could improve concurrency [old thread,
still alive]:

http://groups.google.com/group/comp.lang.python/t/cb0cf56c52321ccc/5c82cd09767ba85a?#5c82cd09767ba85a

How to modify a variable in an outer (non global) scope:
http://groups.google.com/group/comp.lang.python/t/e0e64250bd82825f/

Best way to add "private" directories to sys.path:
http://groups.google.com/group/comp.lang.python/t/cb43cf90d72f6833/

Interval arithmetic:
http://groups.google.com/group/comp.lang.python/t/71f050d8f5987244/

Ensure that no more than three instances of the same program are
running at the same time:
http://groups.google.com/group/comp.lang.python/t/af7ae6429c2bda1e/

Some people don't like the way Python documentation is managed/presented:
http://groups.google.com/group/comp.lang.python/t/a52b22cd90b15ef8/



Everything Python-related you want is probably one or two clicks away in
these pages:

Python.org's Python Language Website is the traditional
center of Pythonia
http://www.python.org
Notice especially the master FAQ
http://www.python.org/doc/FAQ.html

PythonWare complements the digest you're reading with the
marvelous daily python url
 http://www.pythonware.com/daily

Just beginning with Python?  This page is a great place to start:
http://wiki.python.org/moin/BeginnersGuide/Programmers

The Python Papers aims to publish "the efforts of Python enthusiasts":
http://pythonpapers.org/
The Python Magazine is a technical monthly devoted to Python:
http://pythonmagazine.com

Readers have recommended the "Planet" sites:
http://planetpython.org
http://planet.python.org

comp.lang.python.announce announces new Python software.  Be
sure to scan this newsgroup weekly.
http://groups.google.com/group/comp.lang.python.announce/topics

Python411 indexes "podcasts ... to help people learn Python ..."
Updates appear more-than-weekly:
http://www.awaretek.com/python/index.html

The Python Package Index catalogues packages.
http://www.python.org/pypi/

Much of Python's real work takes place on Special-Interest Group
mailing lists
http://www.python.org/sigs/

Python Success Stories--from air-traffic control to on-line
match-making--can inspire you or decision-makers to whom you're
subject with a vision of what the language makes practical.
http://www.pythonology.com/success

The Python Software Foundation (PSF) has replaced the Python
Consortium as an independent nexus of activity.  It has official
responsibility for Python's development and maintenance.
http://www.python.org/psf/
Among the ways you can support PSF is with a donation.
http://www.python.org/psf/donations/

The Summary of Python Tracker Issues is an automatically generated
report summarizing new bugs, closed ones, and patch submissions. 

http://search.gmane.org/?author=status%40bugs.python.org&group=gmane.comp.python.devel&sort=date

Although unmaintained since 2002, the Cetus collection of Python
hyperlinks retains a few gems.
http://www.cetus-links.org/oo_python.html

Python FAQTS
http://python.faqts.com/

The Cookbook is a collaborative effort to capture useful and
interesting recipes.
http://code.activestate.

Re: Using Python to automate builds

2009-08-06 Thread Piet van Oostrum
> Kosta  (K) wrote:

>K> My interpretation of the above (and your email) is that using Popen
>K> allows one to pass the Python environment to a child processs (in my
>K> case, setenv.bat).   I need the reverse, to propagate from the child
>K> to the parent.

I don't think there is any modern OS that allows that. Unless you use
your own protocol of course, like letting the child write the
environment to its stdout and reading and interpreting it in the parent.
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing Binary Structures; Is there a better way / What is your way?

2009-08-06 Thread Martin P. Hellwig

Thanks all for your insights and suggestions.
It seems to me that there are a couple of ways to this bit manipulation 
and a couple of foreign modules to assist you with that.


Would it be worth the while to do a PEP on this?
Personally I think that it would be nice to have a standard module in 
Python for this but perhaps there is a good reason that there isn't 
already one.


--
MPH
http://blog.dcuktec.com
'If consumed, best digested with added seasoning to own preference.'
--
http://mail.python.org/mailman/listinfo/python-list


pylucene installation problem on Ubuntu 9.04

2009-08-06 Thread KK
hi all,
I've trying to install pylucene on my linux box from last 2 days but
not able to do so. first i tried to install it using apt-get like
this,
kk-laptop$ sudo apt-get install pylucene
and it did install python2.5, python2.5-minimal and pylucene. I must
mention one thing that I already had python2.6 on my box as the
default python i.e /usr/bin/python is linked to python2.6. Anyway s,
now i started the python interpreter using "python" command from cli
and then to make sure pylucene has been installed i tried to import
the module and to my surprise it said "module pylucene not found".
I thought I should enter the python2.6 env and do the same , so i
tried starting the python2.6 interpreter using "python2.6" as the
command and tried importing the same module and again it failed giving
the same irritating message.
 As a final try i pulled the source code of pylucene and as per the
comments given there in the README file, copied the mentioned files to
site-packages directory of python2.6 and then tried importing the
module and then got the same error message saying no module name
pylucene is present. I'm sick of this error !
Can someone point me what is the issue? If it is due to multiple
version of python running on box, can someone tell me which one to
remove or someone tell me how to get the whole thing running? I'll
very much thankful to you guys.

Thanks,
KK.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python docs disappointing - group effort to hire writers?

2009-08-06 Thread r
On Aug 6, 11:20 am, Paul Rubin  wrote:
...(snip)
> There is something similar with the PostgreSQL docs.  There is also
> Real World Haskell (http://book.realworld.haskell.org) which has a lot
> of interspersed user comments.  It would be cool if Python's doc site
> did something like it too.

hear! hear!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Nobody
On Thu, 06 Aug 2009 09:14:08 -0700, Robert Dailey wrote:

> I'm loading a file via open() in Python 3.1 and I'm getting the
> following error when I try to print the contents of the file that I
> obtained through a call to read():
> 
> UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> position 1650: character maps to 
> 
> The file is defined as ASCII and the copyright symbol shows up just
> fine in Notepad++. However, Python will not print this symbol. How can
> I get this to work? And no, I won't replace it with "(c)". Thanks!

1. As others have said, your file *isn't* ASCII, but that isn't the
problem.

2. The problem is that the encoding which your standard output
stream uses doesn't have the copyright symbol. You need to use something
like:

sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding = 'iso-8859-1')
sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding = 'iso-8859-1')

to fix the encoding of the stdout and stderr streams.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode() vs. s.decode()

2009-08-06 Thread Thorsten Kampe
* Michael Ströder (Thu, 06 Aug 2009 18:26:09 +0200)
> Thorsten Kampe wrote:
> > * Michael Ströder (Wed, 05 Aug 2009 16:43:09 +0200)
> > I don't think any measurable speed increase will be noticeable
> > between those two.
> 
> Well, seems not to be true. Try yourself. I did (my console has UTF-8 as 
> charset):
> 
> Python 2.6 (r26:66714, Feb  3 2009, 20:52:03)
> [GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import timeit
> >>> timeit.Timer("'äöüÄÖÜß'.decode('utf-8')").timeit(100)
> 7.2721178531646729
> >>> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(100)
> 7.1302499771118164
> >>> timeit.Timer("unicode('äöüÄÖÜß','utf8')").timeit(100)
> 8.3726329803466797
> >>> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(100)
> 1.8622009754180908
> >>> timeit.Timer("unicode('äöüÄÖÜß','utf8')").timeit(100)
> 8.651669979095459
> >>>
> 
> Comparing again the two best combinations:
> 
> >>> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(1000)
> 17.23644495010376
> >>> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(1000)
> 72.087096929550171
> 
> That is significant! So the winner is:
> 
> unicode('äöüÄÖÜß','utf-8')

Unless you are planning to write a loop that decodes "äöüÄÖÜß" one 
million times, these benchmarks are meaningless.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unexpected side-effects of assigning to sys.modules[__name__]

2009-08-06 Thread Jean-Michel Pichavant

Steven D'Aprano wrote:

Given this module:

#funny.py
import sys
print "Before:"
print "  __name__ =", __name__
print "  sys.modules[__name__] =", sys.modules[__name__]
sys.modules[__name__] = 123
print "After:"
print "  __name__ =", __name__
print "  sys =", sys


when I run it I get these results:


[st...@sylar python]$ python2.6 funny.py
Before:
  __name__ = __main__
  sys.modules[__name__] = 
After:
  __name__ = None
  sys = None



I'm completely perplexed by this behaviour. sys.modules() seems to be a 
regular dict, at least according to type(), and yet assigning to an item 
of it seems to have unexpected, and rather weird, side-effects.


What am I missing?



  
Maybe when you assign 123 to sys.modules[__name__], you've removed the 
last reference on  and it is 
garbaged. You are then loosing all your initial namespace.



try this one:
#funny.py
import sys
print "Before:"
print "  __name__ =", __name__
print "  sys.modules[__name__] =", sys.modules[__name__]
foo = sys.modules[__name__] # backup ref for the garbage collector
sys.modules[__name__] = 123
print "After:"
print "  __name__ =", __name__
print "  sys =", sys

Jean-Michel


--
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread Nobody
On Thu, 06 Aug 2009 08:35:57 -0700, Robert Dailey wrote:

> I'm creating a python script that is going to try to search a text
> file for any text that matches my regular expression. The thing it is
> looking for is:
> 
> FILEVERSION #,#,#,#
> 
> The # symbol represents any number that can be any length 1 or
> greater. Example:
> 
> FILEVERSION 1,45,10082,3
> 
> The regex should only match the exact above. So far here's what I have
> come up with:
> 
> re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )

[0-9]+ allows any number of leading zeros, which is sometimes undesirable.
Using:

(0|[1-9][0-9]*)

is more robust.

-- 
http://mail.python.org/mailman/listinfo/python-list


Unexpected side-effects of assigning to sys.modules[__name__]

2009-08-06 Thread Steven D'Aprano
Given this module:

#funny.py
import sys
print "Before:"
print "  __name__ =", __name__
print "  sys.modules[__name__] =", sys.modules[__name__]
sys.modules[__name__] = 123
print "After:"
print "  __name__ =", __name__
print "  sys =", sys


when I run it I get these results:


[st...@sylar python]$ python2.6 funny.py
Before:
  __name__ = __main__
  sys.modules[__name__] = 
After:
  __name__ = None
  sys = None



I'm completely perplexed by this behaviour. sys.modules() seems to be a 
regular dict, at least according to type(), and yet assigning to an item 
of it seems to have unexpected, and rather weird, side-effects.

What am I missing?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Overlap in python

2009-08-06 Thread John Ladasky
On Aug 4, 3:21 pm, Jay Bird  wrote:
> Hi everyone,
>
> I wanted to thank you all for your help and *excellent* discussion.  I
> was able to utilize and embed the script by Grigor Lingl in the 6th
> post of this discussion to get my program to work very quickly (I had
> to do about 20 comparisons per data bin, with over 40K bins in
> total).  I am involved in genomic analysis research and this problem
> comes up a lot and I was surprised to not have been able to find a
> clear way to solve it.  I will also look through all the tips in this
> thread, I have a feeling they may come in handy for future use!
>
> Thank you again,
> Jay

Hi Jay,

I know this is a bit off-topic, but how does this pertain to genomic
analysis?  Are you counting the lengths of microsatellite repeats or
something?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using Python to automate builds

2009-08-06 Thread Kosta
On Aug 6, 3:57 am, David Cournapeau  wrote:
> On Thu, Aug 6, 2009 at 12:39 AM, Kosta wrote:
>
> > Setenv.bat sets up the path and other environment variables build.exe
> > needs to compile and link (and even binplace) its utilities.  So
> > building itself is not the issue.  The problem is that if I call
> > setenv.bat from Python and then build.exe, but the modifications to
> > the path (and other environment settings) are not seen by Python, so
> > the attempt to build without a specified path fails.
>
> It sounds like you do not propagate the environment when calling
> setenv.bat from python. There is an option to do so in
> subprocess.Popen init method, or you can define your own environment
> if you do not want to propagate the whole environment (but this is
> often difficult to avoid for build environment in my experience,
> expecially if you don't have access to the sources of the whole system
> to check which variables are necessary).
>
> David

David,

Thanks you.  I looked up the docs on Popen (http://docs.python.org/
library/subprocess.html) where I read:

On Windows: the Popen class uses CreateProcess() to execute the child
program, which operates on strings. If args is a sequence, it will be
converted to a string using the list2cmdline() method. Please note
that not all MS Windows applications interpret the command line the
same way: list2cmdline() is designed for applications using the same
rules as the MS C runtime.

My interpretation of the above (and your email) is that using Popen
allows one to pass the Python environment to a child processs (in my
case, setenv.bat).   I need the reverse, to propagate from the child
to the parent.

Thanks,
Kosta

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Richard Brodie

"Robert Dailey"  wrote in message 
news:f64f9830-c416-41b1-a510-c1e486271...@g19g2000vbi.googlegroups.com...

> As you can see, I am trying to load the file with encoding 'cp1252'
> which, according to the python 3.1 docs, translates to windows-1252. I
> also tried 'latin_1', which translates to ISO-8859-1, but this did not
> work either. Am I doing something else wrong?

Probably it's just the debugging print that has a problem, and if you
opened an output file with an encoding specified it would be fine.
When you get a UnicodeEncodingError, it's conversion _from_
Unicode that has failed. 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Philip Semanchuk


On Aug 6, 2009, at 12:41 PM, Robert Dailey wrote:


On Aug 6, 11:31 am, "Richard Brodie"  wrote:

"Robert Dailey"  wrote in message

news:29ab0981-b95d-4435-91bd-a7a520419...@b15g2000yqd.googlegroups.com 
...



UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
position 1650: character maps to 



The file is defined as ASCII.


That's the problem: ASCII is a seven bit code. What you have is
actually ISO-8859-1 (or possibly Windows-1252).

The different ISO-8859-n variants assign various characters to
to '\xa9'. Rather than being Western-European centric and assuming
ISO-8859-1 by default, Python throws an error when you stray
outside of strict ASCII.


Thanks for the help guys. Sorry I left out code, I wasn't sure at the
time if it would be helpful. Below is my code:


#
def GetFileContentsAsString( file ):
  f = open( file, mode='r', encoding='cp1252' )
  contents = f.read()
  f.close()
  return contents

#
def ReplaceVersion( file, version, regExps ):
  #match = regExps[0].search( 'FILEVERSION 1,45332,2100,32,' )
  #print( match.group() )
  text = GetFileContentsAsString( file )
  print( text )


As you can see, I am trying to load the file with encoding 'cp1252'
which, according to the python 3.1 docs, translates to windows-1252. I
also tried 'latin_1', which translates to ISO-8859-1, but this did not
work either. Am I doing something else wrong?



Are you getting the error when you read the file or when you  
print(text)?


As a side note, you should probably use something other than "file"  
for the parameter name in GetFileContentsAsString() since file() is a  
Python function.





--
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Albert Hopkins
On Thu, 2009-08-06 at 09:14 -0700, Robert Dailey wrote:
> Hello,
> 
> I'm loading a file via open() in Python 3.1 and I'm getting the
> following error when I try to print the contents of the file that I
> obtained through a call to read():
> 
> UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> position 1650: character maps to 
> 
> The file is defined as ASCII and the copyright symbol shows up just
> fine in Notepad++. However, Python will not print this symbol. How can
> I get this to work? And no, I won't replace it with "(c)". Thanks!

It's not actually ASCII but Windows-1252 extended ASCII-like.  So with
that information you can do either of 2 things: You can open it in text
mode and specify the encoding:

>>> fp = open(filename, 'r', encoding='windows-1252')
>>> s = fp.read()
>>> print(s)

or you can open it in binary mode and decode it later:

>>> fp = open(filename, 'rb')
>>> b = fp.read()
>>> print(str(b, encoding='windows-1252'))

Or you may be able to set the default encoding to windows-1252 but I
don't know how to do that (in Windows).

p.s.

Next time it might be helpful to paste a code snippet else we have to
make assumptions about what you are actually doing.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread Robert Dailey
On Aug 6, 11:12 am, Roman  wrote:
> On 06/08/09 08:35, Robert Dailey wrote:
>
>
>
>
>
> > Hey guys,
>
> > I'm creating a python script that is going to try to search a text
> > file for any text that matches my regular expression. The thing it is
> > looking for is:
>
> > FILEVERSION #,#,#,#
>
> > The # symbol represents any number that can be any length 1 or
> > greater. Example:
>
> > FILEVERSION 1,45,10082,3
>
> > The regex should only match the exact above. So far here's what I have
> > come up with:
>
> > re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )
>
> > This works, but I was hoping for something a bit cleaner. I'm having
> > to create a special case portion of the regex for the last of the 4
> > numbers simply because it doesn't end with a comma like the first 3.
> > Is there a better, more compact, way to write this regex?
> > --
> >http://mail.python.org/mailman/listinfo/python-list
>
> Since there cannot be more than one "end of string" you can try this
> expression:
> re.compile( r'FILEVERSION (?:[0-9]+(,|$)){4}' )

I had thought of this but I can't use that either. I have to assume
that someone was silly and put text at the end somewhere, perhaps a
comment. Like so:

FILEVERSION 1,2,3,4 // This is the file version

It would be nice if there was a type of counter for regex. So you
could say 'match only 1 [^,]' or something like that...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Robert Dailey
On Aug 6, 11:31 am, "Richard Brodie"  wrote:
> "Robert Dailey"  wrote in message
>
> news:29ab0981-b95d-4435-91bd-a7a520419...@b15g2000yqd.googlegroups.com...
>
> > UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> > position 1650: character maps to 
>
> > The file is defined as ASCII.
>
> That's the problem: ASCII is a seven bit code. What you have is
> actually ISO-8859-1 (or possibly Windows-1252).
>
> The different ISO-8859-n variants assign various characters to
> to '\xa9'. Rather than being Western-European centric and assuming
> ISO-8859-1 by default, Python throws an error when you stray
> outside of strict ASCII.

Thanks for the help guys. Sorry I left out code, I wasn't sure at the
time if it would be helpful. Below is my code:


#
def GetFileContentsAsString( file ):
   f = open( file, mode='r', encoding='cp1252' )
   contents = f.read()
   f.close()
   return contents

#
def ReplaceVersion( file, version, regExps ):
   #match = regExps[0].search( 'FILEVERSION 1,45332,2100,32,' )
   #print( match.group() )
   text = GetFileContentsAsString( file )
   print( text )


As you can see, I am trying to load the file with encoding 'cp1252'
which, according to the python 3.1 docs, translates to windows-1252. I
also tried 'latin_1', which translates to ISO-8859-1, but this did not
work either. Am I doing something else wrong?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using easy_install, reduncant?

2009-08-06 Thread Adam N
On Jul 27, 7:53 pm, David Lyon  wrote:
> On Mon, 27 Jul 2009 09:42:06 -0700 (PDT), ray 
> wrote:
>
> > I am working on a Trac installation.  I am new to Python.  To install
> > packages, it is suggested to use setuptools.  I have not understood
> > the directions.
>
> > I execute ez_install.py.
>
> > Then I attempt to execute easy_install.py setuptools-0.6c9-py2.6.egg.
> > There response that setuptools is already the active version in easy-
> > install.pth.  Then:
> > Installing easy_install.exe script to C:\Python26\Scripts error:  C:
> > \Python26\Scripts\Easy_install.exe: Permission denied.
>
> > I have compared the file entries before and after this attempt and
> > there are no new files.  Is there any problems here?  What did I miss?
>
> Try using python package manager 
> :http://sourceforge.net/projects/pythonpkgmgr/
>
> You might find it a lot simpler. It will download and install setuptools
> for you if you are still having problems.
>
> David

Is there any solution within the easy_install world?  I'm trying to
run a script (pinax) that calls it specifically so I'd have to do all
sorts of hacking to use pythonpkgmgr.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Two Dimensional Array + ctypes

2009-08-06 Thread Sparky
On Aug 5, 11:19 pm, "Gabriel Genellina" 
wrote:
> En Wed, 05 Aug 2009 20:12:09 -0300, Sparky  escribió:
>
>
>
>
>
> > Hello! I am trying to call this method:
>
> > long _stdcall AIBurst(long *idnum, [...]
> >                     long timeout,
> >                     float (*voltages)[4],
> >                     long *stateIOout,
> >                     long *overVoltage,
> >                     long transferMode);
>
> > I am having some problems with that  float (*voltages)[4].
> >         pointerArray = (ctypes.c_void_p * 4)
> >         voltages = pointerArray(ctypes.cast(ctypes.pointer
> > ((ctypes.c_long * 4096)()), ctypes.c_void_p), ctypes.cast
> > (ctypes.pointer((ctypes.c_long * 4096)()), ctypes.c_void_p),
> > ctypes.cast(ctypes.pointer((ctypes.c_long * 4096)()),
> > ctypes.c_void_p), ctypes.cast(ctypes.pointer((ctypes.c_long * 4096)
>
> Why c_long and not c_float?
> Anyway, this way looks much more clear to me (and doesn't require a cast):
>
> arr4096_type = ctypes.c_float * 4096
> voltages_type = arr4096_type * 4
> voltages = voltages_type()
>
> > The program runs but the values that come back in the array are not
> > right.
>
> Thay might be due to the long/float confusion.
>
> --
> Gabriel Genellina

Brilliant! Your code is much cleaner and the problem must have been
float vs long.

Thanks,
Sam
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread Roman
On 06/08/09 08:35, Robert Dailey wrote:
> Hey guys,
> 
> I'm creating a python script that is going to try to search a text
> file for any text that matches my regular expression. The thing it is
> looking for is:
> 
> FILEVERSION #,#,#,#
> 
> The # symbol represents any number that can be any length 1 or
> greater. Example:
> 
> FILEVERSION 1,45,10082,3
> 
> The regex should only match the exact above. So far here's what I have
> come up with:
> 
> re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )
> 
> This works, but I was hoping for something a bit cleaner. I'm having
> to create a special case portion of the regex for the last of the 4
> numbers simply because it doesn't end with a comma like the first 3.
> Is there a better, more compact, way to write this regex?
> -- 
> http://mail.python.org/mailman/listinfo/python-list
> 

Since there cannot be more than one "end of string" you can try this
expression:
re.compile( r'FILEVERSION (?:[0-9]+(,|$)){4}' )
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Richard Brodie

"Robert Dailey"  wrote in message 
news:29ab0981-b95d-4435-91bd-a7a520419...@b15g2000yqd.googlegroups.com...

> UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> position 1650: character maps to 
>
> The file is defined as ASCII.

That's the problem: ASCII is a seven bit code. What you have is
actually ISO-8859-1 (or possibly Windows-1252).

The different ISO-8859-n variants assign various characters to
to '\xa9'. Rather than being Western-European centric and assuming
ISO-8859-1 by default, Python throws an error when you stray
outside of strict ASCII. 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread MRAB

Robert Dailey wrote:

On Aug 6, 11:02 am, MRAB  wrote:

Robert Dailey wrote:

Hey guys,
I'm creating a python script that is going to try to search a text
file for any text that matches my regular expression. The thing it is
looking for is:
FILEVERSION #,#,#,#
The # symbol represents any number that can be any length 1 or
greater. Example:
FILEVERSION 1,45,10082,3
The regex should only match the exact above. So far here's what I have
come up with:
re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )
This works, but I was hoping for something a bit cleaner. I'm having
to create a special case portion of the regex for the last of the 4
numbers simply because it doesn't end with a comma like the first 3.
Is there a better, more compact, way to write this regex?

The character class \d is equivalent to [0-9], and ',' isn't a special
character so it doesn't need to be escaped:

 re.compile(r'FILEVERSION (?:\d+,){3}\d+')


But ',' is a special symbol It's used in this way:
{0,3}

This will match the previous regex 0-3 times. Are you sure commas need
not be escaped?

In any case, your suggestions help to clean it up a bit!


By 'special' I mean ones like '?', '*', '(', etc. ',' isn't special in
that sense.

In fact, the {...} quantifier is special only if it's syntactically
correct, otherwise it's just a literal, eg "a{," and a{} are just
literals.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Character encoding & the copyright symbol

2009-08-06 Thread Philip Semanchuk


On Aug 6, 2009, at 12:14 PM, Robert Dailey wrote:


Hello,

I'm loading a file via open() in Python 3.1 and I'm getting the
following error when I try to print the contents of the file that I
obtained through a call to read():

UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
position 1650: character maps to 

The file is defined as ASCII and the copyright symbol shows up just
fine in Notepad++. However, Python will not print this symbol. How can
I get this to work? And no, I won't replace it with "(c)". Thanks!


If the file is defined as ASCII and it contains 0xa9, then the file  
was written incorrectly or you were told the wrong encoding. There is  
no such character in ASCII which runs from 0x00 - 0x7f.


The copyright symbol == 0xa9 if the encoding is ISO-8859-1 or  
windows-1252, and since you're on Windows the latter is a likely bet.


http://en.wikipedia.org/wiki/Ascii
http://en.wikipedia.org/wiki/Iso-8859-1
http://en.wikipedia.org/wiki/Windows-1252


Bottom line is that your file is not in ASCII. Try specifying  
windows-1252 as the encoding. Without seeing your code I can't tell  
you where you need to specify the encoding, but the Python docs should  
help you out.



HTH
Philip

--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode() vs. s.decode()

2009-08-06 Thread Michael Ströder
Thorsten Kampe wrote:
> * Michael Ströder (Wed, 05 Aug 2009 16:43:09 +0200)
>> These both expressions are equivalent but which is faster or should be
>> used for any reason?
>>
>> u = unicode(s,'utf-8')
>>
>> u = s.decode('utf-8') # looks nicer
> 
> "decode" was added in Python 2.2 for the sake of symmetry to encode(). 

Yes, and I like the style. But...

> It's essentially the same as unicode() and I wouldn't be surprised if it 
> is exactly the same.

Did you try?

> I don't think any measurable speed increase will be noticeable between
> those two.

Well, seems not to be true. Try yourself. I did (my console has UTF-8 as 
charset):

Python 2.6 (r26:66714, Feb  3 2009, 20:52:03)
[GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.Timer("'äöüÄÖÜß'.decode('utf-8')").timeit(100)
7.2721178531646729
>>> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(100)
7.1302499771118164
>>> timeit.Timer("unicode('äöüÄÖÜß','utf8')").timeit(100)
8.3726329803466797
>>> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(100)
1.8622009754180908
>>> timeit.Timer("unicode('äöüÄÖÜß','utf8')").timeit(100)
8.651669979095459
>>>

Comparing again the two best combinations:

>>> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(1000)
17.23644495010376
>>> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(1000)
72.087096929550171

That is significant! So the winner is:

unicode('äöüÄÖÜß','utf-8')

Ciao, Michael.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python docs disappointing - group effort to hire writers?

2009-08-06 Thread Paul Rubin
alex23  writes:
> No offence, but the last thing the official documentation needs is
> example code written by people learning how to code. Suggest changes,
> request clarifications, submit samples for review, sure, but direct
> modification by users? I've seen the PHP docs; thanks but no thanks.

The PHP docs as I remember are sort of regular (non-publically
editable) doc pages, each of which has a public discussion thread
where people can post questions and answers about the topic of that
doc page.  I thought it worked really well.  The main thing is that
the good stuff from the comment section gets folded into the actual
doc now and then.

There is something similar with the PostgreSQL docs.  There is also
Real World Haskell (http://book.realworld.haskell.org) which has a lot
of interspersed user comments.  It would be cool if Python's doc site
did something like it too.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Web page data and urllib2.urlopen

2009-08-06 Thread ryles
On Aug 5, 4:30 pm, Massi  wrote:
> Hi everyone, I'm using the urllib2 library to get the html source code
> of web pages. In general it works great, but I'm having to do with a
> financial web site which does not provide the souce code I expect. As
> a matter of fact if you try:
>
> import urllib2
> res = urllib2.urlopen("http://www.marketwatch.com/story/mondays-
> biggest-gaining-and-declining-stocks-2009-07-27")
> page = res.read()
> print page
>
> you will see that the printed code is very different from the one
> given, for example, by mozilla. Since I have really little knowledge
> in html I can't even understand if this is a python or html problem.
> Can anyone give me some help?
> Thanks in advance.

Check if setting your user agent to Mozilla results in a different
page:

http://diveintopython.org/http_web_services/user_agent.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread Robert Dailey
On Aug 6, 11:02 am, MRAB  wrote:
> Robert Dailey wrote:
> > Hey guys,
>
> > I'm creating a python script that is going to try to search a text
> > file for any text that matches my regular expression. The thing it is
> > looking for is:
>
> > FILEVERSION #,#,#,#
>
> > The # symbol represents any number that can be any length 1 or
> > greater. Example:
>
> > FILEVERSION 1,45,10082,3
>
> > The regex should only match the exact above. So far here's what I have
> > come up with:
>
> > re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )
>
> > This works, but I was hoping for something a bit cleaner. I'm having
> > to create a special case portion of the regex for the last of the 4
> > numbers simply because it doesn't end with a comma like the first 3.
> > Is there a better, more compact, way to write this regex?
>
> The character class \d is equivalent to [0-9], and ',' isn't a special
> character so it doesn't need to be escaped:
>
>      re.compile(r'FILEVERSION (?:\d+,){3}\d+')

But ',' is a special symbol It's used in this way:
{0,3}

This will match the previous regex 0-3 times. Are you sure commas need
not be escaped?

In any case, your suggestions help to clean it up a bit!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to comment on a Python PEP?

2009-08-06 Thread Robert Kern

On 2009-08-06 03:42, "Martin v. Löwis" wrote:

Is there a mechanism for submitting comments on a Python PEP?


You post to python-dev or comp.lang.python, and you CC the author.


And be sure to put the PEP number in the Subject: line.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list


Character encoding & the copyright symbol

2009-08-06 Thread Robert Dailey
Hello,

I'm loading a file via open() in Python 3.1 and I'm getting the
following error when I try to print the contents of the file that I
obtained through a call to read():

UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
position 1650: character maps to 

The file is defined as ASCII and the copyright symbol shows up just
fine in Notepad++. However, Python will not print this symbol. How can
I get this to work? And no, I won't replace it with "(c)". Thanks!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread alex23
On Aug 7, 1:35 am, Robert Dailey  wrote:
> I'm creating a python script that is going to try to search a text
> file for any text that matches my regular expression. The thing it is
> looking for is:
>
> FILEVERSION 1,45,10082,3

Would it be easier to do it without regex? The following is untested
but I would probably do it more like this:

TOKEN = 'FILEVERSION '
for line in file:
  if line.startswith(TOKEN):
version = line[len(TOKEN):]
maj, min, rev, other = version.split(',')
break # if there's only one occurance, otherwise do stuff here
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help with regex

2009-08-06 Thread MRAB

Robert Dailey wrote:

Hey guys,

I'm creating a python script that is going to try to search a text
file for any text that matches my regular expression. The thing it is
looking for is:

FILEVERSION #,#,#,#

The # symbol represents any number that can be any length 1 or
greater. Example:

FILEVERSION 1,45,10082,3

The regex should only match the exact above. So far here's what I have
come up with:

re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )

This works, but I was hoping for something a bit cleaner. I'm having
to create a special case portion of the regex for the last of the 4
numbers simply because it doesn't end with a comma like the first 3.
Is there a better, more compact, way to write this regex?


The character class \d is equivalent to [0-9], and ',' isn't a special
character so it doesn't need to be escaped:

re.compile(r'FILEVERSION (?:\d+,){3}\d+')
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python docs disappointing - group effort to hire writers?

2009-08-06 Thread alex23
Kee Nethery wrote:
> As I struggle through trying to figure out how to make python do  
> simple stuff for me, I frequently generate samples. If some volunteer  
> here would point me towards the documentation that would tell me how I  
> can alter the existing Python docs to include sample code, I'd be more  
> than happy to do so.

No offence, but the last thing the official documentation needs is
example code written by people learning how to code. Suggest changes,
request clarifications, submit samples for review, sure, but direct
modification by users? I've seen the PHP docs; thanks but no thanks.

> I would like to "do it". Please point me to the docs that tell me how  
> to "do it" so that we people with newbie questions and a need for  
> examples can get out of your way and "do it" ourselves.

You start by reading this: http://docs.python.org/documenting/index.html
And this: http://www.python.org/dev/contributing/
And this: http://wiki.python.org/moin/WikiGuidelines

The first link, which directly answers your question, is clearly
listed on the doc contents page as "Documenting Python". I'm uncertain
how the docs could be made any _more_ helpful if people aren't
prepared to put effort into reading them. We're a long way away from
direct upload to the brain, unfortunately.

If you're learning the language, you should also consider using more
appropriate resources:
http://mail.python.org/mailman/listinfo/tutor
http://www.doughellmann.com/PyMOTW/
http://diveintopython.org/

The documentation cannot be all things to all people, and it most
certainly can't be a guide to general programming, which is what often
seems to be the issue with novice users. Python's a great language to
learn how to program in, sure, but I would hate to see that become the
focus of the docs.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python configuration question when python scripts are executed using Appweb as web server.

2009-08-06 Thread IronyOfLife
Hi Gabriel

On Aug 5, 4:18 pm, "Gabriel Genellina"  wrote:
> En Tue, 04 Aug 2009 10:15:24 -0300, IronyOfLife 
> escribió:
>
> > On Aug 3, 8:42 pm, "Gabriel Genellina"  wrote:
> >> En Mon, 03 Aug 2009 11:04:07 -0300, IronyOfLife   
> >> escribió:
>
> >> > I have installed python 2.6.2 in windows xp professional machine. I
> >> > have set the following environment variables -- PYTHONPATH. It points
> >> > to following windows folders: python root folder, the lib folder and
> >> > lib-tk folder.
>
> >> Why? Did you read it somewhere? Usually there is no need to set the  
> >> PYTHONPATH variable at all; remove it.
>
> > Setting PYTHONPATH environment variables is mentioned in Python docs.
>
> Could you provide a link please? Setting PYTHONPATH should not be

Here are couple of links that discusses setting PYTHONPATH environment
variable.
http://docs.python.org/using/windows.html
http://www.daimi.au.dk/~chili/PBI/pythonpath.html

> necesary, and in fact it's a very bad idea. Environment variables are
> global, but Python modules may depend on the Python version, architecture,
> install location... By example, you may install a 64 bits Python 3.1
> version *and* a 32 bits Python 2.5 version and they both can coexist
> peacefully - but an extension module compiled for the former cannot be
> used in the later version. You must build a separate library for each
> version, and install them in two separate directories. But since the
> PYTHONPATH variable is shared by all installations, which directory should
> contain?
> It's best not to use PYTHONPATH at all and rely on other alternatives
> (like .pth files, that are searched relative to the current Python
> executable, so different versions use different configuration files)

I understand your concerns regarding setting of PYTHONPATH while
multiple versions of Python are installed on the same machine. My fix
however does not use PYTHONPATH. The GNUTLS wrapper module for PYTHON
loads the GNUTLS dll's and it was not able to find them. Using FileMon
(win tool) I found out the paths that are scanned and I copied the
dlls to one of such paths. I still do not like this fix. This is a
temporary solution.

Can you explain maybe with some sample how to set .pth files? Maybe
this will resolve my issue.
>
> > I solved the issue temporarily by copying the gnutls related dlls to
> > the path searched by python.exe
>
> Glad to see you could finally fix it!
>
> --
> Gabriel Genellina

Thanks very much for your reply.
-- 
http://mail.python.org/mailman/listinfo/python-list


Help with regex

2009-08-06 Thread Robert Dailey
Hey guys,

I'm creating a python script that is going to try to search a text
file for any text that matches my regular expression. The thing it is
looking for is:

FILEVERSION #,#,#,#

The # symbol represents any number that can be any length 1 or
greater. Example:

FILEVERSION 1,45,10082,3

The regex should only match the exact above. So far here's what I have
come up with:

re.compile( r'FILEVERSION (?:[0-9]+\,){3}[0-9]+' )

This works, but I was hoping for something a bit cleaner. I'm having
to create a special case portion of the regex for the last of the 4
numbers simply because it doesn't end with a comma like the first 3.
Is there a better, more compact, way to write this regex?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: Python for Bioinformatics book

2009-08-06 Thread Sebastian Bassi
On Thu, Aug 6, 2009 at 11:52 AM, Bearophile wrote:
> The book looks interesting, but that doesn't look like a good way to
> show/offer the code. I suggest to also put it into a zip that can be
> downloaded.

Code is also in a directory in the DVD and also inside the virtual
machine. Anyway I think it wouldn't hurt to make a zip and put it
online, so i will do it. Thanks.
Best,
SB.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: Python for Bioinformatics book

2009-08-06 Thread Bearophile
Sebastian Bassi:

> All code is also available at thehttp://py3.us/##where ## is the code number, 
> for example:http://py3.us/57<

The book looks interesting, but that doesn't look like a good way to
show/offer the code. I suggest to also put it into a zip that can be
downloaded.

Bye,
bearophile
-- 
http://mail.python.org/mailman/listinfo/python-list


Help making this script better

2009-08-06 Thread jakecjacobson
Hi,

After much Google searching and trial & error, I was able to write a
Python script that posts XML files to a REST API using HTTPS and
passing PEM cert & key file.  It seems to be working but would like
some pointers on how to handle errors.  I am using Python 2.4, I don't
have the capability to upgrade even though I would like to.  I am very
new to Python so help will be greatly appreciated and I hope others
can use this script.

#!/usr/bin/python
#
# catalog_feeder.py
#
# This sciript will process a directory of XML files and push them to
the Enterprise Catalog.
#  You configure this script by using a configuration file that
describes the required variables.
#  The path to this file is either passed into the script as a command
line argument or hard coded
#  in the script.  The script will terminate with an error if it can't
process the XML file.
#

# IMPORT STATEMENTS
import httplib
import mimetypes
import os
import sys
import shutil
import time
from urllib import *
from time import strftime
from xml.dom import minidom

def main(c):
start_time = time.time()
# Set configuration parameters
try:
# Process the XML conf file 
xmldoc = minidom.parse(c)
catalog_host = readConfFile(xmldoc, 'catalog_host')
catalog_port = int(readConfFile(xmldoc, 'catalog_port'))
catalog_path = readConfFile(xmldoc, 'catalog_path')
collection_name = readConfFile(xmldoc, 'collection_name')
cert_file = readConfFile(xmldoc, 'cert_file')
key_file = readConfFile(xmldoc, 'key_file')
log_file = readConfFile(xmldoc, 'log_file')
input_dir = readConfFile(xmldoc, 'input_dir')
archive_dir = readConfFile(xmldoc, 'archive_dir')
hold_dir = readConfFile(xmldoc, 'hold_dir')
except Exception, inst:
# I had an error so report it and exit script
print "Unexpected error opening %s: %s" % (c, inst)
sys.exit(1)
# Log Starting
logOut = verifyLogging(log_file)
if logOut:
log(logOut, "Processing Started ...")
# Get list of XML files to process
if os.path.exists(input_dir):
files = getFiles2Post(input_dir)
else:
if logOut:
log(logOut, "WARNING!!! Couldn't find input directory: 
" +
input_dir)
cleanup(logOut)
else:
print "Dir doen't exist: " + input_dir
sys.exit(1)
try:
# Process each file to the catalog
connection = httplib.HTTPSConnection(catalog_host, catalog_port,
key_file, cert_file)
for file in files:
log(logOut, "Processing " + file + " ...")
try:
response = post2Catalog(connection, 
catalog_path, os.path.join
(input_dir, file), collection_name)
if response.status == 200:
msg = "Succesfully posted " +  file + " 
to cataloge ..."
print msg
log(logOut, msg)
# Move file to done directory
shutil.move(os.path.join(input_dir, 
file), os.path.join
(archive_dir, file))
else:
msg = "Error posting " +  file + " to 
cataloge [" + response.read
() + "] ..."
print msg
log(logOut, response.read())
# Move file to error dir
shutil.move(os.path.join(input_dir, 
file), os.path.join(hold_dir,
file))
except IOError, (errno):
print "%s" % (errno)

except httplib.HTTPException, (e):
print "Unexpected error %s " % (e)

run_time = time.time() - start_time
print 'Run time: %f seconds' % run_time

# Clean up
connection.close()
cleanup(logOut)

# Get an arry of files from the input_dir
def getFiles2Post(d):
return (os.listdir(d))

# Read out the conf file and set the needed global variable
def readConfFile(xmldoc, tag):
return (xmldoc.getElementsByTagName(tag)[0].firstChild.data)

# Write out the message to log file
def log(f, m):
f.write(strftime("%Y-%m-%d %H:%M:%S") + " : " + m + '\n')

# Clean up and exit
def cleanup(logOut):
if logOut:
log(logOut, "Proce

ANN: Python for Bioinformatics book

2009-08-06 Thread Sebastian Bassi
"Python for Bioinformatics"
ISBN 1584889292
Amazon: http://www.tinyurl.com/biopython
Publisher: http://www.crcpress.com/product/isbn/9781584889298

This book introduces programming concepts to life science researchers,
bioinformaticians, support staff, students, and everyone who is
interested in applying programming to solve biologically-related
problems. Python is the chosen programming language for this task
because it is both powerful and easy-to-use.

It begins with the basic aspects of the language (like data types and
control structures) up to essential skills on today's bioinformatics
tasks like building web applications, using relational database
management systems, XML and version control. There is a chapter
devoted to Biopython (www.biopython.org) since it can be used for most
of the tasks related to bioinformatics data processing.

There is a section with applications with source code, featuring
sequence manipulation, filtering vector contamination, calculating DNA
melting temperature, parsing a genbank file, inferring splicing sites,
and more.

There are questions at the end of every chapter and odd numbered
questiona are answered in an appendix making this text suitable for
classroom use.

This book can be used also as a reference material as it includes
Richard Gruet's Python Quick Reference, and the Python Style Guide.

DVD: The included DVD features a virtual machine with a special
edition of DNALinux, with all the programs and complementary files
required to run the scripts commented in the book. All scripts can be
tweaked to fit a particular configuration. By using a pre-configured
virtual machine the reader has access to the same development
environment than the author, so he can focus on learning Python. All
code is also available at the http://py3.us/## where ## is the code
number, for example: http://py3.us/57

I've been working on this book for more than two years testing the
examples under different setups and working to make the code
compatible for most versions of Python, Biopython and operating
systems. Where there is code that only works with a particular
dependency, this is clearly noted.

Finally, I want to highlight that non-bioinformaticians out there can
use this book as an introduction to bioinformatics by starting with
the included "Diving into the Gene Pool with BioPython" (by Zachary
Voase and published originally in Python Magazine).


-- 
Sebastián Bassi. Diplomado en Ciencia y Tecnología.

Non standard disclaimer: READ CAREFULLY. By reading this email,
you agree, on behalf of your employer, to release me from all
obligations and waivers arising from any and all NON-NEGOTIATED
agreements, licenses, terms-of-service, shrinkwrap, clickwrap,
browsewrap, confidentiality, non-disclosure, non-compete and
acceptable use policies ("BOGUS AGREEMENTS") that I have
entered into with your employer, its partners, licensors, agents and
assigns, in perpetuity, without prejudice to my ongoing rights and
privileges. You further represent that you have the authority to release
me from any BOGUS AGREEMENTS on behalf of your employer.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python docs disappointing - group effort to hire writers?

2009-08-06 Thread Kee Nethery


On Aug 6, 2009, at 6:55 AM, Terry Reedy wrote:


RayS wrote:

At 08:35 PM 8/5/2009 -0700, r wrote:

"""... Any real sense of community is undermined -- or
even destroyed -- to be replaced by virtual equivalents that strive,
unsuccessfully, to synthesize a sense of community."""
I've brought up the idea of the quasi-community doc that PHP uses  
to good effect.


And what have you done about setting up such a project?

http://www.php.net/manual/en/language.types.array.php is a prime  
example where 2/3 of the "doc" is user-contributed comments and code.


I consider consider this to an unreadable mishmash. If you and  
others want something like that, do it.  And quite bitching about  
the work of those of us who have done something compact and  
readable. We are all volunteers here.


As I struggle through trying to figure out how to make python do  
simple stuff for me, I frequently generate samples. If some volunteer  
here would point me towards the documentation that would tell me how I  
can alter the existing Python docs to include sample code, I'd be more  
than happy to do so.


I would like to "do it". Please point me to the docs that tell me how  
to "do it" so that we people with newbie questions and a need for  
examples can get out of your way and "do it" ourselves.


Thanks,
Kee Nethery



--
http://mail.python.org/mailman/listinfo/python-list


Re: unicode() vs. s.decode()

2009-08-06 Thread Thorsten Kampe
* Michael Ströder (Wed, 05 Aug 2009 16:43:09 +0200)
> These both expressions are equivalent but which is faster or should be
> used for any reason?
> 
> u = unicode(s,'utf-8')
> 
> u = s.decode('utf-8') # looks nicer

"decode" was added in Python 2.2 for the sake of symmetry to encode(). 
It's essentially the same as unicode() and I wouldn't be surprised if it 
is exactly the same. I don't think any measurable speed increase will be 
noticeable between those two.

Thorsten
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: unicode() vs. s.decode()

2009-08-06 Thread Jason Tackaberry
On Thu, 2009-08-06 at 01:31 +, John Machin wrote:
> Faster by an enormous margin; attributing this to the cost of attribute lookup
> seems implausible.

Ok, fair point.  I don't think the time difference fully registered when
I composed that message.

Testing a global access (LOAD_GLOBAL) versus an attribute access on a
global object (LOAD_GLOBAL + LOAD_ATTR) shows that the latter is about
40% slower than the former.  So that certainly doesn't account for the
difference.


> Suggested further avenues of investigation:
> 
> (1) Try the timing again with "cp1252" and "utf8" and "utf_8"
> 
> (2) grep "utf-8" /Objects/unicodeobject.c

Very pedagogical of you. :)  Indeed, it looks like bigger player in the
performance difference is the fact that the code path for unicode(s,
enc) short-circuits the codec registry for common encodings (which
includes 'utf-8' specifically), whereas s.decode('utf-8') necessarily
consults the codec registry.

Cheers,
Jason.

-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >