Re: Pickled objects over the network

2007-07-22 Thread Hendrik van Rooyen
"Steve Holden" <[EMAIL PROTECTED]> wrote:

> Yes.

Why?

- Hendrik

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: importing a module from a specific directory

2007-07-22 Thread Gabriel Genellina
En Sun, 22 Jul 2007 09:03:43 -0300, O.R.Senthil Kumaran  
<[EMAIL PROTECTED]> escribió:

>>  I would like to organize them into directory structure in
>>  which there is a 'main' directory, and under it directories for
>>  specific sub-tasks, or sub-experiments, I'm running (let's call them
>>  'A', 'B', 'C').
>>  Is there a neat clean way of achieving the code organization?
>>
>
> This is a kind of a frequently asked question at c.l.p and every  
> programmer I
> guess has to go through this problem.
> If you look around c.l.p you will find that one of the good ways to  
> solve this
> problem with the python interpretor <2.5 is:
>
 import sys
 sys.path.append(os.path.abspath(os.pardir))

I would write it as  
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__  
- or certainly split it into two lines, because the current directory may  
not be the directory containing the script.

> But, if you are using Python 2.5, you are saved.
>
> 
> Starting with Python 2.5, in addition to the implicit relative imports
> described above, you can write explicit relative imports with the from  
> module
> import name form of import statement. These explicit relative imports use
> leading dots to indicate the current and parent packages involved in the
> relative import. From the surround module for example, you might use:

Note that this only applies to *packages*, not alone modules. And if you  
already have a package, it's better to put the driver/test/demo code out  
of the package itself, this way it must import the package the same way as  
any other client code. I usually place them in the directory containing  
the package (so "import package" just works). Other people prefer a  
different layout; search for some recent posts on this subject.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: URL parsing for the hard cases

2007-07-22 Thread Miles
On 7/23/07, John Nagle wrote:
> Here's another hard case.  This one might be a bug in urlparse:
>
> import urlparse
>
> s = 'ftp://administrator:[EMAIL PROTECTED]/originals/6 june
> 07/ebay/login/ebayisapi.html'
>
> urlparse.urlparse(s)
>
> yields:
>
> (u'ftp', u'administrator:[EMAIL PROTECTED]', u'/originals/6 june
> 07/ebay/login/ebayisapi.html', '', '', '')
>
> That second field is supposed to be the "hostport" (per the RFC usage
> of the term; Python uses the term "netloc"), and the username/password
> should have been parsed and moved to the "username" and "password" fields
> of the object. So it looks like urlparse doesn't really understand FTP URLs.

Those values aren't "moved" to the fields; they're extracted on the
fly from the netloc.  Use the .hostname property of the result tuple
to get just the hostname.

-Miles
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advice on sending images to clients over network

2007-07-22 Thread Frank Millman

Frank Millman wrote:
> Hi all
>
> This is not strictly a Python question, but as the system to which
> relates is written in Python, hopefully it is not too off-topic.
>
[...]
> I now want to add the capability of displaying images on the client.
> For example, if the application deals with properties, I want to
> display various photographs of the property on the client. wxPython is
> perfectly capable of displaying the image. My question is, what is the
> best way to get the image to the client?
>

Thanks for all the responses.

The verdict seems unanimous - use http.

Thanks for pointing me in the right direction.

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: URL parsing for the hard cases

2007-07-22 Thread John Nagle
Here's another hard case.  This one might be a bug in urlparse:

import urlparse

s = 'ftp://administrator:[EMAIL PROTECTED]/originals/6 june 
07/ebay/login/ebayisapi.html'

urlparse.urlparse(s)

yields:

(u'ftp', u'administrator:[EMAIL PROTECTED]', u'/originals/6 june 
07/ebay/login/ebayisapi.html', '', '', '')

That second field is supposed to be the "hostport" (per the RFC usage
of the term; Python uses the term "netloc"), and the username/password
should have been parsed and moved to the "username" and "password" fields
of the object. So it looks like urlparse doesn't really understand FTP URLs.

That's a real URL, from a search for phishing sites.  There are lots
of hostile URLs out there.  Some of which can fool some parsers.

John Nagle

John Nagle wrote:
> [EMAIL PROTECTED] wrote:
> 
>> Once you eliminate IPv6 addresses, parsing is simple. Is there a
>> colon? Then there is a port number. Does the left over have any
>> characters not in [0123456789.]? Then it is a name, not an IPv4
>> address.
>>
>> --Michael Dillon
>>
> 
>   You wish.  Hex input of IP addresses is allowed:
> 
> http://0x525eedda
> 
> and
> 
> http://0x52.0x5e.0xed.0xda
> 
> are both "Python.org".  Or just put
> 
> 0x52.0x5e.0xed.0xda
> 
> into the address bar of a browser.  All these work in Firefox on Windows 
> and
> are recognized as valid IP addresses.
> 
> On the other hand,
> 
> 0x52.com
> 
> is a valid domain name, in use by PairNIC.
> 
> But
> 
> http://test.0xda
> 
> is handled by Firefox on Windows as a domain name.  It doesn't resolve, 
> but it's
> sent to DNS.
> 
> So I think the question is whether every term between dots can be parsed as
> a decimal or hex number.  If all terms can be parsed as a number, and 
> there are
> no more than four of them, it's an IP address.  Otherwise it's a domain 
> name.
> 
> There are phishing sites that pull stuff like this, and I'm parsing a 
> long list
> of such sites.  So I really do need to get the hard cases right.
> 
> Is there any library function that correctly tests for an IP address vs. a
> domain name based on syntax, i.e. without looking it up in DNS?
> 
> John Nagle
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: custom plugin architecture: how to see parent namespace?

2007-07-22 Thread escalation746
Jorge Godoy wrote:

> escalation746 wrote:
> > I have updated documentation for this on my blog, diagrammes modernes.
> > Surf:
> >http://diagrammes-modernes.blogspot.com/
>
> Your motivation looks a lot like what is solved by setuptools, eggs and
> entry points.

Though that problem domain looks different (installation, app
dependencies), you are no doubt correct that there is functionality
overlap. However, that is a maximal solution while mine is minimal and
hence (maybe) of use to people who prefer a couple dozen lines of code
to an entire package.

-- robin

-- 
http://mail.python.org/mailman/listinfo/python-list


Configure apache to run python scripts

2007-07-22 Thread joe jacob
I need to configure apache to run python scripts. I followed the steps
mentioned in this site (http://www.thesitewizard.com/archive/
addcgitoapache.shtml). But I am not able to run python scripts from
Firefox, I  got a forbidden error "you do not have permission to
access the file in the server" when I try to run the script form
Firefox browser. Somebody please help me.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Ben Finney
Steve Holden <[EMAIL PROTECTED]> writes:

> Well, if you're going to start answering questions with FACTS, how
> can questioners reply on their prejudices to guide them any more?

You clearly underestimate the capacity for such people to choose only
the particular facts that support those prejudices.

-- 
 \ "Are you pondering what I'm pondering?" "I think so, Brain, but |
  `\ I don't think Kay Ballard's in the union."  -- _Pinky and The |
_o__)   Brain_ |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: custom plugin architecture: how to see parent namespace?

2007-07-22 Thread Jorge Godoy
escalation746 wrote:

> I have updated documentation for this on my blog, diagrammes modernes.
> Surf:
> http://diagrammes-modernes.blogspot.com/

Your motivation looks a lot like what is solved by setuptools, eggs and
entry points.

http://peak.telecommunity.com/DevCenter/PkgResources
http://docs.pythonweb.org/display/pylonscookbook/Using+Entry+Points+to+Write+Plugins

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: idiom for RE matching

2007-07-22 Thread mik3l3374
On Jul 19, 12:52 pm, Gordon Airporte <[EMAIL PROTECTED]> wrote:
> I have some code which relies on running each line of a file through a
> large number of regexes which may or may not apply. For each pattern I
> want to match I've been writing
>
> gotit = mypattern.findall(line)
> if gotit:
> gotit = gotit[0]
> ...do whatever else...
>
> This seems kind of clunky. Is there a prettier way to handle this?
> I've also been assuming that using the re functions that create match
> objects is slower/heavier than dealing with the simple list returned by
> findall(). I've profiled it and these matches are the biggest part of
> the running time of the program, so I really would rather not use
> anything slower.

if your search is not overly complicated, i think regexp is not
needed. if you want, you can post a sample what you want to search,
and some sample input.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Re-running unittest

2007-07-22 Thread Gabriel Genellina
En Sun, 22 Jul 2007 17:43:03 -0300, Israel Fernández Cabrera  
<[EMAIL PROTECTED]> escribió:

> I'm writing some code that automatically execute some registered unit
> test in a way to automate the process. A sample code follows to
> illustrate what I'm doing:
>
> 
> class PruebasDePrueba(unittest.TestCase):
>def testUnTest(self):
>a = 2
>b = 1
>self.assertEquals(a, b)
>
> def runTests():
>loader = unittest.TestLoader()
>result = unittest.TestResult()
>suite = loader.loadTestsFromName("import_tests.PruebasDePrueba")
>suite.run(result)
>print "Errores: ", len(result.errors)
>print "Fallos: ", len(result.failures)
>
> if __name__ == "__main__":
>runTests()
>raw_input("Modify [fix] the test and press ENTER to continue")
> 
>
> The code executes the tests from the class PruebasDePrueba, as the
> user to "fix" the failing test and then executes the tests again after
> ENTER is pressed.
> The test's initial state is "fail" so, equaling the values of a or b
> in the second execution I wait the test does not fails again, but it
> does.
> I've changed the original code in very different ways trying to figure
> out what is wrong with it but no success
> The problem is not reproduced if instead of loading the test from the
> TestCase (import_tests.PruebasDePrueba) they are loaded referring the
> container module and this behaves like this because I wrote a class
> that inherits from unittest.TestLoader abd re-defines the
> loadTestsFromModule(module) then every time this method is called, the
> module is reloaded via "reload" python's function. I would like to do
> the same with TestCases.

I'm a bit confused. Perhaps the description above does not match the code.  
The code does not use reload, and has no loops, so it is executed only  
once. The way I interpret it: I run the script, the test fails; I "fix"  
testUnTest; I run the script again, it passes.
When I say "I run the script", I mean typing "python import_tests.py" at  
the console, or similar.
Perhaps you are running it from inside another environment, like IDLE, and  
you keep objects created which won't notice the changed code, even if you  
use reload(). The answer would be: don't do that. Read the last comments  
in the reload documentation  


Another thing I don't understand, is that usually one fixes the CODE until  
it passes the tests; fixing the TESTS so the code passes looks a bit  
strange.

> I have written this problem to several other python lists but I have
> not received any answer, hope this time is different,
> I'd like to thaks in advance, regards

Perhaps you have to explain it further. If your problem is the usage of  
reload, as I think, this has nothing to do with unit tests, and worse,  
your code does not even show how you use that function.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: URL parsing for the hard cases

2007-07-22 Thread John Nagle
[EMAIL PROTECTED] wrote:

> Once you eliminate IPv6 addresses, parsing is simple. Is there a
> colon? Then there is a port number. Does the left over have any
> characters not in [0123456789.]? Then it is a name, not an IPv4
> address.
> 
> --Michael Dillon
> 

   You wish.  Hex input of IP addresses is allowed:

http://0x525eedda

and

http://0x52.0x5e.0xed.0xda

are both "Python.org".  Or just put

0x52.0x5e.0xed.0xda

into the address bar of a browser.  All these work in Firefox on Windows and
are recognized as valid IP addresses.

On the other hand,

0x52.com

is a valid domain name, in use by PairNIC.

But

http://test.0xda

is handled by Firefox on Windows as a domain name.  It doesn't resolve, but it's
sent to DNS.

So I think the question is whether every term between dots can be parsed as
a decimal or hex number.  If all terms can be parsed as a number, and there are
no more than four of them, it's an IP address.  Otherwise it's a domain name.

There are phishing sites that pull stuff like this, and I'm parsing a long list
of such sites.  So I really do need to get the hard cases right.

Is there any library function that correctly tests for an IP address vs. a
domain name based on syntax, i.e. without looking it up in DNS?

John Nagle
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: custom plugin architecture: how to see parent namespace?

2007-07-22 Thread escalation746
I have updated documentation for this on my blog, diagrammes modernes.
Surf:
http://diagrammes-modernes.blogspot.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where is the collections module?

2007-07-22 Thread Jerry Hill
On 7/22/07, Gordon Airporte <[EMAIL PROTECTED]> wrote:
> Gordon Airporte wrote:
> > I was going to try tweaking defaultdict, but I can't for the life of me
> > find where the collections module or its structures are defined. Python
> > 2.5.
>
> Thanks all. I was expecting it in Python. Time to dust off my C :-P

If you'd rather work with a pure python implementation, Jason Kirtland
has written one on the Python Cookbook:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/523034

-- 
Jerry
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: recursively expanding $references in dictionaries

2007-07-22 Thread Alia Khouri
Ok. I've reached a nice little conclusion here. Time to go to bed, but
before that I thought I'd share the results (-;

I can now read a yaml file which natively produces a dict tree and
convert it into an object tree with attribute read/write access, dump
that back into a readable yaml string, and then expand references
within that using cheetah in a very nifty self-referential way.*

(see: http://pyyaml.org/wiki/PyYAML) and cheetah templates (http://
www.cheetahtemplate.org/)

Enjoy!


AK




from Cheetah.Template import Template
from pprint import pprint
import yaml

class Object(dict):
def __getattr__(self, name):
if name in self: return self[name]
#~ if name in self.__dict__: return getattr(self, name)
def __setattr__(self, name, value):
self[name] = value

def getTree(tree, to_dict=False):
_tree = Object() if not to_dict else dict()
def recurse(targetDict, sourceDict):
for key, value in sourceDict.items():
if isinstance(value, dict):
value = Object(value) if not to_dict else dict(value)
new_target = targetDict.setdefault(key, value)
recurse(new_target, value)
else:
targetDict[key] = value
recurse(_tree, tree)
return _tree


config  = '''
app:
  name: appname
  copyright: me.org 2007

dir:
  src: /src/$app.name
'''
# from yml dict tree to obj tree
root = getTree(yaml.load(config))
print root
print
assert root.app.name == root.app['name']
root.app.name = "the_monster"

# from obj tree to dict tree
root = getTree(root, to_dict=True)

# from dict tree to yaml string
s = yaml.dump(root)
print s

# use cheetah templates to expand references
print str(Template(s, searchList=[root]))

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where is the collections module?

2007-07-22 Thread Gordon Airporte
Gordon Airporte wrote:
> I was going to try tweaking defaultdict, but I can't for the life of me 
> find where the collections module or its structures are defined. Python 
> 2.5.

Thanks all. I was expecting it in Python. Time to dust off my C :-P
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: idiom for RE matching

2007-07-22 Thread Gordon Airporte
[EMAIL PROTECTED] wrote:

> Have you read and understood what MULTILINE means in the manual
> section on re syntax?
> 
> Essentially, you can make a single pattern which tests a match against
> each line.
> 
> -- Michael Dillon

No, I have not looked into this - thank you. RE's are hard enough to get 
into that I didn't want the added complication of the flags. Now that 
I'm comfortable writing patterns I guess I never got around to the rest 
of the options.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: custom plugin architecture: how to see parent namespace?

2007-07-22 Thread escalation746
Wojciech Mu a wrote:

> These names don't match.  I replaced Valuable() with proper name,
> and everything work fine.

That was a result of a transcription error when posting to the
newsgroup. My actual test code did not have this error but
nevertheless did not work.

However, copying the code I *did* post to the newsgroup and making
that change you pointed out... the code indeed worked as you claimed!

Two wrongs making a right?

I am sure when I look at this tomorrow it will not work again. :-)

-- robin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: recursively expanding $references in dictionaries

2007-07-22 Thread Alia Khouri
Oops, I left some redundant cruft in the function... here it is
slightly cleaner:

def expand(dikt):
names = {}
output = {}
def _search(_, sourceDict):
for key, value in sourceDict.items():
if isinstance(value, dict):
_search({}, value)
if not '$' in value:
names[key] = value
_search({}, dikt)
def _substitute(targetDict, sourceDict):
for key, value in sourceDict.items():
if isinstance(value, dict):
new_target = targetDict.setdefault(key, {})
_substitute(new_target, value)
else:
targetDict[key] =
Template(value).substitute(names)
_substitute(output, dikt)
return output

print expand(d2)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sort lines in a text file

2007-07-22 Thread leegold
...snip...
>
> Do your own homework.

Hush troll.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lazy "for line in f" ?

2007-07-22 Thread Steve Holden
Alexandre Ferrieux wrote:
> On Jul 22, 7:21 pm, Miles <[EMAIL PROTECTED]> wrote:
>> On 7/22/07, Alexandre Ferrieux  wrote:
>>
>>> The Tutorial says about the "for line in f" idiom that it is "space-
>>> efficient".
>>> Short of further explanation, I interpret this as "doesn't read the
>>> whole file before spitting out lines".
>>> In other words, I would say "lazy". Which would be a Good Thing, a
>>> much nicer idiom than the usual while loop calling readline()...
>>> But when I use it on the standard input, be it the tty or a pipe, it
>>> seems to wait for EOF before yielding the first line.
>> It doesn't read the entire file, but it does use internal buffering
>> for performance.  On my system, it waits until it gets about 8K of
>> input before it yields anything.  If you need each line as it's
>> entered at a terminal, you're back to the while/readline (or
>> raw_input) loop.
> 
> How frustrating ! Such a nice syntax for such a crippled semantics...
> 
> Of course, I guess it is trivial to write another iterator doing
> exactly what I want.
> But nonetheless, it is disappointing not to have it with the standard
> file handles.
> And speaking about optimization, I doubt blocking on a full buffer
> gains much.
> For decades, libc's fgets() has been doing it properly (block-
> buffering when data come swiftly, but yielding lines as soon as they
> are complete)... Why is the Python library doing this ?
> 
What makes you think Python doesn't use the platform fgets()? As a 
matter of policy the Python library offers as thin as possbile a shim 
over the C standard library when this is practical - as it is with "for 
line in f:". But in  the case of file.next() (the file method called to 
iterate over the contents) it will actually use getc_unlocked() on 
platforms that offer it, though you can override that configuration 
feature by setting USE_FGETS_IN_GETLINE,

It's probably more to do with the buffering. If whatever is driving the 
file is using buffering itself, then it really doesn't matter what the 
Python library does, it will still have to wait until the sending buffer 
fills before it can get any data at all.

Try running stdin unbuffered (use python -u) and see if that makes any 
difference. It should, in the shell-driven case, for example.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
--- Asciimercial --
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
--- Thank You for Reading -

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sort lines in a text file

2007-07-22 Thread leegold
...snip...
> To save anybody who's tempted to write the whole shebang for you,
> please specify which part(s) of the exercise you are having problems
> with:
> (a) reading lines from a file
> (b) extracting a sort key from a line [presuming "number" means
> "positive integer"; what do you want to do if multiple lines have the
> same number? what if no number at all"]
> (c) creating a list of tuples, where each tuple is (key,
> line_contents)
> (d) sorting the list
> (e) creating the output file from the sorted list.

Thanks, you've done more than enough right here!

(a) through (e) gives me a good start.

Again, thank you for this.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread I V
On Sun, 22 Jul 2007 21:13:02 +0200, Peter Kleiweg wrote:
> Here is another "space":
> 
>   >>> u'\uFEFF'.isspace()
>   False
> 
> isspace() is inconsistent

Well, U+00A0 is in the category "Separator, Space" while U+FEFF is in the
category "Other, Format", so it doesn't seem unreasonable that one is
treated as a space and the other isn't.
-- 
http://mail.python.org/mailman/listinfo/python-list


recursively expanding $references in dictionaries

2007-07-22 Thread Alia Khouri
I was kind of wondering what ways are out there to elegantly expand
'$name' identifiers in nested dictionary value. The problem arose when
I wanted to include that kind of functionality to dicts read from yaml
files such that:

def func(input):
# do something
return output

where:

input = {'firstname': 'John', 'lastname': 'Smith', 'src': 'c:/tmp/
file',
'dir':  {'path': '$src', 'fullname': '$firstname $lastname'}}

output = {'firstname': 'John', 'lastname': 'Smith', 'src': 'c:/tmp/
file',
'dir':  {'path': 'c:/tmp/file', 'fullname': 'John Smith'}}


Doing this substitution easy done when you have a flat dict, but when
they got nested, I had to resort to an undoubtedly ugly function with
two recursive passes and obvious deficiencies.

Is there a better way to do this?

Thanks for any help...

AK




# test_recurse.py

from string import Template
from pprint import pprint

def expand(dikt):
'''
>>> d = expand2({'firstname': 'John', 'lastname': 'Smith',
'fullname': '$firstname $lastname'})
>>> d == {'lastname': 'Smith', 'fullname': 'John Smith',
'firstname': 'John'}
True
'''
subs = {}
for key, value in dikt.items():
if '$' in value:
subs[key] = Template(value).substitute(dikt)
dikt.update(subs)
return dikt


dikt = {'firstname': 'John', 'lastname': 'Smith',
'fullname': '$firstname $lastname'}

#~ print expand(dikt)

d1 = {'firstname': 'John', 'lastname': 'Smith',
'dir': {'fullname': '$firstname $lastname'}
}
d2 = {'firstname': 'John', 'lastname': 'Smith', 'src': 'c:/tmp/file',
'dir': {'fullname': '$firstname $lastname', 'path':
'$src'}
}

def rexpand(dikt):
subs = {}
names = {}
# pass 1
def recurse(_, sourceDict):
for key, value in sourceDict.items():
if isinstance(value, dict):
recurse({}, value)
elif '$' in value:
subs[key] = value
else:
names[key] = value
recurse({}, dikt)
print 'subs', subs
print 'names', names
print 'dikt (before):', dikt
for key, value in subs.items():
subs[key] = Template(value).substitute(names)
# -
# pass 2
output = {}
def substitute(targetDict, sourceDict):
for key, value in sourceDict.items():
if isinstance(value, dict):
new_target = targetDict.setdefault(key, {})
substitute(new_target, value)
else:
targetDict[key] =
Template(value).substitute(names)
substitute(output, dikt)
print 'output:', output
return output

rexpand(d2)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lazy "for line in f" ?

2007-07-22 Thread Alexandre Ferrieux
On Jul 22, 7:21 pm, Miles <[EMAIL PROTECTED]> wrote:
> On 7/22/07, Alexandre Ferrieux  wrote:
>
> > The Tutorial says about the "for line in f" idiom that it is "space-
> > efficient".
> > Short of further explanation, I interpret this as "doesn't read the
> > whole file before spitting out lines".
> > In other words, I would say "lazy". Which would be a Good Thing, a
> > much nicer idiom than the usual while loop calling readline()...
>
> > But when I use it on the standard input, be it the tty or a pipe, it
> > seems to wait for EOF before yielding the first line.
>
> It doesn't read the entire file, but it does use internal buffering
> for performance.  On my system, it waits until it gets about 8K of
> input before it yields anything.  If you need each line as it's
> entered at a terminal, you're back to the while/readline (or
> raw_input) loop.

How frustrating ! Such a nice syntax for such a crippled semantics...

Of course, I guess it is trivial to write another iterator doing
exactly what I want.
But nonetheless, it is disappointing not to have it with the standard
file handles.
And speaking about optimization, I doubt blocking on a full buffer
gains much.
For decades, libc's fgets() has been doing it properly (block-
buffering when data come swiftly, but yielding lines as soon as they
are complete)... Why is the Python library doing this ?

-Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Steve Holden
Jean-Paul Calderone wrote:
> On Sun, 22 Jul 2007 21:13:02 +0200, Peter Kleiweg 
<[EMAIL PROTECTED]> wrote:
>> Carsten Haese schreef op de 22e dag van de hooimaand van het jaar 2007:
>>
>>> On Sun, 2007-07-22 at 17:44 +0200, Peter Kleiweg wrote:
> It's a feature. See help(str.split): "If sep is not specified or is
> None, any whitespace string is a separator."
 Define "any whitespace".
>>> Any string for which isspace returns True.
>> Define white space to isspace()
>>
 Why is it different in  and ?
>> '\xa0'.isspace()
>>> False
>> u'\xa0'.isspace()
>>> True
>> Here is another "space":
>>
>>  >>> u'\uFEFF'.isspace()
>>  False
>>
>> isspace() is inconsistent
> 
> It's only inconsistent if you think it should behave based on the
> name of a unicode code point.  It doesn't use the name, though. It
> uses the category.  NO-BREAK SPACE is in the Zs category (Separator, Space).
> ZERO WIDTH NO-BREAK SPACE is in the Cf category (Other, Format).
> 
> Maybe that makes unicode inconsistent (I won't try to argue either way),
> but it's pretty clear that isspace is being consistent based on the data
> it has to work with.
> 
Well, if you're going to start answering questions with FACTS, how can 
questioners reply on their prejudices to guide them any more?

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
--- Asciimercial --
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
--- Thank You for Reading -

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: web page text extractor

2007-07-22 Thread Thomas Dickey
Miki <[EMAIL PROTECTED]> wrote:
> (You can find lynx at http://lynx.browser.org/)

not exactly -

The current version of lynx is 2.8.6

It's available at
http://lynx.isc.org/lynx2.8.6/
2.8.7 Development & patches:
http://lynx.isc.org/current/index.html

-- 
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: space / nonspace

2007-07-22 Thread marduk
On Sun, 2007-07-22 at 22:33 +0200, Peter Kleiweg wrote:
> >>> import re
> >>> s = u'a b\u00A0c d'
> >>> s.split()
> [u'a', u'b', u'c', u'd']
> >>> re.findall(r'\S+', s)
> [u'a', u'b\xa0c', u'd']  
> 

If you want the Unicode interpretation of \S+, etc, you pass the
re.UNICODE flag:

>>> re.findall(r'\S+', s,re.UNICODE)
[u'a', u'b', u'c', u'd']

See http://docs.python.org/lib/node46.html

> 
> This isn't documented either:
> 
> >>> s = ' b c '
> >>> s.split()
> ['b', 'c']
> >>> s.split(' ')
> ['', 'b', 'c', '']

I believe the following documents it accurately:
http://docs.python.org/lib/string-methods.html

If sep is not specified or is None, a different splitting
algorithm is applied. First, whitespace characters (spaces,
tabs, newlines, returns, and formfeeds) are stripped from both
ends. Then, words are separated by arbitrary length strings of
whitespace characters. Consecutive whitespace delimiters are
treated as a single delimiter ("'1 2 3'.split()" returns "['1',
'2', '3']"). Splitting an empty string or a string consisting of
just whitespace returns an empty list.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: custom plugin architecture: how to see parent namespace?

2007-07-22 Thread Wojciech Muła
escalation746 wrote:
> def ViewValuable():
  
[...]
> code = """
> Hello()
> Plus()
> Valuable()
  
> """

These names don't match.  I replaced Valuable() with proper name,
and everything work fine.

w.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: space / nonspace

2007-07-22 Thread Carsten Haese
On Sun, 2007-07-22 at 22:33 +0200, Peter Kleiweg wrote:
> >>> import re
> >>> s = u'a b\u00A0c d'
> >>> s.split()
> [u'a', u'b', u'c', u'd']
> >>> re.findall(r'\S+', s)
> [u'a', u'b\xa0c', u'd']  

And your question is...?

> This isn't documented either:
> 
> >>> s = ' b c '
> >>> s.split()
> ['b', 'c']
> >>> s.split(' ')
> ['', 'b', 'c', '']

See http://docs.python.org/lib/string-methods.html

-- 
Carsten Haese
http://informixdb.sourceforge.net


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: URL parsing for the hard cases

2007-07-22 Thread memracom
On 22 Jul, 18:56, John Nagle <[EMAIL PROTECTED]> wrote:
> Is there something available that will parse the "netloc" field as
> returned by URLparse, including all the hard cases?  The "netloc" field
> can potentially contain a port number and a numeric IP address.  The
> IP address may take many forms, including an IPv6 address.
>
> I'm parsing URLs used by hostile sites, and the wierd cases come up
> all too frequently.

I assume that when you say "netloc" you are referring to the second
field returned by the urlparse module. If this netloc contains an IPv6
address then it will also contain square brackets. The colons inside
the [] belong to the IPv6 address and the single possible colon
outside the brackets belongs to the port number. Of course, you might
want to try to help people who do not follow the RFCs and failed to
wrap the IPv6 address in square brackets. In that case, try...expect
comes in handy. You can try to parse an IPv6 address and if it fails
because of too many segments, then fallback to some other behaviour.

The worst case is a URL like http://2001::123:4567:abcd:8080/something.
Does the 8080 refer to a port number or part of the IPv6 address. If I
had to support non-bracketed IPv6 addresses, then I would interpret
this as http://[2001::123:4567:abcd]:8080/something.

RFC3986 is the reference for correct URL formats.

Once you eliminate IPv6 addresses, parsing is simple. Is there a
colon? Then there is a port number. Does the left over have any
characters not in [0123456789.]? Then it is a name, not an IPv4
address.

--Michael Dillon

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: custom plugin architecture: how to see parent namespace?

2007-07-22 Thread escalation746
faulkner wrote:

> sys._getframe(1).f_locals

Brilliant. That one's pretty well hidden and labeled "should be used
for internal and specialized purposes only". Guess I'm officially
special. :-)

To implement this with minimal requirements on the author of the
plugin, I created a function in master.py:
def GetKey():
ns = sys._getframe(2).f_locals
return ns['VALUABLE']

Now the plugin gets the useful info simply:
import master
print master.GetKey()

Though it would be nice if this info could somehow be "injected" into
the namespace of plugin.py without this, I can live with two lines.

Thanks!

-- robin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CSV without first line?

2007-07-22 Thread memracom
On 15 Jul, 04:30, "Sebastian Bassi" <[EMAIL PROTECTED]> wrote:
> Hi,
>
> In my CSV file, the first line has the name of the variables. So the
> data I want to parse resides from line 2 up to the end. Here is what I
> do:
>
> import csv
> lines=csv.reader(open("MYFILE"))
> lines.next() #this is just to avoid the first line
> for line in lines:
> DATA PARSING
>
> This works fine. But I don't like to do "lines.next()" just to get rid
> of the first line. So I wonder if the reader function on the csv
> module has something that could let me parse the file from the second
> line (w/o doing that lines.next()).

There is nothing better than the way you are doing it now. Your code
is explicit so that everyone reading it can see that you skip exactly
one line at the beginning.

Imagine that one day someone presents you with a CSV file containing
two header rows? You can easily handle this by writing:

lines.next() # and skip the second line too

but any other magic method that you can find, might not work.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re-running unittest

2007-07-22 Thread Israel Fernández Cabrera
Hi
I'm writing some code that automatically execute some registered unit
test in a way to automate the process. A sample code follows to
illustrate what I'm doing:


class PruebasDePrueba(unittest.TestCase):
   def testUnTest(self):
   a = 2
   b = 1
   self.assertEquals(a, b)

def runTests():
   loader = unittest.TestLoader()
   result = unittest.TestResult()
   suite = loader.loadTestsFromName("import_tests.PruebasDePrueba")
   suite.run(result)
   print "Errores: ", len(result.errors)
   print "Fallos: ", len(result.failures)

if __name__ == "__main__":
   runTests()
   raw_input("Modify [fix] the test and press ENTER to continue")


The code executes the tests from the class PruebasDePrueba, as the
user to "fix" the failing test and then executes the tests again after
ENTER is pressed.
The test's initial state is "fail" so, equaling the values of a or b
in the second execution I wait the test does not fails again, but it
does.
I've changed the original code in very different ways trying to figure
out what is wrong with it but no success
The problem is not reproduced if instead of loading the test from the
TestCase (import_tests.PruebasDePrueba) they are loaded referring the
container module and this behaves like this because I wrote a class
that inherits from unittest.TestLoader abd re-defines the
loadTestsFromModule(module) then every time this method is called, the
module is reloaded via "reload" python's function. I would like to do
the same with TestCases.

I have written this problem to several other python lists but I have
not received any answer, hope this time is different,
I'd like to thaks in advance, regards


-- 

Israel Fdez. Cabrera
[EMAIL PROTECTED]

 . 0 .
 . . 0
 0 0 0
-- 
http://mail.python.org/mailman/listinfo/python-list


space / nonspace

2007-07-22 Thread Peter Kleiweg

>>> import re
>>> s = u'a b\u00A0c d'
>>> s.split()
[u'a', u'b', u'c', u'd']
>>> re.findall(r'\S+', s)
[u'a', u'b\xa0c', u'd']  


This isn't documented either:

>>> s = ' b c '
>>> s.split()
['b', 'c']
>>> s.split(' ')
['', 'b', 'c', '']

   

-- 
Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: best SOAP module

2007-07-22 Thread memracom
On 18 Jul, 14:02, "Sells, Fred" <[EMAIL PROTECTED]> wrote:
> I need to talk to a vendor side via SOAP,  Googling is overwhelming and many
> hits seem to point to older attempts.
>
> Can someone tell me which SOAP module is recommended.  I'm using Python 2.4.

If you are doing this inside the enterprise then you should probably
be using ZSI from the Python Web Services project. I've used this one
successfully with my company's Web21C sdk to send SMS messages.

http://pywebsvcs.sourceforge.net/

Here's the Web21C site if you are interested http://sdk.bt.com/

--Michael Dillon

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: idiom for RE matching

2007-07-22 Thread memracom
On 19 Jul, 05:52, Gordon Airporte <[EMAIL PROTECTED]> wrote:
> I have some code which relies on running each line of a file through a
> large number of regexes which may or may not apply.

Have you read and understood what MULTILINE means in the manual
section on re syntax?

Essentially, you can make a single pattern which tests a match against
each line.

-- Michael Dillon

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python MAPI

2007-07-22 Thread memracom
> Well, I ran Process Monitor with some filters enabled to only watch
> Thunderbird and MS Word. Unfortunately, that didn't give me any of the
> registry edits, so I disabled my filters and ran it without. Now I
> have a log file with 28,000 entries. It's amazing to see all the stuff
> that happens in just a few moments, but how am I supposed to parse
> this mess?

I expect you will find it easier figuring out how to install your app
in the SendTo menu rather than making your app callable via MAPI. This
probably involves ShellExtensions but I believe there are utilities
that can be used to add any arbitrary application to the SendTo menu.
That may be enough for your application.

You might want to have a look at SpamBayes for an example of an
Outlook extension written in Python to get an idea of how you can
interface with Outlook.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Compiling PythonD using DJGPP

2007-07-22 Thread memracom
On 22 Jul, 18:29, "John Simeon" <[EMAIL PROTECTED]> wrote:
> Hi there. I had an old computer at my disposal and decided to put it to use
> by setting up a nostalgia project with DOS and Windows for Workgroups 3.11.
> gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. 
> -I./Include
>   -DPy_BUILD_CORE -o Python/compile.o Python/compile.c

> Python/compile.c: In function 'optimize_code':
> Python/compile.c:512: warning: pointer targets in assignment differ in
> signedness

This sounds like you are running into problems with C library memory
models on DOS. I.e. LARGE, MEDIUM, SMALL, TINY. Different memory
models use different pointer lengths and presumably, this might result
in the wrong bit being interpreted as a sign bit.

However, before you dig into that, try turning off the optimizations
( -O3 ) because this can be the cause of wierd errors. If this does
work, turn on optimization one level at a time to see how far you can
go.

And if this leads nowhere, then you probably are dealing with a DOS or
DJGPP specific issue. Ask people who work with DJGPP for advice.

Good Luck,

--Michael Dillon

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: custom plugin architecture: how to see parent namespace?

2007-07-22 Thread faulkner
On Jul 22, 10:06 am, escalation746 <[EMAIL PROTECTED]> wrote:
> I've got a namespace query that amounts to this: How can an imported
> function see data in the parent custom namespace? I have read through
> numerous posts which skirt this issue without answering it.
>
> To illustrate, create plugin.py with a couple of functions. The second
> will obviously fail.
>
> 
> def Hello():
> print 'hello'
>
> def ViewValuable():
> print VALUABLE
> 
>
> Then create master.py which loads the plugin at runtime, later running
> various code fragments against it.
>
> 
> # location of plugin module
> filespec = '/path/to/plugins/plugin.py'
> filepath, filename = os.path.split(filespec)
> filename = os.path.splitext(filename)[0]
>
> # add to system path
> if filepath not in sys.path:
> sys.path.append(filepath)
>
> # import into our namespace
> space = __import__(filename, globals(), locals(), [])
> namespace = space.__dict__
>
> # sometime later in the code... define a new function
> def _plus():
> print 'plus'
>
> # add that to our namespace
> namespace.update({'Plus': _plus, 'VALUABLE': 'gold'})
>
> # run custom code
> code = """
> Hello()
> Plus()
> Valuable()
> """
> exec code in namespace
> 
>
> This code will echo the lines:
> hello
> plus
>
> Followed by a traceback for:
> NameError: global name 'VALUABLE' is not defined
>
> The question is: How do I get a function in plugin.py to see VALUABLE?
> Using external storage of some sort is not viable since many different
> instances of plugin.py, all with different values of VALUABLE, might
> be running at once. (In fact VALUABLE is to be a key into a whole
> whack of data stored in a separate module space.)
>
> Extensive modifications to plugin.py is also not a viable approach,
> since that module will be created by users. Rather, I need to be able
> to pass something at execution time to make this happen. Or create an
> access function along the lines of _plus() that I can inject into the
> namespace.
>
> Any help, please? I've been losing sleep over this one.
>
> -- robin

sys._getframe(1).f_locals

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Jean-Paul Calderone
On Sun, 22 Jul 2007 21:13:02 +0200, Peter Kleiweg <[EMAIL PROTECTED]> wrote:
>Carsten Haese schreef op de 22e dag van de hooimaand van het jaar 2007:
>
>> On Sun, 2007-07-22 at 17:44 +0200, Peter Kleiweg wrote:
>> > > It's a feature. See help(str.split): "If sep is not specified or is
>> > > None, any whitespace string is a separator."
>> >
>> > Define "any whitespace".
>>
>> Any string for which isspace returns True.
>
>Define white space to isspace()
>
>> > Why is it different in  and ?
>>
>> >>> '\xa0'.isspace()
>> False
>> >>> u'\xa0'.isspace()
>> True
>
>Here is another "space":
>
>  >>> u'\uFEFF'.isspace()
>  False
>
>isspace() is inconsistent

It's only inconsistent if you think it should behave based on the
name of a unicode code point.  It doesn't use the name, though. It
uses the category.  NO-BREAK SPACE is in the Zs category (Separator, Space).
ZERO WIDTH NO-BREAK SPACE is in the Cf category (Other, Format).

Maybe that makes unicode inconsistent (I won't try to argue either way),
but it's pretty clear that isspace is being consistent based on the data
it has to work with.

Jean-Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


[ANN] ftputil 2.2.3 released

2007-07-22 Thread Stefan Schwarzer
ftputil 2.2.3 is now available from
http://ftputil.sschwarzer.net/download .

Changes since version 2.2.2
---

This release fixes a bug in the ``makedirs`` call (report and fix by
Julian, whose last name I don't know ;-) ). Upgrading is recommended.

What is ftputil?


ftputil is a high-level FTP client library for the Python programming
language. ftputil implements a virtual file system for accessing FTP
servers, that is, it can generate file-like objects for remote files.
The library supports many functions similar to those in the os,
os.path and shutil modules. ftputil has convenience functions for
conditional uploads and downloads, and handles FTP clients and servers
in different timezones.

Read the documentation at
http://ftputil.sschwarzer.net/documentation .

License
---

ftputil is Open Source software, released under the revised BSD
license (see http://www.opensource.org/licenses/bsd-license.php ).

Stefan

-- 
Dr.-Ing. Stefan Schwarzer
SSchwarzer.com - Softwareentwicklung für Technik und Wissenschaft
http://sschwarzer.com
-- 
http://mail.python.org/mailman/listinfo/python-list


ANN: pyparsing1.4.7

2007-07-22 Thread Paul McGuire
I just uploaded the latest release (v1.4.7) of pyparsing, and I'm
happy to say, it is not a very big release - this module is getting
to be quite stable.  A few bug-fixes, and one significant notation
enhancement: setResultsNames gains a big shortcut in this release
(see below).  No new examples in this release, sorry.

Here are the notes:

- NEW NOTATION SHORTCUT: ParserElement now accepts results names
  using a notational shortcut, following the expression with the
  results name in parentheses.  So this:

stats = "AVE:" + realNum.setResultsName("average") + \
"MIN:" + realNum.setResultsName("min") + \
"MAX:" + realNum.setResultsName("max")

  can now be written as this:

stats = "AVE:" + realNum("average") + \
"MIN:" + realNum("min") + \
"MAX:" + realNum("max")

  The intent behind this change is to make it simpler to define
  results names for significant fields within the expression, while
  keeping the grammar syntax clean and uncluttered.

- Fixed bug when packrat parsing is enabled, with cached ParseResults
  being updated by subsequent parsing.  Reported on the pyparsing
  wiki by Kambiz, thanks!

- Fixed bug in operatorPrecedence for unary operators with left
  associativity, if multiple operators were given for the same term.

- Fixed bug in example simpleBool.py, corrected precedence of "and"
  vs. "or" operations.

- Fixed bug in Dict class, in which keys were converted to strings
  whether they needed to be or not.  Have narrowed this logic to
  convert keys to strings only if the keys are ints (which would
  confuse __getitem__ behavior for list indexing vs. key lookup).

- Added ParserElement method setBreak(), which will invoke the pdb
  module's set_trace() function when this expression is about to be
  parsed.

- Fixed bug in StringEnd in which reading off the end of the input
  string raises an exception - should match.  Resolved while
  answering a question for Shawn on the pyparsing wiki.

Download pyparsing 1.4.7 at http://sourceforge.net/projects/pyparsing/.
The pyparsing Wiki is at http://pyparsing.wikispaces.com

-- Paul


Pyparsing is a pure-Python class library for quickly developing
recursive-descent parsers.  Parser grammars are assembled directly in
the calling Python code, using classes such as Literal, Word,
OneOrMore, Optional, etc., combined with operators '+', '|', and '^'
for And, MatchFirst, and Or.  No separate code-generation or external
files are required.  Pyparsing can be used in many cases in place of
regular expressions, with shorter learning curve and greater
readability and maintainability.  Pyparsing comes with a number of
parsing examples, including:
- "Hello, World!" (English, Korean, Greek, and Spanish(new))
- chemical formulas
- configuration file parser
- web page URL extractor
- 5-function arithmetic expression parser
- subset of CORBA IDL
- chess portable game notation
- simple SQL parser
- Mozilla calendar file parser
- EBNF parser/compiler
- Python value string parser (lists, dicts, tuples, with nesting)
  (safe alternative to eval)
- HTML tag stripper
- S-expression parser
- macro substitution preprocessor

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Wildemar Wildenburger
Peter Kleiweg wrote:
>
> Define white space to isspace()
>  
>   
Explain that phrase.

>
> Here is another "space":
>
>   >>> u'\uFEFF'.isspace()
>   False
>
> isspace() is inconsistent
>   
I don't really know much about unicode, but google tells me that \uFEFF 
is a byte order mark. I thought we we're implicitly in unison that 
"whitespace" (whatever the formal definition) means "the stuff we put 
into text to visually separate words".
So what is *your* definition of whitespace?


>>> Why does split() split when it says NO-BREAK?
>>>   
>> Precisely. It says NO-BREAK. It doesn't say NO-SPLIT.
>> 
>
> That is a stupid answer.
>
>   
I fail to see why you deem it a good idea to become insulting at this point.
It is a very valid answer: NO-BREAK means "when wrapping characters into 
paragraphs do not break at this space".
split() however does not wrap text, it /splits/ it (at whitespace 
characters, as it happens). The NO-BREAK semantic has no meaning here.


/W
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Peter Kleiweg
Carsten Haese schreef op de 22e dag van de hooimaand van het jaar 2007:

> On Sun, 2007-07-22 at 17:44 +0200, Peter Kleiweg wrote: 
> > > It's a feature. See help(str.split): "If sep is not specified or is
> > > None, any whitespace string is a separator."
> > 
> > Define "any whitespace".
> 
> Any string for which isspace returns True.

Define white space to isspace()
 
> > Why is it different in  and ?
> 
> >>> '\xa0'.isspace()
> False
> >>> u'\xa0'.isspace()
> True

Here is another "space":

  >>> u'\uFEFF'.isspace()
  False

isspace() is inconsistent
 
> For byte strings, Python doesn't know whether 0xA0 is a whitespace
> because it depends on the encoding whether the number 160 corresponds to
> a whitespace character. For unicode strings, code point 160 is
> unquestionably a whitespace, because it is a no-break SPACE.

I question it. And so does the sre module:

  \s Matches any whitespace character; equivalent to [ \t\n\r\f\v]

Where is the NO-BREAK SPACE in there?

 
> > Why does split() split when it says NO-BREAK?
> 
> Precisely. It says NO-BREAK. It doesn't say NO-SPLIT.

That is a stupid answer.


-- 
Peter Kleiweg  L:NL,af,da,de,en,ia,nds,no,sv,(fr,it)  S:NL,de,en,(da,ia)
info: http://www.let.rug.nl/kleiweg/ls.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: UnicodeDecodeError

2007-07-22 Thread Peter Otten
Ken Seehart wrote:

>> I am wondering if anyone knows where I can find a mapping from this
>> particular extended ascii code (where \xd1 is Ñ), to the corresponding
>> unicode characters.

> Um, never mind.  The recent unicode conversation gave me my answer  :-)
> unicode(s, 'Windows-1252')

Run the following script to get a few more candidates:

import encodings
import os
import glob

def encodings_from_modulenames():
ef = os.path.dirname(encodings.__file__)
for fn in glob.glob(os.path.join(ef, "*.py")):
fn = os.path.basename(fn)
yield os.path.splitext(fn)[0]

def find_encodings(unistr, bytestr, encodings=None):
if encodings is None:
encodings = encodings_from_modulenames()
for encoding in encodings:
try:
encoded = unistr.encode(encoding)
except Exception:
pass
else:
if encoded == bytestr:
yield encoding

for encoding in find_encodings(u"\N{LATIN CAPITAL LETTER N WITH TILDE}", 
"\xd1"):
print encoding

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where is the collections module?

2007-07-22 Thread Carsten Haese
On Sun, 2007-07-22 at 14:24 -0400, Gordon Airporte wrote:
> I was going to try tweaking defaultdict, but I can't for the life of me 
> find where the collections module or its structures are defined. Python 2.5.

It's written in C. You'll find it in the Python2.5 source code
at /path/to/source/Modules/collectionsmodule.c

-- 
Carsten Haese
http://informixdb.sourceforge.net


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advice on sending images to clients over network

2007-07-22 Thread Paul McNett
Calvin Spealman wrote:
> On 7/22/07, Paul McNett <[EMAIL PROTECTED]> wrote:
>> Paul Rubin wrote:
>> > Frank Millman <[EMAIL PROTECTED]> writes:
>> >> Any suggestions will be much appreciated.
>> >
>> > Why on earth don't you write the whole thing as a web app instead of
>> > a special protocol?  Then just use normal html tags to put images
>> > into the relevant pages.
>>
>> I believe he has a full desktop client app, not a web app. Believe it or
>> not, there's still a solid place for desktop applications even in this
>> ever-increasing webified world.
>>
>> Use the right tool for the job...
> 
> There is no reason that something being a "desktop app" means they
> can't use HTTP instead of reinventing the protocol wheel all over
> again.

Absolutely! Which is why I recommended setting up an httpd to serve the 
images...

I interpreted Paul Rubin's response to say "rewrite the whole thing 
(client, server, everything) as a web app".

Cheers!


-- 
pkm ~ http://paulmcnett.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Where is the collections module?

2007-07-22 Thread Calvin Spealman
Look in Modules/_collectionsmodule.c

Pretty much any built-in module will be named thusly.

On 7/22/07, Gordon Airporte <[EMAIL PROTECTED]> wrote:
> I was going to try tweaking defaultdict, but I can't for the life of me
> find where the collections module or its structures are defined. Python 2.5.
> --
> http://mail.python.org/mailman/listinfo/python-list
>


-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Where is the collections module?

2007-07-22 Thread Gordon Airporte
I was going to try tweaking defaultdict, but I can't for the life of me 
find where the collections module or its structures are defined. Python 2.5.
-- 
http://mail.python.org/mailman/listinfo/python-list


http://www.tbn.org/films/videos/To_Hell_And_Back.ram << GREAT VIDEO!

2007-07-22 Thread David Manti
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Advice on sending images to clients over network

2007-07-22 Thread Calvin Spealman
On 7/22/07, Paul McNett <[EMAIL PROTECTED]> wrote:
> Paul Rubin wrote:
> > Frank Millman <[EMAIL PROTECTED]> writes:
> >> Any suggestions will be much appreciated.
> >
> > Why on earth don't you write the whole thing as a web app instead of
> > a special protocol?  Then just use normal html tags to put images
> > into the relevant pages.
>
> I believe he has a full desktop client app, not a web app. Believe it or
> not, there's still a solid place for desktop applications even in this
> ever-increasing webified world.
>
> Use the right tool for the job...

There is no reason that something being a "desktop app" means they
can't use HTTP instead of reinventing the protocol wheel all over
again.

> --
> pkm ~ http://paulmcnett.com
> --
> http://mail.python.org/mailman/listinfo/python-list
>


-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://ironfroggy-code.blogspot.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advice on sending images to clients over network

2007-07-22 Thread Jorge Godoy
Paul McNett wrote:

> Paul Rubin wrote:
>> Frank Millman <[EMAIL PROTECTED]> writes:
>>> Any suggestions will be much appreciated.
>> 
>> Why on earth don't you write the whole thing as a web app instead of
>> a special protocol?  Then just use normal html tags to put images
>> into the relevant pages.
> 
> I believe he has a full desktop client app, not a web app. Believe it or
> not, there's still a solid place for desktop applications even in this
> ever-increasing webified world.

He's using wxPython and already has network connectivity to access the
database server.

> Use the right tool for the job...

Yep...  I also believe that a HTTP server is the right tool. :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: URL parsing for the hard cases

2007-07-22 Thread Miles
On 7/22/07, John Nagle wrote:
> Is there something available that will parse the "netloc" field as
> returned by URLparse, including all the hard cases?  The "netloc" field
> can potentially contain a port number and a numeric IP address.  The
> IP address may take many forms, including an IPv6 address.

What do you mean by "parse" the field?  What do you want to get back
from the parser function?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advice on sending images to clients over network

2007-07-22 Thread Paul McNett
Paul Rubin wrote:
> Frank Millman <[EMAIL PROTECTED]> writes:
>> Any suggestions will be much appreciated.
> 
> Why on earth don't you write the whole thing as a web app instead of
> a special protocol?  Then just use normal html tags to put images
> into the relevant pages.

I believe he has a full desktop client app, not a web app. Believe it or 
not, there's still a solid place for desktop applications even in this 
ever-increasing webified world.

Use the right tool for the job...


-- 
pkm ~ http://paulmcnett.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Compiling PythonD using DJGPP

2007-07-22 Thread John Simeon
Hi there. I had an old computer at my disposal and decided to put it to use 
by setting up a nostalgia project with DOS and Windows for Workgroups 3.11.

Now that all of you are back from laughing about the archaicness of the 
software involved ;-) here is my problem.

PythonD is a port of python to DOS. The release used is 2.4.2. I am trying 
to compile PythonD using DJGPP, which is a port of GCC to DOS.

My problem is that I am getting a compiler error that I do not understand:

gcc -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. 
-I./Include 
  -DPy_BUILD_CORE -o Python/compile.o Python/compile.c
Python/compile.c: In function 'optimize_code':
Python/compile.c:512: warning: pointer targets in assignment differ in 
signedness
Python/compile.c: At top level:
Python/compile.c:1038: error: two or more data types in declaration 
specifiers
Python/compile.c:1232: error: two or more data types in declaration 
specifiers
Python/compile.c: In function 'com_addbyte':
Python/compile.c:1232: error: parameter name omitted
Python/compile.c:1241: error: expected expression before 'unsigned'
make.exe: *** [Python/compile.o] Error 1


Does anyone have any ideas that could help shed some light on this and help 
me get back on track?

Thanks! 



-- 
Posted via a free Usenet account from http://www.teranews.com

-- 
http://mail.python.org/mailman/listinfo/python-list


URL parsing for the hard cases

2007-07-22 Thread John Nagle
Is there something available that will parse the "netloc" field as
returned by URLparse, including all the hard cases?  The "netloc" field
can potentially contain a port number and a numeric IP address.  The
IP address may take many forms, including an IPv6 address.

I'm parsing URLs used by hostile sites, and the wierd cases come up
all too frequently.

John Nagle
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advice on sending images to clients over network

2007-07-22 Thread Paul McNett
Frank Millman wrote:
> I guess the point of all this rambling is that my thought process is
> leading me towards my third option, but this would be a bit of work to
> set up, so I would appreciate any comments from anyone who has been
> down this road before - do I make sense, or are there better ways to
> handle this?
> 
> Any suggestions will be much appreciated.

I would put the images into a static web directory, either on the same 
or different server. Then your main server just sends the url (or 
relevant portion of the url, or list of all urls to download), and then 
the client grabs the images from your image server using urllib.

Let Apache do what it's good at, instead of reinventing that particular 
wheel.

-- 
pkm ~ http://paulmcnett.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lazy "for line in f" ?

2007-07-22 Thread Miles
On 7/22/07, Alexandre Ferrieux  wrote:
> The Tutorial says about the "for line in f" idiom that it is "space-
> efficient".
> Short of further explanation, I interpret this as "doesn't read the
> whole file before spitting out lines".
> In other words, I would say "lazy". Which would be a Good Thing, a
> much nicer idiom than the usual while loop calling readline()...
>
> But when I use it on the standard input, be it the tty or a pipe, it
> seems to wait for EOF before yielding the first line.

It doesn't read the entire file, but it does use internal buffering
for performance.  On my system, it waits until it gets about 8K of
input before it yields anything.  If you need each line as it's
entered at a terminal, you're back to the while/readline (or
raw_input) loop.

-Miles
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with exec

2007-07-22 Thread Steve Holden
Steven D'Aprano wrote:
> On Sun, 22 Jul 2007 09:12:21 -0400, Steve Holden wrote:
> 
> 
>>> Steve Holden was playing silly games. You can't use { } for indentation.
>>> You have to use indentation.
>>>
>> I wasn't playing silly games at all, and I did prefix that part ofmy 
>> answer with "I'm afraid I don't understand this question". The OP is 
>> writing a program to "translate" a Python-like language that uses 
>> non-English keywords into Python. Since the application is transforming 
>> its input, it could transform braces into indentation. Of course 
>> *Python* doesn't use braces, but the question was how to write 
>> "pseudo-Python" without using indentation to indicate grouping.
> 
> Then I have misunderstood you, and I apologize.
> 
Thanks. I was hoping that was the case.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
--- Asciimercial --
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
--- Thank You for Reading -

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advice on sending images to clients over network

2007-07-22 Thread Paul Rubin
Frank Millman <[EMAIL PROTECTED]> writes:
> Any suggestions will be much appreciated.

Why on earth don't you write the whole thing as a web app instead of
a special protocol?  Then just use normal html tags to put images
into the relevant pages.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with exec

2007-07-22 Thread Steven D'Aprano
On Sun, 22 Jul 2007 09:12:21 -0400, Steve Holden wrote:


>> Steve Holden was playing silly games. You can't use { } for indentation.
>> You have to use indentation.
>> 
> I wasn't playing silly games at all, and I did prefix that part ofmy 
> answer with "I'm afraid I don't understand this question". The OP is 
> writing a program to "translate" a Python-like language that uses 
> non-English keywords into Python. Since the application is transforming 
> its input, it could transform braces into indentation. Of course 
> *Python* doesn't use braces, but the question was how to write 
> "pseudo-Python" without using indentation to indicate grouping.

Then I have misunderstood you, and I apologize.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Lazy "for line in f" ?

2007-07-22 Thread Christoph Haas
On Sun, Jul 22, 2007 at 09:10:50AM -0700, Alexandre Ferrieux wrote:
> I'm a total newbie in Python, but did give quite a try to the
> documentation before coming here.
> Sorry if I missed the obvious.
> 
> The Tutorial says about the "for line in f" idiom that it is "space-
> efficient".
> Short of further explanation, I interpret this as "doesn't read the
> whole file before spitting out lines".

Correct. It reads one line at a time (as an "iterator") and returns it.

> In other words, I would say "lazy". Which would be a Good Thing, a
> much nicer idiom than the usual while loop calling readline()...

The space-efficiency is similar. The faux pas would rather to read the
whole file with readlines().

> But when I use it on the standard input, be it the tty or a pipe, it
> seems to wait for EOF before yielding the first line.

Standard input is a weird thing in Python. Try sending two EOFs
(Ctrl-D). There is some internal magic with two loops checking for EOF.
It's submitted as a bug report bug the developers denied a solution.
Otherwise it's fine. In a pipe you shouldn't even notice.

 Christoph

-- 
http://mail.python.org/mailman/listinfo/python-list


Lazy "for line in f" ?

2007-07-22 Thread Alexandre Ferrieux
Hi,

I'm a total newbie in Python, but did give quite a try to the
documentation before coming here.
Sorry if I missed the obvious.

The Tutorial says about the "for line in f" idiom that it is "space-
efficient".
Short of further explanation, I interpret this as "doesn't read the
whole file before spitting out lines".
In other words, I would say "lazy". Which would be a Good Thing, a
much nicer idiom than the usual while loop calling readline()...

But when I use it on the standard input, be it the tty or a pipe, it
seems to wait for EOF before yielding the first line.

So, is it lazy or not ? Is there some external condition that may
trigger one behavior or the other ? If not, why is it said "space
efficient" ?

TIA,

-Alex

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Carsten Haese
On Sun, 2007-07-22 at 17:44 +0200, Peter Kleiweg wrote: 
> > It's a feature. See help(str.split): "If sep is not specified or is
> > None, any whitespace string is a separator."
> 
> Define "any whitespace".

Any string for which isspace returns True.

> Why is it different in  and ?

>>> '\xa0'.isspace()
False
>>> u'\xa0'.isspace()
True

For byte strings, Python doesn't know whether 0xA0 is a whitespace
because it depends on the encoding whether the number 160 corresponds to
a whitespace character. For unicode strings, code point 160 is
unquestionably a whitespace, because it is a no-break SPACE.

> Why does split() split when it says NO-BREAK?

Precisely. It says NO-BREAK. It doesn't say NO-SPLIT.

-- 
Carsten Haese
http://informixdb.sourceforge.net


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Peter Kleiweg
Carsten Haese schreef op de 22e dag van de hooimaand van het jaar 2007:

> On Sun, 2007-07-22 at 17:15 +0200, Peter Kleiweg wrote:
> > Is this a bug or a feature?
> > 
> > 
> > Python 2.4.4 (#1, Oct 19 2006, 11:55:22) 
> > [GCC 2.95.3 20010315 (SuSE)] on linux2
> > 
> > >>> a = 'a b c\240d e'
> > >>> a
> > 'a b c\xa0d e'
> > >>> a.split()
> > ['a', 'b', 'c\xa0d', 'e']
> > >>> a = a.decode('latin-1')
> > >>> a
> > u'a b c\xa0d e'
> > >>> a.split()
> > [u'a', u'b', u'c', u'd', u'e']
> 
> It's a feature. See help(str.split): "If sep is not specified or is
> None, any whitespace string is a separator."

Define "any whitespace".
Why is it different in  and ?
Why does split() split when it says NO-BREAK?

-- 
Peter Kleiweg  L:NL,af,da,de,en,ia,nds,no,sv,(fr,it)  S:NL,de,en,(da,ia)
info: http://www.let.rug.nl/kleiweg/ls.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: split on NO-BREAK SPACE

2007-07-22 Thread Carsten Haese
On Sun, 2007-07-22 at 17:15 +0200, Peter Kleiweg wrote:
> Is this a bug or a feature?
> 
> 
> Python 2.4.4 (#1, Oct 19 2006, 11:55:22) 
> [GCC 2.95.3 20010315 (SuSE)] on linux2
> 
> >>> a = 'a b c\240d e'
> >>> a
> 'a b c\xa0d e'
> >>> a.split()
> ['a', 'b', 'c\xa0d', 'e']
> >>> a = a.decode('latin-1')
> >>> a
> u'a b c\xa0d e'
> >>> a.split()
> [u'a', u'b', u'c', u'd', u'e']

It's a feature. See help(str.split): "If sep is not specified or is
None, any whitespace string is a separator."

-- 
Carsten Haese
http://informixdb.sourceforge.net


-- 
http://mail.python.org/mailman/listinfo/python-list


split on NO-BREAK SPACE

2007-07-22 Thread Peter Kleiweg

Is this a bug or a feature?


Python 2.4.4 (#1, Oct 19 2006, 11:55:22) 
[GCC 2.95.3 20010315 (SuSE)] on linux2

>>> a = 'a b c\240d e'
>>> a
'a b c\xa0d e'
>>> a.split()
['a', 'b', 'c\xa0d', 'e']
>>> a = a.decode('latin-1')
>>> a
u'a b c\xa0d e'
>>> a.split()
[u'a', u'b', u'c', u'd', u'e']



-- 
Peter Kleiweg  L:NL,af,da,de,en,ia,nds,no,sv,(fr,it)  S:NL,de,en,(da,ia)
info: http://www.let.rug.nl/kleiweg/ls.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can a low-level programmer learn OOP?

2007-07-22 Thread Wolfgang Strobl
[EMAIL PROTECTED] (Eddie Corns):

>I don't believe you can get the benefit of SNOBOL matching without direct
>language support.  

That's my opinion, too.

>There's only so much a library can do. However a valiant
>and interesting effort:
>
>http://www.wilmott.ca/python/patternmatching.html

This is newer than http://sourceforge.net/projects/snopy/ which adapts a
ADA implemenation, which follows the SNOBOL model quite closely. Didn't
knew that. Thanks for pointing it out!  

Well, unfortunately, it somehow demonstrates your point. This may be
missing familiarity with the changed idiom, though. Perhaps rewriting a
few of James Gimple's snippets from "Algorithms in SNOBOL4"
(->http://www.snobol4.org/) as an exercise using that library might help
to get a better appreciation. Perhaps I'll try, eventually ...


-- 
Wir danken für die Beachtung aller Sicherheitsbestimmungen
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can a low-level programmer learn OOP?

2007-07-22 Thread Wolfgang Strobl
[EMAIL PROTECTED] (Aahz):

>In article <[EMAIL PROTECTED]>,
>Wolfgang Strobl  <[EMAIL PROTECTED]> wrote:
>>
>>SNOBOLs powerfull patterns still shine, compared to Pythons clumsy
>>regular expressions. 
>
>Keep in mind that Python regular expressions are modeled on the
>grep/sed/awk/Perl model so as to be familiar to any sysadmin 

Sure, I don't dispute that. There is room for both regular expressions
and SNOBOL type patterns, IMHO, because the concepts are different
enough.

>-- but
>there's a reason why Python makes it a *library* unlike Perl.  So adding
>SNOBOL patterns to another library would be a wonderful gift to the
>Python community...

Like Eddie Corns if find it hard to do in an elegant way, without
integrating it into the language. I haven't looked into it for a long
time, though.

-- 
Wir danken für die Beachtung aller Sicherheitsbestimmungen
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Can a low-level programmer learn OOP?

2007-07-22 Thread Wolfgang Strobl
Paul Rubin :

>[EMAIL PROTECTED] (Aahz) writes:
>> .So adding SNOBOL patterns to another library would be a wonderful
>> gift to the Python community...
>
>Snobol patterns were invented at a time when nobody knew anything
>about parsing.  

But Snobol patterns aren't mainly about building parsers. 

>They were extremely powerful (recursive with arbitrary
>amounts of backtracking) but could use exponential time and maybe even
>exponential space.

Sure. Like any Turing complete language feature. 
>
>These days, it makes more sense to use something like pyparsing.  

Probably, yes.



-- 
Wir danken für die Beachtung aller Sicherheitsbestimmungen
-- 
http://mail.python.org/mailman/listinfo/python-list


custom plugin architecture: how to see parent namespace?

2007-07-22 Thread escalation746
I've got a namespace query that amounts to this: How can an imported
function see data in the parent custom namespace? I have read through
numerous posts which skirt this issue without answering it.

To illustrate, create plugin.py with a couple of functions. The second
will obviously fail.


def Hello():
print 'hello'

def ViewValuable():
print VALUABLE


Then create master.py which loads the plugin at runtime, later running
various code fragments against it.


# location of plugin module
filespec = '/path/to/plugins/plugin.py'
filepath, filename = os.path.split(filespec)
filename = os.path.splitext(filename)[0]

# add to system path
if filepath not in sys.path:
sys.path.append(filepath)

# import into our namespace
space = __import__(filename, globals(), locals(), [])
namespace = space.__dict__

# sometime later in the code... define a new function
def _plus():
print 'plus'

# add that to our namespace
namespace.update({'Plus': _plus, 'VALUABLE': 'gold'})

# run custom code
code = """
Hello()
Plus()
Valuable()
"""
exec code in namespace


This code will echo the lines:
hello
plus

Followed by a traceback for:
NameError: global name 'VALUABLE' is not defined

The question is: How do I get a function in plugin.py to see VALUABLE?
Using external storage of some sort is not viable since many different
instances of plugin.py, all with different values of VALUABLE, might
be running at once. (In fact VALUABLE is to be a key into a whole
whack of data stored in a separate module space.)

Extensive modifications to plugin.py is also not a viable approach,
since that module will be created by users. Rather, I need to be able
to pass something at execution time to make this happen. Or create an
access function along the lines of _plus() that I can inject into the
namespace.

Any help, please? I've been losing sleep over this one.

-- robin

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sort lines in a text file

2007-07-22 Thread Daniel
On Sun, 22 Jul 2007 06:03:17 +0300, leegold <[EMAIL PROTECTED]> wrote:
> say I have a text file:
>
> zz3 uaa4a ss 7 uu
>   zz 3 66 ppazz9
> a0zz0
>
> I want to sort the text file. I want the key to be the number after
> the two "zz".  Or I guess a string of two zz then a numberSo
> that's 3, 9, 0
>
> I'm trying to say that I want to sort lines in a file based on a
> regular expression. How could I do that in Python? I'm limited to
> Python 2.1, I can't add any 2nd party newer tools.
>
> Thanks
> Lee G.
>

Shouldn't it be 3, 6, 9, 0
-- 
http://mail.python.org/mailman/listinfo/python-list


ANN: Snobol 1.0

2007-07-22 Thread greg
Aahz wrote:
> So adding
> SNOBOL patterns to another library would be a wonderful gift to the
> Python community...

I wrote a module for Snobol-style pattern matching a
while back, but didn't get around to releasing it.
I've just put it on my web page:

http://www.cosc.canterbury.ac.nz/greg.ewing/python/Snobol.tar.gz

There's no manual yet, but there's a fairly complete
set of docstrings and some test cases to figure it
out from.

--
Greg
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sort lines in a text file

2007-07-22 Thread Daniel
On Sun, 22 Jul 2007 06:03:17 +0300, leegold <[EMAIL PROTECTED]> wrote:
> say I have a text file:
>
> zz3 uaa4a ss 7 uu
>   zz 3 66 ppazz9
> a0zz0
>
> I want to sort the text file. I want the key to be the number after
> the two "zz".  Or I guess a string of two zz then a numberSo
> that's 3, 9, 0
>
> I'm trying to say that I want to sort lines in a file based on a
> regular expression. How could I do that in Python? I'm limited to
> Python 2.1, I can't add any 2nd party newer tools.

Do your own homework.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with exec

2007-07-22 Thread vedrandekovic
> I wasn't playing silly games at all, and I did prefix that part ofmy
> answer with "I'm afraid I don't understand this question". The OP is
> writing a program to "translate" a Python-like language that uses
> non-English keywords into Python. Since the application is transforming
> its input, it could transform braces into indentation. Of course *Python*
> doesn't use braces, but the question was how to write "pseudo-Python"
> without using indentation to indicate grouping.
>
> regards
>  Steve


Hi,

This previously is exactly what I need can you help me somehow about
this
code
indentation, on any way you know.Plese help I will really appreciate
this!!



 Regards,

 Vedran


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with exec

2007-07-22 Thread Steve Holden
Steven D'Aprano wrote:
> On Sun, 22 Jul 2007 03:23:30 -0700, vedrandekovic wrote:
> 
>> Thanks for everything previously, but just to I ask about code
>> indentation,this with { and } doesn't
>> employed, here is my example how can I solve this about code
>> indentation:
>>
> n=90
> if n==90:
>> {print "bok kjai ma'}
>>   File "", line 2
>> {print "bok kjai ma'}
>>  ^
>> SyntaxError: invalid syntax
> 
> 
> Steve Holden was playing silly games. You can't use { } for indentation.
> You have to use indentation.
> 
I wasn't playing silly games at all, and I did prefix that part ofmy 
answer with "I'm afraid I don't understand this question". The OP is 
writing a program to "translate" a Python-like language that uses 
non-English keywords into Python. Since the application is transforming 
its input, it could transform braces into indentation. Of course 
*Python* doesn't use braces, but the question was how to write 
"pseudo-Python" without using indentation to indicate grouping.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
--- Asciimercial --
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
--- Thank You for Reading -

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pickled objects over the network

2007-07-22 Thread Steve Holden
Hendrik van Rooyen wrote:
> "Steve Holden" <[EMAIL PROTECTED]> wrote:
> 
>> I think someone has already pointed out netstrings, which will allow you 
>> to send arbitrary strings over network connections deterministically. 
> 
> Yes I brought it up
> 
>> I'm afraid for the rest it's just a matter of encoding your information 
>> in a way that you can decode without allowing a malicious sender to 
>> cause arbitrary code to be called.
> 
> Yes - and in general you do this by having both the sender and the 
> transmitter conform to some pre agreed format - a netstring is one 
> of the simplest of such things - another way is to "frame" records 
> between some kind of delimeter and to "escape" the occurences of the
> delimiter in the data.  Another way is to use simple "self defining fields"
> that work by giving fields a "tag" number from a list of pre defined
> things, as well as a length, followed by the data - some financial 
> protocols work as a variant of this concept, where the presence or 
> absence of a bit signify the presence or absence of a field in the record.
> 
> The problem with all of these schemes is that they are all a PITA to
> implement, compared to the ease with which you can pickle and 
> unpickle something like a simple dict of parameters.
> 
> And if that is all you want to pass over to some remote thing, then
> having to download and import Pyro is an equal PITA and overkill.
> - It adresses a far more sophisticated problem than just getting 
> some small things across the network.
> 
> Now if Pyro were to make it into the standard library, it would be
> my method of choice for even this silly level of functionality, 
> because I happen to think it rocks.
> 
>> The issue with pickle is that it's way too general a mechanism to be 
>> secure in open network applications, so a suggestion to beef up its 
>> security was misguided. Trying to "beef up pickle's security" is like 
>> trying to make a shotgun that can't kill anything.
>>
> 
> Is it really that impossible to add something like a "noeval" flag, or to
> force it to only give you a string or a dict if you ask for one or the other, 
> given that someone has already mentioned that the built in types are 
> decoded by separate routines?
> 
> Or more generally - as it already has different protocols - to define a
> protocol that won't pass executable stuff over, or one that will only 
> pass and accept the built in types?
> 
Yes.

regards
  Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd   http://www.holdenweb.com
Skype: holdenweb  http://del.icio.us/steve.holden
--- Asciimercial --
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
--- Thank You for Reading -

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: importing a module from a specific directory

2007-07-22 Thread O.R.Senthil Kumaran
>  I would like to organize them into directory structure in
>  which there is a 'main' directory, and under it directories for
>  specific sub-tasks, or sub-experiments, I'm running (let's call them
>  'A', 'B', 'C').
>  Is there a neat clean way of achieving the code organization?
> 

This is a kind of a frequently asked question at c.l.p and every programmer I
guess has to go through this problem.
If you look around c.l.p you will find that one of the good ways to solve this
problem with the python interpretor <2.5 is:

>>> import sys
>>> sys.path.append(os.path.abspath(os.pardir))

But, if you are using Python 2.5, you are saved. 

Straight from the documentation:


Starting with Python 2.5, in addition to the implicit relative imports
described above, you can write explicit relative imports with the from module
import name form of import statement. These explicit relative imports use
leading dots to indicate the current and parent packages involved in the
relative import. From the surround module for example, you might use:

from . import echo
from .. import Formats
from ..Filters import equalizer


HTH.




-- 
O.R.Senthil Kumaran
http://uthcode.sarovar.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to find available classes in a file ?

2007-07-22 Thread John J. Lee
Alex Popescu <[EMAIL PROTECTED]> writes:
[...]
> I may be wrong but I think I've found a difference between my
> dir(module) approach
> and the inspect.getmembers(module, inspect.isclass): the first one
> returns the
> classes defined in the module, while the later also lists the imported
> available
> classes.

FWIW, see doctest.DocTestFinder._from_module() for a way to tell if an
object is from a module.  This can be fooled if you've managed to get
hold of two copies of a module, though, which unfortunately is
possible.

I hope the import system is much cleaner in Python 3 :-/ (seems there
are efforts in that direction, though I'm not up-to-date with it).


John
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: code packaging

2007-07-22 Thread Alex Popescu
On 7/22/07, Ryan Ginstrom <> wrote:
> Hi Alex:
> 
> Do you develop for Windows? Are you looking to automate a build
> process? 
> 
> The standard library's build module is distutils:
> http://docs.python.org/lib/module-distutils.html
> 
> As I mentioned in my post, I use a variety of third-party modules
> (epydoc, py2exe, Inno Setup, AutoIt), which I glue together with my
> own brew of python scripts. Since I develop mostly for Windows (except
> for Web stuff, which I use Linux for), my build process is tailored to
> that platform. 
> 
> Regards,
> Ryan Ginstrom

Thanks for following up on this. I am mostly used with a world where
platform dependent builds are not very used (Java). 

My current attempt to learn Python is by creating a Python equivalent of
the advanced Java testing framework TestNG (http://testng.org) -- but I
will give more details about this when I'll be starting to have
something more solid :-). 

However, due to my background where builds are required and there are
some de facto build standards (Ant, Maven, etc.) I am starting to think
I will be needing something similar while working on the tool I've
mentioned. At this point I think that what I am interested in are how
distros are build in Python world (no platform specific distros, but
rather generic ones) and ways to automate the process of creating the
distros (and probably running the framework internal tests etc.) 

My research lead me to distutils and/or setuptools, respectively SCon
and/or buildutils. So I am wondering if I am looking in the right
direction or do I need to do some more research :-). 

tia,
./alex
--
.w( the_mindstorm )p.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way for missing dict keys

2007-07-22 Thread Alex Popescu
Zentrader <[EMAIL PROTECTED]> wrote in news:1185041243.323915.161230
@x40g2000prg.googlegroups.com:

> On Jul 21, 7:48 am, Duncan Booth <[EMAIL PROTECTED]> wrote:
>
> [snip...]
>
> 
>>From the 2.6 PEP #361 (looks like dict.has_key is deprecated)
> Python 3.0 compatability: ['compatibility'-->someone should use a
> spell-checker for 'official' releases]
> - warnings were added for the following builtins which no
> longer exist in 3.0:
>  apply, callable, coerce, dict.has_key, execfile, reduce,
> reload
> 

I see... what that document doesn't describe is the alternatives to be 
used. And I see in that list a couple of functions that are probably used a 
lot nowadays (callable, reduce, etc.).

bests,
./alex
--
.w the_mindstorm )p.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with exec

2007-07-22 Thread Steven D'Aprano
On Sun, 22 Jul 2007 03:23:30 -0700, vedrandekovic wrote:

> Thanks for everything previously, but just to I ask about code
> indentation,this with { and } doesn't
> employed, here is my example how can I solve this about code
> indentation:
> 
 n=90
 if n==90:
> {print "bok kjai ma'}
>   File "", line 2
> {print "bok kjai ma'}
>  ^
> SyntaxError: invalid syntax


Steve Holden was playing silly games. You can't use { } for indentation.
You have to use indentation.

-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Advice on sending images to clients over network

2007-07-22 Thread Bjoern Schliessmann
Frank Millman wrote:

> My question is, what is the best way to get the image to the
> client? 

IMHO, HTTP would be most painless. Either incorporate a little HTTP
server into your server application, or use a seperate daemon and
let the server only output HTTP links.

> My third thought was to set up a separate 'image server'. It would
> be another server program written in Python, listening on its own
> port number, waiting for a client request for a particular image.
> It would know where to find it, read it in, and send it to the
> client. Then all the client needs to know is the ip address and
> port number.

See above -- you could also write your own HTTP server. Best using
Twisted or something of similar high level. Why make yourself work
developing such a system when somebody already did it for you?

> I guess the point of all this rambling is that my thought process
> is leading me towards my third option, but this would be a bit of
> work to set up, so I would appreciate any comments from anyone who
> has been down this road before - do I make sense, or are there
> better ways to handle this?

For minimum additional work, I'd use some lightweight http daemon,
like lighttpd ot thttpd. Then your server can pass links like
 and the clients
can use well-tested standard library routines to retrieve the
image.

Regards,


Björn

-- 
BOFH excuse #174:

Backbone adjustment

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with exec

2007-07-22 Thread vedrandekovic
On 21 srp, 22:31, Steve Holden <[EMAIL PROTECTED]> wrote:
> ...:::JA:::... wrote:
> > Hello,
>
> > After my program read and translate this code:
>
> > koristi os,sys;
> > ispisi 'bok kaj ima';
>
> > into the:
>
> > import os,sys;
> > print 'bok kaj ima';
>
> > and when it run this code with "exec", I always get error like this, but I
> > still dont't know what is a problem:
>
> > Traceback (most recent call last):
> >   File "C:\Python24\Lib\site-packages\VL\__init__.py", line 188, in
> > kompajlati
> > kompajlati_proces()
> >   File "C:\Python24\Lib\site-packages\VL\__init__.py", line 183, in
> > kompajlati_proces
> > h2=Konzola()
> >   File "C:\Python24\Lib\site-packages\VL\__init__.py", line 158, in __init__
> > k=kod(ZTextCtrl.GetLabel())
> >   File "C:\Python24\Lib\site-packages\VL\__init__.py", line 83, in kod
> > exec(str_ngh)
> >   File "", line 1
> > import os ,sys ;
> > ^
> > SyntaxError: invalid syntax
>
> This is almost certainly because the code contains embedded carriage
> returns:
>
>  >>> code = """import os,sys;\nprint 'bok kaj ima';"""
>  >>> exec code
> bok kaj ima
>  >>> code = """import os,sys;\r\nprint 'bok kaj ima';"""
>  >>> exec code
> Traceback (most recent call last):
>File "", line 1, in 
>File "", line 1
>  import os,sys;
>^
> SyntaxError: invalid syntax
>  >>>
>
> > PS: How can I change when user write script with my program to he don't need
> >   aspirate the lines of his source code
> > e.g.
> >  import os,sys
> >  n=90
> >  if n==90:print "OK"
> >  else:print "No"
>
> I'm afraid I don't understand this question. If you are talking about
> the indentation of the code, if you don't want indentation you will have
> to use braces - { and } - to indicate the nesting structure of your program.
>
> regards
>   Steve
> --
> Steve Holden+1 571 484 6266   +1 800 494 3119
> Holden Web LLC/Ltd  http://www.holdenweb.com
> Skype: holdenweb  http://del.icio.us/steve.holden
> --- Asciimercial --
> Get on the web: Blog, lens and tag the Internet
> Many services currently offer free registration
> --- Thank You for Reading -

Hello,

Thanks for everything previously, but just to I ask about code
indentation,this with { and } doesn't
employed, here is my example how can I solve this about code
indentation:

>>> n=90
>>> if n==90:
{print "bok kjai ma'}
  File "", line 2
{print "bok kjai ma'}
 ^
SyntaxError: invalid syntax

 
Thanks!!!
 
Regards,Vedran

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ignoring a part of returned tuples

2007-07-22 Thread noamtm
> Pylint also "allows" the name `dummy` without complaining.  That makes it
> even clearer and doesn't clash with the meaning of `_` when `gettext` is
> used.

Thanks, that's even better!

Noam.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way for missing dict keys

2007-07-22 Thread Marc 'BlackJack' Rintsch
On Sat, 21 Jul 2007 16:20:37 -0700, genro wrote:

> On Jul 19, 6:29 am, Bruno Desthuilliers
> <[EMAIL PROTECTED]> wrote:
>> No "surprise" here, but it can indeed be suboptimal if instanciating
>> myobject is costly.
> 
> What about this way ?
> 
> my_obj = my_dict.get(key) or my_dict.setdefault(key,myobject())

Reduces the unnecessary instantiation of `myobject` to "false" objects. 
May be not good enough.

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list


Advice on sending images to clients over network

2007-07-22 Thread Frank Millman
Hi all

This is not strictly a Python question, but as the system to which
relates is written in Python, hopefully it is not too off-topic.

I have an accounting/business application, written in client/server
mode. The server makes a connection to a database, and then runs a
continuous loop waiting for client connections, establishes a session
with the client, and responds to messages received from the client
until the client closes the connection. The client uses wxPython as
the gui. All the business logic is on the server - there is none at
all on the client side. It seems to be working quite well.

I now want to add the capability of displaying images on the client.
For example, if the application deals with properties, I want to
display various photographs of the property on the client. wxPython is
perfectly capable of displaying the image. My question is, what is the
best way to get the image to the client?

Assume that the images are stored in a directory on the server, or at
least accessible from the server, and that the database has a table
which stores the full path to each image, with the property id as a
reference.

My first thought was that the server would simply retrieve the path,
read the image, and send it to the client over the network, along with
all the other information. The problem with this is performance - I
may set up a page with 20 images to be displayed on the client, which
may take some time. I need the server to respond to client messages as
quickly as possible, so that it is ready to handle the next request.
The server is multi-threaded, so it will not block other threads, but
it may still result in sluggish performance.

My second thought was to send the path to the image down to the
client, and get the client to read the image directly. The problem
with this is that each client needs to be able to resolve the path to
the image directory. At present, all that the client requires is a
pointer to the client program directory, and a parameter giving it the
ip address and port number required to make a connection to the
server. It seems an extra administrative burden to ensure that each
client can access the image directory, especially if at a later date
it is decided to move the directory.

My third thought was to set up a separate 'image server'. It would be
another server program written in Python, listening on its own port
number, waiting for a client request for a particular image. It would
know where to find it, read it in, and send it to the client. Then all
the client needs to know is the ip address and port number.

It seems likely that a typical setup would start by storing the images
on the same machine as the database and the server program. If so,
there would be little difference between any of the above, as no
matter with method is used, the same machine ultimately has to read
the record from its own hard drive and send it down the network over
its own nic.

If at a later date it was decided that the volume of image handling
was slowing down the server process, one might well decide to move the
images to a separate server. In this case, I think that my third
option would make it easiest to facilitate this. You would have to
change all the client parameters to connect to a different server. Or
maybe not - thinking aloud, I could pass the 'image server connection
parameters' to the client from the main server once a connection has
been established. A second implication is that you would have to
change all the paths in the database table. Again, maybe not - just
store the file names in the table, and the path to the directory as a
separate parameter. Then you only have to change the parameter.

I guess the point of all this rambling is that my thought process is
leading me towards my third option, but this would be a bit of work to
set up, so I would appreciate any comments from anyone who has been
down this road before - do I make sense, or are there better ways to
handle this?

Any suggestions will be much appreciated.

Thanks

Frank Millman

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: simpleJSON pack binary data

2007-07-22 Thread Marc 'BlackJack' Rintsch
On Sat, 21 Jul 2007 19:13:22 -0700, Andrey wrote:

> My question is, anyone will suggest a workaround to this error?
> i really like to pack my raw image data into the JSON, so my other 
> programming script can read the array easily

JSON is a text format so you have to encode the binary data somehow.  I'd
use base64.  It's available as codec for `str.encode()`/`str.decode()`.

In [10]: '\x00\xff\xaa'
Out[10]: '\x00\xff\xaa'

In [11]: '\x00\xff\xaa'.encode('base64')
Out[11]: 'AP+q\n'

In [12]: _.decode('base64')
Out[12]: '\x00\xff\xaa'

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sorting dict keys

2007-07-22 Thread Martin v. Löwis

> I'd like to do it in one line because what I am trying to do is, after
> all, a single, simple enough action. I find the suggested
> b = sorted(a.keys()) much more readable than breaking it up in two
> lines. 

I think you have demonstrated that a single-line statements with
multiple functions and methods is *not* more readable for you,
contrary to your own beliefs.

You were aware that sort is "in-place", and recognized that
b = d.keys().sort() does not "work". What you here failed to
recognize is that b is assigned the result of .sort(), not
the result of .keys(). You then made the same mistake again
in thinking that b=copy.copy(d.keys()).sort() should work better,
because it sorts a copy - still failing to see that it is
again the result of .sort() that gets assigned to b.

So ISTM that you got puzzled by the order in which multiple
things happen when written into a single line. My guess would
be that methods are particularly puzzling, more so than
functions (which make it somewhat more obvious that they
entirely wrap their arguments, and are entitled to return
whatever they want to).

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: simpleJSON pack binary data

2007-07-22 Thread Gabriel Genellina
En Sat, 21 Jul 2007 23:13:22 -0300, Andrey <[EMAIL PROTECTED]>  
escribió:

> Is it possible to pack binary data into simplejson?

json does not provide any direct "binary" type; strings are Unicode  
strings. Try encoding your data using Base64 for example, or transform it  
into an array of numbers.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [2.5] Regex doesn't support MULTILINE?

2007-07-22 Thread Gabriel Genellina
En Sun, 22 Jul 2007 01:56:32 -0300, Gilles Ganault <[EMAIL PROTECTED]>  
escribió:

> Incidently, as far as using Re alone is concerned, it appears that
> re.MULTILINE isn't enough to get Re to include newlines: re.DOTLINE
> must be added.
>
> Problem is, when I add re.DOTLINE, the search takes less than a second
> for a 500KB file... and about 1mn30 for a file that's 1MB, with both
> files holding similar contents.
>
> Why such a huge difference in performance?
>
> pattern = "(\d+:\d+).*?"

Try to avoid using ".*" and ".+" (even the non greedy forms); in this  
case, I think you want the scan to stop when it reaches the ending   
or any other tag, so use: [^<]* instead.

BTW, better to use a raw string to represent the pattern: pattern =  
r"...\d+..."

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PEP 3107 and stronger typing (note: probably a newbie question)

2007-07-22 Thread Hendrik van Rooyen

 "Steve Holden" <[EMAIL PROTECTED],..eb.com> wrote:

> The trouble there, though, is that although COBOL was comprehensible (to 
> a degree) relatively few people have the rigor of thought necessary to 
> construct, or even understand, an algorithm of any kind.

This is true - and in my experience the competent people come in two classes -
those that are good at mathematics, and those that are good at languages.

If you find someone good at both, employ her.

- Hendrik



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pickled objects over the network

2007-07-22 Thread Hendrik van Rooyen
"Steve Holden" <[EMAIL PROTECTED]> wrote:

> I think someone has already pointed out netstrings, which will allow you 
> to send arbitrary strings over network connections deterministically. 

Yes I brought it up

> I'm afraid for the rest it's just a matter of encoding your information 
> in a way that you can decode without allowing a malicious sender to 
> cause arbitrary code to be called.

Yes - and in general you do this by having both the sender and the 
transmitter conform to some pre agreed format - a netstring is one 
of the simplest of such things - another way is to "frame" records 
between some kind of delimeter and to "escape" the occurences of the
delimiter in the data.  Another way is to use simple "self defining fields"
that work by giving fields a "tag" number from a list of pre defined
things, as well as a length, followed by the data - some financial 
protocols work as a variant of this concept, where the presence or 
absence of a bit signify the presence or absence of a field in the record.

The problem with all of these schemes is that they are all a PITA to
implement, compared to the ease with which you can pickle and 
unpickle something like a simple dict of parameters.

And if that is all you want to pass over to some remote thing, then
having to download and import Pyro is an equal PITA and overkill.
- It adresses a far more sophisticated problem than just getting 
some small things across the network.

Now if Pyro were to make it into the standard library, it would be
my method of choice for even this silly level of functionality, 
because I happen to think it rocks.

> 
> The issue with pickle is that it's way too general a mechanism to be 
> secure in open network applications, so a suggestion to beef up its 
> security was misguided. Trying to "beef up pickle's security" is like 
> trying to make a shotgun that can't kill anything.
> 

Is it really that impossible to add something like a "noeval" flag, or to
force it to only give you a string or a dict if you ask for one or the other, 
given that someone has already mentioned that the built in types are 
decoded by separate routines?

Or more generally - as it already has different protocols - to define a
protocol that won't pass executable stuff over, or one that will only 
pass and accept the built in types?

- Hendrik


-- 
http://mail.python.org/mailman/listinfo/python-list


ANN: parley 0.3

2007-07-22 Thread Jacob Lee
Release Announcement: PARLEY version 0.3

PARLEY is a library for writing Python programs that implement the Actor
model of distributed systems, in which lightweight concurrent processes
communicate through asynchronous message-passing. Actor systems typically
are easier to write and debug than traditional concurrent programs that
use locks and shared memory.

With version 0.3, PARLEY now supports the Greenlet execution model
(http://codespeak.net/py/dist/greenlet.html). Greenlets are lightweight
threads, similar to the tasklets supported by Stackless Python; unlike
tasklets, they do not require a special version of Python. PARLEY also
supports tasklets and traditional native threads, and it provides the
ability to switch between these modes of execution without substantial
code changes.

Version 0.3 also includes various bug fixes, additional features, and
documentation improvements.

Code samples, documentation, and source code can be found at the PARLEY
home page: http://osl.cs.uiuc.edu/parley/

PARLEY is licensed under the LGPL.

-- 
Jacob Lee
<[EMAIL PROTECTED]>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python version changes, sys.executable does not

2007-07-22 Thread Jeffrey Froman
Jim Langston wrote:

> I think it's because your python directory is in the path before your
> python2.5 directory.

Thanks for the tip. In fact, /usr/local/bin/python (2.5) is on my PATH
before /usr/bin/python (2.3).

I did find the problem however -- it turns out that caching the executable
path is a feature of the bash shell, possibly a buggy one. After installing
the new executable in /usr/local/bin, bash claimed to be running that
executable, but was actually invoking the cached "python"
at /usr/bin/python.

What sorted out the confusion for me was when someone demonstrated to me how
sys.executable could be fooled:

$ exec -a /usr/bin/foobar python
Python 2.5.1 (r251:54863, May  4 2007, 16:52:23)
[GCC 4.1.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.executable
'/usr/bin/foobar'

To remove the cached version, I ran:
$ hash -d python

After which, running "python" invoked a properly named /usr/local/bin/python
as expected.


Thanks,
Jeffrey
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regex doesn't support MULTILINE?

2007-07-22 Thread irstas
On Jul 22, 7:56 am, Gilles Ganault <[EMAIL PROTECTED]> wrote:
> On Sat, 21 Jul 2007 22:18:56 -0400, Carsten Haese
>
> <[EMAIL PROTECTED]> wrote:
> >That's your problem right there. RE is not the right tool for that job.
> >Use an actual HTML parser such as BeautifulSoup
>
> Thanks a lot for the tip. I tried it, and it does look interesting,
> although I've been unsuccessful using a regex with BS to find all
> occurences of the pattern.
>
> Incidently, as far as using Re alone is concerned, it appears that
> re.MULTILINE isn't enough to get Re to include newlines: re.DOTLINE
> must be added.
>
> Problem is, when I add re.DOTLINE, the search takes less than a second
> for a 500KB file... and about 1mn30 for a file that's 1MB, with both
> files holding similar contents.
>
> Why such a huge difference in performance?
>
> pattern = "(\d+:\d+).*?"

That .*? can really slow it down if the following pattern
can't be found. It may end up looking until the end of the file for
proper continuation of the pattern and fail, and then start again.
Without DOTALL it would only look until the end of the line so
performance would stay bearable. Your 1.5MB file might have for
example
'13:34< /span>'*1 as its contents. Because
the < /span> doesn't match , it would end up looking till
the end of the file for  and not finding it. And then move
on to the next occurence of '),
you could maybe use negated char range:

"(\d+:\d+)[^<]*"

This pattern should be very fast for all inputs because the [^<]*
can't
match stuff indefinitely until the end of the file - only until the
next HTML element comes around. Or if you don't care about anything
but
those numbers, you should just match this:

"(\d+:\d+)"

-- 
http://mail.python.org/mailman/listinfo/python-list