Re: bags? 2.5.x?

2008-01-22 Thread Christian Heimes
Dan Stromberg wrote:
> Is there a particular reason why bags didn't go into 2.5.x or 3000?
> 
> I keep wanting something like them - especially bags with something akin 
> to set union, intersection and difference.

Ask yourself the following questions:

* Is the feature useful for the broad mass?
* Has the feature been implemented and contributed for Python?
* Is the code well written, tested and documented?
* Is the code mature and used by lots of people?

Can you answer every question with yes?

Christian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removing objects

2008-01-22 Thread Asun Friere
On Jan 23, 6:16 pm, Asun Friere <[EMAIL PROTECTED]> wrote:

> >>> x.pop(x.index(c))

Umm, of course you would simply use x.remove(c) ... force of (bad)
habit. %/



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: translating Python to Assembler

2008-01-22 Thread Christian Heimes
Wim Vander Schelden wrote:
> Python modules and scripts are normally not even compiled, if they have
> been,
> its probably just the Python interpreter packaged with the scripts and
> resources.

No, that is not correct. Python code is compiled to Python byte code and
execute inside a virtual machine just like Java or C#. It's even
possible to write code with Python assembly and compile the Python
assembly into byte code.

You most certainly meant: Python code is not compiled into machine code.

Christian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extract value from a attribute in a string

2008-01-22 Thread inFocus
On Wed, 23 Jan 2008 01:13:31 -0200, "Gabriel Genellina"
<[EMAIL PROTECTED]> wrote:

>En Tue, 22 Jan 2008 23:45:22 -0200, <[EMAIL PROTECTED]> escribió:
>
>> I am looking for some help in reading a large text tile and extracting
>> a value from an attribute? so I would need to find name=foo and
>> extract just the value foo which can be at any location in the string.
>> The attribute name will be in almost each line.
>
>In this case a regular expression may be the right tool. See  
>http://docs.python.org/lib/module-re.html
>
>py> import re
>py> text = """ok name=foo
>... in this line name=bar but
>... here you get name = another thing
>... is this what you want?"""
>py> for match in re.finditer(r"name\s*=\s*(\S+)", text):
>...   print match.group(1)
>...
>foo
>bar
>another

Thank you very much.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread George Sakkis
On Jan 23, 1:39 am, Steven D'Aprano
<[EMAIL PROTECTED]> wrote:

> Given the human psychology displayed involved, in the absence of
> definitive evidence one way or another it is a far safer bet to assume
> that people are unnecessarily asking for "the fastest" out of a misguided
> and often ignorant belief that they need it, rather than the opposite.
> People who actually need a faster solution usually know enough to preface
> their comments with an explanation of why their existing solution is too
> slow rather than just a context-free demand for "the fastest" solution.

As I mentioned already, I consider the seeking of the most efficient
solution a legitimate question, regardless of whether a "dumb"
solution is fast enough for an application. Call it a "don't be
sloppy" principle if you wish. It's the same reason I always use
xrange() instead of range() for a loop, although in practice the
difference is rarely measurable.

> Fast code is like fast cars. There *are* people who really genuinely need
> to have the fastest car available, but that number is dwarfed by the vast
> legions of tossers trying to make up for their lack of self-esteem by
> buying a car with a spoiler. Yeah, you're going to be traveling SO FAST
> on the way to the mall that the car is at risk of getting airborne, sure,
> we believe you.
>
> (The above sarcasm naturally doesn't apply to those who actually do need
> to travel at 200mph in a school zone, like police, taxi drivers and stock
> brokers.)

Good example; it shows that there's more than the utilitarian point of
view. People don't buy these cars because of an actual need but rather
because of the brand, the (perceived) social value and other reasons.

And since you like metaphors, here's another one: caring about
efficient code only when you need it is like keeping notes for a
course only for the material to be included in the final exams,
skipping the more encyclopedic, general knowledge lectures. Sure, you
may pass the class, even with a good grade, but for some people a
class is more than a final grade.

George
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removing objects

2008-01-22 Thread Robert Kern
[EMAIL PROTECTED] wrote:
> I am writing a game, and it must keep a list of objects. I've been
> representing this as a list, but I need an object to be able to remove
> itself. It doesn't know it's own index. If I tried to make each object
> keep track of it's own index, it would be invalidated when any object
> with a lower index was deleted.  The error was that when I called
> list.remove(self), it just removed the first thing in hte list with
> the same type as what I wanted, rather than the object I wanted. The
> objects have no identifying charachteristics, other than thier
> location in memory

By default, classes that do not implement the special methods __eq__ or __cmp__ 
get compared by identity; i.e. "(x == y) == (x is y)". Double-check your 
classes 
and their super-classes for implementations of one of these methods. 
mylist.remove(x) will check "x is mylist[i]" first and only check "x == 
mylist[i]" if that is False.


In [1]: class A(object):
...: def __eq__(self, other):
...: print '%r == %r' % (self, other)
...: return self is other
...: def __ne__(self, other):
...: print '%r != %r' % (self, other)
...: return self is not other
...:
...:

In [2]: As = [A() for i in range(10)]

In [3]: As
Out[3]:
[<__main__.A object at 0xf47f70>,
  <__main__.A object at 0xf47d90>,
  <__main__.A object at 0xf47db0>,
  <__main__.A object at 0xf47cb0>,
  <__main__.A object at 0xf47eb0>,
  <__main__.A object at 0xf47e70>,
  <__main__.A object at 0xf47cd0>,
  <__main__.A object at 0xf47e10>,
  <__main__.A object at 0xf47dd0>,
  <__main__.A object at 0xf47e90>]

In [4]: A0 = As[0]

In [5]: A0
Out[5]: <__main__.A object at 0xf47f70>

In [6]: As.remove(A0)

In [7]: As
Out[7]:
[<__main__.A object at 0xf47d90>,
  <__main__.A object at 0xf47db0>,
  <__main__.A object at 0xf47cb0>,
  <__main__.A object at 0xf47eb0>,
  <__main__.A object at 0xf47e70>,
  <__main__.A object at 0xf47cd0>,
  <__main__.A object at 0xf47e10>,
  <__main__.A object at 0xf47dd0>,
  <__main__.A object at 0xf47e90>]

In [8]: A0
Out[8]: <__main__.A object at 0xf47f70>

In [9]: A9 = As[-1]

In [10]: As.remove(A9)
<__main__.A object at 0xf47d90> == <__main__.A object at 0xf47e90>
<__main__.A object at 0xf47db0> == <__main__.A object at 0xf47e90>
<__main__.A object at 0xf47cb0> == <__main__.A object at 0xf47e90>
<__main__.A object at 0xf47eb0> == <__main__.A object at 0xf47e90>
<__main__.A object at 0xf47e70> == <__main__.A object at 0xf47e90>
<__main__.A object at 0xf47cd0> == <__main__.A object at 0xf47e90>
<__main__.A object at 0xf47e10> == <__main__.A object at 0xf47e90>
<__main__.A object at 0xf47dd0> == <__main__.A object at 0xf47e90>

In [11]: As
Out[11]:
[<__main__.A object at 0xf47d90>,
  <__main__.A object at 0xf47db0>,
  <__main__.A object at 0xf47cb0>,
  <__main__.A object at 0xf47eb0>,
  <__main__.A object at 0xf47e70>,
  <__main__.A object at 0xf47cd0>,
  <__main__.A object at 0xf47e10>,
  <__main__.A object at 0xf47dd0>]

In [12]: A9
Out[12]: <__main__.A object at 0xf47e90>


If you cannot find an implementation of __eq__ or __cmp__ anywhere in your 
code, 
please try to make a small, self-contained example like the one above but which 
demonstrates your problem.

> So my question: How do I look something up in a list by it's location
> in memory? does python even support pointers?

If you need to keep an __eq__ that works by equality of value instead of 
identity, then you could keep a dictionary keyed by the id() of the object. 
That 
will correspond to its C pointer value in memory.

In [13]: id(A9)
Out[13]: 16023184

In [14]: hex(_)
Out[14]: '0xf47e90'

> Is there a better way?

Possibly. It looks like you are implementing a cache of some kind. Depending on 
exactly how you are using it, you might want to consider a "weak" dictionary 
instead. A weak dictionary, specifically a WeakValueDictionary, acts like a 
normal dictionary, but only holds a weak reference to the object. A weak 
reference does not increment the object's reference count like a normal 
("strong") reference would. Consequently, once all of the "strong" references 
disappear, the object will be removed from the WeakValueDictionary without your 
having to do anything explicit. If this corresponds with when you want the 
object to be removed from the cache, then you might want to try this approach. 
Use "id(x)" as the key if there is no more meaningful key that fits your 
application.

   http://docs.python.org/lib/module-weakref.html

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Just for fun: Countdown numbers game solver

2008-01-22 Thread Arnaud Delobelle
On Jan 22, 10:56 pm, [EMAIL PROTECTED] wrote:
> Arnaud and Terry,
>
> Great solutions both of you! Much nicer than mine. I particularly like
> Arnaud's latest one based on folding because it's so neat and
> conceptually simple. For me, it's the closest so far to my goal of the
> most elegant solution.

Thanks!  It's a great little problem to think of and it helps bring
more fun to this list.  Sadly work takes over fun during the week, but
I will try to improve it at the weekend.

> So anyone got an answer to which set of numbers gives the most targets
> from 100 onwards say (or from 0 onwards)? Is Python up to the task?

I bet it is :)

> A thought on that last one. Two ways to improve speed. First of all,
> you don't need to rerun from scratch for each target

Yes, I've been doing this by writing an 'action' (see my code) that
takes note of all reached results.

> Secondly, you
> can try multiple different sets of numbers at the same time by passing
> numpy arrays instead of single values (although you have to give up
> the commutativity and division by zero optimisations).

Have to think about this.

--
Arnaud

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Removing objects

2008-01-22 Thread Asun Friere
On Jan 23, 5:59 pm, [EMAIL PROTECTED] wrote:
> I am writing a game, and it must keep a list of objects. I've been
> representing this as a list, but I need an object to be able to remove
> itself. It doesn't know it's own index. If I tried to make each object
> keep track of it's own index, it would be invalidated when any object
> with a lower index was deleted.  The error was that when I called
> list.remove(self), it just removed the first thing in hte list with
> the same type as what I wanted, rather than the object I wanted. The
> objects have no identifying charachteristics, other than thier
> location in memory
>
> So my question: How do I look something up in a list by it's location
> in memory? does python even support pointers?
>
> Is there a better way?


How about adding an id attribute to your objects, which will contain a
unique identifier, override __eq__ to use that id to compare itself to
others and then simply pop off the object using
object_list.pop(object_list.index(self)).  Something like this:

>>> class Spam (object) :
def __init__ (self, id) :
self.id = id
def __eq__ (self, other) :
try :
return self.id == other.id
except AttributeError :
return False


>>>
>>> a,b,c = Spam(1), Spam(2), Spam(3)
>>> x = [a,b,c]
>>> x.pop(x.index(c))
<__main__.Spam object at 0x885e5ac>

Except your object would be telling the list to pop self of course,
and you'd need someway of insuring the uniqueness of your IDs.
-- 
http://mail.python.org/mailman/listinfo/python-list


Is there a HTML parser who can reconstruct the original html EXACTLY?

2008-01-22 Thread ioscas
Hi, I am looking for a HTML parser who can parse a given page into
a DOM tree,  and can reconstruct the exact original html sources.
Strictly speaking, I should be allowed to retrieve the original
sources at each internal nodes of the DOM tree.
I have tried Beautiful Soup who is really nice when dealing with
those god damned ill-formed documents, but it's a pity for me to find
that this guy cannot retrieve original sources due to its great tidy
job.
Since Beautiful Soup, like most of the other HTML parsers in
python, is a subclass of sgmllib.SGMLParser to some extent,  I have
investigated the source code of sgmllib.SGMLParser,  see if there is
anything I can do to tell Beautiful Soup where he can find every tag
segment from HTML source, but this will be a time-consuming job.
so... any ideas?


cheers
kai liu
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Just for fun: Countdown numbers game solver

2008-01-22 Thread Arnaud Delobelle
On Jan 22, 9:05 am, Terry Jones <[EMAIL PROTECTED]> wrote:
> Hi Arnaud
>
> > I've tried a completely different approach, that I imagine as 'folding'.  I
> > thought it would improve performance over my previous effort but extremely
> > limited and crude benchmarking seems to indicate disappointingly comparable
> > performance...
>
> I wrote a stack-based version yesterday and it's also slow. It keeps track
> of the stack computation and allows you to backtrack. I'll post it
> sometime, but it's too slow for my liking and I need to put in some more
> optimizations. I'm trying not to think about this problem.
>
> What was wrong with the very fast(?) code you sent earlier?

I thought it was a bit convoluted, wanted to try something I thought
had more potential.  I think the problem with the second one is that I
repeat the same 'fold' too many times.  I'll take a closer look at the
weekend.

--
Arnaud

-- 
http://mail.python.org/mailman/listinfo/python-list


python24 symbol file...pyhon24.pdb

2008-01-22 Thread over
I've seen a few references on the net to a python24.pdb file. I assume
it's a symbol file along the lines of the pdb files issued by
microsoft for their products. Maybe I'm wrong.

Has anyone seen such an animal?

Also, is there source code available for python24 for Windoze? I have
seen reference to source code but not in a package for Windows.

thanks 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess and & (ampersand)

2008-01-22 Thread Steven Bethard
Steven D'Aprano wrote:
> On Tue, 22 Jan 2008 22:53:20 -0700, Steven Bethard wrote:
> 
>> I'm having trouble using the subprocess module on Windows when my
>> command line includes special characters like "&" (ampersand)::
>>
>>  >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1&y=2'
>>  >>> kwargs = dict(stdin=subprocess.PIPE,
>> ...   stdout=subprocess.PIPE, ...  
>> stderr=subprocess.PIPE)
>>  >>> proc = subprocess.Popen(command, **kwargs) proc.stderr.read()
>> "'y' is not recognized as an internal or external command,\r\noperable
>> program or batch file.\r\n"
>>
>> As you can see, Windows is interpreting that "&" as separating two
>> commands, instead of being part of the single argument as I intend it to
>> be above.  Is there any workaround for this?  How do I get "&" treated
>> like a regular character using the subprocess module?
> 
> 
> That's nothing to do with the subprocess module. As you say, it is 
> Windows interpreting the ampersand as a special character, so you need to 
> escape the character to the Windows shell.
> 
> Under Windows, the escape character is ^, or you can put the string in 
> double quotes:
> 
> # untested
> command = 'lynx.bat -dump http://www.example.com/?x=1^&y=2'
> command = 'lynx.bat -dump "http://www.example.com/?x=1&y=2";'

Sorry, I should have mentioned that I already tried that. You get the 
same result::

   >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1^&y=2'
   >>> proc = subprocess.Popen(command,
   ... stdin=subprocess.PIPE,
   ... stdout=subprocess.PIPE,
   ... stderr=subprocess.PIPE)
   >>> proc.stderr.read()
   "'y' is not recognized as an internal or external command,\r\noperable
   program or batch file.\r\n"

In fact, the "^" doesn't seem to work at the command line either::

   >lynx.bat -dump http://www.example.com/?x=1^&y=2

   Can't Access `file://localhost/C:/PROGRA~1/lynx/1'
   Alert!: Unable to access document.

   lynx: Can't access startfile
   'y' is not recognized as an internal or external command,
   operable program or batch file.

Using quotes does work at the command line::

   C:\PROGRA~1\lynx>lynx.bat -dump "http://www.example.com/?x=1&y=2";
  You have reached this web page by typing "example.com",
  "example.net", or "example.org" into your web browser.

  These  domain  names are reserved for use in documentation and are
  not available for registration. See [1]RFC 2606, Section 3.

   References

  1. http://www.rfc-editor.org/rfc/rfc2606.txt

But I get no output at all when using quotes with subprocess::

   >>> command= 'lynx.bat', '-dump', '"http://www.example.com/?x=1^&y=2";'
   >>> proc = subprocess.Popen(command,
   ... stdin=subprocess.PIPE,
   ... stdout=subprocess.PIPE,
   ... stderr=subprocess.PIPE)
   >>> proc.stderr.read()
   ''

Any other ideas?

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Removing objects

2008-01-22 Thread bladedpenguin
I am writing a game, and it must keep a list of objects. I've been
representing this as a list, but I need an object to be able to remove
itself. It doesn't know it's own index. If I tried to make each object
keep track of it's own index, it would be invalidated when any object
with a lower index was deleted.  The error was that when I called
list.remove(self), it just removed the first thing in hte list with
the same type as what I wanted, rather than the object I wanted. The
objects have no identifying charachteristics, other than thier
location in memory

So my question: How do I look something up in a list by it's location
in memory? does python even support pointers?

Is there a better way?
-- 
http://mail.python.org/mailman/listinfo/python-list


UNSUBSCRIBE

2008-01-22 Thread TezZ Da [EMAIL PROTECTED] MaN


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Diez B. Roggisch
Sent: 22 January 2008 20:22
To: python-list@python.org
Subject: Re: isgenerator(...) - anywhere to be found?

Jean-Paul Calderone wrote:

> On Tue, 22 Jan 2008 15:15:43 +0100, "Diez B. Roggisch"
> <[EMAIL PROTECTED]> wrote:
>>Jean-Paul Calderone wrote:
>>
>>> On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch"
>>> <[EMAIL PROTECTED]> wrote:
For a simple greenlet/tasklet/microthreading experiment I found myself
in the need to ask the question

 [snip]
>>>
>>> Why do you need a special case for generators?  If you just pass the
>>> object in question to iter(), instead, then you'll either get back
>>> something that you can iterate over, or you'll get an exception for
>>> things that aren't iterable.
>>
>>Because - as I said - I'm working on a micro-thread thingy, where the
>>scheduler needs to push returned generators to a stack and execute them.
>>Using send(), which rules out iter() anyway.
> 
> Sorry, I still don't understand.  Why is a generator different from any
> other iterator?

Because you can use send(value) on it for example. Which you can't with
every other iterator. And that you can utizilize to create a little
framework of co-routines or however you like to call it that will yield
values when they want, or generators if they have nested co-routines the
scheduler needs to keep track of and invoke after another.

I'm currently at work and can't show you the code - I don't claim that my
current approach is the shizzle, but so far it serves my purposes - and I
need a isgenerator()

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Steven D'Aprano
On Tue, 22 Jan 2008 18:32:22 -0800, George Sakkis wrote:

> The OP didn't mention anything about the context; for all we know, this
> might be a homework problem or the body of a tight inner loop. There is
> this tendency on c.l.py to assume that every optimization question is
> about a tiny subproblem of a 100 KLOC application. Without further
> context, we just don't know.

Funny. As far as I can tell, the usual assumption on c.l.py is that every 
tiny two-line piece of code is the absolute most critically important 
heart of an application which gets called billions of times on petabytes 
of data daily.

Given the human psychology displayed involved, in the absence of 
definitive evidence one way or another it is a far safer bet to assume 
that people are unnecessarily asking for "the fastest" out of a misguided 
and often ignorant belief that they need it, rather than the opposite. 
People who actually need a faster solution usually know enough to preface 
their comments with an explanation of why their existing solution is too 
slow rather than just a context-free demand for "the fastest" solution.

Fast code is like fast cars. There *are* people who really genuinely need 
to have the fastest car available, but that number is dwarfed by the vast 
legions of tossers trying to make up for their lack of self-esteem by 
buying a car with a spoiler. Yeah, you're going to be traveling SO FAST 
on the way to the mall that the car is at risk of getting airborne, sure, 
we believe you.

(The above sarcasm naturally doesn't apply to those who actually do need 
to travel at 200mph in a school zone, like police, taxi drivers and stock 
brokers.)



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess and & (ampersand)

2008-01-22 Thread Steven D'Aprano
On Tue, 22 Jan 2008 22:53:20 -0700, Steven Bethard wrote:

> I'm having trouble using the subprocess module on Windows when my
> command line includes special characters like "&" (ampersand)::
> 
>  >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1&y=2'
>  >>> kwargs = dict(stdin=subprocess.PIPE,
> ...   stdout=subprocess.PIPE, ...  
> stderr=subprocess.PIPE)
>  >>> proc = subprocess.Popen(command, **kwargs) proc.stderr.read()
> "'y' is not recognized as an internal or external command,\r\noperable
> program or batch file.\r\n"
> 
> As you can see, Windows is interpreting that "&" as separating two
> commands, instead of being part of the single argument as I intend it to
> be above.  Is there any workaround for this?  How do I get "&" treated
> like a regular character using the subprocess module?


That's nothing to do with the subprocess module. As you say, it is 
Windows interpreting the ampersand as a special character, so you need to 
escape the character to the Windows shell.

Under Windows, the escape character is ^, or you can put the string in 
double quotes:

# untested
command = 'lynx.bat -dump http://www.example.com/?x=1^&y=2'
command = 'lynx.bat -dump "http://www.example.com/?x=1&y=2";'

In Linux land, you would use a backslash or quotes.

To find the answer to this question, I googled for "windows how to escape 
special characters shell" and found these two pages:


http://www.microsoft.com/technet/archive/winntas/deploy/prodspecs/shellscr.mspx

http://technet2.microsoft.com/WindowsServer/en/library/44500063-fdaf-4e4f-8dac-476c497a166f1033.mspx


Hope this helps,



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


subprocess and & (ampersand)

2008-01-22 Thread Steven Bethard
I'm having trouble using the subprocess module on Windows when my 
command line includes special characters like "&" (ampersand)::

 >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1&y=2'
 >>> kwargs = dict(stdin=subprocess.PIPE,
...   stdout=subprocess.PIPE,
...   stderr=subprocess.PIPE)
 >>> proc = subprocess.Popen(command, **kwargs)
 >>> proc.stderr.read()
"'y' is not recognized as an internal or external command,\r\noperable 
program or batch file.\r\n"

As you can see, Windows is interpreting that "&" as separating two 
commands, instead of being part of the single argument as I intend it to 
be above.  Is there any workaround for this?  How do I get "&" treated 
like a regular character using the subprocess module?

Thanks,

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: translating Python to Assembler

2008-01-22 Thread Steven D'Aprano
On Wed, 23 Jan 2008 04:58:02 +, Grant Edwards wrote:

> On 2008-01-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> 
>> My expertise, if any, is in assembler. I'm trying to understand Python
>> scripts and modules by examining them after they have been disassembled
>> in a Windows environment.
> 
> You can't dissassemble them, since they aren't ever converted to
> assembler and assembled.  Python is compiled into bytecode for a virtual
> machine (either the Java VM or the Python VM or the .NET VM).


There is the Python disassembler, dis, which dissassembles the bytecode 
into something which might as well be "assembler" *cough* for the virtual 
machine.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Cleanup when a object dies

2008-01-22 Thread Raymond Hettinger
On Jan 22, 7:54 pm, Benjamin <[EMAIL PROTECTED]> wrote:
> I writing writing a class to allow settings (options, preferences) to
> written file in a cross platform manner. I'm unsure how to go a about
> syncing the data to disk. Of course, it's horribly inefficient to
> write the data every time something changes a value, however I don't
> see how I can do it on deletion. I've read that __del__ methods should
> be avoided. So am I just going to have to force the client of my
> object to call sync when they're done?

Lots of ways
1. Try the atexit module
2. Use a weakref callback
3. Embed a client callback in a try/finally.
4. Or, like you said, have the client call a sync() method -- this is
explicit and gives the client control over when data is written.

Raymond
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: translating Python to Assembler

2008-01-22 Thread Grant Edwards
On 2008-01-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> My expertise, if any, is in assembler. I'm trying to
> understand Python scripts and modules by examining them after
> they have been disassembled in a Windows environment.

You can't dissassemble them, since they aren't ever converted
to assembler and assembled.  Python is compiled into bytecode
for a virtual machine (either the Java VM or the Python VM or
the .NET VM).

> I'm wondering if a Python symbols file is available.

You're way off track.

> In the Windows environment, a symbol file normally has a PDB
> extension. It's a little unfortunate that Python also uses PDB
> for its debugger. Google, for whatever reason, wont accept
> queries with dots, hyphens, etc., in the query line. For
> example a Google for "python.pdb" returns +python +pdb, so I
> get a ridiculous number of returns referring to the python
> debugger. I have mentioned this to Google several times, but I
> guess logic isn't one of their strong points.  :-)

Trying to find assembly language stuff to look at is futile.
Python doesn't get compiled into assembly language.

If you want to learn Python, then read a book on Python.

-- 
Grant Edwards   grante Yow!  I am NOT a nut
  at   
   visi.com
-- 
http://mail.python.org/mailman/listinfo/python-list


monitoring device status with python ...

2008-01-22 Thread Ajay Deshpande

hi everyone:

i am writing a program, which needs to keep monitoring whether a certain usb 
hard drive is connected/hot-plugged in or not. instead of repeatedly checking 
if its path exists or not, can i have the os let my program know that the 
device has been connected? i have read about the minihallib module but have not 
come across an elaborate example. can any of you point me to any examples (or 
alternatives)?

id appreciate any help.

regards,
-ajay
-- 
http://mail.python.org/mailman/listinfo/python-list

Professional Grant Proposal Writing Workshop (April 2008: Vancouver, British Columbia)

2008-01-22 Thread Anthony Jones


The Grant Institute's Grants 101: Professional Grant Proposal Writing Workshop
 will be held in Vancouver, British Columbia, April 14 - 16, 2008. Interested development professionals, researchers, faculty, and graduate students should register as soon as possible, as demand means that seats will fill up quickly. Please forward, post, and distribute this e-mail to your colleagues and listservs. 
 
All participants will receive certification in professional grant writing from the Institute. For more information call (213) 817 - 5308 or visit The Grant Institute at www.thegrantinstitute.com.
 
Please find the program description below:
 
The Grant Institute
Grants 101: Professional Grant Proposal Writing Workshop
will be held in
Vancouver, British Columbia
April 14 - 16, 2008
8:00 AM - 5:00 PM
 

The Grant Institute's Grants 101 course is an intensive and detailed introduction to the process, structure, and skill of professional proposal writing. This course is characterized by its ability to act as a thorough overview, introduction, and refresher at the same time. In this course, participants will learn the entire proposal writing process and complete the course with a solid understanding of not only the ideal proposal structure, but a holistic understanding of the essential factors, 
which determine whether or not a program gets funded. Through the completion of interactive exercises and activities, participants will complement expert lectures by putting proven techniques into practice. This course is designed for both the beginner looking for a thorough introduction and the intermediate looking for a refresher course that will strengthen their grant acquisition skills. This class, simply put, is designed to get results by creating professional grant proposal writers. 

 
Participants will become competent program planning and proposal writing professionals after successful completion of the Grants 101 course. In three active and informative days, students will be exposed to the art of successful grant writing practices, and led on a journey that ends with a masterful grant proposal. 
 
Grants 101 consists of three (3) courses that will be completed during the three-day workshop. 
 
(1) Fundamentals of Program Planning
 
This course is centered on the belief that "it's all about the program." This intensive course will teach professional program development essentials and program evaluation. While most grant writing "workshops" treat program development and evaluation as separate from the writing of a proposal, this class will teach students the relationship between overall program planning and grant writing. 
 
(2) Professional Grant Writing
 

Designed for both the novice and experienced grant writer, this course will make each student an overall proposal writing specialist. In addition to teaching the basic components of a grant proposal, successful approaches, and the do's and don'ts of grant writing, this course is infused with expert principles that will lead to a mastery of the process. Strategy resides at the forefront of this course's intent to illustrate grant writing as an integrated, multidimensional, and dynamic endeavor. 
Each student will learn to stop writing the grant and to start writing the story. Ultimately, this class will illustrate how each component of the grant proposal represents an opportunity to use proven techniques for generating support.
 
(3) Grant Research
 

At its foundation, this course will address the basics of foundation, corporation, and government grant research. However, this course will teach a strategic funding research approach that encourages students to see research not as something they do before they write a proposal, but as an integrated part of the grant seeking process. Students will be exposed to online and database research tools, as well as publications and directories that contain information about foundation, corporation, and 
government grant opportunities. Focusing on funding sources and basic social science research, this course teaches students how to use research as part of a strategic grant acquisition effort.
 
Registration
$597.00 USD tuition includes all materials and certificates.
 
Each student will receive:
*The Grant Institute Certificate in Professional Grant Writing
*The Grant Institute's Guide to Successful Grant Writing
*The Grant Institute Grant Writer's Workbook with sample proposals, forms, and outlines
 
Registration Methods
 
1) On-Line - Complete the online registration form at www.thegrantinstitute.com under Register Now. We'll send your confirmation by e-mail. 
 
2) By Phone - Call (213) 817-5308 to register by phone. Our friendly Program Coordinators will be happy to assist you and answer your questions. 
 
3) By E-mail - Send an e-mail with your name, organization, and basic contact information to [EMAIL PROTECTED] and we will reserve your slot and send your Confirmation Packet. 
 
You have received this invitation due to 

Cleanup when a object dies

2008-01-22 Thread Benjamin
I writing writing a class to allow settings (options, preferences) to
written file in a cross platform manner. I'm unsure how to go a about
syncing the data to disk. Of course, it's horribly inefficient to
write the data every time something changes a value, however I don't
see how I can do it on deletion. I've read that __del__ methods should
be avoided. So am I just going to have to force the client of my
object to call sync when they're done?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML parsing confusion

2008-01-22 Thread Alnilam
On Jan 22, 7:29 pm, "Gabriel Genellina" <[EMAIL PROTECTED]>
wrote:
>
> > I was asking this community if there was a simple way to use only the
> > tools included with Python to parse a bit of html.
>
> If you *know* that your document is valid HTML, you can use the HTMLParser  
> module in the standard Python library. Or even the parser in the htmllib  
> module. But a lot of HTML pages out there are invalid, some are grossly  
> invalid, and those parsers are just unable to handle them. This is why  
> modules like BeautifulSoup exist: they contain a lot of heuristics and  
> trial-and-error and personal experience from the developers, in order to  
> guess more or less what the page author intended to write and make some  
> sense of that "tag soup".
> A guesswork like that is not suitable for the std lib ("Errors should  
> never pass silently" and "In the face of ambiguity, refuse the temptation  
> to guess.") but makes a perfect 3rd party module.
>
> If you want to use regular expressions, and that works OK for the  
> documents you are handling now, fine. But don't complain when your RE's  
> match too much or too little or don't match at all because of unclosed  
> tags, improperly nested tags, nonsense markup, or just a valid combination  
> that you didn't take into account.
>
> --
> Gabriel Genellina

Thanks, Gabriel. That does make sense, both what the benefits of
BeautifulSoup are and why it probably won't become std lib anytime
soon.

The pages I'm trying to write this code to run against aren't in the
wild, though. They are static html files on my company's lan, are very
consistent in format, and are (I believe) valid html. They just have
specific paragraphs of useful information, located in the same place
in each file, that I want to 'harvest' and put to better use. I used
diveintopython.org as an example only (and in part because it had good
clean html formatting). I am pretty sure that I could craft some
regular expressions to do the work -- which of course would not be the
case if I was screen scraping web pages in the 'wild' -- but I was
trying to find a way to do that using one of those std libs you
mentioned.

I'm not sure if HTMLParser or htmllib would work better to achieve the
same effect as the regex example I gave above, or how to get them to
do that. I thought I'd come close, but as someone pointed out early
on, I'd accidently tapped into PyXML which is installed where I was
testing code, but not necessarily where I need it. It may turn out
that the regex way works faster, but falling back on methods I'm
comfortable with doesn't help expand my Python knowledge.

So if anyone can tell me how to get HTMLParser or htmllib to grab a
specific paragraph, and then provide the text in that paragraph in a
clean, markup-free format, I'd appreciate it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: possible to overide setattr in local scope?

2008-01-22 Thread Terry Reedy

"glomde" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| In a class it is poosible to override setattr, so that you can decide
| how you should
| handle setting of variables.
|
| Is this possible to do outside of an class on module level.
|
| mysetattr(obj, var, value):
|  print "Hello"
|
| So that
|
| test = 5
|
|
| would print
| Hello

An assignment at module level amounts to setting an attribute of an 
instance of the builtin (C coded) module type, which you cannot change. 
Even if you can subclass that type (I don't know), there is no way to get 
the (stock) interpreter to use instances of your module subclass instead. 



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: translating Python to Assembler...sorry if this is duplicated...it's unintentional

2008-01-22 Thread Mike Driscoll
On Jan 22, 4:45 pm, [EMAIL PROTECTED] wrote:
> My expertise, if any, is in assembler. I'm trying to understand Python
> scripts and modules by examining them after they have been
> disassembled in a Windows environment.
>
> I'm wondering if a Python symbols file is available. In the Windows
> environment, a symbol file normally has a PDB extension. It's a little
> unfortunate that Python also uses PDB for its debugger. Google, for
> whatever reason, wont accept queries with dots, hyphens, etc., in the
> query line. For example a Google for "python.pdb" returns +python
> +pdb, so I get a ridiculous number of returns referring to the python
> debugger. I have mentioned this to Google several times, but I guess
> logic isn't one of their strong points.  :-)
>
> If there's dupicates of this post it's because it wouldn't send for
> some reason.

I'm not sure what you're talking about...mainly because I'm not sure
what you mean by a "symbols file". But I did some google-fu myself and
found this CookBook entry:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/200638

And this thread seems to be talking about symbol resolution, I think:

http://www.python.org/search/hypermail/python-1994q2/0605.html

And here's some weird site that claims to have a list of inseparable
symbols, whatever that means:

voicecode.iit.nrc.ca/VCodeWiki/public/wiki.cgi?
obj=ListOfUnseparablePythonSymbols

I can't get it to load unless I use Google's cached version though.

Hope that helps and that I'm not too far off the mark!

Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML parsing confusion

2008-01-22 Thread [EMAIL PROTECTED]
On Jan 22, 7:29 pm, "Gabriel Genellina" <[EMAIL PROTECTED]>
wrote:

>
> > I was asking this community if there was a simple way to use only the
> > tools included with Python to parse a bit of html.
>
> If you *know* that your document is valid HTML, you can use the HTMLParser
> module in the standard Python library. Or even the parser in the htmllib
> module. But a lot of HTML pages out there are invalid, some are grossly
> invalid, and those parsers are just unable to handle them. This is why
> modules like BeautifulSoup exist: they contain a lot of heuristics and
> trial-and-error and personal experience from the developers, in order to
> guess more or less what the page author intended to write and make some
> sense of that "tag soup".
> A guesswork like that is not suitable for the std lib ("Errors should
> never pass silently" and "In the face of ambiguity, refuse the temptation
> to guess.") but makes a perfect 3rd party module.
>
> If you want to use regular expressions, and that works OK for the
> documents you are handling now, fine. But don't complain when your RE's
> match too much or too little or don't match at all because of unclosed
> tags, improperly nested tags, nonsense markup, or just a valid combination
> that you didn't take into account.
>
> --
> Gabriel Genellina

Thank you. That does make perfect sense, and is a good clear position
on the up and down side of what I'm trying to do, as well as a good
explanation for why BeautifulSoup will probably remain outside the std
lib. I'm sure that I will get plenty of use out of it.

If, however, I am sure that the html code in  target documents is
good, and the framework html doesn't change, just the data on page
after page of static html, would it be better to just go with regex or
with one of the std lib items you mentioned. I thought the latter, but
I'm stuck on how to make them generate results similar to the code I
put above as an example. I'm not trying to code this to go against
html in the wild, but to try to strip specific, consistently located
data from the markup and turn it into something more useful.

I may have confused folks by using the www.diveintopython.org page as
an example, but its html seemed to be valid strict tags.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extract value from a attribute in a string

2008-01-22 Thread Gabriel Genellina
En Tue, 22 Jan 2008 23:45:22 -0200, <[EMAIL PROTECTED]> escribió:

> I am looking for some help in reading a large text tile and extracting
> a value from an attribute? so I would need to find name=foo and
> extract just the value foo which can be at any location in the string.
> The attribute name will be in almost each line.

In this case a regular expression may be the right tool. See  
http://docs.python.org/lib/module-re.html

py> import re
py> text = """ok name=foo
... in this line name=bar but
... here you get name = another thing
... is this what you want?"""
py> for match in re.finditer(r"name\s*=\s*(\S+)", text):
...   print match.group(1)
...
foo
bar
another

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Don't want child process inheriting open sockets

2008-01-22 Thread Gabriel Genellina
En Tue, 22 Jan 2008 13:02:35 -0200, Steven Watanabe  
<[EMAIL PROTECTED]> escribió:

> I'm using subprocess.Popen() to create a child process. The child  
> process is inheriting the parent process' open sockets, but I don't want  
> that. I believe that on Unix systems I could use the FD_CLOEXEC flag,  
> but I'm running Windows. Any suggestions?

You could use the DuplicateHandle Windows API function with  
bInheritHandle=False to create a non inheritable socket handle, then close  
the original one. This should be done for every socket you don't want to  
be inherited.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread George Sakkis
On Jan 22, 1:34 pm, Paddy <[EMAIL PROTECTED]> wrote:
> On Jan 22, 5:34 am, George Sakkis <[EMAIL PROTECTED]> wrote:
>
>
>
> > On Jan 22, 12:15 am, Paddy <[EMAIL PROTECTED]> wrote:
>
> > > On Jan 22, 3:20 am, Alan Isaac <[EMAIL PROTECTED]> wrote:> I want to 
> > > generate sequential pairs from a list.
> > > <>
> > > > What is the fastest way? (Ignore the import time.)
>
> > > 1) How fast is the method you have?
> > > 2) How much faster does it need to be for your application?
> > > 3) Are their any other bottlenecks in your application?
> > > 4) Is this the routine whose smallest % speed-up would give the
> > > largest overall speed up of your application?
>
> > I believe the "what is the fastest way" question for such small well-
> > defined tasks is worth asking on its own, regardless of whether it
> > makes a difference in the application (or even if there is no
> > application to begin with).
>
> Hi George,
> You need to 'get it right' first.

For such trivial problems, getting it right alone isn't a particularly
high expectation.

> Micro optimizations for speed
> without thought of the wider context is a bad habit to form and a time
> waster.

The OP didn't mention anything about the context; for all we know,
this might be a homework problem or the body of a tight inner loop.
There is this tendency on c.l.py to assume that every optimization
question is about a tiny subproblem of a 100 KLOC application. Without
further context, we just don't know.

> If the routine is all that needs to be delivered and it does not
> perform at an acceptable speed then find out what is acceptable
> and optimise towards that goal. My questions were set to get
> posters to think more about the need for speed optimizations and
> where they should be applied, (if at all).

I don't agree with this logic in general. Just because one can solve a
problem by throwing a quick and dirty hack with quadratic complexity
that happens to do well enough on current typical input, it doesn't
mean he shouldn't spend ten or thirty minutes more to write a proper
linear time solution, all else being equal or at least comparable
(elegance, conciseness, readability, etc.). Of course it's a tradeoff;
spending a week to save a few milliseconds on average is usually a
waste for most applications, but being a lazy keyboard banger writing
the first thing that pops into mind is not that good either.

George
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Submitting with PAMIE

2008-01-22 Thread romo20350
On Jan 22, 7:49 pm, "Gabriel Genellina" <[EMAIL PROTECTED]>
wrote:
> En Tue, 22 Jan 2008 15:39:33 -0200, <[EMAIL PROTECTED]> escribió:
>
> > Hi I really need help. I've been looking around for an answer forever.
> > I need to submit a form with no name and also the submit button has no
> > name or value. How might I go about doing either of these. Thanks
>
> I think you'll have more luck in a specific forum like the PAMIE User  
> Group athttp://tech.groups.yahoo.com/group/Pamie_UsersGroup/
>
> --
> Gabriel Genellina

Thanks I signed up and awaiting approval. Hopefully I can get help
soon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Extract value from a attribute in a string

2008-01-22 Thread inFocus

Hello,

I am looking for some help in reading a large text tile and extracting
a value from an attribute? so I would need to find name=foo and
extract just the value foo which can be at any location in the string.
The attribute name will be in almost each line.

Thank you for any suggestions. 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using utidylib, empty string returned in some cases

2008-01-22 Thread Gabriel Genellina
En Tue, 22 Jan 2008 15:35:16 -0200, Boris <[EMAIL PROTECTED]>  
escribió:

> I'm using debian linux, Python 2.4.4, and utidylib (http://
> utidylib.berlios.de/). I wrote simple functions to get a web page,
> convert it from windows-1251 to utf8 and then I'd like to clean html
> with it.

Why the intermediate conversion? I don't know utidylib, but can't you feed  
it with the original page, in the original encoding? If the page itself  
contains a "meta http-equiv" tag stating its content-type and charset, it  
won't be valid anymore if you reencode the page.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Core Python Programming . . .

2008-01-22 Thread wesley chun
> > 6-11 Conversion.
> >   (a) Create a program that will convert from an integer to an
> > Internet Protocol (IP) address in the four-octet format of WWW.XXX.YYY.ZZZ
> >   (b) Update your program to be able to do the vice verse of the above.
>
> I think it's is asking to convert a 32-bit int to the dotted form.
>
> It's a little known fact, but IP addresses are valid in non-dotted
> long-int form.  Spammers commonly use this trick to disguise their IP
> addresses in emails from scanners.


that is correct.  don't read too much into it.  i'm not trying to
validate anything or any format, use old or new technology.  it is
simply to exercise your skills with numbers (specifically 32-bit/4-
byte integers), string manipulation, and bitwise operations.  if you
wish to use different sizes of numbers, forms of addressing, IPv6,
etc., that's up to you. don't forget about part (b), which is to take
an IP address and turn it into a 32-bit integer.

enjoy!
-- wesley

ps. since you're on p. 248, there is also a typo in the piece of code
right above this exercise, Example 6.4, which is tied to exercise
6-7.  "'fac_list'" should really be "`fac_list`", or even better,
"repr(fac_list)".  see the Errata at the book's website http://corepython.com
for more details.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"Core Python Programming", Prentice Hall, (c)2007,2001
http://corepython.com

wesley.j.chun :: wescpy-at-gmail.com
python training and technical consulting
cyberweb.consulting : silicon valley, ca
http://cyberwebconsulting.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: translating Python to Assembler

2008-01-22 Thread Luis Zarrabeitia

I second Wim's opinion. Learn python as a high level language, you won't regret 
it.

About google, I'll give you a little gtip:

> > For example a Google for "python.pdb" returns +python
> > +pdb, so I get a ridiculous number of returns referring to the python
> > debugger. I have mentioned this to Google several times, but I guess
> > logic isn't one of their strong points.  :-)

Instead of searching 'python.pdb' try the query "filetype:pdb python", or even
"python pdb" (quoted). The first one whould give you files with pdb extension
and python in the name or contents, and the second one (quoted) should return
pages with both words together, except for commas, spaces, dots, slashs, etc.

However... one of the second query results is this thread in google groups...
not a good sign.

-- 
Luis Zarrabeitia
Facultad de Matemática y Computación, UH
http://profesores.matcom.uh.cu/~kyrie


Quoting Wim Vander Schelden <[EMAIL PROTECTED]>:

> Python modules and scripts are normally not even compiled, if they have
> been,
> its probably just the Python interpreter packaged with the scripts and
> resources.
> 
> My advice is that if you want to learn Python, is that you just read a book
> about
> it or read only resources. Learning Python from assembler is kind of...
> strange.
> 
> Not only are you skipping several generations of programming languages,
> spanned
> over a period of 40 years, but the approach to programming in Python is so
> fundamentally different from assembler programming that there is simply no
> reason
> to start looking at if from this perspective.
> 
> I truly hope you enjoy the world of high end programming languages, but
> treat them
> as such. Looking at them in a low-level representation or for a low-level
> perspective
> doesn't bear much fruits.
> 
> Kind regards,
> 
> Wim
> 
> On 1/22/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> >
> > My expertise, if any, is in assembler. I'm trying to understand Python
> > scripts and modules by examining them after they have been
> > disassembled in a Windows environment.
> >
> > I'm wondering if a Python symbols file is available. In the Windows
> > environment, a symbol file normally has a PDB extension. It's a little
> > unfortunate that Python also uses PDB for its debugger. Google, for
> > whatever reason, wont accept queries with dots, hyphens, etc., in the
> > query line. For example a Google for "python.pdb" returns +python
> > +pdb, so I get a ridiculous number of returns referring to the python
> > debugger. I have mentioned this to Google several times, but I guess
> > logic isn't one of their strong points.  :-)
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
> 

--
"Al mundo nuevo corresponde la Universidad nueva"
UNIVERSIDAD DE LA HABANA
280 aniversario 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Submitting with PAMIE

2008-01-22 Thread Gabriel Genellina
En Tue, 22 Jan 2008 15:39:33 -0200, <[EMAIL PROTECTED]> escribió:

> Hi I really need help. I've been looking around for an answer forever.
> I need to submit a form with no name and also the submit button has no
> name or value. How might I go about doing either of these. Thanks

I think you'll have more luck in a specific forum like the PAMIE User  
Group at
http://tech.groups.yahoo.com/group/Pamie_UsersGroup/


-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: translating Python to Assembler

2008-01-22 Thread James Matthews
The reason you were finding a Python Debugger when looking for the PDB
files is because PDB is Python DeBugger! Also why would you be looking
for a PDB file if you can read the C source!

On Jan 22, 2008 11:55 PM, Wim Vander Schelden <[EMAIL PROTECTED]> wrote:
> Python modules and scripts are normally not even compiled, if they have
> been,
> its probably just the Python interpreter packaged with the scripts and
> resources.
>
> My advice is that if you want to learn Python, is that you just read a book
> about
> it or read only resources. Learning Python from assembler is kind of...
> strange.
>
> Not only are you skipping several generations of programming languages,
> spanned
> over a period of 40 years, but the approach to programming in Python is so
> fundamentally different from assembler programming that there is simply no
> reason
> to start looking at if from this perspective.
>
> I truly hope you enjoy the world of high end programming languages, but
> treat them
> as such. Looking at them in a low-level representation or for a low-level
> perspective
> doesn't bear much fruits.
>
> Kind regards,
>
> Wim
>
>
>
> On 1/22/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > My expertise, if any, is in assembler. I'm trying to understand Python
> > scripts and modules by examining them after they have been
> > disassembled in a Windows environment.
> >
> > I'm wondering if a Python symbols file is available. In the Windows
> > environment, a symbol file normally has a PDB extension. It's a little
> > unfortunate that Python also uses PDB for its debugger. Google, for
> > whatever reason, wont accept queries with dots, hyphens, etc., in the
> > query line. For example a Google for "python.pdb" returns +python
> > +pdb, so I get a ridiculous number of returns referring to the python
> > debugger. I have mentioned this to Google several times, but I guess
> > logic isn't one of their strong points.  :-)
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
http://search.goldwatches.com/?Search=Movado+Watches
http://www.jewelerslounge.com
http://www.goldwatches.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML parsing confusion

2008-01-22 Thread Gabriel Genellina
En Tue, 22 Jan 2008 19:20:32 -0200, Alnilam <[EMAIL PROTECTED]> escribió:

> On Jan 22, 11:39 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
>> Alnilam wrote:
>> > On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote:
>> >> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2,
>> >> > -1)) doesn't have an xml.dom.ext ... you must have the  
>> mega-monstrous
>> >> > 200-modules PyXML package installed. And you don't want the 75Kb
>> >> > BeautifulSoup?
>> > Ugh. Found it. Sorry about that, but I still don't understand why
>> > there isn't a simple way to do this without using PyXML, BeautifulSoup
>> > or libxml2dom. What's the point in having sgmllib, htmllib,
>> > HTMLParser, and formatter all built in if I have to use use someone
>> > else's modules to write a couple of lines of code that achieve the
>> > simple thing I want. I get the feeling that this would be easier if I
>> > just broke down and wrote a couple of regular expressions, but it
>> > hardly seems a 'pythonic' way of going about things.
>>
>> This is simply a gross misunderstanding of what BeautifulSoup or lxml
>> accomplish. Dealing with mal-formatted HTML whilst trying to make _some_
>> sense is by no means trivial. And just because you can come up with a  
>> few
>> lines of code using rexes that work for your current use-case doesn't  
>> mean
>> that they serve as general html-fixing-routine. Or do you think the  
>> rather
>> long history and 75Kb of code for BS are because it's creator wasn't  
>> aware
>> of rexes?
>
> I am, by no means, trying to trivialize the work that goes into
> creating the numerous modules out there. However as a relatively
> novice programmer trying to figure out something, the fact that these
> modules are pushed on people with such zealous devotion that you take
> offense at my desire to not use them gives me a bit of pause. I use
> non-included modules for tasks that require them, when the capability
> to do something clearly can't be done easily another way (eg.
> MySQLdb). I am sure that there will be plenty of times where I will
> use BeautifulSoup. In this instance, however, I was trying to solve a
> specific problem which I attempted to lay out clearly from the
> outset.
>
> I was asking this community if there was a simple way to use only the
> tools included with Python to parse a bit of html.

If you *know* that your document is valid HTML, you can use the HTMLParser  
module in the standard Python library. Or even the parser in the htmllib  
module. But a lot of HTML pages out there are invalid, some are grossly  
invalid, and those parsers are just unable to handle them. This is why  
modules like BeautifulSoup exist: they contain a lot of heuristics and  
trial-and-error and personal experience from the developers, in order to  
guess more or less what the page author intended to write and make some  
sense of that "tag soup".
A guesswork like that is not suitable for the std lib ("Errors should  
never pass silently" and "In the face of ambiguity, refuse the temptation  
to guess.") but makes a perfect 3rd party module.

If you want to use regular expressions, and that works OK for the  
documents you are handling now, fine. But don't complain when your RE's  
match too much or too little or don't match at all because of unclosed  
tags, improperly nested tags, nonsense markup, or just a valid combination  
that you didn't take into account.

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: UDP Client/Server

2008-01-22 Thread Guilherme Polo
2008/1/22, Martin Marcher <[EMAIL PROTECTED]>:
> Hello,
>
> I created a really simple udp server and protocol but I only get every 2nd
> request (and thus answer just every second request).
>
> Maybe someone could shed some light, I'm lost in the dark(tm), sorry if this
> is a bit oververbose but to me everything that happens here is black magic,
> and I have no clue where the packages go. I can't think of a simpler
> protocol than to just receive a fixed max UDP packet size and answer
> immediately (read an "echo" server).
>
> thanks
> martin
>
>
> ### server
> >>> from socket import *
> >>> import SocketServer
> >>> from SocketServer import BaseRequestHandler, UDPServer
> >>> class FooReceiveServer(SocketServer.UDPServer):
> ... def __init__(self):
> ... SocketServer.UDPServer.__init__(self, ("localhost", 4321),
> FooRequestHandler)
> ...
> >>> class FooRequestHandler(BaseRequestHandler):
> ... def handle(self):
> ... data, addr_info = self.request[1].recvfrom(65534)

Your FooReceiveServer subclasses UDPServer, it already handled the
recvfrom for you, so, this is wrong.

> ... print data
> ... print addr_info
> ... self.request[1].sendto("response", addr_info)
> ...
> >>> f = FooReceiveServer()
> >>> f.serve_forever()
> request 0
> ('127.0.0.1', 32884)
> request 1
> ('127.0.0.1', 32884)
> request 2
> ('127.0.0.1', 32884)
> request 2
> ('127.0.0.1', 32884)
> request 2
> ('127.0.0.1', 32884)
>
>
>
> ### client
> >>> target = ('127.0.0.1', 4321)
> >>> from socket import *
> >>> s = socket(AF_INET, SOCK_DGRAM)
> >>> for i in range(10):
> ... s.sendto("request " + str(i), target)
> ... s.recv(65534)
> ...
> 9
> Traceback (most recent call last):
>   File "", line 3, in 
> KeyboardInterrupt
> >>> s.sendto("request " + str(i), target)
> 9
> >>> str(i)
> '0'
> >>> for i in range(10):
> ... s.sendto("request " + str(i), target)
> ... s.recv(65534)
> ...
> 9
> 'response'
> 9
> 'response'
> 9
> Traceback (most recent call last):
>   File "", line 3, in 
> KeyboardInterrupt
> >>> #this was hanging, why?
> ...
> >>> s.sendto("request " + str(i), target)
> 9
> >>> s.recv(65534)
> 'response'
> >>> s.sendto("request " + str(i), target)
> 9
> >>> s.recv(65534)
> Traceback (most recent call last):
>   File "", line 1, in 
> KeyboardInterrupt
> >>> s.sendto("request " + str(i), target)
> 9
> >>> s.sendto("request " + str(i), target)
> 9
> >>> s.recv(65534)
> 'response'
> >>> s.recv(65534)
> Traceback (most recent call last):
>   File "", line 1, in 
> KeyboardInterrupt
> >>> s.sendto("request " + str(i), target)
> 9
> >>>
>
> --
> http://noneisyours.marcher.name
> http://feeds.feedburner.com/NoneIsYours
>
> You are not free to read this message,
> by doing so, you have violated my licence
> and are required to urinate publicly. Thank you.
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>


-- 
-- Guilherme H. Polo Goncalves
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Processing XML that's embedded in HTML

2008-01-22 Thread Paul McGuire
On Jan 22, 10:57 am, Mike Driscoll <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I need to parse a fairly complex HTML page that has XML embedded in
> it. I've done parsing before with the xml.dom.minidom module on just
> plain XML, but I cannot get it to work with this HTML page.
>
> The XML looks like this:
>
...

Once again (this IS HTML Day!), instead of parsing the HTML, pyparsing
can help lift the interesting bits and leave the rest alone.  Try this
program out:

from pyparsing import
makeXMLTags,Word,nums,Combine,oneOf,SkipTo,withAttribute

htmlWithEmbeddedXml = """



Hey! this is really bold!


Owner
1
07/16/2007
No
Doe, John
1905 S 3rd Ave , Hicksville IA 9
  

  
Owner
2
07/16/2007
No
Doe, Jane
1905 S 3rd Ave , Hicksville IA 9
  


this is in a table, woo-hoo!
more HTML
blah blah blah...
"""

# define pyparsing expressions for XML tags
rowStart,rowEnd   = makeXMLTags("Row")
relationshipStart,relationshipEnd = makeXMLTags("Relationship")
priorityStart,priorityEnd = makeXMLTags("Priority")
startDateStart,startDateEnd   = makeXMLTags("StartDate")
stopsExistStart,stopsExistEnd = makeXMLTags("StopsExist")
nameStart,nameEnd = makeXMLTags("Name")
addressStart,addressEnd   = makeXMLTags("Address")

# define some useful expressions for data of specific types
integer = Word(nums)
date = Combine(Word(nums,exact=2)+"/"+
Word(nums,exact=2)+"/"+Word(nums,exact=4))
yesOrNo = oneOf("Yes No")

# conversion parse actions
integer.setParseAction(lambda t: int(t[0]))
yesOrNo.setParseAction(lambda t: t[0]=='Yes')
# could also define a conversion for date if you really wanted to

# define format of a , plus assign results names for each data
field
rowRec = rowStart + \
relationshipStart + SkipTo(relationshipEnd)("relationship") +
relationshipEnd + \
priorityStart + integer("priority") + priorityEnd + \
startDateStart + date("startdate") + startDateEnd + \
stopsExistStart + yesOrNo("stopsexist") + stopsExistEnd + \
nameStart + SkipTo(nameEnd)("name") + nameEnd + \
addressStart + SkipTo(addressEnd)("address") + addressEnd + \
rowEnd

# set filtering parse action
rowRec.setParseAction(withAttribute(relationship="Owner",priority=1))

# find all matching rows, matching grammar and filtering parse action
rows = rowRec.searchString(htmlWithEmbeddedXml)

# print the results (uncomment r.dump() statement to see full
# result for each row)
for r in rows:
# print r.dump()
print r.relationship
print r.priority
print r.startdate
print r.stopsexist
print r.name
print r.address


This prints:
Owner
1
07/16/2007
False
Doe, John
1905 S 3rd Ave , Hicksville IA 9

In addition to parsing this data, some conversions were done at parse
time, too - "1" was converted to the value 1, and "No" was converted
to False.  These were done by the conversion parse actions.  The
filtering just for Row's containing Relationship="Owner" and
Priority=1 was done in a more global parse action, called
withAttribute.  If you comment this line out, you will see that both
rows get retrieved.

-- Paul
(Find out more about pyparsing at http://pyparsing.wikispaces.com.)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Just for fun: Countdown numbers game solver

2008-01-22 Thread dg . google . groups
Arnaud and Terry,

Great solutions both of you! Much nicer than mine. I particularly like
Arnaud's latest one based on folding because it's so neat and
conceptually simple. For me, it's the closest so far to my goal of the
most elegant solution.

So anyone got an answer to which set of numbers gives the most targets
from 100 onwards say (or from 0 onwards)? Is Python up to the task?

A thought on that last one. Two ways to improve speed. First of all,
you don't need to rerun from scratch for each target. Secondly, you
can try multiple different sets of numbers at the same time by passing
numpy arrays instead of single values (although you have to give up
the commutativity and division by zero optimisations).

Dan Goodman
-- 
http://mail.python.org/mailman/listinfo/python-list


UDP Client/Server

2008-01-22 Thread Martin Marcher
Hello,

I created a really simple udp server and protocol but I only get every 2nd
request (and thus answer just every second request).

Maybe someone could shed some light, I'm lost in the dark(tm), sorry if this
is a bit oververbose but to me everything that happens here is black magic,
and I have no clue where the packages go. I can't think of a simpler
protocol than to just receive a fixed max UDP packet size and answer
immediately (read an "echo" server).

thanks
martin


### server
>>> from socket import *
>>> import SocketServer
>>> from SocketServer import BaseRequestHandler, UDPServer
>>> class FooReceiveServer(SocketServer.UDPServer):
... def __init__(self):
... SocketServer.UDPServer.__init__(self, ("localhost", 4321),
FooRequestHandler)
...
>>> class FooRequestHandler(BaseRequestHandler):
... def handle(self):
... data, addr_info = self.request[1].recvfrom(65534)
... print data
... print addr_info
... self.request[1].sendto("response", addr_info)
...
>>> f = FooReceiveServer()
>>> f.serve_forever()
request 0
('127.0.0.1', 32884)
request 1
('127.0.0.1', 32884)
request 2
('127.0.0.1', 32884)
request 2
('127.0.0.1', 32884)
request 2
('127.0.0.1', 32884)



### client
>>> target = ('127.0.0.1', 4321)
>>> from socket import *
>>> s = socket(AF_INET, SOCK_DGRAM)
>>> for i in range(10):
... s.sendto("request " + str(i), target)
... s.recv(65534)
...
9
Traceback (most recent call last):
  File "", line 3, in 
KeyboardInterrupt
>>> s.sendto("request " + str(i), target)
9
>>> str(i)
'0'
>>> for i in range(10):
... s.sendto("request " + str(i), target)
... s.recv(65534)
...
9
'response'
9
'response'
9
Traceback (most recent call last):
  File "", line 3, in 
KeyboardInterrupt
>>> #this was hanging, why?
...
>>> s.sendto("request " + str(i), target)
9
>>> s.recv(65534)
'response'
>>> s.sendto("request " + str(i), target)
9
>>> s.recv(65534)
Traceback (most recent call last):
  File "", line 1, in 
KeyboardInterrupt
>>> s.sendto("request " + str(i), target)
9
>>> s.sendto("request " + str(i), target)
9
>>> s.recv(65534)
'response'
>>> s.recv(65534)
Traceback (most recent call last):
  File "", line 1, in 
KeyboardInterrupt
>>> s.sendto("request " + str(i), target)
9
>>>

-- 
http://noneisyours.marcher.name
http://feeds.feedburner.com/NoneIsYours

You are not free to read this message,
by doing so, you have violated my licence
and are required to urinate publicly. Thank you.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: translating Python to Assembler

2008-01-22 Thread Wim Vander Schelden
Python modules and scripts are normally not even compiled, if they have
been,
its probably just the Python interpreter packaged with the scripts and
resources.

My advice is that if you want to learn Python, is that you just read a book
about
it or read only resources. Learning Python from assembler is kind of...
strange.

Not only are you skipping several generations of programming languages,
spanned
over a period of 40 years, but the approach to programming in Python is so
fundamentally different from assembler programming that there is simply no
reason
to start looking at if from this perspective.

I truly hope you enjoy the world of high end programming languages, but
treat them
as such. Looking at them in a low-level representation or for a low-level
perspective
doesn't bear much fruits.

Kind regards,

Wim

On 1/22/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>
> My expertise, if any, is in assembler. I'm trying to understand Python
> scripts and modules by examining them after they have been
> disassembled in a Windows environment.
>
> I'm wondering if a Python symbols file is available. In the Windows
> environment, a symbol file normally has a PDB extension. It's a little
> unfortunate that Python also uses PDB for its debugger. Google, for
> whatever reason, wont accept queries with dots, hyphens, etc., in the
> query line. For example a Google for "python.pdb" returns +python
> +pdb, so I get a ridiculous number of returns referring to the python
> debugger. I have mentioned this to Google several times, but I guess
> logic isn't one of their strong points.  :-)
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: translating Python to Assembler

2008-01-22 Thread John Machin
On Jan 23, 9:24 am, [EMAIL PROTECTED] wrote:
> My expertise, if any, is in assembler. I'm trying to understand Python
> scripts and modules by examining them after they have been
> disassembled in a Windows environment.
>

DB "Wrong way. Go back. Read the tutorials."
RET
-- 
http://mail.python.org/mailman/listinfo/python-list


translating Python to Assembler...sorry if this is duplicated...it's unintentional

2008-01-22 Thread over
My expertise, if any, is in assembler. I'm trying to understand Python
scripts and modules by examining them after they have been
disassembled in a Windows environment.

I'm wondering if a Python symbols file is available. In the Windows
environment, a symbol file normally has a PDB extension. It's a little
unfortunate that Python also uses PDB for its debugger. Google, for
whatever reason, wont accept queries with dots, hyphens, etc., in the
query line. For example a Google for "python.pdb" returns +python
+pdb, so I get a ridiculous number of returns referring to the python
debugger. I have mentioned this to Google several times, but I guess
logic isn't one of their strong points.  :-)

If there's dupicates of this post it's because it wouldn't send for
some reason.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Processing XML that's embedded in HTML

2008-01-22 Thread Paul Boddie
On 22 Jan, 21:48, Mike Driscoll <[EMAIL PROTECTED]> wrote:
> On Jan 22, 11:32 am, Paul Boddie <[EMAIL PROTECTED]> wrote:
>
> > [1]http://www.python.org/pypi/libxml2dom
>
> I must have tried this module quite a while ago since I already have
> it installed. I see you're the author of the module, so you can
> probably tell me what's what. When I do the above, I get an empty list
> either way. See my code below:
>
> import libxml2dom
> d = libxml2dom.parse(filename, html=1)
> rows = d.xpath('//[EMAIL 
> PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/
> Row')
> # rows = d.xpath("//XML/BoundData/Row")
> print rows

It may be namespace-related, although parsing as HTML shouldn't impose
namespaces on the document, unlike parsing XHTML, say. One thing you
can try is to start with a simpler query and to expand it. Start with
the expression "//XML" and add things to make the results more
specific. Generally, namespaces can make XPath queries awkward because
you have to qualify the element names and define the namespaces for
each of the prefixes used.

Let me know how you get on!

Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: difflib confusion

2008-01-22 Thread Paul Hankin
On Jan 22, 6:57 pm, "krishnakant Mane" <[EMAIL PROTECTED]> wrote:
> hello all,
> I have a bit of a confusing question.
> firstly I wanted a library which can do an svn like diff with two files.
> let's say I have file1 and file2 where file2 contains some thing which
> file1 does not have.  now if I do readlines() on both the files, I
> have a list of all the lines.
> I now want to do a diff and find out which word is added or deleted or 
> changed.
> and that too on which character, if not at least want to know the word
> that has the change.
> any ideas please?

Have a look at difflib in the standard library.

--
Paul Hankin
-- 
http://mail.python.org/mailman/listinfo/python-list


translating Python to Assembler

2008-01-22 Thread over
My expertise, if any, is in assembler. I'm trying to understand Python
scripts and modules by examining them after they have been
disassembled in a Windows environment.

I'm wondering if a Python symbols file is available. In the Windows
environment, a symbol file normally has a PDB extension. It's a little
unfortunate that Python also uses PDB for its debugger. Google, for
whatever reason, wont accept queries with dots, hyphens, etc., in the
query line. For example a Google for "python.pdb" returns +python
+pdb, so I get a ridiculous number of returns referring to the python
debugger. I have mentioned this to Google several times, but I guess
logic isn't one of their strong points.  :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Anyone into Paimei?? Need some help.

2008-01-22 Thread over
Hi. I have been trying to get Paimei running on Windoze but find it
very inconsistent. It works on certain apps really well, like Notepad,
but fails on other apps, especially those written in languages like
Delphi. There isn't a lot out there on Paimei and the author's site is
very terse on the app.
-- 
http://mail.python.org/mailman/listinfo/python-list


translating Python to Assembler

2008-01-22 Thread over
My expertise, if any, is in assembler. I'm trying to understand Python
scripts and modules by examining them after they have been
disassembled in a Windows environment.

I'm wondering if a Python symbols file is available. In the Windows
environment, a symbol file normally has a PDB extension. It's a little
unfortunate that Python also uses PDB for its debugger. Google, for
whatever reason, wont accept queries with dots, hyphens, etc., in the
query line. For example a Google for "python.pdb" returns +python
+pdb, so I get a ridiculous number of returns referring to the python
debugger. I have mentioned this to Google several times, but I guess
logic isn't one of their strong points.  :-)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get the size of a dynamically changing file fast ?

2008-01-22 Thread Stef Mientki
Mike Driscoll wrote:
> On Jan 22, 3:35 pm, Stef Mientki <[EMAIL PROTECTED]> wrote:
>   
>> Mike Driscoll wrote:
>> 
>>> On Jan 17, 3:56 pm, Stef Mientki <[EMAIL PROTECTED]> wrote:
>>>   
 hello,
 
 I've a program (not written in Python) that generates a few thousands
 bytes per second,
 these files are dumped in 2 buffers (files), at in interval time of 50 
 msec,
 the files can be read by another program, to do further processing.
 
 A program written in VB or delphi can handle the data in the 2 buffers
 perfectly.
 Sometimes Python is also able to process the data correctly,
 but often it can't :-(
 
 I keep one of the files open en test the size of the open datafile each
 50 msec.
 I have tried
 os.stat ( ) [ ST_SIZE]
 os.path.getsize ( ... )
 but they both have the same behaviour, sometimes it works, and the data
 is collected each 50 .. 100 msec,
 sometimes 1 .. 1.5 seconds is needed to detect a change in filesize.
 
 I'm using python 2.4 on winXP.
 
 Is there a solution for this problem ?
 
 thanks,
 Stef Mientki
 
>>> Tim Golden has a method to watch for changes in a directory on his
>>> website:
>>>   
>>> http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_fo...
>>>   
>>> This old post also mentions something similar:
>>>   
>>> http://mail.python.org/pipermail/python-list/2007-October/463065.html
>>>   
>>> And here's a cookbook recipe that claims to do it as well using
>>> decorators:
>>>   
>>> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/426620
>>>   
>>> Hopefully that will get you going.
>>>   
>>> Mike
>>>   
>> thanks Mike,
>> sorry for the late reaction.
>> I've it working perfect now.
>> After all, os.stat works perfectly well,
>> the problem was in the program that generated the file with increasing
>> size,
>> by truncating it after each block write, it apperently garantees that
>> the file is flushed to disk and all problems are solved.
>>
>> cheers,
>> Stef Mientki
>> 
>
> I almost asked if you were making sure you had flushed the data to the
> file...oh well.
>   
Yes, that's a small disadavantage of using a "high-level" language,
where there's no flush available, and you assume it'll done 
automatically ;-)

cheers,
Stef

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hebrew in idle ans eclipse (Windows)

2008-01-22 Thread iu2
On Jan 17, 10:35 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> ...
>   print lines[0].decode("").encode("")
> ...
> Regards,
> Martin

Ok, I've got the solution, but I still have a question.

Recall:
When I read data using sql I got a sequence like this:
\x88\x89\x85
But when I entered heberw words directly in the print statement (or as
a dictionary key)
I got this:
\xe8\xe9\xe5

Now, scanning the encoding module I discovered that cp1255 maps
'\u05d9' to \xe9
while cp856 maps '\u05d9' to \x89,
so trasforming \x88\x89\x85 to \xe8\xe9\xe5 is done by

s.decode('cp856').encode('cp1255')

ending up with the pattern you suggested.

My qestion is, is there a way I can deduce cp856 and cp1255 from the
string itself? Is there a function doing it? (making the
transformation more robust)

I don't know how IDLE guessed cp856, but it must have done it.
(perhaps because it uses tcl, and maybe tcl guesses the encoding
automatically?)

thanks
iu2



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Processing XML that's embedded in HTML

2008-01-22 Thread John Machin
On Jan 23, 7:48 am, Mike Driscoll <[EMAIL PROTECTED]> wrote:
[snip]

> I'm not sure what is wrong here...but I got lxml to create a tree from
> by doing the following:
>
> 
> from lxml import etree
> from StringIO import StringIO
>
> parser = etree.HTMLParser()
> tree = etree.parse(filename, parser)
> xml_string = etree.tostring(tree)
> context = etree.iterparse(StringIO(xml_string))
> 
>
> However, when I iterate over the contents of "context", I can't figure
> out how to nab the row's contents:
>
> for action, elem in context:
> if action == 'end' and elem.tag == 'relationship':
> # do something...but what!?
> # this if statement probably isn't even right
>

lxml allegedly supports the ElementTree interface so I would expect
elem.text to refer to the contents. Sure enough:
http://codespeak.net/lxml/tutorial.html#elements-contain-text

Why do you want/need to use the iterparse technique on the 2nd pass
instead of creating another tree and then using getiterator?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get the size of a dynamically changing file fast ?

2008-01-22 Thread Mike Driscoll
On Jan 22, 3:35 pm, Stef Mientki <[EMAIL PROTECTED]> wrote:
> Mike Driscoll wrote:
> > On Jan 17, 3:56 pm, Stef Mientki <[EMAIL PROTECTED]> wrote:
>
> >> hello,
>
> >> I've a program (not written in Python) that generates a few thousands
> >> bytes per second,
> >> these files are dumped in 2 buffers (files), at in interval time of 50 
> >> msec,
> >> the files can be read by another program, to do further processing.
>
> >> A program written in VB or delphi can handle the data in the 2 buffers
> >> perfectly.
> >> Sometimes Python is also able to process the data correctly,
> >> but often it can't :-(
>
> >> I keep one of the files open en test the size of the open datafile each
> >> 50 msec.
> >> I have tried
> >> os.stat ( ) [ ST_SIZE]
> >> os.path.getsize ( ... )
> >> but they both have the same behaviour, sometimes it works, and the data
> >> is collected each 50 .. 100 msec,
> >> sometimes 1 .. 1.5 seconds is needed to detect a change in filesize.
>
> >> I'm using python 2.4 on winXP.
>
> >> Is there a solution for this problem ?
>
> >> thanks,
> >> Stef Mientki
>
> > Tim Golden has a method to watch for changes in a directory on his
> > website:
>
> >http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_fo...
>
> > This old post also mentions something similar:
>
> >http://mail.python.org/pipermail/python-list/2007-October/463065.html
>
> > And here's a cookbook recipe that claims to do it as well using
> > decorators:
>
> >http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/426620
>
> > Hopefully that will get you going.
>
> > Mike
>
> thanks Mike,
> sorry for the late reaction.
> I've it working perfect now.
> After all, os.stat works perfectly well,
> the problem was in the program that generated the file with increasing
> size,
> by truncating it after each block write, it apperently garantees that
> the file is flushed to disk and all problems are solved.
>
> cheers,
> Stef Mientki

I almost asked if you were making sure you had flushed the data to the
file...oh well.

Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: get the size of a dynamically changing file fast ?

2008-01-22 Thread Stef Mientki
Mike Driscoll wrote:
> On Jan 17, 3:56 pm, Stef Mientki <[EMAIL PROTECTED]> wrote:
>   
>> hello,
>>
>> I've a program (not written in Python) that generates a few thousands
>> bytes per second,
>> these files are dumped in 2 buffers (files), at in interval time of 50 msec,
>> the files can be read by another program, to do further processing.
>>
>> A program written in VB or delphi can handle the data in the 2 buffers
>> perfectly.
>> Sometimes Python is also able to process the data correctly,
>> but often it can't :-(
>>
>> I keep one of the files open en test the size of the open datafile each
>> 50 msec.
>> I have tried
>> os.stat ( ) [ ST_SIZE]
>> os.path.getsize ( ... )
>> but they both have the same behaviour, sometimes it works, and the data
>> is collected each 50 .. 100 msec,
>> sometimes 1 .. 1.5 seconds is needed to detect a change in filesize.
>>
>> I'm using python 2.4 on winXP.
>>
>> Is there a solution for this problem ?
>>
>> thanks,
>> Stef Mientki
>> 
>
> Tim Golden has a method to watch for changes in a directory on his
> website:
>
> http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_for_changes.html
>
> This old post also mentions something similar:
>
> http://mail.python.org/pipermail/python-list/2007-October/463065.html
>
> And here's a cookbook recipe that claims to do it as well using
> decorators:
>
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/426620
>
> Hopefully that will get you going.
>
> Mike
>   
thanks Mike,
sorry for the late reaction.
I've it working perfect now.
After all, os.stat works perfectly well,
the problem was in the program that generated the file with increasing 
size,
by truncating it after each block write, it apperently garantees that 
the file is flushed to disk and all problems are solved.

cheers,
Stef Mientki


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML parsing confusion

2008-01-22 Thread Alnilam
On Jan 22, 11:39 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> Alnilam wrote:
> > On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote:
> >> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2,
> >> > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous
> >> > 200-modules PyXML package installed. And you don't want the 75Kb
> >> > BeautifulSoup?
>
> >> I wasn't aware that I had PyXML installed, and can't find a reference
> >> to having it installed in pydocs. ...
>
> > Ugh. Found it. Sorry about that, but I still don't understand why
> > there isn't a simple way to do this without using PyXML, BeautifulSoup
> > or libxml2dom. What's the point in having sgmllib, htmllib,
> > HTMLParser, and formatter all built in if I have to use use someone
> > else's modules to write a couple of lines of code that achieve the
> > simple thing I want. I get the feeling that this would be easier if I
> > just broke down and wrote a couple of regular expressions, but it
> > hardly seems a 'pythonic' way of going about things.
>
> This is simply a gross misunderstanding of what BeautifulSoup or lxml
> accomplish. Dealing with mal-formatted HTML whilst trying to make _some_
> sense is by no means trivial. And just because you can come up with a few
> lines of code using rexes that work for your current use-case doesn't mean
> that they serve as general html-fixing-routine. Or do you think the rather
> long history and 75Kb of code for BS are because it's creator wasn't aware
> of rexes?
>
> And it also makes no sense stuffing everything remotely useful into the
> standard lib. This would force to align development and release cycles,
> resulting in much less features and stability as it can be wished.
>
> And to be honest: I fail to see where your problem is. BeatifulSoup is a
> single Python file. So whatever you carry with you from machine to machine,
> if it's capable of holding a file of your own code, you can simply put
> BeautifulSoup beside it - even if it was a floppy  disk.
>
> Diez


I am, by no means, trying to trivialize the work that goes into
creating the numerous modules out there. However as a relatively
novice programmer trying to figure out something, the fact that these
modules are pushed on people with such zealous devotion that you take
offense at my desire to not use them gives me a bit of pause. I use
non-included modules for tasks that require them, when the capability
to do something clearly can't be done easily another way (eg.
MySQLdb). I am sure that there will be plenty of times where I will
use BeautifulSoup. In this instance, however, I was trying to solve a
specific problem which I attempted to lay out clearly from the
outset.

I was asking this community if there was a simple way to use only the
tools included with Python to parse a bit of html.

If the answer is no, that's fine. Confusing, but fine. If the answer
is yes, great. I look forward to learning from someone's example. If
you don't have an answer, or a positive contribution, then please
don't interject your angst into this thread.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug in __init__?

2008-01-22 Thread Bart Ogryczak
On 2008-01-22, citizen Bruno Desthuilliers testified:
>>  from copy import copy
>>  ### see also deepcopy
>>  self.lst = copy(val)
>
> What makes you think the OP wants a copy ?

I´m guessing he doesn´t want to mutate original list, while changing 
contents of self.lst.

bart
-- 
"chłopcy dali z siebie wszystko, z czego tv pokazała głównie bebechy"
http://candajon.azorragarse.info/ http://azorragarse.candajon.info/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question

2008-01-22 Thread Wildemar Wildenburger
Hi there :)

A little tip upfront: In the future you might want to come up with a 
more descriptive subject line. This will help readers decide early if 
they can possibly help or not.

[EMAIL PROTECTED] wrote:
> def albumInfo(theBand):
> def Rush():
> return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A 
> Farewell to Kings', 'Hemispheres']
> 
> def Enchant():
> return ['A Blueprint of the World', 'Wounded', 'Time Lost']
> 
> ...
> 
Yuck! ;)


> The only problem with the code above though is that I don't know how to call 
> it, especially since if the user is entering a string, how would I convert 
> that string into a function name?
While this is relatively easy, it is *waaayyy* too complicated an 
approach here, because . . .


> def albumInfo(theBand):
> if theBand == 'Rush':
> return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A 
> Farewell to Kings', 'Hemispheres']
> elif theBand == 'Enchant':
> return ['A Blueprint of the World', 'Wounded', 'Time Lost']
> ...
>
. . . this is a lot more fitting for this problem.

You could also have used a dictionary here, but the above is better if 
you have a lot of lists, because only the one you use is created (I 
think . . .).

You might also want to consider preparing a textfile and reading it into 
a list (via lines = open("somefile.txt").readlines()) and then work with 
that so you don't have to hardcode it into the program. This however is 
somewhat advanced (if you're just starting out), so don't sweat it.



> I'm not familiar with how 'classes' work yet (still reading through my 'Core 
> Python' book) but was curious if using a 'class' would be better suited for 
> something like this?  Since the user could possibly choose from 100 or more 
> choices, I'd like to come up with something that's efficient as well as easy 
> to read in the code.  If anyone has time I'd love to hear your thoughts.
> 
Think of classes as "models of things and their behavior" (like an 
animal, a car or a robot). What you want is a simple "request->answer" 
style functionality, hence a function.


Hope that helps.
Happy coding :)

/W
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Processing XML that's embedded in HTML

2008-01-22 Thread Mike Driscoll
On Jan 22, 11:32 am, Paul Boddie <[EMAIL PROTECTED]> wrote:

> > The rest of the document is html, javascript div tags, etc. I need the
> > information only from the row where the Relationship tag = Owner and
> > the Priority tag = 1. The rest I can ignore. When I tried parsing it
> > with minidom, I get an ExpatError: mismatched tag: line 1, column 357
> > so I think the HTML is probably malformed.
>
> Or that it isn't well-formed XML, at least.

I probably should have posted that I got the error on the first line
of the file, which is why I think it's the HTML. But I wouldn't be
surprised if it was the XML that's behaving badly.

>
> > I looked at BeautifulSoup, but it seems to separate its HTML
> > processing from its XML processing. Can someone give me some pointers?
>
> With libxml2dom [1] I'd do something like this:
>
>   import libxml2dom
>   d = libxml2dom.parse(filename, html=1)
>   # or: d = parseURI(uri, html=1)
>   rows = d.xpath("//XML/BoundData/Row")
>   # or: rows = d.xpath("//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/
> BoundData/Row")
>
> Even though the document is interpreted as HTML, you should get a DOM
> containing the elements as libxml2 interprets them.
>
> > I am currently using Python 2.5 on Windows XP. I will be using
> > Internet Explorer 6 since the document will not display correctly in
> > Firefox.
>
> That shouldn't be much of a surprise, it must be said: it isn't XHTML,
> where you might be able to extend the document via XML, so the whole
> document has to be "proper" HTML.
>
> Paul
>
> [1]http://www.python.org/pypi/libxml2dom


I must have tried this module quite a while ago since I already have
it installed. I see you're the author of the module, so you can
probably tell me what's what. When I do the above, I get an empty list
either way. See my code below:

import libxml2dom
d = libxml2dom.parse(filename, html=1)
rows = d.xpath('//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/
Row')
# rows = d.xpath("//XML/BoundData/Row")
print rows

I'm not sure what is wrong here...but I got lxml to create a tree from
by doing the following:


from lxml import etree
from StringIO import StringIO

parser = etree.HTMLParser()
tree = etree.parse(filename, parser)
xml_string = etree.tostring(tree)
context = etree.iterparse(StringIO(xml_string))


However, when I iterate over the contents of "context", I can't figure
out how to nab the row's contents:

for action, elem in context:
if action == 'end' and elem.tag == 'relationship':
# do something...but what!?
# this if statement probably isn't even right


Thanks for the quick response, though! Any other ideas?

Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: bags? 2.5.x?

2008-01-22 Thread MRAB
On Jan 21, 11:13 pm, Dan Stromberg <[EMAIL PROTECTED]> wrote:
> On Thu, 17 Jan 2008 18:18:53 -0800, Raymond Hettinger wrote:
> >> >> I keep wanting something like them - especially bags with something
> >> >> akin to set union, intersection and difference.
>
> >> > How about this recepie
> >> > http://www.ubookcase.com/book/Oreilly/
>
> >> The author of the bag class said that he was planning to submit bags
> >> for inclusion in 2.5 - is there a particular reason why they didn't go
> >> in?
>
> > Three reasons:
>
> > 1. b=collections.defaultdict(int) went a long way towards meeting the
> > need to for a fast counter.
>
> > 2. It's still not clear what the best API would be. What should list(b)
> > return for b.dict = {'a':3, 'b':0, 'c':-3}? Perhaps, [('a', 3), ('b',
> > 0), ('c', -3)] or ['a', 'a', 'a']
> > or ['a']
> > or ['a', 'b', 'c']
> > or raise an Error for the negative entry.
>
> I'd suggest that .keys() return the unique list, and that list() return
> the list of tuples.  Then people can use list comprehensions or map() to
> get what they really need.

I think that a bag is a cross between a dict (but the values are
always positive integers) and a set (but duplicates are permitted).

I agree that .keys() should the unique list, but that .items() should
return the tuples and list() should return the list of keys including
duplicates. bag() should accept an iterable and count the duplicates.

For example:

>>> sentence = "the cat sat on the mat"
>>> my_words = sentence.split()
>>> print my_words
['the', 'cat', 'sat', 'on', 'the', 'mat']
>>> my_bag = bag(my_words)
>>> print my_bag
bag({'on': 1, 'the': 2, 'sat': 1, 'mat': 1, 'cat': 1})
my_list = list(my_bag)
['on', 'the', 'the', 'sat', 'mat', 'cat']

It should be easy to convert a bag to a dict and also a dict to a bag,
raising ValueError if it sees a value that's not a non-negative
integer (a value of zero just means "there isn't one of these in the
bag"!).

>
> It might not be a bad thing to have an optional parameter on __init__
> that would allow the user to specify if they need negative counts or not;
> so far, I've not needed them though.
>
> > 3. I'm still working on it and am not done yet.
>
> Decent reasons.  :)
>
> Thanks!
>
> Here's a diff to bag.py that helped me.  I'd like to think these meanings
> are common, but not necessarily!
>
> $ diff -b /tmp/bag.py.original /usr/local/lib/bag.py
> 18a19,58
>
> >   def _difference(lst):
> >   left = lst[0]
> >   right = lst[1]
> >   return max(left - right, 0)
> >   _difference = staticmethod(_difference)
>
> >   def _relationship(self, other, operator):
> >   if isinstance(other, bag):
> >   self_keys = set(self._data.keys())
> >   other_keys = set(other._data.keys())
> >   union_keys = self_keys | other_keys
> >   #print 'union_keys is',union_keys
> >   result = bag()
> >   for element in list(union_keys):
> >   temp = operator([ self[element], other
> [element] ])
> >   #print 'operator is', operator
> >   #print 'temp is', temp
> >   result[element] += temp
> >   return result
> >   else:
> >   raise NotImplemented
>
> >   def union(self, other):
> >   return self._relationship(other, sum)
>
> >   __or__ = union
>
> >   def intersection(self, other):
> >   return self._relationship(other, min)
>
> >   __and__ = intersection
>
> >   def maxunion(self, other):
> >   return self._relationship(other, max)
>
> >   def difference(self, other):
> >   return self._relationship(other, self._difference)
>
> >   __sub__ = difference
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question

2008-01-22 Thread Tim Chase
> def albumInfo(theBand):
> def Rush():
> return ['Rush', 'Fly By Night', 'Caress of Steel',
'2112', 'A Farewell to Kings', 'Hemispheres']
>
> def Enchant():
> return ['A Blueprint of the World', 'Wounded', 'Time Lost']
>
> The only problem with the code above though is that I
> don't know how to call it, especially since if the user is
> entering a string, how would I convert that string into a
> function name? For example, if the user entered 'Rush',
> how would I call the appropriate function -->
> albumInfo(Rush()) >

It looks like you're reaching for a dictionary idiom:

  album_info = {
'Rush': [
  'Rush',
  'Fly By Night',
  'Caress of Steel',
  '2112',
  'A Farewell to Kings',
  'Hemispheres',
  ],
'Enchant': [
  'A Blueprint of the World',
  'Wounded',
  'Time Lost',
  ],
}

You can then reference the bits:

  who = "Rush" #get this from the user?
  print "Albums by %s" % who
  for album_name in album_info[who]:
print ' *', album_name

This is much more flexible when it comes to adding groups
and albums because you can load the contents of album_info
dynamically from your favorite source (a file, DB, or teh
intarweb) rather than editing & restarting your app every time.

-tkc

PS: to answer your original question, you can use the getattr()
function, such as

  results = getattr(albumInfo, who)()

but that's an ugly solution for the example you gave.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Raymond Hettinger
[Peter Otten]
> You can be bolder here as the izip() docs explicitly state
>
> """
> Note, the left-to-right evaluation order of the iterables is
> guaranteed. This makes possible an idiom for clustering a data series into
> n-length groups using "izip(*[iter(s)]*n)".
> """
 . . .
> is about zip(), not izip().

FWIW, I just added a similar guarantee for zip().


Raymond

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question

2008-01-22 Thread Arnaud Delobelle
On Jan 22, 7:58 pm, <[EMAIL PROTECTED]> wrote:
> I'm still learning Python and was wanting to get some thoughts on this.  I 
> apologize if this sounds ridiculous...  I'm mainly asking it to gain some 
> knowledge of what works better.  The main question I have is if I had a lot 
> of lists to choose from, what's the best way to write the code so I'm not 
> wasting a lot of memory?  I've attempted to list a few examples below to 
> hopefully be a little clearer about my question.
>
> Lets say I was going to be pulling different data, depending on what the user 
> entered.  I was thinking I could create a function which contained various 
> functions inside:
>
> def albumInfo(theBand):
>     def Rush():
>         return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A 
> Farewell to Kings', 'Hemispheres']
>
>     def Enchant():
>         return ['A Blueprint of the World', 'Wounded', 'Time Lost']
>
>     ...
>
> The only problem with the code above though is that I don't know how to call 
> it, especially since if the user is entering a string, how would I convert 
> that string into a function name?  For example, if the user entered 'Rush', 
> how would I call the appropriate function -->  albumInfo(Rush())
>
> But if I could somehow make that code work, is it a good way to do it?  I'm 
> assuming if the user entered 'Rush' that only the list in the Rush() function 
> would be stored, ignoring the other functions inside the albumInfo() function?
>
> I then thought maybe just using a simple if/else statement might work like so:
>
> def albumInfo(theBand):
>     if theBand == 'Rush':
>         return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A 
> Farewell to Kings', 'Hemispheres']
>     elif theBand == 'Enchant':
>         return ['A Blueprint of the World', 'Wounded', 'Time Lost']
>     ...
>
> Does anyone think this would be more efficient?
>
> I'm not familiar with how 'classes' work yet (still reading through my 'Core 
> Python' book) but was curious if using a 'class' would be better suited for 
> something like this?  Since the user could possibly choose from 100 or more 
> choices, I'd like to come up with something that's efficient as well as easy 
> to read in the code.  If anyone has time I'd love to hear your thoughts.
>
> Thanks.
>
> Jay

What you want is a dictionary:

albumInfo = {
'Rush': 'Rush', 'Fly By Night', 'Caress of Steel',
'2112', 'A Farewell to Kings', 'Hemispheres'],
'Enchant': ['A Blueprint of the World',
   'Wounded', 'Time Lost'],
...
}

then to find the info just do:

>>> albumInfo['Enchant']
['A Blueprint of the World', 'Wounded', 'Time Lost']

It also makes it easy to add a new album on the fly:

>>> albumInfo["Lark's tongue in Aspic"] = [ ... ]

Hope that helps.

--
Arnaud

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyGTK, Glade, and ComboBoxEntry.append_text()

2008-01-22 Thread Yann Leboulanger
Greg Johnston wrote:
> Hey all,
> 
> I'm a relative newbie to Python (switched over from Scheme fairly
> recently) but I've been using PyGTK and Glade to create an interface,
> which is a combo I'm very impressed with.
> 
> There is, however, one thing I've been wondering about. It doesn't
> seem possible to modify ComboBoxEntry choice options on the fly--at
> least with append_text(), etc--because they were not created with
> gtk.combo_box_entry_new_text(). Basically, I'm wondering if there's
> any way around this.
> 
> Thank you,
> Greg Johnston

PyGTK mailing list:
http://pygtk.org/feedback.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question

2008-01-22 Thread James Matthews
Since you aren't familyer with classes i will keep this within the
scope of functions... If you have code like this

def a():
   def b():
  a+=1
Then you can only call function b when you are within function a

James

On Jan 22, 2008 8:58 PM,  <[EMAIL PROTECTED]> wrote:
> I'm still learning Python and was wanting to get some thoughts on this.  I 
> apologize if this sounds ridiculous...  I'm mainly asking it to gain some 
> knowledge of what works better.  The main question I have is if I had a lot 
> of lists to choose from, what's the best way to write the code so I'm not 
> wasting a lot of memory?  I've attempted to list a few examples below to 
> hopefully be a little clearer about my question.
>
> Lets say I was going to be pulling different data, depending on what the user 
> entered.  I was thinking I could create a function which contained various 
> functions inside:
>
> def albumInfo(theBand):
> def Rush():
> return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A 
> Farewell to Kings', 'Hemispheres']
>
> def Enchant():
> return ['A Blueprint of the World', 'Wounded', 'Time Lost']
>
> ...
>
> The only problem with the code above though is that I don't know how to call 
> it, especially since if the user is entering a string, how would I convert 
> that string into a function name?  For example, if the user entered 'Rush', 
> how would I call the appropriate function -->  albumInfo(Rush())
>
> But if I could somehow make that code work, is it a good way to do it?  I'm 
> assuming if the user entered 'Rush' that only the list in the Rush() function 
> would be stored, ignoring the other functions inside the albumInfo() function?
>
> I then thought maybe just using a simple if/else statement might work like so:
>
> def albumInfo(theBand):
> if theBand == 'Rush':
> return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A 
> Farewell to Kings', 'Hemispheres']
> elif theBand == 'Enchant':
> return ['A Blueprint of the World', 'Wounded', 'Time Lost']
> ...
>
> Does anyone think this would be more efficient?
>
> I'm not familiar with how 'classes' work yet (still reading through my 'Core 
> Python' book) but was curious if using a 'class' would be better suited for 
> something like this?  Since the user could possibly choose from 100 or more 
> choices, I'd like to come up with something that's efficient as well as easy 
> to read in the code.  If anyone has time I'd love to hear your thoughts.
>
> Thanks.
>
> Jay
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
http://search.goldwatches.com/?Search=Movado+Watches
http://www.jewelerslounge.com
http://www.goldwatches.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Peter Otten
Arnaud Delobelle wrote:

> On Jan 22, 4:10 pm, Alan Isaac <[EMAIL PROTECTED]> wrote:
> 
>> http://bugs.python.org/issue1121416>
>>
>> fwiw,
>> Alan Isaac
> 
> Thanks.  So I guess I shouldn't take the code snippet I quoted as a
> specification of izip but rather as an illustration.

You can be bolder here as the izip() docs explicitly state

"""
Note, the left-to-right evaluation order of the iterables is
guaranteed. This makes possible an idiom for clustering a data series into
n-length groups using "izip(*[iter(s)]*n)".
"""

and the bug report with Raymond Hettinger saying

"""
Left the evaluation order as an unspecified, implementation
specific detail.
"""

is about zip(), not izip().

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: A global or module-level variable?

2008-01-22 Thread Paul Rubin
Bret <[EMAIL PROTECTED]> writes:
> nextport=42000
> 
> def getNextPort():
> nextport += 1
> return nextport

If you have to do it that way, use:

def getNextPort():
global nextport
nextport += 1
return nextport

the global declaration stops the compiler from treating nextport as
local and then trapping the increment as to an uninitialized variable.
-- 
http://mail.python.org/mailman/listinfo/python-list


question

2008-01-22 Thread jyoung79
I'm still learning Python and was wanting to get some thoughts on this.  I 
apologize if this sounds ridiculous...  I'm mainly asking it to gain some 
knowledge of what works better.  The main question I have is if I had a lot of 
lists to choose from, what's the best way to write the code so I'm not wasting 
a lot of memory?  I've attempted to list a few examples below to hopefully be a 
little clearer about my question.

Lets say I was going to be pulling different data, depending on what the user 
entered.  I was thinking I could create a function which contained various 
functions inside: 

def albumInfo(theBand):
def Rush():
return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell 
to Kings', 'Hemispheres']

def Enchant():
return ['A Blueprint of the World', 'Wounded', 'Time Lost']

...

The only problem with the code above though is that I don't know how to call 
it, especially since if the user is entering a string, how would I convert that 
string into a function name?  For example, if the user entered 'Rush', how 
would I call the appropriate function -->  albumInfo(Rush())

But if I could somehow make that code work, is it a good way to do it?  I'm 
assuming if the user entered 'Rush' that only the list in the Rush() function 
would be stored, ignoring the other functions inside the albumInfo() function?

I then thought maybe just using a simple if/else statement might work like so:

def albumInfo(theBand):
if theBand == 'Rush':
return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell 
to Kings', 'Hemispheres']
elif theBand == 'Enchant':
return ['A Blueprint of the World', 'Wounded', 'Time Lost']
...

Does anyone think this would be more efficient?

I'm not familiar with how 'classes' work yet (still reading through my 'Core 
Python' book) but was curious if using a 'class' would be better suited for 
something like this?  Since the user could possibly choose from 100 or more 
choices, I'd like to come up with something that's efficient as well as easy to 
read in the code.  If anyone has time I'd love to hear your thoughts.

Thanks.

Jay
-- 
http://mail.python.org/mailman/listinfo/python-list


A global or module-level variable?

2008-01-22 Thread Bret
This has to be easier than I'm making it

I've got a module, remote.py, which contains a number of classes, all
of whom open a port for communication.  I'd like to have a way to
coordinate these port numbers akin to this:

So I have this in the __init__.py file for a package called cstore:

nextport=42000

def getNextPort():
nextport += 1
return nextport

:
Then, in the class where I wish to use this (in cstore.remote.py):
:


class Spam():

def __init__(self, **kwargs):
self._port = cstore.getNextPort()

I can't seem to make this work, though.  As given here, I get an
"UnboundLocalError:local variable 'nextport' referenced before
assignment".  When I try prefixing the names inside __init__.py with
"cstore.", I get an error that the global name "cstore" is not
defined.

I've been looking at this long enough that my eyes are blurring.  Any
ideas?

BTW, the driving force here is that I'm going to need to wrap this in
some thread synchronization.  For now, though, I'm just trying to get
the basics working.

Thanks!


Bret
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with processing XML

2008-01-22 Thread John Carlyle-Clarke
Paul McGuire wrote:

> 
> Here is a pyparsing hack for your problem.

Thanks Paul!  This looks like an interesting approach, and once I get my 
head around the syntax, I'll give it a proper whirl.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: printing escape character

2008-01-22 Thread hrochonwo
On Jan 22, 7:58 pm, "Jerry Hill" <[EMAIL PROTECTED]> wrote:
> On Jan 22, 2008 1:38 PM, hrochonwo <[EMAIL PROTECTED]> wrote:
>
> > Hi,
>
> > I want to print string without "decoding" escaped characters to
> > newline etc.
> > like print "a\nb" -> a\nb
> > is there a simple way to do it in python or should i somehow use
> > string.replace(..) function ?
> >>> print 'a\nb'.encode('string_escape')
>
> a\nb
>
> --
> Jerry


thank you, jerry
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Arnaud Delobelle
On Jan 22, 6:34 pm, Paddy <[EMAIL PROTECTED]> wrote:
[...]
> Hi George,
> You need to 'get it right' first. Micro optimizations for speed
> without thought of the wider context is a bad habit to form and a time
> waster.
> If the routine is all that needs to be delivered and it does not
> perform at an acceptable speed then find out what is acceptable and
> optimise towards that goal. My questions were set to get posters to
> think more about the need for speed optimizations and where they
> should be applied, (if at all).
>
> A bit of forethought might justify leaving the routine alone, or
> optimising for readability instead.

But it's fun!

Some-of-us-can't-help-it'ly yours
--
Arnaud

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: printing escape character

2008-01-22 Thread Jerry Hill
On Jan 22, 2008 1:38 PM, hrochonwo <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I want to print string without "decoding" escaped characters to
> newline etc.
> like print "a\nb" -> a\nb
> is there a simple way to do it in python or should i somehow use
> string.replace(..) function ?

>>> print 'a\nb'.encode('string_escape')
a\nb

-- 
Jerry
-- 
http://mail.python.org/mailman/listinfo/python-list


difflib confusion

2008-01-22 Thread krishnakant Mane
hello all,
I have a bit of a confusing question.
firstly I wanted a library which can do an svn like diff with two files.
let's say I have file1 and file2 where file2 contains some thing which
file1 does not have.  now if I do readlines() on both the files, I
have a list of all the lines.
I now want to do a diff and find out which word is added or deleted or changed.
and that too on which character, if not at least want to know the word
that has the change.
any ideas please?
kk
-- 
http://mail.python.org/mailman/listinfo/python-list


rpy registry

2008-01-22 Thread [EMAIL PROTECTED]
Howdy, I've been using rpy (1.0.1) and python (2.5.1) on my office
computer with great success.  When I went to put rpy on my laptop,
however, I get an error trying to load rpy.

"Unable to determine R version from the registry. Trying another
method."

followed by a few lines of the usual error message style (ending with
"NameError: global name 'RuntimeExecError' is not defined."  I have
reinstalled R (now 2.6.1), rpy, and python without any luck (being
sure to check the "include in registry" on the installation of R).
Everything else I have used thus far works perfectly.  Any thoughts on
what might be causing problems?

Thanks,

-Hans
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Beginners question about debugging (import)

2008-01-22 Thread Diez B. Roggisch
Albert van der Horst schrieb:
> I'm starting with Python. First with some interactive things,
> working through the tutorial,
> then with definitions in a file called sudoku.py.
> Of course I make lots of mistakes, so I have to include that file
> time and again.
> 
> I discovered (the hard way) that the second time you invoke
> from sudoku.py import *
> nothing happens.
> 
> There is reload. But it only seems to work with
> import sudoku
> 
> Now I find myself typing ``sudoku.'' all the time:
> 
> x=sudoku.sudoku()
> y=sudoku.create_set_of_sets()
> sudoku.symbols
> 
> Is there a more convenient way?
> 
> (This is a howto question, rather difficult to get answered
> from the documentation.)

import sudoku as s

However, I find it easier to just create a test.py and run that from the 
shell. For the exact reason that reload has it's caveats and in the end, 
more complex testing-code isn't really feasible anyway. If you need to, 
drop into the interactive prompt using

python -i test.py

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


printing escape character

2008-01-22 Thread hrochonwo
Hi,

I want to print string without "decoding" escaped characters to
newline etc.
like print "a\nb" -> a\nb
is there a simple way to do it in python or should i somehow use
string.replace(..) function ?


thanks for any reply

hrocho
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Paddy
On Jan 22, 5:34 am, George Sakkis <[EMAIL PROTECTED]> wrote:
> On Jan 22, 12:15 am, Paddy <[EMAIL PROTECTED]> wrote:
>
> > On Jan 22, 3:20 am, Alan Isaac <[EMAIL PROTECTED]> wrote:> I want to 
> > generate sequential pairs from a list.
> > <>
> > > What is the fastest way? (Ignore the import time.)
>
> > 1) How fast is the method you have?
> > 2) How much faster does it need to be for your application?
> > 3) Are their any other bottlenecks in your application?
> > 4) Is this the routine whose smallest % speed-up would give the
> > largest overall speed up of your application?
>
> I believe the "what is the fastest way" question for such small well-
> defined tasks is worth asking on its own, regardless of whether it
> makes a difference in the application (or even if there is no
> application to begin with).

Hi George,
You need to 'get it right' first. Micro optimizations for speed
without thought of the wider context is a bad habit to form and a time
waster.
If the routine is all that needs to be delivered and it does not
perform at an acceptable speed then find out what is acceptable and
optimise towards that goal. My questions were set to get posters to
think more about the need for speed optimizations and where they
should be applied, (if at all).

A bit of forethought might justify leaving the routine alone, or
optimising for readability instead.

- Paddy.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: isgenerator(...) - anywhere to be found?

2008-01-22 Thread Steven Bethard
Diez B. Roggisch wrote:
> Jean-Paul Calderone wrote:
> 
>> On Tue, 22 Jan 2008 15:15:43 +0100, "Diez B. Roggisch"
>> <[EMAIL PROTECTED]> wrote:
>>> Jean-Paul Calderone wrote:
>>>
 On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch"
 <[EMAIL PROTECTED]> wrote:
> For a simple greenlet/tasklet/microthreading experiment I found myself
> in the need to ask the question
>
> [snip]
 Why do you need a special case for generators?  If you just pass the
 object in question to iter(), instead, then you'll either get back
 something that you can iterate over, or you'll get an exception for
 things that aren't iterable.
>>> Because - as I said - I'm working on a micro-thread thingy, where the
>>> scheduler needs to push returned generators to a stack and execute them.
>>> Using send(), which rules out iter() anyway.
>> Sorry, I still don't understand.  Why is a generator different from any
>> other iterator?
> 
> Because you can use send(value) on it for example. Which you can't with
> every other iterator. And that you can utizilize to create a little
> framework of co-routines or however you like to call it that will yield
> values when they want, or generators if they have nested co-routines the
> scheduler needs to keep track of and invoke after another.

So if you need the send() method, why not just check for that::

 try:
 obj.send
 except AttributeError:
 # not a generator-like object
 else:
 # is a generator-like object

Then anyone who wants to make an extended iterator and return it can 
expect it to work just like a real generator would.

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Alan Isaac
Arnaud Delobelle wrote:
> pairs4 wins.


Oops. I see a smaller difference,
but yes, pairs4 wins.

Alan Isaac

import time
from itertools import islice, izip

x = range(51)

def pairs1(x):
return izip(islice(x,0,None,2),islice(x,1,None,2))

def pairs2(x):
xiter = iter(x)
while True:
yield xiter.next(), xiter.next()

def pairs3(x):
for i in range( len(x)//2 ):
yield x[2*i], x[2*i+1],

def pairs4(x):
xiter = iter(x)
return izip(xiter,xiter)

t = time.clock()
for x1, x2 in pairs1(x):
pass
t1 = time.clock() - t

t = time.clock()
for x1, x2 in pairs2(x):
pass
t2 = time.clock() - t

t = time.clock()
for x1, x2 in pairs3(x):
pass
t3 = time.clock() - t

t = time.clock()
for x1, x2 in pairs4(x):
pass
t4 = time.clock() - t

print t1, t2, t3, t4

Output:
0.317524154606 1.13436847421 1.07100930426 0.262926712753
-- 
http://mail.python.org/mailman/listinfo/python-list


Beginners question about debugging (import)

2008-01-22 Thread Albert van der Horst
I'm starting with Python. First with some interactive things,
working through the tutorial,
then with definitions in a file called sudoku.py.
Of course I make lots of mistakes, so I have to include that file
time and again.

I discovered (the hard way) that the second time you invoke
from sudoku.py import *
nothing happens.

There is reload. But it only seems to work with
import sudoku

Now I find myself typing ``sudoku.'' all the time:

x=sudoku.sudoku()
y=sudoku.create_set_of_sets()
sudoku.symbols

Is there a more convenient way?

(This is a howto question, rather difficult to get answered
from the documentation.)

Groetjes Albert



~

--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
[EMAIL PROTECTED]&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
-- 
http://mail.python.org/mailman/listinfo/python-list


Submitting with PAMIE

2008-01-22 Thread romo20350
Hi I really need help. I've been looking around for an answer forever.
I need to submit a form with no name and also the submit button has no
name or value. How might I go about doing either of these. Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


Using utidylib, empty string returned in some cases

2008-01-22 Thread Boris
Hello

I'm using debian linux, Python 2.4.4, and utidylib (http://
utidylib.berlios.de/). I wrote simple functions to get a web page,
convert it from windows-1251 to utf8 and then I'd like to clean html
with it.

Here is two pages I use to check my program:
http://www.ya.ru/ (in this case everything works ok)
http://www.yellow-pages.ru/rus/nd2/qu5/ru15632 (in this case tidy did
not return me anything just empty string)


code:

--

# coding: utf-8
import urllib, urllib2, tidy

def get_page(url):
  user_agent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT
5.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)'
  headers = { 'User-Agent' : user_agent }
  data= {}

  req = urllib2.Request(url, data, headers)
  responce = urllib2.urlopen(req)
  page = responce.read()

  return page

def convert_1251(page):
  p = page.decode('windows-1251')
  u = p.encode('utf-8')
  return u

def clean_html(page):
  tidy_options = { 'output_xhtml' : 1,
   'add_xml_decl' : 1,
   'indent' : 1,
   'input-encoding' : 'utf8',
   'output-encoding' : 'utf8',
   'tidy_mark' : 1,
 }
  cleaned_page = tidy.parseString(page, **tidy_options)
  return cleaned_page

test_url = 'http://www.yellow-pages.ru/rus/nd2/qu5/ru15632'
#test_url = 'http://www.ya.ru/'

#f = open('yp.html', 'r')
#p = f.read()

print clean_html(convert_1251(get_page(test_url)))

--

What am I doing wrong? Can anyone help, please?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Processing XML that's embedded in HTML

2008-01-22 Thread Paul Boddie
On 22 Jan, 17:57, Mike Driscoll <[EMAIL PROTECTED]> wrote:
>
> I need to parse a fairly complex HTML page that has XML embedded in
> it. I've done parsing before with the xml.dom.minidom module on just
> plain XML, but I cannot get it to work with this HTML page.

It's HTML day on comp.lang.python today! ;-)

> The XML looks like this:
>
> 
>
> Owner
>
> 1
>
> 07/16/2007
>
> No
>
> Doe, John
>
> 1905 S 3rd Ave , Hicksville IA 9
>
>   
>
>   
>
> Owner
>
> 2
>
> 07/16/2007
>
> No
>
> Doe, Jane
>
> 1905 S 3rd Ave , Hicksville IA 9
>
>   
>
> It appears to be enclosed with  id="grdRegistrationInquiryCustomers">

You could probably find the Row elements with the following XPath
expression:

  //XML/BoundData/Row

More specific would be this:

  //[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/Row

See below for the relevance of this. You could also try using
getElementById on the document, specifying the id attribute's value
given above, then descending to find the Row elements.

> The rest of the document is html, javascript div tags, etc. I need the
> information only from the row where the Relationship tag = Owner and
> the Priority tag = 1. The rest I can ignore. When I tried parsing it
> with minidom, I get an ExpatError: mismatched tag: line 1, column 357
> so I think the HTML is probably malformed.

Or that it isn't well-formed XML, at least.

> I looked at BeautifulSoup, but it seems to separate its HTML
> processing from its XML processing. Can someone give me some pointers?

With libxml2dom [1] I'd do something like this:

  import libxml2dom
  d = libxml2dom.parse(filename, html=1)
  # or: d = parseURI(uri, html=1)
  rows = d.xpath("//XML/BoundData/Row")
  # or: rows = d.xpath("//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/
BoundData/Row")

Even though the document is interpreted as HTML, you should get a DOM
containing the elements as libxml2 interprets them.

> I am currently using Python 2.5 on Windows XP. I will be using
> Internet Explorer 6 since the document will not display correctly in
> Firefox.

That shouldn't be much of a surprise, it must be said: it isn't XHTML,
where you might be able to extend the document via XML, so the whole
document has to be "proper" HTML.

Paul

[1] http://www.python.org/pypi/libxml2dom
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Curses and Threading

2008-01-22 Thread Ian Clark
On 2008-01-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>> In fact you have *two* threads: the main thread, and the one you create
>> explicitly.
>
>> After you start the clock thread, the main thread continues executing,
>> immediately entering the finally clause.
>> If you want to wait for the other thread to finish, use the join() method.
>> But I'm unsure if this is the right way to mix threads and curses.
>
> This is what the python documentation says:
>
> join([timeout])
> Wait until the thread terminates. This blocks the calling thread
> until the thread whose join() method is called terminates.
>
> So according to this since I need to block the main thread until the
> clock thread ends I would need the main thread to call
> "cadtime().join()", correct? I'm not sure how to do this because I
> don't have a class or anything for the main thread that I know of. I
> tried putting that after cadtime().start() but that doesn't work. I
> guess what I'm trying to say is how can I tell the main thread what to
> do when it doesn't exist in my code?
>
> Thanks for the help
> -Brett

join() is a method on Thread objects. So you'll need a reference to the
Thread you create, then call join() on that.

thread = cadtime()
thread.start()
thread.join()

Ian

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Boa constructor debugging - exec some code at breakpoint?

2008-01-22 Thread Mike Driscoll
On Jan 22, 1:23 am, Joel <[EMAIL PROTECTED]> wrote:
> Can you please tell me how this can be done..
> are there any other IDEs for the same purpose if Boa can't do it?
>
> Joel
>
> On Jan 6, 11:01 am, Joel <[EMAIL PROTECTED]> wrote:
>
> > Hey there..
> > I'm using boa constructor to debug a python application. For my
> > application, I need to insert break points and execute some piece of
> > code interactively through shell or someother window when the
> > breakpoint has been reached. Unfortunately the shell I think is a
> > seperate process so whatever variables are set while executing in
> > debugger dont appear in the shell when I try to print using print
> > statement.
>
> > Can anyone tell me how can I do this?
>
> > Really appreciate any support, Thanks
>
> > Joel
> > P.S. Please CC a copy of reply to my email ID if possible.

IDLE does breakpoints...you might fine the ActiveState distro more to
your liking too. It's a little bit more fleshed out as an IDE than
IDLE is. Or you could go full blown and use Eclipse with the Python
plug-in.

Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Processing XML that's embedded in HTML

2008-01-22 Thread Mike Driscoll
Hi,

I need to parse a fairly complex HTML page that has XML embedded in
it. I've done parsing before with the xml.dom.minidom module on just
plain XML, but I cannot get it to work with this HTML page.

The XML looks like this:



Owner

1

07/16/2007

No

Doe, John

1905 S 3rd Ave , Hicksville IA 9

  

  

Owner

2

07/16/2007

No

Doe, Jane

1905 S 3rd Ave , Hicksville IA 9

  

It appears to be enclosed with 

The rest of the document is html, javascript div tags, etc. I need the
information only from the row where the Relationship tag = Owner and
the Priority tag = 1. The rest I can ignore. When I tried parsing it
with minidom, I get an ExpatError: mismatched tag: line 1, column 357
so I think the HTML is probably malformed.

I looked at BeautifulSoup, but it seems to separate its HTML
processing from its XML processing. Can someone give me some pointers?

I am currently using Python 2.5 on Windows XP. I will be using
Internet Explorer 6 since the document will not display correctly in
Firefox.

Thank you very much!

Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with processing XML

2008-01-22 Thread Paul Boddie
On 22 Jan, 15:11, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote:
>
> I wrote some code that works on my Linux box using xml.dom.minidom, but
> it will not run on the windows box that I really need it on.  Python
> 2.5.1 on both.
>
> On the windows machine, it's a clean install of the Python .msi from
> python.org.  The linux box is Ubuntu 7.10, which has some Python XML
> packages installed which can't easily be removed (namely  python-libxml2
> and python-xml).

I don't think you're straying into libxml2 or PyXML territory here...

> I have boiled the code down to its simplest form which shows the problem:-
>
> import xml.dom.minidom
> import sys
>
> input_file = sys.argv[1];
> output_file = sys.argv[2];
>
> doc = xml.dom.minidom.parse(input_file)
> file = open(output_file, "w")

On Windows, shouldn't this be the following...?

  file = open(output_file, "wb")

> doc.writexml(file)
>
> The error is:-
>
> $ python test2.py input2.xml output.xml
> Traceback (most recent call last):
>File "test2.py", line 9, in 
>  doc.writexml(file)
>File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml
>  node.writexml(writer, indent, addindent, newl)
>File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml
>  node.writexml(writer,indent+addindent,addindent,newl)
>File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml
>  _write_data(writer, attrs[a_name].value)
>File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data
>  data = data.replace("&", "&").replace("<", "<")
> AttributeError: 'NoneType' object has no attribute 'replace'
>
> As I said, this code runs fine on the Ubuntu box.  If I could work out
> why the code runs on this box, that would help because then I call set
> up the windows box the same way.

If I encountered the same issue, I'd have to inspect the goings-on
inside minidom, possibly using judicious trace statements in the
minidom.py file. Either way, the above looks like an attribute node
produces a value of None rather than any kind of character string.

> The input file contains an  block which is what actually
> causes the problem.  If you remove that node and subnodes, it works
> fine.  For a while at least, you can view the input file at
> http://rafb.net/p/5R1JlW12.html

The horror! ;-)

> Someone suggested that I should try xml.etree.ElementTree, however
> writing the same type of simple code to import and then write the file
> mangles the xsd:schema stuff because ElementTree does not understand
> namespaces.

I'll leave this to others: I don't use ElementTree.

> By the way, is pyxml a live project or not?  Should it still be used?
> It's odd that if you go to http://www.python.org/and click the link
> "Using python for..." XML, it leads you to 
> http://pyxml.sourceforge.net/topics/
>
> If you then follow the download links to
> http://sourceforge.net/project/showfiles.php?group_id=6473 you see that
> the latest file is 2004, and there are no versions for newer pythons.
> It also says "PyXML is no longer maintained".  Shouldn't the link be
> removed from python.org?

The XML situation in Python's standard library is controversial and
can be probably inaccurately summarised by the following chronology:

 1. XML is born, various efforts start up (see the qp_xml and xmllib
modules).
 2. Various people organise themselves, contributing software to the
PyXML project (4Suite, xmlproc).
 3. The XML backlash begins: we should all apparently be using stuff
like YAML (but don't worry if you haven't heard of it).
 4. ElementTree is released, people tell you that you shouldn't be
using SAX or DOM any more, "pull" parsers are all the rage
(although proponents overlook the presence of xml.dom.pulldom in
the Python standard library).
 5. ElementTree enters the standard library as xml.etree; PyXML falls
into apparent disuse (see remarks about SAX and DOM above).

I think I looked seriously at wrapping libxml2 (with libxml2dom [1])
when I experienced issues with both PyXML and 4Suite when used
together with mod_python, since each project used its own Expat
libraries and the resulting mis-linked software produced very bizarre
results. Moreover, only cDomlette from 4Suite seemed remotely fast,
and yet did not seem to be an adequate replacement for the usual PyXML
functionality.

People will, of course, tell you that you shouldn't use a DOM for
anything and that the "consensus" is to use ElementTree or lxml (see
above), but I can't help feeling that this has a damaging effect on
the XML situation for Python: some newcomers would actually benefit
from the traditional APIs, may already be familiar with them from
other contexts, and may consider Python lacking if the support for
them is in apparent decay. It requires a degree of motivation to
actually attempt to maintain software providing such APIs (which was
my solution to the problem), but if someone isn't totally bound to
Python then they might easily start

Re: stdin, stdout, redmon

2008-01-22 Thread Thynnus
On 1/21/2008 9:02 AM, Bernard Desnoues wrote:
> Hi,
> 
> I've got a problem with the use of Redmon (redirection port monitor). I 
> intend to develop a virtual printer so that I can modify data sent to 
> the printer.

FWIW: there is a nice update the RedMon (v1.7) called RedMon EE (v1.81) 
available at http://www.is-foehr.com/ that I have used and like a lot.

 From the developers website:
Fixed issues and features [with respect to the orininal RedMon]
 *  On Windows Terminal Server or Windows XP with fast user switching, the
   "Prompt for filename" dialog will appear on the current session.
 * "SaveAs" now shows XP style dialogs if running under XP
 * Support for PDF Security added - experimental -.
 * Support for setting the task priority - experimental -
 * Use of file-shares as output
 * Environment variables are passed to the AfterWorks Process now.
 * Environment variables are replaced in the program arguments. No 
workaround is needed.
 * RedMon EE comes with an RPC communication feature which could transfer
   output-files back to the client starting the print job on a print server.
   Error messages will be send to the client.
 * Redmon EE may start a process after the print job has finished (After 
works process).
   e.g. starting a presentation program to show the pdf generated by 
GhostScript.
 * additional debug messages may be written for error analysis.
   No special debug version is needed.
 * user interface has been rewritten. May be it's more friendly.
   Added some basic system information which may help if running in 
failures.
 * new feature: running on a print server.
 * cleanup of documentnames "Microsoft -"
 * define templates for output-file names with full environment variable 
substitution
   e.g. %homedrive%\%homedir%\%redmon-user%-%date%-%time%-%n.pdf
 * RedMon EE does not support for NT 3.5 and Windows 95/98 !

-Thynnus
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Arnaud Delobelle
On Jan 22, 4:10 pm, Alan Isaac <[EMAIL PROTECTED]> wrote:

> http://bugs.python.org/issue1121416>
>
> fwiw,
> Alan Isaac

Thanks.  So I guess I shouldn't take the code snippet I quoted as a
specification of izip but rather as an illustration.

--
Arnaud

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML parsing confusion

2008-01-22 Thread Diez B. Roggisch
Alnilam wrote:

> On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote:
>> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2,
>> > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous
>> > 200-modules PyXML package installed. And you don't want the 75Kb
>> > BeautifulSoup?
>>
>> I wasn't aware that I had PyXML installed, and can't find a reference
>> to having it installed in pydocs. ...
> 
> Ugh. Found it. Sorry about that, but I still don't understand why
> there isn't a simple way to do this without using PyXML, BeautifulSoup
> or libxml2dom. What's the point in having sgmllib, htmllib,
> HTMLParser, and formatter all built in if I have to use use someone
> else's modules to write a couple of lines of code that achieve the
> simple thing I want. I get the feeling that this would be easier if I
> just broke down and wrote a couple of regular expressions, but it
> hardly seems a 'pythonic' way of going about things.

This is simply a gross misunderstanding of what BeautifulSoup or lxml
accomplish. Dealing with mal-formatted HTML whilst trying to make _some_
sense is by no means trivial. And just because you can come up with a few
lines of code using rexes that work for your current use-case doesn't mean
that they serve as general html-fixing-routine. Or do you think the rather
long history and 75Kb of code for BS are because it's creator wasn't aware
of rexes?

And it also makes no sense stuffing everything remotely useful into the
standard lib. This would force to align development and release cycles,
resulting in much less features and stability as it can be wished.

And to be honest: I fail to see where your problem is. BeatifulSoup is a
single Python file. So whatever you carry with you from machine to machine,
if it's capable of holding a file of your own code, you can simply put
BeautifulSoup beside it - even if it was a floppy  disk.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: stdin, stdout, redmon

2008-01-22 Thread Thynnus
On 1/22/2008 8:54 AM, Konstantin Shaposhnikov wrote:
> Hi,
> 
> This is Windows bug that is described here: 
> http://support.microsoft.com/default.aspx?kbid=321788
> 
> This article also contains solution: you need to add registry value:
> 
> HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Policies
> \Explorer
> InheritConsoleHandles = 1 (REG_DWORD type)
> 
> Do not forget to launch new console (cmd.exe) after editing registry.
> 
> Alternatively you can use following command
> 
>   cat file | python script.py
> 
> instead of
> 
>   cat file | python script.py
> 
> Regards,
> Konstantin

Nice one, Konstantin!

I can confirm that adding the registry key solves the problem on XPsp2:

-After adding InheritConsoleHandles DWORD 1 key-
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

D:\temp>type test3.py | test3.py
['import sys\n', '\n', 'print sys.stdin.readlines ()\n']

D:\temp>

The KB article is quite poorly written. Even though it seems to state that 
issue was 'solved for win2k with sp4, for XP with sp1', and gives no indication 
that the key is needed after the sp's are applied *even though* it is in fact 
necessary to the solution.

Questions:
 -Any side effects to look out for?
 -If the change is relatively benign, should it be part of the install?
 -Is this worth a documentation patch? If yes to where, and I'll give it a 
shot.

-Thynnus
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Curses and Threading

2008-01-22 Thread Brett . Friermood
> In fact you have *two* threads: the main thread, and the one you create
> explicitly.

> After you start the clock thread, the main thread continues executing,
> immediately entering the finally clause.
> If you want to wait for the other thread to finish, use the join() method.
> But I'm unsure if this is the right way to mix threads and curses.

This is what the python documentation says:

join([timeout])
Wait until the thread terminates. This blocks the calling thread
until the thread whose join() method is called terminates.

So according to this since I need to block the main thread until the
clock thread ends I would need the main thread to call
"cadtime().join()", correct? I'm not sure how to do this because I
don't have a class or anything for the main thread that I know of. I
tried putting that after cadtime().start() but that doesn't work. I
guess what I'm trying to say is how can I tell the main thread what to
do when it doesn't exist in my code?

Thanks for the help
-Brett
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: isgenerator(...) - anywhere to be found?

2008-01-22 Thread james . pye
On Jan 22, 6:20 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> For a simple greenlet/tasklet/microthreading experiment I found myself in
> the need to ask the question
>
> isgenerator(v)
>
> but didn't find any implementation in the usual suspects - builtins or
> inspect.

types.GeneratorType exists in newer Pythons, but I'd suggest just
checking for a send method. ;)
That way, you can use something that emulates the interface without
being forced to use a generator.

hasattr(ob, 'send')..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Alan Isaac
Arnaud Delobelle wrote:
> According to the docs [1], izip is defined to be equivalent to:
> 
>  def izip(*iterables):
>  iterables = map(iter, iterables)
>  while iterables:
>  result = [it.next() for it in iterables]
>  yield tuple(result)
> 
> This guarantees that it.next() will be performed from left to right,
> so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1),
> (4, 3)].
> 
> Is there anything else that I am overlooking?
> 
> [1] http://docs.python.org/lib/itertools-functions.html


http://bugs.python.org/issue1121416>

fwiw,
Alan Isaac
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread Arnaud Delobelle
On Jan 22, 1:19 pm, Alan Isaac <[EMAIL PROTECTED]> wrote:
> I suppose my question should have been,
> is there an obviously faster way?
> Anyway, of the four ways below, the
> first is substantially fastest.  Is
> there an obvious reason why?

Can you post your results?

I get different ones (pairs1 and pairs2 rewritten slightly to avoid
unnecessary indirection).

== pairs.py ===
from itertools import *

def pairs1(x):
return izip(islice(x,0,None,2),islice(x,1,None,2))

def pairs2(x):
xiter = iter(x)
while True:
yield xiter.next(), xiter.next()

def pairs3(x):
for i in range( len(x)//2 ):
yield x[2*i], x[2*i+1],

def pairs4(x):
xiter = iter(x)
return izip(xiter,xiter)

def compare():
import timeit
for i in '1234':
t = timeit.Timer('list(pairs.pairs%s(l))' % i,
 'import pairs; l=range(1000)')
print 'pairs%s: %s' % (i, t.timeit(1))

if __name__ == '__main__':
compare()
=

marigold:python arno$ python pairs.py
pairs1: 0.789824962616
pairs2: 4.08462786674
pairs3: 2.90438890457
pairs4: 0.536775827408

pairs4 wins.

--
Arnaud

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with processing XML

2008-01-22 Thread Alnilam
On Jan 22, 9:11 am, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote:

> By the way, is pyxml a live project or not?  Should it still be used?
> It's odd that if you go tohttp://www.python.org/and click the link
> "Using python for..." XML, it leads you tohttp://pyxml.sourceforge.net/topics/
>
> If you then follow the download links 
> tohttp://sourceforge.net/project/showfiles.php?group_id=6473you see that
> the latest file is 2004, and there are no versions for newer pythons.
> It also says "PyXML is no longer maintained".  Shouldn't the link be
> removed from python.org?

I was wondering that myself. Any answer yet?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML parsing confusion

2008-01-22 Thread Alnilam
On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote:
> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2,
> > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous
> > 200-modules PyXML package installed. And you don't want the 75Kb
> > BeautifulSoup?
>
> I wasn't aware that I had PyXML installed, and can't find a reference
> to having it installed in pydocs. ...

Ugh. Found it. Sorry about that, but I still don't understand why
there isn't a simple way to do this without using PyXML, BeautifulSoup
or libxml2dom. What's the point in having sgmllib, htmllib,
HTMLParser, and formatter all built in if I have to use use someone
else's modules to write a couple of lines of code that achieve the
simple thing I want. I get the feeling that this would be easier if I
just broke down and wrote a couple of regular expressions, but it
hardly seems a 'pythonic' way of going about things.

# get the source (assuming you don't have it locally and have an
internet connection)
>>> import urllib
>>> page = urllib.urlopen("http://diveintopython.org/";)
>>> source = page.read()
>>> page.close()

# set up some regex to find tags, strip them out, and correct some
formatting oddities
>>> import re
>>> p = re.compile(r'(.*?)',re.DOTALL)
>>> tag_strip = re.compile(r'>(.*?)<',re.DOTALL)
>>> fix_format = re.compile(r'\n +',re.MULTILINE)

# achieve clean results.
>>> paragraphs = re.findall(p,source)
>>> text_list = re.findall(tag_strip,paragraphs[5])
>>> text = "".join(text_list)
>>> clean_text = re.sub(fix_format," ",text)

This works, and is small and easily reproduced, but seems like it
would break easily and seems a waste of other *ML specific parsers.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: isgenerator(...) - anywhere to be found?

2008-01-22 Thread Jean-Paul Calderone
On Tue, 22 Jan 2008 15:52:02 +0100, "Diez B. Roggisch" <[EMAIL PROTECTED]> 
wrote:
>Jean-Paul Calderone wrote:
>
> [snip]
>>
>> Sorry, I still don't understand.  Why is a generator different from any
>> other iterator?
>
>Because you can use send(value) on it for example. Which you can't with
>every other iterator. And that you can utizilize to create a little
>framework of co-routines or however you like to call it that will yield
>values when they want, or generators if they have nested co-routines the
>scheduler needs to keep track of and invoke after another.

Ah.  Thanks for clarifying.

Jean-Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pairs from a list

2008-01-22 Thread bearophileHUGS
Alan Isaac>What is the fastest way? (Ignore the import time.)<

Maybe someday someone will realize such stuff belongs to the python
STD lib...

If you need a lazy generator without padding, that splits starting
from the start, then this is the faster to me if n is close to 2:

def xpartition(seq, n=2):
return izip( *(iter(seq),)*n )

If you need the faster greedy version without padding then there are
two answers, one for Psyco and one for Python without... :-)
If you need padding or to start from the end then there are more
answers...

Bye,
bearophile
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >