Re: bags? 2.5.x?
Dan Stromberg wrote: > Is there a particular reason why bags didn't go into 2.5.x or 3000? > > I keep wanting something like them - especially bags with something akin > to set union, intersection and difference. Ask yourself the following questions: * Is the feature useful for the broad mass? * Has the feature been implemented and contributed for Python? * Is the code well written, tested and documented? * Is the code mature and used by lots of people? Can you answer every question with yes? Christian -- http://mail.python.org/mailman/listinfo/python-list
Re: Removing objects
On Jan 23, 6:16 pm, Asun Friere <[EMAIL PROTECTED]> wrote: > >>> x.pop(x.index(c)) Umm, of course you would simply use x.remove(c) ... force of (bad) habit. %/ -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler
Wim Vander Schelden wrote: > Python modules and scripts are normally not even compiled, if they have > been, > its probably just the Python interpreter packaged with the scripts and > resources. No, that is not correct. Python code is compiled to Python byte code and execute inside a virtual machine just like Java or C#. It's even possible to write code with Python assembly and compile the Python assembly into byte code. You most certainly meant: Python code is not compiled into machine code. Christian -- http://mail.python.org/mailman/listinfo/python-list
Re: Extract value from a attribute in a string
On Wed, 23 Jan 2008 01:13:31 -0200, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: >En Tue, 22 Jan 2008 23:45:22 -0200, <[EMAIL PROTECTED]> escribió: > >> I am looking for some help in reading a large text tile and extracting >> a value from an attribute? so I would need to find name=foo and >> extract just the value foo which can be at any location in the string. >> The attribute name will be in almost each line. > >In this case a regular expression may be the right tool. See >http://docs.python.org/lib/module-re.html > >py> import re >py> text = """ok name=foo >... in this line name=bar but >... here you get name = another thing >... is this what you want?""" >py> for match in re.finditer(r"name\s*=\s*(\S+)", text): >... print match.group(1) >... >foo >bar >another Thank you very much. -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 23, 1:39 am, Steven D'Aprano <[EMAIL PROTECTED]> wrote: > Given the human psychology displayed involved, in the absence of > definitive evidence one way or another it is a far safer bet to assume > that people are unnecessarily asking for "the fastest" out of a misguided > and often ignorant belief that they need it, rather than the opposite. > People who actually need a faster solution usually know enough to preface > their comments with an explanation of why their existing solution is too > slow rather than just a context-free demand for "the fastest" solution. As I mentioned already, I consider the seeking of the most efficient solution a legitimate question, regardless of whether a "dumb" solution is fast enough for an application. Call it a "don't be sloppy" principle if you wish. It's the same reason I always use xrange() instead of range() for a loop, although in practice the difference is rarely measurable. > Fast code is like fast cars. There *are* people who really genuinely need > to have the fastest car available, but that number is dwarfed by the vast > legions of tossers trying to make up for their lack of self-esteem by > buying a car with a spoiler. Yeah, you're going to be traveling SO FAST > on the way to the mall that the car is at risk of getting airborne, sure, > we believe you. > > (The above sarcasm naturally doesn't apply to those who actually do need > to travel at 200mph in a school zone, like police, taxi drivers and stock > brokers.) Good example; it shows that there's more than the utilitarian point of view. People don't buy these cars because of an actual need but rather because of the brand, the (perceived) social value and other reasons. And since you like metaphors, here's another one: caring about efficient code only when you need it is like keeping notes for a course only for the material to be included in the final exams, skipping the more encyclopedic, general knowledge lectures. Sure, you may pass the class, even with a good grade, but for some people a class is more than a final grade. George -- http://mail.python.org/mailman/listinfo/python-list
Re: Removing objects
[EMAIL PROTECTED] wrote: > I am writing a game, and it must keep a list of objects. I've been > representing this as a list, but I need an object to be able to remove > itself. It doesn't know it's own index. If I tried to make each object > keep track of it's own index, it would be invalidated when any object > with a lower index was deleted. The error was that when I called > list.remove(self), it just removed the first thing in hte list with > the same type as what I wanted, rather than the object I wanted. The > objects have no identifying charachteristics, other than thier > location in memory By default, classes that do not implement the special methods __eq__ or __cmp__ get compared by identity; i.e. "(x == y) == (x is y)". Double-check your classes and their super-classes for implementations of one of these methods. mylist.remove(x) will check "x is mylist[i]" first and only check "x == mylist[i]" if that is False. In [1]: class A(object): ...: def __eq__(self, other): ...: print '%r == %r' % (self, other) ...: return self is other ...: def __ne__(self, other): ...: print '%r != %r' % (self, other) ...: return self is not other ...: ...: In [2]: As = [A() for i in range(10)] In [3]: As Out[3]: [<__main__.A object at 0xf47f70>, <__main__.A object at 0xf47d90>, <__main__.A object at 0xf47db0>, <__main__.A object at 0xf47cb0>, <__main__.A object at 0xf47eb0>, <__main__.A object at 0xf47e70>, <__main__.A object at 0xf47cd0>, <__main__.A object at 0xf47e10>, <__main__.A object at 0xf47dd0>, <__main__.A object at 0xf47e90>] In [4]: A0 = As[0] In [5]: A0 Out[5]: <__main__.A object at 0xf47f70> In [6]: As.remove(A0) In [7]: As Out[7]: [<__main__.A object at 0xf47d90>, <__main__.A object at 0xf47db0>, <__main__.A object at 0xf47cb0>, <__main__.A object at 0xf47eb0>, <__main__.A object at 0xf47e70>, <__main__.A object at 0xf47cd0>, <__main__.A object at 0xf47e10>, <__main__.A object at 0xf47dd0>, <__main__.A object at 0xf47e90>] In [8]: A0 Out[8]: <__main__.A object at 0xf47f70> In [9]: A9 = As[-1] In [10]: As.remove(A9) <__main__.A object at 0xf47d90> == <__main__.A object at 0xf47e90> <__main__.A object at 0xf47db0> == <__main__.A object at 0xf47e90> <__main__.A object at 0xf47cb0> == <__main__.A object at 0xf47e90> <__main__.A object at 0xf47eb0> == <__main__.A object at 0xf47e90> <__main__.A object at 0xf47e70> == <__main__.A object at 0xf47e90> <__main__.A object at 0xf47cd0> == <__main__.A object at 0xf47e90> <__main__.A object at 0xf47e10> == <__main__.A object at 0xf47e90> <__main__.A object at 0xf47dd0> == <__main__.A object at 0xf47e90> In [11]: As Out[11]: [<__main__.A object at 0xf47d90>, <__main__.A object at 0xf47db0>, <__main__.A object at 0xf47cb0>, <__main__.A object at 0xf47eb0>, <__main__.A object at 0xf47e70>, <__main__.A object at 0xf47cd0>, <__main__.A object at 0xf47e10>, <__main__.A object at 0xf47dd0>] In [12]: A9 Out[12]: <__main__.A object at 0xf47e90> If you cannot find an implementation of __eq__ or __cmp__ anywhere in your code, please try to make a small, self-contained example like the one above but which demonstrates your problem. > So my question: How do I look something up in a list by it's location > in memory? does python even support pointers? If you need to keep an __eq__ that works by equality of value instead of identity, then you could keep a dictionary keyed by the id() of the object. That will correspond to its C pointer value in memory. In [13]: id(A9) Out[13]: 16023184 In [14]: hex(_) Out[14]: '0xf47e90' > Is there a better way? Possibly. It looks like you are implementing a cache of some kind. Depending on exactly how you are using it, you might want to consider a "weak" dictionary instead. A weak dictionary, specifically a WeakValueDictionary, acts like a normal dictionary, but only holds a weak reference to the object. A weak reference does not increment the object's reference count like a normal ("strong") reference would. Consequently, once all of the "strong" references disappear, the object will be removed from the WeakValueDictionary without your having to do anything explicit. If this corresponds with when you want the object to be removed from the cache, then you might want to try this approach. Use "id(x)" as the key if there is no more meaningful key that fits your application. http://docs.python.org/lib/module-weakref.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Just for fun: Countdown numbers game solver
On Jan 22, 10:56 pm, [EMAIL PROTECTED] wrote: > Arnaud and Terry, > > Great solutions both of you! Much nicer than mine. I particularly like > Arnaud's latest one based on folding because it's so neat and > conceptually simple. For me, it's the closest so far to my goal of the > most elegant solution. Thanks! It's a great little problem to think of and it helps bring more fun to this list. Sadly work takes over fun during the week, but I will try to improve it at the weekend. > So anyone got an answer to which set of numbers gives the most targets > from 100 onwards say (or from 0 onwards)? Is Python up to the task? I bet it is :) > A thought on that last one. Two ways to improve speed. First of all, > you don't need to rerun from scratch for each target Yes, I've been doing this by writing an 'action' (see my code) that takes note of all reached results. > Secondly, you > can try multiple different sets of numbers at the same time by passing > numpy arrays instead of single values (although you have to give up > the commutativity and division by zero optimisations). Have to think about this. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: Removing objects
On Jan 23, 5:59 pm, [EMAIL PROTECTED] wrote: > I am writing a game, and it must keep a list of objects. I've been > representing this as a list, but I need an object to be able to remove > itself. It doesn't know it's own index. If I tried to make each object > keep track of it's own index, it would be invalidated when any object > with a lower index was deleted. The error was that when I called > list.remove(self), it just removed the first thing in hte list with > the same type as what I wanted, rather than the object I wanted. The > objects have no identifying charachteristics, other than thier > location in memory > > So my question: How do I look something up in a list by it's location > in memory? does python even support pointers? > > Is there a better way? How about adding an id attribute to your objects, which will contain a unique identifier, override __eq__ to use that id to compare itself to others and then simply pop off the object using object_list.pop(object_list.index(self)). Something like this: >>> class Spam (object) : def __init__ (self, id) : self.id = id def __eq__ (self, other) : try : return self.id == other.id except AttributeError : return False >>> >>> a,b,c = Spam(1), Spam(2), Spam(3) >>> x = [a,b,c] >>> x.pop(x.index(c)) <__main__.Spam object at 0x885e5ac> Except your object would be telling the list to pop self of course, and you'd need someway of insuring the uniqueness of your IDs. -- http://mail.python.org/mailman/listinfo/python-list
Is there a HTML parser who can reconstruct the original html EXACTLY?
Hi, I am looking for a HTML parser who can parse a given page into a DOM tree, and can reconstruct the exact original html sources. Strictly speaking, I should be allowed to retrieve the original sources at each internal nodes of the DOM tree. I have tried Beautiful Soup who is really nice when dealing with those god damned ill-formed documents, but it's a pity for me to find that this guy cannot retrieve original sources due to its great tidy job. Since Beautiful Soup, like most of the other HTML parsers in python, is a subclass of sgmllib.SGMLParser to some extent, I have investigated the source code of sgmllib.SGMLParser, see if there is anything I can do to tell Beautiful Soup where he can find every tag segment from HTML source, but this will be a time-consuming job. so... any ideas? cheers kai liu -- http://mail.python.org/mailman/listinfo/python-list
Re: Just for fun: Countdown numbers game solver
On Jan 22, 9:05 am, Terry Jones <[EMAIL PROTECTED]> wrote: > Hi Arnaud > > > I've tried a completely different approach, that I imagine as 'folding'. I > > thought it would improve performance over my previous effort but extremely > > limited and crude benchmarking seems to indicate disappointingly comparable > > performance... > > I wrote a stack-based version yesterday and it's also slow. It keeps track > of the stack computation and allows you to backtrack. I'll post it > sometime, but it's too slow for my liking and I need to put in some more > optimizations. I'm trying not to think about this problem. > > What was wrong with the very fast(?) code you sent earlier? I thought it was a bit convoluted, wanted to try something I thought had more potential. I think the problem with the second one is that I repeat the same 'fold' too many times. I'll take a closer look at the weekend. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
python24 symbol file...pyhon24.pdb
I've seen a few references on the net to a python24.pdb file. I assume it's a symbol file along the lines of the pdb files issued by microsoft for their products. Maybe I'm wrong. Has anyone seen such an animal? Also, is there source code available for python24 for Windoze? I have seen reference to source code but not in a package for Windows. thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: subprocess and & (ampersand)
Steven D'Aprano wrote: > On Tue, 22 Jan 2008 22:53:20 -0700, Steven Bethard wrote: > >> I'm having trouble using the subprocess module on Windows when my >> command line includes special characters like "&" (ampersand):: >> >> >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1&y=2' >> >>> kwargs = dict(stdin=subprocess.PIPE, >> ... stdout=subprocess.PIPE, ... >> stderr=subprocess.PIPE) >> >>> proc = subprocess.Popen(command, **kwargs) proc.stderr.read() >> "'y' is not recognized as an internal or external command,\r\noperable >> program or batch file.\r\n" >> >> As you can see, Windows is interpreting that "&" as separating two >> commands, instead of being part of the single argument as I intend it to >> be above. Is there any workaround for this? How do I get "&" treated >> like a regular character using the subprocess module? > > > That's nothing to do with the subprocess module. As you say, it is > Windows interpreting the ampersand as a special character, so you need to > escape the character to the Windows shell. > > Under Windows, the escape character is ^, or you can put the string in > double quotes: > > # untested > command = 'lynx.bat -dump http://www.example.com/?x=1^&y=2' > command = 'lynx.bat -dump "http://www.example.com/?x=1&y=2";' Sorry, I should have mentioned that I already tried that. You get the same result:: >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1^&y=2' >>> proc = subprocess.Popen(command, ... stdin=subprocess.PIPE, ... stdout=subprocess.PIPE, ... stderr=subprocess.PIPE) >>> proc.stderr.read() "'y' is not recognized as an internal or external command,\r\noperable program or batch file.\r\n" In fact, the "^" doesn't seem to work at the command line either:: >lynx.bat -dump http://www.example.com/?x=1^&y=2 Can't Access `file://localhost/C:/PROGRA~1/lynx/1' Alert!: Unable to access document. lynx: Can't access startfile 'y' is not recognized as an internal or external command, operable program or batch file. Using quotes does work at the command line:: C:\PROGRA~1\lynx>lynx.bat -dump "http://www.example.com/?x=1&y=2"; You have reached this web page by typing "example.com", "example.net", or "example.org" into your web browser. These domain names are reserved for use in documentation and are not available for registration. See [1]RFC 2606, Section 3. References 1. http://www.rfc-editor.org/rfc/rfc2606.txt But I get no output at all when using quotes with subprocess:: >>> command= 'lynx.bat', '-dump', '"http://www.example.com/?x=1^&y=2";' >>> proc = subprocess.Popen(command, ... stdin=subprocess.PIPE, ... stdout=subprocess.PIPE, ... stderr=subprocess.PIPE) >>> proc.stderr.read() '' Any other ideas? STeVe -- http://mail.python.org/mailman/listinfo/python-list
Removing objects
I am writing a game, and it must keep a list of objects. I've been representing this as a list, but I need an object to be able to remove itself. It doesn't know it's own index. If I tried to make each object keep track of it's own index, it would be invalidated when any object with a lower index was deleted. The error was that when I called list.remove(self), it just removed the first thing in hte list with the same type as what I wanted, rather than the object I wanted. The objects have no identifying charachteristics, other than thier location in memory So my question: How do I look something up in a list by it's location in memory? does python even support pointers? Is there a better way? -- http://mail.python.org/mailman/listinfo/python-list
UNSUBSCRIBE
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Diez B. Roggisch Sent: 22 January 2008 20:22 To: python-list@python.org Subject: Re: isgenerator(...) - anywhere to be found? Jean-Paul Calderone wrote: > On Tue, 22 Jan 2008 15:15:43 +0100, "Diez B. Roggisch" > <[EMAIL PROTECTED]> wrote: >>Jean-Paul Calderone wrote: >> >>> On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch" >>> <[EMAIL PROTECTED]> wrote: For a simple greenlet/tasklet/microthreading experiment I found myself in the need to ask the question [snip] >>> >>> Why do you need a special case for generators? If you just pass the >>> object in question to iter(), instead, then you'll either get back >>> something that you can iterate over, or you'll get an exception for >>> things that aren't iterable. >> >>Because - as I said - I'm working on a micro-thread thingy, where the >>scheduler needs to push returned generators to a stack and execute them. >>Using send(), which rules out iter() anyway. > > Sorry, I still don't understand. Why is a generator different from any > other iterator? Because you can use send(value) on it for example. Which you can't with every other iterator. And that you can utizilize to create a little framework of co-routines or however you like to call it that will yield values when they want, or generators if they have nested co-routines the scheduler needs to keep track of and invoke after another. I'm currently at work and can't show you the code - I don't claim that my current approach is the shizzle, but so far it serves my purposes - and I need a isgenerator() Diez -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Tue, 22 Jan 2008 18:32:22 -0800, George Sakkis wrote: > The OP didn't mention anything about the context; for all we know, this > might be a homework problem or the body of a tight inner loop. There is > this tendency on c.l.py to assume that every optimization question is > about a tiny subproblem of a 100 KLOC application. Without further > context, we just don't know. Funny. As far as I can tell, the usual assumption on c.l.py is that every tiny two-line piece of code is the absolute most critically important heart of an application which gets called billions of times on petabytes of data daily. Given the human psychology displayed involved, in the absence of definitive evidence one way or another it is a far safer bet to assume that people are unnecessarily asking for "the fastest" out of a misguided and often ignorant belief that they need it, rather than the opposite. People who actually need a faster solution usually know enough to preface their comments with an explanation of why their existing solution is too slow rather than just a context-free demand for "the fastest" solution. Fast code is like fast cars. There *are* people who really genuinely need to have the fastest car available, but that number is dwarfed by the vast legions of tossers trying to make up for their lack of self-esteem by buying a car with a spoiler. Yeah, you're going to be traveling SO FAST on the way to the mall that the car is at risk of getting airborne, sure, we believe you. (The above sarcasm naturally doesn't apply to those who actually do need to travel at 200mph in a school zone, like police, taxi drivers and stock brokers.) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: subprocess and & (ampersand)
On Tue, 22 Jan 2008 22:53:20 -0700, Steven Bethard wrote: > I'm having trouble using the subprocess module on Windows when my > command line includes special characters like "&" (ampersand):: > > >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1&y=2' > >>> kwargs = dict(stdin=subprocess.PIPE, > ... stdout=subprocess.PIPE, ... > stderr=subprocess.PIPE) > >>> proc = subprocess.Popen(command, **kwargs) proc.stderr.read() > "'y' is not recognized as an internal or external command,\r\noperable > program or batch file.\r\n" > > As you can see, Windows is interpreting that "&" as separating two > commands, instead of being part of the single argument as I intend it to > be above. Is there any workaround for this? How do I get "&" treated > like a regular character using the subprocess module? That's nothing to do with the subprocess module. As you say, it is Windows interpreting the ampersand as a special character, so you need to escape the character to the Windows shell. Under Windows, the escape character is ^, or you can put the string in double quotes: # untested command = 'lynx.bat -dump http://www.example.com/?x=1^&y=2' command = 'lynx.bat -dump "http://www.example.com/?x=1&y=2";' In Linux land, you would use a backslash or quotes. To find the answer to this question, I googled for "windows how to escape special characters shell" and found these two pages: http://www.microsoft.com/technet/archive/winntas/deploy/prodspecs/shellscr.mspx http://technet2.microsoft.com/WindowsServer/en/library/44500063-fdaf-4e4f-8dac-476c497a166f1033.mspx Hope this helps, -- Steven -- http://mail.python.org/mailman/listinfo/python-list
subprocess and & (ampersand)
I'm having trouble using the subprocess module on Windows when my command line includes special characters like "&" (ampersand):: >>> command = 'lynx.bat', '-dump', 'http://www.example.com/?x=1&y=2' >>> kwargs = dict(stdin=subprocess.PIPE, ... stdout=subprocess.PIPE, ... stderr=subprocess.PIPE) >>> proc = subprocess.Popen(command, **kwargs) >>> proc.stderr.read() "'y' is not recognized as an internal or external command,\r\noperable program or batch file.\r\n" As you can see, Windows is interpreting that "&" as separating two commands, instead of being part of the single argument as I intend it to be above. Is there any workaround for this? How do I get "&" treated like a regular character using the subprocess module? Thanks, STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler
On Wed, 23 Jan 2008 04:58:02 +, Grant Edwards wrote: > On 2008-01-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > >> My expertise, if any, is in assembler. I'm trying to understand Python >> scripts and modules by examining them after they have been disassembled >> in a Windows environment. > > You can't dissassemble them, since they aren't ever converted to > assembler and assembled. Python is compiled into bytecode for a virtual > machine (either the Java VM or the Python VM or the .NET VM). There is the Python disassembler, dis, which dissassembles the bytecode into something which might as well be "assembler" *cough* for the virtual machine. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Cleanup when a object dies
On Jan 22, 7:54 pm, Benjamin <[EMAIL PROTECTED]> wrote: > I writing writing a class to allow settings (options, preferences) to > written file in a cross platform manner. I'm unsure how to go a about > syncing the data to disk. Of course, it's horribly inefficient to > write the data every time something changes a value, however I don't > see how I can do it on deletion. I've read that __del__ methods should > be avoided. So am I just going to have to force the client of my > object to call sync when they're done? Lots of ways 1. Try the atexit module 2. Use a weakref callback 3. Embed a client callback in a try/finally. 4. Or, like you said, have the client call a sync() method -- this is explicit and gives the client control over when data is written. Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler
On 2008-01-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > My expertise, if any, is in assembler. I'm trying to > understand Python scripts and modules by examining them after > they have been disassembled in a Windows environment. You can't dissassemble them, since they aren't ever converted to assembler and assembled. Python is compiled into bytecode for a virtual machine (either the Java VM or the Python VM or the .NET VM). > I'm wondering if a Python symbols file is available. You're way off track. > In the Windows environment, a symbol file normally has a PDB > extension. It's a little unfortunate that Python also uses PDB > for its debugger. Google, for whatever reason, wont accept > queries with dots, hyphens, etc., in the query line. For > example a Google for "python.pdb" returns +python +pdb, so I > get a ridiculous number of returns referring to the python > debugger. I have mentioned this to Google several times, but I > guess logic isn't one of their strong points. :-) Trying to find assembly language stuff to look at is futile. Python doesn't get compiled into assembly language. If you want to learn Python, then read a book on Python. -- Grant Edwards grante Yow! I am NOT a nut at visi.com -- http://mail.python.org/mailman/listinfo/python-list
monitoring device status with python ...
hi everyone: i am writing a program, which needs to keep monitoring whether a certain usb hard drive is connected/hot-plugged in or not. instead of repeatedly checking if its path exists or not, can i have the os let my program know that the device has been connected? i have read about the minihallib module but have not come across an elaborate example. can any of you point me to any examples (or alternatives)? id appreciate any help. regards, -ajay -- http://mail.python.org/mailman/listinfo/python-list
Professional Grant Proposal Writing Workshop (April 2008: Vancouver, British Columbia)
The Grant Institute's Grants 101: Professional Grant Proposal Writing Workshop will be held in Vancouver, British Columbia, April 14 - 16, 2008. Interested development professionals, researchers, faculty, and graduate students should register as soon as possible, as demand means that seats will fill up quickly. Please forward, post, and distribute this e-mail to your colleagues and listservs. All participants will receive certification in professional grant writing from the Institute. For more information call (213) 817 - 5308 or visit The Grant Institute at www.thegrantinstitute.com. Please find the program description below: The Grant Institute Grants 101: Professional Grant Proposal Writing Workshop will be held in Vancouver, British Columbia April 14 - 16, 2008 8:00 AM - 5:00 PM The Grant Institute's Grants 101 course is an intensive and detailed introduction to the process, structure, and skill of professional proposal writing. This course is characterized by its ability to act as a thorough overview, introduction, and refresher at the same time. In this course, participants will learn the entire proposal writing process and complete the course with a solid understanding of not only the ideal proposal structure, but a holistic understanding of the essential factors, which determine whether or not a program gets funded. Through the completion of interactive exercises and activities, participants will complement expert lectures by putting proven techniques into practice. This course is designed for both the beginner looking for a thorough introduction and the intermediate looking for a refresher course that will strengthen their grant acquisition skills. This class, simply put, is designed to get results by creating professional grant proposal writers. Participants will become competent program planning and proposal writing professionals after successful completion of the Grants 101 course. In three active and informative days, students will be exposed to the art of successful grant writing practices, and led on a journey that ends with a masterful grant proposal. Grants 101 consists of three (3) courses that will be completed during the three-day workshop. (1) Fundamentals of Program Planning This course is centered on the belief that "it's all about the program." This intensive course will teach professional program development essentials and program evaluation. While most grant writing "workshops" treat program development and evaluation as separate from the writing of a proposal, this class will teach students the relationship between overall program planning and grant writing. (2) Professional Grant Writing Designed for both the novice and experienced grant writer, this course will make each student an overall proposal writing specialist. In addition to teaching the basic components of a grant proposal, successful approaches, and the do's and don'ts of grant writing, this course is infused with expert principles that will lead to a mastery of the process. Strategy resides at the forefront of this course's intent to illustrate grant writing as an integrated, multidimensional, and dynamic endeavor. Each student will learn to stop writing the grant and to start writing the story. Ultimately, this class will illustrate how each component of the grant proposal represents an opportunity to use proven techniques for generating support. (3) Grant Research At its foundation, this course will address the basics of foundation, corporation, and government grant research. However, this course will teach a strategic funding research approach that encourages students to see research not as something they do before they write a proposal, but as an integrated part of the grant seeking process. Students will be exposed to online and database research tools, as well as publications and directories that contain information about foundation, corporation, and government grant opportunities. Focusing on funding sources and basic social science research, this course teaches students how to use research as part of a strategic grant acquisition effort. Registration $597.00 USD tuition includes all materials and certificates. Each student will receive: *The Grant Institute Certificate in Professional Grant Writing *The Grant Institute's Guide to Successful Grant Writing *The Grant Institute Grant Writer's Workbook with sample proposals, forms, and outlines Registration Methods 1) On-Line - Complete the online registration form at www.thegrantinstitute.com under Register Now. We'll send your confirmation by e-mail. 2) By Phone - Call (213) 817-5308 to register by phone. Our friendly Program Coordinators will be happy to assist you and answer your questions. 3) By E-mail - Send an e-mail with your name, organization, and basic contact information to [EMAIL PROTECTED] and we will reserve your slot and send your Confirmation Packet. You have received this invitation due to
Cleanup when a object dies
I writing writing a class to allow settings (options, preferences) to written file in a cross platform manner. I'm unsure how to go a about syncing the data to disk. Of course, it's horribly inefficient to write the data every time something changes a value, however I don't see how I can do it on deletion. I've read that __del__ methods should be avoided. So am I just going to have to force the client of my object to call sync when they're done? -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 7:29 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > > > I was asking this community if there was a simple way to use only the > > tools included with Python to parse a bit of html. > > If you *know* that your document is valid HTML, you can use the HTMLParser > module in the standard Python library. Or even the parser in the htmllib > module. But a lot of HTML pages out there are invalid, some are grossly > invalid, and those parsers are just unable to handle them. This is why > modules like BeautifulSoup exist: they contain a lot of heuristics and > trial-and-error and personal experience from the developers, in order to > guess more or less what the page author intended to write and make some > sense of that "tag soup". > A guesswork like that is not suitable for the std lib ("Errors should > never pass silently" and "In the face of ambiguity, refuse the temptation > to guess.") but makes a perfect 3rd party module. > > If you want to use regular expressions, and that works OK for the > documents you are handling now, fine. But don't complain when your RE's > match too much or too little or don't match at all because of unclosed > tags, improperly nested tags, nonsense markup, or just a valid combination > that you didn't take into account. > > -- > Gabriel Genellina Thanks, Gabriel. That does make sense, both what the benefits of BeautifulSoup are and why it probably won't become std lib anytime soon. The pages I'm trying to write this code to run against aren't in the wild, though. They are static html files on my company's lan, are very consistent in format, and are (I believe) valid html. They just have specific paragraphs of useful information, located in the same place in each file, that I want to 'harvest' and put to better use. I used diveintopython.org as an example only (and in part because it had good clean html formatting). I am pretty sure that I could craft some regular expressions to do the work -- which of course would not be the case if I was screen scraping web pages in the 'wild' -- but I was trying to find a way to do that using one of those std libs you mentioned. I'm not sure if HTMLParser or htmllib would work better to achieve the same effect as the regex example I gave above, or how to get them to do that. I thought I'd come close, but as someone pointed out early on, I'd accidently tapped into PyXML which is installed where I was testing code, but not necessarily where I need it. It may turn out that the regex way works faster, but falling back on methods I'm comfortable with doesn't help expand my Python knowledge. So if anyone can tell me how to get HTMLParser or htmllib to grab a specific paragraph, and then provide the text in that paragraph in a clean, markup-free format, I'd appreciate it. -- http://mail.python.org/mailman/listinfo/python-list
Re: possible to overide setattr in local scope?
"glomde" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | In a class it is poosible to override setattr, so that you can decide | how you should | handle setting of variables. | | Is this possible to do outside of an class on module level. | | mysetattr(obj, var, value): | print "Hello" | | So that | | test = 5 | | | would print | Hello An assignment at module level amounts to setting an attribute of an instance of the builtin (C coded) module type, which you cannot change. Even if you can subclass that type (I don't know), there is no way to get the (stock) interpreter to use instances of your module subclass instead. -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler...sorry if this is duplicated...it's unintentional
On Jan 22, 4:45 pm, [EMAIL PROTECTED] wrote: > My expertise, if any, is in assembler. I'm trying to understand Python > scripts and modules by examining them after they have been > disassembled in a Windows environment. > > I'm wondering if a Python symbols file is available. In the Windows > environment, a symbol file normally has a PDB extension. It's a little > unfortunate that Python also uses PDB for its debugger. Google, for > whatever reason, wont accept queries with dots, hyphens, etc., in the > query line. For example a Google for "python.pdb" returns +python > +pdb, so I get a ridiculous number of returns referring to the python > debugger. I have mentioned this to Google several times, but I guess > logic isn't one of their strong points. :-) > > If there's dupicates of this post it's because it wouldn't send for > some reason. I'm not sure what you're talking about...mainly because I'm not sure what you mean by a "symbols file". But I did some google-fu myself and found this CookBook entry: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/200638 And this thread seems to be talking about symbol resolution, I think: http://www.python.org/search/hypermail/python-1994q2/0605.html And here's some weird site that claims to have a list of inseparable symbols, whatever that means: voicecode.iit.nrc.ca/VCodeWiki/public/wiki.cgi? obj=ListOfUnseparablePythonSymbols I can't get it to load unless I use Google's cached version though. Hope that helps and that I'm not too far off the mark! Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 7:29 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > > > I was asking this community if there was a simple way to use only the > > tools included with Python to parse a bit of html. > > If you *know* that your document is valid HTML, you can use the HTMLParser > module in the standard Python library. Or even the parser in the htmllib > module. But a lot of HTML pages out there are invalid, some are grossly > invalid, and those parsers are just unable to handle them. This is why > modules like BeautifulSoup exist: they contain a lot of heuristics and > trial-and-error and personal experience from the developers, in order to > guess more or less what the page author intended to write and make some > sense of that "tag soup". > A guesswork like that is not suitable for the std lib ("Errors should > never pass silently" and "In the face of ambiguity, refuse the temptation > to guess.") but makes a perfect 3rd party module. > > If you want to use regular expressions, and that works OK for the > documents you are handling now, fine. But don't complain when your RE's > match too much or too little or don't match at all because of unclosed > tags, improperly nested tags, nonsense markup, or just a valid combination > that you didn't take into account. > > -- > Gabriel Genellina Thank you. That does make perfect sense, and is a good clear position on the up and down side of what I'm trying to do, as well as a good explanation for why BeautifulSoup will probably remain outside the std lib. I'm sure that I will get plenty of use out of it. If, however, I am sure that the html code in target documents is good, and the framework html doesn't change, just the data on page after page of static html, would it be better to just go with regex or with one of the std lib items you mentioned. I thought the latter, but I'm stuck on how to make them generate results similar to the code I put above as an example. I'm not trying to code this to go against html in the wild, but to try to strip specific, consistently located data from the markup and turn it into something more useful. I may have confused folks by using the www.diveintopython.org page as an example, but its html seemed to be valid strict tags. -- http://mail.python.org/mailman/listinfo/python-list
Re: Extract value from a attribute in a string
En Tue, 22 Jan 2008 23:45:22 -0200, <[EMAIL PROTECTED]> escribió: > I am looking for some help in reading a large text tile and extracting > a value from an attribute? so I would need to find name=foo and > extract just the value foo which can be at any location in the string. > The attribute name will be in almost each line. In this case a regular expression may be the right tool. See http://docs.python.org/lib/module-re.html py> import re py> text = """ok name=foo ... in this line name=bar but ... here you get name = another thing ... is this what you want?""" py> for match in re.finditer(r"name\s*=\s*(\S+)", text): ... print match.group(1) ... foo bar another -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: Don't want child process inheriting open sockets
En Tue, 22 Jan 2008 13:02:35 -0200, Steven Watanabe <[EMAIL PROTECTED]> escribió: > I'm using subprocess.Popen() to create a child process. The child > process is inheriting the parent process' open sockets, but I don't want > that. I believe that on Unix systems I could use the FD_CLOEXEC flag, > but I'm running Windows. Any suggestions? You could use the DuplicateHandle Windows API function with bInheritHandle=False to create a non inheritable socket handle, then close the original one. This should be done for every socket you don't want to be inherited. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 1:34 pm, Paddy <[EMAIL PROTECTED]> wrote: > On Jan 22, 5:34 am, George Sakkis <[EMAIL PROTECTED]> wrote: > > > > > On Jan 22, 12:15 am, Paddy <[EMAIL PROTECTED]> wrote: > > > > On Jan 22, 3:20 am, Alan Isaac <[EMAIL PROTECTED]> wrote:> I want to > > > generate sequential pairs from a list. > > > <> > > > > What is the fastest way? (Ignore the import time.) > > > > 1) How fast is the method you have? > > > 2) How much faster does it need to be for your application? > > > 3) Are their any other bottlenecks in your application? > > > 4) Is this the routine whose smallest % speed-up would give the > > > largest overall speed up of your application? > > > I believe the "what is the fastest way" question for such small well- > > defined tasks is worth asking on its own, regardless of whether it > > makes a difference in the application (or even if there is no > > application to begin with). > > Hi George, > You need to 'get it right' first. For such trivial problems, getting it right alone isn't a particularly high expectation. > Micro optimizations for speed > without thought of the wider context is a bad habit to form and a time > waster. The OP didn't mention anything about the context; for all we know, this might be a homework problem or the body of a tight inner loop. There is this tendency on c.l.py to assume that every optimization question is about a tiny subproblem of a 100 KLOC application. Without further context, we just don't know. > If the routine is all that needs to be delivered and it does not > perform at an acceptable speed then find out what is acceptable > and optimise towards that goal. My questions were set to get > posters to think more about the need for speed optimizations and > where they should be applied, (if at all). I don't agree with this logic in general. Just because one can solve a problem by throwing a quick and dirty hack with quadratic complexity that happens to do well enough on current typical input, it doesn't mean he shouldn't spend ten or thirty minutes more to write a proper linear time solution, all else being equal or at least comparable (elegance, conciseness, readability, etc.). Of course it's a tradeoff; spending a week to save a few milliseconds on average is usually a waste for most applications, but being a lazy keyboard banger writing the first thing that pops into mind is not that good either. George -- http://mail.python.org/mailman/listinfo/python-list
Re: Submitting with PAMIE
On Jan 22, 7:49 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > En Tue, 22 Jan 2008 15:39:33 -0200, <[EMAIL PROTECTED]> escribió: > > > Hi I really need help. I've been looking around for an answer forever. > > I need to submit a form with no name and also the submit button has no > > name or value. How might I go about doing either of these. Thanks > > I think you'll have more luck in a specific forum like the PAMIE User > Group athttp://tech.groups.yahoo.com/group/Pamie_UsersGroup/ > > -- > Gabriel Genellina Thanks I signed up and awaiting approval. Hopefully I can get help soon. -- http://mail.python.org/mailman/listinfo/python-list
Extract value from a attribute in a string
Hello, I am looking for some help in reading a large text tile and extracting a value from an attribute? so I would need to find name=foo and extract just the value foo which can be at any location in the string. The attribute name will be in almost each line. Thank you for any suggestions. -- http://mail.python.org/mailman/listinfo/python-list
Re: Using utidylib, empty string returned in some cases
En Tue, 22 Jan 2008 15:35:16 -0200, Boris <[EMAIL PROTECTED]> escribió: > I'm using debian linux, Python 2.4.4, and utidylib (http:// > utidylib.berlios.de/). I wrote simple functions to get a web page, > convert it from windows-1251 to utf8 and then I'd like to clean html > with it. Why the intermediate conversion? I don't know utidylib, but can't you feed it with the original page, in the original encoding? If the page itself contains a "meta http-equiv" tag stating its content-type and charset, it won't be valid anymore if you reencode the page. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: Core Python Programming . . .
> > 6-11 Conversion. > > (a) Create a program that will convert from an integer to an > > Internet Protocol (IP) address in the four-octet format of WWW.XXX.YYY.ZZZ > > (b) Update your program to be able to do the vice verse of the above. > > I think it's is asking to convert a 32-bit int to the dotted form. > > It's a little known fact, but IP addresses are valid in non-dotted > long-int form. Spammers commonly use this trick to disguise their IP > addresses in emails from scanners. that is correct. don't read too much into it. i'm not trying to validate anything or any format, use old or new technology. it is simply to exercise your skills with numbers (specifically 32-bit/4- byte integers), string manipulation, and bitwise operations. if you wish to use different sizes of numbers, forms of addressing, IPv6, etc., that's up to you. don't forget about part (b), which is to take an IP address and turn it into a 32-bit integer. enjoy! -- wesley ps. since you're on p. 248, there is also a typo in the piece of code right above this exercise, Example 6.4, which is tied to exercise 6-7. "'fac_list'" should really be "`fac_list`", or even better, "repr(fac_list)". see the Errata at the book's website http://corepython.com for more details. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "Core Python Programming", Prentice Hall, (c)2007,2001 http://corepython.com wesley.j.chun :: wescpy-at-gmail.com python training and technical consulting cyberweb.consulting : silicon valley, ca http://cyberwebconsulting.com -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler
I second Wim's opinion. Learn python as a high level language, you won't regret it. About google, I'll give you a little gtip: > > For example a Google for "python.pdb" returns +python > > +pdb, so I get a ridiculous number of returns referring to the python > > debugger. I have mentioned this to Google several times, but I guess > > logic isn't one of their strong points. :-) Instead of searching 'python.pdb' try the query "filetype:pdb python", or even "python pdb" (quoted). The first one whould give you files with pdb extension and python in the name or contents, and the second one (quoted) should return pages with both words together, except for commas, spaces, dots, slashs, etc. However... one of the second query results is this thread in google groups... not a good sign. -- Luis Zarrabeitia Facultad de Matemática y Computación, UH http://profesores.matcom.uh.cu/~kyrie Quoting Wim Vander Schelden <[EMAIL PROTECTED]>: > Python modules and scripts are normally not even compiled, if they have > been, > its probably just the Python interpreter packaged with the scripts and > resources. > > My advice is that if you want to learn Python, is that you just read a book > about > it or read only resources. Learning Python from assembler is kind of... > strange. > > Not only are you skipping several generations of programming languages, > spanned > over a period of 40 years, but the approach to programming in Python is so > fundamentally different from assembler programming that there is simply no > reason > to start looking at if from this perspective. > > I truly hope you enjoy the world of high end programming languages, but > treat them > as such. Looking at them in a low-level representation or for a low-level > perspective > doesn't bear much fruits. > > Kind regards, > > Wim > > On 1/22/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > > My expertise, if any, is in assembler. I'm trying to understand Python > > scripts and modules by examining them after they have been > > disassembled in a Windows environment. > > > > I'm wondering if a Python symbols file is available. In the Windows > > environment, a symbol file normally has a PDB extension. It's a little > > unfortunate that Python also uses PDB for its debugger. Google, for > > whatever reason, wont accept queries with dots, hyphens, etc., in the > > query line. For example a Google for "python.pdb" returns +python > > +pdb, so I get a ridiculous number of returns referring to the python > > debugger. I have mentioned this to Google several times, but I guess > > logic isn't one of their strong points. :-) > > -- > > http://mail.python.org/mailman/listinfo/python-list > > > -- "Al mundo nuevo corresponde la Universidad nueva" UNIVERSIDAD DE LA HABANA 280 aniversario -- http://mail.python.org/mailman/listinfo/python-list
Re: Submitting with PAMIE
En Tue, 22 Jan 2008 15:39:33 -0200, <[EMAIL PROTECTED]> escribió: > Hi I really need help. I've been looking around for an answer forever. > I need to submit a form with no name and also the submit button has no > name or value. How might I go about doing either of these. Thanks I think you'll have more luck in a specific forum like the PAMIE User Group at http://tech.groups.yahoo.com/group/Pamie_UsersGroup/ -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler
The reason you were finding a Python Debugger when looking for the PDB files is because PDB is Python DeBugger! Also why would you be looking for a PDB file if you can read the C source! On Jan 22, 2008 11:55 PM, Wim Vander Schelden <[EMAIL PROTECTED]> wrote: > Python modules and scripts are normally not even compiled, if they have > been, > its probably just the Python interpreter packaged with the scripts and > resources. > > My advice is that if you want to learn Python, is that you just read a book > about > it or read only resources. Learning Python from assembler is kind of... > strange. > > Not only are you skipping several generations of programming languages, > spanned > over a period of 40 years, but the approach to programming in Python is so > fundamentally different from assembler programming that there is simply no > reason > to start looking at if from this perspective. > > I truly hope you enjoy the world of high end programming languages, but > treat them > as such. Looking at them in a low-level representation or for a low-level > perspective > doesn't bear much fruits. > > Kind regards, > > Wim > > > > On 1/22/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > My expertise, if any, is in assembler. I'm trying to understand Python > > scripts and modules by examining them after they have been > > disassembled in a Windows environment. > > > > I'm wondering if a Python symbols file is available. In the Windows > > environment, a symbol file normally has a PDB extension. It's a little > > unfortunate that Python also uses PDB for its debugger. Google, for > > whatever reason, wont accept queries with dots, hyphens, etc., in the > > query line. For example a Google for "python.pdb" returns +python > > +pdb, so I get a ridiculous number of returns referring to the python > > debugger. I have mentioned this to Google several times, but I guess > > logic isn't one of their strong points. :-) > > -- > > http://mail.python.org/mailman/listinfo/python-list > > > > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://search.goldwatches.com/?Search=Movado+Watches http://www.jewelerslounge.com http://www.goldwatches.com -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
En Tue, 22 Jan 2008 19:20:32 -0200, Alnilam <[EMAIL PROTECTED]> escribió: > On Jan 22, 11:39 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >> Alnilam wrote: >> > On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote: >> >> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2, >> >> > -1)) doesn't have an xml.dom.ext ... you must have the >> mega-monstrous >> >> > 200-modules PyXML package installed. And you don't want the 75Kb >> >> > BeautifulSoup? >> > Ugh. Found it. Sorry about that, but I still don't understand why >> > there isn't a simple way to do this without using PyXML, BeautifulSoup >> > or libxml2dom. What's the point in having sgmllib, htmllib, >> > HTMLParser, and formatter all built in if I have to use use someone >> > else's modules to write a couple of lines of code that achieve the >> > simple thing I want. I get the feeling that this would be easier if I >> > just broke down and wrote a couple of regular expressions, but it >> > hardly seems a 'pythonic' way of going about things. >> >> This is simply a gross misunderstanding of what BeautifulSoup or lxml >> accomplish. Dealing with mal-formatted HTML whilst trying to make _some_ >> sense is by no means trivial. And just because you can come up with a >> few >> lines of code using rexes that work for your current use-case doesn't >> mean >> that they serve as general html-fixing-routine. Or do you think the >> rather >> long history and 75Kb of code for BS are because it's creator wasn't >> aware >> of rexes? > > I am, by no means, trying to trivialize the work that goes into > creating the numerous modules out there. However as a relatively > novice programmer trying to figure out something, the fact that these > modules are pushed on people with such zealous devotion that you take > offense at my desire to not use them gives me a bit of pause. I use > non-included modules for tasks that require them, when the capability > to do something clearly can't be done easily another way (eg. > MySQLdb). I am sure that there will be plenty of times where I will > use BeautifulSoup. In this instance, however, I was trying to solve a > specific problem which I attempted to lay out clearly from the > outset. > > I was asking this community if there was a simple way to use only the > tools included with Python to parse a bit of html. If you *know* that your document is valid HTML, you can use the HTMLParser module in the standard Python library. Or even the parser in the htmllib module. But a lot of HTML pages out there are invalid, some are grossly invalid, and those parsers are just unable to handle them. This is why modules like BeautifulSoup exist: they contain a lot of heuristics and trial-and-error and personal experience from the developers, in order to guess more or less what the page author intended to write and make some sense of that "tag soup". A guesswork like that is not suitable for the std lib ("Errors should never pass silently" and "In the face of ambiguity, refuse the temptation to guess.") but makes a perfect 3rd party module. If you want to use regular expressions, and that works OK for the documents you are handling now, fine. But don't complain when your RE's match too much or too little or don't match at all because of unclosed tags, improperly nested tags, nonsense markup, or just a valid combination that you didn't take into account. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: UDP Client/Server
2008/1/22, Martin Marcher <[EMAIL PROTECTED]>: > Hello, > > I created a really simple udp server and protocol but I only get every 2nd > request (and thus answer just every second request). > > Maybe someone could shed some light, I'm lost in the dark(tm), sorry if this > is a bit oververbose but to me everything that happens here is black magic, > and I have no clue where the packages go. I can't think of a simpler > protocol than to just receive a fixed max UDP packet size and answer > immediately (read an "echo" server). > > thanks > martin > > > ### server > >>> from socket import * > >>> import SocketServer > >>> from SocketServer import BaseRequestHandler, UDPServer > >>> class FooReceiveServer(SocketServer.UDPServer): > ... def __init__(self): > ... SocketServer.UDPServer.__init__(self, ("localhost", 4321), > FooRequestHandler) > ... > >>> class FooRequestHandler(BaseRequestHandler): > ... def handle(self): > ... data, addr_info = self.request[1].recvfrom(65534) Your FooReceiveServer subclasses UDPServer, it already handled the recvfrom for you, so, this is wrong. > ... print data > ... print addr_info > ... self.request[1].sendto("response", addr_info) > ... > >>> f = FooReceiveServer() > >>> f.serve_forever() > request 0 > ('127.0.0.1', 32884) > request 1 > ('127.0.0.1', 32884) > request 2 > ('127.0.0.1', 32884) > request 2 > ('127.0.0.1', 32884) > request 2 > ('127.0.0.1', 32884) > > > > ### client > >>> target = ('127.0.0.1', 4321) > >>> from socket import * > >>> s = socket(AF_INET, SOCK_DGRAM) > >>> for i in range(10): > ... s.sendto("request " + str(i), target) > ... s.recv(65534) > ... > 9 > Traceback (most recent call last): > File "", line 3, in > KeyboardInterrupt > >>> s.sendto("request " + str(i), target) > 9 > >>> str(i) > '0' > >>> for i in range(10): > ... s.sendto("request " + str(i), target) > ... s.recv(65534) > ... > 9 > 'response' > 9 > 'response' > 9 > Traceback (most recent call last): > File "", line 3, in > KeyboardInterrupt > >>> #this was hanging, why? > ... > >>> s.sendto("request " + str(i), target) > 9 > >>> s.recv(65534) > 'response' > >>> s.sendto("request " + str(i), target) > 9 > >>> s.recv(65534) > Traceback (most recent call last): > File "", line 1, in > KeyboardInterrupt > >>> s.sendto("request " + str(i), target) > 9 > >>> s.sendto("request " + str(i), target) > 9 > >>> s.recv(65534) > 'response' > >>> s.recv(65534) > Traceback (most recent call last): > File "", line 1, in > KeyboardInterrupt > >>> s.sendto("request " + str(i), target) > 9 > >>> > > -- > http://noneisyours.marcher.name > http://feeds.feedburner.com/NoneIsYours > > You are not free to read this message, > by doing so, you have violated my licence > and are required to urinate publicly. Thank you. > > -- > http://mail.python.org/mailman/listinfo/python-list > -- -- Guilherme H. Polo Goncalves -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML that's embedded in HTML
On Jan 22, 10:57 am, Mike Driscoll <[EMAIL PROTECTED]> wrote: > Hi, > > I need to parse a fairly complex HTML page that has XML embedded in > it. I've done parsing before with the xml.dom.minidom module on just > plain XML, but I cannot get it to work with this HTML page. > > The XML looks like this: > ... Once again (this IS HTML Day!), instead of parsing the HTML, pyparsing can help lift the interesting bits and leave the rest alone. Try this program out: from pyparsing import makeXMLTags,Word,nums,Combine,oneOf,SkipTo,withAttribute htmlWithEmbeddedXml = """ Hey! this is really bold! Owner 1 07/16/2007 No Doe, John 1905 S 3rd Ave , Hicksville IA 9 Owner 2 07/16/2007 No Doe, Jane 1905 S 3rd Ave , Hicksville IA 9 this is in a table, woo-hoo! more HTML blah blah blah... """ # define pyparsing expressions for XML tags rowStart,rowEnd = makeXMLTags("Row") relationshipStart,relationshipEnd = makeXMLTags("Relationship") priorityStart,priorityEnd = makeXMLTags("Priority") startDateStart,startDateEnd = makeXMLTags("StartDate") stopsExistStart,stopsExistEnd = makeXMLTags("StopsExist") nameStart,nameEnd = makeXMLTags("Name") addressStart,addressEnd = makeXMLTags("Address") # define some useful expressions for data of specific types integer = Word(nums) date = Combine(Word(nums,exact=2)+"/"+ Word(nums,exact=2)+"/"+Word(nums,exact=4)) yesOrNo = oneOf("Yes No") # conversion parse actions integer.setParseAction(lambda t: int(t[0])) yesOrNo.setParseAction(lambda t: t[0]=='Yes') # could also define a conversion for date if you really wanted to # define format of a , plus assign results names for each data field rowRec = rowStart + \ relationshipStart + SkipTo(relationshipEnd)("relationship") + relationshipEnd + \ priorityStart + integer("priority") + priorityEnd + \ startDateStart + date("startdate") + startDateEnd + \ stopsExistStart + yesOrNo("stopsexist") + stopsExistEnd + \ nameStart + SkipTo(nameEnd)("name") + nameEnd + \ addressStart + SkipTo(addressEnd)("address") + addressEnd + \ rowEnd # set filtering parse action rowRec.setParseAction(withAttribute(relationship="Owner",priority=1)) # find all matching rows, matching grammar and filtering parse action rows = rowRec.searchString(htmlWithEmbeddedXml) # print the results (uncomment r.dump() statement to see full # result for each row) for r in rows: # print r.dump() print r.relationship print r.priority print r.startdate print r.stopsexist print r.name print r.address This prints: Owner 1 07/16/2007 False Doe, John 1905 S 3rd Ave , Hicksville IA 9 In addition to parsing this data, some conversions were done at parse time, too - "1" was converted to the value 1, and "No" was converted to False. These were done by the conversion parse actions. The filtering just for Row's containing Relationship="Owner" and Priority=1 was done in a more global parse action, called withAttribute. If you comment this line out, you will see that both rows get retrieved. -- Paul (Find out more about pyparsing at http://pyparsing.wikispaces.com.) -- http://mail.python.org/mailman/listinfo/python-list
Re: Just for fun: Countdown numbers game solver
Arnaud and Terry, Great solutions both of you! Much nicer than mine. I particularly like Arnaud's latest one based on folding because it's so neat and conceptually simple. For me, it's the closest so far to my goal of the most elegant solution. So anyone got an answer to which set of numbers gives the most targets from 100 onwards say (or from 0 onwards)? Is Python up to the task? A thought on that last one. Two ways to improve speed. First of all, you don't need to rerun from scratch for each target. Secondly, you can try multiple different sets of numbers at the same time by passing numpy arrays instead of single values (although you have to give up the commutativity and division by zero optimisations). Dan Goodman -- http://mail.python.org/mailman/listinfo/python-list
UDP Client/Server
Hello, I created a really simple udp server and protocol but I only get every 2nd request (and thus answer just every second request). Maybe someone could shed some light, I'm lost in the dark(tm), sorry if this is a bit oververbose but to me everything that happens here is black magic, and I have no clue where the packages go. I can't think of a simpler protocol than to just receive a fixed max UDP packet size and answer immediately (read an "echo" server). thanks martin ### server >>> from socket import * >>> import SocketServer >>> from SocketServer import BaseRequestHandler, UDPServer >>> class FooReceiveServer(SocketServer.UDPServer): ... def __init__(self): ... SocketServer.UDPServer.__init__(self, ("localhost", 4321), FooRequestHandler) ... >>> class FooRequestHandler(BaseRequestHandler): ... def handle(self): ... data, addr_info = self.request[1].recvfrom(65534) ... print data ... print addr_info ... self.request[1].sendto("response", addr_info) ... >>> f = FooReceiveServer() >>> f.serve_forever() request 0 ('127.0.0.1', 32884) request 1 ('127.0.0.1', 32884) request 2 ('127.0.0.1', 32884) request 2 ('127.0.0.1', 32884) request 2 ('127.0.0.1', 32884) ### client >>> target = ('127.0.0.1', 4321) >>> from socket import * >>> s = socket(AF_INET, SOCK_DGRAM) >>> for i in range(10): ... s.sendto("request " + str(i), target) ... s.recv(65534) ... 9 Traceback (most recent call last): File "", line 3, in KeyboardInterrupt >>> s.sendto("request " + str(i), target) 9 >>> str(i) '0' >>> for i in range(10): ... s.sendto("request " + str(i), target) ... s.recv(65534) ... 9 'response' 9 'response' 9 Traceback (most recent call last): File "", line 3, in KeyboardInterrupt >>> #this was hanging, why? ... >>> s.sendto("request " + str(i), target) 9 >>> s.recv(65534) 'response' >>> s.sendto("request " + str(i), target) 9 >>> s.recv(65534) Traceback (most recent call last): File "", line 1, in KeyboardInterrupt >>> s.sendto("request " + str(i), target) 9 >>> s.sendto("request " + str(i), target) 9 >>> s.recv(65534) 'response' >>> s.recv(65534) Traceback (most recent call last): File "", line 1, in KeyboardInterrupt >>> s.sendto("request " + str(i), target) 9 >>> -- http://noneisyours.marcher.name http://feeds.feedburner.com/NoneIsYours You are not free to read this message, by doing so, you have violated my licence and are required to urinate publicly. Thank you. -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler
Python modules and scripts are normally not even compiled, if they have been, its probably just the Python interpreter packaged with the scripts and resources. My advice is that if you want to learn Python, is that you just read a book about it or read only resources. Learning Python from assembler is kind of... strange. Not only are you skipping several generations of programming languages, spanned over a period of 40 years, but the approach to programming in Python is so fundamentally different from assembler programming that there is simply no reason to start looking at if from this perspective. I truly hope you enjoy the world of high end programming languages, but treat them as such. Looking at them in a low-level representation or for a low-level perspective doesn't bear much fruits. Kind regards, Wim On 1/22/08, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > My expertise, if any, is in assembler. I'm trying to understand Python > scripts and modules by examining them after they have been > disassembled in a Windows environment. > > I'm wondering if a Python symbols file is available. In the Windows > environment, a symbol file normally has a PDB extension. It's a little > unfortunate that Python also uses PDB for its debugger. Google, for > whatever reason, wont accept queries with dots, hyphens, etc., in the > query line. For example a Google for "python.pdb" returns +python > +pdb, so I get a ridiculous number of returns referring to the python > debugger. I have mentioned this to Google several times, but I guess > logic isn't one of their strong points. :-) > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
Re: translating Python to Assembler
On Jan 23, 9:24 am, [EMAIL PROTECTED] wrote: > My expertise, if any, is in assembler. I'm trying to understand Python > scripts and modules by examining them after they have been > disassembled in a Windows environment. > DB "Wrong way. Go back. Read the tutorials." RET -- http://mail.python.org/mailman/listinfo/python-list
translating Python to Assembler...sorry if this is duplicated...it's unintentional
My expertise, if any, is in assembler. I'm trying to understand Python scripts and modules by examining them after they have been disassembled in a Windows environment. I'm wondering if a Python symbols file is available. In the Windows environment, a symbol file normally has a PDB extension. It's a little unfortunate that Python also uses PDB for its debugger. Google, for whatever reason, wont accept queries with dots, hyphens, etc., in the query line. For example a Google for "python.pdb" returns +python +pdb, so I get a ridiculous number of returns referring to the python debugger. I have mentioned this to Google several times, but I guess logic isn't one of their strong points. :-) If there's dupicates of this post it's because it wouldn't send for some reason. -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML that's embedded in HTML
On 22 Jan, 21:48, Mike Driscoll <[EMAIL PROTECTED]> wrote: > On Jan 22, 11:32 am, Paul Boddie <[EMAIL PROTECTED]> wrote: > > > [1]http://www.python.org/pypi/libxml2dom > > I must have tried this module quite a while ago since I already have > it installed. I see you're the author of the module, so you can > probably tell me what's what. When I do the above, I get an empty list > either way. See my code below: > > import libxml2dom > d = libxml2dom.parse(filename, html=1) > rows = d.xpath('//[EMAIL > PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/ > Row') > # rows = d.xpath("//XML/BoundData/Row") > print rows It may be namespace-related, although parsing as HTML shouldn't impose namespaces on the document, unlike parsing XHTML, say. One thing you can try is to start with a simpler query and to expand it. Start with the expression "//XML" and add things to make the results more specific. Generally, namespaces can make XPath queries awkward because you have to qualify the element names and define the namespaces for each of the prefixes used. Let me know how you get on! Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: difflib confusion
On Jan 22, 6:57 pm, "krishnakant Mane" <[EMAIL PROTECTED]> wrote: > hello all, > I have a bit of a confusing question. > firstly I wanted a library which can do an svn like diff with two files. > let's say I have file1 and file2 where file2 contains some thing which > file1 does not have. now if I do readlines() on both the files, I > have a list of all the lines. > I now want to do a diff and find out which word is added or deleted or > changed. > and that too on which character, if not at least want to know the word > that has the change. > any ideas please? Have a look at difflib in the standard library. -- Paul Hankin -- http://mail.python.org/mailman/listinfo/python-list
translating Python to Assembler
My expertise, if any, is in assembler. I'm trying to understand Python scripts and modules by examining them after they have been disassembled in a Windows environment. I'm wondering if a Python symbols file is available. In the Windows environment, a symbol file normally has a PDB extension. It's a little unfortunate that Python also uses PDB for its debugger. Google, for whatever reason, wont accept queries with dots, hyphens, etc., in the query line. For example a Google for "python.pdb" returns +python +pdb, so I get a ridiculous number of returns referring to the python debugger. I have mentioned this to Google several times, but I guess logic isn't one of their strong points. :-) -- http://mail.python.org/mailman/listinfo/python-list
Anyone into Paimei?? Need some help.
Hi. I have been trying to get Paimei running on Windoze but find it very inconsistent. It works on certain apps really well, like Notepad, but fails on other apps, especially those written in languages like Delphi. There isn't a lot out there on Paimei and the author's site is very terse on the app. -- http://mail.python.org/mailman/listinfo/python-list
translating Python to Assembler
My expertise, if any, is in assembler. I'm trying to understand Python scripts and modules by examining them after they have been disassembled in a Windows environment. I'm wondering if a Python symbols file is available. In the Windows environment, a symbol file normally has a PDB extension. It's a little unfortunate that Python also uses PDB for its debugger. Google, for whatever reason, wont accept queries with dots, hyphens, etc., in the query line. For example a Google for "python.pdb" returns +python +pdb, so I get a ridiculous number of returns referring to the python debugger. I have mentioned this to Google several times, but I guess logic isn't one of their strong points. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: get the size of a dynamically changing file fast ?
Mike Driscoll wrote: > On Jan 22, 3:35 pm, Stef Mientki <[EMAIL PROTECTED]> wrote: > >> Mike Driscoll wrote: >> >>> On Jan 17, 3:56 pm, Stef Mientki <[EMAIL PROTECTED]> wrote: >>> hello, I've a program (not written in Python) that generates a few thousands bytes per second, these files are dumped in 2 buffers (files), at in interval time of 50 msec, the files can be read by another program, to do further processing. A program written in VB or delphi can handle the data in the 2 buffers perfectly. Sometimes Python is also able to process the data correctly, but often it can't :-( I keep one of the files open en test the size of the open datafile each 50 msec. I have tried os.stat ( ) [ ST_SIZE] os.path.getsize ( ... ) but they both have the same behaviour, sometimes it works, and the data is collected each 50 .. 100 msec, sometimes 1 .. 1.5 seconds is needed to detect a change in filesize. I'm using python 2.4 on winXP. Is there a solution for this problem ? thanks, Stef Mientki >>> Tim Golden has a method to watch for changes in a directory on his >>> website: >>> >>> http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_fo... >>> >>> This old post also mentions something similar: >>> >>> http://mail.python.org/pipermail/python-list/2007-October/463065.html >>> >>> And here's a cookbook recipe that claims to do it as well using >>> decorators: >>> >>> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/426620 >>> >>> Hopefully that will get you going. >>> >>> Mike >>> >> thanks Mike, >> sorry for the late reaction. >> I've it working perfect now. >> After all, os.stat works perfectly well, >> the problem was in the program that generated the file with increasing >> size, >> by truncating it after each block write, it apperently garantees that >> the file is flushed to disk and all problems are solved. >> >> cheers, >> Stef Mientki >> > > I almost asked if you were making sure you had flushed the data to the > file...oh well. > Yes, that's a small disadavantage of using a "high-level" language, where there's no flush available, and you assume it'll done automatically ;-) cheers, Stef -- http://mail.python.org/mailman/listinfo/python-list
Re: Hebrew in idle ans eclipse (Windows)
On Jan 17, 10:35 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > ... > print lines[0].decode("").encode("") > ... > Regards, > Martin Ok, I've got the solution, but I still have a question. Recall: When I read data using sql I got a sequence like this: \x88\x89\x85 But when I entered heberw words directly in the print statement (or as a dictionary key) I got this: \xe8\xe9\xe5 Now, scanning the encoding module I discovered that cp1255 maps '\u05d9' to \xe9 while cp856 maps '\u05d9' to \x89, so trasforming \x88\x89\x85 to \xe8\xe9\xe5 is done by s.decode('cp856').encode('cp1255') ending up with the pattern you suggested. My qestion is, is there a way I can deduce cp856 and cp1255 from the string itself? Is there a function doing it? (making the transformation more robust) I don't know how IDLE guessed cp856, but it must have done it. (perhaps because it uses tcl, and maybe tcl guesses the encoding automatically?) thanks iu2 -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML that's embedded in HTML
On Jan 23, 7:48 am, Mike Driscoll <[EMAIL PROTECTED]> wrote: [snip] > I'm not sure what is wrong here...but I got lxml to create a tree from > by doing the following: > > > from lxml import etree > from StringIO import StringIO > > parser = etree.HTMLParser() > tree = etree.parse(filename, parser) > xml_string = etree.tostring(tree) > context = etree.iterparse(StringIO(xml_string)) > > > However, when I iterate over the contents of "context", I can't figure > out how to nab the row's contents: > > for action, elem in context: > if action == 'end' and elem.tag == 'relationship': > # do something...but what!? > # this if statement probably isn't even right > lxml allegedly supports the ElementTree interface so I would expect elem.text to refer to the contents. Sure enough: http://codespeak.net/lxml/tutorial.html#elements-contain-text Why do you want/need to use the iterparse technique on the 2nd pass instead of creating another tree and then using getiterator? -- http://mail.python.org/mailman/listinfo/python-list
Re: get the size of a dynamically changing file fast ?
On Jan 22, 3:35 pm, Stef Mientki <[EMAIL PROTECTED]> wrote: > Mike Driscoll wrote: > > On Jan 17, 3:56 pm, Stef Mientki <[EMAIL PROTECTED]> wrote: > > >> hello, > > >> I've a program (not written in Python) that generates a few thousands > >> bytes per second, > >> these files are dumped in 2 buffers (files), at in interval time of 50 > >> msec, > >> the files can be read by another program, to do further processing. > > >> A program written in VB or delphi can handle the data in the 2 buffers > >> perfectly. > >> Sometimes Python is also able to process the data correctly, > >> but often it can't :-( > > >> I keep one of the files open en test the size of the open datafile each > >> 50 msec. > >> I have tried > >> os.stat ( ) [ ST_SIZE] > >> os.path.getsize ( ... ) > >> but they both have the same behaviour, sometimes it works, and the data > >> is collected each 50 .. 100 msec, > >> sometimes 1 .. 1.5 seconds is needed to detect a change in filesize. > > >> I'm using python 2.4 on winXP. > > >> Is there a solution for this problem ? > > >> thanks, > >> Stef Mientki > > > Tim Golden has a method to watch for changes in a directory on his > > website: > > >http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_fo... > > > This old post also mentions something similar: > > >http://mail.python.org/pipermail/python-list/2007-October/463065.html > > > And here's a cookbook recipe that claims to do it as well using > > decorators: > > >http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/426620 > > > Hopefully that will get you going. > > > Mike > > thanks Mike, > sorry for the late reaction. > I've it working perfect now. > After all, os.stat works perfectly well, > the problem was in the program that generated the file with increasing > size, > by truncating it after each block write, it apperently garantees that > the file is flushed to disk and all problems are solved. > > cheers, > Stef Mientki I almost asked if you were making sure you had flushed the data to the file...oh well. Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: get the size of a dynamically changing file fast ?
Mike Driscoll wrote: > On Jan 17, 3:56 pm, Stef Mientki <[EMAIL PROTECTED]> wrote: > >> hello, >> >> I've a program (not written in Python) that generates a few thousands >> bytes per second, >> these files are dumped in 2 buffers (files), at in interval time of 50 msec, >> the files can be read by another program, to do further processing. >> >> A program written in VB or delphi can handle the data in the 2 buffers >> perfectly. >> Sometimes Python is also able to process the data correctly, >> but often it can't :-( >> >> I keep one of the files open en test the size of the open datafile each >> 50 msec. >> I have tried >> os.stat ( ) [ ST_SIZE] >> os.path.getsize ( ... ) >> but they both have the same behaviour, sometimes it works, and the data >> is collected each 50 .. 100 msec, >> sometimes 1 .. 1.5 seconds is needed to detect a change in filesize. >> >> I'm using python 2.4 on winXP. >> >> Is there a solution for this problem ? >> >> thanks, >> Stef Mientki >> > > Tim Golden has a method to watch for changes in a directory on his > website: > > http://tgolden.sc.sabren.com/python/win32_how_do_i/watch_directory_for_changes.html > > This old post also mentions something similar: > > http://mail.python.org/pipermail/python-list/2007-October/463065.html > > And here's a cookbook recipe that claims to do it as well using > decorators: > > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/426620 > > Hopefully that will get you going. > > Mike > thanks Mike, sorry for the late reaction. I've it working perfect now. After all, os.stat works perfectly well, the problem was in the program that generated the file with increasing size, by truncating it after each block write, it apperently garantees that the file is flushed to disk and all problems are solved. cheers, Stef Mientki -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 11:39 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > Alnilam wrote: > > On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote: > >> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2, > >> > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous > >> > 200-modules PyXML package installed. And you don't want the 75Kb > >> > BeautifulSoup? > > >> I wasn't aware that I had PyXML installed, and can't find a reference > >> to having it installed in pydocs. ... > > > Ugh. Found it. Sorry about that, but I still don't understand why > > there isn't a simple way to do this without using PyXML, BeautifulSoup > > or libxml2dom. What's the point in having sgmllib, htmllib, > > HTMLParser, and formatter all built in if I have to use use someone > > else's modules to write a couple of lines of code that achieve the > > simple thing I want. I get the feeling that this would be easier if I > > just broke down and wrote a couple of regular expressions, but it > > hardly seems a 'pythonic' way of going about things. > > This is simply a gross misunderstanding of what BeautifulSoup or lxml > accomplish. Dealing with mal-formatted HTML whilst trying to make _some_ > sense is by no means trivial. And just because you can come up with a few > lines of code using rexes that work for your current use-case doesn't mean > that they serve as general html-fixing-routine. Or do you think the rather > long history and 75Kb of code for BS are because it's creator wasn't aware > of rexes? > > And it also makes no sense stuffing everything remotely useful into the > standard lib. This would force to align development and release cycles, > resulting in much less features and stability as it can be wished. > > And to be honest: I fail to see where your problem is. BeatifulSoup is a > single Python file. So whatever you carry with you from machine to machine, > if it's capable of holding a file of your own code, you can simply put > BeautifulSoup beside it - even if it was a floppy disk. > > Diez I am, by no means, trying to trivialize the work that goes into creating the numerous modules out there. However as a relatively novice programmer trying to figure out something, the fact that these modules are pushed on people with such zealous devotion that you take offense at my desire to not use them gives me a bit of pause. I use non-included modules for tasks that require them, when the capability to do something clearly can't be done easily another way (eg. MySQLdb). I am sure that there will be plenty of times where I will use BeautifulSoup. In this instance, however, I was trying to solve a specific problem which I attempted to lay out clearly from the outset. I was asking this community if there was a simple way to use only the tools included with Python to parse a bit of html. If the answer is no, that's fine. Confusing, but fine. If the answer is yes, great. I look forward to learning from someone's example. If you don't have an answer, or a positive contribution, then please don't interject your angst into this thread. -- http://mail.python.org/mailman/listinfo/python-list
Re: Bug in __init__?
On 2008-01-22, citizen Bruno Desthuilliers testified: >> from copy import copy >> ### see also deepcopy >> self.lst = copy(val) > > What makes you think the OP wants a copy ? I´m guessing he doesn´t want to mutate original list, while changing contents of self.lst. bart -- "chłopcy dali z siebie wszystko, z czego tv pokazała głównie bebechy" http://candajon.azorragarse.info/ http://azorragarse.candajon.info/ -- http://mail.python.org/mailman/listinfo/python-list
Re: question
Hi there :) A little tip upfront: In the future you might want to come up with a more descriptive subject line. This will help readers decide early if they can possibly help or not. [EMAIL PROTECTED] wrote: > def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > ... > Yuck! ;) > The only problem with the code above though is that I don't know how to call > it, especially since if the user is entering a string, how would I convert > that string into a function name? While this is relatively easy, it is *waaayyy* too complicated an approach here, because . . . > def albumInfo(theBand): > if theBand == 'Rush': > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > elif theBand == 'Enchant': > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > ... > . . . this is a lot more fitting for this problem. You could also have used a dictionary here, but the above is better if you have a lot of lists, because only the one you use is created (I think . . .). You might also want to consider preparing a textfile and reading it into a list (via lines = open("somefile.txt").readlines()) and then work with that so you don't have to hardcode it into the program. This however is somewhat advanced (if you're just starting out), so don't sweat it. > I'm not familiar with how 'classes' work yet (still reading through my 'Core > Python' book) but was curious if using a 'class' would be better suited for > something like this? Since the user could possibly choose from 100 or more > choices, I'd like to come up with something that's efficient as well as easy > to read in the code. If anyone has time I'd love to hear your thoughts. > Think of classes as "models of things and their behavior" (like an animal, a car or a robot). What you want is a simple "request->answer" style functionality, hence a function. Hope that helps. Happy coding :) /W -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML that's embedded in HTML
On Jan 22, 11:32 am, Paul Boddie <[EMAIL PROTECTED]> wrote: > > The rest of the document is html, javascript div tags, etc. I need the > > information only from the row where the Relationship tag = Owner and > > the Priority tag = 1. The rest I can ignore. When I tried parsing it > > with minidom, I get an ExpatError: mismatched tag: line 1, column 357 > > so I think the HTML is probably malformed. > > Or that it isn't well-formed XML, at least. I probably should have posted that I got the error on the first line of the file, which is why I think it's the HTML. But I wouldn't be surprised if it was the XML that's behaving badly. > > > I looked at BeautifulSoup, but it seems to separate its HTML > > processing from its XML processing. Can someone give me some pointers? > > With libxml2dom [1] I'd do something like this: > > import libxml2dom > d = libxml2dom.parse(filename, html=1) > # or: d = parseURI(uri, html=1) > rows = d.xpath("//XML/BoundData/Row") > # or: rows = d.xpath("//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/ > BoundData/Row") > > Even though the document is interpreted as HTML, you should get a DOM > containing the elements as libxml2 interprets them. > > > I am currently using Python 2.5 on Windows XP. I will be using > > Internet Explorer 6 since the document will not display correctly in > > Firefox. > > That shouldn't be much of a surprise, it must be said: it isn't XHTML, > where you might be able to extend the document via XML, so the whole > document has to be "proper" HTML. > > Paul > > [1]http://www.python.org/pypi/libxml2dom I must have tried this module quite a while ago since I already have it installed. I see you're the author of the module, so you can probably tell me what's what. When I do the above, I get an empty list either way. See my code below: import libxml2dom d = libxml2dom.parse(filename, html=1) rows = d.xpath('//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/ Row') # rows = d.xpath("//XML/BoundData/Row") print rows I'm not sure what is wrong here...but I got lxml to create a tree from by doing the following: from lxml import etree from StringIO import StringIO parser = etree.HTMLParser() tree = etree.parse(filename, parser) xml_string = etree.tostring(tree) context = etree.iterparse(StringIO(xml_string)) However, when I iterate over the contents of "context", I can't figure out how to nab the row's contents: for action, elem in context: if action == 'end' and elem.tag == 'relationship': # do something...but what!? # this if statement probably isn't even right Thanks for the quick response, though! Any other ideas? Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: bags? 2.5.x?
On Jan 21, 11:13 pm, Dan Stromberg <[EMAIL PROTECTED]> wrote: > On Thu, 17 Jan 2008 18:18:53 -0800, Raymond Hettinger wrote: > >> >> I keep wanting something like them - especially bags with something > >> >> akin to set union, intersection and difference. > > >> > How about this recepie > >> > http://www.ubookcase.com/book/Oreilly/ > > >> The author of the bag class said that he was planning to submit bags > >> for inclusion in 2.5 - is there a particular reason why they didn't go > >> in? > > > Three reasons: > > > 1. b=collections.defaultdict(int) went a long way towards meeting the > > need to for a fast counter. > > > 2. It's still not clear what the best API would be. What should list(b) > > return for b.dict = {'a':3, 'b':0, 'c':-3}? Perhaps, [('a', 3), ('b', > > 0), ('c', -3)] or ['a', 'a', 'a'] > > or ['a'] > > or ['a', 'b', 'c'] > > or raise an Error for the negative entry. > > I'd suggest that .keys() return the unique list, and that list() return > the list of tuples. Then people can use list comprehensions or map() to > get what they really need. I think that a bag is a cross between a dict (but the values are always positive integers) and a set (but duplicates are permitted). I agree that .keys() should the unique list, but that .items() should return the tuples and list() should return the list of keys including duplicates. bag() should accept an iterable and count the duplicates. For example: >>> sentence = "the cat sat on the mat" >>> my_words = sentence.split() >>> print my_words ['the', 'cat', 'sat', 'on', 'the', 'mat'] >>> my_bag = bag(my_words) >>> print my_bag bag({'on': 1, 'the': 2, 'sat': 1, 'mat': 1, 'cat': 1}) my_list = list(my_bag) ['on', 'the', 'the', 'sat', 'mat', 'cat'] It should be easy to convert a bag to a dict and also a dict to a bag, raising ValueError if it sees a value that's not a non-negative integer (a value of zero just means "there isn't one of these in the bag"!). > > It might not be a bad thing to have an optional parameter on __init__ > that would allow the user to specify if they need negative counts or not; > so far, I've not needed them though. > > > 3. I'm still working on it and am not done yet. > > Decent reasons. :) > > Thanks! > > Here's a diff to bag.py that helped me. I'd like to think these meanings > are common, but not necessarily! > > $ diff -b /tmp/bag.py.original /usr/local/lib/bag.py > 18a19,58 > > > def _difference(lst): > > left = lst[0] > > right = lst[1] > > return max(left - right, 0) > > _difference = staticmethod(_difference) > > > def _relationship(self, other, operator): > > if isinstance(other, bag): > > self_keys = set(self._data.keys()) > > other_keys = set(other._data.keys()) > > union_keys = self_keys | other_keys > > #print 'union_keys is',union_keys > > result = bag() > > for element in list(union_keys): > > temp = operator([ self[element], other > [element] ]) > > #print 'operator is', operator > > #print 'temp is', temp > > result[element] += temp > > return result > > else: > > raise NotImplemented > > > def union(self, other): > > return self._relationship(other, sum) > > > __or__ = union > > > def intersection(self, other): > > return self._relationship(other, min) > > > __and__ = intersection > > > def maxunion(self, other): > > return self._relationship(other, max) > > > def difference(self, other): > > return self._relationship(other, self._difference) > > > __sub__ = difference -- http://mail.python.org/mailman/listinfo/python-list
Re: question
> def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > The only problem with the code above though is that I > don't know how to call it, especially since if the user is > entering a string, how would I convert that string into a > function name? For example, if the user entered 'Rush', > how would I call the appropriate function --> > albumInfo(Rush()) > It looks like you're reaching for a dictionary idiom: album_info = { 'Rush': [ 'Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres', ], 'Enchant': [ 'A Blueprint of the World', 'Wounded', 'Time Lost', ], } You can then reference the bits: who = "Rush" #get this from the user? print "Albums by %s" % who for album_name in album_info[who]: print ' *', album_name This is much more flexible when it comes to adding groups and albums because you can load the contents of album_info dynamically from your favorite source (a file, DB, or teh intarweb) rather than editing & restarting your app every time. -tkc PS: to answer your original question, you can use the getattr() function, such as results = getattr(albumInfo, who)() but that's an ugly solution for the example you gave. -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
[Peter Otten] > You can be bolder here as the izip() docs explicitly state > > """ > Note, the left-to-right evaluation order of the iterables is > guaranteed. This makes possible an idiom for clustering a data series into > n-length groups using "izip(*[iter(s)]*n)". > """ . . . > is about zip(), not izip(). FWIW, I just added a similar guarantee for zip(). Raymond -- http://mail.python.org/mailman/listinfo/python-list
Re: question
On Jan 22, 7:58 pm, <[EMAIL PROTECTED]> wrote: > I'm still learning Python and was wanting to get some thoughts on this. I > apologize if this sounds ridiculous... I'm mainly asking it to gain some > knowledge of what works better. The main question I have is if I had a lot > of lists to choose from, what's the best way to write the code so I'm not > wasting a lot of memory? I've attempted to list a few examples below to > hopefully be a little clearer about my question. > > Lets say I was going to be pulling different data, depending on what the user > entered. I was thinking I could create a function which contained various > functions inside: > > def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > ... > > The only problem with the code above though is that I don't know how to call > it, especially since if the user is entering a string, how would I convert > that string into a function name? For example, if the user entered 'Rush', > how would I call the appropriate function --> albumInfo(Rush()) > > But if I could somehow make that code work, is it a good way to do it? I'm > assuming if the user entered 'Rush' that only the list in the Rush() function > would be stored, ignoring the other functions inside the albumInfo() function? > > I then thought maybe just using a simple if/else statement might work like so: > > def albumInfo(theBand): > if theBand == 'Rush': > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > elif theBand == 'Enchant': > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > ... > > Does anyone think this would be more efficient? > > I'm not familiar with how 'classes' work yet (still reading through my 'Core > Python' book) but was curious if using a 'class' would be better suited for > something like this? Since the user could possibly choose from 100 or more > choices, I'd like to come up with something that's efficient as well as easy > to read in the code. If anyone has time I'd love to hear your thoughts. > > Thanks. > > Jay What you want is a dictionary: albumInfo = { 'Rush': 'Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'], 'Enchant': ['A Blueprint of the World', 'Wounded', 'Time Lost'], ... } then to find the info just do: >>> albumInfo['Enchant'] ['A Blueprint of the World', 'Wounded', 'Time Lost'] It also makes it easy to add a new album on the fly: >>> albumInfo["Lark's tongue in Aspic"] = [ ... ] Hope that helps. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: PyGTK, Glade, and ComboBoxEntry.append_text()
Greg Johnston wrote: > Hey all, > > I'm a relative newbie to Python (switched over from Scheme fairly > recently) but I've been using PyGTK and Glade to create an interface, > which is a combo I'm very impressed with. > > There is, however, one thing I've been wondering about. It doesn't > seem possible to modify ComboBoxEntry choice options on the fly--at > least with append_text(), etc--because they were not created with > gtk.combo_box_entry_new_text(). Basically, I'm wondering if there's > any way around this. > > Thank you, > Greg Johnston PyGTK mailing list: http://pygtk.org/feedback.html -- http://mail.python.org/mailman/listinfo/python-list
Re: question
Since you aren't familyer with classes i will keep this within the scope of functions... If you have code like this def a(): def b(): a+=1 Then you can only call function b when you are within function a James On Jan 22, 2008 8:58 PM, <[EMAIL PROTECTED]> wrote: > I'm still learning Python and was wanting to get some thoughts on this. I > apologize if this sounds ridiculous... I'm mainly asking it to gain some > knowledge of what works better. The main question I have is if I had a lot > of lists to choose from, what's the best way to write the code so I'm not > wasting a lot of memory? I've attempted to list a few examples below to > hopefully be a little clearer about my question. > > Lets say I was going to be pulling different data, depending on what the user > entered. I was thinking I could create a function which contained various > functions inside: > > def albumInfo(theBand): > def Rush(): > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > > def Enchant(): > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > > ... > > The only problem with the code above though is that I don't know how to call > it, especially since if the user is entering a string, how would I convert > that string into a function name? For example, if the user entered 'Rush', > how would I call the appropriate function --> albumInfo(Rush()) > > But if I could somehow make that code work, is it a good way to do it? I'm > assuming if the user entered 'Rush' that only the list in the Rush() function > would be stored, ignoring the other functions inside the albumInfo() function? > > I then thought maybe just using a simple if/else statement might work like so: > > def albumInfo(theBand): > if theBand == 'Rush': > return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A > Farewell to Kings', 'Hemispheres'] > elif theBand == 'Enchant': > return ['A Blueprint of the World', 'Wounded', 'Time Lost'] > ... > > Does anyone think this would be more efficient? > > I'm not familiar with how 'classes' work yet (still reading through my 'Core > Python' book) but was curious if using a 'class' would be better suited for > something like this? Since the user could possibly choose from 100 or more > choices, I'd like to come up with something that's efficient as well as easy > to read in the code. If anyone has time I'd love to hear your thoughts. > > Thanks. > > Jay > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://search.goldwatches.com/?Search=Movado+Watches http://www.jewelerslounge.com http://www.goldwatches.com -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Arnaud Delobelle wrote: > On Jan 22, 4:10 pm, Alan Isaac <[EMAIL PROTECTED]> wrote: > >> http://bugs.python.org/issue1121416> >> >> fwiw, >> Alan Isaac > > Thanks. So I guess I shouldn't take the code snippet I quoted as a > specification of izip but rather as an illustration. You can be bolder here as the izip() docs explicitly state """ Note, the left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using "izip(*[iter(s)]*n)". """ and the bug report with Raymond Hettinger saying """ Left the evaluation order as an unspecified, implementation specific detail. """ is about zip(), not izip(). Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: A global or module-level variable?
Bret <[EMAIL PROTECTED]> writes: > nextport=42000 > > def getNextPort(): > nextport += 1 > return nextport If you have to do it that way, use: def getNextPort(): global nextport nextport += 1 return nextport the global declaration stops the compiler from treating nextport as local and then trapping the increment as to an uninitialized variable. -- http://mail.python.org/mailman/listinfo/python-list
question
I'm still learning Python and was wanting to get some thoughts on this. I apologize if this sounds ridiculous... I'm mainly asking it to gain some knowledge of what works better. The main question I have is if I had a lot of lists to choose from, what's the best way to write the code so I'm not wasting a lot of memory? I've attempted to list a few examples below to hopefully be a little clearer about my question. Lets say I was going to be pulling different data, depending on what the user entered. I was thinking I could create a function which contained various functions inside: def albumInfo(theBand): def Rush(): return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'] def Enchant(): return ['A Blueprint of the World', 'Wounded', 'Time Lost'] ... The only problem with the code above though is that I don't know how to call it, especially since if the user is entering a string, how would I convert that string into a function name? For example, if the user entered 'Rush', how would I call the appropriate function --> albumInfo(Rush()) But if I could somehow make that code work, is it a good way to do it? I'm assuming if the user entered 'Rush' that only the list in the Rush() function would be stored, ignoring the other functions inside the albumInfo() function? I then thought maybe just using a simple if/else statement might work like so: def albumInfo(theBand): if theBand == 'Rush': return ['Rush', 'Fly By Night', 'Caress of Steel', '2112', 'A Farewell to Kings', 'Hemispheres'] elif theBand == 'Enchant': return ['A Blueprint of the World', 'Wounded', 'Time Lost'] ... Does anyone think this would be more efficient? I'm not familiar with how 'classes' work yet (still reading through my 'Core Python' book) but was curious if using a 'class' would be better suited for something like this? Since the user could possibly choose from 100 or more choices, I'd like to come up with something that's efficient as well as easy to read in the code. If anyone has time I'd love to hear your thoughts. Thanks. Jay -- http://mail.python.org/mailman/listinfo/python-list
A global or module-level variable?
This has to be easier than I'm making it I've got a module, remote.py, which contains a number of classes, all of whom open a port for communication. I'd like to have a way to coordinate these port numbers akin to this: So I have this in the __init__.py file for a package called cstore: nextport=42000 def getNextPort(): nextport += 1 return nextport : Then, in the class where I wish to use this (in cstore.remote.py): : class Spam(): def __init__(self, **kwargs): self._port = cstore.getNextPort() I can't seem to make this work, though. As given here, I get an "UnboundLocalError:local variable 'nextport' referenced before assignment". When I try prefixing the names inside __init__.py with "cstore.", I get an error that the global name "cstore" is not defined. I've been looking at this long enough that my eyes are blurring. Any ideas? BTW, the driving force here is that I'm going to need to wrap this in some thread synchronization. For now, though, I'm just trying to get the basics working. Thanks! Bret -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with processing XML
Paul McGuire wrote: > > Here is a pyparsing hack for your problem. Thanks Paul! This looks like an interesting approach, and once I get my head around the syntax, I'll give it a proper whirl. -- http://mail.python.org/mailman/listinfo/python-list
Re: printing escape character
On Jan 22, 7:58 pm, "Jerry Hill" <[EMAIL PROTECTED]> wrote: > On Jan 22, 2008 1:38 PM, hrochonwo <[EMAIL PROTECTED]> wrote: > > > Hi, > > > I want to print string without "decoding" escaped characters to > > newline etc. > > like print "a\nb" -> a\nb > > is there a simple way to do it in python or should i somehow use > > string.replace(..) function ? > >>> print 'a\nb'.encode('string_escape') > > a\nb > > -- > Jerry thank you, jerry -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 6:34 pm, Paddy <[EMAIL PROTECTED]> wrote: [...] > Hi George, > You need to 'get it right' first. Micro optimizations for speed > without thought of the wider context is a bad habit to form and a time > waster. > If the routine is all that needs to be delivered and it does not > perform at an acceptable speed then find out what is acceptable and > optimise towards that goal. My questions were set to get posters to > think more about the need for speed optimizations and where they > should be applied, (if at all). > > A bit of forethought might justify leaving the routine alone, or > optimising for readability instead. But it's fun! Some-of-us-can't-help-it'ly yours -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: printing escape character
On Jan 22, 2008 1:38 PM, hrochonwo <[EMAIL PROTECTED]> wrote: > Hi, > > I want to print string without "decoding" escaped characters to > newline etc. > like print "a\nb" -> a\nb > is there a simple way to do it in python or should i somehow use > string.replace(..) function ? >>> print 'a\nb'.encode('string_escape') a\nb -- Jerry -- http://mail.python.org/mailman/listinfo/python-list
difflib confusion
hello all, I have a bit of a confusing question. firstly I wanted a library which can do an svn like diff with two files. let's say I have file1 and file2 where file2 contains some thing which file1 does not have. now if I do readlines() on both the files, I have a list of all the lines. I now want to do a diff and find out which word is added or deleted or changed. and that too on which character, if not at least want to know the word that has the change. any ideas please? kk -- http://mail.python.org/mailman/listinfo/python-list
rpy registry
Howdy, I've been using rpy (1.0.1) and python (2.5.1) on my office computer with great success. When I went to put rpy on my laptop, however, I get an error trying to load rpy. "Unable to determine R version from the registry. Trying another method." followed by a few lines of the usual error message style (ending with "NameError: global name 'RuntimeExecError' is not defined." I have reinstalled R (now 2.6.1), rpy, and python without any luck (being sure to check the "include in registry" on the installation of R). Everything else I have used thus far works perfectly. Any thoughts on what might be causing problems? Thanks, -Hans -- http://mail.python.org/mailman/listinfo/python-list
Re: Beginners question about debugging (import)
Albert van der Horst schrieb: > I'm starting with Python. First with some interactive things, > working through the tutorial, > then with definitions in a file called sudoku.py. > Of course I make lots of mistakes, so I have to include that file > time and again. > > I discovered (the hard way) that the second time you invoke > from sudoku.py import * > nothing happens. > > There is reload. But it only seems to work with > import sudoku > > Now I find myself typing ``sudoku.'' all the time: > > x=sudoku.sudoku() > y=sudoku.create_set_of_sets() > sudoku.symbols > > Is there a more convenient way? > > (This is a howto question, rather difficult to get answered > from the documentation.) import sudoku as s However, I find it easier to just create a test.py and run that from the shell. For the exact reason that reload has it's caveats and in the end, more complex testing-code isn't really feasible anyway. If you need to, drop into the interactive prompt using python -i test.py Diez -- http://mail.python.org/mailman/listinfo/python-list
printing escape character
Hi, I want to print string without "decoding" escaped characters to newline etc. like print "a\nb" -> a\nb is there a simple way to do it in python or should i somehow use string.replace(..) function ? thanks for any reply hrocho -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 5:34 am, George Sakkis <[EMAIL PROTECTED]> wrote: > On Jan 22, 12:15 am, Paddy <[EMAIL PROTECTED]> wrote: > > > On Jan 22, 3:20 am, Alan Isaac <[EMAIL PROTECTED]> wrote:> I want to > > generate sequential pairs from a list. > > <> > > > What is the fastest way? (Ignore the import time.) > > > 1) How fast is the method you have? > > 2) How much faster does it need to be for your application? > > 3) Are their any other bottlenecks in your application? > > 4) Is this the routine whose smallest % speed-up would give the > > largest overall speed up of your application? > > I believe the "what is the fastest way" question for such small well- > defined tasks is worth asking on its own, regardless of whether it > makes a difference in the application (or even if there is no > application to begin with). Hi George, You need to 'get it right' first. Micro optimizations for speed without thought of the wider context is a bad habit to form and a time waster. If the routine is all that needs to be delivered and it does not perform at an acceptable speed then find out what is acceptable and optimise towards that goal. My questions were set to get posters to think more about the need for speed optimizations and where they should be applied, (if at all). A bit of forethought might justify leaving the routine alone, or optimising for readability instead. - Paddy. -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
Diez B. Roggisch wrote: > Jean-Paul Calderone wrote: > >> On Tue, 22 Jan 2008 15:15:43 +0100, "Diez B. Roggisch" >> <[EMAIL PROTECTED]> wrote: >>> Jean-Paul Calderone wrote: >>> On Tue, 22 Jan 2008 14:20:35 +0100, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > For a simple greenlet/tasklet/microthreading experiment I found myself > in the need to ask the question > > [snip] Why do you need a special case for generators? If you just pass the object in question to iter(), instead, then you'll either get back something that you can iterate over, or you'll get an exception for things that aren't iterable. >>> Because - as I said - I'm working on a micro-thread thingy, where the >>> scheduler needs to push returned generators to a stack and execute them. >>> Using send(), which rules out iter() anyway. >> Sorry, I still don't understand. Why is a generator different from any >> other iterator? > > Because you can use send(value) on it for example. Which you can't with > every other iterator. And that you can utizilize to create a little > framework of co-routines or however you like to call it that will yield > values when they want, or generators if they have nested co-routines the > scheduler needs to keep track of and invoke after another. So if you need the send() method, why not just check for that:: try: obj.send except AttributeError: # not a generator-like object else: # is a generator-like object Then anyone who wants to make an extended iterator and return it can expect it to work just like a real generator would. STeVe -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Arnaud Delobelle wrote: > pairs4 wins. Oops. I see a smaller difference, but yes, pairs4 wins. Alan Isaac import time from itertools import islice, izip x = range(51) def pairs1(x): return izip(islice(x,0,None,2),islice(x,1,None,2)) def pairs2(x): xiter = iter(x) while True: yield xiter.next(), xiter.next() def pairs3(x): for i in range( len(x)//2 ): yield x[2*i], x[2*i+1], def pairs4(x): xiter = iter(x) return izip(xiter,xiter) t = time.clock() for x1, x2 in pairs1(x): pass t1 = time.clock() - t t = time.clock() for x1, x2 in pairs2(x): pass t2 = time.clock() - t t = time.clock() for x1, x2 in pairs3(x): pass t3 = time.clock() - t t = time.clock() for x1, x2 in pairs4(x): pass t4 = time.clock() - t print t1, t2, t3, t4 Output: 0.317524154606 1.13436847421 1.07100930426 0.262926712753 -- http://mail.python.org/mailman/listinfo/python-list
Beginners question about debugging (import)
I'm starting with Python. First with some interactive things, working through the tutorial, then with definitions in a file called sudoku.py. Of course I make lots of mistakes, so I have to include that file time and again. I discovered (the hard way) that the second time you invoke from sudoku.py import * nothing happens. There is reload. But it only seems to work with import sudoku Now I find myself typing ``sudoku.'' all the time: x=sudoku.sudoku() y=sudoku.create_set_of_sets() sudoku.symbols Is there a more convenient way? (This is a howto question, rather difficult to get answered from the documentation.) Groetjes Albert ~ -- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- like all pyramid schemes -- ultimately falters. [EMAIL PROTECTED]&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst -- http://mail.python.org/mailman/listinfo/python-list
Submitting with PAMIE
Hi I really need help. I've been looking around for an answer forever. I need to submit a form with no name and also the submit button has no name or value. How might I go about doing either of these. Thanks -- http://mail.python.org/mailman/listinfo/python-list
Using utidylib, empty string returned in some cases
Hello I'm using debian linux, Python 2.4.4, and utidylib (http:// utidylib.berlios.de/). I wrote simple functions to get a web page, convert it from windows-1251 to utf8 and then I'd like to clean html with it. Here is two pages I use to check my program: http://www.ya.ru/ (in this case everything works ok) http://www.yellow-pages.ru/rus/nd2/qu5/ru15632 (in this case tidy did not return me anything just empty string) code: -- # coding: utf-8 import urllib, urllib2, tidy def get_page(url): user_agent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)' headers = { 'User-Agent' : user_agent } data= {} req = urllib2.Request(url, data, headers) responce = urllib2.urlopen(req) page = responce.read() return page def convert_1251(page): p = page.decode('windows-1251') u = p.encode('utf-8') return u def clean_html(page): tidy_options = { 'output_xhtml' : 1, 'add_xml_decl' : 1, 'indent' : 1, 'input-encoding' : 'utf8', 'output-encoding' : 'utf8', 'tidy_mark' : 1, } cleaned_page = tidy.parseString(page, **tidy_options) return cleaned_page test_url = 'http://www.yellow-pages.ru/rus/nd2/qu5/ru15632' #test_url = 'http://www.ya.ru/' #f = open('yp.html', 'r') #p = f.read() print clean_html(convert_1251(get_page(test_url))) -- What am I doing wrong? Can anyone help, please? -- http://mail.python.org/mailman/listinfo/python-list
Re: Processing XML that's embedded in HTML
On 22 Jan, 17:57, Mike Driscoll <[EMAIL PROTECTED]> wrote: > > I need to parse a fairly complex HTML page that has XML embedded in > it. I've done parsing before with the xml.dom.minidom module on just > plain XML, but I cannot get it to work with this HTML page. It's HTML day on comp.lang.python today! ;-) > The XML looks like this: > > > > Owner > > 1 > > 07/16/2007 > > No > > Doe, John > > 1905 S 3rd Ave , Hicksville IA 9 > > > > > > Owner > > 2 > > 07/16/2007 > > No > > Doe, Jane > > 1905 S 3rd Ave , Hicksville IA 9 > > > > It appears to be enclosed with id="grdRegistrationInquiryCustomers"> You could probably find the Row elements with the following XPath expression: //XML/BoundData/Row More specific would be this: //[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/BoundData/Row See below for the relevance of this. You could also try using getElementById on the document, specifying the id attribute's value given above, then descending to find the Row elements. > The rest of the document is html, javascript div tags, etc. I need the > information only from the row where the Relationship tag = Owner and > the Priority tag = 1. The rest I can ignore. When I tried parsing it > with minidom, I get an ExpatError: mismatched tag: line 1, column 357 > so I think the HTML is probably malformed. Or that it isn't well-formed XML, at least. > I looked at BeautifulSoup, but it seems to separate its HTML > processing from its XML processing. Can someone give me some pointers? With libxml2dom [1] I'd do something like this: import libxml2dom d = libxml2dom.parse(filename, html=1) # or: d = parseURI(uri, html=1) rows = d.xpath("//XML/BoundData/Row") # or: rows = d.xpath("//[EMAIL PROTECTED]"grdRegistrationInquiryCustomers"]/ BoundData/Row") Even though the document is interpreted as HTML, you should get a DOM containing the elements as libxml2 interprets them. > I am currently using Python 2.5 on Windows XP. I will be using > Internet Explorer 6 since the document will not display correctly in > Firefox. That shouldn't be much of a surprise, it must be said: it isn't XHTML, where you might be able to extend the document via XML, so the whole document has to be "proper" HTML. Paul [1] http://www.python.org/pypi/libxml2dom -- http://mail.python.org/mailman/listinfo/python-list
Re: Curses and Threading
On 2008-01-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: >> In fact you have *two* threads: the main thread, and the one you create >> explicitly. > >> After you start the clock thread, the main thread continues executing, >> immediately entering the finally clause. >> If you want to wait for the other thread to finish, use the join() method. >> But I'm unsure if this is the right way to mix threads and curses. > > This is what the python documentation says: > > join([timeout]) > Wait until the thread terminates. This blocks the calling thread > until the thread whose join() method is called terminates. > > So according to this since I need to block the main thread until the > clock thread ends I would need the main thread to call > "cadtime().join()", correct? I'm not sure how to do this because I > don't have a class or anything for the main thread that I know of. I > tried putting that after cadtime().start() but that doesn't work. I > guess what I'm trying to say is how can I tell the main thread what to > do when it doesn't exist in my code? > > Thanks for the help > -Brett join() is a method on Thread objects. So you'll need a reference to the Thread you create, then call join() on that. thread = cadtime() thread.start() thread.join() Ian -- http://mail.python.org/mailman/listinfo/python-list
Re: Boa constructor debugging - exec some code at breakpoint?
On Jan 22, 1:23 am, Joel <[EMAIL PROTECTED]> wrote: > Can you please tell me how this can be done.. > are there any other IDEs for the same purpose if Boa can't do it? > > Joel > > On Jan 6, 11:01 am, Joel <[EMAIL PROTECTED]> wrote: > > > Hey there.. > > I'm using boa constructor to debug a python application. For my > > application, I need to insert break points and execute some piece of > > code interactively through shell or someother window when the > > breakpoint has been reached. Unfortunately the shell I think is a > > seperate process so whatever variables are set while executing in > > debugger dont appear in the shell when I try to print using print > > statement. > > > Can anyone tell me how can I do this? > > > Really appreciate any support, Thanks > > > Joel > > P.S. Please CC a copy of reply to my email ID if possible. IDLE does breakpoints...you might fine the ActiveState distro more to your liking too. It's a little bit more fleshed out as an IDE than IDLE is. Or you could go full blown and use Eclipse with the Python plug-in. Mike -- http://mail.python.org/mailman/listinfo/python-list
Processing XML that's embedded in HTML
Hi, I need to parse a fairly complex HTML page that has XML embedded in it. I've done parsing before with the xml.dom.minidom module on just plain XML, but I cannot get it to work with this HTML page. The XML looks like this: Owner 1 07/16/2007 No Doe, John 1905 S 3rd Ave , Hicksville IA 9 Owner 2 07/16/2007 No Doe, Jane 1905 S 3rd Ave , Hicksville IA 9 It appears to be enclosed with The rest of the document is html, javascript div tags, etc. I need the information only from the row where the Relationship tag = Owner and the Priority tag = 1. The rest I can ignore. When I tried parsing it with minidom, I get an ExpatError: mismatched tag: line 1, column 357 so I think the HTML is probably malformed. I looked at BeautifulSoup, but it seems to separate its HTML processing from its XML processing. Can someone give me some pointers? I am currently using Python 2.5 on Windows XP. I will be using Internet Explorer 6 since the document will not display correctly in Firefox. Thank you very much! Mike -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with processing XML
On 22 Jan, 15:11, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote: > > I wrote some code that works on my Linux box using xml.dom.minidom, but > it will not run on the windows box that I really need it on. Python > 2.5.1 on both. > > On the windows machine, it's a clean install of the Python .msi from > python.org. The linux box is Ubuntu 7.10, which has some Python XML > packages installed which can't easily be removed (namely python-libxml2 > and python-xml). I don't think you're straying into libxml2 or PyXML territory here... > I have boiled the code down to its simplest form which shows the problem:- > > import xml.dom.minidom > import sys > > input_file = sys.argv[1]; > output_file = sys.argv[2]; > > doc = xml.dom.minidom.parse(input_file) > file = open(output_file, "w") On Windows, shouldn't this be the following...? file = open(output_file, "wb") > doc.writexml(file) > > The error is:- > > $ python test2.py input2.xml output.xml > Traceback (most recent call last): >File "test2.py", line 9, in > doc.writexml(file) >File "c:\Python25\lib\xml\dom\minidom.py", line 1744, in writexml > node.writexml(writer, indent, addindent, newl) >File "c:\Python25\lib\xml\dom\minidom.py", line 814, in writexml > node.writexml(writer,indent+addindent,addindent,newl) >File "c:\Python25\lib\xml\dom\minidom.py", line 809, in writexml > _write_data(writer, attrs[a_name].value) >File "c:\Python25\lib\xml\dom\minidom.py", line 299, in _write_data > data = data.replace("&", "&").replace("<", "<") > AttributeError: 'NoneType' object has no attribute 'replace' > > As I said, this code runs fine on the Ubuntu box. If I could work out > why the code runs on this box, that would help because then I call set > up the windows box the same way. If I encountered the same issue, I'd have to inspect the goings-on inside minidom, possibly using judicious trace statements in the minidom.py file. Either way, the above looks like an attribute node produces a value of None rather than any kind of character string. > The input file contains an block which is what actually > causes the problem. If you remove that node and subnodes, it works > fine. For a while at least, you can view the input file at > http://rafb.net/p/5R1JlW12.html The horror! ;-) > Someone suggested that I should try xml.etree.ElementTree, however > writing the same type of simple code to import and then write the file > mangles the xsd:schema stuff because ElementTree does not understand > namespaces. I'll leave this to others: I don't use ElementTree. > By the way, is pyxml a live project or not? Should it still be used? > It's odd that if you go to http://www.python.org/and click the link > "Using python for..." XML, it leads you to > http://pyxml.sourceforge.net/topics/ > > If you then follow the download links to > http://sourceforge.net/project/showfiles.php?group_id=6473 you see that > the latest file is 2004, and there are no versions for newer pythons. > It also says "PyXML is no longer maintained". Shouldn't the link be > removed from python.org? The XML situation in Python's standard library is controversial and can be probably inaccurately summarised by the following chronology: 1. XML is born, various efforts start up (see the qp_xml and xmllib modules). 2. Various people organise themselves, contributing software to the PyXML project (4Suite, xmlproc). 3. The XML backlash begins: we should all apparently be using stuff like YAML (but don't worry if you haven't heard of it). 4. ElementTree is released, people tell you that you shouldn't be using SAX or DOM any more, "pull" parsers are all the rage (although proponents overlook the presence of xml.dom.pulldom in the Python standard library). 5. ElementTree enters the standard library as xml.etree; PyXML falls into apparent disuse (see remarks about SAX and DOM above). I think I looked seriously at wrapping libxml2 (with libxml2dom [1]) when I experienced issues with both PyXML and 4Suite when used together with mod_python, since each project used its own Expat libraries and the resulting mis-linked software produced very bizarre results. Moreover, only cDomlette from 4Suite seemed remotely fast, and yet did not seem to be an adequate replacement for the usual PyXML functionality. People will, of course, tell you that you shouldn't use a DOM for anything and that the "consensus" is to use ElementTree or lxml (see above), but I can't help feeling that this has a damaging effect on the XML situation for Python: some newcomers would actually benefit from the traditional APIs, may already be familiar with them from other contexts, and may consider Python lacking if the support for them is in apparent decay. It requires a degree of motivation to actually attempt to maintain software providing such APIs (which was my solution to the problem), but if someone isn't totally bound to Python then they might easily start
Re: stdin, stdout, redmon
On 1/21/2008 9:02 AM, Bernard Desnoues wrote: > Hi, > > I've got a problem with the use of Redmon (redirection port monitor). I > intend to develop a virtual printer so that I can modify data sent to > the printer. FWIW: there is a nice update the RedMon (v1.7) called RedMon EE (v1.81) available at http://www.is-foehr.com/ that I have used and like a lot. From the developers website: Fixed issues and features [with respect to the orininal RedMon] * On Windows Terminal Server or Windows XP with fast user switching, the "Prompt for filename" dialog will appear on the current session. * "SaveAs" now shows XP style dialogs if running under XP * Support for PDF Security added - experimental -. * Support for setting the task priority - experimental - * Use of file-shares as output * Environment variables are passed to the AfterWorks Process now. * Environment variables are replaced in the program arguments. No workaround is needed. * RedMon EE comes with an RPC communication feature which could transfer output-files back to the client starting the print job on a print server. Error messages will be send to the client. * Redmon EE may start a process after the print job has finished (After works process). e.g. starting a presentation program to show the pdf generated by GhostScript. * additional debug messages may be written for error analysis. No special debug version is needed. * user interface has been rewritten. May be it's more friendly. Added some basic system information which may help if running in failures. * new feature: running on a print server. * cleanup of documentnames "Microsoft -" * define templates for output-file names with full environment variable substitution e.g. %homedrive%\%homedir%\%redmon-user%-%date%-%time%-%n.pdf * RedMon EE does not support for NT 3.5 and Windows 95/98 ! -Thynnus -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 4:10 pm, Alan Isaac <[EMAIL PROTECTED]> wrote: > http://bugs.python.org/issue1121416> > > fwiw, > Alan Isaac Thanks. So I guess I shouldn't take the code snippet I quoted as a specification of izip but rather as an illustration. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
Alnilam wrote: > On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote: >> > Pardon me, but the standard issue Python 2.n (for n in range(5, 2, >> > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous >> > 200-modules PyXML package installed. And you don't want the 75Kb >> > BeautifulSoup? >> >> I wasn't aware that I had PyXML installed, and can't find a reference >> to having it installed in pydocs. ... > > Ugh. Found it. Sorry about that, but I still don't understand why > there isn't a simple way to do this without using PyXML, BeautifulSoup > or libxml2dom. What's the point in having sgmllib, htmllib, > HTMLParser, and formatter all built in if I have to use use someone > else's modules to write a couple of lines of code that achieve the > simple thing I want. I get the feeling that this would be easier if I > just broke down and wrote a couple of regular expressions, but it > hardly seems a 'pythonic' way of going about things. This is simply a gross misunderstanding of what BeautifulSoup or lxml accomplish. Dealing with mal-formatted HTML whilst trying to make _some_ sense is by no means trivial. And just because you can come up with a few lines of code using rexes that work for your current use-case doesn't mean that they serve as general html-fixing-routine. Or do you think the rather long history and 75Kb of code for BS are because it's creator wasn't aware of rexes? And it also makes no sense stuffing everything remotely useful into the standard lib. This would force to align development and release cycles, resulting in much less features and stability as it can be wished. And to be honest: I fail to see where your problem is. BeatifulSoup is a single Python file. So whatever you carry with you from machine to machine, if it's capable of holding a file of your own code, you can simply put BeautifulSoup beside it - even if it was a floppy disk. Diez -- http://mail.python.org/mailman/listinfo/python-list
Re: stdin, stdout, redmon
On 1/22/2008 8:54 AM, Konstantin Shaposhnikov wrote: > Hi, > > This is Windows bug that is described here: > http://support.microsoft.com/default.aspx?kbid=321788 > > This article also contains solution: you need to add registry value: > > HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Policies > \Explorer > InheritConsoleHandles = 1 (REG_DWORD type) > > Do not forget to launch new console (cmd.exe) after editing registry. > > Alternatively you can use following command > > cat file | python script.py > > instead of > > cat file | python script.py > > Regards, > Konstantin Nice one, Konstantin! I can confirm that adding the registry key solves the problem on XPsp2: -After adding InheritConsoleHandles DWORD 1 key- Microsoft Windows XP [Version 5.1.2600] (C) Copyright 1985-2001 Microsoft Corp. D:\temp>type test3.py | test3.py ['import sys\n', '\n', 'print sys.stdin.readlines ()\n'] D:\temp> The KB article is quite poorly written. Even though it seems to state that issue was 'solved for win2k with sp4, for XP with sp1', and gives no indication that the key is needed after the sp's are applied *even though* it is in fact necessary to the solution. Questions: -Any side effects to look out for? -If the change is relatively benign, should it be part of the install? -Is this worth a documentation patch? If yes to where, and I'll give it a shot. -Thynnus -- http://mail.python.org/mailman/listinfo/python-list
Re: Curses and Threading
> In fact you have *two* threads: the main thread, and the one you create > explicitly. > After you start the clock thread, the main thread continues executing, > immediately entering the finally clause. > If you want to wait for the other thread to finish, use the join() method. > But I'm unsure if this is the right way to mix threads and curses. This is what the python documentation says: join([timeout]) Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates. So according to this since I need to block the main thread until the clock thread ends I would need the main thread to call "cadtime().join()", correct? I'm not sure how to do this because I don't have a class or anything for the main thread that I know of. I tried putting that after cadtime().start() but that doesn't work. I guess what I'm trying to say is how can I tell the main thread what to do when it doesn't exist in my code? Thanks for the help -Brett -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
On Jan 22, 6:20 am, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > For a simple greenlet/tasklet/microthreading experiment I found myself in > the need to ask the question > > isgenerator(v) > > but didn't find any implementation in the usual suspects - builtins or > inspect. types.GeneratorType exists in newer Pythons, but I'd suggest just checking for a send method. ;) That way, you can use something that emulates the interface without being forced to use a generator. hasattr(ob, 'send').. -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Arnaud Delobelle wrote: > According to the docs [1], izip is defined to be equivalent to: > > def izip(*iterables): > iterables = map(iter, iterables) > while iterables: > result = [it.next() for it in iterables] > yield tuple(result) > > This guarantees that it.next() will be performed from left to right, > so there is no risk that e.g. pairs4([1, 2, 3, 4]) returns [(2, 1), > (4, 3)]. > > Is there anything else that I am overlooking? > > [1] http://docs.python.org/lib/itertools-functions.html http://bugs.python.org/issue1121416> fwiw, Alan Isaac -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
On Jan 22, 1:19 pm, Alan Isaac <[EMAIL PROTECTED]> wrote: > I suppose my question should have been, > is there an obviously faster way? > Anyway, of the four ways below, the > first is substantially fastest. Is > there an obvious reason why? Can you post your results? I get different ones (pairs1 and pairs2 rewritten slightly to avoid unnecessary indirection). == pairs.py === from itertools import * def pairs1(x): return izip(islice(x,0,None,2),islice(x,1,None,2)) def pairs2(x): xiter = iter(x) while True: yield xiter.next(), xiter.next() def pairs3(x): for i in range( len(x)//2 ): yield x[2*i], x[2*i+1], def pairs4(x): xiter = iter(x) return izip(xiter,xiter) def compare(): import timeit for i in '1234': t = timeit.Timer('list(pairs.pairs%s(l))' % i, 'import pairs; l=range(1000)') print 'pairs%s: %s' % (i, t.timeit(1)) if __name__ == '__main__': compare() = marigold:python arno$ python pairs.py pairs1: 0.789824962616 pairs2: 4.08462786674 pairs3: 2.90438890457 pairs4: 0.536775827408 pairs4 wins. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem with processing XML
On Jan 22, 9:11 am, John Carlyle-Clarke <[EMAIL PROTECTED]> wrote: > By the way, is pyxml a live project or not? Should it still be used? > It's odd that if you go tohttp://www.python.org/and click the link > "Using python for..." XML, it leads you tohttp://pyxml.sourceforge.net/topics/ > > If you then follow the download links > tohttp://sourceforge.net/project/showfiles.php?group_id=6473you see that > the latest file is 2004, and there are no versions for newer pythons. > It also says "PyXML is no longer maintained". Shouldn't the link be > removed from python.org? I was wondering that myself. Any answer yet? -- http://mail.python.org/mailman/listinfo/python-list
Re: HTML parsing confusion
On Jan 22, 8:44 am, Alnilam <[EMAIL PROTECTED]> wrote: > > Pardon me, but the standard issue Python 2.n (for n in range(5, 2, > > -1)) doesn't have an xml.dom.ext ... you must have the mega-monstrous > > 200-modules PyXML package installed. And you don't want the 75Kb > > BeautifulSoup? > > I wasn't aware that I had PyXML installed, and can't find a reference > to having it installed in pydocs. ... Ugh. Found it. Sorry about that, but I still don't understand why there isn't a simple way to do this without using PyXML, BeautifulSoup or libxml2dom. What's the point in having sgmllib, htmllib, HTMLParser, and formatter all built in if I have to use use someone else's modules to write a couple of lines of code that achieve the simple thing I want. I get the feeling that this would be easier if I just broke down and wrote a couple of regular expressions, but it hardly seems a 'pythonic' way of going about things. # get the source (assuming you don't have it locally and have an internet connection) >>> import urllib >>> page = urllib.urlopen("http://diveintopython.org/";) >>> source = page.read() >>> page.close() # set up some regex to find tags, strip them out, and correct some formatting oddities >>> import re >>> p = re.compile(r'(.*?)',re.DOTALL) >>> tag_strip = re.compile(r'>(.*?)<',re.DOTALL) >>> fix_format = re.compile(r'\n +',re.MULTILINE) # achieve clean results. >>> paragraphs = re.findall(p,source) >>> text_list = re.findall(tag_strip,paragraphs[5]) >>> text = "".join(text_list) >>> clean_text = re.sub(fix_format," ",text) This works, and is small and easily reproduced, but seems like it would break easily and seems a waste of other *ML specific parsers. -- http://mail.python.org/mailman/listinfo/python-list
Re: isgenerator(...) - anywhere to be found?
On Tue, 22 Jan 2008 15:52:02 +0100, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: >Jean-Paul Calderone wrote: > > [snip] >> >> Sorry, I still don't understand. Why is a generator different from any >> other iterator? > >Because you can use send(value) on it for example. Which you can't with >every other iterator. And that you can utizilize to create a little >framework of co-routines or however you like to call it that will yield >values when they want, or generators if they have nested co-routines the >scheduler needs to keep track of and invoke after another. Ah. Thanks for clarifying. Jean-Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: pairs from a list
Alan Isaac>What is the fastest way? (Ignore the import time.)< Maybe someday someone will realize such stuff belongs to the python STD lib... If you need a lazy generator without padding, that splits starting from the start, then this is the faster to me if n is close to 2: def xpartition(seq, n=2): return izip( *(iter(seq),)*n ) If you need the faster greedy version without padding then there are two answers, one for Psyco and one for Python without... :-) If you need padding or to start from the end then there are more answers... Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list