Re: % sign in python?
korean_dave wrote: What does this operator do? Specifically in this context test.log( "[[Log level %d: %s]]" % ( level, msg ), description ) (Tried googling and searching, but the "%" gets interpreted as an operation and distorts the search results) It's the string formatting operator: http://docs.python.org/lib/typesseq-strings.html Btw, a good place to start searching would be: http://docs.python.org/lib/lib.html especially: http://docs.python.org/lib/genindex.html Cheers RB -- http://mail.python.org/mailman/listinfo/python-list
Re: decorator to prevent adding attributes to class?
Michele Simionato wrote: This article could give you same idea (it is doing the opposite, warning you if an attribute is overridden): http://stacktrace.it/articoli/2008/06/i-pericoli-della-programmazione-con-i-mixin1/ There is also a recipe that does exactly what you want by means of a metaclass: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252158 It is so short I can write it down here: # requires Python 2.2+ def frozen(set): "Raise an error when trying to set an undeclared name." def set_attr(self,name,value): if hasattr(self,name): set(self,name,value) else: raise AttributeError("You cannot add attributes to %s" % self) return set_attr class Frozen(object): """Subclasses of Frozen are frozen, i.e. it is impossibile to add new attributes to them and their instances.""" __setattr__=frozen(object.__setattr__) class __metaclass__(type): __setattr__=frozen(type.__setattr__) I don't get it. Why use a metaclass? Wouldn't the following be the same, but easier to grasp: class Frozen(object): def __setattr__(self, name, value): if not hasattr(self, name): raise AttributeError, "cannot add attributes to %s" % self object.__setattr__(self, name, value) Btw, the main drawback with Frozen is that it will not allow to set any new attributes even inside __init__. Some people would advise to use __slots__: http://docs.python.org/ref/slots.html#l2h-222 Some other people would advise NOT to use __slots__: http://groups.google.com/group/comp.lang.python/msg/0f2e859b9c002b28 Personally, if I must absolutely, I'd go for explicitely freeze the object at the end of __init__: class Freezeable(object): def freeze(self): self._frozen = None def __setattr__(self, name, value): if hasattr(self, '_frozen') and not hasattr(self, name): raise AttributeError object.__setattr__(self, name, value) class Foo(Freezeable): def __init__(self): self.bar = 42 self.freeze() # ok, we set all variables, no more from here x = Foo() print x.bar x.bar = -42 print x.bar x.baz = "OMG! A typo!" Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Looking for lots of words in lots of files
I forgot to mention another way: put one thousand monkeys to work on it. ;) RB Robert Bossy wrote: brad wrote: Just wondering if anyone has ever solved this efficiently... not looking for specific solutions tho... just ideas. I have one thousand words and one thousand files. I need to read the files to see if some of the words are in the files. I can stop reading a file once I find 10 of the words in it. It's easy for me to do this with a few dozen words, but a thousand words is too large for an RE and too inefficient to loop, etc. Any suggestions? The quick answer would be: grep -F -f WORDLIST FILE1 FILE2 ... FILE1000 where WORDLIST is a file containing the thousand words, one per line. The more interesting answers would be to use either a suffix tree or an Aho-Corasick graph. - The suffix tree is a representation of the target string (your files) that allows to search quickly for a word. Your problem would then be solved by 1) building a suffix tree for your files, and 2) search for each word sequentially in the suffix tree. - The Aho-Corasick graph is a representation of the query word list that allows fast scanning of the words on a target string. Your problem would then be solved by 1) building an Aho-Corasick graph for the list of words, and 2) scan sequentially each file. The preference for using either one or the other depends on some details of your problems: the expected size of target files, the rate of overlaps between words in your list (are there common prefixes), will you repeat the operation with another word list or another set of files, etc. Personally, I'd lean towards Aho-Corasick, it is a matter of taste; the kind of applications that comes to my mind makes it more practical. Btw, the `grep -F -f` combo builds an Aho-Corasick graph. Also you can find modules for building both data structures in the python package index. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Looking for lots of words in lots of files
brad wrote: Just wondering if anyone has ever solved this efficiently... not looking for specific solutions tho... just ideas. I have one thousand words and one thousand files. I need to read the files to see if some of the words are in the files. I can stop reading a file once I find 10 of the words in it. It's easy for me to do this with a few dozen words, but a thousand words is too large for an RE and too inefficient to loop, etc. Any suggestions? The quick answer would be: grep -F -f WORDLIST FILE1 FILE2 ... FILE1000 where WORDLIST is a file containing the thousand words, one per line. The more interesting answers would be to use either a suffix tree or an Aho-Corasick graph. - The suffix tree is a representation of the target string (your files) that allows to search quickly for a word. Your problem would then be solved by 1) building a suffix tree for your files, and 2) search for each word sequentially in the suffix tree. - The Aho-Corasick graph is a representation of the query word list that allows fast scanning of the words on a target string. Your problem would then be solved by 1) building an Aho-Corasick graph for the list of words, and 2) scan sequentially each file. The preference for using either one or the other depends on some details of your problems: the expected size of target files, the rate of overlaps between words in your list (are there common prefixes), will you repeat the operation with another word list or another set of files, etc. Personally, I'd lean towards Aho-Corasick, it is a matter of taste; the kind of applications that comes to my mind makes it more practical. Btw, the `grep -F -f` combo builds an Aho-Corasick graph. Also you can find modules for building both data structures in the python package index. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: dict order
Peter Otten wrote: Robert Bossy wrote: I wish to know how two dict objects are compared. By browsing the archives I gathered that the number of items are first compared, but if the two dict objects have the same number of items, then the comparison algorithm was not mentioned. If I interpret the comments in http://svn.python.org/view/python/trunk/Objects/dictobject.c?rev=64048&view=markup correctly it's roughly def characterize(d, e): return min(((k, v) for k, v in d.iteritems() if k not in e or e[k] != v), key=lambda (k, v): k) def dict_compare(d, e): result = cmp(len(d), len(e)) if result: return result try: ka, va = characterize(d, e) except ValueError: return 0 kb, vb = characterize(e, d) return cmp(ka, kb) or cmp(va, vb) Thanks, Peter! That was exactly what I was looking for. Quite clever, I might add. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: dict order
Lie wrote: Whoops, I think I misunderstood the question. If what you're asking whether two dictionary is equal (equality comparison, rather than sorting comparison). You could do something like this: Testing for equality and finding differences are trivial tasks indeed. It is the sort order I'm interested in. The meaning of the order is not really an issue, I'm rather looking for a consistent comparison function (in the __cmp__ sense) such as: if d1 > d2 and d2 > d3, then d1 > d3 I'm not sure the hashing method suggested by Albert guarantees that. Cheers -- http://mail.python.org/mailman/listinfo/python-list
dict order
Hi, I wish to know how two dict objects are compared. By browsing the archives I gathered that the number of items are first compared, but if the two dict objects have the same number of items, then the comparison algorithm was not mentioned. Note that I'm not trying to rely on this order. I'm building a domain-specific language where there's a data structure similar to python dict and I need an source of inspiration for implementing comparisons. Thanks RB -- http://mail.python.org/mailman/listinfo/python-list
Re: sed to python: replace Q
Raymond wrote: For some reason I'm unable to grok Python's string.replace() function. Just trying to parse a simple IP address, wrapped in square brackets, from Postfix logs. In sed this is straightforward given: line = "date process text [ip] more text" sed -e 's/^.*\[//' -e 's/].*$//' alternatively: sed -e 's/.*\[\(.*\)].*/\1/' yet the following Python code does nothing: line = line.replace('^.*\[', '', 1) line = line.replace('].*$', '') Is there a decent description of string.replace() somewhere? In python shell: help(str.replace) Online: http://docs.python.org/lib/string-methods.html#l2h-255 But what you are probably looking for is re.sub(): http://docs.python.org/lib/node46.html#l2h-405 RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Issue with regular expressions
Julien wrote: Hi, I'm fairly new in Python and I haven't used the regular expressions enough to be able to achieve what I want. I'd like to select terms in a string, so I can then do a search in my database. query = ' " some words" with and "withoutquotes " ' p = re.compile(magic_regular_expression) $ <--- the magic happens m = p.match(query) I'd like m.groups() to return: ('some words', 'with', 'and', 'without quotes') Is that achievable with a single regular expression, and if so, what would it be? Any help would be much appreciated. Hi, I think re is not the best tool for you. Maybe there's a regular expression that does what you want but it will be quite complex and hard to maintain. I suggest you split the query with the double quotes and process alternate inside/outside chunks. Something like: import re def spulit(s): inq = False for term in s.split('"'): if inq: yield re.sub('\s+', ' ', term.strip()) else: for word in term.split(): yield word inq = not inq for token in spulit(' " some words" with and "withoutquotes " '): print token Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
bisect intersection
Hi, I stumbled into a sorted list intersection algorithm by Baeza-Yates which I found quite elegant. For the lucky enough to have a springerlink access, here's the citation: http://dblp.uni-trier.de/rec/bibtex/conf/cpm/Baeza-Yates04 I implemented this algorithm in python and I thought I could share it. I've done some tests and, of course, it can't compete against dict/set intersection, but it will perform pretty well. Computing union and differences are left as an exercise... from bisect import bisect_left def bisect_intersect(L1, L2): inter = [] def rec(lo1, hi1, lo2, hi2): if hi1 <= lo1: return if hi2 <= lo2: return mid1 = (lo1 + hi1) // 2 x1 = L1[mid1] mid2 = bisect_left(L2, x1, lo=lo2, hi=hi2) rec(lo1, mid1, lo2, mid2) if mid2 < hi2 and x1 == L2[mid2]: inter.append(x1) rec(mid1+1, hi1, mid2+1, hi2) else: rec(mid1+1, hi1, mid2, hi2) rec(0, len(L1), 0, len(L2)) inter.sort() return inter Cheers RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Given a string - execute a function by the same name
[EMAIL PROTECTED] wrote: I'm parsing a simple file and given a line's keyword, would like to call the equivalently named function. There are 3 ways I can think to do this (other than a long if/elif construct): 1. eval() 2. Convert my functions to methods and use getattr( myClass, "method" ) 3. Place all my functions in dictionary and lookup the function to be called Any suggestions on the "best" way to do this? (3) is the securest way since the input file cannot induce unexpected behaviour. With this respect (1) is a folly and (2) is a good compromise since you still can write a condition before passing "method" to getattr(). Btw, if you look into the guts, you'll realize that (2) is nearly the same as (3)... RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Little novice program written in Python
Marc 'BlackJack' Rintsch wrote: Indeed. Would it be a sensible proposal that sequence slices should return an iterator instead of a list? I don't think so as that would break tons of code that relies on the current behavior. Take a look at `itertools.islice()` if you want/need an iterator. A pity, imvho. Though I can live with islice() even if it is not as powerful as the [:] notation. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: multiple pattern regular expression
Arnaud Delobelle wrote: micron_make <[EMAIL PROTECTED]> writes: I am trying to parse a file whose contents are : parameter=current max=5A min=2A for a single line I used for line in file: print re.search("parameter\s*=\s*(.*)",line).groups() is there a way to match multiple patterns using regex and return a dictionary. What I am looking for is (pseudo code) for line in file: re.search("pattern1" OR "pattern2" OR ..,line) and the result should be {pattern1:match, pattern2:match...} Also should I be using regex at all here ? If every line of the file is of the form name=value, then regexps are indeed not needed. You could do something like that. params = {} for line in file: name, value = line.strip().split('=', 2) params[name] = value (untested) I might add before you stumble upon the consequences: params[name.rstrip()] = value.lstrip() Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Little novice program written in Python
John Machin wrote: On Apr 25, 5:44 pm, Robert Bossy <[EMAIL PROTECTED]> wrote: Peter Otten wrote: Rogério Brito wrote: i = 2 while i <= n: if a[i] != 0: print a[i] i += 1 You can spell this as a for-loop: for p in a: if p: print p It isn't exactly equivalent, but gives the same output as we know that a[0] and a[1] are also 0. If the OP insists in not examining a[0] and a[1], this will do exactly the same as the while version: for p in a[2:]: if p: print p ... at the cost of almost doubling the amount of memory required. Indeed. Would it be a sensible proposal that sequence slices should return an iterator instead of a list? RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Little novice program written in Python
Peter Otten wrote: Rogério Brito wrote: i = 2 while i <= n: if a[i] != 0: print a[i] i += 1 You can spell this as a for-loop: for p in a: if p: print p It isn't exactly equivalent, but gives the same output as we know that a[0] and a[1] are also 0. If the OP insists in not examining a[0] and a[1], this will do exactly the same as the while version: for p in a[2:]: if p: print p Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: annoying dictionary problem, non-existing keys
bvidinli wrote: i use dictionaries to hold some config data, such as: conf={'key1':'value1','key2':'value2'} and so on... when i try to process conf, i have to code every time like: if conf.has_key('key1'): if conf['key1']<>'': other commands this is very annoying. in php, i was able to code only like: if conf['key1']=='someth' in python, this fails, because, if key1 does not exists, it raises an exception. MY question: is there a way to directly get value of an array/tuple/dict item by key, as in php above, even if key may not exist, i should not check if key exist, i should only use it, if it does not exist, it may return only empty, just as in php i hope you understand my question... If I understand correctly you want default values for non-existing keys. There are two ways for achieving this: Way 1: use the get() method of the dict object: conf.get(key, default) which is the same as: conf[key] if key in conf else default Way 2: make conf a defaultdict instead of a dict, the documentation is there: http://docs.python.org/lib/defaultdict-objects.html Hope this helps, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Profiling, recursive func slower than imperative, normal?
Gabriel Genellina wrote: > En Wed, 16 Apr 2008 17:53:16 -0300, <[EMAIL PROTECTED]> escribió: > > >> On Apr 16, 3:27 pm, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: >> >> >>> Any function can be implemented without recursion, although it isn't >>> always easy or fun. >>> >>> >> Really? I'm curious about that, I can't figure out how that would >> work. Could give an example? Say, for example, the typical: walking >> through the file system hierarchy (without using os.walk(), which uses >> recursion anyway!). >> > > Use a queue of pending directories to visit: > > start with empty queue > queue.put(starting dir) > while queue is not empty: >dir = queue.get() >list names in dir >for each name: > if is subdirectory: queue.put(name) > else: process file > Hi, In that case, I'm not sure you get any performance gain since the queue has basically the same role as the stack in the recursive version. A definitive answer calls for an actual test, though. Anyway if you want to process the tree depth-first, the queue version falls in the "not fun" category. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: use object method without initializing object
Reckoner wrote: > would it be possible to use one of an object's methods without > initializing the object? > > In other words, if I have: > > class Test: > def __init__(self): > print 'init' > def foo(self): > print 'foo' > > and I want to use the foo function without hitting the > initialize constructor function. > > Is this possible? > Hi, Yes. It is possible and it is called "class method". That is to say, it is a method bound to the class, and not to the class instances. In pragmatic terms, class methods have three differences from instance methods: 1) You have to declare a classmethod as a classmethod with the classmethod() function, or the @classmethod decorator. 2) The first argument is not the instance but the class: to mark this clearly, it is usually named cls, instead of self. 3) Classmethods are called with class objects, which looks like this: ClassName.class_method_name(...). In your example, this becomes: class Test(object): def __init__(self): print 'init' @classmethod def foo(cls): print 'foo' Now call foo without instantiating a Test: Test.foo() RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Process multiple files
Doran, Harold wrote: > Say I have multiple text files in a single directory, for illustration > they are called "spam.txt" and "eggs.txt". All of these text files are > organized in exactly the same way. I have written a program that parses > each file one at a time. In other words, I need to run my program each > time I want to process one of these files. > > However, because I have hundreds of these files I would like to be able > to process them all in one fell swoop. The current program is something > like this: > > sample.py > new_file = open('filename.txt', 'w') > params = open('eggs.txt', 'r') > do all the python stuff here > new_file.close() > > If these files followed a naming convention such as 1.txt and 2.txt I > can easily see how these could be parsed consecutively in a loop. > However, they are not and so is it possible to modify this code such > that I can tell python to parse all .txt files in a certain directory > and then to save them as separate files? For instance, using the example > above, python would parse both spam.txt and eggs.txt and then save 2 > different files, say as spam_parsed.txt and eggs_parsed.txt. > Hi, It seems that you need glob.glob() : http://docs.python.org/lib/module-glob.html#l2h-2284 import glob for txt_filename in glob.glob('/path/to/the/dir/containing/your/files/*.txt'): print txt_filename # or do your stuff with txt_filename RB -- http://mail.python.org/mailman/listinfo/python-list
Re: 答复: Java or C++?
Penny Y. wrote: > Perl is a functional language I guess you mean functional in the sense it works. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Java or C++?
[EMAIL PROTECTED] wrote: > Hello, I was hoping to get some opinions on a subject. I've been > programming Python for almost two years now. Recently I learned Perl, > but frankly I'm not very comfortable with it. Now I want to move on > two either Java or C++, but I'm not sure which. Which one do you think > is a softer transition for a Python programmer? Which one do you think > will educate me the best? > Hi, I vote for Java, it will be relatively smoother if you come from Python. Java adds a bit of type-checking which is a good thing to learn to code with. Also with Java, you'll learn to dig into an API documentation. Brian suggests C++, personnally, I'd rather advise C for learning about computers themselves and non-GC memory management. C++ is just too nasty. If your goal is exclusively education, I suggest a functional language (choose Haskell or any ML dialect) or even a predicate-based language (Prolog or Mercury, but the latter is pretty hardcore). These languages have quite unusual ways of looking at algorithm implementations and they will certainly expand your programming culture. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: PROBLEMS WITH PYTHON IN SOME VARIABLE,FUNCTIONS,ETC.
Hi, First thing, I appreciate (and I'm positive we all do) if you DID'N YELL AT ME. [EMAIL PROTECTED] wrote: > I am using Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v. > 1310 32 bit (Intel)] on win32 with IDLE 1.2.1 > My O/S is Windows XP SP2 I use 512 MB RAM. > I am encountering the following problems: > (i) a1=1 > a2=2 > a3=a1+a2 > print a3 > # The result is coming sometimes as 3 sometimes as vague numbers. > On all computers I work with (two at work and one at home), it always gives me 3. > (ii) x1="Bangalore is called the Silicon Valley of India" > x2="NewYork" > x3=x1.find(x2) > print x3 > # The result of x3 is coming as -1 as well as +ve numbers. > On my computer, this always gives me -1 which is what I expected since x2 not in x1. Are you sure you posted what you wanted to show us? > (iii) I have been designing one crawler using "urllib". For crawling > one web page it is perfect. But when I am giving around 100 URLs by > and their links and sublinks the IDLE is not responding. Presently I > have been running with 10 URLs but can't it be ported? > Maybe you've implemented quadratic algorithms, or even exponential. Sorry, I cannot see without more specifics... > (iv) I have designed a program with more than 500 if elif else but > sometimes it is running fine sometimes it is giving hugely erroneous > results, one view of the code: > elif a4==3: > print "YOU HAVE NOW ENTERED THREE WORDS" > if a3[0] not in a6: > if a3[1] not in a6: > if a3[2] not in a6: > print "a3[0] not in a6, a3[1] not in a6, a3[2] not > in a6" > elif a3[2] in a6: > print "a3[0] not in a6, a3[1] not in a6, a3[2] in > a6" > else: > print "NONE3.1" > elif a3[1] in a6: > if a3[2] not in a6: > print "a3[0] not in a6, a3[1] in a6, a3[2] not in > a6" > elif a3[2] in a6: > print "a3[0] not in a6,a3[1] in a6, a3[2] in a6" > else: > print "NONE3.2" > else: > print "NONE3.3" > elif a3[0] in a6: > if a3[1] not in a6: > if a3[2] not in a6: > print "a3[0] in a6, a3[1] not in a6, a3[2] not in > a6" > elif a3[2] in a6: > print "a3[0] in a6, a3[1] not in a6, a3[2] in a6" > else: > print "NONE3.4" > elif a3[1] in a6: >if a3[2] not in a6: >print "a3[0] in a6, a3[1] in a6, a3[2] not in a6" >elif a3[2] in a6: >print "a3[0] in a6, a3[1] in a6, a3[2] in a6" >else: >print "NONE3.5" > else: > print "NONE3.6" > else: > print "NONE3.7" > I guess you're looking for one or several of three strings inside a longer string. The algorithm is quadratic, no wonder your software doesn't respond for larger datasets. Someone spoke about Aho-Corasick recently on this list, you should defenitely consider it. Moreover, the least we could say is that it doesn't loks pythonic, do you think the following does the same thing as your snip? L = [] for i, x in enumerate(a3): if x in a6: L.append('a3[%d] in a6' % i) else: L.append('a3[%d] not in a6' % i) print ', '.join(L) RB -- http://mail.python.org/mailman/listinfo/python-list
Re: reading dictionary's (key,value) from file
[EMAIL PROTECTED] wrote: > Folks, > Is it possible to read hash values from txt file. > I have script which sets options. Hash table has key set to option, > and values are option values. > > Way we have it, we set options in a different file (*.txt), and we > read from that file. > Is there easy way for just reading file and setting options instead of > parsing it. > > so this is what my option files look like: > > 1opt.txt > { '-cc': '12', > '-I': r'/my/path/work/'} > > 2opt.txt > { '-I': r/my/path/work2/'} > > so my scipt how has dictionary > options = { '-cc' :'12' > '-I': r'/my/path/work/:/my/path/work2/'} > > I am trying to avoid parsing > For this particular case, you can use the optparse module: http://docs.python.org/lib/module-optparse.html Since you're obviously running commands with different set of options, I suggest you listen to Diez. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: finding euclidean distance,better code?
Gabriel Genellina wrote: > That's what I said in another paragraph. "sum of coordinates" is using a > different distance definition; it's the way you measure distance in a city > with square blocks. I don't know if the distance itself has a name, but I think it is called Manhattan distance in reference of the walking distance from one point to another in this city. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Inheritance question
Hi, I'm not sure what you're trying to actually achieve, but it seems that you want an identificator for classes, not for instances. In this case, setting the id should be kept out of __init__ since it is an instance initializer: make id static and thus getid() a classmethod. Furthermore, if you have several Foo subclasses and subsubclasses, etc. and still want to use the same identificator scheme, the getid() method would better be defined once for ever in Foo. I propose you the following: class Foo(object): id = 1 def getid(cls): if cls == Foo: return str(cls.id) return '%s.%d' % (cls.__bases__[0].getid(), cls.id) # get the parent id and append its own id getid = classmethod(getid) class FooSon(Foo): id = 2 class Bar(Foo): id = 3 class Toto(Bar): id = 1 # Show me that this works for cls in [Foo, FooSon, Bar, Toto]: inst = cls() print '%s id: %s\nalso can getid from an instance: %s\n' % (cls.__name__, cls.getid(), inst.getid()) One advantage of this approach is that you don't have to redefine the getid() method for each Foo child and descendent. Unfortunately, the "cls.__bases__[0]" part makes getid() to work if and only if the first base class is Foo or a subclass of Foo. You're not using multiple inheritance, are you? RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating dynamic objects with dynamic constructor args
[EMAIL PROTECTED] wrote: > I'd like to create objects on the fly from a pointer to the class > using: instance = klass() But I need to be able to pass in variables > to the __init__ method. I can recover the arguments using the > inspect.argspec, but how do I call __init__ with a list of arguments > and have them unpacked to the argument list rather than passed as a > single object? > > ie. class T: > def __init__(self, foo, bar): > self.foo = foo > self.bar = bar > > argspec = inspect.argspec(T.__init__) > args = (1, 2) > > ??? how do you call T(args)? > The star operator allows you to do this: T(*args) You also can use dict for keyword arguments using the double-star operator: class T(object): def __init__(self, foo=None, bar=None): self.foo = foo self.bar = bar kwargs = {'bar': 1, 'foo': 2} T(**kwargs) RB -- http://mail.python.org/mailman/listinfo/python-list
Re: dynamically created names / simple problem
Robert Bossy wrote: > Jules Stevenson wrote: > >> Hello all, >> >> I'm fairly green to python and programming, so please go gently. The >> following code >> >> for display in secondary: >> >> self.("so_active_"+display) = wx.CheckBox(self.so_panel, -1, "checkbox_2") >> >> Errors, because of the apparent nastyness at the beginning. What I’m >> trying to do is loop through a list and create uniquely named wx >> widgets based on the list values. Obviously the above doesn’t work, >> and is probably naughty – what’s a good approach for achieving this? >> >> > Hi, > > What you're looking for is the builtin function setattr: > http://docs.python.org/lib/built-in-funcs.html#l2h-66 > > Your snippet would be written (not tested): > > for display in secondary: > > setattr(self, "so_active_"+display, wx.CheckBox(self.so_panel, -1, > "checkbox_2")) Damn! The indentation didn't came out right, it should be: for display in secondary: setattr(self, "so_active_"+display, wx.CheckBox(self.so_panel, -1,"checkbox_2")) RB -- http://mail.python.org/mailman/listinfo/python-list
Re: dynamically created names / simple problem
Jules Stevenson wrote: > > Hello all, > > I'm fairly green to python and programming, so please go gently. The > following code > > for display in secondary: > > self.("so_active_"+display) = wx.CheckBox(self.so_panel, -1, "checkbox_2") > > Errors, because of the apparent nastyness at the beginning. What I’m > trying to do is loop through a list and create uniquely named wx > widgets based on the list values. Obviously the above doesn’t work, > and is probably naughty – what’s a good approach for achieving this? > Hi, What you're looking for is the builtin function setattr: http://docs.python.org/lib/built-in-funcs.html#l2h-66 Your snippet would be written (not tested): for display in secondary: setattr(self, "so_active_"+display, wx.CheckBox(self.so_panel, -1, "checkbox_2")) RB -- http://mail.python.org/mailman/listinfo/python-list
Re: script won't run using cron.d or crontab
Bjorn Meyer wrote: > I appologize if this been discussed previously. If so, just point me > to that information. > > I have done a fair bit of digging, but I haven't found a description > of what to actually do. > > I have a fairly lengthy script that I am able to run without any > problems from a shell. My problem is, now I am wanting to get it > running using crontab or cron.d. It seems that running it this way > there is a problem with some of the commands that I am using. For > instance "commands.getoutput" or "os.access". I am assuming that there > is something missing within the environment that cron runs that fails > to allow these commands to run. > If anyone has any information that would help, it would be greatly > appreciated. Hi, From a shell, type: man 5 crontab and read carefully. You'll realize that a croned script does not inherit from the user shell's environment. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: xml sax
Timothy Wu wrote: > Hi, > > I am using xml.sax.handler.ContentHandler to parse some simple xml. > > I want to detect be able to parse the content of this tag embedded in > the XML. > 174 > > > Is the proper way of doing so involving finding the "Id" tag > from startElement(), setting flag when seeing one, and in characters(), > when seeing that flag set, save the content? > > What if multiple tags of the same name are nested at different levels > > and I want to differentiate them? I would be setting a flag for each level. > I can imagine things get pretty messy when flags are all around. > Hi, You could have a list of all opened elements from the root to the innermost. To keep such a list, you append the name of the element to this stack at the end of startElement() and pop it off at the end of endElement(). In this way you have acces to the path of the current parser position. In order to differentiate between character data in Id and in Id/Id, you just have to iterate at the last elements of the list. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: lists v. tuples
[EMAIL PROTECTED] wrote: > On Mar 17, 6:49 am, [EMAIL PROTECTED] wrote: > >> What are the considerations in choosing between: >> >>return [a, b, c] >> >> and >> >> return (a, b, c) # or return a, b, c >> >> Why is the immutable form the default? >> > > Using a house definition from some weeks ago, a tuple is a data > structure such which cannot contain a refrence to itself. Can a > single expression refer to itself ever? > In some way, I think this answer will be more confusing than enlightening to the original poster... The difference is that lists are mutable, tuples are not. That means you can do the following with a list: - add element(s) - remove element(s) - re-assign element(s) These operations are impossible on tuples. So, by default, I use lists because they offer more functionality. But if I want to make sure the sequence is not messed up with later, I use tuples. The most frequent case is when a function (or method) returns a sequence whose fate is to be unpacked, things like: def connect(self, server): # try to connect to server return (handler, message,) It is pretty obvious that the returned value will (almost) never be used as is, the caller will most probably want to unpack the pair. Hence the tuple instead of list. There's a little caveat for beginners: the tuple is immutable, which doesn't mean that each element of the tuple is necessarily immutable. Also, I read several times tuples are more efficient than lists, however I wasn't able to actually notice that yet. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: merging intervals repeatedly
Magdoll wrote: >> One question you should ask yourself is: do you want all solutions? or >> just one? >> If you want just one, there's another question: which one? the one with >> the most intervals? any one? >> > > I actually don't know which solution I want, and that's why I keep > trying different solutions :P > You should think about what is your data and what is probably the "best" solution. >> If you want all of them, then I suggest using prolog rather than python >> (I hope I won't be flamed for advocating another language here). >> > > Will I be able to switch between using prolog & python back and forth > though? Cuz the bulk of my code will still be written in python and > this is just a very small part of it. > You'll have to popen a prolog interpreter and parse its output. Not very sexy. Moreover if you've never done prolog, well, you should be warned it's a "different" language (but still beautiful) with an important learning curve. Maybe not worth it for just one single problem. >> If you have a reasonable number of intervals, you're algorithm seems >> fine. But it is O(n**2), so in the case you read a lot of intervals and >> you observe unsatisfying performances, you will have to store the >> intervals in a cleverer data structure, see one of >> these:http://en.wikipedia.org/wiki/Interval_treehttp://en.wikipedia.org/wiki/Segment_tree >> > > Thanks! Both of these look interesting and potentially useful :) > Indeed. However these structures are clearly heavyweight if the number of intervals is moderate. I would consider them only if I expected more than several thousands of intervals. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a file with $SIZE
Bryan Olson wrote: > Robert Bossy wrote: > >> [EMAIL PROTECTED] wrote: >> >>> Robert Bossy wrote: >>> >>>> Indeed! Maybe the best choice for chunksize would be the file's buffer >>>> size... >>>> > > That bit strikes me as silly. > The size of the chunk must be as little as possible in order to minimize memory consumption. However below the buffer-size, you'll end up filling the buffer anyway before actually writing on disk. >> Though, as Marco Mariani mentioned, this may create a fragmented file. >> It may or may not be an hindrance depending on what you want to do with >> it, but the circumstances in which this is a problem are quite rare. >> > > Writing zeros might also create a fragmented and/or compressed file. > Using random data, which is contrary to the stated requirement but > usually better for stated application, will prevent compression but > not prevent fragmentation. > > I'm not entirely clear on what the OP is doing. If he's testing > network throughput just by creating this file on a remote server, > the seek-way-past-end-then-write trick won't serve his purpose. > Even if the filesystem has to write all the zeros, the protocols > don't actually send those zeros. Amen. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: "Attribute Doesnt Exist" ... but.... it does :-s
Robert Rawlins wrote: > Hi Guys, > > Well thanks for the response, I followed your advice and chopped out all the > crap from my class, right down to the bare __init__ and the setter method, > however, the problem continued to persist. > > However, Robert mentioned something about unindented lines which got me > thinking so I deleted my tab indents on that method and replaces them with > standard space-bar indents and it appears to have cured the problem. > Aha! Killed the bug at the first guess! You owe me a beer, mate. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: "Attribute Doesnt Exist" ... but.... it does :-s
Robert Rawlins wrote: > > Hello Guys, > > I’ve got an awfully aggravating problem which is causing some > substantial hair loss this afternoon J I want to get your ideas on > this. I am trying to invoke a particular method in one of my classes, > and I’m getting a runtime error which is telling me the attribute does > not exist. > > I’m calling the method from within __init__ yet it still seems to > think it doesn’t exist. > > Code: > > # Define the RemoteDevice class. > > class *remote_device*: > > # I'm the class constructor method. > > def *__init__*(/self/, message_list=/""/): > > /self/.set_pending_list(message_list) > > def *set_pending_list*(/self/, pending_list): > > # Set the message list property. > > /self/.pending_list = message_list > > And the error message which I receive during the instantiation of the > class: > > File: “/path/to/my/files/remote_device.py", line 22, in __init__ > > self.set_pending_list(message_list) > > AttributeError: remote_device instance has no attribute 'set_pending_list' > > Does anyone have the slightest idea why this might be happening? I can > see that the code DOES have that method in it, I also know that I > don’t get any compile time errors so that should be fine. I know it > mentions line 22 in the error, but I’ve chopped out a load of non > relevant code for the sake of posting here. > Hi, I don't get this error if I run your code. Maybe the irrelevant code causes the error: my guess is that there's a parenthesis mismatch or an undeindented line. Btw, calls to set_pending_list will fail since the name "message_list" is not defined in its scope. Please follow Chris Mellon's advice. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a file with $SIZE
[EMAIL PROTECTED] wrote: > On Mar 12, 2:44 pm, Robert Bossy <[EMAIL PROTECTED]> wrote: > >> Matt Nordhoff wrote: >> >>> Robert Bossy wrote: >>> >>>> k.i.n.g. wrote: >>>> >>>>> I think I am not clear with my question, I am sorry. Here goes the >>>>> exact requirement. >>>>> >>>>> We use dd command in Linux to create a file with of required size. In >>>>> similar way, on windows I would like to use python to take the size of >>>>> the file( 50MB, 1GB ) as input from user and create a uncompressed >>>>> file of the size given by the user. >>>>> >>>>> ex: If user input is 50M, script should create 50Mb of blank or empty >>>>> file >>>>> >>>> def make_blank_file(path, size): >>>> f = open(path, 'w') >>>> f.seek(size - 1) >>>> f.write('\0') >>>> f.close() >>>> >>>> I'm not sure the f.seek() trick will work on all platforms, so you can: >>>> >>>> def make_blank_file(path, size): >>>> f = open(path, 'w') >>>> f.write('\0' * size) >>>> f.close() >>>> >>> I point out that a 1 GB string is probably not a good idea. >>> >>> def make_blank_file(path, size): >>> chunksize = 10485760 # 10 MB >>> chunk = '\0' * chunksize >>> left = size >>> fh = open(path, 'wb') >>> while left > chunksize: >>> fh.write(chunk) >>> left -= chunksize >>> if left > 0: >>> fh.write('\0' * left) >>> fh.close() >>> >> Indeed! Maybe the best choice for chunksize would be the file's buffer >> size... I won't search the doc how to get the file's buffer size because >> I'm too cool to use that function and prefer the seek() option since >> it's lighning fast regardless the size of the file and it takes near to >> zero memory. >> >> Cheers, >> RB >> > > But what platforms does it work on / not work on? > Posix. It's been ages since I touched Windows, so I don't know if XP and Vista are posix or not. Though, as Marco Mariani mentioned, this may create a fragmented file. It may or may not be an hindrance depending on what you want to do with it, but the circumstances in which this is a problem are quite rare. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a file with $SIZE
Matt Nordhoff wrote: > Robert Bossy wrote: > >> k.i.n.g. wrote: >> >>> I think I am not clear with my question, I am sorry. Here goes the >>> exact requirement. >>> >>> We use dd command in Linux to create a file with of required size. In >>> similar way, on windows I would like to use python to take the size of >>> the file( 50MB, 1GB ) as input from user and create a uncompressed >>> file of the size given by the user. >>> >>> ex: If user input is 50M, script should create 50Mb of blank or empty >>> file >>> >>> >> def make_blank_file(path, size): >> f = open(path, 'w') >> f.seek(size - 1) >> f.write('\0') >> f.close() >> >> I'm not sure the f.seek() trick will work on all platforms, so you can: >> >> def make_blank_file(path, size): >> f = open(path, 'w') >> f.write('\0' * size) >> f.close() >> > > I point out that a 1 GB string is probably not a good idea. > > def make_blank_file(path, size): > chunksize = 10485760 # 10 MB > chunk = '\0' * chunksize > left = size > fh = open(path, 'wb') > while left > chunksize: > fh.write(chunk) > left -= chunksize > if left > 0: > fh.write('\0' * left) > fh.close() > Indeed! Maybe the best choice for chunksize would be the file's buffer size... I won't search the doc how to get the file's buffer size because I'm too cool to use that function and prefer the seek() option since it's lighning fast regardless the size of the file and it takes near to zero memory. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating a file with $SIZE
k.i.n.g. wrote: > I think I am not clear with my question, I am sorry. Here goes the > exact requirement. > > We use dd command in Linux to create a file with of required size. In > similar way, on windows I would like to use python to take the size of > the file( 50MB, 1GB ) as input from user and create a uncompressed > file of the size given by the user. > > ex: If user input is 50M, script should create 50Mb of blank or empty > file > def make_blank_file(path, size): f = open(path, 'w') f.seek(size - 1) f.write('\0') f.close() I'm not sure the f.seek() trick will work on all platforms, so you can: def make_blank_file(path, size): f = open(path, 'w') f.write('\0' * size) f.close() Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: merging intervals repeatedly
Magdoll wrote: > Hi, > > I have to read through a file that will give me a bunch of intervals. > My ultimate goal is to produce a final set of intervals such that not > two intervals overlap by more than N, where N is a predetermined > length. > > For example, I could read through this input: > (1,10), (3,15), (20,30),(29,40),(51,65),(62,100),(50,66) > > btw, the input is not guaranteed to be in any sorted order. > > say N = 5, so the final set should be > (1,15), (20, 30), (29, 40), (50, 100) > Hi, The problem, as stated here, may have several solutions. For instance the following set of intervals also satisfies the constraint: (1,15), (20,40), (50,100) One question you should ask yourself is: do you want all solutions? or just one? If you want just one, there's another question: which one? the one with the most intervals? any one? If you want all of them, then I suggest using prolog rather than python (I hope I won't be flamed for advocating another language here). > Is there already some existing code in Python that I can easily take > advantage of to produce this? Right now I've written my own simple > solution, which is just to maintain a list of the intervals. I can use > the Interval module, but it doesn't really affect much. I read one > interval from the input file at a time, and use bisect to insert it in > order. The problem comes with merging, which sometimes can be > cascading. > > ex: > read (51,65) ==> put (51,65) in list > read (62,100) ==> put (62,100) in list (overlap only be 4 <= N) > read (50,66) ==> merge with (51,65) to become (50,66) ==> now can > merge with (62,100) The way this algorithm is presented suggests an additional constraint: you cannot merge two intervals if their overlap <= N. In that case, there is a single solution indeed... Nitpick: you cannot merge (50,66) and (62,100) since their overlap is still <= 5. If you have a reasonable number of intervals, you're algorithm seems fine. But it is O(n**2), so in the case you read a lot of intervals and you observe unsatisfying performances, you will have to store the intervals in a cleverer data structure, see one of these: http://en.wikipedia.org/wiki/Interval_tree http://en.wikipedia.org/wiki/Segment_tree Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: difference b/t dictionary{} and anydbm - they seem the same
davidj411 wrote: > anydbm and dictionary{} seem like they both have a single key and key > value. > Can't you put more information into a DBM file or link tables? I just > don't see the benefit except for the persistent storage. Except for the persistent storage, that insignificant feature... ;) Well I guess that persistent storage must be the reason some people use anydbm sometimes. If you want keys and values of any type (not just strings) and persistent storage, you can use builtin dicts then pickle them. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: parsing directory for certain filetypes
jay graves wrote: > On Mar 10, 9:28 am, Robert Bossy <[EMAIL PROTECTED]> wrote: > >> Personally, I'd use glob.glob: >> >> import os.path >> import glob >> >> def parsefolder(folder): >> path = os.path.normpath(os.path.join(folder, '*.py')) >> lst = [ fn for fn in glob.glob(path) ] >> lst.sort() >> return lst >> >> > > Why the 'no-op' list comprehension? Typo? > My mistake, it is: import os.path import glob def parsefolder(folder): path = os.path.normpath(os.path.join(folder, '*.py')) lst = glob.glob(path) lst.sort() return lst -- http://mail.python.org/mailman/listinfo/python-list
Re: parsing directory for certain filetypes
royG wrote: > hi > i wrote a function to parse a given directory and make a sorted list > of files with .txt,.doc extensions .it works,but i want to know if it > is too bloated..can this be rewritten in more efficient manner? > > here it is... > > from string import split > from os.path import isdir,join,normpath > from os import listdir > > def parsefolder(dirname): > filenms=[] > folder=dirname > isadr=isdir(folder) > if (isadr): > dirlist=listdir(folder) > filenm="" > This las line is unnecessary: variable scope rules in python are a bit different from what we're used to. You're not required to declare/initialize a variable, you're only required to assign a value before it is referenced. > for x in dirlist: > filenm=x >if(filenm.endswith(("txt","doc"))): > nmparts=[] >nmparts=split(filenm,'.' ) > if((nmparts[1]=='txt') or (nmparts[1]=='doc')): > I don't get it. You've already checked that filenm ends with "txt" or "doc"... What is the purpose of these three lines? Btw, again, nmparts=[] is unnecessary. > filenms.append(filenm) > filenms.sort() > filenameslist=[] > Unnecessary initialization. > filenameslist=[normpath(join(folder,y)) for y in filenms] > numifiles=len(filenameslist) > numifiles is not used so I guess this line is too much. > print filenameslist > return filenameslist > Personally, I'd use glob.glob: import os.path import glob def parsefolder(folder): path = os.path.normpath(os.path.join(folder, '*.py')) lst = [ fn for fn in glob.glob(path) ] lst.sort() return lst I leave you the exercice to add .doc files. But I must say (whoever's listening) that I was a bit disappointed that glob('*.{txt,doc}') didn't work. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: problem with join
nodrogbrown wrote: > hi > i am using python on WinXP..i have a string 'folder ' that i want to > join to a set of imagefile names to create complete qualified names so > that i can create objects out of them > > folder='F:/brown/code/python/fgrp1' > filenms=['amber1.jpg', 'amber3.jpg', 'amy1.jpg', 'amy2.jpg'] > filenameslist=[] > for x in filenms: > myfile=join(folder,x) > filenameslist.append(myfile) > > now when i print the filenameslist i find that it looks like > > ['F:/brown/code/python/fgrp1\\amber1.jpg', > 'F:/brown/code/python/fgrp1\\amber3.jpg', 'F:/brown/code/python/fgrp1\ > \amy1.jpg', 'F:/brown/code/python/fgrp1\\amy2.jpg'] > > is there some problem with the way i use join? why do i get \\ infront > of the basename? > i would prefer it like 'F:/brown/code/python/fgrp1/basename.jpg', > os.path.join() http://docs.python.org/lib/module-os.path.html#l2h-2185 vs. string.join() http://docs.python.org/lib/node42.html#l2h-379 RB -- http://mail.python.org/mailman/listinfo/python-list
Re: why not bisect options?
Aaron Watters wrote: > On Feb 29, 9:31 am, Robert Bossy <[EMAIL PROTECTED]> wrote: > >> Hi all, >> >> I thought it would be useful if insort and consorts* could accept the >> same options than list.sort, especially key and cmp. >> > > Wouldn't this make them slower and less space efficient? It would > be fine to add something like this as an additional elaboration, but > I want bisect to scream as fast as possible in the default streamlined > usage. Yes it is slower and bigger, so I agree that the canonical implementation for default values should be kept. Also because the original bisect functions are actually written in C, the speed difference is even more noticeable. Though, I needed custom ordering bisects since I was implementing interval trees (storing intervals by startpoint/endpoint). Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Eurosymbol in xml document
Diez B. Roggisch wrote: > Hellmut Weber wrote: > > >> Hi, >> i'm new here in this list. >> >> i'm developing a little program using an xml document. So far it's easy >> going, but when parsing an xml document which contains the EURO symbol >> ('€') then I get an error: >> >> UnicodeEncodeError: 'charmap' codec can't encode character u'\xa4' in >> position 11834: character maps to >> >> the relevant piece of code is: >> >> from xml.dom.minidom import Document, parse, parseString >> ... >> doc = parse(inFIleName) >> > > The contents of the file must be encoded with the proper encoding which is > given in the XML-header, or has to be utf-8 if no header is given. > > From the above I think you have a latin1-based document. Does the encoding > header match? If the file is declared as latin-1 and contains an euro symbol, then the file is actually invalid since euro is not defined of in iso-8859-1. If there is no encoding declaration, as Diez already said, the file should be encoded as utf-8. Try replacing or adding the encoding with latin-15 (or iso-8859-15) which is the same as latin-1 with a few changes, including the euro symbol: If your file has lot of strange diacritics, you might take a look on the little differences between latin-1 and latin-15 in order to make sure that your file won't be broken: http://en.wikipedia.org/wiki/ISO_8859-15 Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem round-tripping with xml.dom.minidom pretty-printer
Ben Butler-Cole wrote: >> An additional thing to keep in mind is that toprettyxml does not print >> an XML identical to the original DOM tree: it adds newlines and tabs. >> When parsed again these blank characters are inserted in the DOM tree as >> character nodes. If you toprettyxml an XML document twice in a row, then >> the second one will also add newlines and tabs around the newlines and >> tabs added by the first. Since you call toprettyxml an infinite number >> of times, it is expected that lots of blank characters appear. >> > > Right. That's the behaviour I'm asking about, which I consider to be > problematic. I would expect a module providing a parser and pretty- > printer (not just for XML parsers) to be able to conservatively round- > trip. > > As far as I can see (and your comments back this up) minidom doesn't > have this property. Unless anyone knows how to get it to behave that > way... > minidom --any DOM parser, btw-- has no means to know which blank character is a pretty print artefact or actual blank content from the original XML. You could write a function that strips all-blank nodes recursively down the elements tree, before doing so I suggest you take a look at section 2.10 of http://www.w3.org/TR/REC-xml/. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Problem round-tripping with xml.dom.minidom pretty-printer
Ben Butler-Cole wrote: > Hello > > I have run into a problem using minidom. I have an HTML file that I > want to make occasional, automated changes to (adding new links). My > strategy is to parse it with minidom, add a node, pretty print it and > write it back to disk. > > However I find that every time I do a round trip minidom's pretty > printer puts extra blank lines around every element, so my file grows > without limit. I have found that normalizing the document doesn't make > any difference. Obviously I can fix the problem by doing without the > pretty-printing, but I don't really like producing non-human readable > HTML. > > Here is some code that shows the behaviour: > > import xml.dom.minidom as dom > def p(t): > d = dom.parseString(t) > d.normalize() > t2 = d.toprettyxml() > print t2 > p(t2) > p('') > > Does anyone know how to fix this behaviour? If not, can anyone > recommend an alternative XML tool for simple tasks like this? Hi, The last line of p() calls itself: it is an unconditional recursive call so, no matter what it does, it will never stop. And since p() also prints something, calling it will print endlessly. By removing this line, you get something like: That seems sensible, imo. Was that what you wanted? An additional thing to keep in mind is that toprettyxml does not print an XML identical to the original DOM tree: it adds newlines and tabs. When parsed again these blank characters are inserted in the DOM tree as character nodes. If you toprettyxml an XML document twice in a row, then the second one will also add newlines and tabs around the newlines and tabs added by the first. Since you call toprettyxml an infinite number of times, it is expected that lots of blank characters appear. Finally, normalize() is supposed to merge consecutive sibling character nodes, however it will never remove character contents even if they are blank. That means that several character nodes will be replaced by a single one whose content is the concatenation of the respective content of the original nodes. Clear enough? Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: joining strings question
[EMAIL PROTECTED] wrote: > Hi all, > > I have some data with some categories, titles, subtitles, and a link > to their pdf and I need to join the title and the subtitle for every > file and divide them into their separate groups. > > So the data comes in like this: > > data = ['RULES', 'title','subtitle','pdf', > 'title1','subtitle1','pdf1','NOTICES','title2','subtitle2','pdf','title3','subtitle3','pdf'] > > What I'd like to see is this: > > [RULES', 'title subtitle','pdf', 'title1 subtitle1','pdf1'], > ['NOTICES','title2 subtitle2','pdf','title3 subtitle3','pdf'], etc... > > I've racked my brain for a while about this and I can't seem to figure > it out. Any ideas would be much appreciated. > As others already said, the data structure is quite unfit. Therefore I give you one of the ugliest piece of code I've produced in years: r = [] for i in xrange(0, len(data), 7): r.append([data[i], ' '.join((data[i+1], data[i+2],)), data[i+3], ' '.join((data[i+4], data[i+5],)), data[i+6]]) print r Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
why not bisect options?
Hi all, I thought it would be useful if insort and consorts* could accept the same options than list.sort, especially key and cmp. The only catch I can think of is that nothing prevents a crazy developer to insort elements using different options to the same list. I foresee two courses of actions: 1) let the developer be responsible for the homogeneity of successive insort calls on the same list (remember that the developer is already responsible for giving a sorted list), or 2) make bisect a class which keeps the key and cmp options internally and always use them for comparison, something like: class Bisect: def __init__(self, lst = [], key = None, cmp = None): self.key = key self.cmp = cmp self.lst = lst self.lst.sort(key = key, cmp = cmp) def compare_elements(self, a, b): if self.cmp is not None: return self.cmp(a, b) if self.key is not None: return cmp(self.key(a), self.key(b)) return cmp(a,b) def insort_right(self, elt, lo = 0, hi = None): """Inspired from bisect in the python standard library""" if hi is None: hi = len(self.lst) while lo < hi: mid = (lo + hi) / 2 if self.compare_elements(elt, self.lst[mid]) < 0: hi = mid else: lo = mid + 1 self.lst.insert(lo, elt) ... Any thoughts about this? RB * at this point you should smile... -- http://mail.python.org/mailman/listinfo/python-list
Re: executing a python program by specifying only its name in terminal or command line
Steve Holden wrote: > bharath venkatesh wrote: > >> hi, >>i wanna run a python program by specifying only its name ex prog >> with the arguments in the terminal or command line instead of specifying >> python prog in the terminal to run the program not even specifying the >> it with .py extension .. >> for example i want to run the python program named prog by sepcifying >> $prog -arguments >> instead of >> $python prog -arguments >> or >> $prog.py -arguments >> can anyone tell me how to do it >> >> > reseach pathext for Windows. > > For Unix-like systems use the shebang (#!) line, and don't put a .py at > the end of the filename. Besides being ugly and a bit unsettling for the user, the final .py won't prevent the execution of your program. Though the file must have the executable attribute set, so you have to chmod +x it. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: Parent instance attribute access
Gabriel Rossetti wrote: > Hello, > > I have something weird going on, I have the following (simplified) : > > class MyFactory(..., ...): > > def __init__(self, *args, **kwargs): > self.args = args > self.kwargs = kwargs > ... > > class MyXmlFactory(MyFactory): > > def __init__(self, *args, **kwargs): > MyFactory.__init__(self, *args, **kwargs) > #self.args = args > #self.kwargs = kwargs >... > > def build(self, addr): > p = self.toto(*self.args, **self.kwargs) > > when build is called I get this : > > exceptions.AttributeError: MyXmlFactory instance has no attribute 'args' > > If I uncomment "self.args = args" and "self.kwargs = kwargs" in > __init__(...) > it works. I find this strange, since in OO MyXmlFactory is a MyFactory > and thus has > "self.args" and "self.kargs", and I explicitly called the paret > __init__(...) method, so I tried this small example : > > >>> class A(object): > ... def __init__(self, *args, **kargs): > ... self.args = args > ... self.kargs = kargs > ... self.toto = 3 > ... > >>> class B(A): > ... def __init__(self, *args, **kargs): > ... A.__init__(self, *args, **kargs) > ... def getToto(self): > ... print str(self.toto) > ... > >>> b = B() > >>> b.getToto() > 3 > > so what I though is correct, so why does it not work with args and > kargs? BTW, If I build a MyFactory and call build, it works as expected. > If you add the following lines ot getToto, it works as expected: print self.args print self.kargs The bug must lay somewhere else in your code. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: check a directory
Raj kumar wrote: > Hi all, > I'm using following code... > > for x in listdir(path) > some code. > > but how i can check whether x is a directory or not? > Because listdir() is giving all the files present in that path Take a look at the module os.path, especially the functions named isdir and walk. RB -- http://mail.python.org/mailman/listinfo/python-list
Re: integrating python with owl
Noorhan Abbas wrote: > Hello, > I have developed an ontology using protege owl and I wonder if you > can help me get any documentation on how to integrate it with python. > Hi, It depends on what you mean by integrating. If you mean reading OWL files generated by Protégé, there are some Python libraries out there though I never tested any: http://eulersharp.sourceforge.net/2004/02swap/OWLLogic/owllogic.html http://seth-scripting.sourceforge.net/ I must warn you, the OWL written by Protégé isn't quite straightforward to parse. Anyway RDFLib seems to be the canonical library for parsing and processing RDF/RDFS. If your goal is to develop plugins in Python. Well... I expect that any solution is based on Jython. A quick googling gave me JOT which is more like a scripting console for Protégé: http://protege.cim3.net/file/work/files/ProtegeScriptConsole/jot-tutorial/ Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: is this data structure build-in or I'll have to write my own class?
mkPyVS wrote: > This isn't so optimal but I think accomplishes what you desire to some > extent... I *think* there is some hidden gem in inheriting from dict > or an mapping type that is cleaner than what I've shown below though. > > class dum_struct: >def __init__(self,keyList,valList): > self.__orderedKeys = keyList > self.__orderedValList = valList >def __getattr__(self,name): > return self.__orderedValList[self.__orderedKeys.index(name)] > > > keys = ['foo','baz'] > vals = ['bar','bal'] > > m = dum_struct(keys,vals) > > print m.foo > Let's add: __getitem__(self, key): return self.__orderedValList[key] in order to have: m.foo == m[0] RB -- http://mail.python.org/mailman/listinfo/python-list
Re: distutils and data files
Sam Peterson wrote: > I've been googling for a while now and cannot find a good way to deal > with this. > > I have a slightly messy python program I wrote that I've historically > just run from the extracted source folder. I have pictures and sound > files in this folder that this program uses. I've always just used > the relative path names of these files in my program. > > Lately, I had the idea of cleaning up my program and packaging it with > distutils, but I've been stuck on a good way to deal with these > resource files. The package_data keyword seems to be the way to go, > but how can I locate and open my files once they've been moved? In > other words, what should I do about changing the relative path names? > I need something that could work from both the extracted source > folder, AND when the program gets installed via the python setup.py > install command. > This seems to be a classic distutils question: how a python module can access to data files *after* being installed? The following thread addresses this issue: http://www.gossamer-threads.com/lists/python/python/163159 Carl Banks' solution seems to overcome the problem: his trick is to generate an additional configuration module with the relevant informations from the distutil data structure. However it is quite an old thread (2003) and I don't know if there has been progress made since then, maybe the distutils module now incorporates a similar mechanism. Hope it helps, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: dictionary of operators
A.T.Hofkamp wrote: > On 2008-02-14, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> In the standard library module "operator", it would be nice to have a >> dictionary >> mapping operators strings with their respective functions. Something like: >> >> { >> '+': add, >> '-': sub, >> 'in': contains, >> 'and': and_, >> 'or': or_, >> ... >> } >> >> Does such a dictionary already exist? Is it really a good and useful idea? >> > > How would you handle changes in operator syntax? > - I have 'add' instead of '+' > - I have U+2208 instead of 'in' > Originally I meant only the Python syntax which shouldn't change that much. For some operators (arith, comparison) the toy language had the same syntax as Python. Btw, U+2208 would be a wonderful token... if only it was on standard keyboards. > I don't think this is generally applicable. > Thinking about it, I think it is not really applicable. Mainly because my examples were exclusively binary operators. What would be for unary operators? Or enclosing operators (getitem)? > Why don't you attach the function to the +/-/in/... token instead? Then you > don't need the above table at all. > Could be. But I prefer settling the semantic parts the furthest possible from the lexer. Not that I have strong arguments for that, it's religious. Anyway, thanks for answering, RB -- http://mail.python.org/mailman/listinfo/python-list
Re: OT: Speed of light [was Re: Why not a Python compiler?]
Jeff Schwab wrote: > Erik Max Francis wrote: > >> Jeff Schwab wrote: >> >> >>> Erik Max Francis wrote: >>> >>>> Robert Bossy wrote: >>>> >>>>> I'm pretty sure we can still hear educated people say that free fall >>>>> speed depends on the weight of the object without realizing it's a >>>>> double mistake. >>>>> >>>> Well, you have to qualify it better than this, because what you've >>>> stated in actually correct ... in a viscous fluid. >>>> >>> By definition, that's not free fall. >>> >> In a technical physics context. But he's talking about posing the >> question to generally educated people, not physicists (since physicists >> wouldn't make that error). In popular parlance, "free fall" just means >> falling freely without restraint (hence "free fall rides," "free >> falling," etc.). And in that context, in the Earth's atmosphere, you >> _will_ reach a terminal speed that is dependent on your mass (among >> other things). >> >> So you made precisely my point: The average person would not follow >> that the question was being asked was about an abstract (for people >> stuck on the surface of the Earth) physics principle, but rather would >> understand the question to be in a context where the supposedly-wrong >> statement is _actually true_. >> > > So what's the "double mistake?" My understanding was (1) the misuse > (ok, vernacular use) of the term "free fall," and (2) the association of > weight with free-fall velocity ("If I tie an elephant's tail to a > mouse's, and drop them both into free fall, will the mouse slow the > elephant down?") > In my mind, the second mistake was the confusion between weight and mass. Cheers RB -- http://mail.python.org/mailman/listinfo/python-list
Re: OT: Speed of light [was Re: Why not a Python compiler?]
Grant Edwards wrote: > On 2008-02-11, Steve Holden <[EMAIL PROTECTED]> wrote: > > >> Well the history of physics for at least two hundred years has >> been a migration away from the intuitive. >> > > Starting at least as far back as Newtonian mechanics. I once > read a very interesting article about some experiments that > showed that even simple newtonian physics is counter-intuitive. > Two of the experiments I remember vividly. One of them showed > that the human brain expects objects constrained to travel in a > curved path will continue to travel in a curved path when > released. The other showed that the human brain expects that > when an object is dropped it will land on a spot immediately > below the drop point -- regardless of whether or not the ojbect > was in motion horizontally when released. > > After repeated attempts at the tasks set for them in the > experiments, the subjects would learn strategies that would > work in a Newtonian world, but the initial intuitive reactions > were very non-Newtonian (regardless of how educated they were > in physics). > I'm pretty sure we can still hear educated people say that free fall speed depends on the weight of the object without realizing it's a double mistake. Cheers, RB -- http://mail.python.org/mailman/listinfo/python-list