Re: % sign in python?

2008-07-17 Thread Robert Bossy

korean_dave wrote:

What does this operator do? Specifically in this context

test.log( "[[Log level %d: %s]]" % ( level, msg ), description )

(Tried googling and searching, but the "%" gets interpreted as an
operation and distorts the search results)
  

It's the string formatting operator:
   http://docs.python.org/lib/typesseq-strings.html


Btw, a good place to start searching would be:
   http://docs.python.org/lib/lib.html
especially:
   http://docs.python.org/lib/genindex.html

Cheers
RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: decorator to prevent adding attributes to class?

2008-07-11 Thread Robert Bossy

Michele Simionato wrote:

This article could give you same idea (it is doing the opposite,
warning you
if an attribute is overridden):
http://stacktrace.it/articoli/2008/06/i-pericoli-della-programmazione-con-i-mixin1/

There is also a recipe that does exactly what you want by means of a
metaclass:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252158
It is so short I can write it down here:
# requires Python 2.2+

def frozen(set):
"Raise an error when trying to set an undeclared name."
def set_attr(self,name,value):
if hasattr(self,name):
set(self,name,value)
else:
raise AttributeError("You cannot add attributes to %s" %
self)
return set_attr

class Frozen(object):
"""Subclasses of Frozen are frozen, i.e. it is impossibile to add
 new attributes to them and their instances."""
__setattr__=frozen(object.__setattr__)
class __metaclass__(type):
__setattr__=frozen(type.__setattr__)
  
I don't get it. Why use a metaclass? Wouldn't the following be the same, 
but easier to grasp:


class Frozen(object):
   def __setattr__(self, name, value):
  if not hasattr(self, name):
 raise AttributeError, "cannot add attributes to %s" % self
  object.__setattr__(self, name, value)

Btw, the main drawback with Frozen is that it will not allow to set any 
new attributes even inside __init__.



Some people would advise to use __slots__:
   http://docs.python.org/ref/slots.html#l2h-222
Some other people would advise NOT to use __slots__:
   http://groups.google.com/group/comp.lang.python/msg/0f2e859b9c002b28



Personally, if I must absolutely, I'd go for explicitely freeze the 
object at the end of __init__:


class Freezeable(object):
   def freeze(self):
  self._frozen = None

   def __setattr__(self, name, value):
  if hasattr(self, '_frozen') and not hasattr(self, name):
 raise AttributeError
  object.__setattr__(self, name, value)


class Foo(Freezeable):
   def __init__(self):
  self.bar = 42
  self.freeze() # ok, we set all variables, no more from here


x = Foo()
print x.bar
x.bar = -42
print x.bar
x.baz = "OMG! A typo!"


Cheers,
RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for lots of words in lots of files

2008-06-18 Thread Robert Bossy

I forgot to mention another way: put one thousand monkeys to work on it. ;)

RB

Robert Bossy wrote:

brad wrote:
Just wondering if anyone has ever solved this efficiently... not 
looking for specific solutions tho... just ideas.


I have one thousand words and one thousand files. I need to read the 
files to see if some of the words are in the files. I can stop 
reading a file once I find 10 of the words in it. It's easy for me to 
do this with a few dozen words, but a thousand words is too large for 
an RE and too inefficient to loop, etc. Any suggestions?

The quick answer would be:
   grep -F -f WORDLIST FILE1 FILE2 ... FILE1000
where WORDLIST is a file containing the thousand words, one per line.

The more interesting answers would be to use either a suffix tree or 
an Aho-Corasick graph.


- The suffix tree is a representation of the target string (your 
files) that allows to search quickly for a word. Your problem would 
then be solved by 1) building a suffix tree for your files, and 2) 
search for each word sequentially in the suffix tree.


- The Aho-Corasick graph is a representation of the query word list 
that allows fast scanning of the words on a target string. Your 
problem would then be solved by 1) building an Aho-Corasick graph for 
the list of words, and 2) scan sequentially each file.


The preference for using either one or the other depends on some 
details of your problems: the expected size of target files, the rate 
of overlaps between words in your list (are there common prefixes), 
will you repeat the operation with another word list or another set of 
files, etc. Personally, I'd lean towards Aho-Corasick, it is a matter 
of taste; the kind of applications that comes to my mind makes it more 
practical.


Btw, the `grep -F -f` combo builds an Aho-Corasick graph. Also you can 
find modules for building both data structures in the python package 
index.


Cheers,
RB
--
http://mail.python.org/mailman/listinfo/python-list



--
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for lots of words in lots of files

2008-06-18 Thread Robert Bossy

brad wrote:
Just wondering if anyone has ever solved this efficiently... not 
looking for specific solutions tho... just ideas.


I have one thousand words and one thousand files. I need to read the 
files to see if some of the words are in the files. I can stop reading 
a file once I find 10 of the words in it. It's easy for me to do this 
with a few dozen words, but a thousand words is too large for an RE 
and too inefficient to loop, etc. Any suggestions?

The quick answer would be:
   grep -F -f WORDLIST FILE1 FILE2 ... FILE1000
where WORDLIST is a file containing the thousand words, one per line.

The more interesting answers would be to use either a suffix tree or an 
Aho-Corasick graph.


- The suffix tree is a representation of the target string (your files) 
that allows to search quickly for a word. Your problem would then be 
solved by 1) building a suffix tree for your files, and 2) search for 
each word sequentially in the suffix tree.


- The Aho-Corasick graph is a representation of the query word list that 
allows fast scanning of the words on a target string. Your problem would 
then be solved by 1) building an Aho-Corasick graph for the list of 
words, and 2) scan sequentially each file.


The preference for using either one or the other depends on some details 
of your problems: the expected size of target files, the rate of 
overlaps between words in your list (are there common prefixes), will 
you repeat the operation with another word list or another set of files, 
etc. Personally, I'd lean towards Aho-Corasick, it is a matter of taste; 
the kind of applications that comes to my mind makes it more practical.


Btw, the `grep -F -f` combo builds an Aho-Corasick graph. Also you can 
find modules for building both data structures in the python package index.


Cheers,
RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: dict order

2008-06-18 Thread Robert Bossy

Peter Otten wrote:

Robert Bossy wrote:

  

I wish to know how two dict objects are compared. By browsing the
archives I gathered that the number of items are first compared, but if
the two dict objects have the same number of items, then the comparison
algorithm was not mentioned.



If I interpret the comments in 


http://svn.python.org/view/python/trunk/Objects/dictobject.c?rev=64048&view=markup

correctly it's roughly

def characterize(d, e):
return min(((k, v) for k, v in d.iteritems() if k not in e or e[k] != v),
   key=lambda (k, v): k)

def dict_compare(d, e):
result = cmp(len(d), len(e))
if result:
return result
try:
ka, va = characterize(d, e)
except ValueError:
return 0
kb, vb = characterize(e, d)
return cmp(ka, kb) or cmp(va, vb)
Thanks, Peter! That was exactly what I was looking for. Quite clever, I 
might add.


RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: dict order

2008-06-18 Thread Robert Bossy

Lie wrote:

Whoops, I think I misunderstood the question. If what you're asking
whether two dictionary is equal (equality comparison, rather than
sorting comparison). You could do something like this:

Testing for equality and finding differences are trivial tasks indeed. 
It is the sort order I'm interested in. The meaning of the order is not 
really an issue, I'm rather looking for a consistent comparison function 
(in the __cmp__ sense) such as:

   if d1 > d2 and d2 > d3,
   then d1 > d3

I'm not sure the hashing method suggested by Albert guarantees that.

Cheers
--
http://mail.python.org/mailman/listinfo/python-list


dict order

2008-06-18 Thread Robert Bossy

Hi,

I wish to know how two dict objects are compared. By browsing the 
archives I gathered that the number of items are first compared, but if 
the two dict objects have the same number of items, then the comparison 
algorithm was not mentioned.


Note that I'm not trying to rely on this order. I'm building a 
domain-specific language where there's a data structure similar to 
python dict and I need an source of inspiration for implementing 
comparisons.


Thanks
RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: sed to python: replace Q

2008-04-30 Thread Robert Bossy

Raymond wrote:

For some reason I'm unable to grok Python's string.replace() function.
Just trying to parse a simple IP address, wrapped in square brackets,
from Postfix logs. In sed this is straightforward given:

line = "date process text [ip] more text"

  sed -e 's/^.*\[//' -e 's/].*$//'
  

alternatively:
 sed -e 's/.*\[\(.*\)].*/\1/'


yet the following Python code does nothing:

  line = line.replace('^.*\[', '', 1)
  line = line.replace('].*$', '')

Is there a decent description of string.replace() somewhere?
  

In python shell:
   help(str.replace)

Online:
   http://docs.python.org/lib/string-methods.html#l2h-255

But what you are probably looking for is re.sub():
   http://docs.python.org/lib/node46.html#l2h-405


RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Issue with regular expressions

2008-04-29 Thread Robert Bossy

Julien wrote:

Hi,

I'm fairly new in Python and I haven't used the regular expressions
enough to be able to achieve what I want.
I'd like to select terms in a string, so I can then do a search in my
database.

query = '   "  some words"  with and "withoutquotes   "  '
p = re.compile(magic_regular_expression)   $ <--- the magic happens
m = p.match(query)

I'd like m.groups() to return:
('some words', 'with', 'and', 'without quotes')

Is that achievable with a single regular expression, and if so, what
would it be?

Any help would be much appreciated.
  

Hi,

I think re is not the best tool for you. Maybe there's a regular 
expression that does what you want but it will be quite complex and hard 
to maintain.


I suggest you split the query with the double quotes and process 
alternate inside/outside chunks. Something like:


import re

def spulit(s):
   inq = False
   for term in s.split('"'):
   if inq:
   yield re.sub('\s+', ' ', term.strip())
   else:
   for word in term.split():
   yield word
   inq = not inq

for token in spulit('   "  some words"  with and "withoutquotes   "  '):
   print token
  
 
Cheers,

RB
--
http://mail.python.org/mailman/listinfo/python-list


bisect intersection

2008-04-28 Thread Robert Bossy

Hi,

I stumbled into a sorted list intersection algorithm by Baeza-Yates 
which I found quite elegant. For the lucky enough to have a springerlink 
access, here's the citation:

http://dblp.uni-trier.de/rec/bibtex/conf/cpm/Baeza-Yates04

I implemented this algorithm in python and I thought I could share it. 
I've done some tests and, of course, it can't compete against dict/set 
intersection, but it will perform pretty well. Computing union and 
differences are left as an exercise...


from bisect import bisect_left

def bisect_intersect(L1, L2):
   inter = []
   def rec(lo1, hi1, lo2, hi2):
   if hi1 <= lo1: return
   if hi2 <= lo2: return
   mid1 = (lo1 + hi1) // 2
   x1 = L1[mid1]
   mid2 = bisect_left(L2, x1, lo=lo2, hi=hi2)
   rec(lo1, mid1, lo2, mid2)
   if mid2 < hi2 and x1 == L2[mid2]:
   inter.append(x1)
   rec(mid1+1, hi1, mid2+1, hi2)
   else:
   rec(mid1+1, hi1, mid2, hi2)
   rec(0, len(L1), 0, len(L2))
   inter.sort()
   return inter


Cheers
RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Given a string - execute a function by the same name

2008-04-28 Thread Robert Bossy

[EMAIL PROTECTED] wrote:

I'm parsing a simple file and given a line's keyword, would like to call
the equivalently named function.

There are 3 ways I can think to do this (other than a long if/elif
construct):

1. eval() 


2. Convert my functions to methods and use getattr( myClass, "method" )

3. Place all my functions in dictionary and lookup the function to be
called

Any suggestions on the "best" way to do this?
(3) is the securest way since the input file cannot induce unexpected 
behaviour.
With this respect (1) is a folly and (2) is a good compromise since you 
still can write a condition before passing "method" to getattr(). Btw, 
if you look into the guts, you'll realize that (2) is nearly the same as 
(3)...


RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Little novice program written in Python

2008-04-25 Thread Robert Bossy

Marc 'BlackJack' Rintsch wrote:
Indeed. Would it be a sensible proposal that sequence slices should 
return an iterator instead of a list?



I don't think so as that would break tons of code that relies on the
current behavior.  Take a look at `itertools.islice()` if you want/need
an iterator.
A pity, imvho. Though I can live with islice() even if it is not as 
powerful as the [:] notation.


RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiple pattern regular expression

2008-04-25 Thread Robert Bossy

Arnaud Delobelle wrote:

micron_make <[EMAIL PROTECTED]> writes:

  

I am trying to parse a file whose contents are :

parameter=current
max=5A
min=2A

for a single line I used 
for line in file:

print re.search("parameter\s*=\s*(.*)",line).groups()

is there a way to match multiple patterns using regex and return a
dictionary. What I am looking for is (pseudo code)

for line in file:
   re.search("pattern1" OR "pattern2" OR ..,line)

and the result should be {pattern1:match, pattern2:match...}

Also should I be using regex at all here ?



If every line of the file is of the form name=value, then regexps are
indeed not needed.  You could do something like that.

params = {}
for line in file:
name, value = line.strip().split('=', 2)
params[name] = value 


(untested)

I might add before you stumble upon the consequences:
   params[name.rstrip()] = value.lstrip()

Cheers,
RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Little novice program written in Python

2008-04-25 Thread Robert Bossy

John Machin wrote:

On Apr 25, 5:44 pm, Robert Bossy <[EMAIL PROTECTED]> wrote:
  

Peter Otten wrote:


Rogério Brito wrote:
  

i = 2
while i <= n:
 if a[i] != 0:
print a[i]
 i += 1


You can spell this as a for-loop:
  
for p in a:

if p:
print p
  
It isn't exactly equivalent, but gives the same output as we know that a[0]

and a[1] are also 0.
  

If the OP insists in not examining a[0] and a[1], this will do exactly
the same as the while version:

for p in a[2:]:
if p:
print p




... at the cost of almost doubling the amount of memory required.
Indeed. Would it be a sensible proposal that sequence slices should 
return an iterator instead of a list?


RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Little novice program written in Python

2008-04-25 Thread Robert Bossy

Peter Otten wrote:

Rogério Brito wrote:

  

i = 2
while i <= n:
 if a[i] != 0:
print a[i]
 i += 1



You can spell this as a for-loop:

for p in a:
if p:
print p

It isn't exactly equivalent, but gives the same output as we know that a[0]
and a[1] are also 0.
  
If the OP insists in not examining a[0] and a[1], this will do exactly 
the same as the while version:


for p in a[2:]:
   if p:
   print p


Cheers,
RB
--
http://mail.python.org/mailman/listinfo/python-list

Re: annoying dictionary problem, non-existing keys

2008-04-24 Thread Robert Bossy

bvidinli wrote:

i use dictionaries to hold some config data,
such as:

conf={'key1':'value1','key2':'value2'}
and so on...

when i try to process conf, i have to code every time like:
if conf.has_key('key1'):
 if conf['key1']<>'':
 other commands


this is very annoying.
in php, i was able to code only like:
if conf['key1']=='someth'

in python, this fails, because, if key1 does not exists, it raises an exception.

MY question:
is there a way to directly get value of an array/tuple/dict  item by key,
as in php above, even if key may not exist,  i should not check if key exist,
i should only use it, if it does not exist, it may return only empty,
just as in php

i hope you understand my question...
  
If I understand correctly you want default values for non-existing keys. 
There are two ways for achieving this:


Way 1: use the get() method of the dict object:
   conf.get(key, default)

which is the same as:
   conf[key] if key in conf else default


Way 2: make conf a defaultdict instead of a dict, the documentation is 
there:

http://docs.python.org/lib/defaultdict-objects.html

Hope this helps,
RB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Profiling, recursive func slower than imperative, normal?

2008-04-17 Thread Robert Bossy
Gabriel Genellina wrote:
> En Wed, 16 Apr 2008 17:53:16 -0300, <[EMAIL PROTECTED]> escribió:
>
>   
>> On Apr 16, 3:27 pm, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote:
>>
>> 
>>> Any function can be implemented without recursion, although it isn't
>>> always easy or fun.
>>>
>>>   
>> Really? I'm curious about that, I can't figure out how that would
>> work. Could give an example? Say, for example, the typical: walking
>> through the file system hierarchy (without using os.walk(), which uses
>> recursion anyway!).
>> 
>
> Use a queue of pending directories to visit:
>
> start with empty queue
> queue.put(starting dir)
> while queue is not empty:
>dir = queue.get()
>list names in dir
>for each name:
>  if is subdirectory: queue.put(name)
>  else: process file
>   
Hi,

In that case, I'm not sure you get any performance gain since the queue 
has basically the same role as the stack in the recursive version. A 
definitive answer calls for an actual test, though.

Anyway if you want to process the tree depth-first, the queue version 
falls in the "not fun" category.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: use object method without initializing object

2008-04-15 Thread Robert Bossy
Reckoner wrote:
> would it be possible to use one of an object's methods without
> initializing the object?
>
> In other words, if I have:
>
> class Test:
>   def __init__(self):
>   print 'init'
>   def foo(self):
>   print 'foo'
>
> and I want to use the foo function without hitting the
> initialize constructor function.
>
> Is this possible?
>   
Hi,

Yes. It is possible and it is called "class method". That is to say, it 
is a method bound to the class, and not to the class instances.
In pragmatic terms, class methods have three differences from instance 
methods:
   1) You have to declare a classmethod as a classmethod with the 
classmethod() function, or the @classmethod decorator.
   2) The first argument is not the instance but the class: to mark this 
clearly, it is usually named cls, instead of self.
   3) Classmethods are called with class objects, which looks like this: 
ClassName.class_method_name(...).

In your example, this becomes:

class Test(object):
def __init__(self):
print 'init'
@classmethod
def foo(cls):
print 'foo'


Now call foo without instantiating a Test:
Test.foo()

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Process multiple files

2008-04-14 Thread Robert Bossy
Doran, Harold wrote:
> Say I have multiple text files in a single directory, for illustration
> they are called "spam.txt" and "eggs.txt". All of these text files are
> organized in exactly the same way. I have written a program that parses
> each file one at a time. In other words, I need to run my program each
> time I want to process one of these files.
>
> However, because I have hundreds of these files I would like to be able
> to process them all in one fell swoop. The current program is something
> like this:
>
> sample.py
> new_file = open('filename.txt', 'w')
> params = open('eggs.txt', 'r')
>   do all the python stuff here
> new_file.close()
>
> If these files followed a naming convention such as 1.txt and 2.txt I
> can easily see how these could be parsed consecutively in a loop.
> However, they are not and so is it possible to modify this code such
> that I can tell python to parse all .txt files in a certain directory
> and then to save them as separate files? For instance, using the example
> above, python would parse both spam.txt and eggs.txt and then save 2
> different files, say as spam_parsed.txt and eggs_parsed.txt.
>   
Hi,

It seems that you need glob.glob() :
http://docs.python.org/lib/module-glob.html#l2h-2284

import glob
for txt_filename in 
glob.glob('/path/to/the/dir/containing/your/files/*.txt'):
print txt_filename # or do your stuff with txt_filename

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 答复: Java or C++?

2008-04-14 Thread Robert Bossy
Penny Y. wrote:
> Perl is a functional language
I guess you mean functional in the sense it works.

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Java or C++?

2008-04-14 Thread Robert Bossy
[EMAIL PROTECTED] wrote:
> Hello, I was hoping to get some opinions on a subject. I've been
> programming Python for almost two years now. Recently I learned Perl,
> but frankly I'm not very comfortable with it. Now I want to move on
> two either Java or C++, but I'm not sure which. Which one do you think
> is a softer transition for a Python programmer? Which one do you think
> will educate me the best?
>   
Hi,

I vote for Java, it will be relatively smoother if you come from Python. 
Java adds a bit of type-checking which is a good thing to learn to code 
with. Also with Java, you'll learn to dig into an API documentation.

Brian suggests C++, personnally, I'd rather advise C for learning about 
computers themselves and non-GC memory management. C++ is just too nasty.

If your goal is exclusively education, I suggest a functional language 
(choose Haskell or any ML dialect) or even a predicate-based language 
(Prolog or Mercury, but the latter is pretty hardcore). These languages 
have quite unusual ways of looking at algorithm implementations and they 
will certainly expand your programming culture.

Cheers,
RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PROBLEMS WITH PYTHON IN SOME VARIABLE,FUNCTIONS,ETC.

2008-04-08 Thread Robert Bossy
Hi,

First thing, I appreciate (and I'm positive we all do) if you DID'N YELL 
AT ME.

[EMAIL PROTECTED] wrote:
> I am using Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.
> 1310 32 bit (Intel)] on win32 with IDLE 1.2.1
> My O/S is Windows XP SP2 I use 512 MB RAM.
> I am encountering the following problems:
> (i) a1=1
> a2=2
> a3=a1+a2
> print a3
> # The result is coming sometimes as 3 sometimes as vague numbers.
>   
On all computers I work with (two at work and one at home), it always 
gives me 3.

> (ii) x1="Bangalore is called the Silicon Valley of India"
> x2="NewYork"
> x3=x1.find(x2)
> print x3
> # The result of x3 is coming as -1 as well as +ve numbers.
>   
On my computer, this always gives me -1 which is what I expected since 
x2 not in x1.
Are you sure you posted what you wanted to show us?

> (iii) I have been designing one crawler using "urllib". For crawling
> one web page it is perfect. But when I am giving around 100 URLs by
> and their links and sublinks the IDLE is not responding. Presently I
> have been running with 10 URLs but can't it be ported?
>   
Maybe you've implemented quadratic algorithms, or even exponential. 
Sorry, I cannot see without more specifics...

> (iv) I have designed a program with more than 500 if elif else but
> sometimes it is running fine sometimes it is giving hugely erroneous
> results, one view of the code:
> elif a4==3:
> print "YOU HAVE NOW ENTERED THREE WORDS"
> if a3[0] not in a6:
> if a3[1] not in a6:
> if a3[2] not in a6:
> print "a3[0] not in a6, a3[1] not in a6, a3[2] not
> in a6"
> elif a3[2] in a6:
> print "a3[0] not in a6, a3[1] not in a6, a3[2] in
> a6"
> else:
> print "NONE3.1"
> elif a3[1] in a6:
> if a3[2] not in a6:
> print "a3[0] not in a6, a3[1] in a6, a3[2] not in
> a6"
> elif a3[2] in a6:
> print "a3[0] not in a6,a3[1] in a6, a3[2] in a6"
> else:
> print "NONE3.2"
> else:
> print "NONE3.3"
> elif a3[0] in a6:
> if a3[1] not in a6:
> if a3[2] not in a6:
> print "a3[0] in a6, a3[1] not in a6, a3[2] not in
> a6"
> elif a3[2] in a6:
> print "a3[0] in a6, a3[1] not in a6, a3[2] in a6"
> else:
> print "NONE3.4"
> elif a3[1] in a6:
>if a3[2] not in a6:
>print "a3[0] in a6, a3[1] in a6, a3[2] not in a6"
>elif a3[2] in a6:
>print "a3[0] in a6, a3[1] in a6, a3[2] in a6"
>else:
>print "NONE3.5"
> else:
> print "NONE3.6"
> else:
> print "NONE3.7"
>   
I guess you're looking for one or several of three strings inside a 
longer string. The algorithm is quadratic, no wonder your software 
doesn't respond for larger datasets. Someone spoke about Aho-Corasick 
recently on this list, you should defenitely consider it.

Moreover, the least we could say is that it doesn't loks pythonic, do 
you think the following does the same thing as your snip?

L = []
for i, x in enumerate(a3):
if x in a6:
L.append('a3[%d] in a6' % i)
else:
L.append('a3[%d] not in a6' % i)
print ', '.join(L)


RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reading dictionary's (key,value) from file

2008-04-07 Thread Robert Bossy
[EMAIL PROTECTED] wrote:
> Folks,
> Is it possible to read hash values from txt file.
> I have script which sets options. Hash table has key set to option,
> and values are option values.
>
> Way we have it, we set options in a different file (*.txt), and we
> read from that file.
> Is there easy way for just reading file and setting options instead of
> parsing it.
>
> so this is what my option files look like:
>
> 1opt.txt
> { '-cc': '12',
>   '-I': r'/my/path/work/'}
>
> 2opt.txt
> {  '-I': r/my/path/work2/'}
>
> so my scipt how has dictionary
> options = { '-cc' :'12'
> '-I': r'/my/path/work/:/my/path/work2/'}
>
> I am trying to avoid parsing
>   
For this particular case, you can use the optparse module:
http://docs.python.org/lib/module-optparse.html

Since you're obviously running commands with different set of options, I 
suggest you listen to Diez.

Cheers,
RB


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: finding euclidean distance,better code?

2008-03-28 Thread Robert Bossy
Gabriel Genellina wrote:
> That's what I said in another paragraph. "sum of coordinates" is using a  
> different distance definition; it's the way you measure distance in a city  
> with square blocks. I don't know if the distance itself has a name, but 
I think it is called Manhattan distance in reference of the walking 
distance from one point to another in this city.

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Inheritance question

2008-03-25 Thread Robert Bossy
Hi,

I'm not sure what you're trying to actually achieve, but it seems that 
you want an identificator for classes, not for instances. In this case, 
setting the id should be kept out of __init__ since it is an instance 
initializer: make id static and thus getid() a classmethod.
Furthermore, if you have several Foo subclasses and subsubclasses, etc. 
and still want to use the same identificator scheme, the getid() method 
would better be defined once for ever in Foo. I propose you the following:


class Foo(object):
id = 1

def getid(cls):
if cls == Foo: return str(cls.id)
return '%s.%d' % (cls.__bases__[0].getid(), cls.id) # get the 
parent id and append its own id
getid = classmethod(getid)

class FooSon(Foo):
id = 2

class Bar(Foo):
id = 3

class Toto(Bar):
id = 1

# Show me that this works
for cls in [Foo, FooSon, Bar, Toto]:
inst = cls()
print '%s id: %s\nalso can getid from an instance: %s\n' % 
(cls.__name__, cls.getid(), inst.getid())


One advantage of this approach is that you don't have to redefine the 
getid() method for each Foo child and descendent. Unfortunately, the 
"cls.__bases__[0]" part makes getid() to work if and only if the first 
base class is Foo or a subclass of Foo. You're not using multiple 
inheritance, are you?

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating dynamic objects with dynamic constructor args

2008-03-25 Thread Robert Bossy
[EMAIL PROTECTED] wrote:
> I'd like to create objects on the fly from a pointer to the class 
> using:  instance = klass()  But I need to be able to pass in variables 
> to the __init__ method.  I can recover the arguments using the 
> inspect.argspec, but how do I call __init__ with a list of arguments 
> and have them unpacked to the argument list rather than passed as a 
> single object?
>
> ie. class T:
>   def __init__(self, foo, bar):
>   self.foo = foo
>   self.bar = bar
>
> argspec = inspect.argspec(T.__init__)
> args = (1, 2)
>
> ??? how do you call T(args)?
>   
The star operator allows you to do this:
T(*args)


You also can use dict for keyword arguments using the double-star operator:

class T(object):
def __init__(self, foo=None, bar=None):
   self.foo = foo
   self.bar = bar

kwargs = {'bar': 1, 'foo': 2}
T(**kwargs)


RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dynamically created names / simple problem

2008-03-25 Thread Robert Bossy
Robert Bossy wrote:
> Jules Stevenson wrote:
>   
>> Hello all,
>>
>> I'm fairly green to python and programming, so please go gently. The 
>> following code
>>
>> for display in secondary:
>>
>> self.("so_active_"+display) = wx.CheckBox(self.so_panel, -1, "checkbox_2")
>>
>> Errors, because of the apparent nastyness at the beginning. What I’m 
>> trying to do is loop through a list and create uniquely named wx 
>> widgets based on the list values. Obviously the above doesn’t work, 
>> and is probably naughty – what’s a good approach for achieving this?
>>
>> 
> Hi,
>
> What you're looking for is the builtin function setattr:
> http://docs.python.org/lib/built-in-funcs.html#l2h-66
>
> Your snippet would be written (not tested):
>
> for display in secondary:
>
> setattr(self, "so_active_"+display, wx.CheckBox(self.so_panel, -1, 
> "checkbox_2"))
Damn! The indentation didn't came out right, it should be:

for display in secondary:
setattr(self, "so_active_"+display, wx.CheckBox(self.so_panel, 
-1,"checkbox_2"))

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dynamically created names / simple problem

2008-03-25 Thread Robert Bossy
Jules Stevenson wrote:
>
> Hello all,
>
> I'm fairly green to python and programming, so please go gently. The 
> following code
>
> for display in secondary:
>
> self.("so_active_"+display) = wx.CheckBox(self.so_panel, -1, "checkbox_2")
>
> Errors, because of the apparent nastyness at the beginning. What I’m 
> trying to do is loop through a list and create uniquely named wx 
> widgets based on the list values. Obviously the above doesn’t work, 
> and is probably naughty – what’s a good approach for achieving this?
>
Hi,

What you're looking for is the builtin function setattr:
http://docs.python.org/lib/built-in-funcs.html#l2h-66

Your snippet would be written (not tested):

for display in secondary:

setattr(self, "so_active_"+display, wx.CheckBox(self.so_panel, -1, 
"checkbox_2"))




RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: script won't run using cron.d or crontab

2008-03-21 Thread Robert Bossy
Bjorn Meyer wrote:
> I appologize if this been discussed previously. If so, just point me 
> to that information.
>
> I have done a fair bit of digging, but I haven't found a description 
> of what to actually do.
>
> I have a fairly lengthy script that I am able to run without any 
> problems from a shell. My problem is, now I am wanting to get it 
> running using crontab or cron.d. It seems that running it this way 
> there is a problem with some of the commands that I am using. For 
> instance "commands.getoutput" or "os.access". I am assuming that there 
> is something missing within the environment that cron runs that fails 
> to allow these commands to run.
> If anyone has any information that would help, it would be greatly 
> appreciated.
Hi,

 From a shell, type:
man 5 crontab
and read carefully. You'll realize that a croned script does not inherit 
from the user shell's environment.

Cheers,
RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: xml sax

2008-03-19 Thread Robert Bossy
Timothy Wu wrote:
> Hi,
>
> I am using  xml.sax.handler.ContentHandler to parse some simple xml.
>
> I want to detect be able to parse the content of this tag embedded in 
> the XML.
> 174
>
>
> Is the proper way of doing so involving finding the "Id" tag 
> from startElement(), setting flag when seeing one, and in characters(),
> when seeing that flag set, save the content?
>
> What if multiple tags of the same name are nested at different levels
>
> and I want to differentiate them? I would be setting a flag for each level.
> I can imagine things get pretty messy when flags are all around.
>   
Hi,

You could have a list of all opened elements from the root to the 
innermost. To keep such a list, you append the name of the element to 
this stack at the end of startElement() and pop it off at the end of 
endElement().

In this way you have acces to the path of the current parser position. 
In order to differentiate between character data in Id and in Id/Id, you 
just have to iterate at the last elements of the list.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: lists v. tuples

2008-03-17 Thread Robert Bossy
[EMAIL PROTECTED] wrote:
> On Mar 17, 6:49 am, [EMAIL PROTECTED] wrote:
>   
>> What are the considerations in choosing between:
>>
>>return [a, b, c]
>>
>> and
>>
>> return (a, b, c) # or return a, b, c
>>
>> Why is the immutable form the default?
>> 
>
> Using a house definition from some weeks ago, a tuple is a data
> structure such which cannot contain a refrence to itself.  Can a
> single expression refer to itself ever?
>   
In some way, I think this answer will be more confusing than 
enlightening to the original poster...

The difference is that lists are mutable, tuples are not. That means you 
can do the following with a list:
  - add element(s)
  - remove element(s)
  - re-assign element(s)
These operations are impossible on tuples. So, by default, I use lists 
because they offer more functionality.
But if I want to make sure the sequence is not messed up with later, I 
use tuples. The most frequent case is when a function (or method) 
returns a sequence whose fate is to be unpacked, things like:

def connect(self, server):
# try to connect to server
return (handler, message,)

It is pretty obvious that the returned value will (almost) never be used 
as is, the caller will most probably want to unpack the pair. Hence the 
tuple instead of list.

There's a little caveat for beginners: the tuple is immutable, which 
doesn't mean that each element of the tuple is necessarily immutable.

Also, I read several times tuples are more efficient than lists, however 
I wasn't able to actually notice that yet.

Cheers,
RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: merging intervals repeatedly

2008-03-14 Thread Robert Bossy
Magdoll wrote:
>> One question you should ask yourself is: do you want all solutions? or
>> just one?
>> If you want just one, there's another question: which one? the one with
>> the most intervals? any one?
>> 
>
> I actually don't know which solution I want, and that's why I keep
> trying different solutions :P
>   
You should think about what is your data and what is probably the "best" 
solution.


>> If you want all of them, then I suggest using prolog rather than python
>> (I hope I won't be flamed for advocating another language here).
>> 
>
> Will I be able to switch between using prolog & python back and forth
> though? Cuz the bulk of my code will still be written in python and
> this is just a very small part of it.
>   
You'll have to popen a prolog interpreter and parse its output. Not very 
sexy.
Moreover if you've never done prolog, well, you should be warned it's a 
"different" language (but still beautiful) with an important learning 
curve. Maybe not worth it for just one single problem.


>> If you have a reasonable number of intervals, you're algorithm seems
>> fine. But it is O(n**2), so in the case you read a lot of intervals and
>> you observe unsatisfying performances, you will have to store the
>> intervals in a cleverer data structure, see one of 
>> these:http://en.wikipedia.org/wiki/Interval_treehttp://en.wikipedia.org/wiki/Segment_tree
>> 
>
> Thanks! Both of these look interesting and potentially useful :)
>   
Indeed. However these structures are clearly heavyweight if the number 
of intervals is moderate. I would consider them only if I expected more 
than several thousands of intervals.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating a file with $SIZE

2008-03-14 Thread Robert Bossy
Bryan Olson wrote:
> Robert Bossy wrote:
>   
>> [EMAIL PROTECTED] wrote:
>> 
>>> Robert Bossy wrote:  
>>>   
>>>> Indeed! Maybe the best choice for chunksize would be the file's buffer
>>>> size... 
>>>> 
>
> That bit strikes me as silly.
>   
The size of the chunk must be as little as possible in order to minimize 
memory consumption. However below the buffer-size, you'll end up filling 
the buffer anyway before actually writing on disk.


>> Though, as Marco Mariani mentioned, this may create a fragmented file. 
>> It may or may not be an hindrance depending on what you want to do with 
>> it, but the circumstances in which this is a problem are quite rare.
>> 
>
> Writing zeros might also create a fragmented and/or compressed file.
> Using random data, which is contrary to the stated requirement but
> usually better for stated application, will prevent compression but
> not prevent fragmentation.
>
> I'm not entirely clear on what the OP is doing. If he's testing
> network throughput just by creating this file on a remote server,
> the seek-way-past-end-then-write trick won't serve his purpose.
> Even if the filesystem has to write all the zeros, the protocols
> don't actually send those zeros.
Amen.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "Attribute Doesnt Exist" ... but.... it does :-s

2008-03-13 Thread Robert Bossy
Robert Rawlins wrote:
> Hi Guys,
>
> Well thanks for the response, I followed your advice and chopped out all the
> crap from my class, right down to the bare __init__ and the setter method,
> however, the problem continued to persist.
>
> However, Robert mentioned something about unindented lines which got me
> thinking so I deleted my tab indents on that method and replaces them with
> standard space-bar indents and it appears to have cured the problem.
>   
Aha! Killed the bug at the first guess! You owe me a beer, mate.

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "Attribute Doesnt Exist" ... but.... it does :-s

2008-03-13 Thread Robert Bossy
Robert Rawlins wrote:
>
> Hello Guys,
>
> I’ve got an awfully aggravating problem which is causing some 
> substantial hair loss this afternoon J I want to get your ideas on 
> this. I am trying to invoke a particular method in one of my classes, 
> and I’m getting a runtime error which is telling me the attribute does 
> not exist.
>
> I’m calling the method from within __init__ yet it still seems to 
> think it doesn’t exist.
>
> Code:
>
> # Define the RemoteDevice class.
>
> class *remote_device*:
>
> # I'm the class constructor method.
>
> def *__init__*(/self/, message_list=/""/):
>
> /self/.set_pending_list(message_list)
>
> def *set_pending_list*(/self/, pending_list):
>
> # Set the message list property.
>
> /self/.pending_list = message_list
>
> And the error message which I receive during the instantiation of the 
> class:
>
> File: “/path/to/my/files/remote_device.py", line 22, in __init__
>
> self.set_pending_list(message_list)
>
> AttributeError: remote_device instance has no attribute 'set_pending_list'
>
> Does anyone have the slightest idea why this might be happening? I can 
> see that the code DOES have that method in it, I also know that I 
> don’t get any compile time errors so that should be fine. I know it 
> mentions line 22 in the error, but I’ve chopped out a load of non 
> relevant code for the sake of posting here.
>
Hi,
I don't get this error if I run your code. Maybe the irrelevant code 
causes the error: my guess is that there's a parenthesis mismatch or an 
undeindented line.

Btw, calls to set_pending_list will fail since the name "message_list" 
is not defined in its scope. Please follow Chris Mellon's advice.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating a file with $SIZE

2008-03-13 Thread Robert Bossy
[EMAIL PROTECTED] wrote:
> On Mar 12, 2:44 pm, Robert Bossy <[EMAIL PROTECTED]> wrote:
>   
>> Matt Nordhoff wrote:
>> 
>>> Robert Bossy wrote:
>>>   
>>>> k.i.n.g. wrote:
>>>> 
>>>>> I think I am not clear with my question, I am sorry. Here goes the
>>>>> exact requirement.
>>>>>   
>>>>> We use dd command in Linux to create a file with of required size. In
>>>>> similar way, on windows I would like to use python to take the size of
>>>>> the file( 50MB, 1GB ) as input from user and create a uncompressed
>>>>> file of the size given by the user.
>>>>>   
>>>>> ex: If user input is 50M, script should create 50Mb of blank or empty
>>>>> file
>>>>>   
>>>> def make_blank_file(path, size):
>>>> f = open(path, 'w')
>>>> f.seek(size - 1)
>>>> f.write('\0')
>>>> f.close()
>>>> 
>>>> I'm not sure the f.seek() trick will work on all platforms, so you can:
>>>> 
>>>> def make_blank_file(path, size):
>>>> f = open(path, 'w')
>>>> f.write('\0' * size)
>>>> f.close()
>>>> 
>>> I point out that a 1 GB string is probably not a good idea.
>>>   
>>> def make_blank_file(path, size):
>>> chunksize = 10485760 # 10 MB
>>> chunk = '\0' * chunksize
>>> left = size
>>> fh = open(path, 'wb')
>>> while left > chunksize:
>>> fh.write(chunk)
>>> left -= chunksize
>>> if left > 0:
>>> fh.write('\0' * left)
>>> fh.close()
>>>   
>> Indeed! Maybe the best choice for chunksize would be the file's buffer
>> size... I won't search the doc how to get the file's buffer size because
>> I'm too cool to use that function and prefer the seek() option since
>> it's lighning fast regardless the size of the file and it takes near to
>> zero memory.
>>
>> Cheers,
>> RB
>> 
>
> But what platforms does it work on / not work on?
>   
Posix. It's been ages since I touched Windows, so I don't know if XP and 
Vista are posix or not.
Though, as Marco Mariani mentioned, this may create a fragmented file. 
It may or may not be an hindrance depending on what you want to do with 
it, but the circumstances in which this is a problem are quite rare.

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating a file with $SIZE

2008-03-12 Thread Robert Bossy
Matt Nordhoff wrote:
> Robert Bossy wrote:
>   
>> k.i.n.g. wrote:
>> 
>>> I think I am not clear with my question, I am sorry. Here goes the
>>> exact requirement.
>>>
>>> We use dd command in Linux to create a file with of required size. In
>>> similar way, on windows I would like to use python to take the size of
>>> the file( 50MB, 1GB ) as input from user and create a uncompressed
>>> file of the size given by the user.
>>>
>>> ex: If user input is 50M, script should create 50Mb of blank or empty
>>> file
>>>   
>>>   
>> def make_blank_file(path, size):
>> f = open(path, 'w')
>> f.seek(size - 1)
>> f.write('\0')
>> f.close()
>>
>> I'm not sure the f.seek() trick will work on all platforms, so you can:
>>
>> def make_blank_file(path, size):
>> f = open(path, 'w')
>> f.write('\0' * size)
>> f.close()
>> 
>
> I point out that a 1 GB string is probably not a good idea.
>
> def make_blank_file(path, size):
> chunksize = 10485760 # 10 MB
> chunk = '\0' * chunksize
> left = size
> fh = open(path, 'wb')
> while left > chunksize:
> fh.write(chunk)
> left -= chunksize
> if left > 0:
> fh.write('\0' * left)
> fh.close()
>   
Indeed! Maybe the best choice for chunksize would be the file's buffer 
size... I won't search the doc how to get the file's buffer size because 
I'm too cool to use that function and prefer the seek() option since 
it's lighning fast regardless the size of the file and it takes near to 
zero memory.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating a file with $SIZE

2008-03-12 Thread Robert Bossy
k.i.n.g. wrote:
> I think I am not clear with my question, I am sorry. Here goes the
> exact requirement.
>
> We use dd command in Linux to create a file with of required size. In
> similar way, on windows I would like to use python to take the size of
> the file( 50MB, 1GB ) as input from user and create a uncompressed
> file of the size given by the user.
>
> ex: If user input is 50M, script should create 50Mb of blank or empty
> file
>   
def make_blank_file(path, size):
f = open(path, 'w')
f.seek(size - 1)
f.write('\0')
f.close()

I'm not sure the f.seek() trick will work on all platforms, so you can:

def make_blank_file(path, size):
f = open(path, 'w')
f.write('\0' * size)
f.close()

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: merging intervals repeatedly

2008-03-12 Thread Robert Bossy
Magdoll wrote:
> Hi,
>
> I have to read through a file that will give me a bunch of intervals.
> My ultimate goal is to produce a final set of intervals such that not
> two intervals overlap by more than N, where N is a predetermined
> length.
>
> For example, I could read through this input:
> (1,10), (3,15), (20,30),(29,40),(51,65),(62,100),(50,66)
>
> btw, the input is not guaranteed to be in any sorted order.
>
> say N = 5, so the final set should be
> (1,15), (20, 30), (29, 40), (50, 100)
>   
Hi,

The problem, as stated here, may have several solutions. For instance 
the following set of intervals also satisfies the constraint:
(1,15), (20,40), (50,100)

One question you should ask yourself is: do you want all solutions? or 
just one?
If you want just one, there's another question: which one? the one with 
the most intervals? any one?
If you want all of them, then I suggest using prolog rather than python 
(I hope I won't be flamed for advocating another language here).


> Is there already some existing code in Python that I can easily take
> advantage of to produce this? Right now I've written my own simple
> solution, which is just to maintain a list of the intervals. I can use
> the Interval module, but it doesn't really affect much. I read one
> interval from the input file at a time, and use bisect to insert it in
> order. The problem comes with merging, which sometimes can be
> cascading.
>
> ex:
> read (51,65) ==> put (51,65) in list
> read (62,100) ==> put (62,100) in list (overlap only be 4 <= N)
> read (50,66) ==> merge with (51,65) to become (50,66) ==> now can
> merge with (62,100)
The way this algorithm is presented suggests an additional constraint: 
you cannot merge two intervals if their overlap <= N. In that case, 
there is a single solution indeed...
Nitpick: you cannot merge (50,66) and (62,100) since their overlap is 
still <= 5.

If you have a reasonable number of intervals, you're algorithm seems 
fine. But it is O(n**2), so in the case you read a lot of intervals and 
you observe unsatisfying performances, you will have to store the 
intervals in a cleverer data structure, see one of these:
http://en.wikipedia.org/wiki/Interval_tree
http://en.wikipedia.org/wiki/Segment_tree


Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: difference b/t dictionary{} and anydbm - they seem the same

2008-03-11 Thread Robert Bossy
davidj411 wrote:
> anydbm and dictionary{} seem like they both have a single key and key
> value.
> Can't you put more information into a DBM file or link tables? I just
> don't see the benefit except for the persistent storage.
Except for the persistent storage, that insignificant feature... ;) Well 
I guess that persistent storage must be the reason some people use 
anydbm sometimes.

If you want keys and values of any type (not just strings) and 
persistent storage, you can use builtin dicts then pickle them.

Cheers,
RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: parsing directory for certain filetypes

2008-03-10 Thread Robert Bossy
jay graves wrote:
> On Mar 10, 9:28 am, Robert Bossy <[EMAIL PROTECTED]> wrote:
>   
>> Personally, I'd use glob.glob:
>>
>> import os.path
>> import glob
>>
>> def parsefolder(folder):
>> path = os.path.normpath(os.path.join(folder, '*.py'))
>> lst = [ fn for fn in glob.glob(path) ]
>> lst.sort()
>> return lst
>>
>> 
>
> Why the 'no-op' list comprehension?  Typo?
>   
My mistake, it is:

import os.path
import glob

def parsefolder(folder):
path = os.path.normpath(os.path.join(folder, '*.py'))
lst = glob.glob(path)
lst.sort()
return lst


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: parsing directory for certain filetypes

2008-03-10 Thread Robert Bossy
royG wrote:
> hi
> i wrote a function to parse a given directory and make a sorted list
> of  files with .txt,.doc extensions .it works,but i want to know if it
> is too bloated..can this be rewritten in more efficient manner?
>
> here it is...
>
> from string import split
> from os.path import isdir,join,normpath
> from os import listdir
>
> def parsefolder(dirname):
> filenms=[]
> folder=dirname
> isadr=isdir(folder)
> if (isadr):
> dirlist=listdir(folder)
> filenm=""
>   
This las line is unnecessary: variable scope rules in python are a bit 
different from what we're used to. You're not required to 
declare/initialize a variable, you're only required to assign a value 
before it is referenced.


> for x in dirlist:
>  filenm=x
>if(filenm.endswith(("txt","doc"))):
>  nmparts=[]
>nmparts=split(filenm,'.' )
>  if((nmparts[1]=='txt') or (nmparts[1]=='doc')):
>   
I don't get it. You've already checked that filenm ends with "txt" or 
"doc"... What is the purpose of these three lines?
Btw, again, nmparts=[] is unnecessary.

>   filenms.append(filenm)
> filenms.sort()
> filenameslist=[]
>   
Unnecessary initialization.

> filenameslist=[normpath(join(folder,y)) for y in filenms]
>   numifiles=len(filenameslist)
>   
numifiles is not used so I guess this line is too much.

>   print filenameslist
>   return filenameslist
>   

Personally, I'd use glob.glob:


import os.path
import glob

def parsefolder(folder):
path = os.path.normpath(os.path.join(folder, '*.py'))
lst = [ fn for fn in glob.glob(path) ]
lst.sort()
return lst


I leave you the exercice to add .doc files. But I must say (whoever's 
listening) that I was a bit disappointed that glob('*.{txt,doc}') didn't 
work.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem with join

2008-03-07 Thread Robert Bossy
nodrogbrown wrote:
> hi
> i am using python on WinXP..i have a string 'folder ' that i want to
> join to a set of imagefile names to create complete qualified names so
> that i can create objects out of them
>
> folder='F:/brown/code/python/fgrp1'
> filenms=['amber1.jpg', 'amber3.jpg', 'amy1.jpg', 'amy2.jpg']
> filenameslist=[]
> for x in filenms:
>   myfile=join(folder,x)
>   filenameslist.append(myfile)
>
> now when i print the filenameslist  i find that it looks like
>
> ['F:/brown/code/python/fgrp1\\amber1.jpg',
> 'F:/brown/code/python/fgrp1\\amber3.jpg', 'F:/brown/code/python/fgrp1\
> \amy1.jpg', 'F:/brown/code/python/fgrp1\\amy2.jpg']
>
> is there some problem with the way i use join? why do i get \\ infront
> of  the basename?
> i would prefer it like 'F:/brown/code/python/fgrp1/basename.jpg',
>   
os.path.join()
http://docs.python.org/lib/module-os.path.html#l2h-2185

vs.

string.join()
http://docs.python.org/lib/node42.html#l2h-379

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why not bisect options?

2008-03-04 Thread Robert Bossy
Aaron Watters wrote:
> On Feb 29, 9:31 am, Robert Bossy <[EMAIL PROTECTED]> wrote:
>   
>> Hi all,
>>
>> I thought it would be useful if insort and consorts* could accept the
>> same options than list.sort, especially key and cmp.
>> 
>
> Wouldn't this make them slower and less space efficient?  It would
> be fine to add something like this as an additional elaboration, but
> I want bisect to scream as fast as possible in the default streamlined
> usage.
Yes it is slower and bigger, so I agree that the canonical 
implementation for default values should be kept. Also because the 
original bisect functions are actually written in C, the speed 
difference is even more noticeable.

Though, I needed custom ordering bisects since I was implementing 
interval trees (storing intervals by startpoint/endpoint).

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Eurosymbol in xml document

2008-03-04 Thread Robert Bossy
Diez B. Roggisch wrote:
> Hellmut Weber wrote:
>
>   
>> Hi,
>> i'm new here in this list.
>>
>> i'm developing a little program using an xml document. So far it's easy
>> going, but when parsing an xml document which contains the EURO symbol
>> ('€') then I get an error:
>>
>> UnicodeEncodeError: 'charmap' codec can't encode character u'\xa4' in
>> position 11834: character maps to 
>>
>> the relevant piece of code is:
>>
>> from xml.dom.minidom import Document, parse, parseString
>> ...
>> doc = parse(inFIleName)
>> 
>
> The contents of the file must be encoded with the proper encoding which is
> given in the XML-header, or has to be utf-8 if no header is given.
>
> From the above I think you have a latin1-based document. Does the encoding
> header match?
If the file is declared as latin-1 and contains an euro symbol, then the 
file is actually invalid since euro is not defined of in iso-8859-1. If 
there is no encoding declaration, as Diez already said, the file should 
be encoded as utf-8.

Try replacing or adding the encoding with latin-15 (or iso-8859-15) 
which is the same as latin-1 with a few changes, including the euro symbol:




If your file has lot of strange diacritics, you might take a look on the 
little differences between latin-1 and latin-15 in order to make sure 
that your file won't be broken:
http://en.wikipedia.org/wiki/ISO_8859-15

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Problem round-tripping with xml.dom.minidom pretty-printer

2008-02-29 Thread Robert Bossy
Ben Butler-Cole wrote:
>> An additional thing to keep in mind is that toprettyxml does not print
>> an XML identical to the original DOM tree: it adds newlines and tabs.
>> When parsed again these blank characters are inserted in the DOM tree as
>> character nodes. If you toprettyxml an XML document twice in a row, then
>> the second one will also add newlines and tabs around the newlines and
>> tabs added by the first. Since you call toprettyxml an infinite number
>> of times, it is expected that lots of blank characters appear.
>> 
>
> Right. That's the behaviour I'm asking about, which I consider to be
> problematic. I would expect a module providing a parser and pretty-
> printer (not just for XML parsers) to be able to conservatively round-
> trip.
>
> As far as I can see (and your comments back this up) minidom doesn't
> have this property. Unless anyone knows how to get it to behave that
> way...
>   
minidom --any DOM parser, btw-- has no means to know which blank 
character is a pretty print artefact or actual blank content from the 
original XML.

You could write a function that strips all-blank nodes recursively down 
the elements tree, before doing so I suggest you take a look at section 
2.10 of http://www.w3.org/TR/REC-xml/.

RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem round-tripping with xml.dom.minidom pretty-printer

2008-02-29 Thread Robert Bossy
Ben Butler-Cole wrote:
> Hello
>
> I have run into a problem using minidom. I have an HTML file that I
> want to make occasional, automated changes to (adding new links). My
> strategy is to parse it with minidom, add a node, pretty print it and
> write it back to disk.
>
> However I find that every time I do a round trip minidom's pretty
> printer puts extra blank lines around every element, so my file grows
> without limit. I have found that normalizing the document doesn't make
> any difference. Obviously I can fix the problem by doing without the
> pretty-printing, but I don't really like producing non-human readable
> HTML.
>
> Here is some code that shows the behaviour:
>
> import xml.dom.minidom as dom
> def p(t):
> d = dom.parseString(t)
> d.normalize()
> t2 = d.toprettyxml()
> print t2
> p(t2)
> p('')
>
> Does anyone know how to fix this behaviour? If not, can anyone
> recommend an alternative XML tool for simple tasks like this?
Hi,

The last line of p() calls itself: it is an unconditional recursive call 
so, no matter what it does, it will never stop. And since p() also 
prints something, calling it will print endlessly. By removing this 
line, you get something like:








That seems sensible, imo. Was that what you wanted?

An additional thing to keep in mind is that toprettyxml does not print 
an XML identical to the original DOM tree: it adds newlines and tabs. 
When parsed again these blank characters are inserted in the DOM tree as 
character nodes. If you toprettyxml an XML document twice in a row, then 
the second one will also add newlines and tabs around the newlines and 
tabs added by the first. Since you call toprettyxml an infinite number 
of times, it is expected that lots of blank characters appear.

Finally, normalize() is supposed to merge consecutive sibling character 
nodes, however it will never remove character contents even if they are 
blank. That means that several character
nodes will be replaced by a single one whose content is the 
concatenation of the respective content of the original nodes. Clear enough?

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: joining strings question

2008-02-29 Thread Robert Bossy
[EMAIL PROTECTED] wrote:
> Hi all,
>
> I have some data with some categories, titles, subtitles, and a link
> to their pdf and I need to join the title and the subtitle for every
> file and divide them into their separate groups.
>
> So the data comes in like this:
>
> data = ['RULES', 'title','subtitle','pdf',
> 'title1','subtitle1','pdf1','NOTICES','title2','subtitle2','pdf','title3','subtitle3','pdf']
>
> What I'd like to see is this:
>
> [RULES', 'title subtitle','pdf', 'title1 subtitle1','pdf1'],
> ['NOTICES','title2 subtitle2','pdf','title3 subtitle3','pdf'], etc...
>
> I've racked my brain for a while about this and I can't seem to figure
> it out.  Any ideas would be much appreciated.
>   
As others already said, the data structure is quite unfit. Therefore I 
give you one of the ugliest piece of code I've produced in years:

r = []
for i in xrange(0, len(data), 7):
r.append([data[i], ' '.join((data[i+1], data[i+2],)), data[i+3], ' 
'.join((data[i+4], data[i+5],)), data[i+6]])
print r

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


why not bisect options?

2008-02-29 Thread Robert Bossy
Hi all,

I thought it would be useful if insort and consorts* could accept the 
same options than list.sort, especially key and cmp.

The only catch I can think of is that nothing prevents a crazy developer 
to insort elements using different options to the same list. I foresee 
two courses of actions:
1) let the developer be responsible for the homogeneity of successive 
insort calls on the same list (remember that the developer is already 
responsible for giving a sorted list), or
2) make bisect a class which keeps the key and cmp options internally 
and always use them for comparison, something like:


class Bisect:
def __init__(self, lst = [], key = None, cmp = None):
self.key = key
self.cmp = cmp
self.lst = lst
self.lst.sort(key = key, cmp = cmp)

def compare_elements(self, a, b):
if self.cmp is not None:
return self.cmp(a, b)
if self.key is not None:
return cmp(self.key(a), self.key(b))
return cmp(a,b)

def insort_right(self, elt, lo = 0, hi = None):
"""Inspired from bisect in the python standard library"""
if hi is None:
hi = len(self.lst)
while lo < hi:
mid = (lo + hi) / 2
if self.compare_elements(elt, self.lst[mid]) < 0:
hi = mid
else:
lo = mid + 1
self.lst.insert(lo, elt)
...


Any thoughts about this?

RB

* at this point you should smile...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: executing a python program by specifying only its name in terminal or command line

2008-02-26 Thread Robert Bossy
Steve Holden wrote:
> bharath venkatesh wrote:
>   
>> hi,
>>i wanna run a python program by specifying  only its  name ex prog 
>> with the arguments in the terminal or command line instead of specifying 
>> python prog in the terminal to run the program   not even specifying the 
>> it with .py extension ..
>> for example i want to run the python program named prog by sepcifying
>> $prog -arguments
>> instead of
>> $python prog -arguments
>> or
>> $prog.py -arguments
>> can anyone tell me how to do it
>>
>> 
> reseach pathext for Windows.
>
> For Unix-like systems use the shebang (#!) line, and don't put a .py at 
> the end of the filename.
Besides being ugly and a bit unsettling for the user, the final .py 
won't prevent the execution of your program.
Though the file must have the executable attribute set, so you have to 
chmod +x it.

Cheers,
RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parent instance attribute access

2008-02-25 Thread Robert Bossy
Gabriel Rossetti wrote:
> Hello,
>
> I have something weird going on, I have the following (simplified) :
>
> class MyFactory(..., ...):
>
> def __init__(self, *args, **kwargs):
> self.args = args
> self.kwargs = kwargs
> ...
>
> class MyXmlFactory(MyFactory):
>
> def __init__(self, *args, **kwargs):
> MyFactory.__init__(self, *args, **kwargs)
> #self.args = args
> #self.kwargs = kwargs
>...
>
> def build(self, addr):
> p = self.toto(*self.args, **self.kwargs)
>
> when build is called I get this :
>
> exceptions.AttributeError: MyXmlFactory instance has no attribute 'args'
>
> If I uncomment "self.args = args" and "self.kwargs = kwargs" in 
> __init__(...)
> it works. I find this strange, since in OO MyXmlFactory is a MyFactory 
> and thus has
> "self.args" and "self.kargs", and I explicitly called the paret 
> __init__(...) method, so I tried this small example :
>
> >>> class A(object):
> ... def __init__(self, *args, **kargs):
> ... self.args = args
> ... self.kargs = kargs
> ... self.toto = 3
> ...
> >>> class B(A):
> ...   def __init__(self, *args, **kargs):
> ...   A.__init__(self, *args, **kargs)
> ...   def getToto(self):
> ...  print str(self.toto)
> ...
> >>> b = B()
> >>> b.getToto()
> 3
>
> so what I though is correct, so why does it not work with args and 
> kargs? BTW, If I build a MyFactory and call build, it works as expected.
>   
If you add the following lines ot getToto, it works as expected:
print self.args
print self.kargs

The bug must lay  somewhere  else in your code.

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: check a directory

2008-02-25 Thread Robert Bossy
Raj kumar wrote:
> Hi all,
> I'm using following code...
>
> for x in listdir(path)
> some code.
>
> but how i can check whether x is a directory or not?
> Because listdir() is giving all the files present in that path
Take a look at the module os.path, especially the functions named isdir 
and walk.

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: integrating python with owl

2008-02-25 Thread Robert Bossy
Noorhan Abbas wrote:
> Hello,
> I have developed an ontology using protege owl and  I wonder if you 
> can help me get any documentation on how to integrate it with python.
>  
Hi,

It depends on what you mean by integrating.

If you mean reading OWL files generated by Protégé, there are some 
Python libraries out there though I never tested any:
http://eulersharp.sourceforge.net/2004/02swap/OWLLogic/owllogic.html
http://seth-scripting.sourceforge.net/

I must warn you, the OWL written by Protégé isn't quite straightforward 
to parse. Anyway RDFLib seems to be the canonical library for parsing 
and processing RDF/RDFS.


If your goal is to develop plugins in Python. Well... I expect that any 
solution is based on Jython. A quick googling gave me JOT which is more 
like a scripting console for Protégé:
http://protege.cim3.net/file/work/files/ProtegeScriptConsole/jot-tutorial/


Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: is this data structure build-in or I'll have to write my own class?

2008-02-21 Thread Robert Bossy
mkPyVS wrote:
> This isn't so optimal but I think accomplishes what you desire to some
> extent... I *think* there is some hidden gem in inheriting from dict
> or an mapping type that is cleaner than what I've shown below though.
>
> class dum_struct:
>def __init__(self,keyList,valList):
>   self.__orderedKeys = keyList
>   self.__orderedValList = valList
>def __getattr__(self,name):
>   return self.__orderedValList[self.__orderedKeys.index(name)]
>
>
> keys = ['foo','baz']
> vals = ['bar','bal']
>
> m = dum_struct(keys,vals)
>
> print m.foo
>   

Let's add:
__getitem__(self, key):
return self.__orderedValList[key]


in order to have: m.foo == m[0]

RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: distutils and data files

2008-02-20 Thread Robert Bossy
Sam Peterson wrote:
> I've been googling for a while now and cannot find a good way to deal
> with this.
>
> I have a slightly messy python program I wrote that I've historically
> just run from the extracted source folder.  I have pictures and sound
> files in this folder that this program uses.  I've always just used
> the relative path names of these files in my program.
>
> Lately, I had the idea of cleaning up my program and packaging it with
> distutils, but I've been stuck on a good way to deal with these
> resource files.  The package_data keyword seems to be the way to go,
> but how can I locate and open my files once they've been moved?  In
> other words, what should I do about changing the relative path names?
> I need something that could work from both the extracted source
> folder, AND when the program gets installed via the python setup.py
> install command.
>   
This seems to be a classic distutils  question:  how a python module can 
access to data files *after* being installed?

The following thread addresses this issue:
http://www.gossamer-threads.com/lists/python/python/163159

Carl Banks' solution seems to overcome the problem: his trick is to 
generate an additional configuration module with the relevant 
informations from the distutil data structure. However it is quite an 
old thread (2003) and I don't know if there has been progress made since 
then, maybe the distutils module now incorporates a similar mechanism.

Hope it helps,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dictionary of operators

2008-02-15 Thread Robert Bossy
A.T.Hofkamp wrote:
> On 2008-02-14, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>   
>> Hi,
>>
>> In the standard library module "operator", it would be nice to have a 
>> dictionary
>> mapping operators strings with their respective functions. Something like:
>>
>>   {
>> '+': add,
>> '-': sub,
>> 'in': contains,
>> 'and': and_,
>> 'or': or_,
>> ...
>>   }
>>
>> Does such a dictionary already exist? Is it really a good and useful idea?
>> 
>
> How would you handle changes in operator syntax?
> - I have 'add' instead of '+'
> - I have U+2208 instead of 'in'
>   
Originally I meant only the Python syntax which shouldn't change that 
much. For some operators (arith, comparison) the toy language had the 
same syntax as Python.
Btw,  U+2208 would be a wonderful token... if only it was on standard 
keyboards.

> I don't think this is generally applicable.
>   
Thinking about it, I think it is not really applicable. Mainly because 
my examples were exclusively binary operators. What would be for unary 
operators? Or enclosing operators (getitem)?

> Why don't you attach the function to the +/-/in/... token instead? Then you
> don't need the above table at all.
>   
Could be. But I prefer settling the semantic parts the furthest possible 
from the lexer. Not that I have strong arguments for that, it's religious.

Anyway, thanks for answering,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT: Speed of light [was Re: Why not a Python compiler?]

2008-02-12 Thread Robert Bossy
Jeff Schwab wrote:
> Erik Max Francis wrote:
>   
>> Jeff Schwab wrote:
>>
>> 
>>> Erik Max Francis wrote:
>>>   
>>>> Robert Bossy wrote:
>>>> 
>>>>> I'm pretty sure we can still hear educated people say that free fall 
>>>>> speed depends on the weight of the object without realizing it's a 
>>>>> double mistake.
>>>>>   
>>>> Well, you have to qualify it better than this, because what you've 
>>>> stated in actually correct ... in a viscous fluid.
>>>> 
>>> By definition, that's not free fall.
>>>   
>> In a technical physics context.  But he's talking about posing the 
>> question to generally educated people, not physicists (since physicists 
>> wouldn't make that error).  In popular parlance, "free fall" just means 
>> falling freely without restraint (hence "free fall rides," "free 
>> falling," etc.).  And in that context, in the Earth's atmosphere, you 
>> _will_ reach a terminal speed that is dependent on your mass (among 
>> other things).
>>
>> So you made precisely my point:  The average person would not follow 
>> that the question was being asked was about an abstract (for people 
>> stuck on the surface of the Earth) physics principle, but rather would 
>> understand the question to be in a context where the supposedly-wrong 
>> statement is _actually true_.
>> 
>
> So what's the "double mistake?"  My understanding was (1) the misuse 
> (ok, vernacular use) of the term "free fall," and (2) the association of 
> weight with free-fall velocity ("If I tie an elephant's tail to a 
> mouse's, and drop them both into free fall, will the mouse slow the 
> elephant down?")
>   
In my mind, the second mistake was the confusion between weight and mass.

Cheers
RB

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT: Speed of light [was Re: Why not a Python compiler?]

2008-02-11 Thread Robert Bossy
Grant Edwards wrote:
> On 2008-02-11, Steve Holden <[EMAIL PROTECTED]> wrote:
>
>   
>> Well the history of physics for at least two hundred years has
>> been a migration away from the intuitive.
>> 
>
> Starting at least as far back as Newtonian mechanics.  I once
> read a very interesting article about some experiments that
> showed that even simple newtonian physics is counter-intuitive.
> Two of the experiments I remember vividly. One of them showed
> that the human brain expects objects constrained to travel in a
> curved path will continue to travel in a curved path when
> released.  The other showed that the human brain expects that
> when an object is dropped it will land on a spot immediately
> below the drop point -- regardless of whether or not the ojbect
> was in motion horizontally when released.
>
> After repeated attempts at the tasks set for them in the
> experiments, the subjects would learn strategies that would
> work in a Newtonian world, but the initial intuitive reactions
> were very non-Newtonian (regardless of how educated they were
> in physics).
>   
I'm pretty sure we can still hear educated people say that free fall 
speed depends on the weight of the object without realizing it's a 
double mistake.

Cheers,
RB
-- 
http://mail.python.org/mailman/listinfo/python-list