Strings

2005-04-21 Thread Dan
I've having trouble coming to grip with Python strings.

I need to send binary data over a socket.  I'm taking the data from a
database.  When I extract it, non-printable characters come out as a
backslash followed by a three numeric characters representing the
numeric value of the data.  I guess this is what you would call a raw
Python string.  I want to convert those four characters ( in C-think,
say "\\012" ) into a single character and put it in a new string.

There's probably a simple way to do it, but I haven't figured it out.
What I've done so far is to step through the string, character by
character.  Normal characters are appended onto a new string.  If I
come across a '\' character, I look for the next three numeric
characters.  But I don't know how to convert this code into a single
character and append it onto the new string.

I'm sure what I'm doing is long and convoluted.  Any suggestions would
be appreciated.

Dan

-- 
http://mail.python.org/mailman/listinfo/python-list


Indexing strings

2005-03-04 Thread Fred
Hi everybody

I am searching for a possibility, to find out, what the index for a
certain lettyer in a string is.
My example:

for x in text:
   if x == ' ':
  list = text[:  # There I need the index of the space the
program found during the loop...

Is there and possibility to find the index of the space???
Thanks for any help!
Fred
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings

2005-04-21 Thread keirr
I'd use the int and chr casts.  e.g.,

new_string = ""
a = '012'
new_string += chr(int(a))

Just in case the 012 is an octal code I'll mention that to cast to int
in general you can pass the base, as in int('034',8) or int('AF',16)

Cheers,

 Keir.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings

2005-04-21 Thread Peter Hansen
Dan wrote:
I've having trouble coming to grip with Python strings.
I need to send binary data over a socket.  I'm taking the data from a
database.  When I extract it, non-printable characters come out as a
backslash followed by a three numeric characters representing the
numeric value of the data.  I guess this is what you would call a raw
Python string.  I want to convert those four characters ( in C-think,
say "\\012" ) into a single character and put it in a new string.
Does this help?
>>> s = 'foo \\012 bar'
>>>
>>> s.decode('string-escape')
'foo \n bar'
>>> print s.decode('string-escape')
foo
 bar
>>>
Note that the \n in the first one is because I didn't
*print* the result, but merely allowed the interpreter
to call repr() on it.  repr() for a newline is of course
backslash-n, so that's what you see (inside quotation marks)
but the string itself has only 9 characters in it, as
you wished.
-Peter
--
http://mail.python.org/mailman/listinfo/python-list


Re: Strings

2005-04-21 Thread Terry Reedy

"Dan" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> I've having trouble coming to grip with Python strings.
>
> I need to send binary data over a socket.  I'm taking the data from a
> database.  When I extract it, non-printable characters come out as a
> backslash followed by a three numeric characters representing the
> numeric value of the data.

Are you sure that the printable expansion is actually in the string itself, 
and not just occurring when you 'look' at the string by printing it -- as 
in...

>>> s='1\001b'
>>> s
'1\x01b'
>>> len(s)
3
?
>  I guess this is what you would call a raw Python string.

No such thing.  There are only strings (and unicode strings).  'raw' only 
applies to a mode of interpreting string literals in the process of turning 
them into bytes or unicode.

Terry J. Reedy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings

2005-04-21 Thread John Machin
On Thu, 21 Apr 2005 21:16:43 +0800, Dan
<[EMAIL PROTECTED]> wrote:

>I've having trouble coming to grip with Python strings.
>
>I need to send binary data over a socket.  I'm taking the data from a
>database.  When I extract it, non-printable characters come out as a
>backslash followed by a three numeric characters representing the
>numeric value of the data.  

It would be very strange, but not beyond belief, for a DBMS to be
storing strings like that. What you are seeing is more likely an
artifact of how you are extracting it.  If this is so, it would be
better to avoid the complication and error-proneness of converting to
an octal[yuk!]-based representation and back again.

However if the DBMS *IS* storing strings like that, then it would
require a look in the DBMS docs PLUS a look at the empirical evidence
to produce a reliable transcoding.

If you were to tell us which DBMS, and supply a copy&paste snippet
(*NOT* a re-typing) of the *actual* extraction code that you are
using, then we should be able to help you further.

Cheers,

John


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings

2005-04-21 Thread Dan
On Thu, 21 Apr 2005 10:09:59 -0400, Peter Hansen <[EMAIL PROTECTED]>
wrote:

Thanks, that's exactly what I wanted.  Easy when you know how.

Dan

>Dan wrote:
>> I've having trouble coming to grip with Python strings.
>> 
>> I need to send binary data over a socket.  I'm taking the data from a
>> database.  When I extract it, non-printable characters come out as a
>> backslash followed by a three numeric characters representing the
>> numeric value of the data.  I guess this is what you would call a raw
>> Python string.  I want to convert those four characters ( in C-think,
>> say "\\012" ) into a single character and put it in a new string.
>
>Does this help?
>
> >>> s = 'foo \\012 bar'
> >>>
> >>> s.decode('string-escape')
>'foo \n bar'
> >>> print s.decode('string-escape')
>foo
>  bar
> >>>
>
>Note that the \n in the first one is because I didn't
>*print* the result, but merely allowed the interpreter
>to call repr() on it.  repr() for a newline is of course
>backslash-n, so that's what you see (inside quotation marks)
>but the string itself has only 9 characters in it, as
>you wished.
>
>-Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings

2005-04-21 Thread Bengt Richter
On Thu, 21 Apr 2005 10:09:59 -0400, Peter Hansen <[EMAIL PROTECTED]> wrote:

>Dan wrote:
>> I've having trouble coming to grip with Python strings.
>> 
>> I need to send binary data over a socket.  I'm taking the data from a
>> database.  When I extract it, non-printable characters come out as a
>> backslash followed by a three numeric characters representing the
>> numeric value of the data.  I guess this is what you would call a raw
>> Python string.  I want to convert those four characters ( in C-think,
>> say "\\012" ) into a single character and put it in a new string.
>
>Does this help?
>
> >>> s = 'foo \\012 bar'
> >>>
> >>> s.decode('string-escape')
>'foo \n bar'
> >>> print s.decode('string-escape')
>foo
>  bar
> >>>
>
>Note that the \n in the first one is because I didn't
>*print* the result, but merely allowed the interpreter
>to call repr() on it.  repr() for a newline is of course
>backslash-n, so that's what you see (inside quotation marks)
>but the string itself has only 9 characters in it, as
>you wished.
When I wonder how many characters are actually in a_string, I find
list(a_string) helpful, which BTW also re-reprents equivalent escapes
in a consistent way, e.g., note \n's at the end:

 >>> s= 'escapes \\n \n \x0a \012'
 >>> list(s)
 ['e', 's', 'c', 'a', 'p', 'e', 's', ' ', '\\', 'n', ' ', '\n', ' ', '\n', ' ', 
'\n']

OTOH, don't try that with '\a':

 >>> list('\a \x07 \07')
 ['\x07', ' ', '\x07', ' ', '\x07']

Why not like \n above or like \t

 >>> list('\t \x09 \011')
 ['\t', ' ', '\t', ' ', '\t']

Is this fixed by now? It's not news ;-)

 >>> '\a'
 '\x07'

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Interned Strings

2006-01-10 Thread Dave
Hello All,

I'm trying to clarify how Python avoids byte by byte
string comparisons most of the time. As I understand,
dictionaries keep strings, their keys (hash values),
and caches of their keys. Caching keys helps to avoid
recalculation of a string's hash value. So, when two
strings need to be compared, only their cached keys
are compared, which improves performance as there is
no need for byte by byte comparison.

Also, there is a global interning dictionary that
keeps interned strings. What I don't understand is why
strings are interned. How does it help with string
comparisons? 

Thank you.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
-- 
http://mail.python.org/mailman/listinfo/python-list


quoting strings

2006-01-31 Thread Michal Duda
Hi,

is any equivalent to perl "quotemeta" function in python?


I know that I can use this on string: r'\\test'
but I need to do this on the variable

Thanks in advance

Michal
-- 
http://mail.python.org/mailman/listinfo/python-list


python strings

2006-05-03 Thread mike7411
Is it possible for python strings to contain a zero byte?

-- 
http://mail.python.org/mailman/listinfo/python-list


concatenating strings

2006-12-15 Thread EHC
hello!

since i am a py noob, please bear with me ; )

how is it possible to concat a string and an integer in a
print-command? i've tried

print "This robot is named %s. The current speed setting is %d, and %s
has a lifetime of %d" % (self.name , self.speed , self.name)

as well as

print "This robot is named %s. The current speed setting is %d, and %s
has a lifetime of %d" & self.name % self.speed % self.name

though nothing works out...

background is a class named Robot with members speed, name, etc...

tia,
Erich

-- 
http://mail.python.org/mailman/listinfo/python-list


Matching Strings

2007-02-09 Thread rshepard
  I'm not sure how to change a string so that it matches another one.

  My application (using wxPython and SQLite3 via pysqlite2) needs to compare
a string selected from the database into a list of tuples with another
string selected in a display widget.

  An extract of the relevant code is:

selName = self.polTree.GetItemText(selID)
...  
for item in self.appData.polNat:
  print 'Item: ', item, '\n', 'selName: ', selName, '\n'
  if item == selName:
print '* ', self.appData.polNat[1]

  The last comparison and print statement never work because the strings are
presented this way:

Item:  (u'ground water',) 
selName:  ground water

  What do I need to do to 'Item' to strip the parentheses, unicode symbol,
single quotes, and comma? Do I want 'raw' output? If so, how do I specify
that in the line 'if item == selName:'?

TIA,

Rich
-- 
http://mail.python.org/mailman/listinfo/python-list


Concatenating strings

2006-06-30 Thread John Henry
Sorry if this is a dumb question.

I have a list of strings (some 10,000+) and I need to concatenate them
together into one very long string.  The obvious method would be, for
example:

alist=["ab","cd","ef",.,"zzz"]
blist = ""
for x in alist:
   blist += x

But is there a cleaner and faster way of doing this?

Thanks,

-- 
http://mail.python.org/mailman/listinfo/python-list


Concatenating strings

2006-06-30 Thread John Henry
Sorry if this is a dumb question.

I have a list of strings (some 10,000+) and I need to concatenate them
together into one very long string.  The obvious method would be, for
example:

alist=["ab","cd","ef",.,"zzz"]
blist = ""
for x in alist:
   blist += x

But is there a cleaner and faster way of doing this?

Thanks,

-- 
http://mail.python.org/mailman/listinfo/python-list


weird strings question

2005-02-25 Thread Lucas Raab
Is it possible to assign a string a numerical value?? For example, in 
the string "test" can I assign a number to each letter as in "t" = 45, 
"e" =  89, "s" = 54, and so on and so forth??

TIA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Indexing strings

2005-03-04 Thread Patrick Useldinger
Fred wrote:
I am searching for a possibility, to find out, what the index for a
certain lettyer in a string is.
My example:
for x in text:
   if x == ' ':
  list = text[:  # There I need the index of the space the
program found during the loop...
Is there and possibility to find the index of the space???
Thanks for any help!
Fred
Use the index method, e.g.: text.index(' ').
What exactly do you want to do?
-pu
--
http://mail.python.org/mailman/listinfo/python-list


Re: Indexing strings

2005-03-04 Thread Steve Holden
Fred wrote:
Hi everybody
I am searching for a possibility, to find out, what the index for a
certain lettyer in a string is.
My example:
for x in text:
   if x == ' ':
  list = text[:  # There I need the index of the space the
program found during the loop...
Is there and possibility to find the index of the space???
Thanks for any help!
Fred
Perhaps you need something at a higher level (though you could use 
text.find(" ") for the first occurrence). I suspect you might want 
split(). Fred, meet split(). split(), meet Fred.

 >>> s = "The quick brown python swallows the lazy mongoose"
 >>> s.split()
['The', 'quick', 'brown', 'python', 'swallows', 'the', 'lazy', 'mongoose']
 >>> s.split(None)
['The', 'quick', 'brown', 'python', 'swallows', 'the', 'lazy', 'mongoose']
 >>> s.split(None, 3)
['The', 'quick', 'brown', 'python swallows the lazy mongoose']
 >>> s.split(None, 1)
['The', 'quick brown python swallows the lazy mongoose']
 >>>
regards
 Steve
--
Meet the Python developers and your c.l.py favorites March 23-25
Come to PyCon DC 2005  http://www.pycon.org/
Steve Holden   http://www.holdenweb.com/
--
http://mail.python.org/mailman/listinfo/python-list


Re: Indexing strings

2005-03-04 Thread Fred
> Use the index method, e.g.: text.index(' ').
> What exactly do you want to do?

That was exactely what I was searching for. I needed a program, that
chopped up a string into its words and then saves them into a list. I
think I got this done...
Thanks for the help
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Indexing strings

2005-03-05 Thread Patrick Useldinger
Fred wrote:
That was exactely what I was searching for. I needed a program, that
chopped up a string into its words and then saves them into a list. I
think I got this done...
There's a function for that: text.split().
You should really have a look at the Python docs. Also, 
http://diveintopython.org/ and http://www.gnosis.cx/TPiP/ are great 
tutorials.

-pu
--
http://mail.python.org/mailman/listinfo/python-list


Strings and Lists

2005-04-18 Thread Tom Longridge
My current Python project involves lots repeatating code blocks,
mainly centred around a binary string of data. It's a genetic
algorithm in which there are lots of strings (the chromosomes) which
get mixed, mutated and compared a lot.

Given Python's great list processing abilities and the relative
inefficiencies some string operations, I was considering using a list
of True and False values rather than a binary string.

I somehow doubt there would be a clear-cut answer to this, but from
this description, does anyone have any reason to think that one way
would be much more efficient than the other? (I realise the best way
would be to do both and `timeit` to see which is faster, but it's a
sizeable program and if anybody considers it a no-brainer I'd much
rather know now!)

Any advice would be gladly recieved.
-- 
http://mail.python.org/mailman/listinfo/python-list


Enumerating formatting strings

2005-04-18 Thread Steve Holden
I was messing about with formatting and realized that the right kind of 
object could quite easily tell me exactly what accesses are made to the 
mapping in a string % mapping operation. This is a fairly well-known 
technique, modified to tell me what keys would need to be present in any 
mapping used with the format.

class Everything:
def __init__(self, format="%s", discover=False):
self.names = {}
self.values = []
self.format=format
self.discover = discover
def __getitem__(self, key):
x = self.format % key
if self.discover:
self.names[key] = self.names.get(key, 0) + 1
return x
def nameList(self):
 if self.names:
 return ["%-20s %d" % i for i in self.names.items()]
 else:
 return self.values
def __getattr__(self, name):
print "Attribute", name, "requested"
return None
def __repr__(self):
return ""  % id(self)
def nameCount(template):
et = Everything(discover=True)
p = template % et
nlst = et.nameList()
nlst.sort()
return nlst
for s in nameCount("%(name)s %(value)s %(name)s"):
print s
The result of this effort is:
name 2
value1
I've been wondering whether it's possible to perform a similar analysis 
on non-mapping-type format strings, so as to know how long a tuple to 
provide, or whether I'd be forced to lexical analysis of the form string.

regards
 Steve
--
Steve Holden+1 703 861 4237  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
--
http://mail.python.org/mailman/listinfo/python-list


splitting delimited strings

2005-06-15 Thread Mark Harrison
What is the best way to process a text file of delimited strings?
I've got a file where strings are quoted with at-signs, @like [EMAIL PROTECTED]
At-signs in the string are represented as doubled @@.

What's the most efficient way to process this?  Failing all
else I will split the string into characters and use a FSM,
but it seems that's not very pythonesqe.

@rv@ 2 @db.locks@ @//depot/hello.txt@ @mh@ @mh@ 1 1 44
@pv@ 0 @db.changex@ 44 44 @mh@ @mh@ 1118875308 0 @ :@@: :: @

(this is from a perforce journal file, btw)

Many TIA!
Mark

-- 
Mark Harrison
Pixar Animation Studios
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Interned Strings

2006-01-10 Thread Martin v. Löwis
Dave wrote:
> I'm trying to clarify how Python avoids byte by byte
> string comparisons most of the time. As I understand,
> dictionaries keep strings, their keys (hash values),
> and caches of their keys. 

If you are talking about dictionaries with string keys,
you also have the dictionary values, of course.

> Caching keys helps to avoid
> recalculation of a string's hash value. 

Correct (s/keys/string hash values/).

> So, when two
> strings need to be compared, only their cached keys
> are compared, which improves performance as there is
> no need for byte by byte comparison.

No. When comparing to strings s1 and s2, the following
operations occur:
1. is s1 and s2 the very same string? If yes, they
   are equal.
2. else, do they have the same size, the same first
   byte (which might be a null byte), and do they
   compare equal, byte-by-byte?
   If yes, they are equal, if not, they are not equal.
3. Is it perhaps some other compare operation (<,>

<=, >, >=) that we want to perform? Do the
   slow algorithm.

As you can see, the string hash is never consulted when
comparing string objects. It is only consulted to
find the potential dictionary slot in the first place.

> Also, there is a global interning dictionary that
> keeps interned strings. What I don't understand is why
> strings are interned. How does it help with string
> comparisons? 

Why you look up a dictionary entry, this happens:
1. compute the key hash.
2. find the corresponding dictionary slot
   If the slot is empty, KeyError.
3. compare the slot key with the search key.
   If they are equal: value found.
   If they are different: collision, go to the
   next key.

Interned strings speed up step 1 and step 3. If
you only have interned strings throughout, you always
also have the hash value. Of course, you had to
compute the hash value when adding the string
to the interning dictionary.

The real speedup is in step 3, and based on
the assumption that collisions won't happen:
if you lookup the key (e.g. to find the value
of a global variable), you find the slot using
the computed hash. Then:
1. if the slot is empty, it's a KeyError.
2. if the slot is not empty, you first compare
   for pointer equality. As collisions are
   supposedly unlikely, this will be an equal
   string most of the time. Then, if you have
   interning, it even will be the *same* string,
   so you only need to compare pointers to find
   out they are the same string.

So assuming all strings are interned, this is how
you do the dictionary lookup.
1. fetch the hash value from the key (no need to
   compute it - it's already cached)
2. go to the slot in the dict.
3. if the slot is empty (==NULL): KeyError
4. otherwise: success.
As you can see, in that case, there is no need to
compare the string values. If there are collisions,
and if not all strings are interned, the algorithm
gets more complicated, but four items above are
assumed to be the common case.

Regards,
Martin
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: quoting strings

2006-01-31 Thread Fredrik Lundh
Michal Duda wrote:

> is any equivalent to perl "quotemeta" function in python?

s = re.escape(s)

> I know that I can use this on string: r'\\test'
> but I need to do this on the variable

your example is a string literal, which isn't really the same thing as
a RE pattern in Python.

 



-- 
http://mail.python.org/mailman/listinfo/python-list


List of strings

2005-08-17 Thread Mohammed Altaj
Hi All

Thanks for your reply , what i am doing is , i am reading from file ,
using readlines() , I would like to check in these lines , if there is
line belong to another one or not , if it is , then i would like to
delete it

['0132442\n', '13\n', '24\n'] 

'13' is already in '0132442'
'24' is already in '0132442' 

Thanks 



-- 
http://mail.python.org/mailman/listinfo/python-list


Construct raw strings?

2005-09-07 Thread Thomas W
I got a stupid problem; on my WinXP-box I want to scan the filesystem
and enter a  path to scan like this :

path_to_scan = 'd:\test_images'

This is used in a larger context and joined like

real_path_after_scanning = os.path.join(path_to_scan, somepart, 'foo',
'bar', filename)

Using os.path.exists(real_path_after_scanning) returns false. The
problem is that some of the parts being joined contains escape
characters, like \. If I take the seperate parts and join them using
the interpreter, like :
>>> f = r'd:\test_images\something\foo\bar\test.jpg'

it works ok and os.path.exists(f) returns True, but I cannot the that
r' in front using the os.path.join-method in my code.

I don't know if this makes any sense at all, but I'm lost. Damn those
stupid windows-paths !! 

Thanks in advance,
Thomas

-- 
http://mail.python.org/mailman/listinfo/python-list


Lines of Strings

2005-09-16 Thread Reem Mohammed
Hi

Suppose we have data file like this one (Consider all lines as strings )

1 2 3 3 4 4 4 4 5 6
2 2 2 5 5 5 6
3 2 1 1 1 3 3 3 4 6

I would like to remove line if its belong to another one, and will be able 
to do this if longer line come after a short one.

Thanks

_
Express yourself instantly with MSN Messenger! Download today - it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python strings

2006-05-03 Thread Grant Edwards
On 2006-05-03, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:

> Is it possible for python strings to contain a zero byte?

Yes.

-- 
Grant Edwards   grante Yow!  Actually, what
  at   I'd like is a little toy
   visi.comspaceship!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python strings

2006-05-03 Thread Gerhard Häring
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

[EMAIL PROTECTED] wrote:
> Is it possible for python strings to contain a zero byte?

Yes. Here's how to produce one:

[EMAIL PROTECTED]:~$ python
Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = "\x000"
>>> s[0] == chr(0)
True
>>>

- -- Gerhard
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEWQjRdIO4ozGCH14RAsf4AJ4xdbT/FQTSzfciijgVBEfMyTH8SQCeJP39
xzJxWxlAnRgKimsKSKWhSQ0=
=Dd3B
-END PGP SIGNATURE-
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python strings

2006-05-03 Thread Bryan
Gerhard Häring wrote:
> Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
> [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
 s = "\x000"
 s[0] == chr(0)
> True
> 
> - -- Gerhard

this works too :)

 >>> s = '\x001'
 >>> s[0] == chr(0)
True
 >>> s = '\x00abc'
 >>> s[0] == chr(0)
True


i think it would be more clear not to use 3 digits for this example since \x 
only uses the next two numbers, not 3.

 >>> s = '\x00'
 >>> s[0] == chr(0)
True


bryan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python strings

2006-05-04 Thread Sion Arrowsmith
Bryan  <[EMAIL PROTECTED]> wrote:
> >>> s = '\x00'
> >>> s[0] == chr(0)
>True

That's a little excessive when:

>>> s = '\0'
>>> s[0] == chr(0)
True

Oh, and to reassure the OP that that null really is *in* the string:

>>> len(s)
1

-- 
\S -- [EMAIL PROTECTED] -- http://www.chaos.org.uk/~sion/
  ___  |  "Frankly I have no feelings towards penguins one way or the other"
  \X/  |-- Arthur C. Clarke
   her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
-- 
http://mail.python.org/mailman/listinfo/python-list

raw strings and \

2006-03-05 Thread plahey
I thought I understood raw strings, then I got bitten by this:

x=r'c:\blah\'

which is illegal!  I thought that \ had no special meanning in a raw
string so why can't it be the last character?

making me do:

x=r'c:\blah' '\\'

is just silly...

-- 
http://mail.python.org/mailman/listinfo/python-list


parsing combination strings

2007-03-21 Thread PKKR
I need a fast and efficient way to parse a combination string(digits +
chars)

ex: s = "12ABA" or "1ACD" or "123CSD" etc

I want to parse the the above string such that i can grab only the
first digits and ignore the rest of the chacters,

so if i have s = "12ABA" , parser(s) should give me "12" or "1" or
"123".

I can think of a quick dirty way by checking each element in the
string and do a 'str.isdigit()' and stop once its not a digit, but
appreciate any eligent way.

-- 
http://mail.python.org/mailman/listinfo/python-list


Clean "Durty" strings

2007-04-01 Thread Ulysse
Hello,

I need to clean the string like this :

string =
"""
bonne mentalité mec!:) \nbon pour
info moi je suis un serial posteur arceleur dictateur ^^*
\nmais pour avoir des resultats probant il
faut pas faire les mariolles, comme le "fondateur" de bvs
krew \n
mais pour avoir des resultats probant il faut pas faire les mariolles,
comme le "fondateur" de bvs krew \n
"""

into :
bonne mentalité mec!:) bon pour info moi je suis un serial posteur
arceleur dictateur ^^* mais pour avoir des resultats probant il faut
pas faire les mariolles, comme le "fondateur" de bvs krew
mais pour avoir des resultats probant il faut pas faire les mariolles,
comme le "fondateur" de bvs krew

To do this I wold like to use only strandard librairies.

Thanks

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: concatenating strings

2006-12-15 Thread Laurent Pointal
EHC a écrit :
> hello!
> 
> since i am a py noob, please bear with me ; )
> 
> how is it possible to concat a string and an integer in a
> print-command? i've tried
> 
> print "This robot is named %s. The current speed setting is %d, and %s
> has a lifetime of %d" % (self.name , self.speed , self.name)

Four % formating with only three arguments to format. It cannot work...

print "This robot is named %s. The current speed setting is %d, and %s
has a lifetime of %d" % (self.name , self.speed , self.name, self.lifetime)

...

> background is a class named Robot with members speed, name, etc...

May try this too:
print "This robot is named %(name)s. The current speed setting is
%(speed)d, and %(name)s has a lifetime of %(lifetime)d" % self.__dict__

[note: liftefime may be a dynamically calculated value, and should be
providen via an accessor attribute]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: concatenating strings

2006-12-15 Thread Erich Pul
thank you, i just plainly overlooked it ; )

now it works

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: concatenating strings

2006-12-15 Thread Caleb Hattingh
Hi Erich

If you're going to be doing a lot of string substitution, you should
look at the Templating support in the library:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/304005

and (a little bit fancier):

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/335308

Regards
Caleb


On Dec 15, 12:18 pm, "Erich Pul" <[EMAIL PROTECTED]> wrote:
> thank you, i just plainly overlooked it ; )
> 
> now it works

-- 
http://mail.python.org/mailman/listinfo/python-list


Encoding / decoding strings

2007-01-05 Thread [EMAIL PROTECTED]
Hey Everyone,

Was just wondering if anyone here could help me. I want to encode (and
subsequently decode) email addresses to use in URLs. I believe that
this can be done using MD5.

I can find documentation for encoding the strings, but not decoding
them. What should I do to encode =and= decode strings with MD5?

Many Thanks in Advance,
Oliver Beattie

-- 
http://mail.python.org/mailman/listinfo/python-list


sum and strings

2006-08-17 Thread Paddy
I was browsing the Voidspace blog item on "Flattening Lists", and
followed up on the use of sum to do the flattening.
A solution was:

>>> nestedList = [[1, 2], [3, 4], [5, 6]]
>>> sum(nestedList,[])
[1, 2, 3, 4, 5, 6]

I would not have thought of using sum in this way. When I did help(sum)
the docstring was very number-centric: It went further, and precluded
its use on strings:

>>> help(sum)
Help on built-in function sum in module __builtin__:

sum(...)
sum(sequence, start=0) -> value

Returns the sum of a sequence of numbers (NOT strings) plus the
value
of parameter 'start'.  When the sequence is empty, returns start.

The string preclusion would not help with duck-typing (in general), so
I decided to consult the ref doc on sum:

sum( sequence[, start])

Sums start and the items of a sequence, from left to right, and returns
the total. start defaults to 0. The sequence's items are normally
numbers, and are not allowed to be strings. The fast, correct way to
concatenate sequence of strings is by calling ''.join(sequence). Note
that sum(range(n), m) is equivalent to reduce(operator.add, range(n),
m) New in version 2.3.


The above was a lot better description of sum for me, and with an
inquisitive mind, I like to think that I might have come up with using
sum to flatten nestedList :-)
But there was still that warning about using strings.

I therefore  tried sum versus their reduce "equivalent" for strings:

>>> import operator
>>> reduce(operator.add, "ABCD".split(), '')
'ABCD'
>>> sum("ABCD".split(), '')
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: sum() can't sum strings [use ''.join(seq) instead]
>>>

Well, after all the above, there is a question:

  Why not make sum work for strings too?

It would remove what seems like an arbitrary restriction and aid
duck-typing. If the answer is that the sum optimisations don't work for
the string datatype, then wouldn't it be better to put a trap in the
sum code diverting strings to the reduce equivalent?

Just a thought,

- Paddy.

-- 
http://mail.python.org/mailman/listinfo/python-list


Defining constant strings

2006-08-27 Thread Hans



Hi,
 
I want to define a couple of constant strings, like 
in C:
#define mystring "This is my string"
or using a const char construction.
 
Is this really not possible in Python?
 
Hans
-- 
http://mail.python.org/mailman/listinfo/python-list

Strings in Python

2007-02-08 Thread Johny
Playing a little more with strings, I found out that string.find
function provides the position of
the first occurance of the substring in the string.
Is there a way how to find out all substring's position ?
To explain more,
let's suppose

mystring='12341'
import string

>>> string.find(mystring ,'1')
0

But I need to find the  possition the other '1' in mystring too.
Is it possible?
Or must I use regex?
Thanks for help
L

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread James Stroud
[EMAIL PROTECTED] wrote:
>   I'm not sure how to change a string so that it matches another one.
> 
>   My application (using wxPython and SQLite3 via pysqlite2) needs to compare
> a string selected from the database into a list of tuples with another
> string selected in a display widget.
> 
>   An extract of the relevant code is:
> 
> selName = self.polTree.GetItemText(selID)
> ...  
> for item in self.appData.polNat:
>   print 'Item: ', item, '\n', 'selName: ', selName, '\n'
>   if item == selName:
> print '* ', self.appData.polNat[1]
> 
>   The last comparison and print statement never work because the strings are
> presented this way:
> 
>   Item:  (u'ground water',) 
>   selName:  ground water
> 
>   What do I need to do to 'Item' to strip the parentheses, unicode symbol,
> single quotes, and comma? Do I want 'raw' output? If so, how do I specify
> that in the line 'if item == selName:'?
> 
> TIA,
> 
> Rich

Assuming item is "(u'ground water',)"

import re
item = re.compile(r"\(u'([^']*)',\)").search(item).group(1)

James
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread rshepard-at-appl-ecosys . com
On 2007-02-10, [EMAIL PROTECTED] wrote:

>   if item == selName:

  Slicing doesn't seem to do anything -- if I've done it correctly. I
changed the above to read,

if item[2:-2] == selName:

but the output's the same.

Rich
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread Paul McGuire
On Feb 9, 6:03 pm, [EMAIL PROTECTED] wrote:
>   I'm not sure how to change a string so that it matches another one.
>
>   My application (using wxPython and SQLite3 via pysqlite2) needs to compare
> a string selected from the database into a list of tuples with another
> string selected in a display widget.
>
>   An extract of the relevant code is:
>
> selName = self.polTree.GetItemText(selID)
> ...  
> for item in self.appData.polNat:
>   print 'Item: ', item, '\n', 'selName: ', selName, '\n'
>   if item == selName:
> print '* ', self.appData.polNat[1]
>
>   The last comparison and print statement never work because the strings are
> presented this way:
>
> Item:  (u'ground water',)
> selName:  ground water
>
>   What do I need to do to 'Item' to strip the parentheses, unicode symbol,
> single quotes, and comma? Do I want 'raw' output? If so, how do I specify
> that in the line 'if item == selName:'?
>
> TIA,
>
> Rich

I suspect that the variable item is *not* a string, but a tuple whose
zero'th element is a unicode string with the value u'ground water'.
Try comparing item[0] with selname.

>From my command prompt:
>>> u'a' == 'a'
True

-- Paul

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread Larry Bates
rshepard-at-appl-ecosys.com wrote:
> On 2007-02-10, [EMAIL PROTECTED] wrote:
> 
>>   if item == selName:
> 
>   Slicing doesn't seem to do anything -- if I've done it correctly. I
> changed the above to read,
> 
>   if item[2:-2] == selName:
> 
> but the output's the same.
> 
> Rich

Use the interpreter to test things:

a="(u'ground water',)"

a[2:-2]
"'ground water'"

a[3:-3]
'ground water'

That is what you are looking for.

-Larry
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread Gabriel Genellina
En Fri, 09 Feb 2007 21:03:32 -0300, <[EMAIL PROTECTED]>  
escribió:

>   I'm not sure how to change a string so that it matches another one.
>
>   My application (using wxPython and SQLite3 via pysqlite2) needs to  
> compare
> a string selected from the database into a list of tuples with another
> string selected in a display widget.
>
>   An extract of the relevant code is:
>
> selName = self.polTree.GetItemText(selID)
> ...
> for item in self.appData.polNat:
>   print 'Item: ', item, '\n', 'selName: ', selName, '\n'
>   if item == selName:
> print '* ', self.appData.polNat[1]
>
>   The last comparison and print statement never work because the strings  
> are
> presented this way:
>
>   Item:  (u'ground water',)
>   selName:  ground water

Forget about re and slicing and blind guessing...
item appears to be a tuple; in these cases repr is your friend. See what  
happens with:
print repr(item), type(item)

If it is in fact a tuple, you should ask *why* is it a tuple (maybe could  
have many items?). And if it's just an artifact and actually it always  
will be a single:

assert len(item)==1
item = item[0]
if item...

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread rshepard-at-appl-ecosys . com
On 2007-02-10, James Stroud <[EMAIL PROTECTED]> wrote:

> Assuming item is "(u'ground water',)"
>
> import re
> item = re.compile(r"\(u'([^']*)',\)").search(item).group(1)

James,

  I solved the problem when some experimentation reminded me that 'item' is
a list index and not a string variable. by changing the line to,

if item[0] == selName:

I get the matchs correctly.

  Now I need to extract the proper matching strings from the list of tuples,
and I'm working on that.

Many thanks,

Rich
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread John Machin
On Feb 10, 11:03 am, [EMAIL PROTECTED] wrote:
>   I'm not sure how to change a string so that it matches another one.
>
>   My application (using wxPython and SQLite3 via pysqlite2) needs to compare
> a string selected from the database into a list of tuples with another
> string selected in a display widget.

Tuple? Doesn't that give you a clue?

>
>   An extract of the relevant code is:
>
> selName = self.polTree.GetItemText(selID)
> ...
> for item in self.appData.polNat:
>   print 'Item: ', item, '\n', 'selName: ', selName, '\n'
>   if item == selName:
> print '* ', self.appData.polNat[1]
>
>   The last comparison and print statement never work because the strings are
> presented this way:

What you mean is: The way you have presented the strings is confusing
you, and consequently you have written a comparison that will not
work :-)

>
> Item:  (u'ground water',)

H. That comma in there is interesting. I wonder where the
parentheses came from. What did the man mutter about a list of tuples?

> selName:  ground water
>
>   What do I need to do to 'Item' to strip the parentheses, unicode symbol,
> single quotes, and comma?

Nothing. They're not there. It's all in your mind.

> Do I want 'raw' output?

What is "raw output"?

> If so, how do I specify
> that in the line 'if item == selName:'?

That's a comparison, not output.

Step 1: Find out what you've *really* got there:

Instead of
print 'Item: ', item, '\n', 'selName: ', selName, '\n'
do this:
print 'item', repr(item), type(item)
print 'selName', repr(selName), type(selName)

Step 2:
Act accordingly.

Uncle John's Crystal Balls (TM) predict that you probably need this:
if item[0] == selName:

HTH,
John

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread John Machin
On Feb 10, 12:01 pm, rshepard-at-appl-ecosys.com wrote:
> On 2007-02-10, James Stroud <[EMAIL PROTECTED]> wrote:
>
> > Assuming item is "(u'ground water',)"
>
> > import re
> > item = re.compile(r"\(u'([^']*)',\)").search(item).group(1)
>
> James,
>
>   I solved the problem when some experimentation reminded me that 'item' is
> a list index

AArrgghh it's not a list index, it's a ferschlugginer tuple containing
1 element.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread John Machin
On Feb 10, 11:58 am, Larry Bates <[EMAIL PROTECTED]> wrote:
> rshepard-at-appl-ecosys.com wrote:
> > On 2007-02-10, [EMAIL PROTECTED] wrote:
>
> >>   if item == selName:
>
> >   Slicing doesn't seem to do anything -- if I've done it correctly. I
> > changed the above to read,
>
> >if item[2:-2] == selName:
>
> > but the output's the same.
>
> > Rich
>
> Use the interpreter to test things:
>
> a="(u'ground water',)"
>
> a[2:-2]
> "'ground water'"
>
> a[3:-3]
> 'ground water'
>
> That is what you are looking for.

True. Unfortunately he was looking for the wrong thing.
He has:
a = (u'ground water', )

a[2:-2] -> ()
a[3:-3] -> ()
() == 'ground water' -> False

HTH,
John

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Matching Strings

2007-02-09 Thread Steven D'Aprano
On Fri, 09 Feb 2007 16:17:31 -0800, James Stroud wrote:

> Assuming item is "(u'ground water',)"
> 
> import re
> item = re.compile(r"\(u'([^']*)',\)").search(item).group(1)

Using a regex is a lot of overhead for a very simple operation.

If item is the string "(u'ground water',)"

then item[3:-3] will give "ground water".

>>> import re, timeit
>>> item = "(u'ground water',)"

>>> timeit.Timer('item[3:-3]', 'from __main__ import item').repeat()
[0.56174778938293457, 0.53341794013977051, 0.53485989570617676]

>>> timeit.Timer( \
... '''re.compile(r"\(u'([^']*)',\)").search(item).group(1)''', \
... 'from __main__ import item; import re').repeat() 
[9.2723720073699951, 9.2299859523773193, 9.2523660659790039]


However, as many others have pointed out, the Original Poster's problem
isn't that item has leading brackets around the substring he wants, but
that it is a tuple.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


locating strings approximately

2006-06-28 Thread BBands
I'd like to see if a string exists, even approximately, in another. For
example if "black" exists in "blakbird" or if "beatles" exists in
"beatlemania". The application is to look though a long list of songs
and return any approximate matches along with a confidence factor. I
have looked at edit distance, but that isn't a good choice for finding
a short string in a longer one. I have also explored
difflib.SequenceMatcher and .get_close_matches, but what I'd really
like is something like:

a = FindApprox("beatles", "beatlemania")
print a
0.857

Any ideas?

jab

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Concatenating strings

2006-06-30 Thread Robert Kern
John Henry wrote:
> Sorry if this is a dumb question.
> 
> I have a list of strings (some 10,000+) and I need to concatenate them
> together into one very long string.  The obvious method would be, for
> example:
> 
> alist=["ab","cd","ef",.,"zzz"]
> blist = ""
> for x in alist:
>blist += x
> 
> But is there a cleaner and faster way of doing this?

blist = ''.join(alist)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Concatenating strings

2006-06-30 Thread Steven Bethard
John Henry wrote:
> I have a list of strings (some 10,000+) and I need to concatenate them
> together into one very long string.  The obvious method would be, for
> example:
> 
> alist=["ab","cd","ef",.,"zzz"]
> blist = ""
> for x in alist:
>blist += x
> 
> But is there a cleaner and faster way of doing this?

blist = ''.join(alist)

STeVe
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 20050119: quoting strings

2005-01-10 Thread Steve Holden
Xah Lee wrote:
#strings are enclosed in double quotes quotes. e.g.
a="this and that"
print a
#multiple lines must have an escape backslash at the end:
b="this\n\
and that"
print b
#One can use r"" for raw string.
c=r"this\n\
and that"
print c
#To avoid the backslash escape, one can use triple double quotes to
print as it is:
d="""this
and
that"""
print d
---
# in Perl, strings in double quotes acts as Python's triple """.
# String is single quote is like Python's raw r"".
# Alternatively, they can be done as qq() or q() respectively,
#and the bracket can be just about any character,
# matching or not. (so that escapes can be easy avoided)
$a=q(here, everthing is literal, $what or \n or what not.);
$b=qq[this is
what ever including variables $a that will be
evaluated, and "quotes" needn't be quoted.];
print "$a\n$b";
#to see more about perl strings, do on shell prompt
#perldoc -tf qq
Xah
 [EMAIL PROTECTED]
 http://xahlee.org/PageTwo_dir/more.html
Well, that gets that sorted out, then.
Tomorrow: using single quotes. Using single quotes. The larch.
regards
 Steve
--
Steve Holden   http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
Holden Web LLC  +1 703 861 4237  +1 800 494 3119
--
http://mail.python.org/mailman/listinfo/python-list


unicode and data strings

2005-01-28 Thread Laszlo Zsolt Nagy
Hello
I have a program where I would like to calculate a checksum. Looks like 
this:

n = self.__calc_n(seed1,seed2,pwd)
This is required for an OTP (One Time Password) algorithm. My code was 
working before without problems.
Now I installed Python 2.3.4 and wxPython 2.5.3 (with unicode support). 
I'm getting this exception:

exceptions.UnicodeDecodeError:'ascii' codec can't decode byte 0x91 in 
position 0: ordinal not in range(128)

The I tried this:
>>>'\x91'.decode('utf8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0x91 in position 0: 
unexpected code byte

Here is the question: I would like to use simple binary data strings 
(like an array of bytes).
I do not care about encodings. How do I do that?

Thanks,
  Laci 2.0
--
http://mail.python.org/mailman/listinfo/python-list


Converting strings to dates

2005-02-04 Thread Chermside, Michael
I'm trying to convert a string back into a datetime.date.

First I'll create the string:


Python 2.4 (#60, Nov 30 2004, 11:49:19) [MSC v.1310 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import time, datetime
>>> a_date = datetime.date.today()
>>> s = str(a_date)
>>> print s
2005-02-04

Now I convert it back:

>>> new_date = datetime.date.fromtimestamp(
... time.mktime(time.strptime(s, '%Y-%m-%d')))
>>> new_date
datetime.date(2005, 2, 4)


WOW, that's ugly. Is there a more concise way to do this?

-- Michael Chermside


This email may contain confidential or privileged information. If you believe 
you have received the message in error, please notify the sender and delete the 
message without copying or disclosing it.

--
http://mail.python.org/mailman/listinfo/python-list


Splitting strings - by iterators?

2005-02-25 Thread Jeremy Sanders
I have a large string containing lines of text separated by '\n'. I'm
currently using text.splitlines(True) to break the text into lines, and
I'm iterating over the resulting list.

This is very slow (when using 40 lines!). Other than dumping the
string to a file, and reading it back using the file iterator, is there a
way to quickly iterate over the lines?

I tried using newpos=text.find('\n', pos), and returning the chopped text
text[pos:newpos+1], but this is much slower than splitlines.

Any ideas?

Thanks

Jeremy

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weird strings question

2005-02-25 Thread John Machin

Lucas Raab wrote:
> Is it possible to assign a string a numerical value?? For example, in

> the string "test" can I assign a number to each letter as in "t" =
45,
> "e" =  89, "s" = 54, and so on and so forth??
>
> TIA

>>> for c in 'abcd':
...print c, ord(c)
...
a 97
b 98
c 99
d 100

If that isn't what you mean, you may need to rephrase your question.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weird strings question

2005-02-25 Thread Robert Kern
Lucas Raab wrote:
Is it possible to assign a string a numerical value?? For example, in 
the string "test" can I assign a number to each letter as in "t" = 45, 
"e" =  89, "s" = 54, and so on and so forth??
Use a dictionary with the strings as keys.
string2num = {}
string2num['t'] = 45
string2num['e'] = 89
etc.
--
Robert Kern
[EMAIL PROTECTED]
"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter
--
http://mail.python.org/mailman/listinfo/python-list


remove strings from source

2005-02-26 Thread qwweeeit
For a python code I am writing I need to remove all strings
definitions from source and substitute them with a place-holder.

To make clearer:
line 45  sVar="this is the string assigned to sVar"
must be converted in:
line 45 sVar=s1

Such substitution is recorded in a file under:
s0001[line 45]="this is the string assigned to sVar"

For curious guys:
I am trying to implement a cross variable reference tool and the
variability (in lenght) of the string definitions (expecially if
multi-line) can cause display problems.

I need your help in correctly identifying the strings (also embedding
the r'xx..' or u'yy...' as part of the string definition). The problem
is mainly on the multi-line definitions or in cached strings
(embedding chr() definitions or escape sequences).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: weird strings question

2005-02-26 Thread Lucas Raab
Robert Kern wrote:
Lucas Raab wrote:
Is it possible to assign a string a numerical value?? For example, in 
the string "test" can I assign a number to each letter as in "t" = 45, 
"e" =  89, "s" = 54, and so on and so forth??

Use a dictionary with the strings as keys.
string2num = {}
string2num['t'] = 45
string2num['e'] = 89
etc.
Thanks. That's what I was looking for, but was just unsure exactly how 
to proceed.
--
http://mail.python.org/mailman/listinfo/python-list



binutils "strings" like functionality?

2005-03-03 Thread cjl
Hey all:

I am working on a little script that needs to pull the strings out of a
binary file, and then manipulate them with python.

The command line utility "strings" (part of binutils) has exactly the
functionality I need, but I was thinking about trying to implement this
in pure python.

I did some reading on opening and reading binary files, etc., and was
just wondering if people think this is possible, or worth my time (as a
learning exercise), or if something like this already exists.

-cjl

-- 
http://mail.python.org/mailman/listinfo/python-list


Format strings that contain '%'

2005-03-08 Thread [EMAIL PROTECTED]
I'm trying to do something along the lines of

>>> print '%temp %d' % 1
Traceback (most recent call last):
  File "", line 1, in ?
ValueError: unsupported format character 't' (0x74) at index 1

although, obviously I can't do this, since python thinks that the '%t'
is a format string.

I've tried obvious permutations like

>>> print '\%temp %d' % 1

which also doesn't work.

For (template) reasons, I can't do

>>> print '%s %d' % ('%temp',1)

Will anything else work here?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings and Lists

2005-04-18 Thread [EMAIL PROTECTED]
Hello Tom,

I think it is more efficient if we can use list (with True,False)
member to do genetics algorithms. Of course a lot of works to do to
change from string binary into boolean list.

I do programming genetics algorithms in C# I guess I have to modify my
program also because my old program use binary string manipulation.

Sincerely Yours,
Pujo

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings and Lists

2005-04-18 Thread Sidharth Kuruvila
Hi,
   I not sure what sorts of operations you plan to do. But if you
intend to use fixed length arrays or even carrying out repetetive
operations. You should probably look at numeric
http://numeric.scipy.org/


On 18 Apr 2005 04:42:17 -0700, Tom Longridge <[EMAIL PROTECTED]> wrote:
> My current Python project involves lots repeatating code blocks,
> mainly centred around a binary string of data. It's a genetic
> algorithm in which there are lots of strings (the chromosomes) which
> get mixed, mutated and compared a lot.
> 
> Given Python's great list processing abilities and the relative
> inefficiencies some string operations, I was considering using a list
> of True and False values rather than a binary string.
> 
> I somehow doubt there would be a clear-cut answer to this, but from
> this description, does anyone have any reason to think that one way
> would be much more efficient than the other? (I realise the best way
> would be to do both and `timeit` to see which is faster, but it's a
> sizeable program and if anybody considers it a no-brainer I'd much
> rather know now!)
> 
> Any advice would be gladly recieved.
> --
> http://mail.python.org/mailman/listinfo/python-list
> 


-- 
http://blogs.applibase.net/sidharth
--
http://mail.python.org/mailman/listinfo/python-list


Re: Strings and Lists

2005-04-18 Thread Bill Mill
On 18 Apr 2005 04:42:17 -0700, Tom Longridge <[EMAIL PROTECTED]> wrote:
> My current Python project involves lots repeatating code blocks,
> mainly centred around a binary string of data. It's a genetic
> algorithm in which there are lots of strings (the chromosomes) which
> get mixed, mutated and compared a lot.
> 
> Given Python's great list processing abilities and the relative
> inefficiencies some string operations, I was considering using a list
> of True and False values rather than a binary string.
> 
> I somehow doubt there would be a clear-cut answer to this, but from
> this description, does anyone have any reason to think that one way
> would be much more efficient than the other? (I realise the best way
> would be to do both and `timeit` to see which is faster, but it's a
> sizeable program and if anybody considers it a no-brainer I'd much
> rather know now!)
> 
> Any advice would be gladly recieved.
> --
> http://mail.python.org/mailman/listinfo/python-list
> 

Tom,

it seems to me that, if efficiency is your main goal, you should store
your data as a list of integers, and use the bit-twiddling operators
to get at your data. These should be *very* fast, as well as memory
efficient.

Peace
Bill Mill
bill.mill at gmail.com
--
http://mail.python.org/mailman/listinfo/python-list


Re: Strings and Lists

2005-04-18 Thread Peter Hansen
Tom Longridge wrote:
My current Python project involves lots repeatating code blocks,
mainly centred around a binary string of data. It's a genetic
algorithm in which there are lots of strings (the chromosomes) which
get mixed, mutated and compared a lot.
Given Python's great list processing abilities and the relative
inefficiencies some string operations, I was considering using a list
of True and False values rather than a binary string.
I somehow doubt there would be a clear-cut answer to this, but from
this description, does anyone have any reason to think that one way
would be much more efficient than the other? (I realise the best way
would be to do both and `timeit` to see which is faster, but it's a
sizeable program and if anybody considers it a no-brainer I'd much
rather know now!)
Any advice would be gladly recieved.
It depends more on the operations you are performing, and
more importantly *which of those you have measured and
found to be slow*, than on anything else.
If, for example, you've got a particular complex set of
slicing, dicing, and mutating operations going on, then
that might say use one type of data structure.
If, on the other hand, the evaluation of the fitness
function is what is taking most of the time, then you
should focus on what that algorithm does and needs
and, after profiling (to get real data rather than
shots-in-the-dark), you can pick an appropriate data
structure for optimizing those operations.
Strings seem to be what people pick, by default, without
much thought, but I doubt they're the right thing
for the majority of GA work...
-Peter
--
http://mail.python.org/mailman/listinfo/python-list


Re: Strings and Lists

2005-04-18 Thread Donn Cave
In article <[EMAIL PROTECTED]>,
 [EMAIL PROTECTED] (Tom Longridge) wrote:

> My current Python project involves lots repeatating code blocks,
> mainly centred around a binary string of data. It's a genetic
> algorithm in which there are lots of strings (the chromosomes) which
> get mixed, mutated and compared a lot.
> 
> Given Python's great list processing abilities and the relative
> inefficiencies some string operations, I was considering using a list
> of True and False values rather than a binary string.
> 
> I somehow doubt there would be a clear-cut answer to this, but from
> this description, does anyone have any reason to think that one way
> would be much more efficient than the other? (I realise the best way
> would be to do both and `timeit` to see which is faster, but it's a
> sizeable program and if anybody considers it a no-brainer I'd much
> rather know now!)

I make no representations about how much mileage you would
get out of it, but if character data suits your purposes and
you just need a mutable array - in principle, I think it would
be more efficient to use an array.  Like,
   import array
   a = array.array('c', strdata)

As I understand it, this object simply contains character data,
not a list of 1 character string objects, so it's much more
economical to store and examine.

   Donn Cave, [EMAIL PROTECTED]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-18 Thread Michael Spencer
Steve Holden wrote:
I've been wondering whether it's possible to perform a similar analysis 
on non-mapping-type format strings, so as to know how long a tuple to 
provide, or whether I'd be forced to lexical analysis of the form string.

regards
 Steve
I do not know if it is possible to do that.
But if you are forced to parse the string, the following might help:
import re
parse_format = re.compile(r'''
\%  # placeholder
(?P\([\w]+\))?# 0 or 1 named groups
(?P[\#0\-\+]?)  # 0 or 1 conversion flags
(?P[\d]* | \*)   # optional minimum conversion width
(?:.(?P[\d]+ | \*))? # optional precision
(?P[hlL]?)  # optional length modifier
(?P[diouxXeEfFgGcrs]{1})  # conversion type - note %% omitted
''',
re.VERBOSE
)
 >>> parse_format.findall("%(name)s, %-4.2f, %d, %%")
 [('(name)', '', '', '', '', 's'), ('', '-', '4', '2', '', 'f'), ('', '', '', 
'', '', 'd')]
 >>>

I used this successfully in a few simple cases, but I haven't really tested it.
cheers
Michael
--
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-18 Thread Bengt Richter
On Mon, 18 Apr 2005 16:24:39 -0400, Steve Holden <[EMAIL PROTECTED]> wrote:

>I was messing about with formatting and realized that the right kind of 
>object could quite easily tell me exactly what accesses are made to the 
>mapping in a string % mapping operation. This is a fairly well-known 
>technique, modified to tell me what keys would need to be present in any 
>mapping used with the format.
>

>I've been wondering whether it's possible to perform a similar analysis 
>on non-mapping-type format strings, so as to know how long a tuple to 
>provide, or whether I'd be forced to lexical analysis of the form string.
>
When I was playing with formatstring % mapping I thought it could
be useful if you could get the full format specifier info an do your own
complete formatting, even for invented format specifiers. This could be
done without breaking backwards compatibility if str.__mod__ looked for
a __format__ method on the other-wise-mapping-or-tuple-object. If found,
it would call the method, which would expect

def __format__(self,
ix,# index from 0 counting  every %... format
name,  # from %(name) or ''
width, # from %width.prec
prec,  # ditto
fc,# the format character F in %(x)F
all# just a copy of whatever is between % and including F
): ...

This would obviously let you handle non-mapping as you want, and more.

The most popular use would probably be intercepting width in  %(name)s
and doing custom formatting (e.g. centering in available space) for the object
and returning the right size string.

Since ix is an integer and doesn't help find the right object without the normal
tuple, you could give your formatting object's __init__ method keyword arguments
to specify arguments for anonymous slots in the format string, conventionally
naming them a0, a1, a2 etc. Then later when you get an ix with no name, you 
could
write self.kw.get('%as'%ix) to get the value, as in use like
 '%(name)s %s' % Formatter(a1=thevalue) # Formatter as base class knows how 
to do name lookup

Or is this just idearrhea?
  
Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-19 Thread Peter Otten
Steve Holden wrote:

> I was messing about with formatting and realized that the right kind of
> object could quite easily tell me exactly what accesses are made to the
> mapping in a string % mapping operation. This is a fairly well-known
> technique, modified to tell me what keys would need to be present in any
> mapping used with the format.

...

> I've been wondering whether it's possible to perform a similar analysis
> on non-mapping-type format strings, so as to know how long a tuple to
> provide, or whether I'd be forced to lexical analysis of the form string.

PyString_Format() in stringobject.c determines the tuple length, then starts
the formatting process and finally checks whether all items were used -- so
no, it's not possible to feed it a tweaked (auto-growing) tuple like you
did with the dictionary.

Here's a brute-force equivalent to nameCount(), inspired by a post by Hans
Nowak (http://mail.python.org/pipermail/python-list/2004-July/230392.html).

def countArgs(format):
args = (1,) * (format.count("%") - 2*format.count("%%"))
while True:
try:
format % args
except TypeError, e:
args += (1,)
else:
return len(args)

samples = [
("", 0),
("%%", 0),
("%s", 1),
("%%%s", 1),
("%%%*.*d", 3),
("%*s", 2),
("%s %*s %*d %*f", 7)]
for f, n in samples:
f % ((1,)*n) 
assert countArgs(f) == n

Not tested beyond what you see.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strings and Lists

2005-04-19 Thread Tom Longridge
Thank you all very much for your responses. It's especially reassuring
to hear about other Python GA's as I have had some scepticism about
Python's speed (or lack of it) being too big a problem for such an
application.

With regard to using numeric, arrays or integer lists -- I didn't
mention that these strings can also contain wild cards (so I suppose
it's not really binary -- sorry). This is traditionally done using a
'#' symbol, but I was imagining using a value of None in a boolean list
to represent this. Also there is currently a fair bit of research going
into other representations (floating-point values, paired values etc)
so I was hoping to be able to keep my framework extensible for the
future.

Many thanks again for your help. I will ``take the plunge'' and give
the boolean list a go I think!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-19 Thread Greg Ewing
Steve Holden wrote:
I've been wondering whether it's possible to perform a similar analysis 
on non-mapping-type format strings, so as to know how long a tuple to 
provide,
I just tried an experiment, and it doesn't seem to be possible.
The problem seems to be that it expects the arguments to be
in the form of a tuple, and if you give it something else,
it wraps it up in a 1-element tuple and uses that instead.
This seems to happen even with a custom subclass of tuple,
so it must be doing an exact type check.
So it looks like you'll have to parse the format string.
--
Greg Ewing, Computer Science Dept,
University of Canterbury,   
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
--
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Peter Otten
Greg Ewing wrote:

> Steve Holden wrote:
> 
>> I've been wondering whether it's possible to perform a similar analysis
>> on non-mapping-type format strings, so as to know how long a tuple to
>> provide,
> 
> I just tried an experiment, and it doesn't seem to be possible.
> 
> The problem seems to be that it expects the arguments to be
> in the form of a tuple, and if you give it something else,
> it wraps it up in a 1-element tuple and uses that instead.
> 
> This seems to happen even with a custom subclass of tuple,
> so it must be doing an exact type check.

No, it doesn't do an exact type check, but always calls the tuple method:

>>> class Tuple(tuple):
... def __getitem__(self, index):
... return 42
...
>>> "%r %r" % Tuple("ab") # would raise an exception if wrapped
"'a' 'b'"

> So it looks like you'll have to parse the format string.
 
Indeed.

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Bengt Richter
On Wed, 20 Apr 2005 09:14:40 +0200, Peter Otten <[EMAIL PROTECTED]> wrote:

>Greg Ewing wrote:
>
>> Steve Holden wrote:
>> 
>>> I've been wondering whether it's possible to perform a similar analysis
>>> on non-mapping-type format strings, so as to know how long a tuple to
>>> provide,
>> 
>> I just tried an experiment, and it doesn't seem to be possible.
>> 
>> The problem seems to be that it expects the arguments to be
>> in the form of a tuple, and if you give it something else,
>> it wraps it up in a 1-element tuple and uses that instead.
>> 
>> This seems to happen even with a custom subclass of tuple,
>> so it must be doing an exact type check.
>
>No, it doesn't do an exact type check, but always calls the tuple method:
>
>>>> class Tuple(tuple):
>... def __getitem__(self, index):
>... return 42
>...
>>>> "%r %r" % Tuple("ab") # would raise an exception if wrapped
>"'a' 'b'"
>
>> So it looks like you'll have to parse the format string.
> 
>Indeed.
>
Parse might be a big word for

 >> def tupreq(fmt): return sum(map(lambda s:list(s).count('%'), 
 >> fmt.split('%%')))
 ..
 >> tupreq('%s this %(x)s not %% but %s')

(if it works in general ;-)

Or maybe clearer and faster:

 >>> def tupreq(fmt): return sum(1 for c in fmt.replace('%%','') if c=='%')
 ...
 >>> tupreq('%s this %(x)s not %% but %s')
 3

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Peter Otten
Bengt Richter wrote:

> Parse might be a big word for
> 
>  >> def tupreq(fmt): return sum(map(lambda s:list(s).count('%'),
>  >> fmt.split('%%')))
>  ..
>  >> tupreq('%s this %(x)s not %% but %s')
> 
> (if it works in general ;-)

Which it doesn't:

>>> def tupreq(fmt): return sum(map(lambda s:list(s).count('%'),
fmt.split('%%')))
...
>>> fmt = "%*d"
>>> fmt % ((1,) * tupreq(fmt))
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: not enough arguments for format string

> Or maybe clearer and faster:
> 
>  >>> def tupreq(fmt): return sum(1 for c in fmt.replace('%%','') if
>  >>> c=='%')
>  ...
>  >>> tupreq('%s this %(x)s not %% but %s')
>  3

Mixed formats show some "interesting" behaviour:

>>> "%s %(x)s" % (1,2)
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: format requires a mapping
>>> class D:
... def __getitem__(self, key):
... return "D[%s]" % key
...
>>> "%s %(x)s" % D()
'<__main__.D instance at 0x402aaf2c> D[x]'
>>> "%s %(x)s %s" % D()
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: not enough arguments for format string
>>> "%s %(x)s %(y)s" % D()
'<__main__.D instance at 0x402aad8c> D[x] D[y]'

That is as far as I got. So under what circumstances is 
'%s this %(x)s not %% but %s' a valid format string?

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Bengt Richter
On Wed, 20 Apr 2005 11:01:28 +0200, Peter Otten <[EMAIL PROTECTED]> wrote:

>Bengt Richter wrote:
>
>> Parse might be a big word for
>> 
>>  >> def tupreq(fmt): return sum(map(lambda s:list(s).count('%'),
>>  >> fmt.split('%%')))
>>  ..
>>  >> tupreq('%s this %(x)s not %% but %s')
>> 
>> (if it works in general ;-)
>
>Which it doesn't:
   D'oh. (My subconscious knew that one, and prompted the "if" ;-)
>
 def tupreq(fmt): return sum(map(lambda s:list(s).count('%'),
>fmt.split('%%')))
>...
 fmt = "%*d"
 fmt % ((1,) * tupreq(fmt))
>Traceback (most recent call last):
>  File "", line 1, in ?
>TypeError: not enough arguments for format string
>
But that one it totally spaced on ;-/

>> Or maybe clearer and faster:
>> 
>>  >>> def tupreq(fmt): return sum(1 for c in fmt.replace('%%','') if
>>  >>> c=='%')
>>  ...
>>  >>> tupreq('%s this %(x)s not %% but %s')
>>  3
>

>Mixed formats show some "interesting" behaviour:
>
 "%s %(x)s" % (1,2)
>Traceback (most recent call last):
>  File "", line 1, in ?
>TypeError: format requires a mapping
 class D:
>... def __getitem__(self, key):
>... return "D[%s]" % key
>...
 "%s %(x)s" % D()
>'<__main__.D instance at 0x402aaf2c> D[x]'
 "%s %(x)s %s" % D()
>Traceback (most recent call last):
>  File "", line 1, in ?
>TypeError: not enough arguments for format string
 "%s %(x)s %(y)s" % D()
>'<__main__.D instance at 0x402aad8c> D[x] D[y]'
>
>That is as far as I got. So under what circumstances is 
>'%s this %(x)s not %% but %s' a valid format string?
>
Yeah, I got that far too, some time ago playing % mapping, and
I thought they just didn't allow for mixed formats. My thought then
was that they could pass integer positional keys to another method
(say __format__) on a mapping object that wants to handle mixed formats.
If you wanted the normal str or repr resprensentation of a mapping
object that had a __format__ method, you'd have to do it on the args
side with str(theobject), but you'd have a way. And normal mapping objects
would need no special handling for "%s' in a mixed format context.

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Michael Spencer
Bengt Richter wrote:
On Wed, 20 Apr 2005 11:01:28 +0200, Peter Otten <[EMAIL PROTECTED]> wrote:
...
"%s %(x)s %(y)s" % D()
My experiments suggest that you can have a maximum of one unnamed argument in a 
mapping template - this unnamed value evaluates to the map itself
...
So under what circumstances is 
'%s this %(x)s not %% but %s' a valid format string?
Based on the above experiments, never.
I have wrapped up my current understanding in the following class:
 >>> s = StringFormatInfo('%s %*.*d %*s')
 >>> s
 POSITIONAL Template: %s %*.*d %*s
 Arguments: ('s', 'width', 'precision', 'd', 'width', 's')
 >>> s = StringFormatInfo('%(arg1)s %% %(arg2).*f %()s %s')
 >>> s
 MAPPING Template: %(arg1)s %% %(arg2).*f %()s %s
 Arguments: {'': 's', 'arg1': 's', 'arg2': 'f', None: 's'}
 >>>
class StringFormatInfo(object):
parse_format = re.compile(r'''
\%  # placeholder
(?:\((?P[\w]*)\))?# 0 or 1 named groups
(?P[\#0\-\+]?)  # 0 or 1 conversion flags
(?P[\d]* | \*)   # optional minimum conversion width
(?:.(?P[\d]+ | \*))? # optional precision
(?P[hlL]?)  # optional length modifier
(?P[diouxXeEfFgGcrs]{1})  # conversion type - note %% omitted
''',
re.VERBOSE
)
"""Wraps a template string and provides information about the number and
   kinds of arguments that must be supplied.  Call with % to apply the
   template to data"""
def __init__(self, template):
self.template = template
self.formats = formats = [m.groupdict() for m in 
self.parse_format.finditer(template)]

for format in formats:
if format['name']:
self.format_type = "MAPPING"
self.format_names = dict((format['name'], format['type'])
for format in formats)
break
else:
self.format_type = "POSITIONAL"
format_names = []
for format in formats:
if format['width'] == '*':
format_names.append('width')
if format['precision'] == '*':
format_names.append('precision')
format_names.append(format['type'])
self.format_names = tuple(format_names)
def __mod__(self, values):
return self.template % values
def __repr__(self):
return "%s Template: %s\nArguments: %s" % \
(self.format_type, self.template, self.format_names)

Michael
--
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Andrew Dalke
Michael Spencer wrote:
> I have wrapped up my current understanding in the following class:

I see you assume that only \w+ can fit inside of a %()
in a format string.  The actual Python code allows anything
up to the balanced closed parens.

>>> class Show:
...   def __getitem__(self, text):
... print "Want", repr(text)
... 
>>> "%(this(is)a.--test!)s" % Show()
Want 'this(is)a.--test!'
'None'
>>> 

I found this useful for a templating library I once wrote
that allowed operations through a simple pipeline, like

   %(doc.text|reformat(68)|indent(4))s

Andrew
[EMAIL PROTECTED]

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Michael Spencer
Andrew Dalke wrote:
I see you assume that only \w+ can fit inside of a %()
in a format string.  The actual Python code allows anything
up to the balanced closed parens.
Gah! I guess that torpedoes the regexp approach, then.
Thanks for looking at this
Michael
--
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-20 Thread Greg Ewing
Peter Otten wrote:
Greg Ewing wrote:
This seems to happen even with a custom subclass of tuple,
so it must be doing an exact type check.
No, it doesn't do an exact type check, but always calls the tuple method:
I guess you mean len(). On further investigation,
this seems to be right, except that it doesn't
invoke a __len__ defined in a custom subclass.
So there's something in there hard-coded to
expect a built-in tuple.
In any case, the original idea isn't possible.
--
Greg Ewing, Computer Science Dept,
University of Canterbury,   
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
--
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-21 Thread Steve Holden
Michael Spencer wrote:
Andrew Dalke wrote:
I see you assume that only \w+ can fit inside of a %()
in a format string.  The actual Python code allows anything
up to the balanced closed parens.
Gah! I guess that torpedoes the regexp approach, then.
Thanks for looking at this
Michael
While Andrew may have found the "fatal flaw" in your scheme, it's worth 
pointing out that it works just fine for my original use case.

regards
 Steve
--
Steve Holden+1 703 861 4237  +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/
--
http://mail.python.org/mailman/listinfo/python-list


Re: Enumerating formatting strings

2005-04-21 Thread Bengt Richter
On Wed, 20 Apr 2005 17:09:16 -0700, Michael Spencer <[EMAIL PROTECTED]> wrote:

>Andrew Dalke wrote:
>
>> I see you assume that only \w+ can fit inside of a %()
>> in a format string.  The actual Python code allows anything
>> up to the balanced closed parens.
>> 
>Gah! I guess that torpedoes the regexp approach, then.
>
>Thanks for looking at this
>
I brute-forced a str subclass that will call a mapping object's __getitem__ for 
both
kinds of format spec and '*' specs. Just to see what it would take. I didn't go 
the whole
way loking for a __format__ method on the mapping object, along the lines I 
suggested in
a previous post. Someone else's turn again ;-)
This has not been tested thoroughly...

The approach is to scan the original format string and put pieces into an out 
list
and then ''.join that for final ouput. The pieces are the non-format parts and
string from doing the formatting as formats are found. %(name) format args are
retrieved from the mapping object by name as usual, and saved as the arg for
rewritten plain format made from the tail after %(name), which is the same tail
as %tail, except that the value is already retrieved. Next '*' or decimal 
strings
are packed into the rewritten format, etc. The '*' values are retrieved by 
integer
values passed to mapobj[i] and incremented each time. If the arg value was not
retrieved by name, that's another mapobj[i]. Then the conversion is done with
the plain format. The tests have MixFmt(fmt, verbose=True) % 
MapObj(position_params, namedict)
and the verbose prints each rewritten format and arg and result as it appends 
them to out.


< mixfmt.py 
>
# mixfmt.py -- a string subclass with __mod__ permitting mixed '%(name)s %s' 
formatting
import re
class MixFmtError(Exception): pass

class MixFmt(str):
def __new__(cls, s, **kw):
return str.__new__(cls, s)
def __init__(self, *a, **kw):
self._verbose = kw.get('verbose')

# Michael Spencer's regex, slightly modded, but only for reference, since 
XXX note
parse_format = re.compile(r'''
(
\% # placeholder
(?:\(\w*\))?   # 0 or 1 "named" groups XXX "%( (any)(balanced) 
parens )s" is legal!
[\#0\-\+]? # 0 or 1 conversion flags
(?:\* | \d+)?  # optional minimum conversion width
(?:\.\* | \.\d+)?  # optional precision
[hlL]? # optional length modifier
[diouxXeEfFgGcrs]  # conversion type - note %% omitted
)
''',
re.VERBOSE)

def __mod__(self, mapobj):
"""
The '%' MixFmt string operation allowing both %(whatever)fmt and %fmt
by calling mapobj[whatever] for named args, and mapobj[i] sequentially
counting i for each '*' width or precision spec, and unnamed args.
It is up to the mapobj to handle this. See MapObj example used in tests.
"""
out = []
iarg = 0
pos, end = 0, len(self)
sentinel = object()
while pos=0 and self[pos:pos+2] == '%%':
pos+=2
pos = self.find('%', pos)
if pos<0: out.append(self[last:].replace('%%','%')); break
# here we have start of fmt with % at pos
out.append(self[last:pos].replace('%%','%'))
last = pos
plain_arg = sentinel
pos = pos+1
if self[pos]=='(':
# scan for balanced matching ')'
brk = 1; pos+=1
while brk>0:
nextrp = self.find(')',pos)
if nextrp<0: raise MixFmtError, 'no match for "(" at 
%s'%(pos+1)
nextlp = self.find('(', pos)
if nextlp>=0:
if nextlp %r %% %r => %r' % (plain_fmt, (plain_arg,), 
result)
out.append(result)
return ''.join(out)
 
class MapObj(object):
"""
Example for test.
Handles both named and positional (integer) keys 
for MixFmt(fmtstring) % MapObj(posargs, namedict)
"""
def __init__(self, *args, **kw):
self.args = args
self.kw = kw
def __getitem__(self, i):
if isinstance(i, int): return self.args[i]
else: 
try: return self.kw[i]
except KeyError: return ''%i

def test(fmt, *args, **namedict):
print '\n test with:\n  %r\n  %s\n  %s' %(fmt, args, namedict)
print MixFmt(fmt, verbose=True) % Ma

Re: Enumerating formatting strings

2005-04-21 Thread Michael Spencer
Steve Holden wrote:
Michael Spencer wrote:
Andrew Dalke wrote:
I see you assume that only \w+ can fit inside of a %()
in a format string.  The actual Python code allows anything
up to the balanced closed parens.
Gah! I guess that torpedoes the regexp approach, then.
Thanks for looking at this
Michael
While Andrew may have found the "fatal flaw" in your scheme, it's worth 
pointing out that it works just fine for my original use case.

regards
 Steve
Thanks.  Here's a version that overcomes the 'fatal' flaw.
class StringFormatInfo(object):
def __init__(self, template):
self.template = template
self.parse()
def tokenizer(self):
lexer = TinyLexer(self.template)
self.format_type = "POSITIONAL"
while lexer.search("\%"):
if lexer.match("\%"):
continue
format = {}
name = lexer.takeparens()
if name is not None:
self.format_type = "MAPPING"
format['name'] = name
format['conversion'] = lexer.match("[\#0\-\+]")
format['width'] = lexer.match("\d+|\*")
format['precision'] = lexer.match("\.") and \
lexer.match("\d+|\*") or None
format['lengthmodifier'] = lexer.match("[hlL]")
ftype = lexer.match('[diouxXeEfFgGcrs]')
if not ftype:
raise ValueError
else:
format['type'] = ftype
yield format
def parse(self):
self.formats = formats = list(self.tokenizer())
if self.format_type == "MAPPING":
self.format_names = dict((format['name'], format['type'])
for format in formats)
else:
format_names = []
for format in formats:
if format['width'] == '*':
format_names.append('width')
if format['precision'] == '*':
format_names.append('precision')
format_names.append(format['type'])
self.format_names = tuple(format_names)
def __mod__(self, values):
return self.template % values
def __repr__(self):
return "%s Template: %s\nArguments: %s" % \
(self.format_type, self.template, self.format_names)
__str__ = __repr__
SFI = StringFormatInfo
def tests():
print SFI('%(arg1)s %% %(arg2).*f %()s %s')
print SFI('%s %*.*d %*s')
print SFI('%(this(is)a.--test!)s')
import re
class TinyLexer(object):
def __init__(self, text):
self.text = text
self.ptr = 0
self.len = len(text)
self.re_cache = {}
def match(self, regexp, consume = True, anchor = True):
if isinstance(regexp, basestring):
cache = self.re_cache
if regexp not in cache:
cache[regexp] = re.compile(regexp)
regexp = cache[regexp]
matcher = anchor and regexp.match or regexp.search
match = matcher(self.text, self.ptr)
if not match:
return None
if consume:
self.ptr = match.end()
return match.group()
def search(self, regexp, consume = True):
return self.match(regexp, consume=True, anchor=False)
def takeparens(self):
start = self.ptr
if self.text[start] != '(':
return None
out = ''
level = 1
self.ptr += 1
while self.ptr < self.len:
nextchar = self.text[self.ptr]
level += (nextchar == '(') - (nextchar == ')')
self.ptr += 1
if level == 0:
return out
out += nextchar
raise ValueError, "Unmatched parentheses"


--
http://mail.python.org/mailman/listinfo/python-list


text strings to binary...

2005-05-03 Thread oriana . falco
Hi,

I've been working with Python for a bit now but I just came accross a
problem and I'm not sure how to approach it. I'd like to convert a text
string into binary data format. I though I'd use the struct module and
the pack function but I can't get it to work. I'm using Python 2.3.3 in
Windows XP.

Thanks in advance for the help, 
Oriana

-- 
http://mail.python.org/mailman/listinfo/python-list


Escape spaces in strings

2005-05-12 Thread Florian Lindner
Hello,
is there a function to escape spaces and other characters in string for
using them as a argument to unix command? In this case rsync
(http://samba.anu.edu.au/rsync/FAQ.html#10)

Thx,

Florian
-- 
http://mail.python.org/mailman/listinfo/python-list


Comparing 2 similar strings?

2005-05-18 Thread William Park
How do you compare 2 strings, and determine how much they are "close" to
each other?  Eg.
aqwerty
qwertyb
are similar to each other, except for first/last char.  But, how do I
quantify that?

I guess you can say for the above 2 strings that
- at max, 6 chars out of 7 are same sequence --> 85% max

But, for
qawerty
qwerbty
max correlation is
- 3 chars out of 7 are the same sequence --> 42% max

(Crossposted to 3 of my favourite newsgroup.)

-- 
William Park <[EMAIL PROTECTED]>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
   http://home.eol.ca/~parkw/thinflash.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Strings for a newbie

2005-05-27 Thread Malcolm Wooden
I'm trying to get my head around Python but seem to be failing miserably. I 
use RealBasic on a Mac and find it an absolute dream! But PythonUGH!

I want to put a sentence of words into an array, eg "This is a sentence of 
words"

In RB it would be simple:

Dim s as string
Dim a(-1) as string
Dim i as integer

s = "This is a sentence of words"
For i = 1 to CountFields(s," ")
  a.append NthField(s," ",i)
next

That's it an array a() containing the words of the sentence.

Now can I see how this is done in Python? - nope!

UGH!

Malcolm
(a disillusioned Python newbie) 


-- 
http://mail.python.org/mailman/listinfo/python-list


Double decoding of strings??

2005-12-05 Thread manuzhai
Hi all,

I have a bit of a problem. I'm trying to use Python to work with some
data which turns out to be garbage. Ultimately, I think the solution
will be to .decode('utf-8') a string twice, but Python doesn't like
doing this the second time. That could possibly be understandable, but
then why does the unicode object have a .decode() method at all?

I get 'WVL Algemeen Altru\xc3\x83\xc2\xafsme genormeerd Afbeelden' at
first.
I .decode('utf-8') this to u'WVL Algemeen Altru\xc3\xafsme genormeerd
Afbeelden'.
I then try to .decode('utf-8') this again, but that gives an error:

Traceback (most recent call last):
  File "", line 1, in ?
  File "C:\Program Files\Python\lib\encodings\utf_8.py", line 16, in
decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position
18-19: ordinal not in range(128)

If I copy/paste 'WVL Algemeen Altru\xc3\xafsme genormeerd Afbeelden'
and try to .decode('utf-8') it, that works fine, and it gets me the
result I want, which is u'WVL Algemeen Altru\xefsme genormeerd
Afbeelden'.

Why does it work this way? How can I make it work?

Regards,

Manuzhai

-- 
http://mail.python.org/mailman/listinfo/python-list


splitting strings with python

2005-06-09 Thread [EMAIL PROTECTED]
im trying to split a string with this form (the string is from a
japanese dictionary file with mulitple definitions in english for each
japanese word)


str1 [str2] / (def1, ...) (1) def2 / def3 /  (2) def4/ def5 ... /


the varibles i need are str*, def*.

sometimes the (1) and (2) are not included - they are included only if
the word has two different meanings


"..." means that there are sometimes more then two definitions per
meaning.


im trying to use the re.split() function but with no luck.

Is this possible with python, or am i dreamin!?

All the best,

.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread Paul McNett
Mark Harrison wrote:
> What is the best way to process a text file of delimited strings?
> I've got a file where strings are quoted with at-signs, @like [EMAIL 
> PROTECTED]
> At-signs in the string are represented as doubled @@.

Have you taken a look at the csv module yet? No guarantees, but it may 
just work. You'd have to set delimiter to ' ' and quotechar to '@'. You 
may need to manually handle the double-@ thing, but why don't you see 
how close you can get with csv?

> @rv@ 2 @db.locks@ @//depot/hello.txt@ @mh@ @mh@ 1 1 44
> @pv@ 0 @db.changex@ 44 44 @mh@ @mh@ 1118875308 0 @ :@@: :: @
> 
> (this is from a perforce journal file, btw)


-- 
Paul McNett
http://paulmcnett.com

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread Christoph Rackwitz
You could use regular expressions... it's an FSM of some kind but it's
faster *g*
check this snippet out:

def mysplit(s):
pattern = '((?:"[^"]*")|(?:[^ ]+))'
tmp = re.split(pattern, s)
res = [ifelse(i[0] in ('"',"'"), lambda:i[1:-1], lambda:i) for i in
tmp if i.strip()]
return res

>>> mysplit('foo bar "baz foo" bar "baz"')
['foo', 'bar', 'baz foo', 'bar', 'baz']

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread Nicola Mingotti
On Wed, 15 Jun 2005 23:03:55 +, Mark Harrison wrote:
 
> What's the most efficient way to process this?  Failing all
> else I will split the string into characters and use a FSM,
> but it seems that's not very pythonesqe.

like this ?

>>> s = "@[EMAIL PROTECTED]@@[EMAIL PROTECTED]"
>>> s.split("@")
['', 'hello', 'world', '', 'foo', 'bar']
>>> s2 = "[EMAIL PROTECTED]@@[EMAIL PROTECTED]"
>>> s2
'[EMAIL PROTECTED]@@[EMAIL PROTECTED]'
>>> s2.split("@")
['hello', 'world', '', 'foo', 'bar']
>>> 

bye
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread John Machin
Mark Harrison wrote:
> What is the best way to process a text file of delimited strings?
> I've got a file where strings are quoted with at-signs, @like [EMAIL 
> PROTECTED]
> At-signs in the string are represented as doubled @@.
> 
> What's the most efficient way to process this?  Failing all
> else I will split the string into characters and use a FSM,
> but it seems that's not very pythonesqe.
> 
> @rv@ 2 @db.locks@ @//depot/hello.txt@ @mh@ @mh@ 1 1 44
> @pv@ 0 @db.changex@ 44 44 @mh@ @mh@ 1118875308 0 @ :@@: :: @
> 

 >>> import csv
 >>> list(csv.reader(file('at_quotes.txt', 'rb'), delimiter=' ', 
quotechar='@'))
[['rv', '2', 'db.locks', '//depot/hello.txt', 'mh', 'mh', '1', '1', 
'44'], ['pv'
, '0', 'db.changex', '44', '44', 'mh', 'mh', '1118875308', '0', ' :@: 
:@@: ']]
 >>>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread John Machin
Nicola Mingotti wrote:
> On Wed, 15 Jun 2005 23:03:55 +, Mark Harrison wrote:
>  
> 
>>What's the most efficient way to process this?  Failing all
>>else I will split the string into characters and use a FSM,
>>but it seems that's not very pythonesqe.
> 
> 
> like this ?

No, not like that. The OP said that an embedded @ was doubled.

> 
> 
s = "@[EMAIL PROTECTED]@@[EMAIL PROTECTED]"
s.split("@")
> 
> ['', 'hello', 'world', '', 'foo', 'bar']
> 
s2 = "[EMAIL PROTECTED]@@[EMAIL PROTECTED]"
s2
> 
> '[EMAIL PROTECTED]@@[EMAIL PROTECTED]'
> 
s2.split("@")
> 
> ['hello', 'world', '', 'foo', 'bar']
> 
> 
> bye
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread Mark Harrison
Paul McNett <[EMAIL PROTECTED]> wrote:
> Mark Harrison wrote:
> > What is the best way to process a text file of delimited strings?
> > I've got a file where strings are quoted with at-signs, @like [EMAIL 
> > PROTECTED]
> > At-signs in the string are represented as doubled @@.
> 
> Have you taken a look at the csv module yet? No guarantees, but it may 
> just work. You'd have to set delimiter to ' ' and quotechar to '@'. You 
> may need to manually handle the double-@ thing, but why don't you see 
> how close you can get with csv?

This is great!  Everything works perfectly.  Even the double-@ thing
is handled by the default quotechar  handling.

Thanks again,
Mark

-- 
Mark Harrison
Pixar Animation Studios
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread Leif K-Brooks
Mark Harrison wrote:
> What is the best way to process a text file of delimited strings?
> I've got a file where strings are quoted with at-signs, @like [EMAIL 
> PROTECTED]
> At-signs in the string are represented as doubled @@.

>>> import re
>>> _at_re = re.compile('(?>> def split_at_line(line):
... return [field.replace('@@', '@') for field in
...   _at_re.split(line)]
...
>>> split_at_line('[EMAIL PROTECTED]@@[EMAIL PROTECTED]')
['foo', '[EMAIL PROTECTED]', 'qux']
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: splitting delimited strings

2005-06-15 Thread Paul McGuire
Mark -

Let me weigh in with a pyparsing entry to your puzzle.  It wont be
blazingly fast, but at least it will give you another data point in
your comparison of approaches.  Note that the parser can do the
string-to-int conversion for you during the parsing pass.

If @rv@ and @pv@ are record type markers, then you can use pyparsing to
create more of a parser than just a simple tokenizer, and parse out the
individual record fields into result attributes.

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul

test1 = "@hello@@world@@[EMAIL PROTECTED]"
test2 = """@rv@ 2 @db.locks@ @//depot/hello.txt@ @mh@ @mh@ 1 1 44
@pv@ 0 @db.changex@ 44 44 @mh@ @mh@ 1118875308 0 @ :@@: :: @"""

from pyparsing import *

AT = Literal("@")
atQuotedString = AT.suppress() + Combine(OneOrMore((~AT + SkipTo(AT)) |

   (AT +
AT).setParseAction(replaceWith("@")) )) + AT.suppress()

# extract any @-quoted strings
for test in (test1,test2):
for toks,s,e in atQuotedString.scanString(test):
print toks
print

# parse all tokens (assume either a positive integer or @-quoted
string)
def makeInt(s,l,toks):
return int(toks[0])
entry = OneOrMore( Word(nums).setParseAction(makeInt) | atQuotedString
)

for t in test2.split("\n"):
print entry.parseString(t)

Prints out:

['[EMAIL PROTECTED]@foo']

['rv']
['db.locks']
['//depot/hello.txt']
['mh']
['mh']
['pv']
['db.changex']
['mh']
['mh']
[':@: :@@: ']

['rv', 2, 'db.locks', '//depot/hello.txt', 'mh', 'mh', 1, 1, 44]
['pv', 0, 'db.changex', 44, 44, 'mh', 'mh', 1118875308, 0, ':@: :@@: ']

-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >