Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread Gabriel Genellina

En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribió:

On Feb 15, 11:31 pm, odeits ode...@gmail.com wrote:



It seems what you are actually testing for is if the intersection of
the two sets is not empty where the first set is the characters in
your word and the second set is the characters in your defined string.


To expand on what I was saying I thought i should provide a code
snippet:

WORD = 'g' * 100
WORD2 = 'g' * 50 + 'U'
VOWELS = 'aeiouAEIOU'
BIGWORD = 'g' * 1 + 'U'

def set_test(vowels, word):

vowels = set( iter(vowels))
letters = set( iter(word) )

if letters  vowels:
return True
else:
return False

with python 2.5 I got 1.30 usec/pass against the BIGWORD


You could make it slightly faster by removing the iter() call: letters =  
set(word)

And (if vowels are really constant) you could pre-build the vowels set.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread odeits
On Feb 21, 12:47 am, Gabriel Genellina gagsl-...@yahoo.com.ar
wrote:
 En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribió:



  On Feb 15, 11:31 pm, odeits ode...@gmail.com wrote:
  It seems what you are actually testing for is if the intersection of
  the two sets is not empty where the first set is the characters in
  your word and the second set is the characters in your defined string.

  To expand on what I was saying I thought i should provide a code
  snippet:

  WORD = 'g' * 100
  WORD2 = 'g' * 50 + 'U'
  VOWELS = 'aeiouAEIOU'
  BIGWORD = 'g' * 1 + 'U'

  def set_test(vowels, word):

      vowels = set( iter(vowels))
      letters = set( iter(word) )

      if letters  vowels:
          return True
      else:
          return False

  with python 2.5 I got 1.30 usec/pass against the BIGWORD

 You could make it slightly faster by removing the iter() call: letters =  
 set(word)
 And (if vowels are really constant) you could pre-build the vowels set.

 --
 Gabriel Genellina

set(word) = set{[word]} meaning a set with one element, the string
the call to iter makes it set of the letters making up the word.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread rdmurray
odeits ode...@gmail.com wrote:
 On Feb 21, 12:47=A0am, Gabriel Genellina gagsl-...@yahoo.com.ar
 wrote:
  En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribi=F3:
 
   On Feb 15, 11:31=A0pm, odeits ode...@gmail.com wrote:
   It seems what you are actually testing for is if the intersection of
   the two sets is not empty where the first set is the characters in
   your word and the second set is the characters in your defined string.
 
   To expand on what I was saying I thought i should provide a code
   snippet:
 
   WORD = 'g' * 100
   WORD2 = 'g' * 50 + 'U'
   VOWELS = 'aeiouAEIOU'
   BIGWORD = 'g' * 1 + 'U'
 
   def set_test(vowels, word):
 
vowels = set( iter(vowels))
letters = set( iter(word) )
 
if letters  vowels:
return True
else:
   return False
 
   with python 2.5 I got 1.30 usec/pass against the BIGWORD
 
  You could make it slightly faster by removing the iter() call:
  letters = set(word)
  And (if vowels are really constant) you could pre-build the vowels set.
 
 set(word) = set{[word]} meaning a set with one element, the string
 the call to iter makes it set of the letters making up the word.

Did you try it?

Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13) 
[GCC 4.3.2] on linux2
Type help, copyright, credits or license for more information.
 set('abcd')
set(['a', 'c', 'b', 'd'])

--RDM

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread odeits
On Feb 21, 2:24 pm, rdmur...@bitdance.com wrote:
 odeits ode...@gmail.com wrote:
  On Feb 21, 12:47=A0am, Gabriel Genellina gagsl-...@yahoo.com.ar
  wrote:
   En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribi=F3:

On Feb 15, 11:31=A0pm, odeits ode...@gmail.com wrote:
It seems what you are actually testing for is if the intersection of
the two sets is not empty where the first set is the characters in
your word and the second set is the characters in your defined string.

To expand on what I was saying I thought i should provide a code
snippet:

WORD = 'g' * 100
WORD2 = 'g' * 50 + 'U'
VOWELS = 'aeiouAEIOU'
BIGWORD = 'g' * 1 + 'U'

def set_test(vowels, word):

 vowels = set( iter(vowels))
 letters = set( iter(word) )

 if letters  vowels:
     return True
 else:
    return False

with python 2.5 I got 1.30 usec/pass against the BIGWORD

   You could make it slightly faster by removing the iter() call:
   letters = set(word)
   And (if vowels are really constant) you could pre-build the vowels set.

  set(word) = set{[word]} meaning a set with one element, the string
  the call to iter makes it set of the letters making up the word.

 Did you try it?

 Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
 [GCC 4.3.2] on linux2
 Type help, copyright, credits or license for more information. 
 set('abcd')

 set(['a', 'c', 'b', 'd'])

 --RDM

You are in fact correct. Thank you for pointing that out.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-20 Thread odeits
On Feb 15, 11:31 pm, odeits ode...@gmail.com wrote:
 On Feb 15, 9:56 pm, Chris Rebert c...@rebertia.com wrote:



  On Sun, Feb 15, 2009 at 9:17 PM,  pyt...@bdurham.com wrote:
   I need to test strings to determine if one of a list of chars is in the
   string. A simple example would be to test strings to determine if they 
   have
   a vowel (aeiouAEIOU) present.

   I was hopeful that there was a built-in method that operated similar to
   startswith where I could pass a tuple of chars to be tested, but I could 
   not
   find such a method.

   Which of the following techniques is most Pythonic or are there better 
   ways
   to perform this type of match?

   # long and hard coded but short circuits as soon as match found
   if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... :

   -OR-

   # flexible, but no short circuit on first match
   if [ char for char in word if char in 'aeiouAEIOU' ]:

  Just use the fairly new builtin function any() to make it short-circuit:

  if any(char.lower() in 'aeiou' for char in word):
      do_whatever()

  Cheers,
  Chris

  --
  Follow the path of the Iguana...http://rebertia.com

 If you want to generalize it you should look at 
 setshttp://docs.python.org/library/sets.html

 It seems what you are actually testing for is if the intersection of
 the two sets is not empty where the first set is the characters in
 your word and the second set is the characters in your defined string.

To expand on what I was saying I thought i should provide a code
snippet:

WORD = 'g' * 100
WORD2 = 'g' * 50 + 'U'
VOWELS = 'aeiouAEIOU'
BIGWORD = 'g' * 1 + 'U'

def set_test(vowels, word):

vowels = set( iter(vowels))
letters = set( iter(word) )

if letters  vowels:
return True
else:
return False

with python 2.5 I got 1.30 usec/pass against the BIGWORD


--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-19 Thread John Machin
On Feb 19, 6:47 pm, Dennis Lee Bieber wlfr...@ix.netcom.com wrote:
 On Wed, 18 Feb 2009 21:22:45 +0100, Peter Otten __pete...@web.de
 declaimed the following in comp.lang.python:

  Steve Holden wrote:

   Jervis Whitley wrote:
   What happens when you have hundreds of megabytes, I don't know.

   I hope I never have to test a word that is hundreds of megabytes long
   for a vowel :)

   I see you don't speak German ;-)

  I tried to come up with a funny way to point out that you're a fool.

  But because I'm German I failed.

         Yeah... the proper language to bemoan is Welsh G

 Where w is vowel G

Better bemoaned is Czech which can go on and on using only L and R as
sonorant consonants instead of vowels.

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-19 Thread MRAB

Dennis Lee Bieber wrote:

On Wed, 18 Feb 2009 21:22:45 +0100, Peter Otten __pete...@web.de
declaimed the following in comp.lang.python:


Steve Holden wrote:


Jervis Whitley wrote:

What happens when you have hundreds of megabytes, I don't know.



I hope I never have to test a word that is hundreds of megabytes long
for a vowel :)

I see you don't speak German ;-)

I tried to come up with a funny way to point out that you're a fool.

But because I'm German I failed.

	Yeah... the proper language to bemoan is Welsh G 


Where w is vowel G


So is y, but then it is in English too, sometimes.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-19 Thread Martin P. Hellwig

MRAB wrote:

Dennis Lee Bieber wrote:

On Wed, 18 Feb 2009 21:22:45 +0100, Peter Otten __pete...@web.de
declaimed the following in comp.lang.python:


Steve Holden wrote:


Jervis Whitley wrote:

What happens when you have hundreds of megabytes, I don't know.



I hope I never have to test a word that is hundreds of megabytes long
for a vowel :)

I see you don't speak German ;-)

I tried to come up with a funny way to point out that you're a fool.

But because I'm German I failed.


Yeah... the proper language to bemoan is Welsh G
Where w is vowel G


So is y, but then it is in English too, sometimes.


Heh, born and raised in Germany, moved to the Netherlands and now live 
in the UK, speak a bit of French too. No wonder the only language that 
makes actually sense to me is Python.


--
mph
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-18 Thread Steven D'Aprano
On Wed, 18 Feb 2009 07:08:04 +1100, Jervis Whitley wrote:


 This moves the for-loop out of slow Python into fast C and should be
 much, much faster for very large input.


 _Should_ be faster.

Yes, Python's timing results are often unintuitive.


 Here is my test on an XP system Python 2.5.4. I had similar results on
 python 2.7 trunk.
...
 **no vowels**
 any: [0.36063678618957751, 0.36116506191682773, 0.36212355395824081]
 for: [0.24044885376801672, 0.2417684017413404, 0.24084797257163482]

I get similar results.

...
 **BIG word vowel 'U' final char**
 any: [8.0007259193539895, 7.9797344140269644, 7.8901742633514012] for:
 [7.6664422372764101, 7.6784683633957584, 7.6683055766498001]

Well, I did say for very large input. 1 chars isn't very large -- 
that's only 9K. Try this instead:

 BIGWORD = 'g' * 50 + 'U'  # less than 500K of text

 Timer(for_test(BIGWORD), setup).repeat(number=1000)
[4.7292280197143555, 4.633030891418457, 4.6327309608459473]
 Timer(any_test(BIGWORD), setup).repeat(number=1000)
[4.7717428207397461, 4.6366970539093018, 4.6367099285125732]

The difference is not significant. What about bigger?


 BIGWORD = 'g' * 500 + 'U'  # less than 5MB 

 Timer(for_test(BIGWORD), setup).repeat(number=100)
[4.8875839710235596, 4.7698030471801758, 4.769787073135376]
 Timer(any_test(BIGWORD), setup).repeat(number=100)
[4.8555209636688232, 4.8139419555664062, 4.7710208892822266]

It seems to me that I was mistaken -- for large enough input, the running 
time of each version converges to approximately the same speed.

What happens when you have hundreds of megabytes, I don't know.


-- 
Steven
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-18 Thread Jervis Whitley

 What happens when you have hundreds of megabytes, I don't know.


I hope I never have to test a word that is hundreds of megabytes long
for a vowel :)
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-18 Thread Steve Holden
Jervis Whitley wrote:
 What happens when you have hundreds of megabytes, I don't know.


 I hope I never have to test a word that is hundreds of megabytes long
 for a vowel :)

I see you don't speak German ;-)
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-18 Thread Peter Otten
Steven D'Aprano wrote:

 On Wed, 18 Feb 2009 07:08:04 +1100, Jervis Whitley wrote:
 
 
 This moves the for-loop out of slow Python into fast C and should be
 much, much faster for very large input.


 _Should_ be faster.
 
 Yes, Python's timing results are often unintuitive.

Indeed.

 It seems to me that I was mistaken -- for large enough input, the running
 time of each version converges to approximately the same speed.

No, you were right. Both any_test() and for_test() use the improvement you 
suggested, i. e. loop over the vowels, not the characters of the word.
Here's the benchmark as it should have been:

$ python -m timeit -s'word = g*1' 'any(v in word for v in aeiouAEIOU)'
1000 loops, best of 3: 314 usec per loop
$ python -m timeit -s'word = g*1' 'any(c in aeiouAEIOU for c in word)'
100 loops, best of 3: 3.48 msec per loop

Of course this shows only the worst case behaviour. The results will vary 
depending on the actual word e. g. Ug... or g...a.

Peter 

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-18 Thread Peter Otten
Steve Holden wrote:

 Jervis Whitley wrote:
 What happens when you have hundreds of megabytes, I don't know.


 I hope I never have to test a word that is hundreds of megabytes long
 for a vowel :)
 
 I see you don't speak German ;-)

I tried to come up with a funny way to point out that you're a fool.

But because I'm German I failed.

Peter
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-17 Thread Steven D'Aprano
Nicolas Dandrimont wrote:

 I would go for something like:
 
 for char in word:
 if char in 'aeiouAEIUO':
 char_found = True
 break
 else:
 char_found = False
 
 (No, I did not forget to indent the else statement, see
 http://docs.python.org/reference/compound_stmts.html#for)

That might be better written as:

char_found = False
for char in word:
if char in 'aeiouAEIUO':
char_found = True
break

or even:

char_found = False
for char in word:
if char.lower() in 'aeiou':
char_found = True
break

but if word is potentially very large, it's probably better to reverse the
test: rather than compare every char of word to see if it is a vowel, just
search word for each vowel:

char_found = any( vowel in word for vowel in 'aeiouAEIOU' )

This moves the for-loop out of slow Python into fast C and should be much,
much faster for very large input.


-- 
Steven

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-17 Thread Jervis Whitley

 This moves the for-loop out of slow Python into fast C and should be much,
 much faster for very large input.


_Should_ be faster.

Here is my test on an XP system Python 2.5.4. I had similar results on
python 2.7 trunk.

WORD = 'g' * 100
WORD2 = 'g' * 50 + 'U'

BIGWORD = 'g' * 1 + 'U'

def any_test(word):
return any(vowel in word for vowel in 'aeiouAEIOU')

def for_test(word):
for vowel in 'aeiouAEIOU':
if vowel in word:
return True
else:
return False

**no vowels**
any: [0.36063678618957751, 0.36116506191682773, 0.36212355395824081]
for: [0.24044885376801672, 0.2417684017413404, 0.24084797257163482]

**vowel 'U' final char**
any: [0.38218764069443112, 0.38431925474244588, 0.38238668882188831]
for: [0.16398578356553717, 0.16433223810347286, 0.1659337176385]

**BIG word vowel 'U' final char**
any: [8.0007259193539895, 7.9797344140269644, 7.8901742633514012]
for: [7.6664422372764101, 7.6784683633957584, 7.6683055766498001]

Cheers,
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-16 Thread Nicolas Dandrimont
* pyt...@bdurham.com pyt...@bdurham.com [2009-02-16 00:48:34 -0500]:

 Nicolas,
 
  I would go for something like:
  
  for char in word:
  if char in 'aeiouAEIUO':
  char_found = True
  break
  else:
  char_found = False
 
  It is clear (imo), and it is seems to be the intended idiom for a 
  search loop, that short-circuits as soon as a match is found.
 
 Thank you - that looks much better that my overly complicated attempts.
 
 Are there any reasons why I couldn't simplify your approach as follows?
 
 for char in word:
 if char in 'aeiouAEIUO':
 return True
 return False

If you want to put this in its own function, this seems to be the way to go.

Cheers,
-- 
Nicolas Dandrimont

The nice thing about Windows is - It does not just crash, it displays a
dialog box and lets you press 'OK' first.
(Arno Schaefer's .sig)


signature.asc
Description: Digital signature
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-16 Thread J. Cliff Dyer

On Mon, 2009-02-16 at 00:28 -0500, Nicolas Dandrimont wrote:
 * pyt...@bdurham.com pyt...@bdurham.com [2009-02-16 00:17:37 -0500]:
 
  I need to test strings to determine if one of a list of chars is
  in the string. A simple example would be to test strings to
  determine if they have a vowel (aeiouAEIOU) present.
  I was hopeful that there was a built-in method that operated
  similar to startswith where I could pass a tuple of chars to be
  tested, but I could not find such a method.
  Which of the following techniques is most Pythonic or are there
  better ways to perform this type of match?
  # long and hard coded but short circuits as soon as match found
  if 'a' in word or 'e' in word or 'i' in word or 'u' in word or
  ... :
  -OR-
  # flexible, but no short circuit on first match
  if [ char for char in word if char in 'aeiouAEIOU' ]:
  -OR-
  # flexible, but no short circuit on first match
  if set( word ).intersection( 'aeiouAEIOU' ):
 
 I would go for something like:
 
 for char in word:
 if char in 'aeiouAEIUO':
 char_found = True
 break
 else:
 char_found = False
 
 (No, I did not forget to indent the else statement, see
 http://docs.python.org/reference/compound_stmts.html#for)
 
 It is clear (imo), and it is seems to be the intended idiom for a search
 loop, that short-circuits as soon as a match is found.
 

If performance becomes an issue, you can tune this very easily, so it
doesn't have to scan through the string 'aeiouAEIOU' every time, by
making a set out of that:

vowels = set('aeiouAEIOU')
for char in word 
if char in vowels:
return True
return False

Searching in a set runs in constant time.  


 Cheers,
 --
 http://mail.python.org/mailman/listinfo/python-list

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-16 Thread Stefaan Himpe

An entirely different approach would be to use a regular expression:

import re
if re.search([abc], nothing expekted):
   print a, b or c occurs in the string 'nothing expekted'

if re.search([abc], something expected):
   print a, b or c occurs in the string 'something expected'

Best regards,
Stefaan.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-15 Thread python
Nicolas,

 I would go for something like:
 
 for char in word:
 if char in 'aeiouAEIUO':
 char_found = True
 break
 else:
 char_found = False

 It is clear (imo), and it is seems to be the intended idiom for a 
 search loop, that short-circuits as soon as a match is found.

Thank you - that looks much better that my overly complicated attempts.

Are there any reasons why I couldn't simplify your approach as follows?

for char in word:
if char in 'aeiouAEIUO':
return True
return False

Cheers,
Malcolm
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-15 Thread Nicolas Dandrimont
* pyt...@bdurham.com pyt...@bdurham.com [2009-02-16 00:17:37 -0500]:

 I need to test strings to determine if one of a list of chars is
 in the string. A simple example would be to test strings to
 determine if they have a vowel (aeiouAEIOU) present.
 I was hopeful that there was a built-in method that operated
 similar to startswith where I could pass a tuple of chars to be
 tested, but I could not find such a method.
 Which of the following techniques is most Pythonic or are there
 better ways to perform this type of match?
 # long and hard coded but short circuits as soon as match found
 if 'a' in word or 'e' in word or 'i' in word or 'u' in word or
 ... :
 -OR-
 # flexible, but no short circuit on first match
 if [ char for char in word if char in 'aeiouAEIOU' ]:
 -OR-
 # flexible, but no short circuit on first match
 if set( word ).intersection( 'aeiouAEIOU' ):

I would go for something like:

for char in word:
if char in 'aeiouAEIUO':
char_found = True
break
else:
char_found = False

(No, I did not forget to indent the else statement, see
http://docs.python.org/reference/compound_stmts.html#for)

It is clear (imo), and it is seems to be the intended idiom for a search
loop, that short-circuits as soon as a match is found.

Cheers,
-- 
Nicolas Dandrimont

linux: the choice of a GNU generation
(k...@cis.ufl.edu put this on Tshirts in '93)


signature.asc
Description: Digital signature
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-15 Thread Chris Rebert
On Sun, Feb 15, 2009 at 9:17 PM,  pyt...@bdurham.com wrote:
 I need to test strings to determine if one of a list of chars is in the
 string. A simple example would be to test strings to determine if they have
 a vowel (aeiouAEIOU) present.

 I was hopeful that there was a built-in method that operated similar to
 startswith where I could pass a tuple of chars to be tested, but I could not
 find such a method.

 Which of the following techniques is most Pythonic or are there better ways
 to perform this type of match?

 # long and hard coded but short circuits as soon as match found
 if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... :

 -OR-

 # flexible, but no short circuit on first match
 if [ char for char in word if char in 'aeiouAEIOU' ]:

Just use the fairly new builtin function any() to make it short-circuit:

if any(char.lower() in 'aeiou' for char in word):
do_whatever()

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-15 Thread odeits
On Feb 15, 9:56 pm, Chris Rebert c...@rebertia.com wrote:
 On Sun, Feb 15, 2009 at 9:17 PM,  pyt...@bdurham.com wrote:
  I need to test strings to determine if one of a list of chars is in the
  string. A simple example would be to test strings to determine if they have
  a vowel (aeiouAEIOU) present.

  I was hopeful that there was a built-in method that operated similar to
  startswith where I could pass a tuple of chars to be tested, but I could not
  find such a method.

  Which of the following techniques is most Pythonic or are there better ways
  to perform this type of match?

  # long and hard coded but short circuits as soon as match found
  if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... :

  -OR-

  # flexible, but no short circuit on first match
  if [ char for char in word if char in 'aeiouAEIOU' ]:

 Just use the fairly new builtin function any() to make it short-circuit:

 if any(char.lower() in 'aeiou' for char in word):
     do_whatever()

 Cheers,
 Chris

 --
 Follow the path of the Iguana...http://rebertia.com

If you want to generalize it you should look at sets
http://docs.python.org/library/sets.html

It seems what you are actually testing for is if the intersection of
the two sets is not empty where the first set is the characters in
your word and the second set is the characters in your defined string.

--
http://mail.python.org/mailman/listinfo/python-list