Re: Pythonic way to determine if one char of many in a string
En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribió: On Feb 15, 11:31 pm, odeits ode...@gmail.com wrote: It seems what you are actually testing for is if the intersection of the two sets is not empty where the first set is the characters in your word and the second set is the characters in your defined string. To expand on what I was saying I thought i should provide a code snippet: WORD = 'g' * 100 WORD2 = 'g' * 50 + 'U' VOWELS = 'aeiouAEIOU' BIGWORD = 'g' * 1 + 'U' def set_test(vowels, word): vowels = set( iter(vowels)) letters = set( iter(word) ) if letters vowels: return True else: return False with python 2.5 I got 1.30 usec/pass against the BIGWORD You could make it slightly faster by removing the iter() call: letters = set(word) And (if vowels are really constant) you could pre-build the vowels set. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Feb 21, 12:47 am, Gabriel Genellina gagsl-...@yahoo.com.ar wrote: En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribió: On Feb 15, 11:31 pm, odeits ode...@gmail.com wrote: It seems what you are actually testing for is if the intersection of the two sets is not empty where the first set is the characters in your word and the second set is the characters in your defined string. To expand on what I was saying I thought i should provide a code snippet: WORD = 'g' * 100 WORD2 = 'g' * 50 + 'U' VOWELS = 'aeiouAEIOU' BIGWORD = 'g' * 1 + 'U' def set_test(vowels, word): vowels = set( iter(vowels)) letters = set( iter(word) ) if letters vowels: return True else: return False with python 2.5 I got 1.30 usec/pass against the BIGWORD You could make it slightly faster by removing the iter() call: letters = set(word) And (if vowels are really constant) you could pre-build the vowels set. -- Gabriel Genellina set(word) = set{[word]} meaning a set with one element, the string the call to iter makes it set of the letters making up the word. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
odeits ode...@gmail.com wrote: On Feb 21, 12:47=A0am, Gabriel Genellina gagsl-...@yahoo.com.ar wrote: En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribi=F3: On Feb 15, 11:31=A0pm, odeits ode...@gmail.com wrote: It seems what you are actually testing for is if the intersection of the two sets is not empty where the first set is the characters in your word and the second set is the characters in your defined string. To expand on what I was saying I thought i should provide a code snippet: WORD = 'g' * 100 WORD2 = 'g' * 50 + 'U' VOWELS = 'aeiouAEIOU' BIGWORD = 'g' * 1 + 'U' def set_test(vowels, word): vowels = set( iter(vowels)) letters = set( iter(word) ) if letters vowels: return True else: return False with python 2.5 I got 1.30 usec/pass against the BIGWORD You could make it slightly faster by removing the iter() call: letters = set(word) And (if vowels are really constant) you could pre-build the vowels set. set(word) = set{[word]} meaning a set with one element, the string the call to iter makes it set of the letters making up the word. Did you try it? Python 2.6.1 (r261:67515, Jan 7 2009, 17:09:13) [GCC 4.3.2] on linux2 Type help, copyright, credits or license for more information. set('abcd') set(['a', 'c', 'b', 'd']) --RDM -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Feb 21, 2:24 pm, rdmur...@bitdance.com wrote: odeits ode...@gmail.com wrote: On Feb 21, 12:47=A0am, Gabriel Genellina gagsl-...@yahoo.com.ar wrote: En Sat, 21 Feb 2009 01:14:02 -0200, odeits ode...@gmail.com escribi=F3: On Feb 15, 11:31=A0pm, odeits ode...@gmail.com wrote: It seems what you are actually testing for is if the intersection of the two sets is not empty where the first set is the characters in your word and the second set is the characters in your defined string. To expand on what I was saying I thought i should provide a code snippet: WORD = 'g' * 100 WORD2 = 'g' * 50 + 'U' VOWELS = 'aeiouAEIOU' BIGWORD = 'g' * 1 + 'U' def set_test(vowels, word): vowels = set( iter(vowels)) letters = set( iter(word) ) if letters vowels: return True else: return False with python 2.5 I got 1.30 usec/pass against the BIGWORD You could make it slightly faster by removing the iter() call: letters = set(word) And (if vowels are really constant) you could pre-build the vowels set. set(word) = set{[word]} meaning a set with one element, the string the call to iter makes it set of the letters making up the word. Did you try it? Python 2.6.1 (r261:67515, Jan 7 2009, 17:09:13) [GCC 4.3.2] on linux2 Type help, copyright, credits or license for more information. set('abcd') set(['a', 'c', 'b', 'd']) --RDM You are in fact correct. Thank you for pointing that out. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Feb 15, 11:31 pm, odeits ode...@gmail.com wrote: On Feb 15, 9:56 pm, Chris Rebert c...@rebertia.com wrote: On Sun, Feb 15, 2009 at 9:17 PM, pyt...@bdurham.com wrote: I need to test strings to determine if one of a list of chars is in the string. A simple example would be to test strings to determine if they have a vowel (aeiouAEIOU) present. I was hopeful that there was a built-in method that operated similar to startswith where I could pass a tuple of chars to be tested, but I could not find such a method. Which of the following techniques is most Pythonic or are there better ways to perform this type of match? # long and hard coded but short circuits as soon as match found if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... : -OR- # flexible, but no short circuit on first match if [ char for char in word if char in 'aeiouAEIOU' ]: Just use the fairly new builtin function any() to make it short-circuit: if any(char.lower() in 'aeiou' for char in word): do_whatever() Cheers, Chris -- Follow the path of the Iguana...http://rebertia.com If you want to generalize it you should look at setshttp://docs.python.org/library/sets.html It seems what you are actually testing for is if the intersection of the two sets is not empty where the first set is the characters in your word and the second set is the characters in your defined string. To expand on what I was saying I thought i should provide a code snippet: WORD = 'g' * 100 WORD2 = 'g' * 50 + 'U' VOWELS = 'aeiouAEIOU' BIGWORD = 'g' * 1 + 'U' def set_test(vowels, word): vowels = set( iter(vowels)) letters = set( iter(word) ) if letters vowels: return True else: return False with python 2.5 I got 1.30 usec/pass against the BIGWORD -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Feb 19, 6:47 pm, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: On Wed, 18 Feb 2009 21:22:45 +0100, Peter Otten __pete...@web.de declaimed the following in comp.lang.python: Steve Holden wrote: Jervis Whitley wrote: What happens when you have hundreds of megabytes, I don't know. I hope I never have to test a word that is hundreds of megabytes long for a vowel :) I see you don't speak German ;-) I tried to come up with a funny way to point out that you're a fool. But because I'm German I failed. Yeah... the proper language to bemoan is Welsh G Where w is vowel G Better bemoaned is Czech which can go on and on using only L and R as sonorant consonants instead of vowels. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
Dennis Lee Bieber wrote: On Wed, 18 Feb 2009 21:22:45 +0100, Peter Otten __pete...@web.de declaimed the following in comp.lang.python: Steve Holden wrote: Jervis Whitley wrote: What happens when you have hundreds of megabytes, I don't know. I hope I never have to test a word that is hundreds of megabytes long for a vowel :) I see you don't speak German ;-) I tried to come up with a funny way to point out that you're a fool. But because I'm German I failed. Yeah... the proper language to bemoan is Welsh G Where w is vowel G So is y, but then it is in English too, sometimes. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
MRAB wrote: Dennis Lee Bieber wrote: On Wed, 18 Feb 2009 21:22:45 +0100, Peter Otten __pete...@web.de declaimed the following in comp.lang.python: Steve Holden wrote: Jervis Whitley wrote: What happens when you have hundreds of megabytes, I don't know. I hope I never have to test a word that is hundreds of megabytes long for a vowel :) I see you don't speak German ;-) I tried to come up with a funny way to point out that you're a fool. But because I'm German I failed. Yeah... the proper language to bemoan is Welsh G Where w is vowel G So is y, but then it is in English too, sometimes. Heh, born and raised in Germany, moved to the Netherlands and now live in the UK, speak a bit of French too. No wonder the only language that makes actually sense to me is Python. -- mph -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Wed, 18 Feb 2009 07:08:04 +1100, Jervis Whitley wrote: This moves the for-loop out of slow Python into fast C and should be much, much faster for very large input. _Should_ be faster. Yes, Python's timing results are often unintuitive. Here is my test on an XP system Python 2.5.4. I had similar results on python 2.7 trunk. ... **no vowels** any: [0.36063678618957751, 0.36116506191682773, 0.36212355395824081] for: [0.24044885376801672, 0.2417684017413404, 0.24084797257163482] I get similar results. ... **BIG word vowel 'U' final char** any: [8.0007259193539895, 7.9797344140269644, 7.8901742633514012] for: [7.6664422372764101, 7.6784683633957584, 7.6683055766498001] Well, I did say for very large input. 1 chars isn't very large -- that's only 9K. Try this instead: BIGWORD = 'g' * 50 + 'U' # less than 500K of text Timer(for_test(BIGWORD), setup).repeat(number=1000) [4.7292280197143555, 4.633030891418457, 4.6327309608459473] Timer(any_test(BIGWORD), setup).repeat(number=1000) [4.7717428207397461, 4.6366970539093018, 4.6367099285125732] The difference is not significant. What about bigger? BIGWORD = 'g' * 500 + 'U' # less than 5MB Timer(for_test(BIGWORD), setup).repeat(number=100) [4.8875839710235596, 4.7698030471801758, 4.769787073135376] Timer(any_test(BIGWORD), setup).repeat(number=100) [4.8555209636688232, 4.8139419555664062, 4.7710208892822266] It seems to me that I was mistaken -- for large enough input, the running time of each version converges to approximately the same speed. What happens when you have hundreds of megabytes, I don't know. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
What happens when you have hundreds of megabytes, I don't know. I hope I never have to test a word that is hundreds of megabytes long for a vowel :) -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
Jervis Whitley wrote: What happens when you have hundreds of megabytes, I don't know. I hope I never have to test a word that is hundreds of megabytes long for a vowel :) I see you don't speak German ;-) -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
Steven D'Aprano wrote: On Wed, 18 Feb 2009 07:08:04 +1100, Jervis Whitley wrote: This moves the for-loop out of slow Python into fast C and should be much, much faster for very large input. _Should_ be faster. Yes, Python's timing results are often unintuitive. Indeed. It seems to me that I was mistaken -- for large enough input, the running time of each version converges to approximately the same speed. No, you were right. Both any_test() and for_test() use the improvement you suggested, i. e. loop over the vowels, not the characters of the word. Here's the benchmark as it should have been: $ python -m timeit -s'word = g*1' 'any(v in word for v in aeiouAEIOU)' 1000 loops, best of 3: 314 usec per loop $ python -m timeit -s'word = g*1' 'any(c in aeiouAEIOU for c in word)' 100 loops, best of 3: 3.48 msec per loop Of course this shows only the worst case behaviour. The results will vary depending on the actual word e. g. Ug... or g...a. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
Steve Holden wrote: Jervis Whitley wrote: What happens when you have hundreds of megabytes, I don't know. I hope I never have to test a word that is hundreds of megabytes long for a vowel :) I see you don't speak German ;-) I tried to come up with a funny way to point out that you're a fool. But because I'm German I failed. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
Nicolas Dandrimont wrote: I would go for something like: for char in word: if char in 'aeiouAEIUO': char_found = True break else: char_found = False (No, I did not forget to indent the else statement, see http://docs.python.org/reference/compound_stmts.html#for) That might be better written as: char_found = False for char in word: if char in 'aeiouAEIUO': char_found = True break or even: char_found = False for char in word: if char.lower() in 'aeiou': char_found = True break but if word is potentially very large, it's probably better to reverse the test: rather than compare every char of word to see if it is a vowel, just search word for each vowel: char_found = any( vowel in word for vowel in 'aeiouAEIOU' ) This moves the for-loop out of slow Python into fast C and should be much, much faster for very large input. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
This moves the for-loop out of slow Python into fast C and should be much, much faster for very large input. _Should_ be faster. Here is my test on an XP system Python 2.5.4. I had similar results on python 2.7 trunk. WORD = 'g' * 100 WORD2 = 'g' * 50 + 'U' BIGWORD = 'g' * 1 + 'U' def any_test(word): return any(vowel in word for vowel in 'aeiouAEIOU') def for_test(word): for vowel in 'aeiouAEIOU': if vowel in word: return True else: return False **no vowels** any: [0.36063678618957751, 0.36116506191682773, 0.36212355395824081] for: [0.24044885376801672, 0.2417684017413404, 0.24084797257163482] **vowel 'U' final char** any: [0.38218764069443112, 0.38431925474244588, 0.38238668882188831] for: [0.16398578356553717, 0.16433223810347286, 0.1659337176385] **BIG word vowel 'U' final char** any: [8.0007259193539895, 7.9797344140269644, 7.8901742633514012] for: [7.6664422372764101, 7.6784683633957584, 7.6683055766498001] Cheers, -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
* pyt...@bdurham.com pyt...@bdurham.com [2009-02-16 00:48:34 -0500]: Nicolas, I would go for something like: for char in word: if char in 'aeiouAEIUO': char_found = True break else: char_found = False It is clear (imo), and it is seems to be the intended idiom for a search loop, that short-circuits as soon as a match is found. Thank you - that looks much better that my overly complicated attempts. Are there any reasons why I couldn't simplify your approach as follows? for char in word: if char in 'aeiouAEIUO': return True return False If you want to put this in its own function, this seems to be the way to go. Cheers, -- Nicolas Dandrimont The nice thing about Windows is - It does not just crash, it displays a dialog box and lets you press 'OK' first. (Arno Schaefer's .sig) signature.asc Description: Digital signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Mon, 2009-02-16 at 00:28 -0500, Nicolas Dandrimont wrote: * pyt...@bdurham.com pyt...@bdurham.com [2009-02-16 00:17:37 -0500]: I need to test strings to determine if one of a list of chars is in the string. A simple example would be to test strings to determine if they have a vowel (aeiouAEIOU) present. I was hopeful that there was a built-in method that operated similar to startswith where I could pass a tuple of chars to be tested, but I could not find such a method. Which of the following techniques is most Pythonic or are there better ways to perform this type of match? # long and hard coded but short circuits as soon as match found if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... : -OR- # flexible, but no short circuit on first match if [ char for char in word if char in 'aeiouAEIOU' ]: -OR- # flexible, but no short circuit on first match if set( word ).intersection( 'aeiouAEIOU' ): I would go for something like: for char in word: if char in 'aeiouAEIUO': char_found = True break else: char_found = False (No, I did not forget to indent the else statement, see http://docs.python.org/reference/compound_stmts.html#for) It is clear (imo), and it is seems to be the intended idiom for a search loop, that short-circuits as soon as a match is found. If performance becomes an issue, you can tune this very easily, so it doesn't have to scan through the string 'aeiouAEIOU' every time, by making a set out of that: vowels = set('aeiouAEIOU') for char in word if char in vowels: return True return False Searching in a set runs in constant time. Cheers, -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
An entirely different approach would be to use a regular expression: import re if re.search([abc], nothing expekted): print a, b or c occurs in the string 'nothing expekted' if re.search([abc], something expected): print a, b or c occurs in the string 'something expected' Best regards, Stefaan. -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
Nicolas, I would go for something like: for char in word: if char in 'aeiouAEIUO': char_found = True break else: char_found = False It is clear (imo), and it is seems to be the intended idiom for a search loop, that short-circuits as soon as a match is found. Thank you - that looks much better that my overly complicated attempts. Are there any reasons why I couldn't simplify your approach as follows? for char in word: if char in 'aeiouAEIUO': return True return False Cheers, Malcolm -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
* pyt...@bdurham.com pyt...@bdurham.com [2009-02-16 00:17:37 -0500]: I need to test strings to determine if one of a list of chars is in the string. A simple example would be to test strings to determine if they have a vowel (aeiouAEIOU) present. I was hopeful that there was a built-in method that operated similar to startswith where I could pass a tuple of chars to be tested, but I could not find such a method. Which of the following techniques is most Pythonic or are there better ways to perform this type of match? # long and hard coded but short circuits as soon as match found if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... : -OR- # flexible, but no short circuit on first match if [ char for char in word if char in 'aeiouAEIOU' ]: -OR- # flexible, but no short circuit on first match if set( word ).intersection( 'aeiouAEIOU' ): I would go for something like: for char in word: if char in 'aeiouAEIUO': char_found = True break else: char_found = False (No, I did not forget to indent the else statement, see http://docs.python.org/reference/compound_stmts.html#for) It is clear (imo), and it is seems to be the intended idiom for a search loop, that short-circuits as soon as a match is found. Cheers, -- Nicolas Dandrimont linux: the choice of a GNU generation (k...@cis.ufl.edu put this on Tshirts in '93) signature.asc Description: Digital signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Sun, Feb 15, 2009 at 9:17 PM, pyt...@bdurham.com wrote: I need to test strings to determine if one of a list of chars is in the string. A simple example would be to test strings to determine if they have a vowel (aeiouAEIOU) present. I was hopeful that there was a built-in method that operated similar to startswith where I could pass a tuple of chars to be tested, but I could not find such a method. Which of the following techniques is most Pythonic or are there better ways to perform this type of match? # long and hard coded but short circuits as soon as match found if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... : -OR- # flexible, but no short circuit on first match if [ char for char in word if char in 'aeiouAEIOU' ]: Just use the fairly new builtin function any() to make it short-circuit: if any(char.lower() in 'aeiou' for char in word): do_whatever() Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic way to determine if one char of many in a string
On Feb 15, 9:56 pm, Chris Rebert c...@rebertia.com wrote: On Sun, Feb 15, 2009 at 9:17 PM, pyt...@bdurham.com wrote: I need to test strings to determine if one of a list of chars is in the string. A simple example would be to test strings to determine if they have a vowel (aeiouAEIOU) present. I was hopeful that there was a built-in method that operated similar to startswith where I could pass a tuple of chars to be tested, but I could not find such a method. Which of the following techniques is most Pythonic or are there better ways to perform this type of match? # long and hard coded but short circuits as soon as match found if 'a' in word or 'e' in word or 'i' in word or 'u' in word or ... : -OR- # flexible, but no short circuit on first match if [ char for char in word if char in 'aeiouAEIOU' ]: Just use the fairly new builtin function any() to make it short-circuit: if any(char.lower() in 'aeiou' for char in word): do_whatever() Cheers, Chris -- Follow the path of the Iguana...http://rebertia.com If you want to generalize it you should look at sets http://docs.python.org/library/sets.html It seems what you are actually testing for is if the intersection of the two sets is not empty where the first set is the characters in your word and the second set is the characters in your defined string. -- http://mail.python.org/mailman/listinfo/python-list