Re: Newbie question - better way to do this?
Steve Howell skrev: def firstIsCapitalized(word): return 'A' = word[0] = 'Z' For someone who is worried about the impact of non-ascii identifiers, you are making surprising assumptions about the contents of data. Nis -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
--- Nis Jørgensen [EMAIL PROTECTED] wrote: Steve Howell skrev: def firstIsCapitalized(word): return 'A' = word[0] = 'Z' For someone who is worried about the impact of non-ascii identifiers, you are making surprising assumptions about the contents of data. The function there, which I don't even remotely defend, had nothing to with the main point of the thread. Others pointed out that correct idiom here is word.istitle(), but the main point of the thread was how to prevent the newbie from essentially reinventing itertools.groupby. If you want to point out holes in my logic about the impact of non-ascii identifiers (the above code, though bad, does not suggest a hole in my logic; it perhaps even adds to my case), can you kindly do it in a thread where that's the main point of the discussion? Sick sense of humor? Visit Yahoo! TV's Comedy with an Edge to see what's on, when. http://tv.yahoo.com/collections/222 -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
Steve Howell skrev: --- Nis Jørgensen [EMAIL PROTECTED] wrote: Steve Howell skrev: def firstIsCapitalized(word): return 'A' = word[0] = 'Z' For someone who is worried about the impact of non-ascii identifiers, you are making surprising assumptions about the contents of data. The function there, which I don't even remotely defend, had nothing to with the main point of the thread. Others pointed out that correct idiom here is word.istitle(), but the main point of the thread was how to prevent the newbie from essentially reinventing itertools.groupby. The subject line says Newbie question - better way to do this. I was hinting at a better way to do what you did, which was supposedly a better way of doing what the newbie wanted. I disagree that word.istitle is the correct idiom - from the naming of the function in the original example, I would guess word[0].isupper would do the trick. If you want to point out holes in my logic about the impact of non-ascii identifiers (the above code, though bad, does not suggest a hole in my logic; it perhaps even adds to my case), can you kindly do it in a thread where that's the main point of the discussion? Will do. Nis -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
--- Nis Jørgensen [EMAIL PROTECTED] wrote: I disagree that word.istitle is the correct idiom - from the naming of the function in the original example, I would guess word[0].isupper would do the trick. nitpick That would return something like this: built-in method isupper of str object at 0x13ade0 You want to add parens: word[0].isupper() /nitpick Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search http://search.yahoo.com/search?fr=oni_on_mailp=graduation+giftscs=bz -- http://mail.python.org/mailman/listinfo/python-list
Newbie question - better way to do this?
I have some working code, but I realized it is just the way I would write it in C, which means there is probably a better (more pythonic) way of doing it. Here's the section of code: accumulate = firstIsCaps = False accumStart = i = 0 while i len(words): firstIsCaps = firstIsCapitalized(words[i]) if firstIsCaps and not accumulate: accumStart, accumulate = i, True elif accumulate and not firstIsCaps: doSomething(words[accumStart : i]) accumulate = False i += 1 words is a big long array of strings. What I want to do is find consecutive sequences of words that have the first letter capitalized, and then call doSomething on them. (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). Thanks, Eric -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
En Sun, 27 May 2007 10:44:01 -0300, Eric [EMAIL PROTECTED] escribió: I have some working code, but I realized it is just the way I would write it in C, which means there is probably a better (more pythonic) way of doing it. Here's the section of code: accumulate = firstIsCaps = False accumStart = i = 0 while i len(words): firstIsCaps = firstIsCapitalized(words[i]) if firstIsCaps and not accumulate: accumStart, accumulate = i, True elif accumulate and not firstIsCaps: doSomething(words[accumStart : i]) accumulate = False i += 1 words is a big long array of strings. What I want to do is find consecutive sequences of words that have the first letter capitalized, and then call doSomething on them. (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). Using groupby: py from itertools import groupby py py words = Este es un Ejemplo. Los Ejemplos usualmente son tontos. Yo siempre escribo tonterias..split() py py for upper, group in groupby(words, str.istitle): ... if upper: ... print list(group) # doSomething(list(group)) ... ['Este'] ['Ejemplo.', 'Los', 'Ejemplos'] ['Yo'] You could replace your firstIsCapitalized function instead of the string method istitle(), but I think it's the same. See http://docs.python.org/lib/itertools-functions.html -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
On Sun, 27 May 2007 06:44:01 -0700, Eric wrote: words is a big long array of strings. What I want to do is find consecutive sequences of words that have the first letter capitalized, and then call doSomething on them. (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). Assuming the list of words will fit into memory, and you can probably expect to fit anything up to millions of words comfortably into memory, something like this might be suitable: list_of_words = lots of words go here.split() accumulator = [] for word in list_of_words: if word.istitle(): accumulator.append(word) else: doSomething(accumulator) accumulator = [] -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
--- Eric [EMAIL PROTECTED] wrote: I have some working code, but I realized it is just the way I would write it in C, which means there is probably a better (more pythonic) way of doing it. Here's the section of code: accumulate = firstIsCaps = False accumStart = i = 0 while i len(words): firstIsCaps = firstIsCapitalized(words[i]) if firstIsCaps and not accumulate: accumStart, accumulate = i, True elif accumulate and not firstIsCaps: doSomething(words[accumStart : i]) accumulate = False i += 1 words is a big long array of strings. What I want to do is find consecutive sequences of words that have the first letter capitalized, and then call doSomething on them. (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). Try out this program: def doSomething(stuff): print stuff def firstIsCapitalized(word): return 'A' = word[0] = 'Z' def orig_code(words): print 'C-style' accumulate = firstIsCaps = False accumStart = i = 0 while i len(words): firstIsCaps = firstIsCapitalized(words[i]) if firstIsCaps and not accumulate: accumStart, accumulate = i, True elif accumulate and not firstIsCaps: doSomething(words[accumStart : i]) accumulate = False i += 1 def another_way(words): print 'more idiomatic, with minor bug fix' group = [] for word in words: if firstIsCapitalized(word): group.append(word) elif group: doSomething(group) group = [] if group: doSomething(group) orig_code(['foo', 'Python', 'Ruby', 'c', 'xxx', 'Perl']) another_way(['foo', 'Python', 'Ruby', 'c', 'xxx', 'Perl']) See also the groupby method in itertools. http://docs.python.org/lib/itertools-functions.html Get the Yahoo! toolbar and be alerted to new email wherever you're surfing. http://new.toolbar.yahoo.com/toolbar/features/mail/index.php -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
On May 28, 12:46 am, Steven D'Aprano [EMAIL PROTECTED] wrote: On Sun, 27 May 2007 06:44:01 -0700, Eric wrote: words is a big long array of strings. What I want to do is find consecutive sequences of words that have the first letter capitalized, and then call doSomething on them. (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). Assuming the list of words will fit into memory, and you can probably expect to fit anything up to millions of words comfortably into memory, something like this might be suitable: list_of_words = lots of words go here.split() accumulator = [] for word in list_of_words: if word.istitle(): accumulator.append(word) else: doSomething(accumulator) accumulator = [] Bzzzt. Needs the following code at the end: if accumulator: doSomething(accumulator) -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
--- John Machin [EMAIL PROTECTED] wrote: (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). [...] Bzzzt. Needs the following code at the end: if accumulator: doSomething(accumulator) FWIW the OP already conceded that bug, but you're right that it's a common anti-pattern, which is just a nice word for bug. :) The itertools.groupby() function is a well-intended attempt to steer folks away from this anti-pattern, although I think it has usability issues, mostly related to the docs (see other thread). Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
On Sun, 27 May 2007 14:55:42 -0700, John Machin wrote: On May 28, 12:46 am, Steven D'Aprano [EMAIL PROTECTED] wrote: On Sun, 27 May 2007 06:44:01 -0700, Eric wrote: words is a big long array of strings. What I want to do is find consecutive sequences of words that have the first letter capitalized, and then call doSomething on them. (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). Assuming the list of words will fit into memory, and you can probably expect to fit anything up to millions of words comfortably into memory, something like this might be suitable: list_of_words = lots of words go here.split() accumulator = [] for word in list_of_words: if word.istitle(): accumulator.append(word) else: doSomething(accumulator) accumulator = [] Bzzzt. Needs the following code at the end: if accumulator: doSomething(accumulator) Bzzzt! Somebody didn't read the Original Poster's comment And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes. Of course, for somebody whose requirements _aren't_ broken, you would be completely right. Besides, I'm under no obligation to write all the O.P.'s code for him, just point him in the right direction. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
--- Steven D'Aprano wrote: On Sun, 27 May 2007 14:55:42 -0700, John Machin wrote: Bzzzt. Bzzzt! Can we please refrain from buzzer sounds in this mostly civil forum, even if one beep deserves another? Looking for earth-friendly autos? Browse Top Cars by Green Rating at Yahoo! Autos' Green Center. http://autos.yahoo.com/green_center/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie question - better way to do this?
Eric [EMAIL PROTECTED] writes: words is a big long array of strings. What I want to do is find consecutive sequences of words that have the first letter capitalized, and then call doSomething on them. (And you can ignore the fact that it won't find a sequence at the very end of words, that is fine for my purposes). As another poster suggested, use itertools.groupby: for cap,g in groupby(words, firstIsCapitalized): if cap: doSomething(list(g)) This will handle sequences at the the end words just like other sequences. -- http://mail.python.org/mailman/listinfo/python-list