Re: Newbie question - better way to do this?

2007-05-28 Thread Nis Jørgensen
Steve Howell skrev:

 def firstIsCapitalized(word):
 return 'A' = word[0] = 'Z'

For someone who is worried about the impact of non-ascii identifiers,
you are making surprising assumptions about the contents of data.

Nis
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-28 Thread Steve Howell

--- Nis Jørgensen [EMAIL PROTECTED] wrote:

 Steve Howell skrev:
 
  def firstIsCapitalized(word):
  return 'A' = word[0] = 'Z'
 
 For someone who is worried about the impact of
 non-ascii identifiers,
 you are making surprising assumptions about the
 contents of data.
 

The function there, which I don't even remotely
defend, had nothing to with the main point of the
thread.  Others pointed out that correct idiom here is
word.istitle(), but the main point of the thread was
how to prevent the newbie from essentially reinventing
itertools.groupby.

If you want to point out holes in my logic about the
impact of non-ascii identifiers (the above code,
though bad, does not suggest a hole in my logic; it
perhaps even adds to my case), can you kindly do it in
a thread where that's the main point of the
discussion?




   
Sick
 sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when. 
http://tv.yahoo.com/collections/222
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-28 Thread Nis Jørgensen
Steve Howell skrev:
 --- Nis Jørgensen [EMAIL PROTECTED] wrote:
 
 Steve Howell skrev:

 def firstIsCapitalized(word):
 return 'A' = word[0] = 'Z'
 For someone who is worried about the impact of
 non-ascii identifiers,
 you are making surprising assumptions about the
 contents of data.

 
 The function there, which I don't even remotely
 defend, had nothing to with the main point of the
 thread.  Others pointed out that correct idiom here is
 word.istitle(), but the main point of the thread was
 how to prevent the newbie from essentially reinventing
 itertools.groupby.

The subject line says Newbie question - better way to do this. I was
hinting at a better  way to do what you did, which was supposedly a
better way of doing what the newbie wanted.

I disagree that word.istitle is the correct idiom - from the naming of
the function in the original example, I would guess word[0].isupper
would do the trick.

 If you want to point out holes in my logic about the
 impact of non-ascii identifiers (the above code,
 though bad, does not suggest a hole in my logic; it
 perhaps even adds to my case), can you kindly do it in
 a thread where that's the main point of the
 discussion?

Will do.

Nis
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-28 Thread Steve Howell

--- Nis Jørgensen [EMAIL PROTECTED] wrote:

 
 I disagree that word.istitle is the correct idiom -
 from the naming of
 the function in the original example, I would guess
 word[0].isupper
 would do the trick.
 

nitpick
That would return something like this:

   built-in method isupper of str object at 0x13ade0

You want to add parens:

word[0].isupper()
/nitpick





   
Luggage?
 GPS? Comic books? 
Check out fitting gifts for grads at Yahoo! Search
http://search.yahoo.com/search?fr=oni_on_mailp=graduation+giftscs=bz
-- 
http://mail.python.org/mailman/listinfo/python-list


Newbie question - better way to do this?

2007-05-27 Thread Eric
I have some working code, but I realized it is just the way I would
write it in C, which means there is probably a better (more pythonic)
way of doing it.

Here's the section of code:

accumulate = firstIsCaps = False
accumStart = i = 0
while i  len(words):
firstIsCaps = firstIsCapitalized(words[i])
if firstIsCaps and not accumulate:
accumStart, accumulate = i, True
elif accumulate and not firstIsCaps:
doSomething(words[accumStart : i])
accumulate = False
i += 1

words is a big long array of strings.  What I want to do is find
consecutive sequences of words that have the first letter capitalized,
and then call doSomething on them.  (And you can ignore the fact that
it won't find a sequence at the very end of words, that is fine for my
purposes).

Thanks,
Eric

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread Gabriel Genellina
En Sun, 27 May 2007 10:44:01 -0300, Eric [EMAIL PROTECTED] escribió:

 I have some working code, but I realized it is just the way I would
 write it in C, which means there is probably a better (more pythonic)
 way of doing it.

 Here's the section of code:

 accumulate = firstIsCaps = False
 accumStart = i = 0
 while i  len(words):
 firstIsCaps = firstIsCapitalized(words[i])
 if firstIsCaps and not accumulate:
 accumStart, accumulate = i, True
 elif accumulate and not firstIsCaps:
 doSomething(words[accumStart : i])
 accumulate = False
 i += 1

 words is a big long array of strings.  What I want to do is find
 consecutive sequences of words that have the first letter capitalized,
 and then call doSomething on them.  (And you can ignore the fact that
 it won't find a sequence at the very end of words, that is fine for my
 purposes).

Using groupby:

py from itertools import groupby
py
py words = Este es un Ejemplo. Los Ejemplos usualmente son tontos. Yo  
siempre
escribo tonterias..split()
py
py for upper, group in groupby(words, str.istitle):
... if upper:
... print list(group) # doSomething(list(group))
...
['Este']
['Ejemplo.', 'Los', 'Ejemplos']
['Yo']

You could replace your firstIsCapitalized function instead of the string  
method istitle(), but I think it's the same. See  
http://docs.python.org/lib/itertools-functions.html

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread Steven D'Aprano
On Sun, 27 May 2007 06:44:01 -0700, Eric wrote:

 words is a big long array of strings.  What I want to do is find
 consecutive sequences of words that have the first letter capitalized,
 and then call doSomething on them.  (And you can ignore the fact that
 it won't find a sequence at the very end of words, that is fine for my
 purposes).

Assuming the list of words will fit into memory, and you can probably
expect to fit anything up to millions of words comfortably into memory,
something like this might be suitable:

list_of_words = lots of words go here.split()

accumulator = []
for word in list_of_words:
if word.istitle():
accumulator.append(word)
else:
doSomething(accumulator)
accumulator = []


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread Steve Howell

--- Eric [EMAIL PROTECTED] wrote:

 I have some working code, but I realized it is just
 the way I would
 write it in C, which means there is probably a
 better (more pythonic)
 way of doing it.
 
 Here's the section of code:
 
 accumulate = firstIsCaps = False
 accumStart = i = 0
 while i  len(words):
 firstIsCaps = firstIsCapitalized(words[i])
 if firstIsCaps and not accumulate:
 accumStart, accumulate = i, True
 elif accumulate and not firstIsCaps:
 doSomething(words[accumStart : i])
 accumulate = False
 i += 1
 
 words is a big long array of strings.  What I want
 to do is find
 consecutive sequences of words that have the first
 letter capitalized,
 and then call doSomething on them.  (And you can
 ignore the fact that
 it won't find a sequence at the very end of words,
 that is fine for my
 purposes).
 

Try out this program:

def doSomething(stuff):
print stuff

def firstIsCapitalized(word):
return 'A' = word[0] = 'Z'

def orig_code(words):
print 'C-style'
accumulate = firstIsCaps = False
accumStart = i = 0
while i  len(words):
firstIsCaps = firstIsCapitalized(words[i])
if firstIsCaps and not accumulate:
accumStart, accumulate = i, True
elif accumulate and not firstIsCaps:
doSomething(words[accumStart : i])
accumulate = False
i += 1

def another_way(words):
print 'more idiomatic, with minor bug fix'
group = []
for word in words:
if firstIsCapitalized(word):
group.append(word)
elif group:
doSomething(group)
group = []
if group:
doSomething(group)

orig_code(['foo', 'Python', 'Ruby', 'c', 'xxx',
'Perl'])
another_way(['foo', 'Python', 'Ruby', 'c', 'xxx',
'Perl'])

See also the groupby method in itertools.

http://docs.python.org/lib/itertools-functions.html




   
Get
 the Yahoo! toolbar and be alerted to new email wherever you're surfing.
http://new.toolbar.yahoo.com/toolbar/features/mail/index.php
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread John Machin
On May 28, 12:46 am, Steven D'Aprano
[EMAIL PROTECTED] wrote:
 On Sun, 27 May 2007 06:44:01 -0700, Eric wrote:
  words is a big long array of strings.  What I want to do is find
  consecutive sequences of words that have the first letter capitalized,
  and then call doSomething on them.  (And you can ignore the fact that
  it won't find a sequence at the very end of words, that is fine for my
  purposes).

 Assuming the list of words will fit into memory, and you can probably
 expect to fit anything up to millions of words comfortably into memory,
 something like this might be suitable:

 list_of_words = lots of words go here.split()

 accumulator = []
 for word in list_of_words:
 if word.istitle():
 accumulator.append(word)
 else:
 doSomething(accumulator)
 accumulator = []

Bzzzt. Needs the following code at the end:
if accumulator:
doSomething(accumulator)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread Steve Howell

--- John Machin [EMAIL PROTECTED] wrote:

(And you can
   ignore the fact that
   it won't find a sequence at the very end of
 words, that is fine for my
   purposes).
  [...]

 Bzzzt. Needs the following code at the end:
 if accumulator:
 doSomething(accumulator)
 

FWIW the OP already conceded that bug, but you're
right that it's a common anti-pattern, which is just a
nice word for bug. :)

The itertools.groupby() function is a well-intended
attempt to steer folks away from this anti-pattern,
although I think it has usability issues, mostly
related to the docs (see other thread).




   
Building
 a website is a piece of cake. Yahoo! Small Business gives you all the tools to 
get online.
http://smallbusiness.yahoo.com/webhosting 
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread Steven D'Aprano
On Sun, 27 May 2007 14:55:42 -0700, John Machin wrote:

 On May 28, 12:46 am, Steven D'Aprano
 [EMAIL PROTECTED] wrote:
 On Sun, 27 May 2007 06:44:01 -0700, Eric wrote:
  words is a big long array of strings.  What I want to do is find
  consecutive sequences of words that have the first letter capitalized,
  and then call doSomething on them.  (And you can ignore the fact that
  it won't find a sequence at the very end of words, that is fine for my
  purposes).

 Assuming the list of words will fit into memory, and you can probably
 expect to fit anything up to millions of words comfortably into memory,
 something like this might be suitable:

 list_of_words = lots of words go here.split()

 accumulator = []
 for word in list_of_words:
 if word.istitle():
 accumulator.append(word)
 else:
 doSomething(accumulator)
 accumulator = []

 Bzzzt. Needs the following code at the end:
 if accumulator:
 doSomething(accumulator)


Bzzzt! Somebody didn't read the Original Poster's comment And you can
ignore the fact that it won't find a sequence at the very end of words,
that is fine for my purposes.

Of course, for somebody whose requirements _aren't_ broken, you would be
completely right. Besides, I'm under no obligation to write all the O.P.'s
code for him, just point him in the right direction.


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread Steve Howell

--- Steven D'Aprano  wrote:

 On Sun, 27 May 2007 14:55:42 -0700, John Machin
 wrote:
  Bzzzt. 
 Bzzzt!

Can we please refrain from buzzer sounds in this
mostly civil forum, even if one beep deserves another?




 

Looking for earth-friendly autos? 
Browse Top Cars by Green Rating at Yahoo! Autos' Green Center.
http://autos.yahoo.com/green_center/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie question - better way to do this?

2007-05-27 Thread Paul Rubin
Eric [EMAIL PROTECTED] writes:
 words is a big long array of strings.  What I want to do is find
 consecutive sequences of words that have the first letter capitalized,
 and then call doSomething on them.  (And you can ignore the fact that
 it won't find a sequence at the very end of words, that is fine for my
 purposes).

As another poster suggested, use itertools.groupby:

for cap,g in groupby(words, firstIsCapitalized):
   if cap: doSomething(list(g))

This will handle sequences at the the end words just like other sequences.
-- 
http://mail.python.org/mailman/listinfo/python-list