Stefan Ram wrote:
Mark Bourne <nntp.mbou...@spamgourmet.com> wrote or quoted:
I don't think there's a tuple being created.  If you mean:
     ( word for word in list_ if word[ 0 ]== 'e' )
...that's not creating a tuple.  It's a generator expression, which
generates the next value each time it's called for.  If you only ever
ask for the first item, it only generates that one.

   Yes, that's also how I understand it!

   In the meantime, I wrote code for a microbenchmark, shown below.

   This code, when executed on my computer, shows that the
   next+generator approach is a bit faster when compared with
   the procedural break approach. But when the order of the two
   approaches is being swapped in the loop, then it is shown to
   be a bit slower. So let's say, it takes about the same time.

There could be some caching going on, meaning whichever is done second comes out a bit faster.

   However, I also tested code with an early return (not shown below),
   and this was shown to be faster than both code using break and
   code using next+generator by a factor of about 1.6, even though
   the code with return has the "function call overhead"!

To be honest, that's how I'd probably write it - not because of any thought that it might be faster, but just that's it's clearer. And if there's a `do_something_else()` that needs to be called regardless of the whether a word was found, split it into two functions:
```
def first_word_beginning_with_e(target, wordlist):
    for w in wordlist:
        if w.startswith(target):
            return w
    return ''

def find_word_and_do_something_else(target, wordlist):
    result = first_word_beginning_with_e(target, wordlist)
    do_something_else()
    return result
```

   But please be aware that such results depend on the implementation
   and version of the Python implementation being used for the benchmark
   and also of the details of how exactly the benchmark is written.

import random
import string
import timeit

print( 'The following loop may need a few seconds or minutes, '
'so please bear with me.' )

time_using_break = 0
time_using_next = 0

for repetition in range( 100 ):
     for i in range( 100 ): # Yes, this nesting is redundant!

         list_ = \
         [ ''.join \
           ( random.choices \
             ( string.ascii_lowercase, k=random.randint( 1, 30 )))
           for i in range( random.randint( 0, 50 ))]

         start_time = timeit.default_timer()
         for word in list_:
             if word[ 0 ]== 'e':
                 word_using_break = word
                 break
         else:
             word_using_break = ''
         time_using_break += timeit.default_timer() - start_time

         start_time = timeit.default_timer()
         word_using_next = \
         next( ( word for word in list_ if word[ 0 ]== 'e' ), '' )
         time_using_next += timeit.default_timer() - start_time

         if word_using_next != word_using_break:
             raise Exception( 'word_using_next != word_using_break' )

print( f'{time_using_break = }' )
print( f'{time_using_next = }' )
print( f'{time_using_next / time_using_break = }' )

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to