Re: [Tutor] Class-based generator
Michael O'Leary wrote: > I wrote some code to create tasks to be run in a queue based system last > week. It consisted of a big monolithic function that consisted of two > parts: 1) read data from a file and create dictionaries and lists to > iterate through > 2) iterate through the lists creating a job data file and a task for the > queue one at a time until all of the data is dealt with > > My boss reviewed my code and said that it would be more reusable and > Pythonic if I refactored it as a generator that created job data files and > iterated by calling the generator and putting a task on the queue for each > job data file that was obtained. > > This made sense to me, and since the code does a bunch of conversion of > the data in the input file(s) to make it easier and faster to iterate > through the data, I decided to create a class for the generator and put > that conversion code into its __init__ function. So the class looked like > this: > > class JobFileGenerator: > def __init__(self, filedata, output_file_prefix, job_size): > > > def next(self): > while : > > > The problem is that the generator object is not created until you call > next(), so the calling code has to look like this: > > gen = JobFileGenerator(data, "output_", 20).next() > for datafile in gen.next(): > > > This code works OK, but I don't like that it needs to call next() once to > get a generator and then call next() again repeatedly to get the data for > the jobs. If I were to write this without a class as a single generator > function, it would not have to do this, but it would have the monolithic > structure that my boss objected to. > > Would it work to do this: > > for datafile in JobFileGenerator(data, "output_", 20).next(): > > > or would that cause the JobFileGenerator's __init__ function to be called > more than once? Are there examples I could look at of generator functions > defined on classes similar to this, or is it considered a bad idea to mix > the two paradigms? > Thanks, > Mike You are abusing the next method; it is called once to build a generator. The convention for that is to use either a descriptive name (jobs() or somesuch) or __iter__(): class JobFile: def __init__(self, filedata, output_file_prefix, job_size): def __iter__(self): while : for job in JobFile(data, "output_", 20): Here the generator is created by the implicit call to JobFile.__iter__() at the start of the for loop. Subsequent iterations call next() on the generator returned by that call. If you want the class itself to generate items you need a different approach: class JobFileIter: def __init__(self, filedata, output_file_prefix, job_size): self._done = False def __iter__(self): return self def next(self): if self._done or : self._done = True raise StopIteration return for job in JobFileIter(data, "output_", 20): Here __iter__() returns the JobFileIter instance, so for every iteration of the for loop JobFileIter.next() will be called -- until a StopIteration is raised. That said, it is often sufficient to refactor complex code into a few dedicated functions -- Python is not Java, after all. PS I'm assuming Python 2 -- for Python 3 the next() method must be replaced by __next__(). ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Class-based generator
On 18 February 2013 07:36, Michael O'Leary wrote: > I wrote some code to create tasks to be run in a queue based system last > week. It consisted of a big monolithic function that consisted of two parts: > 1) read data from a file and create dictionaries and lists to iterate > through > 2) iterate through the lists creating a job data file and a task for the > queue one at a time until all of the data is dealt with > > My boss reviewed my code and said that it would be more reusable and > Pythonic if I refactored it as a generator that created job data files and > iterated by calling the generator and putting a task on the queue for each > job data file that was obtained. > > This made sense to me, and since the code does a bunch of conversion of the > data in the input file(s) to make it easier and faster to iterate through > the data, I decided to create a class for the generator and put that > conversion code into its __init__ function. So the class looked like this: It's not a "generator" if you create a class for it. Your class is (trying to be) an iterator. > class JobFileGenerator: > def __init__(self, filedata, output_file_prefix, job_size): > > > def next(self): > while : > next() should return a single item not a generator that yields items. If you perhaps rename the next function as __iter__ then it will be a proper iterator. I suspect however that your boss just wants you to write a single generator function rather than an iterator class. For example: def generate_jobs(): while : yield for job in generate_jobs(): ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Class-based generator
I wrote some code to create tasks to be run in a queue based system last week. It consisted of a big monolithic function that consisted of two parts: 1) read data from a file and create dictionaries and lists to iterate through 2) iterate through the lists creating a job data file and a task for the queue one at a time until all of the data is dealt with My boss reviewed my code and said that it would be more reusable and Pythonic if I refactored it as a generator that created job data files and iterated by calling the generator and putting a task on the queue for each job data file that was obtained. This made sense to me, and since the code does a bunch of conversion of the data in the input file(s) to make it easier and faster to iterate through the data, I decided to create a class for the generator and put that conversion code into its __init__ function. So the class looked like this: class JobFileGenerator: def __init__(self, filedata, output_file_prefix, job_size): def next(self): while : The problem is that the generator object is not created until you call next(), so the calling code has to look like this: gen = JobFileGenerator(data, "output_", 20).next() for datafile in gen.next(): This code works OK, but I don't like that it needs to call next() once to get a generator and then call next() again repeatedly to get the data for the jobs. If I were to write this without a class as a single generator function, it would not have to do this, but it would have the monolithic structure that my boss objected to. Would it work to do this: for datafile in JobFileGenerator(data, "output_", 20).next(): or would that cause the JobFileGenerator's __init__ function to be called more than once? Are there examples I could look at of generator functions defined on classes similar to this, or is it considered a bad idea to mix the two paradigms? Thanks, Mike ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor