Re: [Tutor] Chunking list/array data?

2019-08-22 Thread Cameron Simpson

On 21Aug2019 21:26, Sarah Hembree  wrote:

How do you chunk data? We came up with the below snippet. It works (with
integer list data) for our needs, but it seems so clunky.

   def _chunks(lst: list, size: int) -> list:
   return  [lst[x:x+size] for x in range(0, len(lst), size)]

What do you do? Also, what about doing this lazily so as to keep memory
drag at a minimum?


This looks pretty good to me. But as you say, it constructs the complete 
list of chunks and returns them all. For many chunks that is both slow 
and memory hungry.


If you want to conserve memory and return chunks in a lazy manner you 
can rewrite this as a generator. A first cut might look like this:


  def _chunks(lst: list, size: int) -> list:
  for x in range(0, len(lst), size):
  yield lst[x:x+size]

which causes _chunk() be a generator function: it returns an iterator 
which yields each chunk one at a time - the body of the function is kept 
"running", but stalled. When you iterate over the return from _chunk() 
Python runs that stalled function until it yields a value, then stalls 
it again and hands you that value.


Modern Python has a thing called a "generator expression". Your original 
function is a "list comprehension": it constructs a list of values and 
returns that list. In many cases, particularly for very long lists, that 
can be both slow and memory hungry. You can rewrite such a thing like 
this:


   def _chunks(lst: list, size: int) -> list:
   return ( lst[x:x+size] for x in range(0, len(lst), size) )

Omitting the square brackets turns this into a generator expression. It 
returns an iterator instead of a list, which functions like the 
generator function I sketched, and generates the chunks lazily.


Cheers,
Cameron Simpson 
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Chunking list/array data?

2019-08-22 Thread Peter Otten
Sarah Hembree wrote:

> How do you chunk data? We came up with the below snippet. It works (with
> integer list data) for our needs, but it seems so clunky.
> 
> def _chunks(lst: list, size: int) -> list:
> return  [lst[x:x+size] for x in range(0, len(lst), size)]
> 
> What do you do? Also, what about doing this lazily so as to keep memory
> drag at a minimum?

If you don't mind filling up the last chunk with dummy values this will 
generate tuples on demand from an arbitrary iterable:

>>> from itertools import zip_longest
>>> def chunks(items, n):
... return zip_longest(*[iter(items)]*n)
... 
>>> chunked = chunks("abcdefgh", 3)
>>> next(chunked)
('a', 'b', 'c')
>>> list(chunked)
[('d', 'e', 'f'), ('g', 'h', None)]


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Chunking list/array data?

2019-08-22 Thread Sarah Hembree
How do you chunk data? We came up with the below snippet. It works (with
integer list data) for our needs, but it seems so clunky.

def _chunks(lst: list, size: int) -> list:
return  [lst[x:x+size] for x in range(0, len(lst), size)]

What do you do? Also, what about doing this lazily so as to keep memory
drag at a minimum?


--- We not only inherit the Earth from our Ancestors, we borrow it from our
Children. Aspire to grace.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor