On 12/27/16 10:13 PM, j...@math.brown.edu wrote: > Applying this advice to the use cases above would require creating an > arbitrarily large tuple in memory before passing it to hash(), which > is then just thrown away. It would be preferable if there were a way > to pass multiple values to hash() in a streaming fashion, such that > the overall hash were computed incrementally, without building up a > large object in memory first. > > Should there be better support for this use case? Perhaps hash() could > support an alternative signature, allowing it to accept a stream of > values whose combined hash would be computed incrementally in > *constant* space and linear time, e.g. "hash(items=iter(self))". You can write a simple function to use hash iteratively to hash the entire stream in constant space and linear time:
def hash_stream(them): val = 0 for it in them: val = hash((val, it)) return val Although this creates N 2-tuples, they come and go, so the memory use won't grow. Adjust the code as needed to achieve canonicalization before iterating. Or maybe I am misunderstanding the requirements? --Ned.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/