On 12/27/16 10:13 PM, j...@math.brown.edu wrote:
> Applying this advice to the use cases above would require creating an
> arbitrarily large tuple in memory before passing it to hash(), which
> is then just thrown away. It would be preferable if there were a way
> to pass multiple values to hash() in a streaming fashion, such that
> the overall hash were computed incrementally, without building up a
> large object in memory first.
>
> Should there be better support for this use case? Perhaps hash() could
> support an alternative signature, allowing it to accept a stream of
> values whose combined hash would be computed incrementally in
> *constant* space and linear time, e.g. "hash(items=iter(self))".
You can write a simple function to use hash iteratively to hash the
entire stream in constant space and linear time:

    def hash_stream(them):
        val = 0
        for it in them:
            val = hash((val, it))
        return val

Although this creates N 2-tuples, they come and go, so the memory use
won't grow.  Adjust the code as needed to achieve canonicalization
before iterating.

Or maybe I am misunderstanding the requirements?

--Ned.
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to