On 2024-03-31 00:09, marc nicole via Python-list wrote:
I am creating a memoization example with a function that adds up / averages
the elements of an array and compares it with the cached ones to retrieve
them in case they are already stored.

In addition, I want to store only if the result of the function differs
considerably (passes a threshold e.g. 500000 below).

I created an example using a decorator to do so, the results using the
decorator is slightly faster than without the memoization which is OK, but
is the logic of the decorator correct ? anybody can tell me ?

My code is attached below:



import time


def memoize(f):
     cache = {}

     def g(*args):
         if args[1] == "avg":
             sum_key_arr = sum(list(args[0])) / len(list(args[0]))

'list' will iterate over args[0] to make a list, and 'sum' will iterate over that list.

It would be simpler to just let 'sum' iterate over args[0].

         elif args[1] == "sum":
             sum_key_arr = sum(list(args[0]))
         if sum_key_arr not in cache:
             for (
                 key,
                 value,
             ) in (
                 cache.items()
             ):  # key in dict cannot be an array so I use the sum of the
array as the key

You can't use a list as a key, but you can use a tuple as a key, provided that the elements of the tuple are also immutable.

                 if (
                     abs(sum_key_arr - key) <= 500000
                 ):  # threshold is great here so that all values are
approximated!
                     # print('approximated')
                     return cache[key]
             else:
                 # print('not approximated')
                 cache[sum_key_arr] = f(args[0], args[1])
         return cache[sum_key_arr]

     return g


@memoize
def aggregate(dict_list_arr, operation):
     if operation == "avg":
         return sum(list(dict_list_arr)) / len(list(dict_list_arr))
     if operation == "sum":
         return sum(list(dict_list_arr))
     return None


t = time.time()
for i in range(200, 15000):
     res = aggregate(list(range(i)), "avg")

elapsed = time.time() - t
print(res)
print(elapsed)


--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to