Re: Collecting list of errors across executors

2021-08-03 Thread Abdeali Kothari
You could create a custom accumulator using a linkedlist or so. Some examples that could help: https://towardsdatascience.com/custom-pyspark-accumulators-310f63ca3c8c https://stackoverflow.com/questions/34798578/how-to-create-custom-list-accumulator-i-e-listint-int On Tue, Aug 3, 2021 at 1:23

Collecting list of errors across executors

2021-08-03 Thread Sachit Murarka
Hi Team, We are using rdd.foreach(lambda x : do_something(x)) Our use case requires collecting of the error messages in a list which are coming up in the exception block of the method do_something. Since this will be running on executor , a global list won't work here. As the state needs to be