It turns out you can easily use a Python set, so I can send back a list of failed files. Thanks.
On Wed, Jun 15, 2016 at 4:28 PM Ted Yu <yuzhih...@gmail.com> wrote: > Have you looked at: > > https://spark.apache.org/docs/latest/programming-guide.html#accumulators > > On Wed, Jun 15, 2016 at 1:24 PM, Mathieu Longtin <math...@closetwork.org> > wrote: > >> Is there a way to report warnings from the workers back to the driver >> process? >> >> Let's say I have an RDD and do this: >> >> newrdd = rdd.map(somefunction) >> >> In *somefunction*, I want to catch when there are invalid values in *rdd >> *and either put them in another RDD or send some sort of message back. >> >> Is that possible? >> -- >> Mathieu Longtin >> 1-514-803-8977 >> > > -- Mathieu Longtin 1-514-803-8977