Re: Using Spark Accumulators with Structured Streaming

2020-05-15 Thread ZHANG Wei
There is a restriction in AccumulatorV2 API [1], the OUT type should be atomic or thread safe. I'm wondering if the implementation for `java.util.Map[T, Long]` can meet it or not. Is there any chance to replace CollectionLongAccumulator by CollectionAccumulator[2] or LongAccumulator[3] and

spark on k8s - can driver and executor have separate checkpoint location?

2020-05-15 Thread wzhan
Hi guys, I'm running spark applications on kubernetes. According to spark documentation https://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing Spark needs distributed file system to store its checkpoint data so that in case of failure, it can recover from checkpoint

unsubscribe

2020-05-15 Thread Basavaraj
smime.p7s Description: S/MIME cryptographic signature

Re: Calling HTTP Rest APIs from Spark Job

2020-05-15 Thread Chetan Khatri
Hi Sean, Thanks for great answer. What I am trying to do is to use something like Scala Future (cats-effect IO) to do concurrent calls. Was understanding if any limitation thresholds to make those calls. On Thu, May 14, 2020 at 7:28 PM Sean Owen wrote: > No, it means # HTTP calls = # executor

Re: Using Spark Accumulators with Structured Streaming

2020-05-15 Thread Something Something
Can someone from Spark Development team tell me if this functionality is supported and tested? I've spent a lot of time on this but can't get it to work. Just to add more context, we've our own Accumulator class that extends from AccumulatorV2. In this class we keep track of one or more