those results
> to what you're seeing from spark. The results you posted from spark
> didn't show any incoming messages at all.
>
> On Sat, Nov 19, 2016 at 11:12 AM, Hster Geguri
> <hster.investiga...@gmail.com> wrote:
> > Hi Cody,
> >
> > Thank you for te
k is indeed seeing offsets for each partition.
>
>
> The results you posted look to me like there aren't any messages going
> into the other partitions, which looks like a misbehaving producer.
>
> On Thu, Nov 17, 2016 at 5:58 PM, Hster Geguri
> <hster.investiga...@gmail.com>
Our team is trying to upgrade to Spark 2.0.2/Kafka 0.10.1.0 and we have
been struggling with this show stopper problem.
When we run our drivers with auto.offset.reset=latest ingesting from a
single kafka topic with 10 partitions, the driver reads correctly from all
10 partitions.
However when we
, Hster Geguri <hster.investiga...@gmail.com
> wrote:
> Hello everyone,
>
> We are testing checkpointing against YARN 2.7.1 with Spark 1.5. We are
> trying to make sure checkpointing works with orderly shutdowns(i.e. yarn
> application --kill) and unexpected shutdowns which we simu
Is there any way to set the underlying AWS client connection socket timeout
for the kinesis requests made in the spark-streaming-kinesis-asl?
Currently we get socket timeouts which appear to default to about 120
seconds on driver restarts causing all kinds of backup. We'd like to
shorten it to 10
s. In your case, it could be
> happening that because of your killing and restarting, the restarted KCL
> may be taking a while to get new lease and start getting data again.
>
> On Mon, Nov 2, 2015 at 11:26 AM, Hster Geguri <
> hster.investiga...@gmail.com> wrote:
>
>
Hello Wonderful Sparks Peoples,
We are testing AWS Kinesis/Spark Streaming (1.5) failover behavior with
Hadoop/Yarn 2.6 and 2.71 and want to understand expected behavior.
When I manually kill a yarn application master/driver with a linux kill -9,
YARN will automatically relaunch another master
We are using Kinesis with Spark Streaming 1.5 on a YARN cluster. When we
enable checkpointing in Spark, where in the Kinesis stream should a
restarted driver continue? I run a simple experiment as follows:
1. In the first driver run, Spark driver processes 1 million records
starting from