Could you post the screenshot of the Streaming DAG and also the driver log?
It would be great if you have a simple producer for us to debug.
On Mon, Feb 29, 2016 at 1:39 AM, Abhishek Anand
wrote:
> Hi Ryan,
>
> Its not working even after removing the reduceByKey.
>
> So, basically I am doing the
Hi Ryan,
Its not working even after removing the reduceByKey.
So, basically I am doing the following
- reading from kafka
- flatmap inside transform
- mapWithState
- rdd.count on output of mapWithState
But to my surprise still dont see checkpointing taking place.
Is there any restriction to the
Sorry that I forgot to tell you that you should also call `rdd.count()` for
"reduceByKey" as well. Could you try it and see if it works?
On Sat, Feb 27, 2016 at 1:17 PM, Abhishek Anand
wrote:
> Hi Ryan,
>
> I am using mapWithState after doing reduceByKey.
>
> I am right now using mapWithState as
Hi Ryan,
I am using mapWithState after doing reduceByKey.
I am right now using mapWithState as you suggested and triggering the count
manually.
But, still unable to see any checkpointing taking place. In the DAG I can
see that the reduceByKey operation for the previous batches are also being
com
Hey Abhi,
Using reducebykeyandwindow and mapWithState will trigger the bug
in SPARK-6847. Here is a workaround to trigger checkpoint manually:
JavaMapWithStateDStream<...> stateDStream =
myPairDstream.mapWithState(StateSpec.function(mappingFunc));
stateDStream.foreachRDD(new Function1<...
Hey, Ted,
As the fix for SPARK-6847 changes the semantics of Streaming checkpointing,
it doesn't go into branch 1.6.
A workaround is calling `count` to trigger the checkpoint manually. Such as,
val dstream = ... // dstream is an operator needing to be checkpointed.
dstream.foreachRDD(rdd => rdd.
Fix for SPARK-6847 is not in branch-1.6
Should the fix be ported to branch-1.6 ?
Thanks
> On Feb 22, 2016, at 11:55 AM, Shixiong(Ryan) Zhu
> wrote:
>
> Hey Abhi,
>
> Could you post how you use mapWithState? By default, it should do
> checkpointing every 10 batches.
> However, there is a kno
Hi Ryan,
Reposting the code.
Basically my use case is something like - I am receiving the web impression
logs and may get the notify (listening from kafka) for those impressions in
the same interval (for me its 1 min) or any next interval (upto 2 hours).
Now, when I receive notify for a particula
Hey Abhi,
Could you post how you use mapWithState? By default, it should do
checkpointing every 10 batches.
However, there is a known issue that prevents mapWithState from
checkpointing in some special cases:
https://issues.apache.org/jira/browse/SPARK-6847
On Mon, Feb 22, 2016 at 5:47 AM, Abhish
Any Insights on this one ?
Thanks !!!
Abhi
On Mon, Feb 15, 2016 at 11:08 PM, Abhishek Anand
wrote:
> I am now trying to use mapWithState in the following way using some
> example codes. But, by looking at the DAG it does not seem to checkpoint
> the state and when restarting the application fr
I am now trying to use mapWithState in the following way using some example
codes. But, by looking at the DAG it does not seem to checkpoint the state
and when restarting the application from checkpoint, it re-partitions all
the previous batches data from kafka.
static Function3, State, Tuple2> ma
If you don't want to update your only option will be updateStateByKey then
On 13 Feb 2016 8:48 p.m., "Ted Yu" wrote:
> mapWithState supports checkpoint.
>
> There has been some bug fix since release of 1.6.0
> e.g.
> SPARK-12591 NullPointerException using checkpointed mapWithState with
> KryoSe
mapWithState supports checkpoint.
There has been some bug fix since release of 1.6.0
e.g.
SPARK-12591 NullPointerException using checkpointed mapWithState with
KryoSerializer
which is in the upcoming 1.6.1
Cheers
On Sat, Feb 13, 2016 at 12:05 PM, Abhishek Anand
wrote:
> Does mapWithState ch
Does mapWithState checkpoints the data ?
When my application goes down and is restarted from checkpoint, will
mapWithState need to recompute the previous batches data ?
Also, to use mapWithState I will need to upgrade my application as I am
using version 1.4.0 and mapWithState isnt supported ther
Looks like mapWithState could help you?
On 11 Feb 2016 8:40 p.m., "Abhishek Anand" wrote:
> Hi All,
>
> I have an use case like follows in my production environment where I am
> listening from kafka with slideInterval of 1 min and windowLength of 2
> hours.
>
> I have a JavaPairDStream where for
Hi All,
I have an use case like follows in my production environment where I am
listening from kafka with slideInterval of 1 min and windowLength of 2
hours.
I have a JavaPairDStream where for each key I am getting the same key but
with different value,which might appear in the same batch or some
16 matches
Mail list logo