Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Bin Wang
Hope it will be delivered soon. > > Best Regards, > Shixiong Zhu > > 2015-09-24 13:45 GMT+08:00 Bin Wang <wbi...@gmail.com>: > >> I've read the source code and it seems to be impossible, but I'd like to >> confirm it. >> >> It is a very useful feature.

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Bin Wang
t; doesn't have a doc now... > > > Best Regards, > Shixiong Zhu > > 2015-09-24 17:26 GMT+08:00 Bin Wang <wbi...@gmail.com>: > >> Data that are not updated should be saved earlier: while the data added >> to the DStream at the first time, it should be considered a

Re: Get only updated RDDs from or after updateStateBykey

2015-09-24 Thread Bin Wang
.disconnect()) > } > > > Best Regards, > Shixiong Zhu > > 2015-09-24 17:42 GMT+08:00 Bin Wang <wbi...@gmail.com>: > >> It seems like a work around. But I don't know how to get the database >> connection from the working nodes. >> >> Shixiong Zh

Checkpoint directory structure

2015-09-23 Thread Bin Wang
I find the checkpoint directory structure is like this: -rw-r--r-- 1 root root 134820 2015-09-23 16:55 /user/root/checkpoint/checkpoint-144299850 -rw-r--r-- 1 root root 134768 2015-09-23 17:00 /user/root/checkpoint/checkpoint-144299880 -rw-r--r-- 1 root root 134895

Re: Checkpoint directory structure

2015-09-23 Thread Bin Wang
上午9:45写道: > Could you provide the logs on when and how you are seeing this error? > > On Wed, Sep 23, 2015 at 6:32 PM, Bin Wang <wbi...@gmail.com> wrote: > >> BTW, I just kill the application and restart it. Then the application >> cannot recover from checkpoint becaus

Re: Checkpoint directory structure

2015-09-23 Thread Bin Wang
BTW, I just kill the application and restart it. Then the application cannot recover from checkpoint because of some lost of RDD. So I'm wonder, if there are some failure in the application, won't it possible not be able to recovery from checkpoint? Bin Wang <wbi...@gmail.com>于2015年9月23日周三

Get only updated RDDs from or after updateStateBykey

2015-09-23 Thread Bin Wang
I've read the source code and it seems to be impossible, but I'd like to confirm it. It is a very useful feature. For example, I need to store the state of DStream into my database, in order to recovery them from next redeploy. But I only need to save the updated ones. Save all keys into database

Re: Why there is no snapshots for 1.5 branch?

2015-09-22 Thread Bin Wang
stall…..’ > > > Azuryy Yu > > > > On Sep 22, 2015, at 13:36, Bin Wang <wbi...@gmail.com> wrote: > > However I find some scripts in dev/audit-release, can I use them? > > Bin Wang <wbi...@gmail.com>于2015年9月22日周二 下午1:34写道: > >> No, I mean push

Re: Why there is no snapshots for 1.5 branch?

2015-09-21 Thread Bin Wang
ad of branch-1.5 is 1.5.1-SNAPSHOT -- soon to be 1.5.1 release > candidates and then the 1.5.1 release. > > On Mon, Sep 21, 2015 at 9:51 PM, Bin Wang <wbi...@gmail.com> wrote: > >> I'd like to use some important bug fixes in 1.5 branch and I look for the >> apache maven

Why there is no snapshots for 1.5 branch?

2015-09-21 Thread Bin Wang
I'd like to use some important bug fixes in 1.5 branch and I look for the apache maven host, but don't find any snapshot for 1.5 branch. https://repository.apache.org/content/groups/snapshots/org/apache/spark/spark-core_2.10/1.5.0-SNAPSHOT/ I can find 1.4.X and 1.6.0 versions, why there is no

Re: Why there is no snapshots for 1.5 branch?

2015-09-21 Thread Bin Wang
ll git commit history of exactly what is in that build readily available, > not just somewhat arbitrary JARs. > > On Mon, Sep 21, 2015 at 9:57 PM, Bin Wang <wbi...@gmail.com> wrote: > >> But I cannot find 1.5.1-SNAPSHOT either at >> https://repository.apache.org/conten

Re: Why there is no snapshots for 1.5 branch?

2015-09-21 Thread Bin Wang
However I find some scripts in dev/audit-release, can I use them? Bin Wang <wbi...@gmail.com>于2015年9月22日周二 下午1:34写道: > No, I mean push spark to my private repository. Spark don't have a > build.sbt as far as I see. > > Fengdong Yu <fengdo...@everstring.com>于2015年9月22日

Re: Why there is no snapshots for 1.5 branch?

2015-09-21 Thread Bin Wang
leases" at nexus + "content/repositories/releases") > > } > > > > On Sep 22, 2015, at 13:26, Bin Wang <wbi...@gmail.com> wrote: > > My project is using sbt (or maven), which need to download dependency from > a maven repo. I have m

Re: QueueStream doesn't support checkpoint makes it difficult to do unit test

2015-09-17 Thread Bin Wang
Never mind. I've found a PR and it merged: https://github.com/apache/spark/pull/8624/commits Bin Wang <wbi...@gmail.com>于2015年9月17日周四 下午4:50写道: > I'm using spark streaming and use updateStateByKey, which forced to use > checkpoint. In my unit test, I create a queueStream to test.

Optimize the first map reduce of DStream

2015-03-24 Thread Bin Wang
in the memory. Something like a pipeline should be OK. Is it difficult to implement on top of the current implementation? Thanks. --- Bin Wang