the uniqueSource in StreamExecution, where is it be changed please?

2017-08-04 Thread ??????????
Hi all, These days I am learning the code about the StreamExecution. In the method constructNextBatch(about line 365), I found the value of latestOffsets changed but I can not find where the s.getOffset of uniqueSource is changed. here is the code link:

Re: SPARK Issue in Standalone cluster

2017-08-04 Thread Gourav Sengupta
Hi Marco, I am sincerely obliged for your kind time and response. Can you please try the solution that you have so kindly suggested? It will be a lot of help if you could kindly execute the code that I have given. I dont think that anyone has yet. There are lots of fine responses to my question

Re: Spark Streaming: Async action scheduling inside foreachRDD

2017-08-04 Thread Sathish Kumaran Vairavelu
Forkjoinpool with task support would help in this case. Where u can create a thread pool with configured number of threads ( make sure u have enough cores) and submit job I mean actions to the pool On Fri, Aug 4, 2017 at 8:54 AM Raghavendra Pandey < raghavendra.pan...@gmail.com> wrote: > Did you

Re: Quick one on evaluation

2017-08-04 Thread Daniel Darabos
On Fri, Aug 4, 2017 at 4:36 PM, Jean Georges Perrin wrote: > Thanks Daniel, > > I like your answer for #1. It makes sense. > > However, I don't get why you say that there are always pending > transformations... After you call an action, you should be "clean" from > pending

Re: Quick one on evaluation

2017-08-04 Thread Jean Georges Perrin
Thanks Daniel, I like your answer for #1. It makes sense. However, I don't get why you say that there are always pending transformations... After you call an action, you should be "clean" from pending transformations, no? > On Aug 3, 2017, at 5:53 AM, Daniel Darabos

Re: Spark Streaming: Async action scheduling inside foreachRDD

2017-08-04 Thread Raghavendra Pandey
Did you try SparkContext.addSparkListener? On Aug 3, 2017 1:54 AM, "Andrii Biletskyi" wrote: > Hi all, > > What is the correct way to schedule multiple jobs inside foreachRDD method > and importantly await on result to ensure those jobs have completed >

Re: SPARK Issue in Standalone cluster

2017-08-04 Thread Jean Georges Perrin
I use CIFS and it works reasonably well and easily cross platform, well documented... > On Aug 4, 2017, at 6:50 AM, Steve Loughran wrote: > > >> On 3 Aug 2017, at 19:59, Marco Mistroni wrote: >> >> Hello >> my 2 cents here, hope it helps >> If

Re: SPARK Issue in Standalone cluster

2017-08-04 Thread Steve Loughran
> On 3 Aug 2017, at 19:59, Marco Mistroni wrote: > > Hello > my 2 cents here, hope it helps > If you want to just to play around with Spark, i'd leave Hadoop out, it's an > unnecessary dependency that you dont need for just running a python script > Instead do the

Spark Streaming failed jobs due to hardware issue

2017-08-04 Thread Alapati VenuGopal
Hi, We are running a Spark Streaming application with Kafka Direct Stream with Spark version 1.6. It has run for few days without any error or failed tasks and then there was an error creating a directory in one machine as follows: Job aborted due to stage failure: Task 1 in stage 158757.0