[Spark Streaming] Connect to Database only once at the start of Streaming job

2015-10-27 Thread Uthayan Suthakar
Hello all, What I wanted to do is configure the spark streaming job to read the database using JdbcRDD and cache the results. This should occur only once at the start of the job. It should not make any further connection to DB afterwards. Is it possible to do that?

Re: [Spark Streaming] How do we reset the updateStateByKey values.

2015-10-26 Thread Uthayan Suthakar
etState message. If now, continue summing the > others. > > I can provide scala samples, my java is beyond rusty :) > > -adrian > > From: Uthayan Suthakar > Date: Friday, October 23, 2015 at 2:10 PM > To: Sander van Dijk > Cc: user > Subject: Re: [Spark Strea

Re: [Spark Streaming] How do we reset the updateStateByKey values.

2015-10-23 Thread Uthayan Suthakar
hope this applies to your case and that it makes sense, my Java is a bit > rusty :) and perhaps others can suggest better spark streaming methods that > can be used, but hopefully the idea is clear. > > Sander > > On Thu, Oct 22, 2015 at 4:06 PM Uthayan Suthakar < > uthayan.sutha...@

Re: [Spark Streaming] How do we reset the updateStateByKey values.

2015-10-22 Thread Uthayan Suthakar
I need to take the value from a RDD and update the state of the other RDD. Is this possible? On 22 October 2015 at 16:06, Uthayan Suthakar <uthayan.sutha...@gmail.com> wrote: > Hello guys, > > I have a stream job that will carryout computations and update the state > (SUM t

[Spark Streaming] How do we reset the updateStateByKey values.

2015-10-22 Thread Uthayan Suthakar
Hello guys, I have a stream job that will carryout computations and update the state (SUM the value). At some point, I would like to reset the state. I could drop the state by setting 'None' but I don't want to drop it. I would like to keep the state but update the state. For example:

Re: Why Spark Stream job stops producing outputs after a while?

2015-10-12 Thread Uthayan Suthakar
Any suggestions? Is there anyway that I could debug this issue? Cheers, Uthay On 11 October 2015 at 18:39, Uthayan Suthakar <uthayan.sutha...@gmail.com> wrote: > Hello all, > > I have a Spark Streaming job that run and produce results successfully. > However, after a few

Re: Why Spark Stream job stops producing outputs after a while?

2015-10-12 Thread Uthayan Suthakar
, Tathagata Das <t...@databricks.com> wrote: > Are you sure that there are not log4j errors in the driver logs? What if > you try enabling debug level? And what does the streaming UI say? > > > On Mon, Oct 12, 2015 at 12:50 PM, Uthayan Suthakar < > uthayan.sutha...

Why Spark Stream job stops producing outputs after a while?

2015-10-11 Thread Uthayan Suthakar
Hello all, I have a Spark Streaming job that run and produce results successfully. However, after a few days the job stop producing any output. I can see the job is still running ( polling data from Flume, completing jobs and it's subtasks) however, it is failing to produce any output. I have to

Re: Why Checkpoint is throwing "actor.OneForOneStrategy: NullPointerException"

2015-09-25 Thread Uthayan Suthakar
hanks! > - Terry > > On Fri, Sep 25, 2015 at 10:22 AM, Tathagata Das <t...@databricks.com> > wrote: > >> Are you by any chance setting DStream.remember() with null? >> >> On Thu, Sep 24, 2015 at 5:02 PM, Uthayan Suthakar < >> uthayan.sutha...@gmail.com&

Why Checkpoint is throwing "actor.OneForOneStrategy: NullPointerException"

2015-09-24 Thread Uthayan Suthakar
Hello all, My Stream job is throwing below exception at every interval. It is first deleting the the checkpoint file and then it's trying to checkpoint, is this normal behaviour? I'm using Spark 1.3.0. Do you know what may cause this issue? 15/09/24 16:35:55 INFO scheduler.TaskSetManager:

Re: Why RDDs are being dropped by Executors?

2015-09-23 Thread Uthayan Suthakar
ng, as the partitions > will be read from disk -- better than recomputing in most cases. > > On Tue, Sep 22, 2015 at 4:20 AM, Uthayan Suthakar < > uthayan.sutha...@gmail.com> wrote: > >> >> Hello All, >> >> We have a Spark Streaming job that reads data from

Why RDDs are being dropped by Executors?

2015-09-22 Thread Uthayan Suthakar
Hello All, We have a Spark Streaming job that reads data from DB (three tables) and cache them into memory ONLY at the start then it will happily carry out the incremental calculation with the new data. What we've noticed occasionally is that one of the RDDs caches only 90% of the data.

How do we get the Spark Streaming logs while it is active?

2015-09-04 Thread Uthayan Suthakar
Hello all, I'm using Yarn-cluster mode to run the Spark Streaming job, but I could only get the logs once the job is complete (manual intervention). But I would like to see the logs while it is running, is this possible?

Jackson-core-asl conflict with Spark

2015-03-12 Thread Uthayan Suthakar
Hello Guys, I'm running into below error: Exception in thread main java.lang.NoClassDefFoundError: org/codehaus/jackson/annotate/JsonClass I have created a uber jar with Jackson-core-asl.1.9.13 and passed it with --jars configuration, but still getting errors. I searched on the net and found a

Re: Jackson-core-asl conflict with Spark

2015-03-12 Thread Uthayan Suthakar
genuinely stuck with something ancient, then you need to include the JAR that contains the class, and 1.9.13 does not. Why do you think you need that particular version? — p...@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/ On Thu, Mar 12, 2015 at 9:58 AM, Uthayan Suthakar