Re: Read Avro Data using Spark Streaming

2018-11-14 Thread chandan prakash
In case you are using old Spark Streaming and processing Avro data from kafka, this might be useful: https://www.linkedin.com/pulse/avro-data-processing-spark-streaming-kafka-chandan-prakash/ Regards, Chandan On Sat, Nov 3, 2018 at 9:04 AM Divya Narayan wrote: > Hi, > > I produced

Re: Watermarking without aggregation with Structured Streaming

2018-09-30 Thread chandan prakash
correct? > > I am looking for something along the lines of > > ``` > df.withWatermark("ts", ...).filter(F.col("ts") ``` > > Is there any way to get the watermark value to achieve that? > > Thanks! > -- Chandan Prakash

Re: [Structured Streaming SPARK-23966] Why non-atomic rename is problem in State Store ?

2018-09-30 Thread chandan prakash
Anyone who can clear doubts on the questions asked here ? Regards, Chandan On Sat, Aug 11, 2018 at 10:03 PM chandan prakash wrote: > Hi All, > I was going through this pull request about new CheckpointFileManager > abstraction in structured streaming coming in 2.4

[Structured Streaming SPARK-23966] Why non-atomic rename is problem in State Store ?

2018-08-11 Thread chandan prakash
Hi All, I was going through this pull request about new CheckpointFileManager abstraction in structured streaming coming in 2.4 : https://issues.apache.org/jira/browse/SPARK-23966 https://github.com/apache/spark/pull/21048 I went through the code in detail and found it will indtroduce a very nice

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-12 Thread chandan prakash
cannot plugin existing sinks like Kafka and > you need to write the custom logic yourself and you cannot scale the > partitions for the sinks independently. > > [1] > https://spark.apache.org/docs/2.1.2/api/java/org/apache/spark/sql/ForeachWriter.html > > From: chandan prakash

Re: [Structured Streaming] Avoiding multiple streaming queries

2018-07-12 Thread chandan prakash
way to output the results of a structured streaming query >>> to multiple Kafka topics each with a different key column but without >>> having to execute multiple streaming queries? >>> >>> >>> 2. If not, would it be efficient to cascade the multiple queries such >>> that the first query does the complex aggregation and writes output >>> to Kafka and then the other queries just read the output of the first query >>> and write their topics to Kafka thus avoiding doing the complex aggregation >>> again? >>> >>> >>> Thanks in advance for any help. >>> >>> >>> Priyank >>> >>> >>> >> > -- Chandan Prakash

Re: Emit Custom metrics in Spark Structured Streaming job

2018-07-10 Thread chandan prakash
ke a look > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Chandan Prakash

Re: Automatic Json Schema inference using Structured Streaming

2018-07-09 Thread chandan prakash
ds, Chandan On Thu, Jul 5, 2018 at 12:38 PM SRK wrote: > Hi, > > Is there a way that Automatic Json Schema inference can be done using > Structured Streaming? I do not want to supply a predefined schema and bind > it. > > With Spark Kafka Direct I could do spark.read.json(). I see that this is > not > supported in Structured Streaming. > > > Thanks! > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Chandan Prakash

Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-07-09 Thread chandan prakash
Thanks > Amiya > > > > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Chandan Prakash

structured streaming: how to keep counter of error records in log running streaming application

2018-07-05 Thread chandan prakash
LongAccumulator but it seems like it is per batch and not for the life of the application. Any other way or workaround to achieve this, please share. Thanks in advance. Regards, -- Chandan Prakash

Re: How to branch a Stream / have multiple Sinks / do multiple Queries on one Stream

2018-07-05 Thread chandan prakash
n you keep posting once you have any solution. > > Thanks > Amiya > > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Chandan Prakash

Spark Streaming logging on Yarn : issue with rolling in yarn-client mode for driver log

2018-03-07 Thread chandan prakash
to use yarn-client mode only 2. want to enable logging in yarn container only so that it is aggregated and backed up by yarn every hour to hdfs/s3 *How can I get a workaround this to enable driver logs rolling and aggregation as well?* Any pointers will be helpful. thanks in advance. -- Chandan Prakash

Re: SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread chandan prakash
an earlier version with devtools? will > follow up for a fix. > > _ > From: Hyukjin Kwon <gurwls...@gmail.com> > Sent: Wednesday, February 14, 2018 6:49 PM > Subject: Re: SparkR test script issue: unable to run run-tests.h on spark > 2.2 > To: chand

SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread chandan prakash
like testthat as mentioned in doc 3. run run-tests.h Every time I am getting this error line: Error in get(name, envir = asNamespace(pkg), inherits = FALSE) : object 'run_tests' not found Calls: ::: -> get Execution halted Any Help? -- Chandan Prakash

Re: spark streaming Directkafka with checkpointing : changed parameters not considered

2016-08-19 Thread chandan prakash
be turned on in certain situations (e.g. > updateStateByKey), but you're certainly not required to rely on it for > fault tolerance. I try not to. > > On Fri, Aug 19, 2016 at 1:51 PM, chandan prakash < > chandanbaran...@gmail.com> wrote: > >> Thanks Cody for the pointer. &g

Re: spark streaming Directkafka with checkpointing : changed parameters not considered

2016-08-19 Thread chandan prakash
on't rely on > checkpoints > On Aug 18, 2016 8:53 AM, "chandan prakash" <chandanbaran...@gmail.com> > wrote: > >> Yes, >> i looked into the source code implementation. sparkConf is serialized >> and saved during checkpointing and re-created from the check

Re: spark streaming Directkafka with checkpointing : changed parameters not considered

2016-08-18 Thread chandan prakash
ming- > guide.html#upgrading-application-code > > > On Thu, Aug 18, 2016 at 4:39 AM, chandan prakash < > chandanbaran...@gmail.com> wrote: > >> Is it possible that i use checkpoint directory to restart streaming but >> with modified parameter value in con

Re: spark streaming Directkafka with checkpointing : changed parameters not considered

2016-08-18 Thread chandan prakash
Is it possible that i use checkpoint directory to restart streaming but with modified parameter value in config file (e.g. username/password for db connection) ? Thanks in advance. Regards, Chandan On Thu, Aug 18, 2016 at 1:10 PM, chandan prakash <chandanbaran...@gmail.com> wrote: >

spark streaming Directkafka with checkpointing : changed parameters not considered

2016-08-18 Thread chandan prakash
eamingContext.getOrCreate( checkpointDir, setupSsc(topics, kafkaParams, jdbcDriver, jdbcUrl, *jdbcUser*, *jdbcPassword*, checkpointDir) _ ) -- Chandan Prakash

Re: JavaStreamingContext.stop() hangs

2016-07-01 Thread chandan prakash
> Thanks. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/JavaStreamingContext-stop-hangs-tp27257.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > ----

Re: spark streaming: issue with logging with separate log4j properties files for driver and executor

2016-05-24 Thread chandan prakash
utor.properties ") On Tue, May 24, 2016 at 10:24 PM, chandan prakash <chandanbaran...@gmail.com > wrote: > Any suggestion? > > On Mon, May 23, 2016 at 5:18 PM, chandan prakash < > chandanbaran...@gmail.com> wrote: > >> Hi, >> I am able to do logging for driv

Re: spark streaming: issue with logging with separate log4j properties files for driver and executor

2016-05-24 Thread chandan prakash
Any suggestion? On Mon, May 23, 2016 at 5:18 PM, chandan prakash <chandanbaran...@gmail.com> wrote: > Hi, > I am able to do logging for driver but not for executor. > > I am running spark streaming under mesos. > Want to do log4j logging separately for driver and executo

spark streaming: issue with logging with separate log4j properties files for driver and executor

2016-05-23 Thread chandan prakash
requestLogDriver.log) is happening fine. But for executor, there is no logging happening (shud be at /tmp/requestLogExecutor.log as mentioned in log4j_RequestLogExecutor.properties on executor machines) *Any suggestions how to get logging enabled for executor ?* TIA, Chandan -- Chandan Prakash

Re: Kafka partition increased while Spark Streaming is running

2016-05-13 Thread chandan prakash
> My answer to any question about "but what if checkpoints don't let me > do this" is always going to be "well, don't rely on checkpoints." > > If you want dynamic topicpartitions, > https://issues.apache.org/jira/browse/SPARK-12177 > > > On Fri, May 13,

Re: Kafka partition increased while Spark Streaming is running

2016-05-13 Thread chandan prakash
> Does this mean Spark Streaming's Kafka integration can't update its >> parallelism when Kafka's partition number is changed? >> > > -- Chandan Prakash

Re: Spark Streaming : is spark.streaming.receiver.maxRate valid for DirectKafkaApproach

2016-05-10 Thread chandan prakash
> > rate > > > > Sent from my iPhone > > > > On May 10, 2016, at 8:02 AM, chandan prakash <chandanbaran...@gmail.com> > > wrote: > > > > Hi, > > I am using Spark Streaming with Direct kafka approach. > >

Spark Streaming : is spark.streaming.receiver.maxRate valid for DirectKafkaApproach

2016-05-10 Thread chandan prakash
fast rate and some with very slow rate.* Regards, Chandan -- Chandan Prakash

Re: Is SPARK is the right choice for traditional OLAP query processing?

2015-11-08 Thread chandan prakash
Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Chandan Prakash