Re: Spark can't identify the event time column being supplied to withWatermark()

2018-06-08 Thread Tathagata Das
Glad that it worked out! It's unfortunate that there exist such pitfalls. And there is no easy way to get around it. If you can, let us know how your experience with mapGroupsWithState has been. TD On Fri, Jun 8, 2018 at 1:49 PM, frankdede wrote: > You are exactly right! A few hours ago, I

Re: Reset the offsets, Kafka 0.10 and Spark

2018-06-08 Thread Tathagata Das
Structured Streaming really makes this easy. You can simply specify the option of whether the start the query from earliest or latest. Check out - https://www.slideshare.net/databricks/a-deep-dive-into-structured-streaming -

Re: Spark can't identify the event time column being supplied to withWatermark()

2018-06-08 Thread frankdede
You are exactly right! A few hours ago, I tried many things and finally got the example working by defining event timestamp column before groupByKey, just like what you suggested, but I wasn't able to figure out the reasoning behind my fix. val sessionUpdates = events

Re: Spark can't identify the event time column being supplied to withWatermark()

2018-06-08 Thread Tathagata Das
Try to define the watermark on the right column immediately before calling `groupByKey(...).mapGroupsWithState(...)`. You are applying the watermark and then doing a bunch of opaque transformation (user-defined flatMap that the planner has no visibility into). This prevents the planner from

Re: Spark 2.3 driver pod stuck in Running state — Kubernetes

2018-06-08 Thread Yinan Li
Yes, it looks like it is because there's not enough resources to run the executor pods. Have you seen pending executor pods? On Fri, Jun 8, 2018, 11:49 AM Thodoris Zois wrote: > As far as I know from Mesos with Spark, it is a running state and not a > pending one. What you see is normal, but if

Re: Spark 2.3 driver pod stuck in Running state — Kubernetes

2018-06-08 Thread Thodoris Zois
As far as I know from Mesos with Spark, it is a running state and not a pending one. What you see is normal, but if I am wrong somebody correct me. Spark driver at start operates normally (running state) but when it comes to start up executors, then it cannot allocate resources for them and

Spark 2.3 driver pod stuck in Running state — Kubernetes

2018-06-08 Thread purna pradeep
Hello, When I run spark-submit on k8s cluster I’m Seeing driver pod stuck in Running state and when I pulled driver pod logs I’m able to see below log I do understand that this warning might be because of lack of cpu/ Memory , but I expect driver pod be in “Pending” state rather than “ Running”

Spark can't identify the event time column being supplied to withWatermark()

2018-06-08 Thread frankdede
I was trying to find a way to resessionize features in different events based on the event timestamps using Spark and I found a code example that uses mapGroupsWithStateto resessionize events using processing timestamps in their repo.

Change in configuration settings?

2018-06-08 Thread William Briggs
I recently upgraded a Structured Streaming application from Spark 2.2.1 -> Spark 2.3.0. This application runs in yarn-cluster mode, and it made use of the spark.yarn.{driver|executor}.memoryOverhead properties. I noticed the job started crashing unexpectedly, and after doing a bunch of digging, it

Re: [SparkLauncher] stateChanged event not received in standalone cluster mode

2018-06-08 Thread bsikander
Thanks. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark YARN job submission error (code 13)

2018-06-08 Thread Aakash Basu
Fixed by adding 2 configurations in yarn-site,xml. Thanks all! On Fri, Jun 8, 2018 at 2:44 PM, Aakash Basu wrote: > Hi, > > I fixed that problem by putting all the Spark JARS in spark-archive.zip > and putting it in the HDFS (as that problem was happening for that reason) - > > But, I'm facing

Re: Spark YARN Error - triggering spark-shell

2018-06-08 Thread Aakash Basu
Fixed by adding 2 configurations in yarn-site,xml. Thanks all! On Fri, Jun 8, 2018 at 2:44 PM, Aakash Basu wrote: > Hi, > > I fixed that problem by putting all the Spark JARS in spark-archive.zip > and putting it in the HDFS (as that problem was happening for that reason) - > > But, I'm

Re: Spark YARN Error - triggering spark-shell

2018-06-08 Thread Aakash Basu
Hi, I fixed that problem by putting all the Spark JARS in spark-archive.zip and putting it in the HDFS (as that problem was happening for that reason) - But, I'm facing a new issue now, this is the new RPC error I get (Stack-Trace below) - 2018-06-08 14:26:43 WARN NativeCodeLoader:62 -

Re: Spark YARN job submission error (code 13)

2018-06-08 Thread Aakash Basu
Hi, I fixed that problem by putting all the Spark JARS in spark-archive.zip and putting it in the HDFS (as that problem was happening for that reason) - But, I'm facing a new issue now, this is the new RPC error I get (Stack-Trace below) - 2018-06-08 14:26:43 WARN NativeCodeLoader:62 -

Re: Spark YARN Error - triggering spark-shell

2018-06-08 Thread Sathishkumar Manimoorthy
It seems, your spark-on-yarn application is not able to get it's application master, org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. Check once on yarn logs Thanks, Sathish- On Fri, Jun 8, 2018 at 2:22 PM,

Re: Spark YARN Error - triggering spark-shell

2018-06-08 Thread Jeff Zhang
Check the yarn AM log for details. Aakash Basu 于2018年6月8日周五 下午4:36写道: > Hi, > > Getting this error when trying to run Spark Shell using YARN - > > Command: *spark-shell --master yarn --deploy-mode client* > > 2018-06-08 13:39:09 WARN Client:66 - Neither spark.yarn.jars nor >

Re: Spark YARN job submission error (code 13)

2018-06-08 Thread Saisai Shao
In Spark on YARN, error code 13 means SparkContext doesn't initialize in time. You can check the yarn application log to get more information. BTW, did you just write a plain python script without creating SparkContext/SparkSession? Aakash Basu 于2018年6月8日周五 下午4:15写道: > Hi, > > I'm trying to

Spark YARN Error - triggering spark-shell

2018-06-08 Thread Aakash Basu
Hi, Getting this error when trying to run Spark Shell using YARN - Command: *spark-shell --master yarn --deploy-mode client* 2018-06-08 13:39:09 WARN Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 2018-06-08 13:39:25

Spark YARN job submission error (code 13)

2018-06-08 Thread Aakash Basu
Hi, I'm trying to run a program on a cluster using YARN. YARN is present there along with HADOOP. Problem I'm running into is as below - Container exited with a non-zero exit code 13 > Failing this attempt. Failing the application. > ApplicationMaster host: N/A > ApplicationMaster

Re: Strange codegen error for SortMergeJoin in Spark 2.2.1

2018-06-08 Thread Rico Bergmann
Hi! I finally found the problem. I was not aware, that the program was run in Client mode. The client used version 2.2.0. This caused the problem. Best, Rico. Am 07.06.2018 um 08:49 schrieb Kazuaki Ishizaki: > Thank you for reporting a problem. > Would it be possible to create a JIRA entry