Glad that it worked out! It's unfortunate that there exist such pitfalls.
And there is no easy way to get around it.
If you can, let us know how your experience with mapGroupsWithState has
been.
TD
On Fri, Jun 8, 2018 at 1:49 PM, frankdede
wrote:
> You are exactly right! A few hours ago, I
Structured Streaming really makes this easy. You can simply specify the
option of whether the start the query from earliest or latest.
Check out
-
https://www.slideshare.net/databricks/a-deep-dive-into-structured-streaming
-
You are exactly right! A few hours ago, I tried many things and finally got
the example working by defining event timestamp column before groupByKey,
just like what you suggested, but I wasn't able to figure out the reasoning
behind my fix.
val sessionUpdates = events
Try to define the watermark on the right column immediately before calling
`groupByKey(...).mapGroupsWithState(...)`. You are applying the watermark
and then doing a bunch of opaque transformation (user-defined flatMap that
the planner has no visibility into). This prevents the planner from
Yes, it looks like it is because there's not enough resources to run the
executor pods. Have you seen pending executor pods?
On Fri, Jun 8, 2018, 11:49 AM Thodoris Zois wrote:
> As far as I know from Mesos with Spark, it is a running state and not a
> pending one. What you see is normal, but if
As far as I know from Mesos with Spark, it is a running state and not a pending
one. What you see is normal, but if I am wrong somebody correct me.
Spark driver at start operates normally (running state) but when it comes to
start up executors, then it cannot allocate resources for them and
Hello,
When I run spark-submit on k8s cluster I’m
Seeing driver pod stuck in Running state and when I pulled driver pod logs
I’m able to see below log
I do understand that this warning might be because of lack of cpu/ Memory ,
but I expect driver pod be in “Pending” state rather than “ Running”
I was trying to find a way to resessionize features in different events based
on the event timestamps using Spark and I found a code example that uses
mapGroupsWithStateto resessionize events using processing timestamps in
their repo.
I recently upgraded a Structured Streaming application from Spark 2.2.1 ->
Spark 2.3.0. This application runs in yarn-cluster mode, and it made use of
the spark.yarn.{driver|executor}.memoryOverhead properties. I noticed the
job started crashing unexpectedly, and after doing a bunch of digging, it
Thanks.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Fixed by adding 2 configurations in yarn-site,xml.
Thanks all!
On Fri, Jun 8, 2018 at 2:44 PM, Aakash Basu
wrote:
> Hi,
>
> I fixed that problem by putting all the Spark JARS in spark-archive.zip
> and putting it in the HDFS (as that problem was happening for that reason) -
>
> But, I'm facing
Fixed by adding 2 configurations in yarn-site,xml.
Thanks all!
On Fri, Jun 8, 2018 at 2:44 PM, Aakash Basu
wrote:
> Hi,
>
> I fixed that problem by putting all the Spark JARS in spark-archive.zip
> and putting it in the HDFS (as that problem was happening for that reason) -
>
> But, I'm
Hi,
I fixed that problem by putting all the Spark JARS in spark-archive.zip and
putting it in the HDFS (as that problem was happening for that reason) -
But, I'm facing a new issue now, this is the new RPC error I get
(Stack-Trace below) -
2018-06-08 14:26:43 WARN NativeCodeLoader:62 -
Hi,
I fixed that problem by putting all the Spark JARS in spark-archive.zip and
putting it in the HDFS (as that problem was happening for that reason) -
But, I'm facing a new issue now, this is the new RPC error I get
(Stack-Trace below) -
2018-06-08 14:26:43 WARN NativeCodeLoader:62 -
It seems, your spark-on-yarn application is not able to get it's
application master,
org.apache.spark.SparkException: Yarn application has already ended!
It might have been killed or unable to launch application master.
Check once on yarn logs
Thanks,
Sathish-
On Fri, Jun 8, 2018 at 2:22 PM,
Check the yarn AM log for details.
Aakash Basu 于2018年6月8日周五 下午4:36写道:
> Hi,
>
> Getting this error when trying to run Spark Shell using YARN -
>
> Command: *spark-shell --master yarn --deploy-mode client*
>
> 2018-06-08 13:39:09 WARN Client:66 - Neither spark.yarn.jars nor
>
In Spark on YARN, error code 13 means SparkContext doesn't initialize in
time. You can check the yarn application log to get more information.
BTW, did you just write a plain python script without creating
SparkContext/SparkSession?
Aakash Basu 于2018年6月8日周五 下午4:15写道:
> Hi,
>
> I'm trying to
Hi,
Getting this error when trying to run Spark Shell using YARN -
Command: *spark-shell --master yarn --deploy-mode client*
2018-06-08 13:39:09 WARN Client:66 - Neither spark.yarn.jars nor
spark.yarn.archive is set, falling back to uploading libraries under
SPARK_HOME.
2018-06-08 13:39:25
Hi,
I'm trying to run a program on a cluster using YARN.
YARN is present there along with HADOOP.
Problem I'm running into is as below -
Container exited with a non-zero exit code 13
> Failing this attempt. Failing the application.
> ApplicationMaster host: N/A
> ApplicationMaster
Hi!
I finally found the problem. I was not aware, that the program was run
in Client mode. The client used version 2.2.0. This caused the problem.
Best,
Rico.
Am 07.06.2018 um 08:49 schrieb Kazuaki Ishizaki:
> Thank you for reporting a problem.
> Would it be possible to create a JIRA entry
20 matches
Mail list logo