Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation regarding how to run spark-shell on k8s cluster?

2018-10-31 Thread Zhang, Yuqi
Hi Li,

Thank you very much for your reply!

> Did you make the headless service that reflects the driver pod name?
I am not sure but I used “app” in the headless service as selector which is the 
same app name for the StatefulSet that will create the spark driver pod.
For your reference, I attached the yaml file for making headless service and 
StatefulSet. Could you please help me take a look at it if you have time?

I appreciate for your help & have a good day!

Best Regards,
--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


[signature_147554612]<http://www.teradata.com/>

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com<http://www.teradata.com>

This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.



From: Li Gao 
Date: Thursday, November 1, 2018 4:56
To: "Zhang, Yuqi" 
Cc: Gourav Sengupta , "user@spark.apache.org" 
, "Nogami, Masatsugu" 
Subject: Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation 
regarding how to run spark-shell on k8s cluster?

Hi Yuqi,

Yes we are running Jupyter Gateway and kernels on k8s and using Spark 2.4's 
client mode to launch pyspark. In client mode your driver is running on the 
same pod where your kernel runs.

I am planning to write some blog post on this on some future date. Did you make 
the headless service that reflects the driver pod name? Thats one of critical 
pieces we automated in our custom code that makes the client mode works.

-Li


On Wed, Oct 31, 2018 at 8:13 AM Zhang, Yuqi 
mailto:yuqi.zh...@teradata.com>> wrote:
Hi Li,

Thank you for your reply.
Do you mean running Jupyter client on k8s cluster to use spark 2.4? Actually I 
am also trying to set up JupyterHub on k8s to use spark, that’s why I would 
like to know how to run spark client mode on k8s cluster. If there is any 
related documentation on how to set up the Jupyter on k8s to use spark, could 
you please share with me?

Thank you for your help!

Best Regards,
--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


[signature_147554612]<http://www.teradata.com/>

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com<http://www.teradata.com>
This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.



From: Li Gao mailto:ligao...@gmail.com>>
Date: Thursday, November 1, 2018 0:07
To: "Zhang, Yuqi" 
Cc: "gourav.sengu...@gmail.com<mailto:gourav.sengu...@gmail.com>" 
mailto:gourav.sengu...@gmail.com>>, 
"user@spark.apache.org<mailto:user@spark.apache.org>" 
mailto:user@spark.apache.org>>, "Nogami, Masatsugu" 

Subject: Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation 
regarding how to run spark-shell on k8s cluster?

Yuqi,

Your error seems unrelated to headless service config you need to enable. For 
headless service you need to create a headless service that matches to your 
driver pod name exactly in order for spark 2.4 RC to work under client mode. We 
have this running for a while now using Jupyter kernel as the driver client.

-Li


On Wed, Oct 31, 2018 at 7:30 AM Zhang, Yuqi 
mailto:yuqi.zh...@teradata.com>> wrote:
Hi Gourav,

Thank you for your reply.

I haven’t try glue or EMK, but I guess it’s integrating kubernetes on aws 
instances?
I could set up the k8s cluster on AWS, but my problem is don’t know how to run 
spark-shell on kubernetes…
Since spark only support client mode on k8s from 2.4 version which is not 
officially released yet, I would like to ask if there is more detailed 
documentation regarding the way to run spark-shell on k8s cluster?

Thank you in advance & best regards!

--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


Error! Filename not specified.<http://www.teradata.com/>

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com<http://www.teradata.com>
This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.



From: Gourav Sengupta 
mailto:gourav.sengu...@gmail.com>>
Date: Wednesday, October 31, 2018 18:34
To: "Zhang, Yuqi" 
Cc: user mailto:user@spark.ap

Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation regarding how to run spark-shell on k8s cluster?

2018-10-31 Thread Zhang, Yuqi
Hi Li,

Thank you for your reply.
Do you mean running Jupyter client on k8s cluster to use spark 2.4? Actually I 
am also trying to set up JupyterHub on k8s to use spark, that’s why I would 
like to know how to run spark client mode on k8s cluster. If there is any 
related documentation on how to set up the Jupyter on k8s to use spark, could 
you please share with me?

Thank you for your help!

Best Regards,
--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


[signature_147554612]<http://www.teradata.com/>

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com<http://www.teradata.com>

This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.



From: Li Gao 
Date: Thursday, November 1, 2018 0:07
To: "Zhang, Yuqi" 
Cc: "gourav.sengu...@gmail.com" , 
"user@spark.apache.org" , "Nogami, Masatsugu" 

Subject: Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation 
regarding how to run spark-shell on k8s cluster?

Yuqi,

Your error seems unrelated to headless service config you need to enable. For 
headless service you need to create a headless service that matches to your 
driver pod name exactly in order for spark 2.4 RC to work under client mode. We 
have this running for a while now using Jupyter kernel as the driver client.

-Li


On Wed, Oct 31, 2018 at 7:30 AM Zhang, Yuqi 
mailto:yuqi.zh...@teradata.com>> wrote:
Hi Gourav,

Thank you for your reply.

I haven’t try glue or EMK, but I guess it’s integrating kubernetes on aws 
instances?
I could set up the k8s cluster on AWS, but my problem is don’t know how to run 
spark-shell on kubernetes…
Since spark only support client mode on k8s from 2.4 version which is not 
officially released yet, I would like to ask if there is more detailed 
documentation regarding the way to run spark-shell on k8s cluster?

Thank you in advance & best regards!

--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


[signature_147554612]<http://www.teradata.com/>

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com<http://www.teradata.com>
This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.



From: Gourav Sengupta 
mailto:gourav.sengu...@gmail.com>>
Date: Wednesday, October 31, 2018 18:34
To: "Zhang, Yuqi" 
Cc: user mailto:user@spark.apache.org>>, "Nogami, 
Masatsugu" 
Subject: Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation 
regarding how to run spark-shell on k8s cluster?

[External Email]

Just out of curiosity why would you not use Glue (which is Spark on kubernetes) 
or EMR?

Regards,
Gourav Sengupta

On Mon, Oct 29, 2018 at 1:29 AM Zhang, Yuqi 
mailto:yuqi.zh...@teradata.com>> wrote:
Hello guys,

I am Yuqi from Teradata Tokyo. Sorry to disturb but I have some problem 
regarding using spark 2.4 client mode function on kubernetes cluster, so I 
would like to ask if there is some solution to my problem.

The problem is when I am trying to run spark-shell on kubernetes v1.11.3 
cluster on AWS environment, I couldn’t successfully run stateful set using the 
docker image built from spark 2.4. The error message is showing below. The 
version I am using is spark v2.4.0-rc3.

Also, I wonder if there is more documentation on how to use client-mode or 
integrate spark-shell on kubernetes cluster. From the documentation on 
https://github.com/apache/spark/blob/v2.4.0-rc3/docs/running-on-kubernetes.md 
there is only a brief description. I understand it’s not the official released 
version yet, but If there is some more documentation, could you please share 
with me?

Thank you very much for your help!


Error msg:
+ env
+ sed 's/[^=]*=\(.*\)/\1/g'
+ sort -t_ -k4 -n
+ grep SPARK_JAVA_OPT_
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -n '' ']'
+ PYSPARK_ARGS=
+ '[' -n '' ']'
+ R_ARGS=
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /sbin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress= --deploy-mode client
Error: Missing application resource.
Usage: spark-submit [options]  [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submiss

Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation regarding how to run spark-shell on k8s cluster?

2018-10-31 Thread Zhang, Yuqi
Hi Gourav,

Thank you for your reply.

I haven’t try glue or EMK, but I guess it’s integrating kubernetes on aws 
instances?
I could set up the k8s cluster on AWS, but my problem is don’t know how to run 
spark-shell on kubernetes…
Since spark only support client mode on k8s from 2.4 version which is not 
officially released yet, I would like to ask if there is more detailed 
documentation regarding the way to run spark-shell on k8s cluster?

Thank you in advance & best regards!

--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


[signature_147554612]<http://www.teradata.com/>

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com<http://www.teradata.com>

This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.



From: Gourav Sengupta 
Date: Wednesday, October 31, 2018 18:34
To: "Zhang, Yuqi" 
Cc: user , "Nogami, Masatsugu" 

Subject: Re: [Spark Shell on AWS K8s Cluster]: Is there more documentation 
regarding how to run spark-shell on k8s cluster?

[External Email]

Just out of curiosity why would you not use Glue (which is Spark on kubernetes) 
or EMR?

Regards,
Gourav Sengupta

On Mon, Oct 29, 2018 at 1:29 AM Zhang, Yuqi 
mailto:yuqi.zh...@teradata.com>> wrote:
Hello guys,

I am Yuqi from Teradata Tokyo. Sorry to disturb but I have some problem 
regarding using spark 2.4 client mode function on kubernetes cluster, so I 
would like to ask if there is some solution to my problem.

The problem is when I am trying to run spark-shell on kubernetes v1.11.3 
cluster on AWS environment, I couldn’t successfully run stateful set using the 
docker image built from spark 2.4. The error message is showing below. The 
version I am using is spark v2.4.0-rc3.

Also, I wonder if there is more documentation on how to use client-mode or 
integrate spark-shell on kubernetes cluster. From the documentation on 
https://github.com/apache/spark/blob/v2.4.0-rc3/docs/running-on-kubernetes.md 
there is only a brief description. I understand it’s not the official released 
version yet, but If there is some more documentation, could you please share 
with me?

Thank you very much for your help!


Error msg:
+ env
+ sed 's/[^=]*=\(.*\)/\1/g'
+ sort -t_ -k4 -n
+ grep SPARK_JAVA_OPT_
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -n '' ']'
+ PYSPARK_ARGS=
+ '[' -n '' ']'
+ R_ARGS=
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /sbin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress= --deploy-mode client
Error: Missing application resource.
Usage: spark-submit [options]  [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]


--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


[signature_147554612]<http://www.teradata.com/>

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com<http://www.teradata.com>

This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.




[Spark Shell on AWS K8s Cluster]: Is there more documentation regarding how to run spark-shell on k8s cluster?

2018-10-28 Thread Zhang, Yuqi
Hello guys,

I am Yuqi from Teradata Tokyo. Sorry to disturb but I have some problem 
regarding using spark 2.4 client mode function on kubernetes cluster, so I 
would like to ask if there is some solution to my problem.

The problem is when I am trying to run spark-shell on kubernetes v1.11.3 
cluster on AWS environment, I couldn’t successfully run stateful set using the 
docker image built from spark 2.4. The error message is showing below. The 
version I am using is spark v2.4.0-rc3.

Also, I wonder if there is more documentation on how to use client-mode or 
integrate spark-shell on kubernetes cluster. From the documentation on 
https://github.com/apache/spark/blob/v2.4.0-rc3/docs/running-on-kubernetes.md 
there is only a brief description. I understand it’s not the official released 
version yet, but If there is some more documentation, could you please share 
with me?

Thank you very much for your help!


Error msg:
+ env
+ sed 's/[^=]*=\(.*\)/\1/g'
+ sort -t_ -k4 -n
+ grep SPARK_JAVA_OPT_
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -n '' ']'
+ PYSPARK_ARGS=
+ '[' -n '' ']'
+ R_ARGS=
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /sbin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress= --deploy-mode client
Error: Missing application resource.
Usage: spark-submit [options]  [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]


--
Yuqi Zhang
Software Engineer
m: 090-6725-6573


[signature_147554612]

2 Chome-2-23-1 Akasaka
Minato, Tokyo 107-0052
teradata.com


This e-mail is from Teradata Corporation and may contain information that is 
confidential or proprietary. If you are not the intended recipient, do not 
read, copy or distribute the e-mail or any attachments. Instead, please notify 
the sender and delete the e-mail and any attachments. Thank you.

Please consider the environment before printing.