Re: unsubscribe
unsubscribe On Fri, Jan 17, 2020 at 11:39 AM Bruno S. de Barros wrote: > > > > > > > > - To > unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Service Account not being honored using pyspark on Kubernetes
On Wed, Jan 29, 2020 at 9:58 PM pisymbol . wrote: > > > On Wed, Jan 29, 2020 at 5:02 PM pisymbol . wrote: > >> >> The problem is when spark initailizes I see the following error: >> >> io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden: >> User "system:serviceaccount:default:default" cannot watch resource "pods" >> in >> API group "" in the namespace "spark" >> >> > If I deploy my "driver" notebook pod in the spark namespace then things > improve slightly: > > " Forbidden!Configured service account doesn't have access. Service > account may have been revoked. pods is forbidden: User > "system:serviceaccount:spark:default" cannot list resource "pods" > > Again, I don't want spark:default I want spark:spark for the service > account. Why aren't my configuration parameters taking? > For the pour soul that reads this thread and runs into the same issue, the fix is to set the serviceAccount in your deployment for the pod to "spark". I'm not sure why this has to be done but it works. -aps
Re: Service Account not being honored using pyspark on Kubernetes
On Wed, Jan 29, 2020 at 5:02 PM pisymbol . wrote: > > The problem is when spark initailizes I see the following error: > > io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden: > User "system:serviceaccount:default:default" cannot watch resource "pods" > in > API group "" in the namespace "spark" > > If I deploy my "driver" notebook pod in the spark namespace then things improve slightly: " Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:spark:default" cannot list resource "pods" Again, I don't want spark:default I want spark:spark for the service account. Why aren't my configuration parameters taking? -aps
Re: Re: union two pyspark dataframes from different SparkSessions
Dear Yeikel I checked my code and it uses getOrCreate to create a SparkSession. Therefore, I should be retrieving the same SparkSession instance everytime I call that method. Thanks for your reminding. Best regard -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Service Account not being honored using pyspark on Kubernetes
I am on k8s 1.17 in a small 4 node cluster. I am running Spark 2.4.4 but with updated kubernetes-client jars to work around the 403 CVE issue. I am running on a pod in the 'default' namespace of my cluster in a Jupyter notebook. I am trying to configure 'client mode' so I can use pyspark interactively and watch work done on the executors. Here is my SparkConf: sparkConf = SparkConf() sparkConf.setMaster("k8s://https://192.168.0.100:6443;) sparkConf.setAppName("pispark") sparkConf.set("spark.kubernetes.container.image", "pidocker-docker-registry:5000/my-spark-py:v2.4.4") sparkConf.set("spark.kubernetes.namespace", "spark") sparkConf.set("spark.executor.instances", "3") sparkConf.set("spark.driver.memory", "512m") sparkConf.set("spark.executor.memory", "512m") sparkConf.set("spark.kubernetes.pyspark.pythonVersion", 3) sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName", "spark") sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark") sparkConf.set("spark.kubernetes.pullSecrets", "pidocker-docker-registry-secret") spark = SparkSession.builder.config(conf=sparkConf).getOrCreate() sc = spark.sparkContext The problem is when spark initailizes I see the following error: io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden: User "system:serviceaccount:default:default" cannot watch resource "pods" in API group "" in the namespace "spark" But I am not using "default:default" I am using "spark:spark" which has "edit" access via a clusterrolebinding in that namespace: $ k describe clusterrolebinding/spark-role -n spark Name: spark-role Labels: Annotations: Role: Kind: ClusterRole Name: edit Subjects: KindName Namespace - ServiceAccount spark spark What am I doing wrong? -aps
Re: union two pyspark dataframes from different SparkSessions
>From what I understand, the session is a singleton so even if you think you >are creating new instances you are just reusing it. On Wed, 29 Jan 2020 02:24:05 -1100 icbm0...@gmail.com wrote Dear all I already had a python function which is used to query data from HBase and HDFS with given parameters. This function returns a pyspark dataframe and the SparkContext it used. With client's increasing demands, I need to merge data from multiple query. I tested using "union" function to merge the pyspark dataframes returned by different function calls directly and it worked. This surprised me that pyspark dataframe can actually union dataframes from different SparkSession. I am using pyspark 2.3.1 and Python 3.5. I wonder if this is a good practice or I better use same SparkSession for all the query? Best regards -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
union two pyspark dataframes from different SparkSessions
Dear all I already had a python function which is used to query data from HBase and HDFS with given parameters. This function returns a pyspark dataframe and the SparkContext it used. With client's increasing demands, I need to merge data from multiple query. I tested using "union" function to merge the pyspark dataframes returned by different function calls directly and it worked. This surprised me that pyspark dataframe can actually union dataframes from different SparkSession. I am using pyspark 2.3.1 and Python 3.5. I wonder if this is a good practice or I better use same SparkSession for all the query? Best regards -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Problems during upgrade 2.2.2 -> 2.4.4
Anyone? This question is not regarding my application running on top of Spark. The question is about the upgrade of spark itself from 2.2 to 2.4. I expected atleast that spark would recover from upgrades gracefully and recover its own persisted objects. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org