Spark on k8s issues with s3a committer dependencies or config?

2022-03-19 Thread Prasad Paravatha
Hi all,

I am trying out Spark 3.2.1 on k8s using Hadoop 3.3.1
Running into issues with writing to s3 bucket using
TemporaryAWSCredentialsProvider
https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Using_Session_Credentials_with_TemporaryAWSCredentialsProvider

While reading from s3 works, I am getting error 403 access denied while
writing to the KMS enabled bucket.

I am wondering if I am missing some dependency jars or client configuration
properties.
I would Appreciate your help if someone can give me a few pointers on this.

Regards,
Prasad Paravatha


Re: One click to run Spark on Kubernetes

2022-02-22 Thread Prasad Paravatha
Hi Bo Yang,
Would it be something along the lines of Apache livy?

Thanks,
Prasad


On Tue, Feb 22, 2022 at 10:22 PM bo yang  wrote:

> It is not a standalone spark cluster. In some details, it deploys a Spark
> Operator (https://github.com/GoogleCloudPlatform/spark-on-k8s-operator)
> and an extra REST Service. When people submit Spark application to that
> REST Service, the REST Service will create a CRD inside the
> Kubernetes cluster. Then Spark Operator will pick up the CRD and launch the
> Spark application. The one click tool intends to hide these details, so
> people could just submit Spark and do not need to deal with too many
> deployment details.
>
> On Tue, Feb 22, 2022 at 8:09 PM Bitfox  wrote:
>
>> Can it be a cluster installation of spark? or just the standalone node?
>>
>> Thanks
>>
>> On Wed, Feb 23, 2022 at 12:06 PM bo yang  wrote:
>>
>>> Hi Spark Community,
>>>
>>> We built an open source tool to deploy and run Spark on Kubernetes with
>>> a one click command. For example, on AWS, it could automatically create an
>>> EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will
>>> be able to use curl or a CLI tool to submit Spark application. After the
>>> deployment, you could also install Uber Remote Shuffle Service to enable
>>> Dynamic Allocation on Kuberentes.
>>>
>>> Anyone interested in using or working together on such a tool?
>>>
>>> Thanks,
>>> Bo
>>>
>>>

-- 
Regards,
Prasad Paravatha


Re: [ANNOUNCE] Apache Spark 3.2.0

2021-10-19 Thread Prasad Paravatha
https://www.apache.org/dyn/closer.lua/spark/spark-3.2.0/spark-3.2.0-bin-hadoop3.3.tgz

FYI, unable to download from this location. 
Also, I don’t see Hadoop 3.3 version in the dist 


> On Oct 19, 2021, at 9:39 AM, Bode, Meikel, NMA-CFD 
>  wrote:
> 
> 
> Many thanks! 
>  
> From: Gengliang Wang  
> Sent: Dienstag, 19. Oktober 2021 16:16
> To: dev ; user 
> Subject: [ANNOUNCE] Apache Spark 3.2.0
>  
> Hi all,
>  
> Apache Spark 3.2.0 is the third release of the 3.x line. With tremendous 
> contribution from the open-source community, this release managed to resolve 
> in excess of 1,700 Jira tickets.
>  
> We'd like to thank our contributors and users for their contributions and 
> early feedback to this release. This release would not have been possible 
> without you.
>  
> To download Spark 3.2.0, head over to the download page: 
> https://spark.apache.org/downloads.html
>  
> To view the release notes: 
> https://spark.apache.org/releases/spark-release-3-2-0.html