Scala commands syntax shortcuts(alias)

2023-04-14 Thread Ankit Singla
HI there, I'm a user of spark as part of a Data Engineer profile for daily analytical work. I write a few commands 100s of times a day and I always wonder if there would be some way to get spark commands alias instead of rewriting whole syntax all the time. I checked and there seems no *eval

Re: Spark Kubernetes Operator

2023-04-14 Thread Yuval Itzchakov
I'm not running on GKE. I am wondering what's the long term strategy around a Spark operator. Operators are the de-facto way to run complex deployments. The Flink community now has an official community led operator, and I was wondering if there are any similar plans for Spark. On Fri, Apr 14,

Re: Spark Kubernetes Operator

2023-04-14 Thread Mich Talebzadeh
Hi, What exactly are you trying to achieve? Spark on GKE works fine and you can run Datapoc now on GKE https://www.linkedin.com/pulse/running-google-dataproc-kubernetes-engine-gke-spark-mich/?trackingId=lz12GC5dRFasLiaJm5qDSw%3D%3D Unless I misunderstood your point. HTH Mich Talebzadeh, Lead

Spark Kubernetes Operator

2023-04-14 Thread Yuval Itzchakov
Hi, ATM I see the most used option for a Spark operator is the one provided by Google: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator Unfortunately, it doesn't seem actively maintained. Are there any plans to support an official Apache Spark community driven operator?

Re: Accessing python runner file in AWS EKS kubernetes cluster as in local://

2023-04-14 Thread Mich Talebzadeh
OK I managed to load the Python zipped file and the run py.file onto s3 for AWS EKS to work It is a bit of nightmare compared to the same on Google SDK which is simpler Anyhow you will require additional jar files to be added to $SPARK_HOME/jars. These two files will be picked up after you build

Re: How to determine the function of tasks on each stage in an Apache Spark application?

2023-04-14 Thread Jacek Laskowski
Hi, Start with intercepting stage completions using SparkListenerStageCompleted [1]. That's Spark Core (jobs, stages and tasks). Go up the execution chain to Spark SQL with SparkListenerSQLExecutionStart [2] and SparkListenerSQLExecutionEnd [3], and correlate infos. You may want to look at how

Re: How to create spark udf use functioncatalog?

2023-04-14 Thread Jacek Laskowski
Hi, I'm not sure I understand the question, but if your question is how to register (plug-in) your own custom FunctionCatalog, it's through spark.sql.catalog configuration property, e.g. spark.sql.catalog.catalog-name=com.example.YourCatalogClass spark.sql.catalog registers a CatalogPlugin that

How to create spark udf use functioncatalog?

2023-04-14 Thread ??????
We are using spark.Today I see the FunctionCatalog , and I haveseen the source of spark\sql\core\src\test\scala\org\apache\spark\sql\connector\DataSourceV2FunctionSuite.scala and have implements theScalarFunction.But i still not konw how toregisterit in sql