Re: The performance difference when running Apache Spark on K8s and traditional server

2023-07-27 Thread Mich Talebzadeh
Spark on tin boxes like Google Dataproc or AWS EC2 often utilise YARN resource manager. YARN is the most widely used resource manager not just for Spark but for other artefacts as well. On-premise YARN is used extensively. In Cloud it is also used widely in Infrastructure as a Service such as

Unsubscribe

2023-07-27 Thread Kevin Wang
Unsubscribe please!

The performance difference when running Apache Spark on K8s and traditional server

2023-07-27 Thread Trường Trần Phan An
Hi all, I am learning about the performance difference of Spark when performing a JOIN problem on Serverless (K8S) and Serverful (Traditional server) environments. Through experiment, Spark on K8s tends to run slower than Serverful. Through understanding the architecture, I know that Spark runs

Unsubscribe

2023-07-27 Thread blaz stojanovic
Unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Unsubscribe

2023-07-27 Thread Sherif Eid
Unsubscribe

Dynamic allocation does not deallocate executors

2023-07-27 Thread Sergei Zhgirovski
Hi everyone I'm trying to use pyspark 3.3.2. I have these relevant options set: spark.dynamicAllocation.enabled=true spark.dynamicAllocation.shuffleTracking.enabled=true spark.dynamicAllocation.shuffleTracking.timeout=20s spark.dynamicAllocation.executorIdleTimeout=30s

[ANNOUNCE] Apache Celeborn(incubating) 0.3.0 available

2023-07-27 Thread zhongqiang chen
Hi all, Apache Celeborn(Incubating) community is glad to announce the new release of Apache Celeborn(Incubating) 0.3.0 Celeborn is dedicated to improving the efficiency and elasticity of different map-reduce engines and provides an elastic, high-efficient service for intermediate data including

Unsubscribe

2023-07-27 Thread Sherif Eid
Unsubscribe

Re: conver panda image column to spark dataframe

2023-07-27 Thread Adrian Pop-Tifrea
Hello, when you said your pandas Dataframe has 10 rows, does that mean it contains 10 images? Because if that's the case, then you'd want ro only use 3 layers of ArrayType when you define the schema. Best regards, Adrian On Thu, Jul 27, 2023, 11:04 second_co...@yahoo.com.INVALID wrote: > i

conver panda image column to spark dataframe

2023-07-27 Thread second_co...@yahoo.com.INVALID
i have panda dataframe with column 'image' using numpy.ndarray. shape is (500, 333, 3) per image. my panda dataframe has 10 rows, thus, shape is (10, 500, 333, 3) when using spark.createDataframe(panda_dataframe, schema), i need to specify the schema, schema = StructType([    

Re: spark context list_packages()

2023-07-27 Thread Sean Owen
There is no such method in Spark. I think that's some EMR-specific modification. On Wed, Jul 26, 2023 at 11:06 PM second_co...@yahoo.com.INVALID wrote: > I ran the following code > > spark.sparkContext.list_packages() > > on spark 3.4.1 and i get below error > > An error was encountered: >