How to read text files with GBK encoding in the spark core

2023-04-30 Thread lianyou1...@126.com
Hello all, Is there any way to use the pyspark core to read some text files with GBK encoding? Although the pyspark sql has an option to set the encoding, but these text files are not structural format. Any advices are appreciated. Thank you lianyou Li

Re: Tensorflow on Spark CPU

2023-04-30 Thread Sean Owen
There is a large overhead to distributing this type of workload. I imagine that for a small problem, the overhead dominates. You do not nearly need to distribute a problem of this size, so more workers is probalby just worse. On Sun, Apr 30, 2023 at 1:46 AM second_co...@yahoo.com < second_co...@ya

Any experience with K8s Remote Shuffling Service at scale?

2023-04-30 Thread Andrey Gourine
Hi All, I am looking for people that have experience running external shuffling service at scale with Spark 3 and K8s I have already tried internal shuffling service (available from spark 3) and trying to work with Uniffle (Incubating) Any other option

How to change column values using several when conditions ?

2023-04-30 Thread marc nicole
Hello to you Sparkling community :) I want to change values of a column in a dataset according to a mapping list that maps original values of that column to other new values. Each element of the list (colMappingValues) is a string that separates the original values from the new values using a ";".