Re:

2022-04-02 Thread Sungwoo Park
MR3 is a new execution engine, so there are quite a few differences on the backend side. Some of the differences are: 1. Easier to install and run (e.g., no need to upgrade Hadoop) 2. Faster (because Hive on MR3 supports LLAP mode and runs as fast as Hive-LLAP) 3. More efficient - unlike Tez, a

Re:

2022-04-02 Thread Bitfox
Nice reading. Can you give a comparison on Hive on MR3 and Hive on Tez? Thanks On Sat, Apr 2, 2022 at 7:17 PM Sungwoo Park wrote: > Hi Spark users, > > We have published an article where we evaluate the performance of Spark > 2.3.8 and Spark 3.2.1 (along with Hive 3). If interested, please

[no subject]

2022-04-02 Thread Sungwoo Park
Hi Spark users, We have published an article where we evaluate the performance of Spark 2.3.8 and Spark 3.2.1 (along with Hive 3). If interested, please see: https://www.datamonad.com/post/2022-04-01-spark-hive-performance-1.4/ --- SW

Re: how to change data type for columns of dataframe

2022-04-02 Thread Bjørn Jørgensen
https://sparkbyexamples.com/pyspark/pyspark-cast-column-type/ lør. 2. apr. 2022 kl. 04:10 skrev ayan guha : > Please use cast. Also I would strongly recommend to go through spark doco, > its pretty good. > > On Sat, 2 Apr 2022 at 12:43 pm, wrote: > >> Hi >> >> I got a dataframe object from