Question about Spark, Inner Join and Delegation to a Parquet Table

2018-07-02 Thread Mike Buck
I have a question about Spark and how it delegates filters to a Parquet-based table. I have two tables in Hive in Parquet format. Table1 has with four columns of type double and table2 has two columns of type double. I am doing an INNER JOIN of the following: SELECT table1.name FROM table1

Re: Spark Inner Join on pivoted datasets results empty dataset

2017-10-19 Thread Anil Langote
Is there any limit on number of columns used in inner join ? Thank you Anil Langote Sent from my iPhone _ From: Anil Langote <anillangote0...@gmail.com<mailto:anillangote0...@gmail.com>> Sent: Thursday, October 19, 2017 5:01 PM Subject: Spark Inner Joi

Spark Inner Join on pivoted datasets results empty dataset

2017-10-19 Thread Anil Langote
Hi All, I have a requirement to pivot multiple columns using single columns, the pivot API doesn't support doing that hence I have been doing pivot for two columns and then trying to merge the dataset the result is producing empty dataset. Below is the sudo code Main dataset => 33 columns (30

spark inner join

2015-10-24 Thread kali.tumm...@gmail.com
id.map(x => x._1) val movienames=movienamesfile.map(x => x.split("\\|")).map(x => (x(0),x(1))) val movienamejoined=moviesid.join(movienames).distinct() Thanks Sri -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-inner-join-tp25193