I have a question about Spark and how it delegates filters to a Parquet-based
table. I have two tables in Hive in Parquet format. Table1 has with four
columns of type double and table2 has two columns of type double. I am doing an
INNER JOIN of the following:
SELECT table1.name FROM table1
Is there any limit on number of columns used in inner join ?
Thank you
Anil Langote
Sent from my iPhone
_
From: Anil Langote <anillangote0...@gmail.com<mailto:anillangote0...@gmail.com>>
Sent: Thursday, October 19, 2017 5:01 PM
Subject: Spark Inner Joi
Hi All,
I have a requirement to pivot multiple columns using single columns, the
pivot API doesn't support doing that hence I have been doing pivot for two
columns and then trying to merge the dataset the result is producing empty
dataset. Below is the sudo code
Main dataset => 33 columns (30
id.map(x => x._1)
val movienames=movienamesfile.map(x => x.split("\\|")).map(x =>
(x(0),x(1)))
val movienamejoined=moviesid.join(movienames).distinct()
Thanks
Sri
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-inner-join-tp25193