Dima Zhiyanov created SPARK-6072:
Summary: Enable hash joins for nullable columns
Key: SPARK-6072
URL: https://issues.apache.org/jira/browse/SPARK-6072
Project: Spark
Issue Type: Improvement
Hello
Question regarding the new DataFrame API introduced here
https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html
I oftentimes use the zipWithUniqueId method of the SchemaRDD (as an RDD) to
replace string keys with more efficient long keys.
Hello
Has Spark implemented computing statistics for Parquet files? Or is there
any other way I can enable broadcast joins between parquet file RDDs in
Spark Sql?
Thanks
Dima
--
View this message in context:
Sent from my iPhone
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Hello
Has Spark implemented computing statistics for Parquet files? Or is there
any other way I can enable broadcast joins between parquet file RDDs in
Spark Sql?
Thanks
Dima
--
View this message in context:
Hello
Has Spark implemented computing statistics for Parquet files? Or is there
any other way I can enable broadcast joins between parquet file RDDs in
Spark Sql?
Thanks
Dima
--
View this message in context:
Hello
Has Spark implemented computing statistics for Parquet files? Or is there
any other way I can enable broadcast joins between parquet file RDDs in
Spark Sql?
Thanks
Dima
--
View this message in context:
://search-hadoop.com/m/JW1q5BZhf92
On Wed, Feb 11, 2015 at 3:04 PM, Dima Zhiyanov dimazhiya...@gmail.com
wrote:
Hello
Has Spark implemented computing statistics for Parquet files? Or is there
any other way I can enable broadcast joins between parquet file RDDs in
Spark Sql?
Thanks
Dima
I am also experiencing this kryo buffer problem. My join is left outer with
under 40mb on the right side. I would expect the broadcast join to succeed
in this case (hive did)
Another problem is that the optimizer
chose nested loop join for some reason
I would expect broadcast (map side) hash
Yes
Sent from my iPhone
On Aug 5, 2014, at 7:38 AM, Dima Zhiyanov [via Apache Spark User List]
ml-node+s1001560n11432...@n3.nabble.com wrote:
I am also experiencing this kryo buffer problem. My join is left outer with
under 40mb on the right side. I would expect the broadcast join
10 matches
Mail list logo