RE: Broadcasting a parquet file using spark and python

2015-12-07 Thread Shuai Zheng
ril 01, 2015 2:01 PM To: Jitesh chandra Mishra Cc: user Subject: Re: Broadcasting a parquet file using spark and python You will need to create a hive parquet table that points to the data and run "ANALYZE TABLE tableName noscan" so that we have statistics on the size. On Tue, Mar

Re: Broadcasting a parquet file using spark and python

2015-12-05 Thread Michael Armbrust
or roadmap for this feature? > > > > Regards, > > > > Shuai > > > > *From:* Michael Armbrust [mailto:mich...@databricks.com] > *Sent:* Wednesday, April 01, 2015 2:01 PM > *To:* Jitesh chandra Mishra > *Cc:* user > *Subject:* Re: Broadcasting a pa

RE: Broadcasting a parquet file using spark and python

2015-12-04 Thread Shuai Zheng
From: Michael Armbrust [mailto:mich...@databricks.com] Sent: Wednesday, April 01, 2015 2:01 PM To: Jitesh chandra Mishra Cc: user Subject: Re: Broadcasting a parquet file using spark and python You will need to create a hive parquet table that points to the data and run "ANALYZE

Re: Broadcasting a parquet file using spark and python

2015-04-01 Thread Michael Armbrust
You will need to create a hive parquet table that points to the data and run ANALYZE TABLE tableName noscan so that we have statistics on the size. On Tue, Mar 31, 2015 at 9:36 PM, Jitesh chandra Mishra jitesh...@gmail.com wrote: Hi Michael, Thanks for your response. I am running 1.2.1. Is

Re: Broadcasting a parquet file using spark and python

2015-03-31 Thread Jitesh chandra Mishra
Hi Michael, Thanks for your response. I am running 1.2.1. Is there any workaround to achieve the same with 1.2.1? Thanks, Jitesh On Wed, Apr 1, 2015 at 12:25 AM, Michael Armbrust mich...@databricks.com wrote: In Spark 1.3 I would expect this to happen automatically when the parquet table is

Re: Broadcasting a parquet file using spark and python

2015-03-31 Thread Michael Armbrust
In Spark 1.3 I would expect this to happen automatically when the parquet table is small ( 10mb, configurable with spark.sql.autoBroadcastJoinThreshold). If you are running 1.3 and not seeing this, can you show the code you are using to create the table? On Tue, Mar 31, 2015 at 3:25 AM, jitesh129