Re: Avro or Parquet ?

Cheng Lian Sun, 07 Jun 2015 06:48:06 -0700

Usually Parquet can be more efficient because of its columnar nature.Say your table has 10 columns but your join query only touches 3 ofthem, Parquet only reads those 3 columns from disk while Avro must loadall data.


Cheng


On 6/5/15 3:00 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) wrote:

We currently have data in avro format and we do joins between avro andsequence file data.
Will storing these datasets in Parquet make joins any faster ?

The dataset sizes are beyond are between 500 to 1000 GB.
--
Deepak



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Avro or Parquet ?

Reply via email to