RE: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread Naveen Kumar Pokala
- From: tridib [mailto:tridib.sama...@live.com] Sent: Tuesday, November 25, 2014 9:54 AM To: u...@spark.incubator.apache.org Subject: Control number of parquet generated from JavaSchemaRDD Hello, I am reading around 1000 input files from disk in an RDD and generating parquet. It always produces

Re: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread Michael Armbrust
().setInt(parquet.block.size, MB_128); No luck. Is there a way to control the size/number of parquet files generated? Thanks Tridib -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717.html Sent

Re: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread tridib
()); JavaSchemaRDD claimSchemaRdd = sqlCtx.applySchema(claimRdd, Claim.class); claimSchemaRdd.coalesce(1) claimSchemaRdd.saveAsParquetFile(parquetPath); } -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from

Re: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread tridib
repartition(1) too. claimSchemaRdd.saveAsParquetFile(parquetPath); } -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19776.html Sent from the Apache Spark User List mailing

Re: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread Michael Armbrust
repartition(1) too. claimSchemaRdd.saveAsParquetFile(parquetPath); } -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19776.html Sent from the Apache Spark User List mailing list

Re: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread tridib
Ohh...how can I miss that. :(. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19788.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread tridib
-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19789.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Re: Control number of parquet generated from JavaSchemaRDD

2014-11-25 Thread Michael Armbrust
.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19789.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

Control number of parquet generated from JavaSchemaRDD

2014-11-24 Thread tridib
.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional