Re: Static partitioning in partitionBy()

2019-05-08 Thread Gourav Sengupta
some data skew problem but might work for you >> >> >> >> -- >> *From:* Burak Yavuz >> *Sent:* Tuesday, May 7, 2019 9:35:10 AM >> *To:* Shubham Chaurasia >> *Cc:* dev; user@spark.apache.org >> *Subject:* Re: Static partitioning in partitionBy() >>

Re: Static partitioning in partitionBy()

2019-05-08 Thread Shubham Chaurasia
nt:* Tuesday, May 7, 2019 9:35:10 AM > *To:* Shubham Chaurasia > *Cc:* dev; user@spark.apache.org > *Subject:* Re: Static partitioning in partitionBy() > > It depends on the data source. Delta Lake (https://delta.io) allows you > to do it with the .option("replaceWhere",

Re: Static partitioning in partitionBy()

2019-05-07 Thread Felix Cheung
partitioning in partitionBy() It depends on the data source. Delta Lake (https://delta.io) allows you to do it with the .option("replaceWhere", "c = c1"). With other file formats, you can write directly into the partition directory (tablePath/c=c1), but you lose atomicity. On Tu

Re: Static partitioning in partitionBy()

2019-05-07 Thread Burak Yavuz
It depends on the data source. Delta Lake (https://delta.io) allows you to do it with the .option("replaceWhere", "c = c1"). With other file formats, you can write directly into the partition directory (tablePath/c=c1), but you lose atomicity. On Tue, May 7, 2019, 6:36 AM Shubham Chaurasia

Static partitioning in partitionBy()

2019-05-07 Thread Shubham Chaurasia
Hi All, Is there a way I can provide static partitions in partitionBy()? Like: df.write.mode("overwrite").format("MyDataSource").partitionBy("c=c1").save Above code gives following error as it tries to find column `c=c1` in df. org.apache.spark.sql.AnalysisException: Partition column `c=c1`