I tried following to explicitly specify partition columns in sql statement
and also tried different cases (upper and lower) fro partition columns.
insert overwrite table $tableName PARTITION(P1, P2) select A, B, C, P1, P2
from updateTable.
Still getting:
Caused by:
org.apache.hadoop.hive.ql.meta
Thanks Koert. I'll check that out when we can update to 2.3
Meanwhile, I am trying hive sql (INSERT OVERWRITE) statement to insert
overwrite multiple partitions. (without loosing existing ones)
It's giving me issues around partition columns.
dataFrame.createOrReplaceTempView("updateTable") /
this works for dataframes with spark 2.3 by changing a global setting, and
will be configurable per write in 2.4
see:
https://issues.apache.org/jira/browse/SPARK-20236
https://issues.apache.org/jira/browse/SPARK-24860
On Wed, Aug 1, 2018 at 3:11 PM, Nirav Patel wrote:
> Hi Peay,
>
> Have you fin
Hi Peay,
Have you find better solution yet? I am having same issue.
Following says it works with spark 2.1 onward but only when you use
sqlContext and not Dataframe
https://medium.com/@anuvrat/writing-into-dynamic-partitions-using-spark-2e2b818a007a
Thanks,
Nirav
On Mon, Oct 2, 2017 at 4:37 AM,
If your processing task inherently processes input data by month you
may want to "manually" partition the output data by month as well as
by day, that is to save it with a file name including the given month,
i.e. "dataset.parquet/month=01". Then you will be able to use the
overwrite mode with each
As alternative: checkpoint the dataframe, collect days, and then delete
corresponding directories using hadoop FileUtils, then write the dataframe
On Fri, Sep 29, 2017 at 10:31 AM, peay wrote:
> Hello,
>
> I am trying to use data_frame.write.partitionBy("day").save("dataset.parquet")
> to write
Hello,
I am trying to use data_frame.write.partitionBy("day").save("dataset.parquet")
to write a dataset while splitting by day.
I would like to run a Spark job to process, e.g., a month:
dataset.parquet/day=2017-01-01/...
...
and then run another Spark job to add another month using the same