Re: Unable to insert overwrite table with Spark 1.5.2

2016-02-15 Thread Ted Yu
Do you mind trying Spark 1.6.0 ?

As far as I can tell, 'Cannot overwrite table' exception may only occur
for CreateTableUsingAsSelect when source and dest relations refer to the
same table in branch-1.6

Cheers

On Sun, Feb 14, 2016 at 9:29 PM, Ramanathan R 
wrote:

> Hi All,
>
> Spark 1.5.2 does not seem to be backward compatible with functionality
> that was available in earlier versions, at least in 1.3.1 and 1.4.1. It is
> not possible to insert overwrite into an existing table that was read as a
> DataFrame initially.
>
> Our existing code base has few internal Hive tables being overwritten
> after some join operations.
>
> For e.g.
> val PRODUCT_TABLE = "product_dim"
> val productDimDF = hiveContext.table(PRODUCT_TABLE)
> // Joins, filters ...
> productDimDF.write.mode(SaveMode.Overwrite).insertInto(PRODUCT_TABLE)
>
> This results in the exception -
> org.apache.spark.sql.AnalysisException: Cannot overwrite table
> `product_dim` that is also being read from.;
> at
> org.apache.spark.sql.execution.datasources.PreWriteCheck.failAnalysis(rules.scala:82)
> at
> org.apache.spark.sql.execution.datasources.PreWriteCheck$$anonfun$apply$2.apply(rules.scala:155)
> at
> org.apache.spark.sql.execution.datasources.PreWriteCheck$$anonfun$apply$2.apply(rules.scala:85)
>
> Is there any configuration to disable this particular rule? Any pointers
> to solve this would be very helpful.
>
> Thanks,
> Ram
>


Unable to insert overwrite table with Spark 1.5.2

2016-02-14 Thread Ramanathan R
Hi All,

Spark 1.5.2 does not seem to be backward compatible with functionality that
was available in earlier versions, at least in 1.3.1 and 1.4.1. It is not
possible to insert overwrite into an existing table that was read as a
DataFrame initially.

Our existing code base has few internal Hive tables being overwritten after
some join operations.

For e.g.
val PRODUCT_TABLE = "product_dim"
val productDimDF = hiveContext.table(PRODUCT_TABLE)
// Joins, filters ...
productDimDF.write.mode(SaveMode.Overwrite).insertInto(PRODUCT_TABLE)

This results in the exception -
org.apache.spark.sql.AnalysisException: Cannot overwrite table
`product_dim` that is also being read from.;
at
org.apache.spark.sql.execution.datasources.PreWriteCheck.failAnalysis(rules.scala:82)
at
org.apache.spark.sql.execution.datasources.PreWriteCheck$$anonfun$apply$2.apply(rules.scala:155)
at
org.apache.spark.sql.execution.datasources.PreWriteCheck$$anonfun$apply$2.apply(rules.scala:85)

Is there any configuration to disable this particular rule? Any pointers to
solve this would be very helpful.

Thanks,
Ram