Re: Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Benjamin Kim
Mich, Yes, it is already partitioned. In Hive, I can do this: INSERT OVERWRITE amo_bi_events PARTITION (dt) SELECT event_type, timestamp, …, concat(substring(timestamp, 1, 10), ' ', substring(timestamp, 12, 2), ':00:00') AS dt FROM amo_raw_events WHERE to_date(timestamp_iso) BETWEEN

Re: Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Mich Talebzadeh
OK fine but dt is the column used for partitioning the table. This is what I get in Hive itself use test; set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; drop table if exists amo_bi_events; CREATE EXTERNAL TABLE `amo_bi_events`( `event_type` string

Re: Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Benjamin Kim
Mich, I am using .withColumn to add another column “dt” that is a reformatted version of an existing column “timestamp”. The partitioned by column is “dt”. We are using Spark 1.6.0 in CDH 5.7.0. Thanks, Ben > On Jun 3, 2016, at 10:33 AM, Mich Talebzadeh > wrote: >

Re: Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Mich Talebzadeh
what version of spark are you using Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 3 June 2016 at

Re: Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Mich Talebzadeh
ok what is the new column is called? you are basically adding a new column to an already existing table Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Benjamin Kim
The table already exists. CREATE EXTERNAL TABLE `amo_bi_events`( `event_type` string COMMENT '', `timestamp` string COMMENT '', `event_valid` int COMMENT

Re: Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Mich Talebzadeh
hang on are you saving this as a new table? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 3 June

Save to a Partitioned Table using a Derived Column

2016-06-03 Thread Benjamin Kim
Does anyone know how to save data in a DataFrame to a table partitioned using an existing column reformatted into a derived column? val partitionedDf = df.withColumn("dt", concat(substring($"timestamp", 1, 10), lit(" "), substring($"timestamp", 12, 2), lit(":00")))