RE: Want to Add New Column in Avro Schema

2016-03-23 Thread Lunagariya, Dhaval
Thanks guys.

Just updated the .avsc and it’s done. No need to recreated the table again.

Regards,
Dhaval

From: Maulik Gandhi [mailto:mmg...@gmail.com]
Sent: Wednesday, March 23, 2016 7:05 PM
To: user
Cc: er.dcpa...@gmail.com
Subject: Re: Want to Add New Column in Avro Schema

Create table DDL looks right to me.

How are you updating avro.schema.url ?

Thanks.
- Maulik

On Wed, Mar 23, 2016 at 8:29 AM, Lunagariya, Dhaval 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>> wrote:
Here is the DDL.

DROP TABLE IF EXISTS TEST;

CREATE EXTERNAL TABLE TEST
PARTITIONED BY (
COL1 STRING,
COL2 STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///data/hive/TEST'
TBLPROPERTIES ('avro.schema.url'='hdfs:///user/Test.avsc');

Thanks,
Dhaval

From: Aaron.Dossett 
[mailto:aaron.doss...@target.com<mailto:aaron.doss...@target.com>]
Sent: Wednesday, March 23, 2016 6:50 PM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Cc: 'er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>'
Subject: Re: Want to Add New Column in Avro Schema

You shouldn’t have to drop the table, just update the .avsc.  Can you share the 
DDL you use to create the table?

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 8:17 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: RE: Want to Add New Column in Avro Schema

Yes. I made require changes in .avsc file and I drop the table and re-created 
using updated .avsc. But I am not getting existing data in that case.

Where am I wrong? Can you through some light

Thanks,
Dhaval

From: Aaron.Dossett [mailto:aaron.doss...@target.com]
Sent: Wednesday, March 23, 2016 6:36 PM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Cc: 'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'
Subject: Re: Want to Add New Column in Avro Schema

If you create the external table by reference to the .avsc file (TBLPROPERTIES 
( 'avro.schema.url’=‘hdfs://foo.avsc')) the all you have to do is update that 
avsc file in a compatible way and Hive should reflect the new schema.  I’ve 
implemented this pattern in my production system for several months now.

-Aaron

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 6:32 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: Want to Add New Column in Avro Schema

Hey folks,

I want to add new column in existing Hive Table. We created external hive table 
with the help of .avsc. Now I want to add new column in that table.

How can I do that without disturbing any data present in table?

Please Help.

Regards,
Dhaval




Re: Want to Add New Column in Avro Schema

2016-03-23 Thread Maulik Gandhi
Create table DDL looks right to me.

How are you updating *avro.schema.url* ?

Thanks.
- Maulik

On Wed, Mar 23, 2016 at 8:29 AM, Lunagariya, Dhaval <
dhaval.lunagar...@citi.com> wrote:

> Here is the DDL.
>
>
>
> DROP TABLE IF EXISTS TEST;
>
>
>
> CREATE EXTERNAL TABLE TEST
>
> PARTITIONED BY (
>
> COL1 STRING,
>
> COL2 STRING
>
> )
>
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>
> STORED AS
>
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>
> LOCATION 'hdfs:///data/hive/TEST'
>
> TBLPROPERTIES ('avro.schema.url'='hdfs:///user/Test.avsc');
>
>
>
> Thanks,
>
> Dhaval
>
>
>
> *From:* Aaron.Dossett [mailto:aaron.doss...@target.com]
> *Sent:* Wednesday, March 23, 2016 6:50 PM
> *To:* user@avro.apache.org
> *Cc:* 'er.dcpa...@gmail.com'
> *Subject:* Re: Want to Add New Column in Avro Schema
>
>
>
> You shouldn’t have to drop the table, just update the .avsc.  Can you
> share the DDL you use to create the table?
>
>
>
> *From: *"Lunagariya, Dhaval" <dhaval.lunagar...@citi.com>
> *Reply-To: *"user@avro.apache.org" <user@avro.apache.org>
> *Date: *Wednesday, March 23, 2016 at 8:17 AM
> *To: *"user@avro.apache.org" <user@avro.apache.org>
> *Cc: *"'er.dcpa...@gmail.com'" <er.dcpa...@gmail.com>
> *Subject: *RE: Want to Add New Column in Avro Schema
>
>
>
> Yes. I made require changes in .avsc file and I drop the table and
> re-created using updated .avsc. But I am not getting existing data in that
> case.
>
>
>
> Where am I wrong? Can you through some light
>
>
>
> Thanks,
>
> Dhaval
>
>
>
> *From:* Aaron.Dossett [mailto:aaron.doss...@target.com
> <aaron.doss...@target.com>]
> *Sent:* Wednesday, March 23, 2016 6:36 PM
> *To:* user@avro.apache.org
> *Cc:* 'er.dcpa...@gmail.com'
> *Subject:* Re: Want to Add New Column in Avro Schema
>
>
>
> If you create the external table by reference to the .avsc file
> (TBLPROPERTIES ( 'avro.schema.url’=‘hdfs://foo.avsc')) the all you have to
> do is update that avsc file in a compatible way and Hive should reflect the
> new schema.  I’ve implemented this pattern in my production system for
> several months now.
>
>
>
> -Aaron
>
>
>
> *From: *"Lunagariya, Dhaval" <dhaval.lunagar...@citi.com>
> *Reply-To: *"user@avro.apache.org" <user@avro.apache.org>
> *Date: *Wednesday, March 23, 2016 at 6:32 AM
> *To: *"user@avro.apache.org" <user@avro.apache.org>
> *Cc: *"'er.dcpa...@gmail.com'" <er.dcpa...@gmail.com>
> *Subject: *Want to Add New Column in Avro Schema
>
>
>
> Hey folks,
>
>
>
> I want to add new column in existing Hive Table. We created external hive
> table with the help of .avsc. Now I want to add new column in that table.
>
>
>
> How can I do that without disturbing any data present in table?
>
>
>
> Please Help.
>
>
>
> Regards,
>
> Dhaval
>
>
>


Re: Want to Add New Column in Avro Schema

2016-03-23 Thread Maulik Gandhi
You can try describe tableName; and see if the new added column appears in
Hive table.

Thanks.
- Maulik


On Wed, Mar 23, 2016 at 8:38 AM, Aaron.Dossett <aaron.doss...@target.com>
wrote:

> And what happens if you simply update the .avsc file on HDFS?  Does
> ‘describe test’ show the new columns?
>
> From: "Lunagariya, Dhaval" <dhaval.lunagar...@citi.com>
> Reply-To: "user@avro.apache.org" <user@avro.apache.org>
> Date: Wednesday, March 23, 2016 at 8:29 AM
>
> To: "user@avro.apache.org" <user@avro.apache.org>
> Cc: "'er.dcpa...@gmail.com'" <er.dcpa...@gmail.com>
> Subject: RE: Want to Add New Column in Avro Schema
>
> Here is the DDL.
>
>
>
> DROP TABLE IF EXISTS TEST;
>
>
>
> CREATE EXTERNAL TABLE TEST
>
> PARTITIONED BY (
>
> COL1 STRING,
>
> COL2 STRING
>
> )
>
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>
> STORED AS
>
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>
> LOCATION 'hdfs:///data/hive/TEST'
>
> TBLPROPERTIES ('avro.schema.url'='hdfs:///user/Test.avsc');
>
>
>
> Thanks,
>
> Dhaval
>
>
>
> *From:* Aaron.Dossett [mailto:aaron.doss...@target.com
> <aaron.doss...@target.com>]
> *Sent:* Wednesday, March 23, 2016 6:50 PM
> *To:* user@avro.apache.org
> *Cc:* 'er.dcpa...@gmail.com'
> *Subject:* Re: Want to Add New Column in Avro Schema
>
>
>
> You shouldn’t have to drop the table, just update the .avsc.  Can you
> share the DDL you use to create the table?
>
>
>
> *From: *"Lunagariya, Dhaval" <dhaval.lunagar...@citi.com>
> *Reply-To: *"user@avro.apache.org" <user@avro.apache.org>
> *Date: *Wednesday, March 23, 2016 at 8:17 AM
> *To: *"user@avro.apache.org" <user@avro.apache.org>
> *Cc: *"'er.dcpa...@gmail.com'" <er.dcpa...@gmail.com>
> *Subject: *RE: Want to Add New Column in Avro Schema
>
>
>
> Yes. I made require changes in .avsc file and I drop the table and
> re-created using updated .avsc. But I am not getting existing data in that
> case.
>
>
>
> Where am I wrong? Can you through some light
>
>
>
> Thanks,
>
> Dhaval
>
>
>
> *From:* Aaron.Dossett [mailto:aaron.doss...@target.com
> <aaron.doss...@target.com>]
> *Sent:* Wednesday, March 23, 2016 6:36 PM
> *To:* user@avro.apache.org
> *Cc:* 'er.dcpa...@gmail.com'
> *Subject:* Re: Want to Add New Column in Avro Schema
>
>
>
> If you create the external table by reference to the .avsc file
> (TBLPROPERTIES ( 'avro.schema.url’=‘hdfs://foo.avsc')) the all you have to
> do is update that avsc file in a compatible way and Hive should reflect the
> new schema.  I’ve implemented this pattern in my production system for
> several months now.
>
>
>
> -Aaron
>
>
>
> *From: *"Lunagariya, Dhaval" <dhaval.lunagar...@citi.com>
> *Reply-To: *"user@avro.apache.org" <user@avro.apache.org>
> *Date: *Wednesday, March 23, 2016 at 6:32 AM
> *To: *"user@avro.apache.org" <user@avro.apache.org>
> *Cc: *"'er.dcpa...@gmail.com'" <er.dcpa...@gmail.com>
> *Subject: *Want to Add New Column in Avro Schema
>
>
>
> Hey folks,
>
>
>
> I want to add new column in existing Hive Table. We created external hive
> table with the help of .avsc. Now I want to add new column in that table.
>
>
>
> How can I do that without disturbing any data present in table?
>
>
>
> Please Help.
>
>
>
> Regards,
>
> Dhaval
>
>
>


Re: Want to Add New Column in Avro Schema

2016-03-23 Thread Aaron . Dossett
And what happens if you simply update the .avsc file on HDFS?  Does 'describe 
test' show the new columns?

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 8:29 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: RE: Want to Add New Column in Avro Schema

Here is the DDL.

DROP TABLE IF EXISTS TEST;

CREATE EXTERNAL TABLE TEST
PARTITIONED BY (
COL1 STRING,
COL2 STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///data/hive/TEST'
TBLPROPERTIES ('avro.schema.url'='hdfs:///user/Test.avsc');

Thanks,
Dhaval

From: Aaron.Dossett [mailto:aaron.doss...@target.com]
Sent: Wednesday, March 23, 2016 6:50 PM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Cc: 'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'
Subject: Re: Want to Add New Column in Avro Schema

You shouldn't have to drop the table, just update the .avsc.  Can you share the 
DDL you use to create the table?

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 8:17 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: RE: Want to Add New Column in Avro Schema

Yes. I made require changes in .avsc file and I drop the table and re-created 
using updated .avsc. But I am not getting existing data in that case.

Where am I wrong? Can you through some light

Thanks,
Dhaval

From: Aaron.Dossett [mailto:aaron.doss...@target.com]
Sent: Wednesday, March 23, 2016 6:36 PM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Cc: 'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'
Subject: Re: Want to Add New Column in Avro Schema

If you create the external table by reference to the .avsc file (TBLPROPERTIES 
( 'avro.schema.url'='hdfs://foo.avsc')) the all you have to do is update that 
avsc file in a compatible way and Hive should reflect the new schema.  I've 
implemented this pattern in my production system for several months now.

-Aaron

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 6:32 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: Want to Add New Column in Avro Schema

Hey folks,

I want to add new column in existing Hive Table. We created external hive table 
with the help of .avsc. Now I want to add new column in that table.

How can I do that without disturbing any data present in table?

Please Help.

Regards,
Dhaval



RE: Want to Add New Column in Avro Schema

2016-03-23 Thread Lunagariya, Dhaval
Here is the DDL.

DROP TABLE IF EXISTS TEST;

CREATE EXTERNAL TABLE TEST
PARTITIONED BY (
COL1 STRING,
COL2 STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///data/hive/TEST'
TBLPROPERTIES ('avro.schema.url'='hdfs:///user/Test.avsc');

Thanks,
Dhaval

From: Aaron.Dossett [mailto:aaron.doss...@target.com]
Sent: Wednesday, March 23, 2016 6:50 PM
To: user@avro.apache.org
Cc: 'er.dcpa...@gmail.com'
Subject: Re: Want to Add New Column in Avro Schema

You shouldn't have to drop the table, just update the .avsc.  Can you share the 
DDL you use to create the table?

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 8:17 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: RE: Want to Add New Column in Avro Schema

Yes. I made require changes in .avsc file and I drop the table and re-created 
using updated .avsc. But I am not getting existing data in that case.

Where am I wrong? Can you through some light

Thanks,
Dhaval

From: Aaron.Dossett [mailto:aaron.doss...@target.com]
Sent: Wednesday, March 23, 2016 6:36 PM
To: user@avro.apache.org<mailto:user@avro.apache.org>
Cc: 'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'
Subject: Re: Want to Add New Column in Avro Schema

If you create the external table by reference to the .avsc file (TBLPROPERTIES 
( 'avro.schema.url'='hdfs://foo.avsc')) the all you have to do is update that 
avsc file in a compatible way and Hive should reflect the new schema.  I've 
implemented this pattern in my production system for several months now.

-Aaron

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 6:32 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: Want to Add New Column in Avro Schema

Hey folks,

I want to add new column in existing Hive Table. We created external hive table 
with the help of .avsc. Now I want to add new column in that table.

How can I do that without disturbing any data present in table?

Please Help.

Regards,
Dhaval



RE: Want to Add New Column in Avro Schema

2016-03-23 Thread Lunagariya, Dhaval
Yes. I made require changes in .avsc file and I drop the table and re-created 
using updated .avsc. But I am not getting existing data in that case.

Where am I wrong? Can you through some light

Thanks,
Dhaval

From: Aaron.Dossett [mailto:aaron.doss...@target.com]
Sent: Wednesday, March 23, 2016 6:36 PM
To: user@avro.apache.org
Cc: 'er.dcpa...@gmail.com'
Subject: Re: Want to Add New Column in Avro Schema

If you create the external table by reference to the .avsc file (TBLPROPERTIES 
( 'avro.schema.url'='hdfs://foo.avsc')) the all you have to do is update that 
avsc file in a compatible way and Hive should reflect the new schema.  I've 
implemented this pattern in my production system for several months now.

-Aaron

From: "Lunagariya, Dhaval" 
<dhaval.lunagar...@citi.com<mailto:dhaval.lunagar...@citi.com>>
Reply-To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Date: Wednesday, March 23, 2016 at 6:32 AM
To: "user@avro.apache.org<mailto:user@avro.apache.org>" 
<user@avro.apache.org<mailto:user@avro.apache.org>>
Cc: "'er.dcpa...@gmail.com<mailto:'er.dcpa...@gmail.com>'" 
<er.dcpa...@gmail.com<mailto:er.dcpa...@gmail.com>>
Subject: Want to Add New Column in Avro Schema

Hey folks,

I want to add new column in existing Hive Table. We created external hive table 
with the help of .avsc. Now I want to add new column in that table.

How can I do that without disturbing any data present in table?

Please Help.

Regards,
Dhaval



Want to Add New Column in Avro Schema

2016-03-23 Thread Lunagariya, Dhaval
Hey folks,

I want to add new column in existing Hive Table. We created external hive table 
with the help of .avsc. Now I want to add new column in that table.

How can I do that without disturbing any data present in table?

Please Help.

Regards,
Dhaval