Re: Review Request 56546: Allow user to update AVRO table schema via command even if table has external schema

2017-03-08 Thread Adam Szita

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56546/
---

(Updated March 8, 2017, 12:46 p.m.)


Review request for hive, Aihua Xu, Peter Vary, and Sergio Pena.


Changes
---

Removed automatic tblproperties removal and Hive conf key. User will have to 
take care of this manually if they want to alter the table.


Bugs: HIVE-13780
https://issues.apache.org/jira/browse/HIVE-13780


Repository: hive-git


Description
---

Allow user to update AVRO table schema via command even if table's definition 
was defined through schema file / literal in tblproperties


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
8801b4ef4686c654cbcc69e1241bbdf807afddbf 
  ql/src/test/queries/clientnegative/avro_add_column_extschema.q PRE-CREATION 
  ql/src/test/queries/clientpositive/avro_add_column_extschema.q PRE-CREATION 
  ql/src/test/results/clientnegative/avro_add_column_extschema.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/avro_add_column_extschema.q.out 
PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
f18585da1d108abdd500437362eb388b21030ec7 


Diff: https://reviews.apache.org/r/56546/diff/3/

Changes: https://reviews.apache.org/r/56546/diff/2-3/


Testing
---


Thanks,

Adam Szita



Re: Review Request 56546: Allow user to update AVRO table schema via command even if table has external schema

2017-03-03 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56546/#review167827
---




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 1207 (patched)


I think finally I understand the approach you are taking. :)

You are removing the avro literal/url so the avro schemas are inferred from 
the table structure.

For avro table definition, we can define hive schema and/or avro schema and 
we can have both or either of them. If avro schema exists, seems the hive 
schema will be ignored.

I feel the approach to drop literal/url will cause some issues since the 
user could create a table using avro schema. If the avro file generated from 
other tools changes, the users can just update the avro schema itself to 
reflect in hive table. With this patch, it will break such logic.

My suggestion to this fix is: just simply give an error to mention that you 
can't add columns if url/literal is used (such add column operation is 
supported with hive schema and without avro schema), just like replace columns. 

hive> alter table employees_avro replace columns (salary int);
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. Replace columns is not supported for 
table employees_avro. SerDe may be incompatible.

You need to use hive schema rather than avro schema to define the table in 
order to support alter table operations.


- Aihua Xu


On Feb. 22, 2017, 11:48 a.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56546/
> ---
> 
> (Updated Feb. 22, 2017, 11:48 a.m.)
> 
> 
> Review request for hive, Aihua Xu, Peter Vary, and Sergio Pena.
> 
> 
> Bugs: HIVE-13780
> https://issues.apache.org/jira/browse/HIVE-13780
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file / literal in tblproperties
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 3777fa96ba8eac79e07e454ee437fb05583158c5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 
>   ql/src/test/queries/clientnegative/avro_add_column_extschema.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/avro_add_column_extschema.q PRE-CREATION 
>   ql/src/test/results/clientnegative/avro_add_column_extschema.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/avro_add_column_extschema.q.out 
> PRE-CREATION 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
> f18585da1d108abdd500437362eb388b21030ec7 
> 
> 
> Diff: https://reviews.apache.org/r/56546/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 56546: Allow user to update AVRO table schema via command even if table has external schema

2017-02-22 Thread Adam Szita


> On Feb. 21, 2017, 10:23 p.m., Aihua Xu wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1207
> > 
> >
> > Should the command like 'alter table add column' always alter the avro 
> > schema? Just wondering if it's necessary to add this configuration.
> > 
> > So what we will do is to update the schema file?

I think this has to be optional.
The goal of the patch is to enable alterting tables that were defined using 
avro schema url or literal. Currently this doesn't work in Hive.
If an avro table was created using a schema file, it will have to drop its 
avro.schema.url tbl property during alter table command. We cannot know if for 
example this file has any other use case, so we cannot touch it. "Detaching" 
the table schema from the schema file however might be unexpected to some 
users, hence the configurability.


> On Feb. 21, 2017, 10:23 p.m., Aihua Xu wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 3568
> > 
> >
> > This is avro specific call. I'm wondering if we can achieve this by 
> > something like, giving SerDe a function handleAddColumn(). Not sure if it 
> > makes sense. Can you take a look?

I see your point, however this a single case. I would rather make an abstract 
handleAddColumn() method if in the future we find that other serdes could make 
use of it.


- Adam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56546/#review166239
---


On Feb. 22, 2017, 11:48 a.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56546/
> ---
> 
> (Updated Feb. 22, 2017, 11:48 a.m.)
> 
> 
> Review request for hive, Aihua Xu, Peter Vary, and Sergio Pena.
> 
> 
> Bugs: HIVE-13780
> https://issues.apache.org/jira/browse/HIVE-13780
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file / literal in tblproperties
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 3777fa96ba8eac79e07e454ee437fb05583158c5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 
>   ql/src/test/queries/clientnegative/avro_add_column_extschema.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/avro_add_column_extschema.q PRE-CREATION 
>   ql/src/test/results/clientnegative/avro_add_column_extschema.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/avro_add_column_extschema.q.out 
> PRE-CREATION 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
> f18585da1d108abdd500437362eb388b21030ec7 
> 
> Diff: https://reviews.apache.org/r/56546/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 56546: Allow user to update AVRO table schema via command even if table has external schema

2017-02-22 Thread Adam Szita

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56546/
---

(Updated Feb. 22, 2017, 11:48 a.m.)


Review request for hive, Aihua Xu, Peter Vary, and Sergio Pena.


Changes
---

Addressed review comments: restored StorageDescription retrieval in DDLTask, 
extended Q test with avro.schema.url case


Bugs: HIVE-13780
https://issues.apache.org/jira/browse/HIVE-13780


Repository: hive-git


Description
---

Allow user to update AVRO table schema via command even if table's definition 
was defined through schema file / literal in tblproperties


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
3777fa96ba8eac79e07e454ee437fb05583158c5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 
  ql/src/test/queries/clientnegative/avro_add_column_extschema.q PRE-CREATION 
  ql/src/test/queries/clientpositive/avro_add_column_extschema.q PRE-CREATION 
  ql/src/test/results/clientnegative/avro_add_column_extschema.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/avro_add_column_extschema.q.out 
PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
f18585da1d108abdd500437362eb388b21030ec7 

Diff: https://reviews.apache.org/r/56546/diff/


Testing
---


Thanks,

Adam Szita



Re: Review Request 56546: Allow user to update AVRO table schema via command even if table has external schema

2017-02-21 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56546/#review166239
---




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (line 1207)


Should the command like 'alter table add column' always alter the avro 
schema? Just wondering if it's necessary to add this configuration.

So what we will do is to update the schema file?



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java (line 3561)


I guess it's for performance reason that we are not getting sd all the time 
since for some following cases, probably sd is not used. Can you double check?



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java (line 3568)


This is avro specific call. I'm wondering if we can achieve this by 
something like, giving SerDe a function handleAddColumn(). Not sure if it makes 
sense. Can you take a look?



ql/src/test/queries/clientpositive/avro_add_column_extschema.q (line 27)


Can you also print out show create table? I'm wondering if the schema 
literal gets updated? And also if we use avro.schema.url instead of literal, 
will it work as well?


- Aihua Xu


On Feb. 21, 2017, 9:52 p.m., Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56546/
> ---
> 
> (Updated Feb. 21, 2017, 9:52 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Peter Vary, and Sergio Pena.
> 
> 
> Bugs: HIVE-13780
> https://issues.apache.org/jira/browse/HIVE-13780
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file / literal in tblproperties
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> b27b663b94f41a8250b79139ed9f7275b10cf9a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 
>   ql/src/test/queries/clientnegative/avro_add_column_extschema.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/avro_add_column_extschema.q PRE-CREATION 
>   ql/src/test/results/clientnegative/avro_add_column_extschema.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/avro_add_column_extschema.q.out 
> PRE-CREATION 
>   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
> f18585da1d108abdd500437362eb388b21030ec7 
> 
> Diff: https://reviews.apache.org/r/56546/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Review Request 56546: Allow user to update AVRO table schema via command even if table has external schema

2017-02-10 Thread Adam Szita

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56546/
---

Review request for hive, Peter Vary and Sergio Pena.


Bugs: HIVE-13780
https://issues.apache.org/jira/browse/HIVE-13780


Repository: hive-git


Description
---

Allow user to update AVRO table schema via command even if table's definition 
was defined through schema file / literal in tblproperties


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
b27b663b94f41a8250b79139ed9f7275b10cf9a3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 
  ql/src/test/queries/clientnegative/avro_add_column_extschema.q PRE-CREATION 
  ql/src/test/queries/clientpositive/avro_add_column_extschema.q PRE-CREATION 
  ql/src/test/results/clientnegative/avro_add_column_extschema.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/avro_add_column_extschema.q.out 
PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
f18585da1d108abdd500437362eb388b21030ec7 

Diff: https://reviews.apache.org/r/56546/diff/


Testing
---


Thanks,

Adam Szita