date:20171019

[jira] [Updated] (HIVE-17845) insert fails if target table columns are not lowercase

2017-10-19 Thread Naresh P R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-17845:
--
Status: Patch Available  (was: In Progress)

> insert fails if target table columns are not lowercase
> --
>
> Key: HIVE-17845
> URL: https://issues.apache.org/jira/browse/HIVE-17845
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HIVE-17845.patch
>
>
> eg., 
> INSERT INTO TABLE EMP(ID,NAME) select * FROM SRC;
> FAILED: SemanticException 1:27 '[ID,NAME]' in insert schema specification are 
> not found among regular columns of default.EMP nor dynamic partition 
> columns.. Error encountered near token 'NAME'
> Whereas below insert is successful:
> INSERT INTO TABLE EMP(id,name) select * FROM SRC;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17845) insert fails if target table columns are not lowercase

2017-10-19 Thread Naresh P R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-17845:
--
Status: In Progress  (was: Patch Available)

> insert fails if target table columns are not lowercase
> --
>
> Key: HIVE-17845
> URL: https://issues.apache.org/jira/browse/HIVE-17845
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HIVE-17845.patch
>
>
> eg., 
> INSERT INTO TABLE EMP(ID,NAME) select * FROM SRC;
> FAILED: SemanticException 1:27 '[ID,NAME]' in insert schema specification are 
> not found among regular columns of default.EMP nor dynamic partition 
> columns.. Error encountered near token 'NAME'
> Whereas below insert is successful:
> INSERT INTO TABLE EMP(id,name) select * FROM SRC;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17863) Vectorization: Two Q files produce wrong PTF query results

2017-10-19 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-17863:
---


> Vectorization: Two Q files produce wrong PTF query results
> --
>
> Key: HIVE-17863
> URL: https://issues.apache.org/jira/browse/HIVE-17863
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> vector_windowing_multipartitioning.q
> vector_windowing_order_null.q



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-19 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212089#comment-16212089
 ] 

Xuefu Zhang commented on HIVE-16601:


Thanks for the update. Personally I like the way that app name is formatted. 
However, job group portion is less readable. To format job group in a similar 
way as formatting app name would be great. (Instead of just "", maybe 
we can have "query_id="). Thoughts?

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, 
> HIVE-16601.6.patch, Spark UI Applications List.png, Spark UI Jobs List.png
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.03.patch

patch 3 - checkpoint
can fully vectorized original file reads in the absence of delete events
otherwise falls back to VectorizedOrcAcidRowReader

if request is using LLAP and needs ROW__IDs projected it will fail ungracefully


> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Status: Patch Available  (was: Open)

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17771) Implement commands to manage resource plan.

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212017#comment-16212017
 ] 

Sergey Shelukhin commented on HIVE-17771:
-

+1 pending tests.


> Implement commands to manage resource plan.
> ---
>
> Key: HIVE-17771
> URL: https://issues.apache.org/jira/browse/HIVE-17771
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17771.01.patch, HIVE-17771.02.patch, 
> HIVE-17771.03.patch
>
>
> Please see parent jira about llap workload management.
> This jira is to implement create and show resource plan commands in hive to 
> configure resource plans for llap workload. The following are the proposed 
> commands implemented as part of the jira:
> CREATE RESOURCE PLAN plan_name WITH QUERY_PARALLELISM parallelism;
> SHOW RESOURCE PLAN plan_name;
> SHOW RESOURCE PLANS;
> ALTER RESOURCE PLAN plan_name SET QUERY_PARALLELISM = parallelism;
> ALTER RESOURCE PLAN plan_name RENAME TO new_name;
> ALTER RESOURCE PLAN plan_name ACTIVATE;
> ALTER RESOURCE PLAN plan_name DISABLE;
> ALTER RESOURCE PLAN plan_name ENABLE;
> DROP RESOURCE PLAN;
> It will be followed up with more jiras to manage pools, triggers and copy 
> resource plans.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17817) Stabilize crossproduct warning message output order

2017-10-19 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17817:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Zoltan!

> Stabilize crossproduct warning message output order
> ---
>
> Key: HIVE-17817
> URL: https://issues.apache.org/jira/browse/HIVE-17817
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-17817.01.patch, HIVE-17817.02.patch
>
>
> {{CrossProductCheck}} warning printout sometimes happens in reverse order; 
> which reduces people's confidence in the test's reliability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17607) remove ColumnStatsDesc usage from columnstatsupdatetask

2017-10-19 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17607:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Geregly.

> remove ColumnStatsDesc usage from columnstatsupdatetask
> ---
>
> Key: HIVE-17607
> URL: https://issues.apache.org/jira/browse/HIVE-17607
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Gergely Hajós
> Fix For: 3.0.0
>
> Attachments: HIVE-17607.1.patch, HIVE-17607.2.patch, 
> HIVE-17607.3.patch
>
>
> it's not entirely connected to this task...it should either has its own 
> descriptor; or work sould take on the: tablename/coltype/colname payload



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17578) Create a TableRef object for Table/Partition

2017-10-19 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17578:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Geregly.

> Create a TableRef object for Table/Partition
> 
>
> Key: HIVE-17578
> URL: https://issues.apache.org/jira/browse/HIVE-17578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Gergely Hajós
> Fix For: 3.0.0
>
> Attachments: HIVE-17578.1.patch
>
>
> a quick {{git grep DbName |grep -i TableName}} uncovers quite a lot of places 
> where the fully qualified {{dbname.tablename}} is being produced
> and most of the time the Table object is also present which might as well can 
> have a method to service a tableref.
> There might be some hidden bugs because of this...because at some places the 
> fully qualified table name is produced earlier...
> example callsite:
> https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/hooks/UpdateInputAccessTimeHook.java#L63
> and called method:
> https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L620



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17473) implement workload management pools

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17473:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the reviews!

> implement workload management pools
> ---
>
> Key: HIVE-17473
> URL: https://issues.apache.org/jira/browse/HIVE-17473
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-17473.01.patch, HIVE-17473.03.patch, 
> HIVE-17473.04.patch, HIVE-17473.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-19 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211991#comment-16211991
 ] 

Sahil Takiar commented on HIVE-16601:
-

[~xuefuz], attached updated screenshots.

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, 
> HIVE-16601.6.patch, Spark UI Applications List.png, Spark UI Jobs List.png
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-19 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16601:

Attachment: Spark UI Applications List.png
Spark UI Jobs List.png

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, 
> HIVE-16601.6.patch, Spark UI Applications List.png, Spark UI Jobs List.png
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-19 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16601:

Attachment: (was: Spark UI Jobs List.png)

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, HIVE-16601.6.patch
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-19 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16601:

Attachment: (was: Spark UI Applications List.png)

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, HIVE-16601.6.patch
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-10378) Hive Update statement set keyword work with lower case only and doesn't give any error if wrong column name specified in the set clause.

2017-10-19 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211976#comment-16211976
 ] 

Eugene Koifman commented on HIVE-10378:
---

[~osayankin] could you include a test with the fix?

> Hive Update statement set keyword work with lower case only and doesn't give 
> any error if wrong column name specified in the set clause.
> 
>
> Key: HIVE-10378
> URL: https://issues.apache.org/jira/browse/HIVE-10378
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0, 1.1.0
> Environment: Hadoop: 2.6.0
> Hive : 1.0.0/1.1.0
> OS:Linux
>Reporter: Vineet Kandpal
>Assignee: Oleksiy Sayankin
> Fix For: 2.3.2
>
> Attachments: HIVE-10378.1.patch
>
>
> Brief: Hive Update statement set keyword work with lower case only and 
> doesn't give any error if wrong column name specified in the set clause.
> Steps to reproduce: 
> following are the steps performed for the same:
> 1. Create Table with transactional properties.
> create table customer(id int ,name string, email string) clustered by (id) 
> into 2 buckets stored as orc TBLPROPERTIES('transactional'='true')
> 2. Insert data into transactional table:
> insert into table customer values 
> (1,'user1','us...@user1.com'),(2,'user2','us...@user1.com'),(3,'user3','us...@gmail.com')
> 3. Search result:
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| user1  | us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.299 seconds)
> 4. Update table column name with some clause In below column name is used in 
> the UPPER case (NAME) and it is not updating the column value :
> 0: jdbc:hive2://localhost:1> update  customer set  NAME  = 
> 'notworking'   where id = 1;
> INFO  : Table default.customer stats: [numFiles=10, numRows=3, 
> totalSize=6937, rawDataSize=0]
> No rows affected (20.343 seconds)
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| user1  | us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.321 seconds)
> 5. Update table column name with some clause In below column name is used in 
> the LOWER case (name) and it is updating the column value
> 0: jdbc:hive2://localhost:1> update  customer set  name  = 'working'  
>  where id = 1;
> INFO  : Table default.customer stats: [numFiles=11, numRows=3, 
> totalSize=7699, rawDataSize=0]
> No rows affected (19.74 seconds)
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| working| us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.333 seconds)
> 0: jdbc:hive2://localhost:1>
> 6. We have also seen that if we put the column name incorrect in set keyword 
> of the update statement it accept the query and execute job. There should 
> validation on the column name used in the set clause.
> 0: jdbc:hive2://localhost:1> update  customer set  name_44  = 
> 'working'   where id = 1;
>  
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17578) Create a TableRef object for Table/Partition

2017-10-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211975#comment-16211975
 ] 

Ashutosh Chauhan commented on HIVE-17578:
-

+1

> Create a TableRef object for Table/Partition
> 
>
> Key: HIVE-17578
> URL: https://issues.apache.org/jira/browse/HIVE-17578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Gergely Hajós
> Attachments: HIVE-17578.1.patch
>
>
> a quick {{git grep DbName |grep -i TableName}} uncovers quite a lot of places 
> where the fully qualified {{dbname.tablename}} is being produced
> and most of the time the Table object is also present which might as well can 
> have a method to service a tableref.
> There might be some hidden bugs because of this...because at some places the 
> fully qualified table name is produced earlier...
> example callsite:
> https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/hooks/UpdateInputAccessTimeHook.java#L63
> and called method:
> https://github.com/apache/hive/blob/266c50554b86462d8b5ac84d28a2b237b8dbfa7e/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L620



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17862) Update copyright date in NOTICE

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-17862.

Resolution: Fixed

> Update copyright date in NOTICE
> ---
>
> Key: HIVE-17862
> URL: https://issues.apache.org/jira/browse/HIVE-17862
> Project: Hive
>  Issue Type: Task
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Trivial
> Fix For: 2.3.1
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17862) Update copyright date in NOTICE

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-17862:
--


> Update copyright date in NOTICE
> ---
>
> Key: HIVE-17862
> URL: https://issues.apache.org/jira/browse/HIVE-17862
> Project: Hive
>  Issue Type: Task
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Trivial
> Fix For: 2.3.1
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17607) remove ColumnStatsDesc usage from columnstatsupdatetask

2017-10-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211946#comment-16211946
 ] 

Ashutosh Chauhan commented on HIVE-17607:
-

+1

> remove ColumnStatsDesc usage from columnstatsupdatetask
> ---
>
> Key: HIVE-17607
> URL: https://issues.apache.org/jira/browse/HIVE-17607
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Gergely Hajós
> Attachments: HIVE-17607.1.patch, HIVE-17607.2.patch, 
> HIVE-17607.3.patch
>
>
> it's not entirely connected to this task...it should either has its own 
> descriptor; or work sould take on the: tablename/coltype/colname payload



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed

2017-10-19 Thread Andrew Sherman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211942#comment-16211942
 ] 

Andrew Sherman commented on HIVE-17826:
---

Thanks [~aihuaxu] I did think about that but I am not sure how to do it in a 
simple and clean way.   cleanupOperationLog() is called from Operation.closeO 
so adding a delay inline will prevent the session from terminating which seems 
weird. And doing it asynchronously makes it more complicated. 

But as we just discussed IRL I will think about it some more.

> Error writing to RandomAccessFile after operation log is closed
> ---
>
> Key: HIVE-17826
> URL: https://issues.apache.org/jira/browse/HIVE-17826
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-17826.1.patch
>
>
> We are seeing the error from HS2 process stdout.
> {noformat}
> 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to 
> non-started appender query-file-appender
> 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to 
> non-started appender query-file-appender
> 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream 
> /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9
>  for appender query-file-appender
> 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing 
> Appender query-file-appender 
> org.apache.logging.log4j.core.appender.AppenderLoggingException: Error 
> writing to RandomAccessFile 
> /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114)
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103)
>   at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136)
>   at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105)
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
>   at 
> org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116)
>   at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
>   at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390)
>   at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378)
>   at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362)
>   at 
> org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79)
>   at 
> org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385)
>   at 
> org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103)
>   at 
> org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:43)
>   at 
> org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:28)
>   at 
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Stream Closed
>   at java.io.RandomAccessFile.writeBytes(Native Method)
>   at java.io.RandomAccessFile.write(RandomAccessFile.java:525)
>   at 
> org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:111)
>   ... 25 more
> {noformat}

[jira] [Assigned] (HIVE-17856) MM tables - IOW is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17856:
---

Assignee: Steve Yeom

> MM tables - IOW is broken
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17858) MM - some union cases are broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17858:
---

Assignee: Sergey Shelukhin

> MM - some union cases are broken
> 
>
> Key: HIVE-17858
> URL: https://issues.apache.org/jira/browse/HIVE-17858
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: mm-gap-1
>
> mm_all test no longer runs on LLAP; if it's executed in LLAP, one can see 
> that some union cases no longer work.
> Queries on partunion_mm, skew_dp_union_mm produce no results.
> I'm not sure what part of "integration" broke it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-11266) count(*) wrong result based on table statistics for external tables

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11266:
---
Target Version/s: 3.0.0, 2.4.0  (was: 3.0.0, 2.4.0, 2.3.1)

> count(*) wrong result based on table statistics for external tables
> ---
>
> Key: HIVE-11266
> URL: https://issues.apache.org/jira/browse/HIVE-11266
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Simone Battaglia
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HIVE-11266.01.patch, HIVE-11266.patch
>
>
> Hive returns wrong count result on an external table with table statistics if 
> I change table data files.
> This is the scenario in details:
> 1) create external table my_table (...) location 'my_location';
> 2) analyze table my_table compute statistics;
> 3) change/add/delete one or more files in 'my_location' directory;
> 4) select count(\*) from my_table;
> In this case the count query doesn't generate a MR job and returns the result 
> based on table statistics. This result is wrong because is based on 
> statistics stored in the Hive metastore and doesn't take into account 
> modifications introduced on data files.
> Obviously setting "hive.compute.query.using.stats" to FALSE this problem 
> doesn't occur but the default value of this property is TRUE.
> I thinks that also this post on stackoverflow, that shows another type of bug 
> in case of multiple insert, is related to the one that I reported:
> http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17430) Add LOAD DATA test for blobstores

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17430:
---
Target Version/s:   (was: 2.3.1)

> Add LOAD DATA test for blobstores
> -
>
> Key: HIVE-17430
> URL: https://issues.apache.org/jira/browse/HIVE-17430
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Yuzhou Sun
>Assignee: Yuzhou Sun
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17430.patch
>
>
> This patch introduces load_data.q regression tests into the hive-blobstore 
> qtest module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17636) Add multiple_agg.q test for blobstores

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17636:
---
Target Version/s:   (was: 2.3.1)

> Add multiple_agg.q test for blobstores
> --
>
> Key: HIVE-17636
> URL: https://issues.apache.org/jira/browse/HIVE-17636
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
>Assignee: Ran Gu
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17636.patch
>
>
> This patch introduces multiple_agg.q regression tests into the hive-blobstore 
> qtest module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17729) Add Database & Explain related blobstore tests

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211936#comment-16211936
 ] 

Jesus Camacho Rodriguez commented on HIVE-17729:


Changing targeted fix version to 2.3.2 as this is not a blocker and we are 
releasing 2.3.1.

> Add Database & Explain related blobstore tests
> --
>
> Key: HIVE-17729
> URL: https://issues.apache.org/jira/browse/HIVE-17729
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Rentao Wu
>Assignee: Rentao Wu
> Attachments: HIVE-17729.patch
>
>
> This patch introduces the following regression tests into the hive-blobstore 
> qtest module:
> * create_database.q  -> tests tables with location inherited from database
> * multiple_db.q  -> tests query spanning multiple databases
> * explain.q -> tests EXPLAIN INSERT OVERWRITE command
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17729) Add Database & Explain related blobstore tests

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17729:
---
Target Version/s: 3.0.0, 2.4.0, 2.3.2  (was: 3.0.0, 2.4.0, 2.3.1)

> Add Database & Explain related blobstore tests
> --
>
> Key: HIVE-17729
> URL: https://issues.apache.org/jira/browse/HIVE-17729
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Rentao Wu
>Assignee: Rentao Wu
> Attachments: HIVE-17729.patch
>
>
> This patch introduces the following regression tests into the hive-blobstore 
> qtest module:
> * create_database.q  -> tests tables with location inherited from database
> * multiple_db.q  -> tests query spanning multiple databases
> * explain.q -> tests EXPLAIN INSERT OVERWRITE command
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-10378) Hive Update statement set keyword work with lower case only and doesn't give any error if wrong column name specified in the set clause.

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10378:
---
Fix Version/s: (was: 2.3.1)
   2.3.2

> Hive Update statement set keyword work with lower case only and doesn't give 
> any error if wrong column name specified in the set clause.
> 
>
> Key: HIVE-10378
> URL: https://issues.apache.org/jira/browse/HIVE-10378
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0, 1.1.0
> Environment: Hadoop: 2.6.0
> Hive : 1.0.0/1.1.0
> OS:Linux
>Reporter: Vineet Kandpal
>Assignee: Oleksiy Sayankin
> Fix For: 2.3.2
>
> Attachments: HIVE-10378.1.patch
>
>
> Brief: Hive Update statement set keyword work with lower case only and 
> doesn't give any error if wrong column name specified in the set clause.
> Steps to reproduce: 
> following are the steps performed for the same:
> 1. Create Table with transactional properties.
> create table customer(id int ,name string, email string) clustered by (id) 
> into 2 buckets stored as orc TBLPROPERTIES('transactional'='true')
> 2. Insert data into transactional table:
> insert into table customer values 
> (1,'user1','us...@user1.com'),(2,'user2','us...@user1.com'),(3,'user3','us...@gmail.com')
> 3. Search result:
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| user1  | us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.299 seconds)
> 4. Update table column name with some clause In below column name is used in 
> the UPPER case (NAME) and it is not updating the column value :
> 0: jdbc:hive2://localhost:1> update  customer set  NAME  = 
> 'notworking'   where id = 1;
> INFO  : Table default.customer stats: [numFiles=10, numRows=3, 
> totalSize=6937, rawDataSize=0]
> No rows affected (20.343 seconds)
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| user1  | us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.321 seconds)
> 5. Update table column name with some clause In below column name is used in 
> the LOWER case (name) and it is updating the column value
> 0: jdbc:hive2://localhost:1> update  customer set  name  = 'working'  
>  where id = 1;
> INFO  : Table default.customer stats: [numFiles=11, numRows=3, 
> totalSize=7699, rawDataSize=0]
> No rows affected (19.74 seconds)
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| working| us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.333 seconds)
> 0: jdbc:hive2://localhost:1>
> 6. We have also seen that if we put the column name incorrect in set keyword 
> of the update statement it accept the query and execute job. There should 
> validation on the column name used in the set clause.
> 0: jdbc:hive2://localhost:1> update  customer set  name_44  = 
> 'working'   where id = 1;
>  
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-10378) Hive Update statement set keyword work with lower case only and doesn't give any error if wrong column name specified in the set clause.

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211938#comment-16211938
 ] 

Jesus Camacho Rodriguez commented on HIVE-10378:


Changing targeted fix version to 2.3.2 as this is not a blocker and we are 
releasing 2.3.1.

> Hive Update statement set keyword work with lower case only and doesn't give 
> any error if wrong column name specified in the set clause.
> 
>
> Key: HIVE-10378
> URL: https://issues.apache.org/jira/browse/HIVE-10378
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0, 1.1.0
> Environment: Hadoop: 2.6.0
> Hive : 1.0.0/1.1.0
> OS:Linux
>Reporter: Vineet Kandpal
>Assignee: Oleksiy Sayankin
> Fix For: 2.3.2
>
> Attachments: HIVE-10378.1.patch
>
>
> Brief: Hive Update statement set keyword work with lower case only and 
> doesn't give any error if wrong column name specified in the set clause.
> Steps to reproduce: 
> following are the steps performed for the same:
> 1. Create Table with transactional properties.
> create table customer(id int ,name string, email string) clustered by (id) 
> into 2 buckets stored as orc TBLPROPERTIES('transactional'='true')
> 2. Insert data into transactional table:
> insert into table customer values 
> (1,'user1','us...@user1.com'),(2,'user2','us...@user1.com'),(3,'user3','us...@gmail.com')
> 3. Search result:
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| user1  | us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.299 seconds)
> 4. Update table column name with some clause In below column name is used in 
> the UPPER case (NAME) and it is not updating the column value :
> 0: jdbc:hive2://localhost:1> update  customer set  NAME  = 
> 'notworking'   where id = 1;
> INFO  : Table default.customer stats: [numFiles=10, numRows=3, 
> totalSize=6937, rawDataSize=0]
> No rows affected (20.343 seconds)
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| user1  | us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.321 seconds)
> 5. Update table column name with some clause In below column name is used in 
> the LOWER case (name) and it is updating the column value
> 0: jdbc:hive2://localhost:1> update  customer set  name  = 'working'  
>  where id = 1;
> INFO  : Table default.customer stats: [numFiles=11, numRows=3, 
> totalSize=7699, rawDataSize=0]
> No rows affected (19.74 seconds)
> 0: jdbc:hive2://localhost:1> select * from customer;
> +--++--+--+
> | customer.id  | customer.name  |  customer.email  |
> +--++--+--+
> | 2| user2  | us...@user1.com  |
> | 3| user3  | us...@gmail.com  |
> | 1| working| us...@user1.com  |
> +--++--+--+
> 3 rows selected (0.333 seconds)
> 0: jdbc:hive2://localhost:1>
> 6. We have also seen that if we put the column name incorrect in set keyword 
> of the update statement it accept the query and execute job. There should 
> validation on the column name used in the set clause.
> 0: jdbc:hive2://localhost:1> update  customer set  name_44  = 
> 'working'   where id = 1;
>  
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17819) Add sampling.q test for blobstores

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211935#comment-16211935
 ] 

Jesus Camacho Rodriguez commented on HIVE-17819:


Changing targeted fix version to 2.3.2 as this is not a blocker and we are 
releasing 2.3.1.

> Add sampling.q test for blobstores
> --
>
> Key: HIVE-17819
> URL: https://issues.apache.org/jira/browse/HIVE-17819
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
> Attachments: HIVE-17819.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17819) Add sampling.q test for blobstores

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17819:
---
Target Version/s: 2.3.2  (was: 2.3.1)

> Add sampling.q test for blobstores
> --
>
> Key: HIVE-17819
> URL: https://issues.apache.org/jira/browse/HIVE-17819
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
> Attachments: HIVE-17819.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17820) Add buckets.q test for blobstores

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211934#comment-16211934
 ] 

Jesus Camacho Rodriguez commented on HIVE-17820:


Changing targeted fix version to 2.3.2 as this is not a blocker and we are 
releasing 2.3.1.

> Add buckets.q test for blobstores
> -
>
> Key: HIVE-17820
> URL: https://issues.apache.org/jira/browse/HIVE-17820
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
> Attachments: HIVE-17820.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17820) Add buckets.q test for blobstores

2017-10-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17820:
---
Target Version/s: 2.3.2  (was: 2.3.1)

> Add buckets.q test for blobstores
> -
>
> Key: HIVE-17820
> URL: https://issues.apache.org/jira/browse/HIVE-17820
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
> Attachments: HIVE-17820.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17856) MM tables - IOW is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17856:

Description: 
The following tests were removed from mm_all during "integration"... I should 
have never allowed such manner of intergration.
MM logic should have been kept intact until ACID logic could catch up. Alas, 
here we are.



{noformat}
drop table iow0_mm;
create table iow0_mm(key int) tblproperties("transactional"="true", 
"transactional_properties"="insert_only");
insert overwrite table iow0_mm select key from intermediate;
insert into table iow0_mm select key + 1 from intermediate;
select * from iow0_mm order by key;
insert overwrite table iow0_mm select key + 2 from intermediate;
select * from iow0_mm order by key;
drop table iow0_mm;


drop table iow1_mm; 
create table iow1_mm(key int) partitioned by (key2 int)  
tblproperties("transactional"="true", "transactional_properties"="insert_only");
insert overwrite table iow1_mm partition (key2)
select key as k1, key from intermediate union all select key as k1, key from 
intermediate;
insert into table iow1_mm partition (key2)
select key + 1 as k1, key from intermediate union all select key as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key from intermediate union all select key + 4 as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, 
key + 2 from intermediate;
select * from iow1_mm order by key, key2;
drop table iow1_mm;
{noformat}

{noformat}
drop table simple_mm;
create table simple_mm(key int) stored as orc tblproperties 
("transactional"="true", "transactional_properties"="insert_only");
insert into table simple_mm select key from intermediate;
-insert overwrite table simple_mm select key from intermediate;
{noformat}



  was:
The following tests were removed from mm_all during "integration"... I should 
have never allowed such manner of intergration.
MM logic should have been kept intact until ACID logic could catch up. Alas, 
here we are.

Additionally multi-IOW tests may produce incorrect results. They were/are 
commented out in mm_all.

{noformat}
drop table iow0_mm;
create table iow0_mm(key int) tblproperties("transactional"="true", 
"transactional_properties"="insert_only");
insert overwrite table iow0_mm select key from intermediate;
insert into table iow0_mm select key + 1 from intermediate;
select * from iow0_mm order by key;
insert overwrite table iow0_mm select key + 2 from intermediate;
select * from iow0_mm order by key;
drop table iow0_mm;


drop table iow1_mm; 
create table iow1_mm(key int) partitioned by (key2 int)  
tblproperties("transactional"="true", "transactional_properties"="insert_only");
insert overwrite table iow1_mm partition (key2)
select key as k1, key from intermediate union all select key as k1, key from 
intermediate;
insert into table iow1_mm partition (key2)
select key + 1 as k1, key from intermediate union all select key as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key from intermediate union all select key + 4 as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, 
key + 2 from intermediate;
select * from iow1_mm order by key, key2;
drop table iow1_mm;
{noformat}

{noformat}
drop table simple_mm;
create table simple_mm(key int) stored as orc tblproperties 
("transactional"="true", "transactional_properties"="insert_only");
insert into table simple_mm select key from intermediate;
-insert overwrite table simple_mm select key from intermediate;
{noformat}




> MM tables - IOW is broken
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>  Labels: mm-gap-1
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop

[jira] [Updated] (HIVE-17793) Parameterize Logging Messages

2017-10-19 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17793:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Beluga!

> Parameterize Logging Messages
> -
>
> Key: HIVE-17793
> URL: https://issues.apache.org/jira/browse/HIVE-17793
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HIVE-17793.1.patch, HIVE-17793.2.patch
>
>
> * Use SLF4J parameterized logging
> * Remove use of archaic Util's "stringifyException" and simply allow logging 
> framework to handle formatting of output.  Also saves having to create the 
> error message and then throwing it away when the logging level is set higher 
> than the logging message
> * Add some {{LOG.isDebugEnabled}} around complex debug messages



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17657:

Labels: mm-gap-2  (was: mm-gap-1)

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>  Labels: mm-gap-2
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17855) conversion to MM tables via alter may be broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17855:

Labels: mm-gap-2  (was: mm-gap-1)

> conversion to MM tables via alter may be broken
> ---
>
> Key: HIVE-17855
> URL: https://issues.apache.org/jira/browse/HIVE-17855
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>  Labels: mm-gap-2
>
> {noformat}
> git difftool 77511070dd^ 77511070dd -- */mm_conversions.q
> {noformat}
> Looks like during ACID "integration" alter was simply quietly changed to 
> create+insert, because it's broken.
> I asked to keep feature parity with every change but I should have rather 
> insisted on it and -1d all the patches that didn't... This is just annoying. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17817) Stabilize crossproduct warning message output order

2017-10-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211920#comment-16211920
 ] 

Hive QA commented on HIVE-17817:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893009/HIVE-17817.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 11309 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=221)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7388/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7388/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7388/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893009 - PreCommit-HIVE-Build

> Stabilize crossproduct warning message output order
> ---
>
> Key: HIVE-17817
> URL: https://issues.apache.org/jira/browse/HIVE-17817
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17817.01.patch, HIVE-17817.02.patch
>
>
> {{CrossProductCheck}} warning printout sometimes happens in reverse order; 
> which reduces people's confidence in the test's reliability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17799) Add Ellipsis For Truncated Query In Hive Lock

2017-10-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211917#comment-16211917
 ] 

Ashutosh Chauhan commented on HIVE-17799:
-

+1

> Add Ellipsis For Truncated Query In Hive Lock
> -
>
> Key: HIVE-17799
> URL: https://issues.apache.org/jira/browse/HIVE-17799
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-17799.1.patch
>
>
> [HIVE-16334] introduced truncation for storing queries in ZK lock nodes.  
> This Jira is to add ellipsis into the query to let the operator know that 
> truncation has occurred and therefore they will not find the specific query 
> in their logs, only a prefix match will work.
> {code:sql}
> -- Truncation of query may be confusing to operator
> -- Without truncation
> SELECT * FROM TABLE WHERE COL=1
> -- With truncation (operator will not find this query in workload)
> SELECT * FROM TABLE
> -- With truncation (operator will know this is only a prefix match)
> SELECT * FROM TABLE...
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17807) Execute maven commands in batch mode for ptests

2017-10-19 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17807:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks, Vijay!

> Execute maven commands in batch mode for ptests
> ---
>
> Key: HIVE-17807
> URL: https://issues.apache.org/jira/browse/HIVE-17807
> Project: Hive
>  Issue Type: Bug
>Reporter: Vijay Kumar
>Assignee: Vijay Kumar
> Fix For: 3.0.0
>
> Attachments: HIVE-17807.patch
>
>
> No need to run in interactive mode in CI environment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2017-10-19 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211914#comment-16211914
 ] 

Eugene Koifman commented on HIVE-17645:
---

There is a single LM for the warehouse.  Having multiple TMs is like having 
multiple Sessions - they can lock the same resources as long as requested locks 
are compatible.

The issue is that each TM has it's own transaction context, i.e. in a different 
transaction.  For Spark reads, each Query Fragment uses a  separate TM (and 
ValidTxnList) so from the overall query perspective this creates Read Committed 
semantics.

(to be continued)

> MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> --
>
> Key: HIVE-17645
> URL: https://issues.apache.org/jira/browse/HIVE-17645
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>  Labels: mm-gap-2
>
> MM code introduces 
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr()
> {noformat}
> in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_).  
> HIVE-17482 adds a mode where a TransactionManager not associated with the 
> session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-19 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17853:

Description: 
The {{RetryingMetaStoreClient}} is used to automatically reconnect to the Hive 
metastore, after client timeout, transparently to the user.

In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating a 
Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find that 
the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
metastore operations will be attempted as the login-user ({{oozie}}), as 
opposed to the effective user ({{mithun}}).

We should have a fix for this shortly.

  was:
The {{RetryingMetaStoreClient}} is used to automatically reconnect to the Hive 
metastore, after client timeout, transparently to the user.

In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating a 
Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find that 
the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
metastore operations will be attempted as the login-user ({{oozie}}), as 
opposed to the effective user ({{mithunr}}).

We should have a fix for this shortly.


> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17857) Upgrade to orc 1.4

2017-10-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211907#comment-16211907
 ] 

Ashutosh Chauhan commented on HIVE-17857:
-

cc: [~sershe] [~prasanth_j]

> Upgrade to orc 1.4
> --
>
> Key: HIVE-17857
> URL: https://issues.apache.org/jira/browse/HIVE-17857
> Project: Hive
>  Issue Type: Task
>  Components: ORC
>Reporter: Ashutosh Chauhan
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17856) MM tables - IOW is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17856:

Description: 
The following tests were removed from mm_all during "integration"... I should 
have never allowed such manner of intergration.
MM logic should have been kept intact until ACID logic could catch up. Alas, 
here we are.
{noformat}
drop table iow0_mm;
create table iow0_mm(key int) tblproperties("transactional"="true", 
"transactional_properties"="insert_only");
insert overwrite table iow0_mm select key from intermediate;
insert into table iow0_mm select key + 1 from intermediate;
select * from iow0_mm order by key;
insert overwrite table iow0_mm select key + 2 from intermediate;
select * from iow0_mm order by key;
drop table iow0_mm;


drop table iow1_mm; 
create table iow1_mm(key int) partitioned by (key2 int)  
tblproperties("transactional"="true", "transactional_properties"="insert_only");
insert overwrite table iow1_mm partition (key2)
select key as k1, key from intermediate union all select key as k1, key from 
intermediate;
insert into table iow1_mm partition (key2)
select key + 1 as k1, key from intermediate union all select key as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key from intermediate union all select key + 4 as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, 
key + 2 from intermediate;
select * from iow1_mm order by key, key2;
drop table iow1_mm;
{noformat}

{noformat}
drop table simple_mm;
create table simple_mm(key int) stored as orc tblproperties 
("transactional"="true", "transactional_properties"="insert_only");
insert into table simple_mm select key from intermediate;
-insert overwrite table simple_mm select key from intermediate;
{noformat}



  was:
The following tests were removed from mm_all 
{noformat}
drop table iow0_mm;
create table iow0_mm(key int) tblproperties("transactional"="true", 
"transactional_properties"="insert_only");
insert overwrite table iow0_mm select key from intermediate;
insert into table iow0_mm select key + 1 from intermediate;
select * from iow0_mm order by key;
insert overwrite table iow0_mm select key + 2 from intermediate;
select * from iow0_mm order by key;
drop table iow0_mm;


drop table iow1_mm; 
create table iow1_mm(key int) partitioned by (key2 int)  
tblproperties("transactional"="true", "transactional_properties"="insert_only");
insert overwrite table iow1_mm partition (key2)
select key as k1, key from intermediate union all select key as k1, key from 
intermediate;
insert into table iow1_mm partition (key2)
select key + 1 as k1, key from intermediate union all select key as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key from intermediate union all select key + 4 as k1, key 
from intermediate;
select * from iow1_mm order by key, key2;
insert overwrite table iow1_mm partition (key2)
select key + 3 as k1, key + 3 from intermediate union all select key + 2 as k1, 
key + 2 from intermediate;
select * from iow1_mm order by key, key2;
drop table iow1_mm;
{noformat}

{noformat}
drop table simple_mm;
create table simple_mm(key int) stored as orc tblproperties 
("transactional"="true", "transactional_properties"="insert_only");
insert into table simple_mm select key from intermediate;
-insert overwrite table simple_mm select key from intermediate;
{noformat}




> MM tables - IOW is broken
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>  Labels: mm-gap-1
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate

[jira] [Updated] (HIVE-17855) conversion to MM tables via alter may be broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17855:

Description: 
{noformat}
git difftool 77511070dd^ 77511070dd -- */mm_conversions.q
{noformat}
Looks like during ACID "integration" alter was simply quietly changed to 
create+insert, because it's broken.
I asked to keep feature parity with every change but I should have rather 
insisted on it and -1d all the patches that didn't... This is just annoying. 

  was:
{noformat}
git difftool 77511070dd 77511070dd^ -- */mm_conversions.q
{noformat}
Looks like during ACID "integration" alter was simply quietly changed to 
create+insert, because it's broken.
I asked to keep feature parity with every change but I should have rather 
insisted on it and -1d all the patches that didn't... This is just annoying. 


> conversion to MM tables via alter may be broken
> ---
>
> Key: HIVE-17855
> URL: https://issues.apache.org/jira/browse/HIVE-17855
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>  Labels: mm-gap-1
>
> {noformat}
> git difftool 77511070dd^ 77511070dd -- */mm_conversions.q
> {noformat}
> Looks like during ACID "integration" alter was simply quietly changed to 
> create+insert, because it's broken.
> I asked to keep feature parity with every change but I should have rather 
> insisted on it and -1d all the patches that didn't... This is just annoying. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-16964) _orc_acid_version file is missing

2017-10-19 Thread Steve Yeom (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom resolved HIVE-16964.
---
Resolution: Won't Fix

> _orc_acid_version file is missing
> -
>
> Key: HIVE-16964
> URL: https://issues.apache.org/jira/browse/HIVE-16964
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Steve Yeom
>
> OrcRecordUpdater creates OrcRecordUpdater.ACID_FORMAT in the dir that it 
> creates - but there is nothing Hive.moveAcidFiles() that copies it final 
> location.
> It doesn't look like CompactorMR even attempts to create it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16964) _orc_acid_version file is missing

2017-10-19 Thread Steve Yeom (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211889#comment-16211889
 ] 

Steve Yeom commented on HIVE-16964:
---

Talked with Eugene. 
Also checked with the current Hive master code with the unit test, 
"TestTxnCommands2#testNonAcidToAcidConversion1".
 
1.  Currently Hive.moveAcidFiles() does not move a "_orc_acid_version" file. 
This static method is called by the MoveTask for the Hive 
   session of th.
   I.e., FileSinkOperator at map reduce task creates such a file but the 
MoveTask does not move the file to the final destination dir.

2. The intention for creating a "_orc_acid_version" file is to handle the case 
where we have multiple versions of ACID file formats.
   I.e., in that case, we need format version info somewhere either in the 
Metastore or in the directory. 

   As Eugene indicated, currently for ACID tables, inserter/deleters create 
delta directories independently and readers read relevant dirs without conflicts
   with writers via Snapshot isolation. So there can be cases to have multiple 
versions of delta directories per partition or table directory since 
   compactors are not sync with writers. So in this case, one 
"_orc_acid_version" file may be needed per delta dir. 

3. Possibly like the case of micromanaged tables,  we can remove the steps to 
create directories in a staging are and to perform MoveTask to 
   move the delta and base directories along with orc_acid_version file(s) to a 
final destination. 

Thus based on 3 and 4, I think we can lower the priority of this jira since the 
fix of this jira (moving such a file to final destination) 
may not be used at all for HDP 3.0.

> _orc_acid_version file is missing
> -
>
> Key: HIVE-16964
> URL: https://issues.apache.org/jira/browse/HIVE-16964
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Steve Yeom
>
> OrcRecordUpdater creates OrcRecordUpdater.ACID_FORMAT in the dir that it 
> creates - but there is nothing Hive.moveAcidFiles() that copies it final 
> location.
> It doesn't look like CompactorMR even attempts to create it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211886#comment-16211886
 ] 

Sergey Shelukhin commented on HIVE-17645:
-

Spark+ACID only uses the hacky mode for select queries so it should be ok as 
long as we don't get it from the session for selects.
However, a larger concern I have is this... how does it work at all if a 
different TxnManager has non shared state with the main one? They'd be able to 
take locks separately in parallel for the same things.
And if they don't have non-shared state (rely on the same metastore DB, ZK/DB 
lock paths, etc) then what's the problem with getting a different txn manager?


> MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> --
>
> Key: HIVE-17645
> URL: https://issues.apache.org/jira/browse/HIVE-17645
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>  Labels: mm-gap-2
>
> MM code introduces 
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr()
> {noformat}
> in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_).  
> HIVE-17482 adds a mode where a TransactionManager not associated with the 
> session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17854) LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader

2017-10-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17854:
--
Description: 
getRowNumber() is required to read data in non-acid tables that were converted 
to acid.

Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader 
this functionality is not available making it impossible to vectorize reads 
(that need ROW__ID)/updates of non-aicd-to-acid tables with LLAP (until major 
compaction) in the presence of any deletes.

cc [~t3rmin4t0r], [~sershe], [~teddy.choi]

  was:
getRowNumber() is required to read data in non-acid tables that were converted 
to acid.

Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader 
this functionality is not available making it impossible to vectorize 
reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) in 
the presence of any deletes.

cc [~t3rmin4t0r], [~sershe], [~teddy.choi]


> LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader
> 
>
> Key: HIVE-17854
> URL: https://issues.apache.org/jira/browse/HIVE-17854
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>
> getRowNumber() is required to read data in non-acid tables that were 
> converted to acid.
> Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader 
> this functionality is not available making it impossible to vectorize reads 
> (that need ROW__ID)/updates of non-aicd-to-acid tables with LLAP (until major 
> compaction) in the presence of any deletes.
> cc [~t3rmin4t0r], [~sershe], [~teddy.choi]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17854) LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader

2017-10-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17854:
--
Description: 
getRowNumber() is required to read data in non-acid tables that were converted 
to acid.

Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader 
this functionality is not available making it impossible to vectorize 
reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) in 
the presence of any deletes.

cc [~t3rmin4t0r], [~sershe], [~teddy.choi]

  was:
getRowNumber() is required to read data in non-acid tables that were converted 
to acid.

Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader 
this functionality is not available making it impossible to vectorize 
reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) in 
the presence of any deletes.


> LlapRecordReader should have getRowNumber() like org.apache.orc.RecordReader
> 
>
> Key: HIVE-17854
> URL: https://issues.apache.org/jira/browse/HIVE-17854
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>
> getRowNumber() is required to read data in non-acid tables that were 
> converted to acid.
> Currently when VectorizedOrcAcidRowBatchReader is used from LlapRecordReader 
> this functionality is not available making it impossible to vectorize 
> reads/updates of non-aicd-to-acid tables with LLAP (until major compaction) 
> in the presence of any deletes.
> cc [~t3rmin4t0r], [~sershe], [~teddy.choi]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17771) Implement commands to manage resource plan.

2017-10-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211854#comment-16211854
 ] 

Hive QA commented on HIVE-17771:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892992/HIVE-17771.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 11310 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=101)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=221)
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
 (batchId=284)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7387/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7387/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7387/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892992 - PreCommit-HIVE-Build

> Implement commands to manage resource plan.
> ---
>
> Key: HIVE-17771
> URL: https://issues.apache.org/jira/browse/HIVE-17771
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17771.01.patch, HIVE-17771.02.patch, 
> HIVE-17771.03.patch
>
>
> Please see parent jira about llap workload management.
> This jira is to implement create and show resource plan commands in hive to 
> configure resource plans for llap workload. The following are the proposed 
> commands implemented as part of the jira:
> CREATE RESOURCE PLAN plan_name WITH QUERY_PARALLELISM parallelism;
> SHOW RESOURCE PLAN plan_name;
> SHOW RESOURCE PLANS;
> ALTER RESOURCE PLAN plan_name SET QUERY_PARALLELISM = parallelism;
> ALTER RESOURCE PLAN plan_name RENAME TO new_name;
> ALTER RESOURCE PLAN plan_name ACTIVATE;
> ALTER RESOURCE PLAN plan_name DISABLE;
> ALTER RESOURCE PLAN plan_name ENABLE;
> DROP RESOURCE PLAN;
> It will be followed up with more jiras to manage pools, triggers and copy 
> resource plans.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-19 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reassigned HIVE-17853:
---


> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithunr}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2017-10-19 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211844#comment-16211844
 ] 

Eugene Koifman commented on HIVE-17645:
---

Ask [~jdere]

> MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> --
>
> Key: HIVE-17645
> URL: https://issues.apache.org/jira/browse/HIVE-17645
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>  Labels: mm-gap-2
>
> MM code introduces 
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr()
> {noformat}
> in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_).  
> HIVE-17482 adds a mode where a TransactionManager not associated with the 
> session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211834#comment-16211834
 ] 

Sergey Shelukhin edited comment on HIVE-17657 at 10/19/17 10:10 PM:


Well, parts were broken by invalid merge 42a38577bc including commit 1b0d8df58e 
that removed a bunch of code... the old code from around 77511070dd (not sure 
if this is valid, this was the final MM-ACID integration commit for the file) 
needs to be restored.


was (Author: sershe):
Well, parts were broken by invalid merge 42a38577bc that removed a bunch of 
code...

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>  Labels: mm-gap-1
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211834#comment-16211834
 ] 

Sergey Shelukhin commented on HIVE-17657:
-

Well, parts were broken by invalid merge 42a38577bc that removed a bunch of 
code...

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>  Labels: mm-gap-1
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17695) collapse union all produced directories into delta directory name suffix for MM

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17695:

Priority: Minor  (was: Major)

> collapse union all produced directories into delta directory name suffix for 
> MM
> ---
>
> Key: HIVE-17695
> URL: https://issues.apache.org/jira/browse/HIVE-17695
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Priority: Minor
>
> this has special handling for writes resulting from Union All query
> In full Acid case at least, these subdirs get collapsed in favor of 
> statementId based dir names (delta_x_y_stmtId).  It would be cleaner/simpler 
> to make MM follow the same logic.  (full acid does it Hive.moveFiles() I 
> think)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17645:

Labels: mm-gap-2  (was: )

> MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> --
>
> Key: HIVE-17645
> URL: https://issues.apache.org/jira/browse/HIVE-17645
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>  Labels: mm-gap-2
>
> MM code introduces 
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr()
> {noformat}
> in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_).  
> HIVE-17482 adds a mode where a TransactionManager not associated with the 
> session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17645) MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211822#comment-16211822
 ] 

Sergey Shelukhin commented on HIVE-17645:
-

Hmm.. how can it be valid to have multiple txn managers in a single HS2? 

> MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> --
>
> Key: HIVE-17645
> URL: https://issues.apache.org/jira/browse/HIVE-17645
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>
> MM code introduces 
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr()
> {noformat}
> in a number of places (e.g _DDLTask.generateAddMmTasks(Table tbl)_).  
> HIVE-17482 adds a mode where a TransactionManager not associated with the 
> session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17646) MetaStoreUtils.isToInsertOnlyTable(Map<String, String> props) is not needed

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-17646.
-
Resolution: Not A Bug

See the last comment.

> MetaStoreUtils.isToInsertOnlyTable(Map props) is not needed
> ---
>
> Key: HIVE-17646
> URL: https://issues.apache.org/jira/browse/HIVE-17646
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>
> TransactionValidationListener is where all the logic to verify
> "transactional" & "transactional_properties" should be



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17647) DDLTask.generateAddMmTasks(Table tbl) should not start transactions

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17647:

Labels: mm-gap-2  (was: )

> DDLTask.generateAddMmTasks(Table tbl) should not start transactions
> ---
>
> Key: HIVE-17647
> URL: https://issues.apache.org/jira/browse/HIVE-17647
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>  Labels: mm-gap-2
>
> This method has 
> {noformat}
>   if (txnManager.isTxnOpen()) {
> mmWriteId = txnManager.getCurrentTxnId();
>   } else {
> mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
> txnManager.commitTxn();
>   }
> {noformat}
> this should throw if there is no open transaction.  It should never open one.
> In general the logic seems suspect.  Looks like the intent is to move all 
> existing files into a delta_x_x/ when a plain table is converted to MM table. 
>  This seems like something that needs to be done from under an Exclusive lock 
> to prevent concurrent Insert operations writing data under table/partition 
> root.  But this is too late to acquire locks which should be done from the 
> Driver.acquireLocks()  (or else have deadlock detector since acquiring them 
> here would bread all-or-nothing lock acquisition semantics currently required 
> w/o deadlock detector)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211809#comment-16211809
 ] 

Sergey Shelukhin edited comment on HIVE-17657 at 10/19/17 9:56 PM:
---

This was broken even in tests after one of the master merges.
Also even before that, when it was "working" in tests, it looks like some 
commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to 
getValidMmDirectoriesFromTableOrPart from ExportSemanticAnalyzer.java, and I'm 
not sure it added anything in return. Need to take a look at that first, and 
then fix the runtime error (some file is missing).
After that, mm_exim test needs to be reenabled.


was (Author: sershe):
This was broken even in tests after one of the master merges.
Also even before that, when it was "working" in tests, it looks like some 
commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to 
getValidMmDirectoriesFromTableOrPart from ExportSemanticAnalyzer.java, and I'm 
not sure it added anything in return. Need to take a look at that first, and 
then fix the runtime error (some file is missing).

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>  Labels: mm-gap-1
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211809#comment-16211809
 ] 

Sergey Shelukhin edited comment on HIVE-17657 at 10/19/17 9:55 PM:
---

This was broken even in tests after one of the master merges.
Also even before that, when it was "working" in tests, it looks like some 
commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to 
getValidMmDirectoriesFromTableOrPart from ExportSemanticAnalyzer.java, and I'm 
not sure it added anything in return. Need to take a look at that first, and 
then fix the runtime error (some file is missing).


was (Author: sershe):
This was broken even in tests after one of the master merges.
Also even before that, when it was "working" in tests, it looks like some 
commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to 
getValidMmDirectoriesFromTableOrPart, and I'm not sure it added anything in 
return. Need to take a look at that first, and then fix the runtime error (some 
file is missing).

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>  Labels: mm-gap-1
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17657:

Labels: mm-gap-1  (was: )

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>  Labels: mm-gap-1
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211809#comment-16211809
 ] 

Sergey Shelukhin commented on HIVE-17657:
-

This was broken even in tests after one of the master merges.
Also even before that, when it was "working" in tests, it looks like some 
commit since 2e602596f7af6c302fd23628d4337673ca38be86 has removed the call to 
getValidMmDirectoriesFromTableOrPart, and I'm not sure it added anything in 
return. Need to take a look at that first, and then fix the runtime error (some 
file is missing).

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17657) export/import for MM tables is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17657:

Summary: export/import for MM tables is broken  (was: Does ExIm for MM 
tables work?)

> export/import for MM tables is broken
> -
>
> Key: HIVE-17657
> URL: https://issues.apache.org/jira/browse/HIVE-17657
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>
> there is mm_exim.q but it's not clear from the tests what file structure it 
> creates 
> On import the txnids in the directory names would have to be remapped if 
> importing to a different cluster.  Perhaps export can be smart and export 
> highest base_x and accretive deltas (minus aborted ones).  Then import can 
> ...?  It would have to remap txn ids from the archive to new txn ids.  This 
> would then mean that import is made up of several transactions rather than 1 
> atomic op.  (all locks must belong to a transaction)
> One possibility is to open a new txn for each dir in the archive (where 
> start/end txn of file name is the same) and commit all of them at once (need 
> new TMgr API for that).  This assumes using a shared lock (if any!) and thus 
> allows other inserts (not related to import) to occur.
> What if you have delta_6_9, such as a result of concatenate?  If we stipulate 
> that this must mean that there is no delta_6_6 or any other "obsolete" delta 
> in the archive we can map it to a new single txn delta_x_x.
> Add read_only mode for tables (useful in general, may be needed for upgrade 
> etc) and use that to make the above atomic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17660) Compaction for MM runs Cleaner - needs test once IOW is supported

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17660:
---

Assignee: Eugene Koifman

> Compaction for MM runs Cleaner - needs test once IOW is supported
> -
>
> Key: HIVE-17660
> URL: https://issues.apache.org/jira/browse/HIVE-17660
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> Deletion of aborted deltas happens from CompactorMR.run() i.e. from Worker
> but the Worker still sets compaction_queue entry to READY_FOR_CLEANING.
> This is not needed if there are no base_N dirs which can be created by Insert 
> Overwrite
> In this case we can't delete deltas < N until we know no one is reading them, 
> i.e. in Cleaner



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17673) JavaUtils.extractTxnId() etc

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17673:

Status: Patch Available  (was: Open)

> JavaUtils.extractTxnId() etc
> 
>
> Key: HIVE-17673
> URL: https://issues.apache.org/jira/browse/HIVE-17673
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-17673.patch
>
>
> these should be in AcidUtils



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17673) JavaUtils.extractTxnId() etc

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17673:

Attachment: HIVE-17673.patch

[~ekoifman] can you take a look? I'll look at the history of 
getValidMmDirectoriesFromTableOrPart and either remove it or file another jira 
to fix whatever was broken by the removal of its callers.

> JavaUtils.extractTxnId() etc
> 
>
> Key: HIVE-17673
> URL: https://issues.apache.org/jira/browse/HIVE-17673
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-17673.patch
>
>
> these should be in AcidUtils



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17673) JavaUtils.extractTxnId() etc

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17673:
---

Assignee: Sergey Shelukhin

> JavaUtils.extractTxnId() etc
> 
>
> Key: HIVE-17673
> URL: https://issues.apache.org/jira/browse/HIVE-17673
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Minor
>
> these should be in AcidUtils



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17839) Cannot generate thrift definitions in standalone-metastore.

2017-10-19 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17839:
--
Status: Patch Available  (was: Open)

Forgot to redefine the work for the antrun plugin in standalone metastore.

> Cannot generate thrift definitions in standalone-metastore.
> ---
>
> Key: HIVE-17839
> URL: https://issues.apache.org/jira/browse/HIVE-17839
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Jaiprakash
>Assignee: Alan Gates
> Attachments: HIVE-17839.patch
>
>
> mvn clean install -Pthriftif -Dthrift.home=... does not regenerate the thrift 
> sources. This is after the https://issues.apache.org/jira/browse/HIVE-17506 
> fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17839) Cannot generate thrift definitions in standalone-metastore.

2017-10-19 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17839:
--
Attachment: HIVE-17839.patch

> Cannot generate thrift definitions in standalone-metastore.
> ---
>
> Key: HIVE-17839
> URL: https://issues.apache.org/jira/browse/HIVE-17839
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Jaiprakash
>Assignee: Alan Gates
> Attachments: HIVE-17839.patch
>
>
> mvn clean install -Pthriftif -Dthrift.home=... does not regenerate the thrift 
> sources. This is after the https://issues.apache.org/jira/browse/HIVE-17506 
> fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17841) implement applying the resource plan

2017-10-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211748#comment-16211748
 ] 

Hive QA commented on HIVE-17841:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892986/HIVE-17841.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 11306 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testClusterFractions 
(batchId=279)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testDestroyAndReturn 
(batchId=279)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testQueueName 
(batchId=279)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testQueueing 
(batchId=279)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReopen (batchId=279)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuse (batchId=279)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithDifferentPool
 (batchId=279)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithQueueing 
(batchId=279)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=221)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=228)
org.apache.hive.jdbc.TestTriggersWorkloadManager.org.apache.hive.jdbc.TestTriggersWorkloadManager
 (batchId=228)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth 
(batchId=242)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth 
(batchId=242)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7386/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7386/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7386/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 26 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892986 - PreCommit-HIVE-Build

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17676) Enable JDBC + MiniLLAP tests in HIVE-17508 after HIVE-17566

2017-10-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-17676.
--
Resolution: Won't Fix

The tests are already enabled with HIVE-17508. Closing this as won't fix.

> Enable JDBC + MiniLLAP tests in HIVE-17508 after HIVE-17566
> ---
>
> Key: HIVE-17676
> URL: https://issues.apache.org/jira/browse/HIVE-17676
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default

2017-10-19 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211742#comment-16211742
 ] 

Matt McCline commented on HIVE-17471:
-

Yes.  +1 LGTM tests pending.

(Do we need more tests?)

> Vectorization: Enable hive.vectorized.row.identifier.enabled to true by 
> default
> ---
>
> Key: HIVE-17471
> URL: https://issues.apache.org/jira/browse/HIVE-17471
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17471.patch
>
>
> We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 
> "Vectorization: Add infrastructure for vectorization of ROW__ID struct"
> But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17471:

Status: Patch Available  (was: Open)

> Vectorization: Enable hive.vectorized.row.identifier.enabled to true by 
> default
> ---
>
> Key: HIVE-17471
> URL: https://issues.apache.org/jira/browse/HIVE-17471
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17471.patch
>
>
> We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 
> "Vectorization: Add infrastructure for vectorization of ROW__ID struct"
> But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17471:

Attachment: HIVE-17471.patch

[~mmccline] does this make sense?

> Vectorization: Enable hive.vectorized.row.identifier.enabled to true by 
> default
> ---
>
> Key: HIVE-17471
> URL: https://issues.apache.org/jira/browse/HIVE-17471
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17471.patch
>
>
> We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 
> "Vectorization: Add infrastructure for vectorization of ROW__ID struct"
> But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17471:
---

Assignee: Sergey Shelukhin  (was: Teddy Choi)

> Vectorization: Enable hive.vectorized.row.identifier.enabled to true by 
> default
> ---
>
> Key: HIVE-17471
> URL: https://issues.apache.org/jira/browse/HIVE-17471
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Sergey Shelukhin
>
> We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 
> "Vectorization: Add infrastructure for vectorization of ROW__ID struct"
> But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16850) Converting table to insert-only acid may open a txn in an inappropriate place

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16850:

Description: 
This would work for unit-testing, but would need to be fixed for production.

{noformat}
HiveTxnManager txnManager = SessionState.get().getTxnMgr();
  if (txnManager.isTxnOpen()) {
mmWriteId = txnManager.getCurrentTxnId();
  } else {
mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
txnManager.commitTxn();
  }
{noformat}

  was:
{noformat}
HiveTxnManager txnManager = SessionState.get().getTxnMgr();
  if (txnManager.isTxnOpen()) {
mmWriteId = txnManager.getCurrentTxnId();
  } else {
mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
txnManager.commitTxn();
  }
{noformat}


> Converting table to insert-only acid may open a txn in an inappropriate place
> -
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>  Labels: mm-gap-2
> Fix For: hive-14535
>
>
> This would work for unit-testing, but would need to be fixed for production.
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr();
>   if (txnManager.isTxnOpen()) {
> mmWriteId = txnManager.getCurrentTxnId();
>   } else {
> mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
> txnManager.commitTxn();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16850) Only open a new transaction when there's no currently opened transaction

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16850:

Description: 
{noformat}
HiveTxnManager txnManager = SessionState.get().getTxnMgr();
  if (txnManager.isTxnOpen()) {
mmWriteId = txnManager.getCurrentTxnId();
  } else {
mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
txnManager.commitTxn();
  }
{noformat}

> Only open a new transaction when there's no currently opened transaction
> 
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>  Labels: mm-gap-2
> Fix For: hive-14535
>
>
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr();
>   if (txnManager.isTxnOpen()) {
> mmWriteId = txnManager.getCurrentTxnId();
>   } else {
> mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
> txnManager.commitTxn();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16850) Converting table to insert-only acid may open a txn in an inappropriate place

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16850:

Labels: mm-gap-2  (was: )

> Converting table to insert-only acid may open a txn in an inappropriate place
> -
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>  Labels: mm-gap-2
> Fix For: hive-14535
>
>
> This would work for unit-testing, but would need to be fixed for production.
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr();
>   if (txnManager.isTxnOpen()) {
> mmWriteId = txnManager.getCurrentTxnId();
>   } else {
> mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
> txnManager.commitTxn();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16850) Converting table to insert-only acid may open a txn in an inappropriate place

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16850:

Summary: Converting table to insert-only acid may open a txn in an 
inappropriate place  (was: Only open a new transaction when there's no 
currently opened transaction)

> Converting table to insert-only acid may open a txn in an inappropriate place
> -
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>  Labels: mm-gap-2
> Fix For: hive-14535
>
>
> {noformat}
> HiveTxnManager txnManager = SessionState.get().getTxnMgr();
>   if (txnManager.isTxnOpen()) {
> mmWriteId = txnManager.getCurrentTxnId();
>   } else {
> mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
> txnManager.commitTxn();
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16850) Only open a new transaction when there's no currently opened transaction

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16850:

Attachment: (was: HIVE-16850.patch)

> Only open a new transaction when there's no currently opened transaction
> 
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
> Fix For: hive-14535
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0

2017-10-19 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211718#comment-16211718
 ] 

Eugene Koifman commented on HIVE-17698:
---

+1

> FileSinkDesk.getMergeInputDirName() uses stmtId=0
> -
>
> Key: HIVE-17698
> URL: https://issues.apache.org/jira/browse/HIVE-17698
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17698.patch
>
>
> this is certainly wrong for multi statement txn but may also affect writes 
> from Union All queries if these are made to follow full Acid convention
> _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16850) Only open a new transaction when there's no currently opened transaction

2017-10-19 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211715#comment-16211715
 ] 

Eugene Koifman commented on HIVE-16850:
---

yes, the point is that it should not.  We can't be opening transactions in 
random places

> Only open a new transaction when there's no currently opened transaction
> 
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
> Fix For: hive-14535
>
> Attachments: HIVE-16850.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17828) Metastore: mysql upgrade scripts to 3.0.0 is broken

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211695#comment-16211695
 ] 

Sergey Shelukhin commented on HIVE-17828:
-

+1 pending tests

> Metastore: mysql upgrade scripts to 3.0.0 is broken
> ---
>
> Key: HIVE-17828
> URL: https://issues.apache.org/jira/browse/HIVE-17828
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17828.1.patch, HIVE-17828.2.patch
>
>
> {code}
> +-+
> | |
> +-+
> | Finished upgrading MetaStore schema from 2.2.0 to 2.3.0 |
> +-+
> 1 row in set, 1 warning (0.00 sec)
> mysql> source  upgrade-2.3.0-to-3.0.0.mysql.sql
> ++
> ||
> ++
> | Upgrading MetaStore schema from 2.3.0 to 3.0.0 |
> ++
> {code}
> {code}
> --
> CREATE TABLE WM_RESOURCEPLAN (
> `RP_ID` bigint(20) NOT NULL,
> `NAME` varchar(128) NOT NULL,
> `QUERY_PARALLELISM` int(11),
> `STATUS` varchar(20) NOT NULL,
> PRIMARY KEY (`RP_ID`),
> KEY `UNIQUE_WM_RESOURCEPLAN` (`NAME`),
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1064 (42000): You have an error in your SQL syntax; check the manual 
> that corresponds to your MySQL server version for the right syntax to use 
> near ') ENGINE=InnoDB DEFAULT CHARSET=latin1' at line 8
> --
> CREATE TABLE WM_POOL
> (
> `POOL_ID` bigint(20) NOT NULL,
> `RP_ID` bigint(20) NOT NULL,
> `PATH` varchar(1024) NOT NULL,
> `PARENT_POOL_ID` bigint(20),
> `ALLOC_FRACTION` DOUBLE,
> `QUERY_PARALLELISM` int(11),
> PRIMARY KEY (`POOL_ID`),
> KEY `UNIQUE_WM_POOL` (`RP_ID`, `PATH`),
> CONSTRAINT `WM_POOL_FK1` FOREIGN KEY (`RP_ID`) REFERENCES 
> `WM_RESOURCEPLAN` (`RP_ID`),
> CONSTRAINT `WM_POOL_FK2` FOREIGN KEY (`PARENT_POOL_ID`) REFERENCES 
> `WM_POOL` (`POOL_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
> --
> CREATE TABLE WM_TRIGGER
> (   
> `TRIGGER_ID` bigint(20) NOT NULL,
> `RP_ID` bigint(20) NOT NULL,
> `NAME` varchar(128) NOT NULL,
> `TRIGGER_EXPRESSION` varchar(1024),
> `ACTION_EXPRESSION` varchar(1024),
> PRIMARY KEY (`TRIGGER_ID`),
> KEY `UNIQUE_WM_TRIGGER` (`RP_ID`, `NAME`),
> CONSTRAINT `WM_TRIGGER_FK1` FOREIGN KEY (`RP_ID`) REFERENCES 
> `WM_RESOURCEPLAN` (`RP_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1215 (HY000): Cannot add foreign key constraint
> --
> CREATE TABLE WM_POOL_TO_TRIGGER
> (   
> `POOL_ID` bigint(20) NOT NULL,
> `TRIGGER_ID` bigint(20) NOT NULL,
> PRIMARY KEY (`POOL_ID`, `TRIGGER_ID`),
> CONSTRAINT `WM_POOL_TO_TRIGGER_FK1` FOREIGN KEY (`POOL_ID`) REFERENCES 
> `WM_POOL` (`POOL_ID`),
> CONSTRAINT `WM_POOL_TO_TRIGGER_FK2` FOREIGN KEY (`TRIGGER_ID`) REFERENCES 
> `WM_TRIGGER` (`TRIGGER_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1215 (HY000): Cannot add foreign key constraint
> --
> CREATE TABLE WM_MAPPING
> (   
> `MAPPING_ID` bigint(20) NOT NULL,
> `RP_ID` bigint(20) NOT NULL,
> `ENTITY_TYPE` varchar(10) NOT NULL,
> `ENTITY_NAME` varchar(128) NOT NULL,
> `POOL_ID` bigint(20) NOT NULL,
> `ORDERING int,
> PRIMARY KEY (`MAPPING_ID`),
> KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`),
> CONSTRAINT `WM_MAPPING_FK1` FOREIGN KEY (`RP_ID`) REFERENCES 
> `WM_RESOURCEPLAN` (`RP_ID`),
> CONSTRAINT `WM_MAPPING_FK2` FOREIGN KEY (`POOL_ID`) REFERENCES `WM_POOL` 
> (`POOL_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> --
> ERROR 1064 (42000): You have an error in your SQL syntax; check the manual 
> that corresponds to your MySQL server version for the right syntax to use 
> near 'MAPPING_ID`),
> KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`' at line 8
> --
> UPDATE VERSION SET SCHEMA_VERSION='3.0.0', VERSION_COMMENT='Hive release 
> version 3.0.0' where VER_ID=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17698:

Status: Patch Available  (was: Open)

> FileSinkDesk.getMergeInputDirName() uses stmtId=0
> -
>
> Key: HIVE-17698
> URL: https://issues.apache.org/jira/browse/HIVE-17698
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17698.patch
>
>
> this is certainly wrong for multi statement txn but may also affect writes 
> from Union All queries if these are made to follow full Acid convention
> _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17698:

Attachment: HIVE-17698.patch

[~ekoifman] can you take a look? thanks

> FileSinkDesk.getMergeInputDirName() uses stmtId=0
> -
>
> Key: HIVE-17698
> URL: https://issues.apache.org/jira/browse/HIVE-17698
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17698.patch
>
>
> this is certainly wrong for multi statement txn but may also affect writes 
> from Union All queries if these are made to follow full Acid convention
> _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17698:
---

Assignee: Sergey Shelukhin

> FileSinkDesk.getMergeInputDirName() uses stmtId=0
> -
>
> Key: HIVE-17698
> URL: https://issues.apache.org/jira/browse/HIVE-17698
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>
> this is certainly wrong for multi statement txn but may also affect writes 
> from Union All queries if these are made to follow full Acid convention
> _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17828) Metastore: mysql upgrade scripts to 3.0.0 is broken

2017-10-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17828:
-
Attachment: HIVE-17828.2.patch

I did not see the issue in mysql 5.7.10. But when I downgraded to 5.6.38 I saw 
the 767 limit for key column. Fixed it in the new path. I don't see other 
issues that [~gopalv] mentioned even with 5.6 version.

> Metastore: mysql upgrade scripts to 3.0.0 is broken
> ---
>
> Key: HIVE-17828
> URL: https://issues.apache.org/jira/browse/HIVE-17828
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17828.1.patch, HIVE-17828.2.patch
>
>
> {code}
> +-+
> | |
> +-+
> | Finished upgrading MetaStore schema from 2.2.0 to 2.3.0 |
> +-+
> 1 row in set, 1 warning (0.00 sec)
> mysql> source  upgrade-2.3.0-to-3.0.0.mysql.sql
> ++
> ||
> ++
> | Upgrading MetaStore schema from 2.3.0 to 3.0.0 |
> ++
> {code}
> {code}
> --
> CREATE TABLE WM_RESOURCEPLAN (
> `RP_ID` bigint(20) NOT NULL,
> `NAME` varchar(128) NOT NULL,
> `QUERY_PARALLELISM` int(11),
> `STATUS` varchar(20) NOT NULL,
> PRIMARY KEY (`RP_ID`),
> KEY `UNIQUE_WM_RESOURCEPLAN` (`NAME`),
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1064 (42000): You have an error in your SQL syntax; check the manual 
> that corresponds to your MySQL server version for the right syntax to use 
> near ') ENGINE=InnoDB DEFAULT CHARSET=latin1' at line 8
> --
> CREATE TABLE WM_POOL
> (
> `POOL_ID` bigint(20) NOT NULL,
> `RP_ID` bigint(20) NOT NULL,
> `PATH` varchar(1024) NOT NULL,
> `PARENT_POOL_ID` bigint(20),
> `ALLOC_FRACTION` DOUBLE,
> `QUERY_PARALLELISM` int(11),
> PRIMARY KEY (`POOL_ID`),
> KEY `UNIQUE_WM_POOL` (`RP_ID`, `PATH`),
> CONSTRAINT `WM_POOL_FK1` FOREIGN KEY (`RP_ID`) REFERENCES 
> `WM_RESOURCEPLAN` (`RP_ID`),
> CONSTRAINT `WM_POOL_FK2` FOREIGN KEY (`PARENT_POOL_ID`) REFERENCES 
> `WM_POOL` (`POOL_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
> --
> CREATE TABLE WM_TRIGGER
> (   
> `TRIGGER_ID` bigint(20) NOT NULL,
> `RP_ID` bigint(20) NOT NULL,
> `NAME` varchar(128) NOT NULL,
> `TRIGGER_EXPRESSION` varchar(1024),
> `ACTION_EXPRESSION` varchar(1024),
> PRIMARY KEY (`TRIGGER_ID`),
> KEY `UNIQUE_WM_TRIGGER` (`RP_ID`, `NAME`),
> CONSTRAINT `WM_TRIGGER_FK1` FOREIGN KEY (`RP_ID`) REFERENCES 
> `WM_RESOURCEPLAN` (`RP_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1215 (HY000): Cannot add foreign key constraint
> --
> CREATE TABLE WM_POOL_TO_TRIGGER
> (   
> `POOL_ID` bigint(20) NOT NULL,
> `TRIGGER_ID` bigint(20) NOT NULL,
> PRIMARY KEY (`POOL_ID`, `TRIGGER_ID`),
> CONSTRAINT `WM_POOL_TO_TRIGGER_FK1` FOREIGN KEY (`POOL_ID`) REFERENCES 
> `WM_POOL` (`POOL_ID`),
> CONSTRAINT `WM_POOL_TO_TRIGGER_FK2` FOREIGN KEY (`TRIGGER_ID`) REFERENCES 
> `WM_TRIGGER` (`TRIGGER_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> --
> ERROR 1215 (HY000): Cannot add foreign key constraint
> --
> CREATE TABLE WM_MAPPING
> (   
> `MAPPING_ID` bigint(20) NOT NULL,
> `RP_ID` bigint(20) NOT NULL,
> `ENTITY_TYPE` varchar(10) NOT NULL,
> `ENTITY_NAME` varchar(128) NOT NULL,
> `POOL_ID` bigint(20) NOT NULL,
> `ORDERING int,
> PRIMARY KEY (`MAPPING_ID`),
> KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`),
> CONSTRAINT `WM_MAPPING_FK1` FOREIGN KEY (`RP_ID`) REFERENCES 
> `WM_RESOURCEPLAN` (`RP_ID`),
> CONSTRAINT `WM_MAPPING_FK2` FOREIGN KEY (`POOL_ID`) REFERENCES `WM_POOL` 
> (`POOL_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> --
> ERROR 1064 (42000): You have an error in your SQL syntax; check the manual 
> that corresponds to your MySQL server version for the right syntax to use 
> near 'MAPPING_ID`),
> KEY `UNIQUE_WM_MAPPING` (`RP_ID`, `ENTITY_TYPE`, `ENTITY_NAME`' at line 8
> --
> UPDATE VERSION SET SCHEMA_VERSION='3.0.0', VERSION_COMMENT='Hive release 
> version 3.0.0' where VER_ID=1
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16850) Only open a new transaction when there's no currently opened transaction

2017-10-19 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211674#comment-16211674
 ] 

Sergey Shelukhin commented on HIVE-16850:
-

Hmm.. the attached patch is already applied to DDLTask.

> Only open a new transaction when there's no currently opened transaction
> 
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
> Fix For: hive-14535
>
> Attachments: HIVE-16850.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17748) REplCopyTaks.execut(DriverContext)

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17748:

Attachment: HIVE-17748.patch

[~ekoifman] can you take a look? thnx

> REplCopyTaks.execut(DriverContext)
> --
>
> Key: HIVE-17748
> URL: https://issues.apache.org/jira/browse/HIVE-17748
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17748.patch
>
>
> has 
> {noformat}
>   Path fromPath = work.getFromPaths()[0];
>   toPath = work.getToPaths()[0];
> {noformat}
> should this throw if from/to paths have > 1 element?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17748) REplCopyTaks.execut(DriverContext)

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17748:

Status: Patch Available  (was: Open)

> REplCopyTaks.execut(DriverContext)
> --
>
> Key: HIVE-17748
> URL: https://issues.apache.org/jira/browse/HIVE-17748
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17748.patch
>
>
> has 
> {noformat}
>   Path fromPath = work.getFromPaths()[0];
>   toPath = work.getToPaths()[0];
> {noformat}
> should this throw if from/to paths have > 1 element?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17748) ReplCopyTask doesn't support multi-file CopyWork

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17748:

Summary: ReplCopyTask doesn't support multi-file CopyWork  (was: 
REplCopyTaks.execut(DriverContext))

> ReplCopyTask doesn't support multi-file CopyWork
> 
>
> Key: HIVE-17748
> URL: https://issues.apache.org/jira/browse/HIVE-17748
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17748.patch
>
>
> has 
> {noformat}
>   Path fromPath = work.getFromPaths()[0];
>   toPath = work.getToPaths()[0];
> {noformat}
> should this throw if from/to paths have > 1 element?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Reopened] (HIVE-16850) Only open a new transaction when there's no currently opened transaction

2017-10-19 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reopened HIVE-16850:
---

it's not fixed - this code is still there

> Only open a new transaction when there's no currently opened transaction
> 
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
> Fix For: hive-14535
>
> Attachments: HIVE-16850.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17748) REplCopyTaks.execut(DriverContext)

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17748:
---

Assignee: Sergey Shelukhin

> REplCopyTaks.execut(DriverContext)
> --
>
> Key: HIVE-17748
> URL: https://issues.apache.org/jira/browse/HIVE-17748
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>
> has 
> {noformat}
>   Path fromPath = work.getFromPaths()[0];
>   toPath = work.getToPaths()[0];
> {noformat}
> should this throw if from/to paths have > 1 element?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-17848) Bucket Map Join : Implement an efficient way to minimize loading hash table

2017-10-19 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17848 started by Deepak Jaiswal.
-
> Bucket Map Join : Implement an efficient way to minimize loading hash table
> ---
>
> Key: HIVE-17848
> URL: https://issues.apache.org/jira/browse/HIVE-17848
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> In bucket mapjoin, each task loads its own copy of hash table which is 
> inefficient as load is IO heavy and due to multiple copies of same hash 
> table, the tables may get GCed on a busy system.
> Implement a subcache with softreference to each hash table corresponding to 
> its bucketID such that it can be reused by a task.
> This needs changes from Tez side to push bucket id to TezProcessor.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17851) Bucket Map Join : Pick correct number of buckets

2017-10-19 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-17851:
-


> Bucket Map Join : Pick correct number of buckets
> 
>
> Key: HIVE-17851
> URL: https://issues.apache.org/jira/browse/HIVE-17851
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>
> CREATE TABLE tab_part (key int, value string) PARTITIONED BY(ds STRING) 
> CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> CREATE TABLE tab(key int, value string) PARTITIONED BY(ds STRING) CLUSTERED 
> BY (key) INTO 4 BUCKETS STORED AS TEXTFILE;
> select a.key, a.value, b.value
> from tab a join tab_part b on a.key = b.key;
> In above case, if tab_part is bigger then it should be the streaming side and 
> the smaller side should create two hash tables. However, currently, it 
> blindly picks 4 as number of buckets as it is the maximum number of buckets 
> among all the tables involved in the join and create 4 hash tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-16850) Only open a new transaction when there's no currently opened transaction

2017-10-19 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-16850.
-
Resolution: Done
  Assignee: Eugene Koifman

Looks like this is already fixed.

> Only open a new transaction when there's no currently opened transaction
> 
>
> Key: HIVE-16850
> URL: https://issues.apache.org/jira/browse/HIVE-16850
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
> Fix For: hive-14535
>
> Attachments: HIVE-16850.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths

2017-10-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211650#comment-16211650
 ] 

Hive QA commented on HIVE-17696:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892984/HIVE-17696.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11309 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=76)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] 
(batchId=243)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] 
(batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] 
(batchId=241)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=204)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=221)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7385/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7385/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7385/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892984 - PreCommit-HIVE-Build

> Vectorized reader does not seem to be pushing down projection columns in 
> certain code paths
> ---
>
> Key: HIVE-17696
> URL: https://issues.apache.org/jira/browse/HIVE-17696
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
> Attachments: HIVE-17696.patch
>
>
> This is the code snippet from {{VectorizedParquetRecordReader.java}}
> {noformat}
> MessageType tableSchema;
> if (indexAccess) {
>   List indexSequence = new ArrayList<>();
>   // Generates a sequence list of indexes
>   for(int i = 0; i < columnNamesList.size(); i++) {
> indexSequence.add(i);
>   }
>   tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, 
> columnNamesList,
> indexSequence);
> } else {
>   tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, 
> columnNamesList,
> columnTypesList);
> }
> indexColumnsWanted = 
> ColumnProjectionUtils.getReadColumnIDs(configuration);
> if (!ColumnProjectionUtils.isReadAllColumns(configuration) && 
> !indexColumnsWanted.isEmpty()) {
>   requestedSchema =
> DataWritableReadSupport.getSchemaByIndex(tableSchema, 
> columnNamesList, indexColumnsWanted);
> } else {
>   requestedSchema = fileSchema;
> }
> this.reader = new ParquetFileReader(
>   configuration, footer.getFileMetaData(), file, blocks, 
> requestedSchema.getColumns());
> {noformat}
> Couple of things to notice here:
> Most of this code is duplicated from {{DataWritableReadSupport.init()}} 
> method. 
> the else condition passes in fileSchema instead of using tableSchema like we 
> do in DataWritableReadSupport.init() method. Does this cause projection 
> columns to be missed when we read parquet files? We should probably just 
> reuse ReadContext returned from {{DataWritableReadSupport.init()}} method 
> here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 3 >

1 - 100 of 209 matches

Mail list logo