date:20171031

[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-10-31 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Attachment: HIVE-17898.1.patch

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17898.1.patch
>
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-10-31 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Status: Patch Available  (was: Open)

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17898.1.patch
>
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-10-31 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Attachment: (was: HIVE-17898.1.patch)

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17898) Explain plan output enhancement

2017-10-31 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17898:
---
Status: Open  (was: Patch Available)

> Explain plan output enhancement
> ---
>
> Key: HIVE-17898
> URL: https://issues.apache.org/jira/browse/HIVE-17898
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> We would like to enhance the explain plan output to display additional 
> information e.g.:
> TableScan operator should have following additional info
> * Actual table name (currently only alias name is displayed)
> * Database name
> * Column names being scanned



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17884) Implement create, alter and drop workload management triggers

2017-10-31 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233675#comment-16233675
 ] 

Lefty Leverenz commented on HIVE-17884:
---

Doc note:  Although this will be documented with the umbrella HIVE-17481, I've 
added a TODOC3.0 label to this issue to emphasize that CREATE TRIGGER, ALTER 
TRIGGER, and DROP TRIGGER need to be documented in the wiki.

> Implement create, alter and drop workload management triggers
> -
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>Priority: Major
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17884.01.patch, HIVE-17884.02.patch, 
> HIVE-17884.03.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17884) Implement create, alter and drop workload management triggers

2017-10-31 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17884:
--
Labels: TODOC3.0  (was: )

> Implement create, alter and drop workload management triggers
> -
>
> Key: HIVE-17884
> URL: https://issues.apache.org/jira/browse/HIVE-17884
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>Priority: Major
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17884.01.patch, HIVE-17884.02.patch, 
> HIVE-17884.03.patch
>
>
> Implement triggers for workload management:
> The commands to be implemented:
> CREATE TRIGGER `resourceplan_name`.`trigger_name` WHEN condition DO action;
> condition is a boolean expression: variable operator value types with 'AND' 
> and 'OR' support.
> action is currently: KILL or MOVE TO pool;
> ALTER TRIGGER `plan_name`.`trigger_name` WHEN condition DO action;
> DROP TRIGGER `plan_name`.`trigger_name`;
> Also add WM_TRIGGERS to information schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-31 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233670#comment-16233670
 ] 

Lefty Leverenz commented on HIVE-17433:
---

Doc note:  This adds *hive.vectorized.input.format.supports.enabled* and 
*hive.test.vectorized.execution.enabled.override* to HiveConf.java.

Only *hive.vectorized.input.format.supports.enabled* needs to be documented in 
the wiki, because *hive.test.vectorized.execution.enabled.override* is for 
internal use only.

Besides documenting the configuration parameter, perhaps this should also be 
mentioned in the Data Types doc.

* [Configuration Properties -- Vectorization | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Vectorization]
* [Hive Data Types -- Decimals | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-DecimalsdecimalDecimals]

Added a TODOC3.0 label.

([~mmccline], please update the fix version.)

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC3.0
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch, HIVE-17433.07.patch, 
> HIVE-17433.08.patch, HIVE-17433.09.patch, HIVE-17433.091.patch, 
> HIVE-17433.092.patch, HIVE-17433.093.patch, HIVE-17433.094.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-31 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17433:
--
Labels: TODOC3.0  (was: )

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC3.0
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch, HIVE-17433.07.patch, 
> HIVE-17433.08.patch, HIVE-17433.09.patch, HIVE-17433.091.patch, 
> HIVE-17433.092.patch, HIVE-17433.093.patch, HIVE-17433.094.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 .  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16602) Implement shared scans with Tez

2017-10-31 Thread liyunzhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233668#comment-16233668
 ] 

liyunzhang commented on HIVE-16602:
---

[~jcamachorodriguez]:  I indeed found the performance improvement in a not very 
good hw env(less hard disk and less memory). I guess this is because shared 
scans optimization can help reduce duplicated table scan because table scan may 
take long time in a bad hw env. for 
example(DS/[query28.sql|https://github.com/apache/hive/blob/master/ql/src/test/queries/clientpositive/perf/query28.q]
before the shared scan optimization, the explain
{code}
Vertex dependency in root stage
Reducer 11 <- Map 10 (SIMPLE_EDGE)
Reducer 13 <- Map 12 (SIMPLE_EDGE)
Reducer 2 <- Map 1 (SIMPLE_EDGE)
Reducer 3 <- Reducer 11 (CUSTOM_SIMPLE_EDGE), Reducer 13 (CUSTOM_SIMPLE_EDGE), 
Reducer 2 (CUSTOM_SIMPLE_EDGE), Reducer 5 (CUSTOM_SIMPLE_EDGE), Reducer 7 
(CUSTOM_SIMPLE_EDGE), Reducer 9 (CUSTOM_SIMPLE_EDGE)
Reducer 5 <- Map 4 (SIMPLE_EDGE)
Reducer 7 <- Map 6 (SIMPLE_EDGE)
Reducer 9 <- Map 8 (SIMPLE_EDGE)

Stage-0
  Fetch Operator
limit:100
Stage-1
  Reducer 3
  File Output Operator [FS_51]
Limit [LIM_50] (rows=1 width=2497)
  Number of rows:100
  Select Operator [SEL_49] (rows=1 width=2497)

Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13","_col14","_col15","_col16","_col17"]
Merge Join Operator [MERGEJOIN_58] (rows=1 width=2497)
  
Conds:(Inner),(Inner),(Inner),(Inner),(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13","_col14","_col15","_col16","_col17"]
<-Reducer 11 [CUSTOM_SIMPLE_EDGE]
  PARTITION_ONLY_SHUFFLE [RS_46]
Group By Operator [GBY_33] (rows=1 width=416)
  
Output:["_col0","_col1","_col2"],aggregations:["avg(VALUE._col0)","count(VALUE._col1)","count(DISTINCT
 KEY._col0:0._col0)"]
<-Map 10 [SIMPLE_EDGE]
  SHUFFLE [RS_32]
Group By Operator [GBY_31] (rows=21333171 width=88)
  
Output:["_col0","_col1","_col2","_col3"],aggregations:["avg(ss_list_price)","count(ss_list_price)","count(DISTINCT
 ss_list_price)"],keys:ss_list_price
  Select Operator [SEL_30] (rows=21333171 width=88)
Output:["ss_list_price"]
Filter Operator [FIL_56] (rows=21333171 width=88)
  predicate:(ss_quantity BETWEEN 11 AND 15 and 
(ss_list_price BETWEEN 66 AND 76 or ss_coupon_amt BETWEEN 920 AND 1920 or 
ss_wholesale_cost BETWEEN 4 AND 24))
  TableScan [TS_28] (rows=575995635 width=88)

default@store_sales,store_sales,Tbl:COMPLETE,Col:NONE,Output:["ss_quantity","ss_wholesale_cost","ss_list_price","ss_coupon_amt"]
<-Reducer 13 [CUSTOM_SIMPLE_EDGE]
  PARTITION_ONLY_SHUFFLE [RS_47]
Group By Operator [GBY_40] (rows=1 width=416)
  
Output:["_col0","_col1","_col2"],aggregations:["avg(VALUE._col0)","count(VALUE._col1)","count(DISTINCT
 KEY._col0:0._col0)"]
<-Map 12 [SIMPLE_EDGE]
  SHUFFLE [RS_39]
Group By Operator [GBY_38] (rows=21333171 width=88)
  
Output:["_col0","_col1","_col2","_col3"],aggregations:["avg(ss_list_price)","count(ss_list_price)","count(DISTINCT
 ss_list_price)"],keys:ss_list_price
  Select Operator [SEL_37] (rows=21333171 width=88)
Output:["ss_list_price"]
Filter Operator [FIL_57] (rows=21333171 width=88)
  predicate:(ss_quantity BETWEEN 6 AND 10 and 
(ss_list_price BETWEEN 91 AND 101 or ss_coupon_amt BETWEEN 1430 AND 2430 or 
ss_wholesale_cost BETWEEN 32 AND 52))
  TableScan [TS_35] (rows=575995635 width=88)

default@store_sales,store_sales,Tbl:COMPLETE,Col:NONE,Output:["ss_quantity","ss_wholesale_cost","ss_list_price","ss_coupon_amt"]
<-Reducer 2 [CUSTOM_SIMPLE_EDGE]
  PARTITION_ONLY_SHUFFLE [RS_42]
Group By Operator [GBY_5] (rows=1 width=416)
  
Output:["_col0","_col1","_col2"],aggregations:["avg(VALUE._col0)","count(VALUE._col1)","count(DISTINCT
 KEY._col0:0._col0)"]
<-Map 1 [SIMPLE_EDGE]
  SHUFFLE [RS_4]
Group By Operator [GBY_3] (rows=21333171 width=88)
  
Output:["_col0","_col1","_col2","_col3"],aggregations:["avg(ss_list_price)","count(ss_list_price)","count(DISTINCT
 ss_list_price)"],keys:ss_list_price
  Select Operator [SEL_2] (rows=21333171 width=88)
Output:["ss_list_price"]

[jira] [Commented] (HIVE-17938) Enable parallel query compilation in HS2

2017-10-31 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233644#comment-16233644
 ] 

Lefty Leverenz commented on HIVE-17938:
---

Nit:  Patch 2 has initial capital "Enable" but the description starts with 
"Whether to" on the previous line.

Please put "Whether to" on the same line as "enable" so it looks good in the 
generated template file.

> Enable parallel query compilation in HS2
> 
>
> Key: HIVE-17938
> URL: https://issues.apache.org/jira/browse/HIVE-17938
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
> Attachments: HIVE-17938.1.patch, HIVE-17938.2.patch
>
>
> This (hive.driver.parallel.compilation) has been enabled in many production 
> environments for a while (Hortonworks customers), and it has been stable.
> Just realized that this is not yet enabled in apache by default. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17765) expose Hive keywords

2017-10-31 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233636#comment-16233636
 ] 

Lefty Leverenz commented on HIVE-17765:
---

Okay, that makes sense.  Thanks Sergey.

> expose Hive keywords 
> -
>
> Key: HIVE-17765
> URL: https://issues.apache.org/jira/browse/HIVE-17765
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, 
> HIVE-17765.03.patch, HIVE-17765.nogen.patch, HIVE-17765.patch
>
>
> This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on 
> SQL capabilities of Hive



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HIVE-17955) Issue with the 'like' function in Hive

2017-10-31 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-17955.

Resolution: Not A Problem

LIKE isn't instr, it uses LIKE patterns.

https://www.w3schools.com/sql/sql_like.asp

> Issue with the 'like' function in Hive
> --
>
> Key: HIVE-17955
> URL: https://issues.apache.org/jira/browse/HIVE-17955
> Project: Hive
>  Issue Type: Bug
>Reporter: Vishal Jaiswal
>
> The following command should not fail as per the documentation: 
> select like("Vishal Jaiswal", "Jaiswal");
> Command: describe function like;
> Result: like(str, pattern) - Checks if str matches pattern
> Command: select like("Vishal Jaiswal", "Jaiswal");
> Result: Query fails in Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17936) Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin branches

2017-10-31 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233621#comment-16233621
 ] 

Lefty Leverenz commented on HIVE-17936:
---

I left some comments on RB.

> Dynamic Semijoin Reduction : markSemiJoinForDPP marks unwanted semijoin 
> branches
> 
>
> Key: HIVE-17936
> URL: https://issues.apache.org/jira/browse/HIVE-17936
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17936.1.patch, HIVE-17936.2.patch
>
>
> In method markSemiJoinForDPP (HIVE-17399), the nDVs comparison should not 
> have equality as there is a chance that the values are same on both sides and 
> the branch is still marked as good when it shouldn't be.
> Add a configurable factor to see how useful this is if nDVs on smaller side 
> are only slightly less than that on TS side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17955) Issue with the 'like' function in Hive

2017-10-31 Thread Vishal Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal Jaiswal updated HIVE-17955:
--
Description: 
The following command should not fail as per the documentation: 
select like("Vishal Jaiswal", "Jaiswal");

Command: describe function like;
Result: like(str, pattern) - Checks if str matches pattern

Command: select like("Vishal Jaiswal", "Jaiswal");
Result: Query fails in Hive.


  was:
The following command should not fail as per the documentation: 
select like("Vishal Jaiswal", "Jaiswal");

Command: describe function like;
Result: like(str, pattern) - Checks if str matches pattern

Command: select like("Vishal Jaiswal", "Jaiswal");
Result: Query fails in Hive, also checked via Spark SQL


Summary: Issue with the 'like' function in Hive  (was: Issue with the 
'like' function in Hive and Spark SQL. )

> Issue with the 'like' function in Hive
> --
>
> Key: HIVE-17955
> URL: https://issues.apache.org/jira/browse/HIVE-17955
> Project: Hive
>  Issue Type: Bug
>Reporter: Vishal Jaiswal
>
> The following command should not fail as per the documentation: 
> select like("Vishal Jaiswal", "Jaiswal");
> Command: describe function like;
> Result: like(str, pattern) - Checks if str matches pattern
> Command: select like("Vishal Jaiswal", "Jaiswal");
> Result: Query fails in Hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17955) Issue with the 'like' function in Hive and Spark SQL.

2017-10-31 Thread Vishal Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal Jaiswal updated HIVE-17955:
--
Description: 
The following command should not fail as per the documentation: 
select like("Vishal Jaiswal", "Jaiswal");

Command: describe function like;
Result: like(str, pattern) - Checks if str matches pattern

Command: select like("Vishal Jaiswal", "Jaiswal");
Result: Query fails in Hive, also checked via Spark SQL


  was:
Command: describe function like;
Result: like(str, pattern) - Checks if str matches pattern

Command: select like("Vishal Jaiswal", "Jaiswal");
Result: Query fails in Hive, also checked via Spark SQL


Summary: Issue with the 'like' function in Hive and Spark SQL.   (was: 
Issue with the 'like' function in Hive and Spark SQL. The following command 
should not fail as per the documentation: select like("Vishal Jaiswal", 
"Jaiswal");)

> Issue with the 'like' function in Hive and Spark SQL. 
> --
>
> Key: HIVE-17955
> URL: https://issues.apache.org/jira/browse/HIVE-17955
> Project: Hive
>  Issue Type: Bug
>Reporter: Vishal Jaiswal
>
> The following command should not fail as per the documentation: 
> select like("Vishal Jaiswal", "Jaiswal");
> Command: describe function like;
> Result: like(str, pattern) - Checks if str matches pattern
> Command: select like("Vishal Jaiswal", "Jaiswal");
> Result: Query fails in Hive, also checked via Spark SQL



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-31 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233602#comment-16233602
 ] 

Lefty Leverenz commented on HIVE-15104:
---

Good doc, thanks [~lirui].  I removed the TODOC3.0 label.

Here's a direct link to the doc:

* [hive.spark.optimize.shuffle.serde | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.spark.optimize.shuffle.serde]

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, 
> HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, 
> HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, 
> HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-31 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15104:
--
Labels:   (was: TODOC3.0)

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, 
> HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, 
> HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, 
> HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17953) Metrics should move to destination atomically

2017-10-31 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233600#comment-16233600
 ] 

Sahil Takiar commented on HIVE-17953:
-

+1 pending Hive QA

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.01.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16917) HiveServer2 guard rails - Limit concurrent connections from user

2017-10-31 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233573#comment-16233573
 ] 

Prasanth Jayachandran commented on HIVE-16917:
--

[~thejas]/[~sershe] can someone please review this patch?

> HiveServer2 guard rails - Limit concurrent connections from user
> 
>
> Key: HIVE-16917
> URL: https://issues.apache.org/jira/browse/HIVE-16917
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-16917.1.patch
>
>
> Rogue applications can make HS2 unusable for others by making too many 
> connections at a time.
> HS2 should start rejecting the number of connections from a user, after it 
> has reached a configurable threshold.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16917) HiveServer2 guard rails - Limit concurrent connections from user

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16917:
-
Status: Patch Available  (was: Open)

> HiveServer2 guard rails - Limit concurrent connections from user
> 
>
> Key: HIVE-16917
> URL: https://issues.apache.org/jira/browse/HIVE-16917
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-16917.1.patch
>
>
> Rogue applications can make HS2 unusable for others by making too many 
> connections at a time.
> HS2 should start rejecting the number of connections from a user, after it 
> has reached a configurable threshold.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17635) Add unit tests to CompactionTxnHandler and use PreparedStatements for queries

2017-10-31 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17635:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the contribution Andrew, pushed to master.

> Add unit tests to CompactionTxnHandler and use PreparedStatements for queries
> -
>
> Key: HIVE-17635
> URL: https://issues.apache.org/jira/browse/HIVE-17635
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-17635.1.patch, HIVE-17635.2.patch, 
> HIVE-17635.3.patch, HIVE-17635.4.patch, HIVE-17635.6.patch
>
>
> It is better for jdbc code that runs against the HMS database to use 
> PreparedStatements. Convert CompactionTxnHandler queries to use 
> PreparedStatement and add tests to TestCompactionTxnHandler to test these 
> queries, and improve code coverage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16917) HiveServer2 guard rails - Limit concurrent connections from user

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16917:
-
Attachment: HIVE-16917.1.patch

> HiveServer2 guard rails - Limit concurrent connections from user
> 
>
> Key: HIVE-16917
> URL: https://issues.apache.org/jira/browse/HIVE-16917
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-16917.1.patch
>
>
> Rogue applications can make HS2 unusable for others by making too many 
> connections at a time.
> HS2 should start rejecting the number of connections from a user, after it 
> has reached a configurable threshold.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-10-31 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15157:
---
Status: Patch Available  (was: In Progress)

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths

2017-10-31 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17696:

Fix Version/s: 2.4.0

> Vectorized reader does not seem to be pushing down projection columns in 
> certain code paths
> ---
>
> Key: HIVE-17696
> URL: https://issues.apache.org/jira/browse/HIVE-17696
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17696-branch-2.patch, HIVE-17696.2.patch, 
> HIVE-17696.patch
>
>
> This is the code snippet from {{VectorizedParquetRecordReader.java}}
> {noformat}
> MessageType tableSchema;
> if (indexAccess) {
>   List indexSequence = new ArrayList<>();
>   // Generates a sequence list of indexes
>   for(int i = 0; i < columnNamesList.size(); i++) {
> indexSequence.add(i);
>   }
>   tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, 
> columnNamesList,
> indexSequence);
> } else {
>   tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, 
> columnNamesList,
> columnTypesList);
> }
> indexColumnsWanted = 
> ColumnProjectionUtils.getReadColumnIDs(configuration);
> if (!ColumnProjectionUtils.isReadAllColumns(configuration) && 
> !indexColumnsWanted.isEmpty()) {
>   requestedSchema =
> DataWritableReadSupport.getSchemaByIndex(tableSchema, 
> columnNamesList, indexColumnsWanted);
> } else {
>   requestedSchema = fileSchema;
> }
> this.reader = new ParquetFileReader(
>   configuration, footer.getFileMetaData(), file, blocks, 
> requestedSchema.getColumns());
> {noformat}
> Couple of things to notice here:
> Most of this code is duplicated from {{DataWritableReadSupport.init()}} 
> method. 
> the else condition passes in fileSchema instead of using tableSchema like we 
> do in DataWritableReadSupport.init() method. Does this cause projection 
> columns to be missed when we read parquet files? We should probably just 
> reuse ReadContext returned from {{DataWritableReadSupport.init()}} method 
> here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths

2017-10-31 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233557#comment-16233557
 ] 

Ferdinand Xu commented on HIVE-17696:
-

Push to branch 2 as well.

> Vectorized reader does not seem to be pushing down projection columns in 
> certain code paths
> ---
>
> Key: HIVE-17696
> URL: https://issues.apache.org/jira/browse/HIVE-17696
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17696-branch-2.patch, HIVE-17696.2.patch, 
> HIVE-17696.patch
>
>
> This is the code snippet from {{VectorizedParquetRecordReader.java}}
> {noformat}
> MessageType tableSchema;
> if (indexAccess) {
>   List indexSequence = new ArrayList<>();
>   // Generates a sequence list of indexes
>   for(int i = 0; i < columnNamesList.size(); i++) {
> indexSequence.add(i);
>   }
>   tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, 
> columnNamesList,
> indexSequence);
> } else {
>   tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, 
> columnNamesList,
> columnTypesList);
> }
> indexColumnsWanted = 
> ColumnProjectionUtils.getReadColumnIDs(configuration);
> if (!ColumnProjectionUtils.isReadAllColumns(configuration) && 
> !indexColumnsWanted.isEmpty()) {
>   requestedSchema =
> DataWritableReadSupport.getSchemaByIndex(tableSchema, 
> columnNamesList, indexColumnsWanted);
> } else {
>   requestedSchema = fileSchema;
> }
> this.reader = new ParquetFileReader(
>   configuration, footer.getFileMetaData(), file, blocks, 
> requestedSchema.getColumns());
> {noformat}
> Couple of things to notice here:
> Most of this code is duplicated from {{DataWritableReadSupport.init()}} 
> method. 
> the else condition passes in fileSchema instead of using tableSchema like we 
> do in DataWritableReadSupport.init() method. Does this cause projection 
> columns to be missed when we read parquet files? We should probably just 
> reuse ReadContext returned from {{DataWritableReadSupport.init()}} method 
> here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17552) Enable bucket map join by default

2017-10-31 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17552:
--
Attachment: HIVE-17552.1.patch

Enable bucket mapjoin by default.

> Enable bucket map join by default
> -
>
> Key: HIVE-17552
> URL: https://issues.apache.org/jira/browse/HIVE-17552
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17552.1.patch
>
>
> Currently bucket map join is disabled by default, however, it is potentially 
> most optimal join we have. Need to enable it by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17552) Enable bucket map join by default

2017-10-31 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17552:
--
Status: Patch Available  (was: In Progress)

> Enable bucket map join by default
> -
>
> Key: HIVE-17552
> URL: https://issues.apache.org/jira/browse/HIVE-17552
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17552.1.patch
>
>
> Currently bucket map join is disabled by default, however, it is potentially 
> most optimal join we have. Need to enable it by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Work started] (HIVE-17552) Enable bucket map join by default

2017-10-31 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17552 started by Deepak Jaiswal.
-
> Enable bucket map join by default
> -
>
> Key: HIVE-17552
> URL: https://issues.apache.org/jira/browse/HIVE-17552
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> Currently bucket map join is disabled by default, however, it is potentially 
> most optimal join we have. Need to enable it by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17954) Implement create, alter and drop pool API's.

2017-10-31 Thread Harish Jaiprakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Jaiprakash reassigned HIVE-17954:


Assignee: Harish Jaiprakash

> Implement create, alter and drop pool API's.
> 
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>
> Implement pool management commands:
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION `fraction`
>   QUERY_PARALLELISM `parallelism`
>   SCHEDULING_POLICY `policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17954) Implement create, alter and drop pool API's.

2017-10-31 Thread Harish Jaiprakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Jaiprakash updated HIVE-17954:
-
Priority: Major  (was: Trivial)

> Implement create, alter and drop pool API's.
> 
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>Priority: Major
>
> Implement pool management commands:
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION `fraction`
>   QUERY_PARALLELISM `parallelism`
>   SCHEDULING_POLICY `policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths

2017-10-31 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17696:

Attachment: HIVE-17696-branch-2.patch

> Vectorized reader does not seem to be pushing down projection columns in 
> certain code paths
> ---
>
> Key: HIVE-17696
> URL: https://issues.apache.org/jira/browse/HIVE-17696
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-17696-branch-2.patch, HIVE-17696.2.patch, 
> HIVE-17696.patch
>
>
> This is the code snippet from {{VectorizedParquetRecordReader.java}}
> {noformat}
> MessageType tableSchema;
> if (indexAccess) {
>   List indexSequence = new ArrayList<>();
>   // Generates a sequence list of indexes
>   for(int i = 0; i < columnNamesList.size(); i++) {
> indexSequence.add(i);
>   }
>   tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, 
> columnNamesList,
> indexSequence);
> } else {
>   tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, 
> columnNamesList,
> columnTypesList);
> }
> indexColumnsWanted = 
> ColumnProjectionUtils.getReadColumnIDs(configuration);
> if (!ColumnProjectionUtils.isReadAllColumns(configuration) && 
> !indexColumnsWanted.isEmpty()) {
>   requestedSchema =
> DataWritableReadSupport.getSchemaByIndex(tableSchema, 
> columnNamesList, indexColumnsWanted);
> } else {
>   requestedSchema = fileSchema;
> }
> this.reader = new ParquetFileReader(
>   configuration, footer.getFileMetaData(), file, blocks, 
> requestedSchema.getColumns());
> {noformat}
> Couple of things to notice here:
> Most of this code is duplicated from {{DataWritableReadSupport.init()}} 
> method. 
> the else condition passes in fileSchema instead of using tableSchema like we 
> do in DataWritableReadSupport.init() method. Does this cause projection 
> columns to be missed when we read parquet files? We should probably just 
> reuse ReadContext returned from {{DataWritableReadSupport.init()}} method 
> here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-10-31 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Status: Patch Available  (was: Open)

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-10-31 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Status: Open  (was: Patch Available)

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-10-31 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Attachment: HIVE-17767.2.patch

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-10-31 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Attachment: HIVE-17902.03.patch

Fixing the tests

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, 
> HIVE-17902.02.patch, HIVE-17902.03.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233534#comment-16233534
 ] 

Mithun Radhakrishnan commented on HIVE-17853:
-

For the record, we've tested this out manually, using an Oozie setup, with 
user-impersonation. I suppose it might be possible to set up a unit-test using 
something based on {{MiniHiveKdc}}, but it looks non-trivial. Hmm...

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17481) LLAP workload management (umbrella)

2017-10-31 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17481:

Attachment: Workload management design doc.pdf

Attaching the design doc

> LLAP workload management (umbrella)
> ---
>
> Key: HIVE-17481
> URL: https://issues.apache.org/jira/browse/HIVE-17481
> Project: Hive
>  Issue Type: New Feature
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: Workload management design doc.pdf
>
>
> This effort is intended to improve various aspects of cluster sharing for 
> LLAP. Some of these are applicable to non-LLAP queries and may later be 
> extended to all queries. Administrators will be able to specify and apply 
> policies for workload management ("resource plans") that apply to the entire 
> cluster, with only one resource plan being active at a time. The policies 
> will be created and modified using new Hive DDL statements. 
> The policies will cover:
> * Dividing the cluster into a set of (optionally, nested) query pools that 
> are each allocated a fraction of the cluster, a set query parallelism, 
> resource sharing policy between queries, and potentially others like 
> priority, etc.
> * Mapping the incoming queries into pools based on the query user, groups, 
> explicit configuration, etc.
> * Specifying rules that perform actions on queries based on counter values 
> (e.g. killing or moving queries).
> One would also be able to switch policies on a live cluster without (usually) 
> affecting running queries, including e.g. to change policies for daytime and 
> nighttime usage patterns, and other similar scenarios. The switches would be 
> safe and atomic; versioning may eventually be supported.
> Some implementation details:
> * WM will only be supported in HS2 (for obvious reasons).
> * All LLAP query AMs will run in "interactive" YARN queue and will be 
> fungible between Hive pools.
> * We will use the concept of "guaranteed tasks" (also known as ducks) to 
> enforce cluster allocation without a central scheduler and without 
> compromising throughput. Guaranteed tasks preempt other (speculative) tasks 
> and are distributed from HS2 to AMs, and from AMs to tasks, in accordance 
> with percentage allocations in the policy. Each "duck" corresponds to a CPU 
> resource on the cluster. The implementation will be isolated so as to allow 
> different ones later.
> * In future, we may consider improved task placement and late binding, 
> similar to the ones described in Sparrow paper, to work around potential 
> hotspots/etc. that are not avoided with the decentralized scheme.
> * Only one HS2 will initially be supported to avoid split-brain workload 
> management. We will also implement (in a tangential set of work items) 
> active-passive HS2 recovery. Eventually, we intend to switch to full 
> active-active HS2 configuration with shared WM and Tez session pool (unlike 
> the current case with 2 separate session pools). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17953) Metrics should move to destination atomically

2017-10-31 Thread Alexander Kolbasov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17953:
--
Attachment: HIVE-17953.01.patch

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.01.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17953) Metrics should move to destination atomically

2017-10-31 Thread Alexander Kolbasov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233531#comment-16233531
 ] 

Alexander Kolbasov commented on HIVE-17953:
---

[~asherman] [~stakiar] FYI

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.01.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17953) Metrics should move to destination atomically

2017-10-31 Thread Alexander Kolbasov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17953:
--
Status: Patch Available  (was: Open)

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.01.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17953) Metrics should move to destination atomically

2017-10-31 Thread Alexander Kolbasov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov reassigned HIVE-17953:
-


> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17595) Correct DAG for updating the last.repl.id for a database during bootstrap load

2017-10-31 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227800#comment-16227800
 ] 

Daniel Dai commented on HIVE-17595:
---

bq. there is separate conditions that lead to execution of 
createEndReplLogTask, it is done after all the tasks are done as part of ...
Seems createEndReplLogTask is dependent on root tasks, not leaves:
{code}
private Task createEndReplLogTask(...) {
  ..
  dependency(scope.rootTasks, replLogTask);
  ..
}
{code}

> Correct DAG for updating the last.repl.id for a database during bootstrap load
> --
>
> Key: HIVE-17595
> URL: https://issues.apache.org/jira/browse/HIVE-17595
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17595.0.patch, HIVE-17595.1.patch, 
> HIVE-17595.2.patch, HIVE-17595.3.patch
>
>
> We update the last.repl.id as a database property. This is done after all the 
> bootstrap tasks to load the relevant data are done and is the last task to be 
> run. however we are currently not setting up the DAG correctly for this task. 
> This is getting added as the root task for now where as it should be the last 
> task to be run in a DAG. This becomes more important after the inclusion of 
> HIVE-17426 since this will lead to parallel execution and incorrect DAG's 
> will lead to incorrect results/state of the system. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17951) Clarify OrcSplit.hasBase() etc

2017-10-31 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17951:
-


> Clarify OrcSplit.hasBase() etc
> --
>
> Key: HIVE-17951
> URL: https://issues.apache.org/jira/browse/HIVE-17951
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> With HIVE-17089,  the meaning of
> {code:java}
> OrcSplit.hasBase()
> OrcSplit.isOriginal()
> OrcSplit.isAcid()
> {code}
> have shifted somewhat.
> Need to clarify definitions/uses.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-10-31 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227779#comment-16227779
 ] 

Vihang Karajgaonkar commented on HIVE-17812:


Thanks for the response [~alangates] Sorry if I am missing something obvious. 
Is HMSHandler a public API? I don't see it annotated with @Public. If it is not 
a public API I don't see a reason why changing {{getHiveConf()}} to 
{{getConf()}} will be backwards incompatible. Are there other changes as well 
to the public API other than {{getConf()}}? I agree that if there are going to 
be backwards incompatible changes coming later which would need recompile of 
all HMSClients then there is no point in fixing this as well. 

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.3.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17467) HCatClient APIs for discovering partition key-values

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17467:

Attachment: HIVE-17467.2-branch-2.patch

Periodic rebase. :/

> HCatClient APIs for discovering partition key-values
> 
>
> Key: HIVE-17467
> URL: https://issues.apache.org/jira/browse/HIVE-17467
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17467.1-branch-2.patch, HIVE-17467.1.patch, 
> HIVE-17467.2-branch-2.patch, HIVE-17467.2.patch
>
>
> This is a followup to HIVE-17466, which adds the {{HiveMetaStore}} level call 
> to retrieve unique combinations of part-key values that satisfy a specified 
> predicate.
> Attached herewith are the {{HCatClient}} APIs that will be used by Apache 
> Oozie, before launching workflows.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17949) itests compile is busted on branch-1.2

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227758#comment-16227758
 ] 

Mithun Radhakrishnan commented on HIVE-17949:
-

Thanks for the review, [~sershe]. I'm just waiting for the tests to kick in, 
before I check this in.

> itests compile is busted on branch-1.2
> --
>
> Key: HIVE-17949
> URL: https://issues.apache.org/jira/browse/HIVE-17949
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 1.2.3
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17949.01-branch-1.2.patch
>
>
> {{commit 18ddf46e0a8f092358725fc102235cbe6ba3e24d}} on {{branch-1.2}} was for 
> {{Preparing for 1.2.3 development}}. This should have also included 
> corresponding changes to all the pom-files under {{itests}}. As it stands 
> now, the build fails with the following:
> {noformat}
> [ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
>  no suitable method found for 
> updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
>  cannot find symbol
> [ERROR]   symbol:   class MiniZooKeeperCluster
> [ERROR]   location: class 
> org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :hive-it-unit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-31 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Attachment: HIVE-15016.9.patch

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-31 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Status: Patch Available  (was: In Progress)

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-31 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Status: In Progress  (was: Patch Available)

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-31 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Attachment: (was: HIVE-15016.9.patch)

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-31 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227736#comment-16227736
 ] 

Aihua Xu commented on HIVE-15016:
-

Sure. Will do.

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-10-31 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17908:
--
Status: Patch Available  (was: Open)

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch, 
> HIVE-17908.3.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-10-31 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17908:
--
Attachment: HIVE-17908.3.patch

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch, 
> HIVE-17908.3.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-10-31 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17908:
--
Status: Open  (was: Patch Available)

wtf is wrong with the precommit system

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-31 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227718#comment-16227718
 ] 

Ashutosh Chauhan commented on HIVE-15016:
-

Figured {{TestAcidOnTez}} Failures are because default value for few configs 
changed b/w 2.8 vs 3.1. With following change test passes.
{code}
diff --git 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java
index dbfc23510c..65a1ed110e 100644
--- 
a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java
+++ 
b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java
@@ -49,6 +49,7 @@
 import org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2;
 import org.apache.hadoop.hive.ql.processors.CommandProcessorResponse;
 import org.apache.hadoop.hive.ql.session.SessionState;
+import org.apache.tez.mapreduce.hadoop.MRJobConfig;
 import org.junit.After;
 import org.junit.Assert;
 import org.junit.Before;
@@ -106,6 +107,8 @@ public void setUp() throws Exception {
 .setVar(HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER,
 
"org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory");
 TxnDbUtil.setConfValues(hiveConf);
+hiveConf.setInt(MRJobConfig.MAP_MEMORY_MB, 1024);
+hiveConf.setInt(MRJobConfig.REDUCE_MEMORY_MB, 1024);
 TxnDbUtil.prepDb(hiveConf);
 File f = new File(TEST_WAREHOUSE_DIR);
 if (f.exists()) {
{code}
[~aihuaxu] Can you incorporate above also in your patch?

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-10-31 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Attachment: HIVE-15016.9.patch

patch-9: address comments.

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16952) AcidUtils.parseBaseOrDeltaBucketFilename() end clause

2017-10-31 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16952:
--
Priority: Major  (was: Minor)

> AcidUtils.parseBaseOrDeltaBucketFilename() end clause
> -
>
> Key: HIVE-16952
> URL: https://issues.apache.org/jira/browse/HIVE-16952
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> The end of this method
> {noformat}
> } else {
>   result.setOldStyle(true).bucket(-1).minimumTransactionId(0)
>   .maximumTransactionId(0);
> }
> {noformat}
> should this throw instead?  bucket == -1 can't be handled by anything in 
> OrcRawRecordMerger or anywhere else



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-31 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227700#comment-16227700
 ] 

Eugene Koifman commented on HIVE-17458:
---

Patch 13 restores OrcSplit.canUseLlapIo() as before, which means reading 
"original" files will Vectorize but not use LLAP IO.
There are various issues with that which are reflected in subtasks above.  
These can be handled at a later point.



> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch, 
> HIVE-17458.11.patch, HIVE-17458.12.patch, HIVE-17458.12.patch, 
> HIVE-17458.13.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-31 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.13.patch

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch, 
> HIVE-17458.11.patch, HIVE-17458.12.patch, HIVE-17458.12.patch, 
> HIVE-17458.13.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16952) AcidUtils.parseBaseOrDeltaBucketFilename() end clause

2017-10-31 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227674#comment-16227674
 ] 

Eugene Koifman commented on HIVE-16952:
---

note that Load Data can simply move files with arbitrary names into the table 
namespace.
So non-acid to acid conversion (unbucketed) may see files with non-standard 
names
So the "-1" may be needed to send all such files a single logical bucket to 
number the rows correct for reading "original" files.

Could also hash the filename (that maps to -1) and mod N to send to different 
logical buckets so that 1st compaction doesn't have a lopsided split.

> AcidUtils.parseBaseOrDeltaBucketFilename() end clause
> -
>
> Key: HIVE-16952
> URL: https://issues.apache.org/jira/browse/HIVE-16952
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> The end of this method
> {noformat}
> } else {
>   result.setOldStyle(true).bucket(-1).minimumTransactionId(0)
>   .maximumTransactionId(0);
> }
> {noformat}
> should this throw instead?  bucket == -1 can't be handled by anything in 
> OrcRawRecordMerger or anywhere else



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-16917) HiveServer2 guard rails - Limit concurrent connections from user

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-16917:


Assignee: Prasanth Jayachandran

> HiveServer2 guard rails - Limit concurrent connections from user
> 
>
> Key: HIVE-16917
> URL: https://issues.apache.org/jira/browse/HIVE-16917
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Prasanth Jayachandran
>
> Rogue applications can make HS2 unusable for others by making too many 
> connections at a time.
> HS2 should start rejecting the number of connections from a user, after it 
> has reached a configurable threshold.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17467) HCatClient APIs for discovering partition key-values

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17467:

Affects Version/s: 2.4.0
   3.0.0
   Status: Patch Available  (was: Open)

> HCatClient APIs for discovering partition key-values
> 
>
> Key: HIVE-17467
> URL: https://issues.apache.org/jira/browse/HIVE-17467
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17467.1-branch-2.patch, HIVE-17467.1.patch, 
> HIVE-17467.2.patch
>
>
> This is a followup to HIVE-17466, which adds the {{HiveMetaStore}} level call 
> to retrieve unique combinations of part-key values that satisfy a specified 
> predicate.
> Attached herewith are the {{HCatClient}} APIs that will be used by Apache 
> Oozie, before launching workflows.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17467) HCatClient APIs for discovering partition key-values

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17467:

Attachment: HIVE-17467.2.patch

> HCatClient APIs for discovering partition key-values
> 
>
> Key: HIVE-17467
> URL: https://issues.apache.org/jira/browse/HIVE-17467
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17467.1-branch-2.patch, HIVE-17467.1.patch, 
> HIVE-17467.2.patch
>
>
> This is a followup to HIVE-17466, which adds the {{HiveMetaStore}} level call 
> to retrieve unique combinations of part-key values that satisfy a specified 
> predicate.
> Attached herewith are the {{HCatClient}} APIs that will be used by Apache 
> Oozie, before launching workflows.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17950) Implement resource plan fetching from metastore

2017-10-31 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227664#comment-16227664
 ] 

Prasanth Jayachandran commented on HIVE-17950:
--

with HIVE-17907 this will change to push model. So 
MetastoreResourcePlanTriggersFetcher will go away. 

> Implement resource plan fetching from metastore
> ---
>
> Key: HIVE-17950
> URL: https://issues.apache.org/jira/browse/HIVE-17950
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
>
> With HIVE-17884 committed, add real implementation for 
> MetastoreResourcePlanTriggersFetcher class and add qfile tests for triggers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17949) itests compile is busted on branch-1.2

2017-10-31 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227624#comment-16227624
 ] 

Sergey Shelukhin commented on HIVE-17949:
-

+1

> itests compile is busted on branch-1.2
> --
>
> Key: HIVE-17949
> URL: https://issues.apache.org/jira/browse/HIVE-17949
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 1.2.3
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17949.01-branch-1.2.patch
>
>
> {{commit 18ddf46e0a8f092358725fc102235cbe6ba3e24d}} on {{branch-1.2}} was for 
> {{Preparing for 1.2.3 development}}. This should have also included 
> corresponding changes to all the pom-files under {{itests}}. As it stands 
> now, the build fails with the following:
> {noformat}
> [ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
>  no suitable method found for 
> updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
>  cannot find symbol
> [ERROR]   symbol:   class MiniZooKeeperCluster
> [ERROR]   location: class 
> org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :hive-it-unit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17841) implement applying the resource plan

2017-10-31 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17841:

Attachment: HIVE-17841.06.patch

Fixing the rest

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, 
> HIVE-17841.06.patch, HIVE-17841.06.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17950) Implement resource plan fetching from metastore

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17950:
-
Description: With HIVE-17884 committed, add real implementation for 
MetastoreResourcePlanTriggersFetcher class and add qfile tests for triggers.

> Implement resource plan fetching from metastore
> ---
>
> Key: HIVE-17950
> URL: https://issues.apache.org/jira/browse/HIVE-17950
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
>
> With HIVE-17884 committed, add real implementation for 
> MetastoreResourcePlanTriggersFetcher class and add qfile tests for triggers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17950) Implement resource plan fetching from metastore

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-17950:


Assignee: Prasanth Jayachandran

> Implement resource plan fetching from metastore
> ---
>
> Key: HIVE-17950
> URL: https://issues.apache.org/jira/browse/HIVE-17950
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17950) Implement resource plan fetching from metastore

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17950:
-
Fix Version/s: 3.0.0

> Implement resource plan fetching from metastore
> ---
>
> Key: HIVE-17950
> URL: https://issues.apache.org/jira/browse/HIVE-17950
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17950) Implement resource plan fetching from metastore

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17950:
-
Affects Version/s: 3.0.0

> Implement resource plan fetching from metastore
> ---
>
> Key: HIVE-17950
> URL: https://issues.apache.org/jira/browse/HIVE-17950
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-10-31 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227571#comment-16227571
 ] 

Prasanth Jayachandran commented on HIVE-15157:
--

Hadoop fs Path cannot contain ":"
Hive should provide timestamp toPath implementation that escapes/replaces ":" 
to support timestamp as partition path. 

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17949) itests compile is busted on branch-1.2

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17949:

Status: Patch Available  (was: Open)

Submitting for tests. 

> itests compile is busted on branch-1.2
> --
>
> Key: HIVE-17949
> URL: https://issues.apache.org/jira/browse/HIVE-17949
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 1.2.3
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17949.01-branch-1.2.patch
>
>
> {{commit 18ddf46e0a8f092358725fc102235cbe6ba3e24d}} on {{branch-1.2}} was for 
> {{Preparing for 1.2.3 development}}. This should have also included 
> corresponding changes to all the pom-files under {{itests}}. As it stands 
> now, the build fails with the following:
> {noformat}
> [ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
>  no suitable method found for 
> updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
>  cannot find symbol
> [ERROR]   symbol:   class MiniZooKeeperCluster
> [ERROR]   location: class 
> org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :hive-it-unit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17949) itests compile is busted on branch-1.2

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17949:

Attachment: HIVE-17949.01-branch-1.2.patch

Here's the fix. Could I please bother either of \[[~vgumashta], [~sershe]\] to 
review?

> itests compile is busted on branch-1.2
> --
>
> Key: HIVE-17949
> URL: https://issues.apache.org/jira/browse/HIVE-17949
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 1.2.3
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17949.01-branch-1.2.patch
>
>
> {{commit 18ddf46e0a8f092358725fc102235cbe6ba3e24d}} on {{branch-1.2}} was for 
> {{Preparing for 1.2.3 development}}. This should have also included 
> corresponding changes to all the pom-files under {{itests}}. As it stands 
> now, the build fails with the following:
> {noformat}
> [ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
>  no suitable method found for 
> updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
>  cannot find symbol
> [ERROR]   symbol:   class MiniZooKeeperCluster
> [ERROR]   location: class 
> org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :hive-it-unit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17949) itests compile is busted on branch-1.2

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reassigned HIVE-17949:
---


> itests compile is busted on branch-1.2
> --
>
> Key: HIVE-17949
> URL: https://issues.apache.org/jira/browse/HIVE-17949
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 1.2.3
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>
> {{commit 18ddf46e0a8f092358725fc102235cbe6ba3e24d}} on {{branch-1.2}} was for 
> {{Preparing for 1.2.3 development}}. This should have also included 
> corresponding changes to all the pom-files under {{itests}}. As it stands 
> now, the build fails with the following:
> {noformat}
> [ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
>  no suitable method found for 
> updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
>  cannot find symbol
> [ERROR]   symbol:   class MiniZooKeeperCluster
> [ERROR]   location: class 
> org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :hive-it-unit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1

2017-10-31 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227525#comment-16227525
 ] 

Hive QA commented on HIVE-17947:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894991/HIVE-17947.1-branch-1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 176 failed/errored test(s), 7967 tests 
executed
*Failed tests:*
{noformat}
TestAcidOnTez - did not produce a TEST-*.xml file (likely timed out) 
(batchId=376)
TestAdminUser - did not produce a TEST-*.xml file (likely timed out) 
(batchId=358)
TestAuthorizationPreEventListener - did not produce a TEST-*.xml file (likely 
timed out) (batchId=391)
TestAuthzApiEmbedAuthorizerInEmbed - did not produce a TEST-*.xml file (likely 
timed out) (batchId=368)
TestAuthzApiEmbedAuthorizerInRemote - did not produce a TEST-*.xml file (likely 
timed out) (batchId=374)
TestBeeLineWithArgs - did not produce a TEST-*.xml file (likely timed out) 
(batchId=398)
TestCLIAuthzSessionContext - did not produce a TEST-*.xml file (likely timed 
out) (batchId=416)
TestClearDanglingScratchDir - did not produce a TEST-*.xml file (likely timed 
out) (batchId=383)
TestClientSideAuthorizationProvider - did not produce a TEST-*.xml file (likely 
timed out) (batchId=390)
TestCompactor - did not produce a TEST-*.xml file (likely timed out) 
(batchId=379)
TestCreateUdfEntities - did not produce a TEST-*.xml file (likely timed out) 
(batchId=378)
TestCustomAuthentication - did not produce a TEST-*.xml file (likely timed out) 
(batchId=399)
TestDBTokenStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=342)
TestDDLWithRemoteMetastoreSecondNamenode - did not produce a TEST-*.xml file 
(likely timed out) (batchId=377)
TestDynamicSerDe - did not produce a TEST-*.xml file (likely timed out) 
(batchId=345)
TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed 
out) (batchId=355)
TestEmbeddedThriftBinaryCLIService - did not produce a TEST-*.xml file (likely 
timed out) (batchId=402)
TestEncryptedHDFSCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=437)
TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=350)
TestFolderPermissions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=385)
TestHCatLoaderEncryption - did not produce a TEST-*.xml file (likely timed out) 
(batchId=283)
TestHS2AuthzContext - did not produce a TEST-*.xml file (likely timed out) 
(batchId=419)
TestHS2AuthzSessionContext - did not produce a TEST-*.xml file (likely timed 
out) (batchId=420)
TestHS2ClearDanglingScratchDir - did not produce a TEST-*.xml file (likely 
timed out) (batchId=406)
TestHS2ImpersonationWithRemoteMS - did not produce a TEST-*.xml file (likely 
timed out) (batchId=407)
TestHiveAuthorizerCheckInvocation - did not produce a TEST-*.xml file (likely 
timed out) (batchId=394)
TestHiveAuthorizerShowFilters - did not produce a TEST-*.xml file (likely timed 
out) (batchId=393)
TestHiveHistory - did not produce a TEST-*.xml file (likely timed out) 
(batchId=396)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=370)
TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file 
(likely timed out) (batchId=360)
TestHiveMetaTool - did not produce a TEST-*.xml file (likely timed out) 
(batchId=373)
TestHiveServer2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=422)
TestHiveServer2SessionTimeout - did not produce a TEST-*.xml file (likely timed 
out) (batchId=423)
TestHiveSessionImpl - did not produce a TEST-*.xml file (likely timed out) 
(batchId=403)
TestHs2Hooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=375)
TestHs2HooksWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) 
(batchId=451)
TestJdbcDriver2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=410)
TestJdbcMetadataApiAuth - did not produce a TEST-*.xml file (likely timed out) 
(batchId=421)
TestJdbcWithLocalClusterSpark - did not produce a TEST-*.xml file (likely timed 
out) (batchId=415)
TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=412)
TestJdbcWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) 
(batchId=448)
TestJdbcWithMiniKdcCookie - did not produce a TEST-*.xml file (likely timed 
out) (batchId=447)
TestJdbcWithMiniKdcSQLAuthBinary - did not produce a TEST-*.xml file (likely 
timed out) (batchId=445)
TestJdbcWithMiniKdcSQLAuthHttp - did not produce a TEST-*.xml file (likely 
timed out) (batchId=450)
TestJdbcWithMiniMr - did not produce a TEST-*.xml file (likely timed out) 
(batchId=411)
TestJdbcWithSQLAuthUDFBlacklist - did not produce a TEST-*.xml file (likely 
timed out) (batchId=417)
TestJdbcWithSQLAuthorization - did not produce a TEST-*.xml file

[jira] [Work started] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-10-31 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15157 started by Jesus Camacho Rodriguez.
--
> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17940) IllegalArgumentException when reading last row-group in an ORC stripe

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227419#comment-16227419
 ] 

Mithun Radhakrishnan commented on HIVE-17940:
-

bq. ... {{branch-1.2}} builds on my box.
I spoke too soon. Looks like {{branch-1.2}} is busted:

{noformat}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
(default-testCompile) on project hive-it-unit: Compilation failure: Compilation 
failure:
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[31,41]
 package org.apache.hadoop.hbase.zookeeper does not exist
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[42,11]
 cannot find symbol
[ERROR]   symbol:   class MiniZooKeeperCluster
[ERROR]   location: class org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestAdminUser.java:[41,12]
 cannot find symbol
[ERROR]   symbol:   method getPrivilege()
[ERROR]   location: class 
org.apache.hadoop.hive.metastore.api.HiveObjectPrivilege
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestAdminUser.java:[42,75]
 cannot find symbol
[ERROR]   symbol:   method getRole()
[ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
 no suitable method found for 
updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
[ERROR] method 
org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
 is not applicable
[ERROR]   (actual and formal argument lists differ in length)
[ERROR] method 
org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
 is not applicable
[ERROR]   (actual and formal argument lists differ in length)
[ERROR] method 
org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
 is not applicable
[ERROR]   (actual and formal argument lists differ in length)
[ERROR] method 
org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
 is not applicable
[ERROR]   (actual and formal argument lists differ in length)
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
 incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
cannot be converted to boolean
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
 incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
cannot be converted to boolean
[ERROR] 
/Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
 cannot find symbol
[ERROR]   symbol:   class MiniZooKeeperCluster
[ERROR]   location: class org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hive-it-unit
{noformat}

This is without HIVE-17940. I'll raise (yet) another JIRA to sort out the 
breakage. 

> IllegalArgumentException when reading last row-group in an ORC stripe
> -
>
> Key: HIVE-17940
>

[jira] [Commented] (HIVE-17841) implement applying the resource plan

2017-10-31 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227412#comment-16227412
 ] 

Sergey Shelukhin commented on HIVE-17841:
-

Some tests still fail.. looking.

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, 
> HIVE-17841.06.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-31 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227401#comment-16227401
 ] 

Mithun Radhakrishnan commented on HIVE-17853:
-

[~vihangk1],

bq. All subsequent actions coming via HS2 should also do a doAs() using the 
sessionProxy. Is this happening in case of HCatalog...
Right. HS2 doesn't come into it, since this has more to do with {{HCatClient}}. 
The HCatalog APIs use {{HiveClientCache}} to amortize the cost of 
{{HiveMetaStoreClient}} construction and metastore connections.
Systems like Oozie/Falcon that use {{HCatClient}} to make metastore-calls 
within a {{doAs()}} context might land up losing their {{UGI.doAs()}} contexts 
after timeout, causing any retried actions to run as a privileged, rather than 
the impersonated user.

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-10-31 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227371#comment-16227371
 ] 

Alan Gates commented on HIVE-17812:
---

Yes and no.  To do that I'll have to move HiveMetaStore along with this patch.  
That's fine, as I'll have to move it anyway.

But that patch will include changes to HMSHandler that will also be backward 
incompatible.  The getHiveConf method will be removed and replaced by getConf.  
This can't be avoided because HMSHandler will no longer have a HiveConf object 
to return and can't even reference HiveConf since standalone-metastore doesn't 
depend on common.  So, we're just kicking the can down the road.  I suspect all 
listeners will be broken without a shim layer whichever way we go.

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.3.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17926) Support triggers for non-pool sessions

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17926:
-
Attachment: HIVE-17926.3.patch

TestTriggersNoTezSessionPool test does not run the full triggers test suite as 
it can be slow launching sessions everytime. TestTriggersNoTezSessionPool will 
now run only 2 test. 

> Support triggers for non-pool sessions
> --
>
> Key: HIVE-17926
> URL: https://issues.apache.org/jira/browse/HIVE-17926
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, 
> HIVE-17926.2.patch, HIVE-17926.3.patch
>
>
> Current trigger implementation works only with tez session pools. In case 
> when tez sessions pools are not used, a new session gets created for every 
> query in which case trigger validation does not happen. It will be good to 
> support such one-off session case as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Issue Comment Deleted] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2017-10-31 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16826:
---
Comment: was deleted

(was: Interestingly, there seems to be an issue with the current code.  When I 
instruct beeline to use quote {{disable.quoting.for.sv}}, my changes provide 
the same output as the current implementations.  However, when no quotes are 
specified, there is a difference.
\\
\\
* theFileWhereToStoreTheData.csv = current implementation
* theFileWhereToStoreTheData.csv.mod = with my changes

{code}
[root@host ~]# md5sum theFileWhereToStoreTheData.csv*
6bfb928df7d2a7d778930bb972bc23c5  theFileWhereToStoreTheData.csv
fb3972fe583a4e1565a4fddb81dc8d62  theFileWhereToStoreTheData.csv.mod
{code}

For the first 20,000 outputs, we are good, but then it gets weird...

{code}
[root@host ~]# head -n 2 theFileWhereToStoreTheData.csv | xxd | md5sum
280b418c87ed701b509f4cbbdfe8fa29  -
[root@host ~]# head -n 2 theFileWhereToStoreTheData.csv.mod | xxd | md5sum
280b418c87ed701b509f4cbbdfe8fa29  -

[root@host ~]# head -n 21000 theFileWhereToStoreTheData.csv | xxd | md5sum
3b1eb5b7b63a5255c8e1539230d190a9  -
[root@host ~]# head -n 21000 theFileWhereToStoreTheData.csv.mod | xxd | md5sum
7de5ae6604e91a42a388c9826174ee30  -
{code}

Everything in the file starts fine...

{code}
[root@host ~]# head -n 4 theFileWhereToStoreTheData.csv | tail -n 2 | xxd
000: 3030 2d30 3030 302c 416c 6c20 4f63 6375  00-,All Occu
010: 7061 7469 6f6e 732c 3133 3433 3534 3235  pations,13435425
020: 302c 3430 3639 300a 3030 2d30 3030 302c  0,40690.00-,
030: 416c 6c20 4f63 6375 7061 7469 6f6e 732c  All Occupations,
040: 3133 3433 3534 3235 302c 3430 3639 300a  134354250,40690.
{code}

But then it changes behavior.  We see that strings are being quoted with NUL 
bytes "00":

{code}
[root@nightly513-unsecure-1 ~]# head -n 10 theFileWhereToStoreTheData.csv | 
tail -n 2 | xxd
000: 3135 2d31 3031 312c 0043 6f6d 7075 7465  15-1011,.Compute
010: 7220 616e 6420 696e 666f 726d 6174 696f  r and informatio
020: 6e20 7363 6965 6e74 6973 7473 2c20 7265  n scientists, re
030: 7365 6172 6368 002c 3238 3732 302c 3130  search.,28720,10
040: 3036 3430 0a31 352d 3130 3131 2c00 436f  0640.15-1011,.Co
050: 6d70 7574 6572 2061 6e64 2069 6e66 6f72  mputer and infor
060: 6d61 7469 6f6e 2073 6369 656e 7469 7374  mation scientist
070: 732c 2072 6573 6561 7263 6800 2c32 3837  s, research.,287
080: 3230 2c31 3030 3634 300a 20,100640.
{code}

I can't figure out how these NUL bytes are being introduced in the current 
implementation, but my changes seem to address this issue and do not include 
these erroneous extra bytes.)

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2017-10-31 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227355#comment-16227355
 ] 

BELUGA BEHR commented on HIVE-16826:


Ah, I see the NUL values are already addressed in [HIVE-16625].  Ignore my 
previous comments.

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2017-10-31 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227352#comment-16227352
 ] 

BELUGA BEHR commented on HIVE-16826:


Changes also include a measurable speed improvement, it's probably even better 
if there is a lot of GC activity going on:

{code}
-- current implementation
1,199,934 rows selected (25.695 seconds)
Closing: 0: jdbc:hive2://localhost:1

-- with these changes
1,199,934 rows selected (18.248 seconds)
Closing: 0: jdbc:hive2://localhost:1
{code}

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2017-10-31 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-16826:
---
Issue Type: Bug  (was: Improvement)

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2017-10-31 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227337#comment-16227337
 ] 

BELUGA BEHR commented on HIVE-16826:


Interestingly, there seems to be an issue with the current code.  When I 
instruct beeline to use quote {{disable.quoting.for.sv}}, my changes provide 
the same output as the current implementations.  However, when no quotes are 
specified, there is a difference.
\\
\\
* theFileWhereToStoreTheData.csv = current implementation
* theFileWhereToStoreTheData.csv.mod = with my changes

{code}
[root@host ~]# md5sum theFileWhereToStoreTheData.csv*
6bfb928df7d2a7d778930bb972bc23c5  theFileWhereToStoreTheData.csv
fb3972fe583a4e1565a4fddb81dc8d62  theFileWhereToStoreTheData.csv.mod
{code}

For the first 20,000 outputs, we are good, but then it gets weird...

{code}
[root@host ~]# head -n 2 theFileWhereToStoreTheData.csv | xxd | md5sum
280b418c87ed701b509f4cbbdfe8fa29  -
[root@host ~]# head -n 2 theFileWhereToStoreTheData.csv.mod | xxd | md5sum
280b418c87ed701b509f4cbbdfe8fa29  -

[root@host ~]# head -n 21000 theFileWhereToStoreTheData.csv | xxd | md5sum
3b1eb5b7b63a5255c8e1539230d190a9  -
[root@host ~]# head -n 21000 theFileWhereToStoreTheData.csv.mod | xxd | md5sum
7de5ae6604e91a42a388c9826174ee30  -
{code}

Everything in the file starts fine...

{code}
[root@host ~]# head -n 4 theFileWhereToStoreTheData.csv | tail -n 2 | xxd
000: 3030 2d30 3030 302c 416c 6c20 4f63 6375  00-,All Occu
010: 7061 7469 6f6e 732c 3133 3433 3534 3235  pations,13435425
020: 302c 3430 3639 300a 3030 2d30 3030 302c  0,40690.00-,
030: 416c 6c20 4f63 6375 7061 7469 6f6e 732c  All Occupations,
040: 3133 3433 3534 3235 302c 3430 3639 300a  134354250,40690.
{code}

But then it changes behavior.  We see that strings are being quoted with NUL 
bytes "00":

{code}
[root@nightly513-unsecure-1 ~]# head -n 10 theFileWhereToStoreTheData.csv | 
tail -n 2 | xxd
000: 3135 2d31 3031 312c 0043 6f6d 7075 7465  15-1011,.Compute
010: 7220 616e 6420 696e 666f 726d 6174 696f  r and informatio
020: 6e20 7363 6965 6e74 6973 7473 2c20 7265  n scientists, re
030: 7365 6172 6368 002c 3238 3732 302c 3130  search.,28720,10
040: 3036 3430 0a31 352d 3130 3131 2c00 436f  0640.15-1011,.Co
050: 6d70 7574 6572 2061 6e64 2069 6e66 6f72  mputer and infor
060: 6d61 7469 6f6e 2073 6369 656e 7469 7374  mation scientist
070: 732c 2072 6573 6561 7263 6800 2c32 3837  s, research.,287
080: 3230 2c31 3030 3634 300a 20,100640.
{code}

I can't figure out how these NUL bytes are being introduced in the current 
implementation, but my changes seem to address this issue and do not include 
these erroneous extra bytes.

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-31 Thread Chris Drome (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-17853:
---
Attachment: HIVE-17853.01-branch-2.patch

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-31 Thread Chris Drome (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-17853:
---
Attachment: (was: HIVE-17853.01-branch-2.2.patch)

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-31 Thread Chris Drome (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-17853:
---
Attachment: (was: HIVE-17853.01-branch-2.patch)

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2017-10-31 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227329#comment-16227329
 ] 

Thejas M Nair commented on HIVE-9350:
-

[~akolb] Please see my first comment in the jira.
bq. [~prasadm] FYI, this patch introduces a small change to metastore 
filterhook api. The signature of the apis allows for MetaException to be 
thrown, this helps in propagating error messages properly. 

Without that, it would need to throw unchecked exceptions. Having a clear 
exception in the signature informs users of the api about what to expect/handle.
Since this API is not widely used by end users, and only by advanced users like 
Sentry, I notified the original author of it. It should be easy to make a 
change in Sentry when it adds support for this new version of hive (1.2.0).


> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
> HIVE-9350.4.patch, HIVE-9350.5.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17934) Merging Statistics are promoted to COMPLETE (most of the time)

2017-10-31 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227328#comment-16227328
 ] 

Hive QA commented on HIVE-17934:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12894989/HIVE-17934.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1008 failed/errored test(s), 11345 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=243)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[select_dummy_source] 
(batchId=243)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=243)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[allcolref_in_udf] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStatsPart] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ambiguous_col] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_part] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_9] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join0] (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join10] (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join11] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join12] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join13] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join14] (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join15] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join16] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join17] (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18] (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18_multi_distinct]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19] (batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19_inclause] 
(batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join1] (batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join20] (batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join21] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join22] (batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join23] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join24] (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join26] (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join27] (batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join28] (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join29] (batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join2] (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join31] (batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join33] (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join3] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join4] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join5] (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join6] (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join7] (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join8] (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join9] (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values]
 (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats2] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask]
 (batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_10] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_11] 
(batchId=84)

[jira] [Updated] (HIVE-17841) implement applying the resource plan

2017-10-31 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17841:

Attachment: HIVE-17841.06.patch

Fixing the trigger test (at least some of them, they take forever to run so I 
will look at failures once it finishes). For now it will only work for one 
pool. 
This also hides the internals, requiring the test to go thru normal interface.
cc [~prasanth_j]



> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, 
> HIVE-17841.06.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-10-31 Thread Chris Drome (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227311#comment-16227311
 ] 

Chris Drome commented on HIVE-17853:


[~vihangk1], as per the description, consider the case of Oozie {{oozie}} 
impersonating a different user {{mithun}}. The {{oozie}} user will create a 
client and open the connection to the metastore within the doAs clause, which 
means that all operations during this session are performed as {{mithun}}.

A retry/reconnect can occur if the read timeout for an operation is exceeded or 
the lifetime of the connection is exceeded. At this point, {{close}} is called 
explicitly, followed by a call to {{open}} to establish a new connection. 
However, the reconnect call is not being performed in a doAs context, so it 
will create a new connection to the metastore as {{oozie}}.

There is no specific stack trace to attach here as it depends on the operations 
executed after the reconnect, and typically manifests as a failure caused by 
insufficient privileges. Worst case, if {{oozie}} has more privileges than 
{{mithun}}, it will successfully perform operations that {{mithun}} is not 
allowed to perform.

According to the API, fetching the UserGroupInformation object can throw an 
IOException. I'm not familiar with the cases under which this would occur. 
However, I didn't want to fail immediately, because if the connection was 
initially established within a doAs, the calling code should have been able to 
establish a proper identity. So I let as much work get accomplished until the 
reconnect fails, which shouldn't be a problem, because most metastore sessions 
are not long-lived.

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01-branch-2.2.patch, 
> HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types

2017-10-31 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15552:
---
Attachment: HIVE-15552.01.patch

> unable to coalesce DATE and TIMESTAMP types
> ---
>
> Key: HIVE-15552
> URL: https://issues.apache.org/jira/browse/HIVE-15552
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15552.01.patch, HIVE-15552.patch
>
>
> COALESCE expression does not expect DATE and TIMESTAMP types 
> select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from 
> certtext.tdt
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Argument type mismatch 'cdt': The expressions after COALESCE should all have 
> the same type: "date" is expected but "timestamp" is found
> SQLState:  42000
> ErrorCode: 4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-17948) Hive 2.3.2 Release Planning

2017-10-31 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17948:
---


> Hive 2.3.2 Release Planning
> ---
>
> Key: HIVE-17948
> URL: https://issues.apache.org/jira/browse/HIVE-17948
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.2
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.3.2
>
>
> Release planning for Hive 2.3.2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17926) Support triggers for non-pool sessions

2017-10-31 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227246#comment-16227246
 ] 

Prasanth Jayachandran commented on HIVE-17926:
--

test failures are unrelated

> Support triggers for non-pool sessions
> --
>
> Key: HIVE-17926
> URL: https://issues.apache.org/jira/browse/HIVE-17926
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, 
> HIVE-17926.2.patch
>
>
> Current trigger implementation works only with tez session pools. In case 
> when tez sessions pools are not used, a new session gets created for every 
> query in which case trigger validation does not happen. It will be good to 
> support such one-off session case as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17834) Fix flaky triggers test

2017-10-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17834:
-
Status: Patch Available  (was: Reopened)

> Fix flaky triggers test
> ---
>
> Key: HIVE-17834
> URL: https://issues.apache.org/jira/browse/HIVE-17834
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 3.0.0
>
> Attachments: HIVE-17834.1.patch, HIVE-17834.2.patch, 
> HIVE-17834.3.patch, HIVE-17834.4.patch, HIVE-17834.4.patch, HIVE-17834.5.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-12631?focusedCommentId=16209803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16209803



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1

2017-10-31 Thread Daniel Voros (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227233#comment-16227233
 ] 

Daniel Voros commented on HIVE-17947:
-

I agree, those uses only run on a single partition if the table is partitioned. 
And it should be, right? (:

Yeah, something along those lines should work. I'll submit a new patch shortly.

> Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
> -
>
> Key: HIVE-17947
> URL: https://issues.apache.org/jira/browse/HIVE-17947
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-17947.1-branch-1.patch
>
>
> HIVE-17526 (only on branch-1) disabled conversion to ACID if there are 
> *_copy_N files under the table, but the filesystem checks introduced there 
> are running for every insert since the MoveTask in the end of the insert will 
> call alterTable eventually.
> The filename checking also recurses into staging directories created by other 
> inserts. If those are removed while listing the files, it leads to the 
> following exception and failing insert:
> {code}
> java.io.FileNotFoundException: File 
> hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001
>  does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) 
> ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) 
> ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) 
> ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) 
> ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_144]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) 
> [?:?]
> at 
>

[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-31 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227227#comment-16227227
 ] 

Eugene Koifman commented on HIVE-17458:
---

I'll make RB once I fix some of the issues here

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch, 
> HIVE-17458.11.patch, HIVE-17458.12.patch, HIVE-17458.12.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1

2017-10-31 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227198#comment-16227198
 ] 

Eugene Koifman commented on HIVE-17947:
---

My $0.02
The other uses of listStatusRecursively() look like they wouldn't be run on 
whole table, rather 1 partition at a time so it's much less likely see a lot of 
files.  
Couldn't something like this work?
{noformat}
  public static boolean listStatusRecursively(FileSystem fs, FileStatus 
fileStatus,
  PathFilter filter, List results) throws IOException {
if (fileStatus.isDir()) {
  for (FileStatus stat : fs.listStatus(fileStatus.getPath(), filter)) {
listStatusRecursively(fs, stat, results);
  }
} else {
  if(isCopyFile(stat)) return true;
}
  }
{noformat}

> Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
> -
>
> Key: HIVE-17947
> URL: https://issues.apache.org/jira/browse/HIVE-17947
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Blocker
> Attachments: HIVE-17947.1-branch-1.patch
>
>
> HIVE-17526 (only on branch-1) disabled conversion to ACID if there are 
> *_copy_N files under the table, but the filesystem checks introduced there 
> are running for every insert since the MoveTask in the end of the insert will 
> call alterTable eventually.
> The filename checking also recurses into staging directories created by other 
> inserts. If those are removed while listing the files, it leads to the 
> following exception and failing insert:
> {code}
> java.io.FileNotFoundException: File 
> hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001
>  does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018)
>  ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) 
> ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) 
> ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) 
> ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117)
>  [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
> at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) 
> ~[?:?]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_144]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> at 
>

1 2 >

1 - 100 of 156 matches

Mail list logo