date:20180601

[jira] [Updated] (HIVE-19529) Vectorization: Date/Timestamp NULL issues

2018-06-01 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19529:

Fix Version/s: 3.1.0

> Vectorization: Date/Timestamp NULL issues
> -
>
> Key: HIVE-19529
> URL: https://issues.apache.org/jira/browse/HIVE-19529
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19529.06-branch-3.patch, HIVE-19529.06.patch
>
>
> Wrong results found for:
>  date_add/date_sub
> UT areas:
>  date_add/date_sub
> datediff
> to_date
> interval_year_month + interval_year_month
>  interval_day_time + interval_day_time
>  interval_day_time + timestamp
>  timestamp + interval_day_time
>  date + interval_day_time
>  interval_day_time + date
>  interval_year_month + date
>  date + interval_year_month
>  interval_year_month + interval_year_month
>  timestamp + interval_year_month
> date - date
>  interval_year_month - interval_year_month
>  interval_day_time - interval_day_time
>  timestamp - interval_day_time
>  timestamp - timestamp
>  date - timestamp
>  timestamp - date
>  date - interval_day_time
>  date - interval_year_month
>  timestamp - interval_year_month



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19529) Vectorization: Date/Timestamp NULL issues

2018-06-01 Thread Matt McCline (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498910#comment-16498910
 ] 

Matt McCline commented on HIVE-19529:
-

Committed to branch-3

> Vectorization: Date/Timestamp NULL issues
> -
>
> Key: HIVE-19529
> URL: https://issues.apache.org/jira/browse/HIVE-19529
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19529.06-branch-3.patch, HIVE-19529.06.patch
>
>
> Wrong results found for:
>  date_add/date_sub
> UT areas:
>  date_add/date_sub
> datediff
> to_date
> interval_year_month + interval_year_month
>  interval_day_time + interval_day_time
>  interval_day_time + timestamp
>  timestamp + interval_day_time
>  date + interval_day_time
>  interval_day_time + date
>  interval_year_month + date
>  date + interval_year_month
>  interval_year_month + interval_year_month
>  timestamp + interval_year_month
> date - date
>  interval_year_month - interval_year_month
>  interval_day_time - interval_day_time
>  timestamp - interval_day_time
>  timestamp - timestamp
>  date - timestamp
>  timestamp - date
>  date - interval_day_time
>  date - interval_year_month
>  timestamp - interval_year_month



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19529) Vectorization: Date/Timestamp NULL issues

2018-06-01 Thread Matt McCline (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498909#comment-16498909
 ] 

Matt McCline commented on HIVE-19529:
-

Test failures are unrelated.

> Vectorization: Date/Timestamp NULL issues
> -
>
> Key: HIVE-19529
> URL: https://issues.apache.org/jira/browse/HIVE-19529
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: HIVE-19529.06-branch-3.patch, HIVE-19529.06.patch
>
>
> Wrong results found for:
>  date_add/date_sub
> UT areas:
>  date_add/date_sub
> datediff
> to_date
> interval_year_month + interval_year_month
>  interval_day_time + interval_day_time
>  interval_day_time + timestamp
>  timestamp + interval_day_time
>  date + interval_day_time
>  interval_day_time + date
>  interval_year_month + date
>  date + interval_year_month
>  interval_year_month + interval_year_month
>  timestamp + interval_year_month
> date - date
>  interval_year_month - interval_year_month
>  interval_day_time - interval_day_time
>  timestamp - interval_day_time
>  timestamp - timestamp
>  date - timestamp
>  timestamp - date
>  date - interval_day_time
>  date - interval_year_month
>  timestamp - interval_year_month



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19646) Filesystem closed error in HiveProtoLoggingHook

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498907#comment-16498907
 ] 

Hive QA commented on HIVE-19646:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925698/HIVE-19646.06-branch-3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11430/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11430/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11430/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12925698/HIVE-19646.06-branch-3.patch
 was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925698 - PreCommit-HIVE-Build

> Filesystem closed error in HiveProtoLoggingHook
> ---
>
> Key: HIVE-19646
> URL: https://issues.apache.org/jira/browse/HIVE-19646
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HIVE-19646.01-branch-3.patch, HIVE-19646.01.patch, 
> HIVE-19646.02.patch, HIVE-19646.03-branch-3.patch, HIVE-19646.03.patch, 
> HIVE-19646.04.patch, HIVE-19646.05-branch-3.patch, HIVE-19646.05.patch, 
> HIVE-19646.06-branch-3.patch
>
>
> Exception in proto logging hook on secure cluster.
> {code}
> 2018-05-18T04:48:01,136 ERROR [Hive Hook Proto Log Writer 0]: 
> hooks.HiveProtoLoggingHook (:()) - Error writing proto message for query 
> hive_20180518043717_ca3ab4df-6cab-4920-aa44-2340ae246ad2, eventType: 
> QUERY_SUBMITTED:
> java.io.IOException: Filesystem closed
>  at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:472) 
> ~[hadoop-hdfs-client-3.0.0.3.0.0.0-1298.jar:?]
>  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1652) 
> ~[hadoop-hdfs-client-3.0.0.3.0.0.0-1298.jar:?]
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1569)
>  ~[hadoop-hdfs-client-3.0.0.3.0.0.0-1298.jar:?]
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1566)
>  ~[hadoop-hdfs-client-3.0.0.3.0.0.0-1298.jar:?]
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  ~[hadoop-common-3.0.0.3.0.0.0-1298.jar:?]
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1581)
>  ~[hadoop-hdfs-client-3.0.0.3.0.0.0-1298.jar:?]
>  at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1734) 
> ~[hadoop-common-3.0.0.3.0.0.0-1298.jar:?]
>  at 
> org.apache.hadoop.hive.ql.hooks.DatePartitionedLogger.getPathForDate(DatePartitionedLogger.java:89)
>  ~[hive-exec-3.0.0.3.0.0.0-1298.jar:3.0.0.3.0.0.0-1298]
>  at 
> org.apache.hadoop.hive.ql.hooks.DatePartitionedLogger.getWriter(DatePartitionedLogger.java:73)
>  ~[hive-exec-3.0.0.3.0.0.0-1298.jar:3.0.0.3.0.0.0-1298]
>  at 
> org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook$EventLogger.writeEvent(HiveProtoLoggingHook.java:283)
>  ~[hive-exec-3.0.0.3.0.0.0-1298.jar:3.0.0.3.0.0.0-1298]
>  at 
> org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook$EventLogger.lambda$generateEvent$1(HiveProtoLoggingHook.java:274)
>  ~[hive-exec-3.0.0.3.0.0.0-1298.jar:3.0.0.3.0.0.0-1298]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19529) Vectorization: Date/Timestamp NULL issues

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498906#comment-16498906
 ] 

Hive QA commented on HIVE-19529:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925937/HIVE-19529.06-branch-3.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 14379 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidkafkamini_basic]
 (batchId=253)
org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=305)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11429/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11429/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11429/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925937 - PreCommit-HIVE-Build

> Vectorization: Date/Timestamp NULL issues
> -
>
> Key: HIVE-19529
> URL: https://issues.apache.org/jira/browse/HIVE-19529
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: HIVE-19529.06-branch-3.patch, HIVE-19529.06.patch
>
>
> Wrong results found for:
>  date_add/date_sub
> UT areas:
>  date_add/date_sub
> datediff
> to_date
> interval_year_month + interval_year_month
>  interval_day_time + interval_day_time
>  interval_day_time + timestamp
>  timestamp + interval_day_time
>  date + interval_day_time
>  interval_day_time + date
>  interval_year_month + date
>  date + interval_year_month
>  interval_year_month + interval_year_month
>  timestamp + interval_year_month
> date - date
>  interval_year_month - interval_year_month
>  interval_day_time - interval_day_time
>  timestamp - interval_day_time
>  timestamp - timestamp
>  date - timestamp
>  timestamp - date
>  date - interval_day_time
>  date - interval_year_month
>  timestamp - interval_year_month



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-01 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Attachment: HIVE-19668.01.patch

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-01 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: Patch Available  (was: In Progress)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-01 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-19668 started by Misha Dmitriev.
-
> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Issue Comment Deleted] (HIVE-19529) Vectorization: Date/Timestamp NULL issues

2018-06-01 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19529:

Comment: was deleted

(was: #11429 ?)

> Vectorization: Date/Timestamp NULL issues
> -
>
> Key: HIVE-19529
> URL: https://issues.apache.org/jira/browse/HIVE-19529
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: HIVE-19529.06-branch-3.patch, HIVE-19529.06.patch
>
>
> Wrong results found for:
>  date_add/date_sub
> UT areas:
>  date_add/date_sub
> datediff
> to_date
> interval_year_month + interval_year_month
>  interval_day_time + interval_day_time
>  interval_day_time + timestamp
>  timestamp + interval_day_time
>  date + interval_day_time
>  interval_day_time + date
>  interval_year_month + date
>  date + interval_year_month
>  interval_year_month + interval_year_month
>  timestamp + interval_year_month
> date - date
>  interval_year_month - interval_year_month
>  interval_day_time - interval_day_time
>  timestamp - interval_day_time
>  timestamp - timestamp
>  date - timestamp
>  timestamp - date
>  date - interval_day_time
>  date - interval_year_month
>  timestamp - interval_year_month



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19493) VectorUDFDateDiffColCol copySelected does not handle nulls correctly

2018-06-01 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19493:

Status: Patch Available  (was: In Progress)

> VectorUDFDateDiffColCol copySelected does not handle nulls correctly
> 
>
> Key: HIVE-19493
> URL: https://issues.apache.org/jira/browse/HIVE-19493
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Vihang Karajgaonkar
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-19493.01.patch, HIVE-19493.02.patch, 
> HIVE-19493.04.patch
>
>
> The {{copySelected}} method in {{VectorUDFDateDiffColCol}} class was missed 
> during HIVE-18622



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19493) VectorUDFDateDiffColCol copySelected does not handle nulls correctly

2018-06-01 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19493:

Status: In Progress  (was: Patch Available)

> VectorUDFDateDiffColCol copySelected does not handle nulls correctly
> 
>
> Key: HIVE-19493
> URL: https://issues.apache.org/jira/browse/HIVE-19493
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Vihang Karajgaonkar
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-19493.01.patch, HIVE-19493.02.patch, 
> HIVE-19493.04.patch
>
>
> The {{copySelected}} method in {{VectorUDFDateDiffColCol}} class was missed 
> during HIVE-18622



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19493) VectorUDFDateDiffColCol copySelected does not handle nulls correctly

2018-06-01 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19493:

Attachment: HIVE-19493.04.patch

> VectorUDFDateDiffColCol copySelected does not handle nulls correctly
> 
>
> Key: HIVE-19493
> URL: https://issues.apache.org/jira/browse/HIVE-19493
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Vihang Karajgaonkar
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-19493.01.patch, HIVE-19493.02.patch, 
> HIVE-19493.04.patch
>
>
> The {{copySelected}} method in {{VectorUDFDateDiffColCol}} class was missed 
> during HIVE-18622



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19493) VectorUDFDateDiffColCol copySelected does not handle nulls correctly

2018-06-01 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-19493:
---

Assignee: Matt McCline  (was: Vihang Karajgaonkar)

> VectorUDFDateDiffColCol copySelected does not handle nulls correctly
> 
>
> Key: HIVE-19493
> URL: https://issues.apache.org/jira/browse/HIVE-19493
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Vihang Karajgaonkar
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-19493.01.patch, HIVE-19493.02.patch
>
>
> The {{copySelected}} method in {{VectorUDFDateDiffColCol}} class was missed 
> during HIVE-18622



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19493) VectorUDFDateDiffColCol copySelected does not handle nulls correctly

2018-06-01 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-19493:

Attachment: (was: HIVE-19493.03.patch)

> VectorUDFDateDiffColCol copySelected does not handle nulls correctly
> 
>
> Key: HIVE-19493
> URL: https://issues.apache.org/jira/browse/HIVE-19493
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19493.01.patch, HIVE-19493.02.patch
>
>
> The {{copySelected}} method in {{VectorUDFDateDiffColCol}} class was missed 
> during HIVE-18622



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19529) Vectorization: Date/Timestamp NULL issues

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498863#comment-16498863
 ] 

Hive QA commented on HIVE-19529:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-11429/patches/PreCommit-HIVE-Build-11429.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11429/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Date/Timestamp NULL issues
> -
>
> Key: HIVE-19529
> URL: https://issues.apache.org/jira/browse/HIVE-19529
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: HIVE-19529.06-branch-3.patch, HIVE-19529.06.patch
>
>
> Wrong results found for:
>  date_add/date_sub
> UT areas:
>  date_add/date_sub
> datediff
> to_date
> interval_year_month + interval_year_month
>  interval_day_time + interval_day_time
>  interval_day_time + timestamp
>  timestamp + interval_day_time
>  date + interval_day_time
>  interval_day_time + date
>  interval_year_month + date
>  date + interval_year_month
>  interval_year_month + interval_year_month
>  timestamp + interval_year_month
> date - date
>  interval_year_month - interval_year_month
>  interval_day_time - interval_day_time
>  timestamp - interval_day_time
>  timestamp - timestamp
>  date - timestamp
>  timestamp - date
>  date - interval_day_time
>  date - interval_year_month
>  timestamp - interval_year_month



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19738) Update committer-list

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498859#comment-16498859
 ] 

Hive QA commented on HIVE-19738:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925689/HIVE-19738.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11428/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11428/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11428/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-06-02 04:23:17.915
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-11428/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-06-02 04:23:17.918
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 4463c2b HIVE-19432 : GetTablesOperation is too slow if the hive 
has too many databases and tables (Rajkumar Singh via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 4463c2b HIVE-19432 : GetTablesOperation is too slow if the hive 
has too many databases and tables (Rajkumar Singh via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-06-02 04:23:19.210
+ rm -rf ../yetus_PreCommit-HIVE-Build-11428
+ mkdir ../yetus_PreCommit-HIVE-Build-11428
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-11428
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-11428/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: content/people.mdtext: does not exist in index
error: people.mdtext: does not exist in index
fatal: unable to find filename in patch at line 3
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-11428
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925689 - PreCommit-HIVE-Build

> Update committer-list
> -
>
> Key: HIVE-19738
> URL: https://issues.apache.org/jira/browse/HIVE-19738
> Project: Hive
>  Issue Type: Task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Trivial
> Attachments: HIVE-19738.patch
>
>
> Adding new entry to committer-list:
> {noformat}
> +
> +tchoi 
> +Teddy Choi 
> + href="http://hortonworks.com/";>Hortonworks 
> +
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19727) Fix Signature matching of table aliases

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498858#comment-16498858
 ] 

Hive QA commented on HIVE-19727:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925867/HIVE-19727.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 14443 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union11] 
(batchId=139)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union15] 
(batchId=149)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union2] 
(batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union5] 
(batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union9] 
(batchId=127)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_ppr] 
(batchId=116)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11427/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11427/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11427/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925867 - PreCommit-HIVE-Build

> Fix Signature matching of table aliases
> ---
>
> Key: HIVE-19727
> URL: https://issues.apache.org/jira/browse/HIVE-19727
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-19727.01.patch, HIVE-19727.02.patch, 
> HIVE-19727.03.patch
>
>
> there is a probable problem with alias matching: "t1 as a" is matched to "t2 
> as a" 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19727) Fix Signature matching of table aliases

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498840#comment-16498840
 ] 

Hive QA commented on HIVE-19727:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
36s{color} | {color:blue} ql in master has 2278 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} ql: The patch generated 0 new + 10 unchanged - 7 
fixed = 10 total (was 17) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-11427/dev-support/hive-personality.sh
 |
| git revision | master / 4463c2b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11427/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix Signature matching of table aliases
> ---
>
> Key: HIVE-19727
> URL: https://issues.apache.org/jira/browse/HIVE-19727
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-19727.01.patch, HIVE-19727.02.patch, 
> HIVE-19727.03.patch
>
>
> there is a probable problem with alias matching: "t1 as a" is matched to "t2 
> as a" 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19685) OpenTracing support for HMS

2018-06-01 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498832#comment-16498832
 ] 

Prasanth Jayachandran commented on HIVE-19685:
--

looks like the patch needs rebase

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-19685.patch, hive-19685.patch, trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19597) TestWorkloadManager sometimes hangs

2018-06-01 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498831#comment-16498831
 ] 

Prasanth Jayachandran commented on HIVE-19597:
--

+1

> TestWorkloadManager sometimes hangs
> ---
>
> Key: HIVE-19597
> URL: https://issues.apache.org/jira/browse/HIVE-19597
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19597.patch
>
>
> Seems like the tests randomly get stuck after the lines like
> {noformat}
> 2018-05-17T01:54:27,111  INFO [Workload management master] 
> tez.WorkloadManager: Processing current events
> 2018-05-17T01:54:27,603  INFO [TriggerValidator] 
> tez.PerPoolTriggerValidatorRunnable: Creating trigger validator for pool: llap
> 2018-05-17T01:54:37,090 DEBUG [Thread-28] conf.HiveConf: Found metastore URI 
> of null
> {noformat}
> Then they get killed by timeout. Happened in the same manner, to random tests 
> in a few separate runs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19663) refactor LLAP IO report generation

2018-06-01 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498829#comment-16498829
 ] 

Prasanth Jayachandran commented on HIVE-19663:
--

+1

> refactor LLAP IO report generation
> --
>
> Key: HIVE-19663
> URL: https://issues.apache.org/jira/browse/HIVE-19663
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19663.patch
>
>
> Follow-up from HIVE-19642.
> Instead of each component calling some other component in a chain, all the 
> parts of the state dump should be called in one place to avoid weird 
> dependencies/sequences that need to be accounted for to generate the report.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19525) Spark task logs print PLAN PATH excessive number of times

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498830#comment-16498830
 ] 

Hive QA commented on HIVE-19525:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925680/HIVE-19525.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14443 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11426/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11426/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11426/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925680 - PreCommit-HIVE-Build

> Spark task logs print PLAN PATH excessive number of times
> -
>
> Key: HIVE-19525
> URL: https://issues.apache.org/jira/browse/HIVE-19525
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19525.1.patch
>
>
> A ton of logs with this {{Utilities - PLAN PATH = 
> hdfs://localhost:59527/.../apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/6ebceb49-7a76-4159-9082-5bba44391e30/hive_2018-05-14_07-28-44_672_8205774950452575544-1/-mr-10006/bf14c0b5-a014-4ee8-8ddf-fdb7453eb0f0/map.xml}}
> Seems it print multiple times per task exception, not sure where it is coming 
> from, but its too verbose. It should be changed to DEBUG level. Furthermore, 
> given that we are using {{Utilities#getBaseWork}} anytime we need to access a 
> {{MapWork}} or {{ReduceWork}} object, we should make the method slightly more 
> efficient. Right now it borrows a {{Kryo}} from a pool and does a bunch of 
> stuff to set the classloader, then it checks the cache to see if the work 
> object has already been created. It should check the cache before doing any 
> of that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19525) Spark task logs print PLAN PATH excessive number of times

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498803#comment-16498803
 ] 

Hive QA commented on HIVE-19525:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
29s{color} | {color:blue} ql in master has 2278 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
34s{color} | {color:red} ql: The patch generated 2 new + 119 unchanged - 2 
fixed = 121 total (was 121) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-11426/dev-support/hive-personality.sh
 |
| git revision | master / 4463c2b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11426/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11426/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Spark task logs print PLAN PATH excessive number of times
> -
>
> Key: HIVE-19525
> URL: https://issues.apache.org/jira/browse/HIVE-19525
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19525.1.patch
>
>
> A ton of logs with this {{Utilities - PLAN PATH = 
> hdfs://localhost:59527/.../apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/6ebceb49-7a76-4159-9082-5bba44391e30/hive_2018-05-14_07-28-44_672_8205774950452575544-1/-mr-10006/bf14c0b5-a014-4ee8-8ddf-fdb7453eb0f0/map.xml}}
> Seems it print multiple times per task exception, not sure where it is coming 
> from, but its too verbose. It should be changed to DEBUG level. Furthermore, 
> given that we are using {{Utilities#getBaseWork}} anytime we need to access a 
> {{MapWork}} or {{ReduceWork}} object, we should make the method slightly more 
> efficient. Right now it borrows a {{Kryo}} from a pool and does a bunch of 
> stuff to set the classloader, then it checks the cache to see if the work 
> object has already been created. It should check the cache before doing any 
> of that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18875) Enable SMB Join by default in Tez

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498798#comment-16498798
 ] 

Hive QA commented on HIVE-18875:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12926011/HIVE-18875.11.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14443 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11425/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11425/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11425/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12926011 - PreCommit-HIVE-Build

> Enable SMB Join by default in Tez
> -
>
> Key: HIVE-18875
> URL: https://issues.apache.org/jira/browse/HIVE-18875
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18875.1.patch, HIVE-18875.10.patch, 
> HIVE-18875.11.patch, HIVE-18875.2.patch, HIVE-18875.3.patch, 
> HIVE-18875.4.patch, HIVE-18875.5.patch, HIVE-18875.6.patch, 
> HIVE-18875.7.patch, HIVE-18875.8.patch, HIVE-18875.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19690) multi-insert query with multiple GBY, and distinct in only some branches can produce incorrect results

2018-06-01 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498784#comment-16498784
 ] 

Sergey Shelukhin commented on HIVE-19690:
-

Updating test outputs. I will check input31, bit it appears to be flaky. I see 
if failed ~10 builds ago for an unrelated patch with the same result change.

> multi-insert query with multiple GBY, and distinct in only some branches can 
> produce incorrect results
> --
>
> Key: HIVE-19690
> URL: https://issues.apache.org/jira/browse/HIVE-19690
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19690.01.patch, HIVE-19690.02.patch, 
> HIVE-19690.03.patch, HIVE-19690.04.patch, HIVE-19690.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19690) multi-insert query with multiple GBY, and distinct in only some branches can produce incorrect results

2018-06-01 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19690:

Attachment: HIVE-19690.04.patch

> multi-insert query with multiple GBY, and distinct in only some branches can 
> produce incorrect results
> --
>
> Key: HIVE-19690
> URL: https://issues.apache.org/jira/browse/HIVE-19690
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19690.01.patch, HIVE-19690.02.patch, 
> HIVE-19690.03.patch, HIVE-19690.04.patch, HIVE-19690.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18875) Enable SMB Join by default in Tez

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498778#comment-16498778
 ] 

Hive QA commented on HIVE-18875:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} common in master has 62 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
35s{color} | {color:blue} ql in master has 2278 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
35s{color} | {color:red} ql: The patch generated 2 new + 75 unchanged - 1 fixed 
= 77 total (was 76) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-11425/dev-support/hive-personality.sh
 |
| git revision | master / 4463c2b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11425/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11425/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Enable SMB Join by default in Tez
> -
>
> Key: HIVE-18875
> URL: https://issues.apache.org/jira/browse/HIVE-18875
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18875.1.patch, HIVE-18875.10.patch, 
> HIVE-18875.11.patch, HIVE-18875.2.patch, HIVE-18875.3.patch, 
> HIVE-18875.4.patch, HIVE-18875.5.patch, HIVE-18875.6.patch, 
> HIVE-18875.7.patch, HIVE-18875.8.patch, HIVE-18875.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19773) CBO exception while running queries with tables that are not present in materialized views

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498763#comment-16498763
 ] 

Jesus Camacho Rodriguez commented on HIVE-19773:


Cc [~ashutoshc]

> CBO exception while running queries with tables that are not present in 
> materialized views
> --
>
> Key: HIVE-19773
> URL: https://issues.apache.org/jira/browse/HIVE-19773
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-19773.patch
>
>
> When we obtain the valid list of write ids, some tables in the materialized 
> views may not be present in the list because they are not present in the 
> query, which leads to exceptions (hidden in logs) when we try to load the 
> materialized views in the planner, as we need to verify whether they are 
> outdated or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19773) CBO exception while running queries with tables that are not present in materialized views

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19773:
---
Attachment: HIVE-19773.patch

> CBO exception while running queries with tables that are not present in 
> materialized views
> --
>
> Key: HIVE-19773
> URL: https://issues.apache.org/jira/browse/HIVE-19773
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-19773.patch
>
>
> When we obtain the valid list of write ids, some tables in the materialized 
> views may not be present in the list because they are not present in the 
> query, which leads to exceptions (hidden in logs) when we try to load the 
> materialized views in the planner, as we need to verify whether they are 
> outdated or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19773) CBO exception while running queries with tables that are not present in materialized views

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-19773:
--


> CBO exception while running queries with tables that are not present in 
> materialized views
> --
>
> Key: HIVE-19773
> URL: https://issues.apache.org/jira/browse/HIVE-19773
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>
> When we obtain the valid list of write ids, some tables in the materialized 
> views may not be present in the list because they are not present in the 
> query, which leads to exceptions (hidden in logs) when we try to load the 
> materialized views in the planner, as we need to verify whether they are 
> outdated or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19773) CBO exception while running queries with tables that are not present in materialized views

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19773:
---
Status: Patch Available  (was: Open)

> CBO exception while running queries with tables that are not present in 
> materialized views
> --
>
> Key: HIVE-19773
> URL: https://issues.apache.org/jira/browse/HIVE-19773
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>
> When we obtain the valid list of write ids, some tables in the materialized 
> views may not be present in the list because they are not present in the 
> query, which leads to exceptions (hidden in logs) when we try to load the 
> materialized views in the planner, as we need to verify whether they are 
> outdated or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19773) CBO exception while running queries with tables that are not present in materialized views

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19773:
---
Reporter: Aswathy Chellammal Sreekumar  (was: Jesus Camacho Rodriguez)

> CBO exception while running queries with tables that are not present in 
> materialized views
> --
>
> Key: HIVE-19773
> URL: https://issues.apache.org/jira/browse/HIVE-19773
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>
> When we obtain the valid list of write ids, some tables in the materialized 
> views may not be present in the list because they are not present in the 
> query, which leads to exceptions (hidden in logs) when we try to load the 
> materialized views in the planner, as we need to verify whether they are 
> outdated or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19690) multi-insert query with multiple GBY, and distinct in only some branches can produce incorrect results

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498748#comment-16498748
 ] 

Hive QA commented on HIVE-19690:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925981/HIVE-19690.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11424/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11424/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11424/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12925981/HIVE-19690.03.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925981 - PreCommit-HIVE-Build

> multi-insert query with multiple GBY, and distinct in only some branches can 
> produce incorrect results
> --
>
> Key: HIVE-19690
> URL: https://issues.apache.org/jira/browse/HIVE-19690
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19690.01.patch, HIVE-19690.02.patch, 
> HIVE-19690.03.patch, HIVE-19690.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19690) multi-insert query with multiple GBY, and distinct in only some branches can produce incorrect results

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498747#comment-16498747
 ] 

Hive QA commented on HIVE-19690:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925981/HIVE-19690.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 1 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input31] (batchId=63)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert_gby3] 
(batchId=143)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData[4]
 (batchId=198)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11423/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11423/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11423/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925981 - PreCommit-HIVE-Build

> multi-insert query with multiple GBY, and distinct in only some branches can 
> produce incorrect results
> --
>
> Key: HIVE-19690
> URL: https://issues.apache.org/jira/browse/HIVE-19690
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19690.01.patch, HIVE-19690.02.patch, 
> HIVE-19690.03.patch, HIVE-19690.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19772) Streaming ingest V2 API can generate invalid orc file if interrupted

2018-06-01 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498740#comment-16498740
 ] 

Prasanth Jayachandran commented on HIVE-19772:
--

cc/ [~mattyb149]

> Streaming ingest V2 API can generate invalid orc file if interrupted
> 
>
> Key: HIVE-19772
> URL: https://issues.apache.org/jira/browse/HIVE-19772
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0, 3.0.1, 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-19772.1.patch
>
>
> Hive streaming ingest generated 0 length and 3 byte files which are invalid 
> orc files. This will throw the following exception during compaction
> {code}
> Error: org.apache.orc.FileFormatException: Not a valid ORC file 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/culvert/year=2018/month=7/delta_025_025/bucket_5
>  (maxFileLength= 3) at 
> org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:546) at 
> org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:60) at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:90) at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:1124)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:2373)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:1000)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:977)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18990) Hive doesn't close Tez session properly

2018-06-01 Thread Jason Dere (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498729#comment-16498729
 ] 

Jason Dere commented on HIVE-18990:
---

Tried running Hive locally and ran into this error when I tried to run an 
insert statement, possible it might be related to this change:

{noformat}
java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.getSession(TezSessionState.java:711)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:646)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeIfNotDefault(TezSessionPoolManager.java:353)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:467)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getUnmanagedSession(WorkloadManagerFederation.java:66)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getSession(WorkloadManagerFederation.java:38)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:184) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2497) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2149) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1826) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1569) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1563) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_121]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_121]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_121]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
at org.apache.hadoop.util.RunJar.run(RunJar.java:308) 
~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.util.RunJar.main(RunJar.java:222) 
~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
{noformat}

> Hive doesn't close Tez session properly
> ---
>
> Key: HIVE-18990
> URL: https://issues.apache.org/jira/browse/HIVE-18990
> Project: Hive
>  Issue Type: Bug
>Reporter: Igor Kryvenko
>Assignee: Igor Kryvenko
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18990.01.patch, HIVE-18990.02.patch, 
> HIVE-18990.03.patch
>
>
> Hive doesn't close Tez session properly if AM isn't ready for accepting DAG.
> *STR*
> This can be easily reproduced using the following steps:
> *1) configure cluster on Tez;*
> *2) create file test.hql*
> cat ~/test.hql
> show databases;
> *3) run the job*
> $ hive --hiveconf hive.root.logger=DEBUG,console --hiveconf 
> hive.execution.engine=tez -f ~/test.hql
> If we login into Yarn UI,  we will see that jobs status is failed even it 
> finished successfully.
> It happens because hive creates tez session by defa

[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-01 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Summary: Over 30% of the heap wasted by duplicate 
org.antlr.runtime.CommonToken's and duplicate strings  (was: 11.8% of the heap 
wasted due to duplicate org.antlr.runtime.CommonToken's)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19668) 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's

2018-06-01 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Description: 
I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
spike during compilation of some big query. The analysis was done with jxray 
([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
the 20G heap was used by data structures associated with query parsing 
({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
opportunities for optimizations here. One of them is to stop the code from 
creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See a 
sample of these objects in the attached image:

!image-2018-05-22-17-41-39-572.png|width=879,height=399!

Looks like these particular {{CommonToken}} objects are constants, that don't 
change once created. I see some code, e.g. in 
{{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
apparently repeatedly created with e.g. {{new 
CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds are 
instead created once and reused, we will save more than 1/10th of the heap in 
this scenario. Plus, since these objects are small but very numerous, getting 
rid of them will remove a gread deal of pressure from the GC.

Another source of waste are duplicate strings, that collectively waste 26.1% of 
memory. Some of them come from CommonToken objects that have the same text 
(i.e. for multiple CommonToken objects the contents of their 'text' Strings are 
the same, but each has its own copy of that String). Other duplicate strings 
come from other sources, that are easy enough to fix by adding String.intern() 
calls.

  was:
I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
spike during compilation of some big query. The analysis was done with jxray 
([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
the 20G heap was used by data structures associated with query parsing 
({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
opportunities for optimizations here. One of them is to stop the code from 
creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See a 
sample of these objects in the attached image:

!image-2018-05-22-17-41-39-572.png|width=879,height=399!

Looks like these particular {{CommonToken}} objects are constants, that don't 
change once created. I see some code, e.g. in 
{{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
apparently repeatedly created with e.g. {{new 
CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds are 
instead created once and reused, we will save more than 1/10th of the heap in 
this scenario. Plus, since these objects are small but very numerous, getting 
rid of them will remove a gread deal of pressure from the GC.


> 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's
> -
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects

[jira] [Commented] (HIVE-19690) multi-insert query with multiple GBY, and distinct in only some branches can produce incorrect results

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498713#comment-16498713
 ] 

Hive QA commented on HIVE-19690:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
37s{color} | {color:blue} ql in master has 2278 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 15 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-11423/dev-support/hive-personality.sh
 |
| git revision | master / 4463c2b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11423/yetus/whitespace-eol.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11423/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> multi-insert query with multiple GBY, and distinct in only some branches can 
> produce incorrect results
> --
>
> Key: HIVE-19690
> URL: https://issues.apache.org/jira/browse/HIVE-19690
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19690.01.patch, HIVE-19690.02.patch, 
> HIVE-19690.03.patch, HIVE-19690.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19772) Streaming ingest V2 API can generate invalid orc file if interrupted

2018-06-01 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498697#comment-16498697
 ] 

Prasanth Jayachandran commented on HIVE-19772:
--

[~gopalv]/[~ekoifman] could someone please review?

> Streaming ingest V2 API can generate invalid orc file if interrupted
> 
>
> Key: HIVE-19772
> URL: https://issues.apache.org/jira/browse/HIVE-19772
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0, 3.0.1, 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-19772.1.patch
>
>
> Hive streaming ingest generated 0 length and 3 byte files which are invalid 
> orc files. This will throw the following exception during compaction
> {code}
> Error: org.apache.orc.FileFormatException: Not a valid ORC file 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/culvert/year=2018/month=7/delta_025_025/bucket_5
>  (maxFileLength= 3) at 
> org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:546) at 
> org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:60) at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:90) at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:1124)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:2373)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:1000)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:977)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19772) Streaming ingest V2 API can generate invalid orc file if interrupted

2018-06-01 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-19772:
-
Status: Patch Available  (was: Open)

> Streaming ingest V2 API can generate invalid orc file if interrupted
> 
>
> Key: HIVE-19772
> URL: https://issues.apache.org/jira/browse/HIVE-19772
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0, 3.0.1, 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-19772.1.patch
>
>
> Hive streaming ingest generated 0 length and 3 byte files which are invalid 
> orc files. This will throw the following exception during compaction
> {code}
> Error: org.apache.orc.FileFormatException: Not a valid ORC file 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/culvert/year=2018/month=7/delta_025_025/bucket_5
>  (maxFileLength= 3) at 
> org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:546) at 
> org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:60) at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:90) at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:1124)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:2373)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:1000)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:977)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19772) Streaming ingest V2 API can generate invalid orc file if interrupted

2018-06-01 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-19772:
-
Attachment: HIVE-19772.1.patch

> Streaming ingest V2 API can generate invalid orc file if interrupted
> 
>
> Key: HIVE-19772
> URL: https://issues.apache.org/jira/browse/HIVE-19772
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0, 3.0.1, 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-19772.1.patch
>
>
> Hive streaming ingest generated 0 length and 3 byte files which are invalid 
> orc files. This will throw the following exception during compaction
> {code}
> Error: org.apache.orc.FileFormatException: Not a valid ORC file 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/culvert/year=2018/month=7/delta_025_025/bucket_5
>  (maxFileLength= 3) at 
> org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:546) at 
> org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:60) at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:90) at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:1124)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:2373)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:1000)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:977)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19772) Streaming ingest V2 API can generate invalid orc file if interrupted

2018-06-01 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-19772:



> Streaming ingest V2 API can generate invalid orc file if interrupted
> 
>
> Key: HIVE-19772
> URL: https://issues.apache.org/jira/browse/HIVE-19772
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0, 3.0.1, 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Critical
>
> Hive streaming ingest generated 0 length and 3 byte files which are invalid 
> orc files. This will throw the following exception during compaction
> {code}
> Error: org.apache.orc.FileFormatException: Not a valid ORC file 
> hdfs://cn105-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/culvert/year=2018/month=7/delta_025_025/bucket_5
>  (maxFileLength= 3) at 
> org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:546) at 
> org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:60) at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:90) at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:1124)
>  at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:2373)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:1000)
>  at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:977)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:344) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19323) Create metastore SQL install and upgrade scripts for 3.1

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498689#comment-16498689
 ] 

Hive QA commented on HIVE-19323:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925658/HIVE-19323.6-branch-3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11422/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11422/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11422/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12925658/HIVE-19323.6-branch-3.patch
 was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925658 - PreCommit-HIVE-Build

> Create metastore SQL install and upgrade scripts for 3.1
> 
>
> Key: HIVE-19323
> URL: https://issues.apache.org/jira/browse/HIVE-19323
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.1.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19323.2.patch, HIVE-19323.3.patch, 
> HIVE-19323.4.patch, HIVE-19323.5.patch, HIVE-19323.6-branch-3.patch, 
> HIVE-19323.6.patch, HIVE-19323.branch-3.1.patch, HIVE-19323.patch
>
>
> Now that we've branched for 3.0 we need to create SQL install and upgrade 
> scripts for 3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19378) "hive.lock.numretries" Is Misleading

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498688#comment-16498688
 ] 

Hive QA commented on HIVE-19378:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925997/HIVE-19378.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14443 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_into1] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_into2] 
(batchId=98)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_into3] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insert_into4] 
(batchId=96)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11421/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11421/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11421/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925997 - PreCommit-HIVE-Build

> "hive.lock.numretries" Is Misleading
> 
>
> Key: HIVE-19378
> URL: https://issues.apache.org/jira/browse/HIVE-19378
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19378.1.patch
>
>
> Configuration 'hive.lock.numretries' is confusing.  It's not actually a 
> 'retry' count, it's the total number of attempt to try:
>  
> {code:java|title=ZooKeeperHiveLockManager.java}
> do {
>   lastException = null;
>   tryNum++;
>   try {
> if (tryNum > 1) {
>   Thread.sleep(sleepTime);
>   prepareRetry();
> }
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
> conflictingLocks);
> ...
> } while (tryNum < numRetriesForLock);
> {code}
> So, from this code you can see that on the first loop, {{tryNum}} is set to 
> 1, in which case, if the configuration num*retries* is set to 1, there will 
> be one attempt total.  With a *retry* value of 1, I would assume one initial 
> attempt and one additional retry.  Please change to:
> {code}
> while (tryNum <= numRetriesForLock);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498687#comment-16498687
 ] 

Jesus Camacho Rodriguez commented on HIVE-19771:


Cc [~ashutoshc] [~vgarg]

> allowNullColumnForMissingStats should not be false when column stats are 
> estimated
> --
>
> Key: HIVE-19771
> URL: https://issues.apache.org/jira/browse/HIVE-19771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19771.patch
>
>
> Otherwise we may throw an Exception.
> {noformat}
> 2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
> stats.StatsUtils (:()) - Estimated average row size: 372
> 2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
> calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
> stored in cache
> 2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
> calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
> numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
> -9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
> isEstimated: true
> 2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
> calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
> stored in cache
> 2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
> calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
> countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
> isPrimaryKey: false isEstimated: true
> 2018-05-26T00:30:22,352 ERROR [HiveServer2-Background-Pool: Thread-631]: 
> calcite.RelOptHiveTable (:()) - No Stats for default@basetable_rebuild, 
> Columns: a, b
> java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: 
> a, b
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at 
> GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown 
> Source) ~[?:?]
> at 
> GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
>  ~[calcite-core-1.16.0.jar:1.16.0]
> at 
> org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
>  ~[calcite-core-1.16.0.jar:1.16.0]
> at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
> ~[?:?]
> at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) 
> ~[?:?]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
>  ~[calcite-core-1.16.0.jar:1.16.0]
> at 
> org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
>  ~[calcite-core-1.16.0.jar:1.16.0]
> at 
> org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
> ~[calcite-core-1.16.0.jar:1.16.0]
> at 
> org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
> ~[calcite-core-1.16.0.jar:1.16.0]
> at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
> ~[calcite-core-1.16.0.jar:1.16.0]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
>  ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
> ~[calcite-core-1.16.0.jar:1.16.0]
> at 
> org.apach

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
stats.StatsUtils (:()) - Estimated average row size: 372
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
-9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
isEstimated: true
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
isPrimaryKey: false isEstimated: true
2018-05-26T00:30:22,352 ERROR [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - No Stats for default@basetable_rebuild, 
Columns: a, b
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.16.0.jar:1.16.0]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.16.0.jar:1.16.0]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.16.0.jar:1.16.0]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.16.0.jar:1.16.0]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.16.0.jar:1.16.0]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1418)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genLogicalPlan(CalcitePlanner.java:369)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.parseQuery(HiveMaterializedViewsRegistry.java:416)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
stats.StatsUtils (:()) - Estimated average row size: 372
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
-9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
isEstimated: true
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
isPrimaryKey: false isEstimated: true
2018-05-26T00:30:22,352 ERROR [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - No Stats for default@basetable_rebuild, 
Columns: a, b
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.16.0..jar:1.16.0.]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.16.0..jar:1.16.0.]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.16.0..jar:1.16.0.]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.16.0..jar:1.16.0.]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.16.0..jar:1.16.0.]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.16.0..jar:1.16.0.]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.16.0..jar:1.16.0.]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.16.0..jar:1.16.0.]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.16.0..jar:1.16.0.]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.16.0..jar:1.16.0.]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.16.0..jar:1.16.0.]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1418)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genLogicalPlan(CalcitePlanner.java:369)
 ~[hive-exec-3.0.0..jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.parseQuery(HiveMaterializedViewsRegistry.java:416)
 ~[hive-exec-3.

[jira] [Resolved] (HIVE-19755) insertsel_fail.q.out needs to be updated on branch-3

2018-06-01 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg resolved HIVE-19755.

   Resolution: Fixed
Fix Version/s: 3.1.0

> insertsel_fail.q.out needs to be updated on branch-3
> 
>
> Key: HIVE-19755
> URL: https://issues.apache.org/jira/browse/HIVE-19755
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19755.1-branch-3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
stats.StatsUtils (:()) - Estimated average row size: 372
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
-9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
isEstimated: true
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
isPrimaryKey: false isEstimated: true
2018-05-26T00:30:22,352 ERROR [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - No Stats for default@basetable_rebuild, 
Columns: a, b
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.16.0.jar:1.16.0.]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.16.0.jar:1.16.0.]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.16.0.jar:1.16.0.]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.16.0.jar:1.16.0.]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.16.0.jar:1.16.0.]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.16.0.jar:1.16.0.]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.16.0.jar:1.16.0.]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.16.0.jar:1.16.0.]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.16.0.jar:1.16.0.]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.16.0.jar:1.16.0.]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.16.0.jar:1.16.0.]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1418)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genLogicalPlan(CalcitePlanner.java:369)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.parseQuery(HiveMaterializedViewsRegistry.java:416)
 ~[hive-exec-3.0.0.jar:3.0.0-SNAPSHO

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
stats.StatsUtils (:()) - Estimated average row size: 372
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
-9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
isEstimated: true
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
isPrimaryKey: false isEstimated: true
2018-05-26T00:30:22,352 ERROR [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - No Stats for default@basetable_rebuild, 
Columns: a, b
...
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlan

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
stats.StatsUtils (:()) - Estimated average row size: 372
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
-9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
isEstimated: true
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
isPrimaryKey: false isEstimated: true
2018-05-26T00:30:22,352 ERROR [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - No Stats for default@basetable_rebuild, 
Columns: a, b
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
stats.StatsUtils (:()) - Estimated average row size: 372
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
-9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
isEstimated: true
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
isPrimaryKey: false isEstimated: true
...
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1418)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genLogic

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
2018-05-26T00:30:22,335 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
stats.StatsUtils (:()) - Estimated average row size: 372
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column a in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: a colType: int countDistincts: 4 
numNulls: 1 avgColLen: 4.0 numTrues: 0 numFalses: 0 Range: [ min: 
-9223372036854775808 max: 9223372036854775807 ] isPrimaryKey: false 
isEstimated: true
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - Stats for column b in table basetable_rebuild 
stored in cache
2018-05-26T00:30:22,352 DEBUG [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) -  colName: b colType: varchar(256) 
countDistincts: 4 numNulls: 1 avgColLen: 256.0 numTrues: 0 numFalses: 0 
isPrimaryKey: false isEstimated: true
2018-05-26T00:30:22,352 ERROR [HiveServer2-Background-Pool: Thread-631]: 
calcite.RelOptHiveTable (:()) - No Stats for default@basetable_rebuild, 
Columns: a, b
...
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlan

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Description: 
Otherwise we may throw an Exception.

{noformat}
java.lang.RuntimeException: No Stats for default@basetable_rebuild, Columns: a, 
b
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:586)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:606)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:592)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:155)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:78)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:65)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount_$(Unknown Source) 
~[?:?]
at 
GeneratedMetadataHandler_DistinctRowCount.getDistinctRowCount(Unknown Source) 
~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getDistinctRowCount(RelMetadataQuery.java:781)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:207)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) 
~[?:?]
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:235)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:100)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1991) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1898)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.14.0.3.0.0.0-1368.jar:1.14.0.3.0.0.0-1368]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1418)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genLogicalPlan(CalcitePlanner.java:369)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.parseQuery(HiveMaterializedViewsRegistry.java:416)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.addMaterializedView(HiveMaterializedViewsRegistry.java:225)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry.createMaterializedView(HiveMaterializedViewsRegistry.java:188)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.MaterializedViewTask.execute(MaterializedViewTask.java:61)
 ~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
~[hive-exec-3.0.0.3.0.0.0-1368.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
~[hive-exec-3.0

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Status: Patch Available  (was: In Progress)

> allowNullColumnForMissingStats should not be false when column stats are 
> estimated
> --
>
> Key: HIVE-19771
> URL: https://issues.apache.org/jira/browse/HIVE-19771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19771.patch
>
>
> Otherwise we may throw an Exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19771:
---
Attachment: HIVE-19771.patch

> allowNullColumnForMissingStats should not be false when column stats are 
> estimated
> --
>
> Key: HIVE-19771
> URL: https://issues.apache.org/jira/browse/HIVE-19771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19771.patch
>
>
> Otherwise we may throw an Exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-06-01 Thread Vineet Garg (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498684#comment-16498684
 ] 

Vineet Garg commented on HIVE-19770:


Attached is first patch which provide CBO support for queries with multiple 
same column in select. Following are the known changes/effects: 

* Change in row schema: queries with multiple same column in select now has 
slightly different row scheme .e.g select c, c .. will have c, c_1 instead of 
c, c. I think this is probably because we losses the information about 
duplicate columns once calcite plan is rewritten to AST. This info is displayed 
in post hook of qtest and as far as I can tell it is a safe change.
* One query has different (worse) plan (it now has cbo + vectorization) and new 
plan for some reason contains extra reducer. The root cause for this has yet to 
be determined but this should not introduce correctness issues.
* Bunch of queries are missing lineage information (displayed in posthook). 
This is due to constant folding happening in CBO. e.g. insert into  select 
a,b from t1 where b=1. column corresponding to B will have lineage info missing 
because reference to B is foled to 1. Not sure if this is acceptable/expected 
for lineage.


> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-19771 started by Jesus Camacho Rodriguez.
--
> allowNullColumnForMissingStats should not be false when column stats are 
> estimated
> --
>
> Key: HIVE-19771
> URL: https://issues.apache.org/jira/browse/HIVE-19771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Otherwise we may throw an Exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19771) allowNullColumnForMissingStats should not be false when column stats are estimated

2018-06-01 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-19771:
--


> allowNullColumnForMissingStats should not be false when column stats are 
> estimated
> --
>
> Key: HIVE-19771
> URL: https://issues.apache.org/jira/browse/HIVE-19771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Otherwise we may throw an Exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-06-01 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19770:
---
Attachment: HIVE-19770.1.patch

> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-06-01 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19770:
---
Status: Patch Available  (was: Open)

> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-06-01 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-19770:
--


> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19378) "hive.lock.numretries" Is Misleading

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498647#comment-16498647
 ] 

Hive QA commented on HIVE-19378:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
36s{color} | {color:blue} ql in master has 2278 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-11421/dev-support/hive-personality.sh
 |
| git revision | master / 4463c2b |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11421/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> "hive.lock.numretries" Is Misleading
> 
>
> Key: HIVE-19378
> URL: https://issues.apache.org/jira/browse/HIVE-19378
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19378.1.patch
>
>
> Configuration 'hive.lock.numretries' is confusing.  It's not actually a 
> 'retry' count, it's the total number of attempt to try:
>  
> {code:java|title=ZooKeeperHiveLockManager.java}
> do {
>   lastException = null;
>   tryNum++;
>   try {
> if (tryNum > 1) {
>   Thread.sleep(sleepTime);
>   prepareRetry();
> }
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
> conflictingLocks);
> ...
> } while (tryNum < numRetriesForLock);
> {code}
> So, from this code you can see that on the first loop, {{tryNum}} is set to 
> 1, in which case, if the configuration num*retries* is set to 1, there will 
> be one attempt total.  With a *retry* value of 1, I would assume one initial 
> attempt and one additional retry.  Please change to:
> {code}
> while (tryNum <= numRetriesForLock);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19598) Add Acid V1 to V2 upgrade module

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498631#comment-16498631
 ] 

Hive QA commented on HIVE-19598:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12926034/HIVE-19598.01-branch-3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 14372 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidkafkamini_basic]
 (batchId=253)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] 
(batchId=95)
org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=304)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11420/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11420/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11420/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12926034 - PreCommit-HIVE-Build

> Add Acid V1 to V2 upgrade module
> 
>
> Key: HIVE-19598
> URL: https://issues.apache.org/jira/browse/HIVE-19598
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19598.01-branch-3.patch, HIVE-19598.02.patch, 
> HIVE-19598.05.patch, HIVE-19598.06.patch
>
>
> The on-disk layout for full acid (transactional) tables has changed 3.0.
> Any transactional table that has any update/delete events in any deltas that 
> have not been Major compacted, must go through a Major compaction before 
> upgrading to 3.0.  No more update/delete/merge should be run after/during 
> major compaction.
> Not doing so will result in data corruption/loss.
>  
> Need to create a utility tool to help with this process.  HIVE-19233 started 
> this but it needs more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19079) Add extended query string to Spark job description

2018-06-01 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498598#comment-16498598
 ] 

Aihua Xu commented on HIVE-19079:
-

The patch looks great. Minor comments. Since spark uses description rather than 
name, can we have hive.spark.jobdescription.length as the property name? And 
also, can we change DagUtils.getQueryName() changes to getQueryDescription()? 

> Add extended query string to Spark job description
> --
>
> Key: HIVE-19079
> URL: https://issues.apache.org/jira/browse/HIVE-19079
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19079.1.patch, HIVE-19079.2.patch, 
> HIVE-19079.3.patch, HIVE-19079.4.patch, Spark Collapse Truncated Query.png, 
> Spark Expanded Truncated Query.png
>
>
> As of HIVE-16601, we place a shortened version of the query into the Spark 
> job description. We should look into adding a longer version of the query. It 
> seems that the Spark Web UI has a nice feature where long job descriptions 
> will be truncated with a {{...}}, but when you double click on the {{...}} it 
> expands to show the rest of the string. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19723) Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"

2018-06-01 Thread Eric Wohlstadter (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498593#comment-16498593
 ] 

Eric Wohlstadter edited comment on HIVE-19723 at 6/1/18 9:20 PM:
-

[~teddy.choi]

Hive's Arrow serializer appears to truncate down to MILLISECONDS, but the Jira 
description calls for MICROSECONDS.

This is motivated by {{org.apache.spark.sql.execution.arrow.ArrowUtils.scala}}
{code:java}
case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType{code}

My understanding is that since the primary use-case for {{ArrowUtils}} is 
Python integration, some of the conversions are currently somewhat particular 
for Python. Perhaps Python/Pandas only supports MICROSECOND timestamps. 

FYI: [~hyukjin.kwon] [~bryanc]




was (Author: ewohlstadter):
[~teddy.choi]

The Arrow serializer appears to truncate down to MILLISECONDS, but the Jira 
description calls for MICROSECONDS.

This is motivated by {{org.apache.spark.sql.execution.arrow.ArrowUtils.scala}}
{code:java}
case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType{code}

My understanding is that since the primary use-case for {{ArrowUtils}} is 
Python integration, some of the conversions are currently somewhat particular 
for Python. Perhaps Python/Pandas only supports MICROSECOND timestamps. 

FYI: [~hyukjin.kwon] [~bryanc]



> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -
>
> Key: HIVE-19723
> URL: https://issues.apache.org/jira/browse/HIVE-19723
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19723.1.patch, HIVE-19732.2.patch
>
>
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity. 
> Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need 
> to change the assertion to test microsecond. And we'll need to add this to 
> documentation on supported datatypes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19723) Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"

2018-06-01 Thread Eric Wohlstadter (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498593#comment-16498593
 ] 

Eric Wohlstadter commented on HIVE-19723:
-

[~teddy.choi]

The Arrow serializer appears to truncate down to MILLISECONDS, but the Jira 
description calls for MICROSECONDS.

This is motivated by {{org.apache.spark.sql.execution.arrow.ArrowUtils.scala}}
{code:java}
case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType{code}

My understanding is that since the primary use-case for {{ArrowUtils}} is 
Python integration, some of the conversions are currently somewhat particular 
for Python. Perhaps Python/Pandas only supports MICROSECOND timestamps. 

FYI: [~hyukjin.kwon] [~bryanc]



> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -
>
> Key: HIVE-19723
> URL: https://issues.apache.org/jira/browse/HIVE-19723
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19723.1.patch, HIVE-19732.2.patch
>
>
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity. 
> Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need 
> to change the assertion to test microsecond. And we'll need to add this to 
> documentation on supported datatypes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19418) add background stats updater similar to compactor

2018-06-01 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19418:

Attachment: HIVE-19418.06.patch

> add background stats updater similar to compactor
> -
>
> Key: HIVE-19418
> URL: https://issues.apache.org/jira/browse/HIVE-19418
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19418.01.patch, HIVE-19418.02.patch, 
> HIVE-19418.03.patch, HIVE-19418.04.patch, HIVE-19418.05.patch, 
> HIVE-19418.06.patch, HIVE-19418.06.patch, HIVE-19418.patch
>
>
> There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables 
> to make them usable in a transaction without breaking ACID (for metadata-only 
> optimization). However, stats for ACID tables can still become unusable if 
> e.g. two parallel inserts run - neither sees the data written by the other, 
> so after both finish, the snapshots on either set of stats won't match the 
> current snapshot and the stats will be unusable.
> Additionally, for ACID and non-ACID tables alike, a lot of the stats, with 
> some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
> ndvs from two inserts), and for ACID even less can be aggregated (you cannot 
> derive min/max if some rows are deleted but you don't scan the rest of the 
> dataset).
> Therefore we will add background logic to metastore (similar to, and 
> partially inside, the ACID compactor) to update stats.
> It will have 3 modes of operation.
> 1) Off.
> 2) Update only the stats that exist but are out of date (generating stats can 
> be expensive, so if the user is only analyzing a subset of tables it should 
> be able to only update that subset). We can simply look at existing stats and 
> only analyze for the relevant partitions and columns.
> 3) On: 2 + create stats for all tables and columns missing stats.
> There will also be a table parameter to skip stats update. 
> In phase 1, the process will operate outside of compactor, and run analyze 
> command on the table. The analyze command will automatically save the stats 
> with ACID snapshot information if needed, based on HIVE-19416, so we don't 
> need to do any special state management and this will work for all table 
> types. However it's also more expensive.
> In phase 2, we can explore adding stats collection during MM compaction that 
> uses a temp table. If we don't have open writers during major compaction (so 
> we overwrite all of the data), the temp table stats can simply be copied over 
> to the main table with correct snapshot information, saving us a table scan.
> In phase 3, we can add custom stats collection logic to full ACID compactor 
> that is not query based, the same way as we'd do for (2). Alternatively we 
> can wait for ACID compactor to become query based and just reuse (2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19418) add background stats updater similar to compactor

2018-06-01 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498571#comment-16498571
 ] 

Sergey Shelukhin commented on HIVE-19418:
-

Updated the test, extra null checks in the helper caused some exception types 
to change.

> add background stats updater similar to compactor
> -
>
> Key: HIVE-19418
> URL: https://issues.apache.org/jira/browse/HIVE-19418
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19418.01.patch, HIVE-19418.02.patch, 
> HIVE-19418.03.patch, HIVE-19418.04.patch, HIVE-19418.05.patch, 
> HIVE-19418.06.patch, HIVE-19418.patch
>
>
> There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables 
> to make them usable in a transaction without breaking ACID (for metadata-only 
> optimization). However, stats for ACID tables can still become unusable if 
> e.g. two parallel inserts run - neither sees the data written by the other, 
> so after both finish, the snapshots on either set of stats won't match the 
> current snapshot and the stats will be unusable.
> Additionally, for ACID and non-ACID tables alike, a lot of the stats, with 
> some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
> ndvs from two inserts), and for ACID even less can be aggregated (you cannot 
> derive min/max if some rows are deleted but you don't scan the rest of the 
> dataset).
> Therefore we will add background logic to metastore (similar to, and 
> partially inside, the ACID compactor) to update stats.
> It will have 3 modes of operation.
> 1) Off.
> 2) Update only the stats that exist but are out of date (generating stats can 
> be expensive, so if the user is only analyzing a subset of tables it should 
> be able to only update that subset). We can simply look at existing stats and 
> only analyze for the relevant partitions and columns.
> 3) On: 2 + create stats for all tables and columns missing stats.
> There will also be a table parameter to skip stats update. 
> In phase 1, the process will operate outside of compactor, and run analyze 
> command on the table. The analyze command will automatically save the stats 
> with ACID snapshot information if needed, based on HIVE-19416, so we don't 
> need to do any special state management and this will work for all table 
> types. However it's also more expensive.
> In phase 2, we can explore adding stats collection during MM compaction that 
> uses a temp table. If we don't have open writers during major compaction (so 
> we overwrite all of the data), the temp table stats can simply be copied over 
> to the main table with correct snapshot information, saving us a table scan.
> In phase 3, we can add custom stats collection logic to full ACID compactor 
> that is not query based, the same way as we'd do for (2). Alternatively we 
> can wait for ACID compactor to become query based and just reuse (2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19418) add background stats updater similar to compactor

2018-06-01 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19418:

Attachment: HIVE-19418.06.patch

> add background stats updater similar to compactor
> -
>
> Key: HIVE-19418
> URL: https://issues.apache.org/jira/browse/HIVE-19418
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19418.01.patch, HIVE-19418.02.patch, 
> HIVE-19418.03.patch, HIVE-19418.04.patch, HIVE-19418.05.patch, 
> HIVE-19418.06.patch, HIVE-19418.patch
>
>
> There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables 
> to make them usable in a transaction without breaking ACID (for metadata-only 
> optimization). However, stats for ACID tables can still become unusable if 
> e.g. two parallel inserts run - neither sees the data written by the other, 
> so after both finish, the snapshots on either set of stats won't match the 
> current snapshot and the stats will be unusable.
> Additionally, for ACID and non-ACID tables alike, a lot of the stats, with 
> some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
> ndvs from two inserts), and for ACID even less can be aggregated (you cannot 
> derive min/max if some rows are deleted but you don't scan the rest of the 
> dataset).
> Therefore we will add background logic to metastore (similar to, and 
> partially inside, the ACID compactor) to update stats.
> It will have 3 modes of operation.
> 1) Off.
> 2) Update only the stats that exist but are out of date (generating stats can 
> be expensive, so if the user is only analyzing a subset of tables it should 
> be able to only update that subset). We can simply look at existing stats and 
> only analyze for the relevant partitions and columns.
> 3) On: 2 + create stats for all tables and columns missing stats.
> There will also be a table parameter to skip stats update. 
> In phase 1, the process will operate outside of compactor, and run analyze 
> command on the table. The analyze command will automatically save the stats 
> with ACID snapshot information if needed, based on HIVE-19416, so we don't 
> need to do any special state management and this will work for all table 
> types. However it's also more expensive.
> In phase 2, we can explore adding stats collection during MM compaction that 
> uses a temp table. If we don't have open writers during major compaction (so 
> we overwrite all of the data), the temp table stats can simply be copied over 
> to the main table with correct snapshot information, saving us a table scan.
> In phase 3, we can add custom stats collection logic to full ACID compactor 
> that is not query based, the same way as we'd do for (2). Alternatively we 
> can wait for ACID compactor to become query based and just reuse (2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19529) Vectorization: Date/Timestamp NULL issues

2018-06-01 Thread Matt McCline (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498567#comment-16498567
 ] 

Matt McCline commented on HIVE-19529:
-

#11429 ?

> Vectorization: Date/Timestamp NULL issues
> -
>
> Key: HIVE-19529
> URL: https://issues.apache.org/jira/browse/HIVE-19529
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: HIVE-19529.06-branch-3.patch, HIVE-19529.06.patch
>
>
> Wrong results found for:
>  date_add/date_sub
> UT areas:
>  date_add/date_sub
> datediff
> to_date
> interval_year_month + interval_year_month
>  interval_day_time + interval_day_time
>  interval_day_time + timestamp
>  timestamp + interval_day_time
>  date + interval_day_time
>  interval_day_time + date
>  interval_year_month + date
>  date + interval_year_month
>  interval_year_month + interval_year_month
>  timestamp + interval_year_month
> date - date
>  interval_year_month - interval_year_month
>  interval_day_time - interval_day_time
>  timestamp - interval_day_time
>  timestamp - timestamp
>  date - timestamp
>  timestamp - date
>  date - interval_day_time
>  date - interval_year_month
>  timestamp - interval_year_month



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19598) Add Acid V1 to V2 upgrade module

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498566#comment-16498566
 ] 

Hive QA commented on HIVE-19598:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 19s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-11420/patches/PreCommit-HIVE-Build-11420.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11420/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add Acid V1 to V2 upgrade module
> 
>
> Key: HIVE-19598
> URL: https://issues.apache.org/jira/browse/HIVE-19598
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19598.01-branch-3.patch, HIVE-19598.02.patch, 
> HIVE-19598.05.patch, HIVE-19598.06.patch
>
>
> The on-disk layout for full acid (transactional) tables has changed 3.0.
> Any transactional table that has any update/delete events in any deltas that 
> have not been Major compacted, must go through a Major compaction before 
> upgrading to 3.0.  No more update/delete/merge should be run after/during 
> major compaction.
> Not doing so will result in data corruption/loss.
>  
> Need to create a utility tool to help with this process.  HIVE-19233 started 
> this but it needs more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19323) Create metastore SQL install and upgrade scripts for 3.1

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498558#comment-16498558
 ] 

Hive QA commented on HIVE-19323:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925658/HIVE-19323.6-branch-3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 14370 tests 
executed
*Failed tests:*
{noformat}
TestUpgradeTool - did not produce a TEST-*.xml file (likely timed out) 
(batchId=309)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidkafkamini_basic]
 (batchId=253)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] 
(batchId=95)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testCancelRenewTokenFlow 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testConnection 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValid (batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testIsValidNeg 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testNegativeTokenAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testProxyAuth 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testRenewDelegationToken 
(batchId=254)
org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs.testTokenAuth 
(batchId=254)
org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=304)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11419/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11419/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11419/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925658 - PreCommit-HIVE-Build

> Create metastore SQL install and upgrade scripts for 3.1
> 
>
> Key: HIVE-19323
> URL: https://issues.apache.org/jira/browse/HIVE-19323
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.1.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19323.2.patch, HIVE-19323.3.patch, 
> HIVE-19323.4.patch, HIVE-19323.5.patch, HIVE-19323.6-branch-3.patch, 
> HIVE-19323.6.patch, HIVE-19323.branch-3.1.patch, HIVE-19323.patch
>
>
> Now that we've branched for 3.0 we need to create SQL install and upgrade 
> scripts for 3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables

2018-06-01 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19432:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, [~Rajkumar Singh]
Can you also please upload patch for branch-3 Current one doesn't apply cleanly.

> HIVE-7575: GetTablesOperation is too slow if the hive has too many databases 
> and tables
> ---
>
> Key: HIVE-19432
> URL: https://issues.apache.org/jira/browse/HIVE-19432
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, HiveServer2
>Affects Versions: 2.2.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19432.01.patch, HIVE-19432.01.patch, 
> HIVE-19432.patch
>
>
> GetTablesOperation is too slow since it does not check for the authorization 
> for databases and try pulling all the tables from all the databases using 
> getTableMeta. for operation like follows
> {code}
> con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" });
> {code}
> build the getTableMeta call with wildcard *
> {code}
>  metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=*
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19598) Add Acid V1 to V2 upgrade module

2018-06-01 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19598:

Fix Version/s: (was: 4.0.0)
   3.1.0

> Add Acid V1 to V2 upgrade module
> 
>
> Key: HIVE-19598
> URL: https://issues.apache.org/jira/browse/HIVE-19598
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19598.01-branch-3.patch, HIVE-19598.02.patch, 
> HIVE-19598.05.patch, HIVE-19598.06.patch
>
>
> The on-disk layout for full acid (transactional) tables has changed 3.0.
> Any transactional table that has any update/delete events in any deltas that 
> have not been Major compacted, must go through a Major compaction before 
> upgrading to 3.0.  No more update/delete/merge should be run after/during 
> major compaction.
> Not doing so will result in data corruption/loss.
>  
> Need to create a utility tool to help with this process.  HIVE-19233 started 
> this but it needs more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19598) Add Acid V1 to V2 upgrade module

2018-06-01 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498521#comment-16498521
 ] 

Ashutosh Chauhan commented on HIVE-19598:
-

Pushed to branch-3 as well.

> Add Acid V1 to V2 upgrade module
> 
>
> Key: HIVE-19598
> URL: https://issues.apache.org/jira/browse/HIVE-19598
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19598.01-branch-3.patch, HIVE-19598.02.patch, 
> HIVE-19598.05.patch, HIVE-19598.06.patch
>
>
> The on-disk layout for full acid (transactional) tables has changed 3.0.
> Any transactional table that has any update/delete events in any deltas that 
> have not been Major compacted, must go through a Major compaction before 
> upgrading to 3.0.  No more update/delete/merge should be run after/during 
> major compaction.
> Not doing so will result in data corruption/loss.
>  
> Need to create a utility tool to help with this process.  HIVE-19233 started 
> this but it needs more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19720) backport multiple MM commits to branch-3

2018-06-01 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19720:

Attachment: HIVE-19720.04-branch-3.patch

> backport multiple MM commits to branch-3
> 
>
> Key: HIVE-19720
> URL: https://issues.apache.org/jira/browse/HIVE-19720
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19720.01-branch-3.patch, 
> HIVE-19720.02-branch-3.patch, HIVE-19720.03-branch-3.patch, 
> HIVE-19720.04-branch-3.patch
>
>
> To avoid chained test runs of branch-3 backporting one by one, I will run 
> HiveQA on an epic combined patch, then commit patches w/proper commit 
> separation via cherry-pick:
> 0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner)
> 99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
> 7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
> Shelukhin, reviewed by Eugene Koifman)
> 8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
> TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
> TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
> f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
> conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason 
> Dere)
> Need to add:
> 36d66f0cf27 HIVE-19643 : MM table conversion doesn't need full ACID structure 
> checks (Sergey Shelukhin, reviewed by Eugene Koifman)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19769) Create dedicated objects for DB and Table names

2018-06-01 Thread Alan Gates (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates reassigned HIVE-19769:
-


> Create dedicated objects for DB and Table names
> ---
>
> Key: HIVE-19769
> URL: https://issues.apache.org/jira/browse/HIVE-19769
> Project: Hive
>  Issue Type: Sub-task
>  Components: storage-api
>Affects Versions: 3.0.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>
> Currently table names are always strings.  Sometimes that string is just 
> tablename, sometimes it is dbname.tablename.  Sometimes the code expects one 
> or the other, sometimes it handles either.  This is burdensome for developers 
> and error prone.  With the addition of catalog to the hierarchy, this becomes 
> even worse.
> I propose to add two objects, DatabaseName and TableName.  These will track 
> full names of each object.  They will handle inserting default catalog and 
> database names when those are not provided.  They will handle the conversions 
> to and from strings.
> These will need to be added to storage-api because ValidTxnList will use it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19768) Utility to convert tables to conform to Hive strict managed tables mode

2018-06-01 Thread Jason Dere (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498502#comment-16498502
 ] 

Jason Dere commented on HIVE-19768:
---

Attaching initial patch, which shares the same utility class from HIVE-19753.

> Utility to convert tables to conform to Hive strict managed tables mode
> ---
>
> Key: HIVE-19768
> URL: https://issues.apache.org/jira/browse/HIVE-19768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-19768.1.patch
>
>
> Create a utility that can check existing hive tables and convert them if 
> necessary to conform to strict managed tables mode.
> - Managed non-transactional ORC tables will be converted to full 
> transactional tables
> - Managed non-transactional tables of other types will be converted to 
> insert-only transactional tables
> - Tables with non-native storage/schema will be converted to external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19768) Utility to convert tables to conform to Hive strict managed tables mode

2018-06-01 Thread Jason Dere (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-19768:
--
Attachment: HIVE-19768.1.patch

> Utility to convert tables to conform to Hive strict managed tables mode
> ---
>
> Key: HIVE-19768
> URL: https://issues.apache.org/jira/browse/HIVE-19768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-19768.1.patch
>
>
> Create a utility that can check existing hive tables and convert them if 
> necessary to conform to strict managed tables mode.
> - Managed non-transactional ORC tables will be converted to full 
> transactional tables
> - Managed non-transactional tables of other types will be converted to 
> insert-only transactional tables
> - Tables with non-native storage/schema will be converted to external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19768) Utility to convert tables to conform to Hive strict managed tables mode

2018-06-01 Thread Jason Dere (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-19768:
--
Status: Patch Available  (was: Open)

> Utility to convert tables to conform to Hive strict managed tables mode
> ---
>
> Key: HIVE-19768
> URL: https://issues.apache.org/jira/browse/HIVE-19768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-19768.1.patch
>
>
> Create a utility that can check existing hive tables and convert them if 
> necessary to conform to strict managed tables mode.
> - Managed non-transactional ORC tables will be converted to full 
> transactional tables
> - Managed non-transactional tables of other types will be converted to 
> insert-only transactional tables
> - Tables with non-native storage/schema will be converted to external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19768) Utility to convert tables to conform to Hive strict managed tables mode

2018-06-01 Thread Jason Dere (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498509#comment-16498509
 ] 

Jason Dere commented on HIVE-19768:
---

RB at https://reviews.apache.org/r/67418/

> Utility to convert tables to conform to Hive strict managed tables mode
> ---
>
> Key: HIVE-19768
> URL: https://issues.apache.org/jira/browse/HIVE-19768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-19768.1.patch
>
>
> Create a utility that can check existing hive tables and convert them if 
> necessary to conform to strict managed tables mode.
> - Managed non-transactional ORC tables will be converted to full 
> transactional tables
> - Managed non-transactional tables of other types will be converted to 
> insert-only transactional tables
> - Tables with non-native storage/schema will be converted to external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19720) backport multiple MM commits to branch-3

2018-06-01 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498500#comment-16498500
 ] 

Sergey Shelukhin commented on HIVE-19720:
-

Actually it failed due to {noformat}
[INFO] Scanning for projects...
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 0.105s
[INFO] Finished at: Fri Jun 01 16:46:34 UTC 2018
[INFO] Final Memory: 11M/454M
[INFO] 
[ERROR] The goal you specified requires a project to execute but there is no 
POM in this directory 
(/home/hiveptest/35.188.100.152-hiveptest-0/apache-github-source-source/upgrade-acid).
 Please verify you invoked Maven from the correct directory. -> [Help 1]
...
{noformat} which is some BS failure.
I will add another recent commit and retry.

> backport multiple MM commits to branch-3
> 
>
> Key: HIVE-19720
> URL: https://issues.apache.org/jira/browse/HIVE-19720
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19720.01-branch-3.patch, 
> HIVE-19720.02-branch-3.patch, HIVE-19720.03-branch-3.patch
>
>
> To avoid chained test runs of branch-3 backporting one by one, I will run 
> HiveQA on an epic combined patch, then commit patches w/proper commit 
> separation via cherry-pick:
> 0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner)
> 99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
> 7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
> Shelukhin, reviewed by Eugene Koifman)
> 8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
> TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
> TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
> f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
> conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason 
> Dere)
> Need to add:
> 36d66f0cf27 HIVE-19643 : MM table conversion doesn't need full ACID structure 
> checks (Sergey Shelukhin, reviewed by Eugene Koifman)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19644) change WM syntax to avoid conflicts with identifiers starting with a number

2018-06-01 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498499#comment-16498499
 ] 

Sergey Shelukhin commented on HIVE-19644:
-

Looks like one file was left out of the patch...

> change WM syntax to avoid conflicts with identifiers starting with a number
> ---
>
> Key: HIVE-19644
> URL: https://issues.apache.org/jira/browse/HIVE-19644
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19644.01.patch, HIVE-19644.02.patch, 
> HIVE-19644.03.patch, HIVE-19644.04.patch, HIVE-19644.patch
>
>
> Time/etc literals conflict with non-ANSI query column names starting with a 
> number that were previously supported without quotes (e.g. 30days).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-14388) Add number of rows inserted message after insert command in Beeline

2018-06-01 Thread Bharathkrishna Guruvayoor Murali (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-14388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498494#comment-16498494
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-14388:
-

Thanks [~stakiar] and [~pvary] for the reviews.

Filed follow-up Jira for HOS. It works for multi-insert queries.

> Add number of rows inserted message after insert command in Beeline
> ---
>
> Key: HIVE-14388
> URL: https://issues.apache.org/jira/browse/HIVE-14388
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-14388-WIP.patch, HIVE-14388.02.patch, 
> HIVE-14388.03.patch, HIVE-14388.05.patch, HIVE-14388.06.patch, 
> HIVE-14388.07.patch, HIVE-14388.08.patch, HIVE-14388.09.patch, 
> HIVE-14388.10.patch, HIVE-14388.12.patch
>
>
> Currently, when you run insert command on beeline, it returns a message 
> saying "No rows affected .."
> A better and more intuitive msg would be "xxx rows inserted (26.068 seconds)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19644) change WM syntax to avoid conflicts with identifiers starting with a number

2018-06-01 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19644:

Attachment: HIVE-19644.04.patch

> change WM syntax to avoid conflicts with identifiers starting with a number
> ---
>
> Key: HIVE-19644
> URL: https://issues.apache.org/jira/browse/HIVE-19644
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19644.01.patch, HIVE-19644.02.patch, 
> HIVE-19644.03.patch, HIVE-19644.04.patch, HIVE-19644.patch
>
>
> Time/etc literals conflict with non-ANSI query column names starting with a 
> number that were previously supported without quotes (e.g. 30days).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19643) MM table conversion doesn't need full ACID structure checks

2018-06-01 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498489#comment-16498489
 ] 

Sergey Shelukhin commented on HIVE-19643:
-

The backport will be handled in HIVE-19720

> MM table conversion doesn't need full ACID structure checks
> ---
>
> Key: HIVE-19643
> URL: https://issues.apache.org/jira/browse/HIVE-19643
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Jason Dere
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19643.01.patch, HIVE-19643.02.patch, 
> HIVE-19643.03.patch, HIVE-19643.04.patch, HIVE-19643.05.patch, 
> HIVE-19643.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19720) backport multiple MM commits to branch-3

2018-06-01 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19720:

Description: 
To avoid chained test runs of branch-3 backporting one by one, I will run 
HiveQA on an epic combined patch, then commit patches w/proper commit 
separation via cherry-pick:


0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
Shelukhin, reviewed by Gunther Hagleitner)
99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
Shelukhin, reviewed by Eugene Koifman)
8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason Dere)


Need to add:
36d66f0cf27 HIVE-19643 : MM table conversion doesn't need full ACID structure 
checks (Sergey Shelukhin, reviewed by Eugene Koifman)

  was:
To avoid chained test runs of branch-3 backporting one by one, I will run 
HiveQA on an epic combined patch, then commit patches w/proper commit 
separation via cherry-pick:


0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
Shelukhin, reviewed by Gunther Hagleitner)
99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
Shelukhin, reviewed by Eugene Koifman)
8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason Dere)



> backport multiple MM commits to branch-3
> 
>
> Key: HIVE-19720
> URL: https://issues.apache.org/jira/browse/HIVE-19720
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19720.01-branch-3.patch, 
> HIVE-19720.02-branch-3.patch, HIVE-19720.03-branch-3.patch
>
>
> To avoid chained test runs of branch-3 backporting one by one, I will run 
> HiveQA on an epic combined patch, then commit patches w/proper commit 
> separation via cherry-pick:
> 0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner)
> 99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
> 7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
> Shelukhin, reviewed by Eugene Koifman)
> 8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
> TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
> TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
> f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
> conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason 
> Dere)
> Need to add:
> 36d66f0cf27 HIVE-19643 : MM table conversion doesn't need full ACID structure 
> checks (Sergey Shelukhin, reviewed by Eugene Koifman)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19643) MM table conversion doesn't need full ACID structure checks

2018-06-01 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19643:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master for now.

> MM table conversion doesn't need full ACID structure checks
> ---
>
> Key: HIVE-19643
> URL: https://issues.apache.org/jira/browse/HIVE-19643
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Jason Dere
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19643.01.patch, HIVE-19643.02.patch, 
> HIVE-19643.03.patch, HIVE-19643.04.patch, HIVE-19643.05.patch, 
> HIVE-19643.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19720) backport multiple MM commits to branch-3

2018-06-01 Thread Vineet Garg (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498485#comment-16498485
 ] 

Vineet Garg commented on HIVE-19720:


Never mind about {{TestJdbcWithMiniLlapArrow}} failure. I actually reverted the 
commit from branch-3. So the only failure to look at for you is TestUpgradeTool.

> backport multiple MM commits to branch-3
> 
>
> Key: HIVE-19720
> URL: https://issues.apache.org/jira/browse/HIVE-19720
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19720.01-branch-3.patch, 
> HIVE-19720.02-branch-3.patch, HIVE-19720.03-branch-3.patch
>
>
> To avoid chained test runs of branch-3 backporting one by one, I will run 
> HiveQA on an epic combined patch, then commit patches w/proper commit 
> separation via cherry-pick:
> 0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner)
> 99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
> 7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
> Shelukhin, reviewed by Eugene Koifman)
> 8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
> TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
> TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
> f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
> conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason 
> Dere)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19720) backport multiple MM commits to branch-3

2018-06-01 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498483#comment-16498483
 ] 

Sergey Shelukhin commented on HIVE-19720:
-

I thought the arrow failure is the one you reverted a patch for just recently? 
The test report shows it failed for a 10 last runs

> backport multiple MM commits to branch-3
> 
>
> Key: HIVE-19720
> URL: https://issues.apache.org/jira/browse/HIVE-19720
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19720.01-branch-3.patch, 
> HIVE-19720.02-branch-3.patch, HIVE-19720.03-branch-3.patch
>
>
> To avoid chained test runs of branch-3 backporting one by one, I will run 
> HiveQA on an epic combined patch, then commit patches w/proper commit 
> separation via cherry-pick:
> 0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner)
> 99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
> 7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
> Shelukhin, reviewed by Eugene Koifman)
> 8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
> TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
> TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
> f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
> conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason 
> Dere)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19323) Create metastore SQL install and upgrade scripts for 3.1

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498482#comment-16498482
 ] 

Hive QA commented on HIVE-19323:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-11419/patches/PreCommit-HIVE-Build-11419.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-11419/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Create metastore SQL install and upgrade scripts for 3.1
> 
>
> Key: HIVE-19323
> URL: https://issues.apache.org/jira/browse/HIVE-19323
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.1.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19323.2.patch, HIVE-19323.3.patch, 
> HIVE-19323.4.patch, HIVE-19323.5.patch, HIVE-19323.6-branch-3.patch, 
> HIVE-19323.6.patch, HIVE-19323.branch-3.1.patch, HIVE-19323.patch
>
>
> Now that we've branched for 3.0 we need to create SQL install and upgrade 
> scripts for 3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19720) backport multiple MM commits to branch-3

2018-06-01 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498484#comment-16498484
 ] 

Sergey Shelukhin commented on HIVE-19720:
-

I'll try TestUpgradeTool

> backport multiple MM commits to branch-3
> 
>
> Key: HIVE-19720
> URL: https://issues.apache.org/jira/browse/HIVE-19720
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19720.01-branch-3.patch, 
> HIVE-19720.02-branch-3.patch, HIVE-19720.03-branch-3.patch
>
>
> To avoid chained test runs of branch-3 backporting one by one, I will run 
> HiveQA on an epic combined patch, then commit patches w/proper commit 
> separation via cherry-pick:
> 0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner)
> 99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
> 7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
> Shelukhin, reviewed by Eugene Koifman)
> 8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
> TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
> TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
> f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
> conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason 
> Dere)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19720) backport multiple MM commits to branch-3

2018-06-01 Thread Vineet Garg (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498462#comment-16498462
 ] 

Vineet Garg commented on HIVE-19720:


[~sershe] you can ignore {{druidkafkamini_basic, insertsel_fail and 
TestRpc.testServerPort}}. I hadn't seen {{TestJdbcWithMiniLlapArrow}} failure 
before. Is there a jira to track this?
{{TestUpgradeTool}} I have never seen before either.

> backport multiple MM commits to branch-3
> 
>
> Key: HIVE-19720
> URL: https://issues.apache.org/jira/browse/HIVE-19720
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19720.01-branch-3.patch, 
> HIVE-19720.02-branch-3.patch, HIVE-19720.03-branch-3.patch
>
>
> To avoid chained test runs of branch-3 backporting one by one, I will run 
> HiveQA on an epic combined patch, then commit patches w/proper commit 
> separation via cherry-pick:
> 0930aec69b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner)
> 99a2b8bd6b HIVE-19312 : MM tables don't work with BucketizedHIF (Sergey 
> Shelukhin, reviewed by Gunther Hagleitner) ADDENDUM
> 7ebcdeb951 HIVE-17657 : export/import for MM tables is broken (Sergey 
> Shelukhin, reviewed by Eugene Koifman)
> 8db979f1ff (part not previously backported) HIVE-19476: Fix failures in 
> TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and 
> TestCopyUtils (Sankar Hariappan, reviewed by Sergey Shelukhin)
> f4352e5339 HIVE-19258 : add originals support to MM tables (and make the 
> conversion a metadata only operation) (Sergey Shelukhin, reviewed by Jason 
> Dere)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables

2018-06-01 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498443#comment-16498443
 ] 

Hive QA commented on HIVE-19432:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12925652/HIVE-19432.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14443 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/11418/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/11418/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-11418/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12925652 - PreCommit-HIVE-Build

> HIVE-7575: GetTablesOperation is too slow if the hive has too many databases 
> and tables
> ---
>
> Key: HIVE-19432
> URL: https://issues.apache.org/jira/browse/HIVE-19432
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, HiveServer2
>Affects Versions: 2.2.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19432.01.patch, HIVE-19432.01.patch, 
> HIVE-19432.patch
>
>
> GetTablesOperation is too slow since it does not check for the authorization 
> for databases and try pulling all the tables from all the databases using 
> getTableMeta. for operation like follows
> {code}
> con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" });
> {code}
> build the getTableMeta call with wildcard *
> {code}
>  metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=*
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries

2018-06-01 Thread Steve Yeom (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498341#comment-16498341
 ] 

Steve Yeom edited comment on HIVE-19416 at 6/1/18 7:05 PM:
---

Based on the talk with Ashutosh Chauhan, Sergey Shelukhin, Eugene Koifman, and 
I.
1. I will check and implement concurrent insert case detection and resolution 
based on txn_id (first version shall
  make the CSA state false, the next version may make the case  have the CSA 
state true). 
2. Hive aborted transaction case should turn the CSA state false.
3. I will explore (and implement if it  does not cause any issue) the 
possibility of using CSA state at TBLS/PARTITIONS 
  instead of keeping UPD_TXNS, which will simplify implementation. The basis of 
this is the invariant item that 
  Metastore TBLS/PARTITIONS keeps CSA updated for transactional stats in 
Metastore for both table and its columns.


was (Author: steveyeom2017):
Based on the talk with Ashutosh Chauhan, Sergey Shelukhin, Eugene Koifman, and 
I.
1. I will check and implement concurrent insert case detection and resolution 
based on txn_id (first version shall
  make the CSA state false, the next version may make the case  have the CSA 
state true). 
2. Hive aborted transaction case should turn the CSA state false.
3. I will explore (and implement if it  does not cause any issue) the 
possibility of using CSA state at TBLS/PARTITIONS 
  instead of keeping UPD_TXNS, which will simplify implementation. The basis of 
this is the invariant item that 
  Metastore TBLS/PARTITIONS keeps CSA updated for committed stats for both 
transactional table and its columns.

> Create single version transactional table metastore statistics for 
> aggregation queries
> --
>
> Key: HIVE-19416
> URL: https://issues.apache.org/jira/browse/HIVE-19416
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
>
> The system should use only statistics for aggregation queries like count on 
> transactional tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19768) Utility to convert tables to conform to Hive strict managed tables mode

2018-06-01 Thread Jason Dere (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-19768:
-


> Utility to convert tables to conform to Hive strict managed tables mode
> ---
>
> Key: HIVE-19768
> URL: https://issues.apache.org/jira/browse/HIVE-19768
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
>
> Create a utility that can check existing hive tables and convert them if 
> necessary to conform to strict managed tables mode.
> - Managed non-transactional ORC tables will be converted to full 
> transactional tables
> - Managed non-transactional tables of other types will be converted to 
> insert-only transactional tables
> - Tables with non-native storage/schema will be converted to external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19751) create submodule of hive-upgrade-acid for preUpgrade and postUpgrade

2018-06-01 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19751:
--
Attachment: HIVE-19751.02.patch

> create submodule of hive-upgrade-acid for preUpgrade and postUpgrade
> 
>
> Key: HIVE-19751
> URL: https://issues.apache.org/jira/browse/HIVE-19751
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-19751.02.patch
>
>
> Basically need to produce 2 separate jars: 1 for pre-upgrade step that can be 
> compiled/unit tested with 2.x jars and another can be compiled/tested with 
> 3.x jars. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19751) create submodule of hive-upgrade-acid for preUpgrade and postUpgrade

2018-06-01 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19751:
--
Status: Patch Available  (was: Open)

cc [~ashutoshc], [~jdere]

> create submodule of hive-upgrade-acid for preUpgrade and postUpgrade
> 
>
> Key: HIVE-19751
> URL: https://issues.apache.org/jira/browse/HIVE-19751
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-19751.02.patch
>
>
> Basically need to produce 2 separate jars: 1 for pre-upgrade step that can be 
> compiled/unit tested with 2.x jars and another can be compiled/tested with 
> 3.x jars. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 >

1 - 100 of 240 matches

Mail list logo