[jira] [Commented] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561541#comment-16561541
 ] 

Hive QA commented on HIVE-20166:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} serde in master has 195 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12939/dev-support/hive-personality.sh
 |
| git revision | master / 83e5397 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: serde U: serde |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12939/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like t

[jira] [Commented] (HIVE-20209) Metastore connection fails for first attempt in repl dump.

2018-07-29 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561539#comment-16561539
 ] 

mahesh kumar behera commented on HIVE-20209:


code changes looks fine to  me

> Metastore connection fails for first attempt in repl dump.
> --
>
> Key: HIVE-20209
> URL: https://issues.apache.org/jira/browse/HIVE-20209
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, pull-request-available, replication
> Attachments: HIVE-20209.01.patch
>
>
> Run the following command:
> {code:java}
> repl dump `*` from 60758 with ('hive.repl.dump.metadata.only'='true', 
> 'hive.repl.dump.include.acid.tables'='true');
> {code}
> See this in hs2.log:
> {code:java}
> 2018-07-10T18:07:32,308 INFO [HiveServer2-Handler-Pool: Thread-14380]: 
> conf.HiveConf (HiveConf.java:getLogIdVar(5061)) - Using the default value 
> passed in for log id: f1e13736-3f10-4abf-a29b-683b534dfa4c
> 2018-07-10T18:07:32,309 INFO [HiveServer2-Handler-Pool: Thread-14380]: 
> session.SessionState (:()) - Updating thread name to 
> f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380
> 2018-07-10T18:07:32,311 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: operation.OperationManager (:()) - 
> Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
> getHandleIdentifier()=16eb1d07-e125-490c-8ab8-90192bfd459b]
> 2018-07-10T18:07:32,314 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Compiling 
> command(queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b): 
> repl dump `*` from 60758 with ('hive.repl.dump.metadata.only'='true', 
> 'hive.repl.dump.include.acid.tables'='true')
> 2018-07-10T18:07:32,317 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) 
> - Trying to connect to metastore with URI 
> thrift://hwx-demo-2.field.hortonworks.com:9083
> 2018-07-10T18:07:32,317 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) 
> - Opened a connection to metastore, current connections: 19
> 2018-07-10T18:07:32,319 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) 
> - Connected to metastore.
> 2018-07-10T18:07:32,319 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: metastore.RetryingMetaStoreClient 
> (:()) - RetryingMetaStoreClient proxy=class 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hive 
> (auth:SIMPLE) retries=24 delay=5 lifetime=0
> 2018-07-10T18:07:32,439 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Semantic Analysis 
> Completed (retrial = false)
> 2018-07-10T18:07:32,440 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Returning Hive 
> schema: Schema(fieldSchemas:[FieldSchema(name:dump_dir, type:string, 
> comment:from deserializer), FieldSchema(name:last_repl_id, type:string, 
> comment:from deserializer)], properties:null)
> 2018-07-10T18:07:32,443 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: exec.ListSinkOperator (:()) - 
> Initializing operator LIST_SINK[0]
> 2018-07-10T18:07:32,446 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Completed 
> compiling 
> command(queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b); 
> Time taken: 0.132 seconds
> 2018-07-10T18:07:32,447 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: conf.HiveConf 
> (HiveConf.java:getLogIdVar(5061)) - Using the default value passed in for log 
> id: f1e13736-3f10-4abf-a29b-683b534dfa4c
> 2018-07-10T18:07:32,448 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c 
> HiveServer2-Handler-Pool: Thread-14380]: session.SessionState (:()) - 
> Resetting thread name to HiveServer2-Handler-Pool: Thread-14380
> 2018-07-10T18:07:32,451 INFO [HiveServer2-Background-Pool: Thread-15161]: 
> reexec.ReExecDriver (:()) - Execution #1 of query
> 2018-07-10T18:07:32,452 INFO [HiveServer2-Background-Pool: Thread-15161]: 
> lockmgr.DbTxnManager (:()) - Setting lock request transaction to txnid:30327 
> for queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b
> 2018-07-10T18:07:32,454 INFO [HiveServer2-Background-Pool: Thread-15161]: 
> lockmgr.DbLockManager (:()) - Requesting: 
> queryId=hive

[jira] [Commented] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561469#comment-16561469
 ] 

Hive QA commented on HIVE-20220:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933540/HIVE-20220.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14817 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12938/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12938/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12938/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933540 - PreCommit-HIVE-Build

> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
> Attachments: HIVE-20220.patch
>
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to a faulty datanode or any other reason. When reducer 
> finds that it can't fetch mapper output, it sends a signal to Application 
> Master to reattempt the corresponding map task. The reattempted map task will 
> now get the different random value from rand function and hence the keys that 
> gets distributed now to the reducer will not be same as the previous run. 
>  
> *Steps to reproduce:*
> create table test(id int);
> insert into test values 
> (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9);
> SET hive.groupby.skewindata=true;
> SET mapreduce.reduce.reduces=2;
> //Add a debug port for reducer
> select count(1) from test group by id;
> //Remove mapper's intermediate output file when map stage is completed and 
> one out of 2 reduce tasks is completed and then continue the run. This causes 
> 2nd reducer to send event to Application Master to rerun the map task. 
> The following is the expected result. 
> 1
> 2
> 3
> 4
> 5
> 6
> 8
> 8
> 9 
>  
> But you may get different result due to a different value returned by the 
> rand function in the second run causing different distribution of keys.
> This needs to be fixed such that the mapper distributes the same keys even if 
> it is reattempted multiple times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561455#comment-16561455
 ] 

Hive QA commented on HIVE-20220:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 64 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
57s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} common: The patch generated 2 new + 423 unchanged - 0 
fixed = 425 total (was 423) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 1 new + 55 unchanged - 0 fixed 
= 56 total (was 55) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12938/dev-support/hive-personality.sh
 |
| git revision | master / 83e5397 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12938/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12938/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12938/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
> Attachments: HIVE-20220.patch
>
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to

[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561443#comment-16561443
 ] 

Hive QA commented on HIVE-20225:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933537/HIVE-20225.1.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14833 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_joins]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_masking]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1]
 (batchId=193)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12937/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12937/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12937/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933537 - PreCommit-HIVE-Build

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export/import Data from Teradata, Teradata will 
> generate/require binary files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive 
> or write these files in order to load back to TD.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export/import data from Teradata is using TPT. 
> However, the Hive could not directly utilize/generate these binary format 
> because it doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon/generate the exported Teradata 
> Binary Format file transparently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-20225:
--
Status: Open  (was: Patch Available)

Hi [~luli], automated testing caught new checkstyle, findbugs, and missing 
license header problems with the patch (see results above). Please fix these 
issues and then resubmit the patch for review. Thanks!

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export/import Data from Teradata, Teradata will 
> generate/require binary files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive 
> or write these files in order to load back to TD.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export/import data from Teradata is using TPT. 
> However, the Hive could not directly utilize/generate these binary format 
> because it doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon/generate the exported Teradata 
> Binary Format file transparently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20166:

Attachment: HIVE-20166.1.patch
Status: Patch Available  (was: Open)

> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like to know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20166:

Attachment: (was: HIVE-20166.1.patch)

> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like to know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561418#comment-16561418
 ] 

Hive QA commented on HIVE-20225:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
23s{color} | {color:blue} contrib in master has 13 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} contrib: The patch generated 99 new + 0 unchanged - 0 
fixed = 99 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
33s{color} | {color:red} contrib generated 1 new + 13 unchanged - 0 fixed = 14 
total (was 13) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 11 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:contrib |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryRecordReader.pos;
 locked 81% of time  Unsynchronized access at 
TeradataBinaryRecordReader.java:81% of time  Unsynchronized access at 
TeradataBinaryRecordReader.java:[line 206] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12937/dev-support/hive-personality.sh
 |
| git revision | master / 83e5397 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus/diff-checkstyle-contrib.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus/new-findbugs-contrib.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus/patch-asflicense-problems.txt
 |
| modules | C: contrib U: contrib |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export/import Data from Teradata, Teradata will 
> generate/require binary files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive 
> or write these files in order to load back to TD.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.T

[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20166:

Status: Open  (was: Patch Available)

> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like to know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561413#comment-16561413
 ] 

Hive QA commented on HIVE-20166:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933534/HIVE-20166.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14815 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.minikdc.TestJdbcWithMiniKdcCookie.testCookie (batchId=264)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12936/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12936/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12936/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933534 - PreCommit-HIVE-Build

> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like to know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Ganesha Shreedhara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesha Shreedhara updated HIVE-20220:
--
Status: Patch Available  (was: In Progress)

Added qtests.

> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
> Attachments: HIVE-20220.patch
>
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to a faulty datanode or any other reason. When reducer 
> finds that it can't fetch mapper output, it sends a signal to Application 
> Master to reattempt the corresponding map task. The reattempted map task will 
> now get the different random value from rand function and hence the keys that 
> gets distributed now to the reducer will not be same as the previous run. 
>  
> *Steps to reproduce:*
> create table test(id int);
> insert into test values 
> (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9);
> SET hive.groupby.skewindata=true;
> SET mapreduce.reduce.reduces=2;
> //Add a debug port for reducer
> select count(1) from test group by id;
> //Remove mapper's intermediate output file when map stage is completed and 
> one out of 2 reduce tasks is completed and then continue the run. This causes 
> 2nd reducer to send event to Application Master to rerun the map task. 
> The following is the expected result. 
> 1
> 2
> 3
> 4
> 5
> 6
> 8
> 8
> 9 
>  
> But you may get different result due to a different value returned by the 
> rand function in the second run causing different distribution of keys.
> This needs to be fixed such that the mapper distributes the same keys even if 
> it is reattempted multiple times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Ganesha Shreedhara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesha Shreedhara updated HIVE-20220:
--
Status: In Progress  (was: Patch Available)

> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
> Attachments: HIVE-20220.patch
>
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to a faulty datanode or any other reason. When reducer 
> finds that it can't fetch mapper output, it sends a signal to Application 
> Master to reattempt the corresponding map task. The reattempted map task will 
> now get the different random value from rand function and hence the keys that 
> gets distributed now to the reducer will not be same as the previous run. 
>  
> *Steps to reproduce:*
> create table test(id int);
> insert into test values 
> (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9);
> SET hive.groupby.skewindata=true;
> SET mapreduce.reduce.reduces=2;
> //Add a debug port for reducer
> select count(1) from test group by id;
> //Remove mapper's intermediate output file when map stage is completed and 
> one out of 2 reduce tasks is completed and then continue the run. This causes 
> 2nd reducer to send event to Application Master to rerun the map task. 
> The following is the expected result. 
> 1
> 2
> 3
> 4
> 5
> 6
> 8
> 8
> 9 
>  
> But you may get different result due to a different value returned by the 
> rand function in the second run causing different distribution of keys.
> This needs to be fixed such that the mapper distributes the same keys even if 
> it is reattempted multiple times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Ganesha Shreedhara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesha Shreedhara updated HIVE-20220:
--
Attachment: HIVE-20220.patch

> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
> Attachments: HIVE-20220.patch
>
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to a faulty datanode or any other reason. When reducer 
> finds that it can't fetch mapper output, it sends a signal to Application 
> Master to reattempt the corresponding map task. The reattempted map task will 
> now get the different random value from rand function and hence the keys that 
> gets distributed now to the reducer will not be same as the previous run. 
>  
> *Steps to reproduce:*
> create table test(id int);
> insert into test values 
> (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9);
> SET hive.groupby.skewindata=true;
> SET mapreduce.reduce.reduces=2;
> //Add a debug port for reducer
> select count(1) from test group by id;
> //Remove mapper's intermediate output file when map stage is completed and 
> one out of 2 reduce tasks is completed and then continue the run. This causes 
> 2nd reducer to send event to Application Master to rerun the map task. 
> The following is the expected result. 
> 1
> 2
> 3
> 4
> 5
> 6
> 8
> 8
> 9 
>  
> But you may get different result due to a different value returned by the 
> rand function in the second run causing different distribution of keys.
> This needs to be fixed such that the mapper distributes the same keys even if 
> it is reattempted multiple times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Ganesha Shreedhara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesha Shreedhara updated HIVE-20220:
--
Comment: was deleted

(was: Can someone review this patch please? Please let me know if there is a 
better way of fixing this. I can update the qtest files based on the fix. 

 )

> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to a faulty datanode or any other reason. When reducer 
> finds that it can't fetch mapper output, it sends a signal to Application 
> Master to reattempt the corresponding map task. The reattempted map task will 
> now get the different random value from rand function and hence the keys that 
> gets distributed now to the reducer will not be same as the previous run. 
>  
> *Steps to reproduce:*
> create table test(id int);
> insert into test values 
> (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9);
> SET hive.groupby.skewindata=true;
> SET mapreduce.reduce.reduces=2;
> //Add a debug port for reducer
> select count(1) from test group by id;
> //Remove mapper's intermediate output file when map stage is completed and 
> one out of 2 reduce tasks is completed and then continue the run. This causes 
> 2nd reducer to send event to Application Master to rerun the map task. 
> The following is the expected result. 
> 1
> 2
> 3
> 4
> 5
> 6
> 8
> 8
> 9 
>  
> But you may get different result due to a different value returned by the 
> rand function in the second run causing different distribution of keys.
> This needs to be fixed such that the mapper distributes the same keys even if 
> it is reattempted multiple times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Ganesha Shreedhara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesha Shreedhara updated HIVE-20220:
--
Comment: was deleted

(was: I'll correct the golden files if this fix is feasible. )

> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to a faulty datanode or any other reason. When reducer 
> finds that it can't fetch mapper output, it sends a signal to Application 
> Master to reattempt the corresponding map task. The reattempted map task will 
> now get the different random value from rand function and hence the keys that 
> gets distributed now to the reducer will not be same as the previous run. 
>  
> *Steps to reproduce:*
> create table test(id int);
> insert into test values 
> (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9);
> SET hive.groupby.skewindata=true;
> SET mapreduce.reduce.reduces=2;
> //Add a debug port for reducer
> select count(1) from test group by id;
> //Remove mapper's intermediate output file when map stage is completed and 
> one out of 2 reduce tasks is completed and then continue the run. This causes 
> 2nd reducer to send event to Application Master to rerun the map task. 
> The following is the expected result. 
> 1
> 2
> 3
> 4
> 5
> 6
> 8
> 8
> 9 
>  
> But you may get different result due to a different value returned by the 
> rand function in the second run causing different distribution of keys.
> This needs to be fixed such that the mapper distributes the same keys even if 
> it is reattempted multiple times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled

2018-07-29 Thread Ganesha Shreedhara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesha Shreedhara updated HIVE-20220:
--
Attachment: (was: HIVE-20220.patch)

> Incorrect result when hive.groupby.skewindata is enabled
> 
>
> Key: HIVE-20220
> URL: https://issues.apache.org/jira/browse/HIVE-20220
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 3.0.0
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
>
> hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped 
> by keys to the reducers and hence avoids overloading a single reducer when 
> there is a skew in data. 
> This random distribution of keys is buggy when the reducer fails to fetch the 
> mapper output due to a faulty datanode or any other reason. When reducer 
> finds that it can't fetch mapper output, it sends a signal to Application 
> Master to reattempt the corresponding map task. The reattempted map task will 
> now get the different random value from rand function and hence the keys that 
> gets distributed now to the reducer will not be same as the previous run. 
>  
> *Steps to reproduce:*
> create table test(id int);
> insert into test values 
> (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9);
> SET hive.groupby.skewindata=true;
> SET mapreduce.reduce.reduces=2;
> //Add a debug port for reducer
> select count(1) from test group by id;
> //Remove mapper's intermediate output file when map stage is completed and 
> one out of 2 reduce tasks is completed and then continue the run. This causes 
> 2nd reducer to send event to Application Master to rerun the map task. 
> The following is the expected result. 
> 1
> 2
> 3
> 4
> 5
> 6
> 8
> 8
> 9 
>  
> But you may get different result due to a different value returned by the 
> rand function in the second run causing different distribution of keys.
> This needs to be fixed such that the mapper distributes the same keys even if 
> it is reattempted multiple times. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Lu Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Li updated HIVE-20225:
-
Description: 
When using TPT/BTEQ to export/import Data from Teradata, Teradata will 
generate/require binary files based on the schema.

A Customized SerDe is needed in order to directly read these files from Hive or 
write these files in order to load back to TD.
{code:java}
CREATE EXTERNAL TABLE `TABLE1`(
...)
PARTITIONED BY (
...)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
STORED AS INPUTFORMAT
 
'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
OUTPUTFORMAT
 
'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
LOCATION ...;

SELECT * FROM `TABLE1`;{code}
Problem Statement:

Right now the fast way to export/import data from Teradata is using TPT. 
However, the Hive could not directly utilize/generate these binary format 
because it doesn't have a SerDe for these files.

Result:

Provided with the SerDe, Hive can operate upon/generate the exported Teradata 
Binary Format file transparently

  was:
When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
files based on the schema.

A Customized SerDe is needed in order to directly read these files from Hive.
{code:java}
CREATE EXTERNAL TABLE `TABLE1`(
...)
PARTITIONED BY (
...)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
STORED AS INPUTFORMAT
 
'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
OUTPUTFORMAT
 
'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
LOCATION ...;

SELECT * FROM `TABLE1`;{code}
Problem Statement:

Right now the fast way to export data from Teradata is using TPT. However, the 
Hive could not directly utilize these exported binary format because it doesn't 
have a SerDe for these files.

Result:

Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
Format file transparently.


> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export/import Data from Teradata, Teradata will 
> generate/require binary files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive 
> or write these files in order to load back to TD.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export/import data from Teradata is using TPT. 
> However, the Hive could not directly utilize/generate these binary format 
> because it doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon/generate the exported Teradata 
> Binary Format file transparently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Lu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561381#comment-16561381
 ] 

Lu Li commented on HIVE-20225:
--

Hi [~cwsteinbach]

Could you please review this and provide your comments?

Thanks,
Lu

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
> files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export data from Teradata is using TPT. However, 
> the Hive could not directly utilize these exported binary format because it 
> doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
> Format file transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Lu Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Li updated HIVE-20225:
-
Status: Patch Available  (was: Open)

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
> files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export data from Teradata is using TPT. However, 
> the Hive could not directly utilize these exported binary format because it 
> doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
> Format file transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Lu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561379#comment-16561379
 ] 

Lu Li commented on HIVE-20225:
--

Added the RB: https://reviews.apache.org/r/68099/

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
> files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export data from Teradata is using TPT. However, 
> the Hive could not directly utilize these exported binary format because it 
> doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
> Format file transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Lu Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Li updated HIVE-20225:
-
Attachment: HIVE-20225.1.patch

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
> files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export data from Teradata is using TPT. However, 
> the Hive could not directly utilize these exported binary format because it 
> doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
> Format file transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Lu Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Li updated HIVE-20225:
-
Attachment: (was: HIVE-20225.1.patch)

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
> files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export data from Teradata is using TPT. However, 
> the Hive could not directly utilize these exported binary format because it 
> doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
> Format file transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-29 Thread Lu Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Li updated HIVE-20225:
-
Attachment: HIVE-20225.1.patch

> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
> Attachments: HIVE-20225.1.patch
>
>
> When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
> files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export data from Teradata is using TPT. However, 
> the Hive could not directly utilize these exported binary format because it 
> doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
> Format file transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561359#comment-16561359
 ] 

Hive QA commented on HIVE-20166:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} serde in master has 195 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12936/dev-support/hive-personality.sh
 |
| git revision | master / 83e5397 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: serde U: serde |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12936/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like t

[jira] [Commented] (HIVE-19798) Number of distinct values column statistic accounts null as a distinct value

2018-07-29 Thread Anurag Mantripragada (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561335#comment-16561335
 ] 

Anurag Mantripragada commented on HIVE-19798:
-

[~arhimondr], can you please provide more info on this?

> Number of distinct values column statistic accounts null as a distinct value
> 
>
> Key: HIVE-19798
> URL: https://issues.apache.org/jira/browse/HIVE-19798
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Andy Rosa
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned HIVE-20166:
---

Assignee: Anurag Mantripragada

> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like to know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging

2018-07-29 Thread Anurag Mantripragada (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada updated HIVE-20166:

Attachment: HIVE-20166.1.patch
Status: Patch Available  (was: Open)

Changed logging level to WARN.

> LazyBinaryStruct Warn Level Logging
> ---
>
> Key: HIVE-20166
> URL: https://issues.apache.org/jira/browse/HIVE-20166
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20166.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180
> {code}
> // Extra bytes at the end?
> if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) {
>   extraFieldWarned = true;
>   LOG.warn("Extra bytes detected at the end of the row! " +
>"Last field end " + lastFieldByteEnd + " and serialize buffer end 
> " + structByteEnd + ". " +
>"Ignoring similar problems.");
> }
> // Missing fields?
> if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) {
>   missingFieldWarned = true;
>   LOG.info("Missing fields! Expected " + fields.length + " fields but " +
>   "only got " + fieldId + "! " +
>   "Last field end " + lastFieldByteEnd + " and serialize buffer end " 
> + structByteEnd + ". " +
>   "Ignoring similar problems.");
> }
> {code}
> The first log statement is a 'warn' level logging, the second is an 'info' 
> level logging.  Please change the second log to also be a 'warn'.  This seems 
> like it could be a problem that the user would like to know about.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels

2018-07-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561300#comment-16561300
 ] 

Prasanth Jayachandran commented on HIVE-20267:
--

[~zchovan] Thanks for the patch! Very useful!

2 minor changes
 * Could you auto-refresh the page after clicking submit button?
 * Could you also add this servlet to LlapWebServices?

Looks good otherwise. 

> Expanding WebUI to include form to dynamically config log levels 
> -
>
> Key: HIVE-20267
> URL: https://issues.apache.org/jira/browse/HIVE-20267
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Minor
> Attachments: HIVE-20267.1.patch
>
>
> Expanding the possibility to change the log levels during runtime, the webUI 
> can be extended to interact with the Log4j2ConfiguratorServlet, this way it 
> can be directly used and users/admins don't need to execute curl commands 
> from commandline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels

2018-07-29 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561301#comment-16561301
 ] 

Prasanth Jayachandran commented on HIVE-20267:
--

Also please make the ticket "Patch Available" for it to trigger pre-commit 
tests. 

> Expanding WebUI to include form to dynamically config log levels 
> -
>
> Key: HIVE-20267
> URL: https://issues.apache.org/jira/browse/HIVE-20267
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Minor
> Attachments: HIVE-20267.1.patch
>
>
> Expanding the possibility to change the log levels during runtime, the webUI 
> can be extended to interact with the Log4j2ConfiguratorServlet, this way it 
> can be directly used and users/admins don't need to execute curl commands 
> from commandline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20262) Implement stats annotation rule for the UDTFOperator

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561281#comment-16561281
 ] 

Hive QA commented on HIVE-20262:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933527/HIVE-20262.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14816 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12935/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12935/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12935/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933527 - PreCommit-HIVE-Build

> Implement stats annotation rule for the UDTFOperator
> 
>
> Key: HIVE-20262
> URL: https://issues.apache.org/jira/browse/HIVE-20262
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch
>
>
> User Defined Table Functions (UDTFs) change the number of rows of the output. 
> A common UDTF is the explode() method that creates a row for each element for 
> each array in the input column.
>  
> Right now, the number of output rows is equal to the number of input rows. 
> But if the average number of output rows is bigger than 1, the resulting 
> number of rows is underestimated in the execution plan.
>  
> Implement a rule that can have a factor X as a parameter and for each UDTF 
> function predict that:
>  
> {code:java}
> number of output rows = X * number of input rows{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20262) Implement stats annotation rule for the UDTFOperator

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561276#comment-16561276
 ] 

Hive QA commented on HIVE-20262:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} common in master has 64 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
11s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12935/dev-support/hive-personality.sh
 |
| git revision | master / 83e5397 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12935/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Implement stats annotation rule for the UDTFOperator
> 
>
> Key: HIVE-20262
> URL: https://issues.apache.org/jira/browse/HIVE-20262
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch
>
>
> User Defined Table Functions (UDTFs) change the number of rows of the output. 
> A common UDTF is the explode() method that creates a row for each element for 
> each array in the input column.
>  
> Right now, the number of output rows is equal to the number of input rows. 
> But if the average number of output rows is bigger than 1, the resulting 
> number of rows is underestimated in the execution plan.
>  
> Implement a rule that can have a factor X as a parameter and for each UDTF 
> function predict that:
>  
> {code:java}
> number of output rows = X * number of input rows{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels

2018-07-29 Thread Zoltan Chovan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Chovan updated HIVE-20267:
-
Attachment: HIVE-20267.1.patch

> Expanding WebUI to include form to dynamically config log levels 
> -
>
> Key: HIVE-20267
> URL: https://issues.apache.org/jira/browse/HIVE-20267
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Minor
> Attachments: HIVE-20267.1.patch
>
>
> Expanding the possibility to change the log levels during runtime, the webUI 
> can be extended to interact with the Log4j2ConfiguratorServlet, this way it 
> can be directly used and users/admins don't need to execute curl commands 
> from commandline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels

2018-07-29 Thread Zoltan Chovan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Chovan reassigned HIVE-20267:



> Expanding WebUI to include form to dynamically config log levels 
> -
>
> Key: HIVE-20267
> URL: https://issues.apache.org/jira/browse/HIVE-20267
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Minor
>
> Expanding the possibility to change the log levels during runtime, the webUI 
> can be extended to interact with the Log4j2ConfiguratorServlet, this way it 
> can be directly used and users/admins don't need to execute curl commands 
> from commandline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561271#comment-16561271
 ] 

Hive QA commented on HIVE-20263:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933526/HIVE-20263.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14815 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12934/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12934/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12934/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933526 - PreCommit-HIVE-Build

> Typo in HiveReduceExpressionsWithStatsRule variable
> ---
>
> Key: HIVE-20263
> URL: https://issues.apache.org/jira/browse/HIVE-20263
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20263.patch, HIVE-20263.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable

2018-07-29 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561269#comment-16561269
 ] 

Ashutosh Chauhan commented on HIVE-20263:
-

+1

> Typo in HiveReduceExpressionsWithStatsRule variable
> ---
>
> Key: HIVE-20263
> URL: https://issues.apache.org/jira/browse/HIVE-20263
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20263.patch, HIVE-20263.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561263#comment-16561263
 ] 

Hive QA commented on HIVE-20263:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12934/dev-support/hive-personality.sh
 |
| git revision | master / 83e5397 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12934/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Typo in HiveReduceExpressionsWithStatsRule variable
> ---
>
> Key: HIVE-20263
> URL: https://issues.apache.org/jira/browse/HIVE-20263
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20263.patch, HIVE-20263.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator

2018-07-29 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20262:
-
Attachment: HIVE-20262.2.patch
Status: Patch Available  (was: Open)

> Implement stats annotation rule for the UDTFOperator
> 
>
> Key: HIVE-20262
> URL: https://issues.apache.org/jira/browse/HIVE-20262
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch
>
>
> User Defined Table Functions (UDTFs) change the number of rows of the output. 
> A common UDTF is the explode() method that creates a row for each element for 
> each array in the input column.
>  
> Right now, the number of output rows is equal to the number of input rows. 
> But if the average number of output rows is bigger than 1, the resulting 
> number of rows is underestimated in the execution plan.
>  
> Implement a rule that can have a factor X as a parameter and for each UDTF 
> function predict that:
>  
> {code:java}
> number of output rows = X * number of input rows{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator

2018-07-29 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20262:
-
Status: Open  (was: Patch Available)

> Implement stats annotation rule for the UDTFOperator
> 
>
> Key: HIVE-20262
> URL: https://issues.apache.org/jira/browse/HIVE-20262
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch
>
>
> User Defined Table Functions (UDTFs) change the number of rows of the output. 
> A common UDTF is the explode() method that creates a row for each element for 
> each array in the input column.
>  
> Right now, the number of output rows is equal to the number of input rows. 
> But if the average number of output rows is bigger than 1, the resulting 
> number of rows is underestimated in the execution plan.
>  
> Implement a rule that can have a factor X as a parameter and for each UDTF 
> function predict that:
>  
> {code:java}
> number of output rows = X * number of input rows{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20266) Extra column is being shuffled in cbo as compared to non-cbo

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20266:
---
Description: 
{code:sql}
CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING 
NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
{code}

{code:sql}
explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, 
value, key as p1, 3 as p2 from src limit 10;
{code}

*Without CBO*
{noformat}
 Map 1
Map Operator Tree:
TableScan
  alias: src
  Statistics: Num rows: 2500 Data size: 26560 Basic stats: 
COMPLETE Column stats: NONE
  Select Operator
expressions: key (type: string), value (type: string), 
value (type: string), key (type: string), 3 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
Statistics: Num rows: 2500 Data size: 26560 Basic stats: 
COMPLETE Column stats: NONE
Limit
  Number of rows: 10
  Statistics: Num rows: 10 Data size: 100 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 10 Data size: 100 Basic stats: 
COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type: 
string), _col2 (type: string), _col3 (type: string), _col4 (type: int)
{noformat}

*With CBO*
{noformat}
Map 1
Map Operator Tree:
TableScan
  alias: src
  Statistics: Num rows: 2500 Data size: 26560 Basic stats: 
COMPLETE Column stats: NONE
  Select Operator
expressions: key (type: string), value (type: string), 
value (type: string), key (type: string)
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 2500 Data size: 26560 Basic stats: 
COMPLETE Column stats: NONE
Limit
  Number of rows: 10
  Statistics: Num rows: 10 Data size: 100 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 10 Data size: 100 Basic stats: 
COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type: 
string), _col2 (type: string), _col3 (type: string)
{noformat}

CBO has 4 columns being shuffled as compared to 3 in non-cbo

  was:
{code:sql}
CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING 
NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
{code}

{code:sql}
explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, 
value, key as p1, 3 as p2 from src limit 10;
{code}


> Extra column is being shuffled in cbo as compared to non-cbo
> 
>
> Key: HIVE-20266
> URL: https://issues.apache.org/jira/browse/HIVE-20266
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> {code:sql}
> CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING 
> NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
> {code}
> {code:sql}
> explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, 
> value, key as p1, 3 as p2 from src limit 10;
> {code}
> *Without CBO*
> {noformat}
>  Map 1
> Map Operator Tree:
> TableScan
>   alias: src
>   Statistics: Num rows: 2500 Data size: 26560 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: key (type: string), value (type: string), 
> value (type: string), key (type: string), 3 (type: int)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4
> Statistics: Num rows: 2500 Data size: 26560 Basic stats: 
> COMPLETE Column stats: NONE
> Limit
>   Number of rows: 10
>   Statistics: Num rows: 10 Data size: 100 Basic stats: 
> COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order:
> Statistics: Num rows: 10 Data size: 100 Basic stats: 
> COMPLETE Column stats: NONE
> value expressions: _col0 (type: string), _col1 (type: 
> string), _col2 (type: string), _col3 (type: string), _col4 (type: int)
> {noformat}
> *With CBO*
> {noformat}
> Map 1
> Map Operator Tree:
>

[jira] [Assigned] (HIVE-20266) Extra column is being shuffled in cbo as compared to non-cbo

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-20266:
--


> Extra column is being shuffled in cbo as compared to non-cbo
> 
>
> Key: HIVE-20266
> URL: https://issues.apache.org/jira/browse/HIVE-20266
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> {code:sql}
> CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING 
> NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE);
> {code}
> {code:sql}
> explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, 
> value, key as p1, 3 as p2 from src limit 10;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20265) PTF operator has an extra reducer in CBO

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-20265:
--


> PTF operator has an extra reducer in CBO
> 
>
> Key: HIVE-20265
> URL: https://issues.apache.org/jira/browse/HIVE-20265
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> {code:sql}
> explain vectorization detail
> select p_mfgr, p_name, p_size, 
> min(p_retailprice),
> rank() over(distribute by p_mfgr sort by p_name)as r,
> dense_rank() over(distribute by p_mfgr sort by p_name) as dr,
> p_size, p_size - lag(p_size,1,p_size) over(distribute by p_mfgr sort by 
> p_name) as deltaSz
> from part
> group by p_mfgr, p_name, p_size
> {code}
> Above query generates extra reducer with CBO on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable

2018-07-29 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20263:
---
Attachment: HIVE-20263.patch

> Typo in HiveReduceExpressionsWithStatsRule variable
> ---
>
> Key: HIVE-20263
> URL: https://issues.apache.org/jira/browse/HIVE-20263
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20263.patch, HIVE-20263.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19770:
---
  Resolution: Fixed
   Fix Version/s: 4.0.0
Target Version/s: 4.0.0  (was: 3.1.0)
  Status: Resolved  (was: Patch Available)

Pushed to master. Thanks for reviewing [~ashutoshc]

> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, 
> HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, 
> HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20262) Implement stats annotation rule for the UDTFOperator

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561239#comment-16561239
 ] 

Hive QA commented on HIVE-20262:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933525/HIVE-20262.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12933/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12933/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12933/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-07-29 19:07:39.248
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-12933/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-07-29 19:07:39.251
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2183424 HIVE-19181 : Remove BreakableService (unused class) 
(Anurag Mantripragada via Thejas Nair)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
Removing standalone-metastore/src/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 2183424 HIVE-19181 : Remove BreakableService (unused class) 
(Anurag Mantripragada via Thejas Nair)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-07-29 19:07:40.248
+ rm -rf ../yetus_PreCommit-HIVE-Build-12933
+ mkdir ../yetus_PreCommit-HIVE-Build-12933
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-12933
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12933/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: git apply -p0
/data/hiveptest/working/scratch/build.patch:137: new blank line at EOF.
+
/data/hiveptest/working/scratch/build.patch:399: new blank line at EOF.
+
warning: 2 lines add whitespace errors.
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: executing: [/tmp/protoc1788037483674454778.exe, --version]
libprotoc 2.5.0
protoc-jar: executing: [/tmp/protoc1788037483674454778.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process 
(process-resource-bundles) on project hive-upgrade-acid: Execution 
process-resource-bundles of goal 
org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process failed. 
ConcurrentModificationException -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hive-upgrade-acid
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-1

[jira] [Commented] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561238#comment-16561238
 ] 

Hive QA commented on HIVE-19770:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933524/HIVE-19770.8.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14815 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12932/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12932/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12932/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933524 - PreCommit-HIVE-Build

> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, 
> HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, 
> HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561223#comment-16561223
 ] 

Hive QA commented on HIVE-19770:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
14s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 13 new + 199 unchanged - 1 
fixed = 212 total (was 200) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12932/dev-support/hive-personality.sh
 |
| git revision | master / 2183424 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12932/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12932/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, 
> HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, 
> HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator

2018-07-29 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20262:
-
Attachment: HIVE-20262.1.patch
Status: Patch Available  (was: Open)

> Implement stats annotation rule for the UDTFOperator
> 
>
> Key: HIVE-20262
> URL: https://issues.apache.org/jira/browse/HIVE-20262
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20262.1.patch, HIVE-20262.patch
>
>
> User Defined Table Functions (UDTFs) change the number of rows of the output. 
> A common UDTF is the explode() method that creates a row for each element for 
> each array in the input column.
>  
> Right now, the number of output rows is equal to the number of input rows. 
> But if the average number of output rows is bigger than 1, the resulting 
> number of rows is underestimated in the execution plan.
>  
> Implement a rule that can have a factor X as a parameter and for each UDTF 
> function predict that:
>  
> {code:java}
> number of output rows = X * number of input rows{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator

2018-07-29 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20262:
-
Status: Open  (was: Patch Available)

> Implement stats annotation rule for the UDTFOperator
> 
>
> Key: HIVE-20262
> URL: https://issues.apache.org/jira/browse/HIVE-20262
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20262.patch
>
>
> User Defined Table Functions (UDTFs) change the number of rows of the output. 
> A common UDTF is the explode() method that creates a row for each element for 
> each array in the input column.
>  
> Right now, the number of output rows is equal to the number of input rows. 
> But if the average number of output rows is bigger than 1, the resulting 
> number of rows is underestimated in the execution plan.
>  
> Implement a rule that can have a factor X as a parameter and for each UDTF 
> function predict that:
>  
> {code:java}
> number of output rows = X * number of input rows{code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17683) Add explain locks command

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561214#comment-16561214
 ] 

Hive QA commented on HIVE-17683:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933520/HIVE-17683-branch-3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 14413 tests 
executed
*Failed tests:*
{noformat}
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=258)
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=258)
TestTezPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=258)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=70)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_with_masking]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_semijoin]
 (batchId=187)
org.apache.hadoop.hive.ql.TestWarehouseExternalDir.testManagedPaths 
(batchId=235)
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testLockingOnInsertIntoNonNativeTables
 (batchId=306)
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation 
(batchId=243)
org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=310)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12931/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12931/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12931/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933520 - PreCommit-HIVE-Build

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683-branch-3.patch, 
> HIVE-17683.01.patch, HIVE-17683.02.patch, HIVE-17683.03.patch, 
> HIVE-17683.04.patch, HIVE-17683.05.patch, HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20215) Hive unable to plan/compile query containing subquery with multiple same name columns

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20215:
---
Attachment: (was: HIVE-19770.8.patch)

> Hive unable to plan/compile query containing subquery with multiple same name 
> columns
> -
>
> Key: HIVE-20215
> URL: https://issues.apache.org/jira/browse/HIVE-20215
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> *Reproducer*
> ==
> {code:sql}
> >create table t1(c1 int)
> >explain select count(*) from (select c1, c1 from t1) subq
> {code}
> {noformat}
> FAILED: SemanticException [Error 10007]: Ambiguous column reference c1 in subq
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19770:
---
Status: Patch Available  (was: Open)

> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, 
> HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, 
> HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20215) Hive unable to plan/compile query containing subquery with multiple same name columns

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20215:
---
Attachment: HIVE-19770.8.patch

> Hive unable to plan/compile query containing subquery with multiple same name 
> columns
> -
>
> Key: HIVE-20215
> URL: https://issues.apache.org/jira/browse/HIVE-20215
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> *Reproducer*
> ==
> {code:sql}
> >create table t1(c1 int)
> >explain select count(*) from (select c1, c1 from t1) subq
> {code}
> {noformat}
> FAILED: SemanticException [Error 10007]: Ambiguous column reference c1 in subq
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19770:
---
Attachment: HIVE-19770.8.patch

> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, 
> HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, 
> HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select

2018-07-29 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19770:
---
Status: Open  (was: Patch Available)

> Support for CBO for queries with multiple same columns in select
> 
>
> Key: HIVE-19770
> URL: https://issues.apache.org/jira/browse/HIVE-19770
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, 
> HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, 
> HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch
>
>
> Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} 
> are not supported for CBO. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-20264) Bootstrap repl dump with concurrent write and drop of ACID table makes target inconsistent.

2018-07-29 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan reassigned HIVE-20264:
---


> Bootstrap repl dump with concurrent write and drop of ACID table makes target 
> inconsistent.
> ---
>
> Key: HIVE-20264
> URL: https://issues.apache.org/jira/browse/HIVE-20264
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, replication
>
> During bootstrap dump of ACID tables, let's consider the below sequence.
>  - Get lastReplId = last event ID logged.
>  - Current session (Thread-1), REPL DUMP -> Open txn (Txn1) - Event-10
>  - Another session (Thread-2), Open txn (Txn2) - Event-11
>  - Thread-2 -> Insert data (T1.D1) to ACID table. - Event-12
>  - Thread-2 -> Commit Txn (Txn2) - Event-13
>  - Thread-2 -> Drop table (T1) - Event-14
>  - Thread-1 -> Dump ACID tables based on validTxnList based on Txn1. --> This 
> step skips all the data written by txns > Txn1. So, T1 will be missing.
>  - Thread-1 -> Commit Txn (Txn1)
>  - REPL LOAD from bootstrap dump will skip T1.
>  - Incremental REPL DUMP will start from Event-10 and hence allocate write id 
> for table T1 and drop table(T1) is idempotent. So, at target, exist entries 
> in TXN_TO_WRITE_ID and NEXT_WRITE_ID metastore tables.
>  - Now, when we create another table at source with same name T1 and 
> replicate, then it may lead to incorrect data for readers at target on T1.
> Couple of proposals:
> 1. Make allocate write ID idempotent which is not possible as table doesn't 
> exist and MM table import may lead to allocate write id before creating 
> table. So, cannot differentiate these 2 cases.
> 2. Make Drop table event to drop entries from TXN_TO_WRITE_ID and 
> NEXT_WRITE_ID tables irrespective of table exist or not at target.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17683) Add explain locks command

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561151#comment-16561151
 ] 

Hive QA commented on HIVE-17683:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 17s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-12931/patches/PreCommit-HIVE-Build-12931.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12931/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683-branch-3.patch, 
> HIVE-17683.01.patch, HIVE-17683.02.patch, HIVE-17683.03.patch, 
> HIVE-17683.04.patch, HIVE-17683.05.patch, HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-29 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-17683:
-
Attachment: HIVE-17683-branch-3.patch

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683-branch-3.patch, 
> HIVE-17683.01.patch, HIVE-17683.02.patch, HIVE-17683.03.patch, 
> HIVE-17683.04.patch, HIVE-17683.05.patch, HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-29 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-17683:
-
Attachment: HIVE-17683-branch-3.0.patch

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683.01.patch, 
> HIVE-17683.02.patch, HIVE-17683.03.patch, HIVE-17683.04.patch, 
> HIVE-17683.05.patch, HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-29 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-17683:
-
Attachment: (was: HIVE-17683.01-branch-3.0.patch)

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-29 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-17683:
-
Attachment: (was: HIVE-17683-branch-3.patch)

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17683) Add explain locks command

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561132#comment-16561132
 ] 

Hive QA commented on HIVE-17683:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933518/HIVE-17683-branch-3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12929/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12929/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12929/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-07-29 15:25:27.128
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-12929/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z branch-3 ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-07-29 15:25:27.131
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2183424 HIVE-19181 : Remove BreakableService (unused class) 
(Anurag Mantripragada via Thejas Nair)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout branch-3
Switched to branch 'branch-3'
Your branch is behind 'origin/branch-3' by 5 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/branch-3
HEAD is now at 150ef3b HIVE-19829: Incremental replication load should create 
tasks in execution phase rather than semantic phase (Mahesh Kumar Behera, 
reviewed by Sankar Hariappan)
+ git merge --ff-only origin/branch-3
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-07-29 15:25:30.572
+ rm -rf ../yetus_PreCommit-HIVE-Build-12929
+ mkdir ../yetus_PreCommit-HIVE-Build-12929
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-12929
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12929/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:46
Falling back to three-way merge...
Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java' with 
conflicts.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:419
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java' with 
conflicts.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java:69
Falling back to three-way merge...
Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java' 
cleanly.
Going to apply patch with: git apply -p0
error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:46
Falling back to three-way merge...
Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java' with 
conflicts.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:419
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java' with 
conflicts.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java:69
Falling back to three-way merge...
Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java' 
cleanly.
U ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
U ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-12929
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933518 - PreCommit-HIVE-Build

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apac

[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-29 Thread Igor Kryvenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-17683:
-
Attachment: HIVE-17683-branch-3.patch

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683-branch-3.patch, 
> HIVE-17683.01-branch-3.0.patch, HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561115#comment-16561115
 ] 

Hive QA commented on HIVE-20245:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933511/HIVE-20245.04.patch

{color:green}SUCCESS:{color} +1 due to 14 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14831 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12928/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12928/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12928/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933511 - PreCommit-HIVE-Build

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, 
> HIVE-20245.03.patch, HIVE-20245.04.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561100#comment-16561100
 ] 

Hive QA commented on HIVE-20245:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
19s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
56s{color} | {color:red} ql in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
57s{color} | {color:red} ql: The patch generated 298 new + 1522 unchanged - 34 
fixed = 1820 total (was 1556) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} vector-code-gen: The patch generated 8 new + 322 
unchanged - 0 fixed = 330 total (was 322) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
33s{color} | {color:red} ql generated 15 new + 2292 unchanged - 5 fixed = 2307 
total (was 2297) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Redundant nullcheck of filterExpr, which is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:[line 1640] |
|  |  Redundant nullcheck of vectorExpression, which is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:[line 1687] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int,
 Object, TypeInfo):in 
org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int,
 Object, TypeInfo): String.getBytes()  At ConstantVectorExpression.java:[line 
210] |
|  |  Class 
org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColumnBetween 
defines non-transient non-se

[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20245:

Status: Patch Available  (was: In Progress)

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, 
> HIVE-20245.03.patch, HIVE-20245.04.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20245:

Attachment: HIVE-20245.04.patch

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, 
> HIVE-20245.03.patch, HIVE-20245.04.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20245:

Status: In Progress  (was: Patch Available)

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, 
> HIVE-20245.03.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561064#comment-16561064
 ] 

Hive QA commented on HIVE-20245:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12933509/HIVE-20245.03.patch

{color:green}SUCCESS:{color} +1 due to 14 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 14831 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_10]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_7] 
(batchId=89)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_8] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_10] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_7] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_8] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_casts] 
(batchId=87)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_annotate_stats_select]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf_inline]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_10]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_7]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_8]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_short_regress]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_casts]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=186)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_10]
 (batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_7]
 (batchId=147)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_8]
 (batchId=114)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_10] 
(batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress]
 (batchId=131)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12927/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12927/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12927/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12933509 - PreCommit-HIVE-Build

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, 
> HIVE-20245.03.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561055#comment-16561055
 ] 

Hive QA commented on HIVE-20245:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
12s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
56s{color} | {color:red} ql in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
58s{color} | {color:red} ql: The patch generated 297 new + 1522 unchanged - 34 
fixed = 1819 total (was 1556) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} vector-code-gen: The patch generated 8 new + 322 
unchanged - 0 fixed = 330 total (was 322) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
23s{color} | {color:red} ql generated 15 new + 2292 unchanged - 5 fixed = 2307 
total (was 2297) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Redundant nullcheck of filterExpr, which is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:[line 1640] |
|  |  Redundant nullcheck of vectorExpression, which is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class,
 List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, 
DataTypePhysicalVariation)  Redundant null check at 
VectorizationContext.java:[line 1687] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int,
 Object, TypeInfo):in 
org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int,
 Object, TypeInfo): String.getBytes()  At ConstantVectorExpression.java:[line 
210] |
|  |  Class 
org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColumnBetween 
defines non-transient non-se

[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20245:

Attachment: HIVE-20245.03.patch

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, 
> HIVE-20245.03.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20245:

Status: Patch Available  (was: In Progress)

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, 
> HIVE-20245.03.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN

2018-07-29 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20245:

Status: In Progress  (was: Patch Available)

> Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
> --
>
> Key: HIVE-20245
> URL: https://issues.apache.org/jira/browse/HIVE-20245
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized BETWEEN and IN.
> Add, missing vectorization classes for BETWEEN PROJECTION.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)