[jira] [Commented] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561541#comment-16561541 ] Hive QA commented on HIVE-20166: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 41s{color} | {color:blue} serde in master has 195 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12939/dev-support/hive-personality.sh | | git revision | master / 83e5397 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: serde U: serde | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12939/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like t
[jira] [Commented] (HIVE-20209) Metastore connection fails for first attempt in repl dump.
[ https://issues.apache.org/jira/browse/HIVE-20209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561539#comment-16561539 ] mahesh kumar behera commented on HIVE-20209: code changes looks fine to me > Metastore connection fails for first attempt in repl dump. > -- > > Key: HIVE-20209 > URL: https://issues.apache.org/jira/browse/HIVE-20209 > Project: Hive > Issue Type: Bug > Components: HiveServer2, repl >Affects Versions: 3.0.0, 3.1.0, 4.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-20209.01.patch > > > Run the following command: > {code:java} > repl dump `*` from 60758 with ('hive.repl.dump.metadata.only'='true', > 'hive.repl.dump.include.acid.tables'='true'); > {code} > See this in hs2.log: > {code:java} > 2018-07-10T18:07:32,308 INFO [HiveServer2-Handler-Pool: Thread-14380]: > conf.HiveConf (HiveConf.java:getLogIdVar(5061)) - Using the default value > passed in for log id: f1e13736-3f10-4abf-a29b-683b534dfa4c > 2018-07-10T18:07:32,309 INFO [HiveServer2-Handler-Pool: Thread-14380]: > session.SessionState (:()) - Updating thread name to > f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380 > 2018-07-10T18:07:32,311 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: operation.OperationManager (:()) - > Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, > getHandleIdentifier()=16eb1d07-e125-490c-8ab8-90192bfd459b] > 2018-07-10T18:07:32,314 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Compiling > command(queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b): > repl dump `*` from 60758 with ('hive.repl.dump.metadata.only'='true', > 'hive.repl.dump.include.acid.tables'='true') > 2018-07-10T18:07:32,317 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) > - Trying to connect to metastore with URI > thrift://hwx-demo-2.field.hortonworks.com:9083 > 2018-07-10T18:07:32,317 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) > - Opened a connection to metastore, current connections: 19 > 2018-07-10T18:07:32,319 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) > - Connected to metastore. > 2018-07-10T18:07:32,319 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: metastore.RetryingMetaStoreClient > (:()) - RetryingMetaStoreClient proxy=class > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hive > (auth:SIMPLE) retries=24 delay=5 lifetime=0 > 2018-07-10T18:07:32,439 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Semantic Analysis > Completed (retrial = false) > 2018-07-10T18:07:32,440 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Returning Hive > schema: Schema(fieldSchemas:[FieldSchema(name:dump_dir, type:string, > comment:from deserializer), FieldSchema(name:last_repl_id, type:string, > comment:from deserializer)], properties:null) > 2018-07-10T18:07:32,443 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: exec.ListSinkOperator (:()) - > Initializing operator LIST_SINK[0] > 2018-07-10T18:07:32,446 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Completed > compiling > command(queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b); > Time taken: 0.132 seconds > 2018-07-10T18:07:32,447 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: conf.HiveConf > (HiveConf.java:getLogIdVar(5061)) - Using the default value passed in for log > id: f1e13736-3f10-4abf-a29b-683b534dfa4c > 2018-07-10T18:07:32,448 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c > HiveServer2-Handler-Pool: Thread-14380]: session.SessionState (:()) - > Resetting thread name to HiveServer2-Handler-Pool: Thread-14380 > 2018-07-10T18:07:32,451 INFO [HiveServer2-Background-Pool: Thread-15161]: > reexec.ReExecDriver (:()) - Execution #1 of query > 2018-07-10T18:07:32,452 INFO [HiveServer2-Background-Pool: Thread-15161]: > lockmgr.DbTxnManager (:()) - Setting lock request transaction to txnid:30327 > for queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b > 2018-07-10T18:07:32,454 INFO [HiveServer2-Background-Pool: Thread-15161]: > lockmgr.DbLockManager (:()) - Requesting: > queryId=hive
[jira] [Commented] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561469#comment-16561469 ] Hive QA commented on HIVE-20220: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933540/HIVE-20220.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14817 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12938/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12938/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12938/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12933540 - PreCommit-HIVE-Build > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-20220.patch > > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to a faulty datanode or any other reason. When reducer > finds that it can't fetch mapper output, it sends a signal to Application > Master to reattempt the corresponding map task. The reattempted map task will > now get the different random value from rand function and hence the keys that > gets distributed now to the reducer will not be same as the previous run. > > *Steps to reproduce:* > create table test(id int); > insert into test values > (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9); > SET hive.groupby.skewindata=true; > SET mapreduce.reduce.reduces=2; > //Add a debug port for reducer > select count(1) from test group by id; > //Remove mapper's intermediate output file when map stage is completed and > one out of 2 reduce tasks is completed and then continue the run. This causes > 2nd reducer to send event to Application Master to rerun the map task. > The following is the expected result. > 1 > 2 > 3 > 4 > 5 > 6 > 8 > 8 > 9 > > But you may get different result due to a different value returned by the > rand function in the second run causing different distribution of keys. > This needs to be fixed such that the mapper distributes the same keys even if > it is reattempted multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561455#comment-16561455 ] Hive QA commented on HIVE-20220: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} common in master has 64 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 57s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s{color} | {color:red} common: The patch generated 2 new + 423 unchanged - 0 fixed = 425 total (was 423) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 1 new + 55 unchanged - 0 fixed = 56 total (was 55) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 26m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12938/dev-support/hive-personality.sh | | git revision | master / 83e5397 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12938/yetus/diff-checkstyle-common.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12938/yetus/diff-checkstyle-ql.txt | | modules | C: common ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12938/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-20220.patch > > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to
[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561443#comment-16561443 ] Hive QA commented on HIVE-20225: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933537/HIVE-20225.1.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14833 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz] (batchId=193) org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_joins] (batchId=193) org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_masking] (batchId=193) org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1] (batchId=193) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12937/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12937/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12937/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12933537 - PreCommit-HIVE-Build > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export/import Data from Teradata, Teradata will > generate/require binary files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive > or write these files in order to load back to TD. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export/import data from Teradata is using TPT. > However, the Hive could not directly utilize/generate these binary format > because it doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon/generate the exported Teradata > Binary Format file transparently -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-20225: -- Status: Open (was: Patch Available) Hi [~luli], automated testing caught new checkstyle, findbugs, and missing license header problems with the patch (see results above). Please fix these issues and then resubmit the patch for review. Thanks! > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export/import Data from Teradata, Teradata will > generate/require binary files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive > or write these files in order to load back to TD. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export/import data from Teradata is using TPT. > However, the Hive could not directly utilize/generate these binary format > because it doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon/generate the exported Teradata > Binary Format file transparently -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Mantripragada updated HIVE-20166: Attachment: HIVE-20166.1.patch Status: Patch Available (was: Open) > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like to know about. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Mantripragada updated HIVE-20166: Attachment: (was: HIVE-20166.1.patch) > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like to know about. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561418#comment-16561418 ] Hive QA commented on HIVE-20225: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 23s{color} | {color:blue} contrib in master has 13 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s{color} | {color:red} contrib: The patch generated 99 new + 0 unchanged - 0 fixed = 99 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 33s{color} | {color:red} contrib generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 12s{color} | {color:red} The patch generated 11 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 10m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:contrib | | | Inconsistent synchronization of org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryRecordReader.pos; locked 81% of time Unsynchronized access at TeradataBinaryRecordReader.java:81% of time Unsynchronized access at TeradataBinaryRecordReader.java:[line 206] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12937/dev-support/hive-personality.sh | | git revision | master / 83e5397 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus/diff-checkstyle-contrib.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus/new-findbugs-contrib.html | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus/patch-asflicense-problems.txt | | modules | C: contrib U: contrib | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12937/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export/import Data from Teradata, Teradata will > generate/require binary files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive > or write these files in order to load back to TD. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.T
[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Mantripragada updated HIVE-20166: Status: Open (was: Patch Available) > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like to know about. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561413#comment-16561413 ] Hive QA commented on HIVE-20166: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933534/HIVE-20166.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14815 tests executed *Failed tests:* {noformat} org.apache.hive.minikdc.TestJdbcWithMiniKdcCookie.testCookie (batchId=264) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12936/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12936/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12936/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12933534 - PreCommit-HIVE-Build > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like to know about. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ganesha Shreedhara updated HIVE-20220: -- Status: Patch Available (was: In Progress) Added qtests. > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-20220.patch > > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to a faulty datanode or any other reason. When reducer > finds that it can't fetch mapper output, it sends a signal to Application > Master to reattempt the corresponding map task. The reattempted map task will > now get the different random value from rand function and hence the keys that > gets distributed now to the reducer will not be same as the previous run. > > *Steps to reproduce:* > create table test(id int); > insert into test values > (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9); > SET hive.groupby.skewindata=true; > SET mapreduce.reduce.reduces=2; > //Add a debug port for reducer > select count(1) from test group by id; > //Remove mapper's intermediate output file when map stage is completed and > one out of 2 reduce tasks is completed and then continue the run. This causes > 2nd reducer to send event to Application Master to rerun the map task. > The following is the expected result. > 1 > 2 > 3 > 4 > 5 > 6 > 8 > 8 > 9 > > But you may get different result due to a different value returned by the > rand function in the second run causing different distribution of keys. > This needs to be fixed such that the mapper distributes the same keys even if > it is reattempted multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ganesha Shreedhara updated HIVE-20220: -- Status: In Progress (was: Patch Available) > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-20220.patch > > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to a faulty datanode or any other reason. When reducer > finds that it can't fetch mapper output, it sends a signal to Application > Master to reattempt the corresponding map task. The reattempted map task will > now get the different random value from rand function and hence the keys that > gets distributed now to the reducer will not be same as the previous run. > > *Steps to reproduce:* > create table test(id int); > insert into test values > (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9); > SET hive.groupby.skewindata=true; > SET mapreduce.reduce.reduces=2; > //Add a debug port for reducer > select count(1) from test group by id; > //Remove mapper's intermediate output file when map stage is completed and > one out of 2 reduce tasks is completed and then continue the run. This causes > 2nd reducer to send event to Application Master to rerun the map task. > The following is the expected result. > 1 > 2 > 3 > 4 > 5 > 6 > 8 > 8 > 9 > > But you may get different result due to a different value returned by the > rand function in the second run causing different distribution of keys. > This needs to be fixed such that the mapper distributes the same keys even if > it is reattempted multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ganesha Shreedhara updated HIVE-20220: -- Attachment: HIVE-20220.patch > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-20220.patch > > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to a faulty datanode or any other reason. When reducer > finds that it can't fetch mapper output, it sends a signal to Application > Master to reattempt the corresponding map task. The reattempted map task will > now get the different random value from rand function and hence the keys that > gets distributed now to the reducer will not be same as the previous run. > > *Steps to reproduce:* > create table test(id int); > insert into test values > (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9); > SET hive.groupby.skewindata=true; > SET mapreduce.reduce.reduces=2; > //Add a debug port for reducer > select count(1) from test group by id; > //Remove mapper's intermediate output file when map stage is completed and > one out of 2 reduce tasks is completed and then continue the run. This causes > 2nd reducer to send event to Application Master to rerun the map task. > The following is the expected result. > 1 > 2 > 3 > 4 > 5 > 6 > 8 > 8 > 9 > > But you may get different result due to a different value returned by the > rand function in the second run causing different distribution of keys. > This needs to be fixed such that the mapper distributes the same keys even if > it is reattempted multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ganesha Shreedhara updated HIVE-20220: -- Comment: was deleted (was: Can someone review this patch please? Please let me know if there is a better way of fixing this. I can update the qtest files based on the fix. ) > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to a faulty datanode or any other reason. When reducer > finds that it can't fetch mapper output, it sends a signal to Application > Master to reattempt the corresponding map task. The reattempted map task will > now get the different random value from rand function and hence the keys that > gets distributed now to the reducer will not be same as the previous run. > > *Steps to reproduce:* > create table test(id int); > insert into test values > (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9); > SET hive.groupby.skewindata=true; > SET mapreduce.reduce.reduces=2; > //Add a debug port for reducer > select count(1) from test group by id; > //Remove mapper's intermediate output file when map stage is completed and > one out of 2 reduce tasks is completed and then continue the run. This causes > 2nd reducer to send event to Application Master to rerun the map task. > The following is the expected result. > 1 > 2 > 3 > 4 > 5 > 6 > 8 > 8 > 9 > > But you may get different result due to a different value returned by the > rand function in the second run causing different distribution of keys. > This needs to be fixed such that the mapper distributes the same keys even if > it is reattempted multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ganesha Shreedhara updated HIVE-20220: -- Comment: was deleted (was: I'll correct the golden files if this fix is feasible. ) > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to a faulty datanode or any other reason. When reducer > finds that it can't fetch mapper output, it sends a signal to Application > Master to reattempt the corresponding map task. The reattempted map task will > now get the different random value from rand function and hence the keys that > gets distributed now to the reducer will not be same as the previous run. > > *Steps to reproduce:* > create table test(id int); > insert into test values > (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9); > SET hive.groupby.skewindata=true; > SET mapreduce.reduce.reduces=2; > //Add a debug port for reducer > select count(1) from test group by id; > //Remove mapper's intermediate output file when map stage is completed and > one out of 2 reduce tasks is completed and then continue the run. This causes > 2nd reducer to send event to Application Master to rerun the map task. > The following is the expected result. > 1 > 2 > 3 > 4 > 5 > 6 > 8 > 8 > 9 > > But you may get different result due to a different value returned by the > rand function in the second run causing different distribution of keys. > This needs to be fixed such that the mapper distributes the same keys even if > it is reattempted multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20220) Incorrect result when hive.groupby.skewindata is enabled
[ https://issues.apache.org/jira/browse/HIVE-20220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ganesha Shreedhara updated HIVE-20220: -- Attachment: (was: HIVE-20220.patch) > Incorrect result when hive.groupby.skewindata is enabled > > > Key: HIVE-20220 > URL: https://issues.apache.org/jira/browse/HIVE-20220 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > > hive.groupby.skewindata makes use of rand UDF to randomly distribute grouped > by keys to the reducers and hence avoids overloading a single reducer when > there is a skew in data. > This random distribution of keys is buggy when the reducer fails to fetch the > mapper output due to a faulty datanode or any other reason. When reducer > finds that it can't fetch mapper output, it sends a signal to Application > Master to reattempt the corresponding map task. The reattempted map task will > now get the different random value from rand function and hence the keys that > gets distributed now to the reducer will not be same as the previous run. > > *Steps to reproduce:* > create table test(id int); > insert into test values > (1),(2),(2),(3),(3),(3),(4),(4),(4),(4),(5),(5),(5),(5),(5),(6),(6),(6),(6),(6),(6),(7),(7),(7),(7),(7),(7),(7),(7),(8),(8),(8),(8),(8),(8),(8),(8),(9),(9),(9),(9),(9),(9),(9),(9),(9); > SET hive.groupby.skewindata=true; > SET mapreduce.reduce.reduces=2; > //Add a debug port for reducer > select count(1) from test group by id; > //Remove mapper's intermediate output file when map stage is completed and > one out of 2 reduce tasks is completed and then continue the run. This causes > 2nd reducer to send event to Application Master to rerun the map task. > The following is the expected result. > 1 > 2 > 3 > 4 > 5 > 6 > 8 > 8 > 9 > > But you may get different result due to a different value returned by the > rand function in the second run causing different distribution of keys. > This needs to be fixed such that the mapper distributes the same keys even if > it is reattempted multiple times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Li updated HIVE-20225: - Description: When using TPT/BTEQ to export/import Data from Teradata, Teradata will generate/require binary files based on the schema. A Customized SerDe is needed in order to directly read these files from Hive or write these files in order to load back to TD. {code:java} CREATE EXTERNAL TABLE `TABLE1`( ...) PARTITIONED BY ( ...) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' LOCATION ...; SELECT * FROM `TABLE1`;{code} Problem Statement: Right now the fast way to export/import data from Teradata is using TPT. However, the Hive could not directly utilize/generate these binary format because it doesn't have a SerDe for these files. Result: Provided with the SerDe, Hive can operate upon/generate the exported Teradata Binary Format file transparently was: When using TPT/BTEQ to export Data from Teradata, Teradata will export binary files based on the schema. A Customized SerDe is needed in order to directly read these files from Hive. {code:java} CREATE EXTERNAL TABLE `TABLE1`( ...) PARTITIONED BY ( ...) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' LOCATION ...; SELECT * FROM `TABLE1`;{code} Problem Statement: Right now the fast way to export data from Teradata is using TPT. However, the Hive could not directly utilize these exported binary format because it doesn't have a SerDe for these files. Result: Provided with the SerDe, Hive can operate upon the exported Teradata Binary Format file transparently. > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export/import Data from Teradata, Teradata will > generate/require binary files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive > or write these files in order to load back to TD. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export/import data from Teradata is using TPT. > However, the Hive could not directly utilize/generate these binary format > because it doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon/generate the exported Teradata > Binary Format file transparently -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561381#comment-16561381 ] Lu Li commented on HIVE-20225: -- Hi [~cwsteinbach] Could you please review this and provide your comments? Thanks, Lu > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export Data from Teradata, Teradata will export binary > files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export data from Teradata is using TPT. However, > the Hive could not directly utilize these exported binary format because it > doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon the exported Teradata Binary > Format file transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Li updated HIVE-20225: - Status: Patch Available (was: Open) > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export Data from Teradata, Teradata will export binary > files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export data from Teradata is using TPT. However, > the Hive could not directly utilize these exported binary format because it > doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon the exported Teradata Binary > Format file transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561379#comment-16561379 ] Lu Li commented on HIVE-20225: -- Added the RB: https://reviews.apache.org/r/68099/ > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export Data from Teradata, Teradata will export binary > files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export data from Teradata is using TPT. However, > the Hive could not directly utilize these exported binary format because it > doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon the exported Teradata Binary > Format file transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Li updated HIVE-20225: - Attachment: HIVE-20225.1.patch > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export Data from Teradata, Teradata will export binary > files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export data from Teradata is using TPT. However, > the Hive could not directly utilize these exported binary format because it > doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon the exported Teradata Binary > Format file transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Li updated HIVE-20225: - Attachment: (was: HIVE-20225.1.patch) > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export Data from Teradata, Teradata will export binary > files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export data from Teradata is using TPT. However, > the Hive could not directly utilize these exported binary format because it > doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon the exported Teradata Binary > Format file transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20225) SerDe to support Teradata Binary Format
[ https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Li updated HIVE-20225: - Attachment: HIVE-20225.1.patch > SerDe to support Teradata Binary Format > --- > > Key: HIVE-20225 > URL: https://issues.apache.org/jira/browse/HIVE-20225 > Project: Hive > Issue Type: New Feature > Components: Serializers/Deserializers >Reporter: Lu Li >Assignee: Lu Li >Priority: Major > Attachments: HIVE-20225.1.patch > > > When using TPT/BTEQ to export Data from Teradata, Teradata will export binary > files based on the schema. > A Customized SerDe is needed in order to directly read these files from Hive. > {code:java} > CREATE EXTERNAL TABLE `TABLE1`( > ...) > PARTITIONED BY ( > ...) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde' > STORED AS INPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat' > OUTPUTFORMAT > > 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat' > LOCATION ...; > SELECT * FROM `TABLE1`;{code} > Problem Statement: > Right now the fast way to export data from Teradata is using TPT. However, > the Hive could not directly utilize these exported binary format because it > doesn't have a SerDe for these files. > Result: > Provided with the SerDe, Hive can operate upon the exported Teradata Binary > Format file transparently. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561359#comment-16561359 ] Hive QA commented on HIVE-20166: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 38s{color} | {color:blue} serde in master has 195 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12936/dev-support/hive-personality.sh | | git revision | master / 83e5397 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: serde U: serde | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12936/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like t
[jira] [Commented] (HIVE-19798) Number of distinct values column statistic accounts null as a distinct value
[ https://issues.apache.org/jira/browse/HIVE-19798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561335#comment-16561335 ] Anurag Mantripragada commented on HIVE-19798: - [~arhimondr], can you please provide more info on this? > Number of distinct values column statistic accounts null as a distinct value > > > Key: HIVE-19798 > URL: https://issues.apache.org/jira/browse/HIVE-19798 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1 >Reporter: Andy Rosa >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Mantripragada reassigned HIVE-20166: --- Assignee: Anurag Mantripragada > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like to know about. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20166) LazyBinaryStruct Warn Level Logging
[ https://issues.apache.org/jira/browse/HIVE-20166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Mantripragada updated HIVE-20166: Attachment: HIVE-20166.1.patch Status: Patch Available (was: Open) Changed logging level to WARN. > LazyBinaryStruct Warn Level Logging > --- > > Key: HIVE-20166 > URL: https://issues.apache.org/jira/browse/HIVE-20166 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 3.0.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: Anurag Mantripragada >Priority: Minor > Labels: newbie, noob > Attachments: HIVE-20166.1.patch > > > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L177-L180 > {code} > // Extra bytes at the end? > if (!extraFieldWarned && lastFieldByteEnd < structByteEnd) { > extraFieldWarned = true; > LOG.warn("Extra bytes detected at the end of the row! " + >"Last field end " + lastFieldByteEnd + " and serialize buffer end > " + structByteEnd + ". " + >"Ignoring similar problems."); > } > // Missing fields? > if (!missingFieldWarned && lastFieldByteEnd > structByteEnd) { > missingFieldWarned = true; > LOG.info("Missing fields! Expected " + fields.length + " fields but " + > "only got " + fieldId + "! " + > "Last field end " + lastFieldByteEnd + " and serialize buffer end " > + structByteEnd + ". " + > "Ignoring similar problems."); > } > {code} > The first log statement is a 'warn' level logging, the second is an 'info' > level logging. Please change the second log to also be a 'warn'. This seems > like it could be a problem that the user would like to know about. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels
[ https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561300#comment-16561300 ] Prasanth Jayachandran commented on HIVE-20267: -- [~zchovan] Thanks for the patch! Very useful! 2 minor changes * Could you auto-refresh the page after clicking submit button? * Could you also add this servlet to LlapWebServices? Looks good otherwise. > Expanding WebUI to include form to dynamically config log levels > - > > Key: HIVE-20267 > URL: https://issues.apache.org/jira/browse/HIVE-20267 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Minor > Attachments: HIVE-20267.1.patch > > > Expanding the possibility to change the log levels during runtime, the webUI > can be extended to interact with the Log4j2ConfiguratorServlet, this way it > can be directly used and users/admins don't need to execute curl commands > from commandline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels
[ https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561301#comment-16561301 ] Prasanth Jayachandran commented on HIVE-20267: -- Also please make the ticket "Patch Available" for it to trigger pre-commit tests. > Expanding WebUI to include form to dynamically config log levels > - > > Key: HIVE-20267 > URL: https://issues.apache.org/jira/browse/HIVE-20267 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Minor > Attachments: HIVE-20267.1.patch > > > Expanding the possibility to change the log levels during runtime, the webUI > can be extended to interact with the Log4j2ConfiguratorServlet, this way it > can be directly used and users/admins don't need to execute curl commands > from commandline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20262) Implement stats annotation rule for the UDTFOperator
[ https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561281#comment-16561281 ] Hive QA commented on HIVE-20262: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933527/HIVE-20262.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14816 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12935/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12935/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12935/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12933527 - PreCommit-HIVE-Build > Implement stats annotation rule for the UDTFOperator > > > Key: HIVE-20262 > URL: https://issues.apache.org/jira/browse/HIVE-20262 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: George Pachitariu >Assignee: George Pachitariu >Priority: Minor > Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch > > > User Defined Table Functions (UDTFs) change the number of rows of the output. > A common UDTF is the explode() method that creates a row for each element for > each array in the input column. > > Right now, the number of output rows is equal to the number of input rows. > But if the average number of output rows is bigger than 1, the resulting > number of rows is underestimated in the execution plan. > > Implement a rule that can have a factor X as a parameter and for each UDTF > function predict that: > > {code:java} > number of output rows = X * number of input rows{code} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20262) Implement stats annotation rule for the UDTFOperator
[ https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561276#comment-16561276 ] Hive QA commented on HIVE-20262: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 36s{color} | {color:blue} common in master has 64 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 11s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 20s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12935/dev-support/hive-personality.sh | | git revision | master / 83e5397 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: common ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12935/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Implement stats annotation rule for the UDTFOperator > > > Key: HIVE-20262 > URL: https://issues.apache.org/jira/browse/HIVE-20262 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: George Pachitariu >Assignee: George Pachitariu >Priority: Minor > Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch > > > User Defined Table Functions (UDTFs) change the number of rows of the output. > A common UDTF is the explode() method that creates a row for each element for > each array in the input column. > > Right now, the number of output rows is equal to the number of input rows. > But if the average number of output rows is bigger than 1, the resulting > number of rows is underestimated in the execution plan. > > Implement a rule that can have a factor X as a parameter and for each UDTF > function predict that: > > {code:java} > number of output rows = X * number of input rows{code} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels
[ https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Chovan updated HIVE-20267: - Attachment: HIVE-20267.1.patch > Expanding WebUI to include form to dynamically config log levels > - > > Key: HIVE-20267 > URL: https://issues.apache.org/jira/browse/HIVE-20267 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Minor > Attachments: HIVE-20267.1.patch > > > Expanding the possibility to change the log levels during runtime, the webUI > can be extended to interact with the Log4j2ConfiguratorServlet, this way it > can be directly used and users/admins don't need to execute curl commands > from commandline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20267) Expanding WebUI to include form to dynamically config log levels
[ https://issues.apache.org/jira/browse/HIVE-20267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Chovan reassigned HIVE-20267: > Expanding WebUI to include form to dynamically config log levels > - > > Key: HIVE-20267 > URL: https://issues.apache.org/jira/browse/HIVE-20267 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Chovan >Assignee: Zoltan Chovan >Priority: Minor > > Expanding the possibility to change the log levels during runtime, the webUI > can be extended to interact with the Log4j2ConfiguratorServlet, this way it > can be directly used and users/admins don't need to execute curl commands > from commandline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable
[ https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561271#comment-16561271 ] Hive QA commented on HIVE-20263: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933526/HIVE-20263.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14815 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12934/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12934/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12934/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12933526 - PreCommit-HIVE-Build > Typo in HiveReduceExpressionsWithStatsRule variable > --- > > Key: HIVE-20263 > URL: https://issues.apache.org/jira/browse/HIVE-20263 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20263.patch, HIVE-20263.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable
[ https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561269#comment-16561269 ] Ashutosh Chauhan commented on HIVE-20263: - +1 > Typo in HiveReduceExpressionsWithStatsRule variable > --- > > Key: HIVE-20263 > URL: https://issues.apache.org/jira/browse/HIVE-20263 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20263.patch, HIVE-20263.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable
[ https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561263#comment-16561263 ] Hive QA commented on HIVE-20263: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 6s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12934/dev-support/hive-personality.sh | | git revision | master / 83e5397 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12934/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Typo in HiveReduceExpressionsWithStatsRule variable > --- > > Key: HIVE-20263 > URL: https://issues.apache.org/jira/browse/HIVE-20263 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20263.patch, HIVE-20263.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator
[ https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] George Pachitariu updated HIVE-20262: - Attachment: HIVE-20262.2.patch Status: Patch Available (was: Open) > Implement stats annotation rule for the UDTFOperator > > > Key: HIVE-20262 > URL: https://issues.apache.org/jira/browse/HIVE-20262 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: George Pachitariu >Assignee: George Pachitariu >Priority: Minor > Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch > > > User Defined Table Functions (UDTFs) change the number of rows of the output. > A common UDTF is the explode() method that creates a row for each element for > each array in the input column. > > Right now, the number of output rows is equal to the number of input rows. > But if the average number of output rows is bigger than 1, the resulting > number of rows is underestimated in the execution plan. > > Implement a rule that can have a factor X as a parameter and for each UDTF > function predict that: > > {code:java} > number of output rows = X * number of input rows{code} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator
[ https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] George Pachitariu updated HIVE-20262: - Status: Open (was: Patch Available) > Implement stats annotation rule for the UDTFOperator > > > Key: HIVE-20262 > URL: https://issues.apache.org/jira/browse/HIVE-20262 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: George Pachitariu >Assignee: George Pachitariu >Priority: Minor > Attachments: HIVE-20262.1.patch, HIVE-20262.2.patch, HIVE-20262.patch > > > User Defined Table Functions (UDTFs) change the number of rows of the output. > A common UDTF is the explode() method that creates a row for each element for > each array in the input column. > > Right now, the number of output rows is equal to the number of input rows. > But if the average number of output rows is bigger than 1, the resulting > number of rows is underestimated in the execution plan. > > Implement a rule that can have a factor X as a parameter and for each UDTF > function predict that: > > {code:java} > number of output rows = X * number of input rows{code} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20266) Extra column is being shuffled in cbo as compared to non-cbo
[ https://issues.apache.org/jira/browse/HIVE-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-20266: --- Description: {code:sql} CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE); {code} {code:sql} explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, value, key as p1, 3 as p2 from src limit 10; {code} *Without CBO* {noformat} Map 1 Map Operator Tree: TableScan alias: src Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string), value (type: string), value (type: string), key (type: string), 3 (type: int) outputColumnNames: _col0, _col1, _col2, _col3, _col4 Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 10 Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: int) {noformat} *With CBO* {noformat} Map 1 Map Operator Tree: TableScan alias: src Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string), value (type: string), value (type: string), key (type: string) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 2500 Data size: 26560 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 10 Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 10 Data size: 100 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string) {noformat} CBO has 4 columns being shuffled as compared to 3 in non-cbo was: {code:sql} CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE); {code} {code:sql} explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, value, key as p1, 3 as p2 from src limit 10; {code} > Extra column is being shuffled in cbo as compared to non-cbo > > > Key: HIVE-20266 > URL: https://issues.apache.org/jira/browse/HIVE-20266 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > {code:sql} > CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING > NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE); > {code} > {code:sql} > explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, > value, key as p1, 3 as p2 from src limit 10; > {code} > *Without CBO* > {noformat} > Map 1 > Map Operator Tree: > TableScan > alias: src > Statistics: Num rows: 2500 Data size: 26560 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: key (type: string), value (type: string), > value (type: string), key (type: string), 3 (type: int) > outputColumnNames: _col0, _col1, _col2, _col3, _col4 > Statistics: Num rows: 2500 Data size: 26560 Basic stats: > COMPLETE Column stats: NONE > Limit > Number of rows: 10 > Statistics: Num rows: 10 Data size: 100 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 10 Data size: 100 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string), _col1 (type: > string), _col2 (type: string), _col3 (type: string), _col4 (type: int) > {noformat} > *With CBO* > {noformat} > Map 1 > Map Operator Tree: >
[jira] [Assigned] (HIVE-20266) Extra column is being shuffled in cbo as compared to non-cbo
[ https://issues.apache.org/jira/browse/HIVE-20266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-20266: -- > Extra column is being shuffled in cbo as compared to non-cbo > > > Key: HIVE-20266 > URL: https://issues.apache.org/jira/browse/HIVE-20266 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > {code:sql} > CREATE TABLE tablePartitioned (a STRING NOT NULL ENFORCED, b STRING, c STRING > NOT NULL ENFORCED) PARTITIONED BY (p1 STRING, p2 INT NOT NULL DISABLE); > {code} > {code:sql} > explain INSERT INTO tablePartitioned partition(p1, p2) select key, value, > value, key as p1, 3 as p2 from src limit 10; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20265) PTF operator has an extra reducer in CBO
[ https://issues.apache.org/jira/browse/HIVE-20265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-20265: -- > PTF operator has an extra reducer in CBO > > > Key: HIVE-20265 > URL: https://issues.apache.org/jira/browse/HIVE-20265 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > {code:sql} > explain vectorization detail > select p_mfgr, p_name, p_size, > min(p_retailprice), > rank() over(distribute by p_mfgr sort by p_name)as r, > dense_rank() over(distribute by p_mfgr sort by p_name) as dr, > p_size, p_size - lag(p_size,1,p_size) over(distribute by p_mfgr sort by > p_name) as deltaSz > from part > group by p_mfgr, p_name, p_size > {code} > Above query generates extra reducer with CBO on. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20263) Typo in HiveReduceExpressionsWithStatsRule variable
[ https://issues.apache.org/jira/browse/HIVE-20263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20263: --- Attachment: HIVE-20263.patch > Typo in HiveReduceExpressionsWithStatsRule variable > --- > > Key: HIVE-20263 > URL: https://issues.apache.org/jira/browse/HIVE-20263 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20263.patch, HIVE-20263.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select
[ https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-19770: --- Resolution: Fixed Fix Version/s: 4.0.0 Target Version/s: 4.0.0 (was: 3.1.0) Status: Resolved (was: Patch Available) Pushed to master. Thanks for reviewing [~ashutoshc] > Support for CBO for queries with multiple same columns in select > > > Key: HIVE-19770 > URL: https://issues.apache.org/jira/browse/HIVE-19770 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, > HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, > HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch > > > Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} > are not supported for CBO. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20262) Implement stats annotation rule for the UDTFOperator
[ https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561239#comment-16561239 ] Hive QA commented on HIVE-20262: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933525/HIVE-20262.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12933/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12933/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12933/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-07-29 19:07:39.248 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12933/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-07-29 19:07:39.251 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 2183424 HIVE-19181 : Remove BreakableService (unused class) (Anurag Mantripragada via Thejas Nair) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ Removing standalone-metastore/src/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 2183424 HIVE-19181 : Remove BreakableService (unused class) (Anurag Mantripragada via Thejas Nair) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-29 19:07:40.248 + rm -rf ../yetus_PreCommit-HIVE-Build-12933 + mkdir ../yetus_PreCommit-HIVE-Build-12933 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12933 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12933/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: git apply -p0 /data/hiveptest/working/scratch/build.patch:137: new blank line at EOF. + /data/hiveptest/working/scratch/build.patch:399: new blank line at EOF. + warning: 2 lines add whitespace errors. + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven protoc-jar: executing: [/tmp/protoc1788037483674454778.exe, --version] libprotoc 2.5.0 protoc-jar: executing: [/tmp/protoc1788037483674454778.exe, -I/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore, --java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/target/generated-sources, /data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto] ANTLR Parser Generator Version 3.5.2 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (process-resource-bundles) on project hive-upgrade-acid: Execution process-resource-bundles of goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process failed. ConcurrentModificationException -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hive-upgrade-acid + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-1
[jira] [Commented] (HIVE-19770) Support for CBO for queries with multiple same columns in select
[ https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561238#comment-16561238 ] Hive QA commented on HIVE-19770: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933524/HIVE-19770.8.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14815 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12932/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12932/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12932/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12933524 - PreCommit-HIVE-Build > Support for CBO for queries with multiple same columns in select > > > Key: HIVE-19770 > URL: https://issues.apache.org/jira/browse/HIVE-19770 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, > HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, > HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch > > > Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} > are not supported for CBO. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19770) Support for CBO for queries with multiple same columns in select
[ https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561223#comment-16561223 ] Hive QA commented on HIVE-19770: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 14s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 13 new + 199 unchanged - 1 fixed = 212 total (was 200) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-12932/dev-support/hive-personality.sh | | git revision | master / 2183424 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-12932/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12932/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Support for CBO for queries with multiple same columns in select > > > Key: HIVE-19770 > URL: https://issues.apache.org/jira/browse/HIVE-19770 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, > HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, > HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch > > > Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} > are not supported for CBO. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator
[ https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] George Pachitariu updated HIVE-20262: - Attachment: HIVE-20262.1.patch Status: Patch Available (was: Open) > Implement stats annotation rule for the UDTFOperator > > > Key: HIVE-20262 > URL: https://issues.apache.org/jira/browse/HIVE-20262 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: George Pachitariu >Assignee: George Pachitariu >Priority: Minor > Attachments: HIVE-20262.1.patch, HIVE-20262.patch > > > User Defined Table Functions (UDTFs) change the number of rows of the output. > A common UDTF is the explode() method that creates a row for each element for > each array in the input column. > > Right now, the number of output rows is equal to the number of input rows. > But if the average number of output rows is bigger than 1, the resulting > number of rows is underestimated in the execution plan. > > Implement a rule that can have a factor X as a parameter and for each UDTF > function predict that: > > {code:java} > number of output rows = X * number of input rows{code} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20262) Implement stats annotation rule for the UDTFOperator
[ https://issues.apache.org/jira/browse/HIVE-20262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] George Pachitariu updated HIVE-20262: - Status: Open (was: Patch Available) > Implement stats annotation rule for the UDTFOperator > > > Key: HIVE-20262 > URL: https://issues.apache.org/jira/browse/HIVE-20262 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Reporter: George Pachitariu >Assignee: George Pachitariu >Priority: Minor > Attachments: HIVE-20262.patch > > > User Defined Table Functions (UDTFs) change the number of rows of the output. > A common UDTF is the explode() method that creates a row for each element for > each array in the input column. > > Right now, the number of output rows is equal to the number of input rows. > But if the average number of output rows is bigger than 1, the resulting > number of rows is underestimated in the execution plan. > > Implement a rule that can have a factor X as a parameter and for each UDTF > function predict that: > > {code:java} > number of output rows = X * number of input rows{code} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561214#comment-16561214 ] Hive QA commented on HIVE-17683: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933520/HIVE-17683-branch-3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 14413 tests executed *Failed tests:* {noformat} TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestTezPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=258) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=70) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_with_masking] (batchId=174) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_semijoin] (batchId=187) org.apache.hadoop.hive.ql.TestWarehouseExternalDir.testManagedPaths (batchId=235) org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testLockingOnInsertIntoNonNativeTables (batchId=306) org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation (batchId=243) org.apache.hive.spark.client.rpc.TestRpc.testServerPort (batchId=310) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12931/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12931/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12931/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12933520 - PreCommit-HIVE-Build > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683-branch-3.patch, > HIVE-17683.01.patch, HIVE-17683.02.patch, HIVE-17683.03.patch, > HIVE-17683.04.patch, HIVE-17683.05.patch, HIVE-17683.06.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20215) Hive unable to plan/compile query containing subquery with multiple same name columns
[ https://issues.apache.org/jira/browse/HIVE-20215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-20215: --- Attachment: (was: HIVE-19770.8.patch) > Hive unable to plan/compile query containing subquery with multiple same name > columns > - > > Key: HIVE-20215 > URL: https://issues.apache.org/jira/browse/HIVE-20215 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > *Reproducer* > == > {code:sql} > >create table t1(c1 int) > >explain select count(*) from (select c1, c1 from t1) subq > {code} > {noformat} > FAILED: SemanticException [Error 10007]: Ambiguous column reference c1 in subq > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select
[ https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-19770: --- Status: Patch Available (was: Open) > Support for CBO for queries with multiple same columns in select > > > Key: HIVE-19770 > URL: https://issues.apache.org/jira/browse/HIVE-19770 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, > HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, > HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch > > > Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} > are not supported for CBO. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20215) Hive unable to plan/compile query containing subquery with multiple same name columns
[ https://issues.apache.org/jira/browse/HIVE-20215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-20215: --- Attachment: HIVE-19770.8.patch > Hive unable to plan/compile query containing subquery with multiple same name > columns > - > > Key: HIVE-20215 > URL: https://issues.apache.org/jira/browse/HIVE-20215 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > *Reproducer* > == > {code:sql} > >create table t1(c1 int) > >explain select count(*) from (select c1, c1 from t1) subq > {code} > {noformat} > FAILED: SemanticException [Error 10007]: Ambiguous column reference c1 in subq > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select
[ https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-19770: --- Attachment: HIVE-19770.8.patch > Support for CBO for queries with multiple same columns in select > > > Key: HIVE-19770 > URL: https://issues.apache.org/jira/browse/HIVE-19770 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, > HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, > HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch > > > Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} > are not supported for CBO. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19770) Support for CBO for queries with multiple same columns in select
[ https://issues.apache.org/jira/browse/HIVE-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-19770: --- Status: Open (was: Patch Available) > Support for CBO for queries with multiple same columns in select > > > Key: HIVE-19770 > URL: https://issues.apache.org/jira/browse/HIVE-19770 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-19770.1.patch, HIVE-19770.2.patch, > HIVE-19770.3.patch, HIVE-19770.4.patch, HIVE-19770.5.patch, > HIVE-19770.6.patch, HIVE-19770.7.patch, HIVE-19770.8.patch > > > Currently queries such as {code:sql} select a,a from t1 where b > 10 {code} > are not supported for CBO. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20264) Bootstrap repl dump with concurrent write and drop of ACID table makes target inconsistent.
[ https://issues.apache.org/jira/browse/HIVE-20264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reassigned HIVE-20264: --- > Bootstrap repl dump with concurrent write and drop of ACID table makes target > inconsistent. > --- > > Key: HIVE-20264 > URL: https://issues.apache.org/jira/browse/HIVE-20264 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 4.0.0, 3.2.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, replication > > During bootstrap dump of ACID tables, let's consider the below sequence. > - Get lastReplId = last event ID logged. > - Current session (Thread-1), REPL DUMP -> Open txn (Txn1) - Event-10 > - Another session (Thread-2), Open txn (Txn2) - Event-11 > - Thread-2 -> Insert data (T1.D1) to ACID table. - Event-12 > - Thread-2 -> Commit Txn (Txn2) - Event-13 > - Thread-2 -> Drop table (T1) - Event-14 > - Thread-1 -> Dump ACID tables based on validTxnList based on Txn1. --> This > step skips all the data written by txns > Txn1. So, T1 will be missing. > - Thread-1 -> Commit Txn (Txn1) > - REPL LOAD from bootstrap dump will skip T1. > - Incremental REPL DUMP will start from Event-10 and hence allocate write id > for table T1 and drop table(T1) is idempotent. So, at target, exist entries > in TXN_TO_WRITE_ID and NEXT_WRITE_ID metastore tables. > - Now, when we create another table at source with same name T1 and > replicate, then it may lead to incorrect data for readers at target on T1. > Couple of proposals: > 1. Make allocate write ID idempotent which is not possible as table doesn't > exist and MM table import may lead to allocate write id before creating > table. So, cannot differentiate these 2 cases. > 2. Make Drop table event to drop entries from TXN_TO_WRITE_ID and > NEXT_WRITE_ID tables irrespective of table exist or not at target. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561151#comment-16561151 ] Hive QA commented on HIVE-17683: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 17s{color} | {color:red} /data/hiveptest/logs/PreCommit-HIVE-Build-12931/patches/PreCommit-HIVE-Build-12931.patch does not apply to master. Rebase required? Wrong Branch? See http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-12931/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683-branch-3.patch, > HIVE-17683.01.patch, HIVE-17683.02.patch, HIVE-17683.03.patch, > HIVE-17683.04.patch, HIVE-17683.05.patch, HIVE-17683.06.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Kryvenko updated HIVE-17683: - Attachment: HIVE-17683-branch-3.patch > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683-branch-3.patch, > HIVE-17683.01.patch, HIVE-17683.02.patch, HIVE-17683.03.patch, > HIVE-17683.04.patch, HIVE-17683.05.patch, HIVE-17683.06.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Kryvenko updated HIVE-17683: - Attachment: HIVE-17683-branch-3.0.patch > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683-branch-3.0.patch, HIVE-17683.01.patch, > HIVE-17683.02.patch, HIVE-17683.03.patch, HIVE-17683.04.patch, > HIVE-17683.05.patch, HIVE-17683.06.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Kryvenko updated HIVE-17683: - Attachment: (was: HIVE-17683.01-branch-3.0.patch) > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, > HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, > HIVE-17683.06.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Kryvenko updated HIVE-17683: - Attachment: (was: HIVE-17683-branch-3.patch) > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, > HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, > HIVE-17683.06.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561132#comment-16561132 ] Hive QA commented on HIVE-17683: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933518/HIVE-17683-branch-3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12929/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12929/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12929/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-07-29 15:25:27.128 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12929/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z branch-3 ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-07-29 15:25:27.131 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 2183424 HIVE-19181 : Remove BreakableService (unused class) (Anurag Mantripragada via Thejas Nair) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout branch-3 Switched to branch 'branch-3' Your branch is behind 'origin/branch-3' by 5 commits, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/branch-3 HEAD is now at 150ef3b HIVE-19829: Incremental replication load should create tasks in execution phase rather than semantic phase (Mahesh Kumar Behera, reviewed by Sankar Hariappan) + git merge --ff-only origin/branch-3 Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-29 15:25:30.572 + rm -rf ../yetus_PreCommit-HIVE-Build-12929 + mkdir ../yetus_PreCommit-HIVE-Build-12929 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12929 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12929/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:46 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java' with conflicts. error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:419 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java' with conflicts. error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java:69 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java' cleanly. Going to apply patch with: git apply -p0 error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:46 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java' with conflicts. error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:419 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java' with conflicts. error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java:69 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java' cleanly. U ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java U ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-12929 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12933518 - PreCommit-HIVE-Build > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apac
[jira] [Updated] (HIVE-17683) Add explain locks command
[ https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Kryvenko updated HIVE-17683: - Attachment: HIVE-17683-branch-3.patch > Add explain locks command > --- > > Key: HIVE-17683 > URL: https://issues.apache.org/jira/browse/HIVE-17683 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Igor Kryvenko >Priority: Critical > Attachments: HIVE-17683-branch-3.patch, > HIVE-17683.01-branch-3.0.patch, HIVE-17683.01.patch, HIVE-17683.02.patch, > HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, > HIVE-17683.06.patch > > > Explore if it's possible to add info about what locks will be asked for to > the query plan. > Lock acquisition (for Acid Lock Manager) is done in > DbTxnManager.acquireLocks() which is called once the query starts running. > Would need to refactor that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561115#comment-16561115 ] Hive QA commented on HIVE-20245: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933511/HIVE-20245.04.patch {color:green}SUCCESS:{color} +1 due to 14 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14831 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12928/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12928/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12928/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12933511 - PreCommit-HIVE-Build > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, > HIVE-20245.03.patch, HIVE-20245.04.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561100#comment-16561100 ] Hive QA commented on HIVE-20245: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 19s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 56s{color} | {color:red} ql in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 57s{color} | {color:red} ql: The patch generated 298 new + 1522 unchanged - 34 fixed = 1820 total (was 1556) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} vector-code-gen: The patch generated 8 new + 322 unchanged - 0 fixed = 330 total (was 322) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 33s{color} | {color:red} ql generated 15 new + 2292 unchanged - 5 fixed = 2307 total (was 2297) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 30s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Redundant nullcheck of filterExpr, which is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:[line 1640] | | | Redundant nullcheck of vectorExpression, which is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:[line 1687] | | | Found reliance on default encoding in org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int, Object, TypeInfo):in org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int, Object, TypeInfo): String.getBytes() At ConstantVectorExpression.java:[line 210] | | | Class org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColumnBetween defines non-transient non-se
[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20245: Status: Patch Available (was: In Progress) > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, > HIVE-20245.03.patch, HIVE-20245.04.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20245: Attachment: HIVE-20245.04.patch > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, > HIVE-20245.03.patch, HIVE-20245.04.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20245: Status: In Progress (was: Patch Available) > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, > HIVE-20245.03.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561064#comment-16561064 ] Hive QA commented on HIVE-20245: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12933509/HIVE-20245.03.patch {color:green}SUCCESS:{color} +1 due to 14 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 14831 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_10] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_7] (batchId=89) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_8] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_10] (batchId=26) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_7] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_8] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_casts] (batchId=87) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_annotate_stats_select] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf_inline] (batchId=179) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_10] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_7] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_8] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_short_regress] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_casts] (batchId=178) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=186) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_10] (batchId=118) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_7] (batchId=147) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_8] (batchId=114) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_10] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress] (batchId=131) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12927/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12927/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12927/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12933509 - PreCommit-HIVE-Build > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, > HIVE-20245.03.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16561055#comment-16561055 ] Hive QA commented on HIVE-20245: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 12s{color} | {color:blue} ql in master has 2297 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 56s{color} | {color:red} ql in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 58s{color} | {color:red} ql: The patch generated 297 new + 1522 unchanged - 34 fixed = 1819 total (was 1556) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} vector-code-gen: The patch generated 8 new + 322 unchanged - 0 fixed = 330 total (was 322) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 23s{color} | {color:red} ql generated 15 new + 2292 unchanged - 5 fixed = 2307 total (was 2297) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Redundant nullcheck of filterExpr, which is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:[line 1640] | | | Redundant nullcheck of vectorExpression, which is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:is known to be non-null in org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createDecimal64VectorExpression(Class, List, VectorExpressionDescriptor$Mode, boolean, int, TypeInfo, DataTypePhysicalVariation) Redundant null check at VectorizationContext.java:[line 1687] | | | Found reliance on default encoding in org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int, Object, TypeInfo):in org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.create(int, Object, TypeInfo): String.getBytes() At ConstantVectorExpression.java:[line 210] | | | Class org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColumnBetween defines non-transient non-se
[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20245: Attachment: HIVE-20245.03.patch > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, > HIVE-20245.03.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20245: Status: Patch Available (was: In Progress) > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch, > HIVE-20245.03.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20245) Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN
[ https://issues.apache.org/jira/browse/HIVE-20245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-20245: Status: In Progress (was: Patch Available) > Vectorization: Fix NULL / Wrong Results issues in BETWEEN / IN > -- > > Key: HIVE-20245 > URL: https://issues.apache.org/jira/browse/HIVE-20245 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-20245.01.patch, HIVE-20245.02.patch > > > Write new UT tests that use random data and intentional isRepeating batches > to checks for NULL and Wrong Results for vectorized BETWEEN and IN. > Add, missing vectorization classes for BETWEEN PROJECTION. -- This message was sent by Atlassian JIRA (v7.6.3#76005)