[jira] [Updated] (HIVE-20911) External Table Replication for Hive
[ https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-20911: --- Attachment: HIVE-20911.07.patch > External Table Replication for Hive > --- > > Key: HIVE-20911 > URL: https://issues.apache.org/jira/browse/HIVE-20911 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: anishek >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, > HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, > HIVE-20911.06.patch, HIVE-20911.07.patch, HIVE-20911.07.patch > > > External tables are not replicated currently as part of hive replication. As > part of this jira we want to enable that. > Approach: > * Target cluster will have a top level base directory config that will be > used to copy all data relevant to external tables. This will be provided via > the *with* clause in the *repl load* command. This base path will be prefixed > to the path of the same external table on source cluster. This can be > provided using the following configuration: > {code} > hive.repl.replica.external.table.base.dir=/ > {code} > * Since changes to directories on the external table can happen without hive > knowing it, hence we cant capture the relevant events when ever new data is > added or removed, we will have to copy the data from the source path to > target path for external tables every time we run incremental replication. > ** this will require incremental *repl dump* to now create an additional > file *\_external\_tables\_info* with data in the following form > {code} > tableName,base64Encoded(tableDataLocation) > {code} > In case there are different partitions in the table pointing to different > locations there will be multiple entries in the file for the same table name > with location pointing to different partition locations. For partitions > created in a table without specifying the _set location_ command will be > within the same table Data location and hence there will not be different > entries in the file above > ** *repl load* will read the *\_external\_tables\_info* to identify what > locations are to be copied from source to target and create corresponding > tasks for them. > * New External tables will be created with metadata only with no data copied > as part of regular tasks while incremental load/bootstrap load. > * Bootstrap dump will also create *\_external\_tables\_info* which will be > used to copy data from source to target as part of boostrap load. > * Bootstrap load will create a DAG, that can use parallelism in the execution > phase, the hdfs copy related tasks are created, once the bootstrap phase is > complete. > * Since incremental load results in a DAG with only sequential execution ( > events applied in sequence ) to effectively use the parallelism capability in > execution mode, we create tasks for hdfs copy along with the incremental DAG. > This requires a few basic calculations to approximately meet the configured > value in "hive.repl.approx.max.load.tasks" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725629#comment-16725629 ] Hive QA commented on HIVE-20989: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 33s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15397/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: service U: service | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15397/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at >
[jira] [Commented] (HIVE-20911) External Table Replication for Hive
[ https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725625#comment-16725625 ] Hive QA commented on HIVE-20911: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952449/HIVE-20911.07.patch {color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 15721 tests executed *Failed tests:* {noformat} TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestReplAcidTablesWithJsonMessage - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=251) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15396/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15396/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15396/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952449 - PreCommit-HIVE-Build > External Table Replication for Hive > --- > > Key: HIVE-20911 > URL: https://issues.apache.org/jira/browse/HIVE-20911 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: anishek >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, > HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, > HIVE-20911.06.patch, HIVE-20911.07.patch > > > External tables are not replicated currently as part of hive replication. As > part of this jira we want to enable that. > Approach: > * Target cluster will have a top level base directory config that will be > used to copy all data relevant to external tables. This will be provided via > the *with* clause in the *repl load* command. This base path will be prefixed > to the path of the same external table on source cluster. This can be > provided using the following configuration: > {code} > hive.repl.replica.external.table.base.dir=/ > {code} > * Since changes to directories on the external table can happen without hive > knowing it, hence we cant capture the relevant events when ever new data is > added or removed, we will have to copy the data from the source path to > target path for external tables every time we run incremental replication. > ** this will require incremental *repl dump* to now create an additional > file *\_external\_tables\_info* with data in the following form > {code} > tableName,base64Encoded(tableDataLocation) > {code} > In case there are different partitions in the table pointing to different > locations there will be multiple entries in the file for the same table name > with location pointing to different partition locations. For partitions > created in a table without specifying the _set location_ command will be > within the same table Data location and hence there will not be different > entries in the file above > ** *repl load* will read the *\_external\_tables\_info* to identify what > locations are to be copied from source to target and create corresponding > tasks for them. > * New External tables will be created with metadata only with no data copied > as part of regular tasks while incremental load/bootstrap load. > * Bootstrap dump will also create *\_external\_tables\_info* which will be > used to copy data from source to target as part of boostrap load. > * Bootstrap load will create a DAG, that can use parallelism in the execution > phase, the hdfs copy related tasks are created, once the bootstrap phase is > complete. > * Since incremental load results in a DAG with only sequential execution ( > events applied in sequence ) to effectively use the parallelism capability in > execution mode, we create tasks for hdfs copy along with the incremental DAG. > This requires a few basic calculations to approximately meet the configured > value in "hive.repl.approx.max.load.tasks" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Status: Open (was: Patch Available) > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Attachment: (was: HIVE-20989.01.patch) > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20911) External Table Replication for Hive
[ https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725622#comment-16725622 ] Hive QA commented on HIVE-20911: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 29s{color} | {color:blue} common in master has 65 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 47s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 38s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 39s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 28s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 13 new + 390 unchanged - 12 fixed = 403 total (was 402) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s{color} | {color:red} itests/hive-unit: The patch generated 24 new + 737 unchanged - 37 fixed = 761 total (was 774) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 48s{color} | {color:red} ql generated 2 new + 2309 unchanged - 1 fixed = 2311 total (was 2310) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 59s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | The field org.apache.hadoop.hive.ql.exec.repl.ReplLoadWork.pathsToCopyIterator is transient but isn't set by deserialization In ReplLoadWork.java:but isn't set by deserialization In ReplLoadWork.java | | | Write to static field org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.numIteration from instance method org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(DriverContext, Hive, Logger, ReplLoadWork, TaskTracker) At IncrementalLoadTasksBuilder.java:from instance method org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(DriverContext, Hive, Logger, ReplLoadWork, TaskTracker) At IncrementalLoadTasksBuilder.java:[line 100] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15396/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15396/yetus/diff-checkstyle-ql.txt | | checkstyle |
[jira] [Updated] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Status: Patch Available (was: Open) > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Attachment: HIVE-20989.01.patch > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21059) Support external catalogs
[ https://issues.apache.org/jira/browse/HIVE-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-21059: - Description: Hive has ability to query data from external sources such as other RDBMS, Kafka, Druid, Hbase. For example, to be able to query data from external sources such as a mysql table, an external table has to be explicitly created in Hive for every table in mysql that needs to be made accessible. Moreover, for creating such a table, the schema and login credentials have to be specified. By supporting "external catalogs" in Hive, we can have references to all tables in an entire mysql database by just creating one external catalog. The schema of the tables would also get automatically detected from the underlying source. Where possible, additional information such as statistics of the tables can also be imported from the underlying datasource, to enable Hive cost based optimizer to create optimized query plans. To be able to support the use of external catalog, some of the work tracked under HIVE-18685 for catalog support (including catalog in SQL syntax of Hive) is also needed. was: Hive has ability to query data from external sources such as other RDBMS, Kafka, Druid, Hbase. For example, to be able to query data from external sources such as a mysql table, an external table has to be explicitly created in Hive for every table in mysql that needs to be made accessible. Moreover, for creating such a table, the schema and login credentials have to be specified. By supporting "external catalogs" in Hive, we can have references to all tables in an entire mysql database by just creating one external catalog. The schema of the tables would also get automatically detected from the underlying source. Where possible, additional information such as statistics of the tables can also be imported from the underlying datasource, to enable Hive cost based optimizer to create optimized query plans. > Support external catalogs > - > > Key: HIVE-21059 > URL: https://issues.apache.org/jira/browse/HIVE-21059 > Project: Hive > Issue Type: New Feature >Reporter: Thejas M Nair >Priority: Critical > > Hive has ability to query data from external sources such as other RDBMS, > Kafka, Druid, Hbase. > For example, to be able to query data from external sources such as a mysql > table, an external table has to be explicitly created in Hive for every table > in mysql that needs to be made accessible. > Moreover, for creating such a table, the schema and login credentials have to > be specified. > By supporting "external catalogs" in Hive, we can have references to all > tables in an entire mysql database by just creating one external catalog. The > schema of the tables would also get automatically detected from the > underlying source. > Where possible, additional information such as statistics of the tables can > also be imported from the underlying datasource, to enable Hive cost based > optimizer to create optimized query plans. > To be able to support the use of external catalog, some of the work tracked > under HIVE-18685 for catalog support (including catalog in SQL syntax of > Hive) is also needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725599#comment-16725599 ] Hive QA commented on HIVE-21055: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952446/HIVE-21055.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 15719 tests executed *Failed tests:* {noformat} TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestReplAcidTablesWithJsonMessage - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestReplicationScenariosIncrementalLoadAcidTables - did not produce a TEST-*.xml file (likely timed out) (batchId=249) TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationManagedToAcid (batchId=244) org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationManagedToAcidAllOp (batchId=244) org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationManagedToAcidFailure (batchId=244) org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationManagedToAcidFailurePart (batchId=244) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15395/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15395/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15395/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952446 - PreCommit-HIVE-Build > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch, HIVE-21055.02.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725586#comment-16725586 ] Hive QA commented on HIVE-21055: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 39s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15395/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15395/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch, HIVE-21055.02.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725579#comment-16725579 ] Hive QA commented on HIVE-21040: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952437/HIVE-21040.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15394/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15394/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15394/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12952437/HIVE-21040.03.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12952437 - PreCommit-HIVE-Build > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, > HIVE-21040.03.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725578#comment-16725578 ] Hive QA commented on HIVE-20936: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952438/HIVE-20936.13.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15737 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.TestTxnCommandsForMmTable.testSnapshotIsolationWithAbortedTxnOnMmTable (batchId=284) org.apache.hadoop.hive.ql.TestTxnCommandsForOrcMmTable.testSnapshotIsolationWithAbortedTxnOnMmTable (batchId=304) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15393/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15393/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15393/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952438 - PreCommit-HIVE-Build > Allow the Worker thread in the metastore to run outside of it > - > > Key: HIVE-20936 > URL: https://issues.apache.org/jira/browse/HIVE-20936 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: HIVE-20936.1.patch, HIVE-20936.10.patch, > HIVE-20936.11.patch, HIVE-20936.12.patch, HIVE-20936.13.patch, > HIVE-20936.2.patch, HIVE-20936.3.patch, HIVE-20936.4.patch, > HIVE-20936.5.patch, HIVE-20936.6.patch, HIVE-20936.7.patch, > HIVE-20936.8.patch, HIVE-20936.8.patch > > > Currently the Worker thread in the metastore in bounded to the metastore, > mainly because of the TxnHandler that it has. This thread runs some map > reduce jobs which may not only be an option wherever the metastore is > running. A solution for this can be to run this thread in HS2 depending on a > flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20911) External Table Replication for Hive
[ https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-20911: --- Attachment: (was: HIVE-20911.07.patch) > External Table Replication for Hive > --- > > Key: HIVE-20911 > URL: https://issues.apache.org/jira/browse/HIVE-20911 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: anishek >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, > HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, > HIVE-20911.06.patch, HIVE-20911.07.patch > > > External tables are not replicated currently as part of hive replication. As > part of this jira we want to enable that. > Approach: > * Target cluster will have a top level base directory config that will be > used to copy all data relevant to external tables. This will be provided via > the *with* clause in the *repl load* command. This base path will be prefixed > to the path of the same external table on source cluster. This can be > provided using the following configuration: > {code} > hive.repl.replica.external.table.base.dir=/ > {code} > * Since changes to directories on the external table can happen without hive > knowing it, hence we cant capture the relevant events when ever new data is > added or removed, we will have to copy the data from the source path to > target path for external tables every time we run incremental replication. > ** this will require incremental *repl dump* to now create an additional > file *\_external\_tables\_info* with data in the following form > {code} > tableName,base64Encoded(tableDataLocation) > {code} > In case there are different partitions in the table pointing to different > locations there will be multiple entries in the file for the same table name > with location pointing to different partition locations. For partitions > created in a table without specifying the _set location_ command will be > within the same table Data location and hence there will not be different > entries in the file above > ** *repl load* will read the *\_external\_tables\_info* to identify what > locations are to be copied from source to target and create corresponding > tasks for them. > * New External tables will be created with metadata only with no data copied > as part of regular tasks while incremental load/bootstrap load. > * Bootstrap dump will also create *\_external\_tables\_info* which will be > used to copy data from source to target as part of boostrap load. > * Bootstrap load will create a DAG, that can use parallelism in the execution > phase, the hdfs copy related tasks are created, once the bootstrap phase is > complete. > * Since incremental load results in a DAG with only sequential execution ( > events applied in sequence ) to effectively use the parallelism capability in > execution mode, we create tasks for hdfs copy along with the incremental DAG. > This requires a few basic calculations to approximately meet the configured > value in "hive.repl.approx.max.load.tasks" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20911) External Table Replication for Hive
[ https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-20911: --- Attachment: HIVE-20911.07.patch > External Table Replication for Hive > --- > > Key: HIVE-20911 > URL: https://issues.apache.org/jira/browse/HIVE-20911 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: anishek >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, > HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, > HIVE-20911.06.patch, HIVE-20911.07.patch > > > External tables are not replicated currently as part of hive replication. As > part of this jira we want to enable that. > Approach: > * Target cluster will have a top level base directory config that will be > used to copy all data relevant to external tables. This will be provided via > the *with* clause in the *repl load* command. This base path will be prefixed > to the path of the same external table on source cluster. This can be > provided using the following configuration: > {code} > hive.repl.replica.external.table.base.dir=/ > {code} > * Since changes to directories on the external table can happen without hive > knowing it, hence we cant capture the relevant events when ever new data is > added or removed, we will have to copy the data from the source path to > target path for external tables every time we run incremental replication. > ** this will require incremental *repl dump* to now create an additional > file *\_external\_tables\_info* with data in the following form > {code} > tableName,base64Encoded(tableDataLocation) > {code} > In case there are different partitions in the table pointing to different > locations there will be multiple entries in the file for the same table name > with location pointing to different partition locations. For partitions > created in a table without specifying the _set location_ command will be > within the same table Data location and hence there will not be different > entries in the file above > ** *repl load* will read the *\_external\_tables\_info* to identify what > locations are to be copied from source to target and create corresponding > tasks for them. > * New External tables will be created with metadata only with no data copied > as part of regular tasks while incremental load/bootstrap load. > * Bootstrap dump will also create *\_external\_tables\_info* which will be > used to copy data from source to target as part of boostrap load. > * Bootstrap load will create a DAG, that can use parallelism in the execution > phase, the hdfs copy related tasks are created, once the bootstrap phase is > complete. > * Since incremental load results in a DAG with only sequential execution ( > events applied in sequence ) to effectively use the parallelism capability in > execution mode, we create tasks for hdfs copy along with the incremental DAG. > This requires a few basic calculations to approximately meet the configured > value in "hive.repl.approx.max.load.tasks" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725570#comment-16725570 ] Hive QA commented on HIVE-20936: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 8s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 4s{color} | {color:blue} standalone-metastore/metastore-server in master has 189 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 45s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 35s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 27s{color} | {color:blue} jdbc in master has 17 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 24s{color} | {color:blue} hcatalog/streaming in master has 11 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 25s{color} | {color:blue} streaming in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 8s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 44s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 5 new + 641 unchanged - 6 fixed = 646 total (was 647) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 169 unchanged - 0 fixed = 170 total (was 169) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 107 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 53s{color} | {color:red} ql generated 3 new + 2309 unchanged - 1 fixed = 2312 total (was 2310) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 47s{color} | {color:red} standalone-metastore_metastore-common generated 1 new + 16 unchanged - 0 fixed = 17 total (was 16) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 52m 40s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Field MetaStoreCompactorThread.threadId masks field in superclass org.apache.hadoop.hive.ql.txn.compactor.CompactorThread In MetaStoreCompactorThread.java:superclass org.apache.hadoop.hive.ql.txn.compactor.CompactorThread In MetaStoreCompactorThread.java | | | Field MetaStoreCompactorThread.rs masks field in superclass
[jira] [Commented] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications
[ https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725537#comment-16725537 ] Hive QA commented on HIVE-21027: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952433/HIVE-21027.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15737 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_inner_join] (batchId=192) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15390/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15390/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15390/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952433 - PreCommit-HIVE-Build > Add a configuration to include entire thrift objects in HMS notifications > - > > Key: HIVE-21027 > URL: https://issues.apache.org/jira/browse/HIVE-21027 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 4.0.0 >Reporter: Bharathkrishna Guruvayoor Murali >Assignee: Bharathkrishna Guruvayoor Murali >Priority: Major > Attachments: HIVE-21027.1.patch > > > Currently, we add the full thrift objects of Table / Partition in the HMS > notification messages, starting from HIVE-15180. > We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this > under a flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Attachment: HIVE-21055.02.patch > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch, HIVE-21055.02.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Status: Patch Available (was: Open) > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch, HIVE-21055.02.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Status: Open (was: Patch Available) > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch, HIVE-21055.02.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725539#comment-16725539 ] Hive QA commented on HIVE-21040: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952437/HIVE-21040.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15391/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15391/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15391/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-12-20 02:58:09.927 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-15391/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-12-20 02:58:09.930 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 1020be0 HIVE-20943: Handle Compactor transaction abort properly (Eugene Koifman, reviewed by Vaibhav Gumashta) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 1020be0 HIVE-20943: Handle Compactor transaction abort properly (Eugene Koifman, reviewed by Vaibhav Gumashta) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-12-20 02:58:11.357 + rm -rf ../yetus_PreCommit-HIVE-Build-15391 + mkdir ../yetus_PreCommit-HIVE-Build-15391 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-15391 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-15391/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java: does not exist in index error: a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: does not exist in index Going to apply patch with: git apply -p1 + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven protoc-jar: executing: [/tmp/protoc3062516196038677332.exe, --version] libprotoc 2.5.0 protoc-jar: executing: [/tmp/protoc3062516196038677332.exe, -I/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore, --java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/target/generated-sources, /data/hiveptest/working/apache-github-source-source/standalone-metastore/metastore-common/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto] ANTLR Parser Generator Version 3.5.2 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (process-resource-bundles) on project hive-shims: Execution process-resource-bundles of goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process failed. ConcurrentModificationException -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hive-shims + result=1 + '[' 1 -ne 0 ']' + rm
[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725540#comment-16725540 ] Hive QA commented on HIVE-21040: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952437/HIVE-21040.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15392/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15392/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15392/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12952437/HIVE-21040.03.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12952437 - PreCommit-HIVE-Build > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, > HIVE-21040.03.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications
[ https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725518#comment-16725518 ] Hive QA commented on HIVE-21027: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 10s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 6s{color} | {color:blue} standalone-metastore/metastore-server in master has 189 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 38s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15390/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: standalone-metastore/metastore-common standalone-metastore/metastore-server U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15390/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Add a configuration to include entire thrift objects in HMS notifications > - > > Key: HIVE-21027 > URL: https://issues.apache.org/jira/browse/HIVE-21027 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 4.0.0 >Reporter: Bharathkrishna Guruvayoor Murali >Assignee: Bharathkrishna Guruvayoor Murali >Priority: Major > Attachments: HIVE-21027.1.patch > > > Currently, we add the full thrift objects of Table / Partition in the HMS > notification messages, starting from HIVE-15180. > We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this > under a flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system
[ https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725503#comment-16725503 ] Hive QA commented on HIVE-21044: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952423/HIVE-21044.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15738 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15389/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15389/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15389/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12952423 - PreCommit-HIVE-Build > Add SLF4J reporter to the metastore metrics system > -- > > Key: HIVE-21044 > URL: https://issues.apache.org/jira/browse/HIVE-21044 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Karthik Manamcheri >Priority: Minor > Labels: metrics > Attachments: HIVE-21044.1.patch > > > Lets add SLF4J reporter as an option in Metrics reporting system. Currently > we support JMX, JSON and Console reporting. > We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. > We can use the > {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}} > class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaume M updated HIVE-20936: --- Attachment: HIVE-20936.13.patch Status: Patch Available (was: Open) > Allow the Worker thread in the metastore to run outside of it > - > > Key: HIVE-20936 > URL: https://issues.apache.org/jira/browse/HIVE-20936 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: HIVE-20936.1.patch, HIVE-20936.10.patch, > HIVE-20936.11.patch, HIVE-20936.12.patch, HIVE-20936.13.patch, > HIVE-20936.2.patch, HIVE-20936.3.patch, HIVE-20936.4.patch, > HIVE-20936.5.patch, HIVE-20936.6.patch, HIVE-20936.7.patch, > HIVE-20936.8.patch, HIVE-20936.8.patch > > > Currently the Worker thread in the metastore in bounded to the metastore, > mainly because of the TxnHandler that it has. This thread runs some map > reduce jobs which may not only be an option wherever the metastore is > running. A solution for this can be to run this thread in HS2 depending on a > flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-21040: --- Attachment: HIVE-21040.03.patch > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, > HIVE-21040.03.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725497#comment-16725497 ] Vihang Karajgaonkar edited comment on HIVE-21040 at 12/20/18 1:33 AM: -- v3 fixes a minor typo in the test {{TestHiveMetastoreChecker}} and simplify the else condition in {{HiveMetastoreChecker}} further was (Author: vihangk1): v3 fixes a minor typo in the test {{TestHiveMetastoreChecker}}. > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, > HIVE-21040.03.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-21040: --- Attachment: HIVE-21040.03.patch > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, > HIVE-21040.03.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-21040: --- Attachment: (was: HIVE-21040.03.patch) > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, > HIVE-21040.03.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725497#comment-16725497 ] Vihang Karajgaonkar commented on HIVE-21040: v3 fixes a minor typo in the test {{TestHiveMetastoreChecker}}. > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, > HIVE-21040.03.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725484#comment-16725484 ] Prasanth Jayachandran commented on HIVE-21040: -- +1, pending tests. > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system
[ https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725479#comment-16725479 ] Hive QA commented on HIVE-21044: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 14s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 3s{color} | {color:blue} standalone-metastore/metastore-server in master has 189 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15389/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: standalone-metastore/metastore-common standalone-metastore/metastore-server U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15389/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Add SLF4J reporter to the metastore metrics system > -- > > Key: HIVE-21044 > URL: https://issues.apache.org/jira/browse/HIVE-21044 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Karthik Manamcheri >Priority: Minor > Labels: metrics > Attachments: HIVE-21044.1.patch > > > Lets add SLF4J reporter as an option in Metrics reporting system. Currently > we support JMX, JSON and Console reporting. > We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. > We can use the > {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}} > class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725468#comment-16725468 ] Vihang Karajgaonkar commented on HIVE-21040: Thanks [~prasanth_j] I found a easier way to test it using mockito. Added a new test case since {{TestHiveMetastoreChecker}} is in ql and I need package level access to the method we need to test. I think we should move TestHiveMetastoreChecker to metastore module as well since most of the msck logic is now part of metastore. > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21040) msck does unnecessary file listing at last level of directory tree
[ https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-21040: --- Attachment: HIVE-21040.02.patch > msck does unnecessary file listing at last level of directory tree > -- > > Key: HIVE-21040 > URL: https://issues.apache.org/jira/browse/HIVE-21040 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch > > > Here is the code snippet which is run by {{msck}} to list directories > {noformat} > final Path currentPath = pd.p; > final int currentDepth = pd.depth; > FileStatus[] fileStatuses = fs.listStatus(currentPath, > FileUtils.HIDDEN_FILES_PATH_FILTER); > // found no files under a sub-directory under table base path; it is > possible that the table > // is empty and hence there are no partition sub-directories created > under base path > if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < > partColNames.size()) { > // since maxDepth is not yet reached, we are missing partition > // columns in currentPath > logOrThrowExceptionWithMsg( > "MSCK is missing partition columns under " + > currentPath.toString()); > } else { > // found files under currentPath add them to the queue if it is a > directory > for (FileStatus fileStatus : fileStatuses) { > if (!fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a file at depth which is less than number of partition > keys > logOrThrowExceptionWithMsg( > "MSCK finds a file rather than a directory when it searches > for " > + fileStatus.getPath().toString()); > } else if (fileStatus.isDirectory() && currentDepth < > partColNames.size()) { > // found a sub-directory at a depth less than number of partition > keys > // validate if the partition directory name matches with the > corresponding > // partition colName at currentDepth > Path nextPath = fileStatus.getPath(); > String[] parts = nextPath.getName().split("="); > if (parts.length != 2) { > logOrThrowExceptionWithMsg("Invalid partition name " + > nextPath); > } else if > (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) { > logOrThrowExceptionWithMsg( > "Unexpected partition key " + parts[0] + " found at " + > nextPath); > } else { > // add sub-directory to the work queue if maxDepth is not yet > reached > pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1)); > } > } > } > if (currentDepth == partColNames.size()) { > return currentPath; > } > } > {noformat} > You can see that when the {{currentDepth}} at the {{maxDepth}} it still does > a unnecessary listing of the files. We can improve this call by checking the > currentDepth and bailing out early. > This can improve the performance of msck command significantly especially > when there are lot of files in each partitions on remote filesystems like S3 > or ADLS -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications
[ https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharathkrishna Guruvayoor Murali updated HIVE-21027: Status: Patch Available (was: Open) > Add a configuration to include entire thrift objects in HMS notifications > - > > Key: HIVE-21027 > URL: https://issues.apache.org/jira/browse/HIVE-21027 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 4.0.0 >Reporter: Bharathkrishna Guruvayoor Murali >Assignee: Bharathkrishna Guruvayoor Murali >Priority: Major > Attachments: HIVE-21027.1.patch > > > Currently, we add the full thrift objects of Table / Partition in the HMS > notification messages, starting from HIVE-15180. > We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this > under a flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21027) Add a configuration to include entire thrift objects in HMS notifications
[ https://issues.apache.org/jira/browse/HIVE-21027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharathkrishna Guruvayoor Murali updated HIVE-21027: Attachment: HIVE-21027.1.patch > Add a configuration to include entire thrift objects in HMS notifications > - > > Key: HIVE-21027 > URL: https://issues.apache.org/jira/browse/HIVE-21027 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 4.0.0 >Reporter: Bharathkrishna Guruvayoor Murali >Assignee: Bharathkrishna Guruvayoor Murali >Priority: Major > Attachments: HIVE-21027.1.patch > > > Currently, we add the full thrift objects of Table / Partition in the HMS > notification messages, starting from HIVE-15180. > We can have a configuration like NOTIFICATIONS_ADD_THRIFT_OBJECTS to do this > under a flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20776) Run HMS filterHooks on server-side in addition to client-side
[ https://issues.apache.org/jira/browse/HIVE-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725460#comment-16725460 ] Hive QA commented on HIVE-20776: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952425/HIVE-20776.005.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15738 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.TestFilterHooks.testDummyFilterForPartition (batchId=220) org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithHttp.org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithHttp (batchId=255) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15388/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15388/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15388/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952425 - PreCommit-HIVE-Build > Run HMS filterHooks on server-side in addition to client-side > - > > Key: HIVE-20776 > URL: https://issues.apache.org/jira/browse/HIVE-20776 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Na Li >Priority: Major > Attachments: HIVE-20776.001.patch, HIVE-20776.003.patch, > HIVE-20776.004.patch, HIVE-20776.005.patch > > > In HMS, I noticed that all the filter hooks are applied on the client side > (in HiveMetaStoreClient.java). Is there any reason why we can't apply the > filters on the server-side? > Motivation: Some newer apache projects such as Kudu use HMS for metadata > storage. Kudu is not completely Java-based and there are interaction points > where they have C++ clients. In such cases, it would be ideal to have > consistent behavior from HMS side as far as filters, etc are concerned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaume M updated HIVE-20936: --- Status: Open (was: Patch Available) > Allow the Worker thread in the metastore to run outside of it > - > > Key: HIVE-20936 > URL: https://issues.apache.org/jira/browse/HIVE-20936 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: HIVE-20936.1.patch, HIVE-20936.10.patch, > HIVE-20936.11.patch, HIVE-20936.12.patch, HIVE-20936.2.patch, > HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, > HIVE-20936.6.patch, HIVE-20936.7.patch, HIVE-20936.8.patch, HIVE-20936.8.patch > > > Currently the Worker thread in the metastore in bounded to the metastore, > mainly because of the TxnHandler that it has. This thread runs some map > reduce jobs which may not only be an option wherever the metastore is > running. A solution for this can be to run this thread in HS2 depending on a > flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20776) Run HMS filterHooks on server-side in addition to client-side
[ https://issues.apache.org/jira/browse/HIVE-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725439#comment-16725439 ] Hive QA commented on HIVE-20776: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 12s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 6s{color} | {color:blue} standalone-metastore/metastore-server in master has 189 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 21s{color} | {color:red} standalone-metastore/metastore-common generated 1 new + 29 unchanged - 0 fixed = 30 total (was 29) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 31s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:standalone-metastore/metastore-common | | | Write to static field org.apache.hadoop.hive.metastore.HiveMetaStoreClient.isClientFilterEnabled from instance method new org.apache.hadoop.hive.metastore.HiveMetaStoreClient(Configuration, HiveMetaHookLoader, Boolean) At HiveMetaStoreClient.java:from instance method new org.apache.hadoop.hive.metastore.HiveMetaStoreClient(Configuration, HiveMetaHookLoader, Boolean) At HiveMetaStoreClient.java:[line 170] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile xml | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15388/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-15388/yetus/new-findbugs-standalone-metastore_metastore-common.html | | modules | C: standalone-metastore/metastore-common standalone-metastore/metastore-server U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15388/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Run HMS filterHooks on server-side in addition to client-side > - > > Key: HIVE-20776 > URL: https://issues.apache.org/jira/browse/HIVE-20776 >
[jira] [Updated] (HIVE-20960) remove CompactorMR.createCompactorMarker()
[ https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-20960: -- Description: Now that we have HIVE-20823, we know if a dir is produced by compactor from the name and {{CompactorMR.createCompactorMarker()}} can be removed. was: Now that we have HIVE-20941, we know if a dir is produced by compactor from the name and {{CompactorMR.createCompactorMarker()}} can be removed. > remove CompactorMR.createCompactorMarker() > -- > > Key: HIVE-20960 > URL: https://issues.apache.org/jira/browse/HIVE-20960 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > > Now that we have HIVE-20823, we know if a dir is produced by compactor from > the name and {{CompactorMR.createCompactorMarker()}} can be removed. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20960) remove CompactorMR.createCompactorMarker()
[ https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725428#comment-16725428 ] Eugene Koifman commented on HIVE-20960: --- [~ikryvenko], yes HIVE-20823. > remove CompactorMR.createCompactorMarker() > -- > > Key: HIVE-20960 > URL: https://issues.apache.org/jira/browse/HIVE-20960 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > > Now that we have HIVE-20823, we know if a dir is produced by compactor from > the name and {{CompactorMR.createCompactorMarker()}} can be removed. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20776) Run HMS filterHooks on server-side in addition to client-side
[ https://issues.apache.org/jira/browse/HIVE-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Na Li updated HIVE-20776: - Attachment: HIVE-20776.005.patch > Run HMS filterHooks on server-side in addition to client-side > - > > Key: HIVE-20776 > URL: https://issues.apache.org/jira/browse/HIVE-20776 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Na Li >Priority: Major > Attachments: HIVE-20776.001.patch, HIVE-20776.003.patch, > HIVE-20776.004.patch, HIVE-20776.005.patch > > > In HMS, I noticed that all the filter hooks are applied on the client side > (in HiveMetaStoreClient.java). Is there any reason why we can't apply the > filters on the server-side? > Motivation: Some newer apache projects such as Kudu use HMS for metadata > storage. Kudu is not completely Java-based and there are interaction points > where they have C++ clients. In such cases, it would be ideal to have > consistent behavior from HMS side as far as filters, etc are concerned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21044) Add SLF4J reporter to the metastore metrics system
[ https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Manamcheri updated HIVE-21044: -- Component/s: (was: HiveServer2) > Add SLF4J reporter to the metastore metrics system > -- > > Key: HIVE-21044 > URL: https://issues.apache.org/jira/browse/HIVE-21044 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Karthik Manamcheri >Priority: Minor > Labels: metrics > > Lets add SLF4J reporter as an option in Metrics reporting system. Currently > we support JMX, JSON and Console reporting. > We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. > We can use the > {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}} > class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21044) Add SLF4J reporter to the metastore metrics system
[ https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Manamcheri updated HIVE-21044: -- Attachment: HIVE-21044.1.patch > Add SLF4J reporter to the metastore metrics system > -- > > Key: HIVE-21044 > URL: https://issues.apache.org/jira/browse/HIVE-21044 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Karthik Manamcheri >Priority: Minor > Labels: metrics > Attachments: HIVE-21044.1.patch > > > Lets add SLF4J reporter as an option in Metrics reporting system. Currently > we support JMX, JSON and Console reporting. > We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. > We can use the > {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}} > class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21044) Add SLF4J reporter to the metastore metrics system
[ https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Manamcheri updated HIVE-21044: -- Status: Patch Available (was: In Progress) > Add SLF4J reporter to the metastore metrics system > -- > > Key: HIVE-21044 > URL: https://issues.apache.org/jira/browse/HIVE-21044 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Karthik Manamcheri >Priority: Minor > Labels: metrics > Attachments: HIVE-21044.1.patch > > > Lets add SLF4J reporter as an option in Metrics reporting system. Currently > we support JMX, JSON and Console reporting. > We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. > We can use the > {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}} > class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21044) Add SLF4J reporter to the metastore metrics system
[ https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Manamcheri updated HIVE-21044: -- Summary: Add SLF4J reporter to the metastore metrics system (was: Add SLF4J reporter to the metrics system) > Add SLF4J reporter to the metastore metrics system > -- > > Key: HIVE-21044 > URL: https://issues.apache.org/jira/browse/HIVE-21044 > Project: Hive > Issue Type: New Feature > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Karthik Manamcheri >Priority: Minor > Labels: metrics > > Lets add SLF4J reporter as an option in Metrics reporting system. Currently > we support JMX, JSON and Console reporting. > We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. > We can use the > {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}} > class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-21044) Add SLF4J reporter to the metrics system
[ https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-21044 started by Karthik Manamcheri. - > Add SLF4J reporter to the metrics system > > > Key: HIVE-21044 > URL: https://issues.apache.org/jira/browse/HIVE-21044 > Project: Hive > Issue Type: New Feature > Components: HiveServer2, Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Karthik Manamcheri >Priority: Minor > Labels: metrics > > Lets add SLF4J reporter as an option in Metrics reporting system. Currently > we support JMX, JSON and Console reporting. > We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. > We can use the > {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}} > class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725349#comment-16725349 ] Eugene Koifman commented on HIVE-20936: --- looks like it needs to be rebased > Allow the Worker thread in the metastore to run outside of it > - > > Key: HIVE-20936 > URL: https://issues.apache.org/jira/browse/HIVE-20936 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: HIVE-20936.1.patch, HIVE-20936.10.patch, > HIVE-20936.11.patch, HIVE-20936.12.patch, HIVE-20936.2.patch, > HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, > HIVE-20936.6.patch, HIVE-20936.7.patch, HIVE-20936.8.patch, HIVE-20936.8.patch > > > Currently the Worker thread in the metastore in bounded to the metastore, > mainly because of the TxnHandler that it has. This thread runs some map > reduce jobs which may not only be an option wherever the metastore is > running. A solution for this can be to run this thread in HS2 depending on a > flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-20993) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharathkrishna Guruvayoor Murali resolved HIVE-20993. - Resolution: Fixed > Update committer list > - > > Key: HIVE-20993 > URL: https://issues.apache.org/jira/browse/HIVE-20993 > Project: Hive > Issue Type: Task >Reporter: Bharathkrishna Guruvayoor Murali >Assignee: Bharathkrishna Guruvayoor Murali >Priority: Minor > Attachments: HIVE-20993.patch > > > Please update committer list: > Name: Bharath Krishna > Apache ID: bharos92 > Organization: Cloudera -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725320#comment-16725320 ] Hive QA commented on HIVE-20989: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952392/HIVE-20989.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15735 tests executed *Failed tests:* {noformat} TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=251) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15387/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15387/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15387/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952392 - PreCommit-HIVE-Build > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725284#comment-16725284 ] Hive QA commented on HIVE-20989: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 35s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15387/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: service U: service | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15387/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at >
[jira] [Updated] (HIVE-20989) JDBC - The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Summary: JDBC - The GetOperationStatus + log can block query progress via sleep() (was: JDBC: The GetOperationStatus + log can block query progress via sleep()) > JDBC - The GetOperationStatus + log can block query progress via sleep() > > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20989) JDBC: The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Status: Patch Available (was: Open) > JDBC: The GetOperationStatus + log can block query progress via sleep() > --- > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725265#comment-16725265 ] Hive QA commented on HIVE-21050: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952364/HIVE-21050.3.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 15787 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_analyze] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] (batchId=18) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_0] (batchId=118) org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSessionSparkSessionTimeout (batchId=252) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15386/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15386/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15386/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952364 - PreCommit-HIVE-Build > Upgrade Parquet to 1.11.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch > > > [WIP until Parquet community releases version 1.11.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes. > Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20989) JDBC: The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Attachment: HIVE-20989.01.patch > JDBC: The GetOperationStatus + log can block query progress via sleep() > --- > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > Attachments: HIVE-20989.01.patch > > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20989) JDBC: The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reassigned HIVE-20989: --- Assignee: Sankar Hariappan > JDBC: The GetOperationStatus + log can block query progress via sleep() > --- > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20989) JDBC: The GetOperationStatus + log can block query progress via sleep()
[ https://issues.apache.org/jira/browse/HIVE-20989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-20989: Affects Version/s: 4.0.0 > JDBC: The GetOperationStatus + log can block query progress via sleep() > --- > > Key: HIVE-20989 > URL: https://issues.apache.org/jira/browse/HIVE-20989 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: Gopal V >Assignee: Sankar Hariappan >Priority: Major > > There is an exponential sleep operation inside the CLIService which can end > up adding tens of seconds to a query which has already completed. > {code} > "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 > tid=0x7f4d5e72d800 nid=0xb634a waiting on condition [0x7f28d06a5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506) > at > org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757) > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The sleep loop is on the server side. > {code} > private static final long PROGRESS_MAX_WAIT_NS = 30 * 10l; > private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, > Operation operation, HiveConf conf) { > ... > long startTime = System.nanoTime(); > int timeOutMs = 8; > try { > while (sessionState.getProgressMonitor() == null && > !operation.isDone()) { > long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - > startTime)) / 100l; > if (remainingMs <= 0) { > LOG.debug("timed out and hence returning progress log as NULL"); > return new JobProgressUpdate(ProgressMonitor.NULL); > } > Thread.sleep(Math.min(remainingMs, timeOutMs)); > timeOutMs <<= 1; > } > {code} > After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, > which means the next sleep cycle is for min(30 - 17, 16) = 13. > If the query finishes on the 17th second, the JDBC server will only respond > after the 30th second when it will check for operation.isDone() and return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725255#comment-16725255 ] Hive QA commented on HIVE-21050: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 45s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 20s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 41s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 50s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 44 new + 145 unchanged - 4 fixed = 189 total (was 149) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 11s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 54m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc xml compile findbugs checkstyle | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15386/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15386/yetus/diff-checkstyle-ql.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-15386/yetus/whitespace-eol.txt | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-15386/yetus/patch-asflicense-problems.txt | | modules | C: ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15386/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Upgrade Parquet to 1.11.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch > > > [WIP until Parquet community releases version 1.11.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes.
[jira] [Commented] (HIVE-20748) Disable materialized view rewriting when plan pattern is not allowed
[ https://issues.apache.org/jira/browse/HIVE-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725241#comment-16725241 ] Ashutosh Chauhan commented on HIVE-20748: - some minor comments on RB. +1 modulo those comments. > Disable materialized view rewriting when plan pattern is not allowed > > > Key: HIVE-20748 > URL: https://issues.apache.org/jira/browse/HIVE-20748 > Project: Hive > Issue Type: Bug > Components: Materialized views >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20748.01.patch, HIVE-20748.01.patch, > HIVE-20748.02.patch, HIVE-20748.02.patch, HIVE-20748.03.patch, > HIVE-20748.04.patch, HIVE-20748.04.patch, HIVE-20748.04.patch, > HIVE-20748.patch > > > For instance, currently rewriting algorithm does not support some operators. > Or we cannot have non-deterministic function in the MV definition. In those > cases, we should fail either when we try to create the MV with rewriting > enabled, or when when we enable the rewriting for a MV already created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote
[ https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725202#comment-16725202 ] Hive QA commented on HIVE-16907: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952354/HIVE-16907.03.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15739 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_memcheck] (batchId=45) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15385/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15385/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15385/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952354 - PreCommit-HIVE-Build > "INSERT INTO" overwrite old data when destination table encapsulated by > backquote > > > Key: HIVE-16907 > URL: https://issues.apache.org/jira/browse/HIVE-16907 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 1.1.0, 2.1.1 >Reporter: Nemon Lou >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-16907.02.patch, HIVE-16907.03.patch, > HIVE-16907.03.patch, HIVE-16907.03.patch, HIVE-16907.1.patch > > > A way to reproduce: > {noformat} > create database tdb; > use tdb; > create table t1(id int); > create table t2(id int); > explain insert into `tdb.t1` select * from t2; > {noformat} > {noformat} > +---+ > | > Explain | > +---+ > | STAGE DEPENDENCIES: > | > | Stage-1 is a root stage > | > | Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, > Stage-4 | > | Stage-3 > | > | Stage-0 depends on stages: Stage-3, Stage-2, Stage-5 > | > | Stage-2 > | > | Stage-4 > | > | Stage-5 depends on stages: Stage-4 > | > | > | > | STAGE PLANS: > | > | Stage: Stage-1 > | > | Map Reduce > | > | Map Operator Tree: > | > | TableScan > | > | alias: t2 > | > | Statistics: Num rows: 0
[jira] [Commented] (HIVE-18884) Simplify Logging in Hive Metastore Client
[ https://issues.apache.org/jira/browse/HIVE-18884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725196#comment-16725196 ] Alan Gates commented on HIVE-18884: --- I don't have a problem with these changes, but I don't think they address the request of this bug. The request is to turn down the chattiness of the connection process in the logs. I would do this by changing some of the INFO messages to debug. The current patch adds more information to some of the log calls, which is fine, but doesn't do what this ticket asks. > Simplify Logging in Hive Metastore Client > - > > Key: HIVE-18884 > URL: https://issues.apache.org/jira/browse/HIVE-18884 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: Mani M >Priority: Minor > Labels: noob > Attachments: HIVE.18884.patch > > > https://github.com/apache/hive/blob/4047befe48c8f762c58d8854e058385c1df151c6/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > The current logging is: > {code} > 2018-02-26 07:02:44,883 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Trying to connect to metastore with URI > thrift://host.company.com:9083 > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Connected to metastore. > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to metastore, current connections: 2 > {code} > Please simplify to something like: > {code} > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to the Metastore Server (URI > thrift://host.company.com:9083), current connections: 2 > ... or ... > 2018-02-26 07:02:44,892 ERROR hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Failed to connect to the Metastore Server (URI > thrift://host.company.com:9083) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote
[ https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725173#comment-16725173 ] Hive QA commented on HIVE-16907: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 38s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} ql: The patch generated 0 new + 862 unchanged - 2 fixed = 862 total (was 864) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15385/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15385/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > "INSERT INTO" overwrite old data when destination table encapsulated by > backquote > > > Key: HIVE-16907 > URL: https://issues.apache.org/jira/browse/HIVE-16907 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 1.1.0, 2.1.1 >Reporter: Nemon Lou >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-16907.02.patch, HIVE-16907.03.patch, > HIVE-16907.03.patch, HIVE-16907.03.patch, HIVE-16907.1.patch > > > A way to reproduce: > {noformat} > create database tdb; > use tdb; > create table t1(id int); > create table t2(id int); > explain insert into `tdb.t1` select * from t2; > {noformat} > {noformat} > +---+ > | > Explain | > +---+ > | STAGE DEPENDENCIES: > | > | Stage-1 is a root stage > | > | Stage-6 depends on stages: Stage-1 , consists of
[jira] [Commented] (HIVE-19968) UDF exception is not throw out
[ https://issues.apache.org/jira/browse/HIVE-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725150#comment-16725150 ] Hive QA commented on HIVE-19968: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952352/HIVE-19968.05.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 15736 tests executed *Failed tests:* {noformat} TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] (batchId=86) org.apache.hive.jdbc.TestSSL.testMetastoreWithSSL (batchId=258) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15384/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15384/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15384/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952352 - PreCommit-HIVE-Build > UDF exception is not throw out > -- > > Key: HIVE-19968 > URL: https://issues.apache.org/jira/browse/HIVE-19968 > Project: Hive > Issue Type: Bug >Reporter: sandflee >Assignee: Laszlo Bodor >Priority: Major > Attachments: HIVE-19968.01.patch, HIVE-19968.02.patch, > HIVE-19968.03.patch, HIVE-19968.04.patch, HIVE-19968.05.patch, hive-udf.png > > > udf init failed, and throw a exception, but hive catch it and do nothing, > leading to app succ, but no data is generated. > {code} > GenericUDFReflect.java#evaluate() > try { > o = null; > o = ReflectionUtils.newInstance(c, null); > } catch (Exception e) { > // ignored > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19968) UDF exception is not throw out
[ https://issues.apache.org/jira/browse/HIVE-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725103#comment-16725103 ] Hive QA commented on HIVE-19968: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 49s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 37s{color} | {color:red} ql: The patch generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15384/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15384/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15384/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > UDF exception is not throw out > -- > > Key: HIVE-19968 > URL: https://issues.apache.org/jira/browse/HIVE-19968 > Project: Hive > Issue Type: Bug >Reporter: sandflee >Assignee: Laszlo Bodor >Priority: Major > Attachments: HIVE-19968.01.patch, HIVE-19968.02.patch, > HIVE-19968.03.patch, HIVE-19968.04.patch, HIVE-19968.05.patch, hive-udf.png > > > udf init failed, and throw a exception, but hive catch it and do nothing, > leading to app succ, but no data is generated. > {code} > GenericUDFReflect.java#evaluate() > try { > o = null; > o = ReflectionUtils.newInstance(c, null); > } catch (Exception e) { > // ignored > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.12.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21050: - Description: [WIP until Parquet community releases version 1.11.0] The new Parquet version (1.12.0) uses [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] instead of OriginalTypes. These are backwards-compatible with OriginalTypes. Thanks to [~kuczoram] for her work on this patch. was: [WIP; contains necessary jars until Parquet community releases version 1.12.0] The new Parquet version (1.12.0) uses [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] instead of OriginalTypes. These are backwards-compatible with OriginalTypes. Thanks to [~kuczoram] for her work on this patch. > Upgrade Parquet to 1.12.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch > > > [WIP until Parquet community releases version 1.11.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes. > Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21050: - Summary: Upgrade Parquet to 1.11.0 and use LogicalTypes (was: Upgrade Parquet to 1.12.0 and use LogicalTypes) > Upgrade Parquet to 1.11.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch > > > [WIP until Parquet community releases version 1.11.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes. > Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.12.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21050: - Attachment: HIVE-21050.3.patch Status: Patch Available (was: Open) > Upgrade Parquet to 1.12.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch > > > [WIP; contains necessary jars until Parquet community releases version 1.12.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes. > Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.12.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21050: - Status: Open (was: Patch Available) > Upgrade Parquet to 1.12.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch > > > [WIP; contains necessary jars until Parquet community releases version 1.12.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes. > Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18884) Simplify Logging in Hive Metastore Client
[ https://issues.apache.org/jira/browse/HIVE-18884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725073#comment-16725073 ] Hive QA commented on HIVE-18884: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952340/HIVE.18884.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15737 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15383/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15383/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15383/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12952340 - PreCommit-HIVE-Build > Simplify Logging in Hive Metastore Client > - > > Key: HIVE-18884 > URL: https://issues.apache.org/jira/browse/HIVE-18884 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: Mani M >Priority: Minor > Labels: noob > Attachments: HIVE.18884.patch > > > https://github.com/apache/hive/blob/4047befe48c8f762c58d8854e058385c1df151c6/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > The current logging is: > {code} > 2018-02-26 07:02:44,883 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Trying to connect to metastore with URI > thrift://host.company.com:9083 > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Connected to metastore. > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to metastore, current connections: 2 > {code} > Please simplify to something like: > {code} > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to the Metastore Server (URI > thrift://host.company.com:9083), current connections: 2 > ... or ... > 2018-02-26 07:02:44,892 ERROR hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Failed to connect to the Metastore Server (URI > thrift://host.company.com:9083) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20911) External Table Replication for Hive
[ https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-20911: --- Attachment: HIVE-20911.07.patch > External Table Replication for Hive > --- > > Key: HIVE-20911 > URL: https://issues.apache.org/jira/browse/HIVE-20911 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: anishek >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, > HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, > HIVE-20911.06.patch, HIVE-20911.07.patch > > > External tables are not replicated currently as part of hive replication. As > part of this jira we want to enable that. > Approach: > * Target cluster will have a top level base directory config that will be > used to copy all data relevant to external tables. This will be provided via > the *with* clause in the *repl load* command. This base path will be prefixed > to the path of the same external table on source cluster. This can be > provided using the following configuration: > {code} > hive.repl.replica.external.table.base.dir=/ > {code} > * Since changes to directories on the external table can happen without hive > knowing it, hence we cant capture the relevant events when ever new data is > added or removed, we will have to copy the data from the source path to > target path for external tables every time we run incremental replication. > ** this will require incremental *repl dump* to now create an additional > file *\_external\_tables\_info* with data in the following form > {code} > tableName,base64Encoded(tableDataLocation) > {code} > In case there are different partitions in the table pointing to different > locations there will be multiple entries in the file for the same table name > with location pointing to different partition locations. For partitions > created in a table without specifying the _set location_ command will be > within the same table Data location and hence there will not be different > entries in the file above > ** *repl load* will read the *\_external\_tables\_info* to identify what > locations are to be copied from source to target and create corresponding > tasks for them. > * New External tables will be created with metadata only with no data copied > as part of regular tasks while incremental load/bootstrap load. > * Bootstrap dump will also create *\_external\_tables\_info* which will be > used to copy data from source to target as part of boostrap load. > * Bootstrap load will create a DAG, that can use parallelism in the execution > phase, the hdfs copy related tasks are created, once the bootstrap phase is > complete. > * Since incremental load results in a DAG with only sequential execution ( > events applied in sequence ) to effectively use the parallelism capability in > execution mode, we create tasks for hdfs copy along with the incremental DAG. > This requires a few basic calculations to approximately meet the configured > value in "hive.repl.approx.max.load.tasks" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18884) Simplify Logging in Hive Metastore Client
[ https://issues.apache.org/jira/browse/HIVE-18884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725019#comment-16725019 ] Hive QA commented on HIVE-18884: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 6s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15383/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: standalone-metastore/metastore-common U: standalone-metastore/metastore-common | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15383/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Simplify Logging in Hive Metastore Client > - > > Key: HIVE-18884 > URL: https://issues.apache.org/jira/browse/HIVE-18884 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: Mani M >Priority: Minor > Labels: noob > Attachments: HIVE.18884.patch > > > https://github.com/apache/hive/blob/4047befe48c8f762c58d8854e058385c1df151c6/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > The current logging is: > {code} > 2018-02-26 07:02:44,883 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Trying to connect to metastore with URI > thrift://host.company.com:9083 > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Connected to metastore. > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to metastore, current connections: 2 > {code} > Please simplify to something like: > {code} > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to the Metastore Server (URI > thrift://host.company.com:9083), current connections: 2 > ... or ... > 2018-02-26 07:02:44,892 ERROR hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Failed to connect to the Metastore Server (URI > thrift://host.company.com:9083) > {code} -- This message was sent by
[jira] [Updated] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote
[ https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-16907: Attachment: HIVE-16907.03.patch > "INSERT INTO" overwrite old data when destination table encapsulated by > backquote > > > Key: HIVE-16907 > URL: https://issues.apache.org/jira/browse/HIVE-16907 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 1.1.0, 2.1.1 >Reporter: Nemon Lou >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-16907.02.patch, HIVE-16907.03.patch, > HIVE-16907.03.patch, HIVE-16907.03.patch, HIVE-16907.1.patch > > > A way to reproduce: > {noformat} > create database tdb; > use tdb; > create table t1(id int); > create table t2(id int); > explain insert into `tdb.t1` select * from t2; > {noformat} > {noformat} > +---+ > | > Explain | > +---+ > | STAGE DEPENDENCIES: > | > | Stage-1 is a root stage > | > | Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, > Stage-4 | > | Stage-3 > | > | Stage-0 depends on stages: Stage-3, Stage-2, Stage-5 > | > | Stage-2 > | > | Stage-4 > | > | Stage-5 depends on stages: Stage-4 > | > | > | > | STAGE PLANS: > | > | Stage: Stage-1 > | > | Map Reduce > | > | Map Operator Tree: > | > | TableScan > | > | alias: t2 > | > | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE | > | Select Operator > | > | expressions: id (type: int) > | > | outputColumnNames: _col0 > | > | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE | > | File Output Operator > | > | compressed: false > | > | Statistics: Num rows: 0 Data
[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote
[ https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725002#comment-16725002 ] Hive QA commented on HIVE-16907: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952341/HIVE-16907.03.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15738 tests executed *Failed tests:* {noformat} TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=251) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_3] (batchId=192) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15382/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15382/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15382/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952341 - PreCommit-HIVE-Build > "INSERT INTO" overwrite old data when destination table encapsulated by > backquote > > > Key: HIVE-16907 > URL: https://issues.apache.org/jira/browse/HIVE-16907 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 1.1.0, 2.1.1 >Reporter: Nemon Lou >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-16907.02.patch, HIVE-16907.03.patch, > HIVE-16907.03.patch, HIVE-16907.1.patch > > > A way to reproduce: > {noformat} > create database tdb; > use tdb; > create table t1(id int); > create table t2(id int); > explain insert into `tdb.t1` select * from t2; > {noformat} > {noformat} > +---+ > | > Explain | > +---+ > | STAGE DEPENDENCIES: > | > | Stage-1 is a root stage > | > | Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, > Stage-4 | > | Stage-3 > | > | Stage-0 depends on stages: Stage-3, Stage-2, Stage-5 > | > | Stage-2 > | > | Stage-4 > | > | Stage-5 depends on stages: Stage-4 > | > | > | > | STAGE PLANS: > | > | Stage: Stage-1 > | > | Map Reduce > | > | Map Operator Tree: > | > | TableScan > | > | alias: t2 >
[jira] [Updated] (HIVE-19968) UDF exception is not throw out
[ https://issues.apache.org/jira/browse/HIVE-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laszlo Bodor updated HIVE-19968: Attachment: HIVE-19968.05.patch > UDF exception is not throw out > -- > > Key: HIVE-19968 > URL: https://issues.apache.org/jira/browse/HIVE-19968 > Project: Hive > Issue Type: Bug >Reporter: sandflee >Assignee: Laszlo Bodor >Priority: Major > Attachments: HIVE-19968.01.patch, HIVE-19968.02.patch, > HIVE-19968.03.patch, HIVE-19968.04.patch, HIVE-19968.05.patch, hive-udf.png > > > udf init failed, and throw a exception, but hive catch it and do nothing, > leading to app succ, but no data is generated. > {code} > GenericUDFReflect.java#evaluate() > try { > o = null; > o = ReflectionUtils.newInstance(c, null); > } catch (Exception e) { > // ignored > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21048) Remove needless org.mortbay.jetty from hadoop exclusions
[ https://issues.apache.org/jira/browse/HIVE-21048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laszlo Bodor updated HIVE-21048: Attachment: HIVE-21048.05.patch > Remove needless org.mortbay.jetty from hadoop exclusions > > > Key: HIVE-21048 > URL: https://issues.apache.org/jira/browse/HIVE-21048 > Project: Hive > Issue Type: Bug >Reporter: Laszlo Bodor >Assignee: Laszlo Bodor >Priority: Major > Attachments: HIVE-21048.01.patch, HIVE-21048.02.patch, > HIVE-21048.03.patch, HIVE-21048.04.patch, HIVE-21048.05.patch, dep.out > > > During HIVE-20638 i found that org.mortbay.jetty exclusions from e.g. hadoop > don't take effect, as the actual groupId of jetty is org.eclipse.jetty for > most of the current projects, please find attachment (example for hive > commons project). > https://en.wikipedia.org/wiki/Jetty_(web_server)#History -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16255) Support percentile_cont / percentile_disc
[ https://issues.apache.org/jira/browse/HIVE-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724982#comment-16724982 ] Laszlo Bodor commented on HIVE-16255: - [~kgyrtkirk] , [~ashutoshc] : 05.patch made a green run, could you please take a look? > Support percentile_cont / percentile_disc > - > > Key: HIVE-16255 > URL: https://issues.apache.org/jira/browse/HIVE-16255 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Laszlo Bodor >Priority: Major > Attachments: HIVE-16255.01.patch, HIVE-16255.02.patch, > HIVE-16255.03.patch, HIVE-16255.04.patch, HIVE-16255.05.patch > > > Way back in HIVE-259, a percentile function was added that provides a subset > of the standard percentile_cont aggregate function. > The SQL standard provides some additional options and also a percentile_disc > aggregate function with different rules. In the standard you specify an > ordering with arbitrary value expression and the results are drawn from this > value expression. This aggregate functions should be usable as analytic > functions as well (i.e. support the over clause). The current percentile > function is able to be used with an over clause. > The rough outline of how this works is: > percentile_cont(number) within group (order by expression) [ over(window > spec) ] > percentile_disc(number) within group (order by expression) [ over(window > spec) ] > The value of number should be between 0 and 1. The value expression is > evaluated for each row of the group, nulls are discarded, and the remaining > rows are ordered. > — If PERCENTILE_CONT is specified, by considering the pair of consecutive > rows that are indicated by the argument, treated as a fraction of the total > number of rows in the group, and interpolating the value of the value > expression evaluated for these rows. > — If PERCENTILE_DISC is specified, by treating the group as a window > partition of the CUME_DIST window function, using the specified ordering of > the value expression as the window ordering, and returning the first value > expression whose cumulative distribution value is greater than or equal to > the argument. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20959) cbo_rp_limit / cbo_limit are flaky - intermittent whitespace difference
[ https://issues.apache.org/jira/browse/HIVE-20959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724981#comment-16724981 ] Laszlo Bodor commented on HIVE-20959: - [~vihangk1] : okay, makes sense to be honest, I cannot find the codepath, where result field separators are handled/described, i have to dig deep, I'll check it however it's very exotic that cbo_rp_limit and cbo_limit are the failing tests, they have nothing to do with field separators or anything like that, and in case the preceding test makes them fail, we should have been able to reproduce I guess... > cbo_rp_limit / cbo_limit are flaky - intermittent whitespace difference > --- > > Key: HIVE-20959 > URL: https://issues.apache.org/jira/browse/HIVE-20959 > Project: Hive > Issue Type: Bug >Reporter: Laszlo Bodor >Assignee: Laszlo Bodor >Priority: Major > Attachments: > 171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more.txt, > > TEST-171-TestMiniLlapLocalCliDriver-dynamic_semijoin_reduction.q-materialized_view_create_rewrite_3.q-vectorization_pushdown.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.xml, > diff > > > {code:java} > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_rp_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {code} > After copying here to jira, cannot found difference, but by checking from > original junit xml, there is a whitespace difference in the lines > (represented as hex values), between 1 (x31) and 4 (x34). See [^diff] . > Original golden file contains horizontal tab (x09), actual output contains > space (x20). > The serious thing is that the separator changes to x20, which is wrong, but > then in the same line, it changes back to x09. > {code} > 20 31 *20* 34 09 32 <- actual > 20 31 *09* 34 09 32 <- expected > {code} > Tried to reproduce it by running the failing batch of qtests locally, but no > luck (maybe it's an environment issue) > {code} > mvn test -T 1C -Dtest.output.overwrite=true -Pitests,hadoop-2 -pl > itests/qtest -pl itests/util -Dtest=TestMiniLlapLocalCliDriver > -Dqfile=dynamic_semijoin_reduction.q,materialized_view_create_rewrite_3.q,vectorization_pushdown.q,correlationoptimizer2.q,cbo_gby_empty.q,schema_evol_text_nonvec_part_all_complex_llap_io.q,vectorization_short_regress.q,mapjoin3.q,cross_product_check_1.q,results_cache_quoted_identifiers.q,unionDistinct_3.q,cbo_join.q,correlationoptimizer6.q,union_remove_26.q,cbo_rp_limit.q,convert_decimal64_to_decimal.q,vector_groupby_cube1.q,union2.q,groupby2.q,dynpart_sort_opt_vectorization.q,constraints_optimization.q,exchgpartition2lel.q,retry_failure.q,schema_evol_text_vecrow_part_llap_io.q,sample10.q,vectorized_timestamp_ints_casts.q,auto_sortmerge_join_2.q,bucketizedhiveinputformat.q,cte_mat_2.q,vectorization_8.q > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote
[ https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724962#comment-16724962 ] Hive QA commented on HIVE-16907: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 38s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} ql: The patch generated 0 new + 862 unchanged - 2 fixed = 862 total (was 864) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15382/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15382/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > "INSERT INTO" overwrite old data when destination table encapsulated by > backquote > > > Key: HIVE-16907 > URL: https://issues.apache.org/jira/browse/HIVE-16907 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 1.1.0, 2.1.1 >Reporter: Nemon Lou >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-16907.02.patch, HIVE-16907.03.patch, > HIVE-16907.03.patch, HIVE-16907.1.patch > > > A way to reproduce: > {noformat} > create database tdb; > use tdb; > create table t1(id int); > create table t2(id int); > explain insert into `tdb.t1` select * from t2; > {noformat} > {noformat} > +---+ > | > Explain | > +---+ > | STAGE DEPENDENCIES: > | > | Stage-1 is a root stage > | > | Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, >
[jira] [Commented] (HIVE-21048) Remove needless org.mortbay.jetty from hadoop exclusions
[ https://issues.apache.org/jira/browse/HIVE-21048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724971#comment-16724971 ] Laszlo Bodor commented on HIVE-21048: - uploading 05.patch in order to get a green run druidkafkamini tests, please don't fail during teardown next time, thanks in advance! > Remove needless org.mortbay.jetty from hadoop exclusions > > > Key: HIVE-21048 > URL: https://issues.apache.org/jira/browse/HIVE-21048 > Project: Hive > Issue Type: Bug >Reporter: Laszlo Bodor >Assignee: Laszlo Bodor >Priority: Major > Attachments: HIVE-21048.01.patch, HIVE-21048.02.patch, > HIVE-21048.03.patch, HIVE-21048.04.patch, HIVE-21048.05.patch, dep.out > > > During HIVE-20638 i found that org.mortbay.jetty exclusions from e.g. hadoop > don't take effect, as the actual groupId of jetty is org.eclipse.jetty for > most of the current projects, please find attachment (example for hive > commons project). > https://en.wikipedia.org/wiki/Jetty_(web_server)#History -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724951#comment-16724951 ] Hive QA commented on HIVE-21055: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952321/HIVE-21055.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 15712 tests executed *Failed tests:* {noformat} TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestReplAcidTablesWithJsonMessage - did not produce a TEST-*.xml file (likely timed out) (batchId=251) TestReplicationScenariosIncrementalLoadAcidTables - did not produce a TEST-*.xml file (likely timed out) (batchId=249) TestReplicationWithTableMigration - did not produce a TEST-*.xml file (likely timed out) (batchId=244) TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=251) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15381/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15381/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15381/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952321 - PreCommit-HIVE-Build > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724909#comment-16724909 ] Hive QA commented on HIVE-21055: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 40s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15381/dev-support/hive-personality.sh | | git revision | master / 1020be0 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15381/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-16907) "INSERT INTO" overwrite old data when destination table encapsulated by backquote
[ https://issues.apache.org/jira/browse/HIVE-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-16907: Attachment: HIVE-16907.03.patch > "INSERT INTO" overwrite old data when destination table encapsulated by > backquote > > > Key: HIVE-16907 > URL: https://issues.apache.org/jira/browse/HIVE-16907 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 1.1.0, 2.1.1 >Reporter: Nemon Lou >Assignee: Zoltan Haindrich >Priority: Major > Attachments: HIVE-16907.02.patch, HIVE-16907.03.patch, > HIVE-16907.03.patch, HIVE-16907.1.patch > > > A way to reproduce: > {noformat} > create database tdb; > use tdb; > create table t1(id int); > create table t2(id int); > explain insert into `tdb.t1` select * from t2; > {noformat} > {noformat} > +---+ > | > Explain | > +---+ > | STAGE DEPENDENCIES: > | > | Stage-1 is a root stage > | > | Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, > Stage-4 | > | Stage-3 > | > | Stage-0 depends on stages: Stage-3, Stage-2, Stage-5 > | > | Stage-2 > | > | Stage-4 > | > | Stage-5 depends on stages: Stage-4 > | > | > | > | STAGE PLANS: > | > | Stage: Stage-1 > | > | Map Reduce > | > | Map Operator Tree: > | > | TableScan > | > | alias: t2 > | > | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE | > | Select Operator > | > | expressions: id (type: int) > | > | outputColumnNames: _col0 > | > | Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE | > | File Output Operator > | > | compressed: false > | > | Statistics: Num rows: 0 Data size: 0 Basic stats:
[jira] [Updated] (HIVE-18884) Simplify Logging in Hive Metastore Client
[ https://issues.apache.org/jira/browse/HIVE-18884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mani M updated HIVE-18884: -- Attachment: HIVE.18884.patch > Simplify Logging in Hive Metastore Client > - > > Key: HIVE-18884 > URL: https://issues.apache.org/jira/browse/HIVE-18884 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: Mani M >Priority: Minor > Labels: noob > Attachments: HIVE.18884.patch > > > https://github.com/apache/hive/blob/4047befe48c8f762c58d8854e058385c1df151c6/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > The current logging is: > {code} > 2018-02-26 07:02:44,883 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Trying to connect to metastore with URI > thrift://host.company.com:9083 > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Connected to metastore. > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to metastore, current connections: 2 > {code} > Please simplify to something like: > {code} > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to the Metastore Server (URI > thrift://host.company.com:9083), current connections: 2 > ... or ... > 2018-02-26 07:02:44,892 ERROR hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Failed to connect to the Metastore Server (URI > thrift://host.company.com:9083) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18884) Simplify Logging in Hive Metastore Client
[ https://issues.apache.org/jira/browse/HIVE-18884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mani M updated HIVE-18884: -- Release Note: Included URI details in the connection & failure logs Status: Patch Available (was: Open) > Simplify Logging in Hive Metastore Client > - > > Key: HIVE-18884 > URL: https://issues.apache.org/jira/browse/HIVE-18884 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: Mani M >Priority: Minor > Labels: noob > Attachments: HIVE.18884.patch > > > https://github.com/apache/hive/blob/4047befe48c8f762c58d8854e058385c1df151c6/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > The current logging is: > {code} > 2018-02-26 07:02:44,883 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Trying to connect to metastore with URI > thrift://host.company.com:9083 > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Connected to metastore. > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to metastore, current connections: 2 > {code} > Please simplify to something like: > {code} > 2018-02-26 07:02:44,892 INFO hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Opened a connection to the Metastore Server (URI > thrift://host.company.com:9083), current connections: 2 > ... or ... > 2018-02-26 07:02:44,892 ERROR hive.metastore: [HiveServer2-Handler-Pool: > Thread-65]: Failed to connect to the Metastore Server (URI > thrift://host.company.com:9083) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21050) Upgrade Parquet to 1.12.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724850#comment-16724850 ] Hive QA commented on HIVE-21050: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952305/HIVE-21050.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15380/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15380/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15380/ Messages: {noformat} This message was trimmed, see log for full details [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.25.v20180904/jetty-servlet-9.3.25.v20180904.jar(org/eclipse/jetty/servlet/ServletContextHandler.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-servlet/9.3.25.v20180904/jetty-servlet-9.3.25.v20180904.jar(org/eclipse/jetty/servlet/ServletHolder.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/eclipse/jetty/jetty-xml/9.3.25.v20180904/jetty-xml-9.3.25.v20180904.jar(org/eclipse/jetty/xml/XmlConfiguration.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/slf4j/jul-to-slf4j/1.7.10/jul-to-slf4j-1.7.10.jar(org/slf4j/bridge/SLF4JBridgeHandler.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/DispatcherType.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/Filter.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/FilterChain.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/FilterConfig.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/ServletException.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/ServletRequest.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/ServletResponse.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/annotation/WebFilter.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/http/HttpServletRequest.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/javax/servlet/javax.servlet-api/3.1.0/javax.servlet-api-3.1.0.jar(javax/servlet/http/HttpServletResponse.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/classification/target/hive-classification-4.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceAudience$LimitedPrivate.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/classification/target/hive-classification-4.0.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/classification/InterfaceStability$Unstable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/ByteArrayOutputStream.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/OutputStream.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/Closeable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/AutoCloseable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/io/Flushable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(javax/xml/bind/annotation/XmlRootElement.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/commons/commons-exec/1.1/commons-exec-1.1.jar(org/apache/commons/exec/ExecuteException.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/security/PrivilegedExceptionAction.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/ExecutionException.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/TimeoutException.class)]] [loading
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Status: Patch Available (was: Open) > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Attachment: HIVE-21055.02.patch > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Attachment: (was: HIVE-21055.02.patch) > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Status: Open (was: Patch Available) > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Attachment: HIVE-21055.01.patch > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Attachment: (was: HIVE-21055.02.patch) > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21055) Replication load command executing copy in serial mode even if parallel execution is enabled using with clause
[ https://issues.apache.org/jira/browse/HIVE-21055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera updated HIVE-21055: --- Attachment: (was: HIVE-21055.01.patch) > Replication load command executing copy in serial mode even if parallel > execution is enabled using with clause > -- > > Key: HIVE-21055 > URL: https://issues.apache.org/jira/browse/HIVE-21055 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21055.01.patch > > > For repl load command use can specify the execution mode as part of "with" > clause. But the config for executing task in parallel or serial is not read > from the command specific config. It is read from the hive server config. So > even if user specifies to run the tasks in parallel during repl load command, > the tasks are getting executed serially. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.12.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21050: - Attachment: HIVE-21050.2.patch Status: Patch Available (was: Open) > Upgrade Parquet to 1.12.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch > > > [WIP; contains necessary jars until Parquet community releases version 1.12.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes. > Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.12.0 and use LogicalTypes
[ https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-21050: - Status: Open (was: Patch Available) > Upgrade Parquet to 1.12.0 and use LogicalTypes > -- > > Key: HIVE-21050 > URL: https://issues.apache.org/jira/browse/HIVE-21050 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: Parquet, parquet > Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, > HIVE-21050.1.patch, HIVE-21050.2.patch > > > [WIP; contains necessary jars until Parquet community releases version 1.12.0] > The new Parquet version (1.12.0) uses > [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] > instead of OriginalTypes. > These are backwards-compatible with OriginalTypes. > Thanks to [~kuczoram] for her work on this patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20776) Run HMS filterHooks on server-side in addition to client-side
[ https://issues.apache.org/jira/browse/HIVE-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724793#comment-16724793 ] Hive QA commented on HIVE-20776: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12952298/HIVE-20776.004.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15741 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.TestFilterHooks.testDummyFilterForPartition (batchId=220) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15379/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15379/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15379/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12952298 - PreCommit-HIVE-Build > Run HMS filterHooks on server-side in addition to client-side > - > > Key: HIVE-20776 > URL: https://issues.apache.org/jira/browse/HIVE-20776 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Karthik Manamcheri >Assignee: Na Li >Priority: Major > Attachments: HIVE-20776.001.patch, HIVE-20776.003.patch, > HIVE-20776.004.patch > > > In HMS, I noticed that all the filter hooks are applied on the client side > (in HiveMetaStoreClient.java). Is there any reason why we can't apply the > filters on the server-side? > Motivation: Some newer apache projects such as Kudu use HMS for metadata > storage. Kudu is not completely Java-based and there are interaction points > where they have C++ clients. In such cases, it would be ideal to have > consistent behavior from HMS side as far as filters, etc are concerned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)