[jira] [Work logged] (HIVE-23413) Create a new config to skip all locks
[ https://issues.apache.org/jira/browse/HIVE-23413?focusedWorklogId=456455&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456455 ] ASF GitHub Bot logged work on HIVE-23413: - Author: ASF GitHub Bot Created on: 09/Jul/20 06:48 Start Date: 09/Jul/20 06:48 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #1220: URL: https://github.com/apache/hive/pull/1220#issuecomment-655935856 @deniskuzZ could you check this out, you already reviewed it on the review board, but I forgat about it, so now created a PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456455) Time Spent: 20m (was: 10m) > Create a new config to skip all locks > - > > Key: HIVE-23413 > URL: https://issues.apache.org/jira/browse/HIVE-23413 > Project: Hive > Issue Type: Improvement >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23413.1.patch, HIVE-23413.2.patch > > Time Spent: 20m > Remaining Estimate: 0h > > From time-to-time some query is blocked on locks which should not. > To have a quick workaround for this we should have a config which the user > can set in the session to disable acquiring/checking locks, so we can provide > it immediately and then later investigate and fix the root cause. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23525) TestAcidTxnCleanerService is unstable
[ https://issues.apache.org/jira/browse/HIVE-23525?focusedWorklogId=456454&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456454 ] ASF GitHub Bot logged work on HIVE-23525: - Author: ASF GitHub Bot Created on: 09/Jul/20 06:46 Start Date: 09/Jul/20 06:46 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #1219: URL: https://github.com/apache/hive/pull/1219#issuecomment-655935237 @pvary The flaky test run passed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456454) Time Spent: 1h 40m (was: 1.5h) > TestAcidTxnCleanerService is unstable > - > > Key: HIVE-23525 > URL: https://issues.apache.org/jira/browse/HIVE-23525 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23525.1.patch, HIVE-23525.2.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > from time to time this exception happens > http://34.66.156.144:8080/job/hive-c/7/console > {code} > 15:03:41 [INFO] > 15:03:41 [INFO] --- > 15:03:41 [INFO] T E S T S > 15:03:41 [INFO] --- > 15:03:42 [INFO] Running > org.apache.hadoop.hive.metastore.txn.TestAcidTxnCleanerService > 15:04:10 [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time > elapsed: 25.582 s <<< FAILURE! - in > org.apache.hadoop.hive.metastore.txn.TestAcidTxnCleanerService > 15:04:10 [ERROR] > cleansAllCommittedTxns(org.apache.hadoop.hive.metastore.txn.TestAcidTxnCleanerService) > Time elapsed: 9.952 s <<< FAILURE! > 15:04:10 java.lang.AssertionError: expected:<6> but was:<7> > 15:04:10 at > org.apache.hadoop.hive.metastore.txn.TestAcidTxnCleanerService.cleansAllCommittedTxns(TestAcidTxnCleanerService.java:107) > 15:04:10 > 15:04:10 [INFO] > 15:04:10 [INFO] Results: > 15:04:10 [INFO] > 15:04:10 [ERROR] Failures: > 15:04:10 [ERROR] TestAcidTxnCleanerService.cleansAllCommittedTxns:107 > expected:<6> but was:<7> > 15:04:10 [INFO] > 15:04:10 [ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0 > 15:04:10 [INFO] > 15:04:10 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.20.1:test (default-test) on > project hive-standalone-metastore-server: There are test failures. > 15:04:10 [ERROR] > 15:04:10 [ERROR] Please refer to > /home/jenkins/agent/workspace/hive-c/standalone-metastore/metastore-server/target/surefire-reports > for the individual test results. > 15:04:10 [ERROR] Please refer to dump files (if any exist) > [date]-jvmRun[N].dump, [date].dumpstream and [date]-jvmRun[N].dumpstream. > 15:04:10 [ERROR] -> [Help 1] > 15:04:10 [ERROR] > 15:04:10 [ERROR] To see the full stack trace of the errors, re-run Maven > with the -e switch. > 15:04:10 [ERROR] Re-run Maven using the -X switch to enable full debug > logging. > 15:04:10 [ERROR] > 15:04:10 [ERROR] For more information about the errors and possible > solutions, please read the following articles: > 15:04:10 [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21825) Improve client error msg when Active/Passive HA is enabled
[ https://issues.apache.org/jira/browse/HIVE-21825?focusedWorklogId=456445&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456445 ] ASF GitHub Bot logged work on HIVE-21825: - Author: ASF GitHub Bot Created on: 09/Jul/20 06:35 Start Date: 09/Jul/20 06:35 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #682: URL: https://github.com/apache/hive/pull/682 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456445) Time Spent: 20m (was: 10m) > Improve client error msg when Active/Passive HA is enabled > -- > > Key: HIVE-21825 > URL: https://issues.apache.org/jira/browse/HIVE-21825 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0, 3.2.0 >Reporter: Prasanth Jayachandran >Assignee: Richard Zhang >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0, 3.2.0 > > Attachments: Hive-21825.1.patch, Hive-21825.2.patch, > Hive-21825.3.patch > > Time Spent: 20m > Remaining Estimate: 0h > > When Active/Passive HA is enabled and when client tries to connect to Passive > HA or when HS2 is still starting up, clients will receive the following the > error msg > {code:java} > 'Cannot open sessions on an inactive HS2 instance; use service discovery to > connect'{code} > This error msg can be improved to say that HS2 is still starting up (or more > user-friendly error msg). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-22952) Use LinkedHashMap in TestStandardObjectInspectors.java
[ https://issues.apache.org/jira/browse/HIVE-22952?focusedWorklogId=456441&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456441 ] ASF GitHub Bot logged work on HIVE-22952: - Author: ASF GitHub Bot Created on: 09/Jul/20 06:30 Start Date: 09/Jul/20 06:30 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #929: URL: https://github.com/apache/hive/pull/929 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456441) Time Spent: 40m (was: 0.5h) > Use LinkedHashMap in TestStandardObjectInspectors.java > -- > > Key: HIVE-22952 > URL: https://issues.apache.org/jira/browse/HIVE-22952 > Project: Hive > Issue Type: Bug > Components: Test, Tests >Reporter: cpugputpu >Assignee: cpugputpu >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > The test in > _org.apache.hadoop.hive.serde2.objectinspector.TestStandardObjectInspectors#testStandardUnionObjectInspector_ > can fail due to a different iteration order of HashMap. The failure is > presented as follows. > org.junit.ComparisonFailure: > expected:<\{4:{6:"six",7:"seven",8:"eight"}}> > but was:<\{4:{6:"six",8:"eight",7:"seven"}}> > The reason is that the assertion > _assertEquals("\{4:{6:\"six\",7:\"seven\",8:\"eight\"}}", > SerDeUtils.getJSONString(union, uoi1));_ compares a hard-coded string against > the string representation of a JSON object, which is implemented by a > HashMap. To get the string, the HashMap is iterated here at > _serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:343_ > _for (Object entry : omap.entrySet())_ > The specification about HashMap says that "this class makes no guarantees as > to the order of the map; in particular, it does not guarantee that the order > will remain constant over time". The documentation is here for your > reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html] > > The fix is to use LinkedHashMap instead of HashMap. In this way, the > non-deterministic behaviour is eliminated and the test will become more > stable. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-22952) Use LinkedHashMap in TestStandardObjectInspectors.java
[ https://issues.apache.org/jira/browse/HIVE-22952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-22952. - Fix Version/s: 4.0.0 Resolution: Fixed pushed to master. Thank you [~cpugputpu] and David for reviewing the changes! > Use LinkedHashMap in TestStandardObjectInspectors.java > -- > > Key: HIVE-22952 > URL: https://issues.apache.org/jira/browse/HIVE-22952 > Project: Hive > Issue Type: Bug > Components: Test, Tests >Reporter: cpugputpu >Assignee: cpugputpu >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The test in > _org.apache.hadoop.hive.serde2.objectinspector.TestStandardObjectInspectors#testStandardUnionObjectInspector_ > can fail due to a different iteration order of HashMap. The failure is > presented as follows. > org.junit.ComparisonFailure: > expected:<\{4:{6:"six",7:"seven",8:"eight"}}> > but was:<\{4:{6:"six",8:"eight",7:"seven"}}> > The reason is that the assertion > _assertEquals("\{4:{6:\"six\",7:\"seven\",8:\"eight\"}}", > SerDeUtils.getJSONString(union, uoi1));_ compares a hard-coded string against > the string representation of a JSON object, which is implemented by a > HashMap. To get the string, the HashMap is iterated here at > _serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:343_ > _for (Object entry : omap.entrySet())_ > The specification about HashMap says that "this class makes no guarantees as > to the order of the map; in particular, it does not guarantee that the order > will remain constant over time". The documentation is here for your > reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html] > > The fix is to use LinkedHashMap instead of HashMap. In this way, the > non-deterministic behaviour is eliminated and the test will become more > stable. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23813) Fix MetricsMaintTask run frequency
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-23813: Fix Version/s: 4.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) pushed to master. Thank you [~aasha]! > Fix MetricsMaintTask run frequency > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23813.01.patch > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23813) Fix MetricsMaintTask run frequency
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-23813: Summary: Fix MetricsMaintTask run frequency (was: Fix Flaky tests due to JDO ConnectionException) > Fix MetricsMaintTask run frequency > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23813) Fix MetricsMaintTask run frequency
[ https://issues.apache.org/jira/browse/HIVE-23813?focusedWorklogId=456440&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456440 ] ASF GitHub Bot logged work on HIVE-23813: - Author: ASF GitHub Bot Created on: 09/Jul/20 06:27 Start Date: 09/Jul/20 06:27 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #1223: URL: https://github.com/apache/hive/pull/1223 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456440) Time Spent: 1h (was: 50m) > Fix MetricsMaintTask run frequency > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-22952) Use LinkedHashMap in TestStandardObjectInspectors.java
[ https://issues.apache.org/jira/browse/HIVE-22952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-22952: --- Assignee: cpugputpu > Use LinkedHashMap in TestStandardObjectInspectors.java > -- > > Key: HIVE-22952 > URL: https://issues.apache.org/jira/browse/HIVE-22952 > Project: Hive > Issue Type: Bug > Components: Test, Tests >Reporter: cpugputpu >Assignee: cpugputpu >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The test in > _org.apache.hadoop.hive.serde2.objectinspector.TestStandardObjectInspectors#testStandardUnionObjectInspector_ > can fail due to a different iteration order of HashMap. The failure is > presented as follows. > org.junit.ComparisonFailure: > expected:<\{4:{6:"six",7:"seven",8:"eight"}}> > but was:<\{4:{6:"six",8:"eight",7:"seven"}}> > The reason is that the assertion > _assertEquals("\{4:{6:\"six\",7:\"seven\",8:\"eight\"}}", > SerDeUtils.getJSONString(union, uoi1));_ compares a hard-coded string against > the string representation of a JSON object, which is implemented by a > HashMap. To get the string, the HashMap is iterated here at > _serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java:343_ > _for (Object entry : omap.entrySet())_ > The specification about HashMap says that "this class makes no guarantees as > to the order of the map; in particular, it does not guarantee that the order > will remain constant over time". The documentation is here for your > reference: [https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html] > > The fix is to use LinkedHashMap instead of HashMap. In this way, the > non-deterministic behaviour is eliminated and the test will become more > stable. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154211#comment-17154211 ] Aasha Medhi commented on HIVE-23813: Thank you for the review [~kgyrtkirk] > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23363) Upgrade DataNucleus dependency to 5.2
[ https://issues.apache.org/jira/browse/HIVE-23363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-23363: Fix Version/s: 4.0.0 Assignee: David Mollitor (was: Zoltan Chovan) Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master. Thanks, [~belugabehr] > Upgrade DataNucleus dependency to 5.2 > - > > Key: HIVE-23363 > URL: https://issues.apache.org/jira/browse/HIVE-23363 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Zoltan Chovan >Assignee: David Mollitor >Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23363.2.patch, HIVE-23363.patch > > Time Spent: 2h > Remaining Estimate: 0h > > Upgrade Datanucleus from 4.2 to 5.2 as based on it's docs 4.2 has been > retired: > [http://www.datanucleus.org/documentation/products.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23770) Druid filter translation unable to handle inverted between
[ https://issues.apache.org/jira/browse/HIVE-23770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154169#comment-17154169 ] Ashutosh Chauhan commented on HIVE-23770: - [~nishantbangarwa] is this patch ready for commit ? > Druid filter translation unable to handle inverted between > -- > > Key: HIVE-23770 > URL: https://issues.apache.org/jira/browse/HIVE-23770 > Project: Hive > Issue Type: Bug >Reporter: Nishant Bangarwa >Assignee: Nishant Bangarwa >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23770.1.patch, HIVE-23770.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Druid filter translation happens in Calcite and does not uses HiveBetween > inverted flag for translation this misses a negation in the planned query -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23811) deleteReader SARG rowId/bucketId are not getting validated properly
[ https://issues.apache.org/jira/browse/HIVE-23811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-23811: -- Status: Patch Available (was: Open) > deleteReader SARG rowId/bucketId are not getting validated properly > --- > > Key: HIVE-23811 > URL: https://issues.apache.org/jira/browse/HIVE-23811 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Though we are iterating over min/max stripeIndex, we always seem to pick > ColumnStats from first stripe > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java#L596] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23800) Make HiveServer2 oom hook interface
[ https://issues.apache.org/jira/browse/HIVE-23800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154134#comment-17154134 ] Zhihua Deng commented on HIVE-23800: The oom hook holds a hiveserver2 instance, which calls hiveserver2::stop() to end hiveserver2 gracefully, which would cleanup the scratch(staging) directory/operation log and so on . Although the hooks in the driver can handle oom, he may not be able to stop the hiveserver2 gracefully as the oom hook does. Sometimes we may want to dump the heap for futher analysis when oom happens or alter the devops, so it may be better to make the oom hook here an interface. > Make HiveServer2 oom hook interface > --- > > Key: HIVE-23800 > URL: https://issues.apache.org/jira/browse/HIVE-23800 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Make oom hook an interface of HiveServer2, so user can implement the hook to > do something before HS2 stops, such as dumping the heap or altering the > devops. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23800) Make HiveServer2 oom hook interface
[ https://issues.apache.org/jira/browse/HIVE-23800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23800: --- Description: Make oom hook an interface of HiveServer2, so user can implement the hook to do something before HS2 stops, such as dumping the heap or altering the devops. > Make HiveServer2 oom hook interface > --- > > Key: HIVE-23800 > URL: https://issues.apache.org/jira/browse/HIVE-23800 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Make oom hook an interface of HiveServer2, so user can implement the hook to > do something before HS2 stops, such as dumping the heap or altering the > devops. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23727) Improve SQLOperation log handling when cancel background
[ https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23727: --- Summary: Improve SQLOperation log handling when cancel background (was: Improve SQLOperation log handling when cleanup) > Improve SQLOperation log handling when cancel background > > > Key: HIVE-23727 > URL: https://issues.apache.org/jira/browse/HIVE-23727 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The SQLOperation checks _if (shouldRunAsync() && state != > OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the > background task. If true, the state should not be OperationState.CANCELED, so > logging under the state == OperationState.CANCELED should never happen. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23727) Improve SQLOperation log handling when cleanup
[ https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng reassigned HIVE-23727: -- Assignee: Zhihua Deng > Improve SQLOperation log handling when cleanup > -- > > Key: HIVE-23727 > URL: https://issues.apache.org/jira/browse/HIVE-23727 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The SQLOperation checks _if (shouldRunAsync() && state != > OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the > background task. If true, the state should not be OperationState.CANCELED, so > logging under the state == OperationState.CANCELED should never happen. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HIVE-23727) Improve SQLOperation log handling when cleanup
[ https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23727: --- Comment: was deleted (was: Fix the log output only, refine the condition in the future if needed.) > Improve SQLOperation log handling when cleanup > -- > > Key: HIVE-23727 > URL: https://issues.apache.org/jira/browse/HIVE-23727 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The SQLOperation checks _if (shouldRunAsync() && state != > OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the > background task. If true, the state should not be OperationState.CANCELED, so > logging under the state == OperationState.CANCELED should never happen. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HIVE-23727) Improve SQLOperation log handling when cleanup
[ https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23727: --- Comment: was deleted (was: In a busy env, the operation may be pended(asyncPrepare is enabled), so it's better to change the condition from if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) to if (shouldRunAsync() && oldState == OperationState.PENDING). ) > Improve SQLOperation log handling when cleanup > -- > > Key: HIVE-23727 > URL: https://issues.apache.org/jira/browse/HIVE-23727 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The SQLOperation checks _if (shouldRunAsync() && state != > OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the > background task. If true, the state should not be OperationState.CANCELED, so > logging under the state == OperationState.CANCELED should never happen. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HIVE-23727) Improve SQLOperation log handling when cleanup
[ https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23727: --- Comment: was deleted (was: I'm wondering if we can improve the whole branch if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) here. The codes here make some confusing to me, as state = OperationState.CLOSED will be the only case that the canceling background will take effect, in this case the operation may be finished, closed, failed, running(ctrl+c or session timeout) or pended. There is no need to cancel the finished, closed, failed operations, the running operations can be treated as the timeout operations, which are cleaned up by driver::close.) > Improve SQLOperation log handling when cleanup > -- > > Key: HIVE-23727 > URL: https://issues.apache.org/jira/browse/HIVE-23727 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The SQLOperation checks _if (shouldRunAsync() && state != > OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the > background task. If true, the state should not be OperationState.CANCELED, so > logging under the state == OperationState.CANCELED should never happen. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (HIVE-23727) Improve SQLOperation log handling when cleanup
[ https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23727: --- Comment: was deleted (was: [~ychena] [~ctang] Could you take some time to look at this?) > Improve SQLOperation log handling when cleanup > -- > > Key: HIVE-23727 > URL: https://issues.apache.org/jira/browse/HIVE-23727 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > The SQLOperation checks _if (shouldRunAsync() && state != > OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the > background task. If true, the state should not be OperationState.CANCELED, so > logging under the state == OperationState.CANCELED should never happen. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23797) Throw exception when no metastore found in zookeeper
[ https://issues.apache.org/jira/browse/HIVE-23797?focusedWorklogId=456397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456397 ] ASF GitHub Bot logged work on HIVE-23797: - Author: ASF GitHub Bot Created on: 09/Jul/20 01:54 Start Date: 09/Jul/20 01:54 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on pull request #1201: URL: https://github.com/apache/hive/pull/1201#issuecomment-655848906 @belugabehr can you take another look at the changes? thank you! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456397) Time Spent: 50m (was: 40m) > Throw exception when no metastore found in zookeeper > - > > Key: HIVE-23797 > URL: https://issues.apache.org/jira/browse/HIVE-23797 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > When enable service discovery for metastore, there is a chance that the > client may find no metastore uris available in zookeeper, such as during > metastores startup or the client wrongly configured the path. This results to > redundant retries and finally MetaException with "Unknown exception" message. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.
[ https://issues.apache.org/jira/browse/HIVE-23347?focusedWorklogId=456371&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456371 ] ASF GitHub Bot logged work on HIVE-23347: - Author: ASF GitHub Bot Created on: 09/Jul/20 00:32 Start Date: 09/Jul/20 00:32 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1003: URL: https://github.com/apache/hive/pull/1003 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456371) Time Spent: 0.5h (was: 20m) > MSCK REPAIR cannot discover partitions with upper case directory names. > --- > > Key: HIVE-23347 > URL: https://issues.apache.org/jira/browse/HIVE-23347 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Affects Versions: 3.1.0 >Reporter: Sankar Hariappan >Assignee: Adesh Kumar Rao >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23347.01.patch, HIVE-23347.10.patch, > HIVE-23347.2.patch, HIVE-23347.3.patch, HIVE-23347.4.patch, > HIVE-23347.5.patch, HIVE-23347.6.patch, HIVE-23347.7.patch, > HIVE-23347.8.patch, HIVE-23347.9.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > For the following scenario, we expect MSCK REPAIR to discover partitions but > it couldn't. > 1. Have partitioned data path as follows. > hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10 > hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11 > 2. create external table t1 (key int, value string) partitioned by (Year int, > Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1''; > 3. msck repair table t1; > 4. show partitions t1; --> Returns zero partitions > 5. select * from t1; --> Returns empty data. > When the partition directory names are changed to lower case, this works fine. > hdfs://mycluster/datapath/t1/year=2020/month=03/day=10 > hdfs://mycluster/datapath/t1/year=2020/month=03/day=11 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23822: -- Labels: pull-request-available (was: ) > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?focusedWorklogId=456363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456363 ] ASF GitHub Bot logged work on HIVE-23822: - Author: ASF GitHub Bot Created on: 08/Jul/20 23:00 Start Date: 08/Jul/20 23:00 Worklog Time Spent: 10m Work Description: vineetgarg02 opened a new pull request #1231: URL: https://github.com/apache/hive/pull/1231 …at task ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456363) Remaining Estimate: 0h Time Spent: 10m > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23822) Sorted dynamic partition optimization could remove auto stat task
[ https://issues.apache.org/jira/browse/HIVE-23822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-23822: -- > Sorted dynamic partition optimization could remove auto stat task > - > > Key: HIVE-23822 > URL: https://issues.apache.org/jira/browse/HIVE-23822 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > {{mm_dp}} has reproducer where INSERT query is missing auto stats task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23277) HiveProtoLogger should carry out JSON conversion in its own thread
[ https://issues.apache.org/jira/browse/HIVE-23277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-23277. - Fix Version/s: 4.0.0 Resolution: Fixed Pushed to master. Thanks, Attila! > HiveProtoLogger should carry out JSON conversion in its own thread > -- > > Key: HIVE-23277 > URL: https://issues.apache.org/jira/browse/HIVE-23277 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Attila Magyar >Priority: Minor > Fix For: 4.0.0 > > Attachments: HIVE-23277.1.patch, Screenshot 2020-04-23 at 11.27.42 > AM.png > > > !Screenshot 2020-04-23 at 11.27.42 AM.png|width=623,height=423! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23277) HiveProtoLogger should carry out JSON conversion in its own thread
[ https://issues.apache.org/jira/browse/HIVE-23277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154024#comment-17154024 ] Ashutosh Chauhan commented on HIVE-23277: - +1 > HiveProtoLogger should carry out JSON conversion in its own thread > -- > > Key: HIVE-23277 > URL: https://issues.apache.org/jira/browse/HIVE-23277 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Attila Magyar >Priority: Minor > Attachments: HIVE-23277.1.patch, Screenshot 2020-04-23 at 11.27.42 > AM.png > > > !Screenshot 2020-04-23 at 11.27.42 AM.png|width=623,height=423! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23780) Fail dropTable if acid cleanup fails
[ https://issues.apache.org/jira/browse/HIVE-23780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154002#comment-17154002 ] Naveen Gangam commented on HIVE-23780: -- That makes sense. I forgot that there are no JDO mappings for these tables, probably why we have this AcidListener in the first place. I havent reviewed the test changes but the code changes look good to me. +1 for me. > Fail dropTable if acid cleanup fails > > > Key: HIVE-23780 > URL: https://issues.apache.org/jira/browse/HIVE-23780 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore, Transactions >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Acid cleanup happens after dropTable is committed. If cleanup fails for some > reason, there are leftover entries in acid tables. This later causes dropped > table's name to be unusable by new tables. > [~pvary] [~ngangam] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-23069: Attachment: HIVE-23069.01.patch > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23069.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. > Also introduces a config option to run data copy tasks during repl load > operation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-23069: Attachment: (was: HIVE-23069.01.patch) > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23069.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. > Also introduces a config option to run data copy tasks during repl load > operation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23780) Fail dropTable if acid cleanup fails
[ https://issues.apache.org/jira/browse/HIVE-23780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153829#comment-17153829 ] Mustafa Iman commented on HIVE-23780: - [~ngangam] I had initially considered doing what you said. It was harder to do it that way because ObjectStore used JDO and this cleanup happens using raw sql. I could not find a way for them to share the same transaction. Then, Peter said we should do this using transactionalListener anyway for the reasons he explained above. > Fail dropTable if acid cleanup fails > > > Key: HIVE-23780 > URL: https://issues.apache.org/jira/browse/HIVE-23780 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore, Transactions >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Acid cleanup happens after dropTable is committed. If cleanup fails for some > reason, there are leftover entries in acid tables. This later causes dropped > table's name to be unusable by new tables. > [~pvary] [~ngangam] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23819) Use ranges in ValidReadTxnList serialization
[ https://issues.apache.org/jira/browse/HIVE-23819?focusedWorklogId=456269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456269 ] ASF GitHub Bot logged work on HIVE-23819: - Author: ASF GitHub Bot Created on: 08/Jul/20 17:53 Start Date: 08/Jul/20 17:53 Worklog Time Spent: 10m Work Description: pvary commented on pull request #1230: URL: https://github.com/apache/hive/pull/1230#issuecomment-655666860 Can we have microbenchmarks for serializing/deserializing for edge cases at least? * Everything is one big range * Everything is single event * Everything is 2 long range Thanks, Peter This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456269) Time Spent: 20m (was: 10m) > Use ranges in ValidReadTxnList serialization > > > Key: HIVE-23819 > URL: https://issues.apache.org/jira/browse/HIVE-23819 > Project: Hive > Issue Type: Improvement >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Time to time we see a case, when the open / aborted transaction count is high > and often the aborted transactions come in continues ranges. > When the transaction count goes high the serialization / deserialization to > hive.txn.valid.txns conf gets slower and produces a large config value. > Using ranges in the string representation can mitigate the issue somewhat. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23780) Fail dropTable if acid cleanup fails
[ https://issues.apache.org/jira/browse/HIVE-23780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153797#comment-17153797 ] Peter Vary commented on HIVE-23780: --- Sorry [~ngangam], just realized that you commented on the jira (after pushing the change :( ). To answer your question, we very strictly try to separate TxnHandler stuff from ObjectStore stuff - If we want to split out transaction related classes later this way we can do it without too much effort. Also, the transactionalListener was created for just the same purpose. We use it for the notifications as well. So all-in-all the answer is that we want to keep TxnHandler as separated from ObjectStore as possible. Thanks, Peter > Fail dropTable if acid cleanup fails > > > Key: HIVE-23780 > URL: https://issues.apache.org/jira/browse/HIVE-23780 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore, Transactions >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Acid cleanup happens after dropTable is committed. If cleanup fails for some > reason, there are leftover entries in acid tables. This later causes dropped > table's name to be unusable by new tables. > [~pvary] [~ngangam] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23780) Fail dropTable if acid cleanup fails
[ https://issues.apache.org/jira/browse/HIVE-23780?focusedWorklogId=456258&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456258 ] ASF GitHub Bot logged work on HIVE-23780: - Author: ASF GitHub Bot Created on: 08/Jul/20 17:38 Start Date: 08/Jul/20 17:38 Worklog Time Spent: 10m Work Description: pvary merged pull request #1192: URL: https://github.com/apache/hive/pull/1192 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456258) Time Spent: 0.5h (was: 20m) > Fail dropTable if acid cleanup fails > > > Key: HIVE-23780 > URL: https://issues.apache.org/jira/browse/HIVE-23780 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore, Transactions >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Acid cleanup happens after dropTable is committed. If cleanup fails for some > reason, there are leftover entries in acid tables. This later causes dropped > table's name to be unusable by new tables. > [~pvary] [~ngangam] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23780) Fail dropTable if acid cleanup fails
[ https://issues.apache.org/jira/browse/HIVE-23780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary resolved HIVE-23780. --- Fix Version/s: 4.0.0 Resolution: Fixed Pushed to master. Thanks for the patch [~mustafaiman]! > Fail dropTable if acid cleanup fails > > > Key: HIVE-23780 > URL: https://issues.apache.org/jira/browse/HIVE-23780 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore, Transactions >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Acid cleanup happens after dropTable is committed. If cleanup fails for some > reason, there are leftover entries in acid tables. This later causes dropped > table's name to be unusable by new tables. > [~pvary] [~ngangam] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23780) Fail dropTable if acid cleanup fails
[ https://issues.apache.org/jira/browse/HIVE-23780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153742#comment-17153742 ] Naveen Gangam commented on HIVE-23780: -- [~mustafaiman] [~pvary] I dont have enough context with this AcidListener and why its being done here but would it make sense to do this logic as part of the dropTable prior to committing the table drop, in the ObjectStore itself? The net effect with the proposed fix is about the same (other than order of dropping rows) but this appears cleaner and more predictable. Just wanted to know your thoughts on it. Thanks > Fail dropTable if acid cleanup fails > > > Key: HIVE-23780 > URL: https://issues.apache.org/jira/browse/HIVE-23780 > Project: Hive > Issue Type: Bug > Components: Metastore, Standalone Metastore, Transactions >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Acid cleanup happens after dropTable is committed. If cleanup fails for some > reason, there are leftover entries in acid tables. This later causes dropped > table's name to be unusable by new tables. > [~pvary] [~ngangam] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23819) Use ranges in ValidReadTxnList serialization
[ https://issues.apache.org/jira/browse/HIVE-23819?focusedWorklogId=456219&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456219 ] ASF GitHub Bot logged work on HIVE-23819: - Author: ASF GitHub Bot Created on: 08/Jul/20 16:00 Start Date: 08/Jul/20 16:00 Worklog Time Spent: 10m Work Description: pvargacl opened a new pull request #1230: URL: https://github.com/apache/hive/pull/1230 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456219) Remaining Estimate: 0h Time Spent: 10m > Use ranges in ValidReadTxnList serialization > > > Key: HIVE-23819 > URL: https://issues.apache.org/jira/browse/HIVE-23819 > Project: Hive > Issue Type: Improvement >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Time to time we see a case, when the open / aborted transaction count is high > and often the aborted transactions come in continues ranges. > When the transaction count goes high the serialization / deserialization to > hive.txn.valid.txns conf gets slower and produces a large config value. > Using ranges in the string representation can mitigate the issue somewhat. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23819) Use ranges in ValidReadTxnList serialization
[ https://issues.apache.org/jira/browse/HIVE-23819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23819: -- Labels: pull-request-available (was: ) > Use ranges in ValidReadTxnList serialization > > > Key: HIVE-23819 > URL: https://issues.apache.org/jira/browse/HIVE-23819 > Project: Hive > Issue Type: Improvement >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Time to time we see a case, when the open / aborted transaction count is high > and often the aborted transactions come in continues ranges. > When the transaction count goes high the serialization / deserialization to > hive.txn.valid.txns conf gets slower and produces a large config value. > Using ranges in the string representation can mitigate the issue somewhat. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23819) Use ranges in ValidReadTxnList serialization
[ https://issues.apache.org/jira/browse/HIVE-23819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Varga reassigned HIVE-23819: -- > Use ranges in ValidReadTxnList serialization > > > Key: HIVE-23819 > URL: https://issues.apache.org/jira/browse/HIVE-23819 > Project: Hive > Issue Type: Improvement >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > > Time to time we see a case, when the open / aborted transaction count is high > and often the aborted transactions come in continues ranges. > When the transaction count goes high the serialization / deserialization to > hive.txn.valid.txns conf gets slower and produces a large config value. > Using ranges in the string representation can mitigate the issue somewhat. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23760) Upgrading to Kafka 2.5 Clients
[ https://issues.apache.org/jira/browse/HIVE-23760?focusedWorklogId=456189&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456189 ] ASF GitHub Bot logged work on HIVE-23760: - Author: ASF GitHub Bot Created on: 08/Jul/20 15:08 Start Date: 08/Jul/20 15:08 Worklog Time Spent: 10m Work Description: klcopp commented on pull request #1216: URL: https://github.com/apache/hive/pull/1216#issuecomment-655578545 @pvary, would you mind taking a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456189) Time Spent: 1.5h (was: 1h 20m) > Upgrading to Kafka 2.5 Clients > -- > > Key: HIVE-23760 > URL: https://issues.apache.org/jira/browse/HIVE-23760 > Project: Hive > Issue Type: Improvement > Components: kafka integration >Reporter: Andras Katona >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20447) Add JSON Outputformat support
[ https://issues.apache.org/jira/browse/HIVE-20447?focusedWorklogId=456188&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456188 ] ASF GitHub Bot logged work on HIVE-20447: - Author: ASF GitHub Bot Created on: 08/Jul/20 15:06 Start Date: 08/Jul/20 15:06 Worklog Time Spent: 10m Work Description: belugabehr commented on a change in pull request #1169: URL: https://github.com/apache/hive/pull/1169#discussion_r451617508 ## File path: beeline/src/java/org/apache/hive/beeline/JSONOutputFormat.java ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * This source file is based on code taken from SQLLine 1.9 + * See SQLLine notice in LICENSE + */ +package org.apache.hive.beeline; + +import java.sql.SQLException; +import java.sql.Types; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import com.fasterxml.jackson.core.JsonEncoding; +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonGenerator; + +/** + * OutputFormat for standard JSON format. + * + */ +public class JSONOutputFormat extends AbstractOutputFormat { + protected final BeeLine beeLine; + protected JsonGenerator generator; + + + /** + * @param beeLine + */ + JSONOutputFormat(BeeLine beeLine){ +this.beeLine = beeLine; +ByteArrayOutputStream buf = new ByteArrayOutputStream(); +try{ + this.generator = new JsonFactory().createGenerator(buf, JsonEncoding.UTF8); +}catch(IOException e){ + beeLine.handleException(e); +} + } + + @Override + void printHeader(Rows.Row header) { +try { + generator.writeStartObject(); + generator.writeArrayFieldStart("resultset"); +} catch (IOException e) { + beeLine.handleException(e); +} + } + + @Override + void printFooter(Rows.Row header) { +try { + generator.writeEndArray(); + generator.writeEndObject(); + beeLine.output(generator.getOutputTarget().toString()); + generator.flush(); +} catch (IOException e) { + beeLine.handleException(e); +} + } + + @Override + void printRow(Rows rows, Rows.Row header, Rows.Row row) { +String[] head = header.values; +String[] vals = row.values; +try{ + for (int i = 0; (i < head.length) && (i < vals.length); i++) { +generator.writeFieldName(head[i]); +switch(rows.rsMeta.getColumnType(i)) { + case Types.TINYINT: + case Types.SMALLINT: + case Types.INTEGER: + case Types.BIGINT: + case Types.REAL: + case Types.FLOAT: + case Types.DOUBLE: + case Types.DECIMAL: + case Types.NUMERIC: +generator.writeNumber(vals[i]); +break; + case Types.NULL: +generator.writeNull(); +break; + case Types.BOOLEAN: +generator.writeBoolean(vals[i].equalsIgnoreCase("true")); Review comment: Take a look at using `Boolean#parse` instead: https://docs.oracle.com/javase/8/docs/api/java/lang/Boolean.html#parseBoolean-java.lang.String- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456188) Time Spent: 2h 10m (was: 2h) > Add JSON Outputformat support > - > > Key: HIVE-20447 > URL: https://issues.apache.org/jira/browse/HIVE-20447 > Project: Hive > Issue Type: Task > Components: Beeline >Reporter: Max Efremov >Assignee: Hunter Logan >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20447.01.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This function is present in SQLLine. We need add it to beeline
[jira] [Work logged] (HIVE-20447) Add JSON Outputformat support
[ https://issues.apache.org/jira/browse/HIVE-20447?focusedWorklogId=456187&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456187 ] ASF GitHub Bot logged work on HIVE-20447: - Author: ASF GitHub Bot Created on: 08/Jul/20 15:05 Start Date: 08/Jul/20 15:05 Worklog Time Spent: 10m Work Description: belugabehr closed pull request #421: URL: https://github.com/apache/hive/pull/421 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456187) Time Spent: 2h (was: 1h 50m) > Add JSON Outputformat support > - > > Key: HIVE-20447 > URL: https://issues.apache.org/jira/browse/HIVE-20447 > Project: Hive > Issue Type: Task > Components: Beeline >Reporter: Max Efremov >Assignee: Hunter Logan >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20447.01.patch > > Time Spent: 2h > Remaining Estimate: 0h > > This function is present in SQLLine. We need add it to beeline too. > https://github.com/julianhyde/sqlline/pull/84 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23818) Use String Switch-Case Statement in StatUtils
[ https://issues.apache.org/jira/browse/HIVE-23818?focusedWorklogId=456179&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456179 ] ASF GitHub Bot logged work on HIVE-23818: - Author: ASF GitHub Bot Created on: 08/Jul/20 14:53 Start Date: 08/Jul/20 14:53 Worklog Time Spent: 10m Work Description: belugabehr opened a new pull request #1229: URL: https://github.com/apache/hive/pull/1229 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456179) Remaining Estimate: 0h Time Spent: 10m > Use String Switch-Case Statement in StatUtils > - > > Key: HIVE-23818 > URL: https://issues.apache.org/jira/browse/HIVE-23818 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > switch-case statements with Java is now available. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23818) Use String Switch-Case Statement in StatUtils
[ https://issues.apache.org/jira/browse/HIVE-23818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23818: -- Labels: pull-request-available (was: ) > Use String Switch-Case Statement in StatUtils > - > > Key: HIVE-23818 > URL: https://issues.apache.org/jira/browse/HIVE-23818 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > switch-case statements with Java is now available. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23818) Use String Switch-Case Statement in StatUtils
[ https://issues.apache.org/jira/browse/HIVE-23818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated HIVE-23818: -- Description: switch-case statements with Java is now available. > Use String Switch-Case Statement in StatUtils > - > > Key: HIVE-23818 > URL: https://issues.apache.org/jira/browse/HIVE-23818 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > > switch-case statements with Java is now available. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-22301) Hive lineage is not generated for insert overwrite queries on partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-22301. - Fix Version/s: 4.0.0 Resolution: Fixed pushed to master. Thank you Jesus and Denys for reviewing the changes! > Hive lineage is not generated for insert overwrite queries on partitioned > tables > > > Key: HIVE-22301 > URL: https://issues.apache.org/jira/browse/HIVE-22301 > Project: Hive > Issue Type: Bug > Components: lineage >Affects Versions: 3.1.2 >Reporter: Sidharth Kumar Mishra >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: ScreenShot HookContext.png, ScreenShot > RunPostExecHook.png, ScreenShot runBeforeExecution.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > Problem: When I run the below mentioned queries, the last query should have > given the proper hive lineage info (through HookContext) from table_b to > table_t. > * Create table table_t (id int) partitioned by (dob date); > * Create table table_b (id int) partitioned by (dob date); > * from table_b a insert overwrite table table_t select a.id,a.dob; > Note : for CTAS query from a partitioned table , this issue is not seen. Only > for insert queries like insert into select * from and query > like above, issue is seen. > > Technical Observations: > At HookContext (passed from hive.ql.Driver to Hive Hook of Atlas through > hookRunner.runPostExecHooks call) contains no outputs. Check below screenshot > from IntelliJ. > !ScreenShot RunPostExecHook.png|width=728,height=427! > > I found that the PrivateHookContext is getting created with proper outputs > value as shown below initially: > !ScreenShot HookContext.png|width=714,height=541! > The same is passed properly to runBeforeExecutionHook as shown below: > !ScreenShot runBeforeExecution.png|width=719,height=620! > > Later when we pass HookContext to runPostExecHooks, there is no output > populated. Kindly check the reason and let me know if you need any further > information from my end. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23818) Use String Switch-Case Statement in StatUtils
[ https://issues.apache.org/jira/browse/HIVE-23818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor reassigned HIVE-23818: - > Use String Switch-Case Statement in StatUtils > - > > Key: HIVE-23818 > URL: https://issues.apache.org/jira/browse/HIVE-23818 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-22301) Hive lineage is not generated for insert overwrite queries on partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-22301?focusedWorklogId=456178&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456178 ] ASF GitHub Bot logged work on HIVE-22301: - Author: ASF GitHub Bot Created on: 08/Jul/20 14:51 Start Date: 08/Jul/20 14:51 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #1210: URL: https://github.com/apache/hive/pull/1210 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456178) Time Spent: 0.5h (was: 20m) > Hive lineage is not generated for insert overwrite queries on partitioned > tables > > > Key: HIVE-22301 > URL: https://issues.apache.org/jira/browse/HIVE-22301 > Project: Hive > Issue Type: Bug > Components: lineage >Affects Versions: 3.1.2 >Reporter: Sidharth Kumar Mishra >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Attachments: ScreenShot HookContext.png, ScreenShot > RunPostExecHook.png, ScreenShot runBeforeExecution.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > Problem: When I run the below mentioned queries, the last query should have > given the proper hive lineage info (through HookContext) from table_b to > table_t. > * Create table table_t (id int) partitioned by (dob date); > * Create table table_b (id int) partitioned by (dob date); > * from table_b a insert overwrite table table_t select a.id,a.dob; > Note : for CTAS query from a partitioned table , this issue is not seen. Only > for insert queries like insert into select * from and query > like above, issue is seen. > > Technical Observations: > At HookContext (passed from hive.ql.Driver to Hive Hook of Atlas through > hookRunner.runPostExecHooks call) contains no outputs. Check below screenshot > from IntelliJ. > !ScreenShot RunPostExecHook.png|width=728,height=427! > > I found that the PrivateHookContext is getting created with proper outputs > value as shown below initially: > !ScreenShot HookContext.png|width=714,height=541! > The same is passed properly to runBeforeExecutionHook as shown below: > !ScreenShot runBeforeExecution.png|width=719,height=620! > > Later when we pass HookContext to runPostExecHooks, there is no output > populated. Kindly check the reason and let me know if you need any further > information from my end. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20447) Add JSON Outputformat support
[ https://issues.apache.org/jira/browse/HIVE-20447?focusedWorklogId=456176&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456176 ] ASF GitHub Bot logged work on HIVE-20447: - Author: ASF GitHub Bot Created on: 08/Jul/20 14:49 Start Date: 08/Jul/20 14:49 Worklog Time Spent: 10m Work Description: belugabehr commented on a change in pull request #1169: URL: https://github.com/apache/hive/pull/1169#discussion_r451603702 ## File path: beeline/src/java/org/apache/hive/beeline/JSONOutputFormat.java ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * This source file is based on code taken from SQLLine 1.9 + * See SQLLine notice in LICENSE + */ +package org.apache.hive.beeline; + +import java.sql.SQLException; +import java.sql.Types; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import com.fasterxml.jackson.core.JsonEncoding; +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonGenerator; + +/** + * OutputFormat for standard JSON format. + * + */ +public class JSONOutputFormat extends AbstractOutputFormat { + protected final BeeLine beeLine; + protected JsonGenerator generator; + + + /** + * @param beeLine + */ + JSONOutputFormat(BeeLine beeLine){ +this.beeLine = beeLine; +ByteArrayOutputStream buf = new ByteArrayOutputStream(); +try{ + this.generator = new JsonFactory().createGenerator(buf, JsonEncoding.UTF8); +}catch(IOException e){ + beeLine.handleException(e); +} + } + + @Override + void printHeader(Rows.Row header) { +try { + generator.writeStartObject(); + generator.writeArrayFieldStart("resultset"); +} catch (IOException e) { + beeLine.handleException(e); +} + } + + @Override + void printFooter(Rows.Row header) { +try { + generator.writeEndArray(); + generator.writeEndObject(); + beeLine.output(generator.getOutputTarget().toString()); + generator.flush(); Review comment: Typically you want to `flush` the object to the underlying stream (`ByteArrayOutputStream` in this case) in case the object is doing any kind of internal buffering in order to "flush" its content out. While your `beeLine.output` is technically correct. It's a bit confusing to other coders. Order of operations should be: 1. Flush the generator to clear any buffering into the target `OutputStream` 2. Convert the `OutputStream` into the target output format (String in this situation) 3. Please use a `new String()` with UTF-8 encoding explicitly specified here https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#String-byte:A-java.nio.charset.Charset- ## File path: beeline/src/java/org/apache/hive/beeline/JSONOutputFormat.java ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * This source file is based on code taken from SQLLine 1.9 + * See SQLLine notice in LICENSE + */ +package org.apache.hive.beeline; + +import java.sql.SQLException; +import java.sql.Types; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import com.fasterxml.jackson.core.JsonEncoding; +import com.fasterxml.jackson.core.JsonFactory; +import com.fasterxml.jackson.core.JsonGenerator; + +/** + * Outpu
[jira] [Updated] (HIVE-23817) Pushing TopN Key operator PKFK inner joins
[ https://issues.apache.org/jira/browse/HIVE-23817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23817: -- Labels: pull-request-available (was: ) > Pushing TopN Key operator PKFK inner joins > -- > > Key: HIVE-23817 > URL: https://issues.apache.org/jira/browse/HIVE-23817 > Project: Hive > Issue Type: Improvement >Reporter: Attila Magyar >Assignee: Attila Magyar >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If there is primary key foreign key relationship between the tables we can > push the topnkey operator through the join. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23817) Pushing TopN Key operator PKFK inner joins
[ https://issues.apache.org/jira/browse/HIVE-23817?focusedWorklogId=456166&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456166 ] ASF GitHub Bot logged work on HIVE-23817: - Author: ASF GitHub Bot Created on: 08/Jul/20 14:23 Start Date: 08/Jul/20 14:23 Worklog Time Spent: 10m Work Description: zeroflag opened a new pull request #1228: URL: https://github.com/apache/hive/pull/1228 ## NOTICE (work in progress) ### Pushing the TopNKey operator through PK-FK inner joins. Example: Customer table: ID (PK) | LAST_NAME -- | -- 1 | Robinson 2 | Jones 3 | Smith 4 | Heisenberg Order table: CUSTOMER_ID (FK) | AMOUNT -- | -- 1 | 100 1 | 50 2 | 200 3 | 30 3 | 40 Requirements for doing TopN Key pushdown. * The PRIMARY KEY constraint on Customer.ID that forbids NULL and duplicate values. * The NOT_NULL constraint on Order.CUSTOMER_ID that forbids NULL values. * Plus the FOREIGN KEY constraint between Customer.ID and Order.CUSTOMER_ID ensures that exactly one row exists in the Customer table for any given row in the Order table. In general if the first n of the order by columns are coming from the child table (FK) then we can copy the TopNKey operator with the first n columns and put it before the join. If all columns are coming from the child table we can move the TopNKey operator without keeping the original. ``` SELECT * FROM Customer, Order WHERE Customer.ID = Order.CUSTOMER_ID ORDER BY Order.AMOUNT, [Order.*], [Customer.*] LIMIT 3; ``` Result: CUSTOMER.ID (PK) | CUSTOMER.LAST_NAME | ORDER.AMOUNT -- | -- | -- 3 | Smith | 30 3 | Smith | 40 1 | Robinson | 50 1 | Robinson | 100 2 | Jones | 200 Plan ``` Top N Key Operator sort order: + keys: ORDER.AMOUNT, [ORDER.*] top n: 3 Select Operator (Order) [...] Join [...] Top N Key Operator sort order: + keys: ORDER.AMOUNT, [ORDER.*], [Customer.*] top n: 3 ``` Implementation notes PkFk join information is extracted on the calcite side and it is attached (child table index & name) to the AST as a query hint. At the physical plan level we make use of this information to decide if we can push through the topn key operator. We also need to get the origins of the columns (in the order by) to see if they're coming from the child table. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456166) Remaining Estimate: 0h Time Spent: 10m > Pushing TopN Key operator PKFK inner joins > -- > > Key: HIVE-23817 > URL: https://issues.apache.org/jira/browse/HIVE-23817 > Project: Hive > Issue Type: Improvement >Reporter: Attila Magyar >Assignee: Attila Magyar >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > If there is primary key foreign key relationship between the tables we can > push the topnkey operator through the join. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23808) "MSCK REPAIR.. DROP Partitions fail" with kryo Exception
[ https://issues.apache.org/jira/browse/HIVE-23808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Sinkovits reassigned HIVE-23808: -- Assignee: Antal Sinkovits > "MSCK REPAIR.. DROP Partitions fail" with kryo Exception > - > > Key: HIVE-23808 > URL: https://issues.apache.org/jira/browse/HIVE-23808 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.2.0 >Reporter: Rajkumar Singh >Assignee: Antal Sinkovits >Priority: Major > > Steps to the repo: > 1. Create External partition table > 2. Remove some partition manually be using hdfs dfs -rm command > 3. run "MSCK REPAIR.. DROP Partitions" and it will fail with following > exception > {code:java} > 2020-07-06 10:42:11,434 WARN > org.apache.hadoop.hive.metastore.utils.RetryUtilities$ExponentiallyDecayingBatchWork: > [HiveServer2-Background-Pool: Thread-210]: Exception thrown while processing > using a batch size 2 > org.apache.hadoop.hive.metastore.utils.MetastoreException: > MetaException(message:Index: 117, Size: 0) > at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:479) > ~[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:432) > ~[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hadoop.hive.metastore.utils.RetryUtilities$ExponentiallyDecayingBatchWork.run(RetryUtilities.java:91) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hadoop.hive.metastore.Msck.dropPartitionsInBatches(Msck.java:496) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.metastore.Msck.repair(Msck.java:223) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hadoop.hive.ql.ddl.misc.msck.MsckOperation.execute(MsckOperation.java:74) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) > [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) > [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) > [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at java.security.AccessController.doPrivileged(Native Method) > [?:1.8.0_242] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_242] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > [hadoop-common-3.1.1.7.1.1.0-565.jar:?] > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) > [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_242] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [?:1.8.0_242] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [?:1.8.0_242] > at java.util.concurrent.F
[jira] [Updated] (HIVE-23816) Concurrent access of metastore dynamic partition registration API resulting in data loss due to HDFS dir deletion
[ https://issues.apache.org/jira/browse/HIVE-23816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rameshkrishnan muthusamy updated HIVE-23816: Description: During the process of partition registration via thrift api we are noticing that the HDFS file path associated is being deleted even though the path was not created by the same process. This results in loss of data in the dir path. In the below example there are 3 threads that is trying to create a dir and only one of succeeds in registering a partition , resulting the other 2 threads deleting the directory created and registered by the original thread. hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,307 INFO org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-379217]: Creating directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,308 INFO org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-386717]: Creating directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,308 INFO org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-379074]: Creating directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,314 INFO hive.metastore.hivemetastoressimpl: [pool-5-thread-386717]: deleting hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,315 INFO hive.metastore.hivemetastoressimpl: [pool-5-thread-379217]: deleting hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,321 INFO org.apache.hadoop.fs.TrashPolicyDefault: [pool-5-thread-386717]: Moved: 'hdfs://test_path/dt=2020-07-02/hhmm-0850' to trash at: hdfs://user/test/.Trash/Current/test/dt=2020-07-02/hhmm=0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,321 INFO hive.metastore.hivemetastoressimpl: [pool-5-thread-386717]: Moved to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,323 ERROR hive.log: [pool-5-thread-379217]: Got exception: java.io.IOException Failed to move to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:java.io.IOException: Failed to move to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850 hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,328 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-379217]: MetaException(message:Got exception: java.io.IOException Failed to move to trash: hdfs://test_path/dt=2020-07-02/hhmm-0850) was: During the process of partition registration via thrift api we are noticing that the HDFS file path associated is being deleted even though the path was not created by the same process. This results in loss of data in the dir path. > Concurrent access of metastore dynamic partition registration API resulting > in data loss due to HDFS dir deletion > --- > > Key: HIVE-23816 > URL: https://issues.apache.org/jira/browse/HIVE-23816 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: rameshkrishnan muthusamy >Assignee: rameshkrishnan muthusamy >Priority: Major > > During the process of partition registration via thrift api we are noticing > that the HDFS file path associated is being deleted even though the path was > not created by the same process. > This results in loss of data in the dir path. In the below example there are > 3 threads that is trying to create a dir and only one of succeeds in > registering a partition , resulting the other 2 threads deleting the > directory created and registered by the original thread. > hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,307 INFO > org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-379217]: Creating > directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850 > hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,308 INFO > org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-386717]: Creating > directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850 > hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,308 INFO > org.apache.hadoop.hive.common.FileUtils: [pool-5-thread-379074]: Creating > directory if it doesn't exist: hdfs://test_path/dt=2020-07-02/hhmm-0850 > hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,314 INFO > hive.metastore.hivemetastoressimpl: [pool-5-thread-386717]: deleting > hdfs://test_path/dt=2020-07-02/hhmm-0850 > hadoop-cmf-hive-HIVEMETASTORE-**.41:2020-07-02 08:50:31,315 INF
[jira] [Work logged] (HIVE-23790) The error message length of 2000 is exceeded for scheduled query
[ https://issues.apache.org/jira/browse/HIVE-23790?focusedWorklogId=456144&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456144 ] ASF GitHub Bot logged work on HIVE-23790: - Author: ASF GitHub Bot Created on: 08/Jul/20 13:56 Start Date: 08/Jul/20 13:56 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #1211: URL: https://github.com/apache/hive/pull/1211 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456144) Time Spent: 40m (was: 0.5h) > The error message length of 2000 is exceeded for scheduled query > > > Key: HIVE-23790 > URL: https://issues.apache.org/jira/browse/HIVE-23790 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {code:java} > 2020-07-01 08:24:23,916 ERROR org.apache.thrift.server.TThreadPoolServer: > [pool-7-thread-189]: Error occurred during processing of message. > org.datanucleus.exceptions.NucleusUserException: Attempt to store value > "FAILED: Execution Error, return code 30045 from > org.apache.hadoop.hive.ql.exec.repl.DirCopyTask. Permission denied: > user=hive, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:496) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:336) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkDefaultEnforcer(RangerHdfsAuthorizer.java:626) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkRangerPermission(RangerHdfsAuthorizer.java:388) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermissionWithContext(RangerHdfsAuthorizer.java:229) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:239) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1908) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1892) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1851) > at > org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:60) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3226) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1130) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:729) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) > " in column ""ERROR_MESSAGE"" that has maximum length of 2000. Please correct > your data! > at > org.datanucleus.store.rdbms.mapping.datastore.CharRDBMSMapping.setString(CharRDBMSMapping.java:254) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.setString(SingleFieldMapping.java:180) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeStringField(ParameterSetter.java:158) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.state.AbstractStateManager.providedStringField(AbstractStateManager.java:1448) > ~[datanucleus-core-4.1.17.jar:?] > at > org.datanucleus.state.StateManagerImpl.p
[jira] [Resolved] (HIVE-23790) The error message length of 2000 is exceeded for scheduled query
[ https://issues.apache.org/jira/browse/HIVE-23790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-23790. - Fix Version/s: 4.0.0 Resolution: Fixed pushed to master. Thank you Jesus for reviewing the changes! > The error message length of 2000 is exceeded for scheduled query > > > Key: HIVE-23790 > URL: https://issues.apache.org/jira/browse/HIVE-23790 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {code:java} > 2020-07-01 08:24:23,916 ERROR org.apache.thrift.server.TThreadPoolServer: > [pool-7-thread-189]: Error occurred during processing of message. > org.datanucleus.exceptions.NucleusUserException: Attempt to store value > "FAILED: Execution Error, return code 30045 from > org.apache.hadoop.hive.ql.exec.repl.DirCopyTask. Permission denied: > user=hive, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:496) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:336) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkDefaultEnforcer(RangerHdfsAuthorizer.java:626) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkRangerPermission(RangerHdfsAuthorizer.java:388) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermissionWithContext(RangerHdfsAuthorizer.java:229) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:239) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1908) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1892) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1851) > at > org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:60) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3226) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1130) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:729) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) > " in column ""ERROR_MESSAGE"" that has maximum length of 2000. Please correct > your data! > at > org.datanucleus.store.rdbms.mapping.datastore.CharRDBMSMapping.setString(CharRDBMSMapping.java:254) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.setString(SingleFieldMapping.java:180) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeStringField(ParameterSetter.java:158) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.state.AbstractStateManager.providedStringField(AbstractStateManager.java:1448) > ~[datanucleus-core-4.1.17.jar:?] > at > org.datanucleus.state.StateManagerImpl.providedStringField(StateManagerImpl.java:120) > ~[datanucleus-core-4.1.17.jar:?] > at > org.apache.hadoop.hive.metastore.model.MScheduledExecution.dnProvideField(MScheduledExecution.java) > ~[hive-exec-3.1.3000.7.2.1.0-246.jar:3.1.3000.7.2.1.0-246] > at > org.apache.hadoop.hive.metastore.model.MScheduledExecution.dnProvideFields(MScheduledExecution.java) > ~[hive-exec-3.1.3000.7.2.1.0-246.jar:3.1.3000.7.2.1.0-246] > at > org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:1170) > ~[datanucleus-core-4.1.17.jar:?] > at > org.datanucleus.sto
[jira] [Work started] (HIVE-23790) The error message length of 2000 is exceeded for scheduled query
[ https://issues.apache.org/jira/browse/HIVE-23790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-23790 started by Zoltan Haindrich. --- > The error message length of 2000 is exceeded for scheduled query > > > Key: HIVE-23790 > URL: https://issues.apache.org/jira/browse/HIVE-23790 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {code:java} > 2020-07-01 08:24:23,916 ERROR org.apache.thrift.server.TThreadPoolServer: > [pool-7-thread-189]: Error occurred during processing of message. > org.datanucleus.exceptions.NucleusUserException: Attempt to store value > "FAILED: Execution Error, return code 30045 from > org.apache.hadoop.hive.ql.exec.repl.DirCopyTask. Permission denied: > user=hive, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:496) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:336) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkDefaultEnforcer(RangerHdfsAuthorizer.java:626) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkRangerPermission(RangerHdfsAuthorizer.java:388) > at > org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermissionWithContext(RangerHdfsAuthorizer.java:229) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:239) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1908) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1892) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1851) > at > org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:60) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3226) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1130) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:729) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:985) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:913) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) > " in column ""ERROR_MESSAGE"" that has maximum length of 2000. Please correct > your data! > at > org.datanucleus.store.rdbms.mapping.datastore.CharRDBMSMapping.setString(CharRDBMSMapping.java:254) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.setString(SingleFieldMapping.java:180) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeStringField(ParameterSetter.java:158) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datanucleus.state.AbstractStateManager.providedStringField(AbstractStateManager.java:1448) > ~[datanucleus-core-4.1.17.jar:?] > at > org.datanucleus.state.StateManagerImpl.providedStringField(StateManagerImpl.java:120) > ~[datanucleus-core-4.1.17.jar:?] > at > org.apache.hadoop.hive.metastore.model.MScheduledExecution.dnProvideField(MScheduledExecution.java) > ~[hive-exec-3.1.3000.7.2.1.0-246.jar:3.1.3000.7.2.1.0-246] > at > org.apache.hadoop.hive.metastore.model.MScheduledExecution.dnProvideFields(MScheduledExecution.java) > ~[hive-exec-3.1.3000.7.2.1.0-246.jar:3.1.3000.7.2.1.0-246] > at > org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:1170) > ~[datanucleus-core-4.1.17.jar:?] > at > org.datanucleus.store.rdbms.request.UpdateRequest.execute(UpdateRequest.java:326) > ~[datanucleus-rdbms-4.1.19.jar:?] > at > org.datan
[jira] [Assigned] (HIVE-23817) Pushing TopN Key operator PKFK inner joins
[ https://issues.apache.org/jira/browse/HIVE-23817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Magyar reassigned HIVE-23817: > Pushing TopN Key operator PKFK inner joins > -- > > Key: HIVE-23817 > URL: https://issues.apache.org/jira/browse/HIVE-23817 > Project: Hive > Issue Type: Improvement >Reporter: Attila Magyar >Assignee: Attila Magyar >Priority: Major > > If there is primary key foreign key relationship between the tables we can > push the topnkey operator through the join. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23816) Concurrent access of metastore dynamic partition registration API resulting in data loss due to HDFS dir deletion
[ https://issues.apache.org/jira/browse/HIVE-23816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rameshkrishnan muthusamy reassigned HIVE-23816: --- > Concurrent access of metastore dynamic partition registration API resulting > in data loss due to HDFS dir deletion > --- > > Key: HIVE-23816 > URL: https://issues.apache.org/jira/browse/HIVE-23816 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: rameshkrishnan muthusamy >Assignee: rameshkrishnan muthusamy >Priority: Major > > During the process of partition registration via thrift api we are noticing > that the HDFS file path associated is being deleted even though the path was > not created by the same process. > This results in loss of data in the dir path. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?focusedWorklogId=456117&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456117 ] ASF GitHub Bot logged work on HIVE-23813: - Author: ASF GitHub Bot Created on: 08/Jul/20 13:30 Start Date: 08/Jul/20 13:30 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1223: URL: https://github.com/apache/hive/pull/1223#discussion_r451545094 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ReplicationMetricsMaintTask.java ## @@ -63,13 +63,12 @@ public void run() { if (!MetastoreConf.getBoolVar(conf, ConfVars.SCHEDULED_QUERIES_ENABLED)) { Review comment: Metrics are always enabled by default. So didn't want to introduce a new config. The metric collection depends if the scheduled queries are enabled. If not, there is no metric collection for replication as the primary key for the table is schedule id. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456117) Time Spent: 50m (was: 40m) > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23800) Make HiveServer2 oom hook interface
[ https://issues.apache.org/jira/browse/HIVE-23800?focusedWorklogId=456110&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456110 ] ASF GitHub Bot logged work on HIVE-23800: - Author: ASF GitHub Bot Created on: 08/Jul/20 13:23 Start Date: 08/Jul/20 13:23 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on a change in pull request #1205: URL: https://github.com/apache/hive/pull/1205#discussion_r451539918 ## File path: service/src/java/org/apache/hive/service/server/HiveServer2OomHookRunner.java ## @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hive.service.server; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.hive.common.JavaUtils; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.conf.HiveConf.ConfVars; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.Constructor; +import java.util.ArrayList; +import java.util.List; + +public class HiveServer2OomHookRunner implements Runnable { + + private static final Logger LOG = LoggerFactory.getLogger(HiveServer2OomHookRunner.class); + private OomHookContext context; + private final List hooks = new ArrayList(); + + HiveServer2OomHookRunner(HiveServer2 hiveServer2, HiveConf hiveConf) { +context = new OomHookContext(hiveServer2); +// The hs2 has not been initialized yet, hiveServer2.getHiveConf() would be null +init(hiveConf); + } + + private void init(HiveConf hiveConf) { +String csHooks = hiveConf.getVar(ConfVars.HIVE_SERVER2_OOM_HOOKS); +if (!StringUtils.isBlank(csHooks)) { + String[] hookClasses = csHooks.split(","); + for (String hookClass : hookClasses) { +try { + Class clazz = JavaUtils.loadClass(hookClass.trim()); + Constructor ctor = clazz.getDeclaredConstructor(); + ctor.setAccessible(true); + hooks.add((OomHookWithContext)ctor.newInstance()); +} catch (Exception e) { + LOG.error("Skip adding oom hook '" + hookClass + "'", e); +} + } +} + } + + @VisibleForTesting + public HiveServer2OomHookRunner(HiveConf hiveConf) { +init(hiveConf); + } + + @VisibleForTesting + public List getHooks() { +return hooks; + } + + @Override + public void run() { +for (OomHookWithContext hook : hooks) { + hook.run(context); +} + } + + public static interface OomHookWithContext { +public void run(OomHookContext context); + } + + public static class OomHookContext { +private final HiveServer2 hiveServer2; +public OomHookContext(HiveServer2 hiveServer2) { + this.hiveServer2 = hiveServer2; +} +public HiveServer2 getHiveServer2() { + return hiveServer2; +} + } + + /** + * Used as default oom hook + */ + private static class DefaultOomHook implements OomHookWithContext { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456110) Time Spent: 1h 10m (was: 1h) > Make HiveServer2 oom hook interface > --- > > Key: HIVE-23800 > URL: https://issues.apache.org/jira/browse/HIVE-23800 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23800) Make HiveServer2 oom hook interface
[ https://issues.apache.org/jira/browse/HIVE-23800?focusedWorklogId=456109&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456109 ] ASF GitHub Bot logged work on HIVE-23800: - Author: ASF GitHub Bot Created on: 08/Jul/20 13:22 Start Date: 08/Jul/20 13:22 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on a change in pull request #1205: URL: https://github.com/apache/hive/pull/1205#discussion_r451539648 ## File path: service/src/java/org/apache/hive/service/server/HiveServer2OomHookRunner.java ## @@ -0,0 +1,102 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hive.service.server; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.hive.common.JavaUtils; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.conf.HiveConf.ConfVars; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.Constructor; +import java.util.ArrayList; +import java.util.List; + +public class HiveServer2OomHookRunner implements Runnable { + + private static final Logger LOG = LoggerFactory.getLogger(HiveServer2OomHookRunner.class); + private OomHookContext context; + private final List hooks = new ArrayList(); + + HiveServer2OomHookRunner(HiveServer2 hiveServer2, HiveConf hiveConf) { +context = new OomHookContext(hiveServer2); +// The hs2 has not been initialized yet, hiveServer2.getHiveConf() would be null +init(hiveConf); + } + + private void init(HiveConf hiveConf) { +String csHooks = hiveConf.getVar(ConfVars.HIVE_SERVER2_OOM_HOOKS); +if (!StringUtils.isBlank(csHooks)) { + String[] hookClasses = csHooks.split(","); + for (String hookClass : hookClasses) { +try { + Class clazz = JavaUtils.loadClass(hookClass.trim()); + Constructor ctor = clazz.getDeclaredConstructor(); + ctor.setAccessible(true); + hooks.add((OomHookWithContext)ctor.newInstance()); +} catch (Exception e) { + LOG.error("Skip adding oom hook '" + hookClass + "'", e); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456109) Time Spent: 1h (was: 50m) > Make HiveServer2 oom hook interface > --- > > Key: HIVE-23800 > URL: https://issues.apache.org/jira/browse/HIVE-23800 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153599#comment-17153599 ] Aasha Medhi commented on HIVE-23813: http://ci.hive.apache.org/job/hive-flaky-check/67/ > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-22957) Support Partition Filtering In MSCK REPAIR TABLE Command
[ https://issues.apache.org/jira/browse/HIVE-22957?focusedWorklogId=456097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456097 ] ASF GitHub Bot logged work on HIVE-22957: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:59 Start Date: 08/Jul/20 12:59 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1105: URL: https://github.com/apache/hive/pull/1105#discussion_r451502201 ## File path: parser/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g ## @@ -734,6 +734,21 @@ dropPartitionOperator EQUAL | NOTEQUAL | LESSTHANOREQUALTO | LESSTHAN | GREATERTHANOREQUALTO | GREATERTHAN ; +filterPartitionSpec +: +LPAREN filterPartitionVal (COMMA filterPartitionVal )* RPAREN -> ^(TOK_PARTSPEC filterPartitionVal +) +; + +filterPartitionVal +: +identifier filterPartitionOperator constant -> ^(TOK_PARTVAL identifier filterPartitionOperator constant) Review comment: old `partitionSpec` doesn't mandatorily required the constant ``` identifier (EQUAL constant)? ``` were there any use cases of that? ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java ## @@ -383,7 +375,29 @@ void findUnknownPartitions(Table table, Set partPaths, // now check the table folder and see if we find anything // that isn't in the metastore Set allPartDirs = new HashSet(); +Set partDirs = new HashSet(); +List partColumns = table.getPartitionKeys(); checkPartitionDirs(tablePath, allPartDirs, Collections.unmodifiableList(getPartColNames(table))); + +if (filterExp != null) { + PartitionExpressionProxy expressionProxy = createExpressionProxy(conf); + List paritions = new ArrayList<>(); + for (Path path : allPartDirs) { +// remove the table's path from the partition path +// eg: /p1=1/p2=2/p3=3 ---> p1=1/p2=2/p3=3 +paritions.add(path.toString().substring(tablePath.toString().length() + 1)); + } + // Remove all partition paths which does not matches the filter expression. + expressionProxy.filterPartitionsByExpr(partColumns, filterExp, + conf.get(MetastoreConf.ConfVars.DEFAULTPARTITIONNAME.getVarname()), paritions); + + // now the partition list will contain all the paths that matches the filter expression. + // add them back to partDirs. + for (String path : paritions) { +partDirs.add(new Path(tablePath.toString() + "/" + path)); Review comment: instead of concatenating with `/` use `new Path(parentPath,child)` - it's more portable ## File path: itests/src/test/resources/testconfiguration.properties ## @@ -222,6 +222,7 @@ mr.query.files=\ mapjoin_subquery2.q,\ mapjoin_test_outer.q,\ masking_5.q,\ + msck_repair_filter.q,\ Review comment: is there a reason that we run this test with mr? ## File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ## @@ -1942,9 +1942,8 @@ metastoreCheck @after { popMsg(state); } : KW_MSCK (repair=KW_REPAIR)? (KW_TABLE tableName -((add=KW_ADD | drop=KW_DROP | sync=KW_SYNC) (parts=KW_PARTITIONS))? | -(partitionSpec)?) --> ^(TOK_MSCK $repair? tableName? $add? $drop? $sync? (partitionSpec*)?) +((add=KW_ADD | drop=KW_DROP | sync=KW_SYNC) (parts=KW_PARTITIONS) (filterPartitionSpec)?)?) +-> ^(TOK_MSCK $repair? tableName? $add? $drop? $sync? (filterPartitionSpec)?) Review comment: I know it was here before - but let's fix this up: instead of separate add/drop/sync variable ...we could have `opt=(KW_ADD|KW_DROP|KW_SYNC)` ? that will make the other end more readable as well ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java ## @@ -63,13 +67,24 @@ public void analyzeInternal(ASTNode root) throws SemanticException { } Table table = getTable(tableName); -List> specs = getPartitionSpecs(table, root); +Map> partitionSpecs = getFullPartitionSpecs(root, table, conf, false); +byte[] filterExp = null; +if (partitionSpecs != null & !partitionSpecs.isEmpty()) { + // explicitly set expression proxy class to PartitionExpressionForMetastore since we intend to use the + // filterPartitionsByExpr of PartitionExpressionForMetastore for partition pruning down the line. + conf.set(MetastoreConf.ConfVars.EXPRESSION_PROXY_CLASS.getVarname(), Review comment: I don't think this will work - this is the ql module ; while `EXPRESSION_PROXY_CLASS` is a metastore conf key; in a remote metastore setup this set will probably have no effect... have you tried it? I think making a check and returning with an error that this feature is not available du
[jira] [Updated] (HIVE-23815) output statistics of underlying datastore
[ https://issues.apache.org/jira/browse/HIVE-23815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rossetti Wong updated HIVE-23815: - External issue URL: (was: https://github.com/apache/hive/pull/1226) > output statistics of underlying datastore > -- > > Key: HIVE-23815 > URL: https://issues.apache.org/jira/browse/HIVE-23815 > Project: Hive > Issue Type: Improvement >Reporter: Rossetti Wong >Assignee: Rossetti Wong >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This patch provides a way to get the statistics data of metastore's > underlying datastore, like MySQL, Oracle and so on. You can get the number > of datastore reads and writes, the average time of transaction execution, the > total active connection and so on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23815) output statistics of underlying datastore
[ https://issues.apache.org/jira/browse/HIVE-23815?focusedWorklogId=456093&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456093 ] ASF GitHub Bot logged work on HIVE-23815: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:49 Start Date: 08/Jul/20 12:49 Worklog Time Spent: 10m Work Description: xinghuayu007 opened a new pull request #1227: URL: https://github.com/apache/hive/pull/1227 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456093) Time Spent: 20m (was: 10m) > output statistics of underlying datastore > -- > > Key: HIVE-23815 > URL: https://issues.apache.org/jira/browse/HIVE-23815 > Project: Hive > Issue Type: Improvement >Reporter: Rossetti Wong >Assignee: Rossetti Wong >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This patch provides a way to get the statistics data of metastore's > underlying datastore, like MySQL, Oracl and so on. You can get the number of > datastore reads and writes, the average time of transaction execution, the > total active connection and so on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23815) output statistics of underlying datastore
[ https://issues.apache.org/jira/browse/HIVE-23815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rossetti Wong updated HIVE-23815: - Description: This patch provides a way to get the statistics data of metastore's underlying datastore, like MySQL, Oracle and so on. You can get the number of datastore reads and writes, the average time of transaction execution, the total active connection and so on. (was: This patch provides a way to get the statistics data of metastore's underlying datastore, like MySQL, Oracl and so on. You can get the number of datastore reads and writes, the average time of transaction execution, the total active connection and so on.) > output statistics of underlying datastore > -- > > Key: HIVE-23815 > URL: https://issues.apache.org/jira/browse/HIVE-23815 > Project: Hive > Issue Type: Improvement >Reporter: Rossetti Wong >Assignee: Rossetti Wong >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This patch provides a way to get the statistics data of metastore's > underlying datastore, like MySQL, Oracle and so on. You can get the number > of datastore reads and writes, the average time of transaction execution, the > total active connection and so on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23815) output statistics of underlying datastore
[ https://issues.apache.org/jira/browse/HIVE-23815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23815: -- Labels: pull-request-available (was: ) > output statistics of underlying datastore > -- > > Key: HIVE-23815 > URL: https://issues.apache.org/jira/browse/HIVE-23815 > Project: Hive > Issue Type: Improvement >Reporter: Rossetti Wong >Assignee: Rossetti Wong >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This patch provides a way to get the statistics data of metastore's > underlying datastore, like MySQL, Oracl and so on. You can get the number of > datastore reads and writes, the average time of transaction execution, the > total active connection and so on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23815) output statistics of underlying datastore
[ https://issues.apache.org/jira/browse/HIVE-23815?focusedWorklogId=456092&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456092 ] ASF GitHub Bot logged work on HIVE-23815: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:47 Start Date: 08/Jul/20 12:47 Worklog Time Spent: 10m Work Description: xinghuayu007 closed pull request #1226: URL: https://github.com/apache/hive/pull/1226 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456092) Remaining Estimate: 0h Time Spent: 10m > output statistics of underlying datastore > -- > > Key: HIVE-23815 > URL: https://issues.apache.org/jira/browse/HIVE-23815 > Project: Hive > Issue Type: Improvement >Reporter: Rossetti Wong >Assignee: Rossetti Wong >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This patch provides a way to get the statistics data of metastore's > underlying datastore, like MySQL, Oracl and so on. You can get the number of > datastore reads and writes, the average time of transaction execution, the > total active connection and so on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23815) output statistics of underlying datastore
[ https://issues.apache.org/jira/browse/HIVE-23815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rossetti Wong reassigned HIVE-23815: > output statistics of underlying datastore > -- > > Key: HIVE-23815 > URL: https://issues.apache.org/jira/browse/HIVE-23815 > Project: Hive > Issue Type: Improvement >Reporter: Rossetti Wong >Assignee: Rossetti Wong >Priority: Major > > This patch provides a way to get the statistics data of metastore's > underlying datastore, like MySQL, Oracl and so on. You can get the number of > datastore reads and writes, the average time of transaction execution, the > total active connection and so on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?focusedWorklogId=456067&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456067 ] ASF GitHub Bot logged work on HIVE-23813: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:11 Start Date: 08/Jul/20 12:11 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1223: URL: https://github.com/apache/hive/pull/1223#discussion_r451486233 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ReplicationMetricsMaintTask.java ## @@ -63,13 +63,12 @@ public void run() { if (!MetastoreConf.getBoolVar(conf, ConfVars.SCHEDULED_QUERIES_ENABLED)) { Review comment: please correct the comment in `initialDelay` method as well ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ReplicationMetricsMaintTask.java ## @@ -63,13 +63,12 @@ public void run() { if (!MetastoreConf.getBoolVar(conf, ConfVars.SCHEDULED_QUERIES_ENABLED)) { Review comment: I think this class should depend on a different config knob ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/metric/MetricSink.java ## @@ -121,6 +121,7 @@ public void run() { ObjectMapper mapper = new ObjectMapper(); Review comment: is this method supposed to be fast? (because `ArrayList` was created for a specified size) ...anyway try not to throw away `ObjectMapper` instances right after use - `ObjectMapper`'s first time use cost could be high ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ReplicationMetricsMaintTask.java ## @@ -63,13 +63,12 @@ public void run() { if (!MetastoreConf.getBoolVar(conf, ConfVars.SCHEDULED_QUERIES_ENABLED)) { return; } - LOG.debug("Cleaning up older Metrics"); RawStore ms = HiveMetaStore.HMSHandler.getMSForConf(conf); - int maxRetainSecs = (int) TimeUnit.DAYS.toSeconds(MetastoreConf.getTimeVar(conf, -ConfVars.REPL_METRICS_MAX_AGE, TimeUnit.DAYS)); + int maxRetainSecs = (int) MetastoreConf.getTimeVar(conf, ConfVars.REPL_METRICS_MAX_AGE, TimeUnit.SECONDS); + LOG.info("Cleaning up Metrics older than {} ", maxRetainSecs); int deleteCnt = ms.deleteReplicationMetrics(maxRetainSecs); if (deleteCnt > 0L){ Review comment: nit: space before `{` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456067) Time Spent: 40m (was: 0.5h) > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23638) Fix FindBug issues in hive-common
[ https://issues.apache.org/jira/browse/HIVE-23638?focusedWorklogId=456064&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456064 ] ASF GitHub Bot logged work on HIVE-23638: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:04 Start Date: 08/Jul/20 12:04 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #1161: URL: https://github.com/apache/hive/pull/1161#discussion_r451489380 ## File path: common/src/java/org/apache/hive/common/util/SuppressFBWarnings.java ## @@ -0,0 +1,37 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hive.common.util; + +import java.lang.annotation.Retention; +import java.lang.annotation.RetentionPolicy; + +@Retention(RetentionPolicy.CLASS) +public @interface SuppressFBWarnings { +/** + * The set of FindBugs warnings that are to be suppressed in + * annotated element. The value can be a bug category, kind or pattern. + * + */ +String[] value() default {}; + +/** + * Optional documentation of the reason why the warning is suppressed + */ +String justification() default ""; +} Review comment: Sure, done :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456064) Time Spent: 1h 40m (was: 1.5h) > Fix FindBug issues in hive-common > - > > Key: HIVE-23638 > URL: https://issues.apache.org/jira/browse/HIVE-23638 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 1h 40m > Remaining Estimate: 0h > > mvn -Pspotbugs > -Dorg.slf4j.simpleLogger.log.org.apache.maven.plugin.surefire.SurefirePlugin=INFO > -pl :hive-common test-compile > com.github.spotbugs:spotbugs-maven-plugin:4.0.0:check -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23638) Fix FindBug issues in hive-common
[ https://issues.apache.org/jira/browse/HIVE-23638?focusedWorklogId=456062&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456062 ] ASF GitHub Bot logged work on HIVE-23638: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:03 Start Date: 08/Jul/20 12:03 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #1161: URL: https://github.com/apache/hive/pull/1161#discussion_r451488767 ## File path: common/src/java/org/apache/hadoop/hive/common/StringInternUtils.java ## @@ -135,10 +135,10 @@ public static Path internUriStringsInPath(Path path) { public static Map internValuesInMap(Map map) { if (map != null) { - for (K key : map.keySet()) { -String value = map.get(key); + for (Map.Entry entry : map.entrySet()) { +String value = entry.getValue(); if (value != null) { - map.put(key, value.intern()); + map.put(entry.getKey(), value.intern()); Review comment: Nice idea! I followed similar logic to check if values are already interned in all helper methods in StringInternUtils class This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456062) Time Spent: 1h 20m (was: 1h 10m) > Fix FindBug issues in hive-common > - > > Key: HIVE-23638 > URL: https://issues.apache.org/jira/browse/HIVE-23638 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 1h 20m > Remaining Estimate: 0h > > mvn -Pspotbugs > -Dorg.slf4j.simpleLogger.log.org.apache.maven.plugin.surefire.SurefirePlugin=INFO > -pl :hive-common test-compile > com.github.spotbugs:spotbugs-maven-plugin:4.0.0:check -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23638) Fix FindBug issues in hive-common
[ https://issues.apache.org/jira/browse/HIVE-23638?focusedWorklogId=456063&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456063 ] ASF GitHub Bot logged work on HIVE-23638: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:03 Start Date: 08/Jul/20 12:03 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #1161: URL: https://github.com/apache/hive/pull/1161#discussion_r451489258 ## File path: common/src/java/org/apache/hadoop/hive/conf/Validator.java ## @@ -357,14 +357,15 @@ public String validate(String value) { final Path path = FileSystems.getDefault().getPath(value); if (path == null && value != null) { return String.format("Path '%s' provided could not be located.", value); - } - final boolean isDir = Files.isDirectory(path); - final boolean isWritable = Files.isWritable(path); - if (!isDir) { -return String.format("Path '%s' provided is not a directory.", value); - } - if (!isWritable) { -return String.format("Path '%s' provided is not writable.", value); + } else if (path != null) { +final boolean isDir = Files.isDirectory(path); +final boolean isWritable = Files.isWritable(path); +if (!isDir) { + return String.format("Path '%s' provided is not a directory.", value); +} +if (!isWritable) { + return String.format("Path '%s' provided is not writable.", value); +} } return null; Review comment: Refactored the code to return early when the argument is actually null, the following logic is now simplified to Null and non null path This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456063) Time Spent: 1.5h (was: 1h 20m) > Fix FindBug issues in hive-common > - > > Key: HIVE-23638 > URL: https://issues.apache.org/jira/browse/HIVE-23638 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 1.5h > Remaining Estimate: 0h > > mvn -Pspotbugs > -Dorg.slf4j.simpleLogger.log.org.apache.maven.plugin.surefire.SurefirePlugin=INFO > -pl :hive-common test-compile > com.github.spotbugs:spotbugs-maven-plugin:4.0.0:check -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23638) Fix FindBug issues in hive-common
[ https://issues.apache.org/jira/browse/HIVE-23638?focusedWorklogId=456061&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456061 ] ASF GitHub Bot logged work on HIVE-23638: - Author: ASF GitHub Bot Created on: 08/Jul/20 12:00 Start Date: 08/Jul/20 12:00 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #1161: URL: https://github.com/apache/hive/pull/1161#discussion_r451487709 ## File path: common/src/java/org/apache/hadoop/hive/common/FileUtils.java ## @@ -926,8 +925,7 @@ public static File createLocalDirsTempFile(Configuration conf, String prefix, St * delete a temporary file and remove it from delete-on-exit hook. */ public static boolean deleteTmpFile(File tempFile) { -if (tempFile != null) { - tempFile.delete(); +if (tempFile != null && tempFile.delete()) { Review comment: Good catch, fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456061) Time Spent: 1h 10m (was: 1h) > Fix FindBug issues in hive-common > - > > Key: HIVE-23638 > URL: https://issues.apache.org/jira/browse/HIVE-23638 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Attachments: spotbugsXml.xml > > Time Spent: 1h 10m > Remaining Estimate: 0h > > mvn -Pspotbugs > -Dorg.slf4j.simpleLogger.log.org.apache.maven.plugin.surefire.SurefirePlugin=INFO > -pl :hive-common test-compile > com.github.spotbugs:spotbugs-maven-plugin:4.0.0:check -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-23069 started by Pravin Sinha. --- > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23069.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-23069: Description: Currently the iterator used while copying table data is memory based. In case of a database with very large number of table/partitions, such iterator may cause HS2 process to go OOM. Also introduces a config option to run data copy tasks during repl load operation. was:Currently the iterator used while copying table data is memory based. In case of a database with very large number of table/partitions, such iterator may cause HS2 process to go OOM. > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23069.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. > Also introduces a config option to run data copy tasks during repl load > operation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-23069: Status: Patch Available (was: In Progress) > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23069.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-23069: Attachment: HIVE-23069.01.patch > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23069.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23069: -- Labels: pull-request-available (was: ) > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23069) Memory efficient iterator should be used during replication.
[ https://issues.apache.org/jira/browse/HIVE-23069?focusedWorklogId=456054&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456054 ] ASF GitHub Bot logged work on HIVE-23069: - Author: ASF GitHub Bot Created on: 08/Jul/20 11:48 Start Date: 08/Jul/20 11:48 Worklog Time Spent: 10m Work Description: pkumarsinha opened a new pull request #1225: URL: https://github.com/apache/hive/pull/1225 …n. Config option to execute data copy during load. ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456054) Remaining Estimate: 0h Time Spent: 10m > Memory efficient iterator should be used during replication. > > > Key: HIVE-23069 > URL: https://issues.apache.org/jira/browse/HIVE-23069 > Project: Hive > Issue Type: Improvement >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently the iterator used while copying table data is memory based. In case > of a database with very large number of table/partitions, such iterator may > cause HS2 process to go OOM. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22957) Support Partition Filtering In MSCK REPAIR TABLE Command
[ https://issues.apache.org/jira/browse/HIVE-22957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153523#comment-17153523 ] Syed Shameerur Rahman commented on HIVE-22957: -- [~jcamachorodriguez] [~kgyrtkirk] ping for review request! > Support Partition Filtering In MSCK REPAIR TABLE Command > > > Key: HIVE-22957 > URL: https://issues.apache.org/jira/browse/HIVE-22957 > Project: Hive > Issue Type: Improvement >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Design Doc_ Partition Filtering In MSCK REPAIR > TABLE.pdf, HIVE-22957.01.patch, HIVE-22957.02.patch, HIVE-22957.03.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > *Design Doc:* > [^Design Doc_ Partition Filtering In MSCK REPAIR TABLE.pdf] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23805) ValidReadTxnList need not be constructed multiple times in AcidUtils::getAcidState
[ https://issues.apache.org/jira/browse/HIVE-23805?focusedWorklogId=456036&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456036 ] ASF GitHub Bot logged work on HIVE-23805: - Author: ASF GitHub Bot Created on: 08/Jul/20 11:29 Start Date: 08/Jul/20 11:29 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #1224: URL: https://github.com/apache/hive/pull/1224#discussion_r451472066 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java ## @@ -730,7 +729,10 @@ public boolean validateInput(FileSystem fs, HiveConf conf, ? AcidOperationalProperties.parseString(txnProperties) : null; String value = conf.get(ValidWriteIdList.VALID_WRITEIDS_KEY); - writeIdList = value == null ? new ValidReaderWriteIdList() : new ValidReaderWriteIdList(value); + writeIdList = new ValidReaderWriteIdList(value); + + value = conf.get(ValidTxnList.VALID_TXNS_KEY); + validTxnList = new ValidReadTxnList(value); Review comment: Will this help, if we have multiple partitions with multiple files to parse the TxnList only once? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456036) Time Spent: 40m (was: 0.5h) > ValidReadTxnList need not be constructed multiple times in > AcidUtils::getAcidState > --- > > Key: HIVE-23805 > URL: https://issues.apache.org/jira/browse/HIVE-23805 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2020-07-06 at 4.53.44 PM.png > > Time Spent: 40m > Remaining Estimate: 0h > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1273] > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1286] > > {code:java} > String s = conf.get(ValidTxnList.VALID_TXNS_KEY); > > > if(!Strings.isNullOrEmpty(s)) { > > ... > ... > validTxnList.readFromString(s); > > > } {code} > > > !Screenshot 2020-07-06 at 4.53.44 PM.png|width=610,height=621! > AM spends good amount of CPU parsing the same validtxnlist multiple times. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23805) ValidReadTxnList need not be constructed multiple times in AcidUtils::getAcidState
[ https://issues.apache.org/jira/browse/HIVE-23805?focusedWorklogId=456031&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-456031 ] ASF GitHub Bot logged work on HIVE-23805: - Author: ASF GitHub Bot Created on: 08/Jul/20 11:26 Start Date: 08/Jul/20 11:26 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #1224: URL: https://github.com/apache/hive/pull/1224#discussion_r451470405 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ## @@ -1262,8 +1262,8 @@ public static boolean isAcid(FileSystem fileSystem, Path directory, * @throws IOException on filesystem errors */ public static Directory getAcidState(FileSystem fileSystem, Path candidateDirectory, Configuration conf, - ValidWriteIdList writeIdList, Ref useFileIds, boolean ignoreEmptyFiles) throws IOException { -return getAcidState(fileSystem, candidateDirectory, conf, writeIdList, useFileIds, ignoreEmptyFiles, null); + ValidWriteIdList writeIdList, ValidTxnList validTxnList, Ref useFileIds, boolean ignoreEmptyFiles) throws IOException { +return getAcidState(fileSystem, candidateDirectory, conf, writeIdList, validTxnList, useFileIds, ignoreEmptyFiles, null); Review comment: Maybe create a separate class for AcidState? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 456031) Time Spent: 0.5h (was: 20m) > ValidReadTxnList need not be constructed multiple times in > AcidUtils::getAcidState > --- > > Key: HIVE-23805 > URL: https://issues.apache.org/jira/browse/HIVE-23805 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2020-07-06 at 4.53.44 PM.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1273] > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1286] > > {code:java} > String s = conf.get(ValidTxnList.VALID_TXNS_KEY); > > > if(!Strings.isNullOrEmpty(s)) { > > ... > ... > validTxnList.readFromString(s); > > > } {code} > > > !Screenshot 2020-07-06 at 4.53.44 PM.png|width=610,height=621! > AM spends good amount of CPU parsing the same validtxnlist multiple times. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23805) ValidReadTxnList need not be constructed multiple times in AcidUtils::getAcidState
[ https://issues.apache.org/jira/browse/HIVE-23805?focusedWorklogId=455998&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-455998 ] ASF GitHub Bot logged work on HIVE-23805: - Author: ASF GitHub Bot Created on: 08/Jul/20 10:36 Start Date: 08/Jul/20 10:36 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #1224: URL: https://github.com/apache/hive/pull/1224#discussion_r451446059 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ## @@ -1262,8 +1262,8 @@ public static boolean isAcid(FileSystem fileSystem, Path directory, * @throws IOException on filesystem errors */ public static Directory getAcidState(FileSystem fileSystem, Path candidateDirectory, Configuration conf, - ValidWriteIdList writeIdList, Ref useFileIds, boolean ignoreEmptyFiles) throws IOException { -return getAcidState(fileSystem, candidateDirectory, conf, writeIdList, useFileIds, ignoreEmptyFiles, null); + ValidWriteIdList writeIdList, ValidTxnList validTxnList, Ref useFileIds, boolean ignoreEmptyFiles) throws IOException { +return getAcidState(fileSystem, candidateDirectory, conf, writeIdList, validTxnList, useFileIds, ignoreEmptyFiles, null); Review comment: I will create a separate issue to change getAcidState to get just one parameter with builder pattern, because this is getting out of hand This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 455998) Time Spent: 20m (was: 10m) > ValidReadTxnList need not be constructed multiple times in > AcidUtils::getAcidState > --- > > Key: HIVE-23805 > URL: https://issues.apache.org/jira/browse/HIVE-23805 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2020-07-06 at 4.53.44 PM.png > > Time Spent: 20m > Remaining Estimate: 0h > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1273] > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1286] > > {code:java} > String s = conf.get(ValidTxnList.VALID_TXNS_KEY); > > > if(!Strings.isNullOrEmpty(s)) { > > ... > ... > validTxnList.readFromString(s); > > > } {code} > > > !Screenshot 2020-07-06 at 4.53.44 PM.png|width=610,height=621! > AM spends good amount of CPU parsing the same validtxnlist multiple times. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23805) ValidReadTxnList need not be constructed multiple times in AcidUtils::getAcidState
[ https://issues.apache.org/jira/browse/HIVE-23805?focusedWorklogId=455997&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-455997 ] ASF GitHub Bot logged work on HIVE-23805: - Author: ASF GitHub Bot Created on: 08/Jul/20 10:35 Start Date: 08/Jul/20 10:35 Worklog Time Spent: 10m Work Description: pvargacl opened a new pull request #1224: URL: https://github.com/apache/hive/pull/1224 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 455997) Remaining Estimate: 0h Time Spent: 10m > ValidReadTxnList need not be constructed multiple times in > AcidUtils::getAcidState > --- > > Key: HIVE-23805 > URL: https://issues.apache.org/jira/browse/HIVE-23805 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Peter Varga >Priority: Major > Attachments: Screenshot 2020-07-06 at 4.53.44 PM.png > > Time Spent: 10m > Remaining Estimate: 0h > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1273] > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1286] > > {code:java} > String s = conf.get(ValidTxnList.VALID_TXNS_KEY); > > > if(!Strings.isNullOrEmpty(s)) { > > ... > ... > validTxnList.readFromString(s); > > > } {code} > > > !Screenshot 2020-07-06 at 4.53.44 PM.png|width=610,height=621! > AM spends good amount of CPU parsing the same validtxnlist multiple times. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23805) ValidReadTxnList need not be constructed multiple times in AcidUtils::getAcidState
[ https://issues.apache.org/jira/browse/HIVE-23805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23805: -- Labels: pull-request-available (was: ) > ValidReadTxnList need not be constructed multiple times in > AcidUtils::getAcidState > --- > > Key: HIVE-23805 > URL: https://issues.apache.org/jira/browse/HIVE-23805 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Attachments: Screenshot 2020-07-06 at 4.53.44 PM.png > > Time Spent: 10m > Remaining Estimate: 0h > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1273] > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1286] > > {code:java} > String s = conf.get(ValidTxnList.VALID_TXNS_KEY); > > > if(!Strings.isNullOrEmpty(s)) { > > ... > ... > validTxnList.readFromString(s); > > > } {code} > > > !Screenshot 2020-07-06 at 4.53.44 PM.png|width=610,height=621! > AM spends good amount of CPU parsing the same validtxnlist multiple times. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23805) ValidReadTxnList need not be constructed multiple times in AcidUtils::getAcidState
[ https://issues.apache.org/jira/browse/HIVE-23805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Varga reassigned HIVE-23805: -- Assignee: Peter Varga > ValidReadTxnList need not be constructed multiple times in > AcidUtils::getAcidState > --- > > Key: HIVE-23805 > URL: https://issues.apache.org/jira/browse/HIVE-23805 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Peter Varga >Priority: Major > Attachments: Screenshot 2020-07-06 at 4.53.44 PM.png > > > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1273] > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1286] > > {code:java} > String s = conf.get(ValidTxnList.VALID_TXNS_KEY); > > > if(!Strings.isNullOrEmpty(s)) { > > ... > ... > validTxnList.readFromString(s); > > > } {code} > > > !Screenshot 2020-07-06 at 4.53.44 PM.png|width=610,height=621! > AM spends good amount of CPU parsing the same validtxnlist multiple times. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?focusedWorklogId=455996&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-455996 ] ASF GitHub Bot logged work on HIVE-23813: - Author: ASF GitHub Bot Created on: 08/Jul/20 10:33 Start Date: 08/Jul/20 10:33 Worklog Time Spent: 10m Work Description: aasha commented on pull request #1223: URL: https://github.com/apache/hive/pull/1223#issuecomment-655435804 > @aasha: Could you please verify that the flaky tests are fixed with running the flaky check tester jenkins job?http://ci.hive.apache.org/job/hive-flaky-check/ > > Thanks, > Peter Yes already triggered that. 23 run and all good till now. Will monitor that. http://ci.hive.apache.org/job/hive-flaky-check/67/console This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 455996) Time Spent: 0.5h (was: 20m) > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?focusedWorklogId=455964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-455964 ] ASF GitHub Bot logged work on HIVE-23813: - Author: ASF GitHub Bot Created on: 08/Jul/20 09:50 Start Date: 08/Jul/20 09:50 Worklog Time Spent: 10m Work Description: pvary commented on pull request #1223: URL: https://github.com/apache/hive/pull/1223#issuecomment-655415683 @aasha: Could you please verify that the flaky tests are fixed with running the flaky check tester jenkins job?http://ci.hive.apache.org/job/hive-flaky-check/ Thanks, Peter This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 455964) Time Spent: 20m (was: 10m) > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-20441) NPE in ExprNodeGenericFuncDesc when hive.allow.udf.load.on.demand is set to true
[ https://issues.apache.org/jira/browse/HIVE-20441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153437#comment-17153437 ] Zhihua Deng commented on HIVE-20441: ...The problem may still be there in the trunk, [~BIGrey] are you still working on this ? > NPE in ExprNodeGenericFuncDesc when hive.allow.udf.load.on.demand is set to > true > - > > Key: HIVE-20441 > URL: https://issues.apache.org/jira/browse/HIVE-20441 > Project: Hive > Issue Type: Bug > Components: CLI, HiveServer2 >Affects Versions: 1.2.1, 2.3.3 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Attachments: HIVE-20441.1.patch, HIVE-20441.2.patch, > HIVE-20441.3.patch, HIVE-20441.4.patch, HIVE-20441.patch > > > When hive.allow.udf.load.on.demand is set to true and hiveserver2 has been > started, the new created function from other clients or hiveserver2 will be > loaded from the metastore at the first time. > When the udf is used in where clause, we got a NPE like: > {code:java} > Error executing statement: > org.apache.hive.service.cli.HiveSQLException: Error while compiling > statement: FAILED: NullPointerException null > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAP > SHOT] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHO > T] > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:542) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA > PSHOT] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA > PSHOT] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:57) > ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [?:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [?:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77] > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104) > ~[hive-exec-2. > 3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2. > 3.4-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at > org.apache.hadoop.hive.ql.lib.Expressi
[jira] [Work logged] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?focusedWorklogId=455928&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-455928 ] ASF GitHub Bot logged work on HIVE-23813: - Author: ASF GitHub Bot Created on: 08/Jul/20 08:43 Start Date: 08/Jul/20 08:43 Worklog Time Spent: 10m Work Description: aasha opened a new pull request #1223: URL: https://github.com/apache/hive/pull/1223 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 455928) Remaining Estimate: 0h Time Spent: 10m > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-23813.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23813: -- Labels: pull-request-available (was: ) > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23813.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-23813 started by Aasha Medhi. -- > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-23813.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-23813: --- Attachment: HIVE-23813.01.patch Status: Patch Available (was: In Progress) > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Attachments: HIVE-23813.01.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23814) Clean up Driver
[ https://issues.apache.org/jira/browse/HIVE-23814?focusedWorklogId=455917&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-455917 ] ASF GitHub Bot logged work on HIVE-23814: - Author: ASF GitHub Bot Created on: 08/Jul/20 08:23 Start Date: 08/Jul/20 08:23 Worklog Time Spent: 10m Work Description: miklosgergely opened a new pull request #1222: URL: https://github.com/apache/hive/pull/1222 Driver is now cut down to it's minimal size by extracting all of it's sub tasks to separate classes. The rest should be cleaned up by - moving out some smaller parts of the code to sub task and utility classes wherever it is still possible - cut large functions to meaningful and manageable parts - re-order the functions to follow the order of processing - fix checkstyle issues This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 455917) Remaining Estimate: 0h Time Spent: 10m > Clean up Driver > --- > > Key: HIVE-23814 > URL: https://issues.apache.org/jira/browse/HIVE-23814 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Driver is now cut down to it's minimal size by extracting all of it's sub > tasks to separate classes. The rest should be cleaned up by > * moving out some smaller parts of the code to sub task and utility classes > wherever it is still possible > * cut large functions to meaningful and manageable parts > * re-order the functions to follow the order of processing > * fix checkstyle issues > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23814) Clean up Driver
[ https://issues.apache.org/jira/browse/HIVE-23814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23814: -- Labels: pull-request-available (was: ) > Clean up Driver > --- > > Key: HIVE-23814 > URL: https://issues.apache.org/jira/browse/HIVE-23814 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Driver is now cut down to it's minimal size by extracting all of it's sub > tasks to separate classes. The rest should be cleaned up by > * moving out some smaller parts of the code to sub task and utility classes > wherever it is still possible > * cut large functions to meaningful and manageable parts > * re-order the functions to follow the order of processing > * fix checkstyle issues > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23814) Clean up Driver
[ https://issues.apache.org/jira/browse/HIVE-23814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Gergely reassigned HIVE-23814: - > Clean up Driver > --- > > Key: HIVE-23814 > URL: https://issues.apache.org/jira/browse/HIVE-23814 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > > Driver is now cut down to it's minimal size by extracting all of it's sub > tasks to separate classes. The rest should be cleaned up by > * moving out some smaller parts of the code to sub task and utility classes > wherever it is still possible > * cut large functions to meaningful and manageable parts > * re-order the functions to follow the order of processing > * fix checkstyle issues > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23813) Fix Flaky tests due to JDO ConnectionException
[ https://issues.apache.org/jira/browse/HIVE-23813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi reassigned HIVE-23813: -- > Fix Flaky tests due to JDO ConnectionException > -- > > Key: HIVE-23813 > URL: https://issues.apache.org/jira/browse/HIVE-23813 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23762) TestPigHBaseStorageHandler tests are flaky
[ https://issues.apache.org/jira/browse/HIVE-23762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153342#comment-17153342 ] Aasha Medhi commented on HIVE-23762: will be fixed as part of https://issues.apache.org/jira/browse/HIVE-23813 > TestPigHBaseStorageHandler tests are flaky > -- > > Key: HIVE-23762 > URL: https://issues.apache.org/jira/browse/HIVE-23762 > Project: Hive > Issue Type: Bug >Reporter: Peter Varga >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Most likely caused by HIVE-23668 -- This message was sent by Atlassian Jira (v8.3.4#803005)