[ https://issues.apache.org/jira/browse/HIVE-25970?focusedWorklogId=733189&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-733189 ]
ASF GitHub Bot logged work on HIVE-25970: ----------------------------------------- Author: ASF GitHub Bot Created on: 25/Feb/22 17:30 Start Date: 25/Feb/22 17:30 Worklog Time Spent: 10m Work Description: zabetak closed pull request #3048: URL: https://github.com/apache/hive/pull/3048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 733189) Time Spent: 40m (was: 0.5h) > Missing messages in HS2 operation logs > -------------------------------------- > > Key: HIVE-25970 > URL: https://issues.apache.org/jira/browse/HIVE-25970 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > After HIVE-22753 & HIVE-24590, with some unlucky timing of events, operation > log messages can get lost and never appear in the appropriate files. > The changes in HIVE-22753 will prevent a {{HushableRandomAccessFileAppender}} > from being created if the latter refers to a file that has been closed in the > last second. Preventing the creation of the appender also means that the > message which triggered the creation will be lost forever. In fact any > message (for the same query) that comes in the interval of 1 second will be > lost forever. > Before HIVE-24590 the appender/file was closed only once (explicitly by HS2) > and thus the problem may be very hard to notice in practice. However, with > the arrival of HIVE-24590 appenders may close much more frequently (and not > via HS2) making the issue reproducible rather easily. It suffices to set > _hive.server2.operation.log.purgePolicy.timeToLive_ property very low and > check the operation logs. > The problem was discovered by investigating some intermittent failures in > operation logging tests (e.g., TestOperationLoggingAPIWithTez). -- This message was sent by Atlassian Jira (v8.20.1#820001)