Incorrect regular expression for extracting task id from filename
-----------------------------------------------------------------
Key: HIVE-2309
URL: https://issues.apache.org/jira/browse/HIVE-2309
Project: Hive
Issue Type: Bug
Components: Query Processor
Affects Versions: 0.7.1
Reporter: Paul Yang
Priority: Minor
For producing the correct filenames for bucketed tables, there is a method in
Utilities.java that extracts out the task id from the filename and replaces it
with the bucket number. There is a bug in the regex that is used to extract
this value for attempt numbers >= 10:
{code}
>>> re.match("^.*?([0-9]+)(_[0-9])?(\\..*)?$",
>>> 'attempt_201107090429_64965_m_001210_10').group(1)
'10'
>>> re.match("^.*?([0-9]+)(_[0-9])?(\\..*)?$",
>>> 'attempt_201107090429_64965_m_001210_9').group(1)
'001210'
{code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira