[jira] Commented: (MAPREDUCE-118) Job.getJobID() will always return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868095#action_12868095 ] Sharad Agarwal commented on MAPREDUCE-118: -- Should we override getJobID() in Job and do ensureState before doing super.getJobID() ? This will give the consistent error message to user instead of returning null in some cases. Job.getJobID() will always return null -- Key: MAPREDUCE-118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-118 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.20.1 Reporter: Amar Kamat Assignee: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: patch-118-0.20-1.txt, patch-118-0.20.txt, patch-118-0.21.txt, patch-118-1.txt, patch-118-2.txt, patch-118-3.txt, patch-118-4.txt, patch-118.txt JobContext is used for a read-only view of job's info. Hence all the readonly fields in JobContext are set in the constructor. Job extends JobContext. When a Job is created, jobid is not known and hence there is no way to set JobID once Job is created. JobID is obtained only when the JobClient queries the jobTracker for a job-id., which happens later i.e upon job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-118) Job.getJobID() will always return null
[ https://issues.apache.org/jira/browse/MAPREDUCE-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868096#action_12868096 ] Amareshwari Sriramadasu commented on MAPREDUCE-118: --- bq. Should we override getJobID() in Job and do ensureState before doing super.getJobID() ? I had this in my earlier patch. But have seen problems when user calls getJobID() from his InputFormat.getSplis(JobContext) and etc, though the JobID is available by that time. Job.getJobID() will always return null -- Key: MAPREDUCE-118 URL: https://issues.apache.org/jira/browse/MAPREDUCE-118 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 0.20.1 Reporter: Amar Kamat Assignee: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.21.0 Attachments: patch-118-0.20-1.txt, patch-118-0.20.txt, patch-118-0.21.txt, patch-118-1.txt, patch-118-2.txt, patch-118-3.txt, patch-118-4.txt, patch-118.txt JobContext is used for a read-only view of job's info. Hence all the readonly fields in JobContext are set in the constructor. Job extends JobContext. When a Job is created, jobid is not known and hence there is no way to set JobID once Job is created. JobID is obtained only when the JobClient queries the jobTracker for a job-id., which happens later i.e upon job submission. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1779) Should we provide a way to know JobTracker's memory info from client?
[ https://issues.apache.org/jira/browse/MAPREDUCE-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868099#action_12868099 ] Amar Kamat commented on MAPREDUCE-1779: --- Why would client applications need JobTracker's memory information. I think the reason we added it to ClusterStatus was that its maintained at one place and its passed it to the webui for display. I dont think JobTracker's memory information should be a part of ClusterStatus. If at all some admins require it, it should be made available via MRAdmin. I dont see any reason why client should be aware of JobTracker's memory details. Should we provide a way to know JobTracker's memory info from client? - Key: MAPREDUCE-1779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1779 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, jobtracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.22.0 In HADOOP-4435, in branch 0.20, getClusterStatus() method returns JobTracker's used memory and total memory. But these details are missed in new api (through MAPREDUCE-777). If these details are needed only for web UI, I don't think they are needed for client. So, should we provide a way to know JobTracker's memory info from client? If yes, an api should be added in org.apache.hadoop.mapreduce.Cluster for the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1779) Should we provide a way to know JobTracker's memory info from client?
[ https://issues.apache.org/jira/browse/MAPREDUCE-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868104#action_12868104 ] dhruba borthakur commented on MAPREDUCE-1779: - Two reasons: 1. Hive actually throttles new job-submissions if the heap memory on the JT exceeds a certain threshold. 2. It is also needed to monitor the health of the JT. Should we provide a way to know JobTracker's memory info from client? - Key: MAPREDUCE-1779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1779 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, jobtracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.22.0 In HADOOP-4435, in branch 0.20, getClusterStatus() method returns JobTracker's used memory and total memory. But these details are missed in new api (through MAPREDUCE-777). If these details are needed only for web UI, I don't think they are needed for client. So, should we provide a way to know JobTracker's memory info from client? If yes, an api should be added in org.apache.hadoop.mapreduce.Cluster for the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1793) Exception exculsion functionality is not working correctly.
Exception exculsion functionality is not working correctly. --- Key: MAPREDUCE-1793 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1793 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Vinay Kumar Thota Assignee: Balaji Rajagopalan Exception exclusion functionality is not working correctly because of that tests are failing by not matching the error count. I debugged the issue and found that the problem with shell command which is generating in the getNumberOfMatchesInLogFile function. Currently building the shell command in the following way. if(list != null){ for(int i =0; i list.length; ++i) { filePattern.append( | grep -v + list[i] ); } } String[] cmd = new String[] { bash, -c, grep -c + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; However, The above commnad won't work correctly because you are counting the exceptions in the file before excluding the known exceptions. In this case it gives the mismatch error counts everytime.The shell command should be in the following way to work correctly. if (list != null) { int index = 0; for (String excludeExp : list) { filePattern.append((++index list.length)? | grep -v : | grep -vc + list[i] ); } } String[] cmd = new String[] { bash, -c, grep + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1794) Test the job status of lost task trackers before and after the timeout.
Test the job status of lost task trackers before and after the timeout. --- Key: MAPREDUCE-1794 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1794 Project: Hadoop Map/Reduce Issue Type: Task Components: test Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota This test covers the following scenarios. 1. Verify the job status whether it is succeeded or not when the task tracker is lost and alive before the timeout. 2. Verify the job status and killed attempts of a task whether it is succeeded or not and killed attempts are matched or not when the task trackers are lost and it timeout for all the four attempts of a task. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1794) Test the job status of lost task trackers before and after the timeout.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1794: - Attachment: 1794_lost_tasktracker.patch Please review the patch and give me your comments. Test the job status of lost task trackers before and after the timeout. --- Key: MAPREDUCE-1794 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1794 Project: Hadoop Map/Reduce Issue Type: Task Components: test Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Attachments: 1794_lost_tasktracker.patch This test covers the following scenarios. 1. Verify the job status whether it is succeeded or not when the task tracker is lost and alive before the timeout. 2. Verify the job status and killed attempts of a task whether it is succeeded or not and killed attempts are matched or not when the task trackers are lost and it timeout for all the four attempts of a task. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1779) Should we provide a way to know JobTracker's memory info from client?
[ https://issues.apache.org/jira/browse/MAPREDUCE-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868269#action_12868269 ] Arun C Murthy commented on MAPREDUCE-1779: -- bq. It is also needed to monitor the health of the JT. This is already available in JVM Metrics. bq. Hive actually throttles new job-submissions if the heap memory on the JT exceeds a certain threshold. Again, I'd like to re-iterate that this 'feature' causes *serious* performance issues on the JobTracker - JNI calls are *very* expensive, a rouge client for this feature can easily cause *severe* harm to the JobTracker and hence the entire cluster. Hive can use JVM Metrics for the same functionality given that this is already available in JVM Metrics. Thus, I'm -1 on this feature. Should we provide a way to know JobTracker's memory info from client? - Key: MAPREDUCE-1779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1779 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, jobtracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.22.0 In HADOOP-4435, in branch 0.20, getClusterStatus() method returns JobTracker's used memory and total memory. But these details are missed in new api (through MAPREDUCE-777). If these details are needed only for web UI, I don't think they are needed for client. So, should we provide a way to know JobTracker's memory info from client? If yes, an api should be added in org.apache.hadoop.mapreduce.Cluster for the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1779) Should we provide a way to know JobTracker's memory info from client?
[ https://issues.apache.org/jira/browse/MAPREDUCE-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868302#action_12868302 ] dhruba borthakur commented on MAPREDUCE-1779: - Sounds like a fine idea to me, if this data is already available via JVM Metrics. Should we provide a way to know JobTracker's memory info from client? - Key: MAPREDUCE-1779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1779 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, jobtracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.22.0 In HADOOP-4435, in branch 0.20, getClusterStatus() method returns JobTracker's used memory and total memory. But these details are missed in new api (through MAPREDUCE-777). If these details are needed only for web UI, I don't think they are needed for client. So, should we provide a way to know JobTracker's memory info from client? If yes, an api should be added in org.apache.hadoop.mapreduce.Cluster for the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1793) Exception exculsion functionality is not working correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868367#action_12868367 ] Konstantin Boudnik commented on MAPREDUCE-1793: --- It isn't a good style to paste a content of the patch into the description field. Exception exculsion functionality is not working correctly. --- Key: MAPREDUCE-1793 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1793 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Vinay Kumar Thota Assignee: Balaji Rajagopalan Exception exclusion functionality is not working correctly because of that tests are failing by not matching the error count. I debugged the issue and found that the problem with shell command which is generating in the getNumberOfMatchesInLogFile function. Currently building the shell command in the following way. if(list != null){ for(int i =0; i list.length; ++i) { filePattern.append( | grep -v + list[i] ); } } String[] cmd = new String[] { bash, -c, grep -c + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; However, The above commnad won't work correctly because you are counting the exceptions in the file before excluding the known exceptions. In this case it gives the mismatch error counts everytime.The shell command should be in the following way to work correctly. if (list != null) { int index = 0; for (String excludeExp : list) { filePattern.append((++index list.length)? | grep -v : | grep -vc + list[i] ); } } String[] cmd = new String[] { bash, -c, grep + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-1779) Should we provide a way to know JobTracker's memory info from client?
[ https://issues.apache.org/jira/browse/MAPREDUCE-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved MAPREDUCE-1779. -- Resolution: Won't Fix Thanks Dhruba. Closing as 'wontfix'. Should we provide a way to know JobTracker's memory info from client? - Key: MAPREDUCE-1779 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1779 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, jobtracker Affects Versions: 0.21.0 Reporter: Amareshwari Sriramadasu Priority: Blocker Fix For: 0.22.0 In HADOOP-4435, in branch 0.20, getClusterStatus() method returns JobTracker's used memory and total memory. But these details are missed in new api (through MAPREDUCE-777). If these details are needed only for web UI, I don't think they are needed for client. So, should we provide a way to know JobTracker's memory info from client? If yes, an api should be added in org.apache.hadoop.mapreduce.Cluster for the same. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-115) Map tasks are receiving FileNotFound Exceptions for spill files on a regular basis and are getting killed
[ https://issues.apache.org/jira/browse/MAPREDUCE-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868472#action_12868472 ] geoff hendrey commented on MAPREDUCE-115: - Most of my mappers are dying with this error. I am using Hadoop 20.2. Any suggestions for a work around? 010-05-17 14:03:42,738 INFO org.apache.hadoop.mapred.Merger: Merging 22 sorted segments 2010-05-17 14:03:43,099 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.io.FileNotFoundException: /hive4/mapred/local/taskTracker/jobcache/job_201005141621_0137/attempt_201005141621_0137_m_00_0/output/spill15.out at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:167) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) at org.apache.hadoop.mapred.Merger$Segment.init(Merger.java:205) at org.apache.hadoop.mapred.Merger$Segment.access$100(Merger.java:165) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:418) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381) at org.apache.hadoop.mapred.Merger.merge(Merger.java:77) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1522) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Map tasks are receiving FileNotFound Exceptions for spill files on a regular basis and are getting killed - Key: MAPREDUCE-115 URL: https://issues.apache.org/jira/browse/MAPREDUCE-115 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jothi Padmanabhan The following is the log -- Map tasks are unable to locate the spill files when they are doing the final merge (mergeParts). java.io.FileNotFoundException: File /xxx/mapred-tt/mapred-local/taskTracker/jobcache/job_200808190959_0001/attempt_200808190959_0001_m_00_0/output/spill23.out does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244) at org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:682) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.getFileLength(ChecksumFileSystem.java:218) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.seek(ChecksumFileSystem.java:259) at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1102) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:769) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:255) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1795) add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2)
add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2) Key: MAPREDUCE-1795 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1795 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Greg Roelofs Assignee: Ravi Gummadi When running MapReduce with concatenated gzip files as input only the first part is read, which is confusing, to say the least. Concatenated gzip is described in http://www.gnu.org/software/gzip/manual/gzip.html#Advanced-usage and in http://www.ietf.org/rfc/rfc1952.txt. (See original report at http://www.nabble.com/Problem-with-Hadoop-and-concatenated-gzip-files-to21383097.html) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1795) add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Roelofs updated MAPREDUCE-1795: Original Estimate: 336h Remaining Estimate: 336h Affects Version/s: 0.20.2 Description: When running MapReduce with concatenated gzip files as input, only the first part (member in gzip spec parlance, http://www.ietf.org/rfc/rfc1952.txt) is read; the remainder is silently ignored. As a first step toward fixing that, this issue will add a configurable option to throw an error in such cases. MAPREDUCE-469 is the tracker for the more complete fix/feature, whenever that occurs. was: When running MapReduce with concatenated gzip files as input only the first part is read, which is confusing, to say the least. Concatenated gzip is described in http://www.gnu.org/software/gzip/manual/gzip.html#Advanced-usage and in http://www.ietf.org/rfc/rfc1952.txt. (See original report at http://www.nabble.com/Problem-with-Hadoop-and-concatenated-gzip-files-to21383097.html) add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2) Key: MAPREDUCE-1795 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1795 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 0.20.2 Reporter: Greg Roelofs Assignee: Ravi Gummadi Original Estimate: 336h Remaining Estimate: 336h When running MapReduce with concatenated gzip files as input, only the first part (member in gzip spec parlance, http://www.ietf.org/rfc/rfc1952.txt) is read; the remainder is silently ignored. As a first step toward fixing that, this issue will add a configurable option to throw an error in such cases. MAPREDUCE-469 is the tracker for the more complete fix/feature, whenever that occurs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (MAPREDUCE-1795) add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Roelofs reassigned MAPREDUCE-1795: --- Assignee: Greg Roelofs add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2) Key: MAPREDUCE-1795 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1795 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Greg Roelofs Assignee: Greg Roelofs When running MapReduce with concatenated gzip files as input, only the first part (member in gzip spec parlance, http://www.ietf.org/rfc/rfc1952.txt) is read; the remainder is silently ignored. As a first step toward fixing that, this issue will add a configurable option to throw an error in such cases. MAPREDUCE-469 is the tracker for the more complete fix/feature, whenever that occurs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1795) add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2)
[ https://issues.apache.org/jira/browse/MAPREDUCE-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Roelofs updated MAPREDUCE-1795: Original Estimate: (was: 336h) Remaining Estimate: (was: 336h) Assignee: (was: Ravi Gummadi) Affects Version/s: (was: 0.20.2) add error option if file-based record-readers fail to consume all input (e.g., concatenated gzip, bzip2) Key: MAPREDUCE-1795 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1795 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Greg Roelofs When running MapReduce with concatenated gzip files as input, only the first part (member in gzip spec parlance, http://www.ietf.org/rfc/rfc1952.txt) is read; the remainder is silently ignored. As a first step toward fixing that, this issue will add a configurable option to throw an error in such cases. MAPREDUCE-469 is the tracker for the more complete fix/feature, whenever that occurs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1796) job tracker history viewer shows all recent jobs as being run at job tracker (re)start time
job tracker history viewer shows all recent jobs as being run at job tracker (re)start time --- Key: MAPREDUCE-1796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1796 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Ted Yu Fix For: 0.20.3 This has been the behavior of the History viewer for long that it shows the timestamp when the JobTracker restarted rather than Job start time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-1796) job tracker history viewer shows all recent jobs as being run at job tracker (re)start time
[ https://issues.apache.org/jira/browse/MAPREDUCE-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu resolved MAPREDUCE-1796. Fix Version/s: (was: 0.20.3) Resolution: Duplicate Duplicate of MAPREDUCE-1541 job tracker history viewer shows all recent jobs as being run at job tracker (re)start time --- Key: MAPREDUCE-1796 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1796 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Affects Versions: 0.20.2 Reporter: Ted Yu This has been the behavior of the History viewer for long that it shows the timestamp when the JobTracker restarted rather than Job start time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1793) Exception exculsion functionality is not working correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868526#action_12868526 ] Vinay Kumar Thota commented on MAPREDUCE-1793: -- It's my suggestion where exactly the code needs to change to resolve the issue.So that I have mentioned the part of the code in description field. The pattern always should be either ERROR,WARN and FATAL and we need to fetch the exceptions based on the pattern from the file.Once we got the exceptions, we need to exclude the exceptions from the output list.Later we need to count the new exceptions. For example, In my above suggestion the shell command generates like this. grap ERROR logfiles* | grep -v IOExceptin | grep -vc java.net.ConnectException | awk -F : '{s+=$2} END {print s}' here {{filePattern}} is logfile* and {{pattern}} is ERROR I would say my suggestion is 100% correct and there is no faulty in that. Exception exculsion functionality is not working correctly. --- Key: MAPREDUCE-1793 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1793 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Vinay Kumar Thota Assignee: Balaji Rajagopalan Exception exclusion functionality is not working correctly because of that tests are failing by not matching the error count. I debugged the issue and found that the problem with shell command which is generating in the getNumberOfMatchesInLogFile function. Currently building the shell command in the following way. if(list != null){ for(int i =0; i list.length; ++i) { filePattern.append( | grep -v + list[i] ); } } String[] cmd = new String[] { bash, -c, grep -c + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; However, The above commnad won't work correctly because you are counting the exceptions in the file before excluding the known exceptions. In this case it gives the mismatch error counts everytime.The shell command should be in the following way to work correctly. if (list != null) { int index = 0; for (String excludeExp : list) { filePattern.append((++index list.length)? | grep -v : | grep -vc + list[i] ); } } String[] cmd = new String[] { bash, -c, grep + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1793) Exception exculsion functionality is not working correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12868535#action_12868535 ] Konstantin Boudnik commented on MAPREDUCE-1793: --- oops, you are right. I have misread the proposed fix. Sorry. I guess this is the absence of a patch to blame ;-) Exception exculsion functionality is not working correctly. --- Key: MAPREDUCE-1793 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1793 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Vinay Kumar Thota Assignee: Balaji Rajagopalan Exception exclusion functionality is not working correctly because of that tests are failing by not matching the error count. I debugged the issue and found that the problem with shell command which is generating in the getNumberOfMatchesInLogFile function. Currently building the shell command in the following way. if(list != null){ for(int i =0; i list.length; ++i) { filePattern.append( | grep -v + list[i] ); } } String[] cmd = new String[] { bash, -c, grep -c + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; However, The above commnad won't work correctly because you are counting the exceptions in the file before excluding the known exceptions. In this case it gives the mismatch error counts everytime.The shell command should be in the following way to work correctly. if (list != null) { int index = 0; for (String excludeExp : list) { filePattern.append((++index list.length)? | grep -v : | grep -vc + list[i] ); } } String[] cmd = new String[] { bash, -c, grep + pattern + + filePattern + | awk -F: '{s+=$2} END {print s}' }; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.