[jira] Commented: (MAPREDUCE-323) Improve the way job history files are managed
[ https://issues.apache.org/jira/browse/MAPREDUCE-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891928#action_12891928 ] Dick King commented on MAPREDUCE-323: - A PROPOSAL introduction The way the completed job history file system now works is that when a job is started, an empty history file is created by the job tracker. The name of the file contains nough information about the job to let an application tell whether the file documents a job that satisfies a search criterion. In particular, it includes the job tracker instance ID, the job ID, the user name, and the job name. As the job progresses, records get added to the file, and when it's finished [either successfully or failed] the file is moved to another directory, the completed job history files directory [the "DONE directory"]. Currently this directory has a simple flat structure. If an application [in particular, the job history browser] wants some job histories, it reads this directory and chooses the files with names that indicate that the files will meet the criteria. In practical cases this can includes hundreds of thousands or even a million files. Note that each job is represented by two files, the history file and the config file, doubling the burden on the name node. proposal I would like to implement a simple data base to solve this problem. My proposal has the following features: 1: The DONE directory will contain subdirectories, each containing a few hundred or a thousand files. 2: At any time, the job tracker will be filling one of the DONE directory's subdirectories. All the rest are closed out, never to be added to again. 3: The subdirectories have a naming scheme so they're created in lexicographical rder. We would like to use subdirectory names like 2010-07-23--, etc [the four digits are a serial number, not an HHMM field]. 4: When the job tracker decides to bind off a subdirectory and start a new one, it creates a new index file in the subdirectory it's closing out. That index is a simple list of the history files the directory contains. 4a: The job tracker starts a new subdirectory whenever the first history file is copied on a given day, and whenever the current subdirectory would otherwise contain more than a certain number of files. 4b: Perhaps the files can be renamed? These files' names are a few dozen characters each, and in a system that has run a half million jobs the names collectively occupy 100+ megabytes in the name node. Significant, but not decisive. 4b1: 4b would require that rumen understand indices. 5: The processing is: 5a: [optional] create a new short name for every file in the subdirectory that's being closed out 5a1: The job tracker keeps this information in memory. It doesn't need to read the directory 5b: Write out the index file in a temporary location {{temp-index}} within the directory it's indexing. 5b1: The index contains all of the names in text form [if 5a is not use] or all pairs of { long name, short name } in text form, if we are shortening the names. 5c: rename the temp-index file to {{index}} when it's done 5d: [optional] If we chose file renaming, delete all of the long names. 6: When doing a search, we 6a: determine all subdirectories of the DONE directory 6b: see which ones have an index 6c: read each index that exists, and 6d: read all of the files, for the subdirectories that don't have indices yet. 7: To aid retirement of old job history files, the job tracker always binds off the current subdirectory when the date changes, even if it doesn't have very many files, and we retire files on date boundaries, a subdirectory at a time. The relevant date is the date that the file is being moved, which is normally a short time after the job is completed. 8: [optional] We may want to consolidate the indices of a completed day in a per-day index written as a file directly under the done directory. > Improve the way job history files are managed > - > > Key: MAPREDUCE-323 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-323 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Affects Versions: 0.21.0, 0.22.0 >Reporter: Amar Kamat >Assignee: Dick King >Priority: Critical > > Today all the jobhistory files are dumped in one _job-history_ folder. This > can cause problems when there is a need to search the history folder > (job-recovery etc). It would be nice if we group all the jobs under a _user_ > folder. So all the jobs for user _amar_ will go in _history-folder/amar/_. > Jobs can be categorized using various features like _jobid, date, jobname_ >
[jira] Assigned: (MAPREDUCE-1966) Fix tracker blacklisting
[ https://issues.apache.org/jira/browse/MAPREDUCE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Roelofs reassigned MAPREDUCE-1966: --- Assignee: Greg Roelofs > Fix tracker blacklisting > - > > Key: MAPREDUCE-1966 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1966 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Arun C Murthy >Assignee: Greg Roelofs > > The current heuristic of rolling up fixed number of job failures per tracker > isn't working well, we need better design/heuristics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1966) Fix tracker blacklisting
[ https://issues.apache.org/jira/browse/MAPREDUCE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891905#action_12891905 ] Greg Roelofs commented on MAPREDUCE-1966: - There's an ambiguity between sick nodes (typically due to failing hardware, either hard drive or memory or occasionally NIC/network switch) and nodes that have been rendered unresponsive due to user abuse. The existing blacklist heuristics touch on this, but they're a bit ad hoc, and there's not much visibility on the internal state at any given time. One improvement would be to track the per-node, per-job blacklisting history in a sliding window that's divided into buckets of some suitable granularity. Bad hardware would tend to show up as an elevated fault level on one node (or a few nodes) for an extended period--i.e., multiple buckets--while abusive jobs would tend to show up as a spike (ideally) or at least a limited-duration jump in faults (one or a few buckets) across many nodes. Because the heuristics are open to argument even among experts (which would not include me), and because automatic, hardcoded blacklisting has the potential to wipe out a good fraction of a cluster for the wrong reasons, it would seem best to convert the heuristic form of blacklisting to an advisory mode (i.e., "graylisting") until the behavior is better understood. > Fix tracker blacklisting > - > > Key: MAPREDUCE-1966 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1966 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker >Reporter: Arun C Murthy > > The current heuristic of rolling up fixed number of job failures per tracker > isn't working well, we need better design/heuristics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1967) When a reducer fails on DFS quota, the job should fail immediately
When a reducer fails on DFS quota, the job should fail immediately -- Key: MAPREDUCE-1967 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1967 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Dick King Suppose an M/R job has so much output that the user is certain to exceed hir quota. Then some of the reducers will succeed but the job will get into a state where the remaining reducers squabble over the remaining space. The remaining reducers will nibble at the remaining space, and finally one reducer will fail on quota. Its output file will be erased, and the other reducers will collectively consume that space until one of _them_ fails on quota. Since the incomplete reducer that fails on quota is "chosen" randomly, the tasks will accumulate their failures at similar rates, and the system will have made a substantial futile investment. I would like to say that if a single reducer fails on DFS quota, the job should be failed. There may be a corner case that induces us to think that we shouldn't be quite this stringent, but at least we shouldn't have to await four failures by one task before shutting the job down. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1961) [gridmix3] ConcurrentModificationException when shutting down Gridmix
[ https://issues.apache.org/jira/browse/MAPREDUCE-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1961: - Status: Patch Available (was: Open) > [gridmix3] ConcurrentModificationException when shutting down Gridmix > - > > Key: MAPREDUCE-1961 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1961 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Hong Tang >Assignee: Hong Tang > Attachments: mr-1961-20100723.patch > > > We observed the following exception occasionally at the end of the Gridmix > run: > {code} > Exception in thread "StatsCollectorThread" > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > org.apache.hadoop.mapred.gridmix.Statistics$StatCollector.updateAndNotifyClusterStatsListeners(Statistics.java:220) > at > org.apache.hadoop.mapred.gridmix.Statistics$StatCollector.run(Statistics.java:205) > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1961) [gridmix3] ConcurrentModificationException when shutting down Gridmix
[ https://issues.apache.org/jira/browse/MAPREDUCE-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Tang updated MAPREDUCE-1961: - Attachment: mr-1961-20100723.patch Trivial patch that uses CopyOnWriteArrayList to avoid concurrent modification. No unit test included as it is hard to reproduce the synchronization bug through unit tests. > [gridmix3] ConcurrentModificationException when shutting down Gridmix > - > > Key: MAPREDUCE-1961 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1961 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Hong Tang >Assignee: Hong Tang > Attachments: mr-1961-20100723.patch > > > We observed the following exception occasionally at the end of the Gridmix > run: > {code} > Exception in thread "StatsCollectorThread" > java.util.ConcurrentModificationException > at > java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) > at java.util.AbstractList$Itr.next(AbstractList.java:343) > at > org.apache.hadoop.mapred.gridmix.Statistics$StatCollector.updateAndNotifyClusterStatsListeners(Statistics.java:220) > at > org.apache.hadoop.mapred.gridmix.Statistics$StatCollector.run(Statistics.java:205) > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1718) job conf key for the services name of DelegationToken for HFTP url is constructed incorrectly in HFTPFileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated MAPREDUCE-1718: --- Status: Resolved (was: Patch Available) Fix Version/s: 0.22.0 Resolution: Fixed I just committed this. Thanks, Boris! > job conf key for the services name of DelegationToken for HFTP url is > constructed incorrectly in HFTPFileSystem > --- > > Key: MAPREDUCE-1718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1718 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1718-2.patch, MAPREDUCE-1718-3.patch, > MAPREDUCE-1718-4.patch, MAPREDUCE-1718-4.patch, MAPREDUCE-1718-BP20-1.patch, > MAPREDUCE-1718-BP20-2.patch > > > the key (build in TokenCache) is hdfs.service.host_HOSTNAME.PORT, but > in HftpFileSystem it is sometimes built as hdfs.service.host_IP.PORT. > Fix. change it to always be IP. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
[ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891836#action_12891836 ] Junjie Liang commented on MAPREDUCE-1901: - To supplement Joydeep's comment: We are trying to save the number of calls to the NameNode, through the following optimizations: 1) Currently, files loaded through hadoop libjars/files/archives mechanism are copied onto HDFS and removed on every job. This is inefficient if most jobs are submitted from only 3-4 versions of hive, because rightfully the files should persist in HDFS to be reused. Hence the idea of decoupling files with their jobId to make them sharable across jobs. 2) If files are identified with their md5 checksums, we no longer need to verify file modification time in the TT. This saves another call to the NameNode to get the FileStatus object. The reduction in the number of calls to the NameNode is small, but over a large number of jobs we believe it will be a noticeable difference. > Jobs should not submit the same jar files over and over again > - > > Key: MAPREDUCE-1901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > Attachments: 1901.PATCH > > > Currently each Hadoop job uploads the required resources > (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in > executing this job would then download these resources into local disk. > In an environment where most of the users are using a standard set of jars > and files (because they are using a framework like Hive/Pig) - the same jars > keep getting uploaded and downloaded repeatedly. The overhead of this > protocol (primarily in terms of end-user latency) is significant when: > - the jobs are small (and conversantly - large in number) > - Namenode is under load (meaning hdfs latencies are high and made worse, in > part, by this protocol) > Hadoop should provide a way for jobs in a cooperative environment to not > submit the same files over and again. Identifying and caching execution > resources by a content signature (md5/sha) would be a good alternative to > have available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1966) Fix tracker blacklisting
Fix tracker blacklisting - Key: MAPREDUCE-1966 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1966 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker Reporter: Arun C Murthy The current heuristic of rolling up fixed number of job failures per tracker isn't working well, we need better design/heuristics. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1718) job conf key for the services name of DelegationToken for HFTP url is constructed incorrectly in HFTPFileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated MAPREDUCE-1718: -- Attachment: MAPREDUCE-1718-4.patch merged with trunk. Ran tests all passed (except TestRumenJobTraces - see MAPREDUCE-1925) > job conf key for the services name of DelegationToken for HFTP url is > constructed incorrectly in HFTPFileSystem > --- > > Key: MAPREDUCE-1718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1718 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Attachments: MAPREDUCE-1718-2.patch, MAPREDUCE-1718-3.patch, > MAPREDUCE-1718-4.patch, MAPREDUCE-1718-4.patch, MAPREDUCE-1718-BP20-1.patch, > MAPREDUCE-1718-BP20-2.patch > > > the key (build in TokenCache) is hdfs.service.host_HOSTNAME.PORT, but > in HftpFileSystem it is sometimes built as hdfs.service.host_IP.PORT. > Fix. change it to always be IP. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
[ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891824#action_12891824 ] Joydeep Sen Sarma commented on MAPREDUCE-1901: -- > The DistributedCache already tracks mtimes for files u - that's what i am saying. if u consider objects as immutable - then u don't have to track and look up mtimes. part of the goal here is to not have to look up mtimes again and again. if u have an object with matching md5 localized - you are done. (but we can't use the names alone for that. names can collide. md5 cannot (or nearly so). so we name objects based on their content signature (md5) - which is what a content addressible store/cache does). > Admin installs pig/hive on hdfs: > /share/hive/v1/hive.jar > /share/hive/v2/hive.jar that's not how hive works (or how hadoop streaming works). people deploy hive on NFS filers or local disks. users run hive jobs from these installation points. there's no hdfs involvement anywhere. people add jars to hive or hadoop streaming from their personal or shared folders. when people run hive jobs - they are not writing java. there's no .setRemoteJar() code they are writing. hive loads the required jars (from the install directory) to hadoop via hadoop libjars/files/archives functionality. different hive clients are not aware of each other (ditto for hadoop streaming). most of the hive clients are running from common install points - but people may be running from personal install points with altered builds. with what we have done in this patch - all these uncoordinated clients automatically share jars with each other. because the name for the shared object now is derived from the content of the object. we are still leveraging distributed cache - but we are naming objects based on their contents. Junjie tells me we can leverage the 'shared' objects namespace from trunk (in 20 we added our own shared namespace). because the names are based on strong content signature - we can make the assumption of immutability. as i have tried to point out many times - when objects are immutable - one can make optimizations and skip timestamp based validation. the latter requires hdfs lookups and creates load and latency. note that we need zero application changes for this sharing and zero admin overhead. so all sorts of hadoop users will automatically start getting the benefit a shared jars without writing any code and without any special admin recipe. isn't that good? > Jobs should not submit the same jar files over and over again > - > > Key: MAPREDUCE-1901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > Attachments: 1901.PATCH > > > Currently each Hadoop job uploads the required resources > (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in > executing this job would then download these resources into local disk. > In an environment where most of the users are using a standard set of jars > and files (because they are using a framework like Hive/Pig) - the same jars > keep getting uploaded and downloaded repeatedly. The overhead of this > protocol (primarily in terms of end-user latency) is significant when: > - the jobs are small (and conversantly - large in number) > - Namenode is under load (meaning hdfs latencies are high and made worse, in > part, by this protocol) > Hadoop should provide a way for jobs in a cooperative environment to not > submit the same files over and again. Identifying and caching execution > resources by a content signature (md5/sha) would be a good alternative to > have available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1925) TestRumenJobTraces fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated MAPREDUCE-1925: - Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed +1 I committed this. Thanks, Ravi! > TestRumenJobTraces fails in trunk > - > > Key: MAPREDUCE-1925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amareshwari Sriramadasu >Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1925.patch, 1925.v1.1.patch, 1925.v1.patch, > 1925.v2.1.patch, 1925.v2.patch > > > TestRumenJobTraces failed with following error: > Error Message > the gold file contains more text at line 1 expected:<56> but was:<0> > Stacktrace > at > org.apache.hadoop.tools.rumen.TestRumenJobTraces.testHadoop20JHParser(TestRumenJobTraces.java:294) > Full log of the failure is available at > http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/292/testReport/org.apache.hadoop.tools.rumen/TestRumenJobTraces/testHadoop20JHParser/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
[ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891812#action_12891812 ] Arun C Murthy commented on MAPREDUCE-1901: -- Joydeep - Maybe we are talking past each other, yet ... The DistributedCache _already_ tracks mtimes for files. Each TT, via the DistributedCache, localizes the file based on . This seems sufficient for the use case as I understand it. Here is the flow: Admin installs pig/hive on hdfs: /share/hive/v1/hive.jar /share/hive/v2/hive.jar The pig/hive framework, in fact, any MR job then does: JobConf job = new JobConf(); job.setRemoteJar(new Path("/share/hive/v1/hive.jar") JobConf.submitJob(job); That's it. The JobClient has the smarts to use DistributedCache.addArchiveToClassPath as the implementation of JobConf.setRemoteJar. If you want a new version of hive.jar, you change hive to use /share/hive/v2/hive.jar. What am I missing here? > Jobs should not submit the same jar files over and over again > - > > Key: MAPREDUCE-1901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > Attachments: 1901.PATCH > > > Currently each Hadoop job uploads the required resources > (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in > executing this job would then download these resources into local disk. > In an environment where most of the users are using a standard set of jars > and files (because they are using a framework like Hive/Pig) - the same jars > keep getting uploaded and downloaded repeatedly. The overhead of this > protocol (primarily in terms of end-user latency) is significant when: > - the jobs are small (and conversantly - large in number) > - Namenode is under load (meaning hdfs latencies are high and made worse, in > part, by this protocol) > Hadoop should provide a way for jobs in a cooperative environment to not > submit the same files over and again. Identifying and caching execution > resources by a content signature (md5/sha) would be a good alternative to > have available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1965) Add info for job failure on jobtracker UI.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated MAPREDUCE-1965: - Fix Version/s: 0.22.0 > Add info for job failure on jobtracker UI. > -- > > Key: MAPREDUCE-1965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1965 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1965-yahoo-hadoop-0.20S.patch > > > MAPREDUCE-1521 added a filed to jobstatus to mark reason for failures of the > job. This information needs to be displayed on the jobtracker UI. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1965) Add info for job failure on jobtracker UI.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated MAPREDUCE-1965: - Attachment: MAPREDUCE-1965-yahoo-hadoop-0.20S.patch this patch adds the failure info the jobtracker UI. > Add info for job failure on jobtracker UI. > -- > > Key: MAPREDUCE-1965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1965 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1965-yahoo-hadoop-0.20S.patch > > > MAPREDUCE-1521 added a filed to jobstatus to mark reason for failures of the > job. This information needs to be displayed on the jobtracker UI. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1965) Add info for job failure on jobtracker UI.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated MAPREDUCE-1965: - Attachment: MAPREDUCE-1965-yahoo-hadoop-0.20S.patch forgot --no-prefix. > Add info for job failure on jobtracker UI. > -- > > Key: MAPREDUCE-1965 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1965 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1965-yahoo-hadoop-0.20S.patch, > MAPREDUCE-1965-yahoo-hadoop-0.20S.patch > > > MAPREDUCE-1521 added a filed to jobstatus to mark reason for failures of the > job. This information needs to be displayed on the jobtracker UI. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1965) Add info for job failure on jobtracker UI.
Add info for job failure on jobtracker UI. -- Key: MAPREDUCE-1965 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1965 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Mahadev konar Assignee: Mahadev konar Attachments: MAPREDUCE-1965-yahoo-hadoop-0.20S.patch MAPREDUCE-1521 added a filed to jobstatus to mark reason for failures of the job. This information needs to be displayed on the jobtracker UI. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1718) job conf key for the services name of DelegationToken for HFTP url is constructed incorrectly in HFTPFileSystem
[ https://issues.apache.org/jira/browse/MAPREDUCE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated MAPREDUCE-1718: -- Attachment: MAPREDUCE-1718-4.patch merged with the trunk. modified test to verify value set in conf . > job conf key for the services name of DelegationToken for HFTP url is > constructed incorrectly in HFTPFileSystem > --- > > Key: MAPREDUCE-1718 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1718 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Attachments: MAPREDUCE-1718-2.patch, MAPREDUCE-1718-3.patch, > MAPREDUCE-1718-4.patch, MAPREDUCE-1718-BP20-1.patch, > MAPREDUCE-1718-BP20-2.patch > > > the key (build in TokenCache) is hdfs.service.host_HOSTNAME.PORT, but > in HftpFileSystem it is sometimes built as hdfs.service.host_IP.PORT. > Fix. change it to always be IP. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1919) [Herriot] Test for verification of per cache file ref count.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891735#action_12891735 ] Konstantin Boudnik commented on MAPREDUCE-1919: --- Looks good. Please do the same for the trunk and validate through {{test-patch}} and by running the test in a cluster. > [Herriot] Test for verification of per cache file ref count. > - > > Key: MAPREDUCE-1919 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1919 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: test >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: 1919-ydist-security.patch, 1919-ydist-security.patch, > MAPREDUCE-1919.patch > > > It covers the following scenarios. > 1. Run the job with two distributed cache files and verify whether job is > succeeded or not. > 2. Run the job with distributed cache files and remove one cache file from > the DFS when it is localized.verify whether the job is failed or not. > 3. Run the job with two distribute cache files and the size of one file > should be larger than local.cache.size.Verify whether job is succeeded or > not. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1955) Because of changes in JobInProgress.java, JobInProgressAspect.aj also needs to change.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891732#action_12891732 ] Konstantin Boudnik commented on MAPREDUCE-1955: --- I'm looking into the MR trunk code and I see that {{protected AtomicBoolean tasksInited = new AtomicBoolean(false);}} Besides there's {{public boolean inited()}} to access it. So, I don't see how this patch makes any sense for trunk? > Because of changes in JobInProgress.java, JobInProgressAspect.aj also needs > to change. > -- > > Key: MAPREDUCE-1955 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1955 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Reporter: Iyappan Srinivasan >Assignee: Iyappan Srinivasan > Attachments: 1955-ydist-security-patch.txt, > JobInProgressAspectaj.patch, MAPREDUCE-1955.patch > > > Because of changes in JobInProgress.java, JobInProgressAspect.aj also needs > to change. > A variable taskInited is changed from Boolean to boolean in > JobInProgress.java. So JobInProgressAspect.aj also needs to change too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1809) Ant build changes for Streaming system tests in contrib projects.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891726#action_12891726 ] Konstantin Boudnik commented on MAPREDUCE-1809: --- looks Ok, please verify as usual > Ant build changes for Streaming system tests in contrib projects. > - > > Key: MAPREDUCE-1809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1809 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: build >Affects Versions: 0.21.0 >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: 1809-ydist-security.patch, 1809-ydist-security.patch, > MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, > MAPREDUCE-1809.patch > > > Implementing new target( test-system) in build-contrib.xml file for executing > the system test that are in contrib projects. Also adding 'subant' target in > aop.xml that calls the build-contrib.xml file for system tests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1933) Create automated testcase for tasktracker dealing with corrupted disk.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891725#action_12891725 ] Konstantin Boudnik commented on MAPREDUCE-1933: --- Let's see... - {{find src -name *java | xargs grep 'mapred.*.local.dir'}} shows that {noformat} src/java/org/apache/hadoop/mapreduce/util/ConfigUtil.java: Configuration.addDeprecation("mapred.local.dir", {noformat} - also it finds {noformat} src/java/org/apache/hadoop/mapreduce/MRConfig.java: public static final String LOCAL_DIR = "mapreduce.cluster.local.dir"; {noformat} Hope it helps. > Create automated testcase for tasktracker dealing with corrupted disk. > -- > > Key: MAPREDUCE-1933 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1933 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Reporter: Iyappan Srinivasan >Assignee: Iyappan Srinivasan > Attachments: 1933-ydist-security-patch.txt, MAPREDUCE-1933.patch, > MAPREDUCE-1933.patch, TestCorruptedDiskJob.java > > > After the TaskTracker has already run some tasks successfully, "corrupt" a > disk by making the corresponding mapred.local.dir unreadable/unwritable. > Make sure that jobs continue to succeed even though some tasks scheduled > there fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1960) Limit the size of jobconf.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated MAPREDUCE-1960: - Attachment: MAPREDUCE-1960-yahoo-hadoop-0.20S.patch changed the default limit to 5MB. > Limit the size of jobconf. > -- > > Key: MAPREDUCE-1960 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1960 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1960-yahoo-hadoop-0.20S.patch, > MAPREDUCE-1960-yahoo-hadoop-0.20S.patch, > MAPREDUCE-1960-yahoo-hadoop-0.20S.patch > > > In some of our production cluster users have huge job.xml's that bring down > the jobtracker. THis jira is to put limit on the size of the jobconf, so that > we dont blow up the memory on jobtracker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1925) TestRumenJobTraces fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891687#action_12891687 ] Hong Tang commented on MAPREDUCE-1925: -- Patch looks good to me. +1. > TestRumenJobTraces fails in trunk > - > > Key: MAPREDUCE-1925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amareshwari Sriramadasu >Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1925.patch, 1925.v1.1.patch, 1925.v1.patch, > 1925.v2.1.patch, 1925.v2.patch > > > TestRumenJobTraces failed with following error: > Error Message > the gold file contains more text at line 1 expected:<56> but was:<0> > Stacktrace > at > org.apache.hadoop.tools.rumen.TestRumenJobTraces.testHadoop20JHParser(TestRumenJobTraces.java:294) > Full log of the failure is available at > http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/292/testReport/org.apache.hadoop.tools.rumen/TestRumenJobTraces/testHadoop20JHParser/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
[ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891674#action_12891674 ] Joydeep Sen Sarma commented on MAPREDUCE-1901: -- @Arun - you are right - this is a layer above distributed cache for the most part. Take a look at our use case (bottom of my previous comments). Essentially we are extending the Distributed Cache a bit to be a content addressible cache. I do not think our use case is directly supported by Hadoop for this purpose - and we are hoping to make the change in the framework (instead of Hive) because there's nothing Hive specific here and whatever we are doing will be directly leveraged by other apps. Sharing != Content addressible. A NFS filer can be globally shared - but it's not content addressible. An EMC Centera (amongst others) is. Sorry - terrible examples - trying to come up with something quickly. Will address Vinod's comments later - we have taken race considerations into account. > Jobs should not submit the same jar files over and over again > - > > Key: MAPREDUCE-1901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > Attachments: 1901.PATCH > > > Currently each Hadoop job uploads the required resources > (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in > executing this job would then download these resources into local disk. > In an environment where most of the users are using a standard set of jars > and files (because they are using a framework like Hive/Pig) - the same jars > keep getting uploaded and downloaded repeatedly. The overhead of this > protocol (primarily in terms of end-user latency) is significant when: > - the jobs are small (and conversantly - large in number) > - Namenode is under load (meaning hdfs latencies are high and made worse, in > part, by this protocol) > Hadoop should provide a way for jobs in a cooperative environment to not > submit the same files over and again. Identifying and caching execution > resources by a content signature (md5/sha) would be a good alternative to > have available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1960) Limit the size of jobconf.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated MAPREDUCE-1960: - Attachment: MAPREDUCE-1960-yahoo-hadoop-0.20S.patch updated patch which throws an exception in jobinprogress rather than the jobtracker. > Limit the size of jobconf. > -- > > Key: MAPREDUCE-1960 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1960 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Mahadev konar >Assignee: Mahadev konar > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1960-yahoo-hadoop-0.20S.patch, > MAPREDUCE-1960-yahoo-hadoop-0.20S.patch > > > In some of our production cluster users have huge job.xml's that bring down > the jobtracker. THis jira is to put limit on the size of the jobconf, so that > we dont blow up the memory on jobtracker. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1270) Hadoop C++ Extention
[ https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891662#action_12891662 ] Doug Cutting commented on MAPREDUCE-1270: - Looks like BSD: http://www.boost.org/LICENSE_1_0.txt So we'd just need to append it to LICENSE.txt, noting there which files are under this license. > Hadoop C++ Extention > > > Key: MAPREDUCE-1270 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task >Affects Versions: 0.20.1 > Environment: hadoop linux >Reporter: Wang Shouyan > Attachments: HADOOP-HCE-1.0.0.patch, HCE InstallMenu.pdf, HCE > Performance Report.pdf, HCE Tutorial.pdf, Overall Design of Hadoop C++ > Extension.doc > > > Hadoop C++ extension is an internal project in baidu, We start it for these > reasons: >1 To provide C++ API. We mostly use Streaming before, and we also try to > use PIPES, but we do not find PIPES is more efficient than Streaming. So we > think a new C++ extention is needed for us. >2 Even using PIPES or Streaming, it is hard to control memory of hadoop > map/reduce Child JVM. >3 It costs so much to read/write/sort TB/PB data by Java. When using > PIPES or Streaming, pipe or socket is not efficient to carry so huge data. >What we want to do: >1 We do not use map/reduce Child JVM to do any data processing, which just > prepares environment, starts C++ mapper, tells mapper which split it should > deal with, and reads report from mapper until that finished. The mapper will > read record, ivoke user defined map, to do partition, write spill, combine > and merge into file.out. We think these operations can be done by C++ code. >2 Reducer is similar to mapper, it was started after sort finished, it > read from sorted files, ivoke user difined reduce, and write to user defined > record writer. >3 We also intend to rewrite shuffle and sort with C++, for efficience and > memory control. >at first, 1 and 2, then 3. >What's the difference with PIPES: >1 Yes, We will reuse most PIPES code. >2 And, We should do it more completely, nothing changed in scheduling and > management, but everything in execution. > *UPDATE:* > Now you can get a test version of HCE from this link > http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1 > This is a full package with all hadoop source code. > Following document "HCE InstallMenu.pdf" in attachment, you will build and > deploy it in your cluster. > Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program > and give other specifications of the interface. > Attachment "HCE Performance Report.pdf" gives a performance report of HCE > compared to Java MapRed and Pipes. > Any comments are welcomed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
[ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891640#action_12891640 ] Arun C Murthy commented on MAPREDUCE-1901: -- To re-terate: Pre-security - Artifacts in DistributedCache are _already_ shared across jobs, no changes needed. Post-security - MAPREDUCE-774 allows for a shared distributed cache across jobs too. > Jobs should not submit the same jar files over and over again > - > > Key: MAPREDUCE-1901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > Attachments: 1901.PATCH > > > Currently each Hadoop job uploads the required resources > (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in > executing this job would then download these resources into local disk. > In an environment where most of the users are using a standard set of jars > and files (because they are using a framework like Hive/Pig) - the same jars > keep getting uploaded and downloaded repeatedly. The overhead of this > protocol (primarily in terms of end-user latency) is significant when: > - the jobs are small (and conversantly - large in number) > - Namenode is under load (meaning hdfs latencies are high and made worse, in > part, by this protocol) > Hadoop should provide a way for jobs in a cooperative environment to not > submit the same files over and again. Identifying and caching execution > resources by a content signature (md5/sha) would be a good alternative to > have available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
[ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891633#action_12891633 ] Arun C Murthy commented on MAPREDUCE-1901: -- bq. I'm proposing a change in the way files are stored in HDFS. Instead of storing files in /jobid/files or /jobid/archives, we store them directly in {mapred.system.dir}/files and {mapred.system.dir}/archives. This removes the association between a file and the job ID, so that files can be persistent across jobs. I'm confused here. The distributed-cache does not write any files to HDFS, it merely is configured with a set of files to be copied from HDFS to the compute node. Why are we making these changes? > Jobs should not submit the same jar files over and over again > - > > Key: MAPREDUCE-1901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > Attachments: 1901.PATCH > > > Currently each Hadoop job uploads the required resources > (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in > executing this job would then download these resources into local disk. > In an environment where most of the users are using a standard set of jars > and files (because they are using a framework like Hive/Pig) - the same jars > keep getting uploaded and downloaded repeatedly. The overhead of this > protocol (primarily in terms of end-user latency) is significant when: > - the jobs are small (and conversantly - large in number) > - Namenode is under load (meaning hdfs latencies are high and made worse, in > part, by this protocol) > Hadoop should provide a way for jobs in a cooperative environment to not > submit the same files over and again. Identifying and caching execution > resources by a content signature (md5/sha) would be a good alternative to > have available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1270) Hadoop C++ Extention
[ https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891596#action_12891596 ] Allen Wittenauer commented on MAPREDUCE-1270: - This patch appears to contain code from the C++ Boost library. Someone needs to do the legwork to determine the legality of the patch. > Hadoop C++ Extention > > > Key: MAPREDUCE-1270 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task >Affects Versions: 0.20.1 > Environment: hadoop linux >Reporter: Wang Shouyan > Attachments: HADOOP-HCE-1.0.0.patch, HCE InstallMenu.pdf, HCE > Performance Report.pdf, HCE Tutorial.pdf, Overall Design of Hadoop C++ > Extension.doc > > > Hadoop C++ extension is an internal project in baidu, We start it for these > reasons: >1 To provide C++ API. We mostly use Streaming before, and we also try to > use PIPES, but we do not find PIPES is more efficient than Streaming. So we > think a new C++ extention is needed for us. >2 Even using PIPES or Streaming, it is hard to control memory of hadoop > map/reduce Child JVM. >3 It costs so much to read/write/sort TB/PB data by Java. When using > PIPES or Streaming, pipe or socket is not efficient to carry so huge data. >What we want to do: >1 We do not use map/reduce Child JVM to do any data processing, which just > prepares environment, starts C++ mapper, tells mapper which split it should > deal with, and reads report from mapper until that finished. The mapper will > read record, ivoke user defined map, to do partition, write spill, combine > and merge into file.out. We think these operations can be done by C++ code. >2 Reducer is similar to mapper, it was started after sort finished, it > read from sorted files, ivoke user difined reduce, and write to user defined > record writer. >3 We also intend to rewrite shuffle and sort with C++, for efficience and > memory control. >at first, 1 and 2, then 3. >What's the difference with PIPES: >1 Yes, We will reuse most PIPES code. >2 And, We should do it more completely, nothing changed in scheduling and > management, but everything in execution. > *UPDATE:* > Now you can get a test version of HCE from this link > http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1 > This is a full package with all hadoop source code. > Following document "HCE InstallMenu.pdf" in attachment, you will build and > deploy it in your cluster. > Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program > and give other specifications of the interface. > Attachment "HCE Performance Report.pdf" gives a performance report of HCE > compared to Java MapRed and Pipes. > Any comments are welcomed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1962) [Herriot] IOException throws and it fails with token expired while running the tests.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1962: - Summary: [Herriot] IOException throws and it fails with token expired while running the tests. (was: IOException throws and it fails with token expired while running the tests.) > [Herriot] IOException throws and it fails with token expired while running > the tests. > - > > Key: MAPREDUCE-1962 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1962 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > > Throwing IOException and tests fails due to token is expired. I could see > this issue in a secure cluster. > This issue has been resolved by setting the following attribute in the > configuration before running the tests. > mapreduce.job.complete.cancel.delegation.tokens=false -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1809) Ant build changes for Streaming system tests in contrib projects.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1809: - Attachment: 1809-ydist-security.patch > Ant build changes for Streaming system tests in contrib projects. > - > > Key: MAPREDUCE-1809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1809 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: build >Affects Versions: 0.21.0 >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: 1809-ydist-security.patch, 1809-ydist-security.patch, > MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, > MAPREDUCE-1809.patch > > > Implementing new target( test-system) in build-contrib.xml file for executing > the system test that are in contrib projects. Also adding 'subant' target in > aop.xml that calls the build-contrib.xml file for system tests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1925) TestRumenJobTraces fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891552#action_12891552 ] Ravi Gummadi commented on MAPREDUCE-1925: - Hudson seems to be not responding. I ran ant test and test-patch myself. All unit tests passed except the known failure of MAPREDUCE-1834. ant test-patch gave: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 5 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. > TestRumenJobTraces fails in trunk > - > > Key: MAPREDUCE-1925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amareshwari Sriramadasu >Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1925.patch, 1925.v1.1.patch, 1925.v1.patch, > 1925.v2.1.patch, 1925.v2.patch > > > TestRumenJobTraces failed with following error: > Error Message > the gold file contains more text at line 1 expected:<56> but was:<0> > Stacktrace > at > org.apache.hadoop.tools.rumen.TestRumenJobTraces.testHadoop20JHParser(TestRumenJobTraces.java:294) > Full log of the failure is available at > http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/292/testReport/org.apache.hadoop.tools.rumen/TestRumenJobTraces/testHadoop20JHParser/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1270) Hadoop C++ Extention
[ https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Yang updated MAPREDUCE-1270: - Attachment: HADOOP-HCE-1.0.0.patch HCE-1.0.0.patch for mapreduce trunk (revision 963075) > Hadoop C++ Extention > > > Key: MAPREDUCE-1270 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task >Affects Versions: 0.20.1 > Environment: hadoop linux >Reporter: Wang Shouyan > Attachments: HADOOP-HCE-1.0.0.patch, HCE InstallMenu.pdf, HCE > Performance Report.pdf, HCE Tutorial.pdf, Overall Design of Hadoop C++ > Extension.doc > > > Hadoop C++ extension is an internal project in baidu, We start it for these > reasons: >1 To provide C++ API. We mostly use Streaming before, and we also try to > use PIPES, but we do not find PIPES is more efficient than Streaming. So we > think a new C++ extention is needed for us. >2 Even using PIPES or Streaming, it is hard to control memory of hadoop > map/reduce Child JVM. >3 It costs so much to read/write/sort TB/PB data by Java. When using > PIPES or Streaming, pipe or socket is not efficient to carry so huge data. >What we want to do: >1 We do not use map/reduce Child JVM to do any data processing, which just > prepares environment, starts C++ mapper, tells mapper which split it should > deal with, and reads report from mapper until that finished. The mapper will > read record, ivoke user defined map, to do partition, write spill, combine > and merge into file.out. We think these operations can be done by C++ code. >2 Reducer is similar to mapper, it was started after sort finished, it > read from sorted files, ivoke user difined reduce, and write to user defined > record writer. >3 We also intend to rewrite shuffle and sort with C++, for efficience and > memory control. >at first, 1 and 2, then 3. >What's the difference with PIPES: >1 Yes, We will reuse most PIPES code. >2 And, We should do it more completely, nothing changed in scheduling and > management, but everything in execution. > *UPDATE:* > Now you can get a test version of HCE from this link > http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1 > This is a full package with all hadoop source code. > Following document "HCE InstallMenu.pdf" in attachment, you will build and > deploy it in your cluster. > Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program > and give other specifications of the interface. > Attachment "HCE Performance Report.pdf" gives a performance report of HCE > compared to Java MapRed and Pipes. > Any comments are welcomed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1270) Hadoop C++ Extention
[ https://issues.apache.org/jira/browse/MAPREDUCE-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891544#action_12891544 ] Dong Yang commented on MAPREDUCE-1270: -- Here is a HADOOP-HCE-1.0.0.patch for mapreduce trunk (revision 963075), which includes Hadoop C++ Extension (short for HCE) changes to mapreduce-963075. The steps for using this patch is as follows: 1. Download HADOOP-HCE-1.0.0.patch 2. svn co -r 963075 http://svn.apache.org/repos/asf/hadoop/mapreduce/trunk trunk-963075; 3. cd trunk-963075; 4. patch -p0 < HADOOP-HCE-1.0.0.patch 5. sh build.sh (need java, forrest and ant) HCE includes java and c++ codes, which depends on libhdfs, so in this build.sh we first check out hdfs trunk and build it. > Hadoop C++ Extention > > > Key: MAPREDUCE-1270 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1270 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task >Affects Versions: 0.20.1 > Environment: hadoop linux >Reporter: Wang Shouyan > Attachments: HCE InstallMenu.pdf, HCE Performance Report.pdf, HCE > Tutorial.pdf, Overall Design of Hadoop C++ Extension.doc > > > Hadoop C++ extension is an internal project in baidu, We start it for these > reasons: >1 To provide C++ API. We mostly use Streaming before, and we also try to > use PIPES, but we do not find PIPES is more efficient than Streaming. So we > think a new C++ extention is needed for us. >2 Even using PIPES or Streaming, it is hard to control memory of hadoop > map/reduce Child JVM. >3 It costs so much to read/write/sort TB/PB data by Java. When using > PIPES or Streaming, pipe or socket is not efficient to carry so huge data. >What we want to do: >1 We do not use map/reduce Child JVM to do any data processing, which just > prepares environment, starts C++ mapper, tells mapper which split it should > deal with, and reads report from mapper until that finished. The mapper will > read record, ivoke user defined map, to do partition, write spill, combine > and merge into file.out. We think these operations can be done by C++ code. >2 Reducer is similar to mapper, it was started after sort finished, it > read from sorted files, ivoke user difined reduce, and write to user defined > record writer. >3 We also intend to rewrite shuffle and sort with C++, for efficience and > memory control. >at first, 1 and 2, then 3. >What's the difference with PIPES: >1 Yes, We will reuse most PIPES code. >2 And, We should do it more completely, nothing changed in scheduling and > management, but everything in execution. > *UPDATE:* > Now you can get a test version of HCE from this link > http://docs.google.com/leaf?id=0B5xhnqH1558YZjcxZmI0NzEtODczMy00NmZiLWFkNjAtZGM1MjZkMmNkNWFk&hl=zh_CN&pli=1 > This is a full package with all hadoop source code. > Following document "HCE InstallMenu.pdf" in attachment, you will build and > deploy it in your cluster. > Attachment "HCE Tutorial.pdf" will lead you to write the first HCE program > and give other specifications of the interface. > Attachment "HCE Performance Report.pdf" gives a performance report of HCE > compared to Java MapRed and Pipes. > Any comments are welcomed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1902) job jar file is not distributed via DistributedCache
[ https://issues.apache.org/jira/browse/MAPREDUCE-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891532#action_12891532 ] Vinod K V commented on MAPREDUCE-1902: -- Both are equally efficient I think, unless you bring in sharing of job jars across jobs also. It'd definitely help code reuse. I checked trunk and realized that only a minor difference exists between the present way and the dist-cache way. We also un-jar the job.jar so that classes inside sub-directories (according to a job-configurable pattern), for e.g., lib/, classes/, are also made to be available on class-path. Accommodating it should be straight forward. > job jar file is not distributed via DistributedCache > > > Key: MAPREDUCE-1902 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1902 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > > The main jar file for an job is not distributed via the distributed cache. It > would be more efficient if that were the case. > It would also allow us to comprehensively tackle the inefficiencies in > distribution of jar files and such (see MAPREDUCE-1901). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again
[ https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891529#action_12891529 ] Vinod K V commented on MAPREDUCE-1901: -- bq. Currently, auxiliary files added through DistributedCache.addCacheFiles and DistributedCache.addCacheArchive end up in {mapred.system.dir}/job_id/files or {mapred.system.dir}/job_id/archives. The /job_id directory is then removed after every job, which is why files cannot be reused across jobs. That is only true for private distributed cache files. Artifacts which are already public on the DFS don't go to mapredsystem directly at all and are reusable across users/jobs. bq. 2. it treats shared objects as immutable. meaning that we never look up the timestamp of the backing object in hdfs during task localization/validation. this saves time during task setup. bq. 3. reasonable effort has been put to bypass as many hdfs calls as possible in step 1. the client gets a listing of all shared objects and their md5 signatures in one shot. because of the immutability assumption - individual file stamps are never required and save hdfs calls. I think this is an orthogonal. If md5 checksums are preferred over timestamp based checks for the sake of lessening DFS accesses, that can be done separately within the current design, no? Distributed cache files originally did rely on md5 checksum of the files/jars that HDFS itself used to have. However that changed via HADOOP-1084 when checksums paved way for block level crcs. bq. 4. finally - there is inbuilt code to do garbage collection of the shared namespace (in hdfs) by deleting old shared objects that have not been recently accessed. This is where I think it gets tricky. First, garbage collection of the dfs namespace should be accompanied by the same on individual TTs - more complexity. There are race conditions too. It's not clear how the JobTracker is prevented from expiring shared cache files/jars when some JobClient has already marked or is in the process of marking those artifacts for usage by the job. Warranting such synchronization across JobTracker and JobClients is difficult and, at best, brittle. Leaving the synchronization issues unsolved would only mean leaving the tasks/job to fail later which is not desirable. bq. the difference here is that all applications (like Hive) using libjars etc. options provided in hadoop automatically share jars with each other (when they set this option). the applications don't have to do anything special (like figuring out the right global identifier in hdfs for their jars). That seems like a valid use-case. But as I mentioned above, because of complexity and race conditions it seems like a wrong place to develop it. I think the core problem is trying to perform a service (sharing of files) that strictly belongs to the layer above mapreduce - maintaining the share list doesn't seem like a JT's responsibility. The current way of leaving it to the users to decide which are public files(and hence shareable) and which are not and how and when they are purged, keeps things saner from the mapreduce framework point of view. What do you think? bq. if u can look at the patch a bit - that might help understand the differences as well I looked at the patch. And I am still not convinced. Yet, that is. > Jobs should not submit the same jar files over and over again > - > > Key: MAPREDUCE-1901 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Joydeep Sen Sarma > Attachments: 1901.PATCH > > > Currently each Hadoop job uploads the required resources > (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in > executing this job would then download these resources into local disk. > In an environment where most of the users are using a standard set of jars > and files (because they are using a framework like Hive/Pig) - the same jars > keep getting uploaded and downloaded repeatedly. The overhead of this > protocol (primarily in terms of end-user latency) is significant when: > - the jobs are small (and conversantly - large in number) > - Namenode is under load (meaning hdfs latencies are high and made worse, in > part, by this protocol) > Hadoop should provide a way for jobs in a cooperative environment to not > submit the same files over and again. Identifying and caching execution > resources by a content signature (md5/sha) would be a good alternative to > have available. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1809) Ant build changes for Streaming system tests in contrib projects.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891521#action_12891521 ] Balaji Rajagopalan commented on MAPREDUCE-1809: --- +1 > Ant build changes for Streaming system tests in contrib projects. > - > > Key: MAPREDUCE-1809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1809 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: build >Affects Versions: 0.21.0 >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: 1809-ydist-security.patch, MAPREDUCE-1809.patch, > MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, MAPREDUCE-1809.patch > > > Implementing new target( test-system) in build-contrib.xml file for executing > the system test that are in contrib projects. Also adding 'subant' target in > aop.xml that calls the build-contrib.xml file for system tests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1964) Running hi Ram jobs when TTs are blacklisted
[ https://issues.apache.org/jira/browse/MAPREDUCE-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891519#action_12891519 ] Vinay Kumar Thota commented on MAPREDUCE-1964: -- 1. Please add some brief description about class. 2. Add java doc information for each public methods. 3. {noformat} +Assert.assertEquals("Job has not been succeeded", + jInfo.getStatus().getRunState(), JobStatus.SUCCEEDED); {noformat} don't use the above statement in helper method and it left up to test. 4. {noformat} + private int runTool(Configuration job, Tool tool, + String[] jobArgs) throws Exception { + int returnStatus = ToolRunner.run(job, tool, jobArgs); + return returnStatus; + } {noformat} Instead of writing the separate method use ToolRunner statement directly. {noformat} +JobID jobId = helper.runHighRamJob(conf,jobClient,remoteJTClient); {noformat} final HighRamJobHelper helper = new HighRamJobHelper(); JobID jobId = helper.runHighRamJob(conf,jobClient,remoteJTClient); Make it final and use it locally instead of defining globally.Because its using only one place in the class. > Running hi Ram jobs when TTs are blacklisted > > > Key: MAPREDUCE-1964 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1964 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Balaji Rajagopalan > Attachments: hiRam_bList_y20.patch > > > More slots are getting reserved for HiRAM job tasks then required > Blacklist more than 25% TTs across the job. Run high ram job. No > java.lang.RuntimeException should be displayed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1964) Running hi Ram jobs when TTs are blacklisted
[ https://issues.apache.org/jira/browse/MAPREDUCE-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Rajagopalan updated MAPREDUCE-1964: -- Attachment: hiRam_bList_y20.patch First patch for review > Running hi Ram jobs when TTs are blacklisted > > > Key: MAPREDUCE-1964 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1964 > Project: Hadoop Map/Reduce > Issue Type: New Feature >Reporter: Balaji Rajagopalan > Attachments: hiRam_bList_y20.patch > > > More slots are getting reserved for HiRAM job tasks then required > Blacklist more than 25% TTs across the job. Run high ram job. No > java.lang.RuntimeException should be displayed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1964) Running hi Ram jobs when TTs are blacklisted
Running hi Ram jobs when TTs are blacklisted Key: MAPREDUCE-1964 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1964 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Balaji Rajagopalan More slots are getting reserved for HiRAM job tasks then required Blacklist more than 25% TTs across the job. Run high ram job. No java.lang.RuntimeException should be displayed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1963) [Herriot] TaskMemoryManager should log process-tree's status while killing tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1963: - Attachment: 1963-ydist-security.patch patch for yahoo security dist branch. > [Herriot] TaskMemoryManager should log process-tree's status while killing > tasks > > > Key: MAPREDUCE-1963 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1963 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: test >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: 1963-ydist-security.patch > > > 1. Execute a streaming job which will increase memory usage beyond configured > memory limits during mapping phase. TaskMemoryManager should logs a map > task's process-tree's status just before killing the task. > 2. Execute a streaming job which will increase memory usage beyond configured > memory limits during reduce phase. TaskMemoryManager should logs a > reduce task's process-tree's status just before killing the task. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1963) [Herriot] TaskMemoryManager should log process-tree's status while killing tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Kumar Thota updated MAPREDUCE-1963: - Assignee: Vinay Kumar Thota > [Herriot] TaskMemoryManager should log process-tree's status while killing > tasks > > > Key: MAPREDUCE-1963 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1963 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: test >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > > 1. Execute a streaming job which will increase memory usage beyond configured > memory limits during mapping phase. TaskMemoryManager should logs a map > task's process-tree's status just before killing the task. > 2. Execute a streaming job which will increase memory usage beyond configured > memory limits during reduce phase. TaskMemoryManager should logs a > reduce task's process-tree's status just before killing the task. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1963) [Herriot] TaskMemoryManager should log process-tree's status while killing tasks
[Herriot] TaskMemoryManager should log process-tree's status while killing tasks Key: MAPREDUCE-1963 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1963 Project: Hadoop Map/Reduce Issue Type: Task Components: test Reporter: Vinay Kumar Thota 1. Execute a streaming job which will increase memory usage beyond configured memory limits during mapping phase. TaskMemoryManager should logs a map task's process-tree's status just before killing the task. 2. Execute a streaming job which will increase memory usage beyond configured memory limits during reduce phase.TaskMemoryManager should logs a reduce task's process-tree's status just before killing the task. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1957) [Herriot] Test Job cache directories cleanup after job completes.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891501#action_12891501 ] Balaji Rajagopalan commented on MAPREDUCE-1957: --- +1 > [Herriot] Test Job cache directories cleanup after job completes. > - > > Key: MAPREDUCE-1957 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1957 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: test >Reporter: Vinay Kumar Thota >Assignee: Vinay Kumar Thota > Attachments: 1957-ydist-security.patch, 1957-ydist-security.patch, > 1957-ydist-security.patch > > > Test the job cache directories cleanup after job completes.Test covers the > following scenarios. > 1. Submit a job and create folders and files in work folder with > non-writable permissions under task attempt id folder. Wait till the job > completes and verify whether the files and folders are cleaned up or not. > 2. Submit a job and create folders and files in work folder with > non-writable permissions under task attempt id folder. Kill the job and > verify whether the files and folders are cleaned up or not. > 3. Submit a job and create folders and files in work folder with > non-writable permissions under task attempt id folder. Fail the job and > verify whether the files and folders are cleaned up or not. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1925) TestRumenJobTraces fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1925: Status: Patch Available (was: Open) > TestRumenJobTraces fails in trunk > - > > Key: MAPREDUCE-1925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amareshwari Sriramadasu >Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1925.patch, 1925.v1.1.patch, 1925.v1.patch, > 1925.v2.1.patch, 1925.v2.patch > > > TestRumenJobTraces failed with following error: > Error Message > the gold file contains more text at line 1 expected:<56> but was:<0> > Stacktrace > at > org.apache.hadoop.tools.rumen.TestRumenJobTraces.testHadoop20JHParser(TestRumenJobTraces.java:294) > Full log of the failure is available at > http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/292/testReport/org.apache.hadoop.tools.rumen/TestRumenJobTraces/testHadoop20JHParser/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAPREDUCE-1962) IOException throws and it fails with token expired while running the tests.
IOException throws and it fails with token expired while running the tests. --- Key: MAPREDUCE-1962 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1962 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Reporter: Vinay Kumar Thota Assignee: Vinay Kumar Thota Throwing IOException and tests fails due to token is expired. I could see this issue in a secure cluster. This issue has been resolved by setting the following attribute in the configuration before running the tests. mapreduce.job.complete.cancel.delegation.tokens=false -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1925) TestRumenJobTraces fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1925: Attachment: 1925.v2.1.patch Attaching new patch removing the dependency of InputDemuxer in getRewindableInputStream(). > TestRumenJobTraces fails in trunk > - > > Key: MAPREDUCE-1925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amareshwari Sriramadasu >Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1925.patch, 1925.v1.1.patch, 1925.v1.patch, > 1925.v2.1.patch, 1925.v2.patch > > > TestRumenJobTraces failed with following error: > Error Message > the gold file contains more text at line 1 expected:<56> but was:<0> > Stacktrace > at > org.apache.hadoop.tools.rumen.TestRumenJobTraces.testHadoop20JHParser(TestRumenJobTraces.java:294) > Full log of the failure is available at > http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/292/testReport/org.apache.hadoop.tools.rumen/TestRumenJobTraces/testHadoop20JHParser/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1871) Create automated test scenario for "Collect information about number of tasks succeeded / total per time unit for a tasktracker"
[ https://issues.apache.org/jira/browse/MAPREDUCE-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iyappan Srinivasan updated MAPREDUCE-1871: -- Status: Patch Available (was: Open) Affects Version/s: 0.21.0 > Create automated test scenario for "Collect information about number of tasks > succeeded / total per time unit for a tasktracker" > > > Key: MAPREDUCE-1871 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1871 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Affects Versions: 0.21.0 >Reporter: Iyappan Srinivasan >Assignee: Iyappan Srinivasan > Attachments: 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > MAPREDUCE-1871.patch, MAPREDUCE-1871.patch, MAPREDUCE-1871.patch, > MAPREDUCE-1871.patch, MAPREDUCE-1871.patch > > > Create automated test scenario for "Collect information about number of tasks > succeeded / total per time unit for a tasktracker" > 1) Verification of all the above mentioned fields with the specified TTs. > Total no. of tasks and successful tasks should be equal to the corresponding > no. of tasks specified in TTs logs > 2) Fail a task on tasktracker. Node UI should update the status of tasks on > that TT accordingly. > 3) Kill a task on tasktracker. Node UI should update the status of tasks on > that TT accordingly > 4) Positive Run simultaneous jobs and check if all the fields are populated > with proper values of tasks. Node UI should have correct valiues for all the > fields mentioned above. > 5) Check the fields across one hour window Fields related to hour should be > updated after every hour > 6) Check the fields across one day window fields related to hour should be > updated after every day > 7) Restart a TT and bring it back. UI should retain the fields values. > 8) Positive Run a bunch of jobs with 0 maps and 0 reduces simultanously. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1871) Create automated test scenario for "Collect information about number of tasks succeeded / total per time unit for a tasktracker"
[ https://issues.apache.org/jira/browse/MAPREDUCE-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iyappan Srinivasan updated MAPREDUCE-1871: -- Attachment: MAPREDUCE-1871.patch patch for trunk making it on top to make sure it gets picked up when patch is made available. > Create automated test scenario for "Collect information about number of tasks > succeeded / total per time unit for a tasktracker" > > > Key: MAPREDUCE-1871 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1871 > Project: Hadoop Map/Reduce > Issue Type: Test > Components: test >Reporter: Iyappan Srinivasan >Assignee: Iyappan Srinivasan > Attachments: 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > 1871-ydist-security-patch.txt, 1871-ydist-security-patch.txt, > MAPREDUCE-1871.patch, MAPREDUCE-1871.patch, MAPREDUCE-1871.patch, > MAPREDUCE-1871.patch, MAPREDUCE-1871.patch > > > Create automated test scenario for "Collect information about number of tasks > succeeded / total per time unit for a tasktracker" > 1) Verification of all the above mentioned fields with the specified TTs. > Total no. of tasks and successful tasks should be equal to the corresponding > no. of tasks specified in TTs logs > 2) Fail a task on tasktracker. Node UI should update the status of tasks on > that TT accordingly. > 3) Kill a task on tasktracker. Node UI should update the status of tasks on > that TT accordingly > 4) Positive Run simultaneous jobs and check if all the fields are populated > with proper values of tasks. Node UI should have correct valiues for all the > fields mentioned above. > 5) Check the fields across one hour window Fields related to hour should be > updated after every hour > 6) Check the fields across one day window fields related to hour should be > updated after every day > 7) Restart a TT and bring it back. UI should retain the fields values. > 8) Positive Run a bunch of jobs with 0 maps and 0 reduces simultanously. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1925) TestRumenJobTraces fails in trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Gummadi updated MAPREDUCE-1925: Status: Open (was: Patch Available) > TestRumenJobTraces fails in trunk > - > > Key: MAPREDUCE-1925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tools/rumen >Affects Versions: 0.22.0 >Reporter: Amareshwari Sriramadasu >Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1925.patch, 1925.v1.1.patch, 1925.v1.patch, 1925.v2.patch > > > TestRumenJobTraces failed with following error: > Error Message > the gold file contains more text at line 1 expected:<56> but was:<0> > Stacktrace > at > org.apache.hadoop.tools.rumen.TestRumenJobTraces.testHadoop20JHParser(TestRumenJobTraces.java:294) > Full log of the failure is available at > http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/292/testReport/org.apache.hadoop.tools.rumen/TestRumenJobTraces/testHadoop20JHParser/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.