Re: [DISCUSS] A final minor release off branch-2?
On 11/15/17 10:34 AM, Andrew Wang wrote: Hi Junping, On Wed, Nov 15, 2017 at 1:37 AM, Junping Du wrote: 3. Beside incompatibilities, there is also possible to have performance regressions (lower throughput, higher latency, slower job running, bigger memory footprint or even memory leaking, etc.) for new hadoop releases. While the performance impact of migration (if any) could be neglectable to some users, other users could be very sensitive and wish to roll back if it happens on their production cluster. Yes, bugs exist. I won't claim that 3.0.0 is bug-free. All new releases can potentially introduce new bugs. However, I don't think rollback is the solution. In my experience, users rarely rollback since it's so disruptive and causes data loss. It's much more common that they patch and upgrade. With that in mind, I'd rather we spend our effort on making 3.0.x high-quality vs. making it easier to rollback. The root of my concern in announcing a "bridge release" is that it discourages users from upgrading to 3.0.0 until a bridge release is out. I strongly believe the level of quality provided by 3.0.0 is at least equal to new 2.x minor releases, given our extended testing and integration process, and we don't have bridge releases for 2.x. This is why I asked for a list of known issues with 2.x -> 3.0 upgrades, that would necessitate a bridge release. Arun raised a concern about NM rollback. Are there any other *known* issues? While going over the JACC report as part of YARN-6142, I filed HADOOP-14534, MAPREDUCE-6902, and YARN-6717 to document the major issues that I ran across. I think we found one or two other JIRAs which we marked as incompatible as part of this investigation. The protobuf changes should be forward compatible going from 2.8.0 to 3.0.0. YARN-6798 should fix the NM state store versioning when upgrading from 2.9.0 to 3.0.0. 2.8.0 to 3.0.0 could have an issue if the relevant features are enabled. (queued containers, work-preserving NM restart w/AMRMProxy). -Ray - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: Branch merges and 3.0.0-beta1 scope
On 8/22/17 3:20 AM, Steve Loughran wrote: On 21 Aug 2017, at 22:22, Vinod Kumar Vavilapalli wrote: Steve, You can be strict & ruthless about the timelines. Anything that doesn’t get in by mid-September, as was originally planned, can move to the next release - whether it is feature work on branches or feature work on trunk. The problem I see here is that code & branches being worked on for a year are now (apparently) close to being done and we are telling them to hold for 7 more months - this is not a reasonable ask.. If you are advocating for a 3.1 plan, I’m sure one of these branch ‘owners’ can volunteer. But this is how you get competing releases and split bandwidth. As for compatibility / testing etc, it seems like there is a belief that the current ‘scoped’ features are all tested well in these areas and so adding more is going to hurt the release. There is no way this is the reality, trunk has so many features that have been landing for years, the only way we can collectively attempt towards making this stable is by getting as many parties together as possible, each verifying stuff that they need. Not by excluding specific features. If everyone is confident & its coming together, it does make sense. I think those of us (myself included) who are merging stuff in do have to recognise that we really need to follow it through by being responsive to any problem -and with the release manager having the right to pull things out if its felt to be significantly threatening the stability of the final 3.0 release. I think we should also consider making the 3.0 beta the feature freeze; after that fixes on the features go in, but nothing else of significance, otherwise the value of the beta "test this code more broadly" is diminoshed At this point, there have been three planned alphas from September 2016 until July 2017 to "get in features". While a couple of upcoming features are "a few weeks" away, I think all of us are aware how predictable software development schedules can be. I think we can also all agree that rushing just to meet a release deadline isn't the best practice when it comes to software development either. Andrew has been very clear about his goals at each step and I think Wangda's willingness to not rush in resource types was an appropriate response. I'm sympathetic to the goals of getting in a feature for 3.0, but it might be a good idea for each project that is a "few weeks away" to seriously look at the readiness compared to the features which have been testing for 6+ months already. -Ray - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6902) [Umbrella] API related cleanup for Hadoop 3
Ray Chiang created MAPREDUCE-6902: - Summary: [Umbrella] API related cleanup for Hadoop 3 Key: MAPREDUCE-6902 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6902 Project: Hadoop Map/Reduce Issue Type: Task Reporter: Ray Chiang Assignee: Ray Chiang Creating this umbrella JIRA for tracking various API related issues that need to be properly tracked, adjusted, or documented before Hadoop 3 release. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6901) Remove DistributedCache from trunk
Ray Chiang created MAPREDUCE-6901: - Summary: Remove DistributedCache from trunk Key: MAPREDUCE-6901 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6901 Project: Hadoop Map/Reduce Issue Type: Task Components: distributed-cache Affects Versions: 3.0.0-alpha3 Reporter: Ray Chiang Priority: Critical Doing this as part of Hadoop 3 cleanup. DistributedCache has been marked as deprecated forever to the point where the change that did it isn't in Git. I don't really have a preference for whether we remove it or not, but I'd like to have a discussion and have it properly documented as a release not for Hadoop 3 before we hit final release. At the very least we can have a Release Note that will sum up whatever discussion we have here. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6685) LocalDistributedCacheManager can have overlapping filenames
[ https://issues.apache.org/jira/browse/MAPREDUCE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang resolved MAPREDUCE-6685. --- Resolution: Duplicate Target Version/s: (was: ) Properly closing as duplicate. > LocalDistributedCacheManager can have overlapping filenames > --- > > Key: MAPREDUCE-6685 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6685 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.0.0-alpha1 > Reporter: Ray Chiang > Assignee: Ray Chiang > Attachments: MAPREDUCE-6685.001.patch, MAPREDUCE-6685.002.patch > > > LocalDistributedCacheManager has this setup: > bq. AtomicLong uniqueNumberGenerator = new > AtomicLong(System.currentTimeMillis()); > to create this temporary filename: > bq. new FSDownload(localFSFileContext, ugi, conf, new Path(destPath, > Long.toString(uniqueNumberGenerator.incrementAndGet())), resource); > when using LocalJobRunner. When two or more start on the same machine, then > it's possible to end up having the same timestamp or a large enough overlap > that two successive timestamps may not be sufficiently far apart. > Given the assumptions: > 1) Assume timestamp is the same. Then the most common starting random seed > will be the same. > 2) Process ID will very likely be unique, but will likely be close in value. > 3) Thread ID is not guaranteed to be unique. > A unique ID based on PID as a seed (in addition to the timestamp) should be a > better unique identifier for temporary filenames. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-6685) LocalDistributedCacheManager can have overlapping filenames
[ https://issues.apache.org/jira/browse/MAPREDUCE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang reopened MAPREDUCE-6685: --- > LocalDistributedCacheManager can have overlapping filenames > --- > > Key: MAPREDUCE-6685 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6685 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.0.0-alpha1 > Reporter: Ray Chiang > Assignee: Ray Chiang > Attachments: MAPREDUCE-6685.001.patch, MAPREDUCE-6685.002.patch > > > LocalDistributedCacheManager has this setup: > bq. AtomicLong uniqueNumberGenerator = new > AtomicLong(System.currentTimeMillis()); > to create this temporary filename: > bq. new FSDownload(localFSFileContext, ugi, conf, new Path(destPath, > Long.toString(uniqueNumberGenerator.incrementAndGet())), resource); > when using LocalJobRunner. When two or more start on the same machine, then > it's possible to end up having the same timestamp or a large enough overlap > that two successive timestamps may not be sufficiently far apart. > Given the assumptions: > 1) Assume timestamp is the same. Then the most common starting random seed > will be the same. > 2) Process ID will very likely be unique, but will likely be close in value. > 3) Thread ID is not guaranteed to be unique. > A unique ID based on PID as a seed (in addition to the timestamp) should be a > better unique identifier for temporary filenames. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6685) LocalDistributedCacheManager can have overlapping filenames
Ray Chiang created MAPREDUCE-6685: - Summary: LocalDistributedCacheManager can have overlapping filenames Key: MAPREDUCE-6685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6685 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ray Chiang Assignee: Ray Chiang LocalDistributedCacheManager has this setup: bq. AtomicLong uniqueNumberGenerator = new AtomicLong(System.currentTimeMillis()); to create this temporary filename: bq. new FSDownload(localFSFileContext, ugi, conf, new Path(destPath, Long.toString(uniqueNumberGenerator.incrementAndGet())), resource); when using LocalJobRunner. When two or more start on the same machine, then it's possible to end up having the same timestamp or a large enough overlap that two successive timestamps may not be sufficiently far apart. Given the assumptions: 1) Assume timestamp is the same. Then the most common starting random seed will be the same. 2) Process ID will very likely be unique, but will likely be close in value. 3) Thread ID is not guaranteed to be unique. A unique ID based on PID as a seed (in addition to the timestamp) should be a better unique identifier for temporary filenames. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit
Ray Chiang created MAPREDUCE-6622: - Summary: Add capability to set JHS job cache to a task-based limit Key: MAPREDUCE-6622 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Affects Versions: 2.7.2 Reporter: Ray Chiang Assignee: Ray Chiang When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs can be of varying size. This is generally not a problem when the jobs sizes are uniform or small, but when the job sizes can be very large (say greater than 250k tasks), then the JHS heap size can grow tremendously. In cases, where multiple jobs are very large, then the JHS can lock up and spend all its time in GC. However, since the cache is holding on to all the jobs, not much heap space can be freed up. By setting a property that sets a cap on the number of tasks allowed in the cache and since the total number of tasks loaded is directly proportional to the amount of heap used, this should help prevent the JHS from locking up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6613) Change mapreduce.jobhistory.jhist.format default from json to binary
Ray Chiang created MAPREDUCE-6613: - Summary: Change mapreduce.jobhistory.jhist.format default from json to binary Key: MAPREDUCE-6613 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6613 Project: Hadoop Map/Reduce Issue Type: New Feature Affects Versions: 2.8.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor MAPREDUCE-6376 added a configuration setting to set up .jhist internal format: mapreduce.jobhistory.jhist.format Currently, the default is "json". Changing the default to "binary" allows faster parsing, but with the downside of making the file not output friendly by using "hadoop fs cat". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Newcomer developer seeking to copntribute
Welcome aboard. I'd recommend reading the HowToContribute wiki for understanding the contribution process. For other tips on getting started, there is the "newbie" label in JIRA, but it's an imperfect sorting. I recommend doing the following: 1) Look at unit test JIRAs, both open and closed. Tons of benefits: they'll help you understand some part of the API, some could use some comments and enhancements, etc. 2) Pick a component (ResourceManager, NodeManager, MR APIs, DataNode, NameNode, etc.) and focus on that to start. It's easier than trying to understand it all at once. 3) Participate in some code reviews. It will be non-binding, but testing out other code can help you understand some part. -Ray On Mon, Jul 13, 2015 at 9:56 AM, Ajoy Bhatia wrote: > Hi, > > I am a software developer and have been working in the industry since 1991. > I have written Java code for Map-Reduce jobs in my recent jobs. I want to > contribute to the Hadoop project, and signed up for mapreduce-dev mailing > list but I am open to contributing to any Hadoop project module that needs > help. > > I would like to know of any beginner-level issues that I could start > working on. I have gone through the Newcomers and Getting Started pages on > community.apache.org. As suggested, I searched Hadoop Map/Reduce Jira for > issues tagged "GSoC" or "mentor". That didn't help me identify something > suitable. I would appreciate any help in getting me started as a > contributor. > > Thanks... > - Ajoy >
[jira] [Created] (MAPREDUCE-6432) Fix typos in hadoop-mapreduce-project module
Ray Chiang created MAPREDUCE-6432: - Summary: Fix typos in hadoop-mapreduce-project module Key: MAPREDUCE-6432 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6432 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.7.1 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Fix a bunch of typos in comments, strings, variable names, and method names in the hadoop-mapreduce-project module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6421) Fix Findbugs pre-patch warning in org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.reduceNodeLabelExpression
Ray Chiang created MAPREDUCE-6421: - Summary: Fix Findbugs pre-patch warning in org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.reduceNodeLabelExpression Key: MAPREDUCE-6421 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6421 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ray Chiang The actual error message is: Inconsistent synchronization of org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.reduceNodeLabelExpression; locked 66% of time The full URL for the findbugs is at: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5858/artifact/patchprocess/trunkFindbugsWarningshadoop-mapreduce-client-app.html I haven't looked to see if this message is in error or if findbugs is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6406) Update FileOutputCommitter.FILEOUTPUTCOMMITTER_ALGORITHM_VERSION_DEFAULT to match mapred-default.xml
Ray Chiang created MAPREDUCE-6406: - Summary: Update FileOutputCommitter.FILEOUTPUTCOMMITTER_ALGORITHM_VERSION_DEFAULT to match mapred-default.xml Key: MAPREDUCE-6406 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6406 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 2.7.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor MAPREDUCE-6336 updated the default version for the property mapreduce.fileoutputcommitter.algorithm.version to 2. Should the FileOutputCommitter class default be updated to match? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6394) Avoid copying counters in task reports for Job History Server
Ray Chiang created MAPREDUCE-6394: - Summary: Avoid copying counters in task reports for Job History Server Key: MAPREDUCE-6394 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6394 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Affects Versions: 2.7.0 Reporter: Ray Chiang Assignee: Ray Chiang In HsTasksBlock#render(), there is a loop to create a Javascript table which slows down immensely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6388) Remove deprecation warnings from JobHistoryServer classes
Ray Chiang created MAPREDUCE-6388: - Summary: Remove deprecation warnings from JobHistoryServer classes Key: MAPREDUCE-6388 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6388 Project: Hadoop Map/Reduce Issue Type: Task Components: jobhistoryserver Affects Versions: 2.7.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor There are a ton of deprecation warnings in the JobHistoryServer classes. This is affecting some modifications I'm making since a single line move shifts all the deprecation warnings. I'd like to get these fixed to prevent minor changes from generating a ton of warnings in test-patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6376) Fix long load times of .jhist file in JobHistoryServer
Ray Chiang created MAPREDUCE-6376: - Summary: Fix long load times of .jhist file in JobHistoryServer Key: MAPREDUCE-6376 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6376 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 2.7.0 Reporter: Ray Chiang Assignee: Ray Chiang When you click on a Job link in the JHS Web UI, it loads the .jhist file. For jobs which have a large number of tasks, the load time can break UI responsiveness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
New unit tests for *-default.xml verification
I was going to update the Wiki with the information, but it seems to be down for me at the moment. A few people have already bumped into some issues, so I'm informing via the *-dev lists for now. In trunk, there are three new unit tests: TestMapreduceConfigFields (MAPREDUCE-6192) TestYarnConfigurationFields (YARN-2957) TestHdfsConfigFields (HDFS-7559) which perform automatic comparison of properties between one of the *-default.xml files and a set of Java classes. As of right now, if a property is added to the .xml file, there should be one of the following: 1) The property defined in the appropriate Java *Config.java class (e.g. MRJobConfig) 2) The property should be added to the xmlPropsToSkipCompare or xmlPrefixToSkipCompare HashSet in the appropriate unit test. This guarantees that if a .xml property is created, that the corresponding property is spelled correctly in the Java class or its exception is documented in the unit test. The following three JIRAs are open to do the reverse check (flag when a property exists in a Java class, but not in the .xml file). YARN-3069 MAPREDUCE-6358 HDFS-8356 The above JIRAs will take more time due to getting the description fields properly filled out and making sure the default values are correct. In some cases, such as properties that would override an older property, more exceptions will need to be added. Once all of the above are done, I may take a more solid attempt at doing the same for CommonConfigurationKeys/core-default.xml. Thanks to everyone who have been reviewing, committing, and updating the new unit tests. -Ray
[jira] [Created] (MAPREDUCE-6358) Document missing properties in mapred-default.xml
Ray Chiang created MAPREDUCE-6358: - Summary: Document missing properties in mapred-default.xml Key: MAPREDUCE-6358 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6358 Project: Hadoop Map/Reduce Issue Type: Bug Components: documentation Affects Versions: 2.7.0 Reporter: Ray Chiang Assignee: Ray Chiang The following properties are currently not defined in mapred-default.xml. These properties should either be A) documented in mapred-default.xml OR B) listed as an exception (with comments, e.g. for internal use) in the TestMapreduceConfigFields unit test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6349) Fix typo in property org.apache.hadoop.mapreduce.lib.chain.Chain.REDUCER_INPUT_VALUE_CLASS
Ray Chiang created MAPREDUCE-6349: - Summary: Fix typo in property org.apache.hadoop.mapreduce.lib.chain.Chain.REDUCER_INPUT_VALUE_CLASS Key: MAPREDUCE-6349 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6349 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Ran across this typo in a property. It doesn't look like it's used anywhere externally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Question about org.apache.hadoop.yarn.webapp.Dispatcher
In Dispatcher#service(), I see this comment: // TODO: support args converted from /path/:arg1/... dest.action.invoke(controller, (Object[]) null); Right now, I've made some changes in MAPREDUCE-6222 that seems to trigger exceptions at this TODO. Can someone give me a clearer idea about what sort of processing should occur at the point of this TODO? On a related note, is there any additional/better documentation about the various org.apache.hadoop.yarn.webapp and org.apache.hadoop.yarn.webapp.view classes? There are a few things I'm trying to figure out and stepping through with a debugger gets painful at times. Any information is appreciated. Thanks. -Ray
[jira] [Created] (MAPREDUCE-6340) Remove .xml and documentation references to dfs.webhdfs.enabled
Ray Chiang created MAPREDUCE-6340: - Summary: Remove .xml and documentation references to dfs.webhdfs.enabled Key: MAPREDUCE-6340 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6340 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Fix For: 3.0.0 HDFS-7985 removed any references in the code to the dfs.webhdfs.enabled property. The property should also be removed from hdfs-default.xml as well as other places. As mentioned in HDFS-7985, this is an incompatible change with branch-2, so this fix should be limited to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Removal of unused properties
We have done this before where the properties were documented in the .xml file, but didn't exist anywhere in the Configuration files or the rest of the code. HDFS-7566 (committed) YARN-2460 (committed) MAPREDUCE-6057 (pending) -Ray On Thu, Apr 9, 2015 at 4:33 AM, Akira AJISAKA wrote: > Hi Folks, > > In MAPREDUCE-6307, I'd like to remove unused "mapreduce.tasktracker. > taskmemorymanager.monitoringinterval" property, however, the > compatibility document says "Hadoop-defined properties are to be deprecated > at least for one major release before being removed." > > http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/ > hadoop-common/Compatibility.html#Hadoop_Configuration_Files > > Is it applicable for unused properties? > Can we remove unused properties right now? > > Regards, > Akira >
[jira] [Created] (MAPREDUCE-6266) Job#getTrackingURL should consistently return a proper URL
Ray Chiang created MAPREDUCE-6266: - Summary: Job#getTrackingURL should consistently return a proper URL Key: MAPREDUCE-6266 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6266 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor When a job is running, Job#getTrackingURL returns a proper URL like: http://:8088/proxy/application_1424910897258_0004/ Once a job is finished and the job has moved to the JHS, then Job#getTrackingURL returns a URL without the protocol like: :19888/jobhistory/job/job_1424910897258_0004 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6254) Update use of Iterator to Iterable
Ray Chiang created MAPREDUCE-6254: - Summary: Update use of Iterator to Iterable Key: MAPREDUCE-6254 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6254 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Found these using the IntelliJ Findbugs-IDEA plugin, which uses findbugs3. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6253) Update use of Iterator to Iterable
Ray Chiang created MAPREDUCE-6253: - Summary: Update use of Iterator to Iterable Key: MAPREDUCE-6253 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6253 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Found these using the IntelliJ Findbugs-IDEA plugin, which uses findbugs3. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6196) Fix BigDecimal ArithmeticException in PiEstimator
Ray Chiang created MAPREDUCE-6196: - Summary: Fix BigDecimal ArithmeticException in PiEstimator Key: MAPREDUCE-6196 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6196 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.2.1 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Certain combinations of arguments to PiEstimator cause the following exception: java.lang.ArithmeticException: Non-terminating decimal expansion; no exact representable decimal result. at java.math.BigDecimal.divide(BigDecimal.java:1603) at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:313) at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) The calls to the BigDecimal methods should have some large default precision to prevent this exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6192) Create unit test to automatically compare MR related classes and mapred-default.xml
Ray Chiang created MAPREDUCE-6192: - Summary: Create unit test to automatically compare MR related classes and mapred-default.xml Key: MAPREDUCE-6192 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6192 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Create a unit test that will automatically compare the fields in the various MapReduce related classes and mapred-default.xml. It should throw an error if a property is missing in either the class or the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6061) Fix MR_CLIENT_TO_AM_IPC_MAX_RETRIES_ON_TIMEOUTS property in MRJobConfig
Ray Chiang created MAPREDUCE-6061: - Summary: Fix MR_CLIENT_TO_AM_IPC_MAX_RETRIES_ON_TIMEOUTS property in MRJobConfig Key: MAPREDUCE-6061 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6061 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial The property MR_CLIENT_TO_AM_IPC_MAX_RETRIES_ON_TIMEOUTS is defined as: MR_PREFIX + "yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts" which results in the prefix part showing up twice. It should be MR_PREFIX + "client-am.ipc.max-retries-on-timeouts" -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-6057) Remove obsolete entries from mapred-default.xml
Ray Chiang created MAPREDUCE-6057: - Summary: Remove obsolete entries from mapred-default.xml Key: MAPREDUCE-6057 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6057 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Priority: Minor The following properties are defined in mapred-default.xml but no longer exist in MRJobConfig. map.sort.class mapred.child.env mapred.child.java.opts mapreduce.app-submission.cross-platform mapreduce.client.completion.pollinterval mapreduce.client.output.filter mapreduce.client.progressmonitor.pollinterval mapreduce.client.submit.file.replication mapreduce.cluster.acls.enabled mapreduce.cluster.local.dir mapreduce.framework.name mapreduce.ifile.readahead mapreduce.ifile.readahead.bytes mapreduce.input.fileinputformat.list-status.num-threads mapreduce.input.fileinputformat.split.minsize mapreduce.input.lineinputformat.linespermap mapreduce.job.counters.limit mapreduce.job.max.split.locations mapreduce.job.reduce.shuffle.consumer.plugin.class mapreduce.jobhistory.address mapreduce.jobhistory.admin.acl mapreduce.jobhistory.admin.address mapreduce.jobhistory.cleaner.enable mapreduce.jobhistory.cleaner.interval-ms mapreduce.jobhistory.client.thread-count mapreduce.jobhistory.datestring.cache.size mapreduce.jobhistory.done-dir mapreduce.jobhistory.http.policy mapreduce.jobhistory.intermediate-done-dir mapreduce.jobhistory.joblist.cache.size mapreduce.jobhistory.keytab mapreduce.jobhistory.loadedjobs.cache.size mapreduce.jobhistory.max-age-ms mapreduce.jobhistory.minicluster.fixed.ports mapreduce.jobhistory.move.interval-ms mapreduce.jobhistory.move.thread-count mapreduce.jobhistory.principal mapreduce.jobhistory.recovery.enable mapreduce.jobhistory.recovery.store.class mapreduce.jobhistory.recovery.store.fs.uri mapreduce.jobhistory.store.class mapreduce.jobhistory.webapp.address mapreduce.local.clientfactory.class.name mapreduce.map.skip.proc.count.autoincr mapreduce.output.fileoutputformat.compress mapreduce.output.fileoutputformat.compress.codec mapreduce.output.fileoutputformat.compress.type mapreduce.reduce.skip.proc.count.autoincr mapreduce.shuffle.connection-keep-alive.enable mapreduce.shuffle.connection-keep-alive.timeout mapreduce.shuffle.max.connections mapreduce.shuffle.max.threads mapreduce.shuffle.port mapreduce.shuffle.ssl.enabled mapreduce.shuffle.ssl.file.buffer.size mapreduce.shuffle.transfer.buffer.size mapreduce.shuffle.transferTo.allowed yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts Submitting bug for comment/feedback about which properties should be kept in mapred-default.xml. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAPREDUCE-6051) Fix typos in log messages
Ray Chiang created MAPREDUCE-6051: - Summary: Fix typos in log messages Key: MAPREDUCE-6051 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6051 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.5.0 Reporter: Ray Chiang Assignee: Ray Chiang Priority: Trivial There are a bunch of typos in log messages. HADOOP-10946 was initially created, but may have failed due to being in multiple components. Try fixing typos on a per-component basis. -- This message was sent by Atlassian JIRA (v6.2#6252)