Re: [DISCUSS] A final minor release off branch-2?

2017-11-15 Thread Ray Chiang

On 11/15/17 10:34 AM, Andrew Wang wrote:

Hi Junping,

On Wed, Nov 15, 2017 at 1:37 AM, Junping Du  wrote:

3. Beside incompatibilities, there is also possible to have performance
regressions (lower throughput, higher latency, slower job running, bigger
memory footprint or even memory leaking, etc.) for new hadoop releases.
While the performance impact of migration (if any) could be neglectable to
some users, other users could be very sensitive and wish to roll back if it
happens on their production cluster.

Yes, bugs exist. I won't claim that 3.0.0 is bug-free. All new releases

can potentially introduce new bugs.

However, I don't think rollback is the solution. In my experience, users
rarely rollback since it's so disruptive and causes data loss. It's much
more common that they patch and upgrade. With that in mind, I'd rather we
spend our effort on making 3.0.x high-quality vs. making it easier to
rollback.

The root of my concern in announcing a "bridge release" is that it
discourages users from upgrading to 3.0.0 until a bridge release is out. I
strongly believe the level of quality provided by 3.0.0 is at least equal
to new 2.x minor releases, given our extended testing and integration
process, and we don't have bridge releases for 2.x.

This is why I asked for a list of known issues with 2.x -> 3.0 upgrades,
that would necessitate a bridge release. Arun raised a concern about NM
rollback. Are there any other *known* issues?



While going over the JACC report as part of YARN-6142, I filed 
HADOOP-14534, MAPREDUCE-6902, and YARN-6717 to document the major issues 
that I ran across.  I think we found one or two other JIRAs which we 
marked as incompatible as part of this investigation.


The protobuf changes should be forward compatible going from 2.8.0 to 3.0.0.

YARN-6798 should fix the NM state store versioning when upgrading from 
2.9.0 to 3.0.0.  2.8.0 to 3.0.0 could have an issue if the relevant 
features are enabled. (queued containers, work-preserving NM restart 
w/AMRMProxy).


-Ray


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: Branch merges and 3.0.0-beta1 scope

2017-08-22 Thread Ray Chiang

On 8/22/17 3:20 AM, Steve Loughran wrote:


On 21 Aug 2017, at 22:22, Vinod Kumar Vavilapalli  wrote:

Steve,

You can be strict & ruthless about the timelines. Anything that doesn’t get in 
by mid-September, as was originally planned, can move to the next release - whether 
it is feature work on branches or feature work on trunk.

The problem I see here is that code & branches being worked on for a year are 
now (apparently) close to being done and we are telling them to hold for 7 more 
months - this is not a reasonable ask..

If you are advocating for a 3.1 plan, I’m sure one of these branch ‘owners’ can 
volunteer. But this is how you get competing releases and split bandwidth.

As for compatibility / testing etc, it seems like there is a belief that the 
current ‘scoped’ features are all tested well in these areas and so adding more 
is going to hurt the release. There is no way this is the reality, trunk has so 
many features that have been landing for years, the only way we can 
collectively attempt towards making this stable is by getting as many parties 
together as possible, each verifying stuff that they need. Not by excluding 
specific features.


If everyone is confident & its coming together, it does make sense. I think 
those of us (myself included) who are merging stuff in do have to recognise that we 
really need to follow it through by being responsive to any problem -and with the 
release manager having the right to pull things out if its felt to be significantly 
threatening the stability of the final 3.0 release.

I think we should also consider making the 3.0 beta the feature freeze; after that fixes 
on the features go in, but nothing else of significance, otherwise the value of the beta 
"test this code more broadly" is diminoshed
At this point, there have been three planned alphas from September 2016 
until July 2017 to "get in features".  While a couple of upcoming 
features are "a few weeks" away, I think all of us are aware how 
predictable software development schedules can be.  I think we can also 
all agree that rushing just to meet a release deadline isn't the best 
practice when it comes to software development either.


Andrew has been very clear about his goals at each step and I think 
Wangda's willingness to not rush in resource types was an appropriate 
response.  I'm sympathetic to the goals of getting in a feature for 3.0, 
but it might be a good idea for each project that is a "few weeks away" 
to seriously look at the readiness compared to the features which have 
been testing for 6+ months already.


-Ray


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6902) [Umbrella] API related cleanup for Hadoop 3

2017-06-16 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6902:
-

 Summary: [Umbrella] API related cleanup for Hadoop 3
 Key: MAPREDUCE-6902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6902
 Project: Hadoop Map/Reduce
  Issue Type: Task
Reporter: Ray Chiang
Assignee: Ray Chiang


Creating this umbrella JIRA for tracking various API related issues that need 
to be properly tracked, adjusted, or documented before Hadoop 3 release.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6901) Remove DistributedCache from trunk

2017-06-16 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6901:
-

 Summary: Remove DistributedCache from trunk
 Key: MAPREDUCE-6901
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6901
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: distributed-cache
Affects Versions: 3.0.0-alpha3
Reporter: Ray Chiang
Priority: Critical


Doing this as part of Hadoop 3 cleanup.

DistributedCache has been marked as deprecated forever to the point where the 
change that did it isn't in Git.

I don't really have a preference for whether we remove it or not, but I'd like 
to have a discussion and have it properly documented as a release not for 
Hadoop 3 before we hit final release.  At the very least we can have a Release 
Note that will sum up whatever discussion we have here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6685) LocalDistributedCacheManager can have overlapping filenames

2016-05-31 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang resolved MAPREDUCE-6685.
---
  Resolution: Duplicate
Target Version/s:   (was: )

Properly closing as duplicate.

> LocalDistributedCacheManager can have overlapping filenames
> ---
>
> Key: MAPREDUCE-6685
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6685
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>    Reporter: Ray Chiang
>    Assignee: Ray Chiang
> Attachments: MAPREDUCE-6685.001.patch, MAPREDUCE-6685.002.patch
>
>
> LocalDistributedCacheManager has this setup:
> bq. AtomicLong uniqueNumberGenerator = new 
> AtomicLong(System.currentTimeMillis());
> to create this temporary filename:
> bq. new FSDownload(localFSFileContext, ugi, conf, new Path(destPath,  
> Long.toString(uniqueNumberGenerator.incrementAndGet())), resource);
> when using LocalJobRunner.  When two or more start on the same machine, then 
> it's possible to end up having the same timestamp or a large enough overlap 
> that two successive timestamps may not be sufficiently far apart.
> Given the assumptions:
> 1) Assume timestamp is the same. Then the most common starting random seed 
> will be the same.
> 2) Process ID will very likely be unique, but will likely be close in value.
> 3) Thread ID is not guaranteed to be unique.
> A unique ID based on PID as a seed (in addition to the timestamp) should be a 
> better unique identifier for temporary filenames.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Reopened] (MAPREDUCE-6685) LocalDistributedCacheManager can have overlapping filenames

2016-05-31 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang reopened MAPREDUCE-6685:
---

> LocalDistributedCacheManager can have overlapping filenames
> ---
>
> Key: MAPREDUCE-6685
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6685
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha1
>    Reporter: Ray Chiang
>    Assignee: Ray Chiang
> Attachments: MAPREDUCE-6685.001.patch, MAPREDUCE-6685.002.patch
>
>
> LocalDistributedCacheManager has this setup:
> bq. AtomicLong uniqueNumberGenerator = new 
> AtomicLong(System.currentTimeMillis());
> to create this temporary filename:
> bq. new FSDownload(localFSFileContext, ugi, conf, new Path(destPath,  
> Long.toString(uniqueNumberGenerator.incrementAndGet())), resource);
> when using LocalJobRunner.  When two or more start on the same machine, then 
> it's possible to end up having the same timestamp or a large enough overlap 
> that two successive timestamps may not be sufficiently far apart.
> Given the assumptions:
> 1) Assume timestamp is the same. Then the most common starting random seed 
> will be the same.
> 2) Process ID will very likely be unique, but will likely be close in value.
> 3) Thread ID is not guaranteed to be unique.
> A unique ID based on PID as a seed (in addition to the timestamp) should be a 
> better unique identifier for temporary filenames.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6685) LocalDistributedCacheManager can have overlapping filenames

2016-04-26 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6685:
-

 Summary: LocalDistributedCacheManager can have overlapping 
filenames
 Key: MAPREDUCE-6685
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6685
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ray Chiang
Assignee: Ray Chiang


LocalDistributedCacheManager has this setup:

bq. AtomicLong uniqueNumberGenerator = new 
AtomicLong(System.currentTimeMillis());

to create this temporary filename:

bq. new FSDownload(localFSFileContext, ugi, conf, new Path(destPath,  
Long.toString(uniqueNumberGenerator.incrementAndGet())), resource);

when using LocalJobRunner.  When two or more start on the same machine, then 
it's possible to end up having the same timestamp or a large enough overlap 
that two successive timestamps may not be sufficiently far apart.

Given the assumptions:

1) Assume timestamp is the same. Then the most common starting random seed will 
be the same.
2) Process ID will very likely be unique, but will likely be close in value.
3) Thread ID is not guaranteed to be unique.

A unique ID based on PID as a seed (in addition to the timestamp) should be a 
better unique identifier for temporary filenames.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit

2016-01-28 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6622:
-

 Summary: Add capability to set JHS job cache to a task-based limit
 Key: MAPREDUCE-6622
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.7.2
Reporter: Ray Chiang
Assignee: Ray Chiang


When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs 
can be of varying size.  This is generally not a problem when the jobs sizes 
are uniform or small, but when the job sizes can be very large (say greater 
than 250k tasks), then the JHS heap size can grow tremendously.

In cases, where multiple jobs are very large, then the JHS can lock up and 
spend all its time in GC.  However, since the cache is holding on to all the 
jobs, not much heap space can be freed up.

By setting a property that sets a cap on the number of tasks allowed in the 
cache and since the total number of tasks loaded is directly proportional to 
the amount of heap used, this should help prevent the JHS from locking up.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6613) Change mapreduce.jobhistory.jhist.format default from json to binary

2016-01-21 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6613:
-

 Summary: Change mapreduce.jobhistory.jhist.format default from 
json to binary
 Key: MAPREDUCE-6613
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6613
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Affects Versions: 2.8.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


MAPREDUCE-6376 added a configuration setting to set up .jhist internal format:

mapreduce.jobhistory.jhist.format

Currently, the default is "json".  Changing the default to "binary" allows 
faster parsing, but with the downside of making the file not output friendly by 
using "hadoop fs cat".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Newcomer developer seeking to copntribute

2015-07-13 Thread Ray Chiang
Welcome aboard.

I'd recommend reading the HowToContribute wiki for understanding the
contribution process.

For other tips on getting started, there is the "newbie" label in JIRA, but
it's an imperfect sorting.  I recommend doing the following:

1) Look at unit test JIRAs, both open and closed.  Tons of benefits:
they'll help you understand some part of the API, some could use some
comments and enhancements, etc.
2) Pick a component (ResourceManager, NodeManager, MR APIs, DataNode,
NameNode, etc.) and focus on that to start.  It's easier than trying to
understand it all at once.
3) Participate in some code reviews.  It will be non-binding, but testing
out other code can help you understand some part.

-Ray


On Mon, Jul 13, 2015 at 9:56 AM, Ajoy Bhatia  wrote:

> Hi,
>
> I am a software developer and have been working in the industry since 1991.
> I have written Java code for Map-Reduce jobs in my recent jobs. I want to
> contribute to the Hadoop project, and signed up for mapreduce-dev mailing
> list but I am open to contributing to any Hadoop project module that needs
> help.
>
> I would like to know of any beginner-level issues that I could start
> working on. I have gone through the Newcomers and Getting Started pages on
> community.apache.org. As suggested, I searched Hadoop Map/Reduce Jira for
> issues tagged "GSoC" or "mentor". That didn't help me identify something
> suitable. I would appreciate any help in getting me started as a
> contributor.
>
> Thanks...
> - Ajoy
>


[jira] [Created] (MAPREDUCE-6432) Fix typos in hadoop-mapreduce-project module

2015-07-10 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6432:
-

 Summary: Fix typos in hadoop-mapreduce-project module
 Key: MAPREDUCE-6432
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6432
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


Fix a bunch of typos in comments, strings, variable names, and method names in 
the hadoop-mapreduce-project module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6421) Fix Findbugs pre-patch warning in org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.reduceNodeLabelExpression

2015-06-29 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6421:
-

 Summary: Fix Findbugs pre-patch warning in 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.reduceNodeLabelExpression
 Key: MAPREDUCE-6421
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6421
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ray Chiang


The actual error message is:

  Inconsistent synchronization of 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.reduceNodeLabelExpression;
 locked 66% of time

The full URL for the findbugs is at:

  
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5858/artifact/patchprocess/trunkFindbugsWarningshadoop-mapreduce-client-app.html

I haven't looked to see if this message is in error or if findbugs is correct.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6406) Update FileOutputCommitter.FILEOUTPUTCOMMITTER_ALGORITHM_VERSION_DEFAULT to match mapred-default.xml

2015-06-18 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6406:
-

 Summary: Update 
FileOutputCommitter.FILEOUTPUTCOMMITTER_ALGORITHM_VERSION_DEFAULT to match 
mapred-default.xml
 Key: MAPREDUCE-6406
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6406
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


MAPREDUCE-6336 updated the default version for the property 
mapreduce.fileoutputcommitter.algorithm.version to 2.  Should the 
FileOutputCommitter class default be updated to match?




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6394) Avoid copying counters in task reports for Job History Server

2015-06-10 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6394:
-

 Summary: Avoid copying counters in task reports for Job History 
Server
 Key: MAPREDUCE-6394
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6394
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang


In HsTasksBlock#render(), there is a loop to create a Javascript table which 
slows down immensely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6388) Remove deprecation warnings from JobHistoryServer classes

2015-06-05 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6388:
-

 Summary: Remove deprecation warnings from JobHistoryServer classes
 Key: MAPREDUCE-6388
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6388
 Project: Hadoop Map/Reduce
  Issue Type: Task
  Components: jobhistoryserver
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


There are a ton of deprecation warnings in the JobHistoryServer classes.  This 
is affecting some modifications I'm making since a single line move shifts all 
the deprecation warnings.  I'd like to get these fixed to prevent minor changes 
from generating a ton of warnings in test-patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6376) Fix long load times of .jhist file in JobHistoryServer

2015-05-28 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6376:
-

 Summary: Fix long load times of .jhist file in JobHistoryServer
 Key: MAPREDUCE-6376
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6376
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang


When you click on a Job link in the JHS Web UI, it loads the .jhist file.  For 
jobs which have a large number of tasks, the load time can break UI 
responsiveness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


New unit tests for *-default.xml verification

2015-05-12 Thread Ray Chiang
I was going to update the Wiki with the information, but it seems to be
down for me at the moment.  A few people have already bumped into some
issues, so I'm informing via the *-dev lists for now.

In trunk, there are three new unit tests:

TestMapreduceConfigFields (MAPREDUCE-6192)
TestYarnConfigurationFields (YARN-2957)
TestHdfsConfigFields (HDFS-7559)

which perform automatic comparison of properties between one of the
*-default.xml files and a set of Java classes.

As of right now, if a property is added to the .xml file, there should be
one of the following:

1) The property defined in the appropriate Java *Config.java class (e.g.
MRJobConfig)

2) The property should be added to the xmlPropsToSkipCompare or
xmlPrefixToSkipCompare HashSet in the appropriate unit test.

This guarantees that if a .xml property is created, that the corresponding
property is spelled correctly in the Java class or its exception is
documented in the unit test.

The following three JIRAs are open to do the reverse check (flag when a
property exists in a Java class, but not in the .xml file).

YARN-3069
MAPREDUCE-6358
HDFS-8356

The above JIRAs will take more time due to getting the description fields
properly filled out and making sure the default values are correct.  In
some cases, such as properties that would override an older property, more
exceptions will need to be added.

Once all of the above are done, I may take a more solid attempt at doing
the same for CommonConfigurationKeys/core-default.xml.

Thanks to everyone who have been reviewing, committing, and updating the
new unit tests.

-Ray


[jira] [Created] (MAPREDUCE-6358) Document missing properties in mapred-default.xml

2015-05-06 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6358:
-

 Summary: Document missing properties in mapred-default.xml
 Key: MAPREDUCE-6358
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6358
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.7.0
Reporter: Ray Chiang
Assignee: Ray Chiang


The following properties are currently not defined in mapred-default.xml. These 
properties should either be
A) documented in mapred-default.xml OR
B) listed as an exception (with comments, e.g. for internal use) in the 
TestMapreduceConfigFields unit test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6349) Fix typo in property org.apache.hadoop.mapreduce.lib.chain.Chain.REDUCER_INPUT_VALUE_CLASS

2015-05-01 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6349:
-

 Summary: Fix typo in property 
org.apache.hadoop.mapreduce.lib.chain.Chain.REDUCER_INPUT_VALUE_CLASS
 Key: MAPREDUCE-6349
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6349
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


Ran across this typo in a property.  It doesn't look like it's used anywhere 
externally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Question about org.apache.hadoop.yarn.webapp.Dispatcher

2015-04-29 Thread Ray Chiang
In Dispatcher#service(), I see this comment:

  // TODO: support args converted from /path/:arg1/...
  dest.action.invoke(controller, (Object[]) null);

Right now, I've made some changes in MAPREDUCE-6222 that seems to trigger
exceptions at this TODO.  Can someone give me a clearer idea about what
sort of processing should occur at the point of this TODO?

On a related note, is there any additional/better documentation about the
various org.apache.hadoop.yarn.webapp
and org.apache.hadoop.yarn.webapp.view classes?  There are a few things I'm
trying to figure out and stepping through with a debugger gets painful at
times.

Any information is appreciated.  Thanks.

-Ray


[jira] [Created] (MAPREDUCE-6340) Remove .xml and documentation references to dfs.webhdfs.enabled

2015-04-27 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6340:
-

 Summary: Remove .xml and documentation references to 
dfs.webhdfs.enabled
 Key: MAPREDUCE-6340
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6340
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
 Fix For: 3.0.0


HDFS-7985 removed any references in the code to the dfs.webhdfs.enabled 
property.  The property should also be removed from hdfs-default.xml as well as 
other places.

As mentioned in HDFS-7985, this is an incompatible change with branch-2, so 
this fix should be limited to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Removal of unused properties

2015-04-09 Thread Ray Chiang
We have done this before where the properties were documented in the .xml
file, but didn't exist anywhere in the Configuration files or the rest of
the code.

HDFS-7566 (committed)
YARN-2460 (committed)
MAPREDUCE-6057 (pending)

-Ray

On Thu, Apr 9, 2015 at 4:33 AM, Akira AJISAKA 
wrote:

> Hi Folks,
>
> In MAPREDUCE-6307, I'd like to remove unused "mapreduce.tasktracker.
> taskmemorymanager.monitoringinterval" property, however, the
> compatibility document says "Hadoop-defined properties are to be deprecated
> at least for one major release before being removed."
>
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/
> hadoop-common/Compatibility.html#Hadoop_Configuration_Files
>
> Is it applicable for unused properties?
> Can we remove unused properties right now?
>
> Regards,
> Akira
>


[jira] [Created] (MAPREDUCE-6266) Job#getTrackingURL should consistently return a proper URL

2015-02-26 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6266:
-

 Summary: Job#getTrackingURL should consistently return a proper URL
 Key: MAPREDUCE-6266
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6266
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


When a job is running, Job#getTrackingURL returns a proper URL like:

http://:8088/proxy/application_1424910897258_0004/

Once a job is finished and the job has moved to the JHS, then 
Job#getTrackingURL returns a URL without the protocol like:

:19888/jobhistory/job/job_1424910897258_0004




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6254) Update use of Iterator to Iterable

2015-02-11 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6254:
-

 Summary: Update use of Iterator to Iterable
 Key: MAPREDUCE-6254
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6254
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


Found these using the IntelliJ Findbugs-IDEA plugin, which uses findbugs3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6253) Update use of Iterator to Iterable

2015-02-11 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6253:
-

 Summary: Update use of Iterator to Iterable
 Key: MAPREDUCE-6253
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6253
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


Found these using the IntelliJ Findbugs-IDEA plugin, which uses findbugs3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6196) Fix BigDecimal ArithmeticException in PiEstimator

2014-12-15 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6196:
-

 Summary: Fix BigDecimal ArithmeticException in PiEstimator
 Key: MAPREDUCE-6196
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6196
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


Certain combinations of arguments to PiEstimator cause the following exception:

java.lang.ArithmeticException: Non-terminating decimal expansion; no exact 
representable decimal result.
at java.math.BigDecimal.divide(BigDecimal.java:1603)
at org.apache.hadoop.examples.PiEstimator.estimate(PiEstimator.java:313)
at org.apache.hadoop.examples.PiEstimator.run(PiEstimator.java:342)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.PiEstimator.main(PiEstimator.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

The calls to the BigDecimal methods should have some large default precision to 
prevent this exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6192) Create unit test to automatically compare MR related classes and mapred-default.xml

2014-12-12 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6192:
-

 Summary: Create unit test to automatically compare MR related 
classes and mapred-default.xml
 Key: MAPREDUCE-6192
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6192
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor


Create a unit test that will automatically compare the fields in the various 
MapReduce related classes and mapred-default.xml. It should throw an error if a 
property is missing in either the class or the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6061) Fix MR_CLIENT_TO_AM_IPC_MAX_RETRIES_ON_TIMEOUTS property in MRJobConfig

2014-08-28 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6061:
-

 Summary: Fix MR_CLIENT_TO_AM_IPC_MAX_RETRIES_ON_TIMEOUTS property 
in MRJobConfig
 Key: MAPREDUCE-6061
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6061
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial


The property MR_CLIENT_TO_AM_IPC_MAX_RETRIES_ON_TIMEOUTS is defined as:

  MR_PREFIX + "yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts"

which results in the prefix part showing up twice.  It should be

  MR_PREFIX + "client-am.ipc.max-retries-on-timeouts"




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-6057) Remove obsolete entries from mapred-default.xml

2014-08-27 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6057:
-

 Summary: Remove obsolete entries from mapred-default.xml
 Key: MAPREDUCE-6057
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6057
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Priority: Minor


The following properties are defined in mapred-default.xml but no longer exist 
in MRJobConfig.

  map.sort.class
  mapred.child.env
  mapred.child.java.opts
  mapreduce.app-submission.cross-platform
  mapreduce.client.completion.pollinterval
  mapreduce.client.output.filter
  mapreduce.client.progressmonitor.pollinterval
  mapreduce.client.submit.file.replication
  mapreduce.cluster.acls.enabled
  mapreduce.cluster.local.dir
  mapreduce.framework.name
  mapreduce.ifile.readahead
  mapreduce.ifile.readahead.bytes
  mapreduce.input.fileinputformat.list-status.num-threads
  mapreduce.input.fileinputformat.split.minsize
  mapreduce.input.lineinputformat.linespermap
  mapreduce.job.counters.limit
  mapreduce.job.max.split.locations
  mapreduce.job.reduce.shuffle.consumer.plugin.class
  mapreduce.jobhistory.address
  mapreduce.jobhistory.admin.acl
  mapreduce.jobhistory.admin.address
  mapreduce.jobhistory.cleaner.enable
  mapreduce.jobhistory.cleaner.interval-ms
  mapreduce.jobhistory.client.thread-count
  mapreduce.jobhistory.datestring.cache.size
  mapreduce.jobhistory.done-dir
  mapreduce.jobhistory.http.policy
  mapreduce.jobhistory.intermediate-done-dir
  mapreduce.jobhistory.joblist.cache.size
  mapreduce.jobhistory.keytab
  mapreduce.jobhistory.loadedjobs.cache.size
  mapreduce.jobhistory.max-age-ms
  mapreduce.jobhistory.minicluster.fixed.ports
  mapreduce.jobhistory.move.interval-ms
  mapreduce.jobhistory.move.thread-count
  mapreduce.jobhistory.principal
  mapreduce.jobhistory.recovery.enable
  mapreduce.jobhistory.recovery.store.class
  mapreduce.jobhistory.recovery.store.fs.uri
  mapreduce.jobhistory.store.class
  mapreduce.jobhistory.webapp.address
  mapreduce.local.clientfactory.class.name
  mapreduce.map.skip.proc.count.autoincr
  mapreduce.output.fileoutputformat.compress
  mapreduce.output.fileoutputformat.compress.codec
  mapreduce.output.fileoutputformat.compress.type
  mapreduce.reduce.skip.proc.count.autoincr
  mapreduce.shuffle.connection-keep-alive.enable
  mapreduce.shuffle.connection-keep-alive.timeout
  mapreduce.shuffle.max.connections
  mapreduce.shuffle.max.threads
  mapreduce.shuffle.port
  mapreduce.shuffle.ssl.enabled
  mapreduce.shuffle.ssl.file.buffer.size
  mapreduce.shuffle.transfer.buffer.size
  mapreduce.shuffle.transferTo.allowed
  yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts

Submitting bug for comment/feedback about which properties should be kept in 
mapred-default.xml.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-6051) Fix typos in log messages

2014-08-25 Thread Ray Chiang (JIRA)
Ray Chiang created MAPREDUCE-6051:
-

 Summary: Fix typos in log messages
 Key: MAPREDUCE-6051
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6051
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Trivial


There are a bunch of typos in log messages. HADOOP-10946 was initially created, 
but may have failed due to being in multiple components. Try fixing typos on a 
per-component basis.



--
This message was sent by Atlassian JIRA
(v6.2#6252)