Re: Planning Hadoop 2.6.1 release
+YARN-3011: without which, NM is likely to crash, and cannot come back with recovery enabled. Thanks, Zhijie From: Wangda Tan wheele...@gmail.com Sent: Wednesday, August 05, 2015 10:42 AM To: yarn-...@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org Subject: Re: Planning Hadoop 2.6.1 release Can we add following two fixes to 2.6.1? https://issues.apache.org/jira/browse/YARN-2922 and https://issues.apache.org/jira/browse/YARN-3487. They're not fatal issue, but they can cause lots of issue in a large cluster. Thanks, Wangda On Mon, Aug 3, 2015 at 1:21 PM, Sangjin Lee sj...@apache.org wrote: See my later update in the thread. HDFS-7704 is in the list. Thanks, Sangjin On Mon, Aug 3, 2015 at 1:19 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Makes sense, it was caused by HDFS-7704 which got into 2.7.0 only and is not part of the candidate list. Removed HDFS-7916 from the list. Thanks +Vinod On Jul 24, 2015, at 6:32 PM, Sangjin Lee sj...@apache.org wrote: Out of the JIRAs we proposed, please remove HDFS-7916. I don't think it applies to 2.6. Thanks, Sangjin
[jira] [Resolved] (HADOOP-8392) Add YARN audit logging to log4j.properties
[ https://issues.apache.org/jira/browse/HADOOP-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen resolved HADOOP-8392. - Resolution: Duplicate It sounds a YARN issue. There already exists a YARN jira: YARN-2255. So close this jira as duplicate of that one. Add YARN audit logging to log4j.properties -- Key: HADOOP-8392 URL: https://issues.apache.org/jira/browse/HADOOP-8392 Project: Hadoop Common Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Eli Collins MAPREDUCE-2655 added MR/NM audit logging but it's not hooked up into log4j.properties or the bin and env scripts like the other audit logs so you have to modify the deployed binary to change them. Let's add the relevant plumbing that the other audit loggers have, and update log4.properties with a sample configuration that's disabled by default, eg see [this comment|https://issues.apache.org/jira/browse/MAPREDUCE-2655?focusedCommentId=13084191page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13084191]. Also, looks like mapred.AuditLogger and its plumbing can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Hadoop 2.7.0 RC0
+1 (Binding) + Downloaded the source tarball + Built the source with Java 7 successfully + Deployed a single node insecure cluster, and ran MR example jobs and DS jobs successfully, MR AM didn't hang + RM and TS webUI showed the apps' information correctly Thanks, Zhijie From: Wangda Tan wheele...@gmail.com Sent: Friday, April 10, 2015 5:25 PM To: yarn-...@hadoop.apache.org Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; Vinod Kumar Vavilapalli Subject: Re: [VOTE] Release Apache Hadoop 2.7.0 RC0 +1 (Non-binding) Built from src, deployed a single node cluster, and tried to run some MR jobs. On Fri, Apr 10, 2015 at 4:44 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I've created a release candidate RC0 for Apache Hadoop 2.7.0. The RC is available at: http://people.apache.org/~vinodkv/hadoop-2.7.0-RC0/ The RC tag in git is: release-2.7.0-RC0 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1017/ As discussed before - This release will only work with JDK 1.7 and above - I’d like to use this as a starting release for 2.7.x [1], depending on how it goes, get it stabilized and potentially use a 2.7.1 in a few weeks as the stable release. Please try the release and vote; the vote will run for the usual 5 days. Thanks, Vinod [1]: A 2.7.1 release to follow up 2.7.0 http://markmail.org/thread/zwzze6cqqgwq4rmw
Re: A 2.7.1 release to follow up 2.7.0
+1 for roll out 2.7.0 soon and continuing stabilization in 2.7.1. Agree with Karthik, it's better to exclude all improvements unless it turns out to blocking something. In terms of jdiff, we have done the compatibility check for quite a while in branch-2. Do we want to back port it to (some of) early releases? Thanks Zhijie From: Steve Loughran ste...@hortonworks.com Sent: Thursday, April 09, 2015 12:56 PM To: yarn-...@hadoop.apache.org Cc: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org; Vinod Kumar Vavilapalli Subject: Re: A 2.7.1 release to follow up 2.7.0 There's a couple of S3a fixes coming along which could go into a 2.7.1; they've been held back to avoid rushing them in to 2.7.0 last-minute. On 9 Apr 2015, at 20:33, Junping Du j...@hortonworks.com wrote: +1 (non-binding). The plan sounds reasonable. We should make our release train more fast-moving, and predictable - it could benefit our community and ecosystem in many aspects. Thanks, Junping From: Arpit Agarwal aagar...@hortonworks.com Sent: Thursday, April 09, 2015 8:23 PM To: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org Cc: Vinod Kumar Vavilapalli Subject: Re: A 2.7.1 release to follow up 2.7.0 +1 for 2.7.1 and +1 for promoting it to 'stable', assuming it includes no new features or gratuitous improvements. Arpit On 4/9/15, 11:48 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I feel like we haven't done a great job of maintaining the previous 2.x releases. Seeing as how long 2.7.0 release has taken, I am sure we will spend more time stabilizing it, fixing issues etc. I propose that we immediately follow up 2.7.0 with a 2.7.1 within 2-3 weeks. The focus obviously is to have blocker issues, bug-fixes and *no* features. Improvements are going to be slightly hard to reason about, but I propose limiting ourselves to very small improvements, if at all. The other area of concern with the previous releases had been compatibility. With help from Li Lu, I got jdiff reinstated in branch-2 (though patches are not yet in), and did a pass. In the unavoidable event that we find incompatibilities with 2.7.0, we can fix those in 2.7.1 and promote that to be the stable release. Thoughts? Thanks,+Vinod
Re: A 2.7.1 release to follow up 2.7.0
I meant we *haven't* done the compatibility check. From: Zhijie Shen zs...@hortonworks.com Sent: Thursday, April 09, 2015 2:00 PM To: yarn-...@hadoop.apache.org; common-dev@hadoop.apache.org Cc: hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; Vinod Kumar Vavilapalli Subject: Re: A 2.7.1 release to follow up 2.7.0 +1 for roll out 2.7.0 soon and continuing stabilization in 2.7.1. Agree with Karthik, it's better to exclude all improvements unless it turns out to blocking something. In terms of jdiff, we have done the compatibility check for quite a while in branch-2. Do we want to back port it to (some of) early releases? Thanks Zhijie From: Steve Loughran ste...@hortonworks.com Sent: Thursday, April 09, 2015 12:56 PM To: yarn-...@hadoop.apache.org Cc: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org; mapreduce-...@hadoop.apache.org; Vinod Kumar Vavilapalli Subject: Re: A 2.7.1 release to follow up 2.7.0 There's a couple of S3a fixes coming along which could go into a 2.7.1; they've been held back to avoid rushing them in to 2.7.0 last-minute. On 9 Apr 2015, at 20:33, Junping Du j...@hortonworks.com wrote: +1 (non-binding). The plan sounds reasonable. We should make our release train more fast-moving, and predictable - it could benefit our community and ecosystem in many aspects. Thanks, Junping From: Arpit Agarwal aagar...@hortonworks.com Sent: Thursday, April 09, 2015 8:23 PM To: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org Cc: Vinod Kumar Vavilapalli Subject: Re: A 2.7.1 release to follow up 2.7.0 +1 for 2.7.1 and +1 for promoting it to 'stable', assuming it includes no new features or gratuitous improvements. Arpit On 4/9/15, 11:48 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Hi all, I feel like we haven't done a great job of maintaining the previous 2.x releases. Seeing as how long 2.7.0 release has taken, I am sure we will spend more time stabilizing it, fixing issues etc. I propose that we immediately follow up 2.7.0 with a 2.7.1 within 2-3 weeks. The focus obviously is to have blocker issues, bug-fixes and *no* features. Improvements are going to be slightly hard to reason about, but I propose limiting ourselves to very small improvements, if at all. The other area of concern with the previous releases had been compatibility. With help from Li Lu, I got jdiff reinstated in branch-2 (though patches are not yet in), and did a pass. In the unavoidable event that we find incompatibilities with 2.7.0, we can fix those in 2.7.1 and promote that to be the stable release. Thoughts? Thanks,+Vinod
Re: 2.7 status
Can we include YARN-3273 into 2.7? YARN-3430 needs it to fix the RM web UI bug. Thanks, Zhijie From: Vinod Kumar Vavilapalli vino...@hortonworks.com Sent: Monday, March 30, 2015 3:35 PM To: mapreduce-...@hadoop.apache.org Cc: yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; Hadoop Common Subject: Re: 2.7 status This is the filter I am using: https://issues.apache.org/jira/issues/?filter=12330598 +Vinod On Mar 25, 2015, at 11:47 AM, Konstantin Shvachko shv.had...@gmail.com wrote: Progress is good! What are the four blockers? Could you please mark them as such in the Jira. Thanks, --Konst On Wed, Mar 25, 2015 at 9:53 AM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Progress has been really slow, but now we are down to four blockers across the board. I plan to roll an RC this weekend. Thanks, +Vinod On Mar 8, 2015, at 8:40 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: 2.7 branch created and branch-2 updated to point to 2.8-SNAPSHOT. Committers, please exercise caution on the content going into branch-2.7. I'd take help everyone's help in pushing through the remaining set of blockers targeted for 2.7. Thanks, +Vinod On Mar 8, 2015, at 8:21 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Branching 2.7 now. Request holding off commits to branch-2 to avoid commit race. Will send an all-clear in the next 30 mins once I am done. Thanks +Vinod On Mar 5, 2015, at 1:56 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: The 2.7 blocker JIRA went down and going back up again, we will need to converge. Unless I see objections, I plan to cut a branch this weekend and selectively filter stuff in after that in the interest of convergence. Thoughts welcome! Thanks, +Vinod On Mar 1, 2015, at 11:58 AM, Arun Murthy a...@hortonworks.com wrote: Sounds good, thanks for the help Vinod! Arun From: Vinod Kumar Vavilapalli Sent: Sunday, March 01, 2015 11:43 AM To: Hadoop Common; Jason Lowe; Arun Murthy Subject: Re: 2.7 status Agreed. How about we roll an RC end of this week? As a Java 7+ release with features, patches that already got in? Here's a filter tracking blocker tickets - https://issues.apache.org/jira/issues/?filter=12330598. Nine open now. +Arun Arun, I'd like to help get 2.7 out without further delay. Do you mind me taking over release duties? Thanks, +Vinod From: Jason Lowe jl...@yahoo-inc.com.INVALID Sent: Friday, February 13, 2015 8:11 AM To: common-dev@hadoop.apache.org Subject: Re: 2.7 status I'd like to see a 2.7 release sooner than later. It has been almost 3 months since Hadoop 2.6 was released, and there have already been 634 JIRAs committed to 2.7. That's a lot of changes waiting for an official release. https://issues.apache.org/jira/issues/?jql=project%20in%20%28hadoop%2Chdfs%2Cyarn%2Cmapreduce%29%20AND%20fixversion%3D2.7.0%20AND%20resolution%3DFixed Jason From: Sangjin Lee sj...@apache.org To: common-dev@hadoop.apache.org common-dev@hadoop.apache.org Sent: Tuesday, February 10, 2015 1:30 PM Subject: 2.7 status Folks, What is the current status of the 2.7 release? I know initially it started out as a java-7 only release, but looking at the JIRAs that is very much not the case. Do we have a certain timeframe for 2.7 or is it time to discuss it? Thanks, Sangjin
[jira] [Created] (HADOOP-11749) HttpServer2 thread pool is set to daemon
Zhijie Shen created HADOOP-11749: Summary: HttpServer2 thread pool is set to daemon Key: HADOOP-11749 URL: https://issues.apache.org/jira/browse/HADOOP-11749 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen In many cases, it is not the problem because the rpc protocol will bock the process from exit. However, if the process only has a web server, since the thread pool is set to daemon, the process will immediately exit after starting it, but not stay and listen to the incoming requests. It's possible for us to work around to make the main thread being blocked, but I'm wondering if we can resolve it within HttpServer2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Reviving HADOOP-7435: Making Jenkins pre-commit build work with branches
+1. It¹s really helpful for branch development. To continue Karthik¹s point, is it good make pre-commit testing against branch-2 as the default too like that against trunk? On 3/4/15, 1:47 PM, Sean Busbey bus...@cloudera.com wrote: +1 If we can make things look like HBase support for precommit testing on branches (HBASE-12944), that would make it easier for new and occasional contributors who might end up working in other ecosystem projects. AFAICT, Jonathan's proposal for branch names in patch names does this. On Wed, Mar 4, 2015 at 3:41 PM, Karthik Kambatla ka...@cloudera.com wrote: Thanks for reviving this on email, Vinod. Newer folks like me might not be aware of this JIRA/effort. This would be wonderful to have so (1) we know the status of release branches (branch-2, etc.) and also (2) feature branches (YARN-2928). Jonathan's or Matt's proposal for including branch name looks reasonable to me. If none has any objections, I think we can continue on JIRA and get this in. On Wed, Mar 4, 2015 at 1:20 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hi all, I'd like us to revive the effort at https://issues.apache.org/jira/browse/HADOOP-7435 to make precommit builds being able to work with branches. Having the Jenkins verify patches on branches is very useful even if there may be relaxed review oversight on the said-branch. Unless there are objections, I'd request help from Giri who already has a patch sitting there for more than a year before. This may need us to collectively agree on some convention - the last comment says that the branch patch name should be in some format for this to work. Thanks, +Vinod -- Karthik Kambatla Software Engineer, Cloudera Inc. http://five.sentenc.es -- Sean
[jira] [Resolved] (HADOOP-11370) Fix new findbug warnings hadoop-yarn
[ https://issues.apache.org/jira/browse/HADOOP-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen resolved HADOOP-11370. -- Resolution: Fixed Fix Version/s: 2.7.0 Assignee: (was: Varun Saxena) Close it as all incorporated Jiras are resolved. Unset the assignee because of multiple contributors. Fix new findbug warnings hadoop-yarn Key: HADOOP-11370 URL: https://issues.apache.org/jira/browse/HADOOP-11370 Project: Hadoop Common Issue Type: Sub-task Reporter: Zhijie Shen Fix For: 2.7.0 Attachments: FindBugs Report.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Upgrading findbugs
Findbugs warnings are cleaned up at YARN side. Thanks for the contribution from Li Lu and Varun Saxena! On Thu, Dec 18, 2014 at 2:44 PM, Haohui Mai h...@hortonworks.com wrote: So far we made great progress on fixing findbugs warnings. We're free of findbugs warnings in hdfs, nfs, and a couple other sub projects. There are two findbugs warnings left in hadoop-common. I saw there are some progresses on the YARN side as well. Thanks very much for the contributors (Brandon Li, Li Lu, and many others) that have worked on this. We have a cleaner code base now, and the newer findbugs can help us to catch more issues during pre-commits. :-) I plan to finish the remaining work on in 2.7 timeframe. Thanks again for contribution. Thanks, Haohui On Tue, Dec 9, 2014 at 2:16 AM, Steve Loughran ste...@hortonworks.com wrote: +1 to upgrade. regarding the newly surfacing issues, I'd recommend we look at them and see which are critical problems and fix them. One of the conclusions I got from the building is that there are a lot of javac and javadoc warnings that everyone ignores. Sitting down to fix them is time consuming and doesn't directly fix anything or add new features —but it keeps the code cleaner, which is something we want to encourage -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-11370) Fix new findbug warnings hadoop-yarn
Zhijie Shen created HADOOP-11370: Summary: Fix new findbug warnings hadoop-yarn Key: HADOOP-11370 URL: https://issues.apache.org/jira/browse/HADOOP-11370 Project: Hadoop Common Issue Type: Sub-task Reporter: Zhijie Shen -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Hadoop 2.6.0
+1 (non-binding) Replayed the verifications for rc0, and everything keeps working fine. On Fri, Nov 14, 2014 at 9:21 AM, Chen He airb...@gmail.com wrote: +1 non-binding download and compile source code; setup single node cluster; successfully run sleep job; Regards! Chen On Fri, Nov 14, 2014 at 1:44 AM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: Thanks Arun for creating another rc! +1 (non-binding) - patched Tez 0.5.2 pom to compile against 2.6.0-rc1 - patched Hive 0.14 pom to compile against 2.6.0-rc1 - run several Hive queries on Tez Thanks, Akira (11/14/14, 10:16), Wangda Tan wrote: Thanks Arun! Have tried compile, deploy and configured a local cluster and can successfully execute a job with node labels. +1 (non-binding) Thanks, Wangda On Thu, Nov 13, 2014 at 3:39 PM, Ravi Prakash ravi...@ymail.com wrote: Thanks for the respin Arun! I've verified all checksums, and tested that the DockerContainerExecutor was able to launch jobs. I'm a +1 on the release On Thursday, November 13, 2014 3:09 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created another release candidate (rc1) for hadoop-2.6.0 based on the feedback. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc1 The RC tag in git is: release-2.6.0-rc1 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1013. Please try the release and vote; the vote will run for the usual 5 days. thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.6.0
+1 (non-binding) * Downloaded the source tar ball, and built binaries from it successfully. * Ran DS apps and MR jobs with emitting timeline data enabled successfully. * Verified the generic history information, DS-specific and MR-specific metrics were available. * Ran the timeline server in secure mode, delegation token and domain-based ACLs worked properly. On Tue, Nov 11, 2014 at 4:30 PM, Wei Yan ywsk...@gmail.com wrote: On Nov 11, 2014, at 2:06 PM, Robert Kanter rkan...@cloudera.com wrote: Hi Arun, We were testing the RC and ran into a problem with the recent fixes that were done for POODLE for Tomcat (HADOOP-11217 for KMS and HDFS-7274 for HttpFS). Basically, in disabling SSLv3, we also disabled SSLv2Hello, which is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be clear, it does not mean SSLv2, which is insecure. This also affects the MR shuffle in HADOOP-11243. For HADOOP-11243, as the shuffle happens only between NMs, it’s ok to keep it only support TLS. The fix is super simple, so I think we should reopen these 3 JIRAs and put in addendum patches and get them into 2.6.0. thanks - Robert On Tue, Nov 11, 2014 at 1:04 PM, Ravi Prakash ravi...@ymail.com wrote: Hi Arun! We are very close to completion on YARN-1964 (DockerContainerExecutor). I'd also like HDFS-4882 to be checked in. Do you think these issues merit another RC? ThanksRavi On Tuesday, November 11, 2014 11:57 AM, Steve Loughran ste...@hortonworks.com wrote: +1 binding -patched slider pom to build against 2.6.0 -verified build did download, which it did at up to ~8Mbps. Faster than a local build. -full clean test runs on OS/X Linux Windows 2012: Same thing. I did have to first build my own set of the windows native binaries, by checking out branch-2.6.0; doing a native build, copying the binaries and then purging the local m2 repository of hadoop artifacts to be confident I was building against. For anyone who wants those native libs they will be up on https://github.com/apache/incubator-slider/tree/develop/bin/windows/ once it syncs with the ASF repos. afterwords: the tests worked! On 11 November 2014 02:52, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.6.0 that I would like to see released. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc0 The RC tag in git is: release-2.6.0-rc0 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1012. Please try the release and vote; the vote will run for the usual 5 days. thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-11254) Promoting AccessControlList to be public
Zhijie Shen created HADOOP-11254: Summary: Promoting AccessControlList to be public Key: HADOOP-11254 URL: https://issues.apache.org/jira/browse/HADOOP-11254 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen The motivation of promoting AccessControlList to be a public API is to facilitate the users to programmatically parse or construct an ACL string. A typical use case may be the timeline domain, where we have the client lib to accept strings complied with ACL string format to define the authorized readers/writers. It will be more convenient and less buggy if users can compose this string with AccessControlList. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11240) Jenkins build seems to be broken by changes in test-patch.sh
Zhijie Shen created HADOOP-11240: Summary: Jenkins build seems to be broken by changes in test-patch.sh Key: HADOOP-11240 URL: https://issues.apache.org/jira/browse/HADOOP-11240 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Priority: Blocker * https://builds.apache.org/job/PreCommit-YARN-Build/5596//console * https://builds.apache.org/job/PreCommit-YARN-Build/5595//console * https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4981//console A couple jenkins build failure for the same reason: {code} HEAD is now at b0e19c9 HADOOP-10926. Improve test-patch.sh to apply binary diffs (cmccabe) Previous HEAD position was b0e19c9... HADOOP-10926. Improve test-patch.sh to apply binary diffs (cmccabe) Switched to branch 'trunk' Your branch is behind 'origin/trunk' by 17 commits, and can be fast-forwarded. (use git pull to update your local branch) First, rewinding head to replay your work on top of it... Fast-forwarded trunk to b0e19c9d54cecef191b91431f9ca62a76a000f45. MAPREDUCE-5933 patch is being downloaded at Tue Oct 28 02:11:12 UTC 2014 from http://issues.apache.org/jira/secure/attachment/12677496/MAPREDUCE-5933.patch cp: cannot stat '/home/jenkins/buildSupport/lib/*': No such file or directory Error: Patch dryrun couldn't detect changes the patch would make. Exiting. PATCH APPLICATION FAILED {code} It seems to have been broken by HADOOP-10926 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11215) DT management ops in DelegationTokenAuthenticatedURL assume the authenticator is KerberosDelegationTokenAuthenticator
Zhijie Shen created HADOOP-11215: Summary: DT management ops in DelegationTokenAuthenticatedURL assume the authenticator is KerberosDelegationTokenAuthenticator Key: HADOOP-11215 URL: https://issues.apache.org/jira/browse/HADOOP-11215 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Here's the code in get/renew/cancel DT: {code} return ((KerberosDelegationTokenAuthenticator) getAuthenticator()). renewDelegationToken(url, token, token.delegationToken, doAsUser); {code} It seems not to be right because PseudoDelegationTokenAuthenticator should work here as well. At least, it is inconsistent in the context of delegation token authentication, as DelegationTokenAuthenticationHandler doesn't require the authentication must be Kerberos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11210) Findbugs warning about SpanReceiverHost
Zhijie Shen created HADOOP-11210: Summary: Findbugs warning about SpanReceiverHost Key: HADOOP-11210 URL: https://issues.apache.org/jira/browse/HADOOP-11210 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Priority: Minor Findbugs warning about SpanReceiverHost has bee reported multiple times in some Jira: {quote} Dereference of the result of readLine() without nullcheck in org.apache.hadoop.tracing.SpanReceiverHost.getUniqueLocalTraceFileName() Bug type NP_DEREFERENCE_OF_READLINE_VALUE (click for details) In class org.apache.hadoop.tracing.SpanReceiverHost In method org.apache.hadoop.tracing.SpanReceiverHost.getUniqueLocalTraceFileName() Value loaded from line At SpanReceiverHost.java:[line 104] {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11207) DelegationTokenAuthenticationHandler needs to support DT operations for proxy user
Zhijie Shen created HADOOP-11207: Summary: DelegationTokenAuthenticationHandler needs to support DT operations for proxy user Key: HADOOP-11207 URL: https://issues.apache.org/jira/browse/HADOOP-11207 Project: Hadoop Common Issue Type: Bug Components: security Reporter: Zhijie Shen Assignee: Zhijie Shen Currently, DelegationTokenAuthenticationHandler only support DT operations for the request user after it passes the authentication. However, it should also support the request user to do DT operations on behalf of the proxy user. Timeline server is using the authentication filter for DT operations instead of traditional RPC-based ones. It needs this feature to enable the proxy user to use the timeline service (YARN-2676). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11181) o.a.h.security.token.delegation.DelegationTokenManager should be more generalized to handle other DelegationTokenIdentifier
Zhijie Shen created HADOOP-11181: Summary: o.a.h.security.token.delegation.DelegationTokenManager should be more generalized to handle other DelegationTokenIdentifier Key: HADOOP-11181 URL: https://issues.apache.org/jira/browse/HADOOP-11181 Project: Hadoop Common Issue Type: Bug Components: security Reporter: Zhijie Shen Assignee: Zhijie Shen While DelegationTokenManager can set external secretManager, it have the assumption that the token is going to be o.a.h.security.token.delegation.DelegationTokenIdentifier, and use DelegationTokenIdentifier method to decode a token. {code} @SuppressWarnings(unchecked) public UserGroupInformation verifyToken(TokenDelegationTokenIdentifier token) throws IOException { ByteArrayInputStream buf = new ByteArrayInputStream(token.getIdentifier()); DataInputStream dis = new DataInputStream(buf); DelegationTokenIdentifier id = new DelegationTokenIdentifier(tokenKind); id.readFields(dis); dis.close(); secretManager.verifyToken(id, token.getPassword()); return id.getUser(); } {code} It's not going to work it the token kind is other than web.DelegationTokenIdentifier. For example, RM want to reuse it but hook it to RMDelegationTokenSecretManager and RMDelegationTokenIdentifier, which has the customized way to decode a token. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[DISUCSS] Reasonable Hadoop ACL Defaults
Hi folks, There're a bunch of ACLs configuration defaults, which are set to *: 1. yarn.admin.acl in yarn-default.xml 2. yarn.scheduler.capacity.root.default.[acl_submit_applications|acl_administer_queue] in capacity-scheduler.xml 3. security.*.protocol.acl in hadoop-policy.xml When ACL (or server authorization) is enabled, the resources that are supposed to be protected are still accessible. However, anybody can still access them because the default configurations are *, accepting anybody. These defaults seem not to make much sense, but only confuse users. Instead, the reasonable behavior should be that when ACL is enabled, a user is going to be denied by default unless we explicitly add him/her into the admin ACLs or the authorized user/group list. I have a patch to invert * toto block all users by default. Please let me how what you think about it, and how we should progress. Thanks, Zhijie -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Resolved] (HADOOP-10995) HBase cannot run correctly with Hadoop trunk
[ https://issues.apache.org/jira/browse/HADOOP-10995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen resolved HADOOP-10995. -- Resolution: Invalid Release Note: Per the new patch on YARN-2032, we plan to ignore the test cases on trunk to walk around the compatibility issue. Close this ticket as invalid. HBase cannot run correctly with Hadoop trunk Key: HADOOP-10995 URL: https://issues.apache.org/jira/browse/HADOOP-10995 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: HADOOP-10995.1.patch, YARN-2032.dependency.patch Several incompatible changes that happened on trunk but not on branch-2 have broken the compatibility for HBbase: HADOOP-10348 HADOOP-8124 HADOOP-10255 In general, HttpServer is and Syncable.sync have been missed. It blocks YARN-2032, which makes timeline sever support HBase store. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Hadoop 2.5.1 RC0
+1 (non-binding). Downloaded the source tarball, built binaries from the source, deployed a one-node cluster, and ran some RM examples successfully. On Wed, Sep 10, 2014 at 12:15 PM, Alejandro Abdelnur t...@cloudera.com wrote: Thanks Karthik. +1. + verified MD5 for source tarball + verified signature for source tarball + successfully run apache-rat:check + checked CHANGES, LICENSE, README, NOTICE files. + built from source tarball + started pseudo cluster + run a couple of MR example jobs + basic test on HttpFS On Wed, Sep 10, 2014 at 10:10 AM, Karthik Kambatla ka...@cloudera.com wrote: Thanks for reporting the mistake in the documentation, Akira. While it is good to fix it, I am not sure it is big enough to warrant another RC, particularly because 2.5.1 is very much 2.5.0 done right. I just updated the how-to-release wiki to capture this step in the release process, so we don't miss it in the future. On Mon, Sep 8, 2014 at 11:37 PM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: -0 (non-binding) In the document, Apache Hadoop 2.5.1 is a minor release in the 2.x.y release line, buliding upon the previous stable release 2.4.1. Hadoop 2.5.1 is a point release. Filed HADOOP-11078 to track this. Regards, Akira (2014/09/09 0:51), Karthik Kambatla wrote: +1 (non-binding) Built the source tarball, brought up a pseudo-distributed cluster and ran a few MR jobs. Verified documentation and size of the binary tarball. On Fri, Sep 5, 2014 at 5:18 PM, Karthik Kambatla ka...@cloudera.com wrote: Hi folks, I have put together a release candidate (RC0) for Hadoop 2.5.1. The RC is available at: http://people.apache.org/~ kasha/hadoop-2.5.1-RC0/ The RC git tag is release-2.5.1-RC0 The maven artifacts are staged at: https://repository.apache.org/content/repositories/orgapachehadoop-1010/ You can find my public key at: http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS Please try the release and vote. The vote will run for the now usual 5 days. Thanks Karthik -- Alejandro -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Git repo ready to use
Committed YARN-2035 successfully via git. Email notification seems to work already. On Wed, Aug 27, 2014 at 1:40 AM, Karthik Kambatla ka...@cloudera.com wrote: Oh.. a couple more things. The git commit hashes have changed and are different from what we had on our github. This might interfere with any build automations that folks have. Another follow-up item: email and JIRA integration On Wed, Aug 27, 2014 at 1:33 AM, Karthik Kambatla ka...@cloudera.com wrote: Hi folks, I am very excited to let you know that the git repo is now writable. I committed a few changes (CHANGES.txt fixes and branching for 2.5.1) and everything looks good. Current status: 1. All branches have the same names, including trunk. 2. Force push is disabled on trunk, branch-2 and tags. 3. Even if you are experienced with git, take a look at https://wiki.apache.org/hadoop/HowToCommitWithGit . Particularly, let us avoid merge commits. Follow-up items: 1. Update rest of the wiki documentation 2. Update precommit Jenkins jobs and get HADOOP-11001 committed (reviews appreciated). Until this is done, the precommit jobs will run against our old svn repo. 3. git mirrors etc. to use the new repo instead of the old svn repo. Thanks again for your cooperation through the migration process. Please reach out to me (or the list) if you find anything missing or have suggestions. Cheers! Karthik -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Updates on migration to git
Do we have any convention about user.name and user.email? For example, we'd like to use @apache.org for the email. Moreover, do we want to use --author=Author Name em...@address.com when committing on behalf of a particular contributor? On Mon, Aug 25, 2014 at 9:56 AM, Karthik Kambatla ka...@cloudera.com wrote: Thanks for your input, Steve. Sorry for sending the email out that late, I sent it as soon as I could. On Mon, Aug 25, 2014 at 2:20 AM, Steve Loughran ste...@hortonworks.com wrote: just caught up with this after some offlininess...15:48 PST is too late for me. I'd be -1 to a change to master because of that risk that it does break existing code -especially people that have trunk off the git mirrors and automated builds/merges to go with it. Fair enough. It makes sense to leave it as trunk, unless someone is against it being trunk. master may be viewed as the official git way, but it doesn't have to be. For git-flow workflows (which we use in slider) master/ is for releases, develop/ for dev. On 24 August 2014 02:31, Karthik Kambatla ka...@cloudera.com wrote: Couple of things: 1. Since no one expressed any reservations against doing this on Sunday or renaming trunk to master, I ll go ahead and confirm that. I think that serves us better in the long run. 2. Arpit brought up the precommit builds - we should definitely fix them as soon as we can. I understand Giri maintains those builds, do we have anyone else who has access in case Giri is not reachable? Giri - please shout out if you can help us with this either on Sunday or Monday. Thanks Karthik On Fri, Aug 22, 2014 at 3:50 PM, Karthik Kambatla ka...@cloudera.com wrote: Also, does anyone know what we use for integration between JIRA and svn? I am assuming svn2jira. On Fri, Aug 22, 2014 at 3:48 PM, Karthik Kambatla ka...@cloudera.com wrote: Hi folks, For the SCM migration, feel free to follow https://issues.apache.org/jira/browse/INFRA-8195 Most of this is planned to be handled this Sunday. As a result, the subversion repository would be read-only. If this is a major issue for you, please shout out. Daniel Gruno, the one helping us with the migration, was asking if we are open to renaming trunk to master to better conform to git lingo. I am tempted to say yes, but wanted to check. Would greatly appreciate any help with checking the git repo has everything. Thanks Karthik -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-10995) HBase cannot run correctly with Hadoop trunk
Zhijie Shen created HADOOP-10995: Summary: HBase cannot run correctly with Hadoop trunk Key: HADOOP-10995 URL: https://issues.apache.org/jira/browse/HADOOP-10995 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Several incompatible changes that happened on trunk but not on branch-2 have broken the compatibility for HBbase: HADOOP-10348 HADOOP-8124 HADOOP-10255 In general, HttpServer is and Syncable.sync have been missed. It blocks YARN-2032, which makes timeline sever support HBase store. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [VOTE] Migration from subversion to git for version control
+1 non-binding On Mon, Aug 11, 2014 at 4:16 PM, jgho...@gmail.com wrote: +1 From: Colin McCabe Sent: Monday, August 11, 2014 3:40 PM To: Hadoop Common Cc: hdfs-...@hadoop.apache.org, yarn-...@hadoop.apache.org, mapreduce-...@hadoop.apache.org +1. best, Colin On Fri, Aug 8, 2014 at 7:57 PM, Karthik Kambatla ka...@cloudera.com wrote: I have put together this proposal based on recent discussion on this topic. Please vote on the proposal. The vote runs for 7 days. 1. Migrate from subversion to git for version control. 2. Force-push to be disabled on trunk and branch-* branches. Applying changes from any of trunk/branch-* to any of branch-* should be through git cherry-pick -x. 3. Force-push on feature-branches is allowed. Before pulling in a feature, the feature-branch should be rebased on latest trunk and the changes applied to trunk through git rebase --onto or git cherry-pick commit-range. 4. Every time a feature branch is rebased on trunk, a tag that identifies the state before the rebase needs to be created (e.g. tag_feature_JIRA-2454_2014-08-07_rebase). These tags can be deleted once the feature is pulled into trunk and the tags are no longer useful. 5. The relevance/use of tags stay the same after the migration. Thanks Karthik PS: Per Andrew Wang, this should be a Adoption of New Codebase kind of vote and will be Lazy 2/3 majority of PMC members. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.5.0 RC2
+1 non-binding + downloaded the source, built the tarball from it successfully + verified the dates of CHANGES.txt + ran a couple of MR example jobs successfully + ran DS jobs successfully + verified the applications' history was accessible from the timeline server + verified the DS jobs published the right entities and the events to the timeline server On Thu, Aug 7, 2014 at 6:28 PM, Andrew Wang andrew.w...@cloudera.com wrote: +1 binding * verified mds * built tarball from source * checked the CHANGES.txt files, they have dates * ran apache-rat:check * ran pseudo cluster from the resulting tarball, ran teragen, sort, validate on uncached and cached files On Thu, Aug 7, 2014 at 8:58 AM, Masatake Iwasaki iwasak...@oss.nttdata.co.jp wrote: +1 (non-binding) + verified MD5 for tarball and source tarball + built from source tarball + ran example jobs such as nutchindexing, wordcount, dfsioe, hivebench, kmeans, pagerank, bayes, sort, terasort with HiBench on the cluster with 3 slave nodes. (8/6/14, 13:59), Karthik Kambatla wrote: Hi folks, I have put together a release candidate (rc2) for Hadoop 2.5.0. The RC is available at: http://people.apache.org/~kasha/hadoop-2.5.0-RC2/ The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.5.0-rc2/ The maven artifacts are staged at: https://repository.apache.org/content/repositories/orgapachehadoop-1009/ You can find my public key at: http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS Please try the release and vote. The vote will run for the now usual 5 days. Thanks -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Hadoop Code of Incompatible Changes
Hi folks, Recently we have a conversation on YARN-2209 about the incompatible changes over releases. For those API changes that will break binary compatibility, source compatibility towards the existing API users, we've already had a rather clear picture about what we should do. However, YARN-2209 has introduced another case which I'm not quite sure about, which is kind of *logic incompatibility*. In detail, an ApplicationMasterProtocol API is going to throw an exception which is not expected before. The exception is a sub-class of YarnException, such that it doesn't need any method signature change, and won't break any binary/source compatibility. However, the exception is not expected before, but needs to be treated specially at the AM side. Not being aware of the newly introduced exception, the existing YARN applications' AM may not handle it exception properly, and is at the risk of being broken on a new YARN cluster after this change. An additional thought around this problem is that the change of what exception is to throw under what situation may be considered as a *soft *method signature change, because we're supposed to write this javadoc to tell the users (though we didn't it well in Hadoop), and users refer to it to guide how to handle the exception. In a more generalized form, let's assume we have a method, which behaves as A, in release 1.0. However, in release 2.0, the method signature has kept the same, while its behavior is altered from A to B. A and B are different behaviors. In this case, do we consider it as an incompatible change? I think it's somewhat a common issue, such that I raise it on the mailing list. Please share your ideas. Thanks, Zhijie -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Branching 2.5
I've just committed YARN-2247, which is the last 2.5 blocker from YARN. On Sat, Jul 26, 2014 at 5:02 AM, Karthik Kambatla ka...@cloudera.com wrote: A quick update: All remaining blockers are on the verge of getting committed. Once that is done, I plan to cut a branch for 2.5.0 and get an RC out hopefully this coming Monday. On Fri, Jul 25, 2014 at 12:32 PM, Andrew Wang andrew.w...@cloudera.com wrote: One thing I forgot, the release note activities are happening at HADOOP-10821. If you have other things you'd like to see mentioned, feel free to leave a comment on the JIRA and I'll try to include it. Thanks, Andrew On Fri, Jul 25, 2014 at 12:28 PM, Andrew Wang andrew.w...@cloudera.com wrote: I just went through and fixed up the HDFS and Common CHANGES.txt for 2.5.0. As a friendly reminder, please try to put things under the correct section :) We have subsections for the xattr changes in HDFS-2006 and HADOOP-10514, and there were some unrelated JIRAs appended to the end. I'd also encourage committers to be more liberal with their use of the NEW FEATURES section. I'm helping Karthik write up the 2.5 release notes, and I'm using NEW FEATURES to fill it out. When looking through the JIRA list though, I decided to promote things like the SNN/DN/JN webUI improvements, the HCFS specification work, and OIV read-only WebHDFS access to new features. One rule-of-thumb, if a feature required an umbrella JIRA, put the umbrella under NEW FEATURES when it's resolved. Thanks, Andrew On Wed, Jul 16, 2014 at 7:59 PM, Wangda Tan wheele...@gmail.com wrote: Thanks Tsuyoshi for pointing me this, Wangda On Thu, Jul 17, 2014 at 10:30 AM, Tsuyoshi OZAWA ozawa.tsuyo...@gmail.com wrote: Hi Wangda, The following link is same link as Karthik mentioned: https://issues.apache.org/jira/browse/YARN-2247?jql=project%20in%20(Hadoop%2C%20HDFS%2C%20YARN%2C%20%22Hadoop%20Map%2FReduce%22)%20AND%20resolution%20%3D%20Unresolved%20AND%20%22Target%20Version%2Fs%22%20%3D%202.5.0%20AND%20priority%20in%20(Blocker) Or, please access to http://goo.gl/FX3iWp Thanks, - Tsuyoshi On Thu, Jul 17, 2014 at 10:55 AM, Zhijie Shen zs...@hortonworks.com wrote: I raised YARN-2247 as the blocker of 2.5.0. On Thu, Jul 17, 2014 at 9:42 AM, Wangda Tan wheele...@gmail.com wrote: Hi Karthik, I found I cannot access the filter: http://s.apache.org/vJg. Could you please check its permission? I'd like to know if there's any related issues to me. :) Thanks, Wangda On Thu, Jul 17, 2014 at 5:54 AM, Karthik Kambatla ka...@cloudera.com wrote: We are down to 4 blockers and looks like they are all actively being worked on. Please reconsider marking new JIRAs as blockers. Thanks Karthik PS: I moved out a couple of JIRAs that didn't seem like true blockers to 2.6. On Wed, Jul 9, 2014 at 11:43 AM, Karthik Kambatla ka...@cloudera.com wrote: Folks, We have 10 blockers for 2.5. Can the people working on them revisit and see if they are really blockers. If they are, can we try to get them in soon? It would be nice to get an RC out the end of this week or at least early next week? Thanks Karthik On Wed, Jul 2, 2014 at 11:32 PM, Karthik Kambatla ka...@cloudera.com wrote: I just 1. moved non-blocker 2.5 JIRAs to 2.6. 2. created branch-2.5 and added sections for 2.6.0 in all CHANGES.txt in trunk and branch-2. 3. Will create branch-2.5.0 when we are ready to create an RC There are 11 pending blockers for 2.5.0: http://s.apache.org/vJg Committers - please exercise caution when merging to branch-2.5 and target non-blockers preferably to 2.6 On Wed, Jul 2, 2014 at 10:24 PM, Karthik Kambatla ka...@cloudera.com wrote: Committers, I am working on branching 2.5. Will send an update as soon as I am done branching. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding
Re: Branching 2.5
I raised YARN-2247 as the blocker of 2.5.0. On Thu, Jul 17, 2014 at 9:42 AM, Wangda Tan wheele...@gmail.com wrote: Hi Karthik, I found I cannot access the filter: http://s.apache.org/vJg. Could you please check its permission? I'd like to know if there's any related issues to me. :) Thanks, Wangda On Thu, Jul 17, 2014 at 5:54 AM, Karthik Kambatla ka...@cloudera.com wrote: We are down to 4 blockers and looks like they are all actively being worked on. Please reconsider marking new JIRAs as blockers. Thanks Karthik PS: I moved out a couple of JIRAs that didn't seem like true blockers to 2.6. On Wed, Jul 9, 2014 at 11:43 AM, Karthik Kambatla ka...@cloudera.com wrote: Folks, We have 10 blockers for 2.5. Can the people working on them revisit and see if they are really blockers. If they are, can we try to get them in soon? It would be nice to get an RC out the end of this week or at least early next week? Thanks Karthik On Wed, Jul 2, 2014 at 11:32 PM, Karthik Kambatla ka...@cloudera.com wrote: I just 1. moved non-blocker 2.5 JIRAs to 2.6. 2. created branch-2.5 and added sections for 2.6.0 in all CHANGES.txt in trunk and branch-2. 3. Will create branch-2.5.0 when we are ready to create an RC There are 11 pending blockers for 2.5.0: http://s.apache.org/vJg Committers - please exercise caution when merging to branch-2.5 and target non-blockers preferably to 2.6 On Wed, Jul 2, 2014 at 10:24 PM, Karthik Kambatla ka...@cloudera.com wrote: Committers, I am working on branching 2.5. Will send an update as soon as I am done branching. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-10794) A hadoop cluster needs clock synchronization
Zhijie Shen created HADOOP-10794: Summary: A hadoop cluster needs clock synchronization Key: HADOOP-10794 URL: https://issues.apache.org/jira/browse/HADOOP-10794 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen As a distributed system, a hadoop cluster wants the clock on all the participating hosts synchronized. Otherwise, some problems might happen. For example, in MAPREDUCE-5940, due to the clock on the host for the task container falls behind that on the host of the AM container, the computed elapsed time (the diff between the timestamps produced on two hosts) becomes negative. In MAPREDUCE-5940, we tried to mask the negative elapsed time. However, we should seek for a decent long term solution, such as providing mechanism to do and check clock synchronization. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [VOTE] Change by-laws on release votes: 5 days instead of 7
+1 (non-binding) On Wed, Jun 25, 2014 at 1:26 AM, Aaron T. Myers a...@cloudera.com wrote: +1 (binding) -- Aaron T. Myers Software Engineer, Cloudera On Tue, Jun 24, 2014 at 1:53 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, As discussed, I'd like to call a vote on changing our by-laws to change release votes from 7 days to 5. I've attached the change to by-laws I'm proposing. Please vote, the vote will the usual period of 7 days. thanks, Arun [main]$ svn diff Index: author/src/documentation/content/xdocs/bylaws.xml === --- author/src/documentation/content/xdocs/bylaws.xml (revision 1605015) +++ author/src/documentation/content/xdocs/bylaws.xml (working copy) @@ -344,7 +344,16 @@ pVotes are open for a period of 7 days to allow all active voters time to consider the vote. Votes relating to code changes are not subject to a strict timetable but should be -made as timely as possible./p/li +made as timely as possible./p + + ul + li strongProduct Release - Vote Timeframe/strong + pRelease votes, alone, run for a period of 5 days. All other + votes are subject to the above timeframe of 7 days./p + /li + /ul + /li + /ul /section /body -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 0.23.11
+1 (non-binding) built it from source and ram some MR examples successfully. On Mon, Jun 23, 2014 at 11:59 PM, Mit Desai mitde...@yahoo-inc.com.invalid wrote: +1 (non-binding) Tested on: Fedora17 -Successful build from src -Verified Signature -Deployed source to my single node cluster and ran couple of sample MR jobs -Mit Desai On 6/19/14, 10:14 AM, Thomas Graves tgra...@yahoo-inc.com.INVALID wrote: Hey Everyone, There have been various bug fixes that have went into branch-0.23 since the 0.23.10 release. We think its time to do a 0.23.11. This is also the last planned release off of branch-0.23 we plan on doing. The RC is available at: http://people.apache.org/~tgraves/hadoop-0.23.11-candidate-0/ The RC Tag in svn is here: http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.11-rc0/ The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days til June 26th. I am +1 (binding). thanks, Tom Graves -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.4.1
Sounds good to me. Remove MAPREDUCE-5831 out of the scope of 2.4.1. On Sat, Jun 21, 2014 at 2:29 AM, Arun C Murthy a...@hortonworks.com wrote: On Jun 20, 2014, at 11:23 AM, Vinod Kumar Vavilapalli vino...@apache.org wrote: Unfortunately even though we documented wire compatiblity, cross-version client/server support doesn't yet really work for YARN and MapReduce. We can only do that once we have wire-compatibility and eventually rolling upgrades. No fix is in sight for MAPREDUCE-5831 - for now both clients and AMs have to upgrade together in the MapReduce land.. Actually, that is a reasonable expectation - particularly because we should all be migrating towards MAPREDUCE-4421 and should stop installing MR on every node. Arun +Vinod On Wed, Jun 18, 2014 at 6:52 PM, Zhijie Shen zs...@hortonworks.com wrote: Point to another MR compatibility issue marked for 2.4.1: MAPREDUCE-5831 Old MR client is not compatible with new MR application, though it happens since 2.3. It would be good to figure out whether we include it now or later. It seems that we're going to be in a better position once we have versioning for MR components. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-10718) IOException: An existing connection was forcibly closed by the remote host frequently happens on Windows
Zhijie Shen created HADOOP-10718: Summary: IOException: An existing connection was forcibly closed by the remote host frequently happens on Windows Key: HADOOP-10718 URL: https://issues.apache.org/jira/browse/HADOOP-10718 Project: Hadoop Common Issue Type: Bug Components: ipc Reporter: Zhijie Shen After HADOOP-317, we still observed that on windows platform, there're a number of IOException: An existing connection was forcibly closed by the remote host when running a MR job. For example, {code} 2014-06-09 09:11:40,675 INFO [Socket Reader #3 for port 59622] org.apache.hadoop.ipc.Server: Socket Reader #3 for port 59622: readAndProcess from client 10.215.30.53 threw exception [java.io.IOException: An existing connection was forcibly closed by the remote host] java.io.IOException: An existing connection was forcibly closed by the remote host at sun.nio.ch.SocketDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) at sun.nio.ch.IOUtil.read(IOUtil.java:198) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) at org.apache.hadoop.ipc.Server.channelRead(Server.java:2558) at org.apache.hadoop.ipc.Server.access$2800(Server.java:130) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1459) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:750) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:624) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:595) {code} {code} 2014-06-09 09:15:38,539 WARN [main] org.apache.hadoop.mapred.Task: Failure sending commit pending: java.io.IOException: Failed on local exception: java.io.IOException: An existing connection was forcibly closed by the remote host; Host Details : local host is: sdevin-clster53/10.215.16.72; destination host is: sdevin-clster54:63415; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1414) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231) at com.sun.proxy.$Proxy9.commitPending(Unknown Source) at org.apache.hadoop.mapred.Task.done(Task.java:1006) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:397) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.io.IOException: An existing connection was forcibly closed by the remote host at sun.nio.ch.SocketDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) at sun.nio.ch.IOUtil.read(IOUtil.java:198) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:359) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.FilterInputStream.read(FilterInputStream.java:133) at java.io.FilterInputStream.read(FilterInputStream.java:133) at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:510) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read(BufferedInputStream.java:254) at java.io.DataInputStream.readInt(DataInputStream.java:387) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1054) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:949) {code} And the latter one results in the issue of MAPREDUCE-5924. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [VOTE] Release Apache Hadoop 2.4.1
Point to another MR compatibility issue marked for 2.4.1: MAPREDUCE-5831 Old MR client is not compatible with new MR application, though it happens since 2.3. It would be good to figure out whether we include it now or later. It seems that we're going to be in a better position once we have versioning for MR components. Other than that, +1 (non-binding) for rc0. I've downloaded the source code, built the executable from it, run through MR examples and DS jobs, checked the metrics in the timeline server, and passed the test cases mentioned in the change log. - Zhijie On Thu, Jun 19, 2014 at 5:45 AM, Mayank Bansal maban...@gmail.com wrote: I think we should fix this one that will help older clients 2.2/2.3 not to be updated if not absolutely required. Thanks, Mayank On Wed, Jun 18, 2014 at 12:13 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: There is one item [MAPREDUCE-5830 HostUtil.getTaskLogUrl is not backwards binary compatible with 2.3] marked for 2.4. Should we include it? There is no patch there yet, it doesn't really help much other than letting older clients compile - even if we put the API back in, the URL returned is invalid. +Vinod On Jun 16, 2014, at 9:27 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.1 (bug-fix release) that I would like to push out. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.1-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.1-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/hdp/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Thanks and Regards, Mayank Cell: 408-718-9370 -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-10600) AuthenticationFilterInitializer doesn't allow null signature secret file
Zhijie Shen created HADOOP-10600: Summary: AuthenticationFilterInitializer doesn't allow null signature secret file Key: HADOOP-10600 URL: https://issues.apache.org/jira/browse/HADOOP-10600 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen AuthenticationFilterInitializer doesn't allow null signature secret file. However, null signature secret is acceptable in AuthenticationFilter, and a random signature secret is going to be created instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10596) HttpServer2 should apply the authentication filter to some urls instead of null
Zhijie Shen created HADOOP-10596: Summary: HttpServer2 should apply the authentication filter to some urls instead of null Key: HADOOP-10596 URL: https://issues.apache.org/jira/browse/HADOOP-10596 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen HttpServer2 should apply the authentication filter to some urls instead of null. In addition, it should be more flexible for users to configure SPNEGO. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10601) We should prevent AuthenticationFilter to be installed twice
Zhijie Shen created HADOOP-10601: Summary: We should prevent AuthenticationFilter to be installed twice Key: HADOOP-10601 URL: https://issues.apache.org/jira/browse/HADOOP-10601 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen It seems that we have two way to install the authentication filter (at least from YARN aspect): 1. have SPNEGO configs and use withHttpSpnego when starting the web app; 2. add AuthenticationFilterInitializer into the configuration of filter initializer list. If both ways are used, it seems that two AuthenticationFilter will be instantiated, which is not expected. It's good to allow one or the other. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10550) HttpAuthentication.html is out of date
Zhijie Shen created HADOOP-10550: Summary: HttpAuthentication.html is out of date Key: HADOOP-10550 URL: https://issues.apache.org/jira/browse/HADOOP-10550 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0, 3.0.0 Reporter: Zhijie Shen Priority: Minor It is still saying: {code} By default Hadoop HTTP web-consoles (JobTracker, NameNode, TaskTrackers and DataNodes) allow access without any form of authentication. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Wiki Edit Permission
To whom it may concern, would you mind granting me Wiki edit permission? My username is Zhijie Shen. Thanks, Zhijie -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Thinking ahead
On Sun, Apr 13, 2014 at 5:50 PM, Arun C Murthy a...@hortonworks.com wrote: I just opened https://issues.apache.org/jira/browse/YARN-1935. I've just assigned the ticket to me, and opened/linked a number of related tickets. I'll work on the security issues of the timeline server. Thanks, Zhijie -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Thinking ahead
+1 for Timeline Server stability. In addition to the security, we may also want to deal with scalability, generic and per-framework services integration and MR integration. Any plan about YARN long running services? On Sat, Apr 12, 2014 at 10:09 AM, Sandy Ryza sandy.r...@cloudera.comwrote: +1 for starting to think about 2.5. Early June seems a little early to me - we had talked about a quarterly release cadence and this would be about half that. I'm having trouble editing the wiki, but I think Timeline Server stability (e.g. security and locking down APIs) should go on that list. I think we should probably take YARN-1404 off the list - even with 3 months it's unlikely to be complete. On Sat, Apr 12, 2014 at 9:50 AM, Chris Nauroth cnaur...@hortonworks.com wrote: +1 The proposed content for 2.5 in the roadmap wiki looks good to me. On Apr 12, 2014 7:26 AM, Arun C Murthy a...@hortonworks.com wrote: Gang, With hadoop-2.4 out, it's time to think ahead. In the short-term hadoop-2.4.1 is in order; particularly with https://issues.apache.org/jira/browse/MAPREDUCE-5830 (it's a break to @Private API, unfortunately something Hive is using - sigh!). There are some other fixes which testing has uncovered; so it will be nice to pull them them in. I'm thinking of an RC by end of the coming week - committers, please be *very* conservative when getting stuff into 2.4.1 (i.e. merging to branch-2.4). Next up, hadoop-2.5. I've updated https://wiki.apache.org/hadoop/Roadmap with some candidates for consideration - please chime in and say 'aye'/'nay' or add new content. IAC, I suspect that list is too large. Rather than wait for everything it would be better to plan on releasing it on a time-bound manner; particularly around the Hadoop Summit. If that makes sense; I think we should target branching for 2.5 by mid-May to get it stable and released by early June. Thoughts? thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.4.0
+1 non-binding I built from source code, and setup a single node non-secure cluster with almost the default configurations and ran a number of MR example jobs and distributedshell jobs. I verified the generic and the per-framework (DS job only) historic information has been persisted, and such information is accessible via webUI, RESTful APIs and CLI. Thanks, Zhijie On Wed, Apr 2, 2014 at 1:26 PM, Jian He j...@hortonworks.com wrote: +1 non-binding Built from source code, tested on a single node cluster. Successfully ran a few MR sample jobs. Tested RM restart while job is running. Thanks, Jian On Tue, Apr 1, 2014 at 5:42 PM, Travis Thompson tthomp...@linkedin.com wrote: +1 non-binding Built from git. Started with 120 node 2.3.0 cluster with security and non HA, ran upgrade (non rolling) to 2.4.0. Confirmed fsimage is OK and HDFS successfully upgraded. Also successfully ran some pig jobs and mapreduce examples. Haven't found any issues yet but will continue testing. Did not test Timeline Server since I'm using security. Thanks, Travis On 03/31/2014 02:24 AM, Arun C Murthy wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.0 that I would like to get released. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.0-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.4.0
Hi Sandy, Application History Server (we prefer to call it Timeline Server instead now) is going to be shipped in 2.4 without security. I've already wrote a document about what it is, what's the current status, and how to config and start it. Via the document, hopefully the users are clear about the timeline server. There's another jira, YARN-1876, in which I've attached a patch about the REST APIs of using generic history data and per-framework data. However, it seems to be to late to get it in. For the timeline store, we have already one implementation based on Leveldb. We have monitored its performance closely and reviewed its license, and believed it should be a good shape now. Currently the timeline store is for storing per-framework data only, while we eventually hope to move the generic data there as well. Thanks, Zhijie On Mon, Mar 31, 2014 at 11:02 AM, Sandy Ryza sandy.r...@cloudera.comwrote: What's the state of the application history server? Do we have security, documentation, and are APIs stable? If any of these are missing, do we have a plan for how to make this clear to users? What about the timeline store? thanks, Sandy On Mon, Mar 31, 2014 at 2:22 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.0 that I would like to get released. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.0-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.4.0
Hi Sandy, Please find TimelineServer.apt.vm. Thanks, Zhijie On Mon, Mar 31, 2014 at 11:47 AM, Sandy Ryza sandy.r...@cloudera.comwrote: Thanks Zhijie - where is that document located? On Mon, Mar 31, 2014 at 11:23 AM, Zhijie Shen zs...@hortonworks.com wrote: Hi Sandy, Application History Server (we prefer to call it Timeline Server instead now) is going to be shipped in 2.4 without security. I've already wrote a document about what it is, what's the current status, and how to config and start it. Via the document, hopefully the users are clear about the timeline server. There's another jira, YARN-1876, in which I've attached a patch about the REST APIs of using generic history data and per-framework data. However, it seems to be to late to get it in. For the timeline store, we have already one implementation based on Leveldb. We have monitored its performance closely and reviewed its license, and believed it should be a good shape now. Currently the timeline store is for storing per-framework data only, while we eventually hope to move the generic data there as well. Thanks, Zhijie On Mon, Mar 31, 2014 at 11:02 AM, Sandy Ryza sandy.r...@cloudera.com wrote: What's the state of the application history server? Do we have security, documentation, and are APIs stable? If any of these are missing, do we have a plan for how to make this clear to users? What about the timeline store? thanks, Sandy On Mon, Mar 31, 2014 at 2:22 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.4.0 that I would like to get released. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.4.0-rc0 The RC tag in svn is here: https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Subscribe to Mailing List
Please search for all Subscribe to List on https://hadoop.apache.org/mailing_lists.html to get subscribed. On Mon, Mar 17, 2014 at 11:44 AM, Suraj Nayak snay...@gmail.com wrote: Kindly Subscribe to the Mailing List Thanks Suraj Nayak -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Created] (HADOOP-10397) No 64-bit native lib in Hadoop releases
Zhijie Shen created HADOOP-10397: Summary: No 64-bit native lib in Hadoop releases Key: HADOOP-10397 URL: https://issues.apache.org/jira/browse/HADOOP-10397 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Recently, I had a chance to talk to a Hadoop user, who complained there's no 64-bit native lib in Hadaoop releases, and it was user unfriendly to make them download all the dependenies to build 64-bit themselves. Hence I checked the recent two releases, 2.2 and 2.3, whose native lib are both ELF 32-bit LSB shared objects. I'm not aware of the reason why we don't release 64-bit, but I'd like to open the ticket to tackle this issue given we didn't before. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Re-swizzle 2.3
making much progress till he is back; so I'm inclined to push both to 2.4 too. Any objections? Looks like Daryn has both HADOOP-10301 HDFS-4564 covered. Overall, I'll try get this out in next couple of days if we can clear the list. thanks, Arun On Feb 3, 2014, at 12:14 PM, Arun C Murthy a...@hortonworks.com wrote: An update. Per https://s.apache.org/hadoop-2.3.0-blockerswe are now down to 5 blockers: 1 Common, 1 HDFS, 3 YARN. Daryn (thanks!) has both the non-YARN covered. Vinod is helping out with the YARN ones. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Alejandro -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[VOTE] Merge YARN-321 Generic Application History Service to trunk
Hi folks, As previously discussed here (http://markmail.org/message/iscvp7cedrtvmd6p), I would like to call a vote to merge the YARN-321 branch for Generic Application History Server into trunk. *Scope of the changes* The changes enable ResourceManager to record the historic information of the application, the application attempt and the container in terms of events via a history writer. In addition, the changes setup up an application history server, which allows users to access the recorded information via RPC interface, web UI and REST APIs. *Details of development* - Development of the feature is tracked in the jira - https://issues.apache.org/jira/browse/YARN-321. - Development has been done in a separate branch - https://svn.apache.org/repos/asf/hadoop/common/branches/YARN-321. - The feature development involved about 35 subtasks. - The up-to-date design is posted at - https://issues.apache.org/jira/secure/attachment/12619638/Generic Application History - Design-20131219.pdfhttps://issues.apache.org/jira/secure/attachment/12619638/Generic%C2%A0Application%20History%20-%20Design-20131219.pdf - The uber merge patch Jira - https://issues.apache.org/jira/browse/YARN-1587 *Testing* A number of unit tests have been added as a part of the feature. In addition, we’ve also done end-to-end functional tests, and performance tests for HDFS-based history storage and history events processing. Last but not least, we have updated branch YARN-321 against the latest trunk, edited merge conflicts, fixed test failures caused by merge, and corrected a bunch of bad source code issues. The uber merge patch that contains all the diff between branch YARN-321 and trunk has been run through Jenkins. *Pending work* - Make it work in secure mode - Pending bug fixes We wish to merge the branch now instead of waiting for later. The main reason for this is that as the branch grew in size, the cost of its maintenance became huge. Once the feature is merged into trunk, we will continue to work on pending work like security stuff, to test and fix any bugs that may be found on the trunk, and to refactor the code about to share some pieces in PRC and web interfaces. *Release status* If the security stuff and the pending fixes arrive by the time everything else planned for Release 2.4 is done, we can include it as well. This is what we are striving for. Otherwise, we will call AHS not-feature-complete and not stable. The bulk of the design and implementation was done by Mayank Bansal and me with contributions from Devaraj K and Vinod Kumar Vavilapalli amongst others. Also, thanks to Robert Joseph Evans and Sandy Ryza for providing feedback on the design discussions. This vote runs for a week and closes on 1/24/2014 at 11:59 pm PT. Thanks, Zhijie -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [Discussion] Merge YARN-321 into Branch-2
Hi folks, Thanks for replying. Please see the response bellow. bq. 2* The last update of YARN-321 was done in NOV10, this was done from branch-2 (that seems a NIT as it should be against trunk) Basically it's a discussion thread. I'm already in the process of updating the branch. As I mentioned before, one of the motivation of merge branch YARN-321 is to prevent it from further falling behind. bq. 1* The merge must be done in trunk first (then, ideally, from trunk into branch-2) bq. 3* IMO, until we don’t have security, we should merge into trunk only Yes, we plan to merge to trunk, then to branch-2, but I agree, let's have security ready before going to branch-2. We can continue with security on trunk. bq. Regarding doc, while we don't necessarily need full documentation before merging, my feeling is that we should at least have a page on it that will allow give cluster operators a sketch of its role, APIs, and implications. Agree, let's prepare a doc as well. bq. My opinion is that we should mark each API @Stable unless there is a strong reason for it not to be. When I say APIs, I am thinking of the REST APIs, RPC interface, and RM-AHS shared-bus. I'm afraid it's still too early to confirm which APIs can be marked @Stable. How about not doing this until security is ready and refactoring duplicate code is done? At that time, we should have a clearer picture about it. Thanks, Zhijie On Mon, Jan 6, 2014 at 3:02 PM, Alejandro Abdelnur t...@cloudera.comwrote: This is great news. A few things to consider before doing the merge: 1* The merge must be done in trunk first (then, ideally, from trunk into branch-2) 2* The last update of YARN-321 was done in NOV10, this was done from branch-2 (that seems a NIT as it should be against trunk) 3* IMO, until we don’t have security, we should merge into trunk only I would like to see #1 and #2 taken care before making a decision. The reason for this is that if the source changes outside of the AHS are too pervasive, then we may end up be making difficult backports from trunk to branch-2 and release branches because of the delta. Can we get a rebase of the YARN-321 to the head of trunk to see if my concerns are valid or not? Thanks. Alejandro On Mon, Jan 6, 2014 at 1:11 AM, Sandy Ryza sandy.r...@cloudera.com wrote: Very excited for this feature and appreciative of all the work put into this. Reviewed the JIRA and my only two remaining concerns are about documentation and API stability. Regarding doc, while we don't necessarily need full documentation before merging, my feeling is that we should at least have a page on it that will allow give cluster operators a sketch of its role, APIs, and implications. If this doesn't exist yet, would be happy to help with reviewing. Regarding APIs, I imagine some will jump on this as soon as it becomes part of a release as it's a fairly essential feature for non-MR apps. My opinion is that we should mark each API @Stable unless there is a strong reason for it not to be. When I say APIs, I am thinking of the REST APIs, RPC interface, and RM-AHS shared-bus. -Sandy On Sun, Jan 5, 2014 at 8:33 PM, lohit lohit.vijayar...@gmail.com wrote: +1 to merge into branch2 now. On Jan 5, 2014 6:22 PM, Zhijie Shen zs...@hortonworks.com wrote: Hi folks, Majority of the functionality of Application History Server has been completed on branch YARN-321. AHS can now work end-to-end. ResourceManager records the historical information of the application, the application attempt and the container in terms of events via a history writer on a separate thread. The historical information is going to be persisted on HDFS. On the other side, an application history server runs as a separate process, and it allows users to access the historical information via RPC interface, web UI and REST APIs. According to the proposal, the only major missing piece is security. Except it, YARN-321 should be pretty much ready to be merged to Branch-2. We propose to merge YARN-321 to Branch-2 now, such that we can prevent YARN-321 from falling behind too much, and reduce the effort of editing merge conflicts. After merge, we can continue on security, refactoring duplicate code, fixing bugs and other improvements. Anyone who is interested in looking at AHS can review/play with YARN-321 branch: http://svn.apache.org/viewvc/hadoop/common/branches/YARN-321/ You can also have a look at the latest design doc: https://issues.apache.org/jira/secure/attachment/12619638/Generic%20Application%20History%20-%20Design-20131219.pdf If there are no objections, we'll push towards updating the branch, running the patch through Jenkins and getting ready for merge vote by the end of this week. Thanks, Zhijie -- Zhijie Shen Hortonworks Inc
[Discussion] Merge YARN-321 into Branch-2
Hi folks, Majority of the functionality of Application History Server has been completed on branch YARN-321. AHS can now work end-to-end. ResourceManager records the historical information of the application, the application attempt and the container in terms of events via a history writer on a separate thread. The historical information is going to be persisted on HDFS. On the other side, an application history server runs as a separate process, and it allows users to access the historical information via RPC interface, web UI and REST APIs. According to the proposal, the only major missing piece is security. Except it, YARN-321 should be pretty much ready to be merged to Branch-2. We propose to merge YARN-321 to Branch-2 now, such that we can prevent YARN-321 from falling behind too much, and reduce the effort of editing merge conflicts. After merge, we can continue on security, refactoring duplicate code, fixing bugs and other improvements. Anyone who is interested in looking at AHS can review/play with YARN-321 branch: http://svn.apache.org/viewvc/hadoop/common/branches/YARN-321/ You can also have a look at the latest design doc: https://issues.apache.org/jira/secure/attachment/12619638/Generic%20Application%20History%20-%20Design-20131219.pdf If there are no objections, we'll push towards updating the branch, running the patch through Jenkins and getting ready for merge vote by the end of this week. Thanks, Zhijie -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: MapReduce V1 vs MapReduce V2
Hi Matt, in my opinion, the basic difference between MapReduce V1 and V2 is not about mapred or mapreduce API package, but about the platform to run the job. When it was MapReduce V1, the job was managed by JobTracker and TaskTracker. After upgrading to MapReduce V2, the resource management part in the MapReduce project has been spun off, and evolves to ben YARN, a generic distributed resource management system. MapReduce as well as other types of applications can run on the common platform. On the other side, the remaining part, which is code base of MapReduce V2, is a pure distributed computation framework. With regard to the API packages, both mapred.* and mapreduce.* have been existing since MapReduce V1, but mapreduce.* has been involving a lot. If you're writing a new MapReduce application referring to the latest Hadoop libraries, it's MapReduce V2 no matter whether you're using mapred.* or mapreduce.*. If you already has some MapReduce applications that were built with MapReduce V1 framework, and use mapred.* APIs, they are supposed to be run on YARN without problems. However, it those applications use mapreduce.* APIs, you may need to compile them MapReduce V2 framework to be able to run them on YARN. Here're a bunch of resources that you may want to have a look for further information: http://hortonworks.com/hadoop/yarn/ http://hortonworks.com/blog/running-existing-applications-on-hadoop-2-yarn/ http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/ Thanks, Zhijie On Fri, Jan 3, 2014 at 2:19 AM, Matt Fellows matt.fell...@bespokesoftware.com wrote: I'm thoroughly confused about which API is the recent one, which is the old one and which method I should be using to write MapReduce applications. I'm under the impression that MRv2 is primarily driven by the org.apache.hadoop.mapreduce.* packages and MRv1 is primarily driven by the org.apache.hadoop.mapred.* packages. I've been led to believe that MRv2 applications extend MapReduceBase and implement Mapper, Reducer etc. and conversely the MRv1 applications extend Mapper, Reducer directly. However I can not find a canonical statement to back any of this up. What's more I keep finding conflicting statements about these, such as 'Hadoop - the definitive guide' gives example in MRv2 format but then I look at the examples and they use org.apache.hadoop.mapreduce.* packages, but extend Mapper and extend Reducer, not MapReduceBase... Can someone either point me at a canonical resource or just confirm / deny my assumptions? Kind regards -- [image: cid:1CBF4038-3F0F-4FC2-A1FF-6DC81B8B6F94] First Option Software Ltd Signal House Jacklyns Lane Alresford SO24 9JJ Tel: +44 (0)1962 738232 Mob: +44 (0)7710 160458 Fax: +44 (0)1962 600112 Web: www.b http://www.fosolutions.co.uk/espokesoftware.comhttp://bespokesoftware.com/ This is confidential, non-binding and not company endorsed - see full terms at www.fosolutions.co.uk/emailpolicy.html First Option Software Ltd Registered No. 06340261 Signal House, Jacklyns Lane, Alresford, Hampshire, SO24 9JJ, U.K. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.2.0
+1 (non-binding) Built the source files, tried Hadoop-1 examples on it, ran some MR jobs while RM restarted on a single-node cluster and checked JHS. Zhijie On Tue, Oct 8, 2013 at 3:13 PM, Sandy Ryza sandy.r...@cloudera.com wrote: +1 (non-binding) Built from source and ran a few jobs on a pseudo-distributed cluster with the Fair Scheduler. On Tue, Oct 8, 2013 at 6:48 AM, Thomas Graves tgra...@yahoo-inc.com wrote: +1. Downloaded, verified signature/md5, CHANGES.txt, NOTICE, LICENSE, README, release notes, built the source tar ball, and ran a few small jobs on a pseudo-distributed cluster. Tom On 10/7/13 2:00 AM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.2.0 that I would like to get released - this release fixes a small number of bugs and some protocol/api issues which should ensure they are now stable and will not change in hadoop-2.x. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.2.0-rc0 The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun P.S.: Thanks to Colin, Andrew, Daryn, Chris and others for helping nail down the symlinks-related issues. I'll release note the fact that we have disabled it in 2.2. Also, thanks to Vinod for some heavy-lifting on the YARN side in the last couple of weeks. -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Release Apache Hadoop 2.1.1-beta
I've added MAPREDUCE-5531 to the blocker list. - Zhijie On Tue, Sep 24, 2013 at 3:41 PM, Arun C Murthy a...@hortonworks.com wrote: With 4 +1s (3 binding) and no -1s the vote passes. I'll push it out… I'll make it clear on the release page, that there are some known issues and that we will follow up very shortly with another release. Meanwhile, let's fix the remaining blockers (please mark them as such with Target Version 2.1.2-beta). The current blockers are here: http://s.apache.org/hadoop-2.1.2-beta-blockers thanks, Arun On Sep 16, 2013, at 11:38 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.1.1-beta that I would like to get released - this release fixes a number of bugs on top of hadoop-2.1.0-beta as a result of significant amounts of testing. If things go well, this might be the last of the *beta* releases of hadoop-2.x. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.1.1-beta-rc0 The RC tag in svn is here: http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.1.1-beta-rc0 The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: git line endings trouble since recent trunk
The problem seems to happen again on the latest trunk On Mon, Jul 1, 2013 at 11:44 AM, Colin McCabe cmcc...@alumni.cmu.eduwrote: On Mon, Jul 1, 2013 at 11:09 AM, Raja Aluri r...@cmbasics.com wrote: They are not that many that I can think of unless you are using notepad for editing, but some of the windows related files might require CRLF. This can be handled in .gitattributes I think it's a very good idea to add this check to test-patch script and reject the patch based on the CRLF check. -Raja I don't see why you would need a hook. Just set svn:eol-style=crlf on the file that need CRLF. On Mon, Jul 1, 2013 at 11:00 AM, Alejandro Abdelnur t...@cloudera.com wrote: why not just add a precommit hook in svn to reject commits with CRLF? On Mon, Jul 1, 2013 at 10:51 AM, Raja Aluri r...@cmbasics.com wrote: I added a couple of links that discusses 'line endings' when I added .gitattributes in this JIRA. HADOOP-8912https://issues.apache.org/jira/browse/HADOOP-8912 I am just reproducing them here. 1. http://git-scm.com/docs/gitattributes#_checking_out_and_checking_in 2. http://stackoverflow.com/questions/170961/whats-the-best-crlf-handling-strategy-with-git Regardless of what we do or don't do in git, we should have the line endings correct in subversion. cheers. Colin -- Raja On Mon, Jul 1, 2013 at 10:42 AM, Colin McCabe cmcc...@alumni.cmu.edu wrote: On Sat, Jun 29, 2013 at 5:18 PM, Luke Lu l...@vicaya.com wrote: The problem is due to relnotes.py generating the html containing some CRLF (from JIRA) and the release manager not using git-svn, which caused the html with mixed eol getting checked in. The problem would then manifest for git users due to text=auto in .gitattributes (see HADOOP-8912) that auto converts CRLF to LF, hence the persistent modified status of the releasenotes.html for a fresh checkout. Adding svn:eol-style=native would only fix half the problem. We need to fix relnotes.py to avoid generating html with mixed eols (by normalizing everything to LF or native). While I agree that it would be nice to fix relnotes.py, it seems to me that setting svn:eol-style=native should fix the problem completely. Files with this attribute set are stored internally by subversion with all newlines as LF, and converted to CRLF as needed. After all, eol-style=native would not be very useful if it only applied on checkout. Windows users would be constantly checking in CRLF in that case. I'm not an svn expert, though, and I haven't tested the above. Colin On Fri, Jun 28, 2013 at 1:03 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote: Clarification: svn:eol-style = native causes the files to contain whatever the native platform used to check out the code uses. I think just setting this property on all the HTML files should resolve this and future problems. patch posted. C. On Fri, Jun 28, 2013 at 12:56 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote: I think the fix for this is to set svn:eol-style to native on this file. It's set on many other files, just not on this one: cmccabe@keter:~/hadoopST/trunk svn propget svn:eol-style ./hadoop-project-dist/README.txt native cmccabe@keter:~/hadoopST/trunk svn propget svn:eol-style ./hadoop-hdfs-project/hadoop-hdfs/src/main/docs/releasenotes.html cmccabe@keter:~/hadoopST/trunk It might also be a good idea to run dos2unix on it. I thought that in general we wanted to have 'LF' everywhere, so perhaps we should add this to the patch.sh script to prevent this from re-occurring. Colin On Fri, Jun 28, 2013 at 12:27 PM, Sandy Ryza sandy.r...@cloudera.com wrote: I haven't been able to find a solution. Just filed https://issues.apache.org/jira/browse/HADOOP-9675. On Fri, Jun 28, 2013 at 12:19 PM, Omkar Joshi ojo...@hortonworks.com wrote: Sandy... did you fix this? any jira to track? me too facing same problem.. Thanks, Omkar Joshi *Hortonworks Inc.* http://www.hortonworks.com On Fri, Jun 28, 2013 at 11:54 AM, Zhijie Shen zs...@hortonworks.com wrote: Have the same problem here, have to edit the patch manually to exclude the changes in releasenotes.html On Fri, Jun 28, 2013 at 11:01 AM, Sandy Ryza sandy.r...@cloudera.com wrote: Has anybody else been having trouble with line endings since pulling trunk recently? hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html shows
Re: please, subscribe me!
yarn-dev-subscr...@hadoop.apache.org is the correct email address for dev mailing list subscription. Similar for other projects' dev mailing list subscription. Please check http://hadoop.apache.org/mailing_lists.html for detail. On Thu, Jul 11, 2013 at 11:26 PM, Man-Young Goo my...@nate.com wrote: please, subscribe me! -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/
Re: git line endings trouble since recent trunk
Have the same problem here, have to edit the patch manually to exclude the changes in releasenotes.html On Fri, Jun 28, 2013 at 11:01 AM, Sandy Ryza sandy.r...@cloudera.comwrote: Has anybody else been having trouble with line endings since pulling trunk recently? hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html shows up as modified even though I haven't touched it, and I can't check it out or reset to a previous version to make that go away. The only thing I can do to neutralize it is to put it in a dummy commit, but I have to do this every time I switch branches or rebase. This appears to have began after the release notes commit (8c5676830bb176157b2dc28c48cd3dd0a9712741), and must be due to a line endings change. -Sandy -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/
[jira] [Created] (HADOOP-9658) SnappyCodec#checkNativeCodeLoaded may unexpectedly fail when native code is not loaded
Zhijie Shen created HADOOP-9658: --- Summary: SnappyCodec#checkNativeCodeLoaded may unexpectedly fail when native code is not loaded Key: HADOOP-9658 URL: https://issues.apache.org/jira/browse/HADOOP-9658 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen {code} public static void checkNativeCodeLoaded() { if (!NativeCodeLoader.buildSupportsSnappy()) { throw new RuntimeException(native snappy library not available: + this version of libhadoop was built without + snappy support.); } if (!SnappyCompressor.isNativeCodeLoaded()) { throw new RuntimeException(native snappy library not available: + SnappyCompressor has not been loaded.); } if (!SnappyDecompressor.isNativeCodeLoaded()) { throw new RuntimeException(native snappy library not available: + SnappyDecompressor has not been loaded.); } } {code} buildSupportsSnappy is native method. If the native code is not loaded, the method will be missing. Therefore, whether the native code is loaded or not, the first runtime exception will not be thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Plan to create release candidate for 0.23.8
+1 (non-binding) On Sun, May 19, 2013 at 6:23 PM, Sandy Ryza sandy.r...@cloudera.com wrote: +1 (non-binding) On Sun, May 19, 2013 at 1:22 PM, Derek Dagit der...@yahoo-inc.com wrote: +1 (non-binding) On May 17, 2013, at 4:14 PM, Thomas Graves tgra...@yahoo-inc.com wrote: Hello all, We've had a few critical issues come up in 0.23.7 that I think warrants a 0.23.8 release. The main one is MAPREDUCE-5211. There are a couple of other issues that I want finished up and get in before we spin it. Those include HDFS-3875, HDFS-4805, and HDFS-4835. I think those are on track to finish up early next week. So I hope to spin 0.23.8 soon after this vote completes. Please vote '+1' to approve this plan. Voting will close on Friday May 24th at 2:00pm PDT. Thanks, Tom Graves -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] - Release 2.0.5-beta
+1 (non-binding) on the proposal. On Wed, May 15, 2013 at 1:43 PM, Eli Collins e...@cloudera.com wrote: On Wed, May 15, 2013 at 1:29 PM, Matt Foley mfo...@hortonworks.com wrote: Arun, not sure whether your Yes to all already covered this, but I'd like to throw in support for the compatibility guidelines being a blocker. +1 to that. Definitely an overriding concern for me. +1 Likewise. Would be great to get more eyeballs on Karthik's patch on HADOOP-9517 if people haven't review it already. -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/
[jira] [Created] (HADOOP-9538) Make AMLauncher in RM Use NMClient
Zhijie Shen created HADOOP-9538: --- Summary: Make AMLauncher in RM Use NMClient Key: HADOOP-9538 URL: https://issues.apache.org/jira/browse/HADOOP-9538 Project: Hadoop Common Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen YARN-422 adds NMClient. RM's AMLauncher is responsible for the interactions with an application's AM container. AMLauncher should also replace the raw ContainerManager proxy with NMClient. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release Apache Hadoop 0.23.7
+1 (non-binding) 1. Downloaded the source bundle and the binary bundle, and built the source code successfully. 2. Verified the checksum and the signature 3. Setup a single-node cluster, and ran a couple of examples successfully On Tue, Apr 16, 2013 at 8:28 PM, Harsh J ha...@cloudera.com wrote: +1 Downloaded sources, built successfully, stood up a 1-node cluster and ran a Pi MR job. On Wed, Apr 17, 2013 at 2:27 AM, Hitesh Shah hit...@hortonworks.com wrote: +1. Downloaded source, built and ran a couple of sample jobs on a single node cluster. -- Hitesh On Apr 11, 2013, at 12:55 PM, Thomas Graves wrote: I've created a release candidate (RC0) for hadoop-0.23.7 that I would like to release. This release is a sustaining release with several important bug fixes in it. The RC is available at: http://people.apache.org/~tgraves/hadoop-0.23.7-candidate-0/ The RC tag in svn is here: http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.7-rc0/ The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Tom Graves -- Harsh J -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/