[jira] [Created] (HADOOP-9484) Genetic Algorithm Library for Hadoop

2013-04-18 Thread Vaibhav Singh Rajput (JIRA)
Vaibhav Singh Rajput created HADOOP-9484:


 Summary: Genetic Algorithm Library for Hadoop
 Key: HADOOP-9484
 URL: https://issues.apache.org/jira/browse/HADOOP-9484
 Project: Hadoop Common
  Issue Type: New Feature
  Components: contrib/hod
 Environment: Linux Operating System
Reporter: Vaibhav Singh Rajput


Developed a Genetic Algorithm Library for Hadoop. Using this library, problems 
using Genetic Algrorithm can be solved based on Hadoop MapReduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release Apache Hadoop 0.23.7

2013-04-18 Thread Derek Dagit
+1 (non-binding)
checked sigs and checksums
built and ran some simple jobs on single-node
-- 
Derek

On Apr 17, 2013, at 18:27, Sandy Ryza wrote:

 +1 (non-binding)
 Built from source and ran a couple of MR examples on a single node cluster.
 
 -Sandy
 
 
 On Wed, Apr 17, 2013 at 12:03 PM, Siddharth Seth
 seth.siddha...@gmail.comwrote:
 
 +1 (binding).
 Verified checksums and signature.
 Built from source tar, deployed a single node cluster (CapacityScheduler)
 and tried a couple of simple MR jobs.
 
 - Sid
 
 
 On Thu, Apr 11, 2013 at 12:55 PM, Thomas Graves tgra...@yahoo-inc.com
 wrote:
 
 I've created a release candidate (RC0) for hadoop-0.23.7 that I would
 like
 to release.
 
 This release is a sustaining release with several important bug fixes in
 it.
 
 The RC is available at:
 http://people.apache.org/~tgraves/hadoop-0.23.7-candidate-0/
 The RC tag in svn is here:
 http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.7-rc0/
 
 The maven artifacts are available via repository.apache.org.
 
 Please try the release and vote; the vote will run for the usual 7 days.
 
 thanks,
 Tom Graves
 
 
 



RE: How to understand Hadoop source code ?

2013-04-18 Thread Noelle Jakusz (c)
+1

There are quite a few new people, so maybe start a collaborative group where 
you can collect notes and steps (videos and articles). I know I would have some 
for you that I have created as I have gotten started... it would be a great 
idea to post them after some collaboration and review.

Thanks Chris for the detailed reply...

-Original Message-
From: Chris Nauroth [mailto:cnaur...@hortonworks.com] 
Sent: Thursday, April 18, 2013 1:14 PM
To: common-dev@hadoop.apache.org
Subject: Re: How to understand Hadoop source code ?

Is there a specific bug fix or feature that you are trying to contribute?
 Specific questions like how can I help with jira X? or what is the main 
entry point when I run the hdfs command? or where does the namenode serialize 
metadata to disk or where does the secondary namenode execute a checkpoint 
can help focus the conversation.

AFAIK, we don't have a general code walkthrough document focused on onboarding 
new engineers.  This could be a valuable contribution if you want to gather 
notes while you learn.  I think this always works best if it's driven by a new 
engineer with review by an expert.  (If the experts write it, then they might 
accidentally skip something non-obvious that they've already internalized.)

Since that document doesn't exist yet, the other option is to do some reading 
of the code, ideally while trying to fix a specific bug that has been filed in 
jira.  Like you said, it's a relatively large codebase, so it's impractical to 
read the whole thing top-to-bottom.  Instead, it's important to look for 
high-level clues that steer you towards the right files.  I've found that the 
Maven module structure and the Java package names are usually descriptive 
enough to steer me in the right direction.
 If you focus on getting familiar with those, you'll basically build a btree 
inside your brain that helps you index into the right part of the codebase and 
answer your own questions rapidly.  Several examples:

Where is the main entry point for the datanode daemon?: module hadoop-hdfs, 
package org.apache.hadoop.hdfs.server.datanode

What is the algorithm for rebalancing an unbalanced cluster?: module 
hadoop-hdfs, package org.apache.hadoop.hdfs.server.balancer

How does YARN launch a new container process?: module 
hadoop-yarn-server-nodemanager, package 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher

Multiple daemons publish JMX metrics as a common concern.  Where is that
implemented?: module hadoop-common, package org.apache.hadoop.metrics2

I hope this is helpful to get the process started for you.  We're always here 
to help if you have specific follow-up questions.

Thanks,
--Chris


On Wed, Apr 17, 2013 at 10:33 PM, Prabakaran Krishnan  
prabakaran_j...@yahoo.in wrote:

 Couuld you please help me in understand map reduce in Hadoop?



 
 From: Mohammad Mustaqeem 3m.mustaq...@gmail.com
 To: common-dev common-dev@hadoop.apache.org
 Sent: Thursday, 18 April 2013 10:44 AM
 Subject: Re: How to understand Hadoop source code ?


 I am interested in HDFS. Please guide me.


 On Thu, Apr 18, 2013 at 3:36 AM, Arun C Murthy a...@hortonworks.com
 wrote:

  Please don't cross post.
 
  What parts of Hadoop are you interested in? HDFS? YARN? MapReduce?
 
  Arun
 
  On Apr 17, 2013, at 2:50 PM, Mohammad Mustaqeem wrote:
 
   Hello everyone,
I am new to this group. Since the source code of Hadoop 
   is
 very
   big, I am not able to understand it entirely.
   Is there any document that describes the code?
   Is there any way to understand the functionality of each classes 
   and
 its
   method?
  
  
   --
   *With regards ---*
   *Mohammad Mustaqeem*,
   M.Tech (CSE)
   MNNIT Allahabad
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
 
 


 --
 *With regards ---*
 *Mohammad Mustaqeem*,
 M.Tech (CSE)
 MNNIT Allahabad
 9026604270



Re: How to understand Hadoop source code ?

2013-04-18 Thread Ronnie Ghose
+1 I'm one of those new people :)


On Thu, Apr 18, 2013 at 1:32 PM, Noelle Jakusz (c) njak...@vmware.comwrote:

 +1

 There are quite a few new people, so maybe start a collaborative group
 where you can collect notes and steps (videos and articles). I know I would
 have some for you that I have created as I have gotten started... it would
 be a great idea to post them after some collaboration and review.

 Thanks Chris for the detailed reply...

 -Original Message-
 From: Chris Nauroth [mailto:cnaur...@hortonworks.com]
 Sent: Thursday, April 18, 2013 1:14 PM
 To: common-dev@hadoop.apache.org
 Subject: Re: How to understand Hadoop source code ?

 Is there a specific bug fix or feature that you are trying to contribute?
  Specific questions like how can I help with jira X? or what is the
 main entry point when I run the hdfs command? or where does the namenode
 serialize metadata to disk or where does the secondary namenode execute a
 checkpoint can help focus the conversation.

 AFAIK, we don't have a general code walkthrough document focused on
 onboarding new engineers.  This could be a valuable contribution if you
 want to gather notes while you learn.  I think this always works best if
 it's driven by a new engineer with review by an expert.  (If the experts
 write it, then they might accidentally skip something non-obvious that
 they've already internalized.)

 Since that document doesn't exist yet, the other option is to do some
 reading of the code, ideally while trying to fix a specific bug that has
 been filed in jira.  Like you said, it's a relatively large codebase, so
 it's impractical to read the whole thing top-to-bottom.  Instead, it's
 important to look for high-level clues that steer you towards the right
 files.  I've found that the Maven module structure and the Java package
 names are usually descriptive enough to steer me in the right direction.
  If you focus on getting familiar with those, you'll basically build a
 btree inside your brain that helps you index into the right part of the
 codebase and answer your own questions rapidly.  Several examples:

 Where is the main entry point for the datanode daemon?: module
 hadoop-hdfs, package org.apache.hadoop.hdfs.server.datanode

 What is the algorithm for rebalancing an unbalanced cluster?: module
 hadoop-hdfs, package org.apache.hadoop.hdfs.server.balancer

 How does YARN launch a new container process?: module
 hadoop-yarn-server-nodemanager, package
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher

 Multiple daemons publish JMX metrics as a common concern.  Where is that
 implemented?: module hadoop-common, package org.apache.hadoop.metrics2

 I hope this is helpful to get the process started for you.  We're always
 here to help if you have specific follow-up questions.

 Thanks,
 --Chris


 On Wed, Apr 17, 2013 at 10:33 PM, Prabakaran Krishnan 
 prabakaran_j...@yahoo.in wrote:

  Couuld you please help me in understand map reduce in Hadoop?
 
 
 
  
  From: Mohammad Mustaqeem 3m.mustaq...@gmail.com
  To: common-dev common-dev@hadoop.apache.org
  Sent: Thursday, 18 April 2013 10:44 AM
  Subject: Re: How to understand Hadoop source code ?
 
 
  I am interested in HDFS. Please guide me.
 
 
  On Thu, Apr 18, 2013 at 3:36 AM, Arun C Murthy a...@hortonworks.com
  wrote:
 
   Please don't cross post.
  
   What parts of Hadoop are you interested in? HDFS? YARN? MapReduce?
  
   Arun
  
   On Apr 17, 2013, at 2:50 PM, Mohammad Mustaqeem wrote:
  
Hello everyone,
 I am new to this group. Since the source code of Hadoop
is
  very
big, I am not able to understand it entirely.
Is there any document that describes the code?
Is there any way to understand the functionality of each classes
and
  its
method?
   
   
--
*With regards ---*
*Mohammad Mustaqeem*,
M.Tech (CSE)
MNNIT Allahabad
  
   --
   Arun C. Murthy
   Hortonworks Inc.
   http://hortonworks.com/
  
  
  
 
 
  --
  *With regards ---*
  *Mohammad Mustaqeem*,
  M.Tech (CSE)
  MNNIT Allahabad
  9026604270
 



Re: [VOTE] Release Apache Hadoop 2.0.4-alpha

2013-04-18 Thread Konstantin Boudnik
-0

the release is missing HADOOP-9704 that has critical effect on downstream
projects e.g. build are affected. The issue has been raised for the first time
back in 4/10/13 http://is.gd/OGb3GG and never been even sneezed upon.

Cos

On Sat, Apr 13, 2013 at 03:26AM, Arun C Murthy wrote:
 Folks,
 
 I've created a release candidate (RC2) for hadoop-2.0.4-alpha that I would 
 like to release.
 
 The RC is available at: 
 http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc2/
 The RC tag in svn is here: 
 http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc2
 
 The maven artifacts are available via repository.apache.org.
 
 Please try the release and vote; the vote will run for the usual 7 days.
 
 thanks,
 Arun
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 


signature.asc
Description: Digital signature


Re: How to understand Hadoop source code ?

2013-04-18 Thread Steve Loughran
On 18 April 2013 18:32, Noelle Jakusz (c) njak...@vmware.com wrote:

 +1

 There are quite a few new people, so maybe start a collaborative group
 where you can collect notes and steps (videos and articles). I know I would
 have some for you that I have created as I have gotten started... it would
 be a great idea to post them after some collaboration and review.

 Thanks Chris for the detailed reply...


stuff in wiki.apache.org would be welcome, though there's the commitment to
keep it up to date.

Once you've created wiki accounts, email this list to get write access.

One thing to consider is prerequisites. Hadoop is not a place to learn
about basic distributed system concepts (liveness, failures, RPC), though
some of the specifics (how liveness is implemented, how Hadoop RPC works)
are going to be relevant. It's probably best to list things you need to
know. In particular, you should know Java and its build and test tools
before going near Hadoop.

-steve


Re: [VOTE] Release Apache Hadoop 0.23.7

2013-04-18 Thread Thomas Graves
Thanks everyone for trying 0.23.7 out and voting.

The vote passes with 13 +1s (8 binding and 5 non-binding) and no -1s.

I'll push the release.

Tom


On 4/11/13 2:55 PM, Thomas Graves tgra...@yahoo-inc.com wrote:

I've created a release candidate (RC0) for hadoop-0.23.7 that I would like
to release.

This release is a sustaining release with several important bug fixes in
it.

The RC is available at:
http://people.apache.org/~tgraves/hadoop-0.23.7-candidate-0/
The RC tag in svn is here:
http://svn.apache.org/viewvc/hadoop/common/tags/release-0.23.7-rc0/

The maven artifacts are available via repository.apache.org.

Please try the release and vote; the vote will run for the usual 7 days.

thanks,
Tom Graves




[jira] [Created] (HADOOP-9485) inconsistent defaults for hadoop.rpc.socket.factory.class.default

2013-04-18 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HADOOP-9485:


 Summary: inconsistent defaults for 
hadoop.rpc.socket.factory.class.default
 Key: HADOOP-9485
 URL: https://issues.apache.org/jira/browse/HADOOP-9485
 Project: Hadoop Common
  Issue Type: Bug
  Components: net
Affects Versions: 2.0.5-beta
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


In {{core-default.xml}}, {{hadoop.rpc.socket.factory.class.default}} defaults 
to {{org.apache.hadoop.net.StandardSocketFactory}}.  However, in 
{{CommonConfigurationKeysPublic.java}}, there is no default for this key.  This 
is inconsistent (defaults in the code versus defaults in the XML files should 
match.)  It also leads to problems with {{RemoteBlockReader2}}, since the 
default {{SocketFactory}} creates a {{Socket}} without an associated channel.  
{{RemoteBlockReader2}} cannot use such a {{Socket}}.

This bug only really becomes apparent when you create a {{Configuration}} using 
the {{Configuration(loadDefaults=true)}} constructor.  Thanks to AB Srinivasan 
for his help in discovering this bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9486) Promote Windows and Shell related utils from YARN to Hadoop Common

2013-04-18 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created HADOOP-9486:
---

 Summary: Promote Windows and Shell related utils from YARN to 
Hadoop Common
 Key: HADOOP-9486
 URL: https://issues.apache.org/jira/browse/HADOOP-9486
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Chris Nauroth


This is happening as part of YARN-493, this is a tracking ticket for common 
changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HADOOP-9486) Promote Windows and Shell related utils from YARN to Hadoop Common

2013-04-18 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved HADOOP-9486.
-

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed

I just committed this trunk together with YARN-493. Closing this as resolved.

I've been looking at this at YARN-493 for a while, so went ahead with the 
check-in

Apologies for the small time delta between issue-creation and closure. Let me 
know if anyone finds any issues. Thanks.

 Promote Windows and Shell related utils from YARN to Hadoop Common
 --

 Key: HADOOP-9486
 URL: https://issues.apache.org/jira/browse/HADOOP-9486
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Chris Nauroth
 Fix For: 3.0.0

 Attachments: HADOOP-9486.patch


 This is happening as part of YARN-493, this is a tracking ticket for common 
 changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira