Re: Next releases

2014-01-06 Thread Arpit Agarwal
This merge to branch-2 is complete. The changes have been merged to
branch-2 and target version set to 2.4.0 (r1556076).


On Fri, Jan 3, 2014 at 4:13 PM, Arpit Agarwal aagar...@hortonworks.comwrote:

 We plan to merge HDFS-2832 to branch-2 next week for inclusion in 2.4.



 On Fri, Dec 6, 2013 at 1:53 PM, Arun C Murthy a...@hortonworks.com wrote:

 Thanks Suresh  Colin.

 Please update the Roadmap wiki with your proposals.

 As always, we will try our best to get these in - but we can collectively
 decide to slip some of these to subsequent releases based on timelines.

 Arun

 On Dec 6, 2013, at 10:43 AM, Suresh Srinivas sur...@hortonworks.com
 wrote:

  Arun,
 
  I propose the following changes for 2.3:
  - There have been a lot of improvements related to supporting http
 policy.
  - There is a still discussion going on, but I would like to deprecate
  BackupNode in 2.3 as well.
  - We are currently working on rolling upgrades related change in HDFS.
 We
  might add a couple of changes that enables rolling upgrades from 2.3
  onwards (hopefully we can this done by December)
 
  I propose the following for 2.4 release, if they are tested and stable:
  - Heterogeneous storage support - HDFS-2832
  - Datanode cache related change - HDFS-4949
  - HDFS ACLs - HDFS-4685
  - Rolling upgrade changes
 
  Let me know if you want me to update the wiki.
 
  Regards,
  Suresh
 

 On Dec 6, 2013, at 12:27 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:

  If 2.4 is released in January, I think it's very unlikely to include
  symlinks.  There is still a lot of work to be done before they're
  usable.  You can look at the progress on HADOOP-10019.  For some of
  the subtasks, it will require some community discussion before any
  code can be written.
 
  For better or worse, symlinks have not been requested by users as
  often as features like NFS export, HDFS caching, ACLs, etc, so effort
  has been focused on those instead.
 
  For now, I think we should put the symlinks-disabling patches
  (HADOOP-10020, etc) into branch-2, so that they will be part of the
  next releases without additional effort.
 
  I would like to see HDFS caching make it into 2.4.  The APIs and
  implementation are beginning to stabilize, and around January it
  should be ok to backport to a stable branch.
 
  best,
  Colin
 
 
  On Thu, Nov 7, 2013 at 6:42 PM, Arun C Murthy a...@hortonworks.com
 wrote:
 
  Gang,
 
  Thinking through the next couple of releases here, appreciate f/b.
 
  # hadoop-2.2.1
 
  I was looking through commit logs and there is a *lot* of content here
  (81 commits as on 11/7). Some are features/improvements and some are
 fixes
  - it's really hard to distinguish what is important and what isn't.
 
  I propose we start with a blank slate (i.e. blow away branch-2.2 and
  start fresh from a copy of branch-2.2.0)  and then be very careful and
  meticulous about including only *blocker* fixes in branch-2.2. So,
 most of
  the content here comes via the next minor release (i.e. hadoop-2.3)
 
  In future, we continue to be *very* parsimonious about what gets into a
  patch release (major.minor.patch) - in general, these should be only
  *blocker* fixes or key operational issues.
 
  # hadoop-2.3
 
  I'd like to propose the following features for YARN/MR to make it into
  hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
  * Application History Server - This is happening in  a branch and is
  close; with it we can provide a reasonable experience for new
 frameworks
  being built on top of YARN.
  * Bug-fixes in RM Restart
  * Minimal support for long-running applications (e.g. security) via
  YARN-896
  * RM Fail-over via ZKFC
  * Anything else?
 
  HDFS???
 
  Overall, I feel like we have a decent chance of rolling hadoop-2.3 by
 the
  end of the year.
 
  Thoughts?
 
  thanks,
  Arun
 
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding

Re: Next releases

2014-01-06 Thread Jun Ping Du
Great news! Thanks Arpit!
I think we should update Roadmap wiki to include this work. :)

Thanks,

Junping

- Original Message -
From: Arpit Agarwal aagar...@hortonworks.com
To: common-dev@hadoop.apache.org
Cc: yarn-...@hadoop.apache.org, hdfs-...@hadoop.apache.org, 
mapreduce-...@hadoop.apache.org
Sent: Tuesday, January 7, 2014 8:47:32 AM
Subject: Re: Next releases

This merge to branch-2 is complete. The changes have been merged to
branch-2 and target version set to 2.4.0 (r1556076).


On Fri, Jan 3, 2014 at 4:13 PM, Arpit Agarwal aagar...@hortonworks.comwrote:

 We plan to merge HDFS-2832 to branch-2 next week for inclusion in 2.4.



 On Fri, Dec 6, 2013 at 1:53 PM, Arun C Murthy a...@hortonworks.com wrote:

 Thanks Suresh  Colin.

 Please update the Roadmap wiki with your proposals.

 As always, we will try our best to get these in - but we can collectively
 decide to slip some of these to subsequent releases based on timelines.

 Arun

 On Dec 6, 2013, at 10:43 AM, Suresh Srinivas sur...@hortonworks.com
 wrote:

  Arun,
 
  I propose the following changes for 2.3:
  - There have been a lot of improvements related to supporting http
 policy.
  - There is a still discussion going on, but I would like to deprecate
  BackupNode in 2.3 as well.
  - We are currently working on rolling upgrades related change in HDFS.
 We
  might add a couple of changes that enables rolling upgrades from 2.3
  onwards (hopefully we can this done by December)
 
  I propose the following for 2.4 release, if they are tested and stable:
  - Heterogeneous storage support - HDFS-2832
  - Datanode cache related change - HDFS-4949
  - HDFS ACLs - HDFS-4685
  - Rolling upgrade changes
 
  Let me know if you want me to update the wiki.
 
  Regards,
  Suresh
 

 On Dec 6, 2013, at 12:27 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:

  If 2.4 is released in January, I think it's very unlikely to include
  symlinks.  There is still a lot of work to be done before they're
  usable.  You can look at the progress on HADOOP-10019.  For some of
  the subtasks, it will require some community discussion before any
  code can be written.
 
  For better or worse, symlinks have not been requested by users as
  often as features like NFS export, HDFS caching, ACLs, etc, so effort
  has been focused on those instead.
 
  For now, I think we should put the symlinks-disabling patches
  (HADOOP-10020, etc) into branch-2, so that they will be part of the
  next releases without additional effort.
 
  I would like to see HDFS caching make it into 2.4.  The APIs and
  implementation are beginning to stabilize, and around January it
  should be ok to backport to a stable branch.
 
  best,
  Colin
 
 
  On Thu, Nov 7, 2013 at 6:42 PM, Arun C Murthy a...@hortonworks.com
 wrote:
 
  Gang,
 
  Thinking through the next couple of releases here, appreciate f/b.
 
  # hadoop-2.2.1
 
  I was looking through commit logs and there is a *lot* of content here
  (81 commits as on 11/7). Some are features/improvements and some are
 fixes
  - it's really hard to distinguish what is important and what isn't.
 
  I propose we start with a blank slate (i.e. blow away branch-2.2 and
  start fresh from a copy of branch-2.2.0)  and then be very careful and
  meticulous about including only *blocker* fixes in branch-2.2. So,
 most of
  the content here comes via the next minor release (i.e. hadoop-2.3)
 
  In future, we continue to be *very* parsimonious about what gets into a
  patch release (major.minor.patch) - in general, these should be only
  *blocker* fixes or key operational issues.
 
  # hadoop-2.3
 
  I'd like to propose the following features for YARN/MR to make it into
  hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
  * Application History Server - This is happening in  a branch and is
  close; with it we can provide a reasonable experience for new
 frameworks
  being built on top of YARN.
  * Bug-fixes in RM Restart
  * Minimal support for long-running applications (e.g. security) via
  YARN-896
  * RM Fail-over via ZKFC
  * Anything else?
 
  HDFS???
 
  Overall, I feel like we have a decent chance of rolling hadoop-2.3 by
 the
  end of the year.
 
  Thoughts?
 
  thanks,
  Arun
 
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  https://urldefense.proofpoint.com/v1/url?u=http://hortonworks.com/k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=Mw3izENeqbMFnOzHo594vQ%3D%3D%0Am=mPOfB21VcWnmM3DU81gzph9Quwrh%2BSSBvfQ2CNr5G0g%3D%0As=35573f6f0811fd2b7cc2274469110563eca6b20dce4ed5961ae05ad601d552a0
 
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you

Re: Next releases

2014-01-03 Thread Arpit Agarwal
We plan to merge HDFS-2832 to branch-2 next week for inclusion in 2.4.


On Fri, Dec 6, 2013 at 1:53 PM, Arun C Murthy a...@hortonworks.com wrote:

 Thanks Suresh  Colin.

 Please update the Roadmap wiki with your proposals.

 As always, we will try our best to get these in - but we can collectively
 decide to slip some of these to subsequent releases based on timelines.

 Arun

 On Dec 6, 2013, at 10:43 AM, Suresh Srinivas sur...@hortonworks.com
 wrote:

  Arun,
 
  I propose the following changes for 2.3:
  - There have been a lot of improvements related to supporting http
 policy.
  - There is a still discussion going on, but I would like to deprecate
  BackupNode in 2.3 as well.
  - We are currently working on rolling upgrades related change in HDFS. We
  might add a couple of changes that enables rolling upgrades from 2.3
  onwards (hopefully we can this done by December)
 
  I propose the following for 2.4 release, if they are tested and stable:
  - Heterogeneous storage support - HDFS-2832
  - Datanode cache related change - HDFS-4949
  - HDFS ACLs - HDFS-4685
  - Rolling upgrade changes
 
  Let me know if you want me to update the wiki.
 
  Regards,
  Suresh
 

 On Dec 6, 2013, at 12:27 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:

  If 2.4 is released in January, I think it's very unlikely to include
  symlinks.  There is still a lot of work to be done before they're
  usable.  You can look at the progress on HADOOP-10019.  For some of
  the subtasks, it will require some community discussion before any
  code can be written.
 
  For better or worse, symlinks have not been requested by users as
  often as features like NFS export, HDFS caching, ACLs, etc, so effort
  has been focused on those instead.
 
  For now, I think we should put the symlinks-disabling patches
  (HADOOP-10020, etc) into branch-2, so that they will be part of the
  next releases without additional effort.
 
  I would like to see HDFS caching make it into 2.4.  The APIs and
  implementation are beginning to stabilize, and around January it
  should be ok to backport to a stable branch.
 
  best,
  Colin
 
 
  On Thu, Nov 7, 2013 at 6:42 PM, Arun C Murthy a...@hortonworks.com
 wrote:
 
  Gang,
 
  Thinking through the next couple of releases here, appreciate f/b.
 
  # hadoop-2.2.1
 
  I was looking through commit logs and there is a *lot* of content here
  (81 commits as on 11/7). Some are features/improvements and some are
 fixes
  - it's really hard to distinguish what is important and what isn't.
 
  I propose we start with a blank slate (i.e. blow away branch-2.2 and
  start fresh from a copy of branch-2.2.0)  and then be very careful and
  meticulous about including only *blocker* fixes in branch-2.2. So, most
 of
  the content here comes via the next minor release (i.e. hadoop-2.3)
 
  In future, we continue to be *very* parsimonious about what gets into a
  patch release (major.minor.patch) - in general, these should be only
  *blocker* fixes or key operational issues.
 
  # hadoop-2.3
 
  I'd like to propose the following features for YARN/MR to make it into
  hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
  * Application History Server - This is happening in  a branch and is
  close; with it we can provide a reasonable experience for new frameworks
  being built on top of YARN.
  * Bug-fixes in RM Restart
  * Minimal support for long-running applications (e.g. security) via
  YARN-896
  * RM Fail-over via ZKFC
  * Anything else?
 
  HDFS???
 
  Overall, I feel like we have a decent chance of rolling hadoop-2.3 by
 the
  end of the year.
 
  Thoughts?
 
  thanks,
  Arun
 
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.

 --
 Arun C. Murthy
 Hortonworks

Re: Next releases

2013-12-06 Thread Colin McCabe
If 2.4 is released in January, I think it's very unlikely to include
symlinks.  There is still a lot of work to be done before they're
usable.  You can look at the progress on HADOOP-10019.  For some of
the subtasks, it will require some community discussion before any
code can be written.

For better or worse, symlinks have not been requested by users as
often as features like NFS export, HDFS caching, ACLs, etc, so effort
has been focused on those instead.

For now, I think we should put the symlinks-disabling patches
(HADOOP-10020, etc) into branch-2, so that they will be part of the
next releases without additional effort.

I would like to see HDFS caching make it into 2.4.  The APIs and
implementation are beginning to stabilize, and around January it
should be ok to backport to a stable branch.

best,
Colin

On Thu, Dec 5, 2013 at 3:57 PM, Arun C Murthy a...@hortonworks.com wrote:
 Ok, I've updated https://wiki.apache.org/hadoop/Roadmap with a initial 
 strawman list for hadoop-2.4 which I feel we can get out in Jan.

 What else would folks like to see? Please keep timeframe in mind.

 thanks,
 Arun

 On Dec 2, 2013, at 10:55 AM, Arun C Murthy a...@hortonworks.com wrote:


 On Nov 13, 2013, at 1:55 PM, Jason Lowe jl...@yahoo-inc.com wrote:


 +1 to limiting checkins of patch releases to Blockers/Criticals.  If 
 necessary committers check into trunk/branch-2 only and defer to the patch 
 release manager for the patch release merge.  Then there should be fewer 
 surprises for everyone what ended up in a patch release and less likely the 
 patch release becomes destabilized from the sheer amount of code churn.  
 Maybe this won't be necessary if everyone understands that the patch 
 release isn't the only way to get a change out in timely manner.

 I've updated https://wiki.apache.org/hadoop/Roadmap to reflect that we only 
 put in Blocker/Critical bugs into Point Releases.

 Committers, from now, please exercise extreme caution when committing to a 
 point release: they should only be limited to Blocker bugs.

 thanks,
 Arun


 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/



 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


Re: Next releases

2013-12-06 Thread Arun C Murthy
Thanks Suresh  Colin.

Please update the Roadmap wiki with your proposals.

As always, we will try our best to get these in - but we can collectively 
decide to slip some of these to subsequent releases based on timelines.

Arun

On Dec 6, 2013, at 10:43 AM, Suresh Srinivas sur...@hortonworks.com wrote:

 Arun,
 
 I propose the following changes for 2.3:
 - There have been a lot of improvements related to supporting http policy.
 - There is a still discussion going on, but I would like to deprecate
 BackupNode in 2.3 as well.
 - We are currently working on rolling upgrades related change in HDFS. We
 might add a couple of changes that enables rolling upgrades from 2.3
 onwards (hopefully we can this done by December)
 
 I propose the following for 2.4 release, if they are tested and stable:
 - Heterogeneous storage support - HDFS-2832
 - Datanode cache related change - HDFS-4949
 - HDFS ACLs - HDFS-4685
 - Rolling upgrade changes
 
 Let me know if you want me to update the wiki.
 
 Regards,
 Suresh
 

On Dec 6, 2013, at 12:27 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:

 If 2.4 is released in January, I think it's very unlikely to include
 symlinks.  There is still a lot of work to be done before they're
 usable.  You can look at the progress on HADOOP-10019.  For some of
 the subtasks, it will require some community discussion before any
 code can be written.
 
 For better or worse, symlinks have not been requested by users as
 often as features like NFS export, HDFS caching, ACLs, etc, so effort
 has been focused on those instead.
 
 For now, I think we should put the symlinks-disabling patches
 (HADOOP-10020, etc) into branch-2, so that they will be part of the
 next releases without additional effort.
 
 I would like to see HDFS caching make it into 2.4.  The APIs and
 implementation are beginning to stabilize, and around January it
 should be ok to backport to a stable branch.
 
 best,
 Colin
 
 
 On Thu, Nov 7, 2013 at 6:42 PM, Arun C Murthy a...@hortonworks.com wrote:
 
 Gang,
 
 Thinking through the next couple of releases here, appreciate f/b.
 
 # hadoop-2.2.1
 
 I was looking through commit logs and there is a *lot* of content here
 (81 commits as on 11/7). Some are features/improvements and some are fixes
 - it's really hard to distinguish what is important and what isn't.
 
 I propose we start with a blank slate (i.e. blow away branch-2.2 and
 start fresh from a copy of branch-2.2.0)  and then be very careful and
 meticulous about including only *blocker* fixes in branch-2.2. So, most of
 the content here comes via the next minor release (i.e. hadoop-2.3)
 
 In future, we continue to be *very* parsimonious about what gets into a
 patch release (major.minor.patch) - in general, these should be only
 *blocker* fixes or key operational issues.
 
 # hadoop-2.3
 
 I'd like to propose the following features for YARN/MR to make it into
 hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
 * Application History Server - This is happening in  a branch and is
 close; with it we can provide a reasonable experience for new frameworks
 being built on top of YARN.
 * Bug-fixes in RM Restart
 * Minimal support for long-running applications (e.g. security) via
 YARN-896
 * RM Fail-over via ZKFC
 * Anything else?
 
 HDFS???
 
 Overall, I feel like we have a decent chance of rolling hadoop-2.3 by the
 end of the year.
 
 Thoughts?
 
 thanks,
 Arun
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 
 
 
 -- 
 http://hortonworks.com/download/
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure

Re: Next releases

2013-12-05 Thread Arun C Murthy
Ok, I've updated https://wiki.apache.org/hadoop/Roadmap with a initial strawman 
list for hadoop-2.4 which I feel we can get out in Jan.

What else would folks like to see? Please keep timeframe in mind.

thanks,
Arun

On Dec 2, 2013, at 10:55 AM, Arun C Murthy a...@hortonworks.com wrote:

 
 On Nov 13, 2013, at 1:55 PM, Jason Lowe jl...@yahoo-inc.com wrote:
 
 
 +1 to limiting checkins of patch releases to Blockers/Criticals.  If 
 necessary committers check into trunk/branch-2 only and defer to the patch 
 release manager for the patch release merge.  Then there should be fewer 
 surprises for everyone what ended up in a patch release and less likely the 
 patch release becomes destabilized from the sheer amount of code churn.  
 Maybe this won't be necessary if everyone understands that the patch release 
 isn't the only way to get a change out in timely manner.
 
 I've updated https://wiki.apache.org/hadoop/Roadmap to reflect that we only 
 put in Blocker/Critical bugs into Point Releases.
 
 Committers, from now, please exercise extreme caution when committing to a 
 point release: they should only be limited to Blocker bugs.
 
 thanks,
 Arun
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Next releases

2013-12-02 Thread Arun C Murthy

On Dec 2, 2013, at 10:31 AM, Arun C Murthy a...@hortonworks.com wrote:

 Ok, looks like there are no objections.
 
 I'm starting the work to rename 2.2.1 to 2.3 now. Committers, please hold 
 commits till I send out the all clear.

Done. I've renamed 2.3 - 2.4 and 2.2.1 - 2.3.

I'll create the first RC for 2.3 a week from now i.e. 12/9.

thanks,
Arun

 
 thanks,
 Arun
 
 On Nov 20, 2013, at 6:38 AM, Arun C Murthy a...@hortonworks.com wrote:
 
 Jason,
 
  I'm glad to see we are converging. I'll update the Roadmap wiki with 
 details about major/minor/patch releases.
 
  Here is a straight-forward approach for now: I'll just roll contents of 
 branch-2.2 as a 2.3-rc0 candidate right-away. This way we don't have to get 
 embroiled in details of individual patches (there are too many). Next up, 
 I'll roll 2.4 in December.
 
  Thoughts?
 
 thanks,
 Arun
 
 On Nov 13, 2013, at 1:55 PM, Jason Lowe jl...@yahoo-inc.com wrote:
 
 I think a lot of confusion comes from the fact that the 2.x line is 
 starting to mature.  Before this there wasn't such a big contention of what 
 went into patch vs. minor releases and often the lines were blurred between 
 the two.  However now we have significant customers and products starting 
 to use 2.x as a base, which means we need to start treating it like we 
 treat 1.x.  That means getting serious about what we should put into a 
 patch release vs. what we postpone to a minor release.
 
 Here's my $0.02 on recent proposals:
 
 +1 to releasing more often in general.  A lot of the rush to put changes 
 into a patch release is because it can be a very long time between any kind 
 of release.  If minor releases are more frequent then I hope there would be 
 less of a need to rush something or hold up a release.
 
 +1 to limiting checkins of patch releases to Blockers/Criticals.  If 
 necessary committers check into trunk/branch-2 only and defer to the patch 
 release manager for the patch release merge.  Then there should be fewer 
 surprises for everyone what ended up in a patch release and less likely the 
 patch release becomes destabilized from the sheer amount of code churn.  
 Maybe this won't be necessary if everyone understands that the patch 
 release isn't the only way to get a change out in timely manner.
 
 As for 2.2.1, again I think it's expectations for what that release means.  
 If it's really just a patch release then there shouldn't be features in it 
 and tons of code churn, but I think many were treating it as the next 
 vehicle to deliver changes in general.  If we think 2.2.1 is just as good 
 or better than 2.2.0 then let's wrap it up and move to a more disciplined 
 approach for subsequent patch releases and more frequent minor releases.
 
 Jason
 
 On 11/13/2013 12:10 PM, Arun C Murthy wrote:
 On Nov 12, 2013, at 1:54 PM, Todd Lipcon t...@cloudera.com wrote:
 
 On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe 
 cmcc...@alumni.cmu.eduwrote:
 
 To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
 there.  However, I have only been following the HDFS and common side
 of things so I may not have the full picture.  Arun, can you give a
 specific example of something you'd like to blow away?
 There are bunch of issues in YARN/MapReduce which clearly aren't 
 *critical*, similarly in HDFS a cursory glance showed up some 
 *enhancements*/*improvements* in CHANGES.txt which aren't necessary for a 
 patch release, plus things like:
 
HADOOP-9623 
 Update jets3t dependency to 0.9.0
 
 
 
 
 
 
 
  
 Having said that, the HDFS devs know their code the best.
 
 I agree with Colin. If we've been backporting things into a patch release
 (third version component) which don't belong, we should explicitly call 
 out
 those patches, so we can learn from our mistakes and have a discussion
 about what belongs.
 Good point.
 
 Here is a straw man proposal:
 
 
 A patch (third version) release should only include *blocker* bugs which 
 are critical from an operational, security or data-integrity issues.
 
 This way, we can ensure that a minor series release (2.2.x or 2.3.x or 
 2.4.x) is always release-able, and more importantly, deploy-able at any 
 point in time.
 
 
 
 Sandy did bring up a related point about timing of releases and the urge 
 for everyone to cram features/fixes into a dot release.
 
 So, we could remedy that situation by doing a release every 4-6 weeks 
 (2.3, 2.4 etc.) and keep the patch releases limited to blocker bugs.
 
 Thoughts?
 
 thanks,
 Arun
 
 
 
 
 
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not 

Re: Next releases

2013-12-02 Thread Arun C Murthy

On Nov 13, 2013, at 1:55 PM, Jason Lowe jl...@yahoo-inc.com wrote:
 
 
 +1 to limiting checkins of patch releases to Blockers/Criticals.  If 
 necessary committers check into trunk/branch-2 only and defer to the patch 
 release manager for the patch release merge.  Then there should be fewer 
 surprises for everyone what ended up in a patch release and less likely the 
 patch release becomes destabilized from the sheer amount of code churn.  
 Maybe this won't be necessary if everyone understands that the patch release 
 isn't the only way to get a change out in timely manner.

I've updated https://wiki.apache.org/hadoop/Roadmap to reflect that we only put 
in Blocker/Critical bugs into Point Releases.

Committers, from now, please exercise extreme caution when committing to a 
point release: they should only be limited to Blocker bugs.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Next releases

2013-11-20 Thread Arun C Murthy
Jason,

 I'm glad to see we are converging. I'll update the Roadmap wiki with details 
about major/minor/patch releases.

 Here is a straight-forward approach for now: I'll just roll contents of 
branch-2.2 as a 2.3-rc0 candidate right-away. This way we don't have to get 
embroiled in details of individual patches (there are too many). Next up, I'll 
roll 2.4 in December.

 Thoughts?

thanks,
Arun

On Nov 13, 2013, at 1:55 PM, Jason Lowe jl...@yahoo-inc.com wrote:

 I think a lot of confusion comes from the fact that the 2.x line is starting 
 to mature.  Before this there wasn't such a big contention of what went into 
 patch vs. minor releases and often the lines were blurred between the two.  
 However now we have significant customers and products starting to use 2.x as 
 a base, which means we need to start treating it like we treat 1.x.  That 
 means getting serious about what we should put into a patch release vs. what 
 we postpone to a minor release.
 
 Here's my $0.02 on recent proposals:
 
 +1 to releasing more often in general.  A lot of the rush to put changes into 
 a patch release is because it can be a very long time between any kind of 
 release.  If minor releases are more frequent then I hope there would be less 
 of a need to rush something or hold up a release.
 
 +1 to limiting checkins of patch releases to Blockers/Criticals.  If 
 necessary committers check into trunk/branch-2 only and defer to the patch 
 release manager for the patch release merge.  Then there should be fewer 
 surprises for everyone what ended up in a patch release and less likely the 
 patch release becomes destabilized from the sheer amount of code churn.  
 Maybe this won't be necessary if everyone understands that the patch release 
 isn't the only way to get a change out in timely manner.
 
 As for 2.2.1, again I think it's expectations for what that release means.  
 If it's really just a patch release then there shouldn't be features in it 
 and tons of code churn, but I think many were treating it as the next vehicle 
 to deliver changes in general.  If we think 2.2.1 is just as good or better 
 than 2.2.0 then let's wrap it up and move to a more disciplined approach for 
 subsequent patch releases and more frequent minor releases.
 
 Jason
 
 On 11/13/2013 12:10 PM, Arun C Murthy wrote:
 On Nov 12, 2013, at 1:54 PM, Todd Lipcon t...@cloudera.com wrote:
 
 On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe cmcc...@alumni.cmu.eduwrote:
 
 To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
 there.  However, I have only been following the HDFS and common side
 of things so I may not have the full picture.  Arun, can you give a
 specific example of something you'd like to blow away?
 There are bunch of issues in YARN/MapReduce which clearly aren't *critical*, 
 similarly in HDFS a cursory glance showed up some 
 *enhancements*/*improvements* in CHANGES.txt which aren't necessary for a 
 patch release, plus things like:
 
  HADOOP-9623 
 Update jets3t dependency to 0.9.0
 
 
 
 
 
 
 
  
 Having said that, the HDFS devs know their code the best.
 
 I agree with Colin. If we've been backporting things into a patch release
 (third version component) which don't belong, we should explicitly call out
 those patches, so we can learn from our mistakes and have a discussion
 about what belongs.
 Good point.
 
 Here is a straw man proposal:
 
 
 A patch (third version) release should only include *blocker* bugs which are 
 critical from an operational, security or data-integrity issues.
 
 This way, we can ensure that a minor series release (2.2.x or 2.3.x or 
 2.4.x) is always release-able, and more importantly, deploy-able at any 
 point in time.
 
 
 
 Sandy did bring up a related point about timing of releases and the urge for 
 everyone to cram features/fixes into a dot release.
 
 So, we could remedy that situation by doing a release every 4-6 weeks (2.3, 
 2.4 etc.) and keep the patch releases limited to blocker bugs.
 
 Thoughts?
 
 thanks,
 Arun
 
 
 
 
 
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Next releases

2013-11-14 Thread Colin McCabe
On Wed, Nov 13, 2013 at 10:10 AM, Arun C Murthy a...@hortonworks.com wrote:

 On Nov 12, 2013, at 1:54 PM, Todd Lipcon t...@cloudera.com wrote:

 On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe cmcc...@alumni.cmu.eduwrote:

 To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
 there.  However, I have only been following the HDFS and common side
 of things so I may not have the full picture.  Arun, can you give a
 specific example of something you'd like to blow away?

 There are bunch of issues in YARN/MapReduce which clearly aren't *critical*, 
 similarly in HDFS a cursory glance showed up some 
 *enhancements*/*improvements* in CHANGES.txt which aren't necessary for a 
 patch release, plus things like:

 HADOOP-9623
 Update jets3t dependency to 0.9.0

I'm fine with reverting HADOOP-9623 from branch-2.2 and pushing it out
to branch-2.3.  It does bring in httpcore, a dependency that wasn't
there before.

Colin


 Here is a straw man proposal:

 
 A patch (third version) release should only include *blocker* bugs which are 
 critical from an operational, security or data-integrity issues.

 This way, we can ensure that a minor series release (2.2.x or 2.3.x or 2.4.x) 
 is always release-able, and more importantly, deploy-able at any point in 
 time.

 

 Sandy did bring up a related point about timing of releases and the urge for 
 everyone to cram features/fixes into a dot release.

 So, we could remedy that situation by doing a release every 4-6 weeks (2.3, 
 2.4 etc.) and keep the patch releases limited to blocker bugs.

 Thoughts?

 thanks,
 Arun





 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


Re: Next releases

2013-11-14 Thread Steve Loughran
On 14 November 2013 17:15, Colin McCabe cmcc...@alumni.cmu.edu wrote:


  HADOOP-9623
  Update jets3t dependency to 0.9.0

 I'm fine with reverting HADOOP-9623 from branch-2.2 and pushing it out
 to branch-2.3.  It does bring in httpcore, a dependency that wasn't
 there before.


I think http components came in 2.3 with the openstack package -if
that'sthe same one

Also, I'm doing a step-by-step update of all the dependencies for 2.3+, the
jet3t is a major code change, but there a lots of others to deal with, even
if we leave jetty and friends alone :
https://issues.apache.org/jira/browse/HADOOP-9991

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Next releases

2013-11-13 Thread Steve Loughran
On 13 November 2013 18:11, Arun C Murthy a...@hortonworks.com wrote:



HADOOP-9623
 Update jets3t dependency to 0.9.0


I saw that change -I don't think its a bad one, but I do think we need more
testing of blobstores  especially big operations, like 6Gb uploads (which
should now work with the 0.9.0 jets3t).

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Next releases

2013-11-13 Thread Sandy Ryza
Here are few patches that I put into 2.2.1 and are minimally invasive, but
I don't think are blockers:

  YARN-305. Fair scheduler logs too many Node offered to app messages.
  YARN-1335. Move duplicate code from FSSchedulerApp and
FiCaSchedulerApp into SchedulerApplication
  YARN-1333. Support blacklisting in the Fair Scheduler
  YARN-1109. Demote NodeManager Sending out status for container logs
to debug (haosdent via Sandy Ryza)
  YARN-1388. Fair Scheduler page always displays blank fair share

+1 to doing releases at some fixed time interval.

-Sandy


On Wed, Nov 13, 2013 at 10:10 AM, Arun C Murthy a...@hortonworks.com wrote:


 On Nov 12, 2013, at 1:54 PM, Todd Lipcon t...@cloudera.com wrote:

  On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe cmcc...@alumni.cmu.edu
 wrote:
 
  To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
  there.  However, I have only been following the HDFS and common side
  of things so I may not have the full picture.  Arun, can you give a
  specific example of something you'd like to blow away?

 There are bunch of issues in YARN/MapReduce which clearly aren't
 *critical*, similarly in HDFS a cursory glance showed up some
 *enhancements*/*improvements* in CHANGES.txt which aren't necessary for a
 patch release, plus things like:

 HADOOP-9623
 Update jets3t dependency to 0.9.0









 Having said that, the HDFS devs know their code the best.

  I agree with Colin. If we've been backporting things into a patch release
  (third version component) which don't belong, we should explicitly call
 out
  those patches, so we can learn from our mistakes and have a discussion
  about what belongs.

 Good point.

 Here is a straw man proposal:

 
 A patch (third version) release should only include *blocker* bugs which
 are critical from an operational, security or data-integrity issues.

 This way, we can ensure that a minor series release (2.2.x or 2.3.x or
 2.4.x) is always release-able, and more importantly, deploy-able at any
 point in time.

 

 Sandy did bring up a related point about timing of releases and the urge
 for everyone to cram features/fixes into a dot release.

 So, we could remedy that situation by doing a release every 4-6 weeks
 (2.3, 2.4 etc.) and keep the patch releases limited to blocker bugs.

 Thoughts?

 thanks,
 Arun





 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Next releases

2013-11-13 Thread Arun C Murthy

On Nov 13, 2013, at 12:38 PM, Sandy Ryza sandy.r...@cloudera.com wrote:

 Here are few patches that I put into 2.2.1 and are minimally invasive, but
 I don't think are blockers:
 
  YARN-305. Fair scheduler logs too many Node offered to app messages.
  YARN-1335. Move duplicate code from FSSchedulerApp and
 FiCaSchedulerApp into SchedulerApplication
  YARN-1333. Support blacklisting in the Fair Scheduler
  YARN-1109. Demote NodeManager Sending out status for container logs
 to debug (haosdent via Sandy Ryza)
  YARN-1388. Fair Scheduler page always displays blank fair share
 
 +1 to doing releases at some fixed time interval.

To be clear, I still think we should be *very* clear about what features we 
target for each release (2.3, 2.4, etc.).

Except, we don't wait infinitely for any specific feature - if we miss a 4-6 
week window a feature goes to the next train.

Makes sense?

thanks,
Arun

 
 -Sandy
 
 
 On Wed, Nov 13, 2013 at 10:10 AM, Arun C Murthy a...@hortonworks.com wrote:
 
 
 On Nov 12, 2013, at 1:54 PM, Todd Lipcon t...@cloudera.com wrote:
 
 On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe cmcc...@alumni.cmu.edu
 wrote:
 
 To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
 there.  However, I have only been following the HDFS and common side
 of things so I may not have the full picture.  Arun, can you give a
 specific example of something you'd like to blow away?
 
 There are bunch of issues in YARN/MapReduce which clearly aren't
 *critical*, similarly in HDFS a cursory glance showed up some
 *enhancements*/*improvements* in CHANGES.txt which aren't necessary for a
 patch release, plus things like:
 
HADOOP-9623
 Update jets3t dependency to 0.9.0
 
 
 
 
 
 
 
 
 
 Having said that, the HDFS devs know their code the best.
 
 I agree with Colin. If we've been backporting things into a patch release
 (third version component) which don't belong, we should explicitly call
 out
 those patches, so we can learn from our mistakes and have a discussion
 about what belongs.
 
 Good point.
 
 Here is a straw man proposal:
 
 
 A patch (third version) release should only include *blocker* bugs which
 are critical from an operational, security or data-integrity issues.
 
 This way, we can ensure that a minor series release (2.2.x or 2.3.x or
 2.4.x) is always release-able, and more importantly, deploy-able at any
 point in time.
 
 
 
 Sandy did bring up a related point about timing of releases and the urge
 for everyone to cram features/fixes into a dot release.
 
 So, we could remedy that situation by doing a release every 4-6 weeks
 (2.3, 2.4 etc.) and keep the patch releases limited to blocker bugs.
 
 Thoughts?
 
 thanks,
 Arun
 
 
 
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Next releases

2013-11-13 Thread Alejandro Abdelnur
Sound goods, just a little impedance between what seem to be 2 conflicting
goals:

* what features we target for each release
* train releases

If we want to do train releases at fixed times, then if a feature is not
ready, it catches the next train, no delays of the train because of a
feature. If a bug is delaying the train and a feature becomes ready in the
mean time and it does not stabilize the release, it can jump on board, if
it breaks something, it goes out of the window until the next train.

Also we have do decide what we do with 2.2.1. I would say start wrapping up
the current 2.2 branch and make it the first train.

thx




On Wed, Nov 13, 2013 at 12:55 PM, Arun C Murthy a...@hortonworks.com wrote:


 On Nov 13, 2013, at 12:38 PM, Sandy Ryza sandy.r...@cloudera.com wrote:

  Here are few patches that I put into 2.2.1 and are minimally invasive,
 but
  I don't think are blockers:
 
   YARN-305. Fair scheduler logs too many Node offered to app messages.
   YARN-1335. Move duplicate code from FSSchedulerApp and
  FiCaSchedulerApp into SchedulerApplication
   YARN-1333. Support blacklisting in the Fair Scheduler
   YARN-1109. Demote NodeManager Sending out status for container logs
  to debug (haosdent via Sandy Ryza)
   YARN-1388. Fair Scheduler page always displays blank fair share
 
  +1 to doing releases at some fixed time interval.

 To be clear, I still think we should be *very* clear about what features
 we target for each release (2.3, 2.4, etc.).

 Except, we don't wait infinitely for any specific feature - if we miss a
 4-6 week window a feature goes to the next train.

 Makes sense?

 thanks,
 Arun

 
  -Sandy
 
 
  On Wed, Nov 13, 2013 at 10:10 AM, Arun C Murthy a...@hortonworks.com
 wrote:
 
 
  On Nov 12, 2013, at 1:54 PM, Todd Lipcon t...@cloudera.com wrote:
 
  On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe cmcc...@alumni.cmu.edu
  wrote:
 
  To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
  there.  However, I have only been following the HDFS and common side
  of things so I may not have the full picture.  Arun, can you give a
  specific example of something you'd like to blow away?
 
  There are bunch of issues in YARN/MapReduce which clearly aren't
  *critical*, similarly in HDFS a cursory glance showed up some
  *enhancements*/*improvements* in CHANGES.txt which aren't necessary for
 a
  patch release, plus things like:
 
 HADOOP-9623
  Update jets3t dependency to 0.9.0
 
 
 
 
 
 
 
 
 
  Having said that, the HDFS devs know their code the best.
 
  I agree with Colin. If we've been backporting things into a patch
 release
  (third version component) which don't belong, we should explicitly call
  out
  those patches, so we can learn from our mistakes and have a discussion
  about what belongs.
 
  Good point.
 
  Here is a straw man proposal:
 
  
  A patch (third version) release should only include *blocker* bugs which
  are critical from an operational, security or data-integrity issues.
 
  This way, we can ensure that a minor series release (2.2.x or 2.3.x or
  2.4.x) is always release-able, and more importantly, deploy-able at any
  point in time.
 
  
 
  Sandy did bring up a related point about timing of releases and the urge
  for everyone to cram features/fixes into a dot release.
 
  So, we could remedy that situation by doing a release every 4-6 weeks
  (2.3, 2.4 etc.) and keep the patch releases limited to blocker bugs.
 
  Thoughts?
 
  thanks,
  Arun
 
 
 
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/



 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.




-- 
Alejandro


Re: Next releases

2013-11-13 Thread Jason Lowe
I think a lot of confusion comes from the fact that the 2.x line is 
starting to mature.  Before this there wasn't such a big contention of 
what went into patch vs. minor releases and often the lines were blurred 
between the two.  However now we have significant customers and products 
starting to use 2.x as a base, which means we need to start treating it 
like we treat 1.x.  That means getting serious about what we should put 
into a patch release vs. what we postpone to a minor release.


Here's my $0.02 on recent proposals:

+1 to releasing more often in general.  A lot of the rush to put changes 
into a patch release is because it can be a very long time between any 
kind of release.  If minor releases are more frequent then I hope there 
would be less of a need to rush something or hold up a release.


+1 to limiting checkins of patch releases to Blockers/Criticals.  If 
necessary committers check into trunk/branch-2 only and defer to the 
patch release manager for the patch release merge.  Then there should be 
fewer surprises for everyone what ended up in a patch release and less 
likely the patch release becomes destabilized from the sheer amount of 
code churn.  Maybe this won't be necessary if everyone understands that 
the patch release isn't the only way to get a change out in timely manner.


As for 2.2.1, again I think it's expectations for what that release 
means.  If it's really just a patch release then there shouldn't be 
features in it and tons of code churn, but I think many were treating it 
as the next vehicle to deliver changes in general.  If we think 2.2.1 is 
just as good or better than 2.2.0 then let's wrap it up and move to a 
more disciplined approach for subsequent patch releases and more 
frequent minor releases.


Jason

On 11/13/2013 12:10 PM, Arun C Murthy wrote:

On Nov 12, 2013, at 1:54 PM, Todd Lipcon t...@cloudera.com wrote:


On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe cmcc...@alumni.cmu.eduwrote:


To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
there.  However, I have only been following the HDFS and common side
of things so I may not have the full picture.  Arun, can you give a
specific example of something you'd like to blow away?

There are bunch of issues in YARN/MapReduce which clearly aren't *critical*, 
similarly in HDFS a cursory glance showed up some *enhancements*/*improvements* 
in CHANGES.txt which aren't necessary for a patch release, plus things like:

HADOOP-9623 
Update jets3t dependency to 0.9.0







  


Having said that, the HDFS devs know their code the best.


I agree with Colin. If we've been backporting things into a patch release
(third version component) which don't belong, we should explicitly call out
those patches, so we can learn from our mistakes and have a discussion
about what belongs.

Good point.

Here is a straw man proposal:


A patch (third version) release should only include *blocker* bugs which are 
critical from an operational, security or data-integrity issues.

This way, we can ensure that a minor series release (2.2.x or 2.3.x or 2.4.x) 
is always release-able, and more importantly, deploy-able at any point in time.



Sandy did bring up a related point about timing of releases and the urge for 
everyone to cram features/fixes into a dot release.

So, we could remedy that situation by doing a release every 4-6 weeks (2.3, 2.4 
etc.) and keep the patch releases limited to blocker bugs.

Thoughts?

thanks,
Arun









Re: Next releases

2013-11-12 Thread Todd Lipcon
On Mon, Nov 11, 2013 at 2:57 PM, Colin McCabe cmcc...@alumni.cmu.eduwrote:

 To be honest, I'm not aware of anything in 2.2.1 that shouldn't be
 there.  However, I have only been following the HDFS and common side
 of things so I may not have the full picture.  Arun, can you give a
 specific example of something you'd like to blow away?


I agree with Colin. If we've been backporting things into a patch release
(third version component) which don't belong, we should explicitly call out
those patches, so we can learn from our mistakes and have a discussion
about what belongs. Otherwise we'll just end up doing it again. Saying
there were a few mistakes, so let's reset back a bunch of backport work
seems like a baby-with-the-bathwater situation.

Todd


Re: Next releases

2013-11-11 Thread Hari Mankude
Hi Arun,

Another feature that would be relevant and got deferred was the symlink
work (HADOOP-10020) that Colin and Andrew were working on. Can we include
this in hadoop-2.3.0 also?

thanks
hari


On Sun, Nov 10, 2013 at 2:07 PM, Alejandro Abdelnur t...@cloudera.comwrote:

 Arun, thanks for jumping on this.

 On hadoop branch-2.2. I've quickly scanned the commit logs starting from
 the 2.2.0 release and I've found around 20 JIRAs that I like seeing in
 2.2.1. Not all of them are bugs but the don't shake anything and improve
 usability.

 I presume others will have their own laundry lists as well and I wonder the
 union of all of them how much adds up to the current 81 commits.

 How about splitting the JIRAs among a few contributors to assert there is
 nothing risky in there? And if so get discuss getting rid of those commits
 for 2.2.1. IMO doing that would be cheaper than selectively applying
 commits on a fresh branch.

 Said this, I think we should get 2.2.1 out of the door before switching
 main efforts to 2.3.0. I volunteer myself to drive 2.2.1 a  release if ASAP
 if you don't have the bandwidth at the moment for it.

 Cheers.

 Alejandro


 
 Commits in branch-2.2 that I'd like them to be in the 2.2.1 release:

 The ones prefixed with '*' technically are not bugs.

  YARN-1284. LCE: Race condition leaves dangling cgroups entries for killed
 containers. (Alejandro Abdelnur via Sandy Ryza)
  YARN-1265. Fair Scheduler chokes on unhealthy node reconnect (Sandy Ryza)
  YARN-1044. used/min/max resources do not display info in the scheduler
 page (Sangjin Lee via Sandy Ryza)
  YARN-305. Fair scheduler logs too many Node offered to app messages.
 (Lohit Vijayarenu via Sandy Ryza)
 *MAPREDUCE-5463. Deprecate SLOTS_MILLIS counters. (Tzuyoshi Ozawa via Sandy
 Ryza)
  YARN-1259. In Fair Scheduler web UI, queue num pending and num active apps
 switched. (Robert Kanter via Sandy Ryza)
  YARN-1295. In UnixLocalWrapperScriptBuilder, using bash -c can cause Text
 file busy errors. (Sandy Ryza)
 *MAPREDUCE-5457. Add a KeyOnlyTextOutputReader to enable streaming to write
 out text files without separators (Sandy Ryza)
 *YARN-1258. Allow configuring the Fair Scheduler root queue (Sandy Ryza)
 *YARN-1288. Make Fair Scheduler ACLs more user friendly (Sandy Ryza)
  YARN-1330. Fair Scheduler: defaultQueueSchedulingPolicy does not take
 effect (Sandy Ryza)
  HDFS-5403. WebHdfs client cannot communicate with older WebHdfs servers
 post HDFS-5306. Contributed by Aaron T. Myers.
 *YARN-1335. Move duplicate code from FSSchedulerApp and FiCaSchedulerApp
 into SchedulerApplication (Sandy Ryza)
 *YARN-1333. Support blacklisting in the Fair Scheduler (Tsuyoshi Ozawa via
 Sandy Ryza)
 *MAPREDUCE-4680. Job history cleaner should only check timestamps of files
 in old enough directories (Robert Kanter via Sandy Ryza)
  YARN-1109. Demote NodeManager Sending out status for container logs to
 debug (haosdent via Sandy Ryza)
 *YARN-1321. Changed NMTokenCache to support both singleton and an instance
 usage. Contributed by Alejandro Abdelnur
  YARN-1343. NodeManagers additions/restarts are not reported as node
 updates in AllocateResponse responses to AMs. (tucu)
  YARN-1381. Same relaxLocality appears twice in exception message of
 AMRMClientImpl#checkLocalityRelaxationConflict() (Ted Yu via Sandy Ryza)
  HADOOP-9898. Set SO_KEEPALIVE on all our sockets. Contributed by Todd
 Lipcon.
  YARN-1388. Fair Scheduler page always displays blank fair share (Liyin
 Liang via Sandy Ryza)



 On Fri, Nov 8, 2013 at 10:35 PM, Chris Nauroth cnaur...@hortonworks.com
 wrote:

  Arun, what are your thoughts on test-only patches?  I know I've been
  merging a lot of Windows test stabilization patches down to branch-2.2.
   These can't rightly be called blockers, but they do improve dev
  experience, and there is no risk to product code.
 
  Chris Nauroth
  Hortonworks
  http://hortonworks.com/
 
 
 
  On Fri, Nov 8, 2013 at 1:30 AM, Steve Loughran ste...@hortonworks.com
  wrote:
 
   On 8 November 2013 02:42, Arun C Murthy a...@hortonworks.com wrote:
  
Gang,
   
 Thinking through the next couple of releases here, appreciate f/b.
   
 # hadoop-2.2.1
   
 I was looking through commit logs and there is a *lot* of content
 here
(81 commits as on 11/7). Some are features/improvements and some are
   fixes
- it's really hard to distinguish what is important and what isn't.
   
 I propose we start with a blank slate (i.e. blow away branch-2.2 and
start fresh from a copy of branch-2.2.0)  and then be very careful
 and
meticulous about including only *blocker* fixes in branch-2.2. So,
 most
   of
the content here comes via the next minor release (i.e. hadoop-2.3)
   
 In future, we continue to be *very* parsimonious about what gets
 into
  a
patch release (major.minor.patch) - in general, these should be only
*blocker* fixes or 

Re: Next releases

2013-11-08 Thread Sandy Ryza
Starting afresh with 2.2.1 and keeping it as small as possible sounds
reasonable to me.

Would love to get 2.3 out soon.  To that end, how would people feel about
having code and/or feature freeze and/or ship dates?  We've been way behind
out goals for recent releases. Having actual targets on the calendar would
help us achieve the regular release cadence that can really benefit both
downstream projects and ourselves.  Without this, we get in the habit of
stuffing changes into the closest release, even if minor or patch, for fear
that the next will be many months away.

Vinod, Zhijie, or Mayank, please correct me if I'm wrong, but my intuition
is that end-of-year is a little unrealistic for YARN-321.  Will the ~30
working days before the end of the year be enough to complete the feature,
stabilize it, bring APIs to the point that we won't need to break them in
the future, and merge the branch?  Could we either push the feature out or
aim for the end of January instead?

Assuming the latter, some strawman dates:
Feature freeze - 1/6
Code freeze - 1/16
Release - 1/30

Lastly, I'd like to add finer-grained CPU scheduling a la YARN-1089 to the
target features.

thanks,
Sandy


On Thu, Nov 7, 2013 at 6:42 PM, Arun C Murthy a...@hortonworks.com wrote:

 Gang,

  Thinking through the next couple of releases here, appreciate f/b.

  # hadoop-2.2.1

  I was looking through commit logs and there is a *lot* of content here
 (81 commits as on 11/7). Some are features/improvements and some are fixes
 - it's really hard to distinguish what is important and what isn't.

  I propose we start with a blank slate (i.e. blow away branch-2.2 and
 start fresh from a copy of branch-2.2.0)  and then be very careful and
 meticulous about including only *blocker* fixes in branch-2.2. So, most of
 the content here comes via the next minor release (i.e. hadoop-2.3)

  In future, we continue to be *very* parsimonious about what gets into a
 patch release (major.minor.patch) - in general, these should be only
 *blocker* fixes or key operational issues.

  # hadoop-2.3

  I'd like to propose the following features for YARN/MR to make it into
 hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
  * Application History Server - This is happening in  a branch and is
 close; with it we can provide a reasonable experience for new frameworks
 being built on top of YARN.
  * Bug-fixes in RM Restart
  * Minimal support for long-running applications (e.g. security) via
 YARN-896
  * RM Fail-over via ZKFC
  * Anything else?

  HDFS???

  Overall, I feel like we have a decent chance of rolling hadoop-2.3 by the
 end of the year.

  Thoughts?

 thanks,
 Arun


 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/



 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Next releases

2013-11-08 Thread Jun Ping Du
Hi Arun,
   Thanks for working out this list which looks great to me. In addition, I 
would like to add an item: YARN-291 to 2.3 release which enhance YARN's 
resource elasticity in cloud scenario and can benefit other scenarios i.e. 
graceful NM decommission (YARN-914), non job/app regression (or maintenance 
model) in NM rolling upgrade (YARN-671), etc. With great help from Luke, Bikas 
and Vinod, we already get the first and the most important work (YARN-311) in. 
Now, I am working on the left parts include: interfaces (RPC, CLI, REST, etc.) 
and a few enhancements (persistent, supporting different policies, etc.) and be 
optimistic on completing most of work by the end of 2013. Would you help to 
embrace it in if we can make it on time? :)

Thanks,

Junping

- Original Message -
From: Arun C Murthy a...@hortonworks.com
To: common-dev@hadoop.apache.org, hdfs-...@hadoop.apache.org, 
yarn-...@hadoop.apache.org, mapreduce-...@hadoop.apache.org
Sent: Friday, November 8, 2013 10:42:36 AM
Subject: Next releases

Gang,

 Thinking through the next couple of releases here, appreciate f/b.

 # hadoop-2.2.1

 I was looking through commit logs and there is a *lot* of content here (81 
commits as on 11/7). Some are features/improvements and some are fixes - it's 
really hard to distinguish what is important and what isn't.

 I propose we start with a blank slate (i.e. blow away branch-2.2 and start 
fresh from a copy of branch-2.2.0)  and then be very careful and meticulous 
about including only *blocker* fixes in branch-2.2. So, most of the content 
here comes via the next minor release (i.e. hadoop-2.3)

 In future, we continue to be *very* parsimonious about what gets into a patch 
release (major.minor.patch) - in general, these should be only *blocker* fixes 
or key operational issues.

 # hadoop-2.3
 
 I'd like to propose the following features for YARN/MR to make it into 
hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
 * Application History Server - This is happening in  a branch and is close; 
with it we can provide a reasonable experience for new frameworks being built 
on top of YARN.
 * Bug-fixes in RM Restart
 * Minimal support for long-running applications (e.g. security) via YARN-896
 * RM Fail-over via ZKFC
 * Anything else?

 HDFS???

 Overall, I feel like we have a decent chance of rolling hadoop-2.3 by the end 
of the year.

 Thoughts?

thanks,
Arun
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You


Re: Next releases

2013-11-08 Thread Steve Loughran
On 8 November 2013 02:42, Arun C Murthy a...@hortonworks.com wrote:

 Gang,

  Thinking through the next couple of releases here, appreciate f/b.

  # hadoop-2.2.1

  I was looking through commit logs and there is a *lot* of content here
 (81 commits as on 11/7). Some are features/improvements and some are fixes
 - it's really hard to distinguish what is important and what isn't.

  I propose we start with a blank slate (i.e. blow away branch-2.2 and
 start fresh from a copy of branch-2.2.0)  and then be very careful and
 meticulous about including only *blocker* fixes in branch-2.2. So, most of
 the content here comes via the next minor release (i.e. hadoop-2.3)

  In future, we continue to be *very* parsimonious about what gets into a
 patch release (major.minor.patch) - in general, these should be only
 *blocker* fixes or key operational issues.


+1



  # hadoop-2.3

  I'd like to propose the following features for YARN/MR to make it into
 hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
  * Application History Server - This is happening in  a branch and is
 close; with it we can provide a reasonable experience for new frameworks
 being built on top of YARN.
  * Bug-fixes in RM Restart
  * Minimal support for long-running applications (e.g. security) via
 YARN-896


+1 -the complete set isn't going to make it, but I'm sure we can identify
the key ones



  * RM Fail-over via ZKFC
  * Anything else?

  HDFS???



   - If I had the time, I'd like to do some work on the HADOOP-9361
   filesystem spec  tests -this is mostly some specification, the basis of a
   better test framework for newer FS tests, and some more tests, with a
   couple of minor changes to some of the FS code, mainly in terms of
   tightening some of the exceptions thrown (IOE - EOF)

otherwise:

   - I'd like the hadoop-openstack  JAR in; it's already in branch-2 so
   it's a matter of ensuring testing during the release against as many
   providers as possible.
   - There are a fair few JIRAs about updating versions of dependencies
   -the S3 JetS3t update went in this week, but there are more, as well as
   cruft in the POMs which shows up downstream. I think we could update the
   low-risk dependencies (test-time, log4j, c), while avoiding those we know
   will be trouble (jetty). This may seem minor but it does make a big diff to
   the downstream projects.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Next releases

2013-11-08 Thread Chris Nauroth
Arun, what are your thoughts on test-only patches?  I know I've been
merging a lot of Windows test stabilization patches down to branch-2.2.
 These can't rightly be called blockers, but they do improve dev
experience, and there is no risk to product code.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Fri, Nov 8, 2013 at 1:30 AM, Steve Loughran ste...@hortonworks.comwrote:

 On 8 November 2013 02:42, Arun C Murthy a...@hortonworks.com wrote:

  Gang,
 
   Thinking through the next couple of releases here, appreciate f/b.
 
   # hadoop-2.2.1
 
   I was looking through commit logs and there is a *lot* of content here
  (81 commits as on 11/7). Some are features/improvements and some are
 fixes
  - it's really hard to distinguish what is important and what isn't.
 
   I propose we start with a blank slate (i.e. blow away branch-2.2 and
  start fresh from a copy of branch-2.2.0)  and then be very careful and
  meticulous about including only *blocker* fixes in branch-2.2. So, most
 of
  the content here comes via the next minor release (i.e. hadoop-2.3)
 
   In future, we continue to be *very* parsimonious about what gets into a
  patch release (major.minor.patch) - in general, these should be only
  *blocker* fixes or key operational issues.
 

 +1


 
   # hadoop-2.3
 
   I'd like to propose the following features for YARN/MR to make it into
  hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
   * Application History Server - This is happening in  a branch and is
  close; with it we can provide a reasonable experience for new frameworks
  being built on top of YARN.
   * Bug-fixes in RM Restart
   * Minimal support for long-running applications (e.g. security) via
  YARN-896
 

 +1 -the complete set isn't going to make it, but I'm sure we can identify
 the key ones



   * RM Fail-over via ZKFC
   * Anything else?
 
   HDFS???
 
 

- If I had the time, I'd like to do some work on the HADOOP-9361
filesystem spec  tests -this is mostly some specification, the basis
 of a
better test framework for newer FS tests, and some more tests, with a
couple of minor changes to some of the FS code, mainly in terms of
tightening some of the exceptions thrown (IOE - EOF)

 otherwise:

- I'd like the hadoop-openstack  JAR in; it's already in branch-2 so
it's a matter of ensuring testing during the release against as many
providers as possible.
- There are a fair few JIRAs about updating versions of dependencies
-the S3 JetS3t update went in this week, but there are more, as well as
cruft in the POMs which shows up downstream. I think we could update the
low-risk dependencies (test-time, log4j, c), while avoiding those we
 know
will be trouble (jetty). This may seem minor but it does make a big
 diff to
the downstream projects.

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Next releases

2013-11-07 Thread Arun C Murthy
Gang,

 Thinking through the next couple of releases here, appreciate f/b.

 # hadoop-2.2.1

 I was looking through commit logs and there is a *lot* of content here (81 
commits as on 11/7). Some are features/improvements and some are fixes - it's 
really hard to distinguish what is important and what isn't.

 I propose we start with a blank slate (i.e. blow away branch-2.2 and start 
fresh from a copy of branch-2.2.0)  and then be very careful and meticulous 
about including only *blocker* fixes in branch-2.2. So, most of the content 
here comes via the next minor release (i.e. hadoop-2.3)

 In future, we continue to be *very* parsimonious about what gets into a patch 
release (major.minor.patch) - in general, these should be only *blocker* fixes 
or key operational issues.

 # hadoop-2.3
 
 I'd like to propose the following features for YARN/MR to make it into 
hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
 * Application History Server - This is happening in  a branch and is close; 
with it we can provide a reasonable experience for new frameworks being built 
on top of YARN.
 * Bug-fixes in RM Restart
 * Minimal support for long-running applications (e.g. security) via YARN-896
 * RM Fail-over via ZKFC
 * Anything else?

 HDFS???

 Overall, I feel like we have a decent chance of rolling hadoop-2.3 by the end 
of the year.

 Thoughts?

thanks,
Arun
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.