Re: Thinking ahead to hadoop-2.6

2014-11-05 Thread Arun C Murthy
,
 
 Maybe we could do a quick run through of the Roadmap wiki and
 add/retarget
 things accordingly?
 
 I think the KMS and transparent encryption are ready to go. We've
 got
 a
 very few further bug fixes pending, but that's it.
 
 Two HDFS things that I think probably won't make the end of the week
 are
 archival storage (HDFS-6584) and single replica memory writes
 (HDFS-6581),
 which I believe are under the HSM banner. HDFS-6484 was just merged
 to
 trunk and I think needs a little more work before it goes into
 branch-2.
 HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
 further
 off
 yet.
 
 Just my 2c as I did not work directly on these features. I just
 generally
 shy away from shipping bits quite this fresh.
 
 Thanks,
 Andrew
 
 On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 wrote:
 
 Looks like most of the content is in and hadoop-2.6 is shaping up
 nicely.
 
 I'll create branch-2.6 by end of the week and we can go from there
 to
 stabilize it - hopefully in the next few weeks.
 
 Thoughts?
 
 thanks,
 Arun
 
 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
 a...@hortonworks.com
 
 wrote:
 
 Folks,
 
 With hadoop-2.5 nearly done, it's time to start thinking ahead
 to
 hadoop-2.6.
 
 Currently, here is the Roadmap per the wiki:
 
   • HADOOP
   • Credential provider HADOOP-10607
   • HDFS
   • Heterogeneous storage (Phase 2) - Support APIs
 for
 using
 storage tiers by the applications HDFS-5682
   • Memory as storage tier HDFS-5851
   • YARN
   • Dynamic Resource Configuration YARN-291
   • NodeManager Restart YARN-1336
   • ResourceManager HA Phase 2 YARN-556
   • Support for admin-specified labels in YARN
 YARN-796
   • Support for automatic, shared cache for YARN
 application
 artifacts YARN-1492
   • Support NodeGroup layer topology on YARN
 YARN-18
   • Support for Docker containers in YARN
 YARN-1964
   • YARN service registry YARN-913
 
 My suspicion is, as is normal, some will make the cut and some
 won't.
 Please do add/subtract from the list as appropriate. Ideally, it
 would
 be
 good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
 up a
 cadence.
 
 More importantly, as we discussed previously, we'd like
 hadoop-2.6
 to
 be
 the *last* Apache Hadoop 2.x release which support JDK6. I'll
 start
 a
 discussion with other communities (HBase, Pig, Hive, Oozie etc.)
 and
 see
 how they feel about this.
 
 thanks,
 Arun
 
 
 --
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 --
 http://hortonworks.com/download/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 --
 http://hortonworks.com/download/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity 
 to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified 
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender 
 immediately
 and delete it from your system. Thank You.
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received

Re: Thinking ahead to hadoop-2.6

2014-10-23 Thread Yongjun Zhang
...@hortonworks.com wrote:
 
 
 
 
  I actually would like to see both archival storage and single
 replica
  memory writes to be in 2.6 release. Archival storage is in the
 final
  stages
  of getting ready for branch-2 merge as Nicholas has already
 indicated
  on
  the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both
 of
  these
  features are being in development for sometime.
 
  On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
  andrew.w...@cloudera.com
  wrote:
 
  Hey Arun,
 
  Maybe we could do a quick run through of the Roadmap wiki and
  add/retarget
  things accordingly?
 
  I think the KMS and transparent encryption are ready to go. We've
  got
  a
  very few further bug fixes pending, but that's it.
 
  Two HDFS things that I think probably won't make the end of the
 week
  are
  archival storage (HDFS-6584) and single replica memory writes
  (HDFS-6581),
  which I believe are under the HSM banner. HDFS-6484 was just
 merged
  to
  trunk and I think needs a little more work before it goes into
  branch-2.
  HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
  further
  off
  yet.
 
  Just my 2c as I did not work directly on these features. I just
  generally
  shy away from shipping bits quite this fresh.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy 
 a...@hortonworks.com
  wrote:
 
  Looks like most of the content is in and hadoop-2.6 is shaping
 up
  nicely.
 
  I'll create branch-2.6 by end of the week and we can go from
 there
  to
  stabilize it - hopefully in the next few weeks.
 
  Thoughts?
 
  thanks,
  Arun
 
  On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
  a...@hortonworks.com
 
  wrote:
 
  Folks,
 
  With hadoop-2.5 nearly done, it's time to start thinking ahead
  to
  hadoop-2.6.
 
  Currently, here is the Roadmap per the wiki:
 
• HADOOP
• Credential provider HADOOP-10607
• HDFS
• Heterogeneous storage (Phase 2) - Support APIs
  for
  using
  storage tiers by the applications HDFS-5682
• Memory as storage tier HDFS-5851
• YARN
• Dynamic Resource Configuration YARN-291
• NodeManager Restart YARN-1336
• ResourceManager HA Phase 2 YARN-556
• Support for admin-specified labels in YARN
  YARN-796
• Support for automatic, shared cache for YARN
  application
  artifacts YARN-1492
• Support NodeGroup layer topology on YARN
  YARN-18
• Support for Docker containers in YARN
  YARN-1964
• YARN service registry YARN-913
 
  My suspicion is, as is normal, some will make the cut and some
  won't.
  Please do add/subtract from the list as appropriate. Ideally,
 it
  would
  be
  good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
  up a
  cadence.
 
  More importantly, as we discussed previously, we'd like
  hadoop-2.6
  to
  be
  the *last* Apache Hadoop 2.x release which support JDK6. I'll
  start
  a
  discussion with other communities (HBase, Pig, Hive, Oozie
 etc.)
  and
  see
  how they feel about this.
 
  thanks,
  Arun
 
 
  --
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual
 or
  entity
  to
  which it is addressed and may contain information that is
  confidential,
  privileged and exempt from disclosure under applicable law. If
 the
  reader
  of this message is not the intended recipient, you are hereby
  notified
  that
  any printing, copying, dissemination, distribution, disclosure
 or
  forwarding of this communication is strictly prohibited. If you
  have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity
  to
  which it is addressed and may contain information that is
  confidential,
  privileged and exempt from disclosure under applicable law. If the
  reader
  of this message is not the intended recipient, you are hereby
 notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you
 have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is
 confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby
 notified that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication

Re: Thinking ahead to hadoop-2.6

2014-10-19 Thread Yongjun Zhang
 merge as Nicholas has already
 indicated
  on
  the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both
 of
  these
  features are being in development for sometime.
 
  On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
  andrew.w...@cloudera.com
  wrote:
 
  Hey Arun,
 
  Maybe we could do a quick run through of the Roadmap wiki and
  add/retarget
  things accordingly?
 
  I think the KMS and transparent encryption are ready to go. We've
  got
  a
  very few further bug fixes pending, but that's it.
 
  Two HDFS things that I think probably won't make the end of the
 week
  are
  archival storage (HDFS-6584) and single replica memory writes
  (HDFS-6581),
  which I believe are under the HSM banner. HDFS-6484 was just
 merged
  to
  trunk and I think needs a little more work before it goes into
  branch-2.
  HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
  further
  off
  yet.
 
  Just my 2c as I did not work directly on these features. I just
  generally
  shy away from shipping bits quite this fresh.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 
  wrote:
 
  Looks like most of the content is in and hadoop-2.6 is shaping up
  nicely.
 
  I'll create branch-2.6 by end of the week and we can go from
 there
  to
  stabilize it - hopefully in the next few weeks.
 
  Thoughts?
 
  thanks,
  Arun
 
  On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
  a...@hortonworks.com
 
  wrote:
 
  Folks,
 
  With hadoop-2.5 nearly done, it's time to start thinking ahead
  to
  hadoop-2.6.
 
  Currently, here is the Roadmap per the wiki:
 
• HADOOP
• Credential provider HADOOP-10607
• HDFS
• Heterogeneous storage (Phase 2) - Support APIs
  for
  using
  storage tiers by the applications HDFS-5682
• Memory as storage tier HDFS-5851
• YARN
• Dynamic Resource Configuration YARN-291
• NodeManager Restart YARN-1336
• ResourceManager HA Phase 2 YARN-556
• Support for admin-specified labels in YARN
  YARN-796
• Support for automatic, shared cache for YARN
  application
  artifacts YARN-1492
• Support NodeGroup layer topology on YARN
  YARN-18
• Support for Docker containers in YARN
  YARN-1964
• YARN service registry YARN-913
 
  My suspicion is, as is normal, some will make the cut and some
  won't.
  Please do add/subtract from the list as appropriate. Ideally, it
  would
  be
  good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
  up a
  cadence.
 
  More importantly, as we discussed previously, we'd like
  hadoop-2.6
  to
  be
  the *last* Apache Hadoop 2.x release which support JDK6. I'll
  start
  a
  discussion with other communities (HBase, Pig, Hive, Oozie etc.)
  and
  see
  how they feel about this.
 
  thanks,
  Arun
 
 
  --
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity
  to
  which it is addressed and may contain information that is
  confidential,
  privileged and exempt from disclosure under applicable law. If
 the
  reader
  of this message is not the intended recipient, you are hereby
  notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you
  have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity
  to
  which it is addressed and may contain information that is
  confidential,
  privileged and exempt from disclosure under applicable law. If the
  reader
  of this message is not the intended recipient, you are hereby
 notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you
 have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is
 confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby
 notified that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended

Re: Thinking ahead to hadoop-2.6

2014-10-15 Thread Vinod Kumar Vavilapalli
. Hopefully HDFS-6581 gets ready sooner. Both
 of
  these
  features are being in development for sometime.
 
  On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
  andrew.w...@cloudera.com
  wrote:
 
  Hey Arun,
 
  Maybe we could do a quick run through of the Roadmap wiki and
  add/retarget
  things accordingly?
 
  I think the KMS and transparent encryption are ready to go. We've
  got
  a
  very few further bug fixes pending, but that's it.
 
  Two HDFS things that I think probably won't make the end of the
 week
  are
  archival storage (HDFS-6584) and single replica memory writes
  (HDFS-6581),
  which I believe are under the HSM banner. HDFS-6484 was just
 merged
  to
  trunk and I think needs a little more work before it goes into
  branch-2.
  HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
  further
  off
  yet.
 
  Just my 2c as I did not work directly on these features. I just
  generally
  shy away from shipping bits quite this fresh.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 
  wrote:
 
  Looks like most of the content is in and hadoop-2.6 is shaping up
  nicely.
 
  I'll create branch-2.6 by end of the week and we can go from
 there
  to
  stabilize it - hopefully in the next few weeks.
 
  Thoughts?
 
  thanks,
  Arun
 
  On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
  a...@hortonworks.com
 
  wrote:
 
  Folks,
 
  With hadoop-2.5 nearly done, it's time to start thinking ahead
  to
  hadoop-2.6.
 
  Currently, here is the Roadmap per the wiki:
 
• HADOOP
• Credential provider HADOOP-10607
• HDFS
• Heterogeneous storage (Phase 2) - Support APIs
  for
  using
  storage tiers by the applications HDFS-5682
• Memory as storage tier HDFS-5851
• YARN
• Dynamic Resource Configuration YARN-291
• NodeManager Restart YARN-1336
• ResourceManager HA Phase 2 YARN-556
• Support for admin-specified labels in YARN
  YARN-796
• Support for automatic, shared cache for YARN
  application
  artifacts YARN-1492
• Support NodeGroup layer topology on YARN
  YARN-18
• Support for Docker containers in YARN
  YARN-1964
• YARN service registry YARN-913
 
  My suspicion is, as is normal, some will make the cut and some
  won't.
  Please do add/subtract from the list as appropriate. Ideally, it
  would
  be
  good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
  up a
  cadence.
 
  More importantly, as we discussed previously, we'd like
  hadoop-2.6
  to
  be
  the *last* Apache Hadoop 2.x release which support JDK6. I'll
  start
  a
  discussion with other communities (HBase, Pig, Hive, Oozie etc.)
  and
  see
  how they feel about this.
 
  thanks,
  Arun
 
 
  --
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity
  to
  which it is addressed and may contain information that is
  confidential,
  privileged and exempt from disclosure under applicable law. If
 the
  reader
  of this message is not the intended recipient, you are hereby
  notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you
  have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
  entity
  to
  which it is addressed and may contain information that is
  confidential,
  privileged and exempt from disclosure under applicable law. If the
  reader
  of this message is not the intended recipient, you are hereby
 notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you
 have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is
 confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby
 notified that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed

Re: Thinking ahead to hadoop-2.6

2014-10-14 Thread Arun C Murthy
 it.
 
 Two HDFS things that I think probably won't make the end of the week
 are
 archival storage (HDFS-6584) and single replica memory writes
 (HDFS-6581),
 which I believe are under the HSM banner. HDFS-6484 was just merged
 to
 trunk and I think needs a little more work before it goes into
 branch-2.
 HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
 further
 off
 yet.
 
 Just my 2c as I did not work directly on these features. I just
 generally
 shy away from shipping bits quite this fresh.
 
 Thanks,
 Andrew
 
 On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 wrote:
 
 Looks like most of the content is in and hadoop-2.6 is shaping up
 nicely.
 
 I'll create branch-2.6 by end of the week and we can go from there
 to
 stabilize it - hopefully in the next few weeks.
 
 Thoughts?
 
 thanks,
 Arun
 
 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
 a...@hortonworks.com
 
 wrote:
 
 Folks,
 
 With hadoop-2.5 nearly done, it's time to start thinking ahead
 to
 hadoop-2.6.
 
 Currently, here is the Roadmap per the wiki:
 
   • HADOOP
   • Credential provider HADOOP-10607
   • HDFS
   • Heterogeneous storage (Phase 2) - Support APIs
 for
 using
 storage tiers by the applications HDFS-5682
   • Memory as storage tier HDFS-5851
   • YARN
   • Dynamic Resource Configuration YARN-291
   • NodeManager Restart YARN-1336
   • ResourceManager HA Phase 2 YARN-556
   • Support for admin-specified labels in YARN
 YARN-796
   • Support for automatic, shared cache for YARN
 application
 artifacts YARN-1492
   • Support NodeGroup layer topology on YARN
 YARN-18
   • Support for Docker containers in YARN
 YARN-1964
   • YARN service registry YARN-913
 
 My suspicion is, as is normal, some will make the cut and some
 won't.
 Please do add/subtract from the list as appropriate. Ideally, it
 would
 be
 good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
 up a
 cadence.
 
 More importantly, as we discussed previously, we'd like
 hadoop-2.6
 to
 be
 the *last* Apache Hadoop 2.x release which support JDK6. I'll
 start
 a
 discussion with other communities (HBase, Pig, Hive, Oozie etc.)
 and
 see
 how they feel about this.
 
 thanks,
 Arun
 
 
 --
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 --
 http://hortonworks.com/download/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 --
 http://hortonworks.com/download/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/hdp/
 
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com

Re: Thinking ahead to hadoop-2.6

2014-10-02 Thread Arun C Murthy
) and single replica memory writes
 (HDFS-6581),
 which I believe are under the HSM banner. HDFS-6484 was just merged
 to
 trunk and I think needs a little more work before it goes into
 branch-2.
 HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
 further
 off
 yet.
 
 Just my 2c as I did not work directly on these features. I just
 generally
 shy away from shipping bits quite this fresh.
 
 Thanks,
 Andrew
 
 On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 wrote:
 
 Looks like most of the content is in and hadoop-2.6 is shaping up
 nicely.
 
 I'll create branch-2.6 by end of the week and we can go from there
 to
 stabilize it - hopefully in the next few weeks.
 
 Thoughts?
 
 thanks,
 Arun
 
 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
 a...@hortonworks.com
 
 wrote:
 
 Folks,
 
 With hadoop-2.5 nearly done, it's time to start thinking ahead
 to
 hadoop-2.6.
 
 Currently, here is the Roadmap per the wiki:
 
   • HADOOP
   • Credential provider HADOOP-10607
   • HDFS
   • Heterogeneous storage (Phase 2) - Support APIs
 for
 using
 storage tiers by the applications HDFS-5682
   • Memory as storage tier HDFS-5851
   • YARN
   • Dynamic Resource Configuration YARN-291
   • NodeManager Restart YARN-1336
   • ResourceManager HA Phase 2 YARN-556
   • Support for admin-specified labels in YARN
 YARN-796
   • Support for automatic, shared cache for YARN
 application
 artifacts YARN-1492
   • Support NodeGroup layer topology on YARN
 YARN-18
   • Support for Docker containers in YARN
 YARN-1964
   • YARN service registry YARN-913
 
 My suspicion is, as is normal, some will make the cut and some
 won't.
 Please do add/subtract from the list as appropriate. Ideally, it
 would
 be
 good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
 up a
 cadence.
 
 More importantly, as we discussed previously, we'd like
 hadoop-2.6
 to
 be
 the *last* Apache Hadoop 2.x release which support JDK6. I'll
 start
 a
 discussion with other communities (HBase, Pig, Hive, Oozie etc.)
 and
 see
 how they feel about this.
 
 thanks,
 Arun
 
 
 --
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 --
 http://hortonworks.com/download/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If the
 reader
 of this message is not the intended recipient, you are hereby notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 --
 http://hortonworks.com/download/
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/hdp/
 
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/hdp/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity

Re: Thinking ahead to hadoop-2.6

2014-09-24 Thread Andrew Wang
Hey Nicholas,

My concern about Archival Storage isn't related to the code quality or the
size of the feature. I think that you and Jing did good work. My concern is
that once we ship, we're locked into that set of archival storage APIs, and
these APIs are not yet finalized. Simply being able to turn off the feature
does not change the compatibility story.

I'm willing to devote time to help review these JIRAs and kick the tires on
the APIs, but my point above was that I'm not sure it'd all be done by the
end of the week. Testing might also reveal additional changes that need to
be made, which also might not happen by end-of-week.

I guess the question before us is if we're comfortable putting something in
branch-2.6 and then potentially adding API changes after. I'm okay with
that as long as we're all aware that this might happen.

Arun, as RM is this cool with you? Again, I like this feature and I'm fine
with it's inclusion, just a heads up that we might need some extra time to
finalize things before an RC can be cut.

Thanks,
Andrew

On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze 
s29752-hadoop...@yahoo.com.invalid wrote:

 Hi,

 I am worry about KMS and transparent encryption since there are quite many
 bugs discovered after it got merged to branch-2.  It gives us an impression
 that the feature is not yet well tested.  Indeed, transparent encryption is
 a complicated feature which changes the core part of HDFS.  It is not easy
 to get everything right.


 For HDFS-6584: Archival Storage, it is a relatively simple and low risk
 feature.  It introduces a new storage type ARCHIVE and the concept of block
 storage policy to HDFS.  When a cluster is configured with ARCHIVE storage,
 the blocks will be stored using the appropriate storage types specified by
 storage policies assigned to the files/directories.  Cluster admin could
 disable the feature by simply not configuring any storage type and not
 setting any storage policy as before.   As Suresh mentioned, HDFS-6584 is
 in the final stages to be merged to branch-2.

 Regards,
 Tsz-Wo



 On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas 
 sur...@hortonworks.com wrote:


 
 
 I actually would like to see both archival storage and single replica
 memory writes to be in 2.6 release. Archival storage is in the final
 stages
 of getting ready for branch-2 merge as Nicholas has already indicated on
 the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of these
 features are being in development for sometime.
 
 On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
 wrote:
 
  Hey Arun,
 
  Maybe we could do a quick run through of the Roadmap wiki and
 add/retarget
  things accordingly?
 
  I think the KMS and transparent encryption are ready to go. We've got a
  very few further bug fixes pending, but that's it.
 
  Two HDFS things that I think probably won't make the end of the week are
  archival storage (HDFS-6584) and single replica memory writes
 (HDFS-6581),
  which I believe are under the HSM banner. HDFS-6484 was just merged to
  trunk and I think needs a little more work before it goes into branch-2.
  HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further
 off
  yet.
 
  Just my 2c as I did not work directly on these features. I just
 generally
  shy away from shipping bits quite this fresh.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 wrote:
 
   Looks like most of the content is in and hadoop-2.6 is shaping up
 nicely.
  
   I'll create branch-2.6 by end of the week and we can go from there to
   stabilize it - hopefully in the next few weeks.
  
   Thoughts?
  
   thanks,
   Arun
  
   On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
   wrote:
  
Folks,
   
 With hadoop-2.5 nearly done, it's time to start thinking ahead to
hadoop-2.6.
   
 Currently, here is the Roadmap per the wiki:
   
• HADOOP
• Credential provider HADOOP-10607
• HDFS
• Heterogeneous storage (Phase 2) - Support APIs for
   using
storage tiers by the applications HDFS-5682
• Memory as storage tier HDFS-5851
• YARN
• Dynamic Resource Configuration YARN-291
• NodeManager Restart YARN-1336
• ResourceManager HA Phase 2 YARN-556
• Support for admin-specified labels in YARN
 YARN-796
• Support for automatic, shared cache for YARN
   application
artifacts YARN-1492
• Support NodeGroup layer topology on YARN YARN-18
• Support for Docker containers in YARN YARN-1964
• YARN service registry YARN-913
   
 My suspicion is, as is normal, some will make the cut and some
 won't.
Please do add/subtract from the list as appropriate. Ideally, it
 would
  be
good to ship hadoop-2.6

Re: Thinking ahead to hadoop-2.6

2014-09-24 Thread Suresh Srinivas
Given some of the features are in final stages of stabilization,
Arun, we should hold off creating 2.6 branch or building an RC by a week?
All the features in flux are important ones and worth delaying the release
by a week.

On Wed, Sep 24, 2014 at 11:36 AM, Andrew Wang andrew.w...@cloudera.com
wrote:

 Hey Nicholas,

 My concern about Archival Storage isn't related to the code quality or the
 size of the feature. I think that you and Jing did good work. My concern is
 that once we ship, we're locked into that set of archival storage APIs, and
 these APIs are not yet finalized. Simply being able to turn off the feature
 does not change the compatibility story.

 I'm willing to devote time to help review these JIRAs and kick the tires on
 the APIs, but my point above was that I'm not sure it'd all be done by the
 end of the week. Testing might also reveal additional changes that need to
 be made, which also might not happen by end-of-week.

 I guess the question before us is if we're comfortable putting something in
 branch-2.6 and then potentially adding API changes after. I'm okay with
 that as long as we're all aware that this might happen.

 Arun, as RM is this cool with you? Again, I like this feature and I'm fine
 with it's inclusion, just a heads up that we might need some extra time to
 finalize things before an RC can be cut.

 Thanks,
 Andrew

 On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze 
 s29752-hadoop...@yahoo.com.invalid wrote:

  Hi,
 
  I am worry about KMS and transparent encryption since there are quite
 many
  bugs discovered after it got merged to branch-2.  It gives us an
 impression
  that the feature is not yet well tested.  Indeed, transparent encryption
 is
  a complicated feature which changes the core part of HDFS.  It is not
 easy
  to get everything right.
 
 
  For HDFS-6584: Archival Storage, it is a relatively simple and low risk
  feature.  It introduces a new storage type ARCHIVE and the concept of
 block
  storage policy to HDFS.  When a cluster is configured with ARCHIVE
 storage,
  the blocks will be stored using the appropriate storage types specified
 by
  storage policies assigned to the files/directories.  Cluster admin could
  disable the feature by simply not configuring any storage type and not
  setting any storage policy as before.   As Suresh mentioned, HDFS-6584 is
  in the final stages to be merged to branch-2.
 
  Regards,
  Tsz-Wo
 
 
 
  On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas 
  sur...@hortonworks.com wrote:
 
 
  
  
  I actually would like to see both archival storage and single replica
  memory writes to be in 2.6 release. Archival storage is in the final
  stages
  of getting ready for branch-2 merge as Nicholas has already indicated on
  the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
 these
  features are being in development for sometime.
  
  On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
  wrote:
  
   Hey Arun,
  
   Maybe we could do a quick run through of the Roadmap wiki and
  add/retarget
   things accordingly?
  
   I think the KMS and transparent encryption are ready to go. We've got
 a
   very few further bug fixes pending, but that's it.
  
   Two HDFS things that I think probably won't make the end of the week
 are
   archival storage (HDFS-6584) and single replica memory writes
  (HDFS-6581),
   which I believe are under the HSM banner. HDFS-6484 was just merged to
   trunk and I think needs a little more work before it goes into
 branch-2.
   HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further
  off
   yet.
  
   Just my 2c as I did not work directly on these features. I just
  generally
   shy away from shipping bits quite this fresh.
  
   Thanks,
   Andrew
  
   On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
  wrote:
  
Looks like most of the content is in and hadoop-2.6 is shaping up
  nicely.
   
I'll create branch-2.6 by end of the week and we can go from there
 to
stabilize it - hopefully in the next few weeks.
   
Thoughts?
   
thanks,
Arun
   
On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
 
wrote:
   
 Folks,

  With hadoop-2.5 nearly done, it's time to start thinking ahead to
 hadoop-2.6.

  Currently, here is the Roadmap per the wiki:

 • HADOOP
 • Credential provider HADOOP-10607
 • HDFS
 • Heterogeneous storage (Phase 2) - Support APIs
 for
using
 storage tiers by the applications HDFS-5682
 • Memory as storage tier HDFS-5851
 • YARN
 • Dynamic Resource Configuration YARN-291
 • NodeManager Restart YARN-1336
 • ResourceManager HA Phase 2 YARN-556
 • Support for admin-specified labels in YARN
  YARN-796
 • Support for automatic

Re: Thinking ahead to hadoop-2.6

2014-09-24 Thread Jitendra Pandey
I also believe its worth a week's wait to include HDFS-6584 and HDFS-6581
in 2.6.

On Wed, Sep 24, 2014 at 3:28 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 Given some of the features are in final stages of stabilization,
 Arun, we should hold off creating 2.6 branch or building an RC by a week?
 All the features in flux are important ones and worth delaying the release
 by a week.

 On Wed, Sep 24, 2014 at 11:36 AM, Andrew Wang andrew.w...@cloudera.com
 wrote:

  Hey Nicholas,
 
  My concern about Archival Storage isn't related to the code quality or
 the
  size of the feature. I think that you and Jing did good work. My concern
 is
  that once we ship, we're locked into that set of archival storage APIs,
 and
  these APIs are not yet finalized. Simply being able to turn off the
 feature
  does not change the compatibility story.
 
  I'm willing to devote time to help review these JIRAs and kick the tires
 on
  the APIs, but my point above was that I'm not sure it'd all be done by
 the
  end of the week. Testing might also reveal additional changes that need
 to
  be made, which also might not happen by end-of-week.
 
  I guess the question before us is if we're comfortable putting something
 in
  branch-2.6 and then potentially adding API changes after. I'm okay with
  that as long as we're all aware that this might happen.
 
  Arun, as RM is this cool with you? Again, I like this feature and I'm
 fine
  with it's inclusion, just a heads up that we might need some extra time
 to
  finalize things before an RC can be cut.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze 
  s29752-hadoop...@yahoo.com.invalid wrote:
 
   Hi,
  
   I am worry about KMS and transparent encryption since there are quite
  many
   bugs discovered after it got merged to branch-2.  It gives us an
  impression
   that the feature is not yet well tested.  Indeed, transparent
 encryption
  is
   a complicated feature which changes the core part of HDFS.  It is not
  easy
   to get everything right.
  
  
   For HDFS-6584: Archival Storage, it is a relatively simple and low risk
   feature.  It introduces a new storage type ARCHIVE and the concept of
  block
   storage policy to HDFS.  When a cluster is configured with ARCHIVE
  storage,
   the blocks will be stored using the appropriate storage types specified
  by
   storage policies assigned to the files/directories.  Cluster admin
 could
   disable the feature by simply not configuring any storage type and not
   setting any storage policy as before.   As Suresh mentioned, HDFS-6584
 is
   in the final stages to be merged to branch-2.
  
   Regards,
   Tsz-Wo
  
  
  
   On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas 
   sur...@hortonworks.com wrote:
  
  
   
   
   I actually would like to see both archival storage and single replica
   memory writes to be in 2.6 release. Archival storage is in the final
   stages
   of getting ready for branch-2 merge as Nicholas has already indicated
 on
   the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
  these
   features are being in development for sometime.
   
   On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
 andrew.w...@cloudera.com
   wrote:
   
Hey Arun,
   
Maybe we could do a quick run through of the Roadmap wiki and
   add/retarget
things accordingly?
   
I think the KMS and transparent encryption are ready to go. We've
 got
  a
very few further bug fixes pending, but that's it.
   
Two HDFS things that I think probably won't make the end of the week
  are
archival storage (HDFS-6584) and single replica memory writes
   (HDFS-6581),
which I believe are under the HSM banner. HDFS-6484 was just merged
 to
trunk and I think needs a little more work before it goes into
  branch-2.
HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
 further
   off
yet.
   
Just my 2c as I did not work directly on these features. I just
   generally
shy away from shipping bits quite this fresh.
   
Thanks,
Andrew
   
On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
   wrote:
   
 Looks like most of the content is in and hadoop-2.6 is shaping up
   nicely.

 I'll create branch-2.6 by end of the week and we can go from there
  to
 stabilize it - hopefully in the next few weeks.

 Thoughts?

 thanks,
 Arun

 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
 a...@hortonworks.com
  
 wrote:

  Folks,
 
   With hadoop-2.5 nearly done, it's time to start thinking ahead
 to
  hadoop-2.6.
 
   Currently, here is the Roadmap per the wiki:
 
  • HADOOP
  • Credential provider HADOOP-10607
  • HDFS
  • Heterogeneous storage (Phase 2) - Support APIs
  for
 using
  storage tiers by the applications HDFS-5682
  • Memory as storage tier

Re: Thinking ahead to hadoop-2.6

2014-09-24 Thread Jing Zhao
 is in and hadoop-2.6 is shaping
 up
nicely.
 
  I'll create branch-2.6 by end of the week and we can go from
 there
   to
  stabilize it - hopefully in the next few weeks.
 
  Thoughts?
 
  thanks,
  Arun
 
  On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
  a...@hortonworks.com
   
  wrote:
 
   Folks,
  
With hadoop-2.5 nearly done, it's time to start thinking
 ahead
  to
   hadoop-2.6.
  
Currently, here is the Roadmap per the wiki:
  
   • HADOOP
   • Credential provider HADOOP-10607
   • HDFS
   • Heterogeneous storage (Phase 2) - Support
 APIs
   for
  using
   storage tiers by the applications HDFS-5682
   • Memory as storage tier HDFS-5851
   • YARN
   • Dynamic Resource Configuration YARN-291
   • NodeManager Restart YARN-1336
   • ResourceManager HA Phase 2 YARN-556
   • Support for admin-specified labels in YARN
YARN-796
   • Support for automatic, shared cache for YARN
  application
   artifacts YARN-1492
   • Support NodeGroup layer topology on YARN
  YARN-18
   • Support for Docker containers in YARN
  YARN-1964
   • YARN service registry YARN-913
  
My suspicion is, as is normal, some will make the cut and
 some
won't.
   Please do add/subtract from the list as appropriate. Ideally,
 it
would
 be
   good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
  up a
  cadence.
  
More importantly, as we discussed previously, we'd like
   hadoop-2.6
to
 be
   the *last* Apache Hadoop 2.x release which support JDK6. I'll
   start
a
   discussion with other communities (HBase, Pig, Hive, Oozie
 etc.)
   and
 see
   how they feel about this.
  
   thanks,
   Arun
  
  
 
 
  --
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual
 or
entity
 to
  which it is addressed and may contain information that is
confidential,
  privileged and exempt from disclosure under applicable law. If
 the
reader
  of this message is not the intended recipient, you are hereby
   notified
 that
  any printing, copying, dissemination, distribution, disclosure
 or
  forwarding of this communication is strictly prohibited. If you
  have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 




--
http://hortonworks.com/download/

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or
  entity
to
which it is addressed and may contain information that is
  confidential,
privileged and exempt from disclosure under applicable law. If the
   reader
of this message is not the intended recipient, you are hereby
 notified
that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender
immediately
and delete it from your system. Thank You.


   
  
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 



 --
 http://hortonworks.com/download/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed

Re: Thinking ahead to hadoop-2.6

2014-09-24 Thread Vinod Kumar Vavilapalli
Thank for restarting this, Arun!

The following are the efforts that I am involved in directly or indirectly
and shepherding from YARN's point of view: rolling-upgrades (YARN-666),
work-preserving RM restart (YARN-556), minimal long-running services
support (YARN-896), including service-registry (YARN-913), timeline-service
stability/security (some sub-tasks of YARN-1935), node labels (YARN-296)
and some enhancements/bug fixes in cpu-scheduling/cgroups. We also have
reservations sub-system (YARN-1051) and  YARN shared cache (YARN-1492) that
I would like to get in and helping reviews with. Almost all these efforts
are on the edge of completion - some of them are just pending reviews.
Clearly some of them will be more stable than others, some of them make it
eventually, some may not.

+1 for branching it within a week so that we can stabilize it and let other
features go into branch-2, I am putting out all the stops to get the above
YARN efforts to be in reasonable shape for 2.6.

Regarding remaining items that I haven't paid much attention, NM restart
effort - mostly done by Jason Lowe - YARN-1336 is all in. YARN-291,
YARN-18, YARN-1964 are not going to cut from what I see.

I edited the road-map to reflect this latest reality.

+Vinod

On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com wrote:

 Looks like most of the content is in and hadoop-2.6 is shaping up nicely.

 I'll create branch-2.6 by end of the week and we can go from there to
 stabilize it - hopefully in the next few weeks.

 Thoughts?

 thanks,
 Arun

 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
 wrote:

  Folks,
 
   With hadoop-2.5 nearly done, it's time to start thinking ahead to
  hadoop-2.6.
 
   Currently, here is the Roadmap per the wiki:
 
  • HADOOP
  • Credential provider HADOOP-10607
  • HDFS
  • Heterogeneous storage (Phase 2) - Support APIs for
 using
  storage tiers by the applications HDFS-5682
  • Memory as storage tier HDFS-5851
  • YARN
  • Dynamic Resource Configuration YARN-291
  • NodeManager Restart YARN-1336
  • ResourceManager HA Phase 2 YARN-556
  • Support for admin-specified labels in YARN YARN-796
  • Support for automatic, shared cache for YARN
 application
  artifacts YARN-1492
  • Support NodeGroup layer topology on YARN YARN-18
  • Support for Docker containers in YARN YARN-1964
  • YARN service registry YARN-913
 
   My suspicion is, as is normal, some will make the cut and some won't.
  Please do add/subtract from the list as appropriate. Ideally, it would be
  good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
 cadence.
 
   More importantly, as we discussed previously, we'd like hadoop-2.6 to be
  the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
  discussion with other communities (HBase, Pig, Hive, Oozie etc.) and see
  how they feel about this.
 
  thanks,
  Arun
 
 


 --

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Thinking ahead to hadoop-2.6

2014-09-24 Thread Vinod Kumar Vavilapalli
 can go from there
  to
 stabilize it - hopefully in the next few weeks.

 Thoughts?

 thanks,
 Arun

 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
 a...@hortonworks.com
  
 wrote:

  Folks,
 
   With hadoop-2.5 nearly done, it's time to start thinking ahead
 to
  hadoop-2.6.
 
   Currently, here is the Roadmap per the wiki:
 
  • HADOOP
  • Credential provider HADOOP-10607
  • HDFS
  • Heterogeneous storage (Phase 2) - Support APIs
  for
 using
  storage tiers by the applications HDFS-5682
  • Memory as storage tier HDFS-5851
  • YARN
  • Dynamic Resource Configuration YARN-291
  • NodeManager Restart YARN-1336
  • ResourceManager HA Phase 2 YARN-556
  • Support for admin-specified labels in YARN
   YARN-796
  • Support for automatic, shared cache for YARN
 application
  artifacts YARN-1492
  • Support NodeGroup layer topology on YARN
 YARN-18
  • Support for Docker containers in YARN
 YARN-1964
  • YARN service registry YARN-913
 
   My suspicion is, as is normal, some will make the cut and some
   won't.
  Please do add/subtract from the list as appropriate. Ideally, it
   would
be
  good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
 up a
 cadence.
 
   More importantly, as we discussed previously, we'd like
  hadoop-2.6
   to
be
  the *last* Apache Hadoop 2.x release which support JDK6. I'll
  start
   a
  discussion with other communities (HBase, Pig, Hive, Oozie etc.)
  and
see
  how they feel about this.
 
  thanks,
  Arun
 
 


 --

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
   entity
to
 which it is addressed and may contain information that is
   confidential,
 privileged and exempt from disclosure under applicable law. If the
   reader
 of this message is not the intended recipient, you are hereby
  notified
that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
immediately
 and delete it from your system. Thank You.

   
   
   
   
   --
   http://hortonworks.com/download/
   
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
 entity
   to
   which it is addressed and may contain information that is
 confidential,
   privileged and exempt from disclosure under applicable law. If the
  reader
   of this message is not the intended recipient, you are hereby notified
   that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
   immediately
   and delete it from your system. Thank You.
   
   
  
 



 --
 http://hortonworks.com/download/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: In hindsight... Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Arun Murthy
Sorry, coming to discussion late.

We all agreed that 2.6 would the *last* release supporting JDK6 and
hadoop-2.7 would drop support for JDK6. We could easily do 2.7 right after
2.6 (maybe with few critical bug-fixes) with the defining feature of 2.7
being *JDK7 only*. I've checked with HBase, Pig and other communities and
they are good with this. I'm thinking I'll follow up 2-3 wks after 2.6 goes
out to release 2.7 which drops JDK6.

We can certainly add support for JDK8 as early as 2.7 if there are
volunteers - clearly we won't make it depend on JDK8 features right away;
since it would still need to support JDK7.

To recap: hadoop-2.7 onwards would be minimum JDK7, with potential support
for JDK8. We can revisit our discussion from a few months ago to discuss
when we *drop* support for JDK7; clearly something I'd like to avoid doing
in haste.

thanks,
Arun

On Thu, Sep 18, 2014 at 8:41 AM, Alejandro Abdelnur tuc...@gmail.com
wrote:

 Am I missing something, or we already agreed that after 2.5 release we
 would move trunk and branch-2 to java 7?

 On Wed, Sep 17, 2014 at 3:33 PM, Travis Thompson 
 travis.r.thomp...@gmail.com wrote:

 There's actually an umbrella JIRA to track issues with JDK8
 (HADOOP-11090), in case anyone missed it.

 At LinkedIn we've been running our Hadoop 2.3 deployment on JDK8 for
 about a month now with some mixed results.  It definitely works but
 there are issues, mostly around virtual memory exploding.  The reason
 we took the jump early is there is a company wide push to move to JDK8
 ASAP, I suspect this isn't something unique to LinkedIn.   To get this
 to work with security enabled, we've had to apply patches not even in
 trunk yet because they break JDK6 compatibility.

 From my perspective, based on what I've seen and people I've talked
 to, there is a huge push to move to JDK8 ASAP so it's becoming
 increasingly urgent to at least get support to run on JDK8.

 On Wed, Sep 17, 2014 at 9:55 AM, Allen Wittenauer a...@altiscale.com
 wrote:
 
  On Sep 17, 2014, at 2:47 AM, Steve Loughran ste...@hortonworks.com
 wrote:
 
  I don't agree. Certainly the stuff I got into Hadoop 2.5 nailed down
 the
  filesystem binding with more tests than ever before.
 
  FWIW, based upon my survey of JIRA, there are a lot of unit
 test fixes that are only in trunk.
 
  But I am also aware of large organisations that are still on Java 6.
  Giving a clear roadmap move to Java 7 now, java 8 in XX months can
 help
  them plan.
 
  Planning is a big thing.  That’s one of the reasons why it’d be
 prudent to start doing 3.0+JDK8 now as well.  Even if April slips, other
 projects and orgs are already moving to 8.  These people wonder what our
 plans are so that they can run one JVM.  Right now our answer is ¯\_(ツ)_/¯ .
 
  I’m sure I can dig up a user running Hadoop 0.13 because it ran
 on JDK5.  That doesn’t mean the open source project should stall because
 certain orgs don’t/can't upgrade.
 
 
 Drop the 2.6.0 release, branch trunk, and start rolling a
  3.0.0-alpha with JDK8 as the minimum.  2.5.1 becomes the base for all
  sustaining work.  This gives the rest of the community time to move
 to JDK8
  if they haven’t already.  For downstream vendors, it gives a roadmap
 for
  their customers who will be asking about JDK8 sooner rather than
 later.  By
  the time 3.0 stabilizes, we’re probably looking at April, which is
 perfect
  timing.
 
 
  That delays getting stuff out too much; if april slips it becomes a
 long
  time since an ASF release came out.
 
  I’m assuming you specifically mean a ‘stable’ release.  If, as
 everyone seems to be saying, that 3.x doesn’t have that much different than
 2.x, doesn’t this mean that 3.x should be stable much quicker than 2.x
 took?  In other words, if 2.5 is stable and the biggest differences between
 it and trunk is the majority of code (450+ JIRAs as of yesterday afternoon)
 that just also happens to be in 2.6, doesn’t it mean 2.6 is also extremely
 unstable?  (Thus supporting my conjecture that 2.6 is going to be a
 problematic release?)
 
  Saying you must run on Java 8 for
  this will only scare people off and hold back adoption of 3.x,
 leaving 2.5
  as the last release that ends up being used for a while; the new 1.0.4
 
  From the outside, trunk looks a lot of 0.21 already.  From what
 I can tell, there is zero motivation to get it out the door and on a
 roadmap. Primarily because there is little different between trunk and
 branch-2.  This is a very dangerous place to be as those few differences,
 some measured in years old, rot and wither. :(
 
  Here's an alternative
 
  -2.6 on java 6, announce EOL for Java 6 support
  -2.7 on Java 7, state that the lifespan of j7 support will be some
 bounded
  time period (12-18 mo)
  -trunk to build and test on Java 8 in jenkins alongside java 7. For
 that to
  be useful someone needs to volunteer to care about build failures. are
 you
  volunteering, Allen?
 
   

Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Arun Murthy
Looks like most of the content is in and hadoop-2.6 is shaping up nicely.

I'll create branch-2.6 by end of the week and we can go from there to
stabilize it - hopefully in the next few weeks.

Thoughts?

thanks,
Arun

On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

  With hadoop-2.5 nearly done, it's time to start thinking ahead to
 hadoop-2.6.

  Currently, here is the Roadmap per the wiki:

 • HADOOP
 • Credential provider HADOOP-10607
 • HDFS
 • Heterogeneous storage (Phase 2) - Support APIs for using
 storage tiers by the applications HDFS-5682
 • Memory as storage tier HDFS-5851
 • YARN
 • Dynamic Resource Configuration YARN-291
 • NodeManager Restart YARN-1336
 • ResourceManager HA Phase 2 YARN-556
 • Support for admin-specified labels in YARN YARN-796
 • Support for automatic, shared cache for YARN application
 artifacts YARN-1492
 • Support NodeGroup layer topology on YARN YARN-18
 • Support for Docker containers in YARN YARN-1964
 • YARN service registry YARN-913

  My suspicion is, as is normal, some will make the cut and some won't.
 Please do add/subtract from the list as appropriate. Ideally, it would be
 good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a cadence.

  More importantly, as we discussed previously, we'd like hadoop-2.6 to be
 the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
 discussion with other communities (HBase, Pig, Hive, Oozie etc.) and see
 how they feel about this.

 thanks,
 Arun




-- 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Andrew Wang
Hey Arun,

Maybe we could do a quick run through of the Roadmap wiki and add/retarget
things accordingly?

I think the KMS and transparent encryption are ready to go. We've got a
very few further bug fixes pending, but that's it.

Two HDFS things that I think probably won't make the end of the week are
archival storage (HDFS-6584) and single replica memory writes (HDFS-6581),
which I believe are under the HSM banner. HDFS-6484 was just merged to
trunk and I think needs a little more work before it goes into branch-2.
HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further off
yet.

Just my 2c as I did not work directly on these features. I just generally
shy away from shipping bits quite this fresh.

Thanks,
Andrew

On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com wrote:

 Looks like most of the content is in and hadoop-2.6 is shaping up nicely.

 I'll create branch-2.6 by end of the week and we can go from there to
 stabilize it - hopefully in the next few weeks.

 Thoughts?

 thanks,
 Arun

 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
 wrote:

  Folks,
 
   With hadoop-2.5 nearly done, it's time to start thinking ahead to
  hadoop-2.6.
 
   Currently, here is the Roadmap per the wiki:
 
  • HADOOP
  • Credential provider HADOOP-10607
  • HDFS
  • Heterogeneous storage (Phase 2) - Support APIs for
 using
  storage tiers by the applications HDFS-5682
  • Memory as storage tier HDFS-5851
  • YARN
  • Dynamic Resource Configuration YARN-291
  • NodeManager Restart YARN-1336
  • ResourceManager HA Phase 2 YARN-556
  • Support for admin-specified labels in YARN YARN-796
  • Support for automatic, shared cache for YARN
 application
  artifacts YARN-1492
  • Support NodeGroup layer topology on YARN YARN-18
  • Support for Docker containers in YARN YARN-1964
  • YARN service registry YARN-913
 
   My suspicion is, as is normal, some will make the cut and some won't.
  Please do add/subtract from the list as appropriate. Ideally, it would be
  good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
 cadence.
 
   More importantly, as we discussed previously, we'd like hadoop-2.6 to be
  the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
  discussion with other communities (HBase, Pig, Hive, Oozie etc.) and see
  how they feel about this.
 
  thanks,
  Arun
 
 


 --

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Suresh Srinivas
I actually would like to see both archival storage and single replica
memory writes to be in 2.6 release. Archival storage is in the final stages
of getting ready for branch-2 merge as Nicholas has already indicated on
the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of these
features are being in development for sometime.

On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
wrote:

 Hey Arun,

 Maybe we could do a quick run through of the Roadmap wiki and add/retarget
 things accordingly?

 I think the KMS and transparent encryption are ready to go. We've got a
 very few further bug fixes pending, but that's it.

 Two HDFS things that I think probably won't make the end of the week are
 archival storage (HDFS-6584) and single replica memory writes (HDFS-6581),
 which I believe are under the HSM banner. HDFS-6484 was just merged to
 trunk and I think needs a little more work before it goes into branch-2.
 HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further off
 yet.

 Just my 2c as I did not work directly on these features. I just generally
 shy away from shipping bits quite this fresh.

 Thanks,
 Andrew

 On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com wrote:

  Looks like most of the content is in and hadoop-2.6 is shaping up nicely.
 
  I'll create branch-2.6 by end of the week and we can go from there to
  stabilize it - hopefully in the next few weeks.
 
  Thoughts?
 
  thanks,
  Arun
 
  On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
  wrote:
 
   Folks,
  
With hadoop-2.5 nearly done, it's time to start thinking ahead to
   hadoop-2.6.
  
Currently, here is the Roadmap per the wiki:
  
   • HADOOP
   • Credential provider HADOOP-10607
   • HDFS
   • Heterogeneous storage (Phase 2) - Support APIs for
  using
   storage tiers by the applications HDFS-5682
   • Memory as storage tier HDFS-5851
   • YARN
   • Dynamic Resource Configuration YARN-291
   • NodeManager Restart YARN-1336
   • ResourceManager HA Phase 2 YARN-556
   • Support for admin-specified labels in YARN YARN-796
   • Support for automatic, shared cache for YARN
  application
   artifacts YARN-1492
   • Support NodeGroup layer topology on YARN YARN-18
   • Support for Docker containers in YARN YARN-1964
   • YARN service registry YARN-913
  
My suspicion is, as is normal, some will make the cut and some won't.
   Please do add/subtract from the list as appropriate. Ideally, it would
 be
   good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
  cadence.
  
More importantly, as we discussed previously, we'd like hadoop-2.6 to
 be
   the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
   discussion with other communities (HBase, Pig, Hive, Oozie etc.) and
 see
   how they feel about this.
  
   thanks,
   Arun
  
  
 
 
  --
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Chris Trezzo
I would like to see the shared cache (YARN-1492) make it in as well. We are
going through the final review process (with Karthik and Vinod) and should
be fairly close to complete. This is another feature that has been under
development for a while.

Thanks,
Chris

On Tue, Sep 23, 2014 at 4:00 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 I actually would like to see both archival storage and single replica
 memory writes to be in 2.6 release. Archival storage is in the final stages
 of getting ready for branch-2 merge as Nicholas has already indicated on
 the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of these
 features are being in development for sometime.

 On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
 wrote:

  Hey Arun,
 
  Maybe we could do a quick run through of the Roadmap wiki and
 add/retarget
  things accordingly?
 
  I think the KMS and transparent encryption are ready to go. We've got a
  very few further bug fixes pending, but that's it.
 
  Two HDFS things that I think probably won't make the end of the week are
  archival storage (HDFS-6584) and single replica memory writes
 (HDFS-6581),
  which I believe are under the HSM banner. HDFS-6484 was just merged to
  trunk and I think needs a little more work before it goes into branch-2.
  HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further
 off
  yet.
 
  Just my 2c as I did not work directly on these features. I just generally
  shy away from shipping bits quite this fresh.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 wrote:
 
   Looks like most of the content is in and hadoop-2.6 is shaping up
 nicely.
  
   I'll create branch-2.6 by end of the week and we can go from there to
   stabilize it - hopefully in the next few weeks.
  
   Thoughts?
  
   thanks,
   Arun
  
   On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
   wrote:
  
Folks,
   
 With hadoop-2.5 nearly done, it's time to start thinking ahead to
hadoop-2.6.
   
 Currently, here is the Roadmap per the wiki:
   
• HADOOP
• Credential provider HADOOP-10607
• HDFS
• Heterogeneous storage (Phase 2) - Support APIs for
   using
storage tiers by the applications HDFS-5682
• Memory as storage tier HDFS-5851
• YARN
• Dynamic Resource Configuration YARN-291
• NodeManager Restart YARN-1336
• ResourceManager HA Phase 2 YARN-556
• Support for admin-specified labels in YARN YARN-796
• Support for automatic, shared cache for YARN
   application
artifacts YARN-1492
• Support NodeGroup layer topology on YARN YARN-18
• Support for Docker containers in YARN YARN-1964
• YARN service registry YARN-913
   
 My suspicion is, as is normal, some will make the cut and some
 won't.
Please do add/subtract from the list as appropriate. Ideally, it
 would
  be
good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
   cadence.
   
 More importantly, as we discussed previously, we'd like hadoop-2.6
 to
  be
the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
discussion with other communities (HBase, Pig, Hive, Oozie etc.) and
  see
how they feel about this.
   
thanks,
Arun
   
   
  
  
   --
  
   --
   Arun C. Murthy
   Hortonworks Inc.
   http://hortonworks.com/
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
 entity
  to
   which it is addressed and may contain information that is confidential,
   privileged and exempt from disclosure under applicable law. If the
 reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
  
 



 --
 http://hortonworks.com/download/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Andrew Wang
I think we should delay cutting the branch beyond the end of the week then,
seeing as the HDFS-6581 merge vote (assuming it goes through) doesn't close
until 12am on Tuesday next week. Archival storage is also still finishing
up user APIs and I had a few other q's posted on HDFS-7081, so again might
not make it by the end of the week.

I don't think it'd be too bad to retarget these to 2.7 though, since Arun
said he plans to roll 2.7 shortly after 2.6. I think a few more weeks would
really seal the deal.

Thanks,
Andrew

On Tue, Sep 23, 2014 at 4:00 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 I actually would like to see both archival storage and single replica
 memory writes to be in 2.6 release. Archival storage is in the final stages
 of getting ready for branch-2 merge as Nicholas has already indicated on
 the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of these
 features are being in development for sometime.

 On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
 wrote:

  Hey Arun,
 
  Maybe we could do a quick run through of the Roadmap wiki and
 add/retarget
  things accordingly?
 
  I think the KMS and transparent encryption are ready to go. We've got a
  very few further bug fixes pending, but that's it.
 
  Two HDFS things that I think probably won't make the end of the week are
  archival storage (HDFS-6584) and single replica memory writes
 (HDFS-6581),
  which I believe are under the HSM banner. HDFS-6484 was just merged to
  trunk and I think needs a little more work before it goes into branch-2.
  HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further
 off
  yet.
 
  Just my 2c as I did not work directly on these features. I just generally
  shy away from shipping bits quite this fresh.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
 wrote:
 
   Looks like most of the content is in and hadoop-2.6 is shaping up
 nicely.
  
   I'll create branch-2.6 by end of the week and we can go from there to
   stabilize it - hopefully in the next few weeks.
  
   Thoughts?
  
   thanks,
   Arun
  
   On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
   wrote:
  
Folks,
   
 With hadoop-2.5 nearly done, it's time to start thinking ahead to
hadoop-2.6.
   
 Currently, here is the Roadmap per the wiki:
   
• HADOOP
• Credential provider HADOOP-10607
• HDFS
• Heterogeneous storage (Phase 2) - Support APIs for
   using
storage tiers by the applications HDFS-5682
• Memory as storage tier HDFS-5851
• YARN
• Dynamic Resource Configuration YARN-291
• NodeManager Restart YARN-1336
• ResourceManager HA Phase 2 YARN-556
• Support for admin-specified labels in YARN YARN-796
• Support for automatic, shared cache for YARN
   application
artifacts YARN-1492
• Support NodeGroup layer topology on YARN YARN-18
• Support for Docker containers in YARN YARN-1964
• YARN service registry YARN-913
   
 My suspicion is, as is normal, some will make the cut and some
 won't.
Please do add/subtract from the list as appropriate. Ideally, it
 would
  be
good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
   cadence.
   
 More importantly, as we discussed previously, we'd like hadoop-2.6
 to
  be
the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
discussion with other communities (HBase, Pig, Hive, Oozie etc.) and
  see
how they feel about this.
   
thanks,
Arun
   
   
  
  
   --
  
   --
   Arun C. Murthy
   Hortonworks Inc.
   http://hortonworks.com/
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
 entity
  to
   which it is addressed and may contain information that is confidential,
   privileged and exempt from disclosure under applicable law. If the
 reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
  
 



 --
 http://hortonworks.com/download/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have

Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Karthik Kambatla
Looking at the patches, we might be able to get most of YARN-1492 in by the
end of next week. There would be a couple of security items still
remaining, but we can may be call the feature alpha-ready without them.

On Tue, Sep 23, 2014 at 4:25 PM, Chris Trezzo ctre...@gmail.com wrote:

 I would like to see the shared cache (YARN-1492) make it in as well. We are
 going through the final review process (with Karthik and Vinod) and should
 be fairly close to complete. This is another feature that has been under
 development for a while.

 Thanks,
 Chris

 On Tue, Sep 23, 2014 at 4:00 PM, Suresh Srinivas sur...@hortonworks.com
 wrote:

  I actually would like to see both archival storage and single replica
  memory writes to be in 2.6 release. Archival storage is in the final
 stages
  of getting ready for branch-2 merge as Nicholas has already indicated on
  the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
 these
  features are being in development for sometime.
 
  On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
  wrote:
 
   Hey Arun,
  
   Maybe we could do a quick run through of the Roadmap wiki and
  add/retarget
   things accordingly?
  
   I think the KMS and transparent encryption are ready to go. We've got a
   very few further bug fixes pending, but that's it.
  
   Two HDFS things that I think probably won't make the end of the week
 are
   archival storage (HDFS-6584) and single replica memory writes
  (HDFS-6581),
   which I believe are under the HSM banner. HDFS-6484 was just merged to
   trunk and I think needs a little more work before it goes into
 branch-2.
   HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further
  off
   yet.
  
   Just my 2c as I did not work directly on these features. I just
 generally
   shy away from shipping bits quite this fresh.
  
   Thanks,
   Andrew
  
   On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
  wrote:
  
Looks like most of the content is in and hadoop-2.6 is shaping up
  nicely.
   
I'll create branch-2.6 by end of the week and we can go from there to
stabilize it - hopefully in the next few weeks.
   
Thoughts?
   
thanks,
Arun
   
On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
wrote:
   
 Folks,

  With hadoop-2.5 nearly done, it's time to start thinking ahead to
 hadoop-2.6.

  Currently, here is the Roadmap per the wiki:

 • HADOOP
 • Credential provider HADOOP-10607
 • HDFS
 • Heterogeneous storage (Phase 2) - Support APIs
 for
using
 storage tiers by the applications HDFS-5682
 • Memory as storage tier HDFS-5851
 • YARN
 • Dynamic Resource Configuration YARN-291
 • NodeManager Restart YARN-1336
 • ResourceManager HA Phase 2 YARN-556
 • Support for admin-specified labels in YARN
 YARN-796
 • Support for automatic, shared cache for YARN
application
 artifacts YARN-1492
 • Support NodeGroup layer topology on YARN YARN-18
 • Support for Docker containers in YARN YARN-1964
 • YARN service registry YARN-913

  My suspicion is, as is normal, some will make the cut and some
  won't.
 Please do add/subtract from the list as appropriate. Ideally, it
  would
   be
 good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
cadence.

  More importantly, as we discussed previously, we'd like hadoop-2.6
  to
   be
 the *last* Apache Hadoop 2.x release which support JDK6. I'll
 start a
 discussion with other communities (HBase, Pig, Hive, Oozie etc.)
 and
   see
 how they feel about this.

 thanks,
 Arun


   
   
--
   
--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/
   
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or
  entity
   to
which it is addressed and may contain information that is
 confidential,
privileged and exempt from disclosure under applicable law. If the
  reader
of this message is not the intended recipient, you are hereby
 notified
   that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender
   immediately
and delete it from your system. Thank You.
   
  
 
 
 
  --
  http://hortonworks.com/download/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended

Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Subramaniam V K
+1 for end of next week.

We have got all the patches for YARN-1051 committed to the branch. We are
currently fixing test-patch  plan to call a merge vote soon. Hopefully it
will go through by end of next week.

Thanks,
Subru

On Tue, Sep 23, 2014 at 4:31 PM, Karthik Kambatla ka...@cloudera.com
wrote:

 Looking at the patches, we might be able to get most of YARN-1492 in by the
 end of next week. There would be a couple of security items still
 remaining, but we can may be call the feature alpha-ready without them.

 On Tue, Sep 23, 2014 at 4:25 PM, Chris Trezzo ctre...@gmail.com wrote:

  I would like to see the shared cache (YARN-1492) make it in as well. We
 are
  going through the final review process (with Karthik and Vinod) and
 should
  be fairly close to complete. This is another feature that has been under
  development for a while.
 
  Thanks,
  Chris
 
  On Tue, Sep 23, 2014 at 4:00 PM, Suresh Srinivas sur...@hortonworks.com
 
  wrote:
 
   I actually would like to see both archival storage and single replica
   memory writes to be in 2.6 release. Archival storage is in the final
  stages
   of getting ready for branch-2 merge as Nicholas has already indicated
 on
   the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
  these
   features are being in development for sometime.
  
   On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
 
   wrote:
  
Hey Arun,
   
Maybe we could do a quick run through of the Roadmap wiki and
   add/retarget
things accordingly?
   
I think the KMS and transparent encryption are ready to go. We've
 got a
very few further bug fixes pending, but that's it.
   
Two HDFS things that I think probably won't make the end of the week
  are
archival storage (HDFS-6584) and single replica memory writes
   (HDFS-6581),
which I believe are under the HSM banner. HDFS-6484 was just merged
 to
trunk and I think needs a little more work before it goes into
  branch-2.
HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
 further
   off
yet.
   
Just my 2c as I did not work directly on these features. I just
  generally
shy away from shipping bits quite this fresh.
   
Thanks,
Andrew
   
On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
   wrote:
   
 Looks like most of the content is in and hadoop-2.6 is shaping up
   nicely.

 I'll create branch-2.6 by end of the week and we can go from there
 to
 stabilize it - hopefully in the next few weeks.

 Thoughts?

 thanks,
 Arun

 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
 a...@hortonworks.com
 wrote:

  Folks,
 
   With hadoop-2.5 nearly done, it's time to start thinking ahead
 to
  hadoop-2.6.
 
   Currently, here is the Roadmap per the wiki:
 
  • HADOOP
  • Credential provider HADOOP-10607
  • HDFS
  • Heterogeneous storage (Phase 2) - Support APIs
  for
 using
  storage tiers by the applications HDFS-5682
  • Memory as storage tier HDFS-5851
  • YARN
  • Dynamic Resource Configuration YARN-291
  • NodeManager Restart YARN-1336
  • ResourceManager HA Phase 2 YARN-556
  • Support for admin-specified labels in YARN
  YARN-796
  • Support for automatic, shared cache for YARN
 application
  artifacts YARN-1492
  • Support NodeGroup layer topology on YARN
 YARN-18
  • Support for Docker containers in YARN YARN-1964
  • YARN service registry YARN-913
 
   My suspicion is, as is normal, some will make the cut and some
   won't.
  Please do add/subtract from the list as appropriate. Ideally, it
   would
be
  good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up
 a
 cadence.
 
   More importantly, as we discussed previously, we'd like
 hadoop-2.6
   to
be
  the *last* Apache Hadoop 2.x release which support JDK6. I'll
  start a
  discussion with other communities (HBase, Pig, Hive, Oozie etc.)
  and
see
  how they feel about this.
 
  thanks,
  Arun
 
 


 --

 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
   entity
to
 which it is addressed and may contain information that is
  confidential,
 privileged and exempt from disclosure under applicable law. If the
   reader
 of this message is not the intended recipient, you are hereby
  notified
that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received

Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Sangjin Lee
I am also +1 for the end of next week. YARN-1492 (shared cache) is near the
final stages of the review, and with a little more time it would make it
easier to put it in.

On Tue, Sep 23, 2014 at 4:48 PM, Subramaniam V K subru...@gmail.com wrote:

 +1 for end of next week.

 We have got all the patches for YARN-1051 committed to the branch. We are
 currently fixing test-patch  plan to call a merge vote soon. Hopefully it
 will go through by end of next week.

 Thanks,
 Subru

 On Tue, Sep 23, 2014 at 4:31 PM, Karthik Kambatla ka...@cloudera.com
 wrote:

  Looking at the patches, we might be able to get most of YARN-1492 in by
 the
  end of next week. There would be a couple of security items still
  remaining, but we can may be call the feature alpha-ready without them.
 
  On Tue, Sep 23, 2014 at 4:25 PM, Chris Trezzo ctre...@gmail.com wrote:
 
   I would like to see the shared cache (YARN-1492) make it in as well. We
  are
   going through the final review process (with Karthik and Vinod) and
  should
   be fairly close to complete. This is another feature that has been
 under
   development for a while.
  
   Thanks,
   Chris
  
   On Tue, Sep 23, 2014 at 4:00 PM, Suresh Srinivas 
 sur...@hortonworks.com
  
   wrote:
  
I actually would like to see both archival storage and single replica
memory writes to be in 2.6 release. Archival storage is in the final
   stages
of getting ready for branch-2 merge as Nicholas has already indicated
  on
the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
   these
features are being in development for sometime.
   
On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
 andrew.w...@cloudera.com
  
wrote:
   
 Hey Arun,

 Maybe we could do a quick run through of the Roadmap wiki and
add/retarget
 things accordingly?

 I think the KMS and transparent encryption are ready to go. We've
  got a
 very few further bug fixes pending, but that's it.

 Two HDFS things that I think probably won't make the end of the
 week
   are
 archival storage (HDFS-6584) and single replica memory writes
(HDFS-6581),
 which I believe are under the HSM banner. HDFS-6484 was just merged
  to
 trunk and I think needs a little more work before it goes into
   branch-2.
 HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
  further
off
 yet.

 Just my 2c as I did not work directly on these features. I just
   generally
 shy away from shipping bits quite this fresh.

 Thanks,
 Andrew

 On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
wrote:

  Looks like most of the content is in and hadoop-2.6 is shaping up
nicely.
 
  I'll create branch-2.6 by end of the week and we can go from
 there
  to
  stabilize it - hopefully in the next few weeks.
 
  Thoughts?
 
  thanks,
  Arun
 
  On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
  a...@hortonworks.com
  wrote:
 
   Folks,
  
With hadoop-2.5 nearly done, it's time to start thinking ahead
  to
   hadoop-2.6.
  
Currently, here is the Roadmap per the wiki:
  
   * HADOOP
   * Credential provider HADOOP-10607
   * HDFS
   * Heterogeneous storage (Phase 2) - Support
 APIs
   for
  using
   storage tiers by the applications HDFS-5682
   * Memory as storage tier HDFS-5851
   * YARN
   * Dynamic Resource Configuration YARN-291
   * NodeManager Restart YARN-1336
   * ResourceManager HA Phase 2 YARN-556
   * Support for admin-specified labels in YARN
   YARN-796
   * Support for automatic, shared cache for YARN
  application
   artifacts YARN-1492
   * Support NodeGroup layer topology on YARN
  YARN-18
   * Support for Docker containers in YARN
 YARN-1964
   * YARN service registry YARN-913
  
My suspicion is, as is normal, some will make the cut and some
won't.
   Please do add/subtract from the list as appropriate. Ideally,
 it
would
 be
   good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep
 up
  a
  cadence.
  
More importantly, as we discussed previously, we'd like
  hadoop-2.6
to
 be
   the *last* Apache Hadoop 2.x release which support JDK6. I'll
   start a
   discussion with other communities (HBase, Pig, Hive, Oozie
 etc.)
   and
 see
   how they feel about this.
  
   thanks,
   Arun
  
  
 
 
  --
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
entity

Re: Thinking ahead to hadoop-2.6

2014-09-23 Thread Tsz Wo (Nicholas), Sze
Hi,

I am worry about KMS and transparent encryption since there are quite many bugs 
discovered after it got merged to branch-2.  It gives us an impression that the 
feature is not yet well tested.  Indeed, transparent encryption is a 
complicated feature which changes the core part of HDFS.  It is not easy to get 
everything right.


For HDFS-6584: Archival Storage, it is a relatively simple and low risk 
feature.  It introduces a new storage type ARCHIVE and the concept of block 
storage policy to HDFS.  When a cluster is configured with ARCHIVE storage, the 
blocks will be stored using the appropriate storage types specified by storage 
policies assigned to the files/directories.  Cluster admin could disable the 
feature by simply not configuring any storage type and not setting any storage 
policy as before.   As Suresh mentioned, HDFS-6584 is in the final stages to be 
merged to branch-2.

Regards,
Tsz-Wo



On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas 
sur...@hortonworks.com wrote:
 



I actually would like to see both archival storage and single replica
memory writes to be in 2.6 release. Archival storage is in the final stages
of getting ready for branch-2 merge as Nicholas has already indicated on
the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of these
features are being in development for sometime.

On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com
wrote:

 Hey Arun,

 Maybe we could do a quick run through of the Roadmap wiki and add/retarget
 things accordingly?

 I think the KMS and transparent encryption are ready to go. We've got a
 very few further bug fixes pending, but that's it.

 Two HDFS things that I think probably won't make the end of the week are
 archival storage (HDFS-6584) and single replica memory writes (HDFS-6581),
 which I believe are under the HSM banner. HDFS-6484 was just merged to
 trunk and I think needs a little more work before it goes into branch-2.
 HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further off
 yet.

 Just my 2c as I did not work directly on these features. I just generally
 shy away from shipping bits quite this fresh.

 Thanks,
 Andrew

 On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com wrote:

  Looks like most of the content is in and hadoop-2.6 is shaping up nicely.
 
  I'll create branch-2.6 by end of the week and we can go from there to
  stabilize it - hopefully in the next few weeks.
 
  Thoughts?
 
  thanks,
  Arun
 
  On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy a...@hortonworks.com
  wrote:
 
   Folks,
  
With hadoop-2.5 nearly done, it's time to start thinking ahead to
   hadoop-2.6.
  
Currently, here is the Roadmap per the wiki:
  
   • HADOOP
   • Credential provider HADOOP-10607
   • HDFS
   • Heterogeneous storage (Phase 2) - Support APIs for
  using
   storage tiers by the applications HDFS-5682
   • Memory as storage tier HDFS-5851
   • YARN
   • Dynamic Resource Configuration YARN-291
   • NodeManager Restart YARN-1336
   • ResourceManager HA Phase 2 YARN-556
   • Support for admin-specified labels in YARN YARN-796
   • Support for automatic, shared cache for YARN
  application
   artifacts YARN-1492
   • Support NodeGroup layer topology on YARN YARN-18
   • Support for Docker containers in YARN YARN-1964
   • YARN service registry YARN-913
  
My suspicion is, as is normal, some will make the cut and some won't.
   Please do add/subtract from the list as appropriate. Ideally, it would
 be
   good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a
  cadence.
  
More importantly, as we discussed previously, we'd like hadoop-2.6 to
 be
   the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
   discussion with other communities (HBase, Pig, Hive, Oozie etc.) and
 see
   how they feel about this.
  
   thanks,
   Arun
  
  
 
 
  --
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential

Re: In hindsight... Re: Thinking ahead to hadoop-2.6

2014-09-17 Thread Steve Loughran
On 15 September 2014 18:48, Allen Wittenauer a...@altiscale.com wrote:


 It’s now September.  With the passage of time, I have a lot of
 doubts about this plan and where that trajectory takes us.

 * The list of changes that are already in branch-2 scare the crap out of
 any risk adverse person (Hello to my fellow operations people!). Not only
 are the number of changes extremely high, but in addition there are a lot
 of major, blockbuster features in what is supposed to be a minor release.
 Combined with the fact that we’ve had to do some micro releases, it seems
 to hint that branch-2 is getting less stable over time.


I don't agree. Certainly the stuff I got into Hadoop 2.5 nailed down the
filesystem binding with more tests than ever before.

There are changes coming in 2.6, but I see most of them as extensions,
rather than big reworks of the internals



 *  One of the plans talked about was rolling a 2.7 release that drops JDK6
 and makes JDK7 the standard.  If 2.7 comes after 2.6 in October, date wise
 makes it somewhere around January 2015.  JDK7 EOL’s in April 2015.  So
 we’ll have a viable JDK7 release for exactly 3 months.  Frankly, it is too
 late for us to talk about JDK7 and we need to start thinking about JDK8.

 I think my stance on JDK7 is well known. I'm personally in favour of
moving to it ASAP, allowing Hadoop to improve, and to allow downstream
projects to move up secure in the knowledge that hadoop 2.7 == Java 7+

I also think the language improvements of Java 8 look appealing; we can do
stuff that currently you only gain from by moving to groovy, and with the
improved concurrency stuff you can do it way faster with code is easier to
maintain. A fair few downstream projects use groovy as the test language,
Bigtop and Slider being the two I know of. We get the language improvements
 better assertions, at a price. Java 8 would let us improve the Junit
tests in core hadoop as well as the production code.

 But I am also aware of large organisations that are still on Java 6.
Giving a clear roadmap move to Java 7 now, java 8 in XX months can help
them plan.



 * trunk is currently sitting at 3 years old.  There is a lot of stuff that
 has been hanging around that really needs to get into people hands so that
 we can start stabilizing it for a “real” release.


 To me this all says one thing:

 Drop the 2.6.0 release, branch trunk, and start rolling a
 3.0.0-alpha with JDK8 as the minimum.  2.5.1 becomes the base for all
 sustaining work.  This gives the rest of the community time to move to JDK8
 if they haven’t already.  For downstream vendors, it gives a roadmap for
 their customers who will be asking about JDK8 sooner rather than later.  By
 the time 3.0 stabilizes, we’re probably looking at April, which is perfect
 timing.


That delays getting stuff out too much; if april slips it becomes a long
time since an ASF release came out. Saying you must run on Java 8 for
this will only scare people off and hold back adoption of 3.x, leaving 2.5
as the last release that ends up being used for a while; the new 1.0.4


Here's an alternative

-2.6 on java 6, announce EOL for Java 6 support
-2.7 on Java 7, state that the lifespan of j7 support will be some bounded
time period (12-18 mo)
-trunk to build and test on Java 8 in jenkins alongside java 7. For that to
be useful someone needs to volunteer to care about build failures. are you
volunteering, Allen?


-steve


-we switch trunk to Java 7 NOW. That doesn't mean a rewrite fest going
through all catch() statements making them multicatch, and the same for
string case.

What it does mean is that trunk can
-adopt the Java 7 APIs where needed for improvements, such as in the native
filesystem IO
-prepare for the java 8 migration by fixing things that don't work on java
8 and which are incompatible with a java 7 codebase.

Doing that in trunk does mean those features can't seamlessly migrate to
hadoop 2.6, so people who hope to do that had better stay in java-6
compatibility mode. Those bits of the code where we know java 6 is
crippling it can move up, announcing it clearly.







 One of the issues I’ve heard mention is that 3.0 doesn’t have
 anything “compelling” in it.  Well, dropping 2.6 makes the feature list the
 carrot, JDK8 support is obviously the stick.

 Thoughts?




 On Aug 15, 2014, at 6:07 PM, Subramaniam Krishnan su...@apache.org
 wrote:

  Thanks for initiating the thread Arun.
 
  Can we add YARN-1051 https://issues.apache.org/jira/browse/YARN-1051
 to
  the list? We have most of the patches for the sub-JIRAs under review and
  have committed a couple.
 
  -Subru
 
  -- Forwarded message --
 
  From: Arun C Murthy a...@hortonworks.com
 
  Date: Tue, Aug 12, 2014 at 1:34 PM
 
  Subject: Thinking ahead to hadoop-2.6
 
  To: common-dev@hadoop.apache.org common-dev@hadoop.apache.org, 
  hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org, 
  mapreduce-...@hadoop.apache.org mapreduce

Re: In hindsight... Re: Thinking ahead to hadoop-2.6

2014-09-17 Thread Allen Wittenauer

On Sep 17, 2014, at 2:47 AM, Steve Loughran ste...@hortonworks.com wrote:
 
 I don't agree. Certainly the stuff I got into Hadoop 2.5 nailed down the
 filesystem binding with more tests than ever before.

FWIW, based upon my survey of JIRA, there are a lot of unit test fixes 
that are only in trunk. 

 But I am also aware of large organisations that are still on Java 6.
 Giving a clear roadmap move to Java 7 now, java 8 in XX months can help
 them plan.

Planning is a big thing.  That’s one of the reasons why it’d be prudent 
to start doing 3.0+JDK8 now as well.  Even if April slips, other projects and 
orgs are already moving to 8.  These people wonder what our plans are so that 
they can run one JVM.  Right now our answer is ¯\_(ツ)_/¯ . 

I’m sure I can dig up a user running Hadoop 0.13 because it ran on 
JDK5.  That doesn’t mean the open source project should stall because certain 
orgs don’t/can't upgrade. 

 
Drop the 2.6.0 release, branch trunk, and start rolling a
 3.0.0-alpha with JDK8 as the minimum.  2.5.1 becomes the base for all
 sustaining work.  This gives the rest of the community time to move to JDK8
 if they haven’t already.  For downstream vendors, it gives a roadmap for
 their customers who will be asking about JDK8 sooner rather than later.  By
 the time 3.0 stabilizes, we’re probably looking at April, which is perfect
 timing.
 
 
 That delays getting stuff out too much; if april slips it becomes a long
 time since an ASF release came out.

I’m assuming you specifically mean a ‘stable’ release.  If, as everyone 
seems to be saying, that 3.x doesn’t have that much different than 2.x, doesn’t 
this mean that 3.x should be stable much quicker than 2.x took?  In other 
words, if 2.5 is stable and the biggest differences between it and trunk is the 
majority of code (450+ JIRAs as of yesterday afternoon) that just also happens 
to be in 2.6, doesn’t it mean 2.6 is also extremely unstable?  (Thus supporting 
my conjecture that 2.6 is going to be a problematic release?)

 Saying you must run on Java 8 for
 this will only scare people off and hold back adoption of 3.x, leaving 2.5
 as the last release that ends up being used for a while; the new 1.0.4

From the outside, trunk looks a lot of 0.21 already.  From what I can 
tell, there is zero motivation to get it out the door and on a roadmap. 
Primarily because there is little different between trunk and branch-2.  This 
is a very dangerous place to be as those few differences, some measured in 
years old, rot and wither. :(

 Here's an alternative
 
 -2.6 on java 6, announce EOL for Java 6 support
 -2.7 on Java 7, state that the lifespan of j7 support will be some bounded
 time period (12-18 mo)
 -trunk to build and test on Java 8 in jenkins alongside java 7. For that to
 be useful someone needs to volunteer to care about build failures. are you
 volunteering, Allen?

This seems reasonable, except what release should folks who *require* 
java 8 use? Nightly trunk+patches builds? How do downstream projects test?  
Should JDK8 fixes be going into a branch?  (I’m making the assumption that 
fixes for JDK8 are not backward compatible with JDK7.  Hopefully they are, but 
given our usage of private APIs…)

I’ve been approached by a few people over the past month+ if I’d be 
interested in or will be RM’ing 3.0.  I’m seriously considering it esp given a) 
it’d be a nice learning experience for me  b) my “day job” makes it practical 
time-wise c) I seem to be the only one concerned enough about quite a bit of 
stale code  to get it out the door.

FWIW, I’m in the process of moving my test vm to JDK8 to see how bad 
the damage truly is right now. Based on others, it seems security doesn’t work, 
which is a pretty big deal.  I can certainly start posting trunk builds on 
people.apache.org if folks are interested.

 -we switch trunk to Java 7 NOW. That doesn't mean a rewrite fest going
 through all catch() statements making them multicatch, and the same for
 string case.

Yup.  There’s little reason *not* to switch trunk to JDK7 now. 

Re: In hindsight... Re: Thinking ahead to hadoop-2.6

2014-09-17 Thread Travis Thompson
There's actually an umbrella JIRA to track issues with JDK8
(HADOOP-11090), in case anyone missed it.

At LinkedIn we've been running our Hadoop 2.3 deployment on JDK8 for
about a month now with some mixed results.  It definitely works but
there are issues, mostly around virtual memory exploding.  The reason
we took the jump early is there is a company wide push to move to JDK8
ASAP, I suspect this isn't something unique to LinkedIn.   To get this
to work with security enabled, we've had to apply patches not even in
trunk yet because they break JDK6 compatibility.

From my perspective, based on what I've seen and people I've talked
to, there is a huge push to move to JDK8 ASAP so it's becoming
increasingly urgent to at least get support to run on JDK8.

On Wed, Sep 17, 2014 at 9:55 AM, Allen Wittenauer a...@altiscale.com wrote:

 On Sep 17, 2014, at 2:47 AM, Steve Loughran ste...@hortonworks.com wrote:

 I don't agree. Certainly the stuff I got into Hadoop 2.5 nailed down the
 filesystem binding with more tests than ever before.

 FWIW, based upon my survey of JIRA, there are a lot of unit test 
 fixes that are only in trunk.

 But I am also aware of large organisations that are still on Java 6.
 Giving a clear roadmap move to Java 7 now, java 8 in XX months can help
 them plan.

 Planning is a big thing.  That’s one of the reasons why it’d be 
 prudent to start doing 3.0+JDK8 now as well.  Even if April slips, other 
 projects and orgs are already moving to 8.  These people wonder what our 
 plans are so that they can run one JVM.  Right now our answer is ¯\_(ツ)_/¯ .

 I’m sure I can dig up a user running Hadoop 0.13 because it ran on 
 JDK5.  That doesn’t mean the open source project should stall because certain 
 orgs don’t/can't upgrade.


Drop the 2.6.0 release, branch trunk, and start rolling a
 3.0.0-alpha with JDK8 as the minimum.  2.5.1 becomes the base for all
 sustaining work.  This gives the rest of the community time to move to JDK8
 if they haven’t already.  For downstream vendors, it gives a roadmap for
 their customers who will be asking about JDK8 sooner rather than later.  By
 the time 3.0 stabilizes, we’re probably looking at April, which is perfect
 timing.


 That delays getting stuff out too much; if april slips it becomes a long
 time since an ASF release came out.

 I’m assuming you specifically mean a ‘stable’ release.  If, as 
 everyone seems to be saying, that 3.x doesn’t have that much different than 
 2.x, doesn’t this mean that 3.x should be stable much quicker than 2.x took?  
 In other words, if 2.5 is stable and the biggest differences between it and 
 trunk is the majority of code (450+ JIRAs as of yesterday afternoon) that 
 just also happens to be in 2.6, doesn’t it mean 2.6 is also extremely 
 unstable?  (Thus supporting my conjecture that 2.6 is going to be a 
 problematic release?)

 Saying you must run on Java 8 for
 this will only scare people off and hold back adoption of 3.x, leaving 2.5
 as the last release that ends up being used for a while; the new 1.0.4

 From the outside, trunk looks a lot of 0.21 already.  From what I can 
 tell, there is zero motivation to get it out the door and on a roadmap. 
 Primarily because there is little different between trunk and branch-2.  This 
 is a very dangerous place to be as those few differences, some measured in 
 years old, rot and wither. :(

 Here's an alternative

 -2.6 on java 6, announce EOL for Java 6 support
 -2.7 on Java 7, state that the lifespan of j7 support will be some bounded
 time period (12-18 mo)
 -trunk to build and test on Java 8 in jenkins alongside java 7. For that to
 be useful someone needs to volunteer to care about build failures. are you
 volunteering, Allen?

 This seems reasonable, except what release should folks who *require* 
 java 8 use? Nightly trunk+patches builds? How do downstream projects test?  
 Should JDK8 fixes be going into a branch?  (I’m making the assumption that 
 fixes for JDK8 are not backward compatible with JDK7.  Hopefully they are, 
 but given our usage of private APIs…)

 I’ve been approached by a few people over the past month+ if I’d be 
 interested in or will be RM’ing 3.0.  I’m seriously considering it esp given 
 a) it’d be a nice learning experience for me  b) my “day job” makes it 
 practical time-wise c) I seem to be the only one concerned enough about quite 
 a bit of stale code  to get it out the door.

 FWIW, I’m in the process of moving my test vm to JDK8 to see how bad 
 the damage truly is right now. Based on others, it seems security doesn’t 
 work, which is a pretty big deal.  I can certainly start posting trunk builds 
 on people.apache.org if folks are interested.

 -we switch trunk to Java 7 NOW. That doesn't mean a rewrite fest going
 through all catch() statements making them multicatch, and the same for
 string case.

 Yup.  There’s little reason *not* to switch 

Re: In hindsight... Re: Thinking ahead to hadoop-2.6

2014-09-15 Thread Colin McCabe
On Mon, Sep 15, 2014 at 10:48 AM, Allen Wittenauer a...@altiscale.com wrote:

 It’s now September.  With the passage of time, I have a lot of doubts 
 about this plan and where that trajectory takes us.

 * The list of changes that are already in branch-2 scare the crap out of any 
 risk adverse person (Hello to my fellow operations people!). Not only are the 
 number of changes extremely high, but in addition there are a lot of major, 
 blockbuster features in what is supposed to be a minor release.  Combined 
 with the fact that we’ve had to do some micro releases, it seems to hint that 
 branch-2 is getting less stable over time.

I don't see what is so scary about 2.6, can you be more concrete?  It
seems like a pretty normal release to me and most of the new features
are optional.

I also don't see why you think that branch-2 is getting less stable
over time.  Actually, I think that branch-2 has gotten more stable
over time as people have finally gotten around to upgrading from 1.x
or earlier, and contributed their efforts to addressing regressions in
branch-2.

 *  One of the plans talked about was rolling a 2.7 release that drops JDK6 
 and makes JDK7 the standard.  If 2.7 comes after 2.6 in October, date wise  
 makes it somewhere around January 2015.  JDK7 EOL’s in April 2015.  So we’ll 
 have a viable JDK7 release for exactly 3 months.  Frankly, it is too late for 
 us to talk about JDK7 and we need to start thinking about JDK8.

 * trunk is currently sitting at 3 years old.  There is a lot of stuff that 
 has been hanging around that really needs to get into people hands so that we 
 can start stabilizing it for a “real” release.

We have been pretty careful about minimizing trunk's divergence from
branch-2.  I can't think of an example of anything in trunk that
really needs to get into people's hands-- did I forget something?



 To me this all says one thing:

 Drop the 2.6.0 release, branch trunk, and start rolling a 3.0.0-alpha 
 with JDK8 as the minimum.  2.5.1 becomes the base for all sustaining work.  
 This gives the rest of the community time to move to JDK8 if they haven’t 
 already.  For downstream vendors, it gives a roadmap for their customers who 
 will be asking about JDK8 sooner rather than later.  By the time 3.0 
 stabilizes, we’re probably looking at April, which is perfect timing.

 One of the issues I’ve heard mention is that 3.0 doesn’t have 
 anything “compelling” in it.  Well, dropping 2.6 makes the feature list the 
 carrot, JDK8 support is obviously the stick.

 Thoughts?

As we've discussed before, supporting JDK8 is very different from
forcing people to use JDK8.  branch-2 and Hadoop 2.6 most certainly
should support JDK8, and most certainly NOT force people to use JDK8.
Cloudera has been using JDK7 internally for a long time, and
recommending it to customers too.  Some developers are using JDK8 as
well.  It works fine (although I'm sure there will be bugs and
workarounds that get reported and fixed as more people migrate).  I
don't see this particular issue as a reason to change the schedule.

best,
Colin






 On Aug 15, 2014, at 6:07 PM, Subramaniam Krishnan su...@apache.org wrote:

 Thanks for initiating the thread Arun.

 Can we add YARN-1051 https://issues.apache.org/jira/browse/YARN-1051 to
 the list? We have most of the patches for the sub-JIRAs under review and
 have committed a couple.

 -Subru

 -- Forwarded message --

 From: Arun C Murthy a...@hortonworks.com

 Date: Tue, Aug 12, 2014 at 1:34 PM

 Subject: Thinking ahead to hadoop-2.6

 To: common-dev@hadoop.apache.org common-dev@hadoop.apache.org, 
 hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org, 
 mapreduce-...@hadoop.apache.org mapreduce-...@hadoop.apache.org,

 yarn-...@hadoop.apache.org yarn-...@hadoop.apache.org





 Folks,



 With hadoop-2.5 nearly done, it's time to start thinking ahead to
 hadoop-2.6.



 Currently, here is the Roadmap per the wiki:



• HADOOP

• Credential provider HADOOP-10607

• HDFS

• Heterogeneous storage (Phase 2) - Support APIs for using
 storage tiers by the applications HDFS-5682

• Memory as storage tier HDFS-5851

• YARN

• Dynamic Resource Configuration YARN-291

• NodeManager Restart YARN-1336

• ResourceManager HA Phase 2 YARN-556

• Support for admin-specified labels in YARN YARN-796

• Support for automatic, shared cache for YARN application
 artifacts YARN-1492

• Support NodeGroup layer topology on YARN YARN-18

• Support for Docker containers in YARN YARN-1964

• YARN service registry YARN-913



 My suspicion is, as is normal, some will make the cut and some won't.

 Please do add/subtract from the list as appropriate. Ideally, it would be
 good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up

Re: Thinking ahead to hadoop-2.6

2014-08-15 Thread Subramaniam Krishnan
Thanks for initiating the thread Arun.

Can we add YARN-1051 https://issues.apache.org/jira/browse/YARN-1051 to
the list? We have most of the patches for the sub-JIRAs under review and
have committed a couple.

-Subru

-- Forwarded message --

From: Arun C Murthy a...@hortonworks.com

Date: Tue, Aug 12, 2014 at 1:34 PM

Subject: Thinking ahead to hadoop-2.6

To: common-dev@hadoop.apache.org common-dev@hadoop.apache.org, 
hdfs-...@hadoop.apache.org hdfs-...@hadoop.apache.org, 
mapreduce-...@hadoop.apache.org mapreduce-...@hadoop.apache.org,

yarn-...@hadoop.apache.org yarn-...@hadoop.apache.org





Folks,



 With hadoop-2.5 nearly done, it's time to start thinking ahead to
hadoop-2.6.



 Currently, here is the Roadmap per the wiki:



• HADOOP

• Credential provider HADOOP-10607

• HDFS

• Heterogeneous storage (Phase 2) - Support APIs for using
storage tiers by the applications HDFS-5682

• Memory as storage tier HDFS-5851

• YARN

• Dynamic Resource Configuration YARN-291

• NodeManager Restart YARN-1336

• ResourceManager HA Phase 2 YARN-556

• Support for admin-specified labels in YARN YARN-796

• Support for automatic, shared cache for YARN application
artifacts YARN-1492

• Support NodeGroup layer topology on YARN YARN-18

• Support for Docker containers in YARN YARN-1964

• YARN service registry YARN-913



 My suspicion is, as is normal, some will make the cut and some won't.

Please do add/subtract from the list as appropriate. Ideally, it would be
good to ship hadoop-2.6 in a 6-8 weeks (say, October) to keep up a cadence.



 More importantly, as we discussed previously, we'd like hadoop-2.6 to be
the *last* Apache Hadoop 2.x release which support JDK6. I'll start a
discussion with other communities (HBase, Pig, Hive, Oozie etc.) and see
how they feel about this.



thanks,

Arun





--

CONFIDENTIALITY NOTICE

NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


Thinking ahead to hadoop-2.6

2014-08-12 Thread Arun C Murthy
Folks,

 With hadoop-2.5 nearly done, it's time to start thinking ahead to hadoop-2.6.

 Currently, here is the Roadmap per the wiki:
 
• HADOOP
• Credential provider HADOOP-10607
• HDFS
• Heterogeneous storage (Phase 2) - Support APIs for using 
storage tiers by the applications HDFS-5682
• Memory as storage tier HDFS-5851
• YARN
• Dynamic Resource Configuration YARN-291
• NodeManager Restart YARN-1336
• ResourceManager HA Phase 2 YARN-556
• Support for admin-specified labels in YARN YARN-796
• Support for automatic, shared cache for YARN application 
artifacts YARN-1492
• Support NodeGroup layer topology on YARN YARN-18
• Support for Docker containers in YARN YARN-1964
• YARN service registry YARN-913

 My suspicion is, as is normal, some will make the cut and some won't. Please 
do add/subtract from the list as appropriate. Ideally, it would be good to ship 
hadoop-2.6 in a 6-8 weeks (say, October) to keep up a cadence.

 More importantly, as we discussed previously, we'd like hadoop-2.6 to be the 
*last* Apache Hadoop 2.x release which support JDK6. I'll start a discussion 
with other communities (HBase, Pig, Hive, Oozie etc.) and see how they feel 
about this.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.