Re: [VOTE] Apache Hadoop Ozone 1.0.0 RC1

2020-09-01 Thread Jitendra Pandey
+1 (binding)

1. Verified signatures
2. Built from source
3. deployed with docker
4. tested with basic s3 apis.

On Tue, Aug 25, 2020 at 7:01 AM Sammi Chen  wrote:

> RC1 artifacts are at:
> https://home.apache.org/~sammichen/ozone-1.0.0-rc1/
> 
>
> Maven artifacts are staged at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1278
> 
>
> The public key used for signing the artifacts can be found at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> The RC1 tag in github is at:
> https://github.com/apache/hadoop-ozone/releases/tag/ozone-1.0.0-RC1
> 
>
> Change log of RC1, add
> 1. HDDS-4063. Fix InstallSnapshot in OM HA
> 2. HDDS-4139. Update version number in upgrade tests.
> 3. HDDS-4144, Update version info in hadoop client dependency readme
>
> *The vote will run for 7 days, ending on Aug 31th 2020 at 11:59 pm PST.*
>
> Thanks,
> Sammi Chen
>


Re: [DISCUSS] Feature branch for HDFS-14978 In-place Erasure Coding Conversion

2020-01-23 Thread Jitendra Pandey
+1 for the feature branch.

On Thu, Jan 23, 2020 at 1:34 PM Wei-Chiu Chuang
 wrote:

> Hi we are working on a feature to improve Erasure Coding, and I would like
> to seek your opinion on creating a feature branch for it. (HDFS-14978
> )
>
> Reason for a feature branch
> (1) it turns out we need to update NameNode layout version
> (2) It's a medium size project and we want to get this feature merged in
> its entirety.
>
> Aravindan Vijayan and I are planning to work on this feature.
>
> Thoughts?
>


Re: [VOTE] create ozone-dev and ozone-issues mailing lists

2019-10-28 Thread Jitendra Pandey
The immediate problem we need to fix is to prevent github updates from
spamming the dev mailing list.
Might make sense to just have a separate issues@ mailing list and point
github to that?

On Sun, Oct 27, 2019 at 10:12 PM Dinesh Chitlangia
 wrote:

> +1
>
> -Dinesh
>
>
>
>
> On Sun, Oct 27, 2019, 4:25 AM Elek, Marton 
> >
> > As discussed earlier in the thread of "Hadoop-Ozone repository mailing
> > list configurations" [1] I suggested to solve the current
> > misconfiguration problem with creating separated mailing lists
> > (dev/issues) for Hadoop Ozone.
> >
> > It would have some additional benefit: for example it would make easier
> > to follow the Ozone development and future plans.
> >
> > Here I am starting a new vote thread (open for at least 72 hours) to
> > collect more feedback about this.
> >
> > Please express your opinion / vote.
> >
> > Thanks a lot,
> > Marton
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/dc66a30f48a744534e748c418bf7ab6275896166ca5ade11560ebaef@%3Chdfs-dev.hadoop.apache.org%3E
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> >
>


Re: [DISCUSS] Remove Ozone and Submarine from Hadoop repo

2019-10-24 Thread Jitendra Pandey
+1

On Thu, Oct 24, 2019 at 6:42 PM Ayush Saxena  wrote:

> Thanx Akira for putting this up.
> +1, Makes sense removing.
>
> -Ayush
>
> > On 25-Oct-2019, at 6:55 AM, Dinesh Chitlangia 
> > 
> wrote:
> >
> > +1 and Anu's approach of creating a tag makes sense.
> >
> > Dinesh
> >
> >
> >
> >
> >> On Thu, Oct 24, 2019 at 9:24 PM Sunil Govindan 
> wrote:
> >>
> >> +1 on this to remove staleness.
> >>
> >> - Sunil
> >>
> >> On Thu, Oct 24, 2019 at 12:51 PM Akira Ajisaka 
> >> wrote:
> >>
> >>> Hi folks,
> >>>
> >>> Both Ozone and Apache Submarine have separate repositories.
> >>> Can we remove these modules from hadoop-trunk?
> >>>
> >>> Regards,
> >>> Akira
> >>>
> >>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop Ozone 0.3.0-alpha (RC1)

2018-11-19 Thread Jitendra Pandey
+1 (binding)

Built from source.
Ran smoke tests.

On 11/19/18, 9:02 AM, "Anu Engineer"  wrote:

+1. (Binding)

Thanks for getting this release done. Verified the signatures and S3 
Gateway.

--Anu


On 11/16/18, 5:15 AM, "Shashikant Banerjee"  
wrote:

+1 (non-binding).

  - Verified signatures
  - Verified checksums
  - Checked LICENSE/NOTICE files
  - Built from source
  - Ran smoke tests.

Thanks Marton for putting up the release together.

Thanks
Shashi

On 11/14/18, 10:44 PM, "Elek, Marton"  wrote:

Hi all,

I've created the second release candidate (RC1) for Apache Hadoop 
Ozone
0.3.0-alpha including one more fix on top of the previous RC0 
(HDDS-854)

This is the second release of Apache Hadoop Ozone. Notable changes 
since
the first release:

* A new S3 compatible rest server is added. Ozone can be used from 
any
S3 compatible tools (HDDS-434)
* Ozone Hadoop file system URL prefix is renamed from o3:// to 
o3fs://
(HDDS-651)
* Extensive testing and stability improvements of OzoneFs.
* Spark, YARN and Hive support and stability improvements.
* Improved Pipeline handling and recovery.
* Separated/dedicated classpath definitions for all the Ozone
components. (HDDS-447)

The RC artifacts are available from:
https://home.apache.org/~elek/ozone-0.3.0-alpha-rc1/

The RC tag in git is: ozone-0.3.0-alpha-RC1 (ebbf459e6a6)

Please try it out, vote, or just give us feedback.

The vote will run for 5 days, ending on November 19, 2018 18:00 UTC.


Thank you very much,
Marton


PS:

The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs from ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d
4. open localhost:9874 or localhost:9876



The easiest way to try it out from the source:

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha && docker-compose 
up -d



The easiest way to test basic functionality (with acceptance tests):

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha/smoketest
3. ./test.sh


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-06 Thread Jitendra Pandey
Hi Andrew, 
  
 I think we can eliminate the maintenance costs even in the same repo. We can 
make following changes that incorporate suggestions from Daryn and Owen as well.
1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate 
directory.
2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be 
loaded in DN as a pluggable module. 
 If not loaded, there will be absolutely no code path through hdsl or ozone.
4. To further make it easier for folks building hadoop, we can support a maven 
profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will not be 
built.
 For example, Cloudera can choose to skip even compiling/building 
hdsl/ozone and therefore no maintenance overhead whatsoever.
 HADOOP-14453 has a patch that shows how it can be done.

Arguably, there are two kinds of maintenance costs. Costs for developers and 
the cost for users.
- Developers: A maven profile as noted in point (3) and (4) above completely 
addresses the concern for developers 
 as there are no compile time dependencies and 
further, they can choose not to build ozone/hdsl.
- User: Cost to users will be completely alleviated if ozone/hdsl is not loaded 
as mentioned in point (3) above.

jitendra

From: Andrew Wang <andrew.w...@cloudera.com>
Date: Monday, March 5, 2018 at 3:54 PM
To: Wangda Tan <wheele...@gmail.com>
Cc: Owen O'Malley <owen.omal...@gmail.com>, Daryn Sharp 
<da...@oath.com.invalid>, Jitendra Pandey <jiten...@hortonworks.com>, hdfs-dev 
<hdfs-...@hadoop.apache.org>, "common-...@hadoop.apache.org" 
<common-...@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" 
<yarn-dev@hadoop.apache.org>, "mapreduce-...@hadoop.apache.org" 
<mapreduce-...@hadoop.apache.org>
Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk

Hi Owen, Wangda, 

Thanks for clearly laying out the subproject options, that helps the discussion.

I'm all onboard with the idea of regular releases, and it's something I tried 
to do with the 3.0 alphas and betas. The problem though isn't a lack of 
commitment from feature developers like Sanjay or Jitendra; far from it! I 
think every feature developer makes a reasonable effort to test their code 
before it's merged. Yet, my experience as an RM is that more code comes with 
more risk. I don't believe that Ozone is special or different in this regard. 
It comes with a maintenance cost, not a maintenance benefit.


I'm advocating for #3: separate source, separate release. Since HDSL stability 
and FSN/BM refactoring are still a ways out, I don't want to incur a 
maintenance cost now. I sympathize with the sentiment that working cross-repo 
is harder than within same repo, but the right tooling can make this a lot 
easier (e.g. git submodule, Google's repo tool). We have experience doing this 
internally here at Cloudera, and I'm happy to share knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wheele...@gmail.com> wrote:
I like the idea of same source / same release and put Ozone's source under a 
different directory. 

Like Owen mentioned, It gonna be important for all parties to keep a regular 
and shorter release cycle for Hadoop, e.g. 3-4 months between minor releases. 
Users can try features and give feedbacks to stabilize feature earlier; 
developers can be happier since efforts will be consumed by users soon after 
features get merged. In addition to this, if features merged to trunk after 
reasonable tests/review, Andrew's concern may not be a problem anymore: 

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda


On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <owen.omal...@gmail.com> wrote:
On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <andrew.w...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.


Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YA

[VOTE] Merging branch HDFS-7240 to trunk

2018-02-26 Thread Jitendra Pandey
Dear folks,
   We would like to start a vote to merge HDFS-7240 branch into trunk. 
The context can be reviewed in the DISCUSSION thread, and in the jiras (See 
references below).
  
HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is a 
distributed, replicated block layer.
The old HDFS namespace and NN can be connected to this new block layer as 
we have described in HDFS-10419.
We also introduce a key-value namespace called Ozone built on HDSL.
  
The code is in a separate module and is turned off by default. In a secure 
setup, HDSL and Ozone daemons cannot be started.

The detailed documentation is available at 
 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications


I will start with my vote.
+1 (binding)


Discussion Thread:
  https://s.apache.org/7240-merge
  https://s.apache.org/4sfU

Jiras:
   https://issues.apache.org/jira/browse/HDFS-7240
   https://issues.apache.org/jira/browse/HDFS-10419
   https://issues.apache.org/jira/browse/HDFS-13074
   https://issues.apache.org/jira/browse/HDFS-13180

   
Thanks
jitendra





DISCUSSION THREAD SUMMARY :

On 2/13/18, 6:28 PM, "sanjay Radia"  wrote:

Sorry the formatting got messed by my email client.  Here it is 
again


Dear
 Hadoop Community Members,

   We had multiple community discussions, a few meetings in 
smaller groups and also jira discussions with respect to this thread. We 
express our gratitude for participation and valuable comments. 

The key questions raised were following
1) How the new block storage layer and OzoneFS benefit HDFS and 
we were asked to chalk out a roadmap towards the goal of a scalable namenode 
working with the new storage layer
2) We were asked to provide a security design
3)There were questions around stability given ozone brings in a 
large body of code.
4) Why can’t they be separate projects forever or merged in 
when production ready?

We have responded to all the above questions with detailed 
explanations and answers on the jira as well as in the discussions. We believe 
that should sufficiently address community’s concerns. 

Please see the summary below:

1) The new code base benefits HDFS scaling and a roadmap has 
been provided. 

Summary:
  - New block storage layer addresses the scalability of the 
block layer. We have shown how existing NN can be connected to the new block 
layer and its benefits. We have shown 2 milestones, 1st milestone is much 
simpler than 2nd milestone while giving almost the same scaling benefits. 
Originally we had proposed simply milestone 2 and the community felt that 
removing the FSN/BM lock was was a fair amount of work and a simpler solution 
would be useful
  - We provide a new K-V namespace called Ozone FS with 
FileSystem/FileContext plugins to allow the users to use the new system. BTW 
Hive and Spark work very well on KV-namespaces on the cloud. This will 
facilitate stabilizing the new block layer. 
  - The new block layer has a new netty based protocol engine 
in the Datanode which, when stabilized, can be used by  the old hdfs block 
layer. See details below on sharing of code.


2) Stability impact on the existing HDFS code base and code 
separation. The new block layer and the OzoneFS are in modules that are 
separate from old HDFS code - currently there are no calls from HDFS into Ozone 
except for DN starting the new block  layer module if configured to do so. It 
does not add instability (the instability argument has been raised many times). 
Over time as we share code, we will ensure that the old HDFS continues to 
remains stable. (for example we plan to stabilize the new netty based protocol 
engine in the new block layer before sharing it with HDFS’s old block layer)


3) In the short term and medium term, the new system and HDFS  
will be used side-by-side by users. Side by-side usage in the short term for 
testing and side-by-side in the medium term for actual production use till the 
new system has feature parity with old HDFS. During this time, sharing the DN 
daemon and admin functions between the two systems is operationally important: 

Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-30 Thread Jitendra Pandey
Hi Konstantin,
 Thank you for taking out time to review ozone. I appreciate your comments and 
questions.

> There are two main limitations in HDFS
> a) The throughput of Namespace operations. Which is limited by the 
>number of RPCs the NameNode can handle
> b) The number of objects (files + blocks) the system can maintain. 
>Which is limited by the memory size of the NameNode.

   I agree completely. We believe ozone attempts to address both these issues 
for HDFS.
   
   Let us look at the Number of objects problem. Ozone directly addresses the 
scalability of number of blocks by introducing storage containers that can hold 
multiple blocks together. The earlier efforts on this were complicated by the 
fact that block manager and namespace are intertwined in HDFS Namenode. There 
have been efforts in past to separate block manager from namespace for e.g. 
HDFS-5477. Ozone addresses this problem by cleanly separating the block layer.  
Separation of block layer also addresses the file/directories scalability 
because it frees up the blockmap from the namenode.
   
   Separate block layer relieves namenode from handling block reports, IBRs, 
heartbeats, replication monitor etc, and thus reduces the contention on 
FSNamesystem lock and significantly reduces the GC pressure on the namenode. 
These improvements will greatly help the RPC performance of the Namenode.

> Ozone is probably just the first step in rebuilding HDFS under a new
> architecture. With the next steps presumably being HDFS-10419 and 
> HDFS-8. The design doc for the new architecture has never been 
> published.

  We do believe that Namenode can leverage the ozone’s storage container layer, 
however, that is also a big effort. We would like to first have block layer 
stabilized in ozone before taking that up. However, we would certainly support 
any community effort on that, and in fact it was brought up in last BoF session 
at the summit.

  Big data is evolving rapidly. We see our customers needing scalable file 
systems, Objects stores(like S3) and Block Store(for docker and VMs). Ozone 
improves HDFS in two ways. It addresses throughput and scale issues of HDFS, 
and enriches it with newer capabilities.


> Ozone is a big enough system to deserve its own project.

I took a quick look at the core code in ozone and the cloc command reports 
22,511 lines of functionality changes in Java.

This patch also brings in web framework code like Angular.js and that brings in 
bunch of css and js files that contribute to the size of the patch, and the 
rest are test and documentation changes.

I hope this addresses your concerns.

Best regards,
jitendra

On 10/28/17, 2:00 PM, "Konstantin Shvachko"  wrote:

Hey guys,

It is an interesting question whether Ozone should be a part of Hadoop.
There are two main reasons why I think it should not.

1. With close to 500 sub-tasks, with 6 MB of code changes, and with a
sizable community behind, it looks to me like a whole new project.
It is essentially a new storage system, with different (than HDFS)
architecture, separate S3-like APIs. This is really great - the World sure
needs more distributed file systems. But it is not clear why Ozone should
co-exist with HDFS under the same roof.

2. Ozone is probably just the first step in rebuilding HDFS under a new
architecture. With the next steps presumably being HDFS-10419 and
HDFS-8.
The design doc for the new architecture has never been published. I can
only assume based on some presentations and personal communications that
the idea is to use Ozone as a block storage, and re-implement NameNode, so
that it stores only a partial namesapce in memory, while the bulk of it
(cold data) is persisted to a local storage.
Such architecture makes me wonder if it solves Hadoop's main problems.
There are two main limitations in HDFS:
  a. The throughput of Namespace operations. Which is limited by the number
of RPCs the NameNode can handle
  b. The number of objects (files + blocks) the system can maintain. Which
is limited by the memory size of the NameNode.
The RPC performance (a) is more important for Hadoop scalability than the
object count (b). The read RPCs being the main priority.
The new architecture targets the object count problem, but in the expense
of the RPC throughput. Which seems to be a wrong resolution of the tradeoff.
Also based on the use patterns on our large clusters we read up to 90% of
the data we write, so cold data is a small fraction and most of it must be
cached.

To summarize:
- Ozone is a big enough system to deserve its own project.
- The architecture that Ozone leads to does not seem to solve the intrinsic
problems of current HDFS.

I will post my opinion in the Ozone jira. Should be more convenient to
discuss it there for further 

Re: 答复: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-23 Thread Jitendra Pandey
I have filed https://issues.apache.org/jira/browse/HDFS-12697 to ensure ozone 
stays disabled in a secure environment.
Since ozone is disabled by default and will not come with security on, it will 
not expose any new attack surface in a Hadoop deployment.
Ozone security effort will need a detailed design and discussion on a community 
jira. Hopefully, that effort will start soon after the merge.

Thanks
jitendra

On 10/20/17, 2:40 PM, "larry mccay"  wrote:

All -

I broke this list of questions out into a separate DISCUSS thread where we
can iterate over how a security audit process at merge time might look and
whether it is even something that we want to take on.

I will try and continue discussion on that thread and drive that to some
conclusion before bringing it into any particular merge discussion.

thanks,

--larry

On Fri, Oct 20, 2017 at 12:37 PM, larry mccay  wrote:

> I previously sent this same email from my work email and it doesn't seem
> to have gone through - resending from apache account (apologizing up from
> for the length)
>
> For such sizable merges in Hadoop, I would like to start doing security
> audits in order to have an initial idea of the attack surface, the
> protections available for known threats, what sort of configuration is
> being used to launch processes, etc.
>
> I dug into the architecture documents while in the middle of this list -
> nice docs!
> I do intend to try and make a generic check list like this for such
> security audits in the future so a lot of this is from that but I tried to
> also direct specific questions from those docs as well.
>
> 1. UIs
> I see there are at least two UIs - Storage Container Manager and Key Space
> Manager. There are a number of typical vulnerabilities that we find in UIs
>
> 1.1. What sort of validation is being done on any accepted user input?
> (pointers to code would be appreciated)
> 1.2. What explicit protections have been built in for (pointers to code
> would be appreciated):
>   1.2.1. cross site scripting
>   1.2.2. cross site request forgery
>   1.2.3. click jacking (X-Frame-Options)
> 1.3. What sort of authentication is required for access to the UIs?
> 1.4. What authorization is available for determining who can access what
> capabilities of the UIs for either viewing, modifying data or affecting
> object stores and related processes?
> 1.5. Are the UIs built with proxying in mind by leveraging X-Forwarded
> headers?
> 1.6. Is there any input that will ultimately be persisted in configuration
> for executing shell commands or processes?
> 1.7. Do the UIs support the trusted proxy pattern with doas impersonation?
> 1.8. Is there TLS/SSL support?
>
> 2. REST APIs
>
> 2.1. Do the REST APIs support the trusted proxy pattern with doas
> impersonation capabilities?
> 2.2. What explicit protections have been built in for:
>   2.2.1. cross site scripting (XSS)
>   2.2.2. cross site request forgery (CSRF)
>   2.2.3. XML External Entity (XXE)
> 2.3. What is being used for authentication - Hadoop Auth Module?
> 2.4. Are there separate processes for the HTTP resources (UIs and REST
> endpoints) or are the part of existing HDFS processes?
> 2.5. Is there TLS/SSL support?
> 2.6. Are there new CLI commands and/or clients for access the REST APIs?
> 2.7. Bucket Level API allows for setting of ACLs on a bucket - what
> authorization is required here - is there a restrictive ACL set on 
creation?
> 2.8. Bucket Level API allows for deleting a bucket - I assume this is
> dependent on ACLs based access control?
> 2.9. Bucket Level API to list bucket returns up to 1000 keys - is there
> paging available?
> 2.10. Storage Level APIs indicate “Signed with User Authorization” what
> does this refer to exactly?
> 2.11. Object Level APIs indicate that there is no ACL support and only
> bucket owners can read and write - but there are ACL APIs on the Bucket
> Level are they meaningless for now?
> 2.12. How does a REST client know which Ozone Handler to connect to or am
> I missing some well known NN type endpoint in the architecture doc
> somewhere?
>
> 3. Encryption
>
> 3.1. Is there any support for encryption of persisted data?
> 3.2. If so, is KMS and the hadoop key command used for key management?
>
> 4. Configuration
>
> 4.1. Are there any passwords or secrets being added to configuration?
> 4.2. If so, are they accessed via Configuration.getPassword() to allow for
> provisioning in credential providers?
> 4.3. Are there any settings that are used to launch docker containers or
> shell out any commands, etc?
>
> 5. HA
>
> 

Re: Why there are so many revert operations on trunk?

2016-06-06 Thread Jitendra Pandey
Colin raised the -1 demanding a design document. The document was added the 
very next day. There were constructive discussions on the design. There was a 
demand for listenable future or futures with callback, which was accepted to 
accommodate. Rest of the work having been completed, there was no need to 
revert. Andrew’s objection was primarily against releasing in 2.8 without the 
aforementioned change in API, which is reasonable and, IMO, it should be 
possible to make the above improvement within 2.8 timeline. 

On Jun 6, 2016, at 10:13 AM, Chris Douglas  wrote:

> Reading through HDFS-9924, a request for a design doc- and a -1 on
> committing to trunk- was raised in mid-May, but commits to trunk
> continued. Why is that? Shouldn't this have paused while the details
> were discussed? Branching is neutral to the pace of feature
> development, but consensus on the result is required. Working through
> possibilities in a branch- or in multiple branches- seems like a
> reasonable way to determine which approach has support and code to
> back it.
> 
> Reverting code is not "illegal"; the feature will be in/out of any
> release by appealing to bylaws. Our rules exist to facilitate
> consensus, not declare it a fiat accompli.
> 
> An RM only exists by creating an RC. Someone can declare themselves
> Grand Marshall of trunk and stomp around in a fancy hat, but it
> doesn't affect anything. -C
> 
> 
> On Mon, Jun 6, 2016 at 9:36 AM, Junping Du  wrote:
>> Thanks Aaron for pointing it out. I didn't see any consensus on HDFS-9924 so 
>> I think we should bring it here with broader audiences for more discussions.
>> 
>> I saw several very bad practices here:
>> 
>> 1. committer (no need to say who) revert all commits from trunk without 
>> making consensus with all related contributors/committers.
>> 
>> 2. Someone's comments on feature branch are very misleading... If I didn't 
>> remember wrong, feature development doesn't have to go through feature 
>> branch which is just an optional process. This creative process of feature 
>> branch and branch committer - I believe the intention is trying to 
>> accelerate features development but not to slow them down.
>> 
>> 3. Someone (again, no need to say who) seems to claim himself as RM for 
>> trunk. I don't think we need any RM for trunk. Even for RM of 3.0.0-alpha, I 
>> think we need someone else who demonstrates he/she is more responsible, work 
>> hardly and carefully and open communication with all community. Only through 
>> this, the success of Hadoop in age of 3.0 are guranteed.
>> 
>> 
>> Thanks,
>> 
>> 
>> Junping
>> 
>> 
>> 
>> From: Aaron T. Myers 
>> Sent: Monday, June 06, 2016 4:46 PM
>> To: Junping Du
>> Cc: Andrew Wang; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
>> mapreduce-...@hadoop.apache.org; yarn-dev@hadoop.apache.org
>> Subject: Re: Why there are so many revert operations on trunk?
>> 
>> Junping,
>> 
>> All of this is being discussed on HDFS-9924. Suggest you follow the 
>> conversation there.
>> 
>> --
>> Aaron T. Myers
>> Software Engineer, Cloudera
>> 
>> On Mon, Jun 6, 2016 at 7:20 AM, Junping Du 
>> > wrote:
>> Hi Andrew,
>> 
>> I just noticed you revert 8 commits on trunk last Friday:
>> 
>> HADOOP-13226
>> 
>> HDFS-10430
>> 
>> HDFS-10431
>> 
>> HDFS-10390
>> 
>> HADOOP-13168
>> 
>> HDFS-10390
>> 
>> HADOOP-13168
>> 
>> HDFS-10346
>> 
>> HADOOP-12957
>> 
>> HDFS-10224
>> 
>>   And I didn't see you have any comments on JIRA or email discussion before 
>> you did this. I don't think we are legally allowed to do this even as 
>> committer/PMC member. Can you explain what's your intention to do this?
>> 
>>   BTW, thanks Nicolas to revert all these "illegal" revert operations.
>> 
>> 
>> 
>> Thanks,
>> 
>> 
>> Junping
>> 
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 
> 


-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: Thinking ahead to hadoop-2.6

2014-09-24 Thread Jitendra Pandey
I also believe its worth a week's wait to include HDFS-6584 and HDFS-6581
in 2.6.

On Wed, Sep 24, 2014 at 3:28 PM, Suresh Srinivas sur...@hortonworks.com
wrote:

 Given some of the features are in final stages of stabilization,
 Arun, we should hold off creating 2.6 branch or building an RC by a week?
 All the features in flux are important ones and worth delaying the release
 by a week.

 On Wed, Sep 24, 2014 at 11:36 AM, Andrew Wang andrew.w...@cloudera.com
 wrote:

  Hey Nicholas,
 
  My concern about Archival Storage isn't related to the code quality or
 the
  size of the feature. I think that you and Jing did good work. My concern
 is
  that once we ship, we're locked into that set of archival storage APIs,
 and
  these APIs are not yet finalized. Simply being able to turn off the
 feature
  does not change the compatibility story.
 
  I'm willing to devote time to help review these JIRAs and kick the tires
 on
  the APIs, but my point above was that I'm not sure it'd all be done by
 the
  end of the week. Testing might also reveal additional changes that need
 to
  be made, which also might not happen by end-of-week.
 
  I guess the question before us is if we're comfortable putting something
 in
  branch-2.6 and then potentially adding API changes after. I'm okay with
  that as long as we're all aware that this might happen.
 
  Arun, as RM is this cool with you? Again, I like this feature and I'm
 fine
  with it's inclusion, just a heads up that we might need some extra time
 to
  finalize things before an RC can be cut.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze 
  s29752-hadoop...@yahoo.com.invalid wrote:
 
   Hi,
  
   I am worry about KMS and transparent encryption since there are quite
  many
   bugs discovered after it got merged to branch-2.  It gives us an
  impression
   that the feature is not yet well tested.  Indeed, transparent
 encryption
  is
   a complicated feature which changes the core part of HDFS.  It is not
  easy
   to get everything right.
  
  
   For HDFS-6584: Archival Storage, it is a relatively simple and low risk
   feature.  It introduces a new storage type ARCHIVE and the concept of
  block
   storage policy to HDFS.  When a cluster is configured with ARCHIVE
  storage,
   the blocks will be stored using the appropriate storage types specified
  by
   storage policies assigned to the files/directories.  Cluster admin
 could
   disable the feature by simply not configuring any storage type and not
   setting any storage policy as before.   As Suresh mentioned, HDFS-6584
 is
   in the final stages to be merged to branch-2.
  
   Regards,
   Tsz-Wo
  
  
  
   On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas 
   sur...@hortonworks.com wrote:
  
  
   
   
   I actually would like to see both archival storage and single replica
   memory writes to be in 2.6 release. Archival storage is in the final
   stages
   of getting ready for branch-2 merge as Nicholas has already indicated
 on
   the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
  these
   features are being in development for sometime.
   
   On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
 andrew.w...@cloudera.com
   wrote:
   
Hey Arun,
   
Maybe we could do a quick run through of the Roadmap wiki and
   add/retarget
things accordingly?
   
I think the KMS and transparent encryption are ready to go. We've
 got
  a
very few further bug fixes pending, but that's it.
   
Two HDFS things that I think probably won't make the end of the week
  are
archival storage (HDFS-6584) and single replica memory writes
   (HDFS-6581),
which I believe are under the HSM banner. HDFS-6484 was just merged
 to
trunk and I think needs a little more work before it goes into
  branch-2.
HDFS-6581 hasn't even been merged to trunk yet, so seems a bit
 further
   off
yet.
   
Just my 2c as I did not work directly on these features. I just
   generally
shy away from shipping bits quite this fresh.
   
Thanks,
Andrew
   
On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com
   wrote:
   
 Looks like most of the content is in and hadoop-2.6 is shaping up
   nicely.

 I'll create branch-2.6 by end of the week and we can go from there
  to
 stabilize it - hopefully in the next few weeks.

 Thoughts?

 thanks,
 Arun

 On Tue, Aug 12, 2014 at 1:34 PM, Arun C Murthy 
 a...@hortonworks.com
  
 wrote:

  Folks,
 
   With hadoop-2.5 nearly done, it's time to start thinking ahead
 to
  hadoop-2.6.
 
   Currently, here is the Roadmap per the wiki:
 
  • HADOOP
  • Credential provider HADOOP-10607
  • HDFS
  • Heterogeneous storage (Phase 2) - Support APIs
  for
 using
  storage tiers by the applications HDFS-5682
  • Memory as storage tier 

Re: [DISCUSS] Change by-laws on release votes: 5 days instead of 7

2014-06-23 Thread Jitendra Pandey
+1, sounds good!


On Mon, Jun 23, 2014 at 12:02 PM, Andrew Wang andrew.w...@cloudera.com
wrote:

 +1 here as well, let's do a vote thread (for 7 days, maybe for the last
 time!)


 On Mon, Jun 23, 2014 at 11:46 AM, Vinod Kumar Vavilapalli 
 vino...@apache.org wrote:

  This seems reasonable, +1.
 
  In case there is inactivity, we already extend dates even today, we can
 do
  the same going forward.
 
  I don't see major disagreements, time to start a vote?
 
  Thanks
  +Vinod
 
  On Jun 20, 2014, at 10:54 PM, Arun C Murthy a...@hortonworks.com wrote:
 
   Folks,
  
   I'd like to propose we change our by-laws to reduce our voting periods
  on new releases from 7 days to 5.
  
   Currently, it just takes too long to turn around releases; particularly
  if we have critical security fixes etc.
  
   Thoughts?
  
   thanks,
   Arun
  
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
 entity
  to
   which it is addressed and may contain information that is confidential,
   privileged and exempt from disclosure under applicable law. If the
 reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 




-- 
http://hortonworks.com/download/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.