On Wed, May 11, 2016 at 7:53 PM, 张铎 wrote:
> I think at that time I will start a new project called AsyncDFSClient which
> will implement the whole client side logic of HDFS without using reflection
> :)
>
If we end up in this dystopian future, then please have that project
live as a subproject o
On Wed, May 11, 2016 at 10:28 PM, Andrew Purtell
wrote:
> All you have to do is stick around long enough. Hadoop 0.20-append v2 :-)
>
*palm-all-the-faces*
> On May 11, 2016, at 9:46 PM, Stack wrote:
> >
> >> On Wed, May 11, 2016 at 7:53 PM, 张铎 wrote:
> >>
> >> I think at that time I will star
All you have to do is stick around long enough. Hadoop 0.20-append v2 :-)
> On May 11, 2016, at 9:46 PM, Stack wrote:
>
>> On Wed, May 11, 2016 at 7:53 PM, 张铎 wrote:
>>
>> I think at that time I will start a new project called AsyncDFSClient which
>> will implement the whole client side logic
On Wed, May 11, 2016 at 7:53 PM, 张铎 wrote:
> I think at that time I will start a new project called AsyncDFSClient which
> will implement the whole client side logic of HDFS without using reflection
> :)
>
>
Haven't I seen this movie before? (smile)
St.Ack
> 2016-05-12 10:27 GMT+08:00 Andrew P
I think at that time I will start a new project called AsyncDFSClient which
will implement the whole client side logic of HDFS without using reflection
:)
2016-05-12 10:27 GMT+08:00 Andrew Purtell :
> If Hadoop refuses the changes before we release, we can change the default
> back.
>
>
> On May
If Hadoop refuses the changes before we release, we can change the default
back.
On May 11, 2016, at 6:50 PM, Gary Helmling wrote:
>>
>>
>> I was trying to avoid the below oft-repeated pattern at least for the case
>> of critical developments:
>>
>> + New feature arrives after much work by
>
>
> I was trying to avoid the below oft-repeated pattern at least for the case
> of critical developments:
>
> + New feature arrives after much work by developer, reviewers and testers
> accompanied by fanfare (blog, talks).
> + Developers and reviewers move on after getting it committed or it ge
See HDFS-223 and HDFS-916. There are plenty of issues related. The most
important thing is that we need a suitable api and there is an asynchronous
file system proposal in HADOOP-12910 which does not fit our requirements so
I need to stop it being committed first...
And a default choice in a later
On Tue, May 10, 2016 at 10:39 AM, Gary Helmling wrote:
> >
> > The suggestion is that we make this new client the default now in master
> > branch so we have plenty of time to find any issues with the
> > implementation. We'd also enable it as the default because the
> improvement
> > is dramatic
I'm not sure this should be default for 2.0 but I'd definitely like to see it
an option we're comfortable supporting through the duration we are negotiating
with HDFS. Would be one major reason why trying out a 2.0.0 release would be
compelling.
On May 10, 2016, at 10:51 AM, Gary Helmling wr
>
> Yeah the 'push to upstream' work has been started already. See here
>
> https://issues.apache.org/jira/browse/HADOOP-12910
>
> But it is much harder to push code into HDFS than HBase. It is the core of
> all hadoop systems and I do not have many contacts in the hdfs community...
>
>
Yes, I'm fa
>
> The suggestion is that we make this new client the default now in master
> branch so we have plenty of time to find any issues with the
> implementation. We'd also enable it as the default because the improvement
> is dramatic (performance, less moving parts, comprehensible, etc.) and we
> thin
On Mon, May 9, 2016 at 11:59 PM, Gary Helmling wrote:
...
> To me, it seems much safer to actively try to push this upstream into HDFS
> right now, and still pointing to its optional, non-default use in HBase as
> a compelling story. I don't understand why making it the default in 2.0 is
> nece
Some methods are moved from classes in hadoop-hdfs to classes in
hadoop-hdfs-client.
ClientProtocol.addBlock method adds an extra parameter.
DFSClient.Conf is moved to a separated file and renamed to DFSClientConf.
Not very hard. I promise that I can give a patch within 3 days after the
release of
Yeah the 'push to upstream' work has been started already. See here
https://issues.apache.org/jira/browse/HADOOP-12910
But it is much harder to push code into HDFS than HBase. It is the core of
all hadoop systems and I do not have many contacts in the hdfs community...
And it is more convincing
Thanks for adding the tests and fixing up AES support.
My only real concern is the maintainability of this code as our own private
DFS client. The SASL support, for example, is largely based on reflection
and reaches in to private fields of @InterfaceAudience.Private Hadoop
classes. This seems b
Any other suggestions/objections here? If not, will make the cut over in
next day or so.
Thanks,
St.Ack
On Thu, May 5, 2016 at 10:02 PM, Stack wrote:
> On Thu, May 5, 2016 at 7:39 PM, Yu Li wrote:
>
>> Almost miss the party...
>>
>> bq. Do you think it worth to backport this feature to branch-1
On Thu, May 5, 2016 at 7:39 PM, Yu Li wrote:
> Almost miss the party...
>
> bq. Do you think it worth to backport this feature to branch-1 and release
> it in the next 1.x release? This may introduce a compatibility issue as
> said
> in HBASE-14949 that we need HBASE-14949 to make sure that the r
Almost miss the party...
bq. Do you think it worth to backport this feature to branch-1 and release
it in the next 1.x release? This may introduce a compatibility issue as said
in HBASE-14949 that we need HBASE-14949 to make sure that the rolling upgrade
does not lose data...
>From current perf da
Thanks for your effort, Duo.
I am in favor of turning AsyncWAL as default in master branch.
Cheers
On Thu, May 5, 2016 at 6:03 PM, 张铎 wrote:
> Some progress.
>
> I have filed HBASE-15743 for the transparent encryption support,
> and HBASE-15754 for the AES encryption UT. Now both of them are r
Some progress.
I have filed HBASE-15743 for the transparent encryption support,
and HBASE-15754 for the AES encryption UT. Now both of them are resolved.
Let's resume the discussion here.
Thanks.
2016-05-03 10:09 GMT+08:00 张铎 :
> Fine, will add the testcase.
>
> And for the RPC, we only impleme
Fine, will add the testcase.
And for the RPC, we only implement a new client side DTP here and still use
the original RPC.
Thanks.
2016-05-03 3:20 GMT+08:00 Gary Helmling :
> On Fri, Apr 29, 2016 at 6:24 PM 张铎 wrote:
>
> > Yes, it does. There is testcase that enumerates all the possible
> prot
On Fri, Apr 29, 2016 at 6:24 PM 张铎 wrote:
> Yes, it does. There is testcase that enumerates all the possible protection
> level(authentication, integrity and privacy) and encryption algorithm(none,
> 3des, rc4).
>
>
> https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apac
On Sat, Apr 30, 2016 at 10:06 PM, Sean Busbey wrote:
> On Sat, Apr 30, 2016 at 1:34 PM, Stack wrote:
> > On Sat, Apr 30, 2016 at 6:33 AM, Ted Yu wrote:
> >
> >> What about support for Transparent Data Encryption feature which was
> >> introduced in Apache Hadoop 2.6.0 ?
> >>
> >>
> > Transparen
I checked the code. The HdfsFileStatus returned when creating of a file
inside encryption zone will contain a FileEncryptionInfo. DFSClient will
create a CryptoOutputStream which wraps a DFSOutputStream based on the
FileEntryptionInfo.
Let me file issue to implement it. Just another piece of refle
On Sat, Apr 30, 2016 at 1:34 PM, Stack wrote:
> On Sat, Apr 30, 2016 at 6:33 AM, Ted Yu wrote:
>
>> What about support for Transparent Data Encryption feature which was
>> introduced in Apache Hadoop 2.6.0 ?
>>
>>
> Transparent: "...(of a process or interface) functioning without the user
> being
On Sat, Apr 30, 2016 at 6:33 AM, Ted Yu wrote:
> What about support for Transparent Data Encryption feature which was
> introduced in Apache Hadoop 2.6.0 ?
>
>
Transparent: "...(of a process or interface) functioning without the user
being aware of its presence."
St.Ack
> On Fri, Apr 29, 2016
What about support for Transparent Data Encryption feature which was
introduced in Apache Hadoop 2.6.0 ?
On Fri, Apr 29, 2016 at 6:24 PM, 张铎 wrote:
> Yes, it does. There is testcase that enumerates all the possible protection
> level(authentication, integrity and privacy) and encryption algorith
Yes, it does. There is testcase that enumerates all the possible protection
level(authentication, integrity and privacy) and encryption algorithm(none,
3des, rc4).
https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/io/asyncfs/TestSaslFanOutOneBlockAsyncD
How well has this been tested on secure clusters? I know SASL support was
lacking initially, but I believe it had been added? Does AsyncFSWAL
support all the HDFS transport encryption options?
On Fri, Apr 29, 2016 at 12:05 AM Stack wrote:
> I'm +1 on enabling asyncfswal as default in 2.0:
>
>
I'm +1 on enabling asyncfswal as default in 2.0:
+ We'll have plenty of time to figure issues if any if we get it in now,
early.
+ The improvement in throughput is substantial
+ There are now less moving parts
+ A critical piece of our write path is much less opaque in its workings
and no longer (
I‘ve done dig in HDFS and HADOOP proejcts and found that there is an active
issue HADOOP-12910 that related to asynchronous FileSystem implementation.
I have left some comments on it, maybe we could start from there.
Thanks.
2016-04-29 14:42 GMT+08:00 Stack :
> On Thu, Apr 28, 2016 at 8:47 PM,
On Thu, Apr 28, 2016 at 8:47 PM, Ted Yu wrote:
> Last comment on HDFS-916 was from 2010.
>
> Suggest making a new issue or reviving discussion on HDFS-916 (currently
> assigned to Todd).
>
>
Duo is on it. Some mileage and confidence in the new code would be good to
have before going to HDFS (Gett
On Thu, Apr 28, 2016 at 8:34 PM, Heng Chen wrote:
> The performance is quite great, but i think maybe we should collect some
> experience on real production cluster before we make it as default.
>
> Yeah. Would be nice if a production deploy before we made the switch but
in the absence of that,
2016-04-29 11:47 GMT+08:00 Ted Yu :
> Last comment on HDFS-916 was from 2010.
>
> Suggest making a new issue or reviving discussion on HDFS-916 (currently
> assigned to Todd).
>
> bq. The fallback implementation is not aim to get a good performance
>
> For more than two weeks, I have been working
Last comment on HDFS-916 was from 2010.
Suggest making a new issue or reviving discussion on HDFS-916 (currently
assigned to Todd).
bq. The fallback implementation is not aim to get a good performance
For more than two weeks, I have been working with Azure Data Lake
developers so that all hbase
But 2.0 is not released yet...
Do you think it worth to backport this feature to branch-1 and release it
in the next 1.x release? This may introduce a compatibility issue as said
in HBASE-14949 that we need HBASE-14949 to make sure that the rolling
upgrade does not lose data...
2016-04-29 11:34 G
2016-04-29 11:35 GMT+08:00 Ted Yu :
> bq. AsyncFSOutput will be in HDFS-3.0
>
> Is there HDFS JIRA for the above ? Can you share the number ?
>
I have not filed a new one but there are bunch of related issues already,
such as this one https://issues.apache.org/jira/browse/HDFS-916
>
> bq. Just wr
+1 to what Heng said.
On Thu, Apr 28, 2016 at 8:34 PM, Heng Chen wrote:
> The performance is quite great, but i think maybe we should collect some
> experience on real production cluster before we make it as default.
>
> 2016-04-29 11:30 GMT+08:00 张铎 :
>
> > Inline comments.
> > Thanks,
> >
> >
bq. AsyncFSOutput will be in HDFS-3.0
Is there HDFS JIRA for the above ? Can you share the number ?
bq. Just wrap FSDataOutputStream to make it act like an asynchronous output
Can you be a bit more specific ?
HBase currently works with WASB and Azure Data Lake. Does the above mean
their performa
The performance is quite great, but i think maybe we should collect some
experience on real production cluster before we make it as default.
2016-04-29 11:30 GMT+08:00 张铎 :
> Inline comments.
> Thanks,
>
> 2016-04-29 10:57 GMT+08:00 Sean Busbey :
>
> > I am nervous about having default out-of-th
Inline comments.
Thanks,
2016-04-29 10:57 GMT+08:00 Sean Busbey :
> I am nervous about having default out-of-the-box new HBase users reliant on
> a bespoke HDFS client, especially given Hadoop's compatibility
> promises and history. Answers for these questions would make me more
> confident:
>
>
I am nervous about having default out-of-the-box new HBase users reliant on
a bespoke HDFS client, especially given Hadoop's compatibility
promises and history. Answers for these questions would make me more
confident:
1) Where are we on getting the client-side changes to HDFS pushed back upstream
Six month after I filed HBASE-14790...
Now the AsyncFSWAL is ready. The WALPE result shows that it is *1.4x~3.7x*
faster than FSHLog. The ITBLL result turns out that it is *not bad* than
FSHLog(the master branch is not that stable itself...).
More details can be found on HBASE-15536.
So here we
44 matches
Mail list logo