Re: Getting close to a vote on a merge of S3Guard., HADOOP-13345
Thanks for the reply Steve, aligns what Aaron said above. Sooner the better for this branch merge :) On Thu, Aug 17, 2017 at 6:49 AM, Steve Loughran wrote: > > On 16 Aug 2017, at 18:39, Andrew Wang wrote: > > Hi Steve, > > What's the target release vehicle, and the timeline for merging this? The > target date for beta1 is mid-September, so any large code movements make me > nervous. > > > Code targets trunk, current state is ready to go in. > > I've also got it building & running against branch-2: all the code is > Java-7 and the classpath problems were dealt with by Mingliang. > > > Could you comment on testing and API stability of this branch? I'm > trusting the judgement of the contributors involved, since there isn't much > time to fix things before beta1. > > > > This is all working in the s3 code, and it's something you have to > explicitly enable; I'm confident that when disabled it doesn't cause > problems > > There's two modes of use in production (as well as a local dynamodb for > testing) > > * dynamo DB as cache, "non authoritative" > * dynamo DB as store of record, "authoritative" > > I'm fairly happy with non-auth; but as auth assumes that all clients are > using s3guard, it's the one with the most risks. That one I'd be cautious > over. But it does deliver the best speedup. And it lets you use the v1/v2 > algorithms to commit output, as now you get the consistent directory > listings you need. There's still the O(data) COPY call, but at least the > risk of incomplete listings -> incomplete copy operation is eliminated. > > We've had a preview version up for a while, running large hive/LLAP tests > against it happily in particular, and my spark & cloud testing has shown > all is well (indeed, I can show how all isn't well if you enable the > inconsistent FS client and *dont* turn s3guard on). > > After the initial merge, there is more work to do, but mostly around: > metrics, diagnostics, and the new committer work which depends on the > consistent listings for one of the committers, but doesn't do *any* API > calls into s3guard itself. All it needs is a consistent S3 endpoint, be it > AWS S3 & S3Guard, or something else like the WDC cloud store. That's not > going to be ready for Beta 1. > > -Steve > > > > > Best, > Andrew > > On Wed, Aug 16, 2017 at 5:25 AM, Steve Loughran > wrote: > >> >> FYI, We're getting ready for a patch to merge the current S3Guard branch, >> HADOOP-13345, via a patch https://issues.apache.org/jira >> /browse/HADOOP-13998 >> >> After that's done, we do plan to have a second iteration, work on a >> 0-rename committer (HADOOP-13786) with all the other tuning and >> improvements; We'd add a new uber-JIRA & move stuff over, maybe branch, >> and/or do things patch-by-patch . >> >> Anyway, now is a great time for people to download and play >> >> https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop- >> tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md >> >> testing this >> >> https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop- >> tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md >> >> The Inconsistent AWS Client is also something everyone is free to use for >> injecting inconsistencies (and soon faults) into their own apps by way of >> 2-3 config options. Want to know how your code handles S3A being observably >> inconsistent? We'll let you do that. >> >> -Steve >> >> >> > >
Re: Getting close to a vote on a merge of S3Guard., HADOOP-13345
On 16 Aug 2017, at 18:39, Andrew Wang mailto:andrew.w...@cloudera.com>> wrote: Hi Steve, What's the target release vehicle, and the timeline for merging this? The target date for beta1 is mid-September, so any large code movements make me nervous. Code targets trunk, current state is ready to go in. I've also got it building & running against branch-2: all the code is Java-7 and the classpath problems were dealt with by Mingliang. Could you comment on testing and API stability of this branch? I'm trusting the judgement of the contributors involved, since there isn't much time to fix things before beta1. This is all working in the s3 code, and it's something you have to explicitly enable; I'm confident that when disabled it doesn't cause problems There's two modes of use in production (as well as a local dynamodb for testing) * dynamo DB as cache, "non authoritative" * dynamo DB as store of record, "authoritative" I'm fairly happy with non-auth; but as auth assumes that all clients are using s3guard, it's the one with the most risks. That one I'd be cautious over. But it does deliver the best speedup. And it lets you use the v1/v2 algorithms to commit output, as now you get the consistent directory listings you need. There's still the O(data) COPY call, but at least the risk of incomplete listings -> incomplete copy operation is eliminated. We've had a preview version up for a while, running large hive/LLAP tests against it happily in particular, and my spark & cloud testing has shown all is well (indeed, I can show how all isn't well if you enable the inconsistent FS client and *dont* turn s3guard on). After the initial merge, there is more work to do, but mostly around: metrics, diagnostics, and the new committer work which depends on the consistent listings for one of the committers, but doesn't do *any* API calls into s3guard itself. All it needs is a consistent S3 endpoint, be it AWS S3 & S3Guard, or something else like the WDC cloud store. That's not going to be ready for Beta 1. -Steve Best, Andrew On Wed, Aug 16, 2017 at 5:25 AM, Steve Loughran mailto:ste...@hortonworks.com>> wrote: FYI, We're getting ready for a patch to merge the current S3Guard branch, HADOOP-13345, via a patch https://issues.apache.org/jira/browse/HADOOP-13998 After that's done, we do plan to have a second iteration, work on a 0-rename committer (HADOOP-13786) with all the other tuning and improvements; We'd add a new uber-JIRA & move stuff over, maybe branch, and/or do things patch-by-patch . Anyway, now is a great time for people to download and play https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md testing this https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md The Inconsistent AWS Client is also something everyone is free to use for injecting inconsistencies (and soon faults) into their own apps by way of 2-3 config options. Want to know how your code handles S3A being observably inconsistent? We'll let you do that. -Steve
Re: Getting close to a vote on a merge of S3Guard., HADOOP-13345
Thanks for the detailed explanation Aaron. Given that this has gone through Cloudera's QA cycle and is run in production, that adds a lot of confidence in the feature. Looking forward to having this in 3.0.0-beta1! Best, Andrew On Wed, Aug 16, 2017 at 2:17 PM, Aaron Fabbri wrote: > > > On Wed, Aug 16, 2017 at 1:39 PM, Andrew Wang > wrote: > >> Hi Steve, >> >> What's the target release vehicle, and the timeline for merging this? The >> target date for beta1 is mid-September, so any large code movements make >> me >> nervous. >> > > I think this is ready to get in before beta1. Most of upstream s3a dev > has been happening on this branch so it has a lot of improvements and > testing. > > >> Could you comment on testing and API stability of this branch? I'm >> trusting >> the judgement of the contributors involved, since there isn't much time to >> fix things before beta1. >> >> > We've done a ton of testing on this branch: > > - List consistency tests with failure injection. (HADOOP-13793) This > integration test forces a delay in visibility of certain files by wrapping > the AWS S3 client. It asserts listing is consistent. The test fails without > S3Guard, and succeeds with it. > > - All existing S3 integration tests with and without S3Guard. The > filesystem contract tests have been invaluable here. (HADOOP-13589 makes > these very easy to run). > > - MetadataStore contract tests that ensure that the API semantics of the > DynamoDB and in-memory reference implementations are correct. > > - MetadataStore scale tests that can be used to force DynamoDB service > throttling and ensure we are robust to that. > > - Unit tests for different parts of the S3Guard logic. > > As you probably know, at Cloudera we are using this codebase in > production, and have run all of our downstream tests including Hive, Spark, > Impala on the new S3A client code, with and without S3Guard enabled. > > In terms of API compatibility, the new features sit behind the FileSystem > / FileContext APIs, which have not changed. Applications don't require any > changes. Internal APIs for S3Guard, such as MetadataStore (currently > private / evolving), should be properly annotated already. The S3Guard > work has been active for quite a while now, so the APIs are fairly stable > in practice. > > Probably my biggest goal in writing the S3AFileSystem integration code > (HADOOP-13651) was to preserve existing logic and correctness when S3Guard > is not enabled. One design choice which has worked well was to define a > "null" implementation of the MetadataStore (the API that filesystem clients > use to log metadata changes): > > https://github.com/apache/hadoop/blob/HADOOP-13345/ > hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/ > NullMetadataStore.java > > This is used in S3A by default. This made it easier to reason about > correctness and minimized the size of the diff to the FS client as well. > > Other questions welcomed! > > Cheers, > Aaron > > > > Best, >> Andrew >> >> On Wed, Aug 16, 2017 at 5:25 AM, Steve Loughran >> wrote: >> >> > >> > FYI, We're getting ready for a patch to merge the current S3Guard >> branch, >> > HADOOP-13345, via a patch https://issues.apache.org/ >> > jira/browse/HADOOP-13998 >> > >> > After that's done, we do plan to have a second iteration, work on a >> > 0-rename committer (HADOOP-13786) with all the other tuning and >> > improvements; We'd add a new uber-JIRA & move stuff over, maybe branch, >> > and/or do things patch-by-patch . >> > >> > Anyway, now is a great time for people to download and play >> > >> > https://github.com/apache/hadoop/blob/HADOOP-13345/ >> > hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md >> > >> > testing this >> > >> > https://github.com/apache/hadoop/blob/HADOOP-13345/ >> > hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md >> > >> > The Inconsistent AWS Client is also something everyone is free to use >> for >> > injecting inconsistencies (and soon faults) into their own apps by way >> of >> > 2-3 config options. Want to know how your code handles S3A being >> observably >> > inconsistent? We'll let you do that. >> > >> > -Steve >> > >> > >> > >> > >
Re: Getting close to a vote on a merge of S3Guard., HADOOP-13345
On Wed, Aug 16, 2017 at 1:39 PM, Andrew Wang wrote: > Hi Steve, > > What's the target release vehicle, and the timeline for merging this? The > target date for beta1 is mid-September, so any large code movements make me > nervous. > I think this is ready to get in before beta1. Most of upstream s3a dev has been happening on this branch so it has a lot of improvements and testing. > Could you comment on testing and API stability of this branch? I'm trusting > the judgement of the contributors involved, since there isn't much time to > fix things before beta1. > > We've done a ton of testing on this branch: - List consistency tests with failure injection. (HADOOP-13793) This integration test forces a delay in visibility of certain files by wrapping the AWS S3 client. It asserts listing is consistent. The test fails without S3Guard, and succeeds with it. - All existing S3 integration tests with and without S3Guard. The filesystem contract tests have been invaluable here. (HADOOP-13589 makes these very easy to run). - MetadataStore contract tests that ensure that the API semantics of the DynamoDB and in-memory reference implementations are correct. - MetadataStore scale tests that can be used to force DynamoDB service throttling and ensure we are robust to that. - Unit tests for different parts of the S3Guard logic. As you probably know, at Cloudera we are using this codebase in production, and have run all of our downstream tests including Hive, Spark, Impala on the new S3A client code, with and without S3Guard enabled. In terms of API compatibility, the new features sit behind the FileSystem / FileContext APIs, which have not changed. Applications don't require any changes. Internal APIs for S3Guard, such as MetadataStore (currently private / evolving), should be properly annotated already. The S3Guard work has been active for quite a while now, so the APIs are fairly stable in practice. Probably my biggest goal in writing the S3AFileSystem integration code (HADOOP-13651) was to preserve existing logic and correctness when S3Guard is not enabled. One design choice which has worked well was to define a "null" implementation of the MetadataStore (the API that filesystem clients use to log metadata changes): https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/NullMetadataStore.java This is used in S3A by default. This made it easier to reason about correctness and minimized the size of the diff to the FS client as well. Other questions welcomed! Cheers, Aaron Best, > Andrew > > On Wed, Aug 16, 2017 at 5:25 AM, Steve Loughran > wrote: > > > > > FYI, We're getting ready for a patch to merge the current S3Guard branch, > > HADOOP-13345, via a patch https://issues.apache.org/ > > jira/browse/HADOOP-13998 > > > > After that's done, we do plan to have a second iteration, work on a > > 0-rename committer (HADOOP-13786) with all the other tuning and > > improvements; We'd add a new uber-JIRA & move stuff over, maybe branch, > > and/or do things patch-by-patch . > > > > Anyway, now is a great time for people to download and play > > > > https://github.com/apache/hadoop/blob/HADOOP-13345/ > > hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md > > > > testing this > > > > https://github.com/apache/hadoop/blob/HADOOP-13345/ > > hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md > > > > The Inconsistent AWS Client is also something everyone is free to use for > > injecting inconsistencies (and soon faults) into their own apps by way of > > 2-3 config options. Want to know how your code handles S3A being > observably > > inconsistent? We'll let you do that. > > > > -Steve > > > > > > >
Re: Getting close to a vote on a merge of S3Guard., HADOOP-13345
Hi Steve, What's the target release vehicle, and the timeline for merging this? The target date for beta1 is mid-September, so any large code movements make me nervous. Could you comment on testing and API stability of this branch? I'm trusting the judgement of the contributors involved, since there isn't much time to fix things before beta1. Best, Andrew On Wed, Aug 16, 2017 at 5:25 AM, Steve Loughran wrote: > > FYI, We're getting ready for a patch to merge the current S3Guard branch, > HADOOP-13345, via a patch https://issues.apache.org/ > jira/browse/HADOOP-13998 > > After that's done, we do plan to have a second iteration, work on a > 0-rename committer (HADOOP-13786) with all the other tuning and > improvements; We'd add a new uber-JIRA & move stuff over, maybe branch, > and/or do things patch-by-patch . > > Anyway, now is a great time for people to download and play > > https://github.com/apache/hadoop/blob/HADOOP-13345/ > hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md > > testing this > > https://github.com/apache/hadoop/blob/HADOOP-13345/ > hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md > > The Inconsistent AWS Client is also something everyone is free to use for > injecting inconsistencies (and soon faults) into their own apps by way of > 2-3 config options. Want to know how your code handles S3A being observably > inconsistent? We'll let you do that. > > -Steve > > >
Getting close to a vote on a merge of S3Guard., HADOOP-13345
FYI, We're getting ready for a patch to merge the current S3Guard branch, HADOOP-13345, via a patch https://issues.apache.org/jira/browse/HADOOP-13998 After that's done, we do plan to have a second iteration, work on a 0-rename committer (HADOOP-13786) with all the other tuning and improvements; We'd add a new uber-JIRA & move stuff over, maybe branch, and/or do things patch-by-patch . Anyway, now is a great time for people to download and play https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md testing this https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md The Inconsistent AWS Client is also something everyone is free to use for injecting inconsistencies (and soon faults) into their own apps by way of 2-3 config options. Want to know how your code handles S3A being observably inconsistent? We'll let you do that. -Steve