Re: Suggested S3 FileIO/Getting Started

John Clara Wed, 11 Nov 2020 18:17:41 -0800

(Not sure if this is actually replying or just starting a new thread)


Hi Daniel,

Thanks for the response! It's very helpful and answers a lot my questions.

A couple follow ups:

One of my concerns with S3FileIO is getting tied too much to a singlecloud provider. I'm wondering if an ObjectStoreFileIO would be helpfulso that S3FileIO and (a future) GCSFileIO could share logic? I haven'tlooked deep enough into the S3FileIO to know how much logic is not s3specific. Maybe the FileIO interface is enough.


About consistency (no need to respond here):

I'm seeing that during "getFileStatus" my version of s3a does some listrequests (but I'm not sure if that could fail from consistency issues).

I'm also confused about the read-after-(initial) write part:

"Amazon S3 provides read-after-write consistency for PUTS of new objectsin your S3 bucket in all Regions with one caveat. The caveat is that ifyou make a HEAD or GET request to a key name before the object iscreated, then create the object shortly after that, a subsequent GETmight not return the object due to eventual consistency. -https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html";

When my version of s3a does a create, it first does a getMetadataRequest(HEAD) to check if the object exists before creating the object. I thinkthis is talked about in this issue:https://github.com/apache/iceberg/issues/1398 and talked about in theS3FileIO PR: https://github.com/apache/iceberg/pull/1573. I'll follow upin that issue for more info.


John


On 2020/11/12 00:36:10, Daniel Weeks <[email protected]> wrote:

> Hey John, I might be able to help answer some of your questions andprovide>

> some context around how you might want to go forward.>
>
> So, one fundamental aspect of Iceberg is that it only relies on a few>
> operations (as defined by the FileIO interface). This makes much of the>
> functionality and complexity of full file system implementations>
> unnecessary. You should not need features like S3Guard or additional S3>

> operations these implementations rely on in order to achieve filesystem>> contract behavior. Consistency issues should also not be a problemsince>

> Iceberg does not overwrite or list and read-after-(initial)write is a>
> guarantee provided by S3.>
>

> At Netflix, we use a custom FileSystem implementation (somewhat likeS3A),>> but with much of the contract behavior that drives additionaloperations>

> against S3 disabled. However, we are transitioning to a more native>

> implementation of S3FileIO, which you'll see as part of the ongoingwork in>

> Iceberg.>
>
> Per your specific questions:>
>
> 1) The S3FileIO implementation is very new, though internally we have>

> something very similar. There are features missing that we areworking to>> add (e.g. progressive multipart upload for large files is likely themost>

> important).>

> 2) You can use S3AFileSystem with the HadoopFileIO implementation,though>

> you may still see similar behavior with additional calls being made (I>
> don't know if these can be disabled).>

> 3) The PrestoS3FileSystem is tailored to Presto's use and is likelynot as>> complete as S3A, but seeing as it is using the Hadoop FileSystem api,it>

> would likely work for what HadoopFileIO exercises (as would the>
> EMRFileSystem).>

> 4) I would probably discourage you from writing your own file systemas the>> S3FileIO will likely be a more optimized implementation for whatIceberg>

> needs.>
>

> If you want to contribute or have time to help contribute toS3FileIO, that>> is the path I would recommend. As for configuration, I would say alot of>> it comes down to how to configure the AWS S3 Client that you provideto the>

> S3FileIO implementation, but a lot of the defaults are reasonable (you>

> might want to tweak a few like max connections and maybe the retrypolicy).>

> The recently committed work to dynamically load your FileIO shouldmake it>> relatively easy to test out and we'd love to have extra eyes andfeedback>

> on it.>
>
> Let me know if that helps,>
> -Dan>
>
>
>
> On Wed, Nov 11, 2020 at 1:45 PM John Clara <[email protected]>>
> wrote:>
>
> > Hello all,>
> >>
> > Thank you all for creating/continuing this great project! I am just>

> > starting to get comfortable with the fundamentals and I'm thinkingthat my>

> > team has been using Iceberg the wrong way at the FileIO level.>
> >>

> > I was wondering if people would be willing to share how they set uptheir>

> > FileIO/FileSystem with S3 and any customizations they had to add.>
> >>
> > (Preferably from smaller teams. My team is small and cannot>
> > realistically customize everything. If there's an up to date thread>
> > discussing this that I missed, please link me that instead.)>
> >>
> > ***** My team's specific problems/setup which you can ignore ***>
> >>
> > My team has been using Hadoop FileIO with the S3AFileSystem. Jars are>

> > provided by AWS EMR 5.23 which is on Hadoop 2.8.5. We use DynamoDBfor>> > atomic renames by implementing Iceberg's provided interfaces. Weread/write>> > from either Spark in EMR or on-prem JVM's in docker containers(managed by>> > k8s). Both use s3a, but the EMR clusters have HDFS (backed by corenodes)>> > for the s3a buffered writes while the on-prem containers use thedocker>> > container's default file system which uses an overlay2 storagedriver (that>

> > I know nothing about).>
> >>
> > Hadoop 2.8.5's S3AFileSystem does a bunch of unnecessary get and list>
> > requests which is well known in the community (but not to my team>

> > unfortunately). There's also GET PUT GET inconsistency issues withS3 that>> > have been talked about, but I don't yet understand how they arisein the>

> > 2.8.5 S3AFilesystem (https://github.com/apache/iceberg/issues/1398).>
> >>
> > *** End of specific ***>
> >>
> >>
> > The options I'm seeing are:>
> >>
> > 1. Using Iceberg's new S3 FileIO. Is anyone using this in prod?>
> >>
> > This still seems very new unless it is actually based on Netflix's>

> > prod implementation that they're releasing to the community? (I'mwondering>> > if it's safe to start moving onto it in prod in the near term. IfNetflix>> > is using it (or rolling it out) that would be more than enough formy team.)>

> >>
> > 2. Using a newer hadoop version and use the S3AFileSystem. Any>
> > recommendations on a version and are you also using S3Guard?>
> >>
> > From a quick look, most gains compared to older versions seem to be>

> > from S3Guard. Are there substantial gains without it? (My teamdoesn't have>> > experience with S3Guard and Iceberg seems to not need it outside ofatomic>

> > renames?)>
> >>
> > 3. Using an alternative hadoop file system. Any recommendations?>
> >>
> > In the recent Iceberg S3 FileIO, the License states it was based off>

> > the Presto FileSystem. Has anyone used this file system as is withIceberg?>

> > (https://github.com/apache/iceberg/blob/master/LICENSE#L251)>
> >>
> > 4. Roll our own hadoop file system. Anyone have stories/blogs about>
> > pitfalls or difficulties?>
> >>
> > rdblue hints that Netflix already done this:>

> >https://github.com/apache/iceberg/issues/1398#issuecomment-682837392 .>

> > (My team probably doesn't have the capacity for this)>
> >>
> >>
> > Places where I tried looking for this info:>
> >>
> > - https://github.com/apache/iceberg/issues/761 (issue for getting>
> > started guide)>
> > - https://iceberg.apache.org/spec/#file-system-operations>
> >>
> > Thanks everyone,>
> >>
> > John Clara>
> >>
>

Re: Suggested S3 FileIO/Getting Started

Reply via email to