Update: I think I'm wrong about the listing part. I think it will only do the HEAD request. Also it seems like the consistency issue is probably not something my team would encounter with our current jobs.

On 2020/11/12 02:17:10, John Clara <j...@gmail.com> wrote:
> (Not sure if this is actually replying or just starting a new thread)>
>
> Hi Daniel,>
>
> Thanks for the response! It's very helpful and answers a lot my questions.>
>
> A couple follow ups:>
>
> One of my concerns with S3FileIO is getting tied too much to a single >
> cloud provider. I'm wondering if an ObjectStoreFileIO would be helpful >
> so that S3FileIO and (a future) GCSFileIO could share logic? I haven't >
> looked deep enough into the S3FileIO to know how much logic is not s3 >
> specific. Maybe the FileIO interface is enough.>
>
> About consistency (no need to respond here):>
> I'm seeing that during "getFileStatus" my version of s3a does some list >
> requests (but I'm not sure if that could fail from consistency issues).>
> I'm also confused about the read-after-(initial) write part:>
> "Amazon S3 provides read-after-write consistency for PUTS of new objects > > in your S3 bucket in all Regions with one caveat. The caveat is that if >
> you make a HEAD or GET request to a key name before the object is >
> created, then create the object shortly after that, a subsequent GET >
> might not return the object due to eventual consistency. - >
> https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html";>
>
> When my version of s3a does a create, it first does a getMetadataRequest > > (HEAD) to check if the object exists before creating the object. I think >
> this is talked about in this issue: >
> https://github.com/apache/iceberg/issues/1398 and talked about in the >
> S3FileIO PR: https://github.com/apache/iceberg/pull/1573. I'll follow up >
> in that issue for more info.>
>
> John>
>
>
> On 2020/11/12 00:36:10, Daniel Weeks <d....@netflix.com.INVALID> wrote:>
> > Hey John, I might be able to help answer some of your questions and >
> provide>>
> > some context around how you might want to go forward.>>
> >>
> > So, one fundamental aspect of Iceberg is that it only relies on a few>> > > operations (as defined by the FileIO interface). This makes much of the>>
> > functionality and complexity of full file system implementations>>
> > unnecessary. You should not need features like S3Guard or additional S3>>
> > operations these implementations rely on in order to achieve file >
> system>>
> > contract behavior. Consistency issues should also not be a problem >
> since>>
> > Iceberg does not overwrite or list and read-after-(initial)write is a>>
> > guarantee provided by S3.>>
> >>
> > At Netflix, we use a custom FileSystem implementation (somewhat like >
> S3A),>>
> > but with much of the contract behavior that drives additional >
> operations>>
> > against S3 disabled. However, we are transitioning to a more native>>
> > implementation of S3FileIO, which you'll see as part of the ongoing >
> work in>>
> > Iceberg.>>
> >>
> > Per your specific questions:>>
> >>
> > 1) The S3FileIO implementation is very new, though internally we have>>
> > something very similar. There are features missing that we are >
> working to>>
> > add (e.g. progressive multipart upload for large files is likely the >
> most>>
> > important).>>
> > 2) You can use S3AFileSystem with the HadoopFileIO implementation, >
> though>>
> > you may still see similar behavior with additional calls being made (I>>
> > don't know if these can be disabled).>>
> > 3) The PrestoS3FileSystem is tailored to Presto's use and is likely >
> not as>>
> > complete as S3A, but seeing as it is using the Hadoop FileSystem api, >
> it>>
> > would likely work for what HadoopFileIO exercises (as would the>>
> > EMRFileSystem).>>
> > 4) I would probably discourage you from writing your own file system >
> as the>>
> > S3FileIO will likely be a more optimized implementation for what >
> Iceberg>>
> > needs.>>
> >>
> > If you want to contribute or have time to help contribute to >
> S3FileIO, that>>
> > is the path I would recommend. As for configuration, I would say a >
> lot of>>
> > it comes down to how to configure the AWS S3 Client that you provide >
> to the>>
> > S3FileIO implementation, but a lot of the defaults are reasonable (you>>
> > might want to tweak a few like max connections and maybe the retry >
> policy).>>
> >>
> > The recently committed work to dynamically load your FileIO should >
> make it>>
> > relatively easy to test out and we'd love to have extra eyes and >
> feedback>>
> > on it.>>
> >>
> > Let me know if that helps,>>
> > -Dan>>
> >>
> >>
> >>
> > On Wed, Nov 11, 2020 at 1:45 PM John Clara <jo...@gmail.com>>>
> > wrote:>>
> >>
> > > Hello all,>>
> > >>>
> > > Thank you all for creating/continuing this great project! I am just>>
> > > starting to get comfortable with the fundamentals and I'm thinking >
> that my>>
> > > team has been using Iceberg the wrong way at the FileIO level.>>
> > >>>
> > > I was wondering if people would be willing to share how they set up >
> their>>
> > > FileIO/FileSystem with S3 and any customizations they had to add.>>
> > >>>
> > > (Preferably from smaller teams. My team is small and cannot>>
> > > realistically customize everything. If there's an up to date thread>>
> > > discussing this that I missed, please link me that instead.)>>
> > >>>
> > > ***** My team's specific problems/setup which you can ignore ***>>
> > >>>
> > > My team has been using Hadoop FileIO with the S3AFileSystem. Jars are>>
> > > provided by AWS EMR 5.23 which is on Hadoop 2.8.5. We use DynamoDB >
> for>>
> > > atomic renames by implementing Iceberg's provided interfaces. We >
> read/write>>
> > > from either Spark in EMR or on-prem JVM's in docker containers >
> (managed by>>
> > > k8s). Both use s3a, but the EMR clusters have HDFS (backed by core >
> nodes)>>
> > > for the s3a buffered writes while the on-prem containers use the >
> docker>>
> > > container's default file system which uses an overlay2 storage >
> driver (that>>
> > > I know nothing about).>>
> > >>>
> > > Hadoop 2.8.5's S3AFileSystem does a bunch of unnecessary get and list>>
> > > requests which is well known in the community (but not to my team>>
> > > unfortunately). There's also GET PUT GET inconsistency issues with >
> S3 that>>
> > > have been talked about, but I don't yet understand how they arise >
> in the>>
> > > 2.8.5 S3AFilesystem (https://github.com/apache/iceberg/issues/1398).>>
> > >>>
> > > *** End of specific ***>>
> > >>>
> > >>>
> > > The options I'm seeing are:>>
> > >>>
> > > 1. Using Iceberg's new S3 FileIO. Is anyone using this in prod?>>
> > >>>
> > > This still seems very new unless it is actually based on Netflix's>>
> > > prod implementation that they're releasing to the community? (I'm >
> wondering>>
> > > if it's safe to start moving onto it in prod in the near term. If >
> Netflix>>
> > > is using it (or rolling it out) that would be more than enough for >
> my team.)>>
> > >>>
> > > 2. Using a newer hadoop version and use the S3AFileSystem. Any>>
> > > recommendations on a version and are you also using S3Guard?>>
> > >>>
> > > From a quick look, most gains compared to older versions seem to be>>
> > > from S3Guard. Are there substantial gains without it? (My team >
> doesn't have>>
> > > experience with S3Guard and Iceberg seems to not need it outside of >
> atomic>>
> > > renames?)>>
> > >>>
> > > 3. Using an alternative hadoop file system. Any recommendations?>>
> > >>>
> > > In the recent Iceberg S3 FileIO, the License states it was based off>>
> > > the Presto FileSystem. Has anyone used this file system as is with >
> Iceberg?>>
> > > (https://github.com/apache/iceberg/blob/master/LICENSE#L251)>>
> > >>>
> > > 4. Roll our own hadoop file system. Anyone have stories/blogs about>>
> > > pitfalls or difficulties?>>
> > >>>
> > > rdblue hints that Netflix already done this:>>
> > > >
> https://github.com/apache/iceberg/issues/1398#issuecomment-682837392 .>>
> > > (My team probably doesn't have the capacity for this)>>
> > >>>
> > >>>
> > > Places where I tried looking for this info:>>
> > >>>
> > > - https://github.com/apache/iceberg/issues/761 (issue for getting>>
> > > started guide)>>
> > > - https://iceberg.apache.org/spec/#file-system-operations>>
> > >>>
> > > Thanks everyone,>>
> > >>>
> > > John Clara>>
> > >>>
> >>
>

Reply via email to