FWIW I'm pretty sure this
<https://github.com/GoogleCloudPlatform/bigdata-interop/tree/master/util-hadoop>
is Google's gs hdfs connector, and I think this
<https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.6.0> should
work for s3, and Azure's is here
<https://hadoop.apache.org/docs/stable2/hadoop-azure/index.html>.
So going with Hadoop's FileSystem interface is already compatible with
hdfs, gs, s3, azure.

On Thu, Nov 17, 2016 at 9:19 PM Pei He <[email protected]> wrote:

> Hi JB,
> My proposals are based on the current IOChannelFactory, and how they are
> used in FileBasedSink.
>
> Let's me spend more time to investigate Hadoop FileSystem interface.
> --
> Pei
>
> On Thu, Nov 17, 2016 at 1:21 AM, Jean-Baptiste Onofré <[email protected]>
> wrote:
>
> > By the way, Pei, for the record: why introducing BeamFileSystem and not
> > using the Hadoop FileSystem interface ?
> >
> > Thanks
> > Regards
> > JB
> >
> > On 11/17/2016 01:09 AM, Pei He wrote:
> >
> >> Hi,
> >>
> >> I am working on BEAM-59
> >> <https://issues.apache.org/jira/browse/BEAM-59> "IOChannelFactory
> >> redesign". The goals are:
> >>
> >> 1. Support file-based IOs (TextIO, AvorIO) with user-defined file
> system.
> >>
> >> 2. Support configuring any user-defined file system.
> >>
> >> And, I drafted the design proposal in two parts to address them in
> order:
> >>
> >> Part 1: IOChannelFactory Redesign
> >> <https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJ
> >> sVG3qel2lhdKTknmZ_7M/edit#>
> >>
> >> Summary:
> >>
> >> Old API: WritableByteChannel create(String spec, String mimeType);
> >>
> >> New API: WritableByteChannel create(URI uri, CreateOptions options);
> >>
> >> Noticeable proposed changes:
> >>
> >>
> >>    1.
> >>
> >>    Includes the options parameter in most methods to specify behaviors.
> >>    2.
> >>
> >>    Replace String with URI to include scheme for files/directories
> >>    locations.
> >>    3.
> >>
> >>    Require file systems to provide a SeekableByteChannel for read.
> >>    4.
> >>
> >>    Additional methods, such as getMetadata(), rename() e.t.c
> >>
> >>
> >> Part 2: Configurable BeamFileSystem
> >> <https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4
> >> q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs>
> >>
> >> Summary:
> >>
> >> Old API: IOChannelUtils.getFactory(glob).match(glob);
> >>
> >> New API: BeamFileSystems.getFileSystem(glob, config).match(glob);
> >>
> >>
> >> Looking for comments and feedback.
> >>
> >> Thanks
> >>
> >> --
> >>
> >> Pei
> >>
> >>
> > --
> > Jean-Baptiste Onofré
> > [email protected]
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

Reply via email to