I have received a lot of comments in "Part 1: IOChannelFactory
Redesign" [1]. And, I have updated the design based on the feedback.

Now, I feel it is close to be ready for implementation, and I would like to
summarize the changes:
1. Replaced FilePath with URI for resolving files paths.
2. Required match(String spec) to handle ambiguities in users provided
strings (see the match() java doc in the design doc for details).
3. Changed Metadata to use Future.get() paradigm, and removed exception().
4. Changed methods on FileSystem interface to be protected (visible for
implementors), and created FileSystems utility (visible for callers).
5.  Simplified FileSystem interface by moving operation options, such as
DeleteOptions, MatchOptions, to the FileSystems utility.
6. Simplified FileSystem interface by requiring certain behaviors, such as
creating recursively, throwing for missing files.

Any thoughts / feedback?
--
Pei

[1]
https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#

On Wed, Nov 30, 2016 at 1:32 PM, Pei He <[email protected]> wrote:

> Thanks JB for the feedback.
>
> Yes, we should provide a hadoop.fs.FileSystem adaptor. As you said, it
> will make a range of file system available in Beam.
>
> And, people can choose to implement BeamFileSystem directly to get the
> best performance (For example, providing bulk operations.)
>
> --
> Pei
>
>
>
> On Tue, Nov 29, 2016 at 11:11 AM, Jean-Baptiste Onofré <[email protected]>
> wrote:
>
>> Hi Pei,
>>
>> rethinking about that, I understand that the purpose of the Beam
>> filesystem is to avoid to bring a bunch of dependencies into the core. That
>> makes perfect sense.
>>
>> So, I agree that a Beam filesystem abstract is fine.
>>
>> My point is that we should provide a HadoopFilesystem extension/plugin
>> for Beam filesystem asap: that would help us to support a good range of
>> filesystems quickly.
>>
>> Just my $0.01 ;)
>>
>> Regards
>> JB
>>
>>
>> On 11/17/2016 08:18 PM, Pei He wrote:
>>
>>> Hi JB,
>>> My proposals are based on the current IOChannelFactory, and how they are
>>> used in FileBasedSink.
>>>
>>> Let's me spend more time to investigate Hadoop FileSystem interface.
>>> --
>>> Pei
>>>
>>> On Thu, Nov 17, 2016 at 1:21 AM, Jean-Baptiste Onofré <[email protected]>
>>> wrote:
>>>
>>> By the way, Pei, for the record: why introducing BeamFileSystem and not
>>>> using the Hadoop FileSystem interface ?
>>>>
>>>> Thanks
>>>> Regards
>>>> JB
>>>>
>>>> On 11/17/2016 01:09 AM, Pei He wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I am working on BEAM-59
>>>>> <https://issues.apache.org/jira/browse/BEAM-59> "IOChannelFactory
>>>>> redesign". The goals are:
>>>>>
>>>>> 1. Support file-based IOs (TextIO, AvorIO) with user-defined file
>>>>> system.
>>>>>
>>>>> 2. Support configuring any user-defined file system.
>>>>>
>>>>> And, I drafted the design proposal in two parts to address them in
>>>>> order:
>>>>>
>>>>> Part 1: IOChannelFactory Redesign
>>>>> <https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJ
>>>>> sVG3qel2lhdKTknmZ_7M/edit#>
>>>>>
>>>>> Summary:
>>>>>
>>>>> Old API: WritableByteChannel create(String spec, String mimeType);
>>>>>
>>>>> New API: WritableByteChannel create(URI uri, CreateOptions options);
>>>>>
>>>>> Noticeable proposed changes:
>>>>>
>>>>>
>>>>>    1.
>>>>>
>>>>>    Includes the options parameter in most methods to specify behaviors.
>>>>>    2.
>>>>>
>>>>>    Replace String with URI to include scheme for files/directories
>>>>>    locations.
>>>>>    3.
>>>>>
>>>>>    Require file systems to provide a SeekableByteChannel for read.
>>>>>    4.
>>>>>
>>>>>    Additional methods, such as getMetadata(), rename() e.t.c
>>>>>
>>>>>
>>>>> Part 2: Configurable BeamFileSystem
>>>>> <https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4
>>>>> q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs>
>>>>>
>>>>> Summary:
>>>>>
>>>>> Old API: IOChannelUtils.getFactory(glob).match(glob);
>>>>>
>>>>> New API: BeamFileSystems.getFileSystem(glob, config).match(glob);
>>>>>
>>>>>
>>>>> Looking for comments and feedback.
>>>>>
>>>>> Thanks
>>>>>
>>>>> --
>>>>>
>>>>> Pei
>>>>>
>>>>>
>>>>> --
>>>> Jean-Baptiste Onofré
>>>> [email protected]
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>>>
>>>>
>>>
>> --
>> Jean-Baptiste Onofré
>> [email protected]
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>
>

Reply via email to