Fantastic.

Big +1 for this.

Regards
JB

On 09/07/2017 03:44 AM, Eugene Kirpichov wrote:
Hi,

Please take a look at the following proposal.

I believe, together with the (already available) FileIO.match() and
FileIO.readMatches() this proposal will empower Beam users to address all
use cases of file-based IO I'm aware of - which makes me quite excited.

http://s.apache.org/fileio-write

*We propose a new API for writing files in Beam: FileIO.write(). It is more
modular and cleaner to code against than FileBasedSink, and aims to
completely replace it.*

*FileIO.write() lets an IO author implement only logic and configuration
specific to a particular file format (e.g. Avro) and automatically get all
format-agnostic features, such as sharding, cleanup, windowed writes,
DynamicDestinations, compression, returning the successfully written
filenames, etc.*

TL;DR:

FileIO.write(FileSink<DestT, InputT> { open(dest), write(input), close() })
       .to(input → dest)
       .withFilenamePolicy(dest → prefix, shard pattern)
       .withEverythingElse() // like in WriteFiles


--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to