Did you see that I modified the second proposal so that users can map
DestinationT to the actual PTransform (i.e. DestinationT->TextIO or
DestinationT->AvroIO). This means that users do not have to deal with
FileBasedSink or even know it exists.

I prefer the second approach for two reason:

1. It allows customizing some useful things that the FilenamePolicy does
not. e.g. it's very reasonable to want to customize the output directory
and have a different number output shards for each directory. If the
function returns a TextIO or AvroIO they can do that. If there's simply a
mapping to a FilenamePolicy, the can't do that.

2. The majority of users don't need to deal with DefaultFilenamePolicy
today. Allowing them to use the TextIO etc. builders for this will be
more-familiar than the DefaultFilenamePolicy.Config option suggested.

On Wed, May 24, 2017 at 10:59 AM, Kenneth Knowles <[email protected]>
wrote:

> I commented a little in the doc I want to reply on list because this is a
> really great feature.
>
> The two alternatives, as I understand them, both include mapping your
> elements to an intermediate DestinationT that you can group by before
> writing. Then the big picture decision is whether to map each DestinationT
> to a different FilenamePolicy (which may need to be made more powerful) or
> map each DestinationT to a different FileBasedSink.
>
> I think both are reasonable, modulo pitfalls that I'm probably glossing
> over. I favor the FilenamePolicy version a bit, because it is focused just
> on the file names, whereas the FileBasedSink version seems a bit
> overpowered for the use case. The other consideration is that
> FilenamePolicy is intended for user consumption, while FileBasedSink is not
> so much.
>
> Kenn
>
> On Thu, May 18, 2017 at 10:31 PM, Reuven Lax <[email protected]>
> wrote:
>
> > While Beam now supports file-based sinks that can depend on the current
> > window, I've seen interest in value-dependent sinks as well (and there's
> a
> > long-standing JIRA for this). I wrote up a short API proposal for this
> for
> > discussion on the list.
> >
> > https://docs.google.com/document/d/1Bd9mJO1YC8vOoFObJFupVURBMCl7j
> > Wt6hOgw6ClwxE4/edit?usp=sharing
> >
> > Reuven
> >
>

Reply via email to