Stephen Sisk created BEAM-2081:
----------------------------------

             Summary: I/O Authoring overview - better clarify how to read from 
files
                 Key: BEAM-2081
                 URL: https://issues.apache.org/jira/browse/BEAM-2081
             Project: Beam
          Issue Type: Improvement
          Components: website
            Reporter: Stephen Sisk
            Assignee: Davor Bonaci
            Priority: Minor


The I/O authoring doc is a little bit confusing - it has an example of reading 
from file globs and says to use ParDos, but then mentions "A class derived from 
FileBasedSource is often the best option when reading from files"

It'd be nice to better clarify this and provide guidance as to when to use 
which.

I *think* the right answer here is that if you file is splittable you use FBS 
(and let it handle the glob splitting), and if it's not splittable you use 
ParDos.

SDF I believe will make all this easier.

cc [~kirpichov] [~dhalp...@google.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to