Stephen Sisk created BEAM-2081: ---------------------------------- Summary: I/O Authoring overview - better clarify how to read from files Key: BEAM-2081 URL: https://issues.apache.org/jira/browse/BEAM-2081 Project: Beam Issue Type: Improvement Components: website Reporter: Stephen Sisk Assignee: Davor Bonaci Priority: Minor
The I/O authoring doc is a little bit confusing - it has an example of reading from file globs and says to use ParDos, but then mentions "A class derived from FileBasedSource is often the best option when reading from files" It'd be nice to better clarify this and provide guidance as to when to use which. I *think* the right answer here is that if you file is splittable you use FBS (and let it handle the glob splitting), and if it's not splittable you use ParDos. SDF I believe will make all this easier. cc [~kirpichov] [~dhalp...@google.com] -- This message was sent by Atlassian JIRA (v6.3.15#6346)