[ https://issues.apache.org/jira/browse/BEAM-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Halperin resolved BEAM-434. ---------------------------------- Resolution: Fixed Assignee: Thomas Groh (was: Amit Sela) Fix Version/s: 0.2.0-incubating > Limit the number of output files a beam-examples execution writes > ----------------------------------------------------------------- > > Key: BEAM-434 > URL: https://issues.apache.org/jira/browse/BEAM-434 > Project: Beam > Issue Type: Bug > Components: examples-java > Reporter: Amit Sela > Assignee: Thomas Groh > Priority: Minor > Fix For: 0.2.0-incubating > > > When using `TextIO.Write.to("/path/to/output")` without any restrictions on > the number of shards, it might generate many output files (depending on your > input), for WordCount for example, you'll get as many output files as unique > words in your input. > Since I think examples are expected to execute in a friendly manner to "see" > what it does and not optimize for performance in some way, I suggest to use > `withoutSharding()` when writing the example output to an output file. > Examples I could find that behave this way: > org.apache.beam.examples.WordCount > org.apache.beam.examples.complete.TfIdf > org.apache.beam.examples.cookbook.DeDupExample -- This message was sent by Atlassian JIRA (v6.3.4#6332)