[ https://issues.apache.org/jira/browse/BEAM-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davor Bonaci reassigned BEAM-2751: ---------------------------------- Assignee: Eugene Kirpichov (was: Davor Bonaci) > Write PCollection elements to individual files > ---------------------------------------------- > > Key: BEAM-2751 > URL: https://issues.apache.org/jira/browse/BEAM-2751 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core > Reporter: Christopher Hebert > Assignee: Eugene Kirpichov > > I'd like to write elements as individual files. > Rather than smashing thousands of outputs into a handful of files as TextIO > does (output-00000-of-00005, output-00001-of-00005,...), I want to write each > element into unique files. > So if I used WholeFileIO from [BEAM-2750] to read in three files (hi.txt, > what.txt, and yes.txt) then I'd like to write the processed files out to > individual files with user or data-defined filenames (like hi-modified.txt, > what-modified.txt, and yes-modified.txt). > With a WholeFileIO, this would look like: > {code:java} > PCollection<KV<String, Byte[]>> fileNamesAndBytes = p.apply("Read", > WholeFileIO.read().from("/path/to/input/dir/*")); > ... > // Do stuff that change contents and file names > PCollection<KV<String, Byte[]>> modifedFileNamesAndBytes = ... > ... > modifedFileNamesAndBytes.apply("Write", > WholeFileIO.write().to("/path/to/output/dir/")); > {code} > This ticket complements [BEAM-2750]. -- This message was sent by Atlassian JIRA (v6.4.14#64029)