Hi all, I am looking for a way to read a large file and generate the following 3 files: 1. extract header 2. extract column #1 from all lines 3. extract column # 2 from all files
I use DoFn to extract the values. I am looking for a way to redirect the output to three different files? My thought was something like this: One DoFn that iterate on every line and returns: 1. header file => returns the first line and empty string for lines 2..n 2. column #1 from all lines => return column #1 values fro lines 1..n 3. column # 2 from all files => return column #2 values fro lines 1..n What is the correct way to split the output between files? should the DoFn return tuple? should I process the line in three different DoFn instead of one DoFn? Thanks, Eila -- Eila <http://www.orielresearch.com> Meetup <https://www.meetup.com/Deep-Learning-In-Production/>