Hi all,

I am looking for a way to read a large file and generate the following 3
files:
1. extract header
2. extract column #1 from all lines
3.  extract column # 2 from all files

I use DoFn to extract the values. I am looking for a way to redirect the
output to three different files? My thought was something like this:

One DoFn that iterate on every line and returns:
1. header file => returns the first line and empty string for lines 2..n
2. column #1 from all lines => return column #1 values fro lines 1..n
3.  column # 2 from all files => return column #2 values fro lines 1..n

What is the correct way to split the output between files? should the DoFn
return tuple? should I process the line in three different DoFn instead of
one DoFn?

Thanks,
Eila

-- 
Eila
<http://www.orielresearch.com>
Meetup <https://www.meetup.com/Deep-Learning-In-Production/>

Reply via email to