Re: help with right transform to read tgz file

2018-12-26 Thread Jeff Klukas
The general approach in your example looks reasonable to me. I don't think there's anything built in to Beam to help with parsing the tar file format and I don't know how robust the method of replacing "^@" and then splitting on newlines will be. I'd likely use Apache's commons-compress library for

help with right transform to read tgz file

2018-12-21 Thread Sridevi Nookala
Hi, I am newbie to apache beam I am trying to write a simple pipeline using apache beam java sdk. the pipleline will read a bunch of tgz files. each tgz files have multiple CSV files with data public static final void main(String args[]) throws Exception { PipelineOptions options = PipelineO