Oops, not sure I replied to all but I'm using ParquetIO:
PCollection records = pipeline.apply("Read parquet file
in as Generic Records",
ParquetIO.read(finalSchema).from(beamReadPath).withConfiguration(configuration));
The variable beamReadPath starts with the s3 prefix, and I set the
initial cre
Yes, I'm using ParquetIO as below:
PCollection records = pipeline.apply("Read parquet file
in as Generic Records",
ParquetIO.read(finalSchema).from(beamReadPath).withConfiguration(configuration));
On Fri, Dec 22, 2023 at 10:39 AM XQ Hu via user
wrote:
> Can you share some code snippets about h
Can you share some code snippets about how to read from S3? Do you use the
builtin TextIO?
On Fri, Dec 22, 2023 at 11:28 AM Ramya Prasad via user
wrote:
> Hello,
>
> I am a developer trying to use Apache Beam, and I have a nuanced problem I
> need help with. I have a pipeline which has to read i
Hello,
I am a developer trying to use Apache Beam, and I have a nuanced problem I
need help with. I have a pipeline which has to read in 40 million records
from multiple Parquet files from AWS S3. The only way I can get the
credentials I need for this particular bucket is to call an API, which I d