[Java] [ParquetIO] How to determine required dependencies

Andrew Kettmann Fri, 11 Jun 2021 14:34:01 -0700

I am quite ignorant to the details of package management in Java (Usually write 
stuff in Python, but the beam SDK in Python is not to the same level the Java 
one is), I am troubleshooting an issue specifically on the DataflowRunner and I 
decided to try upgrading Beam from 2.28.0 to 2.30.0.


However, code that ran under 2.28.0 now gives a class not found exception when 
it attempts to write data to Parquet locally. My question is what is the 
expected path to know that I am going to need additional dependencies and what 
they are when upgrading the Beam SDK? I would assume that there is a path that 
does not involve googling classes that the pipeline tries to call and adding 
dependencies until it stops complaining.

Could someone more experienced tell me what the expected path is for this?

The specific error I am getting is regarding some Hadoop class for either 
ParquetIO or Snappy compression, but my question is more general. How do I know 
what packages and versions are intended to be used with the different aspects 
of beam extensions?

[https://storage.googleapis.com/e24-email-images/e24logonotag.png]<https://www.evolve24.com>
 Andrew Kettmann
DevOps Engineer
P: 1.314.596.2836
[LinkedIn]<https://linkedin.com/company/evolve24> [Twitter] 
<https://twitter.com/evolve24>  [Instagram] 
<https://www.instagram.com/evolve_24>

evolve24 Confidential & Proprietary Statement: This email and any attachments 
are confidential and may contain information that is privileged, confidential 
or exempt from disclosure under applicable law. It is intended for the use of 
the recipients. If you are not the intended recipient, or believe that you have 
received this communication in error, please do not read, print, copy, 
retransmit, disseminate, or otherwise use the information. Please delete this 
email and attachments, without reading, printing, copying, forwarding or saving 
them, and notify the Sender immediately by reply email. No confidentiality or 
privilege is waived or lost by any transmission in error.

[Java] [ParquetIO] How to determine required dependencies

Reply via email to