Ironically, I've needed this and added it recently on my fork of my parquet. Happy to contribute it back:
https://issues.apache.org/jira/browse/PARQUET-2129 https://github.com/apache/parquet-mr/pull/949 Thanks, Vinoo Ganesh | vinoo.gan...@gmail.com <vinoo.gan...@gmail.com> On Sun, Feb 20, 2022 at 1:18 PM Xinli shang <sha...@uber.com.invalid> wrote: > You seem right. The 'uncompressedSize' is having the value but not printed > out anywhere. Do you want to make a fix? > > On Thu, Feb 17, 2022 at 3:29 AM Deepak Gangwar <dgang...@vmware.com> > wrote: > > > Hi folks, > > > > I was using parquet-tools to see the data or metadata of parquet files. I > > noticed that parquet-tools has been deprecated and removed from the > latest > > branch and it is replaced by parquet-cli. Most of my use-cases are > > fulfilled by parquet-cli but there is 1 thing missing in parquet-cli. I > am > > not able to find any way to get the uncompressed size of the data > present. > > “parquet-tools size -u” gave the uncompressed size but there is no > > equivalent parquet-cli command and “parquet-cli meta” only prints the > > compressed size. > > > > I looked around in the codebase and noticed that uncompressedSize is > > assigned to a variable in meta command but it is not used or printed > > anywhere [1]. I think usage of the variable is missed but I am not able > to > > find any open issue in jira so I might be completely wrong here. Please > > confirm whether this is actually an issue and is there any other way to > get > > uncompressed size that I am missing? > > > > > > [1] > > > https://github.com/apache/parquet-mr/blob/master/parquet-cli/src/main/java/org/apache/parquet/cli/commands/ParquetMetadataCommand.java#L123 > > -- > > Thanks & Regards > > Deepak Gangwar > > > > > > -- > Xinli Shang >