I sent this to the pre-apache status github issues list. I guess that github account is inactive now.

I'm getting a divide by zero in the timing measurement code of InternalParquetRecordReader at line 109 here: https://github.com/apache/incubator-parquet-mr/blob/master/parquet-hadoop/src/main/java/parquet/hadoop/InternalParquetRecordReader.java#L109

totalTime is 0 and there's no check.

In gerneral I find this code somewhat confusing in that it's not obvious what's being tracked (it operates through the side effect of updating certain timing values and assuming it will be called at certain points of the reading lifecycle). With each call to "checkRead" the startedAssemblingCurrentBlockAt is reset to the current time. If checkRead is called again within a millisecond then this is likely to fail.

Also, why are all these timing measurements taken (System.currentTimeMillis is called twice in this one method) and strings constructed for logging when the logging might not even be at INFO level (it seems this code has no operational purpose unless logging is at INFO)?


Reply via email to