[ 
https://issues.apache.org/jira/browse/PARQUET-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393807#comment-17393807
 ] 

Gabor Szadovszky commented on PARQUET-2073:
-------------------------------------------

So, we are talking about [this 
line|https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/impl/ColumnWriteStoreBase.java#L243].
 
The original line was
{code:java}
(long) ((float) rows) / usedMem * remainingMem
{code}
Here both the casts are for {{rows}} so it is completely fine removing the 
{{(float)}} cast. Even the {{(long)}} cast can be removed since all the tree 
values are {{long}}. I can see only one option were the result can be different 
in the two when the value in {{rows}} overflows at downcast to {{float}}. Could 
you please list exact numbers where you got different numbers?

It is another thing that the very original code should have been
{code:java}
(long) ((double) rows / usedMem * remainingMem )
{code}
This way you would get more accurate numbers.

> Is there something wrong calculate usedMem in ColumnWriteStoreBase.java
> -----------------------------------------------------------------------
>
>                 Key: PARQUET-2073
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2073
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.12.0
>            Reporter: JiangYang
>            Priority: Critical
>         Attachments: image-2021-08-05-14-37-51-299.png
>
>
> !image-2021-08-05-14-37-51-299.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to