bvaradar commented on issue #1847:
URL: https://github.com/apache/hudi/issues/1847#issuecomment-660836081


   @zuyanton : Thanks for the detailed write-up.  This is very interesting. If 
you look at the base implementation of FileStatus  getLen() method, it returns 
a cached copy of the length. So, I wouldnt expect it to be the cause of such 
high variance. Also, 100 milliseconds you had observed would definitely making 
some blocking operations like RPC calls.  Does the EMR/S3 implementation of 
filesystem overrides these classes ? 
   
   ```
   
     /**
      * Get the length of this file, in bytes.
      * @return the length of this file, in bytes.
      */
     public long getLen() {
       return length;
     }
   ```
   
   @zuyanton : Can you track the class type for the incoming file-status object 
?
   
   cc @umehrot2 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to