KnightChess commented on issue #10511:
URL: https://github.com/apache/hudi/issues/10511#issuecomment-1905205987

   There will be a variety of factor leading to the difference time in the 
query, like IO、cpu、dick load... in spark, like parallelism,  the expand time of 
executor..., in hudi, snapshot reading should be slow than read-optimized 
theoretically, and they use diff reader to read diff file( ro base or rt 
base+log file).
   And there is another problem, does parquet file with bloom filter will 
faster than without bloom filter in reading? I don't think it is certain, you 
need to look at its actual production effect. 
   In spark query, the difference between 2S cannot explain the slow problem. 
What do you think about, this is my shallow cognition, maybe others have better 
opinion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to