XorSum commented on PR #8179:
URL: https://github.com/apache/hadoop/pull/8179#issuecomment-3742113718

   > Hi @XorSum I think it should hit in most cases. In cases where it does 
hit, have there been tests to assess the performance difference?
   
   Thanks for the review! 
   
   In our cluster, only a minority of paths are mounted, while the majority of 
paths are unmounted. 
   Unfortunately, the original method is extremely slow when it fails to find a 
mount point, as it traverses the entire tree.
   This has become a performance bottleneck, making it critical for us to 
optimize the unmounted path query performance.
   
   
   This program is able to benchmark the performance difference: 
   
https://gist.github.com/XorSum/823641e4b599bfc5c242cdb7b018f044#file-benchmarkfinddeepest-java
   
   To test the case that **all queries hit**,  we can delete the code in Line 
103~109.
   The results shows that, the improved method does not slow down the 
performance.
   This is because the `tree.floorEntry(path)` is able to find the hit entry 
quickly.
   
   ```
   
================================================================================
   BENCHMARK COMPARISON RESULTS
   
================================================================================
   
   【Original Method - findDeepest】
     Total Execution Time:        2,546,286 ns  (2.55 ms)
     Total Lookup Count:              3,001
     Avg Time Per Query:             848.48 ns  (0.00 ms)
     Avg Lookups Per Query:            1.00
   
   【Improved Method - findDeepestImproved】
     Total Execution Time:        1,486,916 ns  (1.49 ms)
     Total Lookup Count:              3,001
     Avg Time Per Query:             495.47 ns  (0.00 ms)
     Avg Lookups Per Query:            1.00
   
   【Performance Improvement】
     Time Saved:                      41.60%
     Lookup Count Reduced:             0.00%
     Speedup Factor:                   1.71x
   
   【Correctness Verification】
     ✓ All results match! Both methods returned identical results.
   ```
   
   To test the case that **all queries NOT hit**,  we can delete the code in 
Line 96~100.
   The results shows that, the improved method significantly speed up the 
performance.
   
   ```
   
================================================================================
   BENCHMARK COMPARISON RESULTS
   
================================================================================
   
   【Original Method - findDeepest】
     Total Execution Time:    1,981,994,707 ns  (1981.99 ms)
     Total Lookup Count:         16,107,249
     Avg Time Per Query:         660,444.75 ns  (0.66 ms)
     Avg Lookups Per Query:        5,367.29
   
   【Improved Method - findDeepestImproved】
     Total Execution Time:       33,253,666 ns  (33.25 ms)
     Total Lookup Count:             24,008
     Avg Time Per Query:          11,080.86 ns  (0.01 ms)
     Avg Lookups Per Query:            8.00
   
   【Performance Improvement】
     Time Saved:                      98.32%
     Lookup Count Reduced:            99.85%
     Speedup Factor:                  59.60x
   
   【Correctness Verification】
     ✓ All results match! Both methods returned identical results.
   ```
   
   
   
   
   Let me know if you'd like me to provide additional benchmark results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to