Baunsgaard opened a new pull request, #2223:
URL: https://github.com/apache/systemds/pull/2223

   A recent commit, b43fa112c7b4ba53921e8471fdcbc3d0a15da7ae, moved the append 
function from MatrixBlock. After the changes, the primary benefit is sparse 
appends of 1.084 ms average from 1.208 avg (10  1000x100 blocks appended). More 
importantly, it improves the first calls before better JIT compilations set in 
with Q1 (quantile 1%) of 4.4 before and after 2.8 (see the table below).
   
   Furthermore, there are still more places to optimize the append 
functionality for earlier JIT compilation via specializations.
   
   ```txt
   SU After:
   appending:  rows:  100 cols:  100 sp:1.0  Blocks:   1  rep: 30000  ,    
0.058+-  0.034 ms [q1:  0.160, q2.5:  0.151, q5:  0.145, q10:  0.129],          
 
   appending:  rows: 1000 cols:  100 sp:1.0  Blocks:   1  rep:  3000  ,    
0.748+-  0.379 ms [q1:  1.635, q2.5:  1.542, q5:  1.526, q10:  1.511],          
 
   appending:  rows: 1000 cols: 1000 sp:1.0  Blocks:   1  rep:  3000  ,    
4.692+-  1.360 ms [q1: 18.938, q2.5: 15.894, q5:  9.155, q10:  7.748],          
 
   appending:  rows:  100 cols:  100 sp:0.3  Blocks:   1  rep: 30000  ,    
0.039+-  0.023 ms [q1:  0.107, q2.5:  0.089, q5:  0.081, q10:  0.076],          
 
   appending:  rows: 1000 cols:  100 sp:0.3  Blocks:   1  rep:  3000  ,    
0.248+-  0.139 ms [q1:  0.734, q2.5:  0.529, q5:  0.458, q10:  0.447],          
 
   appending:  rows: 1000 cols: 1000 sp:0.3  Blocks:   1  rep:  3000  ,    
1.510+-  0.526 ms [q1:  3.219, q2.5:  3.087, q5:  3.050, q10:  2.916], 
   appending:  rows:  100 cols:  100 sp:1.0  Blocks:  10  rep: 30000  ,    
0.175+-  0.027 ms [q1:  0.827, q2.5:  0.794, q5:  0.379, q10:  0.203],          
 
   appending:  rows: 1000 cols:  100 sp:1.0  Blocks:  10  rep:  3000  ,    
3.168+-  0.975 ms [q1: 14.196, q2.5:  9.038, q5:  8.730, q10:  4.289],          
 
   appending:  rows: 1000 cols: 1000 sp:1.0  Blocks:  10  rep:  1000  ,   
27.612+-  5.183 ms [q1: 83.574, q2.5: 62.994, q5: 53.881, q10: 39.627],         
  
   appending:  rows:  100 cols:  100 sp:0.3  Blocks:  10  rep: 30000  ,    
0.118+-  0.066 ms [q1:  0.397, q2.5:  0.386, q5:  0.379, q10:  0.354],          
 
   appending:  rows: 1000 cols:  100 sp:0.3  Blocks:  10  rep:  3000  ,    
1.084+-  0.565 ms [q1:  2.801, q2.5:  2.520, q5:  2.480, q10:  2.402],          
 
   appending:  rows: 1000 cols: 1000 sp:0.3  Blocks:  10  rep:  1000  ,    
8.958+-  3.280 ms [q1: 25.492, q2.5: 18.136, q5: 17.960, q10: 17.310],          
 
   appending:  rows:  100 cols:  100 sp:1.0  Blocks: 100  rep:  3000  ,    
2.546+-  0.905 ms [q1:  8.725, q2.5:  8.170, q5:  7.382, q10:  3.760],          
 
   appending:  rows: 1000 cols:  100 sp:1.0  Blocks: 100  rep:   300  ,   
28.371+-  9.353 ms [q1: 84.630, q2.5: 81.429, q5: 66.162, q10: 48.081],         
  
   appending:  rows: 1000 cols: 1000 sp:1.0  Blocks: 100  rep:   200  ,  
257.702+- 15.004 ms [q1:453.715, q2.5:381.001, q5:291.361, q10:282.998],        
   
   appending:  rows:  100 cols:  100 sp:0.3  Blocks: 100  rep:  3000  ,    
1.298+-  0.728 ms [q1:  3.937, q2.5:  3.882, q5:  3.853, q10:  3.788],          
 
   appending:  rows: 1000 cols:  100 sp:0.3  Blocks: 100  rep:  2000  ,    
9.269+-  2.117 ms [q1: 42.300, q2.5: 26.376, q5: 23.182, q10: 10.511],          
 
   appending:  rows: 1000 cols: 1000 sp:0.3  Blocks: 100  rep:  1000  ,   
85.760+-  3.179 ms [q1:187.034, q2.5:173.308, q5:101.105, q10: 93.012],
   
   
   SU Before:
   appending:  rows:  100 cols:  100 sp:1.0  Blocks:   1  rep: 30000  ,    
0.057+-  0.033 ms [q1:  0.159, q2.5:  0.145, q5:  0.137, q10:  0.130],          
 
   appending:  rows: 1000 cols:  100 sp:1.0  Blocks:   1  rep:  3000  ,    
0.714+-  0.393 ms [q1:  1.600, q2.5:  1.567, q5:  1.552, q10:  1.535],          
 
   appending:  rows: 1000 cols: 1000 sp:1.0  Blocks:   1  rep:  3000  ,    
4.599+-  1.273 ms [q1: 17.844, q2.5: 15.859, q5:  9.143, q10:  7.687],          
 
   appending:  rows:  100 cols:  100 sp:0.3  Blocks:   1  rep: 30000  ,    
0.033+-  0.020 ms [q1:  0.098, q2.5:  0.094, q5:  0.079, q10:  0.066],          
 
   appending:  rows: 1000 cols:  100 sp:0.3  Blocks:   1  rep:  3000  ,    
0.373+-  0.242 ms [q1:  0.905, q2.5:  0.819, q5:  0.808, q10:  0.798],          
 
   appending:  rows: 1000 cols: 1000 sp:0.3  Blocks:   1  rep:  3000  ,    
1.916+-  0.861 ms [q1:  5.557, q2.5:  5.497, q5:  5.435, q10:  2.928],
   appending:  rows:  100 cols:  100 sp:1.0  Blocks:  10  rep: 30000  ,    
0.205+-  0.100 ms [q1:  0.821, q2.5:  0.796, q5:  0.755, q10:  0.426],          
 
   appending:  rows: 1000 cols:  100 sp:1.0  Blocks:  10  rep:  3000  ,    
2.837+-  0.905 ms [q1: 12.766, q2.5:  9.073, q5:  8.292, q10:  4.115],          
 
   appending:  rows: 1000 cols: 1000 sp:1.0  Blocks:  10  rep:  1000  ,   
26.707+-  3.109 ms [q1: 61.407, q2.5: 54.978, q5: 52.100, q10: 26.595],         
  
   appending:  rows:  100 cols:  100 sp:0.3  Blocks:  10  rep: 30000  ,    
0.106+-  0.064 ms [q1:  0.394, q2.5:  0.383, q5:  0.376, q10:  0.243],          
 
   appending:  rows: 1000 cols:  100 sp:0.3  Blocks:  10  rep:  3000  ,    
1.208+-  0.881 ms [q1:  4.401, q2.5:  4.360, q5:  4.312, q10:  4.205],          
 
   appending:  rows: 1000 cols: 1000 sp:0.3  Blocks:  10  rep:  1000  ,   
10.174+-  5.625 ms [q1: 32.897, q2.5: 32.365, q5: 30.840, q10: 28.711],         
  
   appending:  rows:  100 cols:  100 sp:1.0  Blocks: 100  rep:  3000  ,    
2.554+-  0.816 ms [q1:  8.693, q2.5:  8.394, q5:  7.975, q10:  3.474],          
 
   appending:  rows: 1000 cols:  100 sp:1.0  Blocks: 100  rep:   300  ,   
29.651+-  9.058 ms [q1: 87.337, q2.5: 86.180, q5: 65.105, q10: 43.344],         
  
   appending:  rows: 1000 cols: 1000 sp:1.0  Blocks: 100  rep:   200  ,  
262.928+- 19.467 ms [q1:429.463, q2.5:393.549, q5:330.729, q10:290.166],        
   
   appending:  rows:  100 cols:  100 sp:0.3  Blocks: 100  rep:  3000  ,    
1.257+-  0.726 ms [q1:  4.029, q2.5:  3.991, q5:  3.968, q10:  3.510],          
 
   appending:  rows: 1000 cols:  100 sp:0.3  Blocks: 100  rep:  2000  ,    
9.461+-  1.851 ms [q1: 25.104, q2.5: 24.209, q5: 22.847, q10: 10.446],          
 
   appending:  rows: 1000 cols: 1000 sp:0.3  Blocks: 100  rep:  1000  ,   
84.473+-  2.733 ms [q1:249.184, q2.5:108.774, q5: 96.848, q10: 90.587],
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to