The runs can't be repeated at present, but we plan to collect this data when there is an opportunity.
Attached are some tables and plots related to the previous set, which also show the per OST performance from the same runs as the original four plots. (fpp == file per process, sf == shared file, vn == 2 clients per compute node) Figure 1 attached files fpp_avg_per_ost.png fpp_table.txt Figure 2 attached files fpp_vn_avg_per_ost.png fpp_vn_table.txt Figure 3 attached files sf_avg_per_ost.png sf_table.txt Figure 4 attached files sf_vn_avg_per_ost.png sf_vn_table.txt Thanks Jim, Ruth On Wed, 2006-11-22 at 02:04 +0300, Alex Tomas wrote: > I think we'd very appreciate if users provide us more data > from their runs. the most interesting things are: > 1) vmstat 1 > 2) utils/llobdstat.pl <ost name> 1 > 3) utils/llstat.pl <path to stats file> 1 > > note that, every service has own stats file. for example, > > [EMAIL PROTECTED] tests]# find /proc/fs/lustre -name stats > /proc/fs/lustre/ldlm/services/ldlm_canceld/stats > /proc/fs/lustre/ldlm/services/ldlm_cbd/stats > /proc/fs/lustre/llite/fs0/stats > /proc/fs/lustre/mdt/MDT/mds_readpage/stats > /proc/fs/lustre/mdt/MDT/mds_setattr/stats > /proc/fs/lustre/mdt/MDT/mds/stats > /proc/fs/lustre/osc/OSC_tom_OST_tom_MNT_tom/stats > /proc/fs/lustre/osc/OSC_tom_OST_tom_mds1/stats > /proc/fs/lustre/obdfilter/OST_tom/stats > /proc/fs/lustre/ost/OSS/ost_io/stats > /proc/fs/lustre/ost/OSS/ost_create/stats > /proc/fs/lustre/ost/OSS/ost/stats > > it seems like we need a silly script to simplify gathering. > Peter, what do you think? > > thanks, Alex > > >>>>> Peter Braam (PB) writes: > > PB> Sandia has made available a very interesting set of graphs with some > PB> questions. They study single file per process and shared file IO on > PB> Red Storm. > > PB> Most striking is that on the liblustre platform, shared file IO seems > PB> to be 4x slower. It feels like this should be a simple contention > PB> issue on the server, but we havent' been able to find it. > > PB> A question not mentioned in this email is why this won't scale beyond > PB> 12,500 clients. Probably that has a very simple answer too, and > PB> something blows up there. > > PB> LLNL has shown that in some cases shared file IO can be faster than > PB> file per process, and it is high time we put the puzzle to bed. > > PB> I attach the graphs and copy Sandia's questions below. > > PB> Peter > > PB> What follows are some IOR tests run at Sandia on a 160-OSS/320-OST > PB> Lustre file system. This file system had just been reformatted, prior > PB> to the runs. > > PB> The following issues seem key ones: > PB> - the single shared file is a factor 4-5 too slow, what is the > PB> overhead? > PB> - why are reads so slow? > PB> - why is there a significant read dropoff? > PB> - why is two cores so much slower than single core? > > > > PB> _______________________________________________ > PB> Lustre-devel mailing list > PB> [email protected] > PB> https://mail.clusterfs.com/mailman/listinfo/lustre-devel > > _______________________________________________ > Lustre-devel mailing list > [email protected] > https://mail.clusterfs.com/mailman/listinfo/lustre-devel > >
fpp_avg_per_ost.png
Description: PNG image
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG READ AVG READ/OST ERROR
513 1 320 31442 31930
31686.00 99.02 488.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG WRITE AVG WRITE/OST ERROR
513 1 320 30190 28824
29507.00 92.21 1366.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG READ AVG READ/OST ERROR
1023 1 320 32332 32571
32451.50 101.41 239.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG WRITE AVG WRITE/OST ERROR
1023 1 320 34263 15097
24680.00 77.12 19166.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG READ AVG READ/OST ERROR
2049 1 320 35720 31865
33792.50 105.60 3855.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG WRITE AVG WRITE/OST ERROR
2049 1 320 41169 46168
43668.50 136.46 4999.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG READ AVG READ/OST ERROR
2559 1 320 32900 31898
32399.00 101.25 1002.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG WRITE AVG WRITE/OST ERROR
2559 1 320 39888 39814
39851.00 124.53 74.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG READ AVG READ/OST ERROR
3073 1 320 31012 31453
31232.50 97.60 441.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG WRITE AVG WRITE/OST ERROR
3073 1 320 38459 39904
39181.50 122.44 1445.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG READ AVG READ/OST ERROR
3583 1 320 31123 31035
31079.00 97.12 88.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 SAMPLE 2
AVG WRITE AVG WRITE/OST ERROR
3583 1 320 40635 41507
41071.00 128.35 872.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
4300 1 320 26512 26512.00
82.85 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
4300 1 320 41298 41298.00
129.06 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
6300 1 320 25177 25177.00
78.68 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
6300 1 320 38915 38915.00
121.61 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
8300 1 320 15549 15549.00
48.59 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
8300 1 320 38386 38386.00
119.96 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
10300 1 320 14877 14877.00
46.49 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
10300 1 320 36556 36556.00
114.24 0.00
fpp_vn_avg_per_ost.png
Description: PNG image
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
513 1 320 22650 22650.00
70.78 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
513 1 320 27913 27913.00
87.23 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
1023 1 320 25037 25037.00
78.24 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
1023 1 320 30648 30648.00
95.78 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
2049 1 320 30358 30358.00
94.87 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
2049 1 320 37164 37164.00
116.14 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
4095 1 320 28612 28612.00
89.41 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
4095 1 320 39290 39290.00
122.78 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
8191 1 320 19229 19229.00
60.09 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
8191 1 320 30000 30000.00
93.75 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
10300 1 320 12732 12732.00
39.79 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
10300 1 320 27912 27912.00
87.22 0.00
sf_avg_per_ost.png
Description: PNG image
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
2049 157 157 5119 5119.00
32.61 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
2049 157 157 7474 7474.00
47.61 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
4095 157 157 6826 6826.00
43.48 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
4095 157 157 11122 11122.00
70.84 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
8191 157 157 6281 6281.00
40.01 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
8191 157 157 9289 9289.00
59.17 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
10300 157 157 5883 5883.00
37.47 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
10300 157 157 8407 8407.00
53.55 0.00
sf_vn_avg_per_ost.png
Description: PNG image
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
4095 157 157 6653 6653.00
42.38 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
4095 157 157 11138 11138.00
70.94 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
8191 157 157 6157 6157.00
39.22 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
8191 157 157 8530 8530.00
54.33 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG READ
AVG READ/OST ERROR
10300 157 157 5822 5822.00
37.08 0.00
# CLIENTS STRIPE MAX OSTS SAMPLE 1 AVG WRITE
AVG WRITE/OST ERROR
10300 157 157 7850 7850.00
50.00 0.00
_______________________________________________ Lustre-devel mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-devel
