Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-30 Thread Adrian Ulrich
Hi Frederik,

 'lustre-info.pl --monitor=io-size' seems to sit at collecting data, 

`io-size' reads its data from the 'disk I/O size' part of brw_stats:

   read  | write
 disk I/O size  ios   % cum % |  ios   % cum %

..and in your case there are no stats, (for reasons unknown to me...)
that's why lustre-info.pl cannot display anything.

Otherwise the file looks fine: eg. `--monitor=io-time' should work.

Regards,
 Adrian


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-29 Thread Larry
good job, I'll download and learn from them.

On Wed, Jul 28, 2010 at 9:38 PM, Adrian Ulrich adr...@blinkenlights.ch wrote:
 First: Sorry for the shameless self advertising, but...

 I uploaded two lustre-related modules to the CPAN:

 #1: Lustre::Info provides easy access to information located
    at /proc/fs/lustre, it also comes with a 'performance monitoring'
    script called 'lustre-info.pl'

 #2 Lustre::LFS offers IO::Dir and IO::File-like filehandles but
   with additional lustre-specific features ($dir_fh-set_stripe...)


 Examples and details:

 Lustre::Info and lustre-info.pl
 ---

 Lustre::Info provides a Perl-OO interface to lustres procfs information.

 (confusing) example code to get the blockdevice of all OSTs:

  #
  my $l = Lustre::Info-new;
  print join(\n, map( { $l-get_ost($_)-get_name.: 
 .$l-get_ost($_)-get_blockdevice } \
                       �...@{$l-get_ost_list}), '' ) if $l-is_ost;
  #

 ..output:
  $ perl test.pl
  lustre1-OST001e: /dev/md17
  lustre1-OST0016: /dev/md15
  lustre1-OST000e: /dev/md13
  lustre1-OST0006: /dev/md11

 The module also includes a script called 'lustre-info.pl' that can
 be used to gather some live performance statistics:

 Use `--ost-stats' to get a quick overview on what's going on:
 $ lustre-info.pl --ost-stats
  lustre1-OST0006 (@ /dev/md11) :  write=   5.594 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  6.0 R/s
  lustre1-OST000e (@ /dev/md13) :  write=   3.997 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  4.0 R/s
  lustre1-OST0016 (@ /dev/md15) :  write=   5.502 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  6.0 R/s
  lustre1-OST001e (@ /dev/md17) :  write=   5.905 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  6.7 R/s


 You can also get client-ost details via `--monitor=MODE'

 $ lustre-info.pl --monitor=ost --as-list  # this will only show clients where 
 read+write = 1MB/s
 client nid       | lustre1-OST0006    | lustre1-OST000e    | lustre1-OST0016 
    | lustre1-OST001e    | +++ TOTALS +++ (MB/s)
 10.201.46...@o2ib  | r=   0.0, w=   0.0 | r=   0.0, w=   0.0 | r=   0.0, w=   
 0.0 | r=   0.0, w=   1.1 | read=   0.0, write=   1.1
 10.201.47...@o2ib  | r=   0.0, w=   0.0 | r=   0.0, w=   1.2 | r=   0.0, w=   
 2.0 | r=   0.0, w=   0.0 | read=   0.0, write=   3.2


 There are many more options, checkout `lustre-info.pl --help' for details!


 Lustre::LFS::Dir and Lustre::LFS::File
 ---

 This two packages behave like IO::File and IO::Dir but both of
 them add some lustre-only features to the returned filehandle.

 Quick example:
  my $fh = Lustre::LFS::File; # $fh is a normal IO::File-like FH
  $fh-open( test) or die;
  print $fh Foo Bar!\n;
  my $stripe_info = $fh-get_stripe or die Not on a lustre filesystem?!\n;



 Keep in mind that both Lustre modules are far from being complete:
 Lustre::Info really needs some MDT support and Lustre::LFS is just a
 wrapper for /usr/bin/lfs: An XS-Version would be much better.

 But i'd love to hear some feedback if someone decides to play around
 with this modules + lustre-info.pl :-)


 Cheers,
  Adrian


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-29 Thread Frederik Ferner
Hi Adrian,

thanks for sharing these with us.

Adrian Ulrich wrote:
 I uploaded two lustre-related modules to the CPAN:
 
 #1: Lustre::Info provides easy access to information located
 at /proc/fs/lustre, it also comes with a 'performance monitoring'
 script called 'lustre-info.pl'

I did have a bit of a play with the lustre-info.pl script on our test 
file system and it seems to work nicely. If you've got a lot of OSTs on 
your server you need a wide monitor for some of the options like 
--monitor=ost-patterns for all OSTs...

We are currently running Lustre 1.6.7.2 (+ a few patches) on our OSTs, 
in case this makes a difference for my issues below.

[snip]
 Examples and details:
 
 Lustre::Info and lustre-info.pl
 ---
[snip]
 The module also includes a script called 'lustre-info.pl' that can
 be used to gather some live performance statistics:
 
 Use `--ost-stats' to get a quick overview on what's going on:
 $ lustre-info.pl --ost-stats

In our case this looks like this (on a very quiet file system):
   play01-OST (@ /dev/sdb) :  write=   0.000 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  0.0 R/s
   play01-OST0001 (@ /dev/sdc) :  write=   0.000 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/sUse of uninitialized value in division (/) 
 at /usr/local/bin/lustre-info.pl line 187.
 , setattr=  0.0 R/s, preprw=  0.0 R/s
   play01-OST0002 (@ /dev/sdd) :  write=   0.000 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  0.0 R/s
   play01-OST0003 (@ /dev/sde) :  write=   0.000 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  0.0 R/s
   play01-OST0004 (@ /dev/sdf) :  write=   0.000 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/sUse of uninitialized value in division (/) 
 at /usr/local/bin/lustre-info.pl line 187.
 , setattr=  0.0 R/s, preprw=  0.0 R/s
   play01-OST0005 (@ /dev/sdg) :  write=   0.000 MB/s, read=   0.000 MB/s, 
 create=  0.0 R/s, destroy=  0.0 R/sUse of uninitialized value in division (/) 
 at /usr/local/bin/lustre-info.pl line 187.
 , setattr=  0.0 R/s, preprw=  0.0 R/s


Note the 'Use of uninitialized value in division...' errors. Looking at 
the code it seems the value for 'setattr' is missing from the stats file 
for some of our OSTs. Looking at the stats file, indeed the setattr line 
is missing for some OSTs.

Has anyone seen this before? What could have caused this?

 You can also get client-ost details via `--monitor=MODE'
 
 $ lustre-info.pl --monitor=ost --as-list  # this will only show clients where 
 read+write = 1MB/s
 client nid   | lustre1-OST0006| lustre1-OST000e| lustre1-OST0016 
| lustre1-OST001e| +++ TOTALS +++ (MB/s)
 10.201.46...@o2ib  | r=   0.0, w=   0.0 | r=   0.0, w=   0.0 | r=   0.0, w=   
 0.0 | r=   0.0, w=   1.1 | read=   0.0, write=   1.1
 10.201.47...@o2ib  | r=   0.0, w=   0.0 | r=   0.0, w=   1.2 | r=   0.0, w=   
 2.0 | r=   0.0, w=   0.0 | read=   0.0, write=   3.2

'lustre-info.pl --monitor=io-size' seems to sit at collecting data, 
please wait... for a very long time until I killed it, I have not had 
the time to debug this yet.

Kind regards,
Frederik
-- 
Frederik Ferner
Computer Systems Administrator  phone: +44 1235 77 8624
Diamond Light Source Ltd.   mob:   +44 7917 08 5110
(Apologies in advance for the lines below. Some bits are a legal
requirement and I have no control over them.)
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-29 Thread Andreas Dilger
On 2010-07-29, at 09:08, Frederik Ferner wrote:
 Note the 'Use of uninitialized value in division...' errors. Looking at 
 the code it seems the value for 'setattr' is missing from the stats file 
 for some of our OSTs. Looking at the stats file, indeed the setattr line 
 is missing for some OSTs.
 
 Has anyone seen this before? What could have caused this?

The statistics code for OBD devices is generic.  For operations that have never 
been done on a particular target there is no stats value printed.  Otherwise, 
the stats file would be 60 lines long and mostly be filled with counters that 
are all 0.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-29 Thread Tina Friedrich

Thanks Andreas,

that explains that.

So the warning can be made to go by changing line 183 in lustre-info.pl from

printf(, %s=%5.1f R/s,$type,$stats-{$type}/$slice)

to be

if (exists $stats-{$type})
	{ printf(, %s=%5.1f R/s,$type,$stats-{$type}/$slice); } 



(patch attached)

Tina

Andreas Dilger wrote:

On 2010-07-29, at 09:08, Frederik Ferner wrote:
Note the 'Use of uninitialized value in division...' errors. Looking at 
the code it seems the value for 'setattr' is missing from the stats file 
for some of our OSTs. Looking at the stats file, indeed the setattr line 
is missing for some OSTs.


Has anyone seen this before? What could have caused this?


The statistics code for OBD devices is generic.  For operations that have never been done on a 
particular target there is no stats value printed.  Otherwise, the stats file would be 
60 lines long and mostly be filled with counters that are all 0.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss




--
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
--- lustre-info.pl  2010-07-29 17:45:10.0 +0100
+++ lustre-info.pl_new  2010-07-29 17:45:54.0 +0100
@@ -180,7 +180,8 @@ sub loop_ost_stats {

# Add some 'metadata' info
foreach my $type (qw(create destroy setattr preprw)) {
-   printf(, %s=%5.1f 
R/s,$type,$stats-{$type}/$slice);
+   if (exists $stats-{$type})
+   { printf(, %s=%5.1f 
R/s,$type,$stats-{$type}/$slice); }
}
print \n;
}
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-29 Thread Adrian Ulrich
Hi Frederik,

 If you've got a lot of OSTs on  your server you need a wide monitor for some
 of the options like  --monitor=ost-patterns for all OSTs...

The output format is not ideal, but it's a good reason to upgrade your
workstation to a dualhead configuration ;-)


 Looking at   the code it seems the value for 'setattr' is missing from the 
 stats file 
 for some of our OSTs. Looking at the stats file, indeed the setattr line 
 is missing for some OSTs.

As Andreas already said: If 'setattr' is missing, there was no setattr 
operation (yet).
Changing
  printf(, %s=%5.1f R/s,$type,$stats-{$type}/$slice);
into
  printf(, %s=%5.1f R/s,$type,(($stats-{$type}||0)/$slice) );

should fix the warning. (The totals are ok, because in perl  undef/$x == 0/$x)


 'lustre-info.pl --monitor=io-size' seems to sit at collecting data, 
 please wait... for a very long time until I killed it, I have not had 
 the time to debug this yet.

I never tested it with anything else than 1.8.1.1 but this should be
trivial to fix:

Could you mail me the output of
 /proc/fs/lustre/obdfilter/##SOME_OST##/exports/##A_RANDOM_NID##/brw_stats ?

Regards,
 Adrian



-- 
 RFC 1925:
   (11) Every old idea will be proposed again with a different name and
a different presentation, regardless of whether it works.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-29 Thread Frederik Ferner

Adrian Ulrich wrote:

If you've got a lot of OSTs on  your server you need a wide monitor for some
of the options like  --monitor=ost-patterns for all OSTs...


The output format is not ideal, but it's a good reason to upgrade your
workstation to a dualhead configuration ;-)


;-)

[snip]

'lustre-info.pl --monitor=io-size' seems to sit at collecting data, 
please wait... for a very long time until I killed it, I have not had 
the time to debug this yet.


I never tested it with anything else than 1.8.1.1 but this should be
trivial to fix:

Could you mail me the output of
 /proc/fs/lustre/obdfilter/##SOME_OST##/exports/##A_RANDOM_NID##/brw_stats ?


See attached.

Thanks!
Frederik

--
Frederik Ferner
Computer Systems Administrator  phone: +44 1235 77 8624
Diamond Light Source Ltd.   mob:   +44 7917 08 5110
(Apologies in advance for the lines below. Some bits are a legal
requirement and I have no control over them.)
snapshot_time: 1280423140.470719 (secs.usecs)

   read  | write
pages per bulk r/w rpcs  % cum % |  rpcs  % cum %
1: 657   0   0   | 2533   0   0
2: 119   0   0   | 2571   0   1
4: 146   0   0   | 44921  12  13
8: 107   0   0   | 280002  77  91
16:112   0   0   |6   0  91
32: 37   0   0   |   12   0  91
64: 19   0   0   |   10   0  91
128:26   0   0   |   21   0  91
256:727565  99 100   | 28991   8 100

   read  | write
discontiguous pagesrpcs  % cum % |  rpcs  % cum %
0:  726505  99  99   | 359067 100 100
1:   0   0  99   |0   0 100
2:  15   0  99   |0   0 100
3:   0   0  99   |0   0 100
4:   1   0  99   |0   0 100
5:   2   0  99   |0   0 100
6:2243   0  99   |0   0 100
7:  22   0 100   |0   0 100

   read  | write
discontiguous blocks   rpcs  % cum % |  rpcs  % cum %
0:  725512  99  99   | 359067 100 100
1: 541   0  99   |0   0 100
2: 315   0  99   |0   0 100
3: 134   0  99   |0   0 100
4:  14   0  99   |0   0 100
5:   3   0  99   |0   0 100
6:2247   0  99   |0   0 100
7:  22   0 100   |0   0 100

   read  | write
disk fragmented I/Os   ios   % cum % |  ios   % cum %
0:  18   0   0   |0   0   0
1:  719295  98  98   | 357926  99  99
2:6200   0  99   | 1141   0 100
3: 540   0  99   |0   0 100
4:  94   0  99   |0   0 100
5: 221   0  99   |0   0 100
6:  35   0  99   |0   0 100
7:  68   0  99   |0   0 100
8:  15   0  99   |0   0 100
9:  16   0  99   |0   0 100
10:  6   0  99   |0   0 100
11:  4   0  99   |0   0 100
12:  0   0  99   |0   0 100
13:  3   0  99   |0   0 100
14:  1   0  99   |0   0 100
15:  0   0  99   |0   0 100
16:  0   0  99   |0   0 100
17:  0   0  99   |0   0 100
18:  0   0  99   |0   0 100
19:  0   0  99   |0   0 100
20:  0   0  99   |0   0 100
21:  1   0  99   |0   0 100
22:  0   0  99   |0   0 100
23:  0   0  99   |0   0 100
24:  1   0  99   |0   0 100
25:  0   0  99   |0   0 100
26:  0   0  99   |0   0 100
27:  0   0  99   |0   0 100
28:  0   0  99   |0   0 100
29:  0   0  99   |0   0 100
30:  0   0  99   |0   0 100
31:   2270   0 100   |0   0 100

   read  | write
disk I/Os in flightios   % cum % |  ios   % cum %

   read  | write
I/O time (1/1000s) ios   % cum % |  ios   % cum %
1: 865   0   0   | 331676  92  92
2:   10821   1   1   | 26464   7  99
4:  221868  30  32   |  875   0  99
8:  391101  53  85   |   16   0  99
16:  66126   9  94   |1   0  99
32:  31806   4  99   |7   0  99
64:   2475   0  99   |   17   0  99
128:  1288   

[Lustre-discuss] Lustre::LFS + Lustre::Info (inc. lustre-info.pl) available on the CPAN

2010-07-28 Thread Adrian Ulrich
First: Sorry for the shameless self advertising, but...

I uploaded two lustre-related modules to the CPAN:

#1: Lustre::Info provides easy access to information located
at /proc/fs/lustre, it also comes with a 'performance monitoring'
script called 'lustre-info.pl'

#2 Lustre::LFS offers IO::Dir and IO::File-like filehandles but
   with additional lustre-specific features ($dir_fh-set_stripe...)


Examples and details:

Lustre::Info and lustre-info.pl
---

Lustre::Info provides a Perl-OO interface to lustres procfs information.

(confusing) example code to get the blockdevice of all OSTs:
 
 #
 my $l = Lustre::Info-new;
 print join(\n, map( { $l-get_ost($_)-get_name.: 
.$l-get_ost($_)-get_blockdevice } \
@{$l-get_ost_list}), '' ) if $l-is_ost;
 #

..output:
 $ perl test.pl
 lustre1-OST001e: /dev/md17
 lustre1-OST0016: /dev/md15
 lustre1-OST000e: /dev/md13
 lustre1-OST0006: /dev/md11

The module also includes a script called 'lustre-info.pl' that can
be used to gather some live performance statistics:

Use `--ost-stats' to get a quick overview on what's going on:
$ lustre-info.pl --ost-stats
 lustre1-OST0006 (@ /dev/md11) :  write=   5.594 MB/s, read=   0.000 MB/s, 
create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  6.0 R/s
 lustre1-OST000e (@ /dev/md13) :  write=   3.997 MB/s, read=   0.000 MB/s, 
create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  4.0 R/s
 lustre1-OST0016 (@ /dev/md15) :  write=   5.502 MB/s, read=   0.000 MB/s, 
create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  6.0 R/s
 lustre1-OST001e (@ /dev/md17) :  write=   5.905 MB/s, read=   0.000 MB/s, 
create=  0.0 R/s, destroy=  0.0 R/s, setattr=  0.0 R/s, preprw=  6.7 R/s


You can also get client-ost details via `--monitor=MODE'

$ lustre-info.pl --monitor=ost --as-list  # this will only show clients where 
read+write = 1MB/s
 client nid   | lustre1-OST0006| lustre1-OST000e| lustre1-OST0016  
   | lustre1-OST001e| +++ TOTALS +++ (MB/s)
10.201.46...@o2ib  | r=   0.0, w=   0.0 | r=   0.0, w=   0.0 | r=   0.0, w=   
0.0 | r=   0.0, w=   1.1 | read=   0.0, write=   1.1
10.201.47...@o2ib  | r=   0.0, w=   0.0 | r=   0.0, w=   1.2 | r=   0.0, w=   
2.0 | r=   0.0, w=   0.0 | read=   0.0, write=   3.2


There are many more options, checkout `lustre-info.pl --help' for details!


Lustre::LFS::Dir and Lustre::LFS::File
---

This two packages behave like IO::File and IO::Dir but both of
them add some lustre-only features to the returned filehandle.

Quick example:
 my $fh = Lustre::LFS::File; # $fh is a normal IO::File-like FH
 $fh-open( test) or die;
 print $fh Foo Bar!\n;
 my $stripe_info = $fh-get_stripe or die Not on a lustre filesystem?!\n;



Keep in mind that both Lustre modules are far from being complete:
Lustre::Info really needs some MDT support and Lustre::LFS is just a
wrapper for /usr/bin/lfs: An XS-Version would be much better.

But i'd love to hear some feedback if someone decides to play around
with this modules + lustre-info.pl :-)


Cheers,
 Adrian


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss