On Tue, Feb 15, 2011 at 09:07:31PM +0100, Marc Grimme wrote:
> Hi Steve,
> I think lately I observed a very similar behavior with RHEL5 and gfs2.
> It was a gfs2 filesystem that had about 2Mio files with sum of 2GB in a 
> directory. When I did a du -shx . in this directory it took about 5 Minutes 
> (noatime mountoption given). Independently on how much nodes took part in the 
> cluster (in the end I only tested with one node). This was only for the first 
> time running all later executed du commands were much faster.
> When I mounted the exact same filesystem with lockproto=lock_nolock it took 
> about 10-20 seconds to proceed with the same command.
> 
> Next I started to analyze this with oprofile and observed the following 
> result:
> 
> opreport --long-file-names:
> CPU: AMD64 family10, speed 2900.11 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
> mask of 0x00 (No unit mask) count 100000
> samples  %        symbol name
> 200569   46.7639  search_rsb_list
> 118905   27.7234  create_lkb

Hi Marc, thanks for sending this again, I remember that you pointed these
out a long time ago, but had forgotten just how bad those searches were.
I really do need to do some optimizing there.

> This very much reminded me on a similar test we've done years ago with
> gfs (see 
> http://www.open-sharedroot.org/Members/marc/blog/blog-on-dlm/red-hat-dlm-__find_lock_by_id/profile-data-with-diffrent-table-sizes).
>
> Does this not show that during the du command 46% of the time the kernel
> stays in the dlm:search_rsb_list function while looking out for locks.
> It still looks like the hashtable for the lock in dlm is much too small
> and searching inside the hashmap is not constant anymore?

We should definately check if the default hash table sizes should be
increased.

Dave

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to