Le 13/09/2010 11:31, Stu Midgley a écrit : > Afternoon > > I upgraded our oss's from 1.8.3 to 1.8.4 on Saturday (due to > https://bugzilla.lustre.org/show_bug.cgi?id=22755) and suffered a > great deal of pain. > > We have 30 oss's of multiple vintages. The basic difference between them is > > * md on first 20 nodes > * 3ware 9650SE ML12 on last 10 nodes > > After the upgrade to 1.8.4 we were seeing terrible throughput on the > nodes with 3ware cards (and only the nodes with 3ware cards). This > was typified by see the block device being 100% utilised (iostat), > doing about 100r/s and 400kb/s and all the ost_io threads in D state > (no writes). They would be in this state for 10mins and then suddenly > awake and start pushing data again. 1-2 mins later, they would lock > up again. > > The oss's were dumping stacks all over the place, crawling along and > generally making our lustrefs unuseable. > > After trying different kernels, raid card drivers, changing write back > policy on the raid cards etc. the solution was to > > lctl set_param obdfilter.*.writethrough_cache_enable=0 > lctl set_param obdfilter.*.read_cache_enable=0 > > on all the nodes with the 3ware cards. > > Has anyone else seen this? I am completely baffled as to why it only > affects our nodes with 3ware cards. > > These nodes were working very well under 1.8.3... > >
we have the same problem here but we're not on 3ware qla2462 and xiratex F5404E 4Gb FC-SAS/SATA-II RAID on 1.8.4 on 1.8.3 this also occure at start but after it's OK -- Weill Philippe - Administrateur Systeme et Reseaux CNRS/UPMC/IPSL LATMOS (UMR 8190) _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss