Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-15 Thread Ravikiran G Thirumalai
On Sat, Jan 13, 2007 at 01:20:23PM -0800, Andrew Morton wrote: > > Seeing the code helps. But there was a subtle problem with hold time instrumentation here. The code assumed the critical section exiting through spin_unlock_irq entered critical section with spin_lock_irq, but that might not be t

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-13 Thread Andrew Morton
> On Sat, 13 Jan 2007 11:53:34 -0800 Ravikiran G Thirumalai <[EMAIL PROTECTED]> > wrote: > On Sat, Jan 13, 2007 at 12:00:17AM -0800, Andrew Morton wrote: > > > On Fri, 12 Jan 2007 23:36:43 -0800 Ravikiran G Thirumalai <[EMAIL > > > PROTECTED]> wrote: > > > > >void __lockfunc _spin_lock_irq(spinlo

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-13 Thread Ravikiran G Thirumalai
On Sat, Jan 13, 2007 at 12:00:17AM -0800, Andrew Morton wrote: > > On Fri, 12 Jan 2007 23:36:43 -0800 Ravikiran G Thirumalai <[EMAIL > > PROTECTED]> wrote: > > > >void __lockfunc _spin_lock_irq(spinlock_t *lock) > > > >{ > > > >local_irq_disable(); > > > >>

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-13 Thread Andrew Morton
> On Fri, 12 Jan 2007 23:36:43 -0800 Ravikiran G Thirumalai <[EMAIL PROTECTED]> > wrote: > On Sat, Jan 13, 2007 at 03:39:45PM +1100, Nick Piggin wrote: > > Ravikiran G Thirumalai wrote: > > >Hi, > > >We noticed high interrupt hold off times while running some memory > > >intensive > > >tests on a

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Nick Piggin
Ravikiran G Thirumalai wrote: On Sat, Jan 13, 2007 at 03:39:45PM +1100, Nick Piggin wrote: What is the "CS time"? Critical Section :). This is the maximal time interval I measured from t2 above to the time point we release the spin lock. This is the hold time I guess. It would be in

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Ravikiran G Thirumalai
On Fri, Jan 12, 2007 at 05:11:16PM -0800, Andrew Morton wrote: > On Fri, 12 Jan 2007 17:00:39 -0800 > Ravikiran G Thirumalai <[EMAIL PROTECTED]> wrote: > > > But is > > lru_lock an issue is another question. > > I doubt it, although there might be changes we can make in there to > work around it.

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Ravikiran G Thirumalai
On Sat, Jan 13, 2007 at 03:39:45PM +1100, Nick Piggin wrote: > Ravikiran G Thirumalai wrote: > >Hi, > >We noticed high interrupt hold off times while running some memory > >intensive > >tests on a Sun x4600 8 socket 16 core x86_64 box. We noticed softlockups, > > [...] > > >We did not use any l

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Nick Piggin
Ravikiran G Thirumalai wrote: Hi, We noticed high interrupt hold off times while running some memory intensive tests on a Sun x4600 8 socket 16 core x86_64 box. We noticed softlockups, [...] We did not use any lock debugging options and used plain old rdtsc to measure cycles. (We disable cp

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Andrew Morton
On Fri, 12 Jan 2007 17:00:39 -0800 Ravikiran G Thirumalai <[EMAIL PROTECTED]> wrote: > But is > lru_lock an issue is another question. I doubt it, although there might be changes we can make in there to work around it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Ravikiran G Thirumalai
On Fri, Jan 12, 2007 at 01:45:43PM -0800, Christoph Lameter wrote: > On Fri, 12 Jan 2007, Ravikiran G Thirumalai wrote: > > Moreover mostatomic operations are to remote memory which is also > increasing the problem by making the atomic ops take longer. Typically > mature NUMA system have impleme

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Christoph Lameter
On Fri, 12 Jan 2007, Ravikiran G Thirumalai wrote: > > Does the system scale the right way if you stay within the bounds of node > > memory? I.e. allocate 1.5GB from each process? > > Yes. We see problems only when we oversubscribe memory. Ok in that case we can have more than 2 processors tryi

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Ravikiran G Thirumalai
On Fri, Jan 12, 2007 at 11:46:22AM -0800, Christoph Lameter wrote: > On Fri, 12 Jan 2007, Ravikiran G Thirumalai wrote: > > > The test was simple, we have 16 processes, each allocating 3.5G of memory > > and and touching each and every page and returning. Each of the process is > > bound to a nod

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Andrew Morton
On Fri, 12 Jan 2007 11:46:22 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > While the softlockups and the like went away by enabling interrupts during > > spinning, as mentioned in http://lkml.org/lkml/2007/1/3/29 , > > Andi thought maybe this is exposing a problem with zone->lru_loc

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Christoph Lameter
On Fri, 12 Jan 2007, Ravikiran G Thirumalai wrote: > The test was simple, we have 16 processes, each allocating 3.5G of memory > and and touching each and every page and returning. Each of the process is > bound to a node (socket), with the local node being the preferred node for > allocation (nu

Re: High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Peter Zijlstra
On Fri, 2007-01-12 at 08:01 -0800, Ravikiran G Thirumalai wrote: > Hi, > We noticed high interrupt hold off times while running some memory intensive > tests on a Sun x4600 8 socket 16 core x86_64 box. We noticed softlockups, > lost ticks and even wall time drifting (which is probably a bug in the

High lock spin time for zone->lru_lock under extreme conditions

2007-01-12 Thread Ravikiran G Thirumalai
Hi, We noticed high interrupt hold off times while running some memory intensive tests on a Sun x4600 8 socket 16 core x86_64 box. We noticed softlockups, lost ticks and even wall time drifting (which is probably a bug in the x86_64 timer subsystem). The test was simple, we have 16 processes, ea