Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
* Andrew Morton [EMAIL PROTECTED] wrote: On Wed, 07 Feb 2007 00:17:33 +0100 Thomas Gleixner [EMAIL PROTECTED] wrote: On Wed, 2007-02-07 at 00:12 +0100, Tilman Schmidt wrote: No, not this. Anyway the last patch Thomas forwarded does fix the problem. Which one would that be? I might try it for comparison. Find the combined patch of all fixlets on top of -mm3 below. err, I don't have most of this. I just uploaded the crappile-of-the-moment to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/broken-out-2007-02-06-16-59.tar.gz hm: ERROR 404: Not Found. pls. do: ssh master.kernel.org chmod a+r /pub/linux/kernel/people/akpm/mm/broken-out-2007-02-06-16-59.tar.gz Ingo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
Mattia, * Mattia Dongili [EMAIL PROTECTED] wrote: I have it halfways reproducible now and I'm working to find the root cause. Thanks for providing the info. Great, I'm obviously available to test any patch :) Could you try the patch below? The RCU serialization code (a rare call but can be common in some types of setups) has a nasty implicit dependency on the HZ tick - which until now was a hidden wart but became an explicit bug under dynticks. Maybe this is what is slowing down your box. Ingo - Subject: [patch] dynticks: make sure synchronize_rcu() completes From: Ingo Molnar [EMAIL PROTECTED] synchronize_rcu() has a nasty implicit dependency on the HZ tick: it relies on another CPU finishing all RCU work so that this CPU can finish its RCU work too - in IRQ context. But wait_for_completion() goes to sleep indefinitely on dynticks and there might be no other IRQs to this CPU for a long time. Signed-off-by: Ingo Molnar [EMAIL PROTECTED] --- kernel/rcupdate.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) Index: linux/kernel/rcupdate.c === --- linux.orig/kernel/rcupdate.c +++ linux/kernel/rcupdate.c @@ -85,8 +85,13 @@ void synchronize_rcu(void) /* Will wake me after RCU finished */ call_rcu(rcu.head, wakeme_after_rcu); - /* Wait for it */ - wait_for_completion(rcu.completion); + /* +* Wait for it. Note: on dynticks RCU completion needs to be +* polled frequently, to make sure we finish work. If this CPU +* goes idle then another CPU cannot finish this CPU's work. +*/ + while (wait_for_completion_timeout(rcu.completion, HZ/100 ? : 1) == 0) + /* nothing */; } static void rcu_barrier_callback(struct rcu_head *notused) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
On Tue, Feb 06, 2007 at 05:48:26PM +0100, Ingo Molnar wrote: Mattia, * Mattia Dongili [EMAIL PROTECTED] wrote: I have it halfways reproducible now and I'm working to find the root cause. Thanks for providing the info. Great, I'm obviously available to test any patch :) Could you try the patch below? The RCU serialization code (a rare call but can be common in some types of setups) has a nasty implicit dependency on the HZ tick - which until now was a hidden wart but became an explicit bug under dynticks. Maybe this is what is slowing down your box. No, not this. Anyway the last patch Thomas forwarded does fix the problem. By the way, I have all the patches I received stacked up, if you want me to test some different combination, just ask. Thanks -- mattia :wq! - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
Am 06.02.2007 20:28 schrieb Mattia Dongili: On Tue, Feb 06, 2007 at 05:48:26PM +0100, Ingo Molnar wrote: Could you try the patch below? The RCU serialization code (a rare call but can be common in some types of setups) has a nasty implicit dependency on the HZ tick - which until now was a hidden wart but became an explicit bug under dynticks. Maybe this is what is slowing down your box. I have the same problem (huge delay when loading iptables) with 2.6.20-rc6-mm3, and for me this patch did fix it. No, not this. Anyway the last patch Thomas forwarded does fix the problem. Which one would that be? I might try it for comparison. Thanks, Tilman -- Tilman Schmidt E-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
On Wed, 2007-02-07 at 00:12 +0100, Tilman Schmidt wrote: No, not this. Anyway the last patch Thomas forwarded does fix the problem. Which one would that be? I might try it for comparison. Find the combined patch of all fixlets on top of -mm3 below. tglx Index: linux-2.6.20/kernel/timer.c === --- linux-2.6.20.orig/kernel/timer.c +++ linux-2.6.20/kernel/timer.c @@ -985,8 +985,9 @@ static int timekeeping_resume(struct sys if (now (now timekeeping_suspend_time)) { unsigned long sleep_length = now - timekeeping_suspend_time; + xtime.tv_sec += sleep_length; - jiffies_64 += (u64)sleep_length * HZ; + wall_to_monotonic.tv_sec -= sleep_length; } /* re-base the last cycle value */ clock-cycle_last = clocksource_read(clock); @@ -994,7 +995,7 @@ static int timekeeping_resume(struct sys timekeeping_suspended = 0; write_sequnlock_irqrestore(xtime_lock, flags); - clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL); + touch_softlockup_watchdog(); /* Resume hrtimers */ clock_was_set(); Index: linux-2.6.20/kernel/time/clockevents.c === --- linux-2.6.20.orig/kernel/time/clockevents.c +++ linux-2.6.20/kernel/time/clockevents.c @@ -42,8 +42,8 @@ unsigned long clockevent_delta2ns(unsign u64 clc = ((u64) latch evt-shift); do_div(clc, evt-mult); - if (clc KTIME_MONOTONIC_RES.tv64) - clc = KTIME_MONOTONIC_RES.tv64; + if (clc 1000) + clc = 1000; if (clc LONG_MAX) clc = LONG_MAX; @@ -72,18 +72,22 @@ void clockevents_set_mode(struct clock_e * * Returns 0 on success, -ETIME when the event is in the past. */ -int clockevents_program_event(struct clock_event_device *dev, ktime_t expires) +int clockevents_program_event(struct clock_event_device *dev, ktime_t expires, + ktime_t now) { unsigned long long clc; int64_t delta; - delta = ktime_to_ns(ktime_sub(expires, ktime_get())); + delta = ktime_to_ns(ktime_sub(expires, now)); if (delta = 0) return -ETIME; dev-next_event = expires; + if (dev-mode == CLOCK_EVT_MODE_SHUTDOWN) + return 0; + if (delta dev-max_delta_ns) delta = dev-max_delta_ns; if (delta dev-min_delta_ns) Index: linux-2.6.20/kernel/time/tick-broadcast.c === --- linux-2.6.20.orig/kernel/time/tick-broadcast.c +++ linux-2.6.20/kernel/time/tick-broadcast.c @@ -159,6 +159,8 @@ static void tick_do_periodic_broadcast(v */ static void tick_handle_periodic_broadcast(struct clock_event_device *dev) { + dev-next_event.tv64 = KTIME_MAX; + tick_do_periodic_broadcast(); /* @@ -174,7 +176,7 @@ static void tick_handle_periodic_broadca for (;;) { ktime_t next = ktime_add(dev-next_event, tick_period); - if (!clockevents_program_event(dev, next)) + if (!clockevents_program_event(dev, next, ktime_get())) return; tick_do_periodic_broadcast(); } @@ -294,17 +296,31 @@ cpumask_t *tick_get_broadcast_oneshot_ma return tick_broadcast_oneshot_mask; } +static int tick_broadcast_set_event(ktime_t expires, int force) +{ + struct clock_event_device *bc = tick_broadcast_device.evtdev; + ktime_t now = ktime_get(); + int res; + + for(;;) { + res = clockevents_program_event(bc, expires, now); + if (!res || !force) + return res; + now = ktime_get(); + expires = ktime_add(now, ktime_set(0, bc-min_delta_ns)); + } +} + /* * Reprogram the broadcast device: * * Called with tick_broadcast_lock held and interrupts disabled. */ -static int tick_broadcast_reprogram(int force) +static int tick_broadcast_reprogram(void) { - struct clock_event_device *bc = tick_broadcast_device.evtdev; - ktime_t tmp, expires = { .tv64 = KTIME_MAX }; + ktime_t expires = { .tv64 = KTIME_MAX }; struct tick_device *td; - int cpu, res; + int cpu; /* * Find the event which expires next: @@ -319,13 +335,7 @@ static int tick_broadcast_reprogram(int if (expires.tv64 == KTIME_MAX) return 0; - for(;;) { - res = clockevents_program_event(bc, expires); - if (!res || !force) - return res; - tmp = ktime_set(0, bc-min_delta_ns 1); - expires = ktime_add(ktime_get(), tmp); - } + return tick_broadcast_set_event(expires, 0); } /* @@ -333,14 +343,15 @@ static int tick_broadcast_reprogram(int
Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
On Wed, 07 Feb 2007 00:17:33 +0100 Thomas Gleixner [EMAIL PROTECTED] wrote: On Wed, 2007-02-07 at 00:12 +0100, Tilman Schmidt wrote: No, not this. Anyway the last patch Thomas forwarded does fix the problem. Which one would that be? I might try it for comparison. Find the combined patch of all fixlets on top of -mm3 below. err, I don't have most of this. I just uploaded the crappile-of-the-moment to ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/broken-out-2007-02-06-16-59.tar.gz - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
Cc-ing netdev and netfilter-devel, the beginning of the thread is here http://lkml.org/lkml/2007/1/31/306 On Thu, Feb 01, 2007 at 11:33:22PM +0100, Thomas Gleixner wrote: Mattia, ... May I ask you for another test ? Please turn on high resolution timers and check, if the same strange behaviour is happening. Yep, here we go again. Still seeing long stalls but no negative expires offset. Actually one more test I did is disabling my iptables script and the boot process went fine. The script is just: #!/bin/sh iptables -F INPUT iptables -F FORWARD iptables -F OUTPUT iptables -P INPUT DROP iptables -P FORWARD ACCEPT iptables -A INPUT -i lo -j ACCEPT iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT iptables -A INPUT -p tcp --dport ssh -j ACCEPT # LAN iptables -I INPUT -s 10.0.0.0/8 -j ACCEPT # LAN UML iptables -I INPUT -s 172.20.0.0/16 -j ACCEPT echo iptables: MASQUERADING for virtual machines iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE iptables -t nat -A POSTROUTING -o eth2 -j MASQUERADE sysctl -w net.ipv4.ip_forward=1 and executing it from a shell once the boot process is done doesn't generate all that strangeness/slowness... Dmesg with iptables script enabled: [0.00] Linux version 2.6.20-rc6-mm3-1 ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #8 SMP Fri Feb 2 10:26:07 CET 2007 [0.00] BIOS-provided physical RAM map: [0.00] sanitize start [0.00] sanitize end [0.00] copy_e820_map() start: size: 0009f800 end: 0009f800 type: 1 [0.00] copy_e820_map() type is E820_RAM [0.00] copy_e820_map() start: 0009f800 size: 0800 end: 000a type: 2 [0.00] copy_e820_map() start: 000dc000 size: 00024000 end: 0010 type: 2 [0.00] copy_e820_map() start: 0010 size: 3fd7 end: 3fe7 type: 1 [0.00] copy_e820_map() type is E820_RAM [0.00] copy_e820_map() start: 3fe7 size: 0009 end: 3ff0 type: 4 [0.00] copy_e820_map() start: 3ff0 size: 0010 end: 4000 type: 2 [0.00] copy_e820_map() start: e000 size: 1000 end: f000 type: 2 [0.00] copy_e820_map() start: fec0 size: 0001 end: fec1 type: 2 [0.00] copy_e820_map() start: fed14000 size: 6000 end: fed1a000 type: 2 [0.00] copy_e820_map() start: fed1c000 size: 00074000 end: fed9 type: 2 [0.00] copy_e820_map() start: fee0 size: 1000 end: fee01000 type: 2 [0.00] copy_e820_map() start: ff00 size: 0100 end: 0001 type: 2 [0.00] BIOS-e820: - 0009f800 (usable) [0.00] BIOS-e820: 0009f800 - 000a (reserved) [0.00] BIOS-e820: 000dc000 - 0010 (reserved) [0.00] BIOS-e820: 0010 - 3fe7 (usable) [0.00] BIOS-e820: 3fe7 - 3ff0 (ACPI NVS) [0.00] BIOS-e820: 3ff0 - 4000 (reserved) [0.00] BIOS-e820: e000 - f000 (reserved) [0.00] BIOS-e820: fec0 - fec1 (reserved) [0.00] BIOS-e820: fed14000 - fed1a000 (reserved) [0.00] BIOS-e820: fed1c000 - fed9 (reserved) [0.00] BIOS-e820: fee0 - fee01000 (reserved) [0.00] BIOS-e820: ff00 - 0001 (reserved) [0.00] 126MB HIGHMEM available. [0.00] 896MB LOWMEM available. [0.00] found SMP MP-table at 000f6480 [0.00] Entering add_active_range(0, 0, 261744) 0 entries of 256 used [0.00] sizeof(struct page) = 32 [0.00] Zone PFN ranges: [0.00] DMA 0 - 4096 [0.00] Normal 4096 - 229376 [0.00] HighMem229376 - 261744 [0.00] early_node_map[1] active PFN ranges [0.00] 0:0 - 261744 [0.00] On node 0 totalpages: 261744 [0.00] Node 0 memmap at 0xc100 size 8388608 first pfn 0xc100 [0.00] DMA zone: 32 pages used for memmap [0.00] DMA zone: 0 pages reserved [0.00] DMA zone: 4064 pages, LIFO batch:0 [0.00] Normal zone: 1760 pages used for memmap [0.00] Normal zone: 223520 pages, LIFO batch:31 [0.00] HighMem zone: 252 pages used for memmap [0.00] HighMem zone: 32116 pages, LIFO batch:7 [0.00] DMI present. [0.00] ACPI: RSDP @ 0x000f63b0/0x0014 (v000 PTLTD ) [0.00] ACPI: RSDT @ 0x3fe764ef/0x0048 (v001 Sony N0 0x20060710 PTL 0x) [
Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
On Fri, 2007-02-02 at 20:18 +0100, Mattia Dongili wrote: May I ask you for another test ? Please turn on high resolution timers and check, if the same strange behaviour is happening. Yep, here we go again. Still seeing long stalls but no negative expires offset. Actually one more test I did is disabling my iptables script and the boot process went fine. The script is just: Mattia, I have it halfways reproducible now and I'm working to find the root cause. Thanks for providing the info. tglx - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: dynticks + iptables almost stops the boot process [was: Re: 2.6.20-rc6-mm3]
On Fri, Feb 02, 2007 at 09:27:14PM +0100, Thomas Gleixner wrote: On Fri, 2007-02-02 at 20:18 +0100, Mattia Dongili wrote: May I ask you for another test ? Please turn on high resolution timers and check, if the same strange behaviour is happening. Yep, here we go again. Still seeing long stalls but no negative expires offset. Actually one more test I did is disabling my iptables script and the boot process went fine. The script is just: Mattia, I have it halfways reproducible now and I'm working to find the root cause. Thanks for providing the info. Great, I'm obviously available to test any patch :) -- mattia :wq! - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html