[Bug 94940] Re: mdns listed in nsswitch.conf causes excessive time for dns lookups

2008-01-07 Thread ed_p
After reading through related bugs, it looks like avahi / nss is trying
multicast dns before traditional dns. With multicast DNS, the only real
option is long timeouts and retries, since only one avahi-enabled
machine on the network may have a response for a given request. (That
is, a successful lookup could have NXDomain responses from all but one
host.) Since IP networks are assumed unreliable, it makes sense to retry
requests, since the request may not have reached the one host that has
the record.

I'm not sure that there's an easy way to fix this. Anything we did to
fix this issue would weaken multicast dns (lower the timeout, reduce the
number of retries, etc). It's unfortunate that currently this impacts
servers that have no use for avahi-style multicast dns, since avahi mdns
is enabled by default on many systems (eg. gutsy).

-- 
mdns listed in nsswitch.conf causes excessive time  for dns lookups
https://bugs.launchpad.net/bugs/94940
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 94940] Re: mdns listed in nsswitch.conf causes excessive time for dns lookups

2008-01-06 Thread ed_p
After some stracing and tcpdumping, it looks like the changed behavior
here is that when mdns gets an NXDomain response, it retries up to 5
seconds, then reports a "timeout" to the requesting client, rather than
immediately reporting that the record doesn't exist.

Is there a reason why requests that get NXDomain responses are retried?
I can't think of a situation where that would be what you'd want, but
maybe I'm missing something.

Trace excerpts are below.

Disabling mdns, request is over in 13 ms, and we do not retry (stracing
sshd, following forks):

[pid  7728] 02:10:20.474565 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
[pid  7728] 02:10:20.474649 connect(4, {sa_family=AF_INET, sin_port=htons(53), 
sin_addr=inet_addr("68.87.76.178")}, 28) = 0
[pid  7728] 02:10:20.474745 fcntl64(4, F_GETFL) = 0x2 (flags O_RDWR)
[pid  7728] 02:10:20.474806 fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  7728] 02:10:20.474870 gettimeofday({1199614220, 474899}, NULL) = 0
[pid  7728] 02:10:20.474943 poll([{fd=4, events=POLLOUT, revents=POLLOUT}], 1, 
0) = 1
[pid  7728] 02:10:20.475030 send(4, 
"\214\222\1\0\0\1\0\0\0\0\0\0\0011\0010\003168\003192\7"..., 42, MSG_NOSIGNAL) 
= 42
[pid  7728] 02:10:20.475158 poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, 
5000) = 1
[pid  7728] 02:10:20.488319 ioctl(4, FIONREAD, [42]) = 0
[pid  7728] 02:10:20.488425 recvfrom(4, 
"\214\222\201\203\0\1\0\0\0\0\0\0\0011\0010\003168\0031"..., 1024, 0, 
{sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("68.87.76.178")}, 
[16]) = 42
[pid  7728] 02:10:20.488640 close(4)= 0

There are no further communications with the dns server in the trace.
(It's not real clear here, but the IP being looked up is 192.168.0.1.)


With mdns enabled, we retry several times (stracing avahi-daemon). I've 
annotated it with shell-style comments, since it's much longer.

# avahi-daemon gets the RESOLVE-ADDRESS command from sshd over its socket
02:23:43.930581 poll([{fd=7, events=POLLIN}, {fd=3, events=POLLIN, 
revents=POLLIN}, {fd=16, events=POLLIN}, {fd=15, events=POLLIN}, {fd=14, 
events=POLLIN}, {fd=13, events=POLLIN}, {fd=12, events=POLLIN}, {fd=11, 
events=POLLIN}, {fd=9, events=POLLIN}], 9, 2196610) = 1 

02:23:43.930711 gettimeofday({1199615023, 
930740}, NULL) = 0
02:23:43.930778 read(3, "RESOLVE-ADDRESS 192.168.0.1\n", 20480) = 28
# (snip)
# request #1
02:23:44.035615 sendmsg(14, {msg_name(16)={sa_family=AF_INET, 
sin_port=htons(5353), sin_addr=inet_addr("224.0.0.251")}, 
msg_iov(1)=[{"\0\0\0\0\0\1\0\0\0\0\0\0\0011\0010\003168\003192\7in-a"..., 42}], 
msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_IP, cmsg_type=, ...}, 
msg_flags=0}, 0) = 42
# (snip)
02:23:44.036015 poll([{fd=7, events=POLLIN}, {fd=3, events=POLLIN}, {fd=16, 
events=POLLIN}, {fd=15, events=POLLIN}, {fd=14, events=POLLIN, revents=POLLIN}, 
{fd=13, events=POLLIN}, {fd=12, events=POLLIN}, {fd=11, events=POLLIN}, {fd=9, 
events=POLLIN}], 9, 100) = 1
02:23:44.036144 gettimeofday({1199615024, 36173}, NULL) = 0
02:23:44.036212 ioctl(14, FIONREAD, [42]) = 0
02:23:44.036307 recvmsg(14, {msg_name(16)={sa_family=AF_INET, 
sin_port=htons(5353), sin_addr=inet_addr("192.168.0.50")}, 
msg_iov(1)=[{"\0\0\0\0\0\1\0\0\0\0\0\0\0011\0010\003168\003192\7in-a"..., 42}], 
msg_controllen=40, {cmsg_len=24, cmsg_level=SOL_IP, cmsg_type=, ...}, 
msg_flags=0}, 0) = 42
# (snip)
# request #2, 1 second after first request
02:23:45.039623 sendmsg(14, {msg_name(16)={sa_family=AF_INET, 
sin_port=htons(5353), sin_addr=inet_addr("224.0.0.251")}, 
msg_iov(1)=[{"\0\0\0\0\0\1\0\0\0\0\0\0\0011\0010\003168\003192\7in-a"..., 42}], 
msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_IP, cmsg_type=, ...}, 
msg_flags=0}, 0) = 42
02:23:45.039807 write(8, "W", 1)= 1
02:23:45.039899 read(7, "WW", 10)   = 2
02:23:45.039965 gettimeofday({1199615025, 39994}, NULL) = 0
02:23:45.040032 poll([{fd=7, events=POLLIN}, {fd=3, events=POLLIN}, {fd=16, 
events=POLLIN}, {fd=15, events=POLLIN}, {fd=14, events=POLLIN, revents=POLLIN}, 
{fd=13, events=POLLIN}, {fd=12, events=POLLIN}, {fd=11, events=POLLIN}, {fd=9, 
events=POLLIN}], 9, 100) = 1
02:23:45.040162 gettimeofday({1199615025, 40190}, NULL) = 0 

  02:23:45.040229 ioctl(14, FIONREAD, 
[42]) = 0
02:23:45.040307 recvmsg(14, {msg_name(16)={sa_family=AF_INET, 
sin_port=htons(5353), sin_addr=inet_addr("192.168.0.50")}, 
msg_iov(1)=[{"\0\0\0\0\0\1\0\0\0\0\0\0\0011\0010\003168\003192\7in-a"..., 42}], 
msg_controllen=40, {cmsg_len=24, cmsg_level=SOL_IP, cmsg_type=, ...}, 
msg_flags=0}, 0) = 42
# (snip)
# request #2, 3 seconds after first request
02:23:47.043610 sendmsg(14, {msg_name(16)={sa_family=AF_INET, 
sin_port=htons(5353), sin_addr=inet_addr("224.0.0.251")}, 
msg_iov(1)=[{"\0\0\0\0\0\1\0\0\0\0\0\0\0011\0010\

[Bug 84899] Re: SSH with GSSAPIAuthentication option on SSH servers are very slow

2008-01-06 Thread ed_p
This is really an nss-mdns bug, reported here:

https://bugs.launchpad.net/ubuntu/+source/avahi/+bug/94940

-- 
SSH with GSSAPIAuthentication option on SSH servers are very slow
https://bugs.launchpad.net/bugs/84899
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 34902] Re: Ralink Wireless USB/PCMCIA/PCI hangs PC

2007-04-08 Thread ed_p
So ndiswrapper works, but that solution only helps x86, and not other
supported platforms (ppc, sparc). When possible, it's better to fix a
driver than to retreat to ndiswrapper.

-- 
Ralink Wireless USB/PCMCIA/PCI hangs PC
https://bugs.launchpad.net/bugs/34902
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 34902] Re: Ralink Wireless USB/PCMCIA/PCI hangs PC

2006-09-19 Thread ed_p
This seems to be a bug in the rt2570 module. Since you can also use
ndiswrapper to drive these cards (at least those of us on intel boxes),
there is a simple workaround: disable the rt2570 module. You can compile
it out, or just add these two lines to your /etc/modprobe.d/blacklist:

# use ndiswrapper instead for rt2x00 devices: rt2570 panics kernel
blacklist rt2570

With rt2570 disabled, my device works as expected, appearing as the
wlan0 interface (ndiswrapper), rather than rausb0 (rt2570).

-- 
Ralink Wireless USB/PCMCIA/PCI hangs PC
https://launchpad.net/bugs/34902

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 34902] Re: Ralink Wireless USB/PCMCIA/PCI hangs PC

2006-09-13 Thread ed_p
Oh, more information, for those more interested in getting their card
working than in fixing kernel bugs:

I was able to get my card up by properly configuring
/etc/network/interfaces manually, in the standard Ubuntu kernels, with
DHCP. I can get exactly one "ifup" command through per reboot. You can
also do one "ifdown" afterwards. However, on the second "ifup" the
system panic's. I think the wizard does several ip-down steps, which
triggers the panic. The end of my /etc/network/interfaces file, where I
added wireless configs, looks like this:

iface rausb0 inet dhcp
wireless-essid 
wireless-key 

*Don't* declare your interface "auto" in /etc/network/interfaces, since
that tells automatic processes that they can up/down that interface,
which will panic your kernel.

I have a simple script that checks if i've ever up'ed the card (by
reading the ESSID from iwconfig), and only up's it once, and that is in
my startup scripts:

is_wlan_up=$(iwconfig 2>/dev/null | grep '^rausb0' | sed -e
's/^.*ESSID://' | cut -d\" -f2)

[ "$is_wlan_up" ] || ifup rausb0

That was sufficient to get my D-Link DWL-g122 USB card (rt2500 based)
working with Ubuntu. My network basically works at this point, but I'm
curious to poke around a bit more to see what's wong.

-- 
Ralink Wireless USB/PCMCIA/PCI hangs PC
https://launchpad.net/bugs/34902

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 34902] Re: Ralink Wireless USB/PCMCIA/PCI hangs PC

2006-09-13 Thread ed_p
Here is some additional information:

Playing around in a text console, it appears that the "freeze" is caused
by a kernel panic that occurs when we hit a BUG in a bh context at
kernel/timer.c, line 411 (in cascade, an inline function that appears in
__run_timers, which itself is an inline function that appears in
run_timer_softirq, which is run in a bh context).

Here's the trace I was working from. The code indicates that it's line
411 of some file:

Code: ... <0f> 0b 9b 01 ee 03 30 c0 ...

It panics, so I couldn't conclusively trace which file from the panic,
but I think it's clearly kernel/timer.c, given the stack trace.

 [] run_timer_softirq+0x132/0x1d0
 [] __do_softirq+0x4f/0xb0
 [] do_softirq+0x35/0x40
 [] irq_exit+0x35/0x40
 [] do_IRQ+0x1f/0x30
 [] common_interrupt+0x1a/0x20

I'm doing some testing now, as to how to fix this. Since this is a
double-fault, I'm curious if we can't just disable BUG()'s in the kernel
(by recompiling). I don't know if the system will recover or not - it
realizes something's wrong, but we panic since it reports the BUG in BH
context, not necessarily because the problem is un-recoverable.

I just did a build with big-kernel-lock pre-emption (CONFIG_PREEMPT_BKL)
turned off, and I didn't see any touble. So this may be a pre-emption
issue. The kernel that had the problem and the one that didn't is a
pretty big configuration delta, so I'm still trying to figure out what
fixed it.

Still looking...

-- 
Ralink Wireless USB/PCMCIA/PCI hangs PC
https://launchpad.net/bugs/34902

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs