Re: CARP and em0 timeout watchdog

2007-04-20 Thread Brian McCann

On 4/20/07, Jack Vogel [EMAIL PROTECTED] wrote:

On 4/20/07, Sven Willenberger [EMAIL PROTECTED] wrote:
 On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
  On 4/20/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
   On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
Having done more diagnostics I have found out it is not CARP related at
all. It turns out that the same timeouts will happen when ftp'ing to the
physical address IPs as well. There is also an odd situation here
depending on which protocol I use. The two boxes are connected to a Dell
Powerconnect 2616 gig switch with CAT6. If I scp files from the
192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
hiccup (I used dd to create various sized testfiles from 32M to 1G in
size and just scp testfile* to the other box). On the other hand, if I
connect to 192.168.0.19 using ftp (either active or passive) where ftp
is being run through inetd, the interface resets (watchdog) within
seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
such behavioral differences between scp and ftp?
  
   You'll get a much higher throughput rate with FTP than you will with
   SSH, simply because encryption overhead is quite high (even with the
   Blowfish cipher).  With a very fast processor and on a gigE network
   you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
   That's the only difference I can think of.
  
   The watchdog resets I can't explain; Jack Vogel should be able to assist
   with that.  But it sounds like the resets only happen under very high
   throughput conditions (which is why you'd see it with FTP but not SSH).
 
  What kind of hardware is this interface? Watchdogs mean TX cleanup
  isn't happening in a reasonable time, without further data its hard to
  know what might be going on.
 
  Jack

 from pciconf:

 [EMAIL PROTECTED]:0:0:  class=0x02 card=0x108c15d9 chip=0x108c8086 
rev=0x03
 hdr=0x00
 vendor   = 'Intel Corporation'
 device   = 'PRO/1000 PM'
 class= network
 subclass = ethernet
 [EMAIL PROTECTED]:0:0:  class=0x02 card=0x109a15d9 chip=0x109a8086 
rev=0x00
 hdr=0x00
 vendor   = 'Intel Corporation'
 class= network
 subclass = ethernet

 em0 is the interface in question.

 from dmesg:

 em0: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
 0x4000-0x401f mem 0xe030-0xe031 irq 16 at device 0.0 on pci13

 em1: Intel(R) PRO/1000 Network Connection Version - 6.2.9 port
 0x5000-0x501f mem 0xe040-0xe041 irq 17 at device 0.0 on pci14

OH, this is an 82573, and I've posted a firmware patcher a couple
different times, there is a bit in the MANC register that is incorrectly
programmed in some vendors systems. Can you search email for
that patcher, it needs to run from DOS. If you are unable to find
it let me know and I'll resent you a copy.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]



FWIW, I've got 82546B cards and it's happening to me as well, but I'm
on 6.1.  I'm upgrading to 6.2 and trying polling as we speak.

--Brian


--
_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_
Brian McCann
Systems  Network Administrator, K12USA

I don't have to take this abuse from you -- I've got hundreds of
people waiting to abuse me.
   -- Bill Murray, Ghostbusters
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pthreads and nagios issue

2005-05-06 Thread Brian McCann
On 5/6/05, Christophe Yayon [EMAIL PROTECTED] wrote:
  Hi all,
 
 i am upgrading our nagios 1.2 (on freebsd 5.3-release) to nagios 2.0
 (currently last cvs after 2.0b3) on Freebsd-5.4RC3 and i saw a very
 strange thing.
 
 After few hours, nagios main process (nagios -d ...) use lot of cpu time
 and when i do a truss on the pid, i have a kse_release loop message.
 
 # top
 last pid: 75729;  load averages:  1.81,  2.08,  2.03
 63 processes:  2 running, 61 sleeping
 CPU states: 12.5% user,  0.0% nice, 16.0% system,  0.0% interrupt, 71.5% idle
 Mem: 36M Active, 1639M Inact, 219M Wired, 68M Cache, 112M Buf, 44M Free
 Swap: 5000M Total, 52K Used, 5000M Free
 
 PID USERNAME PRI NICE   SIZERES STATE  C   TIME   WCPUCPU COMMAND
 40435 nagios   1120  4688K  3544K CPU0   0 569:46 93.99% 93.99% nagios
 [...]
 
 # truss -p 40435
 kse_release(0xbfbf9b70)  ERR#22 Invalid argument
 kse_release(0xbfbf9b70)  ERR#22 Invalid argument
 kse_release(0xbfbf9b70)  ERR#22 Invalid argument
 [...]
 
 I know there is a pthread_acquire() issue with Nagios and  FreeBSD
 threads, but is there any patch against this ?
 
 Thanks in advance...
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 

I've got the same issues, as do some people on the nagios-users list. 
If there is a patch available, the Nagios team still isn't aware of it
yet as that is one of the reasons 2.0 is still in beta.

-- 
_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_-=-_
Brian McCann
Systems  Network Administrator, K12USA

I don't have to take this abuse from you -- I've got hundreds of
people waiting to abuse me.
-- Bill Murray, Ghostbusters
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ggatec ggated question/issue

2005-01-14 Thread Brian McCann
Hi all.  Since this is only in STABLE, figured I'd try here first,
then in -questions.  I'm playing around with ggatec and ggated, and
would like to eventually be able to mirror a partition over the
network...but for now I've just exported /dev/da1s1d as RO, created
the ggate device on the client as RO, mounted it as RO (and tried
using async as well), and when I make  change on the server, I NEVER
see it on the client without unmounting and remounting the client. 
What's odd, is say I make file 1 by doing:

#echo foo  /share/bar

Then mounting the client, I see the file.  Now I delete the file on
the server, I can still cat the file on the client.  It's like the
client can still read the old superblock or something.  Any ideas on
why this is doing this, or how to make it work so the client sees what
the server sees?

Thanks,
--Brian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ggatec ggated question/issue

2005-01-14 Thread Brian McCann
 That's exactly what's happening.  I put ggatec and ggated into
verbose mode, and tried the same thing.  When I cat the file, there
appears to be nothing going on in the logs for either ggatec or ggated
(or even when I do an ls for that matter).  This is definitely
different then what I expected, as I thought ggated just exported a
raw block device, ggatec creates the other end of the mount and a
dev node to point to the tunnel, and mount then would just go
accross the tunnel.  Thinking the mount was RO and mounted async I
thought would get rid of any buffering at the client side.
 Yes, this is cool none the less, and a huge advancement, but
there's got to be a way to have it not buffer and actively read from
the virtual-block device on the client to the server.  Maybe I just
missed something in the man page...guess I'll try reading it again.

Thanks!
--Brian


On Fri, 14 Jan 2005 16:21:59 +0100, Holger Kipp [EMAIL PROTECTED] wrote:
 On Fri, Jan 14, 2005 at 10:01:10AM -0500, Brian McCann wrote:
  #echo foo  /share/bar
 
  Then mounting the client, I see the file.  Now I delete the file on
  the server, I can still cat the file on the client.  It's like the
  client can still read the old superblock or something.  Any ideas on
  why this is doing this, or how to make it work so the client sees what
  the server sees?
 
 Looking at http://kerneltrap.org/node/3104 should explain this. My
 current idea (IANAKH) would be that the client is caching the directory
 and file data and is not notified that anything has changed on disk, so
 there is no reason to refresh the cached data from disk.
 
 The behaviour sounds similar to two FreeBSD-Systems accessing the same disk
 device via SCSI (without synchronizing disk access).
 
 Regards,
 Holger Kipp

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]