Re: FreeBSD based bandwidth manager, traffic shaper

2007-11-07 Thread Oleg Derevenetz
I am looking high performance bandwidth manager, traffic shaper for IP core 
network to configure leased line, xDSL, Ethernet, GPON/EPON, wireless 
subscribers.


Is there any FreeBSD based solution?


Juniper :-)

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load

2007-11-05 Thread Oleg Derevenetz

Anyway, I looked at the ddb output already, said that it looks as either
driver or hw problem with very high confidence.

I think the time of the project could be spent more productive elsewere,
while submitter checks his hardware, for instance, by changing controller,
disks, or controller type.


I already said that:

1. This controller and disks succesfully works earlier with FreeBSD 4.6.2 
without any problems;
2. I tried to replace a disk with another one (the same model), but it 
doesn't help. Unfortunately, I have no another free SCSI controller (but see 
#1);
3. I have another AMD64 machine with different hardware (including disks and 
SCSI controller) that periodically suffers from the same problem. 
Unfortunately, that machine is in production and heavily loaded, so I can't 
overload it even more with INVARIANTS, WITNESS, and DIAGNOSTIC - my clients 
will not forgive me for that.


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load

2007-11-04 Thread Oleg Derevenetz
Dumpdev is swap partition on da0 (single physical disk) that 
connected to Mylex AcceleRAID 170 RAID controller. The problem 
arrives when I copy large amount of files from FTP to another disk 
(da1) that is connected to the same RAID controller.


If the driver or controller is misbehaving it could explain both 
problems. Any chance you can get another disk in there on a different 
controller to dump onto?


Yes, I got IDE disk and saved kernel dump for another static hang state 
on it. Here is the dump:


ftp://oleg.vsi.ru/private/vmcore.0.zip


Is this just the vmcore, or the debugging kernel also?  Both are needed 
to make sense of the dump.


Kernel binary with kernel config is here:

ftp://oleg.vsi.ru/private/kernel.zip

This kernel was built statically, and no modules loaded on boot at all.

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load

2007-11-04 Thread Oleg Derevenetz
) thread 26
[Switching to thread 26 (Thread 100019)]#0  0xc05663ab in sched_switch ()
(kgdb) bt
#0  0xc05663ab in sched_switch ()
#1  0xc055b868 in mi_switch ()
#2  0xc0573bd9 in sleepq_switch ()
#3  0xc0573de2 in sleepq_timedwait ()
#4  0xc055b269 in msleep ()
#5  0xc050cfa8 in usb_event_thread ()
#6  0xc05400cc in fork_exit ()
#7  0xc06b17cc in fork_trampoline ()

Anyway, kernel in kernel.zip is exactly the same kernel for which this 
vmcore generated. I have no any other kernel on this machine :-)


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load

2007-11-01 Thread Oleg Derevenetz
Dumpdev is swap partition on da0 (single physical disk) that connected to 
Mylex AcceleRAID 170 RAID controller. The problem arrives when I copy 
large amount of files from FTP to another disk (da1) that is connected to 
the same RAID controller.


If the driver or controller is misbehaving it could explain both problems. 
Any chance you can get another disk in there on a different controller to 
dump onto?


Yes, I got IDE disk and saved kernel dump for another static hang state on 
it. Here is the dump:


ftp://oleg.vsi.ru/private/vmcore.0.zip

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: rrdtool performance tuning (fwd)

2007-10-30 Thread Oleg Derevenetz

[hmm, after thinking a bit I decided it would be more appropriate here, in
[EMAIL PROTECTED]

Dear colleagues,

any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)?

machine is mostly IO-bound, showing 100% disk load with 8 or sometimes 
even 3

mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)


For example, update algorythm can be changed. Try to not update RRD files 
simultaneously, queue update data instead (with timestamps), for example, in 
memory, and periodically do a bulk update using a single rrdupdate call 
for all queue items related to single RRD file. This saves I/O a lot, for 
example, in my NMS (TclMon) I use simular scheme, and now it updates 50K 
RRD files with 5-10 variables each. I/O load is 100-150 tps with 1-1.5MB/s 
throughput.


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Fw: kern/104406: [ufs] Processes get stuck in ufsstateunderpersistent CPU load

2007-10-29 Thread Oleg Derevenetz

 Oleg, one thing you can do to make this less painful is to
 run your machine's console over serial port.

 First get a crossover serial cable, make sure it works from one
 box to another, it should be easy to run tip com1 on both
 boxes to ensure that it works.

 Then you just need to add console=comconsole to /boot/loader.conf
 and your box's console should come over serial.

 Then on the machine watching the console, you can just do this:

 % script
 Script started, output file is typescript
 % tip com1
 ...do ddb stuff now...
 ...stop tip
 % exit

 now you should have everything logged into a file called typescript
 should save you a big headache.

Thanks, I'll try it in the monday morning.


I posted a followup to kern/104406 that includes all information listed in Debugging Deadlocks chapter of FreeBSD Developer's  
Handbook. Can anyone take a look on it and say - is this certainly a hardware problem or some sort of software problem ?


Anyone ? :-)

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load

2007-10-22 Thread Oleg Derevenetz

  Can anyone take a look on PR kern/104406 ? I got repeatable hang
situation,
  but I can't obtain a kernel dump to get result of all show commands from
  here:
 
 
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
 
  After my break to debugger using Ctrl+Alt+Esc sequence and entering a
  panic command kernel does not wrote a kernel dump but seems to hang.
Can
  anyone describe how to obtain a kernel dump in this situation, or at
least
  say - which output of show commands need in first place to debug this ?
  Output of all suggested commands is huge and I afraid of making mistake
  when carrying this output from screen to list of paper and back :-)

 Oleg, one thing you can do to make this less painful is to
 run your machine's console over serial port.

 First get a crossover serial cable, make sure it works from one
 box to another, it should be easy to run tip com1 on both
 boxes to ensure that it works.

 Then you just need to add console=comconsole to /boot/loader.conf
 and your box's console should come over serial.

 Then on the machine watching the console, you can just do this:

 % script
 Script started, output file is typescript
 % tip com1
 ...do ddb stuff now...
 ...stop tip
 % exit

 now you should have everything logged into a file called typescript
 should save you a big headache.

Thanks, I'll try it in the monday morning.


I posted a followup to kern/104406 that includes all information listed in Debugging Deadlocks chapter of FreeBSD Developer's 
Handbook. Can anyone take a look on it and say - is this certainly a hardware problem or some sort of software problem ?


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs stateunderpersistent CPU load

2007-10-21 Thread Oleg Derevenetz

  After my break to debugger using Ctrl+Alt+Esc sequence and entering a
  panic command kernel does not wrote a kernel dump but seems to 
  hang.

Can
  anyone describe how to obtain a kernel dump in this situation, or at
least
  say - which output of show commands need in first place to debug this 
  ?
  Output of all suggested commands is huge and I afraid of making 
  mistake

  when carrying this output from screen to list of paper and back :-)

 Oleg, one thing you can do to make this less painful is to
 run your machine's console over serial port.

 First get a crossover serial cable, make sure it works from one
 box to another, it should be easy to run tip com1 on both
 boxes to ensure that it works.

 Then you just need to add console=comconsole to /boot/loader.conf
 and your box's console should come over serial.

 Then on the machine watching the console, you can just do this:

 % script
 Script started, output file is typescript
 % tip com1
 ...do ddb stuff now...
 ...stop tip
 % exit

 now you should have everything logged into a file called typescript
 should save you a big headache.

Thanks, I'll try it in the monday morning.

 As far as getting a dump from ddb, try this:

 ddb call doadump

 I'm completely at a loss why this isn't a base ddb command dump
 but whatever... :)

Unfortunately, this doesn't work too. I called duty personnel in this
datacenter and asked them to do this, and person on duty tells me that 
after

he enters this command something like that arrives on monitor:

db call doadump
Dumping 3072 MB

Dump aborted error I/O
Dump failed. (Error 5)


Hmnmm, that seems like you might be having a hardware problem,


It is possible, but unlikely:

1. I have simular symptoms on another AMD64 machine with 6.2 (uname -a from 
this machine listed in PR kern/104406 in my followup at Wed, 7 Mar 2007 
05:10:59 +0300), but they are rare and this machine is in production, so I 
can't make experiments with it;

2. All these hardware successfully works earlier with FreeBSD 4.6.


what disk device do you have?


Dumpdev is swap partition on da0 (single physical disk) that connected to 
Mylex AcceleRAID 170 RAID controller. The problem arrives when I copy large 
amount of files from FTP to another disk (da1) that is connected to the same 
RAID controller.



Have you also enabled kernel dumps via /etc/rc.conf:dumpdev=
?


Yes, I have dumpdev=AUTO in rc.conf and swap device (4G) listed in 
/etc/fstab.


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-20 Thread Oleg Derevenetz
   Can anyone take a look on PR kern/104406 ? I got repeatable hang
situation,
   but I can't obtain a kernel dump to get result of all show commands
from
   here:
  
  
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
  
   After my break to debugger using Ctrl+Alt+Esc sequence and entering a
   panic command kernel does not wrote a kernel dump but seems to hang.
Can
   anyone describe how to obtain a kernel dump in this situation, or at
least
   say - which output of show commands need in first place to debug this
?
   Output of all suggested commands is huge and I afraid of making
mistake
   when carrying this output from screen to list of paper and back :-)

 This very easy to reproduce [ufs] uninterruptable deadlock
 for both of RELENG_6 and RELENG_7. Look at this PR:
 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439

 The PR is closed but the problem is still here with 7.0-PRERELEASE
 and, perhaps, CURRENT.

This is probably another bug because:

1. I built kernel with INVARIANTS as described in on Debugging Deadlocks
page of FreeBSD Developers' Handbook and got no panic, but only deadlock;
2. I have no NTFS filesystem at all and just do a copy of file(s) from FTP
to local UFS using mc. In this PR panic occured when NTFS mounted r/w (and
NOT occured when the same NTFS mounted r/o).

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-20 Thread Oleg Derevenetz
  Can anyone take a look on PR kern/104406 ? I got repeatable hang
situation,
  but I can't obtain a kernel dump to get result of all show commands from
  here:
 
 
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
 
  After my break to debugger using Ctrl+Alt+Esc sequence and entering a
  panic command kernel does not wrote a kernel dump but seems to hang.
Can
  anyone describe how to obtain a kernel dump in this situation, or at
least
  say - which output of show commands need in first place to debug this ?
  Output of all suggested commands is huge and I afraid of making mistake
  when carrying this output from screen to list of paper and back :-)

 Oleg, one thing you can do to make this less painful is to
 run your machine's console over serial port.

 First get a crossover serial cable, make sure it works from one
 box to another, it should be easy to run tip com1 on both
 boxes to ensure that it works.

 Then you just need to add console=comconsole to /boot/loader.conf
 and your box's console should come over serial.

 Then on the machine watching the console, you can just do this:

 % script
 Script started, output file is typescript
 % tip com1
 ...do ddb stuff now...
 ...stop tip
 % exit

 now you should have everything logged into a file called typescript
 should save you a big headache.

Thanks, I'll try it in the monday morning.

 As far as getting a dump from ddb, try this:

 ddb call doadump

 I'm completely at a loss why this isn't a base ddb command dump
 but whatever... :)

Unfortunately, this doesn't work too. I called duty personnel in this
datacenter and asked them to do this, and person on duty tells me that after
he enters this command something like that arrives on monitor:

db call doadump
Dumping 3072 MB

Dump aborted error I/O
Dump failed. (Error 5)

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-19 Thread Oleg Derevenetz

Hi all,

Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all 
show commands from here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems 
to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in 
first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from 
screen to list of paper and back :-)


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: How to report bugs (Re: 6.2-STABLE deadlock?)

2007-04-25 Thread Oleg Derevenetz
Цитирую Kris Kennaway [EMAIL PROTECTED]:

  Oleg Derevenetz wrote:
   ??? LI Xin [EMAIL PROTECTED]:
  [...]
   I'm not very sure if this is specific to one disk controller. 
 Actually
   I got some occasional reports about similar hangs on amd64
 6.2-RELEASE
   (slightly patched version) that most of processes stuck in the
 'ufs'
   state, under very light load, the box was equipped with amr(4)
 RAID.
  
   I was not able to reproduce the problem at my lab, though, it's
 still
   unknown that how to trigger the livelock :-(  Still need some
   investigate on their production system.
   
   I reported simular issue for FreeBSD 6.2 in audit-trail for
 kern/104406:
   
   http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=
   
   and there should be a thread related to this. Briefly, I suspects
 that this is 
   related to nullfs filesystems on my server and when I cvsuped to
 FreeBSD 6.2-
   STABLE with Daichi's unionfs-related patches and replaced
 nullfs-mounted fs 
   with unionfs-mounted (that was done 10.03.07) problem is gone (seems
 to be so, 
   at least).
  
  Hmm...  Seems to be different issues.  The problem I have received was
 a
  pgsql server (no nullfs/unionfs involved), and the hang always happen
  when it is not being heavily loaded (usually in the morning, for
  instance, and there is no special configuration, like scheduled tasks
  which can generate disk load, etc., only the entropy harvesting), so
  this is quite confusing.
 
 Yes, a large part of the confusion is the unfortunate tendency of
 people to do the following:
 
 user1 my system hangs/panics/etc
 user2 my system hangs/panics/etc too; it must be the same problem!
 
 What we really need is for every FreeBSD user who encounters a
 hang/panic/etc to avoid jumping to conclusions -- no matter how many
 superficial similarities there may seem to you -- and instead go
 through the relevant steps described here:
 
  
 http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-
handbook/kerneldebug.html
 
 Until you (or a developer) have analyzed the resulting information,
 you cannot definitively determine whether or not your problem is the
 same as a given random other problem, and you may just confuse the
 issue by making claims of similarity when you are really reporting a
 completely separate problem.

Not all people can do deadlock debugging, though. In my case turning on 
INVARIANTS and WITNESS leads to unacceptable performance penalty due to heavily 
loaded server. So I can only describe my case, actions and result without 
providing any debug information.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: How to report bugs (Re: 6.2-STABLE deadlock?)

2007-04-25 Thread Oleg Derevenetz
Цитирую Kris Kennaway [EMAIL PROTECTED]:

 On Wed, Apr 25, 2007 at 12:14:20PM +0400, Oleg Derevenetz wrote:
 
   Until you (or a developer) have analyzed the resulting information,
   you cannot definitively determine whether or not your problem is
 the
   same as a given random other problem, and you may just confuse the
   issue by making claims of similarity when you are really reporting
 a
   completely separate problem.
  
  Not all people can do deadlock debugging, though. In my case turning
 on 
  INVARIANTS and WITNESS leads to unacceptable performance penalty due
 to heavily 
  loaded server. So I can only describe my case, actions and result
 without 
  providing any debug information.
 
 But you can still do *some* things, e.g. backtraces and/or a coredump:
 every little bit helps.
 
 Ultimately, though, you have to understand and accept that the less
 information you provide, the less chance there is that a developer
 will be able to track down your problem.  In fact a developer may have
 to effectively ignore your problem report altogether, because of what
 I explained about symptoms usually not being enough to tell one bug
 from another.
 
 In general, when you encounter a bug in FreeBSD, you have a little bit
 of work to do on your side before we can start doing the rest.  I
 understand that you may not be in a position to do that work, but that
 means you also need to understand that we can't do it either.

In fact, I solved (or workarounded) this problem for me, so in this thread I 
provide my workaround as possible workaround for users that experiences the 
same problem. This only hint for them, and not a bugreport for you. I could not 
provide a full (or only partial) debug information because I will not back out 
cvsuped sources, will not replace unionfs with nullfs again and will not wait 
week or more for another stuck.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.2-STABLE deadlock?

2007-04-24 Thread Oleg Derevenetz
Цитирую LI Xin [EMAIL PROTECTED]:

 Kostik Belousov wrote:
  On Mon, Apr 23, 2007 at 03:56:32AM +0100, Adrian Wontroba wrote:
  On Tue, Mar 13, 2007 at 02:08:48PM +, Adrian Wontroba wrote:
  At work, amoungst my stable of old computers running FreeBSD, I have
 a
  Fujitsu M800 - a 4 Zeon SMP processor with 4 GB of memory. This
  primarily runs Nagios and a small and lightly used MySQL database,
 along
  with a few inbound FTP transfers per minute. It has a Mylex card
 based
  disc subsystem, ruling out crash dumps.
 
  At some point during 5.5-STABLE this machine started to occasionally
 hang ...
  Another 6-STABLE (cvsupped on 27/03/07) example, with diagnostics
 taken
  rather sooner after the hang.  Processes with wmesg=ufs feature often
 in
  the ps output.
 
  http://www.stade.co.uk/crash1/
  
  I would suspect the mlx controller. There is several processes (for
 instance,
  988, 50918) waiting for completion of block read, and processes in the
 ufs
  states are the result of the lock cascade, IMHO.
 
 I'm not very sure if this is specific to one disk controller.  Actually
 I got some occasional reports about similar hangs on amd64 6.2-RELEASE
 (slightly patched version) that most of processes stuck in the 'ufs'
 state, under very light load, the box was equipped with amr(4) RAID.
 
 I was not able to reproduce the problem at my lab, though, it's still
 unknown that how to trigger the livelock :-(  Still need some
 investigate on their production system.

I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406:

http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=

and there should be a thread related to this. Briefly, I suspects that this is 
related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2-
STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs 
with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be so, 
at least).
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Processes get stuck in ufs state

2007-03-25 Thread Oleg Derevenetz
Цитирую Oleg Derevenetz [EMAIL PROTECTED]:

 On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:
 
  Sometimes (once a week approximately) I have a problem with the same
  symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD
 Opteron(tm)
  Processor 850:
 
  http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=
 
  Sometimes (apparently when CPU load suddenly goes up) all processes
 that
  interacts with disk gets stuck in ufs state, but in my case
  SIGSTOP/SIGCONT seemingly does not help.
 
  See developer handbook, Deadlock Debugging chapter for instruction
 what
  information shall be gathered to debug the problem.
 
 OK, I built kernel with debug options and will wait for stuck. By the
 way, when debug options turned on, I see this message on every 
 boot when nullfs mounting in progress:
 
 acquiring duplicate lock of same type: vnode interlock
  1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806
  2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040
 KDB: stack backtrace:
 kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at
 kdb_backtrace+0x29
 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578
 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at
 _mtx_lock_flags+0x78
 vrefcnt(cfd5c414) at vrefcnt+0x20
 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56
 null_lock(f02f1a68) at null_lock+0x66
 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87
 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac
 nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407)
 at nullfs_root+0x26
 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf)
 at vfs_domount+0x975
 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9
 nmount(cfc60300,f02f1d04) at nmount+0x8b
 syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b
 Xint0x80_syscall() at Xint0x80_syscall+0x1f
 --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp =
 0xbf7fe5bc, ebp = 0xbf7fee38 ---
 
 This host have nullfs filesystems. Is this can be related to deadlock ?

FYI: after replacing nullfs filesystems with unionfs (using new unionfs 
implementation):

http://people.freebsd.org/~daichi/unionfs/

all deadlocks are gone. It seems to be a problem in current nullfs 
implementation, but I can't debug it properly because deadlock cases are 
relatively rare and machine that uses nullfs is heavily loaded so WITNESS and 
DEBUG options leads to unacceptable performance penalty.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Processes get stuck in ufs state

2007-03-09 Thread Oleg Derevenetz

On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:


Sometimes (once a week approximately) I have a problem with the same
symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm)
Processor 850:

http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=

Sometimes (apparently when CPU load suddenly goes up) all processes that
interacts with disk gets stuck in ufs state, but in my case
SIGSTOP/SIGCONT seemingly does not help.


See developer handbook, Deadlock Debugging chapter for instruction what
information shall be gathered to debug the problem.


OK, I built kernel with debug options and will wait for stuck. By the way, when debug options turned on, I see this message on every 
boot when nullfs mounting in progress:


acquiring duplicate lock of same type: vnode interlock
1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806
2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040
KDB: stack backtrace:
kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at kdb_backtrace+0x29
witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578
_mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at _mtx_lock_flags+0x78
vrefcnt(cfd5c414) at vrefcnt+0x20
null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56
null_lock(f02f1a68) at null_lock+0x66
VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87
vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac
nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at 
nullfs_root+0x26
vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) at 
vfs_domount+0x975
vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9
nmount(cfc60300,f02f1d04) at nmount+0x8b
syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 0xbf7fe5bc, 
ebp = 0xbf7fee38 ---

This host have nullfs filesystems. Is this can be related to deadlock ?

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Processes get stuck in ufs state

2007-03-06 Thread Oleg Derevenetz
 0xfe30-0xfe300fff irq 32 at device 1.0
on pci32
pci33: ACPI PCI bus on pcib5
pci32: base peripheral, interrupt controller at device 1.1 (no driver
attached)
pcib6: ACPI PCI-PCI bridge mem 0xfe302000-0xfe302fff irq 36 at device 2.0
on pci32
pci34: ACPI PCI bus on pcib6
pci32: base peripheral, interrupt controller at device 2.1 (no driver
attached)
pcib7: ACPI PCI-PCI bridge mem 0xfe304000-0xfe304fff irq 40 at device 3.0
on pci32
pci35: ACPI PCI bus on pcib7
pci32: base peripheral, interrupt controller at device 3.1 (no driver
attached)
pcib8: ACPI PCI-PCI bridge mem 0xfe306000-0xfe306fff irq 44 at device 4.0
on pci32
pci36: ACPI PCI bus on pcib8
pci32: base peripheral, interrupt controller at device 4.1 (no driver
attached)
atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0
atkbd0: AT Keyboard irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A
fdc0: floppy drive controller port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: 1440-KB 3.5 drive on fdc0 drive 0
pmtimer0 on isa0
orm0: ISA Option ROMs at iomem
0xc-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff,0xcb000-0xcefff on isa0
ppc0: parallel port not found.
sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles, flags=0x300
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
Timecounters tick every 1.000 msec
IP Filter: v4.1.13 initialized.  Default = block all, Logging = enabled
Waiting 5 seconds for SCSI devices to settle
acd0: DMA limited to UDMA33, controller found non-ATA66 cable
acd0: DVDROM MATSHITADVD-ROM SR-8178/PZ21 at ata1-master UDMA33
ses0 at mpt0 bus 0 target 6 lun 0
ses0: SDR GEM318P 1 Fixed Processor SCSI-2 device
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
SMP: AP CPU #1 Launched!
da0 at mpt0 bus 0 target 0 lun 0
da0: SEAGATE ST373207LC 0003 Fixed Direct Access SCSI-3 device
da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da0: 70007MB (143374744 512 byte sectors: 255H 63S/T 8924C)
da1 at mpt0 bus 0 target 2 lun 0
da1: SEAGATE ST336807LC 0C01 Fixed Direct Access SCSI-3 device
da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
Trying to mount root from ufs:/dev/da0s1a
Accounting enabled

Recently I posted followup to this PR with description of the problem. Any 
ideas on how to debug this ?


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Processes get stuck in ufs state

2007-03-06 Thread Oleg Derevenetz
 0xfe30-0xfe300fff irq 32 at device 1.0
on pci32
pci33: ACPI PCI bus on pcib5
pci32: base peripheral, interrupt controller at device 1.1 (no driver
attached)
pcib6: ACPI PCI-PCI bridge mem 0xfe302000-0xfe302fff irq 36 at device 2.0
on pci32
pci34: ACPI PCI bus on pcib6
pci32: base peripheral, interrupt controller at device 2.1 (no driver
attached)
pcib7: ACPI PCI-PCI bridge mem 0xfe304000-0xfe304fff irq 40 at device 3.0
on pci32
pci35: ACPI PCI bus on pcib7
pci32: base peripheral, interrupt controller at device 3.1 (no driver
attached)
pcib8: ACPI PCI-PCI bridge mem 0xfe306000-0xfe306fff irq 44 at device 4.0
on pci32
pci36: ACPI PCI bus on pcib8
pci32: base peripheral, interrupt controller at device 4.1 (no driver
attached)
atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0
atkbd0: AT Keyboard irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio0: type 16550A
fdc0: floppy drive controller port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FAST]
fd0: 1440-KB 3.5 drive on fdc0 drive 0
pmtimer0 on isa0
orm0: ISA Option ROMs at iomem
0xc-0xc7fff,0xc8000-0xc97ff,0xc9800-0xcafff,0xcb000-0xcefff on isa0
ppc0: parallel port not found.
sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles, flags=0x300
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
Timecounters tick every 1.000 msec
IP Filter: v4.1.13 initialized.  Default = block all, Logging = enabled
Waiting 5 seconds for SCSI devices to settle
acd0: DMA limited to UDMA33, controller found non-ATA66 cable
acd0: DVDROM MATSHITADVD-ROM SR-8178/PZ21 at ata1-master UDMA33
ses0 at mpt0 bus 0 target 6 lun 0
ses0: SDR GEM318P 1 Fixed Processor SCSI-2 device
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
SMP: AP CPU #1 Launched!
da0 at mpt0 bus 0 target 0 lun 0
da0: SEAGATE ST373207LC 0003 Fixed Direct Access SCSI-3 device
da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da0: 70007MB (143374744 512 byte sectors: 255H 63S/T 8924C)
da1 at mpt0 bus 0 target 2 lun 0
da1: SEAGATE ST336807LC 0C01 Fixed Direct Access SCSI-3 device
da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
Trying to mount root from ufs:/dev/da0s1a
Accounting enabled

Recently I posted followup to this PR with description of the problem. Any
ideas on how to debug this ?

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]