panic: pool_do_get: pfstitem free list modified

2016-06-09 Thread Alexey Suslikov
With the diff http://marc.info/?l=openbsd-tech=146468839001627=2

panic: pool_do_get: pfstitem free list modified: page 0xff01e8e2c0008
Stopped at  Debugger+0x9:   leave
   TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*44675  44675  0 0x14000  0x2103  softnet
Debugger() at Debugger+0x9
panic() at panic+0xfe
pool_runqueue() at pool_runqueue
pool_get() at pool_get+0xb5
pf_state_key_attach() at pf_state_key_attach+0x124
pf_state_insert() at pf_state_insert+0x1c6
pfsync_state_import() at pfsync_state_import+0x6ab
pfsync_in_ins() at pfsync_in_ins+0xe9
pfsync_input() at pfsync_input+0x23e
ipv4_input() at ipv4_input+0x37e
ipintr() at ipintr+0x1e
if_netisr() at if_netisr+0xea
taskq_thread() at taskq_thread+0x6c
end trace frame: 0x0, count: 2
http://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.

ddb{3}> show panic
pool_do_get: pfstitem free list modified: page 0xff01e8e2c000; item addr 0x
ff01e8e2cc90; offset 0x0=0x139e137bdb8816c8 != 0x139e136bdb8816c8

ddb{3}> trace
Debugger() at Debugger+0x9
panic() at panic+0xfe
pool_runqueue() at pool_runqueue
pool_get() at pool_get+0xb5
pf_state_key_attach() at pf_state_key_attach+0x124
pf_state_insert() at pf_state_insert+0x1c6
pfsync_state_import() at pfsync_state_import+0x6ab
pfsync_in_ins() at pfsync_in_ins+0xe9
pfsync_input() at pfsync_input+0x23e
ipv4_input() at ipv4_input+0x37e
ipintr() at ipintr+0x1e
if_netisr() at if_netisr+0xea
taskq_thread() at taskq_thread+0x6c
end trace frame: 0x0, count: -13

ddb{3}> show registers
rdi  0x1
rsi0x292hibernate_resume_vector_3+0x9
rbp   0x800022176a70
rbx   0x817204e8__func__.3995+0x8a8
rdx0
rcx   0x80068000
rax  0x1
r80x800022176990
r9 0
r10   0x800022176a00
r11   0x8108aeb0comcnputc
r120x100mptramp_gdt32_desc+0xde
r13   0x800022176a80
r14  0x2
r150
rip   0x8134e4d9Debugger+0x9
cs   0x8
rflags 0x286mptramp_gdt32_desc+0x264
rsp   0x800022176a60
ss  0x10
Debugger+0x9:   leave

ddb{3}> show all pool
Name  Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
ipsec policy 352  402 2 1 1 1 0 80
pfsync  72   4106260   410626 19065 19064 1 1 0 81
arp 56  1260   99 1 0 1 1 0 80
inpcbpl288  25615740  2561342   335   3171821 0 80
plimitpl   152  4450  414 2 0 2 2 0 80
rtentry112  2970  184 4 0 4 4 0 80
syncache   264  11433180  1143318 3 2 1 1 0 81
sackhl  24   260   262121 0 1 0 80
tcpqe   32  11546920  1154692 2 1 1 1 0 81
tcpcb  560  14588950  1458778   354   3441013 0 80
rttmr   72   510   472120 1 1 0 80
nd6 96   500   38 1 0 1 1 0 80
art_heap8  4096   100 1 0 1 1 0 80
art_heap4  256 10650  72725 32222 0 80
art_table   32 10660  727 3 0 3 3 0 80
art_node24  2970  201 1 0 1 1 0 80
ppxss  1128   100 1 0 1 1 0 80
pfstscr 40   7386220   738012   302   2921025 0 81
pffrag 112  1600  1602121 0 1 0220
pffrent 40  3220  3222121 0 1 0 80
pfosfp  40  8400  420 5 0 5 5 0 80
pfosfpen   112 14200  71021 02121 0 80
pfrke_plain 160 2170   90 6 0 6 6 0 80
pfrktable  1344   13046013021 4 0 4 4 0 80
pfruleitem  1683481083469 2 0 2 2 0 80
pfstitem24  51196160  5115944   140   1112936 0 80
pfstitem: pool(0x81940f20:pfstitem): free list modified: page 0xff08
pfstkey112  52155660  5211778  1000   871   129   174 0 80
pfstate312  52005070  5197063  9203  8899   304   474 0 80
pfrule 1336 1470   1813 21113 0 80

Re: pool_put uvm fault, amappl1 "page inconsistency"

2016-05-30 Thread Alexey Suslikov
On Mon, May 30, 2016 at 8:19 PM, Stefan Kempf <sisnk...@gmail.com> wrote:
> Somehow the page inconsistency seems truncated:
>
> Alexey Suslikov wrote:
>
>> ddb{2}> show all pool
>> Name  Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg 
>> Idle
>> [...]
>> amappl1 7299331095371   559   48574   103 0 8>> 0
>> amappl1: pool(0x81974640:amappl1): page inconsistency: page 
>> 0xff01e0
>
> The error message looks odd to me.
> kern/subr_pool.c prints additional info after the 'page inconsistency'.
>
> That seems to be missing in all your 'page inconsistency' bug reports.
> Not sure why this output is not captured here, but there should also be
> something like "at page head addr", "item ordinal", "on list, missing"
> after the page address.

I don't know what to say. It's a copy-paste from kvm output.

>> OpenBSD 6.0-beta (GENERIC.MP) #0: Mon May 30 14:47:09 EEST 2016
>> ***@***:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
> Can you reproduce these page inconsistency problems with a snapshot kernel?
> Is this truly a kernel built from clean original sources?

Yes, it is. For sure. I'm not interested in obfuscation.

Plus experimental code in snapshots can mangle results. And
it is still carp backup machine...



pool_put uvm fault, amappl1 "page inconsistency"

2016-05-30 Thread Alexey Suslikov
uvm_fault(0x81910900, 0xff11ea28e888, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  pool_put+0x1dd: xorq0x8(%rax),%rcx

ddb{2}> show panic
the kernel did not panic

ddb{2}> trace
pool_put() at pool_put+0x1dd
uvm_unmap_detach() at uvm_unmap_detach+0x61
uvmspace_exec() at uvmspace_exec+0xc9
sys_execve() at sys_execve+0x5eb
syscall() at syscall+0x27b
--- syscall (number 59) ---
end of kernel
end trace frame: 0x7f7df330, count: -5
0x9094941923a:

ddb{2}> show registers
rdi   0x
rsi   0xff01ea28e958
rbp   0x8000223238f0
rbx   0xff01ea28ef90
rdx   0x818d02fd2fb02fa4
rcx   0x818d02fd2fb02fa4
rax   0xff11ea28e880
r8 0
r90xff0106d5fd00
r10 0x73mptramp_gdt32_desc+0x51
r11  0x1
r12   0xff01ea28e958
r13   0x81974640uvm_small_amap_pool
r140
r15   0x1000__ALIGN_SIZE
rip   0x811aa71dpool_put+0x1dd
cs   0x8
rflags   0x10287mptramp_longmode+0x1df
rsp   0x800022323850
ss  0x10
pool_put+0x1dd: xorq0x8(%rax),%rcx

ddb{2}> show all pool
Name  Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
ipsec policy 352  200 1 0 1 1 0 80
pfsync  72 45490 4549   248   247 1 1 0 81
arp 56   2801 1 0 1 1 0 80
inpcbpl2882515002492119 11818 0 80
plimitpl   152   460   15 2 0 2 2 0 80
rtentry112  10902 4 0 4 4 0 80
syncache   26411526011525 2 1 1 1 0 80
sackhl  24505 3 3 0 1 0 80
tcpqe   3211521011521 2 1 1 1 0 81
tcpcb  5601475301463915 6 910 0 80
nd6104   1101 1 0 1 1 0 80
art_heap8  4096   100 1 0 1 1 0 80
art_heap4  256  2910019 01919 0 80
art_table   32  29200 3 0 3 3 0 80
art_node24  1090   19 1 0 1 1 0 80
ppxss  1128   100 1 0 1 1 0 80
pfstscr 40 55150 5415 4 1 3 3 0 80
pfosfp  40  8400  420 5 0 5 5 0 80
pfosfpen   112 14200  71021 02121 0 80
pfrke_plain 160 10702 5 0 5 5 0 80
pfrktable  1344 1770  152 4 0 4 4 0 80
pfruleitem  16  9160  912 2 1 1 2 0 80
pfstitem2450180049568 7 0 7 7 0 80
pfstkey1125116205048835 82732 0 80
pfstate31251084050445   137825588 0 80
pfrule 1336 1450   1613 11213 0 80
semupl 11238854038854 1 0 1 1 0 81
semapl 112100 1 0 1 1 0 80
shmpl  112100 1 0 1 1 0 80
dirhash1024  560   25 5 1 4 4 0 80
dino1pl128 28180  99276175959 0 80
ffsino 240 28180  992   12517   108   108 0 80
nchpl  144 40670 1029   13421   113   113 0 80
rtmask  32300 1 0 1 1 0 80
uvmvnodes   72 28340052 05252 0 80
vnodes 200 283400   150 0   150   150 0 80
namei  1024   27305027304   224   223 1 2 0 80
uhcixfer   264  1080  102 1 0 1 1 0 80
wdcxfer176606 1 1 0 1 0 80
pfiaddrpl  120500 1 0 1 1 0 80
ehcixfer   264   780   74 1 0 1 1 0 80
scxspl 192 90420 9042   362   361 1 2 0 81
sigapl 432 11630 11133529 6 7 0 80
knotepl11213275013078 7 1 6 7 0
 80
kqueuepl   320   210

pool_put uvm fault

2016-05-12 Thread Alexey Suslikov
machine acts as carp BACKUP in MASTER/BACKUP setup and MASTER has
never crashed like that. notice "page inconsistency" in "show all pools".


login: uvm_fault(0xff022f69f700, 0x18, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  pool_put+0x1dd: xorq0x8(%rax),%rcx
ddb{3}>


ddb{3}> show panic
the kernel did not panic


ddb{3}> trace
pool_put() at pool_put+0x1dd
m_free() at m_free+0x60
soreceive() at soreceive+0xba4
recvit() at recvit+0x139
sys_recvfrom() at sys_recvfrom+0x9d
syscall() at syscall+0x368
--- syscall (number 29) ---
end of kernel
end trace frame: 0x625b29c4800, count: -6
0x626016de63a:


ddb{3}> show registers
rdi   0x8194c0a0mbpool+0xa0
rsi   0xff022378e430
rbp   0x8000221fbbe0
rbx   0xff022378edd0
rdx   0x248bdd84cb423a17
rcx   0x248bdd84cb423a17
rax 0x10
r80x7f7fc000
r90x8000221fbe0c
r10   0x8000221fbd40
r11  0x1
r12   0xff0089c03000
r13   0x8194c000mbpool
r14   0xff0089c03300
r15 0x84mptramp_gdt32_desc+0x62
rip   0x811a5a1dpool_put+0x1dd
cs   0x8
rflags   0x10286mptramp_longmode+0x1de
rsp   0x8000221fbb40
ss  0x10
pool_put+0x1dd: xorq0x8(%rax),%rcx


ddb{3}> show all pools
Name  Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
ipsec policy 352  200 1 0 1 1 0 80
pfsync  7287019087019  4103  4102 1 1 0 81
arp 56   380   11 1 0 1 1 0 80
inpcbpl288   6988380   69859289701922 0 80
plimitpl   152  1290   95 2 0 2 2 0 80
rtentry184  1460   37 6 0 6 6 0 80
syncache   248   2389680   238968 4 3 1 2 0 81
sackhl  24   110   11 8 8 0 1 0 80
tcpqe   32   2407250   240725 2 1 1 1 0 81
tcpcb  560   3050350   304906   113   1031014 0 80
rttmr   72   150   15 8 8 0 1 0 80
nd6104   230   11 1 0 1 1 0 80
ppxss  1128   100 1 0 1 1 0 80
pfstscr 40336480334803430 411 0 80
pffrag 112   510   51 8 8 0 1 0220
pffrent 40  1020  102 8 8 0 1 0 80
pfosfp  40  8400  420 5 0 5 5 0 80
pfosfpen   112 14200  71021 02121 0 80
pfrke_plain 160 17602 8 0 8 8 0 80
pfrktable  134427370 2713 4 1 3 4 0 80
pfruleitem  1616752016746 4 2 2 4 0 80
pfstitem24   9468650   94610649381141 0 80
pfstkey104   9668700   966111   333   29934   177 0 80
pfstate312   9606330   959848  2059  198376   516 0 80
pfrule 1336 1470   1613 11213 0 80
semupl 112   7998190   799819 1 0 1 1 0 81
semapl 112100 1 0 1 1 0 80
shmpl  112100 1 0 1 1 0 80
dirhash1024   93921093507  1637  157463   226 0 80
dino1pl128  41637390  4108055  1797 0  1797  1797 0 80
ffsino 240  41637390  4108055  3276 0  3276  3276 0 80
nchpl  144  18160540  1811004   188 0   188   188 0 80
rtmask  32   2600 1 0 1 1 0 80
uvmvnodes   725569500  1013 0  1013  1013 0 80
vnodes 2005569500  2932 0  2932  2932 0 80
namei  1024 15227234   0 15227234  3864  3863 1 2 0 81
uhcixfer   264  1080  102 1 0 1 1 0 80
wdcxfer176606 1 1 0 1 0 80
pfiaddrpl  120500 1 0 1 1 0 80
ehcixfer   264   840   80 1 0 1 1 0 80
scxspl 192  78317950  7831795  7031  7030 1 2 0 81
sigapl 43213697013634  4556  4549 7 9 0 80
knotepl112   2888340   288636 7 

pool_do_get uvm fault

2016-05-12 Thread Alexey Suslikov
machine acts as carp BACKUP in MASTER/BACKUP setup and MASTER has
never crashed like that. notice "page inconsistency" in "show all pools".


uvm_fault(0x81909940, 0xff109c941a00, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  pool_do_get+0x90:   movq0(%r13),%rdi
ddb{2}>


ddb{2}> show panic
the kernel did not panic


ddb{2}> trace
pool_do_get() at pool_do_get+0x90
pool_get() at pool_get+0xb5
m_get() at m_get+0x28
m_dup_pkt() at m_dup_pkt+0x59
carp_input() at carp_input+0x17c
if_input_process() at if_input_process+0xcd
taskq_thread() at taskq_thread+0x6c
end trace frame: 0x0, count: -7


ddb{2}> show registers
rdi  0x7
rsi   0x83978492b2dd8e99
rbp   0x800022176cc0
rbx   0xff022d2f18d0
rdx   0x800022176d24
rcx   0x80067000
rax   0x67592a7da6f1a793
r80x80797e00
r9   0x1
r10  0x1
r11   0xff00cd740a6a
r12   0x8194c000mbpool
r13   0xff109c941a00
r14  0x2
r15  0x2
rip   0x811a5340pool_do_get+0x90
cs   0x8
rflags   0x10286mptramp_longmode+0x1de
rsp   0x800022176c70
ss  0x10
pool_do_get+0x90:   movq0(%r13),%rdi


ddb{2}> show all pools
Name  Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
ipsec policy 352  200 1 0 1 1 0 80
pfsync  7286343086343  4219  4218 1 1 0 81
arp 56   430   16 1 0 1 1 0 80
inpcbpl288   7016510   701406   134   1161821 0 80
plimitpl   152  1290   95 2 0 2 2 0 80
rtentry184  1290   21 7 1 6 6 0 80
syncache   248   2401770   240177 2 1 1 1 0 81
sackhl  24   100   101010 0 1 0 80
tcpqe   32   2426860   242686 2 1 1 1 0 81
tcpcb  560   3066160   30648786761012 0 80
rttmr   72808 2 2 0 1 0 80
nd6104   1201 1 0 1 1 0 80
ppxss  1128   100 1 0 1 1 0 80
pfstscr 40   2338610   233851   117   116 130 0 80
pffrag 112   470   47 6 6 0 1 0220
pffrent 40   940   94 6 6 0 1 0 80
pfosfp  40  8400  420 5 0 5 5 0 80
pfosfpen   112 14200  71021 02121 0 80
pfrke_plain 160 17602 8 0 8 8 0 80
pfrktable  134427540 2730 4 1 3 4 0 80
pfruleitem  1615044015039 3 1 2 3 0 80
pfstitem24  10797950  10792693428 620 0 80
pfstkey104  10997890  1099263   382   3671585 0 80
pfstate312  10992780  1098729  2387  234443   268 0 80
pfrule 1336 1470   1613 11213 0 80
semupl 112   8046640   804664 1 0 1 1 0 81
semapl 112100 1 0 1 1 0 80
shmpl  112100 1 0 1 1 0 80
dirhash1024   82402081988  1648  159256   225 0 80
dino1pl128  31620480  3106408  1795 0  1795  1795 0 80
ffsino 240  31620480  3106408  3274 0  3274  3274 0 80
nchpl  144  14606050  1455646   186 1   185   185 0 80
rtmask  32   2500 1 0 1 1 0 80
uvmvnodes   725564900  1012 0  1012  1012 0 80
vnodes 2005564900  2929 0  2929  2929 0 80
namei  1024 12038132   0 12038132  4573  4572 1 2 0 81
uhcixfer   264  1080  102 1 0 1 1 0 80
wdcxfer176606 1 1 0 1 0 80
pfiaddrpl  120500 1 0 1 1 0 80
ehcixfer   264   840   80 1 0 1 1 0 80
scxspl 192  58499240  5849924  7676  7675 1 2 0 81
sigapl 43213760013697   273   265 8 9 0 80
knotepl112   2693370   269139 7 0 7 7 0 80

Re: bsd.rd from snapshot not booting on x220

2015-07-24 Thread Alexey Suslikov
danielk danielk_lists at z9d.de writes:

 OpenBSD 5.7-current (GENERIC.MP) #969: Fri May  1 00:36:03 MDT 2015

May 1 is too old. Try newer one.



Re: Elantech Clickpad (pms0) stopped working on current

2015-06-07 Thread Alexey Suslikov
Stefan Sperling stsp at stsp.name writes:

 - if (((fw_version  0x0f)  16)  6)
 + if (((fw_version  0x0f)  16) != 6 
 + (fw_version  0x0f)  16 != 8)

fw_version  0x0f)  16 is repeating twice
(at least).

Is it good idea to turn above magic into a macro
with meaningful name?



Re: OpenBSD-release v5.6 amd64 azalia kernel panic

2014-12-11 Thread Alexey Suslikov
tixx at openmailbox.org writes:

 # cat -n azalia.c | head -n 2352 | tail -n 1
 2352   index = w-connections[w-selected];

Could you place a debug printf and dump w-type, w-selected and
size of w-connections before above line of code?



Re: kernel panic on 5.3 running under RedHat KVM in virtio mode

2013-05-21 Thread Alexey Suslikov
 virtio0 at pci0 dev 3 function 0 Qumranet Virtio Network rev 0x00: Virtio
Network Device
 vio0 at virtio0: address 52:54:00:00:1c:a9

Could you switch network adapter emulation to Intel PRO/1000 to
see if it helps?



Re: Kernel panic on current amd64 running under Ubuntu KVM (patch included)

2013-05-20 Thread Alexey Suslikov
On Mon, May 20, 2013 at 8:42 PM, Mike Larkin mlar...@azathoth.net wrote:
 On Mon, May 20, 2013 at 05:11:35PM +, Alexey E. Suslikov wrote:
 Theo de Raadt deraadt at cvs.openbsd.org writes:

  If these VM's are real VM's the should start emulating the machines
  they claim to be emulating correctly, or they should start advertising
  that they are something different, so that we can isolate the bullshit
  factor.

 Ok. I see.

 Could we trim that down to the following?

 --- sys/arch/amd64/amd64/identcpu.c.orig  Mon May 20 19:58:06 2013
 +++ sys/arch/amd64/amd64/identcpu.c   Mon May 20 20:01:08 2013
 @@ -127,6 +127,7 @@
   { CPUIDECX_AVX, AVX },
   { CPUIDECX_F16C,F16C },
   { CPUIDECX_RDRAND,  RDRAND },
 + { CPUIDECX_HV,  HV },
  }, cpu_ecpuid_ecxfeatures[] = {
   { CPUIDECX_LAHF,LAHF },
   { CPUIDECX_CMPLEG,  CMPLEG },
 --- sys/arch/amd64/include/specialreg.h.orig  Mon May 20 20:01:56 2013
 +++ sys/arch/amd64/include/specialreg.h   Mon May 20 20:06:09 2013
 @@ -158,6 +158,7 @@
  #define  CPUIDECX_AVX0x1000  /* Advanced Vector Extensions 
 */
  #define  CPUIDECX_F16C   0x2000  /* 16bit fp conversion  */
  #define  CPUIDECX_RDRAND 0x4000  /* RDRAND instruction  */
 +#define  CPUIDECX_HV 0x8000  /* Hypervisor presence 
 */

  /*
   * Structured Extended Feature Flags Parameters (CPUID function 0x7, leaf 
 0)


 That's certainly less objectionable but I'm not sure what useful information
 this diff provides.

Seen in dmesg, HV flag will indicate operating system is run under hypervisor
and weird things are possible while running kernel code which depends on CPU
features.

After all, it is kinda documented by AMD on page 570 of
http://support.amd.com/us/Processor_TechDocs/24594_APM_v3.pdf

(AMD named it RAZ, but I put meaningful name like in FreeBSD - should we
put a reference to above mentioned document near the define?).



Cannot open whatis database for `OpenBSD Current'

2012-01-12 Thread Alexey Suslikov
hello bugs@

http://www.openbsd.org/cgi-bin/man.cgi says Cannot open whatis
database for `OpenBSD Current' if Keyword Search is chosen.

regards,
alexey



Re: kernel/6486: MCLGETI breaks re(4)

2010-10-15 Thread Alexey Suslikov
Claudio Jeker wrote:

 On Wed, Oct 13, 2010 at 03:33:27PM +0600, Anton Maksimenkov wrote:
  2010/10/13 Claudio Jeker clau...@openbsd.org:
   Most probably re(4) was unable to allocate
   clusters and now the RX ring is empty and stuck, in the worst case you hit
   an interrupt storm.
 
  BTW, in worst case (I used ping -f, just can't find anything useful to
  generate many small packets) re generate about 5000 interrupts.
  While vr on same machine can do about 20 000 interrupts and the
  machine still stay.
 

 As Henning mentioned, number of interrupts do not matter at all. My re(4)

It does. Bug is triggered at 15+ interrupts per second (according to systat).

 is happyly pumping 75kpps bidirectional with 6500 interrupts per second.
 The system is still mostly idle. ping -f is a bad test as it is a ping
 pong protocol and not mad to max out a line.

In our test we haven't used to ping, but sipp making 200-400 simultaneous
calls to Asterisk box.

Alexey



Re: kernel/6486: MCLGETI breaks re(4)

2010-10-15 Thread Alexey Suslikov
On Fri, Oct 15, 2010 at 17:30, Alexey Suslikov
alexey.susli...@gmail.com wrote:
 Claudio Jeker wrote:

 On Wed, Oct 13, 2010 at 03:33:27PM +0600, Anton Maksimenkov wrote:
  2010/10/13 Claudio Jeker clau...@openbsd.org:
   Most probably re(4) was unable to allocate
   clusters and now the RX ring is empty and stuck, in the worst case you 
   hit
   an interrupt storm.
 
  BTW, in worst case (I used ping -f, just can't find anything useful to
  generate many small packets) re generate about 5000 interrupts.
  While vr on same machine can do about 20 000 interrupts and the
  machine still stay.
 

 As Henning mentioned, number of interrupts do not matter at all. My re(4)

 It does. Bug is triggered at 15+ interrupts per second (according to systat).

Should be read as 15k+ interrupts per second.


 is happyly pumping 75kpps bidirectional with 6500 interrupts per second.
 The system is still mostly idle. ping -f is a bad test as it is a ping
 pong protocol and not mad to max out a line.

 In our test we haven't used to ping, but sipp making 200-400 simultaneous
 calls to Asterisk box.

 Alexey



Re: kernel/6486: MCLGETI breaks re(4)

2010-10-13 Thread Alexey Suslikov
Claudio Jeker wrote:

 From my test:

  netstat -m
 20/128034/614400 mbuf 2048 byte clusters in use (current/peak/max)

 As you can see during my tcpbench test I peaked at 128034 active clusters
 which is way more then the 6144 setup by default. Oh and just for the
 kicks:
 Memory resource pool statistics
 NameSize Requests FailInUse Pgreq Pgrel Npage Hiwat Minpg Maxpg
Idle
 mbpl 256 68895154   10   52  9645 0  9645  9645 1 38400
9641
 mcl2k   2048  7987962   11   20 64017 0 64017 64017 4 307200
64006

 As you can see my i386 had mbuf and mcluster failures because I run the
 kernel out of kvm (this would not have happend if I increased
 kern.maxclusters a bit more carefully).

 Anyway, you need to properly tune your system to handle 1000 and more
 TCP connections.

Hello Claudio.

In my case I just replaced re(4) with bnx(4) without any tune so router can
handle 70k interrupts without issues.

http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yesnumbers=6419
says so:

The same board equipped with a dual-port intel em card in its PCI slot is
able to forward 60k packets and will not crash with more.

Alexey

OpenBSD 4.7-beta (GENERIC.MP) #82: Fri Feb  5 01:05:44 MST 2010
t...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 3720871936 (3548MB)
avail mem = 3615391744 (3447MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xf06d0 (63 entries)
bios0: vendor American Megatrends Inc. version 0302 date 06/01/2009
bios0: ASUSTeK Computer INC. P5QL-VM EPU
acpi0 at bios0: rev 0
acpi0: tables DSDT FACP APIC MCFG OEMB HPET GSCI SSDT
acpi0: wakeup devices P0P2(S4) P0P3(S4) P0P1(S4) UAR1(S4) PS2K(S4)
PS2M(S4) USB0(S4) USB1(S4) USB2(S4) USB3(S4) EUSB(S4) USB4(S4)
USB5(S4) USB6(S4) USBE(S4) P0P4(S4) P0P5(S4) P0P6(S4) P0P7(S4)
P0P8(S4) P0P9(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz, 2930.81 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,
xTPR,NXE,LONG
cpu0: 3MB 64b/line 8-way L2 cache
cpu0: apic clock running at 266MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz, 2930.40 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS
H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16,
xTPR,NXE,LONG
cpu1: 3MB 64b/line 8-way L2 cache
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (P0P3)
acpiprt2 at acpi0: bus 5 (P0P1)
acpiprt3 at acpi0: bus 4 (P0P4)
acpiprt4 at acpi0: bus -1 (P0P8)
acpiprt5 at acpi0: bus 3 (P0P9)
acpicpu0 at acpi0: PSS
acpicpu1 at acpi0: PSS
aibs0 at acpi0
acpibtn0 at acpi0: PWRB
cpu0: Enhanced SpeedStep 2930 MHz: speeds: 2936, 2670, 2403, 2136,
1870, 1603 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 Intel G45 Host rev 0x03
ppb0 at pci0 dev 1 function 0 Intel G45 PCIE rev 0x03: apic 2 int 16 (irq
10)
pci1 at ppb0 bus 1
ppb1 at pci1 dev 0 function 0 ServerWorks PCIE-PCIX rev 0xc3
pci2 at ppb1 bus 2
bnx0 at pci2 dev 0 function 0 Broadcom BCM5708 rev 0x12: apic 2 int
16 (irq 10)
vga1 at pci0 dev 2 function 0 Intel G45 Video rev 0x03
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
intagp0 at vga1
agp0 at intagp0: aperture at 0xe000, size 0x1000
inteldrm0 at vga1: apic 2 int 16 (irq 10)
drm0 at inteldrm0
uhci0 at pci0 dev 26 function 0 Intel 82801JI USB rev 0x00: apic 2
int 16 (irq 10)
uhci1 at pci0 dev 26 function 1 Intel 82801JI USB rev 0x00: apic 2
int 21 (irq 14)
uhci2 at pci0 dev 26 function 2 Intel 82801JI USB rev 0x00: apic 2
int 18 (irq 15)
ehci0 at pci0 dev 26 function 7 Intel 82801JI USB rev 0x00: apic 2
int 18 (irq 15)
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb2 at pci0 dev 28 function 0 Intel 82801JI PCIE rev 0x00: apic 2
int 17 (irq 11)
pci3 at ppb2 bus 4
ppb3 at pci0 dev 28 function 5 Intel 82801JI PCIE rev 0x00: apic 2
int 16 (irq 10)
pci4 at ppb3 bus 3
re0 at pci4 dev 0 function 0 Realtek 8168 rev 0x01: RTL8168 2
(0x3800), apic 2 int 17 (irq 11), address 90:e6:ba:b6:ec:6e
rgephy0 at re0 phy 7: RTL8169S/8110S PHY, rev. 2
uhci3 at pci0 dev 29 function 0 Intel 82801JI USB rev 0x00: apic 2
int 23 (irq 3)
uhci4 at pci0 dev 29 function 1 Intel 82801JI USB rev 0x00: apic 2
int 19 (irq 5)
uhci5 at pci0 dev 29 function 2 Intel 82801JI USB rev 0x00: apic 2
int 18 (irq 15)
ehci1 at pci0 dev 29 function 7 Intel 82801JI USB rev 0x00: apic 2
int 23 (irq 3)
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 Intel EHCI root hub rev 2.00/1.00 addr 1
ppb4 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0x90
pci5 at ppb4 bus 5
rl0 at 

Re: MCLGETI breaks re(4)

2010-10-07 Thread Alexey Suslikov
Anton,

Looks like you don't have as much interrupts to trigger the bug on pre-MCLGETI.

Take a look at PR 6419 and our box also can't survive 15k with re(4)
and rock stable on 70k with bnx(4).

OpenBSD 4.7-beta (GENERIC.MP) #82: Fri Feb  5 01:05:44 MST 2010
t...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

re0 at pci4 dev 0 function 0 Realtek 8168 rev 0x01: RTL8168 2
(0x3800), apic 2 int 17 (irq 11), address 90:e6:ba:b6:ec:6e

Alexey

Anton Maksimenkov wrote:

 Description:
 When system reaches thousand connections, machine becomes unresponsive.
 Though it's possible to break into ddb (stack traces differ from time
 to time).

 Original report: http://marc.info/?l=openbsd-miscm=128426958003630w=2
 How-To-Repeat:
 Run any network stress program that about a thousand connections.
 Fix:
 Use pre-MCLGETI version of the driver.



systat -b vmstat

2010-07-10 Thread Alexey Suslikov
Hello b...@.

In contrast to other views, systat -b vmstat only shows

   2 usersLoad 1.26 0.94 0.65

Is it normal behavior?

In fact, I'm interested in current irq rate on particular device
(vmstat -i gives 5 sec averages which is not the same).

Alexey