Re: NFS issue with 10.0_BETA

2023-02-28 Thread Mark Davies




On 28/02/23 19:05, Michael van Elst wrote:

That's the truncate operation where things fail. Somewhere
the NFS call must be different. Please dump the full RPC
call and reply.



linux->netbsd10
09:45:24.673875 IP (tos 0x0, ttl 64, id 40711, offset 0, flags [DF], 
proto TCP (6), length 264)
city-art.ecs.vuw.ac.nz.782 > paramount.ecs.vuw.ac.nz.shilp: Flags 
[P.], cksum 0x05ad (correct), seq 1240:1452, ack 829, win 501, options 
[nop,nop,TS val 3429007002 ecr 786], length 212: NFS request xid 
3426046061 208 setattr fh 168,6/629417642
09:45:24.673894 IP (tos 0x0, ttl 64, id 35365, offset 0, flags [DF], 
proto TCP (6), length 200)
paramount.ecs.vuw.ac.nz.shilp > city-art.ecs.vuw.ac.nz.782: Flags 
[P.], cksum 0x35d7 (correct), seq 829:977, ack 1452, win 15441, options 
[nop,nop,TS val 786 ecr 3429007002], length 148: NFS reply xid 
3426046061 reply ok 144 setattr ERROR: Permission denied PRE: sz 8 mtime 
1677530437.104914503 ctime 1677530492.245647910 POST: REG 664 ids 
93/1020 sz 0 nlink 1 rdev 4095/1048575 fsid a806 fileid 56fcd02 
a/m/ctime 1677530437.103108884 1677530724.673885881 1677530724.673890095




linux->netbsd9
09:56:09.776109 IP (tos 0x0, ttl 64, id 19849, offset 0, flags [DF], 
proto TCP (6), length 264)
city-art.ecs.vuw.ac.nz.972 > circa.ecs.vuw.ac.nz.shilp: Flags [P.], 
cksum 0xac49 (correct), seq 1240:1452, ack 829, win 501, options 
[nop,nop,TS val 3777571609 ecr 502], length 212: NFS request xid 
1226647613 208 setattr fh 142,16/1575639005
09:56:09.776144 IP (tos 0x0, ttl 64, id 24979, offset 0, flags [DF], 
proto TCP (6), length 200)
circa.ecs.vuw.ac.nz.shilp > city-art.ecs.vuw.ac.nz.972: Flags [P.], 
cksum 0xf28c (correct), seq 829:977, ack 1452, win 15967, options 
[nop,nop,TS val 502 ecr 3777571609], length 148: NFS reply xid 
1226647613 reply ok 144 setattr PRE: sz 8 mtime 1677531182.479689377 
ctime 1677531182.479689377 POST: REG 664 ids 93/1020 sz 0 nlink 1 rdev 
4095/1048575 fsid 8e10 fileid 43e502 a/m/ctime 1677452099.883185744 
1677531369.774888087 1677531369.776138274



netbsd10->netbsd10
10:08:08.603146 IP (tos 0x0, ttl 64, id 10629, offset 0, flags [DF], 
proto TCP (6), length 244)
turakirae.ecs.vuw.ac.nz.942 > paramount.ecs.vuw.ac.nz.shilp: Flags 
[P.], cksum 0x9cfc (correct), seq 792:984, ack 605, win 15874, options 
[nop,nop,TS val 162 ecr 161], length 192: NFS request xid 2805821546 188 
setattr fh 168,6/629417642
10:08:08.603163 IP (tos 0x0, ttl 64, id 2825, offset 0, flags [DF], 
proto TCP (6), length 200)
paramount.ecs.vuw.ac.nz.shilp > turakirae.ecs.vuw.ac.nz.942: Flags 
[P.], cksum 0x0a59 (correct), seq 605:753, ack 984, win 15990, options 
[nop,nop,TS val 161 ecr 162], length 148: NFS reply xid 2805821546 reply 
ok 144 setattr PRE: sz 8 mtime 1677531851.703190779 ctime 
1677531851.703190779 POST: REG 664 ids 93/1020 sz 0 nlink 1 rdev 
4095/1048575 fsid a806 fileid 56fcd02 a/m/ctime 1677530437.103108884 
1677532088.603155935 1677532088.603159940




Full raw tcpdumps can be grabbed from:
https://homepages.ecs.vuw.ac.nz/~mark/linux-netbsd10.tcpdump
https://homepages.ecs.vuw.ac.nz/~mark/linux-netbsd9.tcpdump
https://homepages.ecs.vuw.ac.nz/~mark/netbsd10-netbsd10.tcpdump

cheers
mark



Re: mfii(4) and Dell PERC

2022-08-08 Thread Mark Davies




On 8/08/22 21:37, Edgar Fuß wrote:

Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode?


I have several NetBSD systems with PERC H730P's


mfii(4) says all configuration is done via the controller's BIOS.
Does that mean I need to shut down in case a drive fails an I need to rebuild?


no.
on my systems powerd will notify me of a state change on the disk array, 
then I can use bioctl to see the current state.  I can then hot swap the 
faulty disk and the array will automatically start rebuilding.




Can I monitor the RAID state?


yes.

green-mountain# bioctl mfii0 show
Volume Status   Size Device/LabelLevel Stripe
=
 0 Online   2.6TRAID 564K
   0:0 Online   894G 1:0.0 noencl DSF8>
   0:1 Online   894G 1:1.0 noencl DSF8>
   0:2 Online   894G 1:2.0 noencl DSF8>
   0:3 Online   894G 1:3.0 noencl DSF8>





Can I monitor the BBU Battery health?


yes

green-mountain# envstat -d mfii0
 Current  CritMax  WarnMax  WarnMin  CritMin  Unit
  mfii0 BBU state:  TRUE
mfii0 BBU voltage: 3.885 V
mfii0 BBU current: 0.000 A
mfii0 BBU temperature:26.000  degC
  mfii0:0:online


cheers
mark


Dell PERC H750

2022-06-15 Thread Mark Davies
I have a machine with a Dell PERC H750 raid card.  I'd like to get it 
working under NetBSD.



When FreeBSD added support to their driver (mrsas) it looks like they 
did it with these three patches:


https://cgit.freebsd.org/src/commit/sys/dev/mrsas?id=2909aab4cfc296bcf83fa3e87ed41ed1f4244fea

https://cgit.freebsd.org/src/commit/sys/dev/mrsas?id=b518670c218c4e2674207e946d1b9a70502c5451

https://cgit.freebsd.org/src/commit/sys/dev/mrsas?id=e315cf4dc4d167d9f2e34fe03cd79468f035a6e8

The first patch seems to treat it the same as the previous generation 
card, and then the other two patches add specific changes for the "aero"



If I add the following to our  mfii driver:

--- mfii.c  17 May 2022 10:29:47 -  1.4.4.1
+++ mfii.c  8 Jun 2022 04:22:54 -
@@ -604,6 +604,8 @@
{ PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_MEGARAID_3416,
&mfii_iop_35 },
{ PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_MEGARAID_3516,
+   &mfii_iop_35 },
+   { PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_MEGARAID_39XX_3,
&mfii_iop_35 }
 };



to add the card and initially treat it the same as the previous gen 
card, then on start up I detect the card but never get an "sd" disk detected


 [...]
[ 1.058596] mfii0 at pci6 dev 0 function 0: "PERC H750 Adapter", 
firmware 52.16.1-4074, 8192MB cache

[ 1.058596] mfii0: interrupting at ioapic2 pin 2
[ 1.058596] scsibus0 at mfii0: 240 targets, 8 luns per target
 [...]
[ 1.418319] mfii0: physical disk inserted id 64 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 0 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 1 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 2 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 3 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 4 enclosure 64
[ 1.418319] mfii0: physical disk inserted id 5 enclosure 64
 [...]



On another machine with an H730 installed I see:

mfii0 at pci6 dev 0 function 0: "PERC H730P Adapter", firmware 
25.5.6.0009, 2048MB cache

mfii0: interrupting at ioapic2 pin 2
scsibus0 at mfii0: 64 targets, 8 luns per target
 [...]
mfii0: physical disk inserted id 32 enclosure 32
mfii0: physical disk inserted id 0 enclosure 32
mfii0: physical disk inserted id 1 enclosure 32
mfii0: physical disk inserted id 2 enclosure 32
mfii0: physical disk inserted id 3 enclosure 32
 [...]
sd0 at scsibus0 target 0 lun 0:  disk fixed
sd0: fabricating a geometry
sd0: 2681 GB, 2745600 cyl, 64 head, 32 sec, 512 bytes/sect x 5622988800 
sectors

sd0: fabricating a geometry
sd0: GPT GUID: 92f5aca9-29d3-4c7e-8c41-85fb2df819d6
 [...]



Any suggestions on what the equivalent of the freebsd patch 2 and 3 
would be, or anything else I may need to do to get this going (or why 
this approach wont work) would be appreciated.


cheers
mark


Re: missing memory with uefi

2019-06-20 Thread Mark Davies



On 24/05/19 4:05 pm, Mark Davies wrote:
> A few weeks ago there was some discussion here and in ticket 54147 about
> the amount of system memory being calculated incorrectly (for at least
> some machines) when doing a UEFI boot, but no solution was determined.

So increasing BOOTINFO_MAXSIZE in sys/arch/x86/include/bootinfo.h to
16384 fixed my issue (see ticket 54147).  Any reason not to commit that
change (and pull up to 8)?

cheers
mark


Re: missing memory with uefi

2019-05-23 Thread Mark Davies
A few weeks ago there was some discussion here and in ticket 54147 about
the amount of system memory being calculated incorrectly (for at least
some machines) when doing a UEFI boot, but no solution was determined.

If anyone has any ideas I can test them on the machine of ours that is
exhibiting the issue for the next week or so before I have to put that
machine into production (by switching to a bios boot if necessary).

cheers
mark


Re: missing memory with uefi

2019-05-02 Thread Mark Davies



On 2/05/19 5:51 pm, Michael van Elst wrote:

> On the other hand, the types are mapped to a few BIOS memory types
> before combination. Even this map should shrink that way to only
> a few entries unless the stripped parts vary a lot between "available"
> and "reserved" or contain lots of holes.

the stripped parts were all "available" and I didn't notice any holes
but comprised

42 pairs of
xx/ available [BootServicesData]
+1/ available [BootServicesCode]

then
12 pairs

then
52 pairs

then
10 pairs


(I have the 12 photos of the screen I took to get this, but didn't
really want to transcribe the lot)

cheers
mark


Re: missing memory with uefi

2019-05-01 Thread Mark Davies



On 1/05/19 9:06 pm, Michael van Elst wrote:
> efiboot has a 'memmap' command that tells you what memory is
> reported by UEFI. Maybe that helps to find out where the memory
> goes missing.

If it helps someone this is what it reported:


/0fff available [BootServicesCode]
1000/ available [ConventionalMemory]
0001/00013fff available [BootServicesCode]
00014000/00061fff available [ConventionalMemory]
00062000/0008dfff available [BootServicesCode]
0008e000/0008 available [ConventionalMemory]
0009/0009 available [BootServicesCode]
000a/000f reserved [Reserved]
0010/00bf available [ConventionalMemory]
00c0/00ff available [BootServicesCode]
0100/4c2c available [ConventionalMemory]
4c2d/4c3c available [BootServicesData]
4c3d/5bd59fff available [ConventionalMemory]
5bd5a000/5be59fff available [LoaderData]
5be5a000/5be92fff available [LoaderCode]
5be93000/5c046fff available [BootServicesData]
5c047000/5c095fff available [BootServicesCode]
5c096000/5c0e4fff available [BootServicesData]
5c0e5000/5c167fff available [BootServicesCode]
  ...
5f5c2000/5f609fff reserved [RuntimeServicesCode]
  ...
5f84a000/5f84dfff available [LoaderData]
5f84e000/5f863fff available [BootServicesCode]
5f864000/5f86afff available [BootServicesData]
5f86b000/5f86dfff available [LoaderData]
  ...
611be000/651bdfff reserved [Reserved]
  ...
6c2cf000/6c3cefff reserved [RuntimeServicesData]
6c3cf000/6c5cefff reserved [RuntimeServicesCode]
6c5cf000/6e7cefff reserved [Reserved]
6e7cf000/6f5fefff ACPI NVS [ACPIMemoryNVS]
6f5ff000/6f7fefff ACPI reclaimable [ACPIReclaimMemory]
6f7ff000/6f7f available [BootServicesData]
6f80/7fff reserved [Reserved]
8000/8fff reserved [MemoryMappedIO]
fe00/fe010fff reserved [MemoryMappedIO]
0001/00087fff available [ConventionalMemory]


where the "..." sections are repeated chunks of [BootServicesCode] and
[BootServicesData]

cheers
mark


missing memory with uefi

2019-05-01 Thread Mark Davies
Hi,
   I have a new Dell PowerEdge 440 that I was installing a few month old
8.0_STABLE on.   For the first time I tried setting it up as a UEFI
system and everything seems to have worked OK _except_ that it seems to
be significantly under-reporting the amount of physical memory - the
system has 32GB, it reports 1540MB.

If on the same system, I do a BIOS boot from a USB stick I get the full
32GB reported.

Suggestions?

Attached are the dmesg outputs from the bios and uefi boots.

cheers
mark
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
2018 The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 8.0_STABLE (GENERIC) #5: Tue Dec 11 22:39:29 NZDT 2018

m...@turakirae.ecs.vuw.ac.nz:/local/SAVE/8_64.obj/src/work/8/src/sys/arch/amd64/compile/GENERIC
total memory = 32389 MB
avail memory = 31427 MB
cpu_rng: RDSEED
timecounter: Timecounters tick every 10.000 msec
Kernelized RAIDframe activated
running cgd selftest aes-xts-256 aes-xts-512 done
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Dell Inc. PowerEdge R440
mainbus0 (root)
ACPI: RSDP 0x000FE320 24 (v02 DELL  )
ACPI: XSDT 0x6F7F1188 E4 (v01 DELL   PE_SC3     
0113)
ACPI: FACP 0x6F7FA000 000114 (v06 DELL   PE_SC3    DELL 
0001)
ACPI: DSDT 0x6F7A2000 02FE1A (v02 DELL   PE_SC3   0003 DELL 
0001)
ACPI: FACS 0x6F54F000 40
ACPI: WD__ 0x6F7FC000 000134 (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: SLIC 0x6F7FB000 24 (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: HPET 0x6F7F9000 38 (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: APIC 0x6F7F7000 0016DE (v04 DELL   PE_SC3    DELL 
0001)
ACPI: MCFG 0x6F7F6000 3C (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: MIGT 0x6F7F5000 40 (v01 DELL   PE_SC3    DELL 
0001)
ACPI: MSCT 0x6F7F4000 90 (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: PCAT 0x6F7DA000 48 (v01 DELL   PE_SC3   0002 DELL 
0001)
ACPI: PCCT 0x6F7D9000 6E (v01 DELL   PE_SC3   0002 DELL 
0001)
ACPI: RASF 0x6F7D8000 30 (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: SLIT 0x6F7D7000 6C (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: SRAT 0x6F7D4000 002830 (v03 DELL   PE_SC3   0002 DELL 
0001)
ACPI: SVOS 0x6F7D3000 32 (v01 DELL   PE_SC3    DELL 
0001)
ACPI: WSMT 0x6F7D2000 28 (v01 DELL   PE_SC3    DELL 
0001)
ACPI: OEM4 0x6F6FF000 0A27C4 (v02 INTEL  CPU  CST 3000 INTL 
20150818)
ACPI: SSDT 0x6F6C9000 035130 (v02 INTEL  SSDT  PM 4000 INTL 
20150818)
ACPI: SSDT 0x6F6C8000 0009CE (v02 DELL   PE_SC3    DELL 
0001)
ACPI: SSDT 0x6F6C5000 002541 (v02 INTEL  SpsNm0002 INTL 
20150818)
ACPI: DMAR 0x6F7F3000 000108 (v01 DELL   PE_SC3   0001 DELL 
0001)
ACPI: HEST 0x6F7F2000 00017C (v01 DELL   PE_SC3   0002 DELL 
0001)
ACPI: BERT 0x6F7FD000 30 (v01 DELL   PE_SC3   0002 DELL 
0001)
ACPI: ERST 0x6F7F 000230 (v01 DELL   PE_SC3   0002 DELL 
0001)
ACPI: EINJ 0x6F7EF000 000150 (v01 DELL   PE_SC3   0002 DELL 
0001)
ACPI: 4 ACPI AML tables successfully acquired and loaded
ioapic0 at mainbus0 apid 8: pa 0xfec0, version 0x20, 24 pins
ioapic1 at mainbus0 apid 9: pa 0xfec01000, version 0x20, 8 pins
ioapic2 at mainbus0 apid 10: pa 0xfec08000, version 0x20, 8 pins
ioapic3 at mainbus0 apid 11: pa 0xfec1, version 0x20, 8 pins
ioapic4 at mainbus0 apid 12: pa 0xfec18000, version 0x20, 8 pins
x2APIC available but disabled by DMAR table
cpu0 at mainbus0 apid 0
cpu0: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu0: package 0, core 0, smt 0
cpu1 at mainbus0 apid 14
cpu1: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu1: package 0, core 7, smt 0
cpu2 at mainbus0 apid 2
cpu2: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu2: package 0, core 1, smt 0
cpu3 at mainbus0 apid 12
cpu3: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu3: package 0, core 6, smt 0
cpu4 at mainbus0 apid 4
cpu4: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu4: package 0, core 2, smt 0
cpu5 at mainbus0 apid 10
cpu5: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu5: package 0, core 5, smt 0
cpu6 at mainbus0 apid 6
cpu6: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu6: package 0, core 3, smt 0
cpu7 at mainbus0 apid 8
cpu7: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz, id 0x50654
cpu7: package 0, core 4, smt 0
acpi0 at mainbus0: Intel ACPICA 20170303
acpi0: X/RSDT: OemId , AslId <,0113>
acpi0: MCFG: segment 

problems when multiple mfi cards in system?

2014-09-18 Thread Mark Davies
I have a Dell PERC H810 pcie card and an external disk array attached 
to it.  If I put this into a desktop 6.1_STABLE/amd64 system I can 
happily read and write terabytes from the external array.  However if 
I move the card to a Dell poweredge r610 (which also has an internal 
PERC 6/i) and try reading from the external array, after a few minutes 
the entire machine just freezes solid.  There doesn't seem to be any 
problem reading from the internal disks.

Any ideas how to identify whats failing here and how to fix it?

dmesg for the r610 is below.


cheers
mark

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 
2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 6.1_STABLE (GENERIC) #25: Mon Jun  9 12:44:09 NZST 2014

m...@turakirae.ecs.vuw.ac.nz:/local/SAVE/6_64.obj/src/work/6/src/sys/arch/amd64/compile/GENERIC
total memory = 8182 MB
avail memory = 7929 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Dell Inc. PowerEdge R610
mainbus0 (root)
cpu0 at mainbus0 apid 16: Intel(R) Xeon(R) CPU   E5504  @ 
2.00GHz, id 0x106a5
cpu1 at mainbus0 apid 18: Intel(R) Xeon(R) CPU   E5504  @ 
2.00GHz, id 0x106a5
cpu2 at mainbus0 apid 20: Intel(R) Xeon(R) CPU   E5504  @ 
2.00GHz, id 0x106a5
cpu3 at mainbus0 apid 22: Intel(R) Xeon(R) CPU   E5504  @ 
2.00GHz, id 0x106a5
ioapic0 at mainbus0 apid 0: pa 0xfec0, version 20, 24 pins
ioapic1 at mainbus0 apid 1: pa 0xfec8, version 20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20110623
acpi0: X/RSDT: OemId , AslId 
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Safe" frequency 3579545 Hz quality 900
hpet0 at acpi0: high precision event timer (mem 0xfed0-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
WHEA (PNP0C33) at acpi0 not configured
SPK (PNP0C01) at acpi0 not configured
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x5f irq 0
COMA (PNP0501) at acpi0 not configured
COMB (PNP0501) at acpi0 not configured
MBIO (PNP0C01) at acpi0 not configured
NIPM (IPI0001) at acpi0 not configured
MBI1 (PNP0C01) at acpi0 not configured
PEHB (PNP0C02) at acpi0 not configured
VTD (PNP0C02) at acpi0 not configured
ipmi0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x3403 (rev. 
0x13)
ppb0 at pci0 dev 1 function 0: vendor 0x8086 product 0x3408 (rev. 
0x13)
ppb0: PCI Express 2.0 
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
bnx0 at pci1 dev 0 function 0: Broadcom NetXtreme II BCM5709 1000Base-
T
bnx0: Ethernet address 00:22:19:60:83:95
bnx0: interrupting at ioapic1 pin 4
brgphy0 at bnx0 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
bnx1 at pci1 dev 0 function 1: Broadcom NetXtreme II BCM5709 1000Base-
T
bnx1: Ethernet address 00:22:19:60:83:97
bnx1: interrupting at ioapic1 pin 16
brgphy1 at bnx1 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
ppb1 at pci0 dev 3 function 0: vendor 0x8086 product 0x340a (rev. 
0x13)
ppb1: PCI Express 2.0 
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
bnx2 at pci2 dev 0 function 0: Broadcom NetXtreme II BCM5709 1000Base-
T
bnx2: Ethernet address 00:22:19:60:83:99
bnx2: interrupting at ioapic1 pin 0
brgphy2 at bnx2 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
bnx3 at pci2 dev 0 function 1: Broadcom NetXtreme II BCM5709 1000Base-
T
bnx3: Ethernet address 00:22:19:60:83:9b
bnx3: interrupting at ioapic1 pin 10
brgphy3 at bnx3 phy 1: BCM5709 10/100/1000baseT PHY, rev. 8
brgphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
ppb2 at pci0 dev 7 function 0: vendor 0x8086 product 0x340e (rev. 
0x13)
ppb2: PCI Express 2.0 
pci3 at ppb2 bus 4
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
ppb3 at pci0 dev 9 function 0: vendor 0x8086 product 0x3410 (rev. 
0x13)
ppb3: PCI Express 2.0 
pci4 at ppb3 bus 5
pci4: i/o space, memory space enabled, rd/line, wr/inv ok
mfi0 at pci4 dev 0 function 0: Dell PERC H810 Adapter
mfi0: interrupting at ioapic1 pin 8
mfi0: PERC H810 Adapter version 21.0.1-0132
mfi0: logical drives 1, 1024MB RAM, BBU type BBU, status good
scsibus0 at mfi0: 64 targets, 8 luns per target
vendor 0x8086 product 0x342e (interrupt system, revision 0x13) at pci0 
dev 20 function 0 not configured
vendor 0x8086 product 0x3422 (interrupt system, revision 0x13) at pci0 
dev 20 function 1 not configured
vendor 0x8086 product 0x3423 (

ffsv2 extattr support

2014-06-17 Thread Mark Davies
Hi,
  back in Jan 2012 there was some discussion about ffsv2 extattr 
support initiated by Manuel Bouyer.  What is the current state of 
that?  Is there anything in Current?

cheers
mark


Re: resource leak in linux emulation?

2014-05-04 Thread Mark Davies
On Mon, 05 May 2014, Christos Zoulas wrote:
> I wrote:
> >So can someone suggest where exactly the patch should go.  And
> >isn't proc_lock held at this point (entered at line 344, exit at
> >line 569)?
> 
> How about this?

Seems good to me and can confirm that its fixed the increasing proc 
count problem.  Can someone commit and pull up to 6?

cheers
mark


Re: resource leak in linux emulation?

2014-05-03 Thread Mark Davies
On Thu, 24 Apr 2014 07:18:10 David Laight wrote:
> > To fix, this should be added somewhere, probably at
> > sys/kern/kern_exit.c:487 (but I'm not sure if there's a better
> > location):
> > if ((l->l_pflag & LP_PIDLID) != 0 && l->l_lid != p->p_pid) {
> > 
> > proc_free_pid(l->l_lid);
> > 
> > }
> 
> That doesn't look like the right place.
> I think it should be further down (and with proc_lock held).

So can someone suggest where exactly the patch should go.  And isn't proc_lock 
held at this point (entered at line 344, exit at line 569)?

cheers
mark


Re: resource leak in linux emulation?

2014-04-03 Thread Mark Davies
On Thursday 27 March 2014 14:00:37 I wrote:
> So what resource could this be running out of?

Coming back to this, looks like nprocs isn't being incremented/decremented 
properly in some circumstances:

test# cat > HelloWorld.java

public class HelloWorld {

public static void main(String[] args) {
System.out.println("Hello, World");
}

}
test# cat /proc/loadavg 
0.00 0.03 0.30 1/887 3
test# /usr/pkg/java/sun-7/bin/javac HelloWorld.java 
test# cat /proc/loadavg
0.00 0.02 0.30 1/888 3
test# /usr/pkg/java/sun-7/bin/javac HelloWorld.java
test# cat /proc/loadavg
0.00 0.02 0.27 1/889 3
test# ps uaxww | wc -l
  27

Note that nprocs (2nd to last value in the /proc/loadavg output) 
increments every time javac runs until it hits maxproc.

cheers
mark


resource leak in linux emulation?

2014-03-26 Thread Mark Davies
On a NetBSD/amd64 6.1_STABLE system, I have a perl script that 
effectively calls /usr/pkg/java/sun-7/bin/javac twice.  It doesn't 
really matter what java file its compiling.
If I call this script in an infinite loop, after an hour or so the 
javac's start failing with memory errors:

  # There is insufficient memory for the Java Runtime Environment to 
continue.
  # Cannot create GC thread. Out of system resources.

and after some more time the perl fails to fork (to exec the second 
javac)

   23766  1 perl CALL  fork
   23766  1 perl RET   fork -1 errno 35 Resource temporarily 
unavailable

Mar 27 11:43:24 test /netbsd: proc: table is full - increase 
kern.maxproc or NPROC

But all through this top et al tell me there are plenty of processes 
and memory

25 processes: 23 sleeping, 2 on CPU
CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  
100% idle
CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  
100% idle
CPU2 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  
100% idle
CPU3 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  
100% idle
Memory: 141M Act, 15M Wired, 11M Exec, 90M File, 15G Free
Swap: 2048M Total, 2048M Free


So what resource could this be running out of?

cheers
mark


Re: SAS tape drives

2013-12-12 Thread Mark Davies
On Wed, 11 Dec 2013, you wrote:
> I completely missed the mpii driver.  I'll order the card in the
> morning and see how it goes.

Just to close this off: plugged the card in, mpii found it and the 
tape drive is there as st0 so all looks good.

cheers
mark


Re: SAS tape drives

2013-12-11 Thread Mark Davies
On Wednesday 11 December 2013 20:46:17 Manuel Bouyer wrote:
> > I found this:
> > https://www.ascent.co.nz/productspecification.aspx?ItemID=413461
> > but that seems to use the mps driver on FreeBSD and NetBSD doesn't
> > have it.  Any guess on how hard it would be to port?
> 
> Hopefully it's supported by our mpii(4) driver.

I completely missed the mpii driver.  I'll order the card in the morning 
and see how it goes.

cheers
mark


Re: SAS tape drives

2013-12-10 Thread Mark Davies
On Wed, 11 Dec 2013, Eduardo Horvath wrote:
> Last time I fiddled around with the LSI MegaRAID stack it did not
> provide any sort of transparent access to attached devices.  Can
> you create a LUN with the tape device?
> 
> You might have more success with the LSI MPT stack.  That at least
> provides transparent access to the target devices.  I don't know
> whether mpt hooks into NetBSD's scsipi layer in a way that it will
> attach non-disk devices, but I suspect it would.

I used the MegaRAID card as that was what we had lying around but I 
could buy something else.

I found this: 
https://www.ascent.co.nz/productspecification.aspx?ItemID=413461
but that seems to use the mps driver on FreeBSD and NetBSD doesn't 
have it.  Any guess on how hard it would be to port?

cheers
mark


SAS tape drives

2013-12-10 Thread Mark Davies
Are SAS tape drives supported in NetBSD?

I have an LSI MegaRAID SAS card with an HP LTO5 SAS drive attached.  
The card's WebBIOS can see the tape attached and NetBSD can see the 
LSI card but NetBSD show no evidence of seeing the tape drive (not 
even as an unconfigured device).

cheers
mark


Re: Weird memory usage, performance problem with 6.1_STABLE

2013-09-24 Thread Mark Davies
On Tue, 24 Sep 2013, Lars Heidieker wrote:
> I think the pagescanner can't keep up with the speed, is there any
> change in network bandwidth between those machines?

The two ftp servers are both gigabit connected to the same switch and 
its gigabit all the way to the clients.

> Another idea, the images nearly fit into memory and the ftpd
> process trigger the file cache in a way that the page scanner
> can't find pages to free...

So on this 16GB machine the original 36GB image causes the problem
I tried a 15GB image and it also caused the problem.
With an 8GB image (probably not suprisingly) it worked happily with 
active memory never getting above 10GB - didn't try increasing the 
number of simultaneous ftp sessions beyond 3 to see if that had any 
effect.

> You wen't from non-smp to smp right?

No.  amd64 box is 6 core Intel(R) Xeon(R) CPU E5-2430 while the i386 
is 4 core Intel(R) Core(TM) i7-3770.

cheers
mark


Re: Weird memory usage, performance problem with 6.1_STABLE

2013-09-23 Thread Mark Davies
On Fri, 20 Sep 2013, Mark Davies wrote:
> On Fri, 20 Sep 2013, Lars Heidieker wrote:
> > Can you see which kernel thread causes high CPU usage by showing
> > lwps in top? (t toggles those modes)
> 
> 149 threads: 25 idle, 118 sleeping, 6 on CPU
> Memory: 15G Act, 15M Wired, 28M Exec, 15G File, 4620K Free
> Swap: 8192M Total, 8192M Free
> 
>   PID   LID USERNAME PRI STATE  TIME   WCPUCPU NAME
> COMMAND
> 081 root 221 CPU/0 24:19 93.60% 93.60% pgdaemon
> [system]
> 13362 1 root 116 tstile/4   6:09 19.58% 19.58% -
> ftpd 17398 1 root 221 tstile/2   8:26 18.07% 18.07% - 
>ftpd 14648 1 root 116 tstile/1   4:15 16.41% 16.41% -  
>   ftpd
> 
> > or even better try:
> > systat vm 1
> > and check for page scan rates etc.
> 
> When it was in the above state  pdscn was reporting around 95000,
> pdfre was 0.


So is there something I can tune to get better behaviour out of this?
Just for comparison I tried the same thing on a 6.1_STABLE/i386 with 
4GB (3.3 available)  and with 3 ftp's going it at sat at 2G active and 
1G inactive and remained responsive throughout the entire download.

cheers
mark


Re: Weird memory usage, performance problem with 6.1_STABLE

2013-09-19 Thread Mark Davies
On Fri, 20 Sep 2013, Mark Davies wrote:
> When it was in the above state  pdscn was reporting around 95000,
> pdfre was 0.   With just one ftp going both pdfre and pdscn report
> values in the range 7000 - 14000

Actually the figures for "just one ftp" were from shortly after I 
killed off the other two.  Now for just one ftp its settled down to:

Memory: 10G Act, 5005M Inact, 15M Wired, 28M Exec, 15G File, 14M Free

  PID   LID USERNAME PRI STATE  TIME   WCPUCPU NAME  
COMMAND
081 root 126 pgdaem/2  28:12  0.00%  0.00% pgdaemon  
[system]
14648 1 root  85 netio/56:20  0.00%  0.00% - ftpd

and both pdfre and pdscn typically reporting 0.

cheers
mark


Re: Weird memory usage, performance problem with 6.1_STABLE

2013-09-19 Thread Mark Davies
On Fri, 20 Sep 2013, Lars Heidieker wrote:
> Can you see which kernel thread causes high CPU usage by showing
> lwps in top? (t toggles those modes)

149 threads: 25 idle, 118 sleeping, 6 on CPU
Memory: 15G Act, 15M Wired, 28M Exec, 15G File, 4620K Free
Swap: 8192M Total, 8192M Free

  PID   LID USERNAME PRI STATE  TIME   WCPUCPU NAME  
COMMAND
081 root 221 CPU/0 24:19 93.60% 93.60% pgdaemon  
[system]
13362 1 root 116 tstile/4   6:09 19.58% 19.58% - ftpd
17398 1 root 221 tstile/2   8:26 18.07% 18.07% - ftpd
14648 1 root 116 tstile/1   4:15 16.41% 16.41% - ftpd


> or even better try:
> systat vm 1
> and check for page scan rates etc.

When it was in the above state  pdscn was reporting around 95000, 
pdfre was 0.   With just one ftp going both pdfre and pdscn report 
values in the range 7000 - 14000


cheers
mark


Re: Weird memory usage, performance problem with 6.1_STABLE

2013-09-19 Thread Mark Davies
On Thursday 19 September 2013 20:36:49 Manuel Bouyer wrote:
> the file cache shouldn't be allowed to use that much memory.
> What are your vm.* settings (sysctl vm) ?

They are the default settings from GENERIC, I haven't explicitly set 
anything.

vm.loadavg: 0.00 0.00 0.00
vm.maxslp = 20
vm.uspace = 12288
vm.user_va0_disable = 1
vm.idlezero = 1
vm.anonmin = 10
vm.filemin = 10
vm.execmin = 5
vm.anonmax = 80
vm.filemax = 50
vm.execmax = 30
vm.inactivepct = 33
vm.bufcache = 15
vm.bufmem = 195102208
vm.bufmem_lowater = 321228800
vm.bufmem_hiwater = 2569830400

> did you change ftp.conf, especially the mmapsize parameter ?

No.  Wasn't aware of that parameter.

cheers
mark


Weird memory usage, performance problem with 6.1_STABLE

2013-09-18 Thread Mark Davies
I have a system that is (sometimes) used as an ftp server to serve g4u 
disk images.  Current machine is a Dell PowerEdge R320 with 16GB 
memory running 6.1_STABLE from yesterday.

If I get 3 ftp clients all reading the same 45GB image from it I 
quickly get into the situation that all memory is used by something 
and the machine becomes very unresponsive.  Below is the output of top 
while in this state:

load averages:  3.38,  1.56,  0.68;   up 0+16:11:05 

14:51:52
43 processes: 2 runnable, 38 sleeping, 3 on CPU
CPU0 states:  0.0% user,  0.0% nice, 23.6% system,  0.0% interrupt, 
76.4% idle
CPU1 states:  0.0% user,  0.0% nice,  4.1% system,  0.0% interrupt, 
95.9% idle
CPU2 states:  0.0% user,  0.0% nice, 28.5% system,  0.0% interrupt, 
71.5% idle
CPU3 states:  0.0% user,  0.0% nice,  2.1% system,  0.0% interrupt, 
97.9% idle
CPU4 states:  0.0% user,  0.0% nice,  3.8% system,  0.0% interrupt, 
96.2% idle
CPU5 states:  0.0% user,  0.0% nice, 78.9% system,  0.0% interrupt, 
21.1% idle
Memory: 15G Act, 113M Inact, 15M Wired, 29M Exec, 15G File, 112K Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE   RES STATE  TIME   WCPUCPU 
COMMAND
0 root 1260 0K   29M CPU/1  3:37 99.76% 99.76% 
[system]
 7173 root  79053M 4588K CPU/2  1:52 36.13% 36.13% 
ftpd
 6931 root 113053M 4588K RUN/2  1:13 28.42% 28.42% 
ftpd
 7025 root 115053M 4588K RUN/1  2:03 26.61% 26.61% 
ftpd
 6601 root  43017M 1876K CPU/5  0:01  4.05%  4.05% top


If I just run two ftp clients there seems to be about 5GB of Inactive 
memory and performance is "fine".

Previously I had an i386 box running 5.x or 6.x doing this job and it 
could quite happily have 10-15 clients slurping images at once.

So whats going on?

cheers
mark


Re: 5.1_RC3 on Dell r610 fails

2010-08-31 Thread Mark Davies
On Tue, 31 Aug 2010, Manuel Bouyer wrote:
> On Tue, Aug 31, 2010 at 03:09:01PM +1000, matthew green wrote:
> > i think you have the problem fixed by this pullup (not yet
> > processed):
> >
> > http://releng.netbsd.org/cgi-bin/req-5.cgi?show=1439
>
> I just processed this one;

Just to confirm that this fixed my problem.

cheers
mark


Re: 5.1_RC3 on Dell r610 fails

2010-08-31 Thread Mark Davies
On Tuesday 31 August 2010 23:03:58 Manuel Bouyer wrote:
> I just processed this one; rebuilding the netbsd-5 branch should be
> enough (unfortunably the TNF build cluster producing binary snapshots is
> down at this time and needs physical presence, we hope it will be back
> up today or tomorow).

I applied the patch manually and am rebuilding at the moment.  Wont be able 
to test on the box till tomorrow.

cheers
mark


5.1_RC3 on Dell r610 fails

2010-08-30 Thread Mark Davies
I have 5.1_RC3/i386 running quite happily on a couple of slightly 
older Dell PowerEdge r610's.  I went to install it on a new R610 and 
the system dies immediately

   [...]
Intel 686-class, 2660Hz 0x20bc2
Fatal protection fault in supervisor mode
Stopped in pid 0.1(system) at netbsd:rdmsr+0x4

backtrace:
rdmsr()
est_init_once()
_run_once()
est_init()
cpu_identify()
cpu_attach()
config_attach_loc()
mpacpi_config_cpu()
acpi_madt_walk()
macpi_scan_apics()
mainbus_attach()
  [...]


This new machine has a Xeon E5640 Quad core processor, otherwise not 
to different to the working ones.

Any suggestions on whats broke, how to fix?

cheers
mark