from:"Sean Bruno"

installworld failure with poudriere

2022-05-26 Thread Sean Bruno


https://people.freebsd.org/~sbruno/poudriere_atf_debug.txt

I think that some of the atf_check(1) Makefiles aren't respecting 
WITHOUT_DEBUG_FILES and this change has happened in the last few months.


===> libexec/atf/atf-check (install)
--- _proginstall ---
install -N /usr/src/etc  -s -o root -g wheel -m 555   atf-check 
/usr/local/poudriere/jails/14amd64/usr/libexec/atf-check
install -N /usr/src/etc  -o root -g wheel -m 444  atf-check.debug 
/usr/local/poudriere/jails/14amd64/usr/lib/debug/usr/libexec/atf-check.debug

install: atf-check.debug: No such file or directory
*** [_proginstall] Error code 71


sean

xhci(4) hanging in early boot with drives attached

2022-04-02 Thread Sean Bruno

As far as I can tell, xhci(4) is hanging on "something" during early 
boot (probably something uninitialized) that causes it to basically 
never attach drives.


I only have one controller here, so it could be something special with 
my device.


If I single user my machine and attach my storage array, probe/attach 
works just fine.  Leaving the USB3 array attached during a full boot 
causes xhci(4) to become completely unresponsive and requires a reboot 
to resolve.


I probably need someone who understand the early bits well enough to 
know what things a driver can/cannot do prior to single-user mode.


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261912

sean

USB CD Eject Failures

2022-02-16 Thread Sean Bruno


Been playing around with sysutils/eject to automate some media backup stuff.

I note that "after a number of ejects" the USB 2 CD drive will cease 
responding.  I don't think its a race to failure, it acts like resource 
starvation/leak.  Seems fairly reproducible, if someone gets to it 
before I do, let me know.


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261961

I suspect that something has changed in the 12 years since 
sysutils/eject was last looked at and the CDIOCEJECT case in 
sys/cam/scsi_cd.c probably needs an eyeball.


The close tray command also seems nonfunctional, which probably means 
that a data structure has changed or something else that I haven't 
started at in quite some time.


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261936

sean

Re: USB Disk Stalls on -current

2022-02-12 Thread Sean Bruno









I think I'm going to grab one of these fancy 5-disk USB 3 enclosures I 
see on the Internet.  ;-)


Most of the consumer grade units seem to have one or more issues (no NCQ 
depth more than 1, loud/bad fans, hard to get a real JBOD mode, JBOD 
mode eats serial numbers making ID hard).


I think I settled on the TERRAMASTER D5-300 USB3.1

I'll post back results when it arrives and I get done swearing and 
cursing during the replacement.


sean




Performance, unsurprisingly is way better.

Some irritating XHCI initialization was noted on reboots:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261912

I am currently moving stuff around but it is fun to be able to put my 
old 2 disk USB 2 array into the 5 disk USB 3 array and have ZFS just 
flat out DO THE RIGHT THING.  Its fantastic.


Another thing I noticed is the lack of support by this vendor for SATA 
NCQ.  Not surprising, but SUPER irritating.  The drives I'm using 
definitely supported this in the their old USB 2 enclosures, but the 
only thing I can do is yell at the vendor.


sean

Re: USB Disk Stalls on -current

2022-02-08 Thread Sean Bruno





On 2/6/22 10:14, Sean Bruno wrote:
I'm doing something "gross" with ZFS & Plex on a little Intel NUC that I 
have here at the house to provide me with a nice little NAS at home. I'm 
using 2x USB2 external disks as the mirror.


I noted that the two USB2 disks I'm using in a mirror seem to "stall" 
from time to time and its not clear to me why.


I'd like to poke further into the USB system but I'm not sure where I 
should start to see if there is something amiss with the hardware (e.g. 
the disks suck) or if FreeBSD is losing track of something during I/O 
leading to a stall/timeout.


I'm not seeing data loss or anything, I just note from time to time 
during large file transfers that the clanking/grinding sound of the 
spinning rust on my desk completely stops, the encoding of the video 
files stops (so its waiting for a read to complete) and its gets much 
quieter in my office.  :-)


sean



I think I'm going to grab one of these fancy 5-disk USB 3 enclosures I 
see on the Internet.  ;-)


Most of the consumer grade units seem to have one or more issues (no NCQ 
depth more than 1, loud/bad fans, hard to get a real JBOD mode, JBOD 
mode eats serial numbers making ID hard).


I think I settled on the TERRAMASTER D5-300 USB3.1

I'll post back results when it arrives and I get done swearing and 
cursing during the replacement.


sean

Re: USB Disk Stalls on -current

2022-02-06 Thread Sean Bruno








So there's some tools you can use. For usb, there's usbdump that can
get you the USB transactions. I've not used it enough to give more details
here. This will let you know what's going on, and when, on the USB endpoint.

You can also enable the CAM_IOSCHED stuff. This will allow you to get 
latency
measurements for 'requests in the sim' which basically will tell you 
what your
latency spread is for the drives. This will tell you if things are 
getting caught

up in the USB layer, or after CAM's da driver completes the I/O request
(granted, that's almost certainly not happening, but it will help you 
figure out

what's going on and put numbers to the oddities you are seeing).

Also, make sure you have good cables. I've had lots of hicups over the
years from dodgy USB cables. Also make sure you have good, high quality
enclosures. Many from the USB2 time-period are sketchy at best and I
went through several at one point trying to find a good one. I'd be 
tempted to

get USB 3 enclosures. I've had better luck with USB3 gear than USB2 gear
here, but you need a USB-3 controller to get USB-3 speeds which might not
be compatible with the NUC's built-in stuff (though my NUC has one USB3
port, there's lots of different models).

Usually, though, I see weirdness associated with dmesg messages from
usb, cam, etc when the hardware is on the sketch end.

Warner


I'm assuming that I have a fairly dodgy USB device, as the pauses seem 
to correspond to this from CAM being emitted:


Feb  6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): READ(10). CDB: 28 
00 36 69 02 6e 00 00 80 00
Feb  6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): CAM status: CCB 
request completed with an error
Feb  6 11:56:43 alice kernel: (da0:umass-sim1:1:0:0): Retrying command, 
2 more tries remain



Things resume after this is emitted, but there is a substantial 
(multiple minutes) pause here.  I would assume that timeouts would fire 
much quicker.


sean

Re: USB Disk Stalls on -current

2022-02-06 Thread Sean Bruno





On 2/6/22 10:52, Mehmet Erol Sanliturk wrote:



On Sun, Feb 6, 2022 at 8:15 PM Sean Bruno <mailto:sbr...@freebsd.org>> wrote:


I'm doing something "gross" with ZFS & Plex on a little Intel NUC
that I
have here at the house to provide me with a nice little NAS at home.
I'm using 2x USB2 external disks as the mirror.

I noted that the two USB2 disks I'm using in a mirror seem to "stall"
from time to time and its not clear to me why.

I'd like to poke further into the USB system but I'm not sure where I
should start to see if there is something amiss with the hardware (e.g.
the disks suck) or if FreeBSD is losing track of something during I/O
leading to a stall/timeout.

I'm not seeing data loss or anything, I just note from time to time
during large file transfers that the clanking/grinding sound of the
spinning rust on my desk completely stops, the encoding of the video
files stops (so its waiting for a read to complete) and its gets much
quieter in my office.  :-)

sean



I encountered such a case in Fedora Linux with an external 2.0 USB disk .
When the external disk was connected to a 1.? USB port , the loading of 
operating system

was terrifically slow or sometimes some parts normal .

You may check your USB ports versions to ensure that they are conforming 
to each other .
Board USB port may be 2.0 , but connected chassis USB port may be 1.?  
like in my chassis .

When USB external disk is connected to the chassis  USB 2.0 port ,
everything has become normal .


Mehmet Erol Sanliturk






I see them all up as 480mbps / USB 2.0 if usbconfig and the driver 
attach is anything to go by.  Disk read/write perf when running seems to 
approach 40MB/s, so I think its running pretty close to the correct speed.

...
ugen0.7:  at usbus0
umass1 on uhub0
umass1:  on usbus0
da0 at umass-sim1 bus 1 scbus4 target 0 lun 0
da0:  Fixed Direct Access SCSI device
da0: 40.000MB/s transfers
da0: 1907729MB (3907029168 512 byte sectors)
da0: quirks=0x2
Root mount waiting for: usbus0

ugen0.8:  at usbus0
umass2 on uhub0
umass2:  on usbus0
da1 at umass-sim2 bus 2 scbus5 target 0 lun 0
da1:  Fixed Direct Access SPC-2 SCSI device
da1: Serial Number ABCDEF0123456847
da1: 40.000MB/s transfers
da1: 1907729MB (3907029168 512 byte sectors)
da1: quirks=0x2

...
ugen0.7:  at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (2mA)
ugen0.8:  at usbus0, cfg=0 md=HOST spd=HIGH 
(480Mbps) pwr=ON (500mA)

...

sean

USB Disk Stalls on -current

2022-02-06 Thread Sean Bruno

I'm doing something "gross" with ZFS & Plex on a little Intel NUC that I 
have here at the house to provide me with a nice little NAS at home. 
I'm using 2x USB2 external disks as the mirror.


I noted that the two USB2 disks I'm using in a mirror seem to "stall" 
from time to time and its not clear to me why.


I'd like to poke further into the USB system but I'm not sure where I 
should start to see if there is something amiss with the hardware (e.g. 
the disks suck) or if FreeBSD is losing track of something during I/O 
leading to a stall/timeout.


I'm not seeing data loss or anything, I just note from time to time 
during large file transfers that the clanking/grinding sound of the 
spinning rust on my desk completely stops, the encoding of the video 
files stops (so its waiting for a read to complete) and its gets much 
quieter in my office.  :-)


sean

test

2018-12-11 Thread Sean Bruno

Just testing mailing lists.  Delete.

sean



signature.asc
Description: OpenPGP digital signature

pkg-base noise

2018-12-11 Thread Sean Bruno

make[8]: "/home/sbruno/bsd/fbsd_head/share/mk/bsd.files.mk" line 92:
warning: duplicate script for target "_testsFILESINS_cleanup.ksh" ignored
make[8]: "/home/sbruno/bsd/fbsd_head/share/mk/bsd.files.mk" line 92:
warning: using previous script for "_testsFILESINS_cleanup.ksh" defined here


Is this something easily fixable?  I'm unclear what is throwing a
warning here?

sean



signature.asc
Description: OpenPGP digital signature

Re: pkg problem on FreeBSD 13.0-CURRENT

2018-10-25 Thread Sean Bruno



On 10/25/18 10:29 AM, Pieper, Jeffrey E wrote:
> I'm seeing:
> 
> Installing pkg-1.10.5_5...
> Extracting pkg-1.10.5_5: 100%
> ld-elf.so.1: Shared object "libssl.so.9" not found, required by "pkg"
> 
> Thanks,
> Jeff

Is this before or after the move of libssl.so.9 -> libssl.so.111 ?

https://svnweb.freebsd.org/changeset/base/339709

sean



signature.asc
Description: OpenPGP digital signature

Re: pkg problem on FreeBSD 13.0-CURRENT

2018-10-25 Thread Sean Bruno



On 10/25/18 6:57 AM, Kurt Jaeger wrote:
> Hi!
> 
>> FreeBSD konjak 13.0-CURRENT FreeBSD 13.0-CURRENT r339705 GENERIC  amd64
>>
>> pkg-static install -f pkg
>> pkg-static: Warning: Major OS version upgrade detected.  Running
>> "pkg-static install -f pkg" recommended
>> Updating FreeBSD repository catalogue...
>> pkg-static: Repository FreeBSD load error: access repo
>> file(/var/db/pkg/repo-FreeBSD.sqlite) failed: No such file or directory
>> pkg-static: http://pkg.FreeBSD.org/FreeBSD:13:amd64/latest/meta.txz: Not
>> Found
> 
> The package builders are not yet providing packages for 13.0-CURRENT.
> 
> It's being worked on, but will take a little more time.
> 


Portmgr has pushed out packages for 13, give 'em a spin and see what
happens.

sean



signature.asc
Description: OpenPGP digital signature

Re: UFS panics

2018-10-24 Thread Sean Bruno



On 10/24/18 9:22 AM, Leandro wrote:
> Hello,
> 
> I'm seeing a kernel panic when trying to move a specific file.
> 
> panic: Bad effnlink fip 0xc004a2c69a00, fdp 0xc00497093be0,
> tdp 0xc004be6295a0
> cpuid = 72
> time = 1540283798
> KDB: stack backtrace:
> 0xe000adb8dcc0: at .kdb_backtrace+0x5c
> 0xe000adb8ddf0: at .vpanic+0x1b4
> 0xe000adb8deb0: at .panic+0x38
> 0xe000adb8df40: at .ufs_readdir+0x2f24
> 0xe000adb8e1b0: at .VOP_RENAME_APV+0x190
> 0xe000adb8e240: at .kern_renameat+0x3c0
> 0xe000adb8e540: at .sys_rename+0x2c
> 0xe000adb8e5c0: at .trap+0x65c
> 0xe000adb8e780: at .powerpc_interrupt+0x290
> 0xe000adb8e820: user SC trap by 0x81010b7b8: srr1=0x9000f032
> r1=0x3fffc490 cr=0x3428 xer=0x2000
> ctr=0x81010b7b0 r2=0x8102c5950
> KDB: enter: panic
> 
> Using 'ls' or 'rm' on the file gives a "Bad file descriptor" error.
> Using 'cat', I get another panic, but now during open.
> Everything indicates that the file system is in an inconsistent state.
> 
> Therefore, I would just like to ask the following: is it expected for
> kernel panics to happen when there are errors in the file system?
> 
> Thanks,
> Leandro
> _

I can't make much sense of the ppc backtrace here, but I assume this is UFS?

If it is, UFS tends to panic to protect data in an error case and ZFS
tends to go read-only in an error case.  (over generalization here, but
its been my experience).

sean



signature.asc
Description: OpenPGP digital signature

Re: UEFI layout includes a freebsd-boot?

2018-09-23 Thread Sean Bruno



On 9/23/18 3:57 PM, David P. Discher wrote:
> This is correct for a EFI+BIOS map. If the installer did this for an EFI
> only map, then that is a bug.
> 

Is it supposed to "Detect" BIOS vs UEFI booting in the installer?

sean

> My 12-Alpha7 install I just did, with BIOS only:
> 
> dpd@amd:~ % gpart show
> =>       40  234441568  ada0  GPT  (112G)
>          40       1024     1  freebsd-boot  (512K)
>        1064        984        - free -  (492K)
>        2048    4194304     2  freebsd-swap  (2.0G)
>     4196352  230244352     3  freebsd-zfs  (110G)
>   234440704        904        - free -  (452K)
> 
> 
> 
> The “free” section seems a bit aggressive (large) … assuming for sector
> alignment.  ( Would be cool for future feature if freebsd-boot can be
> encapsulated in the EFI partition. ) 
> 
> 
> --
> David P. Discher 
> https://davidpdischer.com/
> d...@dpdtech.com
> 
>> On Sep 23, 2018, at 1:57 PM, Sean Bruno > <mailto:sbr...@freebsd.org>> wrote:
>>
>> I don't think this layout from the installer is correct, but I could be
>> wrong.  Is there any reason to have a freebsd-boot in this layout
>> created by the installer when using UEFI?
>>
>> % gpart show
>> =>   40  537234688  ada0  GPT  (256G)
>> 40 409600 1  efi  (200M)
>> 409640   1024 2  freebsd-boot  (512K)
>> 410664    984    - free -  (492K)
>> 411648   67108864 3  freebsd-swap  (32G)
>>   67520512  469712896 4  freebsd-zfs  (224G)
>>  537233408   1320    - free -  (660K)
>>
>>
> 



signature.asc
Description: OpenPGP digital signature

IPv6 for local_unbound?

2018-09-23 Thread Sean Bruno

Does it make sense to add an IPv6 localhost (::1) to our setup scripts
for local_unbound?  unbound is definitely listening on ::1 as well at
127.0.0.1 so things like "host -6" will work if we add it like this perhaps?

--- /usr/sbin/local-unbound-setup   2018-09-20 21:47:41.0 -0600
+++ /tmp/local-unbound-setup2018-09-23 13:27:01.841365000 -0600
@@ -152,6 +152,7 @@
done
if [ "${localhost}" = "no" ] ; then
echo "nameserver 127.0.0.1"
+   echo "nameserver ::1"
fi
if [ "${edns0}" = "no" ] ; then
echo "options edns0"



signature.asc
Description: OpenPGP digital signature

UEFI layout includes a freebsd-boot?

2018-09-23 Thread Sean Bruno

I don't think this layout from the installer is correct, but I could be
wrong.  Is there any reason to have a freebsd-boot in this layout
created by the installer when using UEFI?

% gpart show
=>   40  537234688  ada0  GPT  (256G)
 40 409600 1  efi  (200M)
 409640   1024 2  freebsd-boot  (512K)
 410664984- free -  (492K)
 411648   67108864 3  freebsd-swap  (32G)
   67520512  469712896 4  freebsd-zfs  (224G)
  537233408   1320- free -  (660K)




signature.asc
Description: OpenPGP digital signature

Re: Some delete-old detrius

2018-09-19 Thread Sean Bruno



On 9/18/18 8:48 PM, Mark Millard wrote:
> A problem here is that the references should be to usr.bin instead
> of usr/bin . See:
> 
> https://lists.freebsd.org/pipermail/svn-src-head/2018-September/118426.html
> 
> which reports for svn commit: r336601 - head:
> 
> QUOTE
> Looks like this head/ObsoleteFiles.inc update has a typo
> in each thing added to OLD_FILES . . .
> 
> # ls -lTdt /usr/tests/usr.bin/indent/*
> -r--r--r--  1 root  wheel   121 Sep 13 22:53:30 2018 
> /usr/tests/usr.bin/indent/Kyuafile
> . . .
> -r--r--r--  1 root  wheel   295 Sep 13 22:53:29 2018 
> /usr/tests/usr.bin/indent/binary.0
> -r--r--r--  1 root  wheel92 May  1 19:35:24 2018 
> /usr/tests/usr.bin/indent/sac.0.pro
> -r--r--r--  1 root  wheel   130 May  1 19:35:24 2018 
> /usr/tests/usr.bin/indent/sac.0.stdout
> -r--r--r--  1 root  wheel   122 May  1 19:35:24 2018 
> /usr/tests/usr.bin/indent/sac.0
> -r--r--r--  1 root  wheel94 May  1 19:35:24 2018 
> /usr/tests/usr.bin/indent/nsac.0.pro
> -r--r--r--  1 root  wheel   130 May  1 19:35:24 2018 
> /usr/tests/usr.bin/indent/nsac.0.stdout
> -r--r--r--  1 root  wheel   123 May  1 19:35:24 2018 
> /usr/tests/usr.bin/indent/nsac.0
> 
> vs. ( note usr.bin vs. usr/bin ):
> 
> Modified: head/ObsoleteFiles.inc
> ==
> --- head/ObsoleteFiles.incSun Jul 22 12:04:21 2018(r336600)
> +++ head/ObsoleteFiles.incSun Jul 22 12:45:02 2018(r336601)
> @@ -38,6 +38,13 @@
>  #   xargs -n1 | sort | uniq -d;
>  # done
>  
> +# 20180722: indent(1) option renamed, test files follow
> +OLD_FILES+=usr/bin/indent/tests/nsac.0
> +OLD_FILES+=usr/bin/indent/tests/nsac.0.pro
> +OLD_FILES+=usr/bin/indent/tests/nsac.0.stdout
> +OLD_FILES+=usr/bin/indent/tests/sac.0
> +OLD_FILES+=usr/bin/indent/tests/sac.0.pro
> +OLD_FILES+=usr/bin/indent/tests/sac.0.stdout
>  # 20180721: move of libmlx5.so.1 and libibverbs.so.1
>  OLD_LIBS+=usr/lib/libmlx5.so.1
>  OLD_LIBS+=usr/lib/libibverbs.so.1
> END QUOTE
> 
> This was after having the nsac and sac kyua tests report
> failures in my environment, long after they should have
> not existed.
> 
> 
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
> 
> 

So, what should we even do here?  It sort of looks like this entire
section should be purged or something.

There was a bit of error committing this section I guess.

sean



signature.asc
Description: OpenPGP digital signature

Some delete-old detrius

2018-09-18 Thread Sean Bruno

doing a delete-old run:
drwxr-xr-x  2 root  wheel  25 Sep 18 22:55 usr/lib/debug/usr/lib/i18n
rm: usr/bin/indent/tests/nsac.0: Not a directory
rm: usr/bin/indent/tests/nsac.0.pro: Not a directory
rm: usr/bin/indent/tests/nsac.0.stdout: Not a directory
rm: usr/bin/indent/tests/sac.0: Not a directory
rm: usr/bin/indent/tests/sac.0.pro: Not a directory
rm: usr/bin/indent/tests/sac.0.stdout: Not a directory
rm: usr/lib/debug/usr/lib/i18n: is a directory


I'm not clear on how long we leave stuff lying around in Obsolete or
whatever before we purge them.  Is this output with us forever?

sean



signature.asc
Description: OpenPGP digital signature

Re: page fault in ip6_output

2018-09-02 Thread Sean Bruno



On 9/2/18 6:17 AM, Alexander Leidinger wrote:
> Hi,
> 
> -current at r338322 with manually applied r338372 (fix potential data
> corruption in iflib) and r338416 (re-compute arc size).
> 
> What worries me a little bit about the validity of this report is the
> gdb 8.1.1 error when loading the dump/kernel:
> ---snip---
> warning: kld_current_sos: Can't read filename: Unknown error: -1
> 
> inferior.c:311: internal-error: struct inferior *find_inferior_pid(int):
> Assertion `pid != 0' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) [answered Y; input not from terminal]
> 
> This is a bug, please report it.  For instructions, see:
> .
> 
> inferior.c:311: internal-error: struct inferior *find_inferior_pid(int):
> Assertion `pid != 0' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Create a core file of GDB? (y or n) [answered Y; input not from terminal]
> Abort trap (core dumped)
> ---snip---
> 
> kernel panic:
> ---snip---
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 5; apic id = 13
> fault virtual address   = 0x98
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x8068cbf2
> stack pointer   = 0x28:0xfe0128caa510
> frame pointer   = 0x28:0xfe0128caa760
> code segment    = base 0x0, limit 0xf, type 0x1b
>     = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags    = interrupt enabled, resume, IOPL = 0
> current process = 1658 (isc-worker0003)
> trap number = 12
> panic: page fault
> cpuid = 5
> time = 1535835179
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe0128caa1c0
> vpanic() at vpanic+0x1a3/frame 0xfe0128caa220
> panic() at panic+0x43/frame 0xfe0128caa280
> trap_fatal() at trap_fatal+0x35f/frame 0xfe0128caa2d0
> trap_pfault() at trap_pfault+0x49/frame 0xfe0128caa330
> trap() at trap+0x2ba/frame 0xfe0128caa440
> calltrap() at calltrap+0x8/frame 0xfe0128caa440
> --- trap 0xc, rip = 0x8068cbf2, rsp = 0xfe0128caa510, rbp =
> 0xfe0128caa760 ---
> ip6_output() at ip6_output+0xf82/frame 0xfe0128caa760
> udp6_send() at udp6_send+0x702/frame 0xfe0128caa920
> sosend_dgram() at sosend_dgram+0x346/frame 0xfe0128caa980
> kern_sendit() at kern_sendit+0x170/frame 0xfe0128caaa10
> sendit() at sendit+0x19e/frame 0xfe0128caaa60
> sys_sendmsg() at sys_sendmsg+0x61/frame 0xfe0128caaac0
> amd64_syscall() at amd64_syscall+0x254/frame 0xfe0128caabf0
> fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe0128caabf0
> --- syscall (28, FreeBSD ELF64, sys_sendmsg), rip = 0x8015adf0a, rsp =
> 0x7fffdf9f7218, rbp = 0x7fffdf9f7250 ---
> Uptime: 22h37m4s
> Dumping 13174 out of 61352
> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> ---snip---
> 
> I can not reproduce it at will, but it happens often enough (from once a
> day to several times after each reboot).
> 
> Can this gdb be trusted? If yes, which frame do you want to see more
> detailed?
> 
> Bye,
> Alexander.
> 


I think, you have hit this, no?
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230950



signature.asc
Description: OpenPGP digital signature

Re: IPMI SOL seems to not accept characters after getty starts

2018-08-08 Thread Sean Bruno



On 08/08/18 17:13, Warner Losh wrote:
> 
> 
> On Wed, Aug 8, 2018 at 4:39 PM, Sean Bruno  <mailto:sbr...@freebsd.org>> wrote:
> 
> tl;dr pxeboot new x86 host, ipmi sol works in loader, not after
> multiuser.
> 
> The FreeBSD cluster just acquired 4x Supermicro X11DDW-L and I am having
> the hardest time with the IPMI SOL interface.
> 
> I have configured "COM2" as the IPMI SOL interface and enabled console
> redirection.  Netbooting via pxeboot works well and the loader menu is
> interactive and responds.
> 
> After NFS booting into freebsd (current or stable/11), getty fires up
> and attaches to ttyu0.  It prompts me correctly, but it does not accept
> my keystrokes.
> 
> If I do not configure /etc/ttys to enable a tty unconditionally (on vs
> onifconsole), I see dmesg/kernel boot messages but never get a tty.
> 
> Its as though FreeBSD does *not* recognize the IPMI SOL port as the
> console or something and I'm super confused.  Any thoughts here?
> 
> 
> Works fine for me.
> 
> So, let's start with your /boot.config (or /boot/config) loader.conf and
> device,hints settings. Also BIOS or UEFI?
> 
> Warner
>  

Works for you on this exact Supermicro?

I am using the defaults all around.  This is booting BIOS mode PXE, all
console output appears on the IPMI SOL interface.  Driving the beastie
menu in pxeboot/loader works fine.  When the loader hands the uart off
to the kernel, I see all boot output and "everything is fine"

The problem arises when trying to login.  I see the amnesiac login
prompt, but no key strokes are registered.

loader.conf:
console="comconsole"
comconsole_speed="115200"

boot.config:


sean




signature.asc
Description: OpenPGP digital signature

IPMI SOL seems to not accept characters after getty starts

2018-08-08 Thread Sean Bruno

tl;dr pxeboot new x86 host, ipmi sol works in loader, not after multiuser.

The FreeBSD cluster just acquired 4x Supermicro X11DDW-L and I am having
the hardest time with the IPMI SOL interface.

I have configured "COM2" as the IPMI SOL interface and enabled console
redirection.  Netbooting via pxeboot works well and the loader menu is
interactive and responds.

After NFS booting into freebsd (current or stable/11), getty fires up
and attaches to ttyu0.  It prompts me correctly, but it does not accept
my keystrokes.

If I do not configure /etc/ttys to enable a tty unconditionally (on vs
onifconsole), I see dmesg/kernel boot messages but never get a tty.

Its as though FreeBSD does *not* recognize the IPMI SOL port as the
console or something and I'm super confused.  Any thoughts here?

sean



signature.asc
Description: OpenPGP digital signature

Re: [regression] The USB WiFi card stopped working: if_run doesn't create the 'run0' interface any more

2018-07-03 Thread Sean Bruno



On 07/03/18 13:53, Yuri wrote:
> On 07/03/18 12:50, Lev Serebryakov wrote:
>>   No, it isn't.
>>
>>  
>> https://lists.freebsd.org/pipermail/freebsd-wireless/2016-October/007232.html
>>
>>
>>   I don't know, why is it not mentioned in UPDATING:-(
> 
> 
> I may be mistaken about the interface creation.
> 
> But regardless of the cause, WiFi doesn't work any more. :-(
> 
> 
> Yuri
> 
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 


Yuri:

If you're still having trouble, dump your rc.conf entries for your
wireless.  Mine looks like this at the moment with iwn(4):

wlans_iwn0="wlan0"
ifconfig_wlan0="WPA DHCP"

seam




signature.asc
Description: OpenPGP digital signature

Re: em0 link fail

2018-07-03 Thread Sean Bruno



On 07/03/18 11:47, Michael Butler wrote:
> On June 1st, I was able to do my monthly laptop ZFS snap-shot/back-up
> (using "zfs snapshot -r zroot@backup; zfs send -R >nfs-filesys"). Now I
> can't without the em0 interface stalling :-(
> 

Can you tell what version of FreeBSD SVN was in use on "June 1st" ?

sean

> On a guess, I tried reverting SVN r335303 but that didn't help.
> 
> em0:  port 0xf080-0xf09f mem
> 0xf7e0-0xf7e1,0xf7e39000-0xf7e39fff irq 20 at device 25.0 on pci0
> em0: attach_pre capping queues at 1
> em0: using 1024 tx descriptors and 1024 rx descriptors
> em0: msix_init qsets capped at 1
> em0: PCIY_MSIX capability not found; or rid 0 == 0.
> em0: Using an MSI interrupt
> em0: allocated for 1 tx_queues
> em0: allocated for 1 rx_queues
> em0: Ethernet address: f0:1f:af:66:95:7e
> em0: netmap queues/slots: TX 1/1024, RX 1/1024
> em0: link state changed to UP
> 
>  [ initiate "zfs send" ]
> 
> em0: TX(0) desc avail = 41, pidx = 172
> em0: link state changed to DOWN
> em0: TX(0) desc avail = 1024, pidx = 0
> em0: TX(0) desc avail = 1024, pidx = 0
> 
>  .. ad nauseum ..
> 
> "ifconfig em0 down; ifconfig em0 up" doesn't help.
> 
> Any hints?
> 
>   imb
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 



signature.asc
Description: OpenPGP digital signature

Re: head -r335795 broke the builds for ci.freebsd.org 's FreeBSD-head-{amd64,i386,powerpc,risc64,sparc64}-build

2018-06-29 Thread Sean Bruno



On 06/29/18 08:50, Mark Millard wrote:
> Side note:
> 
> All I get for the lists that I normally look at is:
> 
> Error 503 Backend fetch failed
> 
> Backend status: Backend fetch failed
> 
> Transaction ID: . . .


I think I fixed that about an hour ago.  Try again.

sean



signature.asc
Description: OpenPGP digital signature

Re: how to browse svnweb source?

2018-05-28 Thread Sean Bruno



On 05/28/18 15:37, Jeffrey Bouquet wrote:
> Suddenly the site www.secnetix.de/olli/FreeBSD/svnews which showed sequential
> source as for example xx1966 on april 3  xx2040 on april 4 this year, is 
> not loading
> in the browser.  It was informative to read color coded the source backported 
> to v10
> v11 vs new, and new drivers coming into play.  I can find NOWHERE except
> freshsource.org which has the ports updates interspersed which makes the 
> information
> too time consuming.  As an example,
> 
> 09:36:34 - r 318137Affects: /head/usr.bin/mking/mking.1 [ mking.c] 
> on 5-10-2017 adding the -C and --capacity options...
> 
> ...
>   What was educational to browse now is found at 
> ..
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 


https://svnweb.freebsd.org/base/head/

?

sean



signature.asc
Description: OpenPGP digital signature

Re: Intel Corporation Wireless 8265 / 8275 (rev 78) on FreeBSD 12

2018-05-23 Thread Sean Bruno



On 05/23/18 01:41, David Pan wrote:
> Hi:
> how config the Intel Corporation Wireless 8265 / 8275 (rev 78) on FreeBSD 
> 12?I installed FreeBSD 12 on my thinkpad t470p ,and can not drive the Intel 
> Wireless Card.
> please help me fix that,
> 
> thx
> David Pan
> 

This seems to be a "iwm(4)" device.  Try loading this driver in loader.conf.


https://www.freebsd.org/doc/handbook/network-wireless.html

sean



signature.asc
Description: OpenPGP digital signature

Re: Microcode Updater Changes

2018-05-17 Thread Sean Bruno

On 05/16/18 10:42, Sean Bruno wrote:
> https://reviews.freebsd.org/D15443
> 
> The FreeBSD Foundation has collaborated with a few folks so that we can
> continue to process and update microcode on FreeBSD.
> 
> This review represents the first step in updating our infrastructure.
> If you are currently using microcode updates, please give this a spin
> and report back if you see a different version of microcode being loaded
> on the new package vs the old package.
> 
> sean
> 
> bcc emaste
> 

Ports svn 470255 updates the microcode package devcpu-data to process
the same files that linux (and the only ones Intel releases now).

If you are a consumer of this package, you should see *no* difference in
the version update to your CPU.  If you do, please respond to this thread.

sean

signature.asc
Description: OpenPGP digital signature

Microcode Updater Changes

2018-05-16 Thread Sean Bruno

https://reviews.freebsd.org/D15443

The FreeBSD Foundation has collaborated with a few folks so that we can
continue to process and update microcode on FreeBSD.

This review represents the first step in updating our infrastructure.
If you are currently using microcode updates, please give this a spin
and report back if you see a different version of microcode being loaded
on the new package vs the old package.

sean

bcc emaste



signature.asc
Description: OpenPGP digital signature

Recent warnings.

2018-05-04 Thread Sean Bruno

make[3]: "/usr/src/share/mk/bsd.prog.mk" line 274: warning: duplicate
script for target "_scriptsinstall" ignored
make[3]: "/usr/src/share/mk/bsd.prog.mk" line 274: warning: using
previous script for "_scriptsinstall" defined here


This popped up on me this week.  Anyone see what's going on?

sean



signature.asc
Description: OpenPGP digital signature

Re: [CFT] sysutils/devcpu-data Intel microcode migration

2018-01-17 Thread Sean Bruno



On 01/15/18 11:14, Sean Bruno wrote:
> https://reviews.freebsd.org/D13921
> 
> In order to better absorb updates as they appear, I'm proposing that we
> switch from the current model of processing the "microcode.dat" legacy
> file to consuming the pre-digested update files.
> 
> This update should not change the microcode version that you previously
> received, but I'd like for folks to give it a spin before we commit yet
> another update to this port.
> 
> sean
> 


*sigh*

https://reviews.freebsd.org/D13958

Looks like there are some "discrepancies" in the legacy vs current
method of distribution of microcode.

sean



signature.asc
Description: OpenPGP digital signature

[CFT] sysutils/devcpu-data Intel microcode migration

2018-01-15 Thread Sean Bruno

https://reviews.freebsd.org/D13921

In order to better absorb updates as they appear, I'm proposing that we
switch from the current model of processing the "microcode.dat" legacy
file to consuming the pre-digested update files.

This update should not change the microcode version that you previously
received, but I'd like for folks to give it a spin before we commit yet
another update to this port.

sean



signature.asc
Description: OpenPGP digital signature

Re: [CFT] AMD cpu microcode update port sysutils/devcpu-data D13832

2018-01-12 Thread Sean Bruno



On 01/12/18 04:49, Rainer Hurling wrote:
> Am 12.01.2018 um 00:03 schrieb Sean Bruno:
>> https://reviews.freebsd.org/D13832  <--- test this update
>>
>> I'd like to get some feedback from AMD cpu users on this update.  I've
>> restructured and undone a few things that may have been keeping folks
>> using this port from getting their runtime cpu microcode updates.
>>
>> After installing the port, grab your microcode version via
>> sysutils/x86info.  If you don't see an update, that only means there is
>> no update available for your system.
>>
>> x86info -a | grep Microcode
>>
>> Run the microcode_update and repeat.  Check /var/log/messages for a
>> notification that the code was updated.  You should be able to get
>> something like the following for your system if it executed an update:
>>
>> root@lab:/home/sbruno # x86info -a | grep "CPU Model"
>> CPU Model (x86info's best guess): AMD FX Series Processor (OR-B2)
>>
>> root@lab:/home/sbruno # x86info -a | grep Microcode
>> Microcode patch level: 0x6000629
>>
>> root@lab:/home/sbruno # /usr/local/etc/rc.d/microcode_update onestart
>> Updating CPU Microcode...
>> Done.
>>
>> root@lab:/home/sbruno # x86info -a | grep Microcode
>> Microcode patch level: 0x600063d
>>
>> root@lab:/home/sbruno # grep microcode_update /var/log/messages
>> Jan 10 16:52:26 lab microcode_update:
>> /usr/local/share/cpucontrol/microcode_amd_fam15h.bin: updating cpu
>> /dev/cpuctl0 to revision 0x600063d... done.
>>
> 
> Just for the record, for an older Phenom with dmesg:
> 
> CPU: AMD Phenom(tm) II X6 1090T Processor (3214.31-MHz K8-class CPU)
>   Origin="AuthenticAMD"  Id=0x100fa0  Family=0x10  Model=0xa  Stepping=0
> Features=0x178bfbff
>   Features2=0x802009
>   AMD
> Features=0xee500800
>   AMD
> Features2=0x37ff
>   SVM: NP,NRIP,NAsids=64
>   TSC: P-state invariant, performance statistics
> 
> 
> #x86info -a | grep "CPU Model"
> CPU Model (x86info's best guess): Phenom/Athlon/Sempron/Turion
> (II)/Opteron (PH-E0)
> 
> #x86info -a | grep Microcode
> Microcode patch level: 0x1bf
> 
> #/usr/local/etc/rc.d/microcode_update onestart
> Updating CPU Microcode...
> Done.
> 
> #x86info -a | grep Microcode
> Microcode patch level: 0x1bf
> 
> #grep microcode_update /var/log/messages
> ---
> 
> So no recent update and no log messages, as expected ;)
> 
> Regards,
> Rainer Hurling
> 


Thank you!

sean



signature.asc
Description: OpenPGP digital signature

Re: [CFT] AMD cpu microcode update port sysutils/devcpu-data D13832

2018-01-12 Thread Sean Bruno



On 01/12/18 08:38, Mike Tancsa wrote:
> On 1/11/2018 6:03 PM, Sean Bruno wrote:
>> https://reviews.freebsd.org/D13832  <--- test this update
>>
>> I'd like to get some feedback from AMD cpu users on this update.  I've
>> restructured and undone a few things that may have been keeping folks
>> using this port from getting their runtime cpu microcode updates.
> Hi,
>   I am trying out on RELENG_11 on a Ryzen CPU
> 
> Without kib's commits at
> 
> https://lists.freebsd.org/pipermail/svn-src-stable-11/2018-January/005320.html
> 
> I get
> 
> root@testamd:/usr/ports/sysutils/devcpu-data #
> /usr/local/etc/rc.d/microcode_update onestart
> Updating CPU Microcode...
> Re-evalutation of CPU flags Failed.
> root@testamd:/usr/ports/sysutils/devcpu-data #

Correct, this is expected.

> 
> with r327597,
> 
> root@testamd:/home/mdtancsa # /usr/local/etc/rc.d/microcode_update onestart
> Updating CPU Microcode...
> Done.
> root@testamd:/home/mdtancsa #
> 
> 
> 
> running x86info -a also generates this error / warning ?
> 
> CPU0: local APIC error 0x80
> 
> 

Probably, update your port of x86info.  I pushed an update for this and
it might work better for you.

> root@testamd:/home/mdtancsa # x86info -a | grep -i microco
> Microcode patch level: 0x8001129
> root@testamd:/home/mdtancsa # x86info -a | head -20
> x86info v1.31pre
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Unknown CPU family: 0x17
> Found 12 identical CPUs
> Extended Family: 8 Extended Model: 0 Family: 15 Model: 1 Stepping: 1
> CPU Model (x86info's best guess):
> Processor name string (BIOS programmed): AMD Ryzen 5 1600X Six-Core
> Processor
> 
> Number of reporting banks : 7
> 
> root@testamd:/home/mdtancsa #
> 
> 
> Also your diff is based on a previous version of the port. There was an
> update since, with new microcode from Intel that gets clobbered in your
> diffs.
> 
> 

The update to Intel microcode was reverted.  So, possibly the tree you
are using is out of date?

sean



signature.asc
Description: OpenPGP digital signature

[CFT] AMD cpu microcode update port sysutils/devcpu-data D13832

2018-01-11 Thread Sean Bruno

https://reviews.freebsd.org/D13832  <--- test this update

I'd like to get some feedback from AMD cpu users on this update.  I've
restructured and undone a few things that may have been keeping folks
using this port from getting their runtime cpu microcode updates.

After installing the port, grab your microcode version via
sysutils/x86info.  If you don't see an update, that only means there is
no update available for your system.

x86info -a | grep Microcode

Run the microcode_update and repeat.  Check /var/log/messages for a
notification that the code was updated.  You should be able to get
something like the following for your system if it executed an update:

root@lab:/home/sbruno # x86info -a | grep "CPU Model"
CPU Model (x86info's best guess): AMD FX Series Processor (OR-B2)

root@lab:/home/sbruno # x86info -a | grep Microcode
Microcode patch level: 0x6000629

root@lab:/home/sbruno # /usr/local/etc/rc.d/microcode_update onestart
Updating CPU Microcode...
Done.

root@lab:/home/sbruno # x86info -a | grep Microcode
Microcode patch level: 0x600063d

root@lab:/home/sbruno # grep microcode_update /var/log/messages
Jan 10 16:52:26 lab microcode_update:
/usr/local/share/cpucontrol/microcode_amd_fam15h.bin: updating cpu
/dev/cpuctl0 to revision 0x600063d... done.



signature.asc
Description: OpenPGP digital signature

Re: Panic: @r323525: iflib

2017-09-13 Thread Sean Bruno

> Previous successful build was:
> FreeBSD g1-252.catwhisker.org 12.0-CURRENT FreeBSD 12.0-CURRENT #398  
> r323483M/323489:1200044: Tue Sep 12 04:31:08 PDT 2017 
> r...@g1-252.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY  amd64
> 
> The usual historical information, including a verbose-boot dmesg.boot
> from the above-cited build, may be found at
> .
> 
> I will try hand-transcribing some of the lock & backtrace info:
> 
> ...
> em0: allocated for 1 rx_queues
> Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex taskqgroup (taskqgroup) r = 0 (0xfe07be2e4800) 
> locked @ /usr/src/sys/kern/subr_gtaskqueue.c:803
> stack backtrace:  [which I am abbreviating at this point -- dhw]
> #0 ... at witness_debugger+0x73
> #1 ... at witness_warn+0x43f
> #2 ... at trap_pfault+0x53
> #3 ... at trap+0x2c5
> #4 ... at calltrap+0x8
> #5 ... at iflib_device_register+0x2a61
> #6 ... at iflib_device_attach+0xb7
> #7 ... at device_attach+0x3ee
> #8 ... at bus_generic_attach+0x5a
> #9 ... at pci_attach+0xd5
> #10 ... at device_attach+0x3ee
> #11 ... at bus_generic_attach+0x5a
> #12 ... at acpi_pcib_acpi_attach+0x3bc
> #13 ... at device_attach+0x3ee
> #14 ... at bus_generic_attach+0x5a
> #15 ... at acpi_attach+0xe85
> #16 ... at device_attach+0x3ee
> #17 ... at bus_generic_attach+0x5a
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 2; apic id = 02
> fault virtual address   = 0x8b530c20
> fault code  = supervisor write data, page not present
> ...
> [ thread pid 0 tid 10 ]
> Stopped at  0x80a743b0 = taskqgroup_attach+0x230:orq   
> %rax,-0x 58(%rbp,%xrx,8)
> 
> I can provide more specific excerpts, but I need to focus on some
> other activities for a while.
> 
> Peace,
> david
> 


When you get a chance, let me know what em(4) device is in your machine
(pciconf -lvbc).  I'll see if I have one around here to test.

I'm assuming you do *not* have any iflib or em(4) tuning options set either.

sean



signature.asc
Description: OpenPGP digital signature

Re: Failover Mode Between Ethernet and Wireless Interfaces broken on >= 11

2017-06-21 Thread Sean Bruno



On 06/21/17 11:48, Renato Botelho wrote:
> I've already sent it to net, but I suspect this is the appropriate place
> to discuss this subject.
> 
> Last night I was configuring a new laptop and decided to give it [1] a
> try. I figured out this section of handbook (similar instructions are on
> lagg(4) manpage) is outdated, based on FreeBSD 10.x.
> 
> Then I modified a bit the commands and tried to get it configured on
> 12-CURRENT, without success. I spoke with adrian@, who told me this
> setup doesn't work on FreeBSD > 10, because on newer versions Wireless
> interfaces mac address cannot be changed.
> 
> My next attempt was to do the other way round and make lagg to use wlan0
> mac address instead of em0's. but even doing this my wireless interface
> ended up not working.
> 
> After further investigation I noted that a simple command:
> 
> # ifconfig wlan0 ether $wlan0_current_mac_address
> 
> is enough to break it on 12-CURRENT.
> 
> I've checked if_setlladdr() source code and noted it always replace the
> mac address, even if the same is already configured on the interface. Is
> it the expected behavior?
> 
> Just as a PoC I've applied the following patch to if_setlladdr():
> 
> Index: sys/net/if.c
> ===
> --- sys/net/if.c  (revision 320097)
> +++ sys/net/if.c  (working copy)
> @@ -3519,6 +3519,10 @@
>   ifa_free(ifa);
>   return (EINVAL);
>   }
> + if (memcmp(lladdr, LLADDR(sdl), len) == 0) {
> + ifa_free(ifa);
> + return (0);
> + }
>   switch (ifp->if_type) {
>   case IFT_ETHER:
>   case IFT_FDDI:
> 
> And configured it to use wlan0 mac address on rc.conf:
> 
> ifconfig_em0="ether 60:67:20:c5:2d:48 up"
> wlans_iwn0="wlan0"
> ifconfig_wlan0="WPA"
> cloned_interfaces="lagg0"
> ifconfig_lagg0="up laggproto failover laggport em0 laggport wlan0 DHCP"
> 
> and it's now working as expected.
> 
> Other than that, I believe if wlan interfaces cannot have their mac
> address changed, ifconfig should return an error when user attempts to
> do it, and if_setlladdr() should do the same.
> 
> Thoughts?
> 
> [1]
> https://www.freebsd.org/doc/handbook/network-aggregation.html#networking-lagg-wired-and-wireless
> 


Maybe this is a "iflib" problem.  em(4) and igb(4) are pretty different
now in head.  Can you shove it into bugzilla with a test case
(copy/paste your email) and tag me on it?

sean



signature.asc
Description: OpenPGP digital signature

List test, please ignore.

2017-04-11 Thread Sean Bruno

ignore



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-24 Thread Sean Bruno



On 01/24/17 08:27, Olivier Cochard-Labbé wrote:
> On Tue, Jan 24, 2017 at 3:17 PM, Sean Bruno  <mailto:sbr...@freebsd.org>> wrote:
> 
> 
> 
> Did you increase the number of rx/tx rings to 8 and the number of
> descriptors to 4k in your tests or just the defaults?
> 
> 
> Tuning are same as described in my previous email (rxd|txd=2048, rx|tx
> process_limit=-1, max_interrupt_rate=16000).
> [root@apu2]~# sysctl hw.igb.
> hw.igb.tx_process_limit: -1
> hw.igb.rx_process_limit: -1
> hw.igb.num_queues: 0
> hw.igb.header_split: 0
> hw.igb.max_interrupt_rate: 16000
> hw.igb.enable_msix: 1
> hw.igb.enable_aim: 1
> hw.igb.txd: 2048
> hw.igb.rxd: 2048
> 
> 

Oh, I think you missed my note on these.  In order to adjust txd/rxd you
need to tweak the iflib version of these numbers.  nrxds/ntxds should be
adjust upwards to your value of 2048.  nrxqs/ntxqs should be adjust
upwards to 8, I think, so you can test equivalent settings to the legacy
driver.

Specifically, you may want to adjust these:

dev.em.0.iflib.override_nrxds: 0
dev.em.0.iflib.override_ntxds: 0

dev.em.0.iflib.override_nrxqs: 0
dev.em.0.iflib.override_ntxqs: 0

sean

> But I've did a new benchs with default setting, and the performance drop
> is now about -25% :
> 
> x head r311848 packets-per-second (default settings)
> + head r311849 packets-per-second (default settings)
> +--+
> |+ |
> |+   x |
> |+   xx|
> |++  xx|
> |A||
> |A||
> +--+
> N   Min   MaxMedian   AvgStddev
> x   5618711621135  619930.5  619840.8 951.83787
> +   5467389468740467778  467864.8 550.40322
> Difference at 95.0% confidence
> -151976 +/- 1133.9
> -24.5186% +/- 0.150581%
> (Student's t, pooled s = 777.476)
> 



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-24 Thread Sean Bruno



On 01/23/17 23:31, Olivier Cochard-Labbé wrote:
> On Tue, Jan 24, 2017 at 2:40 AM, Sean Bruno  <mailto:sbr...@freebsd.org>> wrote:
> 
> 
> 
> Which set of configs from your test suite are you using for this?
> Specifically, what packet size are you slamming across?
> 
> https://github.com/ocochard/netbenches/tree/master/pktgen.configs
> <https://github.com/ocochard/netbenches/tree/master/pktgen.configs>
> 
> 
> Because I'm in the point of view of a Telco, I'm measuring the «worst»
> case, this mean with the smallest frame size.
> Here is the exact pkt-gen command line I'm using:
> - 60 byte Ethernet frame size (excluding the 4 CRC bytes)
> - 2000 UDP flows (20 IP sources * 100 IP destinations)
> 
> pkt-gen -U -i igb2 -f tx -n 8000 -l 60 -d 198.19.10.1:2000-198.19.10.20 
> -D 00:0d:b9:41:ca:3d -s 198.18.10.1:2000-198.18.10.100 -w 4
> 
> Option -U is available on a patched netmap version [1]: It fix the
> checksum calculation when using source/destination IP range on NIC that
> didn't enable HW CHKSUM in netmap mode and IPv6 support.
> 
> [1]
> https://github.com/ocochard/BSDRP/blob/master/BSDRPcur/patches/freebsd.pkt-gen.ae-ipv6.patch
> 


Did you increase the number of rx/tx rings to 8 and the number of
descriptors to 4k in your tests or just the defaults?

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-23 Thread Sean Bruno



On 01/23/17 08:39, Olivier Cochard-Labbé wrote:
> 
> On Thu, Jan 12, 2017 at 1:54 AM, Matthew Macy  > wrote:
> 
>  >  A flame graph for the core cycle count and a flame graph with
> cache miss stats from pmc would be a great start.
>  >
>  >
>  > I didn't know the exact event name to use for cache miss stats,
> but here are the flame graphs for CPU_CLK_UNHALTED_CORE:
>  > http://dev.bsdrp.net/netgate.r311848.CPU_CLK_UNHALTED_CORE.svg
> 
>  > http://dev.bsdrp.net/netgate.r311849.CPU_CLK_UNHALTED_CORE.svg
> 
> 
> Thanks. Having twice as many txqs would definitely help. It's also
> clear that there may be some sort of peformance issue in
> iflib_txq_drain. Although it could just be non-stop cache misses on
> the packet headers.
> 
> 
> Any news about the performance issue in iflib_txq_drain ?
> 
> On a different hardware (PC Engine APU2), I've got -20% performance drop:
> 
> x head r311848: packets per second
> + head r311849: packets per second
> +--+
> | ++  x|
> |+++ x xx x|
> | |_A_||
> ||A|   |
> +--+
> N   Min   MaxMedian   AvgStddev
> x   5580021588650585676  585406.1 3550.8673
> +   5463865467599465428  465638.6 1437.9347
> Difference at 95.0% confidence
> -119768 +/- 3950.78
> -20.4589% +/- 0.558328%
> (Student's t, pooled s = 2708.9)
> 
>  
> Because it's an AMD processor I didn't found the pmc equivalent of
> CPU_CLK_UNHALTED_CORE, then I've used BU_CPU_CLK_UNHALTED but I've no
> idea if it's the good one.
> 
> http://dev.bsdrp.net/apu2.r311848.BU_CPU_CLK_UNHALTED.svg
> http://dev.bsdrp.net/apu2.r311849.BU_CPU_CLK_UNHALTED.svg
> 
> Thanks
> 
> 


Olivier:

Which set of configs from your test suite are you using for this?
Specifically, what packet size are you slamming across?

https://github.com/ocochard/netbenches/tree/master/pktgen.configs

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-18 Thread Sean Bruno



On 01/18/17 08:20, O. Hartmann wrote:
> On Wed, 18 Jan 2017 07:59:17 -0700
> Sean Bruno  wrote:
> 
>> On 01/18/17 07:41, Sean Bruno wrote:
>>>
>>>
>>> On 01/18/17 00:34, O. Hartmann wrote:  
>>>> On Thu, 5 Jan 2017 20:17:56 -0700
>>>> Sean Bruno  wrote:  
>>>>>  
>>>> On a Fujitsu Celsius M740, the "em0" device gets stuck on heavy I/O. I can
>>>> still trigger this behaviour on recent CURRENT (12.0-CURRENT #17 r312369:
>>>> Wed Jan 18 06:18:45 CET 2017 amd64) by rsync'ing a large poudriere ports
>>>> repository onto a remote NFSv4 fileserver. The freeze always occur on large
>>>> tarballs.
>>>>
>>>> Again, here is the pciconf output of the device: 
>>>>
>>>> em0@pci0:0:25:0:class=0x02 card=0x11ed1734 chip=0x153a8086
>>>> rev=0x05 hdr=0x00 vendor = 'Intel Corporation'
>>>> device = 'Ethernet Connection I217-LM'
>>>> class  = network
>>>> subclass   = ethernet
>>>> bar   [10] = type Memory, range 32, base 0xfb30, size 131072,
>>>> enabled bar   [14] = type Memory, range 32, base 0xfb339000, size 4096,
>>>> enabled bar   [18] = type I/O Port, range 32, base 0xf020, size 32, enabled
>>>>
>>>> On another box. equipted with a dual-port Intel i350 NIC, the igb0 and
>>>> igb1 do have negotiation problems with several types of switches (in my
>>>> SoHo environment, I use a Netgear GS110TP, at work there are several types
>>>> of Cisco Catalyst 3XXX types). The igbX very often fall back to 100MBit/s.
>>>>
>>>> Since yesterday, the igbX on that specific i350 basesd NIC (we have plentz
>>>> of them and they show similar phenomena with FreeBSD), although the switch
>>>> reports an uplink with 1 GBit, FreeBSD CURRENT shows this weird crap
>>>> message: 
>>>>> igb0: flags=8843 metric 0 mtu
>>>>> 1500
>>>>> options=653dbb
>>>>> ether xx:xx:xx:xx:xx:xx inet 192.168.0.111 netmask 0xff00 broadcast
>>>>> 192.168.0.255 nd6 options=29
>>>>>media: Ethernet autoselect (100baseTX )
>>>>>status: active  
>>>>  
>>
>> I just checked my test machines (which are auto/auto on the Juniper
>> EX4200 switches in use) and I see them come up with 1000baseTX.  Do you
>> set any options in /etc/rc.conf?
>>
>> sean
>>
> 
> No, I don't.
> 
> The line is:
> ifconfig_igb0="inet 192.168.0.10 netmask 0xff00"
> 
> Nothing else.
> 

Ok, good.  Definitely a regression.

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-18 Thread Sean Bruno



On 01/18/17 07:41, Sean Bruno wrote:
> 
> 
> On 01/18/17 00:34, O. Hartmann wrote:
>> On Thu, 5 Jan 2017 20:17:56 -0700
>> Sean Bruno  wrote:
>>>
>> On a Fujitsu Celsius M740, the "em0" device gets stuck on heavy I/O. I can
>> still trigger this behaviour on recent CURRENT (12.0-CURRENT #17 r312369: Wed
>> Jan 18 06:18:45 CET 2017 amd64) by rsync'ing a large poudriere ports
>> repository onto a remote NFSv4 fileserver. The freeze always occur on large
>> tarballs.
>>
>> Again, here is the pciconf output of the device: 
>>
>> em0@pci0:0:25:0:class=0x02 card=0x11ed1734 chip=0x153a8086
>> rev=0x05 hdr=0x00 vendor = 'Intel Corporation'
>> device = 'Ethernet Connection I217-LM'
>> class  = network
>> subclass   = ethernet
>> bar   [10] = type Memory, range 32, base 0xfb30, size 131072, enabled
>> bar   [14] = type Memory, range 32, base 0xfb339000, size 4096, enabled
>> bar   [18] = type I/O Port, range 32, base 0xf020, size 32, enabled
>>
>> On another box. equipted with a dual-port Intel i350 NIC, the igb0 and igb1 
>> do
>> have negotiation problems with several types of switches (in my SoHo
>> environment, I use a Netgear GS110TP, at work there are several types of 
>> Cisco
>> Catalyst 3XXX types). The igbX very often fall back to 100MBit/s.
>>
>> Since yesterday, the igbX on that specific i350 basesd NIC (we have plentz of
>> them and they show similar phenomena with FreeBSD), although the switch 
>> reports
>> an uplink with 1 GBit, FreeBSD CURRENT shows this weird crap message:
>>
>>> igb0: flags=8843 metric 0 mtu
>>> 1500
>>> options=653dbb
>>> ether xx:xx:xx:xx:xx:xx inet 192.168.0.111 netmask 0xff00 broadcast
>>> 192.168.0.255 nd6 options=29
>>>media: Ethernet autoselect (100baseTX )
>>>status: active
>>

I just checked my test machines (which are auto/auto on the Juniper
EX4200 switches in use) and I see them come up with 1000baseTX.  Do you
set any options in /etc/rc.conf?

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-18 Thread Sean Bruno



On 01/18/17 00:34, O. Hartmann wrote:
> On Thu, 5 Jan 2017 20:17:56 -0700
> Sean Bruno  wrote:
> 
>> tl;dr --> igbX devices will become emX devices
>>
>> We're about to commit an update to sys/dev/e1000 that will implement and
>> activate IFLIB for em(4), lem(4) & igb(4) and would appreciate all folks
>> who can test and poke at the drivers to do so this week.  This will have
>> some really great changes for performance and standardization that have
>> been bouncing around inside of various FreeBSD shops that have been
>> collaborating with Matt Macy over the last year.
>>
>> This will implement multiple queues for certain em(4) devices that are
>> capable of such things and add some new sysctl's for you to poke at in
>> your monitoring tools.
>>
>> Due to limitations of device registration, igbX devices will become emX
>> devices.  So, you'll need to make a minor update to your rc.conf and
>> scripts that manipulate the network devices.
>>
>> UPDATING will be bumped to reflect these changes.
>>
>> MFC to stable/11 will have a legacy implementation that doesn't use
>> IFLIB for compatibility reasons.
>>
>> A documentation and man page update will follow in the next few days
>> explaining how to work with the changed driver.
>>
>> sean
>>
>> bcc net@ current@ re@
>>
>>
>>
> On a Fujitsu Celsius M740, the "em0" device gets stuck on heavy I/O. I can
> still trigger this behaviour on recent CURRENT (12.0-CURRENT #17 r312369: Wed
> Jan 18 06:18:45 CET 2017 amd64) by rsync'ing a large poudriere ports
> repository onto a remote NFSv4 fileserver. The freeze always occur on large
> tarballs.
> 
> Again, here is the pciconf output of the device: 
> 
> em0@pci0:0:25:0:class=0x02 card=0x11ed1734 chip=0x153a8086
> rev=0x05 hdr=0x00 vendor = 'Intel Corporation'
> device = 'Ethernet Connection I217-LM'
> class  = network
> subclass   = ethernet
> bar   [10] = type Memory, range 32, base 0xfb30, size 131072, enabled
> bar   [14] = type Memory, range 32, base 0xfb339000, size 4096, enabled
> bar   [18] = type I/O Port, range 32, base 0xf020, size 32, enabled
> 
> On another box. equipted with a dual-port Intel i350 NIC, the igb0 and igb1 do
> have negotiation problems with several types of switches (in my SoHo
> environment, I use a Netgear GS110TP, at work there are several types of Cisco
> Catalyst 3XXX types). The igbX very often fall back to 100MBit/s.
> 
> Since yesterday, the igbX on that specific i350 basesd NIC (we have plentz of
> them and they show similar phenomena with FreeBSD), although the switch 
> reports
> an uplink with 1 GBit, FreeBSD CURRENT shows this weird crap message:
> 
>> igb0: flags=8843 metric 0 mtu
>> 1500
>> options=653dbb
>> ether xx:xx:xx:xx:xx:xx inet 192.168.0.111 netmask 0xff00 broadcast
>> 192.168.0.255 nd6 options=29
>>media: Ethernet autoselect (100baseTX )
>>status: active
> 
> I haven't checked whether FreeBSD lies or the switch lies about the linkspeed,
> but will do next time I have access to the box.
> 
> 
> regards,
> Oliver
> 


Ugh.  Ok.  Investigating the link issue, that's gross.

sean



signature.asc
Description: OpenPGP digital signature

Re: crash in iflib_fast_intr

2017-01-18 Thread Sean Bruno



On 01/18/17 03:37, peter.b...@bsd4all.org wrote:
> Hi,
> 
> A kernel without option EARLY_AP_STARTUP crashes in if lib_fast_intr. Since 
> GENERIC now has EARLY_AP_STARTUP, this probably got unnoticed. Problem is 
> reproducible.
> 
> KDB: stack backtrace:
> #0 0x805cec97 at kdb_backtrace+0x67
> #1 0x80584816 at vpanic+0x186
> #2 0x80584683 at panic+0x43
> #3 0x8090f222 at trap_fatal+0x322
> #4 0x8090f3ec at trap_pfault+0x1bc
> #5 0x8090eaa0 at trap+0x280
> #6 0x808f35e1 at calltrap+0x8
> #7 0x806a202d at iflib_fast_intr+0x3d
> #8 0x8054963b at intr_event_handle+0x9b
> #9 0x80965f38 at intr_execute_handlers+0x48
> #10 0x8096b1cf at lapic_handle_intr+0x3f
> #11 0x808f3cc7 at Xapic_isr1+0xb7
> #12 0x805b994a at sched_idletd+0x37a
> #13 0x805460f5 at fork_exit+0x85
> #14 0x808f3b1e at fork_trampoline+0xe
> 
> Peter

Thanks for the report.  We're looking at this.

This is with an igb(4) interface or em(4)?

sean



signature.asc
Description: OpenPGP digital signature

Re: Panic on boot current amd64

2017-01-17 Thread Sean Bruno



On 01/17/17 02:10, O. Hartmann wrote:
> Am Mon, 16 Jan 2017 10:33:35 -0800
> Manfred Antar  schrieb:
> 
>> From current today after changes to /sys/sys/gtaskqueue.h (r312293) I get 
>> panic on boot.
>> reverting to r312235 boot ok
>>
>> random: harvesting attach, 8 bytes (4 bits) from uhub9
>> ugen1.3:  at usbus1
>> kernel trap 12 with interrupts disabled
>>
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 2; apic id = 02
>> fault virtual address= 0x64
>> fault code   = supervisor read data, page not present
>> instruction pointer  = 0x20:0x80660449
>> stack pointer= 0x28:0xfe0466aa9010
>> frame pointer= 0x28:0xfe0466aa9030
>> code segment = base 0x0, limit 0xf, type 0x1b
>>  = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = resume, IOPL = 0
>> current process  = 60445 (ifconfig)
>> [ thread pid 60445 tid 100131 ]
>> Stopped at  grouptaskqueue_enqueue+0x19:cmpl$0,0x64(%rbx)
>> db> bt  
>> Tracing pid 60445 tid 100131 td 0xf800088df500
>> grouptaskqueue_enqueue() at grouptaskqueue_enqueue+0x19/frame 
>> 0xfe0466aa9030
>> em_intr() at em_intr+0x8c/frame 0xfe0466aa9060
>> iflib_fast_intr() at iflib_fast_intr+0x2c/frame 0xfe0466aa9080
>> intr_event_handle() at intr_event_handle+0x9b/frame 0xfe0466aa90d0
>> intr_execute_handlers() at intr_execute_handlers+0x48/frame 
>> 0xfe0466aa9100
>> lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfe0466aa9120
>> Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0466aa9120
>> --- interrupt, rip = 0x809639ad, rsp = 0xfe0466aa91f0, rbp =
>> 0xfe0466aa9200 --- spinlock_exit() at spinlock_exit+0x2d/frame 
>> 0xfe0466aa9200
>> smp_rendezvous_cpus() at smp_rendezvous_cpus+0x272/frame 0xfe0466aa9270
>> smp_rendezvous() at smp_rendezvous+0x40/frame 0xfe0466aa92a0
>> counter_u64_alloc() at counter_u64_alloc+0x3e/frame 0xfe0466aa92c0
>> rtentry_zinit() at rtentry_zinit+0x11/frame 0xfe0466aa92e0
>> keg_alloc_slab() at keg_alloc_slab+0x1e3/frame 0xfe0466aa9350
>> keg_fetch_slab() at keg_fetch_slab+0x16e/frame 0xfe0466aa93a0
>> zone_fetch_slab() at zone_fetch_slab+0x9e/frame 0xfe0466aa93e0
>> zone_import() at zone_import+0x52/frame 0xfe0466aa9430
>> uma_zalloc_arg() at uma_zalloc_arg+0x450/frame 0xfe0466aa94a0
>> rtrequest1_fib() at rtrequest1_fib+0xfc/frame 0xfe0466aa95c0
>> rtinit() at rtinit+0x390/frame 0xfe0466aa9740
>> in_addprefix() at in_addprefix+0xef/frame 0xfe0466aa97b0
>> in_control() at in_control+0x9dc/frame 0xfe0466aa9850
>> ifioctl() at ifioctl+0xdcc/frame 0xfe0466aa98d0
>> kern_ioctl() at kern_ioctl+0x274/frame 0xfe0466aa9950
>> sys_ioctl() at sys_ioctl+0x13c/frame 0xfe0466aa9a20
>> amd64_syscall() at amd64_syscall+0x488/frame 0xfe0466aa9bb0
>> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0466aa9bb0
>> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x4b194a, rsp = 
>> 0x7fffe478, rbp =
>> 0x7fffe4d0 ---
>> db>   
>>
>>
>>
>>
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 
> 
> Has this been fixed? I bugs mee, too.
> I went back to r312235, too, which doesn't coredump.
> 
> Regards,
> 
> oh
> 

For the time being, I'm suggesting people add EARLY_AP_START to their
kernel configs as is in GENERIC.

We are still debugging this.

sean



signature.asc
Description: OpenPGP digital signature

Re: igb is broken, even across reboots, at r312294

2017-01-16 Thread Sean Bruno



On 01/16/17 13:52, Alan Somers wrote:
> Today I updated my machine from 311787 to 312294.  After the update,
> my igb ports can pass no traffic.  If I reboot into kernel.old, they
> still can't pass any traffic.  They won't even work in the PXE ROM.  I
> have to power off, pull the power cables, then boot into kernel.old
> before they'll work.  This behavior is repeatable.
> 
> $ pciconf -lv
> ...
> igb0@pci0:1:0:0:class=0x02 card=0x34dc8086 chip=0x10a78086 
> rev=0x02
> hdr=0x00
> vendor = 'Intel Corporation'
> device = '82575EB Gigabit Network Connection'
> class  = network
> subclass   = ethernet
> igb1@pci0:1:0:1:class=0x02 card=0x34dc8086 chip=0x10a78086
> rev=0x02 hdr=0x00
> vendor = 'Intel Corporation'
> device = '82575EB Gigabit Network Connection'
> class  = network
> subclass   = ethernet
> ...
> 
> $ dmesg # on 311787, it's identical whether or not the igb ports are working
> ...
> igb0:  port
> 0x2020-0x203f mem 0xb1b2-0xb1b3,0xb1b44000-0xb1b47fff irq 40
> at device 0.0 on pci1
> igb0: Using MSIX interrupts with 5 vectors
> igb0: Ethernet address: 00:1e:67:25:71:bc
> igb0: Bound queue 0 to cpu 0
> igb0: Bound queue 1 to cpu 1
> igb0: Bound queue 2 to cpu 2
> igb0: Bound queue 3 to cpu 3
> igb0: netmap queues/slots: TX 4/1024, RX 4/1024
> igb1:  port
> 0x2000-0x201f mem 0xb1b0-0xb1b1,0xb1b4-0xb1b43fff irq 28
> at device 0.1 on pci1
> igb1: Using MSIX interrupts with 5 vectors
> igb1: Ethernet address: 00:1e:67:25:71:bd
> igb1: Bound queue 0 to cpu 4
> igb1: Bound queue 1 to cpu 5
> igb1: Bound queue 2 to cpu 6
> igb1: Bound queue 3 to cpu 7
> igb1: netmap queues/slots: TX 4/1024, RX 4/1024
> ...
> 
> $ dmesg # on 312294, when the igb ports are not working
> ...
> igb0:  port
> 0x2020-0x203f mem 0xb1b2-0xb1b3,0xb1b44000-0xb1b47fff irq 40
> at device 0.0 on pci1
> igb0: attach_pre capping queues at 4
> igb0: using 1024 tx descriptors and 1024 rx descriptors
> igb0: msix_init qsets capped at 4
> igb0: pxm cpus: 8 queue msgs: 9 admincnt: 1
> igb0: using 4 rx queues 4 tx queues
> igb0: Using MSIX interrupts with 5 vectors
> igb0: allocated for 4 tx_queues
> igb0: allocated for 4 rx_queues
> igb0: Ethernet address: 00:1e:67:25:71:bc
> igb0: netmap queues/slots: TX 4/1024, RX 4/1024
> igb1:  port
> 0x2000-0x201f mem 0xb1b0-0xb1b1,0xb1b4-0xb1b43fff irq 28
> at device 0.1 on pci1
> igb1: attach_pre capping queues at 4
> igb1: using 1024 tx descriptors and 1024 rx descriptors
> igb1: msix_init qsets capped at 4
> igb1: pxm cpus: 8 queue msgs: 9 admincnt: 1
> igb1: using 4 rx queues 4 tx queues
> igb1: Using MSIX interrupts with 5 vectors
> igb1: allocated for 4 tx_queues
> igb1: allocated for 4 rx_queues
> igb1: Ethernet address: 00:1e:67:25:71:bd
> igb1: netmap queues/slots: TX 4/1024, RX 4/1024
> ...
> 
> Any ideas?
> 
> -Alan
> 

Yeah, fighting with EARLY_AP_STARTUP with regards to initialization of
the interfaces.  em(4) seems to be ok with my change today, but that
change makes igb(4) *very* angry.

I'm aware and trying to find a happy medium.

sean



signature.asc
Description: OpenPGP digital signature

Re: Panic on boot current amd64

2017-01-16 Thread Sean Bruno



On 01/16/17 11:33, Manfred Antar wrote:
>>From current today after changes to /sys/sys/gtaskqueue.h (r312293) I get 
>>panic on boot.
> reverting to r312235 boot ok
> 
> random: harvesting attach, 8 bytes (4 bits) from uhub9
> ugen1.3:  at usbus1
> kernel trap 12 with interrupts disabled
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 2; apic id = 02
> fault virtual address = 0x64
> fault code= supervisor read data, page not present
> instruction pointer   = 0x20:0x80660449
> stack pointer = 0x28:0xfe0466aa9010
> frame pointer = 0x28:0xfe0466aa9030
> code segment  = base 0x0, limit 0xf, type 0x1b
>   = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags  = resume, IOPL = 0
> current process   = 60445 (ifconfig)
> [ thread pid 60445 tid 100131 ]
> Stopped at  grouptaskqueue_enqueue+0x19:cmpl$0,0x64(%rbx)
> db> bt
> Tracing pid 60445 tid 100131 td 0xf800088df500
> grouptaskqueue_enqueue() at grouptaskqueue_enqueue+0x19/frame 
> 0xfe0466aa9030
> em_intr() at em_intr+0x8c/frame 0xfe0466aa9060
> iflib_fast_intr() at iflib_fast_intr+0x2c/frame 0xfe0466aa9080
> intr_event_handle() at intr_event_handle+0x9b/frame 0xfe0466aa90d0
> intr_execute_handlers() at intr_execute_handlers+0x48/frame 0xfe0466aa9100
> lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfe0466aa9120
> Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0466aa9120
> --- interrupt, rip = 0x809639ad, rsp = 0xfe0466aa91f0, rbp = 
> 0xfe0466aa9200 ---
> spinlock_exit() at spinlock_exit+0x2d/frame 0xfe0466aa9200
> smp_rendezvous_cpus() at smp_rendezvous_cpus+0x272/frame 0xfe0466aa9270
> smp_rendezvous() at smp_rendezvous+0x40/frame 0xfe0466aa92a0
> counter_u64_alloc() at counter_u64_alloc+0x3e/frame 0xfe0466aa92c0
> rtentry_zinit() at rtentry_zinit+0x11/frame 0xfe0466aa92e0
> keg_alloc_slab() at keg_alloc_slab+0x1e3/frame 0xfe0466aa9350
> keg_fetch_slab() at keg_fetch_slab+0x16e/frame 0xfe0466aa93a0
> zone_fetch_slab() at zone_fetch_slab+0x9e/frame 0xfe0466aa93e0
> zone_import() at zone_import+0x52/frame 0xfe0466aa9430
> uma_zalloc_arg() at uma_zalloc_arg+0x450/frame 0xfe0466aa94a0
> rtrequest1_fib() at rtrequest1_fib+0xfc/frame 0xfe0466aa95c0
> rtinit() at rtinit+0x390/frame 0xfe0466aa9740
> in_addprefix() at in_addprefix+0xef/frame 0xfe0466aa97b0
> in_control() at in_control+0x9dc/frame 0xfe0466aa9850
> ifioctl() at ifioctl+0xdcc/frame 0xfe0466aa98d0
> kern_ioctl() at kern_ioctl+0x274/frame 0xfe0466aa9950
> sys_ioctl() at sys_ioctl+0x13c/frame 0xfe0466aa9a20
> amd64_syscall() at amd64_syscall+0x488/frame 0xfe0466aa9bb0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0466aa9bb0
> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x4b194a, rsp = 
> 0x7fffe478, rbp = 0x7fffe4d0 ---
> db> 
> 
> 
> 
> 
> 


Just to make sure, you're running GENERIC?

sean



signature.asc
Description: OpenPGP digital signature

Re: CURRENT: em0 NIC freezes under heavy I/O on net

2017-01-12 Thread Sean Bruno



On 01/11/17 01:27, O. Hartmann wrote:
> Running recent CURRENT (FreeBSD 12.0-CURRENT #5 r311919: Wed Jan 11 08:24:28
> CET 2017 amd64), the system freezes when doing a rsync over automounted
> (autofs) NFSv4 filesystem, mounted from another CURRENT server (same revision,
> but with BCM NICs).
> 
> The host in question is a Fujitsu Celsius M740 equipted with an Intel NIC:
> 
> [...]
> em0:  port 0xf020-0xf03f mem
> 0xfb30-0xfb31,0xfb339000-0xfb339fff at device 25.0 numa-domain 0 on
> pci1 em0: attach_pre capping queues at 1 em0: using 1024 tx descriptors and
> 1024 rx descriptors em0: msix_init qsets capped at 1
> em0: Unable to map MSIX table 
> em0: Using an MSI interrupt
> em0: allocated for 1 tx_queues
> em0: allocated for 1 rx_queues
> em0: netmap queues/slots: TX 1/1024, RX 1/1024
> [...]
> 
> The pciconf output reveals:
> 
> em0@pci0:0:25:0:class=0x02 card=0x11ed1734 chip=0x153a8086 
> rev=0x05
> hdr=0x00 vendor = 'Intel Corporation'
> device = 'Ethernet Connection I217-LM'
> class  = network
> subclass   = ethernet
> bar   [10] = type Memory, range 32, base 0xfb30, size 131072, enabled
> bar   [14] = type Memory, range 32, base 0xfb339000, size 4096, enabled
> bar   [18] = type I/O Port, range 32, base 0xf020, size 32, enabled
> cap 01[c8] = powerspec 2  supports D0 D3  current D0
> cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> cap 13[e0] = PCI Advanced Features: FLR TP
> 
> I have a customized kernel. The NIC has revealed itself all the time as an
> "emX" device (never as igbX). The kernel contains device netmap (if
> relevevant).
> 
> The phenomenon:
> 
> Syncing a poudriere repository between to remote hosts, I use rsync on a NGSv4
> exported filesystem, mounted via AUTOFS. So far, this work two days ago
> perfectly. Since yesterday, syncing brings down the network connection - the
> connection is simply dead. Terminating the rsync, bringing em0 down and up
> again doesn't help much, for short moments, the connection is established, but
> dies within seconds. Restarting via "service netif restart" all network
> services have the same effect: after the desaster, it is impossible for me to
> bring back the NIC/connection to normal, I have to reboot. The same happens
> when having heavy network load, but it takes a time and even rsync isn't
> "deadly" within the same timeframe - it takes sometimes a couple of seconds,
> another takes only one or two seconds to make the connection die. 
> 
> I checked with dd'ing a large file over that connection, it takes several
> seconds then to make the connection freezing (so, someone could reproduce iy
> not ncessarily using rsync).
> 
> Kind regards,
> 
> oh

If you have the time today or tomorrow.  Can you please capture 'sysctl
dev.em.0' and post it here?

In addition, I would like to have this patch tested in your configuration:

https://people.freebsd.org/~sbruno/em_tx_limit.diff

Finally, if you have any loader.conf entries for hw.em, please post them
as well.

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-11 Thread Sean Bruno



On 01/11/17 15:44, Sean Bruno wrote:
> 
>> My tunning are (same for both test):
>> hw.igb.rxd="2048" (it should be useless now)
>> hw.igb.txd="2048" (it should be useless now)
>> hw.em.rxd="2048"
>> hw.em.txd="2048"
>> hw.igb.rx_process_limit="-1" (It should be useless now too)
>> hw.em.rx_process_limit="-1"
>>
>> dev.igb.2.fc=0
>> dev.igb.3.fc=0
>>
>> I can generate profiling data for you: what kind of data do you want ?
> 
> 
> Specifically, you may want to adjust these:
> 
> dev.em.0.iflib.override_nrxds: 0
> dev.em.0.iflib.override_ntxds: 0
> 
> dev.em.0.iflib.override_nrxqs: 0
> dev.em.0.iflib.override_ntxqs: 0
> 
> sean
> 

dev.igb.0  but you get the point.

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-11 Thread Sean Bruno


> My tunning are (same for both test):
> hw.igb.rxd="2048" (it should be useless now)
> hw.igb.txd="2048" (it should be useless now)
> hw.em.rxd="2048"
> hw.em.txd="2048"
> hw.igb.rx_process_limit="-1" (It should be useless now too)
> hw.em.rx_process_limit="-1"
> 
> dev.igb.2.fc=0
> dev.igb.3.fc=0
> 
> I can generate profiling data for you: what kind of data do you want ?


Specifically, you may want to adjust these:

dev.em.0.iflib.override_nrxds: 0
dev.em.0.iflib.override_ntxds: 0

dev.em.0.iflib.override_nrxqs: 0
dev.em.0.iflib.override_ntxqs: 0

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-11 Thread Sean Bruno



On 01/11/17 12:47, Olivier Cochard-Labbé wrote:
> On Wed, Jan 11, 2017 at 4:17 PM, Sean Bruno  <mailto:sbr...@freebsd.org>> wrote:
> 
> 
> 
> Olivier:
> 
> Give this a quick try.  This isn't the correct way to do this, but I
> want to see if I'm on the right path:
> 
> 
> thanks, it fix the problem, I've got back the 4 queues:
> 
> igb2:  port 0x3000-0x301f
> mem 0xdfea-0xdfeb,0xdff24000-0xdff27fff irq 18 at device 20.0 on
> pci0
> igb2: attach_pre capping queues at 8
> igb2: using 1024 tx descriptors and 1024 rx descriptors
> igb2: msix_init qsets capped at 8
> igb2: pxm cpus: 4 queue msgs: 9 admincnt: 1
> igb2: using 4 rx queues 4 tx queues
> igb2: Using MSIX interrupts with 5 vectors
> igb2: allocated for 4 tx_queues
> igb2: allocated for 4 rx_queues
> igb2: Ethernet address: 00:08:a2:09:33:da
> igb2: netmap queues/slots: TX 4/1024, RX 4/1024
> 
> In forwarding mode, I measure about 10% performance drop with this new
> drivers on this hardware:
> 
> x head r311848: packets per second
> + head r311849 and BAR patch: packets per second
> +--+
> |++++ +   xxx x   x|
> ||__M__A|  |
> | |___AM__||
> +--+
> N   Min   MaxMedian   AvgStddev
> x   5924170943071927509  931612.1 8096.8269
> +   5831452  845929.5840940  838730.5 6413.5602
> Difference at 95.0% confidence
> -92881.6 +/- 10652.2
> -9.96999% +/- 1.07481%
> (Student's t, pooled s = 7303.85)
> 
> Regards,
> 
> Olivier
> 


Hmmm ... did your old tests do 4 or 8 queues on this hardware?

Did the old tests run 1024 tx/rx slots or the max 4096?

sean



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-11 Thread Sean Bruno



On 01/11/17 05:54, Matthew Macy wrote:
> 
> 
> 
>   On Wed, 11 Jan 2017 01:23:46 -0800 Olivier Cochard-Labbé 
>  wrote  
>  > On Tue, Jan 10, 2017 at 4:31 AM, Sean Bruno  wrote:
>  > 
>  > >
>  > > I've updated sys/dev/e1000 at svn R311849 to match Matt Macy's work on
>  > > IFLIB in the kernel.
>  > >
>  > > At this point, the driver deviates from Intel's code dramatically and
>  > > you now get to yell directly into the freebsd-net@ megaphone for things
>  > > that I may have broken.
>  > >
>  > >
>  > >
>  > I've got problem with this new drivers regarding number of queues used on
>  > a Netgate RCC-VE 4860 (Intel i354 NIC).
>  > Only one queue in place of 4 (on a 4 cores proc) previously: Performance
>  > drops dramatically.
>  > 

>  > igb2:  port 0x3000-0x301f mem
>  > 0xdfea-0xdfeb,0xdff24000-0xdff27fff irq 18 at device 20.0 on pci0
>  > igb2: attach_pre capping queues at 8
>  > igb2: using 1024 tx descriptors and 1024 rx descriptors
>  > igb2: msix_init qsets capped at 8
>  > igb2: Unable to map MSIX table
> 
> It has the wrong msix bar for your device. I'll look in to it.
> 


Olivier:

Give this a quick try.  This isn't the correct way to do this, but I
want to see if I'm on the right path:
Index: sys/net/iflib.c
===
--- sys/net/iflib.c (revision 311875)
+++ sys/net/iflib.c (working copy)
@@ -4721,7 +4721,7 @@
if_softc_ctx_t scctx = &ctx->ifc_softc_ctx;
int vectors, queues, rx_queues, tx_queues, queuemsgs, msgs;
int iflib_num_tx_queues, iflib_num_rx_queues;
-   int err, admincnt, bar;
+   int err, admincnt, bar, use_different_bar;

iflib_num_tx_queues = scctx->isc_ntxqsets;
iflib_num_rx_queues = scctx->isc_nrxqsets;
@@ -4729,6 +4729,16 @@
device_printf(dev, "msix_init qsets capped at %d\n", 
iflib_num_tx_queues);

bar = ctx->ifc_softc_ctx.isc_msix_bar;
+
+/*
+** Some new devices, as with ixgbe, now may
+** use a different BAR, so we need to keep
+** track of which is used.
+*/
+   use_different_bar = pci_read_config(dev, bar, 4);
+   if (use_different_bar == 0)
+   bar += 4;
+
admincnt = sctx->isc_admin_intrcnt;
/* Override by tuneable */
if (enable_msix == 0)





signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-09 Thread Sean Bruno

tl;dir --> you get to keep your igbX devices(thanks jhb), no POLA
violations this week.

I've updated sys/dev/e1000 at svn R311849 to match Matt Macy's work on
IFLIB in the kernel.

At this point, the driver deviates from Intel's code dramatically and
you now get to yell directly into the freebsd-net@ megaphone for things
that I may have broken.

man page updates are coming up next.  Please let us know if this
revision has made things better, worse or none-of-the above on whatever
Intel Gigabit NIC you happen to have lying around.

sean

On 01/05/17 20:17, Sean Bruno wrote:
> tl;dr --> igbX devices will become emX devices
> 
> We're about to commit an update to sys/dev/e1000 that will implement and
> activate IFLIB for em(4), lem(4) & igb(4) and would appreciate all folks
> who can test and poke at the drivers to do so this week.  This will have
> some really great changes for performance and standardization that have
> been bouncing around inside of various FreeBSD shops that have been
> collaborating with Matt Macy over the last year.
> 
> This will implement multiple queues for certain em(4) devices that are
> capable of such things and add some new sysctl's for you to poke at in
> your monitoring tools.
> 
> Due to limitations of device registration, igbX devices will become emX
> devices.  So, you'll need to make a minor update to your rc.conf and
> scripts that manipulate the network devices.
> 
> UPDATING will be bumped to reflect these changes.
> 
> MFC to stable/11 will have a legacy implementation that doesn't use
> IFLIB for compatibility reasons.
> 
> A documentation and man page update will follow in the next few days
> explaining how to work with the changed driver.
> 
> sean
> 
> bcc net@ current@ re@
> 
> 
> 



signature.asc
Description: OpenPGP digital signature

Re: HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-06 Thread Sean Bruno



On 01/06/17 03:48, Steven Hartland wrote:
> Hmm I'm not sure about everyone else but I we treat emX as legacy
> devices (not used one in years) but igbX is very common here.
> 
> The impact of changing a nic device name is quite a bit more involved
> than just rc.conf it effects other areas too, jails etc so given we can
> loose access to the machine on reboot if everything isn't done right, it
> would be worth considering:
> 
>  * Change emX -> igbX to lower the impact.
>  * Add shims / alias so that operations on the device name going away
>still work.
> 
> What do people think?
> 

We have a "legacy" code implementation that does exactly what you're
describing and we intend on putting that version into 11-stable so that
existing users won't bang their heads against it.

The amount of code in the tree dropped *significantly* when we dropped
this implementation, hence why we wanted to make 12-current a clean break.

sean

bcc matt
> 
> On 06/01/2017 03:17, Sean Bruno wrote:
>> tl;dr --> igbX devices will become emX devices
>>
>> We're about to commit an update to sys/dev/e1000 that will implement and
>> activate IFLIB for em(4), lem(4) & igb(4) and would appreciate all folks
>> who can test and poke at the drivers to do so this week.  This will have
>> some really great changes for performance and standardization that have
>> been bouncing around inside of various FreeBSD shops that have been
>> collaborating with Matt Macy over the last year.
>>
>> This will implement multiple queues for certain em(4) devices that are
>> capable of such things and add some new sysctl's for you to poke at in
>> your monitoring tools.
>>
>> Due to limitations of device registration, igbX devices will become emX
>> devices.  So, you'll need to make a minor update to your rc.conf and
>> scripts that manipulate the network devices.
>>
>> UPDATING will be bumped to reflect these changes.
>>
>> MFC to stable/11 will have a legacy implementation that doesn't use
>> IFLIB for compatibility reasons.
>>
>> A documentation and man page update will follow in the next few days
>> explaining how to work with the changed driver.
>>
>> sean
>>
>> bcc net@ current@ re@
>>
>>
>>
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 



signature.asc
Description: OpenPGP digital signature

HEADS-UP: IFLIB implementations of sys/dev/e1000 em, lem, igb pending

2017-01-05 Thread Sean Bruno

tl;dr --> igbX devices will become emX devices

We're about to commit an update to sys/dev/e1000 that will implement and
activate IFLIB for em(4), lem(4) & igb(4) and would appreciate all folks
who can test and poke at the drivers to do so this week.  This will have
some really great changes for performance and standardization that have
been bouncing around inside of various FreeBSD shops that have been
collaborating with Matt Macy over the last year.

This will implement multiple queues for certain em(4) devices that are
capable of such things and add some new sysctl's for you to poke at in
your monitoring tools.

Due to limitations of device registration, igbX devices will become emX
devices.  So, you'll need to make a minor update to your rc.conf and
scripts that manipulate the network devices.

UPDATING will be bumped to reflect these changes.

MFC to stable/11 will have a legacy implementation that doesn't use
IFLIB for compatibility reasons.

A documentation and man page update will follow in the next few days
explaining how to work with the changed driver.

sean

bcc net@ current@ re@





signature.asc
Description: OpenPGP digital signature

Re: Large patch/diff refuses to apply

2016-10-06 Thread Sean Bruno



On 10/06/16 10:58, Dimitry Andric wrote:
> On 06 Oct 2016, at 17:23, Sean Bruno  wrote:
>>
>> On 10/06/16 09:19, Sean Bruno wrote:
>>>
>>> On 10/06/16 09:00, Alan Somers wrote:
>>>> On Thu, Oct 6, 2016 at 8:41 AM, Sean Bruno  wrote:
>>>>>
>>>>> On 10/06/16 08:23, Slawa Olhovchenkov wrote:
>>>>>> On Thu, Oct 06, 2016 at 07:57:57AM -0600, Sean Bruno wrote:
> ...
>> A good example of what I was trying to do:
>> svn mkdir sys/modules/ix/ix_iflib
>> svn cp sys/modules/ix/Makefile sys/modules/ix/ix/ix_iflib/Makefile
>> 
>>
>> That seems to really confuse things.
> 
> Try using: svn diff --show-copies-as-adds
> 
> -Dimitry
> 

Most of what I was trying to do was with arcanist.  The review
generation seemed to be fine, but applying the diff from arcanist seemed
to blow up.

sean



signature.asc
Description: OpenPGP digital signature

Re: Large patch/diff refuses to apply

2016-10-06 Thread Sean Bruno



On 10/06/16 09:19, Sean Bruno wrote:
> 
> 
> On 10/06/16 09:00, Alan Somers wrote:
>> On Thu, Oct 6, 2016 at 8:41 AM, Sean Bruno  wrote:
>>>
>>>
>>> On 10/06/16 08:23, Slawa Olhovchenkov wrote:
>>>> On Thu, Oct 06, 2016 at 07:57:57AM -0600, Sean Bruno wrote:
>>>>
>>>>> I'm doing a large amount of change to ixgbe(4) in support of IFLIB
>>>>> implementations and running into failures when trying to apply large
>>>>> diffs.  This is causing phabricator reviews to be unuseable as well.
>>>>>
>>>>> I've setup two trees to test this.  The first tree is used to generate
>>>>> the diff and the second (vanilla) is used to apply the diff.  The entire
>>>>> patch fails to apply, so I'm assuming that the size of the diff is
>>>>> failing because of a sanity check or something.
>>>>
>>>> No. This is expanded/collapsed keywords issuse:
>>>>
>>>> ===
>>>> -**/
>>>> -/*$FreeBSD$*/
>>>> ===
>>>>
>>>> svn diff over repo generate patch w/ collapsed keywords.
>>>> At working copy all keywords expanded.
>>>>
>>>
>>> Ah, I see.  Thank you.
>>>
>>> I am regenerating the failed files now.  That seems to work (if I leave
>>> the keywords alone).
>>>
>>> sean
>>>
>>
>> Also, I think the "svn patch" command (as opposed to plain "patch")
>> can deal with the RCS keywords.
>>
> 
> The use of "svn cp" also seems to generate a lot of failures as well.
> 
> I do want to maintain history between certain files when I go to commit.
>  e.g. svn cp fileA new_fileA, but this seems to confuse the crap out of
> diff/patch no matter what I do.  This leads to suffering with
> phabricator as well.
> 
> For now, I'm not going to do the "svn cp" part of my commit, but I will
> try and do this when I am ready to shove my work into current.
> 
> sean
> 

A good example of what I was trying to do:
svn mkdir sys/modules/ix/ix_iflib
svn cp sys/modules/ix/Makefile sys/modules/ix/ix/ix_iflib/Makefile


That seems to really confuse things.

sean



signature.asc
Description: OpenPGP digital signature

Re: Large patch/diff refuses to apply

2016-10-06 Thread Sean Bruno



On 10/06/16 09:00, Alan Somers wrote:
> On Thu, Oct 6, 2016 at 8:41 AM, Sean Bruno  wrote:
>>
>>
>> On 10/06/16 08:23, Slawa Olhovchenkov wrote:
>>> On Thu, Oct 06, 2016 at 07:57:57AM -0600, Sean Bruno wrote:
>>>
>>>> I'm doing a large amount of change to ixgbe(4) in support of IFLIB
>>>> implementations and running into failures when trying to apply large
>>>> diffs.  This is causing phabricator reviews to be unuseable as well.
>>>>
>>>> I've setup two trees to test this.  The first tree is used to generate
>>>> the diff and the second (vanilla) is used to apply the diff.  The entire
>>>> patch fails to apply, so I'm assuming that the size of the diff is
>>>> failing because of a sanity check or something.
>>>
>>> No. This is expanded/collapsed keywords issuse:
>>>
>>> ===
>>> -**/
>>> -/*$FreeBSD$*/
>>> ===
>>>
>>> svn diff over repo generate patch w/ collapsed keywords.
>>> At working copy all keywords expanded.
>>>
>>
>> Ah, I see.  Thank you.
>>
>> I am regenerating the failed files now.  That seems to work (if I leave
>> the keywords alone).
>>
>> sean
>>
> 
> Also, I think the "svn patch" command (as opposed to plain "patch")
> can deal with the RCS keywords.
> 

The use of "svn cp" also seems to generate a lot of failures as well.

I do want to maintain history between certain files when I go to commit.
 e.g. svn cp fileA new_fileA, but this seems to confuse the crap out of
diff/patch no matter what I do.  This leads to suffering with
phabricator as well.

For now, I'm not going to do the "svn cp" part of my commit, but I will
try and do this when I am ready to shove my work into current.

sean



signature.asc
Description: OpenPGP digital signature

Re: Large patch/diff refuses to apply

2016-10-06 Thread Sean Bruno



On 10/06/16 08:23, Slawa Olhovchenkov wrote:
> On Thu, Oct 06, 2016 at 07:57:57AM -0600, Sean Bruno wrote:
> 
>> I'm doing a large amount of change to ixgbe(4) in support of IFLIB
>> implementations and running into failures when trying to apply large
>> diffs.  This is causing phabricator reviews to be unuseable as well.
>>
>> I've setup two trees to test this.  The first tree is used to generate
>> the diff and the second (vanilla) is used to apply the diff.  The entire
>> patch fails to apply, so I'm assuming that the size of the diff is
>> failing because of a sanity check or something.
> 
> No. This is expanded/collapsed keywords issuse:
> 
> ===
> -**/
> -/*$FreeBSD$*/
> ===
> 
> svn diff over repo generate patch w/ collapsed keywords.
> At working copy all keywords expanded.
> 

Ah, I see.  Thank you.

I am regenerating the failed files now.  That seems to work (if I leave
the keywords alone).

sean



signature.asc
Description: OpenPGP digital signature

Large patch/diff refuses to apply

2016-10-06 Thread Sean Bruno

I'm doing a large amount of change to ixgbe(4) in support of IFLIB
implementations and running into failures when trying to apply large
diffs.  This is causing phabricator reviews to be unuseable as well.

I've setup two trees to test this.  The first tree is used to generate
the diff and the second (vanilla) is used to apply the diff.  The entire
patch fails to apply, so I'm assuming that the size of the diff is
failing because of a sanity check or something.

https://people.freebsd.org/~sbruno/test.diff

% svn stat sys/dev/ixgbe/if_ix.c
M   sys/dev/ixgbe/if_ix.c

% svn diff sys/dev/ixgbe/if_ix.c > ~/test.diff


% cd ../vanilla/

% patch -p0 < ~/test.diff
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|Index: sys/dev/ixgbe/if_ix.c
|===
|--- sys/dev/ixgbe/if_ix.c  (revision 306673)
|+++ sys/dev/ixgbe/if_ix.c  (working copy)
--
Patching file sys/dev/ixgbe/if_ix.c using Plan A...
Hunk #1 failed at 1.
1 out of 1 hunks failed--saving rejects to sys/dev/ixgbe/if_ix.c.rej
done

% ls -l sys/dev/ixgbe/if_ix.c*
-rw-r--r--  1 sbruno  wheel  166692 Oct  6 06:41 sys/dev/ixgbe/if_ix.c
-rw-r--r--  1 sbruno  wheel  166692 Oct  6 06:37 sys/dev/ixgbe/if_ix.c.orig
-rw-r--r--  1 sbruno  wheel  172779 Oct  6 06:41 sys/dev/ixgbe/if_ix.c.rej

% svn stat sys/dev/ixgbe/
?   sys/dev/ixgbe/if_ix.c.orig





signature.asc
Description: OpenPGP digital signature

NFS clients (nfsroot netboot) never purged from showmount -a

2016-09-13 Thread Sean Bruno

I just noticed that our netboot nfsroot implementation never clears its
entry or logs out or whatever when rebooting.  I see the same server in
the showmount -a output over the course of testing various
configurations on various nfsroots:

showmount -a
All mount points on localhost:
sysdev05.phx7.llnw.net:/tftpboot/netboot_rbhave
sysdev05.phx7.llnw.net:/tftpboot/netboot_sysdev05
sysdev06.phx7.llnw.net:/tftpboot/netboot_psunkapur
sysdev06.phx7.llnw.net:/tftpboot/netboot_rbhave
sysdev06.phx7.llnw.net:/tftpboot/netboot_sysdev06


Is there some logout mechanism that we're missing in our nfs client that
would do the right thing here?

sean



signature.asc
Description: OpenPGP digital signature

Re: r292688: ix_txrx.c:812:4: error: use of undeclared identifier 'ip6';

2015-12-24 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 12/24/15 04:03, O. Hartmann wrote:
> Building kernel on r292688 fails with the error shown below:
> 

This appears to be breakage due to missing "options INET6" in the
kernconf being used.

I've repaired this at svn revision 292697.

> [...] cc  -O2 -pipe -O3 -O3 -pipe -march=native
> -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc
> -DHAVE_KERNEL_OPTION_HEADERS -include
> /usr/obj/usr/src/sys/GATE/opt_global.h -I. -I/usr/src/sys
> -fno-common -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer
> -I/usr/obj/usr/src/sys/GATE -mcmodel=kernel -mno-red-zone -mno-mmx
> -mno-sse -msoft-float -fno-asynchronous-unwind-tables
> -ffreestanding -fwrapv -fstack-protector -Wall -Wredundant-decls
> -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
> -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign 
> -D__printf__=__freebsd_kprintf__  -Wmissing-include-dirs
> -fdiagnostics-show-option -Wno-unknown-pragmas
> -Wno-error-tautological-compare -Wno-error-empty-body 
> -Wno-error-parentheses-equality -Wno-error-unused-function
> -Wno-error-pointer-sign -Wno-error-shift-negative-value  -mno-aes
> -mno-avx  -std=iso9899:1999 -c
> /usr/src/sys/modules/isci/../../dev/isci/scil/scif_sas_smp_remote_device.c
> -o scif_sas_smp_remote_device.o --- all_subdir_iwnfw --- ---
> iwn5150fw.ko --- ld -d -warn-common -r -d -o iwn5150fw.ko
> iwlwifi-5150-8.24.2.2.fw.fwo iwn5150fw.o :> export_syms awk -f
> /usr/src/sys/conf/kmod_syms.awk iwn5150fw.ko  export_syms | xargs
> -J% objcopy % iwn5150fw.ko --- all_subdir_ix --- --- ix_txrx.o ---
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:812:4: error: use
> of undeclared identifier 'ip6'; did you mean 'ip'? ip6 = (struct
> ip6_hdr *)(l3d); ^~~ ip
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:730:13: note:
> 'ip' declared here struct ip *ip; ^
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:812:8: error: 
> incompatible pointer types assigning to 'struct ip *' from 'struct
> ip6_hdr *' [-Werror,-Wincompatible-pointer-types] ip6 = (struct
> ip6_hdr *)(l3d); ^ ~~~
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:814:14: error: 
> use of undeclared identifier 'ip6' ipproto = ip6->ip6_nxt; ^ ---
> all_subdir_isci ---
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWfCafXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kNm8H/R4b7V2zo7bzDVtFUcA7zSd3
s3luJzaIL19Bw/DMRwiCwiaOn15k0OwloPAYnHZGHWD+c0vCc/F4RxyRdt6MEJfu
rCoSIfVo9HJBsWTuMvHteGv95WiJ6rXyDgHgLpLolmmcxxLEWmmiImYZryqFGCwk
pW15Y28ycdDGfJ7Qd6uMfatKotqC9ZP/Jw7N921/eODWt6qkhBLhJ3Yc9OaCNGow
onaxjzSJfIEtyzdkjvU+yPMu0r189IbjJIG+H5x4ZrU+z6j1/+goOVaSt/xYQDVU
KsPdTCFhogZWcNs9PCbTYWGJZbbKL5mnj1PeVe2LpA1zuNTobd3Mr9QVPSoEkus=
=fGmv
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: This igb change makes my igb not working anymore - Re: regression in igb/clang?

2015-12-24 Thread Sean Bruno



Please create/update/point-me-at a bugzilla issue with all this information.

sean

On 12/22/15 08:37, Alan Somers wrote:
> I'm experiencing the same problem, and I can confirm that Alexander's
> workaround fixes it.  Here is some more information:
> 
> * I see the exact same problem on two different systems, both with
> S5520HC motherboards.
> * Both systems have two igb ports, and igb1 works on both.  Only igb0 is 
> broken.
> * Disabling tso, lro, rxcsum, and txcsum has no effect.
> * tcpdump reveals that igb0 transmits successfully, but fails to receive
> * Curiously, "netstat -I igb0" shows nonzero values for Ipkts, even
> though "tcpdump  -i igb0" shows no inbound packets at all.
> * I can't really tell if IPv4 or IPv6 are working, because even ARP
> doesn't work.
> * SVN revisions 291495 and 292570 are both bad.  I don't know any
> recent good revision.
> 
> -Alan
> 
> On Thu, Nov 19, 2015 at 11:21 PM, Alexander Leidinger
>  wrote:
>> Dual stack.
>> Ping was on ipv4, no answer. Without the line I get the answer.
>> I have not tried a ping6.
>> --
>> Send from a mobile device, please forgive brevity and misspelling.
>>
>>
>> Gesendet mit AquaMail für Android
>> http://www.aqua-mail.com
>>
>>
>> Am 20. November 2015 02:07:11 schrieb Eric Joyner :
>>
>>> Are you using IPv6?
>>>
>>> On Thu, Nov 19, 2015 at 12:42 PM Alexander Leidinger <
>>> alexan...@leidinger.net> wrote:
>>>
 On Wed, 11 Nov 2015 11:45:32 +0100
 Alexander Leidinger  wrote:

> Hi,
>
> I' updated a system with -current as of r287323 (end August) to
> r290633 (yesterday).
>
> Result: no network connection (not even ping) on igb.
> Ping internally (local addresses) works, anything outgoing/incoming
> doesn't.

 And this is the function which causes it:
 e1000_rx_fifo_flush_82575(&adapter->hw);

 If I comment it out in if_igb.c, the network card works again.

 Full quote below for the PCI ID of my card in case it helps for fixing
 the issue.

 Bye,
 Alexander.

> I disabled HW support (tso4, lro, rxcsum, txcsum): doesn't help.
>
> Did I miss some known defect/workaround?
>
> Anything I should test/provide besides what is below?
>
> The igb device is a:
> ---snip---
> igb0@pci0:1:0:0: class=0x02 card=0x34e28086 chip=0x10a78086
> rev=0x02 hdr=0x00 ---snip---
>
> My src.conf:
> ---snip---
> WITH_IDEA=yes
> WITHOUT_PROFILE=yes
> CFLAGS+=-DFTP_COMBINE_CWDS
> MALLOC_PRODUCTION=yes
> LOADER_FIREWIRE_SUPPORT=yes
> #WITH_FAST_DEPEND=yes
> ---snip---
>
> My buildworld related config in make.conf:
> ---snip---
> CFLAGS+= -O2 -pipe
> COPTFLAGS= -O2 -pipe
> #CPUTYPE?=core2
> #WITH_CCACHE_BUILD=yes
> #.if (!empty(.CURDIR:M/usr/src*)
> || !empty(.CURDIR:M/usr/obj*)|| !empty(.CURDIR:M/space/system/usr_obj*))
> #.if !defined(NOCCACHE) && exists(/usr/local/libexec/ccache/world/cc)
> #CC:=${CC:C,^cc,/usr/local/libexec/ccache/world/cc,1}
> #CXX:=${CXX:C,^c\+\+,/usr/local/libexec/ccache/world/c++,1} #.endif
> #.endif
> ---snip---
>
> The commented out parts were active initially, but then I commented
> them out, cleaned out /usr/obj (rm -r) and rebuild/reinstall to make
> sure it's not due to them (CPUTYPE commented out due to the fact that
> there's a new compiler, and I use zsh and there was a commit talking
> about zsh and CPUTYPE workaround).
>
> Bye,
> Alexander.
>


 --
 http://www.Leidinger.net alexan...@leidinger.net: PGP 0xC773696B3BAC17DC
 http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0xC773696B3BAC17DC
 ___
 freebsd-current@freebsd.org mailing list
 https://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

>>
>>
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r292688: ix_txrx.c:812:4: error: use of undeclared identifier 'ip6';

2015-12-24 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 12/24/15 04:03, O. Hartmann wrote:
> Building kernel on r292688 fails with the error shown below:
> 
> [...] cc  -O2 -pipe -O3 -O3 -pipe -march=native
> -fno-strict-aliasing -Werror -D_KERNEL -DKLD_MODULE -nostdinc
> -DHAVE_KERNEL_OPTION_HEADERS -include
> /usr/obj/usr/src/sys/GATE/opt_global.h -I. -I/usr/src/sys
> -fno-common -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer
> -I/usr/obj/usr/src/sys/GATE -mcmodel=kernel -mno-red-zone -mno-mmx
> -mno-sse -msoft-float -fno-asynchronous-unwind-tables
> -ffreestanding -fwrapv -fstack-protector -Wall -Wredundant-decls
> -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
> -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign 
> -D__printf__=__freebsd_kprintf__  -Wmissing-include-dirs
> -fdiagnostics-show-option -Wno-unknown-pragmas
> -Wno-error-tautological-compare -Wno-error-empty-body 
> -Wno-error-parentheses-equality -Wno-error-unused-function
> -Wno-error-pointer-sign -Wno-error-shift-negative-value  -mno-aes
> -mno-avx  -std=iso9899:1999 -c
> /usr/src/sys/modules/isci/../../dev/isci/scil/scif_sas_smp_remote_device.c
> -o scif_sas_smp_remote_device.o --- all_subdir_iwnfw --- ---
> iwn5150fw.ko --- ld -d -warn-common -r -d -o iwn5150fw.ko
> iwlwifi-5150-8.24.2.2.fw.fwo iwn5150fw.o :> export_syms awk -f
> /usr/src/sys/conf/kmod_syms.awk iwn5150fw.ko  export_syms | xargs
> -J% objcopy % iwn5150fw.ko --- all_subdir_ix --- --- ix_txrx.o ---
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:812:4: error: use
> of undeclared identifier 'ip6'; did you mean 'ip'? ip6 = (struct
> ip6_hdr *)(l3d); ^~~ ip
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:730:13: note:
> 'ip' declared here struct ip *ip; ^
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:812:8: error: 
> incompatible pointer types assigning to 'struct ip *' from 'struct
> ip6_hdr *' [-Werror,-Wincompatible-pointer-types] ip6 = (struct
> ip6_hdr *)(l3d); ^ ~~~
> /usr/src/sys/modules/ix/../../dev/ixgbe/ix_txrx.c:814:14: error: 
> use of undeclared identifier 'ip6' ipproto = ip6->ip6_nxt; ^ ---
> all_subdir_isci ---
> 


Is this just GENERIC? Or custom kernconf?

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWfB6lXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k6PoH/1klrjNtBwY3ZtF8VOCxxlMi
GgLEma2tdOj4juSxtwLv5IYlbvUN+5KjBrqixBxI3EewrgD4FwG8caXpRcczpnwS
LPQbk341u1QGiDfjQ2BddOd3tgVj1IIJwivMNZjmyhajD203TGQ1Fjx+UeT2QrzX
vsLtre5T9aS5pCgWaD9KHIZ8hfsxvp1DakAGF1AKRfHl5AEENQPvgv6o+jLmGTxv
jx5TeKtdKd65eEwWLmeTvD4XwRYJSSZQPZxrDeF5YCpf6Qunsi4FAIuHQ1sIXLqi
KT5KNnvayz4TdFGEdBjl077gUwP31GBzGE9R7DuDD/MDeZNhUZSGMaq6P8FcdWA=
=Lazt
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: sys/modules "make clean" seems broken again

2015-11-30 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 11/19/15 15:42, Warner Losh wrote:
> 
> 
> On Thu, Nov 19, 2015 at 11:12 AM, John Baldwin  <mailto:j...@freebsd.org>> wrote:
> 
> On Thursday, November 19, 2015 09:13:06 AM Sean Bruno wrote:
>> I thought I fixed this a year or two ago, but now a "make clean"
>> in sys/modules seems to leave bus_if.h device_if.h and pci_if.h
>> in the directory.
>> 
>> Should I just add these to the clean targets?
> 
> Blame Warner as his MFILES changes broke this.  In particular, see 
> r287263 which turned off cleaning these up.  I'm not sure what that
> broke that caused it to be disabled.
> 
> Your change is a gross hack though.
> 
> 
> Yes. The reason I had to break it was that there were a few files
> named _if.c and _if.h in the tree that would get deleted when make
> clean was run in the modules.
> 
> I'll take another run at fixing this. Sean's "fix" is a horrible
> hack.
> 
> Warner


Just bouncing a ping here.  I haven't actually checked to see if the
last weeks' commits to head fixed this, but I suspect this hasn't been
addressed.  What do we want to do different if not this hackery I propose?



sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWXLFzXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kQGgH/0f9VFqsC2v/CxFSn/oB+n8t
6OwiSZzCQ6D9DQJ259kTFPUXBpOvoDOCrmqxklx0dPov9X9np601CtYlLdz3Ozrk
P3Pg6KrxAq9dZU2TGMm3SSsho6LTvPJ93Up+m0ekk8KRmhntemk1AoUtxShrUFZi
OdEaSxWFfar1cc/6UYhtqXmKcBZXnxdQGkVe0aFCJJDCyuCLWMuIXSO+vjHWvFYF
NUj2fhBHfr1PNX1Hao4j4aR3VfUpLTDclfW0f8e3ABJo2cQ07w935kCPkZQau/8f
ongD795281LJdgamYtB/A0r63xQgzXqeJV6Gk0SRTXU75NUArWfGiweEtMd6wfw=
=RB/W
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: sys/modules "make clean" seems broken again

2015-11-19 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512




On 11/19/15 10:12, John Baldwin wrote:
> On Thursday, November 19, 2015 09:13:06 AM Sean Bruno wrote:
>> I thought I fixed this a year or two ago, but now a "make clean"
>> in sys/modules seems to leave bus_if.h device_if.h and pci_if.h
>> in the directory.
>> 
>> Should I just add these to the clean targets?
> 
> Blame Warner as his MFILES changes broke this.  In particular, see
> r287263 which turned off cleaning these up.  I'm not sure what that
> broke that caused it to be disabled.
> 
> Your change is a gross hack though.
> 

Definitely.  Just committing that would be pretty bad.  There's
similar code above it that I used as a template for this change.  I'm
not sure that its any less gross ... :-(

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWTjpQXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k5AsH/2z3H/jFttevRnWKKP9C4DUM
qkZKz3KjvQHDmlGXjPsfsAJQO7ngsoleGnFCFipRzk2ICmhfJQFcKYeSyUJVvRiO
91MkcQHEzmy4bOdLNbAKj5b9R/WtFwOdY3HwPyiI1tnxRTovnT2gJ5Qg0u6oNFdw
RAFMrn9hMMYxcHRGjJyfOdXP0fYpb752oAlJR1cL3lpsh6MqZfqhDCZShimnveht
R71BTUqQu3X4YdeXt4CRWFqwGchQD0J0T/wFp0JqA5oM2uK1JxenHqz/5mNzuQol
X8xjrhN/Bh5sLK+QFsdBcmlbQ6aEsy7Dq8Abm7tWz46TCO1qcp1SgU+7Ny1+9eY=
=f5LD
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

sys/modules "make clean" seems broken again

2015-11-19 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

I thought I fixed this a year or two ago, but now a "make clean" in
sys/modules seems to leave bus_if.h device_if.h and pci_if.h in the
directory.

Should I just add these to the clean targets?


Index: kmod.mk
===
- --- kmod.mk   (revision 290770)
+++ kmod.mk (working copy)
@@ -425,6 +425,10 @@
${SYSDIR}/${MACHINE}/${MACHINE}/genassym.c
 .endif

+.if !empty(SRCS:M${_i}pci_if.h)
+CLEANFILES+=   ${_i}pci_if.h ${_i}device_if.h ${_i}bus_if.h
+.endif
+
 lint: ${SRCS}
${LINT} ${LINTKERNFLAGS} ${CFLAGS:M-[DILU]*} ${.ALLSRC:M*.c}


sean

bcc -- imp@
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWTgMbXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5ketYIAKTynLglAjsGrfuES7ibF40p
n9hcbKGTDX9w3EUDNtL2Eho8WdZnuJ1gP4EtkHIQegeFPl156nt26jg6THYSFFiO
oFFxAwR7f7S/S6+AjP7CMvdISxKPgmx0h8TEMIG0jatYPtH529WAM5plvD/GBJ1d
pWBiqkTYnzmzaKz5cUXthHL2cEcerw4imakLj7lKGWw+W5kpQNbzrcqhCdYxCz3v
s19ccxQtBrLE3aufvIGErjS28qa38AIYAd52EdTlObDK61f9H0S+O85DexAyMcEE
e1dURWI/X/s13PQvZ7Cq/6nKVjqCUKUFe3sU/qvMCTm3jgS/7Gip5oK/BVix2R4=
=k8Iw
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

[CFT] em(4) - Update to use Extended RX Descriptor Format

2015-10-24 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

https://reviews.freebsd.org/D3447

I'm looking for a bunch of you folks who have em(4) and *not* lem(4)
to test out this update.  This matches what linux does in e1000e vs
e1000 and should be a no-op for your machines.  The more devices we
test here the better.

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWK7G1XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k6JwH/0d2sg+hg6e+MfkE4FTMZHjw
59cAbKtbrVPNY4EVnuzLYTcxuVnFMtQfWwA8zyJLja6snRLfrONdmtrJnOKIuRbz
E9Jhn29JPMyW/aSrhOOEhwS4QV4ffQ1j9V0VbiqeN9JVgutegxpWGoX6ZRkx40Gk
eWUJALnCNdj3cM/c1UoRQhUrlXndAJEYw7t0hcjJwUGQodE7R451mJi/0dqGc1qP
Pytyxu/4uOgRZkKjWfFClBfI9vmsG06UfxcwcALeAQdQ5uZtfOctC6dPhAKgc65x
hqzpxNyfSzdLVpp/KfeKIICXeDzBxdkbWLDh6JXt899BP4z0dw4pNOVSHa/DcTo=
=6dGn
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CURRENT: net/igb broken

2015-10-02 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 10/02/15 00:47, O. Hartmann wrote:
> On Thu, 01 Oct 2015 15:39:11 + Eric Joyner 
> wrote:
> 
>> Oliver,
>> 
>> did you try Sean's suggestion?
>> 
>> - Eric
>> 
>> On Tue, Sep 22, 2015 at 1:10 PM Sean Bruno 
>> wrote:
>> 
> 
> 
> On 09/21/15 23:23, O. Hartmann wrote:
>>>>> On Mon, 21 Sep 2015 21:13:18 + Eric Joyner
>>>>>  wrote:
>>>>> 
>>>>>> If you do a diff between r288057 and r287761, there are
>>>>>> no differences between the sys/dev/e1000, sys/modules/em,
>>>>>> and sys/modules/igb directories. Are you sure r287761
>>>>>> actually works?
>>>>> 
>>>>> I'm quite sure r287761 works (and r287762 doesn't), double
>>>>> checked this this morning again. I also checked r288093 and
>>>>> it is still not working.
>>>>> 
>>>>> The ensure that I'm not the culprit and stupid here:
>>>>> 
>>>>> I use a NanoBSD environment and the only thing that gets
>>>>> exchanged, is the underlying OS/OS revision. The
>>>>> configuration always stays the same. The base system for
>>>>> all of my tests is built from a clean source - (deleted
>>>>> obj/ dir, clean, fresh build into obj/ for every test I
>>>>> ran).
>>>>> 
>>>>> I realised a funny thing. Playing around with
>>>>> enabling/disabling TSO (I have been told that could be the
>>>>> culprit in an earlier Email from this list) with the
>>>>> commend sequence:
>>>>> 
>>>>> ifconfig igb1 down ifconfig igb1 -tso ifconfig igb1 up
>>>>> ifconfig igb1 down ifconfig igb1 tso ifconfig igb1 up . .
>>>>> .
>>>>> 
>>>>> while a ping is pinging in the background a remote host
>>>>> connected to that specific interface, the ping does work
>>>>> for a while and dies then after a round trip of roughly 10
>>>>> - 20. I can reproduce this.
>>>>> 
>>>>> is that observation of any help?
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> oh
>>>>> 
>>>>>> 
>>>>>> On Mon, Sep 21, 2015 at 1:58 AM O. Hartmann 
>>>>>>  wrote:
>>>>>> 
>>>>>>> On Sat, 19 Sep 2015 11:23:44 -0700 Sean Bruno 
>>>>>>>  wrote:
>>>>>>> 
>>>>> 
>>>>> 
>>>>> On 09/18/15 10:20, Eric Joyner wrote:
>>>>>>>>>> He has an i210 -- he would want to revert 
>>>>>>>>>> e1000_i210.[ch], too.
>>>>>>>>>> 
>>>>>>>>>> Sorry for the thrash Sean -- it sounds like it
>>>>>>>>>> would be a good idea for you should revert this
>>>>>>>>>> patch, and Jeff and I can go look at trying these
>>>>>>>>>> shared code updates and igb changes internally
>>>>>>>>>> again. We at Intel really could've done a better
>>>>>>>>>> job of making sure these changes worked across a
>>>>>>>>>> wider variety of devices.
>>>>>>>>>> 
>>>>>>>>>> - Eric
>>>>> 
>>>>> I've reverted the changes to head.  I'll reopen the reviews
>>>>> and we can proceed from there.
>>>>> 
>>>>> sean
>>>>> 
>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Fri, Sep 18, 2015 at 9:50 AM Sean Bruno 
>>>>>>>>>> mailto:sbr...@freebsd.org>>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> r287762 broke the system
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Before I revert this changeset *again* can you
>>>>>>>>>> test revert r287762 from if_igb.c, e1000_82575.c
>>>>>>>>>> and e1000_82575.h *only*
>>>>>>>>>> 
>>>>>>>>>> That narrows down the change quite

Re: HWPMC panic

2015-09-22 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 08/26/15 14:12, Larry Rosenman wrote:
> Was playing with pmcstats -S instructions -T, and got the following
> when I tried to ctrl/c out.
> 
> 
> oldtbh.lerctr.org dumped core - see /var/crash/vmcore.3
> 
> Wed Aug 26 16:05:16 CDT 2015
> 
> FreeBSD oldtbh.lerctr.org 11.0-CURRENT FreeBSD 11.0-CURRENT #18
> r287033: Sun Aug 23 18:08:24 CDT 2015
> r...@oldtbh.lerctr.org:/usr/obj/usr/src/sys/VT-LER  amd64
> 
> panic: [p4,700] class mismatch pd 260 != id class 4
> 
> GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation,
> Inc. GDB is free software, covered by the GNU General Public
> License, and you are welcome to change it and/or distribute copies
> of it under certain conditions. Type "show copying" to see the
> conditions. There is absolutely no warranty for GDB.  Type "show
> warranty" for details. This GDB was configured as
> "amd64-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer: panic: [p4,700] class
> mismatch pd 260 != id class 4 cpuid = 1 KDB: stack backtrace: 
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe0238744770 vpanic() at vpanic+0x189/frame
> 0xfe02387447f0 kassert_panic() at kassert_panic+0x132/frame
> 0xfe0238744860 p4_read_pmc() at p4_read_pmc+0x185/frame
> 0xfe02387448b0 pmc_stop() at pmc_stop+0x132/frame
> 0xfe02387448f0 pmc_syscall_handler() at
> pmc_syscall_handler+0x1752/frame 0xfe0238744ae0 amd64_syscall()
> at amd64_syscall+0x25d/frame 0xfe0238744bf0 Xfast_syscall() at
> Xfast_syscall+0xfb/frame 0xfe0238744bf0 --- syscall (0, FreeBSD
> ELF64, nosys), rip = 0x801407ffa, rsp = 0x7fffe588, rbp =
> 0x7fffe5a0 --- Uptime: 2d8h36m19s Dumping 3475 out of 8158
> MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> 
> Reading symbols from /boot/kernel/if_lagg.ko.symbols...done. Loaded
> symbols for /boot/kernel/if_lagg.ko.symbols Reading symbols from
> /boot/kernel/hwpmc.ko.symbols...done. Loaded symbols for
> /boot/kernel/hwpmc.ko.symbols #0  doadump (textdump=1) at
> pcpu.h:221 221pcpu.h: No such file or directory. in pcpu.h (kgdb)
> #0  doadump (textdump=1) at pcpu.h:221 #1  0x80b34ca5 in
> kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:329 #2
> 0x80b35298 in vpanic (fmt=, ap= optimized out>) at /usr/src/sys/kern/kern_shutdown.c:626 #3
> 0x80b350c2 in kassert_panic (fmt=) at
> /usr/src/sys/kern/kern_shutdown.c:516 #4  0x8242ee25 in
> p4_read_pmc (cpu=1, ri=12, v=0xf8012b206aa0) at
> /usr/src/sys/modules/hwpmc/../../dev/hwpmc/hwpmc_piv.c:699 #5
> 0x82425102 in pmc_stop (pm=0xf8012b206a80) at
> /usr/src/sys/modules/hwpmc/../../dev/hwpmc/hwpmc_mod.c:2741 #6
> 0x82423a12 in pmc_syscall_handler (td= out>, syscall_args=) at
> /usr/src/sys/modules/hwpmc/../../dev/hwpmc/hwpmc_mod.c:3923 #7
> 0x80f7c38d in amd64_syscall (td=0xf801060759a0,
> traced=0) at subr_syscall.c:133 #8  0x80f5b26b in
> Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:395 #9
> 0x000801407ffa in ?? () Previous frame inner to this frame
> (corrupt stack?) Current language:  auto; currently minimal (kgdb)
> 
> 
> 
> vmcore IS available, and I *CAN* give ssh access.
> 


Odd, can you post what CPU type you have? e.g. from my dmesg:

CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2400.13-MHz
K8-class CPU)
  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2

Features=0xbfebfbff

Features2=0x29ee3ff
  AMD Features=0x2c100800
  AMD Features2=0x1
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
  TSC: P-state invariant, performance statistics
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWAb1iXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kcewIAISYfYFC/9rpqZ3vb+ZwIdlc
Jhlt15YNhNn10NjtEEi5VE90+gSHXW5I96qTmBaplCNOYRBc86D8KgNUMJT48H2e
VTL0J2nBLn6jsqflq+08ps4/z0yFd7L8f+1EayP9RpkXsD6ZpdqMQsX26fT6UZDK
q1lTJI9eEngN7EsbIcCmSYYm2geieePxOQgJIOXCO2k8MnB6yfiTHIowTe2klLvT
aHzcr6YOCIVG42KFdNFg8ECjBF2VAzov08u5axVuzC447OI9dsItE2f7xum7Cwq6
Tq7kVMI+0+XXxXHCq/ju7gjvp0E5hRqsi3TiO7eEz6WmIYyTnnD8A+ffgC/9v/A=
=C+xz
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CURRENT: net/igb broken

2015-09-22 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 09/21/15 23:23, O. Hartmann wrote:
> On Mon, 21 Sep 2015 21:13:18 + Eric Joyner 
> wrote:
> 
>> If you do a diff between r288057 and r287761, there are no
>> differences between the sys/dev/e1000, sys/modules/em, and
>> sys/modules/igb directories. Are you sure r287761 actually
>> works?
> 
> I'm quite sure r287761 works (and r287762 doesn't), double checked
> this this morning again. I also checked r288093 and it is still not
> working.
> 
> The ensure that I'm not the culprit and stupid here:
> 
> I use a NanoBSD environment and the only thing that gets exchanged,
> is the underlying OS/OS revision. The configuration always stays
> the same. The base system for all of my tests is built from a clean
> source - (deleted obj/ dir, clean, fresh build into obj/ for every
> test I ran).
> 
> I realised a funny thing. Playing around with enabling/disabling
> TSO (I have been told that could be the culprit in an earlier Email
> from this list) with the commend sequence:
> 
> ifconfig igb1 down ifconfig igb1 -tso ifconfig igb1 up ifconfig
> igb1 down ifconfig igb1 tso ifconfig igb1 up . . .
> 
> while a ping is pinging in the background a remote host connected
> to that specific interface, the ping does work for a while and dies
> then after a round trip of roughly 10 - 20. I can reproduce this.
> 
> is that observation of any help?
> 
> Regards,
> 
> oh
> 
>> 
>> On Mon, Sep 21, 2015 at 1:58 AM O. Hartmann
>>  wrote:
>> 
>>> On Sat, 19 Sep 2015 11:23:44 -0700 Sean Bruno
>>>  wrote:
>>> 
> 
> 
> On 09/18/15 10:20, Eric Joyner wrote:
>>>>>> He has an i210 -- he would want to revert
>>>>>> e1000_i210.[ch], too.
>>>>>> 
>>>>>> Sorry for the thrash Sean -- it sounds like it would be a
>>>>>> good idea for you should revert this patch, and Jeff and
>>>>>> I can go look at trying these shared code updates and igb
>>>>>> changes internally again. We at Intel really could've
>>>>>> done a better job of making sure these changes worked
>>>>>> across a wider variety of devices.
>>>>>> 
>>>>>> - Eric
> 
> I've reverted the changes to head.  I'll reopen the reviews and we
> can proceed from there.
> 
> sean
> 
> 
>>>>>> 
>>>>>> On Fri, Sep 18, 2015 at 9:50 AM Sean Bruno
>>>>>> mailto:sbr...@freebsd.org>> wrote:
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> r287762 broke the system
>>>>>> 
>>>>>> 
>>>>>> Before I revert this changeset *again* can you test
>>>>>> revert r287762 from if_igb.c, e1000_82575.c and
>>>>>> e1000_82575.h *only*
>>>>>> 
>>>>>> That narrows down the change quite a bit.
>>>>>> 
>>>>>> sean ___ 
>>>>>> freebsd-current@freebsd.org
>>>>>> <mailto:freebsd-current@freebsd.org> mailing list 
>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>>>>> To unsubscribe, send any mail to 
>>>>>> "freebsd-current-unsubscr...@freebsd.org 
>>>>>> <mailto:freebsd-current-unsubscr...@freebsd.org>"
>>>>>> 
>>> 
>>> I'm now on r288057 on that specific machine, supposedly
>>> reverted changes that seemingly has been identified as the
>>> culprit. Still NO change in behaviour!
>>> 
>>> r287761 works with the same configuration on igb (i210), any
>>> further does not. Not ping/connect from the outside, no
>>> ping/connect from the inside. Tried different protocols (SAMBA,
>>> ssh, LDAP, DNS). Affected is/are only boxes with the igb driver
>>> and i210 chipset (we do not have other chips covered by igb).
>>> 
>>> Regards, Oliver
>>> 
>> ___ 
>> freebsd-current@freebsd.org mailing list 
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current To
>> unsubscribe, send any mail to
>> "freebsd-current-unsubscr...@freebsd.org"
> 
> 


For my entertainment (and HPS's), can you run HEAD and revert r287775?

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJWAbWOXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kPTAH/jmm1tudLRYVtC+xb9NXHQgr
dl8/fZC8/xL3m0EVM8pWdKlRbF1tHUDSB/2ftYUBEe6SIkab2IZx2Z/0VgdflrbB
05HQUuq1yM3dYBiEAjyM0oK6lfeWu2Jg8nOaA5YWi1GO2OfkuDfXRUkK3sm7xa0C
PE+ZMlfofQCV0RyDu2ew17yZKYRbCXdc+GYg6CGNRRVJHeITZPyAAh8X1d7pC8G3
8vJLKC8JOmg0i5yToYSkKvXdrReHUpzF+hZKgxsl5Lb4BhcHEukkSWQVsJ9IuVGU
615sN6eVub2+OBbxJyV+CcjUVwdLJba/YBUXhWdKslDrN2z9l/sAFHCxDJlmAvc=
=zdDL
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CURRENT: net/igb broken

2015-09-19 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 09/18/15 10:20, Eric Joyner wrote:
> He has an i210 -- he would want to revert e1000_i210.[ch], too.
> 
> Sorry for the thrash Sean -- it sounds like it would be a good idea
> for you should revert this patch, and Jeff and I can go look at
> trying these shared code updates and igb changes internally again.
> We at Intel really could've done a better job of making sure these
> changes worked across a wider variety of devices.
> 
> - Eric

I've reverted the changes to head.  I'll reopen the reviews and we can
proceed from there.

sean


> 
> On Fri, Sep 18, 2015 at 9:50 AM Sean Bruno  <mailto:sbr...@freebsd.org>> wrote:
> 
> 
>> 
>> r287762 broke the system
> 
> 
> Before I revert this changeset *again* can you test revert r287762
> from if_igb.c, e1000_82575.c and e1000_82575.h *only*
> 
> That narrows down the change quite a bit.
> 
> sean ___ 
> freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org> 
> mailing list 
> https://lists.freebsd.org/mailman/listinfo/freebsd-current To
> unsubscribe, send any mail to 
> "freebsd-current-unsubscr...@freebsd.org 
> <mailto:freebsd-current-unsubscr...@freebsd.org>"
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJV/agsXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kkkMH/icGgcIAdm3CIxtXtnhbE5E9
N6RrBrgJruoRHP2geJ9UOUGp7MaFwS6AHcwCC7FvxoydzcsT+iMj3Kreu+aEkT4C
PZvLsqIvLeX0WVEXSC8jwBAZbrOwJ+4TcMMnW2KZMeUPUHAd8rP0+Skj9AeQ4feJ
OfLqv52VdxYTnx+vYbvZSaOgWp8sC+PcoJxiJua7hpmqXwISbcG9eXxQHUil0fcv
tqfb2dv89VN12brFuQbFtgEK2RG6gKtbj7IQkeQqzt0g5WL6UnJ5OslajcEJwIA4
H6H4eke5CQUugVTUv5cbCwmQDfAF2qF8JgsxuuQDPvQRb4+XH1ks/5SEhENItv0=
=JTMu
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: CURRENT: net/igb broken

2015-09-18 Thread Sean Bruno


> 
> r287762 broke the system


Before I revert this changeset *again* can you test revert r287762 from
if_igb.c, e1000_82575.c and e1000_82575.h *only*

That narrows down the change quite a bit.

sean
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: em broken on current amd64

2015-09-13 Thread Sean Bruno



On 09/12/15 13:45, Mark R V Murray wrote:
> 
>> On 8 Sep 2015, at 19:02, Mark R V Murray  wrote:
>>
>>
>>> On 8 Sep 2015, at 17:22, Sean Bruno  wrote:
>>>
>>>
>>>>>>
>>>>>> I’m also seeing breakage with the em0 device; this isn’t a kernel
>>>>>> hang, it is a failure to move data after about 10-15 minutes. The
>>>>>> symptom is that my WAN ethernet no longer moves traffic, no pings,
>>>>>> nothing. Booting looks normal:
>>>>>>
>>>>>> em0:  port
>>>>>> 0x30c0-0x30df mem 0x5030-0x5031,0x50324000-0x50324fff irq
>>>>>> 20 at device 25.0 on pci0 em0: Using an MSI interrupt em0: Ethernet
>>>>>> address: 00:16:76:d3:e1:5b em0: netmap queues/slots: TX 1/1024, RX
>>>>>> 1/1024
>>>>>>
>>>>>> Fixing it is as easy as …
>>>>>>
>>>>>> # ifconfig em0 down ; service ipfw restart ; ifconfig em0 up
>>>>>>
>>>>>> :-)
>>>>>>
>>>>>> I’m running CURRENT, r287538. This last worked of me a month or so
>>>>>> ago at my previous build.
>>>>>>
>>>>>> M
>>>>>>
>>>>>
>>>>>
>>>>> Just so I'm clear, the original problem reported was a failure to
>>>>> attach (you were among several folks reporting breakage).  Is that fixed
>>>>> ?
>>>>
>>>> I did not report the failure to attach, and I am not seeing it as I don’t
>>>> think I built a kernel that had that particular failure. I am having the
>>>> “failure after 10-15 minutes” problem; this is on an em0 device.
>>>>
>>>> M
>>>>
>>>
>>>
>>> Hrm, that's odd.  That sounds like a hole where interrupts aren't being
>>> reset for "reasons" that I cannot fathom.
>>>
>>> What hardware (pciconf -lv) does your system actually have?  The em(4)
>>> driver doesn't identify components which is frustrating.
>>
>> pciconf -lv output below:
>>
>> hostb0@pci0:0:0:0:   class=0x06 card=0x514d8086 chip=0x29a08086 rev=0x02 
>> hdr=0x00
>>vendor = 'Intel Corporation'
>>device = '82P965/G965 Memory Controller Hub'
>>class  = bridge
>>subclass   = HOST-PCI
> 
> I just caught this, on today’s build:
> 
> em0: Watchdog timeout Queue[0]-- resetting
> Interface is RUNNING and ACTIVE
> em0: TX Queue 0 --
> em0: hw tdh = 127, hw tdt = 139
> em0: Tx Queue Status = -2147483648
> em0: TX descriptors avail = 1012
> em0: Tx Descriptors avail failure = 0
> em0: RX Queue 0 --
> em0: hw rdh = 0, hw rdt = 1023
> em0: RX discarded packets = 0
> em0: RX Next to Check = 0
> em0: RX Next to Refresh = 1023
> 
> [graveyard] /usr/ports 09:42 pm # uname -a
> FreeBSD graveyard.grondar.org 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r287705: 
> Sat Sep 12 15:07:54 BST 2015 
> r...@graveyard.grondar.org:/b/obj/usr/src/sys/G_AMD64_GATE  amd64
> 
> M
> 

Any chance you can turn TSO off if its on and see what your results are?

sean
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: em broken on current amd64

2015-09-08 Thread Sean Bruno


>>>
>>> I’m also seeing breakage with the em0 device; this isn’t a kernel
>>> hang, it is a failure to move data after about 10-15 minutes. The
>>> symptom is that my WAN ethernet no longer moves traffic, no pings,
>>> nothing. Booting looks normal:
>>>
>>> em0:  port
>>> 0x30c0-0x30df mem 0x5030-0x5031,0x50324000-0x50324fff irq
>>> 20 at device 25.0 on pci0 em0: Using an MSI interrupt em0: Ethernet
>>> address: 00:16:76:d3:e1:5b em0: netmap queues/slots: TX 1/1024, RX
>>> 1/1024
>>>
>>> Fixing it is as easy as …
>>>
>>> # ifconfig em0 down ; service ipfw restart ; ifconfig em0 up
>>>
>>> :-)
>>>
>>> I’m running CURRENT, r287538. This last worked of me a month or so
>>> ago at my previous build.
>>>
>>> M
>>>
>>
>>
>> Just so I'm clear, the original problem reported was a failure to
>> attach (you were among several folks reporting breakage).  Is that fixed
>> ?
> 
> I did not report the failure to attach, and I am not seeing it as I don’t
> think I built a kernel that had that particular failure. I am having the
> “failure after 10-15 minutes” problem; this is on an em0 device.
> 
> M
> 


Hrm, that's odd.  That sounds like a hole where interrupts aren't being
reset for "reasons" that I cannot fathom.

What hardware (pciconf -lv) does your system actually have?  The em(4)
driver doesn't identify components which is frustrating.

sean
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: em broken on current amd64

2015-09-07 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 09/07/15 14:10, Mark R V Murray wrote:
> 
>> On 5 Sep 2015, at 17:11, Garrett Cooper 
>> wrote:
>> 
>> 
>>> On Sep 5, 2015, at 08:50, Manfred Antar  wrote:
>>> 
>>> Recent changes to em have broken current on amd64. Booting
>>> kernel will hang when trying to load em0, then will continue
>>> booting without the driver loading (No Network) This is on a HP
>>> SFF 8000 with em0 embedded on the motherboard.
>>> 
>>> boot messages:
>>> 
>>> em0:  port
>>> 0x3100-0x311f mem 0xf310-0xf311,0xf3125000-0xf3125fff
>>> irq 19 at device 25.0 on pci0 em0: attempting to allocate 1 MSI
>>> vectors (1 supported) em0: using IRQ 265 for MSI em0: Using an
>>> MSI interrupt em0: The EEPROM Checksum Is Not Valid 
>>> device_attach: em0 attach returned 5
>> 
>> Tijl said the same. The offending commit's r287467. Cheers,
> 
> I’m also seeing breakage with the em0 device; this isn’t a kernel
> hang, it is a failure to move data after about 10-15 minutes. The
> symptom is that my WAN ethernet no longer moves traffic, no pings,
> nothing. Booting looks normal:
> 
> em0:  port
> 0x30c0-0x30df mem 0x5030-0x5031,0x50324000-0x50324fff irq
> 20 at device 25.0 on pci0 em0: Using an MSI interrupt em0: Ethernet
> address: 00:16:76:d3:e1:5b em0: netmap queues/slots: TX 1/1024, RX
> 1/1024
> 
> Fixing it is as easy as …
> 
> # ifconfig em0 down ; service ipfw restart ; ifconfig em0 up
> 
> :-)
> 
> I’m running CURRENT, r287538. This last worked of me a month or so
> ago at my previous build.
> 
> M
> 


Just so I'm clear, the original problem reported was a failure to
attach (you were among several folks reporting breakage).  Is that fixed
?

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJV7iFPXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kFn4H/Rsjq9GqgTHUZHvw4e7DkM8y
nWftg4CM3RCMDnLPHxrFSXIUgq8a7jrtbEcft7YlA0ko56uqJs1SWgbXT2Ug4Mb7
0zNPR/Qy4ihGcwrfDew/JrSKGNG/T7BmNKFY7ANd90fcTeTf6qKIkhUj6bfj+m6/
OfAfQiegI9A5db7xbohhuG4RfXOx2yOz9ONkBpROuwBm7YvYk4hP4yIpGl+ANZ9V
YtjhaEw2v7ehu8SpZ6Zg8XbbtfJ9k814WXxdx7FOcfiSeVPxTa4unwavzjhYWhy0
Z33DAyB8tsrzXfPMu9QZaU9XHUV1fROJ88B2MA0CvTnavhSjd+cCPVnTTnFX/bI=
=A76q
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Kernel panic with fresh current, probably nfs related

2015-08-31 Thread Sean Bruno



On 08/29/15 00:38, Joel Dahl wrote:
> On Tue, Aug 25, 2015 at 12:55:02PM -0700, Sean Bruno wrote:
>>
>>
>> On 08/25/15 12:10, Joel Dahl wrote:
>>>>> Seems to work. However, I cannot reproduce the user panic in the first
>>>>> place.  What's the scenario that seems to work here?  NFS seems happy
>>>>> with/without the patch so I'm not confident in anything we are doing her
>>>>> e.
>>> I see several patches here. Which one should I be using?
>>
>> This:
>>
>> Index: sys/dev/e1000/if_em.c
>> ===
>> --- sys/dev/e1000/if_em.c(revision 287087)
>> +++ sys/dev/e1000/if_em.c(working copy)
>> @@ -3044,7 +3044,7 @@ em_setup_interface(device_t dev, struct adapter *a
>>  if_setioctlfn(ifp, em_ioctl);
>>  if_setgetcounterfn(ifp, em_get_counter);
>>  /* TSO parameters */
>> -ifp->if_hw_tsomax = EM_TSO_SIZE;
>> +ifp->if_hw_tsomax = IP_MAXPACKET;
>>  ifp->if_hw_tsomaxsegcount = EM_MAX_SCATTER;
>>  ifp->if_hw_tsomaxsegsize = EM_TSO_SEG_SIZE;
> 
> Using this patch, my nfs server has survived several
> installkernel/installworld cycles.
> 


Committed as svn R287330

sean
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Kernel panic with fresh current, probably nfs related

2015-08-25 Thread Sean Bruno



On 08/25/15 12:10, Joel Dahl wrote:
>> > Seems to work. However, I cannot reproduce the user panic in the first
>> > place.  What's the scenario that seems to work here?  NFS seems happy
>> > with/without the patch so I'm not confident in anything we are doing her
>> > e.
> I see several patches here. Which one should I be using?

This:

Index: sys/dev/e1000/if_em.c
===
--- sys/dev/e1000/if_em.c   (revision 287087)
+++ sys/dev/e1000/if_em.c   (working copy)
@@ -3044,7 +3044,7 @@ em_setup_interface(device_t dev, struct adapter *a
if_setioctlfn(ifp, em_ioctl);
if_setgetcounterfn(ifp, em_get_counter);
/* TSO parameters */
-   ifp->if_hw_tsomax = EM_TSO_SIZE;
+   ifp->if_hw_tsomax = IP_MAXPACKET;
ifp->if_hw_tsomaxsegcount = EM_MAX_SCATTER;
ifp->if_hw_tsomaxsegsize = EM_TSO_SEG_SIZE;
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Kernel panic with fresh current, probably nfs related

2015-08-24 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512



On 08/23/15 18:36, Yonghyeon PYUN wrote:
> Index: sys/dev/e1000/if_em.c 
> ===
>
> 
- --- sys/dev/e1000/if_em.c (revision 287087)
> +++ sys/dev/e1000/if_em.c (working copy) @@ -3044,7 +3044,7 @@
> em_setup_interface(device_t dev, struct adapter *a 
> if_setioctlfn(ifp, em_ioctl); if_setgetcounterfn(ifp,
> em_get_counter); /* TSO parameters */ -   ifp->if_hw_tsomax =
> EM_TSO_SIZE; +ifp->if_hw_tsomax = IP_MAXPACKET; 
> ifp->if_hw_tsomaxsegcount = EM_MAX_SCATTER; 
> ifp->if_hw_tsomaxsegsize = EM_TSO_SEG_SIZE;


Seems to work. However, I cannot reproduce the user panic in the first
place.  What's the scenario that seems to work here?  NFS seems happy
with/without the patch so I'm not confident in anything we are doing her
e.

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJV21/UXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kjswH/jL+GwJmehiEZVSUdjCMhYkK
sk10m+y8s64yrU5RuR8LqVd4RabmiVqmH8Xh1QKRkLpZT21AsYJVcBWJ6U4/iCMe
PqeszYGn9xGQ2+Weew/3mHmhdEO+biyK2ECaN5F9qfWhipeuAxd7a9c2OIAWY9FN
WtRQEgljkJyuktT16i5FXcmyL2RsynN18bDYKi/W5A/AKGJDpESBLISl0ye/wAA0
ZMr01tS4sgGalx5VZPaA46PRZOTDORz4gzKP7xfOo2Mpyp9xK3AS7FqpI7DZegmf
NOr3bploKqpjgmJHQP5pw9i464fsoDc3bdV66ktrzl1/aJ00Vk5cYCT8RR9raJ8=
=AN4K
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Kernel panic with fresh current, probably nfs related

2015-08-22 Thread Sean Bruno




> I'm going to guess that you're using an "em" net driver, since that is the
> only one that sets if_hw_tsomax > IP_MAXPACKET (65535) from what I can see.
> 
> Sean, EM_TSO_SIZE is defined as (65535 + sizeof(struct ether_vlan_header)),
> which makes it > IP_MAXPACKET. The value of if_hw_tsomax must be <= 
> IP_MAXPACKET
> and I'm guessing this is what caused the above panic. (Someday it would be
> nice if TSO segments > IP_MAXPACKET could be handled, but that will take 
> changes
> in the ip layer and router software so that a bogus ip_len field doesn't cause
> problems.)
> 
> if_hw_tsomax needs to be the maximum segment size that the driver can accept
> from IP. Since the driver adds any MAC header after accepting the TSO segment
> from the IP layer, it shouldn't include MAC header(s) in the value for 
> if_hw_tsomax.
> (If its limit includes MAC header(s), it needs to subtract those out when 
> setting
>  if_hw_tsomax, not add them.)
> 
> Since I am working up a patch for the value of if_hw_tsomaxsegcount, I think 
> I'll
> add a check for > IP_MAXPACKET for if_hw_tsomax as well.
> 
> rick

Huh, ok.  You want to try something like this then?

sean


Index: if_em.h
===
--- if_em.h (revision 286991)
+++ if_em.h (working copy)
@@ -268,7 +268,7 @@

 #define EM_MAX_SCATTER 64
 #define EM_VFTA_SIZE   128
-#define EM_TSO_SIZE(65535 + sizeof(struct ether_vlan_header))
+#define EM_TSO_SIZE(65535 - sizeof(struct ether_vlan_header))
 #define EM_TSO_SEG_SIZE4096/* Max dma segment size */
 #define EM_MSIX_MASK   0x01F0 /* For 82574 use */
 #define EM_MSIX_LINK   0x0100 /* For 82574 use */
Index: if_lem.h
===
--- if_lem.h(revision 286991)
+++ if_lem.h(working copy)
@@ -238,7 +238,7 @@

 #define EM_MAX_SCATTER 64
 #define EM_VFTA_SIZE   128
-#define EM_TSO_SIZE(65535 + sizeof(struct ether_vlan_header))
+#define EM_TSO_SIZE(65535 - sizeof(struct ether_vlan_header))
 #define EM_TSO_SEG_SIZE4096/* Max dma segment size */
 #define EM_MSIX_MASK   0x01F0 /* For 82574 use */
 #define ETH_ZLEN   60

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Are DDB custom function hooks broken in head?

2015-07-15 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 07/15/15 13:24, Sean Bruno wrote:
> 
> I added a couple of DDB_HOOKS to em(4) recently and they definitely
> used to work.  :-)
> 
> 
> KDB: enter: Break to debugger [ thread pid 11 tid 16 ] Stopped
> at  kdb_alt_break_internal+0x197:   movq$0,kdb_why db>
> help ahd_dumpahd_in  ahd_out ahd_pause 
> ahd_sunit   ahd_unpause alltraceb break
> bt  c   call capture continue
> countfreebufs   d delete  dhwatch dump
> dwatch em_dump_queue   em_reset_devexamine findstack 
> gdb halthwatch  kill match
> nextp   panic print   ps
> reboot  reset run s   script
> scripts search  set showstep t
> textdumpthread  trace unscriptuntil
> w   watch watchdogwhere   write
> x db> call em_dump_queue Symbol not found KDB: reentering KDB:
> stack backtrace: db_trace_self_wrapper() at
> db_trace_self_wrapper+0x2b/frame 0xfe00f7128160 kdb_reenter()
> at kdb_reenter+0x33/frame 0xfe00f7128170 db_term() at
> db_term+0x88/frame 0xfe00f7128190 db_unary() at
> db_unary+0x74/frame 0xfe00f71281b0 db_mult_expr() at
> db_mult_expr+0x1b/frame 0xfe00f71281f0 db_add_expr() at
> db_add_expr+0x1b/frame 0xfe00f7128230 db_expression() at
> db_expression+0x1d/frame 0xfe00f7128280 db_fncall() at
> db_fncall+0x17/frame 0xfe00f7128320 db_command() at
> db_command+0x361/frame 0xfe00f71283e0 db_command_loop() at
> db_command_loop+0x64/frame 0xfe00f71283f0 db_trap() at
> db_trap+0xdb/frame 0xfe00f7128480 kdb_trap() at
> kdb_trap+0x194/frame 0xfe00f7128510 trap() at trap+0x4a1/frame
> 0xfe00f7128720 calltrap() at calltrap+0x8/frame
> 0xfe00f7128720 --- trap 0x3, rip = 0x80716767, rsp =
> 0xfe00f71287e0, rbp = 0xfe00f7128810 --- 
> kdb_alt_break_internal() at kdb_alt_break_internal+0x197/frame 
> 0xfe00f7128810 kdb_alt_break() at kdb_alt_break+0xb/frame
> 0xfe00f7128820 uart_intr_rxready() at
> uart_intr_rxready+0x99/frame 0xfe00f7128850 uart_intr() at
> uart_intr+0x111/frame 0xfe00f7128890 intr_event_handle() at
> intr_event_handle+0x9c/frame 0xfe00f71288e0 
> intr_execute_handlers() at intr_execute_handlers+0x48/frame 
> 0xfe00f7128910 lapic_handle_intr() at
> lapic_handle_intr+0x68/frame 0xfe00f7128950 Xapic_isr1() at
> Xapic_isr1+0xba/frame 0xfe00f7128950 --- interrupt, rip =
> 0x80aa29f6, rsp = 0xfe00f7128a10, rbp =
> 0xfe00f7128a20 --- acpi_cpu_c1() at acpi_cpu_c1+0x6/frame
> 0xfe00f7128a20 acpi_cpu_idle() at acpi_cpu_idle+0x304/frame
> 0xfe00f7128a70 cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame
> 0xfe00f7128a90 cpu_idle() at cpu_idle+0x92/frame
> 0xfe00f7128ab0 sched_idletd() at sched_idletd+0x4d5/frame
> 0xfe00f7128bb0 fork_exit() at fork_exit+0x84/frame
> 0xfe00f7128bf0 fork_trampoline() at fork_trampoline+0xe/frame
> 0xfe00f7128bf0 --- trap 0, rip = 0, rsp = 0xfe00f7128cb0,
> rbp = 0 ---
> 
> ___ 
> freebsd-current@freebsd.org mailing list 
> http://lists.freebsd.org/mailman/listinfo/freebsd-current To
> unsubscribe, send any mail to
> "freebsd-current-unsubscr...@freebsd.org"
> 
> 


Or ... you know ... I could just type "em_dump_queue" without the
"call" in it ... if I wanted it to actually work.

Thanks to those of you poking me off list with my derpy-ness today.  :-)

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJVpszuXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kwQAIALUj7/pmZ6xayYcIuSJRYiBH
2ZsKXKgQA8kxDGr2Ls17Kl3tU7R7ohEw/flmarZ6l+z53NSi4eqtdSi++YUxvI2c
xfHsNuonuEq8b79lPvsMBFwTfAOP5EHu9+6XSVDaz+lYL/DPnthQgJ1AJgDIDC4n
rAi/nqmkNav9+hG32bEWfqGQH1NTuFvYO6B+AZCtscjWIKT9YlbUrsslO/afn4WA
tgEIE7lP82CB2ZFtAxSu9d0x8U4/2qoT1MxMqaj41gecaZ/fnuGCfylSlZ7nprVF
u4OJLW76rBm5e2X+px7lxLmy1E3mGXEGzEuJgM5042ppJkbSdRi4di0WqD5HggE=
=utgV
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Are DDB custom function hooks broken in head?

2015-07-15 Thread Sean Bruno


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

I added a couple of DDB_HOOKS to em(4) recently and they definitely used
to work.  :-)


KDB: enter: Break to debugger
[ thread pid 11 tid 16 ]
Stopped at  kdb_alt_break_internal+0x197:   movq$0,kdb_why
db> help
ahd_dumpahd_in  ahd_out ahd_pause
ahd_sunit   ahd_unpause alltraceb
break   bt  c   call
capture continuecountfreebufs   d
delete  dhwatch dumpdwatch
em_dump_queue   em_reset_devexamine findstack
gdb halthwatch  kill
match   nextp   panic
print   ps  reboot  reset
run s   script  scripts
search  set showstep
t   textdumpthread  trace
unscriptuntil   w   watch
watchdogwhere   write   x
db> call em_dump_queue
Symbol not found
KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe00f7128160
kdb_reenter() at kdb_reenter+0x33/frame 0xfe00f7128170
db_term() at db_term+0x88/frame 0xfe00f7128190
db_unary() at db_unary+0x74/frame 0xfe00f71281b0
db_mult_expr() at db_mult_expr+0x1b/frame 0xfe00f71281f0
db_add_expr() at db_add_expr+0x1b/frame 0xfe00f7128230
db_expression() at db_expression+0x1d/frame 0xfe00f7128280
db_fncall() at db_fncall+0x17/frame 0xfe00f7128320
db_command() at db_command+0x361/frame 0xfe00f71283e0
db_command_loop() at db_command_loop+0x64/frame 0xfe00f71283f0
db_trap() at db_trap+0xdb/frame 0xfe00f7128480
kdb_trap() at kdb_trap+0x194/frame 0xfe00f7128510
trap() at trap+0x4a1/frame 0xfe00f7128720
calltrap() at calltrap+0x8/frame 0xfe00f7128720
- --- trap 0x3, rip = 0x80716767, rsp = 0xfe00f71287e0, rbp
= 0xfe00f7128810 ---
kdb_alt_break_internal() at kdb_alt_break_internal+0x197/frame
0xfe00f7128810
kdb_alt_break() at kdb_alt_break+0xb/frame 0xfe00f7128820
uart_intr_rxready() at uart_intr_rxready+0x99/frame 0xfe00f7128850
uart_intr() at uart_intr+0x111/frame 0xfe00f7128890
intr_event_handle() at intr_event_handle+0x9c/frame 0xfe00f71288e0
intr_execute_handlers() at intr_execute_handlers+0x48/frame
0xfe00f7128910
lapic_handle_intr() at lapic_handle_intr+0x68/frame 0xfe00f7128950
Xapic_isr1() at Xapic_isr1+0xba/frame 0xfe00f7128950
- --- interrupt, rip = 0x80aa29f6, rsp = 0xfe00f7128a10, rbp
= 0xfe00f7128a20 ---
acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfe00f7128a20
acpi_cpu_idle() at acpi_cpu_idle+0x304/frame 0xfe00f7128a70
cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfe00f7128a90
cpu_idle() at cpu_idle+0x92/frame 0xfe00f7128ab0
sched_idletd() at sched_idletd+0x4d5/frame 0xfe00f7128bb0
fork_exit() at fork_exit+0x84/frame 0xfe00f7128bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe00f7128bf0
- --- trap 0, rip = 0, rsp = 0xfe00f7128cb0, rbp = 0 ---
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJVpsGOXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kQA4IALLPfpIQp+O5FdZ+kz3neWE5
mISnUcMC37EAwFyqjXOo8ocbzpdrn44jhNlRnCU6kQm9VHJQn7pAkLXeeBNG8YF4
n9Scit5JNF9RyU5zePiGQZ862VoxEPS82h0c4EwoHJygAuL9t+7KVWAhekfnolRB
eEuCMyT8o2eUBV2lcjPltJMAB2MlWY7tFR3u+Zdr6V13oNXvLbSCQThDxwJZFbZ8
Ej+0DL4qH+Pt1uQSqLDoPIJjgL2kTgQ4ZzuKyxiCLwC0HEfD0viF11H2Qr3H9hTy
j6hP2ZMemSmWGTWKiH7v92yp7iZu+KUFZ+PC7SasMEO8T229JNP+/7H038h5MNE=
=0wOm
-END PGP SIGNATURE-

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

ex(4) Removal from -current

2015-06-26 Thread Sean Bruno


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201127

ex(4) is an old Intel ethernet driver that isn't being actively
maintained.  I propose purging it from -current for 11.0 release things.

sean

p.s. will post to -net as well
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJVjWvEXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kiKoIAMwLD9yS2io9CowwMb1brsZX
Cegs05kAdrMrJk9bOF/gVO1yZvPk1kPBBdd59sYPrdVaTuKvwO1MY0+rphpdoC6U
v62fIlpBwZ1n1D07plzm1bC76MUWQl5eJ2MtusWwoAER/SRHn/AGL/JuQx55y0Rt
K8UULy1JBVZjRJOvM9uHFVRPShPJ/fu/yfia3oMbFHO3AuL30x4Vq6+L9RU1b7Dq
e8x9hmXkRt3wl9XVLZ9jkPqS+3u8xOoPg5ycnxF9IkV2ugr09SIg93OqJs8OsY6Y
HJv0SuCl14vevcmygYJMtnreKoo+pjAokrE09UyyliNarXeanlmKPWLJVkTYM/o=
=3GZO
-END PGP SIGNATURE-

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Seeing about 4K "exec: No such file or directory" msgs in installworld

2015-06-04 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 06/01/15 10:38, David Wolfskill wrote:
> On Mon, Jun 01, 2015 at 07:31:53PM +0200, Dimitry Andric wrote:
>> ...
>>> 87386:exec: No such file or directory
>> 
>> Yep, I saw these too.  Hundreds of them.
> 
> Ah; well, I'm fairly relived that I'm not either hallucinating
> (about this, anyway) or the only one seeing it.  So thanks for
> that!  :-)
> 
>> At this point, I think it should run makewhatis on
>> /usr/share/openssl/man?
> 
> Hmmm...
> 
>> I suspect this has to do with some of the recent mandoc changes,
>> but which one... no idea. :) ...
> 
> At least it doesn't seem to be causing observable harm -- it just
> seemed a bit odd to me.
> 
> Peace, david
> 


Bapt fixed this in head, so you should not see this anymore.

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJVcF56XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kWckIALsis85GpkWMmC/IHdIRQZyh
WtKjI9BISQwSDGkVZ5ggNUMYL8kH9XkBzKMO/YCvueKSF7saVgHvpGUJs7SSxSzt
D2dibi0OQJCnh3LqMKCzq4P0jQziBaxcclkyf+GZ/KA1F7nUHoHtr83HSBsT+otq
aUsjJH4TYZXwdnP9es9ieBBejWArvIPctXr345OK6F7ok8oIwSMJp6pCrC6JBnTs
2wmORfJbw0RQsGOysAgeIaPYT+uXTszke4OiaAzmr9fqjhDIWbw4r4BqSDMOHvRa
7aKUZQ1pxLLFXhxJabrlZK39PQwtdVoeJ7/atKiHtpcWVuJrSW2CwZSeYXAddz0=
=OMDn
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on application core dump?

2015-02-22 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 02/22/15 10:54, Sean Bruno wrote:
> On 02/22/15 10:53, Konstantin Belousov wrote:
>> On Sun, Feb 22, 2015 at 10:46:53AM -0800, Sean Bruno wrote:
>>> Hmm ... looks unrelated to signals (maybe).  This looks like a 
>>> common ZFS deadlock that is yet undiagnosed.  I do not have a 
>>> show alllocks command available in db> .  I will show each
>>> lock information below:
>> Add witness.
> 
>>> 
>>> db> show lockedvnods Locked vnodes
>>> 
>>> 0xf801141a6588: tag zfs, type VDIR usecount 19, writecount
>>> 0, refcount 20 mountedhere 0 flags (VV_ROOT|VI_ACTIVE)
>>> v_object 0xf80079be4500 ref 0 pages 0 cleanbuf 0 dirtybuf 0
>>> lock type zfs: EXCL by thread 0xf801ca10c4a0 (pid 75907,
>>> sh, tid 101262) with exclusive waiters pending
>> Without backtraces of the acquisition, it is not useful.  You
>> need DEBUG_VFS_LOCKS for this.
> 
> 
> 
> Thank you.  I will do so and restart my non-determinstic test and
> see what I can find.
> 
> sean ___

Well, that was certainly enlightening.  I was able to get a WITNESS
panic in imgact_binmisc.c in an hour or two.  I need to *not* hold the
mtx protecting the list of activators over the bcopy in
imgact_binmisc_exec().

Jiles proposes that we switch to an sx lock here for simplicity of
change of the code.
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex imgact_binmisc (imgact_binmisc) r = 0
(0x82012418) locked @
/usr/src/sys/modules/imgact_binmisc/../../kern/imgact_binmisc.c:596
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe046a236280
witness_warn() at witness_warn+0x4ae/frame 0xfe046a236350
trap_pfault() at trap_pfault+0x59/frame 0xfe046a2363f0
trap() at trap+0x45e/frame 0xfe046a236600
calltrap() at calltrap+0x8/frame 0xfe046a236600
- --- trap 0xc, rip = 0x80d21279, rsp = 0xfe046a2366c0, rbp
= 0xfe046a2366d0 ---
bcopy() at bcopy+0x39/frame 0xfe046a2366d0
imgact_binmisc_exec() at imgact_binmisc_exec+0x23d/frame
0xfe046a236720
kern_execve() at kern_execve+0x4c6/frame 0xfe046a236a80
sys_execve() at sys_execve+0x37/frame 0xfe046a236ae0
amd64_syscall() at amd64_syscall+0x27f/frame 0xfe046a236bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe046a236bf0
- --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x4297ba, rsp =
0x7fffdaf8, rbp = 0x7fffdb00 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 13; apic id = 33
fault virtual address   = 0xfe0456c01007
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x80d21279
stack pointer   = 0x28:0xfe046a2366c0
frame pointer   = 0x28:0xfe046a2366d0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 27028 (cc)
[ thread pid 27028 tid 100872 ]
Stopped at  bcopy+0x39: repe movsb  (%rsi),%es:(%rdi)




-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU6l+GXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kG8kH/j6+UD8cf8rrLyd369eQQQmo
ZTORZ9pAC6bMS9Dnu7VFpWGuelqFF9IXnjVml4QY4ieOBieavZYbfJ0nR3q+Htgh
CRhvradu2yIBSbmW2sBPzIXsMn/XZCc6DAy21k5ieS29ksCL7wi9tDMVtcRZR2i5
rLowPix4M7MFoNASdPZepuLSnHyxHF00okeYFxaOzQ8sfyAA+zXYQjh5F8Xh0hRM
M0HOF0J9nDxIZtueJSHDYSO94M0IxF+sMn/rmHznOFJZyNFfMY/zd9l9w+dx/8wW
ve0WzZzGGfvYG9J80C6d1iEqIEDIS5tf7/VEwSWuR2cQFtsz3GUJJXI2+lzCl3s=
=MwoX
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on application core dump?

2015-02-22 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 02/22/15 10:53, Konstantin Belousov wrote:
> On Sun, Feb 22, 2015 at 10:46:53AM -0800, Sean Bruno wrote:
>> Hmm ... looks unrelated to signals (maybe).  This looks like a
>> common ZFS deadlock that is yet undiagnosed.  I do not have a
>> show alllocks command available in db> .  I will show each lock
>> information below:
> Add witness.
> 
>> 
>> db> show lockedvnods Locked vnodes
>> 
>> 0xf801141a6588: tag zfs, type VDIR usecount 19, writecount 0,
>> refcount 20 mountedhere 0 flags (VV_ROOT|VI_ACTIVE) v_object
>> 0xf80079be4500 ref 0 pages 0 cleanbuf 0 dirtybuf 0 lock type
>> zfs: EXCL by thread 0xf801ca10c4a0 (pid 75907, sh, tid
>> 101262) with exclusive waiters pending
> Without backtraces of the acquisition, it is not useful.  You need 
> DEBUG_VFS_LOCKS for this.
> 
> 

Thank you.  I will do so and restart my non-determinstic test and see
what I can find.

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU6iX7XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kw+wH/3lcy7AJOvOu3cIQilxmfovt
rplhxp/t5lYVRS76FyBM9FekYtLZhK3uFyL2ZdEYNh11hjuitFXfaoADrZgyXvzI
SxJhjKy3Zhqt8reeLvKM9MhhdyNWNrbufhB4mqsmamtCDKh7jQ7EDIRsES9EzBon
brzZObtFTCNZgulQiTAWZZKNz0NH9hyJoPw9yHfXNTgVyLCkOBDDfCLLthiMd6l0
0RITV3CZ3W3RJvFWAnYU7iWaLWATEzOzPLhegRLG5G2P1khCxsYOH7+1BbOpF86N
PCg49DuEsjhlhJrbX+CqRDb1oGI0/Gt+HE/RGkoCj77Ow98jyIH3Jg5vmhBJYN0=
=2Fjt
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on application core dump?

2015-02-22 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 02/22/15 10:04, Konstantin Belousov wrote:
> On Sun, Feb 22, 2015 at 09:34:29AM -0800, Sean Bruno wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>> 
>>> Err.  Is it easily reproducable in your setup ? The core file
>>> vnode is indeed unreferenced before notification is sent.
>>> 
>>> Try this.
>>> 
>>> diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index 
>>> 41da3dd..57f66b0 100644
>> 
>> Restarted my non-deterministic test case.  Three instances of
>> qemu core dumped and the system did *not* panic.
>> 
>> However, this appears to be interfering with signal handling and 
>> reaping.  Applications seems to stall out and become 
>> unkillable/unreapable.  I have to reboot the system via
>> panic/reset.
>> 
> What applications ?  What is the (kernel) backtrace for the ?
> 

Most of the waiting applications are shells scripts:

root407270.0  0.0 17180 5276  1  I+9:20AM0:00.18
/bin/sh /usr/local/share/poudriere/bulk.sh -j 11mips64 -p 11mips32 -ac

db> trace 47027
Tracing pid 47027 tid 100835 td 0xf80018cae4a0
sched_switch() at sched_switch+0x326/frame 0xfe046a1446f0
mi_switch() at mi_switch+0xde/frame 0xfe046a144730
sleepq_catch_signals() at sleepq_catch_signals+0xab/frame
0xfe046a1447b0
sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfe046a1447e0
_cv_wait_sig() at _cv_wait_sig+0x1b0/frame 0xfe046a144830
seltdwait() at seltdwait+0x104/frame 0xfe046a144880
kern_select() at kern_select+0x963/frame 0xfe046a144a90
sys_select() at sys_select+0x54/frame 0xfe046a144ad0
amd64_syscall() at amd64_syscall+0x3e7/frame 0xfe046a144bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe046a144bf0

>> 1288 root  1  200 17180K  5336K zfs14   1:13
>> 0.00% sh
> Is this the problem  ?
> 
> If yes, do you have ddb compiled in ?  Show the output of 'show
> lockedvnods' and 'show alllocks'.

Hmm ... looks unrelated to signals (maybe).  This looks like a common
ZFS deadlock that is yet undiagnosed.  I do not have a show alllocks
command available in db> .  I will show each lock information below:

db> show lockedvnods
Locked vnodes

0xf801141a6588: tag zfs, type VDIR
usecount 19, writecount 0, refcount 20 mountedhere 0
flags (VV_ROOT|VI_ACTIVE)
v_object 0xf80079be4500 ref 0 pages 0 cleanbuf 0 dirtybuf 0
lock type zfs: EXCL by thread 0xf801ca10c4a0 (pid 75907, sh,
tid 101262)
 with exclusive waiters pending

0xf800184c8b10: tag zfs, type VDIR
usecount 1, writecount 0, refcount 3 mountedhere 0
flags (VI_ACTIVE)
v_object 0xf80355409300 ref 0 pages 0 cleanbuf 0 dirtybuf 0
lock type zfs: EXCL by thread 0xf8001863d000 (pid 94699, rm,
tid 100930)
 with exclusive waiters pending

0xf80404d47b10: tag zfs, type VDIR
usecount 1, writecount 0, refcount 4 mountedhere 0
flags (VI_ACTIVE)
lock type zfs: EXCL by thread 0xf80013b29000 (pid 94698, rm,
tid 100772)
 with exclusive waiters pending

0xf802ec2b5000: tag zfs, type VDIR
usecount 3, writecount 0, refcount 4 mountedhere 0
flags (VI_ACTIVE)
lock type zfs: EXCL by thread 0xf801ca106940 (pid 94700, mv,
tid 101021)
 with exclusive waiters pending

db> show lock 0xf801141a6588
 class: sleep mutex
 name: zfs
 flags: {DEF, DUPOK}
 state: {OWNED}
KDB: reentering
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe0456b0e050
kdb_reenter() at kdb_reenter+0x33/frame 0xfe0456b0e060
trap() at trap+0x54/frame 0xfe0456b0e270
calltrap() at calltrap+0x8/frame 0xfe0456b0e270
- --- trap 0xc, rip = 0x809b2c37, rsp = 0xfe0456b0e330, rbp
= 0xfe0456b0e350 ---
db_show_mtx() at db_show_mtx+0x127/frame 0xfe0456b0e350
db_command() at db_command+0x27c/frame 0xfe0456b0e410
db_command_loop() at db_command_loop+0x64/frame 0xfe0456b0e420
db_trap() at db_trap+0xe0/frame 0xfe0456b0e4b0
kdb_trap() at kdb_trap+0x18e/frame 0xfe0456b0e540
trap() at trap+0x447/frame 0xfe0456b0e750
calltrap() at calltrap+0x8/frame 0xfe0456b0e750
- --- trap 0x3, rip = 0x80a0ee87, rsp = 0xfe0456b0e810, rbp
= 0xfe0456b0e840 ---
kdb_alt_break_internal() at kdb_alt_break_internal+0x197/frame
0xfe0456b0e840
kdb_alt_break() at kdb_alt_break+0xb/frame 0xfe0456b0e850
uart_intr_rxready() at uart_intr_rxready+0x99/frame 0xfe0456b0e880
uart_intr() at uart_intr+0x111/frame 0xfe0456b0e8c0
intr_event_handle() at intr_event_handle+0x9b/frame 0xfe0456b0e910
intr_execute_handlers() at intr_execute_handlers+0x48/frame
0xfe0456b0e940
lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfe0456b0e960
Xapic_isr1() at Xapic_isr1+0xba/frame 0xfe0456b0e960
- --- interrupt, rip = 0x803849

Re: panic on application core dump?

2015-02-22 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

> Err.  Is it easily reproducable in your setup ? The core file vnode
> is indeed unreferenced before notification is sent.
> 
> Try this.
> 
> diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index
> 41da3dd..57f66b0 100644

Restarted my non-deterministic test case.  Three instances of qemu
core dumped and the system did *not* panic.

However, this appears to be interfering with signal handling and
reaping.  Applications seems to stall out and become
unkillable/unreapable.  I have to reboot the system via panic/reset.

sean

last pid: 41009;  load averages:  0.46,  0.29,  0.21


   up 0+12:18:57  17:33:37
72 processes:  1 running, 69 sleeping, 2 zombie
CPU:  0.9% user,  0.0% nice,  0.2% system,  0.0% interrupt, 98.8% idle
Mem: 15M Active, 6675M Inact, 7201M Wired, 244K Cache, 2010M Free
ARC: 4678M Total, 910M MFU, 3001M MRU, 6748K Anon, 70M Header, 691M Other
Swap: 16G Total, 16G Free

  PID USERNAMETHR PRI NICE   SIZERES STATE   C   TIMEWCPU
COMMAND
  718 www   1  200 28708K  6636K kqread 12   0:08   0.70%
nginx
 1430 root  1  520 17180K  3448K wait   13   2:28   0.44% sh
 1276 sbruno1  200 21548K  8796K select 11   4:37   0.17% tmux
40832 root  1  200 22000K  3072K CPU77   0:00   0.08% top
 1267 sbruno1  200 86528K  7580K select 15   0:04   0.02% sshd
  695 root  1  200 25496K  4884K select  9   0:03   0.01% ntpd
  698 root  1  200 14492K  1992K select  7   0:04   0.01%
powerd
 1288 root  1  200 17180K  5336K zfs14   1:13   0.00% sh
79939 root  1  520 17180K  3384K wait   10   0:03   0.00% sh
  750 root  1  200 24164K  5444K select  2   0:01   0.00%
sendmail
  444 unbound   1  200 34780K  9672K select  7   0:00   0.00%
unbound
40727 root  1  200 17180K  5276K wait   14   0:00   0.00% sh
68893 root  1  260 17180K  5336K wait3   0:00   0.00% sh
80937 root  1  240 17180K  5336K wait   13   0:00   0.00% sh
55102 root  1  240 17180K  5336K wait3   0:00   0.00% sh
35713 root  1  250 17180K  5336K wait   10   0:00   0.00% sh
48828 root  1  240 17180K  5336K wait6   0:00   0.00% sh
97473 root  1  260 17180K  5336K wait0   0:00   0.00% sh
64113 root  1  240 17180K  5336K wait5   0:00   0.00% sh
46980 root  1  270 17180K  5336K wait   12   0:00   0.00% sh
80439 root  1  240 17180K  5336K wait7   0:00   0.00% sh
44960 root  1  200 17180K  5276K wait3   0:00   0.00% sh
49661 root  1  270 17180K  5336K wait8   0:00   0.00% sh
78496 root  1  520 17180K  5336K zfs 0   0:00   0.00% sh
69491 root  1  270 17180K  5336K wait6   0:00   0.00% sh
75907 root  1  520 17180K  5336K zfs 8   0:00   0.00% sh
  534 root  1  200 14524K  2200K select 12   0:00   0.00%
syslogd
  757 root  1  520 16620K  2296K nanslp 13   0:00   0.00% cron
  747 root  1  200 59148K  6620K select  6   0:00   0.00% sshd
26179 root  1  310 23652K  4044K pause  14   0:00   0.00% csh
79371 root  1  200 17180K  5336K wait9   0:00   0.00% sh
 1268 sbruno1  230 23652K  3736K pause   5   0:00   0.00% tcsh
 1265 root  1  220 86528K  7284K select 11   0:00   0.00% sshd
 1283 root  1  210 23652K  4044K pause   3   0:00   0.00% csh
 1277 sbruno1  230 23652K  3748K pause   6   0:00   0.00% tcsh
 1282 sbruno1  210 74188K  6348K wait   11   0:00   0.00% su
24565 sbruno1  200 23652K  3740K pause   8   0:00   0.00% tcsh
  438 root  1  200 13588K  4788K select 11   0:00   0.00% devd
26119 sbruno1  200 74188K  6348K wait   13   0:00   0.00% su
  753 smmsp 1  200 24164K  5392K pause   1   0:00   0.00%
sendmail

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU6hMhXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k6sUIALJrfTWiIC2EeAuV/aedWGuM
6XP97oXAp3iFNEyz/4yli6Hoev7AG9lbxh/ruhzKUi1O5tGQye0gIBwdqGrOL5tA
mEjS8EAylc2WJsPbOHnctT/JUgaOle5j4HTlV8r7t8XSPjjoktc9uBwQQ7n+XlI3
AK4aOspxZSTk7WaGWM5cBDHV/Ga+6VgFGXi6i2/Y9zpW93vuCCCDfvpyQqGOoAGq
LJk9hZmDdJMBO6n0HbPm20xKv0lDQsK9HIx2grD7xz5FPwVOFTMHwQrgnEqlpTPq
b1Aex0MPqSTaUlZs8W6jY5C9ze8FEbLG15E0zj1ZMISPVJwrH7XPP8Jq57BqVo0=
=JrKS
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on application core dump?

2015-02-21 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 02/21/15 13:17, Konstantin Belousov wrote:
> On Sat, Feb 21, 2015 at 12:27:22PM -0800, Sean Bruno wrote:
>> 
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>> 
>> Well, this is new.  It looks like current panic'd when trying to
>> dump a core from a qemu crash?  I can leave this at the debugger
>> for now as this is a machine doing mips package builds and is not
>> "production".
>> 
>> sean
>> 
>> Thu Feb 19 18:50:59 UTC 2015
>> 
>> FreeBSD/amd64 (dirty.ysv.freebsd.org) (ttyu0)
>> 
>> login: Feb 20 08:06:05 dirty sshd[51311]: fatal: Read from
>> socket failed: Connection reset by peer [preauth] Feb 20 16:47:29
>> dirty su: sbruno to root on /dev/pts/1 Feb 21 02:15:44 dirty
>> sshd[95051]: fatal: Read from socket failed: Connection reset by
>> peer [preauth]
>> 
>> 
>> Fatal trap 12: page fault while in kernel mode cpuid = 15; apic
>> id = 35 fault virtual address   = 0x380 fault code  =
>> supervisor read data, page not present instruction pointer =
>> 0x20:0x809b2ed1 stack pointer   =
>> 0x28:0xfe046a3a30f0 frame pointer   =
>> 0x28:0xfe046a3a3170 code segment= base 0x0, limit
>> 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 
>> processor eflags= interrupt enabled, resume, IOPL = 0 
>> current process = 42563 (qemu-mips64) [ thread pid 42563
>> tid 100956 ] Stopped at  __mtx_lock_sleep+0xb1:  movl
>> 0x380(%rax),%ecx db> bt Tracing pid 42563 tid 100956 td
>> 0xf80109a214a0 __mtx_lock_sleep() at
>> __mtx_lock_sleep+0xb1/frame 0xfe046a3a3170 vref() at
>> vref+0x6d/frame 0xfe046a3a31a0 vn_fullpath1() at
>> vn_fullpath1+0x62/frame 0xfe046a3a3200 vn_fullpath_global()
>> at vn_fullpath_global+0x6e/frame 0xfe046a3a3240 sigexit() at
>> sigexit+0xa22/frame 0xfe046a3a34f0 sendsig() at
>> sendsig+0x65e/frame 0xfe046a3a3960 trapsignal() at
>> trapsignal+0x2f7/frame 0xfe046a3a39e0 trap() at
>> trap+0x3ba/frame 0xfe046a3a3bf0 calltrap() at
>> calltrap+0x8/frame 0xfe046a3a3bf0 - --- trap 0xc, rip =
>> 0x600334bc, rsp = 0x7ffbffe19990, rbp = 0x7ffe4a20 --- db> p
>> vref+0x6d 80a876cd
> 
> Err.  Is it easily reproducable in your setup ? The core file vnode
> is indeed unreferenced before notification is sent.
> 

Probably not a deterministic crash, but I will get a dump and try out
this change directly today.

sean


> Try this.
> 
> diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index
> 41da3dd..57f66b0 100644 --- a/sys/kern/kern_sig.c +++
> b/sys/kern/kern_sig.c @@ -3310,7 +3310,7 @@ coredump(struct thread
> *td) vattr.va_nlink != 1 || (vp->v_vflag & VV_SYSTEM) != 0) { 
> VOP_UNLOCK(vp, 0); error = EFAULT; -  goto close; +   goto 
> out; }
> 
> VOP_UNLOCK(vp, 0); @@ -3347,17 +3347,12 @@ coredump(struct thread
> *td) VOP_ADVLOCK(vp, (caddr_t)p, F_UNLCK, &lf, F_FLOCK); } 
> vn_rangelock_unlock(vp, rl_cookie); -close: - error1 = vn_close(vp,
> FWRITE, cred, td); -  if (error == 0) -   error = error1; -   
> else -
> goto out; + /* * Notify the userland helper that a process
> triggered a core dump. * This allows the helper to run an automated
> debugging session. */ -   if (coredump_devctl == 0) + if (error != 0
> || coredump_devctl == 0) goto out; len = MAXPATHLEN * 2 +
> sizeof(comm_name) - 1 + sizeof(' ') + sizeof(core_name) - 1; @@
> -3377,6 +3372,9 @@ close: strlcat(data, fullpath, len); 
> devctl_notify("kernel", "signal", "coredump", data); out: +   error1
> = vn_close(vp, FWRITE, cred, td); +   if (error == 0) +   error =
> error1; #ifdef AUDIT audit_proc_coredump(td, name, error); #endif
> 
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU6SdJXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kZlIH/0usK1j2BfzNT95UFE6KAoUv
0+JqmkJyGiR9hEQvvgcSiL9++NXnTLLz1z5SGbwAqL0hebXYckLXhxusObENBnnK
ZMz12bsFmAI615eXK6ZJjsnZJWzmU/tjQjcY93Rao0M+AUTaGk5PFoR486hjhSM+
7lg4KA+BlD5K991Zy9BzR0ZGSkjRnuZSQBsKbHe1RGbS1SAsf4PyfpvXDt0lhfN9
E/C2uvehYbBJi3vJuJx3pVXg5s+uyutnGLjBRY/sqOuiDOuGfHFdKbdgtApIAc0o
B0/3I2IbAx3q0zy9c/4nuKjKHbr+di2pbFymUAH8beHcBuo5wsNFvfGTOqiX5ro=
=nlAC
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

panic on application core dump?

2015-02-21 Thread Sean Bruno


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Well, this is new.  It looks like current panic'd when trying to dump a
core from a qemu crash?  I can leave this at the debugger for now as
this is a machine doing mips package builds and is not "production".

sean

Thu Feb 19 18:50:59 UTC 2015

FreeBSD/amd64 (dirty.ysv.freebsd.org) (ttyu0)

login: Feb 20 08:06:05 dirty sshd[51311]: fatal: Read from socket
failed: Connection reset by peer [preauth]
Feb 20 16:47:29 dirty su: sbruno to root on /dev/pts/1
Feb 21 02:15:44 dirty sshd[95051]: fatal: Read from socket failed:
Connection reset by peer [preauth]


Fatal trap 12: page fault while in kernel mode
cpuid = 15; apic id = 35
fault virtual address   = 0x380
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x809b2ed1
stack pointer   = 0x28:0xfe046a3a30f0
frame pointer   = 0x28:0xfe046a3a3170
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 42563 (qemu-mips64)
[ thread pid 42563 tid 100956 ]
Stopped at  __mtx_lock_sleep+0xb1:  movl0x380(%rax),%ecx
db> bt
Tracing pid 42563 tid 100956 td 0xf80109a214a0
__mtx_lock_sleep() at __mtx_lock_sleep+0xb1/frame 0xfe046a3a3170
vref() at vref+0x6d/frame 0xfe046a3a31a0
vn_fullpath1() at vn_fullpath1+0x62/frame 0xfe046a3a3200
vn_fullpath_global() at vn_fullpath_global+0x6e/frame 0xfe046a3a3240
sigexit() at sigexit+0xa22/frame 0xfe046a3a34f0
sendsig() at sendsig+0x65e/frame 0xfe046a3a3960
trapsignal() at trapsignal+0x2f7/frame 0xfe046a3a39e0
trap() at trap+0x3ba/frame 0xfe046a3a3bf0
calltrap() at calltrap+0x8/frame 0xfe046a3a3bf0
- --- trap 0xc, rip = 0x600334bc, rsp = 0x7ffbffe19990, rbp =
0x7ffe4a20 ---
db> p vref+0x6d
80a876cd

-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU6OonXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k/d8IANC2xfQm9xp/g/sa2R5alFs3
MHBFfk/QyyZCfKShX8aBkfKBUOIB/VJAR3QHoU1EXpBmL9xRZcTWvZFB3Tvt/hZS
S6EJJqW51CjAHVry20yd3lObjQ2ltMtpQ+UhMnNO43wzzLXaGeyPBghLqsPrrYpT
qTlRnOdxP610eDSy/PuziSn/1foohvw1IgdbU4NljA0PRCtj4SPybNuznWYKrcZF
6Lbphw+yRp6KBTYsm3nZMZVVR8j/232cX/Hqc3Ptay9yI8BJTb3tDji0XwxPRm6k
aTQFN86/Yc1gMeg57igj1kq6+xS7hALuhaT/3ZdagTCjiAcP0OOUceeyqOoBofk=
=ni1d
-END PGP SIGNATURE-

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Xen HVM Panic, HEAD

2015-02-19 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512


> 
> This panic starts at: 
> https://svnweb.freebsd.org/base?view=revision&revision=278473
> 
> If I use 278472, I can boot the Xen VM normally.
> 
> If I use head and set hw.x2apic_enable="0" in loader.conf and boot 
> head (278970), it boots normally.
> 
> Second issue:
> 
> However, the UFS disk access is SO slow on this that it took
> 2+hours to do an installworld, which I couldn't abort because it
> had already started.  I'm not sure if the UFS disk access is
> related at all or not.  Once things are read from disk into memory
> they are fast and responsive (e.g. sshd, tmux, shells, etc).
> 
> sean
> 
> bcc royger

Maybe helpful, verbose dmesg on bootup with x2apic disabled:

https://people.freebsd.org/~sbruno/xen_dmesg_verbose.txt

sean


-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU5mIxXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kZewH/j/Fy9b8tteEM68ZGti3XZAK
jLDkr8m21pFO8YxPYWrWhXp4f6pvLnTbUIjq5V+8nlCTK+douRfhy8OvG4lW2a5r
Hvgwc147mDBpELtlByijsasc9ulkUveI7pSDDSu49dD8RBCOBhGjkfr4iU1tiSbI
NHog3vBKk5IYV0u4pynyq1ROMESMNtHSfobt1oHgzxUS1xWHcv4YvnWK05dLM07D
lU+g8sY9aBtU8L+IPlAtQW8fZFTwt1RO5oPk3BveA32KxV90vO9bJ2AyHl821/US
Fo0NU8zhOuCYaSLF43xexEH1h3EQbjLJho6YslHtIaLqqE0cqpBhnbfKHsE4SmA=
=H6sT
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Xen HVM Panic, HEAD

2015-02-19 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

> 

> This panic starts at: 
> https://svnweb.freebsd.org/base?view=revision&revision=278473
> 
> If I use 278472, I can boot the Xen VM normally.
> 
> If I use head and set hw.x2apic_enable="0" in loader.conf and boot 
> head (278970), it boots normally.
> 
> Second issue:
> 
> However, the UFS disk access is SO slow on this that it took
> 2+hours to do an installworld, which I couldn't abort because it
> had already started.  I'm not sure if the UFS disk access is
> related at all or not.  Once things are read from disk into memory
> they are fast and responsive (e.g. sshd, tmux, shells, etc).
> 
> sean
> 
> bcc royger
> 
> 
> 

BTW, my kernconf for this contains:

include GENERIC
ident BLOG

nooptions DDB
nooptions GDB
nooptions DEADLKRES
nooptions INVARIANTS  # Enable calls of extra
sanity checking
nooptions INVARIANT_SUPPORT   # Extra sanity checks of
internal structures, required by INVARIANTS
nooptions WITNESS # Enable checks to detect
deadlocks and cycles
nooptions WITNESS_SKIPSPIN# Don't run witness on
spinlocks for speed
nooptions MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zone
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU5iDJXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kiOEIAM5UWud6DGlGgQM5PsHmP4nO
LMiAyq2bIE/MtaR13IZL1hv9zBKHAYva9CyaAiAVDFtOSP6nR+/zcxi2SlkNbO9Z
lmERNmIs2AvfZiX/+krqVJXcI0MoeXO+9WSpz1SuUo1kXRaWXYNuTDw2qG0lbG/e
282EyNBIg4Jz+KfcTK/cmKQCc1jCMA7Fwym1G7Lwfd8HwxaqJFGa446Y6vle8UZt
BSa52DOUP5D0RswByXzNS4aqMI9fLJRcTQZSrY15lUAYmi8uF37n+u8KH7dzbRM+
czW3vJpwckJieA5h+EmV0bABplT3L80/JhdA6vXF98kgrq+6b7xPVowmmvxECfQ=
=MPg8
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Xen HVM Panic, HEAD

2015-02-18 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 02/17/15 12:30, Sean Bruno wrote:
> On 02/17/15 12:26, Konstantin Belousov wrote:
>> On Tue, Feb 17, 2015 at 12:00:04PM -0800, Sean Bruno wrote:
>>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>>> 
>>> On 02/17/15 00:56, Konstantin Belousov wrote:
>>>> On Mon, Feb 16, 2015 at 08:10:06PM -0800, Sean Bruno wrote:
>>>>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>>>>> 
>>>>> https://people.freebsd.org/~sbruno/Xen_APIC_panic.png
>>>>> 
>>>>> I suspect that there may be one or two more lines above
>>>>> this that are relevant to this panic, but XENHVM kernel's
>>>>> now panic booting on Xen server.  The working kernel output
>>>>> looks like this:
>>>>> 
>>>>> FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 
>>>>> 208032) 20140512 XEN: Hypervisor version 4.2 detected.
>>>>> CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz
>>>>> (2400.05-MHz K8-class CPU) Origin="GenuineIntel"
>>>>> Id=0x206c2  Family=0x6 Model=0x2c Stepping=2 
>>>>> Features=0x1783fbff
>>>>>
>>>>>
>>>
>>>>>
>
>>>>> 
Features2=0x81ba2201
>>>>> AMD Features=0x28100800 AMD 
>>>>> Features2=0x1 Hypervisor: Origin = "XenVMMXenVMM"
>>>>> real memory  = 1434451968 (1368 MB) avail memory =
>>>>> 1353293824 (1290 MB) Event timer "LAPIC" quality 400 ACPI
>>>>> APIC Table:  FreeBSD/SMP: Multiprocessor System
>>>>> Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) cpu0
>>>>> (BSP): APIC ID:  0 cpu1 (AP): APIC ID:  2 ioapic0: Changing
>>>>> APIC ID to 1 MADT: Forcing active-low polarity and level
>>>>> trigger for SCI
>>>> I am not sure why your machine uses native lapic instead of 
>>>> xen lapic, and should it be other way, or not.
>>>> 
>>>> Regardless, show the line number for the ipi_startup+0x56.
>>>> Did you performed clean kernel build ?
>>>> 
>>>> 
>>> 
>>> I have rebuilt a kernel/world based on head at svn r276627.  I 
>>> have delete /usr/obj completely and started from scratch.
>>> 
>>> Updated kernelpanic image at 
>>> https://people.freebsd.org/~sbruno/Xen_APIC_panic.png
>>> 
>>> /usr/src/sys/x86/include # kgdb /boot/kernel/kernel GNU gdb
>>> 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc.
>>> GDB is free software, covered by the GNU General Public
>>> License, and you are welcome to change it and/or distribute
>>> copies of it under certain conditions. Type "show copying" to
>>> see the conditions. There is absolutely no warranty for GDB.
>>> Type "show warranty" for details. This GDB was configured as 
>>> "amd64-marcel-freebsd"... (kgdb) list *(ipi_startup+0x56) 
>>> 0x80e088c6 is in ipi_startup (apicvar.h:383). 378 379 
>>> static inline int 380   lapic_ipi_wait(int delay) 381   { 382 383 
>>> return (apic_ops.ipi_wait(delay)); 384  } 385 386   static inline 
>>> int 387 lapic_set_lvt_mask(u_int apic_id, u_int lvt, u_char 
>>> masked)
>>> 
> 
>> Please disassemble your ipi_startup, also please do 'p
>> *apic_ops'.
> 
> 
> 
> 
> (kgdb) disassemble ipi_startup
> 
> 
> 
> Dump of assembler code for function ipi_startup: 0x80df3900
> : push   %rbp 0x80df3901
> : mov%rsp,%rbp 0x80df3904
> : push   %r14 0x80df3906
> : push   %rbx 0x80df3907
> : mov%esi,%ebx 0x80df3909
> : mov%edi,%r14d 0x80df390c
> :mov$0xc500,%edi 0x80df3911
> :mov%r14d,%esi 0x80df3914
> :callq  *0x815ac428 0x80df391b
> :mov$0x14,%edi 0x80df3920
> :callq  *0x815ac438 0x80df3927
> :mov$0x8500,%edi 0x80df392c
> :mov%r14d,%esi 0x80df392f
> :callq  *0x815ac428 0x80df3936
> :mov$0x2710,%edi 0x80df393b
> :callq  0x80f39c10  
> 0x80df3940 :or $0x4600,%ebx 
> 0x80df3946 :movslq %ebx,%rbx 
> 0x80df3949 :mov%rbx,%rdi 
> 0x80df394c :mov%r14d,%esi 
> 0x80df394f :callq  *0x815ac428 
> 0x80df3956 :mov

Re: Xen HVM Panic, HEAD

2015-02-17 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 02/17/15 12:26, Konstantin Belousov wrote:
> On Tue, Feb 17, 2015 at 12:00:04PM -0800, Sean Bruno wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>> 
>> On 02/17/15 00:56, Konstantin Belousov wrote:
>>> On Mon, Feb 16, 2015 at 08:10:06PM -0800, Sean Bruno wrote:
>>>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>>>> 
>>>> https://people.freebsd.org/~sbruno/Xen_APIC_panic.png
>>>> 
>>>> I suspect that there may be one or two more lines above this
>>>> that are relevant to this panic, but XENHVM kernel's now
>>>> panic booting on Xen server.  The working kernel output looks
>>>> like this:
>>>> 
>>>> FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final
>>>> 208032) 20140512 XEN: Hypervisor version 4.2 detected. CPU:
>>>> Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2400.05-MHz
>>>> K8-class CPU) Origin="GenuineIntel"  Id=0x206c2  Family=0x6
>>>> Model=0x2c Stepping=2 
>>>> Features=0x1783fbff
>>>>
>>>>
>>
>>>> 
Features2=0x81ba2201
>>>> AMD Features=0x28100800 AMD 
>>>> Features2=0x1 Hypervisor: Origin = "XenVMMXenVMM" real 
>>>> memory  = 1434451968 (1368 MB) avail memory = 1353293824
>>>> (1290 MB) Event timer "LAPIC" quality 400 ACPI APIC Table:
>>>>  FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
>>>> FreeBSD/SMP: 1 package(s) x 2 core(s) cpu0 (BSP): APIC ID:  0
>>>> cpu1 (AP): APIC ID:  2 ioapic0: Changing APIC ID to 1 MADT:
>>>> Forcing active-low polarity and level trigger for SCI
>>> I am not sure why your machine uses native lapic instead of
>>> xen lapic, and should it be other way, or not.
>>> 
>>> Regardless, show the line number for the ipi_startup+0x56. Did
>>> you performed clean kernel build ?
>>> 
>>> 
>> 
>> I have rebuilt a kernel/world based on head at svn r276627.  I
>> have delete /usr/obj completely and started from scratch.
>> 
>> Updated kernelpanic image at 
>> https://people.freebsd.org/~sbruno/Xen_APIC_panic.png
>> 
>> /usr/src/sys/x86/include # kgdb /boot/kernel/kernel GNU gdb 6.1.1
>> [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is
>> free software, covered by the GNU General Public License, and you
>> are welcome to change it and/or distribute copies of it under
>> certain conditions. Type "show copying" to see the conditions. 
>> There is absolutely no warranty for GDB.  Type "show warranty"
>> for details. This GDB was configured as
>> "amd64-marcel-freebsd"... (kgdb) list *(ipi_startup+0x56) 
>> 0x80e088c6 is in ipi_startup (apicvar.h:383). 378 379
>> static inline int 380lapic_ipi_wait(int delay) 381   { 382 383
>> return (apic_ops.ipi_wait(delay)); 384   } 385 386   static inline
>> int 387  lapic_set_lvt_mask(u_int apic_id, u_int lvt, u_char
>> masked)
>> 
> 
> Please disassemble your ipi_startup, also please do 'p *apic_ops'.
> 
> 


(kgdb) disassemble ipi_startup



Dump of assembler code for function ipi_startup:
0x80df3900 : push   %rbp
0x80df3901 : mov%rsp,%rbp
0x80df3904 : push   %r14
0x80df3906 : push   %rbx
0x80df3907 : mov%esi,%ebx
0x80df3909 : mov%edi,%r14d
0x80df390c :mov$0xc500,%edi
0x80df3911 :mov%r14d,%esi
0x80df3914 :callq  *0x815ac428
0x80df391b :mov$0x14,%edi
0x80df3920 :callq  *0x815ac438
0x80df3927 :mov$0x8500,%edi
0x80df392c :mov%r14d,%esi
0x80df392f :callq  *0x815ac428
0x80df3936 :mov$0x2710,%edi
0x80df393b :callq  0x80f39c10 
0x80df3940 :or $0x4600,%ebx
0x80df3946 :movslq %ebx,%rbx
0x80df3949 :mov%rbx,%rdi
0x80df394c :mov%r14d,%esi
0x80df394f :callq  *0x815ac428
0x80df3956 :mov$0x14,%edi
0x80df395b :callq  *0x815ac438
0x80df3962 :test   %eax,%eax
0x80df3964 :   je 0x80df399b

0x80df3966 :   mov$0xc8,%edi
0x80df396b :   callq  0x80f39c10 
0x80df3970 :   mov%rbx,%rdi
0x80df3973 :   mov%r14d,%esi
0x80df3976 :   callq  *0x815ac428
0x80df397d :   mov$0x14,%edi
0x80df3982 :   callq  *0x815ac438
0x80df3989 :   test   %eax,%

Re: Xen HVM Panic, HEAD

2015-02-17 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 02/17/15 00:56, Konstantin Belousov wrote:
> On Mon, Feb 16, 2015 at 08:10:06PM -0800, Sean Bruno wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>> 
>> https://people.freebsd.org/~sbruno/Xen_APIC_panic.png
>> 
>> I suspect that there may be one or two more lines above this that
>> are relevant to this panic, but XENHVM kernel's now panic booting
>> on Xen server.  The working kernel output looks like this:
>> 
>> FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032)
>> 20140512 XEN: Hypervisor version 4.2 detected. CPU: Intel(R)
>> Xeon(R) CPU   E5620  @ 2.40GHz (2400.05-MHz K8-class
>> CPU) Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c
>> Stepping=2 
>> Features=0x1783fbff
>>
>> 
Features2=0x81ba2201
>> AMD Features=0x28100800 AMD
>> Features2=0x1 Hypervisor: Origin = "XenVMMXenVMM" real
>> memory  = 1434451968 (1368 MB) avail memory = 1353293824 (1290
>> MB) Event timer "LAPIC" quality 400 ACPI APIC Table:  
>> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP:
>> 1 package(s) x 2 core(s) cpu0 (BSP): APIC ID:  0 cpu1 (AP): APIC
>> ID:  2 ioapic0: Changing APIC ID to 1 MADT: Forcing active-low
>> polarity and level trigger for SCI
> I am not sure why your machine uses native lapic instead of xen
> lapic, and should it be other way, or not.
> 
> Regardless, show the line number for the ipi_startup+0x56. Did you
> performed clean kernel build ?
> 
> 

I have rebuilt a kernel/world based on head at svn r276627.  I have
delete /usr/obj completely and started from scratch.

Updated kernelpanic image at
https://people.freebsd.org/~sbruno/Xen_APIC_panic.png

/usr/src/sys/x86/include # kgdb /boot/kernel/kernel
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "amd64-marcel-freebsd"...
(kgdb) list *(ipi_startup+0x56)
0x80e088c6 is in ipi_startup (apicvar.h:383).
378 
379 static inline int
380 lapic_ipi_wait(int delay)
381 {
382 
383 return (apic_ops.ipi_wait(delay));
384 }
385 
386 static inline int
387 lapic_set_lvt_mask(u_int apic_id, u_int lvt, u_char masked)



-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU453BXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k9PcH/07PKefR3xkJT0W10i2xHYcp
5jNoVfPCP+crWcP7OOqfLY9aQr3KDx5GDZtb/nMbQ36YfCfB5LwAX0cJcGqVbAby
LeznkBqzHa/KPl5RtHtQKPGi25YVm6Q+3mDbH/eGN9DcYwpuNyGrwd7J08XAioux
8UIMCzSy57GlUwMdr6EMOUIP8Uz5Fhm4cryTBhMgAzdIoXnTGIdG1jpatwvXQmtx
dFH3c+vDlJdo3eqA34kufw3yENEjvOd10SVmw1RVs4KJX8pcTJMxRZs4VbayEAFb
V/2FlunDsWnKGm8ybPXrUzSkGgKlQsmaM+gPRiUNpSc9tncnekX9YxqEt36UEJM=
=n5pr
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Xen HVM Panic, HEAD

2015-02-17 Thread Sean Bruno

https://people.freebsd.org/~sbruno/Xen_APIC_panic.png

I suspect that there may be one or two more lines above this that are
relevant to this panic, but XENHVM kernel's now panic booting on Xen
server.  The working kernel output looks like this:

FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
XEN: Hypervisor version 4.2 detected.
CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2400.05-MHz
K8-class CPU)
  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2
 
Features=0x1783fbff
 
Features2=0x81ba2201
  AMD Features=0x28100800
  AMD Features2=0x1
Hypervisor: Origin = "XenVMMXenVMM"
real memory  = 1434451968 (1368 MB)
avail memory = 1353293824 (1290 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  2
ioapic0: Changing APIC ID to 1
MADT: Forcing active-low polarity and level trigger for SCI


bcc: royger@

-- 
The information in this message may be confidential.  It is intended solely 
for
the addressee(s).  If you are not the intended recipient, any disclosure,
copying or distribution of the message, or any action or omission taken by 
you
in reliance on it, is prohibited and may be unlawful.  Please immediately
contact the sender if you have received this message in error.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Xen HVM Panic, HEAD

2015-02-16 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

https://people.freebsd.org/~sbruno/Xen_APIC_panic.png

I suspect that there may be one or two more lines above this that are
relevant to this panic, but XENHVM kernel's now panic booting on Xen
server.  The working kernel output looks like this:

FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
XEN: Hypervisor version 4.2 detected.
CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2400.05-MHz
K8-class CPU)
  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c  Stepping=2
 
Features=0x1783fbff
 
Features2=0x81ba2201
  AMD Features=0x28100800
  AMD Features2=0x1
Hypervisor: Origin = "XenVMMXenVMM"
real memory  = 1434451968 (1368 MB)
avail memory = 1353293824 (1290 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  2
ioapic0: Changing APIC ID to 1
MADT: Forcing active-low polarity and level trigger for SCI


bcc: royger@
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJU4r8bXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kOTcH/R4jzSNBELU+Jc1E0N7b97wS
pRzbL69AQaDnjI8yCHvMX9AwmqC1x4Fd+bpk4Xqf9Aut9SHZTUhlZlw3BAqZfPmj
ofPaaDn3B4AUIMW/K1yPUE7tup1GlM+hSdX4czoBzzO3wKC5aBz4qgv+Peb2FMDe
LwEoeWpbJFu5y11uITN0en08bdRAg7B+gJCPkPbzY+W6m0RKpWJ8PavXNfxlMTYt
WQThTEy8SdRIPQRdAKURYSqWAPkfMP2s07h4Ckm9rXybbLWCQBYMwJZxhcDfXWlz
EoYLHoQ2nt0dT3Lu9lxH8EppCZpVQRAnVLYYB6tBpeDt9boNpNxoJ0UuOf906AM=
=EspR
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Connected sanitizer libraries to the build (for x86)

2015-01-13 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 01/13/15 12:28, Dimitry Andric wrote:
> Hi,
> 
> In r277146, I have connected the sanitizer libraries from
> compiler-rt to the build.  Currently, this works for i386 and
> amd64, and contains Address Sanitizer (ASan) and Undefined Behavior
> Sanitizer.
> 
> AddressSanitizer is a fast memory error detector [1].  It consists
> of a compiler instrumentation module and a run-time library. The
> tool can detect the following types of bugs:
> 
> * Out-of-bounds accesses to heap, stack and globals *
> Use-after-free * Use-after-return (to some extent) * Double-free,
> invalid free * Memory leaks (experimental)
> 
> The typical slowdown introduced by AddressSanitizer is 2x.  Enable
> it by compiling and linking with clang, and using the
> -fsanitize=address flag.
> 
> Undefined Behavior Sanitizer is a fast and compatible undefined
> behavior checker, which enables a number of checks that have small
> runtime cost and no impact on address space layout or ABI.  Enable
> it by using the -fsanitize=undefined flag. [2]
> 
> Please note that the sanitizers still have some rough edges on
> FreeBSD, particularly on i386.  These will hopefully be smoothed
> out in the coming time.  Reports of problems (and fixes :) are very
> welcome, but please log them in Bugzilla, so they can be tracked.
> 
> -Dimitry
> 
> [1]
> http://llvm.org/releases/3.5.0/tools/clang/docs/AddressSanitizer.html
>
> 
[2]
http://llvm.org/releases/3.5.0/tools/clang/docs/UsersManual.html#opt-fsanitize-undefined
> 


Do you want a test run for arm?

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJUtaFkXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k5nsH/iHhTW359K0f2BtUDBwH/+ga
9w7MbymRJZvKTh60LABeuN//DJ9BBRHzGHtRd5nYvvLSBN5HfVy2LbmNvz9H1p0B
/Gw6N9XL3pVMpLxU4JP6IMJ6c1YIlrapDxfUPOVpEPmdOeZ2xPsgRDB20tDNuKxj
AQftpNqf1KJL0FhzfKv0TupxPpCKuffTfO+kYa5tQQU/bDXkgxB1BsuxUD/4HiZU
nRAsbhlZV1roEo3l36a2mlRtc6sEPpZTKn4Phv3oNT7cfCd5hnuhyCfcZOWk7yXo
HyboVn10ABX8GismKQ0erkxNhcHD4VepY2CCc/0z+AhUV0DztpfLvzDoEE0lDaY=
=DqeZ
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Haswell CPU Feature

2015-01-08 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 01/06/15 07:48, John Baldwin wrote:
> On 1/6/15 12:44 AM, Jia-Shiun Li wrote:
>> On Tue, Jan 6, 2015 at 1:23 PM, Neel Natu 
>> wrote:
>> 
>>> Hi Sean,
>>> 
>>> On Mon, Jan 5, 2015 at 6:34 PM, Sean Bruno
>>>  wrote:
>>>> I'm thinking something like this:
>>>> 
>>>> Index: sys/x86/x86/identcpu.c 
>>>> ===
>>>>
>>>> 
- - --- sys/x86/x86/identcpu.c(revision 276729)
>>>> +++ sys/x86/x86/identcpu.c  (working copy) @@ -781,7
>>>> +781,7 @@ "\011TM2"   /* Thermal Monitor 2 */ "\012SSSE3"
>>>> /* SSSE3 */ "\013CNXT-ID"   /* L1 context ID
>>> available */
>>>> - - "\014" +
>>>> "\014SDBG"  /* IA32_DEBUG_INTERFACE
>>> debug*/
>>>> "\015FMA"   /* Fused Multiply Add */ "\016CX16"  /*
>>>> CMPXCHG16B
>>> Instruction */
>>>> "\017xTPR"  /* Send Task Priority
>>> Messages*/
>>>> 
>>>> 
>>> 
>>> Looks good.
>>> 
>> 
>> Maybe also this for completeness?
>> 
>> # svnlite diff Index: sys/x86/include/specialreg.h 
>> ===
>>
>> 
- --- sys/x86/include/specialreg.h(revision 276737)
>> +++ sys/x86/include/specialreg.h(working copy) @@ -154,6
>> +154,7 @@ #defineCPUID2_TM2  0x0100 #define
>> CPUID2_SSSE30x0200 #defineCPUID2_CNXTID
>> 0x0400 +#defineCPUID2_SDBG 0x0800 #define
>> CPUID2_FMA  0x1000 #defineCPUID2_CX16
>> 0x2000 #defineCPUID2_XTPR 0x4000
> 
> Yes, please include both.  SDBG matches the label in the Intel SDM,
> so that's the preferred name.
> 


Thanks folks, I've committed all of this to head after a quick
download and read of the SDM.

sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJUrrWGXxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kvBEH/0XgaqmdXENMYYnq18nBdSrt
lEs8qJuZXwvJPJbKxYLXrL6UFp4Yprw+Z4I6aeJp0zXmQP3Kv6yT+yd/ATYt7E5t
rf6ytd/qStLaq2FZu4rNQdePVWyMA4qXT0dINMChA0SishDef80WSY2J8LA7sExV
EyuD+nBmpr8/oB3UImAbihK2/YGcdi7FEjJe1hWtzcBAp655A5I5fakxDwsQz4iE
kqKaCMT50ib9D4G4JicWx1L72hcOAPWpvj9oOplHzp89ZtkuLSrWeKfKX4GriWEY
gg6jcKSds6TYCs/3wuMM63YaimJ1wZbpGhvb09at1DPFT8CamqhMspAe70yr5a8=
=KAfi
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Haswell CPU Feature

2015-01-05 Thread Sean Bruno

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 01/05/15 17:50, Andrey Fesenko wrote:
> On Tue, Jan 6, 2015 at 4:24 AM, Sean Bruno 
> wrote:
>> -BEGIN PGP SIGNED MESSAGE- Hash: SHA512
>> 
>> On 01/05/15 16:57, Neel Natu wrote:
>>> Congratulations, you have the ability to debug the Haswell
>>> silicon
>>> 
>> HA!
>> 
>> Is this turned on purposefully (its a feature of the CPU) or is
>> this turned on unintentionally and is a bug in manufacturing?
>> 
>> sean
> 
> My desktop i5-4570 contain this flag too
> 
> Origin="GenuineIntel"  Id=0x306c3  Family=0x6  Model=0x3c
> Stepping=3 
> Features=0xbfebfbff
>
> 
Features2=0x7ffafbff,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
> 
> 
I'm thinking something like this:

Index: sys/x86/x86/identcpu.c
===
- --- sys/x86/x86/identcpu.c(revision 276729)
+++ sys/x86/x86/identcpu.c  (working copy)
@@ -781,7 +781,7 @@
"\011TM2"   /* Thermal Monitor 2 */
"\012SSSE3" /* SSSE3 */
"\013CNXT-ID"   /* L1 context ID available */
- - "\014"
+   "\014SDBG"  /* IA32_DEBUG_INTERFACE debug*/
"\015FMA"   /* Fused Multiply Add */
"\016CX16"  /* CMPXCHG16B Instruction */
"\017xTPR"  /* Send Task Priority Messages*/


sean
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQF8BAEBCgBmBQJUq0m+XxSAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5kigkIAI2Naiv2TENl+1SFAQ5oHTUu
HR+sbz3o7p71W8/9WhdkiMhHLwpF5ZGrpqh9Jc4CYuyNax4OvzD9du8b1RFVVBtx
AR6K+zFE1/wHC8S+iEB2QsWLWjd6Y0NbZL1MvgEQTybFwLzdtEXafOi2gSsa2lK0
RFMd0VbE2xn2q9mp5GuTnR8fvqWGPSJLEtWTpEZri8vFnBIMC+kocb//kOhY6JsF
SNcpJ2RfhXiQyOOZT/ETe47s7A29R9VW5u/+Hg8VnNuq5rV5o2PXa68VvSmAu4gr
IxPMoodFUITXTpS/lfmkOf4W+uTSqUji+Y/u1yjNzS4MgodoEh6mc7gDuH9Xoj0=
=j1kZ
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

1 2 3 4 >

1 - 100 of 301 matches

Mail list logo