Re: Issue with gpart "Device Busy"

2020-04-16 Thread Harry Schmalzbauer

Am 15.04.2020 um 20:35 schrieb i...@dijix.com:

I have an issue with gpart, it will not let me delete partition ada0p2 
responding with “Device Busy”
The man page gpart(8) says this may be shown if a partition exists but I cannot 
seem to delete partition 2 in my case via gpart delete or gpart destroy

This is a used disk but new to the machine, I can modify the partition type and 
create partitions before and after partition 2 but I cannot delete it.

Here’s what I have tried so far:


root@beastie:~ # gpart show
=>34  1250263661  ada0  GPT  (596G)
34  409606- free -  (200M)
409640  1249591904 2  freebsd-ufs  (596G)
1250001544  262151- free -  (128M)

=>   40  976773088  ada1  GPT  (466G)
   40   1024 1  freebsd-boot  (512K)
 1064984- free -  (492K)
 20484194304 2  freebsd-swap  (2.0G)
  4196352  972576768 3  freebsd-zfs  (464G)
976773120  8- free -  (4.0K)

root@beastie:~ # gpart delete -i2 ada0
gpart: Device busy

:
:

:
root@beastie:~ # gpart destroy -F ada0
gpart: Device busy


There might still be situations where 'sysctl kern.geom.debugflags=16' 
helps, but I never needed it in the last years (since 7.x I guess).
Are you sure p2 (-i2) of ada0, most likely home for a ufs filesystem, 
isn't mounted anymore? Was it a mountpoint inside a jail?  Stopping the 
jail might leave network related active sockets blocking the filesystem 
(reboot without starting the jail before deleteing the partition should 
work in that case).


-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Audio mixer and mixer control

2020-04-11 Thread Harry Schmalzbauer

Am 11.04.2020 um 06:57 schrieb O'Connor, Daniel via freebsd-stable:
…

So if I have dsp0 with line-in and line-out, and dsp3 with a S/PDIF out, there's no way 
to get the dsp0-"mix" over to dsp3?


You can't use mixer to do what you want, but you can probably do something with 
a sox pipe line that would read from one input and feed to another if that is 
indeed what you need.


What I'm looking for is a mixer which processes various input sources and sends 
them to arbitrary output devices.
Does anybody know if there's such kind of mixer available?

Or is it possible to interconnect different dsp channels? (ugh, I don't really 
know anything about contemporary audio hardware :-( )

I also have problems understanding the mixer(8) channels.  Hard to find the corresponding dsp channel... The relation of 
"speaker", "mix", the invible "monitor" and "rec" is completely unclear to me, likewise the 
difference of "vol" and "pcm".

Is it common that S/PDIF out is a separate dsp?  I never had to investigate on 
other OS, where I get the same signal on analog and digital outputs 
simultaniously.


I don't think it's very uncommon, although I haven't used FreeBSD on a desktop 
for quite a while..

What does this output?
cat /dev/sndstat

If you just want to play some audio out to the S/PDIF you can tell your audio 
program to use that particular device (eg /dev/dsp1 or whatever it is)


Hello and thanks for your help.  Main issue is to playback 
simultaniously on more than one dsp (musicpd(1) is providing that 
feature out of the box, but I was looking for a more general way, 
covering mixed line-in (DAB+ radio)).


Here's my sndstat:
FreeBSD Audio Driver (64bit 2009061500/amd64)
Installed devices:
pcm0:  on hdaa0  (1p:2v/1r:2v) default
pcm1:  on hdaa0  (1p:2v/1r:1v)
pcm2:  on hdaa0  (1p:1v/0r:0v)
pcm3:  on hdaa0  (1p:1v/0r:0v)
pcm4:  at ? kld snd_uaudio (0p:0v/1r:1v)
No devices installed from userspace.

To my surprise, today there's dsp0_line-in/mix signal on dsp1_line-out. 
No idea if it was a layer 8 error yesterday (pretty sure it was not) or 
if some smart chip on the mainboard decided to interconnect over night 
(no reboot)?!? In fact, adjusting "mix" on dsp0 controls the output 
volume on dsp1 (analog line-in on dsp0 get's somehow routed to analog 
out on dsp1 (killed pulseaudio, nothing else is running, so it must be 
done in hardware)).


I'd like to share what I discovered while browsing freshports.org/audio:
rawrec(1) might be the leanest way to pipe signals, like you mentioned 
using sox(1).
virtual_oss(8) seems to do exactly what I was looking for regarding 
"mixing". No idea how cuse(3) comes into play, seems to be not as native 
as I prefer things.


Unfortunately, I don't have time to play with at the moment.  But once I 
come back to it, I'll find it here for reference ;-)


Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Audio mixer and mixer control

2020-04-10 Thread Harry Schmalzbauer

Hello,

today I wanted to utilize my optical S/PDIF out with an external D/A 
converter to empower my garden radio.
Unfortunately, it seems mixer(8) isn't really doing what I understand a 
mixer's job is.


As far as I understood, mixer(8) is just controlling/pushing settings to 
the dsp's specific hardware mixer (if that's true, mixctl(8) was more 
clear e.g.).


So if I have dsp0 with line-in and line-out, and dsp3 with a S/PDIF out, 
there's no way to get the dsp0-"mix" over to dsp3?
What I'm looking for is a mixer which processes various input sources 
and sends them to arbitrary output devices.

Does anybody know if there's such kind of mixer available?

Or is it possible to interconnect different dsp channels? (ugh, I don't 
really know anything about contemporary audio hardware :-( )


I also have problems understanding the mixer(8) channels.  Hard to find 
the corresponding dsp channel... The relation of "speaker", "mix", the 
invible "monitor" and "rec" is completely unclear to me, likewise the 
difference of "vol" and "pcm".


Is it common that S/PDIF out is a separate dsp?  I never had to 
investigate on other OS, where I get the same signal on analog and 
digital outputs simultaniously.


Thanks for any hints,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: UEFI ISO boot not working in 12.1 ?

2019-12-02 Thread Harry Schmalzbauer

Am 09.11.2019 um 18:24 schrieb Kyle Evans:

On Sat, Nov 9, 2019 at 10:42 AM Chris Ross  wrote:

On Thu, Nov 07, 2019 at 02:53:25PM -0500, Chris Ross wrote:

On Thu, Nov 7, 2019 at 9:46 AM Julian Elischer  wrote:

You could try some bisection back along the  12 branch..

Yeah.  I was hoping for an easier path, but.  I can try slogging back
through stable-12 a month or two at a time.

Okay.  I spent a bunch of time moving around stable-12 by date, and
an ISO build from stable-12 as of 2019-10-14 works (rev 353483), and
2019-10-15 (rev 353541) does not.

…

That helps- thanks! I'm CC'ing tsoome@, as this is basically just
r353501 in that range. Can you give the latest -CURRENT snapshot boot
as another data point?


I can confirm that reverting r353501 on stable/12 from yesterday solves 
my problem with booting the setup media.
(My symptoms on ESXi 6.7 guest using SATA vODD: r355263 loads 
kernel/modules but stucks with 100%CPU while trying to hand over to kernel)


My svn skills are as lousy as my C skills, but to me it seems like a 
mismerge.
The attached patch (against r355263, stable/12 from yesterday _without_ 
reverting r353501!) solves my problem.


But please could someone familiar with svn&code inspect what happened 
and verify/correct/commit the fix.
Solving my problem doesn't mean my approach is correct.  I don't know 
HandleProtocol() nor OpenProtocol() nor did I read the code trying to 
understand what's happening in "proto.c".

I just text-edited a obvious cannotbe… Maby I missed a lot of things…
In case attachment won't make it to the list (white space nits to be 
expected):

Index: stand/efi/boot1/proto.c
===
--- stand/efi/boot1/proto.c (Revision 355263)
+++ stand/efi/boot1/proto.c (Arbeitskopie)
@@ -61,7 +61,7 @@
     int preferred;

     /* Figure out if we're dealing with an actual partition. */
-   status = BS->HandleProtocol(h, &DevicePathGUID, (void **)&devpath);
+   status = OpenProtocolByHandle(h, &DevicePathGUID, (void 
**)&devpath);

     if (status == EFI_UNSUPPORTED)
     return (0);

@@ -77,7 +77,7 @@
     efi_free_devpath_name(text);
     }
  #endif
-   status = BS->HandleProtocol(h, &BlockIoProtocolGUID, (void 
**)&blkio);
+   status = OpenProtocolByHandle(h, &BlockIoProtocolGUID, (void 
**)&blkio);

     if (status == EFI_UNSUPPORTED)
     return (0);

Index: stand/efi/gptboot/proto.c
===
--- stand/efi/gptboot/proto.c   (Revision 355263)
+++ stand/efi/gptboot/proto.c   (Arbeitskopie)
@@ -146,7 +146,7 @@
     EFI_STATUS status;

     /* Figure out if we're dealing with an actual partition. */
-   status = BS->HandleProtocol(h, &DevicePathGUID, (void **)&devpath);
+   status = OpenProtocolByHandle(h, &DevicePathGUID, (void 
**)&devpath);

     if (status != EFI_SUCCESS)
     return;
  #ifdef EFI_DEBUG
@@ -169,7 +169,7 @@
     return;
     }
     }
-   status = BS->HandleProtocol(h, &BlockIoProtocolGUID, (void 
**)&blkio);
+   status = OpenProtocolByHandle(h, &BlockIoProtocolGUID, (void 
**)&blkio);

     if (status != EFI_SUCCESS) {
     DPRINTF("Can't get the block I/O protocol block\n");
     return;


But reading this thread leaves one question:
Does 12.1-RELEASE refuse to boot on regular ESXi UEFI guests!?

Thanks,

-Harry

(resent due to ?expired? subscription…; original message was addressed 
to all other recipients)
Index: stand/efi/boot1/proto.c
===
--- stand/efi/boot1/proto.c	(Revision 355263)
+++ stand/efi/boot1/proto.c	(Arbeitskopie)
@@ -61,7 +61,7 @@
 	int preferred;
 
 	/* Figure out if we're dealing with an actual partition. */
-	status = BS->HandleProtocol(h, &DevicePathGUID, (void **)&devpath);
+	status = OpenProtocolByHandle(h, &DevicePathGUID, (void **)&devpath);
 	if (status == EFI_UNSUPPORTED)
 		return (0);
 
@@ -77,7 +77,7 @@
 		efi_free_devpath_name(text);
 	}
 #endif
-	status = BS->HandleProtocol(h, &BlockIoProtocolGUID, (void **)&blkio);
+	status = OpenProtocolByHandle(h, &BlockIoProtocolGUID, (void **)&blkio);
 	if (status == EFI_UNSUPPORTED)
 		return (0);
 
Index: stand/efi/gptboot/proto.c
===
--- stand/efi/gptboot/proto.c	(Revision 355263)
+++ stand/efi/gptboot/proto.c	(Arbeitskopie)
@@ -146,7 +146,7 @@
 	EFI_STATUS status;
 
 	/* Figure out if we're dealing with an actual partition. */
-	status = BS->HandleProtocol(h, &DevicePathGUID, (void **)&devpath);
+	status = OpenProtocolByHandle(h, &DevicePathGUID, (void **)&devpath);
 	if (status != EFI_SUCCESS)
 		return;
 #ifdef EFI_DEBUG
@@ -169,7 +169,7 @@
 			return;
 		}
 	}
-	status = BS->HandleProtocol(h, &BlockIoProtocolGUID, (void **)&blkio);
+	status = OpenProtocolBy

ZVOLs volmode/sync performance influence – affecting windows guests via FC RDM.vmdk

2019-10-04 Thread Harry Schmalzbauer

Hello,

I noticed a significant guest write performance drop with volmode=dev 
during my 12.1 fibre channel tests.
I remember having heard of such reports by some people oaccasionally 
during the last years, so I decided to see how far I can track it down.
Unfortunately, I found no way to demonstrate the effect with in-box 
tools, even not utilizing fio(1) (from ports/benchmarks/fio).


Since I don't know how zvols/ctl work under the hood, I'd need help from 
the experts, how/why volmode seems to affect sync property/behaviour.


The numbers I see let me think that setting volmode=geom will cause the 
same ZFS _zvol_-behaviour as setting the sync property to "disabled".
Why? Shortest summary: Performance on Windows guests writing files onto 
a NTFS filesystem drops by factor ~8 with volmode=dev, but
· After setting sync=disabled with vomode=dev ZVOLs, I see the same 
write rate as I get with volmode=geom.
· Also, disabling write cache flush on windows has exactly the same 
effect, while leaving sync=standard.


Here's a little more background information.

The windows guest uses the zvol-backed-FC-target as mapped raw device 
from a virtual SCSI controller 
(ZVOL->ctl(4)->isp(4)->qlnativefc(ESXi-Initiator)->RDM.vmdk->paravirt-SCSI->\\.\PhysicalDrive1->GPT...NTFS)
The initiator is ESXi6.7, but I'm quiet sure I saw the same effect with 
iSCSI (windows software iSCSI initiator) instead of FC some time ago, 
while I haven't falsified this run.



Here's what I've done trying to reproduce the issue, leaving 
windows/ESXi out of the game:


I'm creating a ZVOL block backend for ctl(4):
    zfs create -V 10G -o compression=off -o volmode=geom -o 
sync=standard MyPool/testvol

    ctladm create -b block -d guest-zvol -o file=/dev/zvol/MyPool/testvol

The first line creates the ZVOL with default values.  If the pool or 
parent dataset hasn't set local values for the compression, volmode or 
snyc properties, defining the 3 "-o"s can be omitted.


    ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` 
-o on


Now I have a "FREEBSD CTLDISK 0001", available as geom "daN".

To simulate even better, I'm using the second isp(4) port as initiator 
(to be precise, I use 2 ports in simultanious target/initiator role, so 
I have the ZVOL backed block device available with and without real FC 
link in the path)
Utilizing dd(1) on the 'da' connected to the FC-initiator, I get 
_exactly_ the same numbers as I get in my windows guest along all the 
different block sizes!!!

E.g., for the 1k test, I'm running
    dd if=/dev/zero bs=1k of=/dev/da11 count=100k status=progress (~8MB/s)

For those wanting to follow the experiment – remove the "volmode=geom"-zvol:
    ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` 
-o off
    ctladm remove -b block -l 0  (<– only if you don't have LUN 0 in 
use otherwise)

    zfs destroy MyPool/testvol

"volmode" property can be altered at runtime, but won't have any 
effect!  Either you would have to reboot or re-import the pool.
For my test I can simply create a new, identical ZVOL, this time with 
volmode=dev (instead of geom like before).
    zfs create -V 10G -o compression=off -o volmode=dev -o 
sync=standard MyPool/testvol

    ctladm create -b block -d guest-zvol -o file=/dev/zvol/MyPool/testvol
    ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` 
-o on


Now the same Windows filesystem write test drops throughput rate by 
factor 8 up for .5-32k block sizes and still about factor 3 for larger 
block sizes.


(at this point you'll most likely have noticed a panic with 12.1-BETA3; 
see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240917 )


Unfortunately, I can't see any performance drop with the dd line from above.
Since fio(8) has an parameter to issue fsync(3) any N written blocks, I 
also tried to reproduce with:
    echo "[noop]" | fio --ioengine=sync --filename=/dev/da11 --bs=1k 
--rw=write --io_size=80m --fsync=1 -
To my surprise, I still do _not_ see any performance drop, while I 
reproducably see the big factor 8 penalty on the windows guest.


Can anybody tell me, which part I'm missing to simulate the real-world 
issue?
Like mentioned, either disabling disk's write chache flush in windows, 
or alternatively setting sync=disabled restore the windows write 
throughput to the same numbers as with volmode=geom.


fio(1) has the not usable ioengine "sg", which I know nothing about.  
Maybe somebody has any hint in that direction?


Thanks

-harry


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


12.1-prerelease nullfs? related panic

2019-09-13 Thread Harry Schmalzbauer

Hello,

got this panic today booting a test machine with kernel from 09/09/2019, 
r352054:


Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0x80541088
stack pointer   = 0x28:0xfe578420
frame pointer   = 0x28:0xfe578470
code segment    = base 0x0, limit 0xf, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process = 593 (limits)
trap number = 9
panic: general protection fault
cpuid = 0
time = 1568392101
KDB: stack backtrace:
#0 0x8061eec7 at kdb_backtrace+0x67
#1 0x805d323d at vpanic+0x19d
#2 0x805d3093 at panic+0x43
#3 0x80941d2c at trap_fatal+0x39c
#4 0x8094113c at trap+0x6c
#5 0x8091b4ac at calltrap+0x8
#6 0x805421c2 at null_lookup+0x162
#7 0x809c40c0 at VOP_LOOKUP_APV+0x50
#8 0x806948f1 at lookup+0x6d1
#9 0x80693dc7 at namei+0x437
#10 0x806aaad2 at kern_statat+0x72
#11 0x806ab2cf at sys_fstatat+0x2f
#12 0x809428e4 at amd64_syscall+0x364
#13 0x8091bdd0 at fast_syscall_common+0x101
Uptime: 26s

#4  0x80941d2c in trap_fatal (frame=, 
eva=)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/trap.c:943
#5  0x8094113c in trap (frame=0xfe578360) at 
RELENG_12/src/sys/amd64/include/counter.h:87
#6  0x8091b4ac in calltrap () at 
/usr/local/share/deploy-tools/RELENG_12/src/sys/amd64/amd64/exception.S:289
#7  0x80541088 in null_nodeget (mp=0xf8000524d000, 
lowervp=0xf8000535b000, vpp=0xfe5784a8)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/sys/fs/nullfs/null_subr.c:117

#8  0x805421c2 in null_lookup (ap=0xfe578568)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/sys/fs/nullfs/null_vnops.c:429
#9  0x809c40c0 in VOP_LOOKUP_APV (vop=0x80c81d98, 
a=0xfe578568) at vnode_if.c:126

#10 0x806948f1 in lookup (ndp=0xfe578768) at vnode_if.h:54
#11 0x80693dc7 in namei (ndp=0xfe578768) at 
/usr/local/share/deploy-tools/RELENG_12/src/sys/kern/vfs_lookup.c:445
#12 0x806aaad2 in kern_statat (td=0xf800052ad5e0, 
flag=, fd=,
    path=0x8002600b0 , 
pathseg=UIO_USERSPACE, sbp=0xfe57, hook=0)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/sys/kern/vfs_syscalls.c:2300
#13 0x806ab2cf in sys_fstatat (td=, 
uap=0xf800052ad9a0)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/sys/kern/vfs_syscalls.c:2277

#14 0x809428e4 in amd64_syscall (td=0xf800052ad5e0, traced=0)
    at RELENG_12/src/sys/amd64/amd64/../../kern/subr_syscall.c:135


This occured only once and after the reboot, I found a corrupted file on 
my nullfs-mount.  It wasn't mutilated, but showed content of another 
valid file in the same directory (like 'cat fil1 >> vfile2')


Any ideas if this has been recently addressed since 09/09?

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: EFI loader doesn't handle md_preload (md_image) correct?

2019-04-07 Thread Harry Schmalzbauer

Am 16.05.2017 um 18:26 schrieb Harry Schmalzbauer:

Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:20 (localtime):

On 16. mai 2017, at 19:13, Harry Schmalzbauer  wrote:

Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:00 (localtime):

On 16. mai 2017, at 18:45, Harry Schmalzbauer mailto:free...@omnilan.de>> wrote:

Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 17:28 (localtime):

Bezüglich Toomas Soome's Nachricht vom 16.05.2017 16:57 (localtime):

On 16. mai 2017, at 17:55, Harry Schmalzbauer mailto:free...@omnilan.de>> wrote:

Hello,

unfortunately I had some trouble with my preferred MFS-root setups.
It seems EFI loader doesn't handle type md_image correctly.

If I load any md_image with loader invoked by gptboot or gptzfsboot,
'lsmod'
shows "elf kernel", "elf obj module(s)" and "md_image".

Using the same loader.conf, but EFI loader, the md_image-file is
prompted and sems to be loaded, but not registered.  There's no
md_image
with 'lsmod', hence it's not astonsihing that kernel doesn't attach md0
so booting fails since there's no rootfs.

Any help highly appreciated, hope Toomas doesn't mind beeing
initially CC'd.

Thanks,

-harry

The first question is, how large is the md_image and what other
modules are loaded?

Thanks for your quick response.

The images are 50-500MB uncompressed (provided by gzip compressed file).
Small ammount of elf modules, 5, each ~50kB.

On the real HW, there's vmm and some more:
Id Refs Address Size Name
1   46 0x8020   16M kernel
21 0x8121d000   86K unionfs.ko
31 0x81233000  3.1M zfs.ko
42 0x81545000   51K opensolaris.ko
57 0x81552000  279K usb.ko
61 0x81598000   67K ukbd.ko
71 0x815a9000   51K umass.ko
81 0x815b6000   46K aesni.ko
91 0x815c3000   54K uhci.ko
101 0x815d1000   65K ehci.ko
111 0x815e2000   15K cc_htcp.ko
121 0x815e6000  3.4M vmm.ko
131 0xa3a21000   12K ums.ko
141 0xa3a24000  9.1K uhid.ko

Providing md_image uncompressed doesn't change anything.

Will deploy a /usr separated rootfs, which is only ~100MB uncompressed
and see if that changes anything.
That's all I can provide, code is far beyond my knowledge...

-harry


The issue is, that current UEFI implementation is using 64MB staging
memory for loading the kernel and modules and files. When the boot is
called, the relocation code will put the bits from staging area into the
final places. The BIOS version does not need such staging area, and that
will explain the difference.

I actually have different implementation to address the same problem,
but thats for illumos case, and will need some work to make it usable
for freebsd; the idea is actually simple - allocate staging area per
loaded file and relocate the bits into the place by component, not as
continuous large chunk (this would also allow to avoid the mines like
planted by hyperv;), but right now there is no very quick real solution
other than just build efi loader with larger staging size.

Ic, thanks for the explanation.
While not aware about the purpose of the staging area nor the
consequences of enlarging it, do you think it's feasable increasing it
to 768Mib?

At least now I have an idea baout the issue and an explanation why
reducing md_imgae to 100MB hasn't helped – still more than 64...

Any quick hint where to define the staging area size highly appreciated,
fi there are no hard objections against a 768MB size.

-harry

The problem is that before UEFI Boot Services are not switched off, the memory 
is managed (and owned) by the firmware,

Hmm, I've been expecting something like that (owend by firmware) ;-)

So I'll stay with CSM for now, and will happily be an early adopter if
you need someone to try anything (-stable mergable).


Hello Toomas,

thanks for your ongoing FreeBSD commits, saw your recent libstand 
improvements and the efiloader commit.

Which remembers me nagging the skilled ones for my unmet needs ;-)

I guess nobody had time to look at the MFS-root limitation with EFI vs. 
BIOS.

If you have any news/plans, please share.
The ability to boot via EFI gives a much better console 
experience/usability for admins, but on MFS-root system, I'm still 
forced to use the old loader path, because of the 64MB size limit.


Do you think there's a chance that this will be resolved for FreeBSD?

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: libcrypto.so.111 linked binaries SIGSEGV (in bhyve guest)

2019-02-22 Thread Harry Schmalzbauer

Am 22.02.2019 um 04:51 schrieb Eugene Grosbein:

21.02.2019 22:27, Harry Schmalzbauer wrote:


The object is clearly corrupted.


Thanks to your hint to readelf, I found out that it gets corrupted during 
dump(8) (or resotore, not yet analyzed).
The obj tree contains the good version, the dump archive not.
The dump archive is used as source for the ISO, hence the described errors.
Now I have to dig in 10 years old deployment scripts to track down and 
reproduce the corruption.  No explanation so far, but for sure no rtld-elf 
problem :-)
And also not a problem in the FreeBSD make chain, building stable/12 on 
stable/11 works as intended and doesn't produce the mutilated libcrypto.so.111!


You may find useful reading trail of this PR 
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228174

Long story short: dump(8) will read inconsistent data (or even garbage) from 
mounted file system
unless used with -L to make and dump a snapshot. And UFS snapshots are not 
compatible with SU+J UFS
created with installer by default in some versions of FreeBSD.


Thanks a lot for that additional relevant information.  I'm aware about 
the -L & SU+J problem.  And I'm not conviced, the default installer 
settings handle this situation correctly, at least not for the root 
filesystem!


My issue was unrelated though.
I dump(8)ed a unmounted md(4), but restore(8) hasn't had enough space 
(only view bytes, so size of the corrupted file wasn't obviously wrong) 
and the deployment script hasn't checked the return status at all. 
Fixed the script and now the restore(8)ed libcrypto.so.111 works.


Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: libcrypto.so.111 linked binaries SIGSEGV (in bhyve guest)

2019-02-21 Thread Harry Schmalzbauer

Am 21.02.2019 um 10:36 schrieb Konstantin Belousov:
…


ELF Header:
Magic:   7f 45 4c 46 02 01 01 09 00 00 00 00 00 00 00 00
Class: ELF64
Data:  2's complement, little endian
Version:   1 (current)
OS/ABI:FreeBSD
ABI Version:   0
Type:  DYN (Shared object file)
Machine:   Advanced Micro Devices x86-64
Version:   0x1
Entry point address:   0x116000
Start of program headers:  64 (bytes into file)
Start of section headers:  3090864 (bytes into file)
Flags: 0
Size of this header:   64 (bytes)
Size of program headers:   56 (bytes)
Number of program headers: 8
Size of section headers:   64 (bytes)
Number of section headers: 29
Section header string table index: 28

Elf file type is DYN (Shared object file)
Entry point 0x116000
There are 8 program headers, starting at offset 64

Program Headers:
Type   Offset VirtAddr   PhysAddr
   FileSizMemSiz  FlgAlign
PHDR   0x0040 0x0040 0x0040
   0x01c0 0x01c0  R  0x8
LOAD   0x 0x 0x
   0x00115a7c 0x00115a7c  R  0x1000
LOAD   0x00116000 0x00116000 0x00116000
   0x001acb20 0x001acb20  R E0x1000
LOAD   0x002c3000 0x002c3000 0x002c3000
   0x0002f790 0x000325e0  RW 0x1000
DYNAMIC0x002f1a80 0x002f1a80 0x002f1a80
   0x0190 0x0190  RW 0x8
GNU_RELRO  0x002c9000 0x002c9000 0x002c9000
   0x00029790 0x00029790  R  0x1
GNU_EH_FRAME   0x000d0050 0x000d0050 0x000d0050
   0xbc74 0xbc74  R  0x4
GNU_STACK  0x 0x 0x
   0x 0x  RW 0

   Section to Segment mapping:
Segment Sections...
 00
 01 (null) (null) (null) (null) (null) (null) (null) (null)
(null) (null) (null) (null) (null) (null) (null) (null) (null) (null)
(null) (null) (null) (null) (null) (null) (null) (null) (null) (null)
 02
 03
 04
 05
 06
 07 (null) (null) (null) (null) (null) (null) (null) (null)
(null) (null) (null) (null) (null) (null) (null) (null) (null) (null)
(null) (null) (null) (null) (null) (null) (null) (null) (null) (null)
There are 29 section headers, starting at offset 0x2f29b0:



…


The object is clearly corrupted.


Thanks to your hint to readelf, I found out that it gets corrupted 
during dump(8) (or resotore, not yet analyzed).

The obj tree contains the good version, the dump archive not.
The dump archive is used as source for the ISO, hence the described errors.
Now I have to dig in 10 years old deployment scripts to track down and 
reproduce the corruption.  No explanation so far, but for sure no 
rtld-elf problem :-)
And also not a problem in the FreeBSD make chain, building stable/12 on 
stable/11 works as intended and doesn't produce the mutilated 
libcrypto.so.111!


Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: libcrypto.so.111 linked binaries SIGSEGV (in bhyve guest)

2019-02-21 Thread Harry Schmalzbauer

Am 21.02.2019 um 09:54 schrieb Konstantin Belousov:

On Thu, Feb 21, 2019 at 09:24:43AM +0100, Harry Schmalzbauer wrote:

Am 20.02.2019 um 17:51 schrieb Harry Schmalzbauer:

Hello,


…

gdb shows:
Core was generated by `/usr/sbin/auditdistd'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libutil.so.9...Reading symbols from
/usr/lib/debug//lib/libutil.so.9.debug...done.
done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from
/usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
Loaded symbols for /libexec/ld-elf.so.1
#0  memset (dest=0x80056f790, c=0, len=)
     at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624
5624    ((char *)dest)[i] = c;
(gdb) bt
#0  memset (dest=0x80056f790, c=0, len=)
     at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624
#1  0x000800235b07 in map_object (fd=3, path=0x800246140
"/lib/libcrypto.so.111",
     sb=0x7fffd4a8)
     at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/map_object.c:249
#2  0x000800230806 in load_object (name=0x201dba
"libcrypto.so.111", fd_u=-1,
     refobj=0x800248000, flags=)
     at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2493
#3  0x000800229972 in _rtld (sp=,
exit_proc=0x7fffea30,
     objp=0x7fffea38)
     at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2315
#4  0x000800228019 in .rtld_start ()
     at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/amd64/rtld_start.S:39
#5  0x in ?? ()
Current language:  auto; currently minimal

Any help highly appreciated.

This is with a live CD (amd64), compiled with stable/12 from today (so
clang 7.01).
The bhyve guest has 2GB hardwired and ran stable/11 beforehand, which
compiled the live CD.
bhyve host is 11.2.  But that shouldn't play a role, does it?


I'm really interested what happens here.
I built stable/11 in that bhyve guest and updated that guest to
stable/11 from yesterday.
To my surpise llvm 7.01 was also merged to stable/11.  Thank you for
that great supprt!
No problems with any binary in the stable/11 bhyve guest.

Then I built stable/12 in that re-built stable/11 guest.
As result, again all binaries linked to /lib/libcrypto.so.111 crash
(signal 11) with the stable/12 iso in the same bhyve guest.

Here the example from ntpq:
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libedit.so.7...Reading symbols from
/usr/lib/debug//lib/libedit.so.7.debug...done.
done.
Loaded symbols for /lib/libedit.so.7
Reading symbols from /lib/libm.so.5...Reading symbols from
/usr/lib/debug//lib/libm.so.5.debug...done.
done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from
/usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
#0  memset (dest=0x8005ef790, c=0, len=) at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624
5624    ((char *)dest)[i] = c;
(gdb) bt
#0  memset (dest=0x8005ef790, c=0, len=) at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624
#1  0x00080025db07 in map_object (fd=3, path=0x80026e1a0
"/lib/libcrypto.so.111", sb=0x7fffd4c8) at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/map_object.c:249
#2  0x000800258806 in load_object (name=0x201b40 "libcrypto.so.111",
fd_u=-1, refobj=0x80027, flags=) at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2493
#3  0x000800251972 in _rtld (sp=,
exit_proc=0x7fffea50, objp=0x7fffea58) at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2315
#4  0x000800250019 in .rtld_start () at
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/amd64/rtld_start.S:39
#5  0x in ?? ()

So please correct me if I'm comletely wrong, but the problem here seems
to be reproducably rtld-elf related.
Unfortunately I don't know anything about object files and linkers and
the related fundamental stuff.

If you do not know about linkers, why do you claim that the problem
is related to rtld ?


But maybe someone else has an idea what's going wrong here?


The fault happens during zeroing of bss.  Most likely it is due to some
strangeness of the object being loaded.  For diagnostic, show
the output of "readelf -a libcrypto.so.111".


Thanks for your help!
I just guess it's rtld related, since I obviously misinterpreted the 
backtrace.  Reverting topic change…


ELF Header:
  Magic:   7f 45 4c 46 02 01 01 09 00 00 00 00 00 00 00 00
  Class: ELF64
  Data:  2's complement, little endian
  Version:   1 (current)
  OS/ABI:FreeBSD
  ABI Version:   0
  Type:  

Strange rtld-elf failure on stable/12 [Was: libcrypto.so.111 linked binaries SIGSEGV (in bhyve guest)]

2019-02-21 Thread Harry Schmalzbauer

Am 20.02.2019 um 17:51 schrieb Harry Schmalzbauer:

Hello,


…

gdb shows:
Core was generated by `/usr/sbin/auditdistd'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libutil.so.9...Reading symbols from 
/usr/lib/debug//lib/libutil.so.9.debug...done.

done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from 
/usr/lib/debug//libexec/ld-elf.so.1.debug...done.

done.
Loaded symbols for /libexec/ld-elf.so.1
#0  memset (dest=0x80056f790, c=0, len=)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624

5624    ((char *)dest)[i] = c;
(gdb) bt
#0  memset (dest=0x80056f790, c=0, len=)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624
#1  0x000800235b07 in map_object (fd=3, path=0x800246140 
"/lib/libcrypto.so.111",

    sb=0x7fffd4a8)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/map_object.c:249
#2  0x000800230806 in load_object (name=0x201dba 
"libcrypto.so.111", fd_u=-1,

    refobj=0x800248000, flags=)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2493
#3  0x000800229972 in _rtld (sp=, 
exit_proc=0x7fffea30,

    objp=0x7fffea38)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2315

#4  0x000800228019 in .rtld_start ()
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/amd64/rtld_start.S:39

#5  0x in ?? ()
Current language:  auto; currently minimal

Any help highly appreciated.

This is with a live CD (amd64), compiled with stable/12 from today (so 
clang 7.01).
The bhyve guest has 2GB hardwired and ran stable/11 beforehand, which 
compiled the live CD.

bhyve host is 11.2.  But that shouldn't play a role, does it?


I'm really interested what happens here.
I built stable/11 in that bhyve guest and updated that guest to 
stable/11 from yesterday.
To my surpise llvm 7.01 was also merged to stable/11.  Thank you for 
that great supprt!

No problems with any binary in the stable/11 bhyve guest.

Then I built stable/12 in that re-built stable/11 guest.
As result, again all binaries linked to /lib/libcrypto.so.111 crash 
(signal 11) with the stable/12 iso in the same bhyve guest.


Here the example from ntpq:
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libedit.so.7...Reading symbols from 
/usr/lib/debug//lib/libedit.so.7.debug...done.

done.
Loaded symbols for /lib/libedit.so.7
Reading symbols from /lib/libm.so.5...Reading symbols from 
/usr/lib/debug//lib/libm.so.5.debug...done.

done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from 
/usr/lib/debug//libexec/ld-elf.so.1.debug...done.

done.
#0  memset (dest=0x8005ef790, c=0, len=) at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624

5624    ((char *)dest)[i] = c;
(gdb) bt
#0  memset (dest=0x8005ef790, c=0, len=) at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624
#1  0x00080025db07 in map_object (fd=3, path=0x80026e1a0 
"/lib/libcrypto.so.111", sb=0x7fffd4c8) at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/map_object.c:249
#2  0x000800258806 in load_object (name=0x201b40 "libcrypto.so.111", 
fd_u=-1, refobj=0x80027, flags=) at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2493
#3  0x000800251972 in _rtld (sp=, 
exit_proc=0x7fffea50, objp=0x7fffea58) at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2315
#4  0x000800250019 in .rtld_start () at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/amd64/rtld_start.S:39

#5  0x in ?? ()

So please correct me if I'm comletely wrong, but the problem here seems 
to be reproducably rtld-elf related.
Unfortunately I don't know anything about object files and linkers and 
the related fundamental stuff.

But maybe someone else has an idea what's going wrong here?

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


libcrypto.so.111 linked binaries SIGSEGV (in bhyve guest)

2019-02-20 Thread Harry Schmalzbauer

Hello,

I'm tryint to upgrade a bhyve guest from stable/11 to stable/12.

pkg(8) for example crashes with signal 11.

I looked for other binaries affected by
ldd /usr/sbin/* | & grep 'signal 11$'
wich gives
/usr/sbin/auditdistd: signal 11
/usr/sbin/bhyve: signal 11
/usr/sbin/bsnmpd: signal
/usr/sbin/gssd: signal 11
/usr/sbin/hostapd: signal 11
/usr/sbin/iprop-log: signal 11
/usr/sbin/keyserv: signal 11
/usr/sbin/kstash: signal 11
/usr/sbin/ktutil: signal 11
/usr/sbin/local-unbound: signal 11
/usr/sbin/local-unbound-anchor: signal 11
/usr/sbin/local-unbound-checkconf: signal 11
/usr/sbin/local-unbound-control: signal 11
/usr/sbin/ntp-keygen: signal 11
/usr/sbin/ntpd: signal 11
/usr/sbin/ntpdate: signal 11
/usr/sbin/ntpdc: signal 11
/usr/sbin/pkg: signal 11
/usr/sbin/ppp: signal 11
/usr/sbin/sntp: signal 11
/usr/sbin/sshd: signal 11
/usr/sbin/tcpdump: signal 11
/usr/sbin/uefisign: signal 11
/usr/sbin/wpa_supplicant: signal 11

They all seem to have in common beeing linked against 
'/lib/libcrypto.so.111'


truss /usr/sbin/auditdistd
:
close(3) = 0 (0x0)
openat(AT_FDCWD,"/lib/libcrypto.so.111",O_RDONLY|O_CLOEXEC|O_VERIFY,00) 
= 3 (0x3)
fstat(3,{ mode=-r--r--r-- ,inode=15002,size=3006464,blksize=4096 }) 
= 0 (0x0)
mmap(0x0,4096,PROT_READ,MAP_PRIVATE|MAP_PREFAULT_READ,3,0x0) = 
34362249216 (0x800265000)

mmap(0x0,3104768,PROT_NONE,MAP_GUARD,-1,0x0) = 34362347520 (0x80027d000)
mmap(0x80027d000,1138688,PROT_READ,MAP_PRIVATE|MAP_FIXED|MAP_NOCORE|MAP_PREFAULT_READ,3,0x0) 
= 34362347520 (0x80027d000)
mmap(0x800393000,1757184,PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_FIXED|MAP_NOCORE|MAP_PREFAULT_READ,3,0x116000) 
= 34363486208 (0x800393000)
mmap(0x80054,196608,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_PREFAULT_READ,3,0x2c3000) 
= 34365243392 (0x80054) SIGNAL 11 (SIGSEGV) code=SEGV_ACCERR 
trapno=12 addr=0x80056f790

process killed, signal = 11 (core dumped)

I have no idea how to analyze further or what the reason could be (like 
mentioned, all binaries listed dump core after opening lib/libcrypto.so.111


gdb shows:
Core was generated by `/usr/sbin/auditdistd'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libutil.so.9...Reading symbols from 
/usr/lib/debug//lib/libutil.so.9.debug...done.

done.
Loaded symbols for /lib/libutil.so.9
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from 
/usr/lib/debug//libexec/ld-elf.so.1.debug...done.

done.
Loaded symbols for /libexec/ld-elf.so.1
#0  memset (dest=0x80056f790, c=0, len=)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624

5624    ((char *)dest)[i] = c;
(gdb) bt
#0  memset (dest=0x80056f790, c=0, len=)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:5624
#1  0x000800235b07 in map_object (fd=3, path=0x800246140 
"/lib/libcrypto.so.111",

    sb=0x7fffd4a8)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/map_object.c:249
#2  0x000800230806 in load_object (name=0x201dba "libcrypto.so.111", 
fd_u=-1,

    refobj=0x800248000, flags=)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2493
#3  0x000800229972 in _rtld (sp=, 
exit_proc=0x7fffea30,

    objp=0x7fffea38)
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/rtld.c:2315

#4  0x000800228019 in .rtld_start ()
    at 
/usr/local/share/deploy-tools/RELENG_12/src/libexec/rtld-elf/amd64/rtld_start.S:39

#5  0x in ?? ()
Current language:  auto; currently minimal

Any help highly appreciated.

This is with a live CD (amd64), compiled with stable/12 from today (so 
clang 7.01).
The bhyve guest has 2GB hardwired and ran stable/11 beforehand, which 
compiled the live CD.

bhyve host is 11.2.  But that shouldn't play a role, does it?

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSI allocation regression, still to be corrected in HEAD and please MFC before release/12.0 gets branched

2018-11-13 Thread Harry Schmalzbauer

Am 13.11.2018 um 19:45 schrieb Scott Long:




On Nov 13, 2018, at 11:11 AM, Harry Schmalzbauer  wrote:

Am 13.11.2018 um 19:02 schrieb Scott Long:

On Nov 12, 2018, at 10:03 AM, Harry Schmalzbauer  wrote:

Am 11.06.2018 um 20:28 schrieb Harry Schmalzbauer:

Am 05.06.2018 um 19:54 schrieb Scott Long:
…

Late in the 11.2 phase, I identified this commit as a regression for MSI 
(non-x) alloctaion.


…


thanks a lot, in fact I'm not surprised that you come up with a better solution 
than that quick fix :-)
Had hoped someone else would do an intermediate commit to get it into 12.0 in 
time, so you won't feel any time pressure - good job needs the time it needs, 
as long as the right person is doing the job.

Unfortunately I don't have a non-productive setup where I could test before 
release/12.0 will be branched – might be subject to change...


12.0 has completely different code from 11.x, and from my review of it last 
night it should be fine.  If you have evidence that what’s currently in 12 is 
not working, please let me know ASAP.


Sorry for the confusion, I missed that.
I just verified that I do apply the patch (without errors) to local 
stable/12 source tree for local releases...  That's probably a mistake. 
I can't remember if I ever checked whether stable/12 (for sure not 
stable/12, but -current back then) MSI fallback allocation does work 
without the patch or not.


Like metioned, I don't have a non-productive machine of that kind for 
testing, but it's superfluous anyways if you know that code paths differ 
in that part.


Please ignore my 12.0 referings, sorry.

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: MSI allocation regression, still to be corrected in HEAD and please MFC before release/12.0 gets branched

2018-11-13 Thread Harry Schmalzbauer

Am 13.11.2018 um 19:02 schrieb Scott Long:




On Nov 12, 2018, at 10:03 AM, Harry Schmalzbauer  wrote:

Am 11.06.2018 um 20:28 schrieb Harry Schmalzbauer:

Am 05.06.2018 um 19:54 schrieb Scott Long:
…

Late in the 11.2 phase, I identified this commit as a regression for MSI 
(non-x) alloctaion.
I have an idea what probably causes the problem here (INTx allocation, although 
MSI (and MSI-x) capability):
disable_msix is not 0 (I need to disable MSI-x because of ESXi-passthru…).

Corresponding lines:
{
  device_t dev;
  int error, msgs;

  dev = sc->mps_dev;
  error = 0;
  msgs = 0;

  if ((sc->disable_msix == 0) &&
  ((msgs = pci_msix_count(dev)) >= MPS_MSI_COUNT))
  error = mps_alloc_msix(sc, MPS_MSI_COUNT);
  if ((error != 0) && (sc->disable_msi == 0) &&
  ((msgs = pci_msi_count(dev)) >= MPS_MSI_COUNT))
  error = mps_alloc_msi(sc, MPS_MSI_COUNT);
  if (error != 0)
  msgs = 0;

  sc->msi_msgs = msgs;
  return (error);
}


…

Hi Harry,
You are correct about the bug.  Please change the line at the top of the 
function that reads
error = 0;
to
error = ENXIO;
Let me know if that fixes the MSI problem for you.


…


…

Index: src/sys/dev/mps/mps_pci.c
===
--- sys/dev/mps/mps_pci.c   (Revision 334948)
+++ sys/dev/mps/mps_pci.c   (Arbeitskopie)
@@ -244,7 +244,7 @@
 int error, msgs;

 dev = sc->mps_dev;
-   error = 0;
+   error = ENXIO;
 msgs = 0;

 if ((sc->disable_msix == 0) &&



To my understanding, it's obvious that the way mps_pci_alloc_interrupts() 
currently works is unintended.
This might not affect too many people, but is there a reason not to fix it?

I already created a coresponding problem report: 
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229267
Anything else I should do?



Hi Harry,

Sorry for ignoring this for so long.  I’m going to commit a fix today, but it 
won’t be the same one-line change.
Upon reviewing the code, I’d going to refactor it so it’s not so confusing and 
prone to these kinds of mistakes.
Thank you for the continued reminders to finish this.


Hi Scott,

thanks a lot, in fact I'm not surprised that you come up with a better 
solution than that quick fix :-)
Had hoped someone else would do an intermediate commit to get it into 
12.0 in time, so you won't feel any time pressure - good job needs the 
time it needs, as long as the right person is doing the job.


Unfortunately I don't have a non-productive setup where I could test 
before release/12.0 will be branched – might be subject to change...


best,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


MSI allocation regression, still to be corrected in HEAD and please MFC before release/12.0 gets branched

2018-11-12 Thread Harry Schmalzbauer

Am 11.06.2018 um 20:28 schrieb Harry Schmalzbauer:

Am 05.06.2018 um 19:54 schrieb Scott Long:
…
Late in the 11.2 phase, I identified this commit as a regression 
for MSI (non-x) alloctaion.
I have an idea what probably causes the problem here (INTx 
allocation, although MSI (and MSI-x) capability):
disable_msix is not 0 (I need to disable MSI-x because of 
ESXi-passthru…).


Corresponding lines:
{
 device_t dev;
 int error, msgs;

 dev = sc->mps_dev;
 error = 0;
 msgs = 0;

 if ((sc->disable_msix == 0) &&
 ((msgs = pci_msix_count(dev)) >= MPS_MSI_COUNT))
 error = mps_alloc_msix(sc, MPS_MSI_COUNT);
 if ((error != 0) && (sc->disable_msi == 0) &&
 ((msgs = pci_msi_count(dev)) >= MPS_MSI_COUNT))
 error = mps_alloc_msi(sc, MPS_MSI_COUNT);
 if (error != 0)
 msgs = 0;

 sc->msi_msgs = msgs;
 return (error);
}


…

Hi Harry,
You are correct about the bug.  Please change the line at the top 
of the function that reads

error = 0;
to
error = ENXIO;
Let me know if that fixes the MSI problem for you.


…


…

Index: src/sys/dev/mps/mps_pci.c
===
--- sys/dev/mps/mps_pci.c   (Revision 334948)
+++ sys/dev/mps/mps_pci.c   (Arbeitskopie)
@@ -244,7 +244,7 @@
    int error, msgs;

    dev = sc->mps_dev;
-   error = 0;
+   error = ENXIO;
    msgs = 0;

    if ((sc->disable_msix == 0) &&



To my understanding, it's obvious that the way 
mps_pci_alloc_interrupts() currently works is unintended.

This might not affect too many people, but is there a reason not to fix it?

I already created a coresponding problem report: 
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229267

Anything else I should do?

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


12.0-BETA3 isp(4): exclusive sleep mutex CAM device lock (CAM device lock)

2018-11-02 Thread Harry Schmalzbauer

Hello,

unnfortunately I can't determine if this is begnin, so I'd like to ask 
the experts:


uma_zalloc_arg: zone "64" with the following non-sleepable locks held:
exclusive sleep mutex CAM device lock (CAM device lock) r = 0 
(0xf8000f1424d0) locked @ 
/usr/local/share/deploy-tools/RELENG_12/src/sys/cam/cam_xpt.c:4309

stack backtrace:
#0 0x8060ead3 at witness_debugger+0x73
#1 0x8060fa48 at witness_warn+0x448
#2 0x808b6f18 at uma_zalloc_arg+0x38
#3 0x8058773a at malloc+0x9a
#4 0x8036f423 at nvlist_create+0x23
#5 0x8030d597 at ctl_port_register+0x187
#6 0x8031aa25 at ctlfeasync+0x405
#7 0x802ca772 at xpt_async_process_dev+0x162
#8 0x802c603d at xpt_async_process+0x15d
#9 0x802c697e at xpt_done_process+0x35e
#10 0x802c8a76 at xpt_done_td+0xf6
#11 0x8056e094 at fork_exit+0x84
#12 0x808f46de at fork_trampoline+0xe
uma_zalloc_arg: zone "64" with the following non-sleepable locks held:
exclusive sleep mutex CAM device lock (CAM device lock) r = 0 
(0xf8000f1414d0) locked @ 
/usr/local/share/deploy-tools/RELENG_12/src/sys/cam/cam_xpt.c:4309

stack backtrace:
#0 0x8060ead3 at witness_debugger+0x73
#1 0x8060fa48 at witness_warn+0x448
#2 0x808b6f18 at uma_zalloc_arg+0x38
#3 0x8058773a at malloc+0x9a
#4 0x8036f423 at nvlist_create+0x23
#5 0x8030d597 at ctl_port_register+0x187
#6 0x8031aa25 at ctlfeasync+0x405
#7 0x802ca772 at xpt_async_process_dev+0x162
#8 0x802c603d at xpt_async_process+0x15d
#9 0x802c697e at xpt_done_process+0x35e
#10 0x802c8a76 at xpt_done_td+0xf6
#11 0x8056e094 at fork_exit+0x84
#12 0x808f46de at fork_trampoline+0xe

Thanks for hints,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ctld(8) 11.2-release lockup with w2k16 [Was: Re: ctld(8), multiple 'portal-group' on same socket (individual 'discovery-auth-group' restrictions)]

2018-08-25 Thread Harry Schmalzbauer

Am 05.07.2018 um 18:17 schrieb Harry Schmalzbauer:

Am 21.10.2014 um 12:43 schrieb Edward Tomasz Napierała:

On 1020T1035, Harald Schmalzbauer wrote:

  Hello,

I'm trying to move from istgt(1) to ctld(8), but it seems my setup 
isn't

possible with ctld.
Besides missing support for virtual-DVDs ('UnitType DVD' in istgt) and
real ODD-devices ('UnitType pass' in istgt),

Yup, we don't implement virtual DVDs and passthrough. Especially the
latter would be a nice feature to have.



Hello Edward,

my current problem is unrelated.
But this old mail illustrates the timeframe I've been happily using 
ctld(8) without problems :-) Thanks!


Recently, I discovered that WindowsServerBackup fails with Win2k16 
(never used 2k12).
Old initiators running 2008R2 (or ESXi 5.5) are still able to use 
ctld(8) ZVOL targets for WindowsServerBackup on 11.2-release without 
problems.


Unfortunately also ESXi6.5 initiatiors are not working well with ctld(8) 
anymore.

Read performace is incredibly slow.
I have a 2x3z1 pool with 6SAS10krpm spindels.
Local ZFS performance doesn't show anything unexpected.
But reading from a ctld(8) ZVOL backed target under ESXi6.5 seems to 
cause a interrupt deadlock – not completely dead, but almost.

gstat(8) tells me that all 6 HDDs are idle.
top(1) shows no thread consuming CPU cycles, with one exception (besides 
idle):

12 root 38 -56    - 0K   608K WAIT   -1 569:02 482.78% intr
systat(1) shows NICs almost idle (<100irqs/s) and permanent 25% INTR 
load (one of 4 cores).


This is with 11.2 release.
It's a ESXi guest, which I used severla years with previous FreeBSD 
versions without such massive iSCSI performance problems.


Using the same /dev/zvol with istgt(1) on the same 11.2-release VM also 
solves the performance issue.


Is anybody using ctld(8) in production post 10.x? If so, without 
observing a similar regression?


Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ctld(8) 11.2-release lockup with w2k16 [Was: Re: ctld(8), multiple 'portal-group' on same socket (individual 'discovery-auth-group' restrictions)]

2018-07-05 Thread Harry Schmalzbauer

Am 21.10.2014 um 12:43 schrieb Edward Tomasz Napierała:

On 1020T1035, Harald Schmalzbauer wrote:

  Hello,

I'm trying to move from istgt(1) to ctld(8), but it seems my setup isn't
possible with ctld.
Besides missing support for virtual-DVDs ('UnitType DVD' in istgt) and
real ODD-devices ('UnitType pass' in istgt),

Yup, we don't implement virtual DVDs and passthrough.  Especially the
latter would be a nice feature to have.



Hello Edward,

my current problem is unrelated.
But this old mail illustrates the timeframe I've been happily using 
ctld(8) without problems :-) Thanks!


Recently, I discovered that WindowsServerBackup fails with Win2k16 
(never used 2k12).
Old initiators running 2008R2 (or ESXi 5.5) are still able to use 
ctld(8) ZVOL targets for WindowsServerBackup on 11.2-release without 
problems.


I haven't had time to do much analysis and I'm lacking skills/equipment 
to do them down at debugger level, but I wanted to ask if you're aware 
about problems with Windows Server 2016 as ctld(8) initiator.


The Symptoms:

The system locks up for about 30-60 seconds with iSCSI load from w2k16.
When the lockup happens, systat(1) shows 25% intr usage (which is one 
core) and not even the login session is responsive anymore. Neither 
updating userland-output nor reacting to input.

But, the input is queued and gets processed after the lockup releases.
The lockup vanishes as soon as iSCSI session was reset:
Jun 28 06:14:09 bansta kernel: WARNING: 172.24.32.172 
(iqn.1991-05.com.microsoft:dafus.mgn.mo1.psw-online.de): no ping reply 
(NOP-Out) after 5 seconds; dropping

connection
Jun 28 06:14:09 bansta kernel: WARNING: 172.24.32.172 
(iqn.1991-05.com.microsoft:dafus.mgn.mo1.psw-online.de): waiting for CTL 
to terminate 94 tasks
Jun 28 06:14:09 bansta kernel: WARNING: 172.24.32.172 
(iqn.1991-05.com.microsoft:dafus.mgn.mo1.psw-online.de): tasks terminated


Sometimes it's possible to transfer 30GB before the lockup happens, 
sometimes even a NTFS-quick-format leads to the lockup.



Yesterday I used istgt(1) instead of ctld(8) to export the exactly same 
ZVOL using the exactly same network backend, with exactly the same 
initiator.
The lockup hasn't occured anymore, the complete WindowsServerBackup taks 
finishes successfully on the Windows Server 2016 initiator.  So I 
strongly suspect a ctld(8) locking problem.
Like mentioned, target backed is a ZFS volume.  I already used a HDD as 
target backed (and observed a much better performance, which drops even 
if I use a UFS vnode backend on the same HDD), but I'm not sure anymore 
whether the lockup also occured...


For now I can't tell anything helpfuly, just describe the symptoms and 
ask if you have any hints for me what to try next to narrow down the 
problem, or if this is a already known problem.


Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Does VirtualBox's vboxnetflt(4) work on stable/11 | 11.2?

2018-06-19 Thread Harry Schmalzbauer

Am 16.06.2018 um 21:42 schrieb Harry Schmalzbauer:
…
To rule out a known vboxnetflt(4) limitation/failure, I'd like to know 
if somebody successfully uses vboxnetflt(4) from 
virtualbox-ose-kmod-5.2.12 on stable/11|11.2.


Really nobody out there who has VirtualBox sucessfully running under 
stable/11 or 11.x with bridged network?


Anyone who tried, but also observed similar problems like I have on 
-current?


Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Does VirtualBox's vboxnetflt(4) work on stable/11 | 11.2?

2018-06-16 Thread Harry Schmalzbauer

Hello,

I'm observing some kind of congestion with virtualbox-ose-kmod-5.2.12 on 
-current.


Frames from the guest make it through ng_ether(4), but frames coming 
from ng_ether(4) seem to choke this direction.
Initially, the guest successfully can get a DHCP lease (both, v4 and v6) 
and all DNS traffic is passed successfully, but as soon as the first TCP 
transmission with some payload happens (jumbo frames, but quick test 
showed that also 1514-frames choke the LAN->guest direction), not even 
ARP replys reach the guest anymore. At least they are traversing 
ng_ether(4).  So there's either a general netgraph(4) problem or 
vboxnetflt(4) problem on -current.


To rule out a known vboxnetflt(4) limitation/failure, I'd like to know 
if somebody successfully uses vboxnetflt(4) from 
virtualbox-ose-kmod-5.2.12 on stable/11|11.2.


Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: removable storage usability, devd, hald and X11-desktop in general

2018-05-21 Thread Harry Schmalzbauer

Am 20.05.2018 um 23:52 schrieb EBFE:

On Sat, 19 May 2018 20:35:59 +0200
Harry Schmalzbauer  wrote:

Hi,


Biggest question: How are useres expected to handle removable media?

I'm a happy user of autofs(5) in several environments (mostly for NFS
mounts), but I'm not aware of any helper tool which enables _users_
to unmount before pulling the UFD.
I've heard of PC-BSD and Lumina (see later why I haven't really tried
out the modern "light" desktops) and I think I remember having read
they utilize devd(8).  But again, how to unmount?


There is a nice little daemon: sysutils/dsbmd
(see https://freeshell.de/~mk/projects/dsbmd.html
and /usr/local/etc/dsbmd.conf.sample)

with a simple GUI sysutils/dsbmc and cli (sysutils/dsbmc-cli) clients.
It supports automounting using devd and/or polling and automatic
or manual unmounting.



Thanks!  Also to Kurt and Edward for their answers.  Suitable advises 
for people with an idea what a filesystem is about, but not for my step 
daughter.  Even if her biggest idol would tell her that it's cool to 
wait 5 seconds before pulling the UFD, she wouldn't accept; if she's 
ready with copying, the device also has to be ready. period. But she 
accepts the "eject" step from other OS...  it's a instruction she tells, 
so it's acceptable.  As long as she needn't to type anything...  And 
she's by far not the only one I know with similar expectations – the 
computer has to do what the user tells, as soon as the user has to 
follow "strange" computer "rules", fun abruptly ends ;-)


sysutils/dsbmc is completely new to me.

Meanwhile I read about sysutils/bsdisks – UDisks2 compliant.  Never 
heard of UDisks2 before, but will have a look asap, sounds interesting too.


It's supposed to be supported by x11-fm/pcmanfm-qt – by far the most 
sensible x11 filemanager I've tried so far (offers checkmark to store 
folder specific preferences, switches from beautified path to text path 
on click, easy to configure single-click, and the usual thumbnail etc. 
is working too at acceptable performance – not even close to Rox filer 
or Thunar, but this might vary if one doesn't use it from gtk session 
but Qt based session/desktop).


Also found out that it should be easily possible to use xfce4wm with LXQt.
As time permits I'll keep trying out those highly appreciated 
alternatives – I've always been happy that my X11/xfce4 desktop helped 
my saving time compared to Windows XP, but since then, many usability 
cherries grew in windows, which I'm missing on X11 and hoped that the 
famous X11 desktop projects would have picked.  pcmanfm-qt at least 
catches up with XP usability...


-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


removable storage usability, devd, hald and X11-desktop in general

2018-05-19 Thread Harry Schmalzbauer

Hello,

after 10 years I replaced my personal desktop machine (FreeBSD8 -> 
FreeBSD12).
While aware of OS progress, I haven't followed any development on the 
X11 planet.


To my surprise, things were in better shape 10 years ago, regarding 
desktop usability.


Biggest question: How are useres expected to handle removable media?

I'm a happy user of autofs(5) in several environments (mostly for NFS 
mounts), but I'm not aware of any helper tool which enables _users_ to 
unmount before pulling the UFD.
I've heard of PC-BSD and Lumina (see later why I haven't really tried 
out the modern "light" desktops) and I think I remember having read they 
utilize devd(8).  But again, how to unmount?


10 years ago my choice was xdm with xfce4-session, supplemented by 
selected gnome tools (eog, evince ...) and firefox/thunderbird.
KDE4 was working well too, the PIM suite was a full/over-featured 
solution, yet acceptable performance.  I liked many aspects of Kmail, 
this was really full featured and I thought the new Qt5 based version 
will become my new X default MUA (since I'm very satisfied how Qt5 
performs on my old Jolla phone)
But today, Kmail is completele unusable thanks to akonadi – likewise 
others of the PIM suite (MySQL database grew beyond quota for only two 
folders of my primary IMAP postbox; MySQL as dependency for a MUA is 
ridiculous anyways, but a index of several gigabytes?!?! if it read all 
folders, I would have needed a second ssd)


My prerequisite is still three seperate x screens. Since no window 
manager of the modern "light" desktops like LXDE, LXQt, Lumina could 
handle my triple-head Xorg setup, I came exactly to the same result like 
10 years ago: xfce4.
It has served me an incredible well job for 10 years.  The only 
usability flaw was Thunar for me, because of it's feature limitations.
I really expected that in the mean time, it was possible to store 
directory dependent sorting preferences – wrong.  Much more frustrating, 
on FreeBSD, there's no thunar-volman anymore (which wasn't really stable 
with hald(8), but as far as I'm aware sysutils/hal has been greatly 
improved some years ago – which I never tried on my old machine).


Sorry for throwing in another topic, but where's 
$PFREFIX/etc/X11/xorg.conf.d/*.conf.sample?
It took me the better part of a day to _search the web_ in order to get 
my keyboard working.  I'm still unsure if I did it how it's supposed to 
be done on FreeBSD these days – without hal but wich auto-detetction.
No idea what the devd(8) dependency of the port controls.  How does X 
interact with devd(8)???
In my opinion, the Xorg(-server) ports need much more attention 
regarding documentation.  Of course I could have used xfce-settings to 
adjust my keyboard layout, but when it comes to the mouse, I need to 
set  AccelerationProfile, AdaptiveDeceleration and ConstantDeceleration 
which isn't covered by xfce-settings and much more important, I want to 
be able to fire up twm(1) and also have my keyboard and mouse aedequate 
supported.  In my opinion, these settings have to be done in the xserver 
config.  And that was really hard to find out how to on FreeBSD these days.


Then, there's slim(1).  It incorporates ConsoleKit, so it's my 
preference over xdm(1).
But: It's incredibly the only authentication mask I ever used, which 
doesn't handle the vert-tab-key as user/password selector.
Don't get me wrong, thanks for slim, it is exactly what I need, but I 
regularly log in as "myuseraccountmynotsosecretpasswordanymore".  I 
could have bet my tab-key is broken; still can't believe there's not 
even a config switch to enabyle this vereywhere-else behaviour.


If somebody's still reading and totally agreeing and fighting the same 
usability wars, here's another one:
Inconsistency at it's best: gtk-file-chooser.  (In order to walk trough 
the filesystem tree from /) Select "Other Places", _single_ click 
"Computer", and then you have to _double_ click for entering 
directories!?!?! I always disliked the double click and for most parts 
of my X application collection I always foud a GUI helper to switch to 
single click, but not for the gtk-file-chooser; neither for gtk2 nor 
these days for gtk3.  Maybe you have had time to read the developer 
repositories and know what to put into "gtkfilechooser.ini" and want to 
share?


I'm happy to share my Xorg-server setup for a haswell triple head setup 
on request.


For those who agree with my _user_ usability view, as a summary, can you 
tell  me:

· How do you access mobile media like FAT UFDs and NTFS HDDs?
· How do you access aribtrary DVDs (yes, besides data, I'm also curious 
if someone watches video discs and how)?
· How to use geli(8)+nonZFS based native mass storage? (I don't expect 
there's a ZFS pool import covering method, but for UFS e.g.?)
Any experience report welcome, also and especially from 
Lumina/ROX/LXQt/etc. users!


One last exclusion: Anything depending on gnome-vfs can't handle my ZFS 
aclmode setu

11.2 roadmap - Up to date www.freebsd.org/releng/#schedule?

2018-02-19 Thread Harry Schmalzbauer
 Dear REs et al.

commit log indicates that 11.2 is going to be on it's way in not too
distant future – my personal interpretation only!

Is https://www.freebsd.org/releng/#schedule up to date?
Or is there a different/better source for such info (for those without
svn accounts)?

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


New in 11? ZFS ACL -> aclinherit stacks synthesized mode ACEs

2017-12-08 Thread Harry Schmalzbauer
 Hello,

quick question, haven't had time to investigate yet, but accidentally
noticed that something between FreeBSD 10 and 11 has changed regarding
ZFS ACL inheritance.
Example:
If a parent directory has the file-inherit flag is set in a mode
synthesized ACE, the ACL of a file in that directory get the mode ACEs
stacked:

getfacl DIR
# file: DIR/   
# owner: toor   
# group: wheel  
owner@:rwxp-daARWcCos:fd-:allow
group@:rwxp--a-R-c--s:fd-:allow
 everyone@:D-:-d-:deny   
 everyone@:--a-R-c--s:fd-:allow

touch DIR/testfile
getfact DIR/testfile

# file: DIR/testfile  
# owner: toor
# group: wheel
owner@:rw-p-daARWcCos:--I:allow   
group@:rw-p--a-R-c--s:--I:allow   
 everyone@:--a-R-c--s:--I:allow
owner@:rw-p--aARWcCos:---:allow
group@:rw-p--a-R-c--s:---:allow
 everyone@:--a-R-c--s:---:allow

The (my) ACL of the parent hasn't change for some years (and aclinherit
is set to "passthrough-x" and aclmode is "passthrough", also unchanged
for several years).
I never saw the resulting ACL before FreeBSD 11.1

Anyone out there who knows what changed why?

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: EFI loader doesn't handle md_preload (md_image) correct?

2017-12-02 Thread Harry Schmalzbauer
Bezüglich Toomas Soome's Nachricht vom 29.06.2017 10:39 (localtime):
> 
>> On 29. juuni 2017, at 11:24, Harry Schmalzbauer > <mailto:free...@omnilan.de>> wrote:
>>
>> Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 18:26 (localtime):
>>> B
>> …
>>>>>> The issue is, that current UEFI implementation is using 64MB staging
>>>>>> memory for loading the kernel and modules and files. When the boot is
>>>>>> called, the relocation code will put the bits from staging area
>>>>>> into the
>>>>>> final places. The BIOS version does not need such staging area,
>>>>>> and that
>>>>>> will explain the difference.
>>>>>>
>>>>>> I actually have different implementation to address the same problem,
>>>>>> but thats for illumos case, and will need some work to make it usable
>>>>>> for freebsd; the idea is actually simple - allocate staging area per
>>>>>> loaded file and relocate the bits into the place by component, not as
>>>>>> continuous large chunk (this would also allow to avoid the mines like
>>>>>> planted by hyperv;), but right now there is no very quick real
>>>>>> solution
>>>>>> other than just build efi loader with larger staging size.
>>>>> Ic, thanks for the explanation.
>>>>> While not aware about the purpose of the staging area nor the
>>>>> consequences of enlarging it, do you think it's feasable increasing it
>>>>> to 768Mib?
>>>>>
>>>>> At least now I have an idea baout the issue and an explanation why
>>>>> reducing md_imgae to 100MB hasn't helped – still more than 64...
>>>>>
>>>>> Any quick hint where to define the staging area size highly
>>>>> appreciated,
>>>>> fi there are no hard objections against a 768MB size.
>>>>>
>>>>> -harry
>>>> The problem is that before UEFI Boot Services are not switched off,
>>>> the memory is managed (and owned) by the firmware,
>>> Hmm, I've been expecting something like that (owend by firmware) ;-)
>>>

…

> There has not been too much activities about this topic, except some
> discussions. But it is quite clear that this change has to be handled by
> the loader in first place - as we need to get the data in safe location;
> now of course there is secondary part as well - it may be that kernel
> would need some work as well, depending on how the md image(s) are to be
> handled in relation to memory maps.

Hello Toomas,

unfortunately my skills don't allow me to make this happen myself :-(
But since almost every production system here is MFS_ROOT based, I'm
awfully missing the UEFI boot feature, especially on those where I have
to do work via vt(4) from time to time, which would be a lot easier if
vt_efi was usable instead of vt_vga :-)

Can you estimate if someone has intentions/interest/time to implement
the missing extensions in boot and kernel resp. the timeframe?

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: buildworld fail in stable/11 @r325033 -- r325029?

2017-10-27 Thread Harry Schmalzbauer
 Bezüglich Konstantin Belousov's Nachricht vom 27.10.2017 16:42
(localtime):
> On Fri, Oct 27, 2017 at 04:12:54AM -0700, David Wolfskill wrote:
>> This is observed on systems (both my laptop & my build machine) running
>> stable/11 @r325003, after updating sources to r325033:
>>
>> --- libprocstat.o ---
>> In file included from /usr/src/lib/libprocstat/libprocstat.c:69:
>> /usr/obj/usr/src/tmp/usr/include/sys/ptrace.h:148:19: error: field has 
>> incomplete type 'struct siginfo32'
>> struct siginfo32 pl_siginfo;/* siginfo for signal */
>>  ^
>> /usr/obj/usr/src/tmp/usr/include/sys/ptrace.h:148:9: note: forward 
>> declaration of 'struct siginfo32'
>> struct siginfo32 pl_siginfo;/* siginfo for signal */
>>^
>>
>> I don't know that r325029 is to blame, but that was the last commit
>> in that area (in the range r325003 -  r325033).  And there wwer not
>> very many commits to stable/11 in that range:
> Can you confirm that the following patch allows your system to build ?
>
> Index: lib/libprocstat/libprocstat.c
> ===
> --- lib/libprocstat/libprocstat.c (revision 325038)
> +++ lib/libprocstat/libprocstat.c (working copy)
> @@ -63,10 +63,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #define  _KERNEL
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> Index: .
> ===
> --- . (revision 325038)
> +++ . (working copy)
>

Confirmed.
Sorry for the obsolete last post. It's been in the pipeline for build
time and haven't checked that you already posted the solution!

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: buildworld fail in stable/11 @r325033 -- r325029?

2017-10-27 Thread Harry Schmalzbauer
 Bezüglich David Wolfskill's Nachricht vom 27.10.2017 13:12 (localtime):
> This is observed on systems (both my laptop & my build machine) running
> stable/11 @r325003, after updating sources to r325033:
>
> --- libprocstat.o ---
> In file included from /usr/src/lib/libprocstat/libprocstat.c:69:
> /usr/obj/usr/src/tmp/usr/include/sys/ptrace.h:148:19: error: field has 
> incomplete type 'struct siginfo32'
> struct siginfo32 pl_siginfo;/* siginfo for signal */
>  ^
> /usr/obj/usr/src/tmp/usr/include/sys/ptrace.h:148:9: note: forward 
> declaration of 'struct siginfo32'
> struct siginfo32 pl_siginfo;/* siginfo for signal */
>^

I know nothing about the code changes in r324932 (MFC r316286), but the
followup fix r325029 (MFC r320481) seems to have caused the early
buildworld failure in stable/11 – for me too.
Since clang reported the error referencing some include from obj/tmp
(tmp/usr/include/sys/ptrace.h), I found:
r316304
(https://svnweb.freebsd.org/base/head/lib/libprocstat/libprocstat.c?view=patch&r1=316304&r2=316303&pathrev=316304),
which I thought could explain the symptom.
Seems to be the solution (applies cleanly to stable/11, buildworld
succeded). Even a blind squirrel sometimes finds the nut ;-)

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: reboot-less zfs volmode property refresh?

2017-10-15 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 15.10.2017 11:57 (localtime):
>  Bezüglich Harry Schmalzbauer's Nachricht vom 15.10.2017 11:33 (localtime):
…
>> 3.) Modify existing volmode=dev dataset and write new GPT
>>
>> zfs set volmode=geom
>> hostPsys/bhyveVOL/sys/test   
>>   
>>
>> zfs get volmode
>> hostPsys/bhyveVOL/sys/test   
>>
>>
>> NAME PROPERTY  VALUE   
>> SOURCE   
>>  
>>
>> hostPsys/bhyveVOL/sys/test  volmode   geom local
>> gpart create -s gpt
>> /dev/zvol/hostPsys/bhyveVOL/sys/test 
>>
>>
>> gpart: arg0 'zvol/hostPsys/bhyveVOL/sys/test': Invalid argument
>>
>> (fails unexpected)
>>
>> What can I do to let geom(4) know that there's a new device?
> Device should read provider.
>
> I found that last sentence in the zfs(8) man page, describing the
> volmode property:
> »This property can be changed any time, but
> so far it is processed only during volume creation and pool import.«
>
> So it seems to be a limitation by design.
>
> re-importing the pool is no option for me, so I'll keep in mind that
> changing volmode means outage.
>
> I'm aware that I can utilize ctl(8) to get access to a volmode=dev
> volume, also md(4) might help in that case, but for changing
> volmode=geom a reboot / re-import was required.

I'ts not correct that md(4) could help here since ms(4) can't use
character devices as vnode backend.
Just to correct myself.


> I guess the benefit of extending the implementation design is much to
> small to justify the effort.
> But I think making "volmode" a creation-only property (like utf8only)
> should be considered.

On the other hand, a reboot/re-import might be preferably over zfs
send|recv, which would be the only way in that case and which was a
regression; better to not consider making it creation-only...

I'd like to share two workarounds:
The first makes it possible to easily create backups from (bhyve(8))
guests which have ZVOLs as ahci/virtio-block backend.
Like described, changing "volmode" property for the corresponding volume
dataset from "dev" to "geom" doesn't help without reboot/re-import.
But you can clone:

zfs snapshot hostPsys/bhyveVOL/sys/guest@offline
zfs clone -o volmode=geom hostPsys/bhyveVOL/sys/guest@offline
hostPsys/bhyveVOL/sys/guest.gc
glabel status
gpt/guestSWAP N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp3
gpt/guestBOOT N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp4
gpt/guestSAFE N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp5
gpt/guestVAR N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp6
gpt/guestLOCAL N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp7
gpt/guestDATA N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp8
gpt/guestJbase N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp9
gpt/guestENTITIES N/A zvol/hostPsys/bhyveVOL/sys/guest.gcp10

After you did what ever you need, simply destroy the snapshot and it's
clone:

zfs destroy -R hostPsys/bhyveVOL/sys/guest@offline


This is just a way to work around the volmode limitation for one special
case (reading data from the guest's volume).

If you need to access the guest volume for other things than
non-destructive ones (writing data), you have to go the ctl(4) way.
Of course, also for read-only tasks ctl(4) would be appropriate, but
since I'm more used to `zfs` than to `ctladm`, the former is easier for me.
The latter has more dynamics.
Here's the ctl(4) example:

ctladm create -b block -d guest-zvol -o
file=/dev/zvol/hostPsys/bhyveVOL/sys/guest
ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` -o on
geom disk list
Geom name: da11
Providers:
1. Name: da11
Mediasize: 8589934592 (8.0G)
Sectorsize: 512
Stripesize: 8192
Stripeoffset: 0
Mode: r0w0e0
descr: FREEBSD CTLDISK
lunname: FREEBSD guest-zvol
lunid: FREEBSD guest-zvol
ident: MYSERIAL 0
rotationrate: 0
fwsectors: 63
fwheads: 255
glabel status
gpt/guestSWAP N/A da11p3
gpt/guestBOOT N/A da11p4
gpt/guestSAFE N/A da11p5
gpt/guestVAR N/A da11p6
gpt/guestLOCAL N/A da11p7
gpt/guestDATA N/A da11p8
gpt/guestJbase N/A da11p9
gpt/guestENTITIES N/A da11p10

ctladm devlist and ctladm portlist give more info.

After you did what ever needed, switch camsim off and remove target:

ctladm port -p `ctladm port -l | grep "camsim.*naa" | cut -w -f 1` -o off
ctladm remove -b block -l `ctladm devl | grep "guest-zvol" | cut -w -f 2`

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: reboot-less zfs volmode property refresh?

2017-10-15 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 15.10.2017 11:33 (localtime):
>  Hello,
>
> maybe I'm just missing something obvious, but modifying a dataset's
> volmode property seems to force me to reboot the host to have any effect.
>
> Test to reproduce (parent dataset hostPsys/bhyveVOL/sys  has volmode set
> ot "dev"):
>
> 1.) Create new volume with volmode=geom, and write new GPT
>
> zfs create -o volmode=geom -V 10G hostPsys/bhyveVOL/sys/test
> gpart create -s gpt /dev/zvol/hostPsys/bhyveVOL/sys/test
> gpart show -l /dev/zvol/hostPsys/bhyveVOL/sys/test
> =>  40  20971440  zvol/hostPsys/bhyveVOL/sys/test  GPT 
> (10G)   
> 40  20971440- free -  (10G)
>
> (works as expected)
>
> 2.) Create new volume with volmode=dev, and write new GPT
>
> zfs destroy hostPsys/bhyveVOL/sys/test
> zfs create -V 10G hostPsys/bhyveVOL/sys/test
> gpart create -s gpt
> /dev/zvol/hostPsys/bhyveVOL/sys/test  
>
>
> gpart: arg0 'zvol/hostPsys/bhyveVOL/sys/test': Invalid argument
>
> (fails as expected)
>
> 3.) Modify existing volmode=dev dataset and write new GPT
>
> zfs set volmode=geom
> hostPsys/bhyveVOL/sys/test
>  
>
> zfs get volmode
> hostPsys/bhyveVOL/sys/test
>   
>
> NAME PROPERTY  VALUE   
> SOURCE
> 
>
> hostPsys/bhyveVOL/sys/test  volmode   geom local
> gpart create -s gpt
> /dev/zvol/hostPsys/bhyveVOL/sys/test  
>   
>
> gpart: arg0 'zvol/hostPsys/bhyveVOL/sys/test': Invalid argument
>
> (fails unexpected)
>
> What can I do to let geom(4) know that there's a new device?

Device should read provider.

I found that last sentence in the zfs(8) man page, describing the
volmode property:
»This property can be changed any time, but
so far it is processed only during volume creation and pool import.«

So it seems to be a limitation by design.

re-importing the pool is no option for me, so I'll keep in mind that
changing volmode means outage.

I'm aware that I can utilize ctl(8) to get access to a volmode=dev
volume, also md(4) might help in that case, but for changing
volmode=geom a reboot / re-import was required.

I guess the benefit of extending the implementation design is much to
small to justify the effort.
But I think making "volmode" a creation-only property (like utf8only)
should be considered.

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


reboot-less zfs volmode property refresh?

2017-10-15 Thread Harry Schmalzbauer
 Hello,

maybe I'm just missing something obvious, but modifying a dataset's
volmode property seems to force me to reboot the host to have any effect.

Test to reproduce (parent dataset hostPsys/bhyveVOL/sys  has volmode set
ot "dev"):

1.) Create new volume with volmode=geom, and write new GPT

zfs create -o volmode=geom -V 10G hostPsys/bhyveVOL/sys/test
gpart create -s gpt /dev/zvol/hostPsys/bhyveVOL/sys/test
gpart show -l /dev/zvol/hostPsys/bhyveVOL/sys/test
=>  40  20971440  zvol/hostPsys/bhyveVOL/sys/test  GPT 
(10G)   
40  20971440- free -  (10G)

(works as expected)

2.) Create new volume with volmode=dev, and write new GPT

zfs destroy hostPsys/bhyveVOL/sys/test
zfs create -V 10G hostPsys/bhyveVOL/sys/test
gpart create -s gpt
/dev/zvol/hostPsys/bhyveVOL/sys/test
 

gpart: arg0 'zvol/hostPsys/bhyveVOL/sys/test': Invalid argument

(fails as expected)

3.) Modify existing volmode=dev dataset and write new GPT

zfs set volmode=geom
hostPsys/bhyveVOL/sys/test 

zfs get volmode
hostPsys/bhyveVOL/sys/test  


NAME PROPERTY  VALUE   
SOURCE  
  

hostPsys/bhyveVOL/sys/test  volmode   geom local
gpart create -s gpt
/dev/zvol/hostPsys/bhyveVOL/sys/test

gpart: arg0 'zvol/hostPsys/bhyveVOL/sys/test': Invalid argument

(fails unexpected)

What can I do to let geom(4) know that there's a new device?
And vice versa, changing volmode property from "geom" to "dev" or "none"
doesn't have any effecit either, until reboot.

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


bhyve ppt usage can cause severe RAM corruption [Was: Re: panic: Memory modified after free in zio_create, passthru in use]

2017-10-11 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 11.06.2017 12:37 (localtime):
>  Bezüglich Harry Schmalzbauer's Nachricht vom 06.06.2017 14:03 (localtime):
>>  Hello,
>>
>> suddenly, I'm getting this error:
>> /lib/libc.so.7: Undefined symbol "xdr_accepted_reply"
>>
>> Very mysterious: It showed up on a running system, which worked
>> flawlessly for some hours. And that host has root-fs (/) mounted
>> readonly from a memorydisk. So to my understanding, it's completely
>> impossible that /lib/libc.so.7 is corrupted since last boot.
>>
>> I'm completely out of ideas what could cause this strange error during
>> "normal" operation.
>>
>> Normal operation in this case is serving as a bhyve test machine.
>> I first noticed that error after one guest - with passthru device
>> attached - was shut down.
>>
>> My suspicion is some undiscovered passthru interference... Since I
>> noticed one other _very_ strange passthru-effect:
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215740
> Hello,
>
> this time I caught a panic with a debuging kernel under 11.1-BETA1,
> which again occured after shuting down a VM which had ppt in use:
>
…
> Please, can anybody of the xperts add a comment?

It turned out that it's a problem with PCIe cards which don't support
FLR or cards, which are not PCIe, even if they have FLR capabilitiy.

jhb@ helped me to diagnose this.

Unfortunately I once forgot to manually bring down the passthrough-nics
in question, which resulted in a completely destroyed ZFS pool.
That hurted, so I won't rely on manual intervention before shutting down
(I had to recreate the complete (system) pool).
Unfortunately my skills don't allow me to help fixing the root cause, so
I created a little rc(8) script, which should protect reliably.
Please see also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222937

Since it's quite small overhead, I'll also attach it here (to be copied
to /etc/rc.d).

-harry

#!/bin/sh
#

# PROVIDE: pciptdetach
# REQUIRE: swap
# BEFORE: devd
# KEYWORD: shutdown

. /etc/rc.subr

name=pciptdetach
rcvar=pciptdetach_enable

load_rc_config ${name}

: ${pciptdetach_enable:="YES"}

start_cmd="true"
stop_cmd="${name}"

pciptdetach()
{
sysctl -n hw.hv_vendor | grep -q bhyve || return 0

echo "Disabling passthrough adapters:"

pptcandidate=`pciconf -l | grep -v -E \
  "^([[:blank:]]|hostb|virtio|isab)[^@]+" | sed -n -E \
 's/^[[:blank:]]*(^[[:alnum:]]+)@([^[:blank:]]+)(:[[:blank:]]).*$/\2/p'`

for pcidev in ${pptcandidate}; do

drv_class=`pciconf -lv | grep -A 3 "@${pcidev}" | sed -n -E -e \
 's/^[[:blank:]]*class[[:blank:]]+=[[:blank:]]+([^[:blank:]].*)$/\1/p' \
-e 's/^([[:alnum:]]+)@.*$/\1/p' | tr '\n' ' '`

# Don't disable mass storage devices, might be busy for shutdown
[ X"${drv_class}" = X"${drv_class%mass storage*}" ] || continue

# Make sure network adapters don't have active vlan(4) clones.
if [ -z "${netstoped}" ] &&
[ X"${drv_class}" != X"${drv_class%network*}" ]
then
/etc/rc.d/netif stop >/dev/null 2>&1 && netstoped=y
fi

# Non-PCIe devices and PCIe devices without FLR support are
# known to cause RAM corruption.
if ! pciconf -lc ${pcidev} | grep -A 20 PCI-Express |
grep -q "[[:blank:]]FLR"
then
devctl disable ${pcidev} >/dev/null 2>&1 ||
echo " ${drv_class%% *}:FAILED"
fi

done
}

run_rc_command "$1"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-03 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 03.10.2017 16:39 (localtime):
> Bezüglich Andriy Gapon's Nachricht vom 03.10.2017 16:28 (localtime):
>> On 03/10/2017 17:19, Harry Schmalzbauer wrote:
>>> Have tried several different txg IDs, but the latest 5 or so lead to the
>>> panic and some other random picked all claim missing devices...
>>> Doh, if I only knew about -T some days ago, when I had all 4 devices
>>> available.
>> I don't think that the error is really about the missing devices.
>> Most likely the real problem is that you are going too far back in history 
>> where
>> the data required to import the pool is not present.  It's just that there 
>> is no
>> special error code to report that condition distinctly, so it gets 
>> interpreted
>> as a missing device condition.
> Sounds reasonable.
> When the RAM-corruption happened, a live update was started, where
> several pool availability checks were done. No data write.
> Last data write were view KBytes some minutes before the corruption, and
> the last significant ammount written to that pool was long time before that.
> So I still have hope to find an importable txg ID.
>
> Are they strictly serialized?

Seems so.
Just for the records, I couldn't recover any data yet, but in general,
if a pool isn't damaged that much, the following promising steps were
the ones I got closest:

I have attached dumps of the physical disks as md2 and md3.
'zpool import' offers
cetusPsysDEGRADED
  mirror-0   DEGRADED
8178308212021996317  UNAVAIL  cannot open
md3  ONLINE
  mirror-1   DEGRADED
md2p5ONLINE
4036286347185017167  UNAVAIL  cannot open

Which is ḱnown to be corrupt.
This time I also attached zdb(8) dumps (sparse files) of the remaining
two disks, resp. partition.
Now import offers this:
   pool: cetusPsys
 id: 13207378952432032998
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

cetusPsys   ONLINE
  mirror-0  ONLINE
md5 ONLINE
md3 ONLINE
  mirror-1  ONLINE
md2p5   ONLINE
md4 ONLINE

'zdb -ue cetusPsys' showed me the latest txg ID (3757573 in my case).

So I decremented the txg ID by one and repeated until the following
fatal panicing indicator vanished:
loading space map for vdev 1 of 2, metaslab 108 of 109 ...
WARNING: blkptr at 0x80e0ead00 has invalid CHECKSUM 1
WARNING: blkptr at 0x80e0ead00 has invalid COMPRESS 0
WARNING: blkptr at 0x80e0ead00 DVA 0 has invalid VDEV 2337865727
WARNING: blkptr at 0x80e0ead00 DVA 1 has invalid VDEV 289407040
WARNING: blkptr at 0x80e0ead00 DVA 2 has invalid VDEV 3959586324

Which was 'zdb -c -t 3757569 -AAA -e cetusPsys':

Traversing all blocks to verify metadata checksums and verify nothing
leaked ...

loading space map for vdev 1 of 2, metaslab 108 of 109 ...
89.0M completed (   6MB/s) estimated time remaining: 3hr 34min 47sec
zdb_blkptr_cb: Got error 122 reading <69, 0, 0, c>  -- skipping
86.8G completed ( 588MB/s) estimated time remaining: 0hr 00min 00sec   
Error counts:

errno  count
  122  1
leaked space: vdev 0, offset 0xa01084200, size 512
leaked space: vdev 0, offset 0xd0dc23c00, size 512
leaked space: vdev 0, offset 0x2380182200, size 3072
leaked space: vdev 0, offset 0x2380189a00, size 1536
leaked space: vdev 0, offset 0x2380183000, size 1536
leaked space: vdev 0, offset 0x238039a200, size 2560
leaked space: vdev 0, offset 0x238039be00, size 18944
leaked space: vdev 0, offset 0x23801b3200, size 9216
leaked space: vdev 0, offset 0x33122a8800, size 512
leaked space: vdev 1, offset 0x2808f1600, size 512
leaked space: vdev 1, offset 0x2808f1e00, size 512
leaked space: vdev 1, offset 0x2808f2e00, size 4096
leaked space: vdev 1, offset 0x2808f1a00, size 512
leaked space: vdev 1, offset 0x9010e6c00, size 512
leaked space: vdev 1, offset 0x23c5ad9c00, size 512
leaked space: vdev 1, offset 0x2e00ad4800, size 512
leaked space: vdev 1, offset 0x2f0030b200, size 50176
leaked space: vdev 1, offset 0x2f000ca800, size 512
leaked space: vdev 1, offset 0x2f003a9800, size 15360
leaked space: vdev 1, offset 0x2f003af600, size 13312
leaked space: vdev 1, offset 0x2f00715c00, size 1024
leaked space: vdev 1, offset 0x2f003adc00, size 6144
leaked space: vdev 1, offset 0x2f00363600, size 38912
block traversal size 93540302336 != alloc 93540473344 (leaked 171008)

bp count: 3670624
ganged count:   0
bp logical:96083156992  avg:  26176
bp physical:   93308853248  avg:  25420 compression:   1.03
bp allocated:  93540302336  avg:  25483 compression:   1.03
bp deduped: 0ref>1:  0   deduplication:   1.00
SPA allocated: 93540473344  

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-03 Thread Harry Schmalzbauer
Bezüglich Andriy Gapon's Nachricht vom 03.10.2017 16:28 (localtime):
> On 03/10/2017 17:19, Harry Schmalzbauer wrote:
>> Have tried several different txg IDs, but the latest 5 or so lead to the
>> panic and some other random picked all claim missing devices...
>> Doh, if I only knew about -T some days ago, when I had all 4 devices
>> available.
> 
> I don't think that the error is really about the missing devices.
> Most likely the real problem is that you are going too far back in history 
> where
> the data required to import the pool is not present.  It's just that there is 
> no
> special error code to report that condition distinctly, so it gets interpreted
> as a missing device condition.

Sounds reasonable.
When the RAM-corruption happened, a live update was started, where
several pool availability checks were done. No data write.
Last data write were view KBytes some minutes before the corruption, and
the last significant ammount written to that pool was long time before that.
So I still have hope to find an importable txg ID.

Are they strictly serialized?

Dou you know any other possibility to get data from a dataset (by
objetct id) without importing the whole pool?

zdz successfully checks the one datset (object ID) I'm interested in;
the rest of the pool isnt much an issue...

Thank you very much for your help!

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-03 Thread Harry Schmalzbauer
 Bezüglich Andriy Gapon's Nachricht vom 03.10.2017 11:20 (localtime):
> On 03/10/2017 11:43, Harald Schmalzbauer wrote:
> ...
>>  action: The pool can be imported despite missing or damaged devices.  The
>> fault tolerance of the pool may be compromised if imported.
> ...
>> Is it impossible to import degraded pools in general, or only together> with 
>> "-X -T"?
> It should be possible to import degraded pools...
> Perhaps the pool originally had more devices?  Like log devices.
> Or maybe there is some issue with the txg you picked.
>
> By the way, I think that you didn't have to provide -T option for -F or -X.
> It's either -F or -X or -T , the first two try to figure out txg
> automatically.  But I could be wrong.

You're right that -T works without F[X] flag, but -X needs -F.

Unfortunately not specifying a distinct txg leads to panic.
Specifying one leads to "device is missing".
Which is true, but only redundant data...

It's abosultely sure that this pool never had any log or cache or other
device than the two mirror vdevs.
zdb -l confirms that.

Have tried several different txg IDs, but the latest 5 or so lead to the
panic and some other random picked all claim missing devices...
Doh, if I only knew about -T some days ago, when I had all 4 devices
available.
I haven't expected problems due to missing redundant mirrors.

Can anybody imagine why degraded import doesn't work in my case and how
to work arround?

Will try to provide the sparse zdb dumps in addition, maybe that changes
anything. But I'm sure these don't have much data., dump time was within
3 seconds at most.

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-02 Thread Harry Schmalzbauer
Bezüglich Andriy Gapon's Nachricht vom 02.10.2017 13:49 (localtime):
> On 01/10/2017 00:38, Harry Schmalzbauer wrote:
>> Now my striped mirror has all 4 devices healthy available, but all
>> datasets seem to be lost.
>> No problem for 450G (99,9_%), but there's a 80M dataset which I'm really
>> missing :-(
> 
> If it's not too late now, you may try to experiment with an "unwind" / 
> "extreme
> unwind" import using -F -n / -X -n.  Or manually specifying a txg number for
> import (in read-only mode).

Thanks for your reply!

I had dumped one of each mirror's drive and attaching it as memory disk
works as intended.
So "zfs import" offers me the corrupt backup (on the host with a already
recreated pool).

Unfortunately my knowledge about ZFS internals (transaction group number
relations to (ü)uberblocks) doesn't allow me to follow your hint.

How can I determine the last txg#, resp. the ones before the last?
I guess 'zpool import -t' is the tool/parameter to use.
ZFS has wonderful documentation, but although this was a perfect reason
to start learning the details about my beloved ZFS, I don't have the
time to.

Is there a zdb(8) aequivalent of 'zpool import -t', so I can issue the
zdb check, wich doesn't crash the kernel but only zdb(8)?

For regular 'zpool import', 'zdb -ce' seems to be such a synonym. At
least the crash report is identical, see my reply to Scott Bennett's post..

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-10-02 Thread Harry Schmalzbauer
Bezüglich Scott Bennett's Nachricht vom 01.10.2017 15:20 (localtime):
>  On Sat, 30 Sep 2017 23:38:45 +0200 Harry Schmalzbauer 
> 
> wrote:

…
>>
>> OpenIndiana also panics at regular import.
>> Unfortunately I don't know the aequivalent of vfs.zfs.recover in OI.
>>
>> panic[cpu1]/thread=ff06dafe8be0: blkptr at ff06dbe63000 has
>> invalid CHECKSUM 1
>>
>> Warning - stack not written to the dump buffer
>> ff001f67f070 genunix:vcmn_err+42 ()
>> ff001f67f0e0 zfs:zfs_panic_recover+51 ()
>> ff001f67f140 zfs:zfs_blkptr_verify+8d ()
>> ff001f67f220 zfs:zio_read+55 ()
>> ff001f67f310 zfs:arc_read+662 ()
>> ff001f67f370 zfs:traverse_prefetch_metadata+b5 ()
>> ff001f67f450 zfs:traverse_visitbp+1c3 ()
>> ff001f67f4e0 zfs:traverse_dnode+af ()
>> ff001f67f5c0 zfs:traverse_visitbp+6dd ()
>> ff001f67f720 zfs:traverse_impl+1a6 ()
>> ff001f67f830 zfs:traverse_pool+9f ()
>> ff001f67f8a0 zfs:spa_load_verify+1e6 ()
>> ff001f67f990 zfs:spa_load_impl+e1c ()
>> ff001f67fa30 zfs:spa_load+14e ()
>> ff001f67fad0 zfs:spa_load_best+7a ()
>> ff001f67fb90 zfs:spa_import+1b0 ()
>> ff001f67fbe0 zfs:zfs_ioc_pool_import+10f ()
>> ff001f67fc80 zfs:zfsdev_ioctl+4b7 ()
>> ff001f67fcc0 genunix:cdev_ioctl+39 ()
>> ff001f67fd10 specfs:spec_ioctl+60 ()
>> ff001f67fda0 genunix:fop_ioctl+55 ()
>> ff001f67fec0 genunix:ioctl+9b ()
>> ff001f67ff10 unix:brand_sys_sysenter+1c9 ()
>>
>> This is a important lesson.
>> My impression was that it's not possible to corrupt a complete pool, but
>> there's always a way to recover healthy/redundant data.
>> Now my striped mirror has all 4 devices healthy available, but all
>> datasets seem to be lost.
>> No problem for 450G (99,9_%), but there's a 80M dataset which I'm really
>> missing :-(
>>
>> Unfortunately I don't know the DVA and blkptr internals, so I won't
>> write a zfs_fsck(8) soon ;-)
>>
>> Does it make sense to dump the disks for further analysis?
>> I need to recreate the pool because I need the machine's resources... :-(
>> Any help highly appreciated!
>>
>  First, if it's not too late already, make a copy of the pool's cache 
> file,
> and save it somewhere in case you need it unchanged again.
>  Can zdb(8) see it without causing a panic, i.e., without importing the
> pool?  You might be able to track down more information if zdb can get you in.

Thank you very much for your help.

zdb(8) is able to get all config data, along with all dataset information.

For the records, I'll provide zdb(8) output beyond.

In the mean time I recreated the pool and the host is back to live.
Since other pools weren't affected and had plenty of space, I dumped two
of the 4 drives along with the zdb(8) -x dump, which I don't know what
it exactly dumps (all blocks accessed!?!; result is big sparse file, but
the time it took to write them down't allow them to have anything but
metadata, at best).

Attaching the two  native dumps as memory-disk works for "zpool import" :-)
To be continued as answer to Andriy Gaoon's reply from today...

>  Another thing you could try with an admittedly very low probability of
> working would be to try importing the pool with one drive of one mirror
> missing, then try it with a different drive of one mirror, and so on the minor
> chance that the critical error is limited to one drive.  If you find a case
> where that works, then you could try to rebuild the missing drive and then run
> a scrub.  Or vice versa.  This one is time-consuming, I would imagine, given

I did try, although I had no hope that this could change the picture,
since the cause of the incosistency wasn't drive related.
And as expected, I had no luck.

Dataset mos [META], ID 0, cr_txg 4, 19.2M, 6503550977762669098 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 21   128K512  05120.00  DSL directory

Dataset mos [META], ID 0, cr_txg 4, 19.2M, 6503550977762669098 objects

Object  lvl   iblk   dblk  dsize  lsize   %full  type
 21   128K512  05120.00  DSL directory

loading space map for vdev 1 of 2, metaslab 108 of 109 ...
error: blkptr at 0x80d726040 has invalid CHECKSUM 1

Traversing all blocks to verify checksums and verify nothing leaked ...

Assertion failed: (!BP_IS_EMBEDDED(bp) || BPE_GET_ETYPE(bp) ==
BP_EMBEDDED_TYPE_DATA), file
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c,
line 5220.
loading space map for vdev 1 of 2, metaslab 108 of 109 ...
error: blkptr

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-09-30 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 30.09.2017 19:25 (localtime):
>  Bezüglich Harry Schmalzbauer's Nachricht vom 30.09.2017 18:30 (localtime):
>>  Bad surprise.
>> Most likely I forgot to stop a PCIe-Passthrough NIC before shutting down
>> that (byhve(8)) guest – jhb@ helped my identifying this as the root
>> cause for sever memory corruptions I regularly had (on stable-11).
>>
>> Now this time, corruption affected ZFS's RAM area, obviously.
>>
>> What I haven't expected is the panic.
>> The machine has memory disk as root, so luckily I still can boot (from
>> ZFS, –> mdpreload rootfs) into single user mode, but early rc stage
>> (most likely mounting ZFS datasets) leads to the following panic:
>>
>> Trying to mount root from ufs:/dev/ufs/cetusROOT []...
>> panic: Solaris(panic): blkptr at 0xfe0005b6b000 has invalid CHECKSUM 1
>> cpuid = 1
>> KDB: stack backtrace:
>> #0 0x805e3837 at kdb_backtrace+0x67
>> #1 0x805a2286 at vpanic+0x186
>> #2 0x805a20f3 at panic+0x43
>> #3 0x81570192 at vcmn_err+0xc2
>> #4 0x812d7dda at zfs_panic_recover+0x5a
>> #5 0x812ff49b at zfs_blkptr_verify+0x8b
>> #6 0x812ff72c at zio_read+0x2c
>> #7 0x812761de at arc_read+0x6de
>> #8 0x81298b4d at traverse_prefetch_metadata+0xbd
>> #9 0x812980ed at traverse_visitbp+0x39d
>> #10 0x81298c27 at traverse_dnode+0xc7
>> #11 0x812984a3 at traverse_visitbp+0x753
>> #12 0x8129788b at traverse_impl+0x22b
>> #13 0x81297afc at traverse_pool+0x5c
>> #14 0x812cce06 at spa_load+0x1c06
>> #15 0x812cc302 at spa_load+0x1102
>> #16 0x812cac6e at spa_load_best+0x6e
>> #17 0x812c73a1 at spa_open_common+0x101
>> Uptime: 37s
>> Dumping 1082 out of 15733 MB:..2%..…
>> Dump complete
>> mps0: Sending StopUnit: path (xpt0:mps0:0:2:): handle 12
>> mps0: Incrementing SSU count
>> …
>>
>> Haven't done any scrub attempts yet – expectation is to get all datasets
>> of the striped mirror pool back...
>>
>> Any hints highly appreciated.
> Now it seems I'm in really big trouble.
> Regular import doesn't work (also not if booted from cd9660).
> I get all pools listed, but trying to import (unmounted) leads to the
> same panic as initialy reported – because rc is just doning the same.
>
> I booted into single user mode (which works since the bootpool isn't
> affected and root is a memory disk from the bootpool)
> and set vfs.zfs.recover=1.
> But this time I don't even get the list of pools to import 'zpool'
> import instantaniously leads to that panic:
>
> Solaris: WARNING: blkptr at 0xfe0005a8e000 has invalid CHECKSUM 1
> Solaris: WARNING: blkptr at 0xfe0005a8e000 has invalid COMPRESS 0
> Solaris: WARNING: blkptr at 0xfe0005a8e000 DVA 0 has invalid VDEV
> 2337865727
> Solaris: WARNING: blkptr at 0xfe0005a8e000 DVA 1 has invalid VDEV
> 289407040
> Solaris: WARNING: blkptr at 0xfe0005a8e000 DVA 2 has invalid VDEV
> 3959586324
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x50
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x812de904
> stack pointer   = 0x28:0xfe043f6bcbc0
> frame pointer   = 0x28:0xfe043f6bcbc0
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 44 (zpool)
> trap number = 12
> panic: page fault
> cpuid = 0

…

OpenIndiana also panics at regular import.
Unfortunately I don't know the aequivalent of vfs.zfs.recover in OI.

panic[cpu1]/thread=ff06dafe8be0: blkptr at ff06dbe63000 has
invalid CHECKSUM 1

Warning - stack not written to the dump buffer
ff001f67f070 genunix:vcmn_err+42 ()
ff001f67f0e0 zfs:zfs_panic_recover+51 ()
ff001f67f140 zfs:zfs_blkptr_verify+8d ()
ff001f67f220 zfs:zio_read+55 ()
ff001f67f310 zfs:arc_read+662 ()
ff001f67f370 zfs:traverse_prefetch_metadata+b5 ()
ff001f67f450 zfs:traverse_visitbp+1c3 ()
ff001f67f4e0 zfs:traverse_dnode+af ()
ff001f67f5c0 zfs:traverse_visitbp+6dd ()
ff001f67f720 zfs:traverse_impl+1a6 ()
ff001f67f830 zfs:traverse_pool+9f ()
ff001f67f8a0 zfs:spa_load_verify+1e6 ()
ff001f67f990 zfs:spa_load_impl+e1c ()
ff001f67fa30 zfs:spa_load+14e ()
ff001f67fad0 zfs:spa_load_best+7a ()
ff001f67fb90 zfs:spa_import+1b0 ()
ff001f67fbe0 zfs:zfs_ioc_pool_import+10f ()
ff001f67fc80 zfs:zfsdev_ioctl+4b7 ()
ff001f67fcc0 genunix:cdev_ioctl+39 ()
ff001f67fd10 specfs:spec_ioctl+60 ()
ff001f67fda0 genunix:fop_ioctl+55 ()
ff001f67fec0 genunix:ioctl+9b ()
ff001f67ff10 unix:brand_sys_sysenter+1c9 ()

This is a important lesson.
My impression was that it's not possible to corrupt a complete pool, but
there's always a way to recover healthy/re

Re: panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-09-30 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 30.09.2017 18:30 (localtime):
>  Bad surprise.
> Most likely I forgot to stop a PCIe-Passthrough NIC before shutting down
> that (byhve(8)) guest – jhb@ helped my identifying this as the root
> cause for sever memory corruptions I regularly had (on stable-11).
>
> Now this time, corruption affected ZFS's RAM area, obviously.
>
> What I haven't expected is the panic.
> The machine has memory disk as root, so luckily I still can boot (from
> ZFS, –> mdpreload rootfs) into single user mode, but early rc stage
> (most likely mounting ZFS datasets) leads to the following panic:
>
> Trying to mount root from ufs:/dev/ufs/cetusROOT []...
> panic: Solaris(panic): blkptr at 0xfe0005b6b000 has invalid CHECKSUM 1
> cpuid = 1
> KDB: stack backtrace:
> #0 0x805e3837 at kdb_backtrace+0x67
> #1 0x805a2286 at vpanic+0x186
> #2 0x805a20f3 at panic+0x43
> #3 0x81570192 at vcmn_err+0xc2
> #4 0x812d7dda at zfs_panic_recover+0x5a
> #5 0x812ff49b at zfs_blkptr_verify+0x8b
> #6 0x812ff72c at zio_read+0x2c
> #7 0x812761de at arc_read+0x6de
> #8 0x81298b4d at traverse_prefetch_metadata+0xbd
> #9 0x812980ed at traverse_visitbp+0x39d
> #10 0x81298c27 at traverse_dnode+0xc7
> #11 0x812984a3 at traverse_visitbp+0x753
> #12 0x8129788b at traverse_impl+0x22b
> #13 0x81297afc at traverse_pool+0x5c
> #14 0x812cce06 at spa_load+0x1c06
> #15 0x812cc302 at spa_load+0x1102
> #16 0x812cac6e at spa_load_best+0x6e
> #17 0x812c73a1 at spa_open_common+0x101
> Uptime: 37s
> Dumping 1082 out of 15733 MB:..2%..…
> Dump complete
> mps0: Sending StopUnit: path (xpt0:mps0:0:2:): handle 12
> mps0: Incrementing SSU count
> …
>
> Haven't done any scrub attempts yet – expectation is to get all datasets
> of the striped mirror pool back...
>
> Any hints highly appreciated.

Now it seems I'm in really big trouble.
Regular import doesn't work (also not if booted from cd9660).
I get all pools listed, but trying to import (unmounted) leads to the
same panic as initialy reported – because rc is just doning the same.

I booted into single user mode (which works since the bootpool isn't
affected and root is a memory disk from the bootpool)
and set vfs.zfs.recover=1.
But this time I don't even get the list of pools to import 'zpool'
import instantaniously leads to that panic:

Solaris: WARNING: blkptr at 0xfe0005a8e000 has invalid CHECKSUM 1
Solaris: WARNING: blkptr at 0xfe0005a8e000 has invalid COMPRESS 0
Solaris: WARNING: blkptr at 0xfe0005a8e000 DVA 0 has invalid VDEV
2337865727
Solaris: WARNING: blkptr at 0xfe0005a8e000 DVA 1 has invalid VDEV
289407040
Solaris: WARNING: blkptr at 0xfe0005a8e000 DVA 2 has invalid VDEV
3959586324


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x50
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x812de904
stack pointer   = 0x28:0xfe043f6bcbc0
frame pointer   = 0x28:0xfe043f6bcbc0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 44 (zpool)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
#0 0x805e3837 at kdb_backtrace+0x67
#1 0x805a2286 at vpanic+0x186
#2 0x805a20f3 at panic+0x43
#3 0x808a4922 at trap_fatal+0x322
#4 0x808a4979 at trap_pfault+0x49
#5 0x808a41f8 at trap+0x298
#6 0x80889fb1 at calltrap+0x8
#7 0x812e58a3 at vdev_mirror_child_select+0x53
#8 0x812e535e at vdev_mirror_io_start+0x2ee
#9 0x81303aa1 at zio_vdev_io_start+0x161
#10 0x8130054c at zio_execute+0xac
#11 0x812ffe7b at zio_nowait+0xcb
#12 0x812761f3 at arc_read+0x6f3
#13 0x81298b4d at traverse_prefetch_metadata+0xbd
#14 0x812980ed at traverse_visitbp+0x39d
#15 0x81298c27 at traverse_dnode+0xc7
#16 0x812984a3 at traverse_visitbp+0x753
#17 0x8129788b at traverse_impl+0x22b

Now I hope any ZFS guru can help me out. Needless to mention that the
bits on this mirrored pool are important for me – no productive data,
but lots of intermediate...

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

panic: Solaris(panic): blkptr invalid CHECKSUM1

2017-09-30 Thread Harry Schmalzbauer
 Bad surprise.
Most likely I forgot to stop a PCIe-Passthrough NIC before shutting down
that (byhve(8)) guest – jhb@ helped my identifying this as the root
cause for sever memory corruptions I regularly had (on stable-11).

Now this time, corruption affected ZFS's RAM area, obviously.

What I haven't expected is the panic.
The machine has memory disk as root, so luckily I still can boot (from
ZFS, –> mdpreload rootfs) into single user mode, but early rc stage
(most likely mounting ZFS datasets) leads to the following panic:

Trying to mount root from ufs:/dev/ufs/cetusROOT []...
panic: Solaris(panic): blkptr at 0xfe0005b6b000 has invalid CHECKSUM 1
cpuid = 1
KDB: stack backtrace:
#0 0x805e3837 at kdb_backtrace+0x67
#1 0x805a2286 at vpanic+0x186
#2 0x805a20f3 at panic+0x43
#3 0x81570192 at vcmn_err+0xc2
#4 0x812d7dda at zfs_panic_recover+0x5a
#5 0x812ff49b at zfs_blkptr_verify+0x8b
#6 0x812ff72c at zio_read+0x2c
#7 0x812761de at arc_read+0x6de
#8 0x81298b4d at traverse_prefetch_metadata+0xbd
#9 0x812980ed at traverse_visitbp+0x39d
#10 0x81298c27 at traverse_dnode+0xc7
#11 0x812984a3 at traverse_visitbp+0x753
#12 0x8129788b at traverse_impl+0x22b
#13 0x81297afc at traverse_pool+0x5c
#14 0x812cce06 at spa_load+0x1c06
#15 0x812cc302 at spa_load+0x1102
#16 0x812cac6e at spa_load_best+0x6e
#17 0x812c73a1 at spa_open_common+0x101
Uptime: 37s
Dumping 1082 out of 15733 MB:..2%..…
Dump complete
mps0: Sending StopUnit: path (xpt0:mps0:0:2:): handle 12
mps0: Incrementing SSU count
…

Haven't done any scrub attempts yet – expectation is to get all datasets
of the striped mirror pool back...

Any hints highly appreciated.

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

find(1)'s "newer" primary expression gives wrong results with symbolic links

2017-09-30 Thread Harry Schmalzbauer


Hello,

utilizing find(1)'s 'newer' primary expression is broken with symbolic links 
(for a very long time).

Anyone who is using find for timestamp comparings should pay special attention 
regarding symbolic links.
The man page states for "-P" (which ist the default), that »the file 
information and file type (see stat(2)) returned for each symbolic link to be 
those of the link itself«

That's not the case, -P doesn't work as expected, see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=222698
for details.
Hope someone with better C skills will find the root of the problem as soon as 
possible.  I'm trying to find myself, but I'm happyly proven not to be the 
fastest ;-)

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Any support creating a Windows Server 2012 unattended install

2017-07-13 Thread Harry Schmalzbauer
Bezüglich Paul Webster's Nachricht vom 13.07.2017 10:22 (localtime):
> Ah ha we can now see installs, perfect thank you harry! just what I
> needed I thought we still had no way of seeing the install process

You can run anything that provides a UEFIx64 loader with VNC-graphics,
due to the ongoing effort of many bhyve developers.
Thank goes to them!

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: EFI loader doesn't handle md_preload (md_image) correct?

2017-06-29 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 18:26 (localtime):
> B
…
 The issue is, that current UEFI implementation is using 64MB staging
 memory for loading the kernel and modules and files. When the boot is
 called, the relocation code will put the bits from staging area into the
 final places. The BIOS version does not need such staging area, and that
 will explain the difference.

 I actually have different implementation to address the same problem,
 but thats for illumos case, and will need some work to make it usable
 for freebsd; the idea is actually simple - allocate staging area per
 loaded file and relocate the bits into the place by component, not as
 continuous large chunk (this would also allow to avoid the mines like
 planted by hyperv;), but right now there is no very quick real solution
 other than just build efi loader with larger staging size.
>>> Ic, thanks for the explanation.
>>> While not aware about the purpose of the staging area nor the
>>> consequences of enlarging it, do you think it's feasable increasing it
>>> to 768Mib?
>>>
>>> At least now I have an idea baout the issue and an explanation why
>>> reducing md_imgae to 100MB hasn't helped – still more than 64...
>>>
>>> Any quick hint where to define the staging area size highly appreciated,
>>> fi there are no hard objections against a 768MB size.
>>>
>>> -harry
>> The problem is that before UEFI Boot Services are not switched off, the 
>> memory is managed (and owned) by the firmware,
> Hmm, I've been expecting something like that (owend by firmware) ;-)
>
> So I'll stay with CSM for now, and will happily be an early adopter if
> you need someone to try anything (-stable mergable).

Toomas, thanks for your help so far! I'm just curious if there's news on
this.
Was there a decision made whether kernel should be utilized to relocate
the MD image modules or the loader should be extended to handle
(x-)large staging areas?

I'd like to switch back to UEFI booting for various reasons (most
priority has consistency), but can't since it breaks md-rootfs with that
machine (the other run ESXi still).

If there's anything to test, please let me know.

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

[LOR] 11.1-BETA1 dev/md/md.c <-> vm/vm_pager.c - lost begnin matchlist

2017-06-12 Thread Harry Schmalzbauer
 Dear hackers,

I couldn't find a up to date list for begning LOR reports.  Since
ffs/vfs.. LORs, which I don't understand, diffused my attention over the
time, I'm not aware about the actual importance of them these days
(without panic).
Here's is one, happening on 11.1-BETA1. which I haven't found a previous
report about:
lock order reversal:
 1st 0xfe03bca22c10 bufwait (bufwait) @
/usr/local/share/deploy-tools/RELENG_11/src/sys/vm/vm_pager.c:370
 2nd 0xf8001f9ff068 ufs (ufs) @
/usr/local/share/deploy-tools/RELENG_11/src/sys/dev/md/md.c:942
stack backtrace:
#0 0x805e79b0 at witness_debugger+0x70
#1 0x805e78a3 at
witness_checkorder+0xe23
  

#2 0x805621d5 at
__lockmgr_args+0x875
  

#3 0x8081c215 at
ffs_lock+0xa5   
  

#4 0x808d78d0 at
VOP_LOCK1_APV+0xe0  
  

#5 0x8065531a at
_vn_lock+0x6a   
  

#6 0x803fb948 at
mdstart_vnode+0x438 
  

#7 0x803f9f4d at
md_kthread+0x19d
  

#8 0x8054ea34 at
fork_exit+0x84  
  

#9 0x80864c0e at fork_trampoline+0xe

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


panic: Memory modified after free in zio_create, passthru in use [Was: 11.1-pre runtime Undefined symbol "xdr_accepted_reply" /lib/libc.so.7]

2017-06-11 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 06.06.2017 14:03 (localtime):
>  Hello,
>
> suddenly, I'm getting this error:
> /lib/libc.so.7: Undefined symbol "xdr_accepted_reply"
>
> Very mysterious: It showed up on a running system, which worked
> flawlessly for some hours. And that host has root-fs (/) mounted
> readonly from a memorydisk. So to my understanding, it's completely
> impossible that /lib/libc.so.7 is corrupted since last boot.
>
> I'm completely out of ideas what could cause this strange error during
> "normal" operation.
>
> Normal operation in this case is serving as a bhyve test machine.
> I first noticed that error after one guest - with passthru device
> attached - was shut down.
>
> My suspicion is some undiscovered passthru interference... Since I
> noticed one other _very_ strange passthru-effect:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215740

Hello,

this time I caught a panic with a debuging kernel under 11.1-BETA1,
which again occured after shuting down a VM which had ppt in use:
cpuid = 5
KDB: stack backtrace:
#0 0x805bf327 at kdb_backtrace+0x67
#1 0x8057f266 at vpanic+0x186
#2 0x8057f2e3 at panic+0x43
#3 0x8082eaeb at trash_ctor+0x4b
#4 0x8082aaec at uma_zalloc_arg+0x52c
#5 0x813b54a6 at zio_add_child+0x26
#6 0x813b5a05 at zio_create+0x385
#7 0x813b6de2 at zio_vdev_child_io+0x232
#8 0x81396be0 at vdev_mirror_io_start+0x370
#9 0x813bc629 at zio_vdev_io_start+0x4a9
#10 0x813b76bc at zio_execute+0x36c
#11 0x813b6868 at zio_nowait+0xb8
#12 0x81396bec at vdev_mirror_io_start+0x37c
#13 0x813bc383 at zio_vdev_io_start+0x203
#14 0x813b76bc at zio_execute+0x36c
#15 0x805d10dd at taskqueue_run_locked+0x13d
#16 0x805d1e78 at taskqueue_thread_loop+0x88
#17 0x80543844 at fork_exit+0x84

#0  doadump (textdump=) at pcpu.h:222
#1  0x8057ece0 in kern_reboot (howto=260) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:366
#2  0x8057f2a0 in vpanic (fmt=, ap=)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:759
#3  0x8057f2e3 in panic (fmt=) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:690
#4  0x8082eaeb in trash_ctor (mem=,
size=, arg=, flags=)
at /usr/local/share/deploy-tools/RELENG_11/src/sys/vm/uma_dbg.c:80
#5  0x8082aaec in uma_zalloc_arg (zone=0xf8001febc680,
udata=0xf8001ad5f340, flags=)
at /usr/local/share/deploy-tools/RELENG_11/src/sys/vm/uma_core.c:2152
#6  0x813b54a6 in zio_add_child (pio=0xf8026f350b88,
cio=0xf8002478b7b0)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:460
#7  0x813b5a05 in zio_create (pio=0xf8026f350b88, spa=, txg=433989, bp=,
data=0xfe0058afa000,
size=1024, type=,
priority=ZIO_PRIORITY_ASYNC_WRITE, flags=,
vd=,
offset=, zb=,
pipeline=)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:690
#8  0x813b6de2 in zio_vdev_child_io (pio=0xf8026f350b88,
bp=, vd=, offset=325398016,
data=, size=1024, type=,
flags=1048704, done=)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1141
#9  0x81396be0 in vdev_mirror_io_start (zio=0xf8026f350b88)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:488
#10 0x813bc629 in zio_vdev_io_start (zio=0xf8026f350b88)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3143
#11 0x813b76bc in zio_execute (zio=)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1681
#12 0x813b6868 in zio_nowait (zio=0xf8026f350b88)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1739
#13 0x81396bec in vdev_mirror_io_start (zio=0xf8026f7a7b88)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:488
#14 0x813bc383 in zio_vdev_io_start (zio=0xf8026f7a7b88)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3021
#15 0x813b76bc in zio_execute (zio=)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1681
#16 0x805d10dd in taskqueue_run_locked
(queue=0xf8001ab5a700) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_taskqueue.c:454
#17 0x805d1e78 in taskqueue_thread_loop (arg=) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/subr_taskqueue.c:741
#18 0x80543844 in fork_exit (callout=0x805d1df0
, arg=0xf8001aa90720, frame=0xfe043f609ac0)
at /usr/l

11.1-pre runtime Undefined symbol "xdr_accepted_reply" /lib/libc.so.7

2017-06-06 Thread Harry Schmalzbauer
 Hello,

suddenly, I'm getting this error:
/lib/libc.so.7: Undefined symbol "xdr_accepted_reply"

Very mysterious: It showed up on a running system, which worked
flawlessly for some hours. And that host has root-fs (/) mounted
readonly from a memorydisk. So to my understanding, it's completely
impossible that /lib/libc.so.7 is corrupted since last boot.

I'm completely out of ideas what could cause this strange error during
"normal" operation.

Normal operation in this case is serving as a bhyve test machine.
I first noticed that error after one guest - with passthru device
attached - was shut down.

My suspicion is some undiscovered passthru interference... Since I
noticed one other _very_ strange passthru-effect:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215740

Thanks for any hints,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [ports] r438901 causes PACKAGES= issues

2017-06-04 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 04.06.2017 17:00 (localtime):
>  Bezüglich Harry Schmalzbauer's Nachricht vom 22.05.2017 12:51 (localtime):
>> Bezüglich Julian Elischer's Nachricht vom 22.05.2017 09:52 (localtime):
>>> On 22/5/17 3:04 pm, Harry Schmalzbauer wrote:
>>>>   Bezüglich Harry Schmalzbauer's Nachricht vom 21.05.2017 20:25
>>>> (localtime):
>>>>>   Mk&bsd.ports.mk still tells:
>>>>> # PACKAGES  - A top level directory where all packages go
>>>>> (rather than
>>>>> # going locally to each port).
>>>>> # Default: ${PORTSDIR}/packages
>>>>>
>>>>> Since r438901 (
>>>>> https://svnweb.freebsd.org/ports?view=revision&sortby=date&revision=438901
>>>>>
>>>>> )
>>>> Actually, r438058 broke PACKAGES. For the records, see
>>>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218827
>>>>
>>>> ___
>>>> freebsd-stable@freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>>>
>>>>
>>> has this been unbroken?   We use this feature but are not on the head of
>>> the tree yet..
>> Nope, not fixed yet and I guess it won't happen, from what I read.
>>
>> Reverting r438901 and r438058 locally is a suitable solution at the
>> moment, but this is going to change soon I fear. The commits seem to be
>> required to make ports pkg/poudriere compatible.
> My assumption was wrong, it has been "fixed" meanwhile – by emitting
> PKGFILE with escaped colons. Great, breaks scripts again here.

Also scripts of ports infrastructure itself are still broken after fix
r441712,
so USE_PACKAGE_DEPENDS doesn't work at the moment (if one uses PACKAGES
with colons).
See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219780

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: [ports] r438901 causes PACKAGES= issues

2017-06-04 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 22.05.2017 12:51 (localtime):
> Bezüglich Julian Elischer's Nachricht vom 22.05.2017 09:52 (localtime):
>> On 22/5/17 3:04 pm, Harry Schmalzbauer wrote:
>>>   Bezüglich Harry Schmalzbauer's Nachricht vom 21.05.2017 20:25
>>> (localtime):
>>>>   Mk&bsd.ports.mk still tells:
>>>> # PACKAGES  - A top level directory where all packages go
>>>> (rather than
>>>> # going locally to each port).
>>>> # Default: ${PORTSDIR}/packages
>>>>
>>>> Since r438901 (
>>>> https://svnweb.freebsd.org/ports?view=revision&sortby=date&revision=438901
>>>>
>>>> )
>>> Actually, r438058 broke PACKAGES. For the records, see
>>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218827
>>>
>>> ___
>>> freebsd-stable@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>>
>>>
>> has this been unbroken?   We use this feature but are not on the head of
>> the tree yet..
> Nope, not fixed yet and I guess it won't happen, from what I read.
>
> Reverting r438901 and r438058 locally is a suitable solution at the
> moment, but this is going to change soon I fear. The commits seem to be
> required to make ports pkg/poudriere compatible.

My assumption was wrong, it has been "fixed" meanwhile – by emitting
PKGFILE with escaped colons. Great, breaks scripts again here.
No discussion, no apporovals... I whish someone could migrate ports/Mk
into base and freeze it.
"make clean" seems to be fundamentally changed with not yet discovered
side effects. Local scripts don't work as expected anymore. Great! Take
poudriere and unlink -R your own stuff or stay away from ports... I
can't believe how ports evolved during the last years :-(
 
-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ifconfig(4) name and tap(4)'s character special device name

2017-06-02 Thread Harry Schmalzbauer
 Hello,

renaming vmnet/tap(4) interfaces, defined in rc.conf(5) via
"cloned_interfaces" e.g, isn't prohibited by rc(8)-network.subr nor by
ifconfig(8).
If such a interface is renamed, the Ethernet device shows up correctly,
with the new name and ifconfig(8) also reports the new name if it's
created and renamed in the same invocation.
Problem is, the control device isn't renamed, it will keep it's initial
creation name like tap0 and I found no userland way to determine the
corresponding Ethernet IF name.

Several solutions come to my mind:
– Prohibit renaming of Ethernet-group tap (and tun?) interface types.
– Extend ifconfig(8) to alter the character special device name.
Either rename or create symlink?
– Let rc(8)-network.subr alter character special device name.
Either rename or create symlink?

Has anybody else thought about that problem?
The last mentioned possibility of course wouldn't cover manual CLI
renaming, so most likely isn't feasable.

Otoh, there's some magic I haven't discovered yet:
Defining ifconfig_bridge0_name="br0" in rc.conf(5), but _not_ listing
bridge0 in cloned_interfaces, leads to run-time auto-renaming as soon as
I manually invoke 'ifconfig bridge0 create' (result is ifconfig(8) lists
br0, although I haven't invoked 'ifconfig bridge0 name br0')!
Haven't found any traces in devd.conf(5) and haven't really inspected
rc(8)-network.subr careful enough to know why/how that happens!
But maybe this method could also be utilized to handle the
vmnet/tap-character-special-device renaming?

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Update netmap for 11.1-RELEASE

2017-05-27 Thread Harry Schmalzbauer
 Bezüglich George Amanakis via freebsd-stable's Nachricht vom 24.05.2017
19:09 (localtime):
> Regarding the upcoming 11.1-RELEASE:
> Could somebody update netmap from CURRENT to STABLE, so that it would make it 
> into 11.1-RELEASE? 
> I would really like to see ptnet and ptnetmap in 11.1-RELEASE.

Can't help regarding 11.1-RELEASE, but if you're looking for pt_netmap
on stable/11, I perpared a private MFC:
ftp://ftp.omnilan.de/pub/FreeBSD/OmniLAN/misc/MFC-netmap-to-11.1-prerelease.diff

I hope I found all related commits, but I've done by hand; no idea if
there's a way to let svn do the same job...
List of repspected revisions:
r306772, r307394-r307396, r307569, r307572, r307574, r307703, r307706,
r307728, r308000, r308038, r309306, r310822, r311045, r311986, r313747,
r314915.
No guarantee that I missed something!
No tests regarding pt_netmap done yet, I just wanted to have a better
testing platfrom, since there are outstaning vale-related problems (in native 
stable/11 netmap version, as well as in updated version, but developers tend to 
work with latest code, so I thought it might be easier getting help with 
updated code base).

-harry


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: [ports] r438901 causes PACKAGES= issues

2017-05-22 Thread Harry Schmalzbauer
Bezüglich Julian Elischer's Nachricht vom 22.05.2017 09:52 (localtime):
> On 22/5/17 3:04 pm, Harry Schmalzbauer wrote:
>>   Bezüglich Harry Schmalzbauer's Nachricht vom 21.05.2017 20:25
>> (localtime):
>>>   Mk&bsd.ports.mk still tells:
>>> # PACKAGES  - A top level directory where all packages go
>>> (rather than
>>> # going locally to each port).
>>> # Default: ${PORTSDIR}/packages
>>>
>>> Since r438901 (
>>> https://svnweb.freebsd.org/ports?view=revision&sortby=date&revision=438901
>>>
>>> )
>> Actually, r438058 broke PACKAGES. For the records, see
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218827
>>
>> ___
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>>
>>
> has this been unbroken?   We use this feature but are not on the head of
> the tree yet..

Nope, not fixed yet and I guess it won't happen, from what I read.

Reverting r438901 and r438058 locally is a suitable solution at the
moment, but this is going to change soon I fear. The commits seem to be
required to make ports pkg/poudriere compatible.

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: [ports] r438901 causes PACKAGES= issues

2017-05-22 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 21.05.2017 20:25 (localtime):
>  Mk&bsd.ports.mk still tells:
> # PACKAGES  - A top level directory where all packages go
> (rather than
> # going locally to each port).
> # Default: ${PORTSDIR}/packages
>
> Since r438901 (
> https://svnweb.freebsd.org/ports?view=revision&sortby=date&revision=438901
> )

Actually, r438058 broke PACKAGES. For the records, see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218827

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


[ports] r438901 causes PACKAGES= issues

2017-05-21 Thread Harry Schmalzbauer
 Mk&bsd.ports.mk still tells:
# PACKAGES  - A top level directory where all packages go
(rather than
# going locally to each port).
# Default: ${PORTSDIR}/packages

Since r438901 (
https://svnweb.freebsd.org/ports?view=revision&sortby=date&revision=438901
)
defining PACKAGES, 'make -VPKGFILE' leads to:
make[1]: "/usr/ports/Mk/bsd.port.mk" line 5118: warning: duplicate
script for target "/mnt/pkg/ivybridge/FreeBSD" ignored
make[1]: "/usr/ports/Mk/bsd.port.mk" line 3288: warning: using previous
script for "/mnt/pkg/ivybridge/FreeBSD" defined here
make: "/usr/ports/Mk/bsd.port.mk" line 5118: warning: duplicate script
for target "/mnt/pkg/ivybridge/FreeBSD" ignored
make: "/usr/ports/Mk/bsd.port.mk" line 3288: warning: using previous
script for "/mnt/pkg/ivybridge/FreeBSD" defined here
/mnt/pkg/ivybridge/FreeBSD:11:amd64/All/zxfer-1.1.6.txz

In that case, the ':' is the culprit, but I think this is a completely
legal path name and support for it should be fixed, especially
considering pkg definitions...

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: EFI loader doesn't handle md_preload (md_image) correct?

2017-05-16 Thread Harry Schmalzbauer
Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:20 (localtime):
> 
>> On 16. mai 2017, at 19:13, Harry Schmalzbauer  wrote:
>>
>> Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:00 (localtime):
>>>
>>>> On 16. mai 2017, at 18:45, Harry Schmalzbauer >>> <mailto:free...@omnilan.de>> wrote:
>>>>
>>>> Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 17:28 (localtime):
>>>>> Bezüglich Toomas Soome's Nachricht vom 16.05.2017 16:57 (localtime):
>>>>>>> On 16. mai 2017, at 17:55, Harry Schmalzbauer >>>>>> <mailto:free...@omnilan.de>> wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> unfortunately I had some trouble with my preferred MFS-root setups.
>>>>>>> It seems EFI loader doesn't handle type md_image correctly.
>>>>>>>
>>>>>>> If I load any md_image with loader invoked by gptboot or gptzfsboot,
>>>>>>> 'lsmod'
>>>>>>> shows "elf kernel", "elf obj module(s)" and "md_image".
>>>>>>>
>>>>>>> Using the same loader.conf, but EFI loader, the md_image-file is
>>>>>>> prompted and sems to be loaded, but not registered.  There's no
>>>>>>> md_image
>>>>>>> with 'lsmod', hence it's not astonsihing that kernel doesn't attach md0
>>>>>>> so booting fails since there's no rootfs.
>>>>>>>
>>>>>>> Any help highly appreciated, hope Toomas doesn't mind beeing
>>>>>>> initially CC'd.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> -harry
>>>>>>
>>>>>> The first question is, how large is the md_image and what other
>>>>>> modules are loaded?
>>>>> Thanks for your quick response.
>>>>>
>>>>> The images are 50-500MB uncompressed (provided by gzip compressed file).
>>>>> Small ammount of elf modules, 5, each ~50kB.
>>>>
>>>> On the real HW, there's vmm and some more:
>>>> Id Refs Address Size Name
>>>> 1   46 0x8020   16M kernel
>>>> 21 0x8121d000   86K unionfs.ko
>>>> 31 0x81233000  3.1M zfs.ko
>>>> 42 0x81545000   51K opensolaris.ko
>>>> 57 0x81552000  279K usb.ko
>>>> 61 0x81598000   67K ukbd.ko
>>>> 71 0x815a9000   51K umass.ko
>>>> 81 0x815b6000   46K aesni.ko
>>>> 91 0x815c3000   54K uhci.ko
>>>> 101 0x815d1000   65K ehci.ko
>>>> 111 0x815e2000   15K cc_htcp.ko
>>>> 121 0x815e6000  3.4M vmm.ko
>>>> 131 0xa3a21000   12K ums.ko
>>>> 141 0xa3a24000  9.1K uhid.ko
>>>>
>>>> Providing md_image uncompressed doesn't change anything.
>>>>
>>>> Will deploy a /usr separated rootfs, which is only ~100MB uncompressed
>>>> and see if that changes anything.
>>>> That's all I can provide, code is far beyond my knowledge...
>>>>
>>>> -harry
>>>
>>>
>>> The issue is, that current UEFI implementation is using 64MB staging
>>> memory for loading the kernel and modules and files. When the boot is
>>> called, the relocation code will put the bits from staging area into the
>>> final places. The BIOS version does not need such staging area, and that
>>> will explain the difference.
>>>
>>> I actually have different implementation to address the same problem,
>>> but thats for illumos case, and will need some work to make it usable
>>> for freebsd; the idea is actually simple - allocate staging area per
>>> loaded file and relocate the bits into the place by component, not as
>>> continuous large chunk (this would also allow to avoid the mines like
>>> planted by hyperv;), but right now there is no very quick real solution
>>> other than just build efi loader with larger staging size.
>>
>> Ic, thanks for the explanation.
>> While not aware about the purpose of the staging area nor the
>> consequences of enlarging it, do you think it's feasable increasing it
>> to 768Mib?
>>
>> At least now I have an idea baout the issue and an explanation why
>> reducing md_imgae to 100MB hasn't helped – still more than 64...
>>
>> Any quick hint where to define the staging area size highly appreciated,
>> fi there are no hard objections against a 768MB size.
>>
>> -harry
> 
> The problem is that before UEFI Boot Services are not switched off, the 
> memory is managed (and owned) by the firmware,

Hmm, I've been expecting something like that (owend by firmware) ;-)

So I'll stay with CSM for now, and will happily be an early adopter if
you need someone to try anything (-stable mergable).

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: EFI loader doesn't handle md_preload (md_image) correct?

2017-05-16 Thread Harry Schmalzbauer
Bezüglich Toomas Soome's Nachricht vom 16.05.2017 18:00 (localtime):
> 
>> On 16. mai 2017, at 18:45, Harry Schmalzbauer > <mailto:free...@omnilan.de>> wrote:
>>
>> Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 17:28 (localtime):
>>> Bezüglich Toomas Soome's Nachricht vom 16.05.2017 16:57 (localtime):
>>>>> On 16. mai 2017, at 17:55, Harry Schmalzbauer >>>> <mailto:free...@omnilan.de>> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> unfortunately I had some trouble with my preferred MFS-root setups.
>>>>> It seems EFI loader doesn't handle type md_image correctly.
>>>>>
>>>>> If I load any md_image with loader invoked by gptboot or gptzfsboot,
>>>>> 'lsmod'
>>>>> shows "elf kernel", "elf obj module(s)" and "md_image".
>>>>>
>>>>> Using the same loader.conf, but EFI loader, the md_image-file is
>>>>> prompted and sems to be loaded, but not registered.  There's no
>>>>> md_image
>>>>> with 'lsmod', hence it's not astonsihing that kernel doesn't attach md0
>>>>> so booting fails since there's no rootfs.
>>>>>
>>>>> Any help highly appreciated, hope Toomas doesn't mind beeing
>>>>> initially CC'd.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -harry
>>>>
>>>> The first question is, how large is the md_image and what other
>>>> modules are loaded?
>>> Thanks for your quick response.
>>>
>>> The images are 50-500MB uncompressed (provided by gzip compressed file).
>>> Small ammount of elf modules, 5, each ~50kB.
>>
>> On the real HW, there's vmm and some more:
>> Id Refs Address Size Name
>> 1   46 0x8020   16M kernel
>> 21 0x8121d000   86K unionfs.ko
>> 31 0x81233000  3.1M zfs.ko
>> 42 0x81545000   51K opensolaris.ko
>> 57 0x81552000  279K usb.ko
>> 61 0x81598000   67K ukbd.ko
>> 71 0x815a9000   51K umass.ko
>> 81 0x815b6000   46K aesni.ko
>> 91 0x815c3000   54K uhci.ko
>> 101 0x815d1000   65K ehci.ko
>> 111 0x815e2000   15K cc_htcp.ko
>> 121 0x815e6000  3.4M vmm.ko
>> 131 0xa3a21000   12K ums.ko
>> 141 0xa3a24000  9.1K uhid.ko
>>
>> Providing md_image uncompressed doesn't change anything.
>>
>> Will deploy a /usr separated rootfs, which is only ~100MB uncompressed
>> and see if that changes anything.
>> That's all I can provide, code is far beyond my knowledge...
>>
>> -harry
> 
> 
> The issue is, that current UEFI implementation is using 64MB staging
> memory for loading the kernel and modules and files. When the boot is
> called, the relocation code will put the bits from staging area into the
> final places. The BIOS version does not need such staging area, and that
> will explain the difference.
> 
> I actually have different implementation to address the same problem,
> but thats for illumos case, and will need some work to make it usable
> for freebsd; the idea is actually simple - allocate staging area per
> loaded file and relocate the bits into the place by component, not as
> continuous large chunk (this would also allow to avoid the mines like
> planted by hyperv;), but right now there is no very quick real solution
> other than just build efi loader with larger staging size.

Ic, thanks for the explanation.
While not aware about the purpose of the staging area nor the
consequences of enlarging it, do you think it's feasable increasing it
to 768Mib?

At least now I have an idea baout the issue and an explanation why
reducing md_imgae to 100MB hasn't helped – still more than 64...

Any quick hint where to define the staging area size highly appreciated,
fi there are no hard objections against a 768MB size.

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: EFI loader doesn't handle md_preload (md_image) correct?

2017-05-16 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 16.05.2017 17:28 (localtime):
> Bezüglich Toomas Soome's Nachricht vom 16.05.2017 16:57 (localtime):
>>> On 16. mai 2017, at 17:55, Harry Schmalzbauer  wrote:
>>>
>>> Hello,
>>>
>>> unfortunately I had some trouble with my preferred MFS-root setups.
>>> It seems EFI loader doesn't handle type md_image correctly.
>>>
>>> If I load any md_image with loader invoked by gptboot or gptzfsboot,
>>> 'lsmod'
>>> shows "elf kernel", "elf obj module(s)" and "md_image".
>>>
>>> Using the same loader.conf, but EFI loader, the md_image-file is
>>> prompted and sems to be loaded, but not registered.  There's no md_image
>>> with 'lsmod', hence it's not astonsihing that kernel doesn't attach md0
>>> so booting fails since there's no rootfs.
>>>
>>> Any help highly appreciated, hope Toomas doesn't mind beeing initially CC'd.
>>>
>>> Thanks,
>>>
>>> -harry
>>
>> The first question is, how large is the md_image and what other modules are 
>> loaded?
> Thanks for your quick response.
>
> The images are 50-500MB uncompressed (provided by gzip compressed file).
> Small ammount of elf modules, 5, each ~50kB.

On the real HW, there's vmm and some more:
Id Refs Address Size Name
 1   46 0x8020   16M kernel
 21 0x8121d000   86K unionfs.ko
 31 0x81233000  3.1M zfs.ko
 42 0x81545000   51K opensolaris.ko
 57 0x81552000  279K usb.ko
 61 0x81598000   67K ukbd.ko
 71 0x815a9000   51K umass.ko
 81 0x815b6000   46K aesni.ko
 91 0x815c3000   54K uhci.ko
101 0x815d1000   65K ehci.ko
111 0x815e2000   15K cc_htcp.ko
121 0x815e6000  3.4M vmm.ko
131 0xa3a21000   12K ums.ko
141 0xa3a24000  9.1K uhid.ko

Providing md_image uncompressed doesn't change anything.

Will deploy a /usr separated rootfs, which is only ~100MB uncompressed
and see if that changes anything.
That's all I can provide, code is far beyond my knowledge...

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: EFI loader doesn't handle md_preload (md_image) correct?

2017-05-16 Thread Harry Schmalzbauer
Bezüglich Toomas Soome's Nachricht vom 16.05.2017 16:57 (localtime):
> 
>> On 16. mai 2017, at 17:55, Harry Schmalzbauer  wrote:
>>
>> Hello,
>>
>> unfortunately I had some trouble with my preferred MFS-root setups.
>> It seems EFI loader doesn't handle type md_image correctly.
>>
>> If I load any md_image with loader invoked by gptboot or gptzfsboot,
>> 'lsmod'
>> shows "elf kernel", "elf obj module(s)" and "md_image".
>>
>> Using the same loader.conf, but EFI loader, the md_image-file is
>> prompted and sems to be loaded, but not registered.  There's no md_image
>> with 'lsmod', hence it's not astonsihing that kernel doesn't attach md0
>> so booting fails since there's no rootfs.
>>
>> Any help highly appreciated, hope Toomas doesn't mind beeing initially CC'd.
>>
>> Thanks,
>>
>> -harry
> 
> 
> The first question is, how large is the md_image and what other modules are 
> loaded?

Thanks for your quick response.

The images are 50-500MB uncompressed (provided by gzip compressed file).
Small ammount of elf modules, 5, each ~50kB.

I haven't checked if the size does have any influence yet.
I just wondered why I can't see any md_image with 'lsmod' and EFI loader.

Btw, I forogt to mention I'm running 11-stable from this week, tested on
real HW and ESXi guest.

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


EFI loader doesn't handle md_preload (md_image) correct?

2017-05-16 Thread Harry Schmalzbauer
 Hello,

unfortunately I had some trouble with my preferred MFS-root setups.
It seems EFI loader doesn't handle type md_image correctly.

If I load any md_image with loader invoked by gptboot or gptzfsboot,
'lsmod'
shows "elf kernel", "elf obj module(s)" and "md_image".

Using the same loader.conf, but EFI loader, the md_image-file is
prompted and sems to be loaded, but not registered.  There's no md_image
with 'lsmod', hence it's not astonsihing that kernel doesn't attach md0
so booting fails since there's no rootfs.

Any help highly appreciated, hope Toomas doesn't mind beeing initially CC'd.

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c

2017-03-08 Thread Harry Schmalzbauer
Bezüglich Konstantin Belousov's Nachricht vom 08.03.2017 00:55 (localtime):
> On Tue, Mar 07, 2017 at 10:49:01PM +, Rick Macklem wrote:
>> Hmm, this is going to sound dumb, but I don't recall generating any
>> unionfs patch;-)
>> I'll go look for it. Maybe it was Kostik's?
> I did not touched unionfs, and have no plans to.  It is equally broken in
> all relevant versions of FreeBSD.

ACK.

While this is no good news, I have more bad news: deadlock came back…

I'd like to summarize in case anybody else is interested in uninionfs,
maybe at any time in the future:

I observed locking problems back in 2012 and Attilio Rao's final attempt
was this: https://people.freebsd.org/~attilio/unionfs_nodeget4.patch
I never used it, most likely because it didn't work even back with
RELENG_9. It applies to stable/11, but has no effect besides panicing
KDB kernels.
What I used up to 10.3 was the following simple patch:
--- src/sys/fs/unionfs/union_subr.c (revision 231702)
+++ src/sys/fs/unionfs/union_subr.c (working copy)
@@ -261,7 +261,9 @@ unionfs_nodeget(struct mount *mp, struct vnode *up
free(unp, M_UNIONFSNODE);
return (error);
}
+   vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
error = insmntque(vp, mp);  /* XXX: Too early for mpsafe fs */
+   VOP_UNLOCK(vp, 0);
if (error != 0) {
free(unp, M_UNIONFSNODE);
return (error);

This hasn't lead to any panic or deadlock during the last 5 years on ~50
machines, up to 10.3.

In 2016 I did some tests with 11.0-Beta1, where this thread origins, and
Rick kindly looked into it and provided the following patch:
https://lists.freebsd.org/pipermail/freebsd-stable/attachments/20160818/d1d1691d/attachment.obj
(Explanation:
https://lists.freebsd.org/pipermail/freebsd-stable/2016-August/085294.html)

This also panics KDB-kernel (and works without KDB) but at least does
have influence on the dedalock, in case symlinks are involved, where
deadlocks are significantly postponed.

…

 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
 0xfe00982220e0
 vpanic() at vpanic+0x186/frame 0xfe0098222160
 kassert_panic() at kassert_panic+0x126/frame 0xfe00982221d0
 witness_assert() at witness_assert+0x35a/frame 0xfe009830
 __lockmgr_args() at __lockmgr_args+0x517/frame 0xfe0098d0
 vop_stdunlock() at vop_stdunlock+0x3b/frame 0xfe0098f0
 VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfe0098222320
 unionfs_unlock() at unionfs_unlock+0x112/frame 0xfe0098222390
 VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfe00982223c0
 unionfs_nodeget() at unionfs_nodeget+0x3ef/frame 0xfe0098222470
 unionfs_domount() at unionfs_domount+0x518/frame 0xfe00982226b0
 vfs_donmount() at vfs_donmount+0xe37/frame 0xfe00982228f0
 sys_nmount() at sys_nmount+0x72/frame 0xfe0098222930
 amd64_syscall() at amd64_syscall+0x2f9/frame 0xfe0098222ab0
 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0098222ab0
 --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80086ecea, rsp =
 0x7fffe318, rbp = 0x7fffeca0 ---
>>> New discovery:
>>> Rick's latest patch casues panic only with KDB. If I compile a kernel
>>> without witenss and KDB, the machine boots fine!
>>> Also, it's at least not so easy anymore to trigger the deadlock :-) . I
>>> need to do more testing but until now Rick's approach seems very
>>> promising :-) .
>>
>> My unionfs deadlock problem isn't really solved with Rick's latest
>> patch, I still can reproduce it: krb5.conf and krb5.keytab are files on
>> unionfs referenced by /etc.  libexec/negotiate_kerberos_auth reads these
>> and if I have enough helper processes handling requests, the deadlock
>> occurs.
>>
>> _But_: If I move the files outside the unionfs and create a symlink, I
>> cannot reproduce the deadlock anymore, which was similar easily
>> reproducable without it or any of the other workarounds.

Picture has changed, the machine daedlocked over night. So it does have
a significant influence, but unfortunately isn't the real solution.

Thanks for any help,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c

2017-03-07 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 07.03.2017 19:44 (localtime):
>  Bezüglich Harry Schmalzbauer's Nachricht vom 07.03.2017 13:42 (localtime):
> …
>> Something ufs related seems to have tightened the unionfs locking
>> problem in stable/11.  Now the machine instantaniously panics during
>> boot after mounting root with Rick's latest patch.
>>
>> Unfortunately I don't have SWAP available on that machine (yet), but
>> maybe shit is a hint for anybody.
>>
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfe00982220e0
>> vpanic() at vpanic+0x186/frame 0xfe0098222160
>> kassert_panic() at kassert_panic+0x126/frame 0xfe00982221d0
>> witness_assert() at witness_assert+0x35a/frame 0xfe009830
>> __lockmgr_args() at __lockmgr_args+0x517/frame 0xfe0098d0
>> vop_stdunlock() at vop_stdunlock+0x3b/frame 0xfe0098f0
>> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfe0098222320
>> unionfs_unlock() at unionfs_unlock+0x112/frame 0xfe0098222390
>> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfe00982223c0
>> unionfs_nodeget() at unionfs_nodeget+0x3ef/frame 0xfe0098222470
>> unionfs_domount() at unionfs_domount+0x518/frame 0xfe00982226b0
>> vfs_donmount() at vfs_donmount+0xe37/frame 0xfe00982228f0
>> sys_nmount() at sys_nmount+0x72/frame 0xfe0098222930
>> amd64_syscall() at amd64_syscall+0x2f9/frame 0xfe0098222ab0
>> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0098222ab0
>> --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80086ecea, rsp =
>> 0x7fffe318, rbp = 0x7fffeca0 ---
> New discovery:
> Rick's latest patch casues panic only with KDB. If I compile a kernel
> without witenss and KDB, the machine boots fine!
> Also, it's at least not so easy anymore to trigger the deadlock :-) . I
> need to do more testing but until now Rick's approach seems very
> promising :-) . 

My unionfs deadlock problem isn't really solved with Rick's latest
patch, I still can reproduce it: krb5.conf and krb5.keytab are files on
unionfs referenced by /etc.  libexec/negotiate_kerberos_auth reads these
and if I have enough helper processes handling requests, the deadlock
occurs.

_But_: If I move the files outside the unionfs and create a symlink, I
cannot reproduce the deadlock anymore, which was similar easily
reproducable without it or any of the other workarounds.
So it looks like I have an acceptable solution for now, although it's
only usable under certain conditions.

Unfortunately I can't do tests with a debug kernel since the patch
prevents the system with the debug kernel from starting up.
But if this was ironed out, I'd happily provide more info.


Thanks,

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c

2017-03-07 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 07.03.2017 13:42 (localtime):
…
> Something ufs related seems to have tightened the unionfs locking
> problem in stable/11.  Now the machine instantaniously panics during
> boot after mounting root with Rick's latest patch.
>
> Unfortunately I don't have SWAP available on that machine (yet), but
> maybe shit is a hint for anybody.
>
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe00982220e0
> vpanic() at vpanic+0x186/frame 0xfe0098222160
> kassert_panic() at kassert_panic+0x126/frame 0xfe00982221d0
> witness_assert() at witness_assert+0x35a/frame 0xfe009830
> __lockmgr_args() at __lockmgr_args+0x517/frame 0xfe0098d0
> vop_stdunlock() at vop_stdunlock+0x3b/frame 0xfe0098f0
> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfe0098222320
> unionfs_unlock() at unionfs_unlock+0x112/frame 0xfe0098222390
> VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfe00982223c0
> unionfs_nodeget() at unionfs_nodeget+0x3ef/frame 0xfe0098222470
> unionfs_domount() at unionfs_domount+0x518/frame 0xfe00982226b0
> vfs_donmount() at vfs_donmount+0xe37/frame 0xfe00982228f0
> sys_nmount() at sys_nmount+0x72/frame 0xfe0098222930
> amd64_syscall() at amd64_syscall+0x2f9/frame 0xfe0098222ab0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0098222ab0
> --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80086ecea, rsp =
> 0x7fffe318, rbp = 0x7fffeca0 ---

New discovery:
Rick's latest patch casues panic only with KDB. If I compile a kernel
without witenss and KDB, the machine boots fine!
Also, it's at least not so easy anymore to trigger the deadlock :-) . I
need to do more testing but until now Rick's approach seems very
promising :-) . Unfortunately I can't provide a fix or suggestion to why
the KDB kernel panics and the non-KDB doesn't, just the dull imagination
it could be that additional locking checks (KASSERT?), preventing more
damage, are not in place. So I guess I'm in danger waters, but it
defenitly is a highly appreciated improvement for me and my bery best
bet for now (neither eliminating unionfs nor holding off 11 updates were
real options for me, especially because unionfs isn't really well
wokring on 10.3 either, just not leading to deadlocks in more environments)!

I tried the non-debug kernel because I browsed old unionfs discussions
and desperately gave Attilio Rao's patch a try since I couldn't remember
why I haven't kept it locally:
https://people.freebsd.org/~attilio/unionfs_nodeget4.patch (he tried to
solve unionfs problems for RELENG_9 back in 2012:
https://lists.freebsd.org/pipermail/freebsd-stable/2012-November/070358.html)

It's still true that his patch leads to a panic with debugging kernel –
only. Same patch without KDB allows to boot and start squid. But the
result is the same as with plain r314856, the system deadlocks reproducibly.

Also, the trace with his patch looks identical to the plain r314856
unionfs panic.

So I hope Rick or someone else can pick up the latest patch and polish
it to make KDB-kernels happy :-)
I can offer a small donation if that helps!
Of course, I'll also provide KDB info if needed/helpful.

thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c

2017-03-07 Thread Harry Schmalzbauer
Bezüglich Rick Macklem's Nachricht vom 05.09.2016 23:21 (localtime):
> Harry Schmalzbauer  wrote:
>>Bezüglich Rick Macklem's Nachricht vom 18.08.2016 02:03 (localtime):
>>>  Kostik wrote:
>>> [stuff snipped]
>>>> insmnque() performs the cleanup on its own, and that default cleanup
> isnot suitable >for the situation.  I think that insmntque1() would
> betterfit your requirements, your >need to move the common code into a
> helper.It seems that >unionfs_ins_cached_vnode() cleanup could reuse it.
>>> <https://lists.freebsd.org>
>>> I've attached an updated patch (untested like the last one). This one
> creates a
>>> custom version insmntque_stddtr() that first calls unionfs_noderem()
> and then
>>> does the same stuff as insmntque_stddtr(). This looks like it does the
> required
>>> stuff (unionfs_noderem() is what the unionfs VOP_RECLAIM() does).
>>> It switches the node back to using its own v_vnlock that is
> exclusively locked,
>>> among other things.
>>
>>Thanks a lot, today I gave it a try.
>>
>>With this patch, one reproducable panic can still be easily triggered:
>>I have directory A unionfs_mounted under directory B.
>>Then I mount_unionfs the same directory A below another directory C.
>>panic: __lockmgr_args: downgrade a recursed lockmgr nfs @
>>/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905
>>Result is this backtrace, hardly helpful I guess:
>>
>>#1  0x80ae5fd9 in kern_reboot (howto=260) at
>>/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:366
>>#2  0x80ae658b in vpanic (fmt=, ap=>optimized out>)
>>at
>>/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:759
>>#3  0x80ae63c3 in panic (fmt=0x0) at
>>/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:690
>>#4  0x80ab7ab7 in __lockmgr_args (lk=,
>>flags=, ilk=, wmesg=>optimized out>,
>>pri=, timo=, file=>optimized out>, line=)
>>  >   at
> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_lock.c:992
>>#5  0x80ba510c in vop_stdlock (ap=) at
>>lockmgr.h:98
>>#6  0x8111932d in VOP_LOCK1_APV (vop=,
>>a=) at vnode_if.c:2087
>>#7  0x80a18cfc in unionfs_lock (ap=0xfe007a3ba6a0) at
>>vnode_if.h:859
>>#8  0x8111932d in VOP_LOCK1_APV (vop=,
>>a=) at vnode_if.c:2087
>>#9  0x80bc9b93 in _vn_lock (vp=,
>>flags=66560, file=, line=) at
>>vnode_if.h:859
>>#10 0x80a18460 in unionfs_readdir (ap=) at
>>/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1531
>>#11 0x81118ecf in VOP_READDIR_APV (vop=,
>>a=) at vnode_if.c:1822
>>#12 0x80bc6e3b in kern_getdirentries (td=,
>>fd=, buf=0x800c3d000 >bounds>,
>>count=, basep=0xfe007a3ba980, residp=0x0)
>>at vnode_if.h:758
>>#13 0x80bc6bf8 in sys_getdirentries (td=0x0,
>>uap=0xfe007a3baa40) at
>>/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_syscalls.c:3940
>>#14 0x80fad6b8 in amd64_syscall (td=,
>>traced=0) at subr_syscall.c:135
>>#15 0x80f8feab in Xfast_syscall () at
>>/usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:396
>>#16 0x00452eea in ?? ()
>>Previous frame inner to this frame (corrupt stack?
> Ok, I finally got around to looking at this and the panic() looks like a
> pretty straightforward
> bug in the unionfs code.
> - In unionfs_readdir(), it does a vn_lock(..LK_UPGRADE) and then later
> in the code
>   vn_lock(..LK_DOWNGRADE) if it did the upgrade. (At line#1531 as noted
> in the backtrace.)
>   - In unionfs_lock(), it sets LK_CANRECURSE when it is the rootvp and
> LK_EXCLUSIVE.
>(So it allows recursive acquisition in this case.)
> --> Then it would call vn_lock(..LK_DOWNGRADE), which would panic if it
> has recursed.
> 
> Now, I'll admit unionfs_lock() is too obscure for me to understand, but...
> Is it necessary to vn_lock(..LK_DOWNGRADE) or can unionfs_readdir() just
> return
> with the vnode exclusively locked?
> (It would be easy to change the code to avoid the
> vn_lock(..LK_DOWNGRADE) call
>  when it has done the vn_lock(..LK_EXCLUSIVE) after
> vn_lock(..LK_UPGRADE) fails.)
> 
> rick
> 
>>I ran your previous patch with for some time.
>>Similarly, mounting one directory below a 2nd mountpount crashed the
>>machine (forgot to config dumpdir, so can't compare backtrace with the
>>current patch).
>>Otherwise, at least with

Re: 'show alllocks' of completely locked machine [Was: Re: Complete IO lockup, state "ufs" from userland, debuging help wanted]

2017-03-07 Thread Harry Schmalzbauer
Bezüglich hiren panchasara's Nachricht vom 06.03.2017 21:10 (localtime):
> On 03/06/17 at 08:56P, Harry Schmalzbauer wrote:
>>  Bez?glich Harry Schmalzbauer's Nachricht vom 05.03.2017 22:59 (localtime):
>>>  Hello,
>>>
>>> I can easily lock up FreeBSD stable/11 from userland. Not that I want to...
>>> I'm running squid, which starts an authentication helper
>>> "*negotiate_kerberos_auth*", which seems to be the culprit.
>>> Completely all IO is blocked, there's no way to get anything from any
>>> filesystem.
>>> All non IO-requesting processes(threads) run well, including sshd and
>>> shells.
>>> There's no load (neither cpu nor io) just any process requesting io
>>> stucks in state "ufs"
>>>
>>> Can anyone help me finding out what's going wrong?
>>> Serial console is available.
>>
>> Dear hackers,
>>
>> I managed to get into DDB, but I'm lost from there?
>>
>> What information could be usefull to find out the cause of this complete
>> lockup?
>>
>> I'd need someone who could guide me through ? I'd pay for a debuging
>> lesson! (quiet constrained budget though)
>>
>> This happens when the machine got stuck:
>>
>> intr_event_handle() at intr_event_handle+0x9c/frame 0xfe0093dcb7d0
>> intr_execute_handlers() at intr_execute_handlers+0x48/frame
>> 0xfe0093dcb800
>> lapic_handle_intr() at lapic_handle_intr+0x68/frame 0xfe0093dcb840
>> Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0093dcb840
>> --- interrupt, rip = 0x807b9bd6, rsp = 0xfe0093dcb910, rbp =
>> 0xfe0093dcb910 ---
>> acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfe0093dcb910
>> acpi_cpu_idle() at acpi_cpu_idle+0x2ea/frame 0xfe0093dcb960
>> cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfe0093dcb980
>> cpu_idle() at cpu_idle+0x8f/frame 0xfe0093dcb9a0
>> sched_idletd() at sched_idletd+0x436/frame 0xfe0093dcba70
>> fork_exit() at fork_exit+0x84/frame 0xfe0093dcbab0
>> fork_trampoline() at fork_trampoline+0xe/frame 0xfe0093dcbab0
>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>
>>
>> db> show alllocks
>> Process 1259 (negotiate_kerberos_) thread 0xf80005ddea00 (100096)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1258 (negotiate_kerberos_) thread 0xf80005ddc500 (100252)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1257 (negotiate_kerberos_) thread 0xf80005ddda00 (100247)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1256 (negotiate_kerberos_) thread 0xf80065612500 (100261)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1255 (negotiate_kerberos_) thread 0xf80065612a00 (100260)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1254 (negotiate_kerberos_) thread 0xf80065613000 (100257)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1253 (negotiate_kerberos_) thread 0xf80065614000 (100254)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1252 (negotiate_kerberos_) thread 0xf800651e1000 (100246)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1251 (negotiate_kerberos_) thread 0xf80005ddca00 (100251)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1250 (negotiate_kerberos_) thread 0xf800651e2a00 (100241)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1251 (negotiate_kerberos_) thread 0xf80005ddca00 (100251)
>> shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
>> /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
>> Process 1250 (negotiate_kerberos_) thread 0xf80

'show alllocks' of completely locked machine [Was: Re: Complete IO lockup, state "ufs" from userland, debuging help wanted]

2017-03-06 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 05.03.2017 22:59 (localtime):
>  Hello,
>
> I can easily lock up FreeBSD stable/11 from userland. Not that I want to...
> I'm running squid, which starts an authentication helper
> "*negotiate_kerberos_auth*", which seems to be the culprit.
> Completely all IO is blocked, there's no way to get anything from any
> filesystem.
> All non IO-requesting processes(threads) run well, including sshd and
> shells.
> There's no load (neither cpu nor io) just any process requesting io
> stucks in state "ufs"
>
> Can anyone help me finding out what's going wrong?
> Serial console is available.

Dear hackers,

I managed to get into DDB, but I'm lost from there…

What information could be usefull to find out the cause of this complete
lockup?

I'd need someone who could guide me through – I'd pay for a debuging
lesson! (quiet constrained budget though)

This happens when the machine got stuck:

intr_event_handle() at intr_event_handle+0x9c/frame 0xfe0093dcb7d0
intr_execute_handlers() at intr_execute_handlers+0x48/frame
0xfe0093dcb800
lapic_handle_intr() at lapic_handle_intr+0x68/frame 0xfe0093dcb840
Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0093dcb840
--- interrupt, rip = 0x807b9bd6, rsp = 0xfe0093dcb910, rbp =
0xfe0093dcb910 ---
acpi_cpu_c1() at acpi_cpu_c1+0x6/frame 0xfe0093dcb910
acpi_cpu_idle() at acpi_cpu_idle+0x2ea/frame 0xfe0093dcb960
cpu_idle_acpi() at cpu_idle_acpi+0x3f/frame 0xfe0093dcb980
cpu_idle() at cpu_idle+0x8f/frame 0xfe0093dcb9a0
sched_idletd() at sched_idletd+0x436/frame 0xfe0093dcba70
fork_exit() at fork_exit+0x84/frame 0xfe0093dcbab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe0093dcbab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---


db> show alllocks
Process 1259 (negotiate_kerberos_) thread 0xf80005ddea00 (100096)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1258 (negotiate_kerberos_) thread 0xf80005ddc500 (100252)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1257 (negotiate_kerberos_) thread 0xf80005ddda00 (100247)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1256 (negotiate_kerberos_) thread 0xf80065612500 (100261)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1255 (negotiate_kerberos_) thread 0xf80065612a00 (100260)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1254 (negotiate_kerberos_) thread 0xf80065613000 (100257)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1253 (negotiate_kerberos_) thread 0xf80065614000 (100254)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1252 (negotiate_kerberos_) thread 0xf800651e1000 (100246)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1251 (negotiate_kerberos_) thread 0xf80005ddca00 (100251)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1250 (negotiate_kerberos_) thread 0xf800651e2a00 (100241)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1251 (negotiate_kerberos_) thread 0xf80005ddca00 (100251)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1250 (negotiate_kerberos_) thread 0xf800651e2a00 (100241)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1247 (sqtop) thread 0xf80065650a00 (100259)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1184 (systat) thread 0xf80065613a00 (100255)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1042 (negotiate_kerberos_) thread 0xf800651e2500 (100242)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 1041 (negotiate_kerberos_) thread 0xf800055e4000 (100078)
shared lockmgr ufs (ufs) r = 0 (0xf8000523d5f0) locked @
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_lookup.c:611
Process 639 (

Complete IO lockup, state "ufs" from userland, debuging help wanted

2017-03-05 Thread Harry Schmalzbauer
 Hello,

I can easily lock up FreeBSD stable/11 from userland. Not that I want to...
I'm running squid, which starts an authentication helper
"*negotiate_kerberos_auth*", which seems to be the culprit.
Completely all IO is blocked, there's no way to get anything from any
filesystem.
All non IO-requesting processes(threads) run well, including sshd and
shells.
There's no load (neither cpu nor io) just any process requesting io
stucks in state "ufs"

Can anyone help me finding out what's going wrong?
Serial console is available.

Thanks in advance,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 11.0, bxe and lagg

2017-02-22 Thread Harry Schmalzbauer
 Bezüglich Ingeborg Hellemo's Nachricht vom 21.02.2017 11:23 (localtime):
> trond.endres...@fagskolen.gjovik.no said:
>> Why does lagg0 refer to bge2 and bge3 in the ifconfig output, and not  to
>> bxe2 and bxe3?
> My bad! No cut and paste from the console of the host without net. Wrote most 
> of it by hand but ended up using cut and paste from another host and forgot 
> to 
> edit.
>
> Correct lines:
>  laggproto lacp lagghash l2,l3,l4
>  laggport: bxe2 flags=0<>
>  laggport: bxe3 flags=0<>
>

There are known problems with laggproto lacp and if_bxe(4) due to
full-duplex detection I guess.
Workarround seems to be putting if_bxe(4) into promisc mode.

See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213606

Especially comment #8 and #13!

Mabye you can add your report to that bug, to raise it's priority.

Using if_lagg(4) with laggproto lacp and if_igb(4) doesn't show any
problems here on 11(-stable).
I also think there were LACPDU changes forcing the switch to be set to
active mode, but a very quick look didn't reveal a matching commit, so
maybe I'm wrong. Anyway, in your case it seems if_bxe(4) is the root
cause, not a configuration mismatch.

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ASM1062 AHCI timeouts, ppt(4) BAR aligning [Was: Re: svn commit: r309251 - head/sys/dev/ahci]

2016-12-29 Thread Harry Schmalzbauer
Bezüglich Alexander Motin's Nachricht vom 29.12.2016 11:32 (localtime):
> On 29.12.2016 10:35, Harry Schmalzbauer wrote:
>> I'd like to report that this doesn't fix timeouts for me (applied to
>> 11-stable).
>>
>> For example my REV120 works without problems on Intel-AHCI but not on
>> ASM1062-AHCI.
>> Even attaching gives different output. Both look fine at first:
>> #cd0 at ahcich0 bus 0 scbus5 target 0 lun 0
>> #cd0:  Removable CD-ROM SCSI device
>> #cd0: Serial Number 0C1E4D046E5DFF18
>> #cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO
>> 8192bytes)
>>
>> When attached to the Intel-AHCI, it's followed by
>> +cd0: Attempt to query device size failed: NOT READY, Medium not present
>> while attaching to ASM1062 it reads (!?)
>> -cd0: 0MB (1 0 byte sectors)
>>
>> Then these timeouts occur:
>> ahcich7: Timeout on slot 11 port 0
>> ahcich7: is  cs 0c00 ss  rs 0c00 tfd 6051 serr
>>  cmd 0004cb17
>> ahcich7: Timeout on slot 24 port 0
>> ahcich7: is  cs 0180 ss  rs 0180 tfd 2051 serr
>>  cmd 0004d817
>> ahcich7: Timeout on slot 6 port 0
>> ahcich7: is  cs 0060 ss  rs 0060 tfd 2051 serr
>>  cmd 0004c617
>> ahcich7: Timeout on slot 20 port 0
>> ahcich7: is  cs 0018 ss  rs 0018 tfd 2051 serr
>>  cmd 0004d417
>>
>> Also IDENT (via camcontrol) "hangs" for 20 seconds, but finally succeeds.
> 
> I think problem may be different in your case.  The HBA still reports
> that command is not completed by the device.  Unfortunately I don't have
> those fancy drives to try, but I'll try to reproduce it with regular CD
> drive when I get back home after short New Year holidays.

Oic, then I need to test the patch with regular SSD usage!
I have had problems with the asm1062 when I used it for a "roaming" SSD,
backing virtio-blk/ahci,hd:/dev/ada4. After some mentionable I/O there
were _always_ outages which I don't remember exactly, but moving the SSD
from asm-ahci to intel-ahci made them vanish.
I'll see if these are now solved for the ASM1062. Otherwise I'll report.


>> Btw: I already found out that extending ppt(4) to support unaligned base
>> address register wouldn't be too easy.
>> Initially I added that ASM1062 card to use it for byhve(8) passthrough.
>> Unfortunately that doesn't work:
>> bhyve: passthru device 6/0/0 BAR 5: base 0xc3e1 or size 0x200 not
…
> 
> I believe it is bhyve bug, since these values are just what hardware
> reports.  BAR size of 512 bytes indeed does not align to 4K, but that is
> not our problem. :)
> 
>> Are there any recommendations for AHCI (SATA-PCIe) controller
>> cards/chips that do work (both, for byhve passthrough and also as plain
>> AHCI provider)?
> 
> Please don't mix multiple unrelated questions in one email.

Yes, sorry, I should have sent that in two different mails. Took the
wrong route because I thought others who are possibly searching/trying
low-power SATA passthrough controllers could find this (useful)...


> There is very little reasonable external AHCI controllers on the market
> now.  I am not sure anything other then Marvell and ASmedia were
> released at all in last years since 6Gbps SATA came out.  Marvell and
> ASmedia probably worth each other, while later Marvell may be slightly
> better on functionality (number of ports and FBS PMP support), but they
> are both desktop products.  If you need this in server environment --
> think about about SAS adapter like LSI.  Or just use on-board Intel
> AHCI, since they are probably the best om reliability you may get out of
> SATA.


Thanks for your hints!
Usually I go with LSI2008 for such cases, but this time the additional
7W power consumption for only _one_ roaming SATA-SSD seemd
inappropriate. Furthermore, I only have one spare slot, so I got a card
with the ASM1062 and an Etron EJ168 USB 3.0 combined.

That was exactly what I'd want to have as passthrough for my guest :-)
Hopefully bhyve(8)/ppt will make that possible in the future.

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ASM1062 AHCI timeouts, ppt(4) BAR aligning [Was: Re: svn commit: r309251 - head/sys/dev/ahci]

2016-12-29 Thread Harry Schmalzbauer
 Bezüglich Alexander Motin's Nachricht vom 28.11.2016 17:23 (localtime):
> Author: mav
> Date: Mon Nov 28 16:23:32 2016
> New Revision: 309251
> URL: https://svnweb.freebsd.org/changeset/base/309251
>
> Log:
>   Process port interrupt even is PxIS register is zero.
>   
>   ASMedia ASM1062 AHCI chips with some fancy firmware handling PMP inside
>   seems sometimes forgeting to set bits in PxIS, causing command timeouts.
>   Removal of this check fixes the issue by the theoretical cost of slightly
>   higher CPU usage in some odd cases, but this is what Linux does too.
>   
>   MFC after:  1 month
>
> Modified:
>   head/sys/dev/ahci/ahci.c
>
> Modified: head/sys/dev/ahci/ahci.c
> ==
> --- head/sys/dev/ahci/ahci.c  Mon Nov 28 15:14:31 2016(r309250)
> +++ head/sys/dev/ahci/ahci.c  Mon Nov 28 16:23:32 2016(r309251)
> @@ -1169,8 +1169,6 @@ ahci_ch_intr(void *arg)
>  
>   /* Read interrupt statuses. */
>   istatus = ATA_INL(ch->r_mem, AHCI_P_IS);
> - if (istatus == 0)
> - return;
>  
>   mtx_lock(&ch->mtx);
>   ahci_ch_intr_main(ch, istatus);
> @@ -1187,8 +1185,6 @@ ahci_ch_intr_direct(void *arg)
>  
>   /* Read interrupt statuses. */
>   istatus = ATA_INL(ch->r_mem, AHCI_P_IS);
> - if (istatus == 0)
> - return;
>  
>   mtx_lock(&ch->mtx);
>   ch->batch = 1;

Hello,

I'd like to report that this doesn't fix timeouts for me (applied to
11-stable).

For example my REV120 works without problems on Intel-AHCI but not on
ASM1062-AHCI.
Even attaching gives different output. Both look fine at first:
#cd0 at ahcich0 bus 0 scbus5 target 0 lun 0
#cd0:  Removable CD-ROM SCSI device
#cd0: Serial Number 0C1E4D046E5DFF18
#cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO
8192bytes)

When attached to the Intel-AHCI, it's followed by
+cd0: Attempt to query device size failed: NOT READY, Medium not present
while attaching to ASM1062 it reads (!?)
-cd0: 0MB (1 0 byte sectors)

Then these timeouts occur:
ahcich7: Timeout on slot 11 port 0
ahcich7: is  cs 0c00 ss  rs 0c00 tfd 6051 serr
 cmd 0004cb17
ahcich7: Timeout on slot 24 port 0
ahcich7: is  cs 0180 ss  rs 0180 tfd 2051 serr
 cmd 0004d817
ahcich7: Timeout on slot 6 port 0
ahcich7: is  cs 0060 ss  rs 0060 tfd 2051 serr
 cmd 0004c617
ahcich7: Timeout on slot 20 port 0
ahcich7: is  cs 0018 ss  rs 0018 tfd 2051 serr
 cmd 0004d417

Also IDENT (via camcontrol) "hangs" for 20 seconds, but finally succeeds.

Btw: I already found out that extending ppt(4) to support unaligned base
address register wouldn't be too easy.
Initially I added that ASM1062 card to use it for byhve(8) passthrough.
Unfortunately that doesn't work:
bhyve: passthru device 6/0/0 BAR 5: base 0xc3e1 or size 0x200 not
page aligned
That's the ASM1062:
ppt0@pci0:6:0:0:class=0x010601 card=0x10601b21 chip=0x06121b21
rev=0x01 hdr=0x00
bar   [10] = type I/O Port, range 32, base 0x5050, size 8, enabled
bar   [14] = type I/O Port, range 32, base 0x5040, size 4, enabled
bar   [18] = type I/O Port, range 32, base 0x5030, size 8, enabled
bar   [1c] = type I/O Port, range 32, base 0x5020, size 4, enabled
bar   [20] = type I/O Port, range 32, base 0x5000, size 32, enabled
bar   [24] = type Memory, range 32, base 0xc3e1, size 512, enabled

Are there any recommendations for AHCI (SATA-PCIe) controller
cards/chips that do work (both, for byhve passthrough and also as plain
AHCI provider)?

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


share/mk/bsd.cpu.mk disables lang/gcc48 build with CPUTYPE

2016-12-10 Thread Harry Schmalzbauer
 Hello,

I'm unsure if I'd better file a bug report, but I'm also unsure if it's
ports or base…

When one defines CPUTYPE in make.conf(5), share/mk/bsd.cpu.mk translates
'core-avx-i' into 'ivybridge' e.g.
This breaks building e.g. ports/lang/gcc48:
configure:3374:
/usr/local/ports-wrktree/lang/gcc48/work/.build/./gcc/xgcc
-B/usr/local/ports-wrktree/lang/gcc48/work/.build/./gcc/
-B/usr/local/x86_64-portbld-f
reebsd11.0/bin/ -B/usr/local/x86_64-portbld-freebsd11.0/lib/ -isystem
/usr/local/x86_64-portbld-freebsd11.0/include -isystem
/usr/local/x86_64-portbld-freebsd11.
0/sys-include -o conftest -g -O2 -pipe -march=ivybridge -DLIBICONV_PLUG
-fno-strict-aliasing conftest.c >&5
conftest.c:1:0: error: bad value (ivybridge) for -march= switch

Translating reverse would require a second point of maintenance.
So it was better to have the translation beeing contitional.

But I'm not really familar with the share/mk make conventions.
So I can just provide one quick workarround (in case anyone searched for
a momentary solution to the build problem, see the attached patch).

This workarround changes the following:
· In /usr/share/mk/bsd.cpu.mk do _CPUFLAGS assignings _before_ alias
settings.
· Instead of usual variable expansion, use := to immediately set the
value, before aliasing "does the wrong thing"…

Note that the patch attached isn't verified for anything but amd64 and
core-avx-i vs. ivybridge (in /etc/make.conf CPUTYPE) and not suitable
for general usage, just if you want to get binaries of lang/gcc48 using
today's 11-stable with CPUTYPE.

Best,

-harry


--- usr/share/mk/bsd.cpu.mk.orig	2016-12-10 15:16:39.625929000 +0100
+++ usr/share/mk/bsd.cpu.mk	2016-12-10 18:14:10.294579000 +0100
@@ -25,62 +25,6 @@
 . endif
 .else
 
-# Handle aliases (not documented in make.conf to avoid user confusion
-# between e.g. i586 and pentium)
-
-. if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386"
-.  if ${CPUTYPE} == "barcelona"
-CPUTYPE = amdfam10
-.  elif ${CPUTYPE} == "core-avx2"
-CPUTYPE = haswell
-.  elif ${CPUTYPE} == "core-avx-i"
-CPUTYPE = ivybridge
-.  elif ${CPUTYPE} == "corei7-avx"
-CPUTYPE = sandybridge
-.  elif ${CPUTYPE} == "corei7"
-CPUTYPE = nehalem
-.  elif ${CPUTYPE} == "slm"
-CPUTYPE = silvermont
-.  elif ${CPUTYPE} == "atom"
-CPUTYPE = bonnell
-.  elif ${CPUTYPE} == "core"
-CPUTYPE = prescott
-.  endif
-.  if ${MACHINE_CPUARCH} == "amd64"
-.   if ${CPUTYPE} == "prescott"
-CPUTYPE = nocona
-.   endif
-.  else
-.   if ${CPUTYPE} == "k7"
-CPUTYPE = athlon
-.   elif ${CPUTYPE} == "p4"
-CPUTYPE = pentium4
-.   elif ${CPUTYPE} == "p4m"
-CPUTYPE = pentium4m
-.   elif ${CPUTYPE} == "p3"
-CPUTYPE = pentium3
-.   elif ${CPUTYPE} == "p3m"
-CPUTYPE = pentium3m
-.   elif ${CPUTYPE} == "p-m"
-CPUTYPE = pentium-m
-.   elif ${CPUTYPE} == "p2"
-CPUTYPE = pentium2
-.   elif ${CPUTYPE} == "i686"
-CPUTYPE = pentiumpro
-.   elif ${CPUTYPE} == "i586/mmx"
-CPUTYPE = pentium-mmx
-.   elif ${CPUTYPE} == "i586"
-CPUTYPE = pentium
-.   endif
-.  endif
-. elif ${MACHINE_ARCH} == "sparc64"
-.  if ${CPUTYPE} == "us"
-CPUTYPE = ultrasparc
-.  elif ${CPUTYPE} == "us3"
-CPUTYPE = ultrasparc3
-.  endif
-. endif
-
 ###
 # Logic to set up correct gcc optimization flag.  This must be included
 # after /etc/make.conf so it can react to the local value of CPUTYPE
@@ -99,10 +43,10 @@
 .  elif ${CPUTYPE} == "c7"
 _CPUCFLAGS = -march=c3-2
 .  else
-_CPUCFLAGS = -march=${CPUTYPE}
+_CPUCFLAGS := -march=${CPUTYPE}
 .  endif
 . elif ${MACHINE_CPUARCH} == "amd64"
-_CPUCFLAGS = -march=${CPUTYPE}
+_CPUCFLAGS := -march=${CPUTYPE}
 . elif ${MACHINE_CPUARCH} == "arm"
 .  if ${CPUTYPE} == "xscale"
 #XXX: gcc doesn't seem to like -mcpu=xscale, and dies while rebuilding itself
@@ -167,6 +111,62 @@
 _CPUCFLAGS = -mcpu=${CPUTYPE}
 . endif
 
+# Handle aliases (not documented in make.conf to avoid user confusion
+# between e.g. i586 and pentium)
+
+. if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386"
+.  if ${CPUTYPE} == "barcelona"
+CPUTYPE = amdfam10
+.  elif ${CPUTYPE} == "core-avx2"
+CPUTYPE = haswell
+.  elif ${CPUTYPE} == "core-avx-i"
+CPUTYPE = ivybridge
+.  elif ${CPUTYPE} == "corei7-avx"
+CPUTYPE = sandybridge
+.  elif ${CPUTYPE} == "corei7"
+CPUTYPE = nehalem
+.  elif ${CPUTYPE} == "slm"
+CPUTYPE = silvermont
+.  elif ${CPUTYPE} == "atom"
+CPUTYPE = bonnell
+.  elif ${CPUTYPE} == "core"
+CPUTYPE = prescott
+.  endif
+.  if ${MACHINE_CPUARCH} == "amd64"
+.   if ${CPUTYPE} == "prescott"
+CPUTYPE = nocona
+.   endif
+.  else
+.   if ${CPUTYPE} == "k7"
+CPUTYPE = athlon
+.   elif ${CPUTYPE} == "p4"
+CPUTYPE = pentium4
+.   elif ${CPUTYPE} == "p4m"
+CPUTYPE = pentium4m
+.   elif ${CPUTYPE} == "p3"
+CPUTYPE = pentium3
+.   elif ${CPUTYPE} == "p3m"
+CPUTYPE = pentium3m
+.   elif ${CPUTYPE} == "p-m"
+CPUTYPE = pentium-m
+.   elif ${CPUTYPE} == "p2"
+CPUTYPE = pentium2
+.   elif ${CPUTYPE} == "i686"
+CPUTYPE = pentiumpro
+.   elif ${CPUTYPE} == "i586/mmx"
+CPUTYPE

Re: boot1.efifat's FAT12 volume label prevents booting (some systems)

2016-11-07 Thread Harry Schmalzbauer
Bezüglich Patrick M. Hausen's Nachricht vom 07.11.2016 09:12 (localtime):
> Hi,
> 
>> Am 07.11.2016 um 09:04 schrieb Harry Schmalzbauer :
>>> create the EFI boot volume like this?
>>>
>>> gpart add -t efi -l efi -a 512k -s 512k 
>>> newfs_msdos /dev/gpt/efi
>>> mount_msdosfs /dev/gpt/efi /mnt
>>> mkdir -p /mnt/efi/boot
>>> cp /boot/boot1.efi /mnt/efi/boot/bootx64.efi
>>
>> You are missing startup.nsh...
>> See
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214282https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214282
> 
> Care to elaborate? This is what we use in production - all
> systems booting just fine ;-)

Of course you can boot UEFI systems without startup.nsh, but it does
offers another way processing the boot sequence – the most sensible in
my opinion.
And it's what FreeBSD Releng-Team decided to provide out of the box, so
heplful hint's shouldn't do it any other.

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: boot1.efifat's FAT12 volume label prevents booting (some systems)

2016-11-07 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 07.11.2016 09:04 (localtime):
> Bezüglich Patrick M. Hausen's Nachricht vom 07.11.2016 08:10 (localtime):
>> Hi, all,
>>
>>> Am 06.11.2016 um 18:14 schrieb Dimitry Andric :
>>>
>>> Please do, so it is not forgotten.  It is relatively easy to change the
>>> volume label, by editing sys/boot/efi/boot1/generate-fat.sh, and then
>>> regenerating the FAT templates.
>> Why use the pre-generated image at all when you can easily
> It's what bsdinstall seems to do, which left the system unbootable, not
> what I do.
>
>
>> create the EFI boot volume like this?
>>
>> gpart add -t efi -l efi -a 512k -s 512k 

And you possibly run into other firmware problems by again using EFI as
label.
I don't know the standards, but it's obvious that at least one
unexpected label/path interference causes problems, so it's better not
to provoke another one which possibly affects only very few
implementations, but causes needless trouble.
Better use something like 'gpart add -t efi -l A-uefiLOADER  -a 512k -s
512k ' (resulting in /dev/gpt/A-uefiLOADER to be used insteaad
of /dev/gpt/efi for the following commands)

I personally prefer the "A-" prefix is to describe that it's the 1st
mirror component… Change it to whatever you like.

>> newfs_msdos /dev/gpt/efi
>> mount_msdosfs /dev/gpt/efi /mnt
>> mkdir -p /mnt/efi/boot
>> cp /boot/boot1.efi /mnt/efi/boot/bootx64.efi

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: boot1.efifat's FAT12 volume label prevents booting (some systems)

2016-11-07 Thread Harry Schmalzbauer
Bezüglich Patrick M. Hausen's Nachricht vom 07.11.2016 08:10 (localtime):
> Hi, all,
> 
>> Am 06.11.2016 um 18:14 schrieb Dimitry Andric :
>>
>> Please do, so it is not forgotten.  It is relatively easy to change the
>> volume label, by editing sys/boot/efi/boot1/generate-fat.sh, and then
>> regenerating the FAT templates.
> 
> Why use the pre-generated image at all when you can easily

It's what bsdinstall seems to do, which left the system unbootable, not
what I do.


> create the EFI boot volume like this?
> 
> gpart add -t efi -l efi -a 512k -s 512k 
> newfs_msdos /dev/gpt/efi
> mount_msdosfs /dev/gpt/efi /mnt
> mkdir -p /mnt/efi/boot
> cp /boot/boot1.efi /mnt/efi/boot/bootx64.efi

You are missing startup.nsh...
See
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214282https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214282

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

boot1.efifat's FAT12 volume label prevents booting (some systems)

2016-11-06 Thread Harry Schmalzbauer
 Recently I played with bsdinstall and UEFI setup, which left the system
unbootable (11.0-Release).
The culprit is the MS-DOS volume lable "EFI" of the EFI partition.
At least on Intel Single-Socket Servers (for Xeon E3 IvyBridge/BearToot
+ Haswell/RainbowPass), the UEFI firmware can't handle the identical
path/volumelabel.

Simply reformatting with a different volume label (EFIFAT e.g.) solves
that problem!
Shall I file a bug report?

Btw, can someone explain in short words why BOOT64.EFI seems to be
boot1.efi, but padded with 0x20 up to 128k?

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: pax(1) needs to learn POSIX-pax format (by libarchive(3)?)

2016-11-04 Thread Harry Schmalzbauer
Bezüglich David Magda's Nachricht vom 04.11.2016 03:56 (localtime):
> On Nov 1, 2016, at 13:44, Harry Schmalzbauer  wrote:
> 
>> Has anyone ever thought about? Unfortunately I'm lacking skills and time :-(
> 
> You’ll want to talk to the folks here:
> 
>   http://libarchive.org
> 
> That is the upstream project. It actually started on FreeBSD over a decade 
> ago but spun off on its own, and is used by a wider audience nowadays.
> 
> I provided some sample Solaris-ACL files early in the development. If you 
> provide some problematic files I’m sure they’ll be willing to help.


Hello David,

thanks for your hint. I'm using libarchive(3)'s pax-support by tar(1)
already, so I'm not sure if these are the ones interested to make pax(1)
use libarchive(3).

If I remember correctly I haven't had problems restoring files with
NFSv4 ACLs, but that's not the major problem anyway.

My real-world problem is that the pax(1) tool can't restore from/backup
to pax-format files. All supported formats by pax(1) have unpractical
path/filename length limits.
I'm aware of cpio(1) and tar(1)s transition to libarchive several years
ago. Unfortunetly pax(1) was overseen :-(

Would be wonderful if someone could catch up pax(1)'s libarchive(3)
transition, but I guess the libarchive developers aren't interested or
have very much spare resources…

Thanks,

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

pax(1) needs to learn POSIX-pax format (by libarchive(3)?)

2016-11-01 Thread Harry Schmalzbauer
 Dear hackers,

I'm frequently missing pax(1) ability to handle the pax (the POSIX pax)
format.
Backing up real-world file names and lengths doesn't work with ustar
format - which pax(1) uses and also tar(1) by default.

I'd prefer using pax(1) because of it's cli usage – personal taste…
But in practice, I'm forced to use tar(1), overriding tar's default
format with the "--format pax" (or --posix) option, for almost any
archive/backup job (where zfs send isn't feasible).
Since tar(1) does support the POSIX pax format, it's not a big issue,
but weird using tar for pax and pax for tar ;-)
I'd love pax(1) beeing libarchive(3)ed.

Has anyone ever thought about? Unfortunately I'm lacking skills and time :-(

Thanks,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

bhyve(8) passthru affects ahci-hd with device backend [Was: Re: Unexpected ahci-hd bytes when running in bhyve(8)]

2016-10-30 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 29.10.2016 19:35 (localtime):
>  Bezüglich Harry Schmalzbauer's Nachricht vom 29.10.2016 17:32 (localtime):
>
> …
>> Like mentioned, while reading the first 448 bytes on the host, I get
>> identical results from /usr/local/guest.img and /dev/ada4, but when
>> attaching /dev/ada4 to ahci-hd (-s 7,ahci-hd,/dev/ada4) and inspecting
>> inside vmm, all I see is 0x0, while ahci-hd attached
>> /usr/local/guest.img shows the same pmbr as on the host!?
>>
>> Do I have to exclude /dev/ada4 on the host from geom? As soon as bhyve
>> opens /dev/ada4, all partitions vanish from the host – probably ada4
>> itself gets blocked somehow?
…
> Just another symptom I can only describe, not debug:
> Opening /dev/adaX on the host works by 'hd /dev/ada4 | less',
> but not inside the guest, where it just leads to endless IO when trying
> the same on the ahci-hd attached /dev/ada4

The described defects only happen when passthru is used with bhyve(8)!!!

If I simply don't attach the passthru-device (keeping memory wired),
everything works the way it's supposed to do.
Opening the guest-ada1-device with hexdump works, geom tastes GPT and
the first 448 bytes show exactly the pMBR like on the host.
As soon as I start the guest with the passthru device (doesn't matter
which slot, tested with a 82574L), the ahci-hd can't be read nymore,
just returning 0x0 when dumped.

Shall I file a bug report? Anybody aware of that or any idea where to
start fixing?

To summarize:
FreeBSD-11-RELEASE hosting a bhyve(8) guest with a physical device as
storage backend (regardless if accessed through virtio-blk or ahci-hd)
corrupts guest-disk access if there's also a passthrough device attached.
If you use a file-backed ahci-hd (or virtio-blk) device, the problem
doesn't show up, regardelss if there's passthru involved or not.


Thanks,

-Harry


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Unexpected ahci-hd bytes when running in bhyve(8)

2016-10-29 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 29.10.2016 17:32 (localtime):

…
> Like mentioned, while reading the first 448 bytes on the host, I get
> identical results from /usr/local/guest.img and /dev/ada4, but when
> attaching /dev/ada4 to ahci-hd (-s 7,ahci-hd,/dev/ada4) and inspecting
> inside vmm, all I see is 0x0, while ahci-hd attached
> /usr/local/guest.img shows the same pmbr as on the host!?
>
> Do I have to exclude /dev/ada4 on the host from geom? As soon as bhyve
> opens /dev/ada4, all partitions vanish from the host – probably ada4
> itself gets blocked somehow?

Maybe that's related?
https://lists.freebsd.org/pipermail/freebsd-virtualization/2015-April/003509.html
(resulting in https://svnweb.freebsd.org/base?view=revision&revision=281700)

Just another symptom I can only describe, not debug:
Opening /dev/adaX on the host works by 'hd /dev/ada4 | less',
but not inside the guest, where it just leads to endless IO when trying
the same on the ahci-hd attached /dev/ada4.

Of course I found discussion threads about virtio-scsi, which was more
appropriate for my needs, but unfortunately nobody skilled enough had
time to implement yet afaik and it also wouldn't solve my problems while
this ssd is SATA attached (could switch to a SAS port so the HBA would
do SAT which should work then...)
Are there any other ways I'm missing to get mass storage into the guest?
ZVOl is a very good candidate for many setups, but not for all.
RAw-device-mappings to HBA-virtual-drives is doing a great job on ESXi,
but replacing HBA-virt-drive RAW-mappings with ZVOL isn't really the
same and sometimes I need physical devices in the guest.
P(cie)P(ass)T(through) seems to work great in bhyve, but I can't
sacrifice a complete HBA to accomplish.

Thanks,

-harry


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Unexpected ahci-hd bytes when running in bhyve(8)

2016-10-29 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 27.10.2016 20:05 (localtime):
>  Hello,
>
> I wanted to use a "roaming" ssd with byhve/vmm, which is the home of a
> GPT based FreeBSD setup.
> I've been using this for years with ESXi and bare-metal-hosts, and
> wanted to try out bhyve.
> Unfortunately this doesn't work the way I'm used to.
> Booting of ufs:/dev/gpt/myROOT fails with error 19, loader does only see
> a diskid/BHYVEDISK, not the GPT partitions.
>
> I guess ahci-hd isn't 1:1 mapping blocks, neither does virtio-blk, since
> it shows exactly the same result, which is a bit strange to me:
> When I boot a Live-CD in vmm with the physical SSD ahci-hd attached, the
> first 8kByte of /dev/ada0 is 0x0.
>
> The same test on the host ('dd if=/dev/ada4 count=16 | hd') shows me
> PMBR and GPT content, which I also expected to see in bhyve…
>
> What am I missing?

To verify whether my assumptions might be correct at all, I installed a
bhyve guest on a file backed ahci-hd drive (/usr/local/guest.img).
The first 448 byte of that file are exactly the same when inspected on
the host as the first 448 bytes of /dev/ada4.

Inspecting /dev/ada0 inside the guest vmm, I see again the same 448
bytes when /usr/local/guest.img was attached to ahci-hd (-s
7,ahci-hd,/usr/local/guest.img).
That was what I expected.
What I don't understand is why my expectations are true for
/usr/local/guest.img but not for /dev/ada4.

Like mentioned, while reading the first 448 bytes on the host, I get
identical results from /usr/local/guest.img and /dev/ada4, but when
attaching /dev/ada4 to ahci-hd (-s 7,ahci-hd,/dev/ada4) and inspecting
inside vmm, all I see is 0x0, while ahci-hd attached
/usr/local/guest.img shows the same pmbr as on the host!?

Do I have to exclude /dev/ada4 on the host from geom? As soon as bhyve
opens /dev/ada4, all partitions vanish from the host – probably ada4
itself gets blocked somehow?

Thanks for any hint in advance,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Unexpected ahci-hd bytes when running in bhyve(8)

2016-10-27 Thread Harry Schmalzbauer
 Hello,

I wanted to use a "roaming" ssd with byhve/vmm, which is the home of a
GPT based FreeBSD setup.
I've been using this for years with ESXi and bare-metal-hosts, and
wanted to try out bhyve.
Unfortunately this doesn't work the way I'm used to.
Booting of ufs:/dev/gpt/myROOT fails with error 19, loader does only see
a diskid/BHYVEDISK, not the GPT partitions.

I guess ahci-hd isn't 1:1 mapping blocks, neither does virtio-blk, since
it shows exactly the same result, which is a bit strange to me:
When I boot a Live-CD in vmm with the physical SSD ahci-hd attached, the
first 8kByte of /dev/ada0 is 0x0.

The same test on the host ('dd if=/dev/ada4 count=16 | hd') shows me
PMBR and GPT content, which I also expected to see in bhyve…

What am I missing?

Here's my switches:

bhyve -u -A -H -P \
-S \
-s 0,hostbridge \
-s 6,passthru,6/0/0 \
-s 31,lpc \
-s 1,ahci-cd,releases/ISO-IMAGES/11.0/FreeBSD-11.0-RELEASE-amd64-disc1.iso \
-s 7,ahci-hd,/dev/ada4 \
-l com1,/dev/nmdm0A \
-m 3G -c 4 preed

/dev/ada4 is the "roaming" (hotpuggable) SSD on the host.

Thanks for any hint,

-harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: vale-ctl(-8), ifconfig(8), SIOCAIFADDR: Invalid argument [utilizing netmap(4) providing virtual switches+interfaces to BHyVe]

2016-10-15 Thread Harry Schmalzbauer
Bezüglich Vincenzo Maffione's Nachricht vom 15.10.2016 09:32 (localtime):
> 2016-10-14 15:38 GMT+02:00 Harry Schmalzbauer :

…
>> I'm familar with epair(4), but not with tap(4).
>> I don't understand the man page for tap, perhaps I should read pty(4)…
>> But I guess I don't have to know the details of tap(4), since you
>> confirmed that it can be connected to VALE.
>>
> 
> It's not necessary to understand the details. However, a TAP device is
> conceptually similar to the two ends of an epair, with the difference that
> in the TAP a network interface (e.g. tap0) is conecptually "connected"
> back-to-back to a file descriptor. The file descriptor is written/read by
> the hypervisor (to inject/intercept packets to/from the network stack),
> while the tap0 interface can be attached to if_bridge.

Hi Vincenzo, thanks for your explanation!


>>
>> So one could summarize:
>> VALE (as part of netmap(4)) can act as a if_bridge(4) replacement in
>> FreeBSD-10/11, keeping everything else involved untouched.
>> Please correct me if I'm wrong.
>>
> 
> For simple cases yes. if_bridge may have features that are not supported by
> netmap (i.e. configure ports as VLAN access ports). Moreover, if_bridge has
> a interface (br0), whereas VALE bridges doesn't.

Again, thank you for your time! (R)STP comes to my mind (which I don't
need any more). And I'm not sure if VALE really lacks that, but I guess
it wouldn't match VALEs philosophy/design at all…

…
>>> https://github.com/luigirizzo/netmap). Among the new features, there is
>> a
>>> new solution for bhyve networking, which will let you attach your bhyve
>> VMs
>>> directly to a VALE switch, without paying additional overheads related to
>>> TAPs, epairs, and vtnet emulation. You can find additional information,
>>> code and performance numbers here:
>>> https://wiki.freebsd.org/SummerOfCode2016/PtnetDriverAndDeviceModel.
>>
>> Thanks for that hint!
>> I guess it's about ptnetmap(4)? I read papers but haven't considered it
>> could be production-ready for FreeBSD in the near future.
>> It's extremely interesting and I'd love to be eraly adopter, but my
>> (ESXi) setups are currently doing well and I don't have spare time or
>> any business project to try out… :-(
>>
> 
> Yes, it's ptnetmap. However, bhyve is going to have support for VALE ports
> anyway (even without ptnetmap), as QEMU already does, so at least you will
> be able to replace TAPs with VALE ports (while still using vtnet devices
> for the VM).

Oic, I wasn't aware that there will be a VALE-vtnet direct path! That is
really great news :-) And a big achievment for guests preferring
"standard" drivers, ptnetmap could limit the guest OS choice I guess.

For now, I'm happy having been in touch with netmap(4) – at least with a
very little fraction of natmap – but I'll stay the legacy way utilizing
if_bridge(4) and see if there are still oddities and try to find some
time to track them down (involving LACP, VLANs, Jumbo-Frames and IPv6 –
that was the problematic constellation)

Since I have extra PHYs, I can do PCIe-passthrough like before (with
ESXi) for some special guests. I'm looking forward to find out how this
works with bhyve!

Best,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: vale-ctl(-8), ifconfig(8), SIOCAIFADDR: Invalid argument [utilizing netmap(4) providing virtual switches+interfaces to BHyVe]

2016-10-14 Thread Harry Schmalzbauer
Bezüglich Vincenzo Maffione's Nachricht vom 14.10.2016 15:08 (localtime):
> Hi,
> 
>   Thanks for your feedback.
> 
…
>> Accidentally I found out that 'vale-ctl -n testif0' creates a artificial
>> interface, which is reported by ifconfig(8):
>> testif0: flags=8801 metric 0 mtu 1500
>> options=8
>> ether 00:be:eb:8d:f8:00
>> nd6 options=21
>>
>> But I can't assign a IP address: 'ifconfig testif0 203.0.113.1/24'
>> ifconfig: ioctl (SIOCAIFADDR): Invalid argument
>>
>> I guess couldn't geti the picture of the netmap(4) world yet.
>> Probably, testif0 is available only in netmap(4) world, not in "host
>> world".
>> I'm assuming, because I found vale-ctl(-8)s "-h" switch.
>>
> 
> Yes, those are the "persistent" VALE ports. They are a recent feature, and
> probably you don't need to use them if you are going to play with Virtual
> Machines and jails (see below).

Hello Vincenzo,

thank you very much for your help!!!


…
>> Now my question:
>>
>> How can I plug a jail's or vmm's artificial interface to a VALE virtual
>> switch, bridging frames to real-world via physical interfaces?
>> (the latter part should work with vale-ctl -h vale0:em1, but what
>> interface to use for jail(8) vnet.interface and how to create/attach?)
>>
> 
> If you use bhyve/vmm, you can attach the VM TAP interface to the VALE
> switch, as you would do for "em1". Regarding jails, I don't know exactly
> how networking works there, but I guess epair(4) interface (or similar) are
> used. If this is the case, then you would have one end of the epair only
> visible in the jail, and the other end only visible in the "host"; then you

I'm familar with epair(4), but not with tap(4).
I don't understand the man page for tap, perhaps I should read pty(4)…
But I guess I don't have to know the details of tap(4), since you
confirmed that it can be connected to VALE.

So one could summarize:
VALE (as part of netmap(4)) can act as a if_bridge(4) replacement in
FreeBSD-10/11, keeping everything else involved untouched.
Please correct me if I'm wrong.


> could attach the host end to a VALE switch again with "vale-ctl -a".
> Unfortunately, the performance you would get in any case is not great,
> because TAP and epair interface do not have netmap "native support".
> Moreover, when using bhyve, you have to pay the cost of the emulation of
> the vtnet device, since each packet passes through this device (other than
> passing across netmap).

I understand, thanks.
In fact, I expected that at first hand, but have had some oddities with
if_bridge(4) some years ago, so I thought I'd better try something new ;-)
Can I expect any resource savings over if_bridge(4)? I guess if so, the
ammount isn't relevant considering the whole bhyve scenarium.


> However, consider the following: a consistent netmap update is going to
> happen in FreeBSD-CURRENT, in short. This is going to align the netmap code
> which is now in FreeBSD to the code on the official github repository (
> https://github.com/luigirizzo/netmap). Among the new features, there is a
> new solution for bhyve networking, which will let you attach your bhyve VMs
> directly to a VALE switch, without paying additional overheads related to
> TAPs, epairs, and vtnet emulation. You can find additional information,
> code and performance numbers here:
> https://wiki.freebsd.org/SummerOfCode2016/PtnetDriverAndDeviceModel.

Thanks for that hint!
I guess it's about ptnetmap(4)? I read papers but haven't considered it
could be production-ready for FreeBSD in the near future.
It's extremely interesting and I'd love to be eraly adopter, but my
(ESXi) setups are currently doing well and I don't have spare time or
any business project to try out… :-(

Is it likely that there will a MFC happen? Or will it be a exclusive
12.0 feature? If ptnetmap will be MFCd I'll definitely give it a try
next summer and stay with 11.0 for my replacement machines for now.
Otherwise I'm unsure…

best,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

vale-ctl(-8), ifconfig(8), SIOCAIFADDR: Invalid argument [utilizing netmap(4) providing virtual switches+interfaces to BHyVe]

2016-10-14 Thread Harry Schmalzbauer
 Dear all,

I found great papers about netmap(4)s desigen and implementation
details, and I'm sure it's one other masterpeace of rizzo-quality :-)
Thanks to all participants for that great code!

To be honest, I haven't read all of that, because I'm short in time and
my first mission is to see if FreeBSD 11 will replace some of my ESXi
machines.

One key element seems netmap(4).
It's quiet hard to find userland documentation.

So far, I've discovered that there are three essential tools waiting in
_usr/src/tools/tools/netmap_ to be compiled
(resulting in *./vale-ctl*, *./bridge*, *./pkt-gen*)

While the latter is often referenced in netmap(4) documentation, it's
not of interest for me, because I'll be doing real-world performance
tests and I'm convinced that all the impressive numbers presented in the
netmap documentation are valid :-)

So *vale-ctl(-8)* seems to be of interest (I'm using (-8) becaus
currently there is no man8 part (I guess that's the reason for these
tools not beeing integrated into base binaries))

Accidentally I found out that 'vale-ctl -n testif0' creates a artificial
interface, which is reported by ifconfig(8):
testif0: flags=8801 metric 0 mtu 1500
options=8
ether 00:be:eb:8d:f8:00
nd6 options=21

But I can't assign a IP address: 'ifconfig testif0 203.0.113.1/24'
ifconfig: ioctl (SIOCAIFADDR): Invalid argument

I guess couldn't geti the picture of the netmap(4) world yet.
Probably, testif0 is available only in netmap(4) world, not in "host world".
I'm assuming, because I found vale-ctl(-8)s "-h" switch.

So another very little peace I'm aware of the netmap(4) world, is how to
attach physical interfaces to virtual switches:
'/usr/src/tools/tools/netmap/vale-ctl -a vale0:em1'
Now vale-ctl(-8) shows:
bdg_ctl [149] bridge:0 port:0 vale0:em1

/*
To share my experience: One cannot use any other than vale[[:digit:]]
for defining the on-demand to be created virtual switch instance, so
e.g. "vale-ctl -a vale-test:em1" doesn't work, although found in
netmap(4) man page in FreeBSD-11:
»valeXXX:YYY (arbitrary XXX and YYY)
the file descriptor is bound to port YYY of a VALE switch called
XXX, both dynamically created if necessary. The string cannot
exceed IFNAMSIZ characters, and YYY cannot be the name of any
existing OS network interface«

I was about to give up on netmap(4) investigations because I thought it
isn't production ready yet (in FreeBSD), since even andding the first
physical interface fails: '/usr/src/tools/tools/netmap/vale-ctl -a
vale-test:em1'
vale-test:em1: Invalid argument

Probably accidentally I used vale[[:digit:]] instead and wondered whay
it suddenly works…

To get back to vale-ctl(-8)s "-h" switch:
*/

If I add a physical interface with -h instead of -a, the host's IP stack
doesn't get disconnected from the interface, so it's still usable by
host applications and vale-ctl(-8) lists one line more:
bdg_ctl [149] bridge:0 port:0 vale0:em1
bdg_ctl [149] bridge:0 port:1 vale0:em1^
So my assumption that netmap(4) lives decapsuled from the well known
FreeBSD IP world.


Now my question:

How can I plug a jail's or vmm's artificial interface to a VALE virtual
switch, bridging frames to real-world via physical interfaces?
(the latter part should work with vale-ctl -h vale0:em1, but what
interface to use for jail(8) vnet.interface and how to create/attach?)

Thanks,

-harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c

2016-08-20 Thread Harry Schmalzbauer
Bezüglich Rick Macklem's Nachricht vom 18.08.2016 02:03 (localtime):
>  Kostik wrote:
> [stuff snipped]
>> insmnque() performs the cleanup on its own, and that default cleanup isnot 
>> suitable >for the situation.  I think that insmntque1() would betterfit your 
>> requirements, your >need to move the common code into a helper.It seems that 
>> >unionfs_ins_cached_vnode() cleanup could reuse it.
> 
> I've attached an updated patch (untested like the last one). This one creates 
> a
> custom version insmntque_stddtr() that first calls unionfs_noderem() and then
> does the same stuff as insmntque_stddtr(). This looks like it does the 
> required
> stuff (unionfs_noderem() is what the unionfs VOP_RECLAIM() does).
> It switches the node back to using its own v_vnlock that is exclusively 
> locked,
> among other things.

Thanks a lot, today I gave it a try.

With this patch, one reproducable panic can still be easily triggered:
I have directory A unionfs_mounted under directory B.
Then I mount_unionfs the same directory A below another directory C.
panic: __lockmgr_args: downgrade a recursed lockmgr nfs @
/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905
Result is this backtrace, hardly helpful I guess:

#1  0x80ae5fd9 in kern_reboot (howto=260) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:366
#2  0x80ae658b in vpanic (fmt=, ap=)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:759
#3  0x80ae63c3 in panic (fmt=0x0) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:690
#4  0x80ab7ab7 in __lockmgr_args (lk=,
flags=, ilk=, wmesg=,
pri=, timo=, file=, line=)
at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_lock.c:992
#5  0x80ba510c in vop_stdlock (ap=) at
lockmgr.h:98
#6  0x8111932d in VOP_LOCK1_APV (vop=,
a=) at vnode_if.c:2087
#7  0x80a18cfc in unionfs_lock (ap=0xfe007a3ba6a0) at
vnode_if.h:859
#8  0x8111932d in VOP_LOCK1_APV (vop=,
a=) at vnode_if.c:2087
#9  0x80bc9b93 in _vn_lock (vp=,
flags=66560, file=, line=) at
vnode_if.h:859
#10 0x80a18460 in unionfs_readdir (ap=) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1531
#11 0x81118ecf in VOP_READDIR_APV (vop=,
a=) at vnode_if.c:1822
#12 0x80bc6e3b in kern_getdirentries (td=,
fd=, buf=0x800c3d000 ,
count=, basep=0xfe007a3ba980, residp=0x0)
at vnode_if.h:758
#13 0x80bc6bf8 in sys_getdirentries (td=0x0,
uap=0xfe007a3baa40) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_syscalls.c:3940
#14 0x80fad6b8 in amd64_syscall (td=,
traced=0) at subr_syscall.c:135
#15 0x80f8feab in Xfast_syscall () at
/usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:396
#16 0x00452eea in ?? ()
Previous frame inner to this frame (corrupt stack?

I ran your previous patch with for some time.
Similarly, mounting one directory below a 2nd mountpount crashed the
machine (forgot to config dumpdir, so can't compare backtrace with the
current patch).
Otherwise, at least with the previous patch, I haven't had any other
panic for about one week.

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c

2016-08-09 Thread Harry Schmalzbauer
Bezüglich Mark Johnston's Nachricht vom 09.08.2016 08:02 (localtime):
…
>>
>> Just for anybody else needing unionfs:
>> https://people.freebsd.org/~attilio/unionfs_missing_insmntque_lock.patch
>>
>> This patch still applies and I'm successfully using this (unmodified) up
>> to FreeBSD-10.3 and never had any panic in all these years.
> 
> Having spent some time looking at unionfs, I'm a bit skeptical that this
> patch will address the panic you reported earlier, though I'd be
> interested to know if it does. 

Thanks for your attention.
I can confirm that it has prevented panics for more than 4 years
(9.0-10.3) and it seems to be still "good enough" to also prevent panics
in 11-BETA4.
I updated my build host (stable/11, this time with the
unionfs_missing_insmntque_lock.patch), where the recent panics happened
and unionfs gets much more utilized than usually in my setups: No panic
with that patch anymore.
Just one message like "prevented resource deadlock" occured.

> Reading the code, I think it will just
> address an INVARIANTS-only assertion in insmntque1().
> 
> Unfortunately, unionfs is quite difficult to fix within the current
> constraints of FreeBSD's VFS. unionfs_readdir() is a particularly good
> demonstration of this fact: some callers of VOP_READDIR expect the
> cookies returned by the FS to be monotonically increasing, but unionfs
> has no straightforward way to make this guarantee.

I'm sorry, I can't provide help here. My skills would require a huge
ammount of lerning-time to get into that matter. I'd love to do that,
but I can't afford :-(

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c

2016-08-09 Thread Harry Schmalzbauer
Bezüglich Kurt Jaeger's Nachricht vom 09.08.2016 07:32 (localtime):
> Hi!
> 
>> Since then I'm draging a minimal patch which prevents at least the
>> kernel panics for me.
>> Unfortunately I don't have the skills to continue Attilio Raos work.
>>
>> Just for anybody else needing unionfs:
>> https://people.freebsd.org/~attilio/unionfs_missing_insmntque_lock.patch
> 
> Is this referenced in any PR ? If not, can you create one ?

Good question, bad answer: No
I had been told not to file a PR at that time because unionfs overhaul
was planned/needed. Partial fixes would have been counterproductive in
that state and not "the right way to go".
Looks like overhaul hasn't really happened during the last 4 years and
since I was happy with the partial patch, I haven't followed unionfs
development.
In the mean time I remember that lots of fs-stress-tests were added to
the FreeBSD test suite, but I'm not familar with a single one, even not
with ATF and the like, and won't find the time to dig into them.
So presently I hesitate picking up that old issue and file a PR, because
developers resources are extremely limited regarding unionfs, and the PR
should be attended by someone who has time and knowledge to make
developers life easier by providing qualified analysis and tests. All I
could do at the moment is to point at a dysfunction, which is documented
in the man page, which isn't really helpful/needed.


>> First thing to do for me, after I won in lottery, was to find someone
>> who can be sponsored fixing unionfs ;-) And bringing MNAMELEN into 21st
>> century state, matching ZFS needs:
>> https://lists.freebsd.org/pipermail/freebsd-hackers/2015-November/048640.html
>> This is another patch I'm carrying for a very long time which solves
>> tremendous limitations for me. Without that, I couldn't use ZFS
>> snapshots in real world, along with a human-friendly dataset naming :-)
> 
> And is there a PR for that ?

Sadly, the same answer with similar reasons is true.
I asked Doug Ambrisko (the author of the mount_bigger_2_1.patch) at
10.2-BETA time (2015/07/10) about progress and future handling of his patch.
He answered that Marshal Kirk McKusik asked him not to continue in that
direction, since »it would make the 64 bit inode work harder«.

So we agreed continuing being happy with the patch in our world, leaving
the rest unhappy ;-) See the link to that discussion above.

I was kind of astonsihed that this patch still applies to 11 and it
seems 11 still has the MNAMELEN limitation besides make_dev_p() ability
to handle extended lengths.

CC'ed Doug, perhaps he joins this thread calrifying things.

The unionfs thing is a edge-case thing, but MNAMELEN is much more
important, imho, since ZFS is one of FreeBSD's most appreciated "new
features" and this MNAMELEN limit needlessly counteracts ZFS' deployment.

Of course I'll keep nagging from time to time ;-) And file PRs if not
told otherwise :-)

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:190

2016-08-08 Thread Harry Schmalzbauer
Bezüglich Rick Macklem's Nachricht vom 07.08.2016 23:34 (localtime):
> Harry Schmalzbauer wrote:
>> Hello,
>>
>> I had another crash which I'm quite sure was triggered by mount_unionfs:
> Just in case you are not already aware, unionfs is always broken. Read
> the BUGS
> section at the end of "man mount_unionfs". If it were easy to fix,
> someone would
> have done so long ago. Yes, some use it successfully, but if not...
> 
> Sorry, but I suspect that is how it will remain, rick

Thanks for the hint, not happy to hear that, but I was not aware of that
explicit warning in man 8 mount_unionfs :-(

This feature is utterly important for me (all my productive machines
have "/" read-only mounted and "/etc" is an union to a writable, synch
mounted separate fs), so back in 2012, after a lot of locking redesign
has been done in 9-current, I got Attilio Raos attention and he gave out
some test patches for 9.0.
He was aware of missing locking adjustments, but patches addressing the
majority of them didn't work.
Since then I'm draging a minimal patch which prevents at least the
kernel panics for me.
Unfortunately I don't have the skills to continue Attilio Raos work.

Just for anybody else needing unionfs:
https://people.freebsd.org/~attilio/unionfs_missing_insmntque_lock.patch

This patch still applies and I'm successfully using this (unmodified) up
to FreeBSD-10.3 and never had any panic in all these years.

I will continue using it for FreeBSD-11 and I guess it will also prevent
my last reported panics.
But I wanted to take part in the BETA test without local modifications
at first.

Another very importend usage scenario of unionfs for me is for my build
host(s). I'm (nfs4-)sharing a svn-checked out read-only portstree. My
inofficial "ports/inofficial" directory perfectly shows up by
unionfs-mounting it below the unaltered portstree :-)

For me, unionfs is as important as ZFS (and nullfs) is in FreeBSD.

First thing to do for me, after I won in lottery, was to find someone
who can be sponsored fixing unionfs ;-) And bringing MNAMELEN into 21st
century state, matching ZFS needs:
https://lists.freebsd.org/pipermail/freebsd-hackers/2015-November/048640.html
This is another patch I'm carrying for a very long time which solves
tremendous limitations for me. Without that, I couldn't use ZFS
snapshots in real world, along with a human-friendly dataset naming :-)

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905

2016-08-07 Thread Harry Schmalzbauer
 Hello,

I had another crash which I'm quite sure was triggered by mount_unionfs:

Unread portion of the kernel message buffer:
panic: __lockmgr_args: downgrade a recursed lockmgr nfs @
/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905

cpuid = 3
KDB: stack backtrace:
#0 0x80b2d887 at kdb_backtrace+0x67
#1 0x80ae5332 at vpanic+0x182
#2 0x80ae51a3 at panic+0x43
#3 0x80ab6987 at __lockmgr_args+0xe87
#4 0x80ba3c7c at vop_stdlock+0x3c
#5 0x82cd at VOP_LOCK1_APV+0x8d
#6 0x80a17c1c at unionfs_lock+0x48c
#7 0x82cd at VOP_LOCK1_APV+0x8d
#8 0x80bc8703 at _vn_lock+0x43
#9 0x80a17380 at unionfs_readdir+0x140
#10 0x81110e6f at VOP_READDIR_APV+0x8f
#11 0x80bc59ab at kern_getdirentries+0x21b
#12 0x80bc5768 at sys_getdirentries+0x28
#13 0x80fab6ae at amd64_syscall+0x4ce
#14 0x80f8dc0b at Xfast_syscall+0xfb
Uptime: 44m36s
Dumping 337 out of 2002 MB:..5%..15%..24%..34%..43%..53%..62%..72%..81%..91%

#0  doadump (textdump=) at pcpu.h:221
221 pcpu.h: Permission denied.
in pcpu.h
(kgdb) backtrace
#0  doadump (textdump=) at pcpu.h:221
#1  0x80ae4db9 in kern_reboot (howto=260) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:366
#2  0x80ae536b in vpanic (fmt=, ap=)
at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:759
#3  0x80ae51a3 in panic (fmt=0x0) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:690
#4  0x80ab6987 in __lockmgr_args (lk=,
flags=, ilk=, wmesg=,
pri=, timo=, file=, line=)
at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_lock.c:992
#5  0x80ba3c7c in vop_stdlock (ap=) at
lockmgr.h:98
#6  0x82cd in VOP_LOCK1_APV (vop=,
a=) at vnode_if.c:2087
#7  0x80a17c1c in unionfs_lock (ap=0xfe2296a0) at
vnode_if.h:859
#8  0x82cd in VOP_LOCK1_APV (vop=,
a=) at vnode_if.c:2087
#9  0x80bc8703 in _vn_lock (vp=,
flags=66560, file=, line=) at
vnode_if.h:859
#10 0x80a17380 in unionfs_readdir (ap=) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1531
#11 0x81110e6f in VOP_READDIR_APV (vop=,
a=) at vnode_if.c:1822
#12 0x80bc59ab in kern_getdirentries (td=,
fd=, buf=0x800a35000 ,
count=, basep=0xfe229980, residp=0x0)
at vnode_if.h:758
#13 0x80bc5768 in sys_getdirentries (td=0x0,
uap=0xfe229a40) at
/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_syscalls.c:3940
#14 0x80fab6ae in amd64_syscall (td=,
traced=0) at subr_syscall.c:135
#15 0x80f8dc0b in Xfast_syscall () at
/usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:396
#16 0x0045da4a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal

Analyzing is out of my scope, sorry.
But I hope somebody else can before 11-RELEASE ships with this problem.

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [CFT] ypldap testing against OpenLDAP and Microsoft Active Directory

2016-08-03 Thread Harry Schmalzbauer
 Bezüglich Craig Rodrigues's Nachricht vom 02.08.2016 22:31 (localtime):
> Thanks for the feedback.  Please consider posting your questions
> on freebsd-current so that other people can jump in and help
> answer your questions.
>
> I don't have an LDAP server to test against, so don't know the answer
> to all your questions.
>
> What type of LDAP server are you testing against?  Is it Active Directory?

Thanks for your response!
In this (productive) environment I use OpenLDAP with core, cosine, nis
and sambaSchema,
But I'd also have MS-Active Directories to test against, once I get it
working and switching to stable/11 in other setups too.

Found your question
https://reviews.freebsd.org/D4744#142095
which makes me wonder if ypldap(8) has been successfully used in FreeBSD
at all yet?

Unfortunately I don't have time to help finding integration problems and
I'm not familar with NIS subsystem at all, so all I can contribute is
questions :-(

And a short summary which might help others joining ypldap(8) testing
under FreeBSD-11:

– 'ypldap -vd' gives reasonable output and does query the LDAP server
defined in the directory "" {} section, where it looks you can use any
form of IP/hostname, including IPv6 addresses without any braces.
– If run in foreground, it registers service "ypserv" version 2 only
with rpcbind.
– 'ypcat passwd.byname' just doesn't work, same is true for 'id'. No
interaction at all with ypldap(8) seems to happen, no errors/results.
– When stopping ypldap(8) from foreground, it does NOT unregister ypserv
service!

The same is true if you run ypldap(8) in background, started without
running ypserv(8)

– If started by rc.d script, yp_serv_(8) registers service ypserv
version 1 and 2, before ypldap(8) overrides service ypserv version 2.
– 'ypcat passwd.byname' _sometimes_ responds with this error:
clnttcp_create failed
ypcat: no such map passwd.byname. Reason: Can't communicate with
portmapper
– ypldap(8) doesn't connect to the server at all when started by rc.d.
– When stopping ypldap(8) only, keeping ypserv (started by rc.d/ypldap)
running and starting ypldap(8) in the foreground, LDAP server connection
gets established and again sensible maps are shown, followed by regular:
connecting to directories
searching password entries
searching group entries
  In that state ypcat results in:
yp_all: clnt_call: RPC: Authentication error; why = Failed
(unspecified error)
yp_all: clnt_call: RPC: Authentication error; why = Failed
(unspecified error)
… repeat 19 more times …
ypcat: no such map passwd.byname. Reason: RPC failure
– After some minutes, ypcat doesn't respond with any errors/results again.

ldap.conf(5) contradicts to
https://svnweb.freebsd.org/base?view=revision&revision=301480. The
latter (rc.d start script by Marcelo Araujo, CC'ed) starts ypserv(8) as
dependency, the former claims ypldap(8) and ypserv(8) are mutual exclusive.

Since I have no clue how ypldap(8) is designed to integrate with NIS/YP,
I don't know how to start finding the root of presently existing
problems – with or without ypserv(8)?!

Right now, ypldap(8) in stable/11 doesn't enable LDAP maintained users
for me.
This should either be solved before 11-RELEASE or, if _nobody_ else can
confirm it's working, /etc/rc.d/ypldap needs to be suspended for
11-RELEASE and live in CURRENT until functional.

Any hints very welcome, but for now I'll have to switch back to nslcd(8).

Since CURRENT turned to stable/11 in the meantime, I'm posting to
stable@ referencing the original post:
https://lists.freebsd.org/pipermail/freebsd-current/2016-June/061775.html


> On Tue, Aug 2, 2016 at 10:49 AM, Harald Schmalzbauer
> mailto:h.schmalzba...@omnilan.de>> wrote:
>
>  Bezüglich Harald Schmalzbauer's Nachricht vom 02.08.2016 17:36
> (localtime):
>
…
>
> > How can I define the host to which ypldap connects for LDAP
> queries? Is
> > it "directory"? What syntax is allowed, FQDN, IPs, IP6-spelling?
> >
> > Tried a lot but always end up in ypldap[6960]: fatal: getpwnam:
> Socket
> > is not connected
>
> Hello, I made some progress :-)
>
> "fatal: getpwnam: Socket is not connected" was due to my outdated
> master.passwd, missing the _ypldap account.
> The "directory" seems to define the host to connect with any
> adressing;
> IPv6 adresses wok just as they are notated every where qre without any
> braces. Will try to find out what about unqualified host names and
> hosts
> with A and  records...
>
> I couldn't figure out if ypserv(8) is needed to authenticate LDAP
> users
> on the local host, where ypldap(8) runs.
>
> Running ypldap in foreground gives lot of reasonable output like
> "pushing line: ..." with vaild content.
> So contacting, binding and querying the LDAP seems to work :-)
>
> Unfortunately 'ypcat passwd.byname' and 'id someldapuser' do not
> work –
> neither with ypserv started nor withou

11-BETA3 Panic with ip6+ESP, Fatal trap 12, severe outage

2016-08-01 Thread Harry Schmalzbauer
 Hello,

unfortunately my upgrade from 10.3 to 11-BETA3 caused machine outage.
ESP encrypted IPv6-traffic acauses a immediate crash.
Please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211486
whereI provided this info:

Unread portion of the kernel message buffer:
Kernel page fault with the following non-sleepable locks held:
exclusive rw tcpinp (tcpinp) r = 0 (0xf80007b1fe18) locked @ 
/usr/local/share/deploy-tools/RELENG_11/src/sys/netinet6/in6_pcb.c:1172
shared rw tcp (tcp) r = 0 (0x82ad2bd8) locked @ 
/usr/local/share/deploy-tools/RELENG_11/src/sys/netinet/tcp_input.c:802
stack backtrace:
#0 0x80ab4d30 at witness_debugger+0x70
#1 0x80ab6017 at witness_warn+0x3d7
#2 0x80ec63d7 at trap_pfault+0x57
#3 0x80ec5a64 at trap+0x284
#4 0x80ea6161 at calltrap+0x8
#5 0x80c43c51 at tcp_twrespond+0x231
#6 0x80c436f5 at tcp_twstart+0x1f5
#7 0x80c34078 at tcp_do_segment+0x23c8
#8 0x80c310b4 at tcp_input+0xe44
#9 0x80c30221 at tcp6_input+0xf1
#10 0x80c82799 at ipsec6_common_input_cb+0x4c9
#11 0x80c97101 at esp_input_cb+0x671
#12 0x80ca9e69 at swcr_process+0xd69
#13 0x80ca6c2f at crypto_dispatch+0x7f
#14 0x80c9605a at esp_input+0x4fa
#15 0x80c8179b at ipsec_common_input+0x40b
#16 0x80c8222d at ipsec6_common_input+0xcd
#17 0x80c64070 at ip6_input+0xc70


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x1a
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80c65afc
stack pointer   = 0x28:0xfe0091f1e5f0
frame pointer   = 0x28:0xfe0091f1e850
code segment= base r 
x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (em0 que)


Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: mfi driver performance too bad on LSI MegaRAID SAS 9260-8i

2016-07-31 Thread Harry Schmalzbauer
 Bezüglich Jason Zhang's Nachricht vom 17.06.2016 09:16 (localtime):
> Hi,
>
> I am working on storage service based on FreeBSD.  I look forward to a good 
> result because many professional storage company use FreeBSD as its OS.  But 
> I am disappointed with the Bad performance.  I tested the the performance of 
> LSI MegaRAID 9260-8i and had the following bad result:
>
>1.  Test environment:
> (1) OS:   FreeBSD 10.0 release
> (2) Memory:  16G
> (3) RAID adapter:   LSI MegaRAID 9260-8i
> (4) Disks:  9 SAS hard drives (1 rpm),  performance is expected 
> for each hard drive

Were the drives completely initialized?
I remember that at least one vendor had implemented read-past-write for
every sector when written first.
It was with 15k 3.5" spindles and I'm really not sure which vendor it
was, so I won't name any. But "slow init" had solved a similir problem
for me back then...

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

pkg(8) still not part of stable/11 (yet)???

2016-07-28 Thread Harry Schmalzbauer
 Hello,

I just tried 11-beta2 – as usual, many thanks for all your hard work,
Devs and REs!

I'm very concerned that there's still just a bootstrap-pkg :-(

Most of the machines I'm responsible for don't have internet access –
and won't ever have.
Currently, I don't have a 11-machine for building pkg(8) handy.
How Do I get pkg(8) the official way (for machines without internet access)?
Are there official ftp servers available for manual downloading (I know
of pkgs SRV lookup method, but since I need to 'scp' packages from
arbitrary internet-connected clients, I can't utilize pkg(8) especially
in a situation like now, where I have not a single compatible
alternative client/host).

pkg(8) has biten my some dozend times since stable/10 due to it's ring
dependency :-( :-( :-(

Please, don't ship FreeBSD 11 without a full version of pkg(8) in the base!

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ahci-timeout regression in beta3

2016-03-05 Thread Harry Schmalzbauer
Bezüglich John Baldwin's Nachricht vom 05.03.2016 22:50 (localtime):
> On Saturday, March 05, 2016 01:11:13 PM Harry Schmalzbauer wrote:
>> Bezüglich John Baldwin's Nachricht vom 02.03.2016 18:32 (localtime):
…
>> With BETA3-iso, where booting fails, "random: unblocking device."
>> happens after timecounter initialization and before attaching ses0/cdX/adaX.
>> With HEAD-iso, where booting succeeds, "random: unblocking device."
>> happens way after ses0/adaX/cdX attached, right before rc.
> 
> Yes, HEAD's /dev/random has many more changes than were put into 10 for
> BETA3.
> 
>> On HEAD, ahci-devices attach in the same order as with -stable pre-r295480.
>> Since r295480, cdX attaches before adaX on -stable and while searching
>> for the cluprit, I had observed that attaching-order was a clear
>> indicator whether machine boots or not.
…
>> Perhpas it's related?!
>> https://lists.freebsd.org/pipermail/freebsd-stable/2015-July/082706.html
> 
> I think it's related in the sense that there is a timing race in ahci and
> that the /dev/random and RACCT changes alter the timing enough to trigger
> the race simply by changing the relative order of SYSINIT's during boot
> (and/or the amount of time between the ahci driver doing its initial
> probe and the second probe that is run for the interrupt config hooks that
> actually probes the attached SATA devices).


Thanks for your comment, I had such kind of race in mind, but I don't
have the skills to debug myself - then and now and unfortunately also
not the time for an upgrade ;-)

But meanwhile I deployed 10.3-RC1 without reverting r295480 (and also
removing "nooptions RACCT" (+ RCTL), since effectless
»kern.racct.enable« was corrected some time after that problem hit me).

Good news is that these ahci-timeouts haven't showed up elsewhere yet –
I've updated several _very_ similar setups (C200 chipsets; but none with
a suspicious faulty ODD)

So it's clearly not a show stopper for 10.3.

But there's a timing race to find, which affects ahci-timeouts. The most
nasty one's I ever fought... And it's not very welcome finding a remote
machine stop booting because of a faulty ODD one wasn't ware, since it
succeeds booting previous FreeBSD release and other OSs.

Tell me if I can help out with my skills.

Thanks,

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ahci-timeout regression in beta3

2016-03-05 Thread Harry Schmalzbauer
Bezüglich John Baldwin's Nachricht vom 02.03.2016 18:32 (localtime):
> On Monday, February 29, 2016 07:29:03 PM Harry Schmalzbauer wrote:
>>  Bezüglich Harry Schmalzbauer's Nachricht vom 28.02.2016 20:55 (localtime):
>>>  Hello,
>>>
>>> I have a remote machine with a probably defective ODD, but until r294989
>>> (from Jan 28th) I could boot with just these warnings:
>>> (cd1:ahcich1:0:0:0): READ(10). CDB: 28 00 00 38 85 e0 00 00 01 00
>>> (cd1:ahcich1:0:0:0): CAM status: SCSI Status Error
>>> (cd1:ahcich1:0:0:0): SCSI status: Check Condition
>>> (cd1:ahcich1:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read
>>> error)
>>> (cd1:ahcich1:0:0:0): Error 5, Unretryable error
>>> (cd1:ahcich1:0:0:0): cddone: got error 0x5 back
>>> …
>>>
>>> beta3 doesn't boot anymore, it's hanging with ahci-timeouts:
>>> ahcich2: Timeout on slot 11 port 0
>>> ahcich2: is 0008 cs  ss  rs 0800 tfd 40 derr
>>>  cmd 0004cb17
>>> (ada1:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 01 ae a3 50 40 5d 01 00
>>> 00 00 00
>>> ...
>>> (aprobe0:ahcich2:0:0:0) ATA_IDENTIFY. ACB eec 00 00 00 00 40 00 00 00 00
>>> 00 00
>>> (aprobe0:ahcich2:0:0:0) CAM status: Command timeout
>>> (aprobe0:ahcich2:0:0:0) Error 5, Retry was blocked
>>> ada1 detached
>>> ...
>>> The numbers (first ACB) and also the channel varies from time to time
>>
>> I could narrow it down to r295480
>> (https://svnweb.freebsd.org/base?view=revision&revision=295480)
>>
>> Reverting that lets the machine boot again.
>>
>> I captured verbose boot messages, finding out that problem relaxes with
>> verbose-booting, since ahci seems to recover:
>> …
>> TSC timecounter discards lower 1 bit(s)
>> Timecounter "TSC-low" frequency 1746033500 Hz quality -100
>> ahcich2: Timeout on slot 12 port 0
>> ahcich2: is 0008 cs  ss  rs 1000 tfd 40 serr
>>  cmd 0004cc17
>> ahcich2: AHCI reset...
>> (ada1:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 04 71 a3 50 40 5d 01 00
>> 00 00 00
>> (ada1:ahcich2:0:0:0): CAM status: Command timeout
>> (ada1:ahcich2:0:0:0): Retrying command
>> ahcich2: SATA connect time=100us status=0123
>> ahcich2: AHCI reset: device found
>> ahcich2: AHCI reset: device ready after 100ms
>> ahcich1: SNTF 0x0001
>> ahcich1: SNTF 0x0001
>> …
>>
>> I have checked twice that r295480 introduces boot failure here.
>>
>> I have absolutely no idea where/how/why/what race happens...
>>
>> Thanks for any hints,
> 
> That is most bizarre.  Does HEAD boot fine on this machine?  The change
> in question probably alters the timing of startup a bit since the random
> kthread is placed on the run queue later which might affect the relative
> order of kthreads as they start executing, but that would just mean it is
> exposting a race in some other part of the system.


Bizarre it is...
HEAD (r295683, 02/17/2016) boots fine.

BETA3 fails.

This time I checked with vendor-ISO images, while before it was a custom
setup rollout with local patches and special hw adaptions. Now I'm sure
it's not site-related.

With BETA3-iso, where booting fails, "random: unblocking device."
happens after timecounter initialization and before attaching ses0/cdX/adaX.
With HEAD-iso, where booting succeeds, "random: unblocking device."
happens way after ses0/adaX/cdX attached, right before rc.

On HEAD, ahci-devices attach in the same order as with -stable pre-r295480.
Since r295480, cdX attaches before adaX on -stable and while searching
for the cluprit, I had observed that attaching-order was a clear
indicator whether machine boots or not.

Sorry, I can't provide more useful info at this time, I just can
describe simple symptoms :-(


While playing with that machine I remember having had such a bizarre
problem before, arising with r284665. Here's what I found in my kernel
config:
# Don't build kernel with RACCT by default, which was enabled with r284665,
# since ahci(4) fails when using MSI with ahcichX timeout!
nooptions   RACCT   # Resource accounting framework
nooptions   RACCT_DEFAULT_TO_DISABLED # Set
kern.racct.enable=0 by default
nooptions   RCTL# Resource limits

Perhpas it's related?!
https://lists.freebsd.org/pipermail/freebsd-stable/2015-July/082706.html

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ahci-timeout regression in beta3

2016-02-29 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 28.02.2016 20:55 (localtime):
>  Hello,
>
> I have a remote machine with a probably defective ODD, but until r294989
> (from Jan 28th) I could boot with just these warnings:
> (cd1:ahcich1:0:0:0): READ(10). CDB: 28 00 00 38 85 e0 00 00 01 00
> (cd1:ahcich1:0:0:0): CAM status: SCSI Status Error
> (cd1:ahcich1:0:0:0): SCSI status: Check Condition
> (cd1:ahcich1:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read
> error)
> (cd1:ahcich1:0:0:0): Error 5, Unretryable error
> (cd1:ahcich1:0:0:0): cddone: got error 0x5 back
> …
>
> beta3 doesn't boot anymore, it's hanging with ahci-timeouts:
> ahcich2: Timeout on slot 11 port 0
> ahcich2: is 0008 cs  ss  rs 0800 tfd 40 derr
>  cmd 0004cb17
> (ada1:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 01 ae a3 50 40 5d 01 00
> 00 00 00
> ...
> (aprobe0:ahcich2:0:0:0) ATA_IDENTIFY. ACB eec 00 00 00 00 40 00 00 00 00
> 00 00
> (aprobe0:ahcich2:0:0:0) CAM status: Command timeout
> (aprobe0:ahcich2:0:0:0) Error 5, Retry was blocked
> ada1 detached
> ...
> The numbers (first ACB) and also the channel varies from time to time

I could narrow it down to r295480
(https://svnweb.freebsd.org/base?view=revision&revision=295480)

Reverting that lets the machine boot again.

I captured verbose boot messages, finding out that problem relaxes with
verbose-booting, since ahci seems to recover:
…
TSC timecounter discards lower 1 bit(s)
Timecounter "TSC-low" frequency 1746033500 Hz quality -100
ahcich2: Timeout on slot 12 port 0
ahcich2: is 0008 cs  ss  rs 1000 tfd 40 serr
 cmd 0004cc17
ahcich2: AHCI reset...
(ada1:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 04 71 a3 50 40 5d 01 00
00 00 00
(ada1:ahcich2:0:0:0): CAM status: Command timeout
(ada1:ahcich2:0:0:0): Retrying command
ahcich2: SATA connect time=100us status=0123
ahcich2: AHCI reset: device found
ahcich2: AHCI reset: device ready after 100ms
ahcich1: SNTF 0x0001
ahcich1: SNTF 0x0001
…

I have checked twice that r295480 introduces boot failure here.

I have absolutely no idea where/how/why/what race happens...

Thanks for any hints,

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ahci-timeout regression in beta3

2016-02-28 Thread Harry Schmalzbauer
 Hello,

I have a remote machine with a probably defective ODD, but until r294989
(from Jan 28th) I could boot with just these warnings:
(cd1:ahcich1:0:0:0): READ(10). CDB: 28 00 00 38 85 e0 00 00 01 00
(cd1:ahcich1:0:0:0): CAM status: SCSI Status Error
(cd1:ahcich1:0:0:0): SCSI status: Check Condition
(cd1:ahcich1:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read
error)
(cd1:ahcich1:0:0:0): Error 5, Unretryable error
(cd1:ahcich1:0:0:0): cddone: got error 0x5 back
…

beta3 doesn't boot anymore, it's hanging with ahci-timeouts:
ahcich2: Timeout on slot 11 port 0
ahcich2: is 0008 cs  ss  rs 0800 tfd 40 derr
 cmd 0004cb17
(ada1:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 01 ae a3 50 40 5d 01 00
00 00 00
...
(aprobe0:ahcich2:0:0:0) ATA_IDENTIFY. ACB eec 00 00 00 00 40 00 00 00 00
00 00
(aprobe0:ahcich2:0:0:0) CAM status: Command timeout
(aprobe0:ahcich2:0:0:0) Error 5, Retry was blocked
ada1 detached
...
The numbers (first ACB) and also the channel varies from time to time.

I couldn't track down the revision yet, checked r295124 and r295131 so far.
Just noticed that probing differs between working (294989) and and
non-working revisoin (r296074): The latter attaches cd past ada, the
former (working) probes cd first.

Will see to find out more until next weekend.
Any hints welcome.

Thanks,

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: FreeBSD and UDF

2016-02-28 Thread Harry Schmalzbauer
 Bezüglich Eugene M. Zheganin's Nachricht vom 25.02.2016 13:17 (localtime):
> Hi,
>
> recenlty I needed to mount the Windows 2012 R2 iso image, which pappened
> to ba an UDF image. After mdconfiging it and attempting to mount I got:
>
> # mount -t udf /dev/md1 cdrom01
> mount_udf: /dev/md1: Invalid argument
>
> udf is in kernel. Is UDF filesystem supported in FreeBSD ? I run
> 10.3-PRERELEASE r294405.

I'ts a matter of the UDF-Version.
I'm no expert, just missing the ability to quickly look into common
flowting DVD discs, which doesn't work so I asked ddg.
It seems today's DVD's are all UDF2 – FreeBSD has only support for UDF
1.02: https://people.freebsd.org/~scottl/udf/

Blu-ray's seem to be in UDF2.5:
https://lists.freebsd.org/pipermail/freebsd-fs/2015-July/021528.html
There's also the UDF2.0 FreeBSD-10-driver mentioned:
https://github.com/williamdevries/UDF

Hope this helps, unfortunately you'll have to enable UDF2 support on
your own.

-Harry


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: svn commit: r294958 - in stable/10: share/man/man4 sys/dev/e1000 sys/dev/ixgb sys/dev/netmap

2016-01-29 Thread Harry Schmalzbauer
 Bezüglich Mike Tancsa's Nachricht vom 29.01.2016 19:08 (localtime):
> On 1/27/2016 5:31 PM, Marius Strobl wrote:
>> Author: marius
>> Date: Wed Jan 27 22:31:08 2016
>> New Revision: 294958
>> URL: https://svnweb.freebsd.org/changeset/base/294958
>>
>> Log:
>>   Sync the e1000 drivers with what's in head as of r294327, modulo parts
>>   that don't apply to stable/10 (driver API, if_inc_counter(), RSS changes
>
> Hi,
>   I am seeing some timeouts since upgrading to this rev. I am running
> r295008, i386. onboard NIC
>
> Manufacturer: Supermicro
> Product Name: PDSMi
>
>
> em0: Watchdog timeout Queue[0]-- resetting
> Interface is RUNNING and ACTIVE
> em0: TX Queue 0 --
> em0: hw tdh = 946, hw tdt = 159
> em0: Tx Queue Status = -2147483648
> em0: TX descriptors avail = 786
> em0: Tx Descriptors avail failure = 0
> em0: RX Queue 0 --
> em0: hw rdh = 401, hw rdt = 400
> em0: RX discarded packets = 0
> em0: RX Next to Check = 401
> em0: RX Next to Refresh = 400
> em0: link state changed to DOWN
> em0: link state changed to UP
> em0: Watchdog timeout Queue[0]-- resetting
> Interface is RUNNING and ACTIVE
> em0: TX Queue 0 --
> em0: hw tdh = 87, hw tdt = 378
> em0: Tx Queue Status = -2147483648
> em0: TX descriptors avail = 720
> em0: Tx Descriptors avail failure = 0
> em0: RX Queue 0 --
> em0: hw rdh = 740, hw rdt = 739
> em0: RX discarded packets = 0
> em0: RX Next to Check = 741
> em0: RX Next to Refresh = 740
> em0: link state changed to DOWN
> em0: link state changed to UP
> Limiting open port RST response from 292 to 200 packets/sec
> em0: Watchdog timeout Queue[0]-- resetting
> Interface is RUNNING and ACTIVE
> em0: TX Queue 0 --
> em0: hw tdh = 611, hw tdt = 840
> em0: Tx Queue Status = -2147483648
> em0: TX descriptors avail = 773
> em0: Tx Descriptors avail failure = 0
> em0: RX Queue 0 --
> em0: hw rdh = 660, hw rdt = 659
> em0: RX discarded packets = 0
> em0: RX Next to Check = 660
> em0: RX Next to Refresh = 659
> em0: link state changed to DOWN
> em0: link state changed to UP
>
>
>
> # pciconf -lBvcb em0
> em0@pci0:13:0:0:class=0x02 card=0x108c15d9 chip=0x108c8086
> rev=0x03 hdr=0x00
> vendor = 'Intel Corporation'
> device = '82573E Gigabit Ethernet Controller (Copper)'

I guess you haven't compiled the kernel with EM_MULTIQUEUE. I can't
remember if 82573 is supposed to be able to handle 2 queues. I couldn't
help solving your problem anyways, but I found default number of rx/tx
descriptors somewhen increased from 1024 to 4096 for my 82574.
What does hw.em.txd read with your 82573?
Before my EM-MULTIQUEUE problem vanished, reducing hw.em.txd (and rxd)
to 256 relaxed the timeout problem a lot.
Seems your interface is recovering after watchdog-reset? Mine stayed
unusable unitl I triggered ifconfig down/up.
Have you checked if disabling TSO changes anything?
Probably checking if hw.em.enable_msix changes symptoms could also
narrow down the root cause.

Hope your problem also vanishes soon :-)

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


em(4) watchdog timeout redemption [Was: Re: svn commit: r294958 - in stable/10: share/man/man4 sys/dev/e1000 sys/dev/ixgb sys/dev/netmap]

2016-01-28 Thread Harry Schmalzbauer
 Bezüglich Marius Strobl's Nachricht vom 27.01.2016 23:31 (localtime):
> Author: marius
> Date: Wed Jan 27 22:31:08 2016
> New Revision: 294958
> URL: https://svnweb.freebsd.org/changeset/base/294958
>
> Log:
>   Sync the e1000 drivers with what's in head as of r294327, modulo parts
>   that don't apply to stable/10 (driver API, if_inc_counter(), RSS changes
>   etc.) and modulo r287465 (which reportedly breaks igb(4)), i. e. assorted
>   fixes and improvements only:
>   
>   o MFC r267385 (partial):
> - Don't compare bus_dma map pointers for static DMA allocations against
>   NULL to determine if bus_dmamap_unload() or bus_dmamem_free() should be
>   called. Instead, check the associated bus and virtual addresses.
> - Don't clear static DMA maps to NULL.
>   o MFC r284933:
> Delete the refernce to VLAN handling being disabled by default. This is
> no longer the case. [1]
>   o MFC r285639:
> Add an adapter CORE lock in the DDB hook em_dump_queue to avoid WITNESS
> panic in em_init_locked() while debugging.
>   o MFC r285879:
> - Remove unused txd_saved.
> - Intialize txd_upper, txd_lower and txd_used at declaration.
>   o MFC r286162:
> Free mbufs when busdma loading fails.
>   o MFC r286829:
> Add capability to disable CRC stripping as it breaks IPMI/BMC capabilities
> on certain adatpers. [2]
>   o MFC r286831: [3]
> - Increase EM_MAX_SCATTER to 64 such that the size of em_xmit()::
>   segs[EM_MAX_SCATTER] doesn't get overrun by things like NFS that can
>   and do shove more than 32 segs when being used with em(4) and TSO4.
> - Update tso handling code in em_xmit() with update from jhb@
> - Set if_hw_tsomax, if_hw_tsomaxsegcount and if_hw_tsomaxsegsize to
>   appropriate values.
> - Define a TSO workaround "magic" number of 4 that is used to avoid an
>   alignment issue in hardware.
> - Change a couple of integer values that were used as booleans to actual
>   bool types.
> - Ensure that em_enable_intr() enables the appropriate mask of interrupts
>   and not just a hardcoded define of values.
>   o MFC r286832:
> e1000/if_lem.c bump to 1.1.0
>   o MFC r286833:
> Bump all copywrite dates to 2015.
>   o MFC r287112:
> Style/whitespace cleanup in shared/common code.
>   o MFC r293331:
> - Switch em(4) to the extended RX descriptor format.
> - Split rxbuffer and txbuffer apart to support the new RX descriptor
>   format structures. Move rxbuffer manipulation to em_setup_rxdesc() to
>   unify the new behavior changes.
> - Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
>   in the card.
> - Change em_receive_checksum() to process the new rxdescriptor format
>   status bit.
>   o MFC r293332:
> Disable the reuse of checksum offload context descriptors in the case
> of multiple queues in em(4). Document errata in the code.
>   o MFC r293854:
> Given that em(4), lem(4) and igb(4) hardware doesn't require the
> alignment guarantees provided by m_defrag(9), use m_collapse(9)
> instead for performance reasons.
> While at it, sanitize the statistics softc members, i. e. retire
> unused ones and add SYSCTL nodes missing for actually used ones.
>   
>   PR: 118693 [1], 161277 [2], 195078 [3], 199174 [3], 200221 [3]

Thanks, especially to sbruno@
I'd like to confirm r294958 fixes multiple em(4) problems I observed up
to r294156, especially EM_MULTIQUEUE support on hartwell (82574)
(haven't filed a bug report since I haven't had time to analyze, seems
199174 and 200221 match well).

Glad to see 10.3 will ship with em(4) able to sustain GbE with one NFS
transfer (111,3MiB/s), while keeping low latency for additional
(low-trhoughput) connections without having unrecoverably watchdog
timeouts anymore (adding 2nd queue to em(4) reduces latency from 10ms to
~3ms on new sockets).

For the records, this kind of watchdog timeouts with unsuccessful
interface resets are fixed for me:
em0: Watchdog timeout Queue[0]-- resetting
Interface is RUNNING and ACTIVE
em0: TX Queue 0 --
em0: hw tdh = 210, hw tdt = 674
em0: Tx Queue Status = -2147483648
em0: TX descriptors avail = 3632
em0: Tx Descriptors avail failure = 0
em0: RX Queue 0 --
em0: hw rdh = 896, hw rdt = 895
em0: RX discarded packets = 0
em0: RX Next to Check = 896
em0: RX Next to Refresh = 895
em0: TX Queue 1 --
em0: hw tdh = 575, hw tdt = 716
em0: Tx Queue Status = -2147483648
em0: TX descriptors avail = 3937
em0: Tx Descriptors avail failure = 0
em0: RX Queue 1 --
em0: hw rdh = 192, hw rdt = 191
em0: RX discarded packets = 0
em0: RX Next to Check = 192
em0: RX Next to Refresh = 191
em0: link state changed to DOWN
em0: link state changed to UP

-Harry

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org

Big geli proglem: MD5 hash checksum mismatch for da0

2008-01-22 Thread Harry Schmalzbauer
Hello,

I tried to change my passphrase for a geli provider.
Like man page tells, I attached the provider (da0) and used
'geli setkey da0' to change the key (only one key, no keyfile used).
Everything seemd to work but after detaching any attach attempt fails with:
MD5 hash checksum mismatch for da0

What went wrong? And how can I solve it? Needless to say that it's
important data and I really don't hope that geli was corrupting it!

Thanks,

-Harry
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"