from:"Landry Breuil"

Re: edgerouter poe panics when bringing network up at boot

2024-07-06 Thread Landry Breuil

Le Tue, Jul 02, 2024 at 07:50:56PM +, Miod Vallat a écrit :
> > hi,
> > 
> > playing with 7.5 on an edgerouter poe, it panics at boot:
> 
> Congratulations, you've reached a divide by zero in the kernel.
> 
> This is caused by cn30xxgmx_rgmii_speed() setting the local variable
> `baudrate' to zero (with an XXX comment) prior to dividing by it.
> 
> The following diff sweeps the issue under the rug by following the
> "consider unknown value as 1Gbps" logic used all over the file. It might
> help your setup.

i've commited the other diff, but should this one go in too for
correctness ? in that case, any oks, or someone to commit it with my ok ?

Landry

> Index: dev/cn30xxgmx.c
> ===
> RCS file: /OpenBSD/src/sys/arch/octeon/dev/cn30xxgmx.c,v
> retrieving revision 1.54
> diff -u -p -r1.54 cn30xxgmx.c
> --- dev/cn30xxgmx.c   20 May 2024 23:13:33 -  1.54
> +++ dev/cn30xxgmx.c   2 Jul 2024 19:49:45 -
> @@ -991,7 +991,8 @@ cn30xxgmx_rgmii_speed(struct cn30xxgmx_p
>   baudrate = IF_Gbps(1);
>   break;
>   default:
> - baudrate = 0/* XXX */;
> + /* Assume 1Gbps for now*/
> + baudrate = IF_Gbps(1);  /* XXX */
>   break;
>   }
>   ifp->if_baudrate = baudrate;
> @@ -1202,7 +1203,7 @@ cn30xxgmx_rgmii_speed_speed(struct cn30x
>   SET(prt_cfg, PRTN_CFG_SPEED);
>   break;
>   default:
> - /* NOT REACHED! */
> + /* THEORETICALLY NOT REACHED! */
>   /* Following configuration is default value of system.
>   */
>   tx_clk = 1;
>

Re: edgerouter poe panics when bringing network up at boot

2024-07-04 Thread Landry Breuil

Le Wed, Jul 03, 2024 at 02:31:02PM +, Visa Hankala a écrit :
> On Wed, Jul 03, 2024 at 03:02:52PM +0200, Landry Breuil wrote:
> > Le Tue, Jul 02, 2024 at 07:50:56PM +, Miod Vallat a écrit :
> > > > hi,
> > > > 
> > > > playing with 7.5 on an edgerouter poe, it panics at boot:
> 
> The condition should be (port > 1 && octeon_boot_info->board_rev_major == 1).
> Otherwise the code skips a working port on the EdgeRouter Lite.
> The Lite uses the same board id but has board_rev_major == 2 as far
> as I know.

sure, here's a new diff:

RCS file: /cvs/src/sys/arch/octeon/dev/cn30xxsmi.c,v
diff -u -r1.12 cn30xxsmi.c
--- cn30xxsmi.c 20 May 2024 23:13:33 -  1.12
+++ cn30xxsmi.c 4 Jul 2024 11:04:59 -
@@ -197,6 +197,10 @@
reg = nutm25_phys[port];
break;
case BOARD_UBIQUITI_E100:
+   /* XXX Skip the switch port on ERPoe-5.
+* XXX There is no driver for it. */
+   if (port > 1 && octeon_boot_info->board_rev_major == 1)
+   return ENOENT;
case BOARD_UBIQUITI_E120:
if (port > 2)
return ENOENT;

i didnt see any use for board_rev_major so i wrongly assumed that
https://github.com/openbsd/src/commit/2bbf581cd0b529f21a6f418a4467e74927b76dd5
made a proper distinction between those.

Landry

Re: edgerouter poe panics when bringing network up at boot

2024-07-03 Thread Landry Breuil

Le Tue, Jul 02, 2024 at 07:50:56PM +, Miod Vallat a écrit :
> > hi,
> > 
> > playing with 7.5 on an edgerouter poe, it panics at boot:
> 
> Congratulations, you've reached a divide by zero in the kernel.
> 
> This is caused by cn30xxgmx_rgmii_speed() setting the local variable
> `baudrate' to zero (with an XXX comment) prior to dividing by it.
> 
> The following diff sweeps the issue under the rug by following the
> "consider unknown value as 1Gbps" logic used all over the file. It might
> help your setup.

it's right in the sense that it doesn't panic anymore and i can
up/configure the interface:

erpoe# ifconfig cnmac2 media
cnmac2: flags=8843 mtu 1500
lladdr 80:2a:a8:8e:2d:51
index 3 priority 0 llprio 3
media: Ethernet manual (none)
supported media:
media manual
inet 10.0.2.2 netmask 0xff00 broadcast 10.0.2.255

but it shows no media type, doesn't work and no packets flow.

digging in the archives, i found
https://marc.info/?l=openbsd-bugs=151063517020444=2 which seems to
say that it'd be better to apparently skip that port (and that there's
no hope for the two remaining ports ?)

visa, any hindsight on this ?

trying the following diff adapted from the 2017 thread:

Index: arch/octeon/dev/cn30xxsmi.c
===
RCS file: /cvs/src/sys/arch/octeon/dev/cn30xxsmi.c,v
diff -u -r1.12 cn30xxsmi.c
--- arch/octeon/dev/cn30xxsmi.c 20 May 2024 23:13:33 -  1.12
+++ arch/octeon/dev/cn30xxsmi.c 3 Jul 2024 12:54:48 -
@@ -197,6 +197,10 @@
reg = nutm25_phys[port];
break;
case BOARD_UBIQUITI_E100:
+   /* XXX Skip the switch port on ERPoe-5.
+* XXX There is no driver for it. */
+   if (port > 1)
+   return ENOENT;
case BOARD_UBIQUITI_E120:
if (port > 2)
return ENOENT;

dmesg goes from
mainbus0 at root: board 20002 rev 1.27, model CN3xxx/CN5xxx
...
octgmx0 at octpip0 interface 0
cnmac0 at octgmx0: port 0 RGMII, address 80:2a:a8:8e:2d:4f
atphy0 at cnmac0 phy 7: AR8035 10/100/1000 PHY, rev. 2
cnmac1 at octgmx0: port 1 RGMII, address 80:2a:a8:8e:2d:50
atphy1 at cnmac1 phy 6: AR8035 10/100/1000 PHY, rev. 2
cnmac2 at octgmx0: port 2 RGMII, address 80:2a:a8:8e:2d:51
com0 at simplebus0: ns16550a, 64 byte fifo

to now only two supported ports:
octgmx0 at octpip0 interface 0
cnmac0 at octgmx0: port 0 RGMII, address 80:2a:a8:8e:2d:4f
atphy0 at cnmac0 phy 7: AR8035 10/100/1000 PHY, rev. 2
cnmac1 at octgmx0: port 1 RGMII, address 80:2a:a8:8e:2d:50
atphy1 at cnmac1 phy 6: AR8035 10/100/1000 PHY, rev. 2
com0 at simplebus0: ns16550a, 64 byte fifo

at least the edgerouter lite 3 physical ports are supported.. its' a
pity having only 2 working ports on 5 physical...

Landry

Re: edgerouter poe panics when bringing network up at boot

2024-07-02 Thread Landry Breuil

Le Tue, Jul 02, 2024 at 09:31:42PM +0200, Landry Breuil a écrit :
> hi,
> 
> playing with 7.5 on an edgerouter poe, it panics at boot:
> 
> starting network
> 
> Trap cause = 13 Frame 0x98ba79b8
> Trap PC 0x8108f438 RA 0x8108f41c fault 0x9fc9ae078
> cn30xxgmx_rgmii_speed+0x270 
> (c0004870,b4b2685682005875,10,7e683fa7441755a4)  ra 
> 0x810919ac sp 0x98ba7b10, sz 48
> cn30xxgmx_reset_speed+0x4c 
> (c0004870,4356c4a931833103,10,7e683fa7441755a4)  ra 
> 0x81063b18 sp 0x98ba7b40, sz 16
> cnmac_init+0xb8 (c0004870,c0899982aa17119c,10,7e683fa7441755a4)  ra 
> 0x81062630 sp 0x98ba7b50, sz 32
> cnmac_ioctl+0x168 (c0004870,c0899982aa17119c,10,5241b1e1580098d8)  ra 
> 0x814f9f98 sp 0x98ba7b70, sz 80
> in_ifinit+0x170 (c0004870,c0899982aa17119c,10,5241b1e1580098d8)  ra 
> 0x814f9b58 sp 0x98ba7bc0, sz 96
> in_ioctl_change_ifaddr+0x420 
> (c0004870,c0899982aa17119c,10,19c4fd59d992c467)  ra 
> 0x814f9094 sp 0x98ba7c20, sz 96
> User-level: pid 9874
> stopped on non ddb fault
> Stopped at  cn30xxgmx_rgmii_speed+0x270:teq at,zero
> 
> # cat /etc/hostname.cnmac*
> inet 10.0.0.2 0xff00
> inet 10.0.1.2 0xff00
> inet 10.0.2.2 0xff00
> 
> it doesnt panic if i move away the config files and configure network
> after boot.

more testing, it just panics (with the same trace) with ifconfig cnmac2
up. having hostname.cnmac{0,1} bringing network up at boot is fine.

Landry

edgerouter poe panics when bringing network up at boot

2024-07-02 Thread Landry Breuil

hi,

playing with 7.5 on an edgerouter poe, it panics at boot:

starting network

Trap cause = 13 Frame 0x98ba79b8
Trap PC 0x8108f438 RA 0x8108f41c fault 0x9fc9ae078
cn30xxgmx_rgmii_speed+0x270 
(c0004870,b4b2685682005875,10,7e683fa7441755a4)  ra 0x810919ac 
sp 0x98ba7b10, sz 48
cn30xxgmx_reset_speed+0x4c 
(c0004870,4356c4a931833103,10,7e683fa7441755a4)  ra 0x81063b18 
sp 0x98ba7b40, sz 16
cnmac_init+0xb8 (c0004870,c0899982aa17119c,10,7e683fa7441755a4)  ra 
0x81062630 sp 0x98ba7b50, sz 32
cnmac_ioctl+0x168 (c0004870,c0899982aa17119c,10,5241b1e1580098d8)  ra 
0x814f9f98 sp 0x98ba7b70, sz 80
in_ifinit+0x170 (c0004870,c0899982aa17119c,10,5241b1e1580098d8)  ra 
0x814f9b58 sp 0x98ba7bc0, sz 96
in_ioctl_change_ifaddr+0x420 
(c0004870,c0899982aa17119c,10,19c4fd59d992c467)  ra 0x814f9094 
sp 0x98ba7c20, sz 96
User-level: pid 9874
stopped on non ddb fault
Stopped at  cn30xxgmx_rgmii_speed+0x270:teq at,zero

# cat /etc/hostname.cnmac*
inet 10.0.0.2 0xff00
inet 10.0.1.2 0xff00
inet 10.0.2.2 0xff00

it doesnt panic if i move away the config files and configure network
after boot.

no cable is plugged in in cnmac0, and a cable is plugged in in cnmac1, directly
connected to a working er-lite.

reproducible at will so can try various things if instructed to.

Landry

Re: t945s hangs on ttyflags -a

2024-03-31 Thread Landry Breuil

Le Sun, Mar 31, 2024 at 09:30:05AM +0200, Landry Breuil a écrit :
> hi,
> 
> istr this has been discussed/fixed at some point and it used to work
> last year, but the t495s i have here on -current hangs at ttyflags -a in
> /etc/rc, commenting it again allows boot to succeed.
> 
> dmesg attached with -current. i dont boot that machine often enough, so
> the regression window is .. large.. guess i'll try bisecting.
> 
> last known working: #1463: Wed Nov 22 21:13:03 MST 2023.

after bisecting a bit, i'm puzzled because it seems ttyflags -a hangs
only happen when a spurious com0 is found in dmesg:

com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: probed fifo depth: 0 bytes

but that device isnt present in the working boots from various kernel
versions (tried kernels from end of december to 1 feb so far)

it's enough to test with boot -s and ttyflags -a, i think i triggered it
once with a kernel from #1587: Sat Dec 30 22:44:51 MST 2023, next boots
on the same kernel were okay..

I've tried differentiating cold boots vs reboots, but that didn't help.

Landry

t945s hangs on ttyflags -a

2024-03-31 Thread Landry Breuil

hi,

istr this has been discussed/fixed at some point and it used to work
last year, but the t495s i have here on -current hangs at ttyflags -a in
/etc/rc, commenting it again allows boot to succeed.

dmesg attached with -current. i dont boot that machine often enough, so
the regression window is .. large.. guess i'll try bisecting.

last known working: #1463: Wed Nov 22 21:13:03 MST 2023.
OpenBSD 7.5-current (RAMDISK_CD) #1: Sat Mar 30 06:11:20 MDT 2024
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD
real mem = 14860877824 (14172MB)
avail mem = 14406131712 (13738MB)
random: good seed from bootblocks
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.1 @ 0xb9eaa000 (62 entries)
bios0: vendor LENOVO version "R13ET54W(1.28 )" date 01/12/2023
bios0: LENOVO 20QJCTO1WW
acpi0 at bios0: ACPI 5.0
acpi0: tables DSDT FACP SSDT SSDT SSDT TPM2 SSDT MSDM SLIC BATB HPET APIC MCFG 
SBST WSMT IVRS SSDT CRAT CDIT FPDT SSDT SSDT SSDT UEFI SSDT
acpihpet0 at acpi0: 14318180 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx, 2300.00 MHz, 17-18-01, 
patch 08108109
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,HWPSTATE,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu0: 32KB 64b/line 8-way D-cache, 64KB 64b/line 4-way I-cache, 512KB 64b/line 
8-way L2 cache, 4MB 64b/line 16-way L3 cache
cpu0: apic clock running at 25MHz
cpu0: mwait min=64, max=64, C-substates=1.1, IBE
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
ioapic0 at mainbus0: apid 32 pa 0xfec0, version 21, 24 pins, can't remap
ioapic1 at mainbus0: apid 33 pa 0xfec01000, version 21, 32 pins, can't remap
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (GPP0)
acpiprt2 at acpi0: bus 1 (GPP1)
acpiprt3 at acpi0: bus 2 (GPP2)
acpiprt4 at acpi0: bus 3 (GPP3)
acpiprt5 at acpi0: bus -1 (GPP4)
acpiprt6 at acpi0: bus -1 (GPP5)
acpiprt7 at acpi0: bus 4 (GPP6)
acpiprt8 at acpi0: bus 5 (GP17)
acpiprt9 at acpi0: bus -1 (GP18)
acpiec0 at acpi0
"PNP0C0C" at acpi0 not configured
acpipci0 at acpi0 PCI0: 0x0010 0x0011 0x
acpicmos0 at acpi0
"PNP0C0A" at acpi0 not configured
"ACPI0003" at acpi0 not configured
"LEN0268" at acpi0 not configured
"SMB0001" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"PNP0C0D" at acpi0 not configured
"PNP0C0E" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
amdgpio0 at acpi0 GPIO uid 0 addr 0xfed81500/0x400 irq 9, 184 pins
"USBC000" at acpi0 not configured
"STM7308" at acpi0 not configured
"PNP0C14" at acpi0 not configured
acpicpu at acpi0 not configured
acpipwrres at acpi0 not configured
acpipwrres at acpi0 not configured
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "AMD 17h/1xh Root Complex" rev 0x00
"AMD 17h/1xh IOMMU" rev 0x00 at pci0 dev 0 function 2 not configured
pchb1 at pci0 dev 1 function 0 "AMD 17h PCIE" rev 0x00
ppb0 at pci0 dev 1 function 2 "AMD 17h/1xh PCIE" rev 0x00: msi
pci1 at ppb0 bus 1
iwm0 at pci1 dev 0 function 0 "Intel Dual Band Wireless-AC 9260" rev 0x29, msix
ppb1 at pci0 dev 1 function 3 "AMD 17h/1xh PCIE" rev 0x00: msi
pci2 at ppb1 bus 2
nvme0 at pci2 dev 0 function 0 "Samsung SM981/PM981 NVMe" rev 0x00: msix, NVMe 
1.3
nvme0: SAMSUNG MZVLB1T0HBLR-000L7, firmware 4M2QEXF7, serial S4EMNX0N762169
scsibus0 at nvme0: 2 targets, initiator 0
sd0 at scsibus0 targ 1 lun 0: 
sd0: 976762MB, 512 bytes/sector, 2000409264 sectors
ppb2 at pci0 dev 1 function 4 "AMD 17h/1xh PCIE" rev 0x00: msi
pci3 at ppb2 bus 3
re0 at pci3 dev 0 function 0 "Realtek 8168" rev 0x0e: RTL8168EP/8111EP 
(0x5000), msi, address 00:2b:67:e6:c7:83
rgephy0 at re0 phy 7: RTL8251 PHY, rev. 0
"Realtek RealManage Serial" rev 0x0e at pci3 dev 0 function 1 not configured
"Realtek RealManage Serial" rev 0x0e at pci3 dev 0 function 2 not configured
"Realtek RealManage IPMI" rev 0x0e at pci3 dev 0 function 3 not configured
ehci0 at pci3 dev 0 function 4 "Realtek RealManage USB" rev 0x0e: apic 33 int 15
ehci0: pre-2.0 USB rev
ppb3 at pci0 dev 1 function 7 "AMD 17h/1xh PCIE" rev 0x00: msi
pci4 at ppb3 bus 4
rtsx0 at pci4 dev 0 function 0 "Realtek RTS522A Card Reader" rev 0x01: msi
sdmmc0 at rtsx0: 4-bit, dma
pchb2 at pci0 dev 8 function 0 "AMD 17h PCIE" rev 0x00
ppb4 at pci0 dev 8 function 1 "AMD 17h/1xh PCIE" rev 0x00
pci5 at ppb4 bus 5
"ATI Picasso" rev 0xd1 at pci5 dev 0 function 0 not configured
"ATI Radeon Vega HD Audio" rev

Re: Thunderbird crashes on certain operations on big mailboxes

2023-07-18 Thread Landry Breuil

Le Mon, Jul 17, 2023 at 10:38:51PM +0200, Peter N. M. Hansteen a écrit :
> Subject says the essentials. sendbug output with some supplemental egdb ouput
> follows. What are useful steps to further debug this?

per /usr/local/share/doc/pkg-readmes/thunderbird install the
debug-thunderbird package (with a version matching the one you use) to
get something useful from the traceback, the one you show is 100%
useless...

This is probably pointless anyway, as 102esr is soon EOL, so you should
also try to reproduce with 115 (see ports@, i've sent a diff with
packages..)

also, this might be due to a corrupted profile (because
crashes/fsck/etc), and at that point you'd better recreate the profile
from scratch, not sure a corrupted profile can be salvaged at all.

Landry

Re: Memory issue on desktop at home, probably Radeon related

2023-03-02 Thread Landry Breuil

Le Thu, Mar 02, 2023 at 10:26:55AM +0100, Marc Espie a écrit :
> Every few days, my desktop at home suddenly slows down. Looking at top,
> I can see that literally everything is being forced into swap (I have 16G
> of memory, when the problem starts, I usually have roughly 5-7G of memory 
> used, so very far below the limit, and I can see it drain into swap, 
> until there's 100K of memory used and everything else in swap).
> 
> I haven't tinkered with bufcachepercent.
> 
> At that point the machine is completely frozen, and I have no option besides
> materially stopping it.
> 
> If I catch it before that, I can try to reboot it to avoid fsck.
> 
> Same kind of workload I use at home and at work: lots of chrome with images, 
> lots of image/video displays with mpv, lots of youtube, and also a game
> that wants webgl (elvenar) which doesn't work on firefox at all.
> 
> The big difference is that my home box is an amd with a radeon TURKS adapter
> (r300 I think ?)
> 
> I suspect something in the memory allocator of dri wants memory with dma
> constraints, and somehow, the pagedaemon decides to put everything into swap.
> 
> I've talked a bit to kettenis@, and followed a red herring through to 
> instrumenting the memory allocation routines that this radeon does NOT 
> abuse (namely alloc_pages and friends: allocates a grant total of one single
> page). I will try to look at the ttm stuff next, probably.
> 
> Since this takes usually a day or three to trigger, I don't know when this
> started exactly, but this has been going on for at least a few months.
> 
> Just in case somebody has a bright idea, or is experiencing similar issues.

I'm having a totally similar issue on my old optiplex radeondrm @work w/
8Gb RAM (sorry no dmesg for now), running top -SH in a term i see
pagedaemon going berserk before things slow down to a halt. Sometimes it
recovers after a minute or too, sometimes i give up and powercycle.
Happens most of the days and sometimes several times a day. Been
happening for 1 year maybe.
Discussed it a bit with claudio@ who told me it was somewhat already
known, about having a 'special region for the first 4Gb of RAM' that
apparently goes low on available free mem and advised talking to mpi@ :).

I know this doesnt help, but if i have some precise guidance i can try
to extract info from the box by breaking into ddb, i should have serial
somewhere.

Landry

7.2: 'No buffer space available' after a while with axe0

2022-12-29 Thread Landry Breuil

Hi,

at some remote place i'using an axe0 to connect to the ISP's CPE in
bridge mode, which hands me a public IP. The CPE had power plug issues,
so was randomly disappearing... and required going on-site.

on the same box, i have monit from ports doing various checks and trying
to ensure connectivity and daemons keep working fine. Operations done by
monit are '/etc/netstart axe0' and 'rcctl restart unbound' when some
checks fail:

check network outside-link with interface axe0
  start program "/bin/sh /etc/netstart axe0"
  if failed link for 5 cycles then start

check network outside-ip with address 
  start program "/bin/sh /etc/netstart axe0"
  if failed link for 5 cycles then start

check process unbound matching unbound
  restart program "/usr/sbin/rcctl restart unbound"
  if failed port 53 type udp protocol dns for 2 cycles then restart

this works fine, until at some point things accumulate (eg too much CPE
downtime), and even if the CPE comes back and things should come back to
normal operation mode, i get in a situation where most of the network
operations by daemons fail with 'No buffer space available', matching
the following patterns:

dhcpleased.*: bpf_send_packet: writev: No buffer space available
unbound:.*notice: send failed: No buffer space available
iked.*: ikev2_msg_send: sendtofrom: No buffer space available

from that point, i tried restarting dhcpleased, but the new process kept
getting 'No buffer space available' errors.

The only 'working workaround' i've found so far is to ifconfig axe0
down/up, which resumes normal operation mode immediately, i get my
public ip/dhcp lease, iked reconnects, unbound does its job, etc.

i can add a monit check to do the down/up dance when the aforementioned
patterns show up in var/log/messages, but i'd be curious in
investigating it. What system metric should be monitored via systat to
check for that 'buffer space' ? A leak happening because of the numerous
netstart calls ? an issue in the axe driver ?

hw is:
axe0 at uhub0 port 7 configuration 1 interface 0 "ASIX Electronics AX88772" rev 
2.00/0.01 addr 2
axe0: AX88772, address 00:14:d1:da:77:4f
ukphy0 at axe0 phy 16: Generic IEEE 802.3u media interface, rev. 1: OUI 
0x000ec6, model 0x0006

and sometimes does some 
axe0: usb errors on rx: IOERROR
axe0 detached

but reattaches immediately, and those msgs cant be correlated to issues
with the CPE or the 'No buffer space available' messages.

Hints welcome.

Landry
OpenBSD 7.2 (GENERIC.MP) #4: Mon Dec 12 06:06:42 MST 2022

r...@syspatch-72-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1835331584 (1750MB)
avail mem = 1762369536 (1680MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe (43 entries)
bios0: vendor Apple Inc. version "MM31.88Z.00AD.B00.0907171535" date 07/17/09
bios0: Apple Inc. Macmini3,1
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP HPET APIC APIC MCFG ASF! SBST ECDT SSDT SSDT SSDT
acpi0: wakeup devices EC__(S3) OHC1(S3) EHC1(S3) OHC2(S3) EHC2(S3) GIGE(S5) 
ARPT(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 2500 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU P7550 @ 2.26GHz, 2255.38 MHz, 06-17-0a
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 3MB 64b/line 
12-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 265MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU P7550 @ 2.26GHz, 2255.36 MHz, 06-17-0a
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 3MB 64b/line 
12-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 11, 24 pins, remapped
acpimcfg0 at acpi0
acpimcfg0: addr 0xf000, bus 0-255
acpiec0 at acpi0
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 2 (IXVE)
acpibtn0 at acpi0: PWRB
acpibtn1 at acpi0: SLPB
acpipci0 at acpi0 PCI0: 0x0010 0x0011 0xmemory map conflict 
0xffc0/0x40

asmc0 at acpi0: SMC_ (smc-mcp) addr 0x300/0x20: rev 1.35f535, 154 keys
acpicmos0 at acpi0
acpicpu0 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1 
mwait), PSS
acpicpu1 at acpi0: !C3(100@57 mwait.3@0x31), !C2(500@1 mwait@0x10), C1(1000@1 
mwait),

Re: menulibre crashes after upgrading to 7.2

2022-10-28 Thread Landry Breuil

Le Wed, Oct 26, 2022 at 02:30:04PM +, noizeless.v...@tutanota.com a écrit :
> >Synopsis:menulibre crashes after upgrading to 7.2
> >Category:desktop, gui
> >Environment:
>   System  : OpenBSD 7.2
>   Details : OpenBSD 7.2 (GENERIC.MP) #758: Tue Sep 27 11:57:54 MDT 
> 2022
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> 
>   Information for inst:menulibre-2.3.0p0
> 
> >Description:
> 
> I had upgrade 7.1 to 7.2 and run menulibre program.
> 
> Traceback (most recent call last):
>   File "/usr/local/bin/menulibre", line 44, in 
> import menulibre
>   File "/usr/local/lib/python3.9/site-packages/menulibre/__init__.py", line 
> 23, in 
> from menulibre import MenulibreApplication
>   File 
> "/usr/local/lib/python3.9/site-packages/menulibre/MenulibreApplication.py", 
> line 38, in 
> from .util import escapeText, getCurrentDesktop, find_program, 
> getProcessList
> ImportError: cannot import name 'getProcessList' from 'menulibre.util' 
> (/usr/local/lib/python3.9/site-packages/menulibre/util.py)

thanks for the report, should be fixed with menulibre-2.3.0p1 in
7.2-stable.

Landry

Re: Unable to complete secure online login

2022-09-15 Thread Landry Breuil

Le Thu, Sep 15, 2022 at 05:40:47AM -0700, Alton Shaw a écrit :
> So I completely removed Firefox, Firefox-ESR, & Chromium and then reinstall
> Firefox.  Started Firefox, without changing any settings went straight to my
> banking site, tried to login and received exactly the same result: "/We’re
> currently having technical issue Please try again later./"
> 
> At this point I don't know what to try next.

Removing browsers is one thing, did you try with a clean profile ?
reinstalling browsers wont do anything if you dont remove old
profiles...

> Also, would surf behave differently than any of the other browsers with
> regards to "natted" and "/29"

surf uses webkitgtk.

iked doesnt cope with dns changes

2022-08-01 Thread Landry Breuil

hi,
not sure it's really a bug or if i should work around it, i have a setup
with multiple iked talking to each other, one of my endpoints is behind
a consumer dsl, so its IP changes from time to time - and one of my
tunnels tries to connect to it, using 'peer my.fqdn'.

when the ip changes, i update my dns to use the new ip for my.fqdn, but
iked still tries to connect to the previous ip, and so far i need to
remember to restart iked so that it picks up the new ip via dns.

I've looked at parse.y and my understanding is that host_dns() is only
called when loading the config, so technically i guess i could try
'ikectl reload' when i detect that the ip changed but it would be much
nicer if iked would gracefully handle that..

Landry

random high cpu spinning on core2quad

2022-04-20 Thread Landry Breuil

Hi,

i have a work desktop (dmesg below, dell optiplex 960 with an old radeon
RV620, Core2Quad CPU Q9400) that quite often feels like 'hardlocking'
with music staggering to a halt, mouse reacting veey slowly,
sometimes it recovers, sometimes i can switch to a VT and kill either
ffx or thunderbird, but most of the times i dont have much option but
hitting the power button.

That behaviour is random, mostly happens when switching apps (but that's
only a feeling) - on that box i'm mostly using firefox, thunderbird,
some terms, music & qgis (which can be heavy, but is mostly idling when
open).

that's not something new, it's been happening for some months now (yeah
i know sucky regression window), and when it hardlocks, a running systat
via ssh show:
75% cpu spining
25% sys
not much interrupts (eg no interrupt storm afaict)
load around 5/7
and plenty of free memory

what systat screen should i watch/keep open to figure out what goes that
wrong ?

Landry
OpenBSD 7.1-current (GENERIC.MP) #476: Tue Apr 19 20:47:53 MDT 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8436838400 (8045MB)
avail mem = 8163807232 (7785MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xf0450 (82 entries)
bios0: vendor Dell Inc. version "A05" date 07/31/2009
bios0: Dell Inc. OptiPlex 960
acpi0 at bios0
acpi0: TCPA checksum error: ACPI 3.0
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC BOOT ASF! MCFG HPET TCPA DMAR SSDT SSDT SSDT 
SSDT SSDT
acpi0: wakeup devices VBTN(S4) PCI0(S5) PCI4(S5) PCI2(S5) PCI3(S5) PCI1(S5) 
PCI5(S5) PCI6(S5) USB0(S3) USB1(S3) USB2(S3) USB3(S3) USB4(S3) USB5(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz, 2660.40 MHz, 06-17-0a
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu0: 3MB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges
cpu0: apic clock running at 332MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz, 2660.00 MHz, 06-17-0a
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu1: 3MB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz, 2660.02 MHz, 06-17-0a
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu2: 3MB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz, 2660.00 MHz, 06-17-0a
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu3: 3MB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins, remapped
acpimcfg0 at acpi0
acpimcfg0: addr 0xe000, bus 0-255
acpimcfg0: addr 0x97adfefb2028, bus 181-20
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 4 (PCI4)
acpiprt2 at acpi0: bus 2 (PCI2)
acpiprt3 at acpi0: bus 3 (PCI3)
acpiprt4 at acpi0: bus 1 (PCI1)
acpiprt5 at acpi0: bus -1 (PCI5)
acpiprt6 at acpi0: bus -1 (PCI6)
acpibtn0 at acpi0: VBTN
acpipci0 at acpi0 PCI0
acpicmos0 at acpi0
com0 at acpi0 COMA addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
"*pnp0c14" at acpi0 not configured
acpicpu0 at acpi0: C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C1(1000@1 mwait.1), PSS
cpu0: Enhanced SpeedStep 2660 MHz: speeds: 2667, 2333, 2000 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Q45 Host" rev 0x03
ppb0 at pci0 dev 1 function 0 "Intel Q45 PCIE" rev 0x03: msi
pci1 at ppb0 bus 1
radeondrm0 at pci1 dev 0 function 0 "ATI Radeon HD 3450" rev 0x00
drm0 at radeondrm0
radeondrm0: msi
"Intel Q45 HECI" rev 0x03 at pci0 dev 3 function 0 not configured
pciide0 at pci0 dev 3 function 2 "Intel Q45 PT IDER" rev 0x03: DMA 
(unsupported), channel 0 wired to native-PCI, channel 1 wired to native-PCI
pciide0: using apic 8 int

pmap_unwire messages in dmesg since 7.0

2022-01-27 Thread Landry Breuil

Hi,

been seeing regular msgs like this on my home gateway (dmesg attached) since
7.0 update, lenovo box with an external usb disk, nothing fancy.  its mostly
used as a pf gw, mediabox for movies on tv and mpd server.

I havent been able to relate the timing of those messages with actual events on
the server, access to external disk (eg some at night arent related to
backups?)

Pretty sure i didnt have those with 6.9.

the left address only changes across reboots, and i have several of those
messages per boot at seemingly random occasions.

Jan 21 23:38:20 sunset /bsd: pmap_unwire: wiring for pmap 0xfd804e3a86d0 va 
0xc000152000 didn't change!
Jan 23 10:00:10 sunset /bsd: pmap_unwire: wiring for pmap 0xfd804e3a86d0 va 
0xc000153000 didn't change!
Jan 23 11:12:40 sunset /bsd: pmap_unwire: wiring for pmap 0xfd804e3a86d0 va 
0xc000152000 didn't change!
Jan 23 17:53:50 sunset /bsd: pmap_unwire: wiring for pmap 0xfd804e3a86d0 va 
0xc000152000 didn't change!
Jan 24 01:46:00 sunset /bsd: pmap_unwire: wiring for pmap 0xfd804e3a86d0 va 
0xc000aaa000 didn't change!
Jan 24 10:21:00 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xcfb000 didn't change!
Jan 24 16:23:20 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xc000c62000 didn't change!
Jan 24 17:25:40 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xcfb000 didn't change!
Jan 24 19:04:30 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xcfa000 didn't change!
Jan 25 14:09:10 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xc0007c2000 didn't change!
Jan 25 14:43:20 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xc000dc4000 didn't change!
Jan 26 01:06:30 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xc000ece000 didn't change!
Jan 26 06:49:30 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xc0010b4000 didn't change!
Jan 27 11:47:40 sunset /bsd: pmap_unwire: wiring for pmap 0xfd8219f2c318 va 
0xcfa000 didn't change!

the attached dmesg.boot has several reboots kept in memory and shows all those
pmap_unwire messages.

the usb enclosure might be at fault, since i've had several issues where sd2
was being 'stuck', at which point i had

umass0: Invalid CSW: tag 1000 should be 92575172

messages in dmesg where i had to force a reboot to regain access to the usb
disk.

happy to provide more info upon request.

Landry
OpenBSD 7.0 (GENERIC.MP) #1: Fri Oct 29 12:04:07 MDT 2021

r...@syspatch-70-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8400904192 (8011MB)
avail mem = 8130285568 (7753MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x9b8a9000 (100 entries)
bios0: vendor LENOVO version "M1UKT28A" date 02/18/2019
bios0: LENOVO 10RS002AFR
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT MCFG SSDT SSDT SLIC MSDM SSDT HPET SSDT 
SSDT UEFI LPIT SSDT SSDT DBGP DBG2 SSDT DMAR NHLT BGRT TPM2 LUFT ASF! WSMT
acpi0: wakeup devices SIO1(S3) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) 
PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) RP06(S4) PXSX(S4) RP07(S4) 
PXSX(S4) RP08(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i5-8500T CPU @ 2.10GHz, 1995.35 MHz, 06-9e-0a
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i5-8500T CPU @ 2.10GHz, 1995.36 MHz, 06-9e-0a
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at

Re: OpenBSD/amd64: xfce screensaver/lock is not interruptible, only mouse pointer visible on blank screen, no password prompt

2021-11-16 Thread Landry Breuil

Le Tue, Nov 16, 2021 at 09:44:38PM +0100, Peter Nicolai Mathias Hansteen a 
écrit :
> 
> 
> > 16. nov. 2021 kl. 19:44 skrev Aaron Bieber :
> > 
> > 
> > Matthieu Herrb mailto:matth...@openbsd.org>> writes:
> > 
> >> So to summarize: you should rebuild and install the patched version to
> >> test.
> >> 
> >> cd /usr/ports/x11/xfce4/xfce4-screensaver
> >> patch -p0 -E < /this/patch
> >> doas make clean=all
> >> doas make package FETCH_PACKAGES=
> >> doas make install
> >> 
> >> I'm adding Landry's patch below for reference :
> > 
> > This resolves the issue for me!
> 
> 
> Yes, this solved it for me too.
> 
> I would love to see this go in.

thanks all for your patience, was offline but its now been commited,
xfce4-screensaver-4.16.0p0 should have the fix.

Landry

Re: OpenBSD/amd64: xfce screensaver/lock is not interruptible, only mouse pointer visible on blank screen, no password prompt

2021-11-01 Thread Landry Breuil

Le Mon, Nov 01, 2021 at 12:15:01PM +0100, Matthieu Herrb a écrit :
> On Mon, Nov 01, 2021 at 12:00:30PM +0100, Landry Breuil wrote:
> > Le Sun, Oct 31, 2021 at 10:47:36PM +0100, Landry Breuil a écrit :
> > 
> > > > > > > ** (xfce4-screensaver-dialog:72106): ERROR **: 21:36:25.353: 
> > > > > > > Failed to
> > > > > > >connect to xfconf daemon: Cannot spawn a message bus when 
> > > > > > > setuid.
> > > > > > > 
> > > > > > > I don't know much about xfconf / dbus / setuid applications
> > > > > > > interactions, but this doesn't look like something related to 
> > > > > > > changes
> > > > > > > in base.
> > > > > > 
> > > > > > Well... iirc, nothing changed between xfconf and xfce4-screensaver 
> > > > > > since
> > > > > > months ... ? changes in credentials passing over sockets ?
> > > > > 
> > > > > The error messages comes from libgio-2.0.so.4200.14 part of glib2.
> > > > 
> > > > https://gitlab.xfce.org/apps/xfce4-screensaver/-/issues/96
> > > 
> > > well, good catch. i'll come up with something adapted from
> > > https://gitlab.alpinelinux.org/alpine/aports/-/commit/ee7f451b3a1b1bdcf1de4303369a0b8a152f4d73
> > > for bsdauth. I guess that's a regression from glib 2.70 update then, and
> > > mate-screensaver might be affected by the same issue as they share the
> > > same ancestor.
> > 
> > That still strange because xfce4-screensaver-dialog has code for
> > bsdauth, but if i try setting the binary setgid auth instead of setuid
> > root, and remove the setgroups() call, glib will still complain the
> > same, even if not setuid anymore..
> 
> But it's setgid, and while the error message only refers to setuid,
> the glib commit  makes it clear it's any kind of elevated privileges that
> make it refuse to connect.

ive looked a bit and i havent found the glib commit/MR that changed this
in 2.70... i've only found
https://gitlab.gnome.org/GNOME/glib/-/issues/2316 which doesnt talk
about gid.

> > Havent looked at mate-screensaver, but the below diff adapted from above
> > seems to work in my limited testing (eg xfce4-screensaver --debug, and
> > xflock4 in another term).
> 
> The problem I see with this approach is that it provides a tool that
> make it possible to do brute-force password checking.
> 
> I think that a solution where main screensaver process keeps the setgid
> auth bit, forks a privileged child to do the password check and
> revokes it's setgid privilege is better. But I'd like hear other
> people on this (millert@, kn@,...)

Well, i'm not going to be the one writing this code :)

> But whether glib will properly recognise that the process doesn't have
> privileges anymore is an open question before someone has looked at
> the code or tried it.

from looking at glib, it uses g_check_setuid:
https://gitlab.gnome.org/GNOME/glib/-/blob/main/gio/gdbusaddress.c#L1097
which is implemented here:
https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gutils.c#L3013

Landry

Re: OpenBSD/amd64: xfce screensaver/lock is not interruptible, only mouse pointer visible on blank screen, no password prompt

2021-11-01 Thread Landry Breuil

Le Sun, Oct 31, 2021 at 10:47:36PM +0100, Landry Breuil a écrit :

> > > > > ** (xfce4-screensaver-dialog:72106): ERROR **: 21:36:25.353: Failed to
> > > > >connect to xfconf daemon: Cannot spawn a message bus when setuid.
> > > > > 
> > > > > I don't know much about xfconf / dbus / setuid applications
> > > > > interactions, but this doesn't look like something related to changes
> > > > > in base.
> > > > 
> > > > Well... iirc, nothing changed between xfconf and xfce4-screensaver since
> > > > months ... ? changes in credentials passing over sockets ?
> > > 
> > > The error messages comes from libgio-2.0.so.4200.14 part of glib2.
> > 
> > https://gitlab.xfce.org/apps/xfce4-screensaver/-/issues/96
> 
> well, good catch. i'll come up with something adapted from
> https://gitlab.alpinelinux.org/alpine/aports/-/commit/ee7f451b3a1b1bdcf1de4303369a0b8a152f4d73
> for bsdauth. I guess that's a regression from glib 2.70 update then, and
> mate-screensaver might be affected by the same issue as they share the
> same ancestor.

That still strange because xfce4-screensaver-dialog has code for
bsdauth, but if i try setting the binary setgid auth instead of setuid
root, and remove the setgroups() call, glib will still complain the
same, even if not setuid anymore..

Havent looked at mate-screensaver, but the below diff adapted from above
seems to work in my limited testing (eg xfce4-screensaver --debug, and
xflock4 in another term).

[error_watch] gs-window-x11.c:893 (11:53:14.465):Command output: 
[request_response] xfce4-screensaver-dialog.c:148 (11:53:14.465):Got response: 
-2
[error_watch] gs-window-x11.c:893 (11:53:14.643):Command output: 
[do_auth_check] xfce4-screensaver-dialog.c:305 (11:53:14.642):  Verify user 
returned: TRUE
[dialog_process_watch] gs-window-x11.c:1405 (11:53:14.648):  Command 
output: RESPONSE=OK
[dialog_process_watch] gs-window-x11.c:1419 (11:53:14.648):  Got OK response

entering the wrong password properly dismisses the attempt too :)

feedback welcome..

Landry
? patchesno
? xfce4-screensaver-askpass.diff
Index: Makefile
===
RCS file: /cvs/ports/x11/xfce4/xfce4-screensaver/Makefile,v
retrieving revision 1.11
diff -u -r1.11 Makefile
--- Makefile3 Jan 2021 17:34:23 -   1.11
+++ Makefile1 Nov 2021 10:53:53 -
@@ -3,6 +3,7 @@
 COMMENT =  Xfce4 screensaver
 
 XFCE_GOODIE =  xfce4-screensaver
+REVISION = 0
 
 # GPLv2
 PERMIT_PACKAGE =   Yes
@@ -32,7 +33,13 @@
 
 FAKE_FLAGS =   menudir=${PREFIX}/share/examples/xfce4-screensaver/xdg/menus
 
+CONFIGURE_ARGS +=  
--with-passwd-helper=${LOCALBASE}/libexec/xfce4-screensaver-ask-pass
+
+post-build:
+   ${CC} ${CFLAGS} ${FILESDIR}/ask-pass.c -o ${WRKBUILD}/ask-pass
+
 post-install:
+   ${INSTALL_PROGRAM} ${WRKBUILD}/ask-pass 
${PREFIX}/libexec/xfce4-screensaver-ask-pass
@mv ${WRKINST}/etc/xdg/autostart \
${PREFIX}/share/examples/xfce4-screensaver/xdg/autostart
rm -Rf ${WRKINST}/etc/xdg
Index: files/ask-pass.c
===
RCS file: files/ask-pass.c
diff -N files/ask-pass.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ files/ask-pass.c    1 Nov 2021 10:53:53 -
@@ -0,0 +1,84 @@
+/* $OpenBSD$
+ * verifying typed passwords with bsd_auth(3)
+ *
+ * Copyright (c) 2009 Antoine Jacoutot 
+ * Copyright (c) 2021 Landry Breuil 
+ * Copyright (c) 2021 Natanael Copa 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+static void sighandler(int sig)
+{
+   if (sig > 0)
+   errx(sig, "caught signal %d", sig);
+}
+
+static void setup_signals(void)
+{
+   struct sigaction action;
+
+   memset((void *) , 0, sizeof(action));
+   action.sa_handler = sighandler;
+   action.sa_flags = SA_RESETHAND;
+   sigaction(SIGILL, , NULL);
+   sigaction(SIGTRAP, , NULL);
+   sigaction(SIGBUS, , NULL);
+

Re: OpenBSD/amd64: xfce screensaver/lock is not interruptible, only mouse pointer visible on blank screen, no password prompt

2021-10-31 Thread Landry Breuil

Le Sun, Oct 31, 2021 at 10:10:55PM +0100, Antoine Jacoutot a écrit :
> On Sun, Oct 31, 2021 at 10:05:15PM +0100, Matthieu Herrb wrote:
> > On Sun, Oct 31, 2021 at 09:48:53PM +0100, Landry Breuil wrote:
> > > Le Sun, Oct 31, 2021 at 09:39:35PM +0100, Matthieu Herrb a écrit :
> > > > On Fri, Oct 29, 2021 at 10:33:04AM +0200, Peter N. M. Hansteen wrote:
> > > > > On Fri, Oct 29, 2021 at 10:05:34AM +0200, Landry Breuil wrote:
> > > > >  
> > > > > > i can confirm i've seen this when upgrading my machines, and most of
> > > > > > them were previously running various snapshots from beginning of
> > > > > > september/end of september, so likely a regression from "something" 
> > > > > > in X
> > > > > > or drm ?
> > > > > 
> > > > > I think we can narrow it down to some time this week. If I remember 
> > > > > correctly,
> > > > > I did not see this after upgrading to whatever was the latest snap 
> > > > > late 
> > > > > last Sunday afternoon CEST, but I noticed this first yesterday 
> > > > > morning 
> > > > > after my late Wednesday evening CEST upgrade.
> > > > > 
> > > > > The problem persists after upgrading to this morning's snap, ie
> > > > > 
> > > > > index.txt   29-Oct-2021 02:191690
> > > > > 
> > > > 
> > > > Ktracing the screensaver process shows it execs
> > > > /usr/local/libexec/xfce4-screensaver-dialog
> > > > which fails with the following error:
> > > > 
> > > > ** (xfce4-screensaver-dialog:72106): ERROR **: 21:36:25.353: Failed to
> > > >connect to xfconf daemon: Cannot spawn a message bus when setuid.
> > > > 
> > > > I don't know much about xfconf / dbus / setuid applications
> > > > interactions, but this doesn't look like something related to changes
> > > > in base.
> > > 
> > > Well... iirc, nothing changed between xfconf and xfce4-screensaver since
> > > months ... ? changes in credentials passing over sockets ?
> > 
> > The error messages comes from libgio-2.0.so.4200.14 part of glib2.
> 
> https://gitlab.xfce.org/apps/xfce4-screensaver/-/issues/96

well, good catch. i'll come up with something adapted from
https://gitlab.alpinelinux.org/alpine/aports/-/commit/ee7f451b3a1b1bdcf1de4303369a0b8a152f4d73
for bsdauth. I guess that's a regression from glib 2.70 update then, and
mate-screensaver might be affected by the same issue as they share the
same ancestor.

Landry

Re: OpenBSD/amd64: xfce screensaver/lock is not interruptible, only mouse pointer visible on blank screen, no password prompt

2021-10-31 Thread Landry Breuil

Le Sun, Oct 31, 2021 at 09:39:35PM +0100, Matthieu Herrb a écrit :
> On Fri, Oct 29, 2021 at 10:33:04AM +0200, Peter N. M. Hansteen wrote:
> > On Fri, Oct 29, 2021 at 10:05:34AM +0200, Landry Breuil wrote:
> >  
> > > i can confirm i've seen this when upgrading my machines, and most of
> > > them were previously running various snapshots from beginning of
> > > september/end of september, so likely a regression from "something" in X
> > > or drm ?
> > 
> > I think we can narrow it down to some time this week. If I remember 
> > correctly,
> > I did not see this after upgrading to whatever was the latest snap late 
> > last Sunday afternoon CEST, but I noticed this first yesterday morning 
> > after my late Wednesday evening CEST upgrade.
> > 
> > The problem persists after upgrading to this morning's snap, ie
> > 
> > index.txt   29-Oct-2021 02:191690
> > 
> 
> Ktracing the screensaver process shows it execs
> /usr/local/libexec/xfce4-screensaver-dialog
> which fails with the following error:
> 
> ** (xfce4-screensaver-dialog:72106): ERROR **: 21:36:25.353: Failed to
>connect to xfconf daemon: Cannot spawn a message bus when setuid.
> 
> I don't know much about xfconf / dbus / setuid applications
> interactions, but this doesn't look like something related to changes
> in base.

Well... iirc, nothing changed between xfconf and xfce4-screensaver since
months ... ? changes in credentials passing over sockets ?

Landry

Re: OpenBSD/amd64: xfce screensaver/lock is not interruptible, only mouse pointer visible on blank screen, no password prompt

2021-10-29 Thread Landry Breuil

Le Fri, Oct 29, 2021 at 09:16:31AM +0200, Peter N. M. Hansteen a écrit :
> SENDBUG: -*- sendbug -*-
> SENDBUG: Lines starting with `SENDBUG' will be removed automatically.
> SENDBUG:
> SENDBUG: Choose from the following categories:
> SENDBUG:
> SENDBUG: system user library documentation kernel alpha amd64 arm hppa i386
> m88k mips64 powerpc sh sparc sparc64 vax
> SENDBUG:
> SENDBUG:
> To: bugs@openbsd.org
> Subject: OpenBSD/amd64: xfce screensaver/lock is not interruptible, only
> mouse pointer visible on blank screen, no password prompt
> From: pe...@bsdly.net
> Cc: pe...@bsdly.net
> Reply-To: pe...@bsdly.net
> 
> >Synopsis:OpenBSD/amd64 -
> >Category:xorg xenodm xfce
> >Environment:
>   System  : OpenBSD 7.0
>   Details : OpenBSD 7.0-current (GENERIC.MP) #55: Thu Oct 28 17:44:49 
> MDT
> 2021
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD/amd64: xfce screensaver/lock is not interruptible,
> only mouse pointer visible on blank screen, no password prompt
>   Machine : amd64
> >Description:
>   I only noticed yesterday after leaving the machine alone for long enough
> for the screensaver to kick in. The setup is
>   the default with a blank screen. However, wiggling the mouse (touchpad) 
> or
> pressing a key (Ctrl or whatever) only
>   produces a very short lived mouse pointer, with no prompt for password 
> to
> unlock. Blindly typing my password after
>   touching Ctrl or whatever does not unlock either.
> >How-To-Repeat:
>   Install amd64 snapshot post 2021-10-27 (I think), pkg_add -u to latest 
> xfce
> packages with xfce running from xenodm
> >Fix:
>   The only (slightly inelegant) workaround I have found is logging on to 
> the
> system via ssh and restarting xenodm (doas rcctl restart xenodm)

i can confirm i've seen this when upgrading my machines, and most of
them were previously running various snapshots from beginning of
september/end of september, so likely a regression from "something" in X
or drm ?

Landry

fdisk during install on -current/octeon fails, works on 6.9

2021-10-03 Thread Landry Breuil

hi,

trying to install -current on an edgerouter lite, fdisk fails when accessing
the usb disk, dunno if its a dwctwo or an fdisk regression:

Copyright (c) 1982, 1986, 1989, 1991, 1993 
The Regents of the University of California.All rights reserved. 
Copyright (c) 1995-2021 OpenBSD. All rights reserved.https://www.OpenBSD.org 
 
OpenBSD 7.0 (RAMDISK) #750: Thu Sep 30 21:33:10 MDT 2021 
dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/RAMDISK
real mem = 536870912 (512MB) 
avail mem = 520880128 (496MB)
random: boothowto does not indicate good seed
mainbus0 at root: board 20002 rev 2.18, model CN3xxx/CN5xxx
cpu0 at mainbus0: CN50xx CPU rev 0.1 500 MHz, Software FP emulation
cpu0: cache L1-I 32KB 4 way D 16KB 64 way, L2 128KB 8 way
clock0 at mainbus0: int 5
iobus0 at mainbus0 
simplebus0 at iobus0: "soc"
octciu0 at simplebus0
octsmi0 at simplebus0
octpip0 at simplebus0
octgmx0 at octpip0 interface 0 
cnmac0 at octgmx0: port 0 RGMII, address 80:2a:a8:f1:19:06 
atphy0 at cnmac0 phy 7: AR8035 10/100/1000 PHY, rev. 2 
cnmac1 at octgmx0: port 1 RGMII, address 80:2a:a8:f1:19:07 
atphy1 at cnmac1 phy 6: AR8035 10/100/1000 PHY, rev. 2 
cnmac2 at octgmx0: port 2 RGMII, address 80:2a:a8:f1:19:08 
atphy2 at cnmac2 phy 5: AR8035 10/100/1000 PHY, rev. 2 
com0 at simplebus0: ns16550a, 64 byte fifo 
com0: console
dwctwo0 at iobus0 base 0x118006800 irq 56
usb0 at dwctwo0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "Octeon DWC2 root hub" rev 2.00/1.00 
addr 1 
umass0 at uhub0 port 1 configuration 1 interface 0 "JMicron USB to ATA/ATAPI 
bridge" rev 2.00/1.00 addr 2 
umass0: using SCSI over Bulk-Only 
scsibus0 at umass0: 2 targets, initiator 0
sd0 at scsibus0 targ 1 lun 0:  
serial.152d232950026B76829C
sd0: 114473MB, 512 bytes/sector, 234441648 sectors
root on rd0a swap on rd0b dump on rd0b
...

Available disks are: sd0. 
Which disk is the root disk? ('?' for details) [sd0]
MBR has invalid signature; not showing it.
Use (W)hole disk or (E)dit the MBR? [whole] W 
Creating a FAT partition and an OpenBSD partition for rest of sd0...fdisk: 
DIOCGPDINFO: Input/output error
done. 
disklabel: DIOCGPDINFO: Input/output error
newfs_msdos: /dev/rsd0i: Input/output error 
The auto-allocated layout for sd0 is:
disklabel: DIOCGPDINFO: Input/output error
Use (A)uto layout, (E)dit auto layout, or create (C)ustom layout? [a] ! 
Type 'exit' to return to install. 
erl# dmesg|grep sd
dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/RAMDISK 
sd0 at scsibus0 targ 1 lun 0:  
serial.152d232950026B76829C
sd0: 114473MB, 512 bytes/sector, 234441648 sectors 
erl# fdisk sd0
fdisk: DIOCGPDINFO: Input/output error 
erl# disklabel sd0
disklabel: DIOCGDINFO: Input/output error

with a miniroot69.img from 6.9, it seems better:

OpenBSD 6.9 (RAMDISK) #606: Sun Apr 18 03:34:14 MDT 2021 
dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/RAMDISK
real mem = 536870912 (512MB) 
avail mem = 520863744 (496MB)
random: boothowto does not indicate good seed
mainbus0 at root: board 20002 rev 2.18, model CN3xxx/CN5xxx
cpu0 at mainbus0: CN50xx CPU rev 0.1 500 MHz, Software FP emulation
cpu0: cache L1-I 32KB 4 way D 16KB 64 way, L2 128KB 8 way
clock0 at mainbus0: int 5 
iobus0 at mainbus0
simplebus0 at iobus0: "soc" 
octciu0 at simplebus0 
octsmi0 at simplebus0 
octpip0 at simplebus0 
octgmx0 at octpip0 interface 0
cnmac0 at octgmx0: port 0 RGMII, address 80:2a:a8:f1:19:06
atphy0 at cnmac0 phy 7: AR8035 10/100/1000 PHY, rev. 2
cnmac1 at octgmx0: port 1 RGMII, address 80:2a:a8:f1:19:07
atphy1 at cnmac1 phy 6: AR8035 10/100/1000 PHY, rev. 2
cnmac2 at octgmx0: port 2 RGMII, address 80:2a:a8:f1:19:08
atphy2 at cnmac2 phy 5: AR8035 10/100/1000 PHY, rev. 2 
com0 at simplebus0: ns16550a, 64 byte fifo
com0: console 
dwctwo0 at iobus0 base 0x118006800 irq 56
usb0 at dwctwo0: USB revision 2.0 
uhub0 at usb0 configuration 1 interface 0 "Octeon DWC2 root hub" rev 2.00/1.00 
addr 1
umass0 at uhub0 port 1 configuration 1 interface 0 "JMicron USB to ATA/ATAPI 
bridge" rev 2.00/1.00 addr 2
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, initiator 0 
sd0 at scsibus0 targ 1 lun 0:  
serial.152d232950026B76829C
sd0: 114473MB, 512 bytes/sector, 234441648 sectors
root on rd0a swap on rd0b dump on rd0b 
...
Available disks are: sd0.
Which disk is the root disk? ('?' for details) [sd0] 
Disk: sd0 geometry: 14593/255/63 [234441648 Sectors]
Offset: 0 Signature: 0xAA55
Starting Ending LBA Info:
 #: idC H S -C H S [ start:size ]
---
*0: 0C0 1 2 -1 10338 [64: 22528 ] FAT32L
 1: 000 0 0 -0 0 0 [ 0: 0 ] unused
 2: 000 0 0 -0 0 0 [ 0: 0 ] unused
 3: 000 0 0 -0 0 0 [ 0: 0 ] unused
Use (W)hole disk or (E)dit the MBR? [whole] W
Creating a FAT partition and an OpenBSD partition for rest of sd0...done.
/dev/rsd0i: 65372 sectors in 16343 FAT16 clusters (2048 bytes/cluster)
bps=512 spc=4 res=1 nft=2 rde=512 mid=0xf8 spf=64 spt=63 hds=255 hid=64 
bsec=65536

(and i

revert to pre-fragattack iwm firmware to fix hw rev 0x210 / AC 7265 on X1 gen3

2021-10-03 Thread Landry Breuil

Hi,

i've been having iwm(4) instability on a X1 Gen3 issues since july
(symptom/reproducer: unable to fetch base70.tgz, stalls after 100Mb and link
goes down, ifconfig up/down/reassoc 'resolves' it - pkg_add -u also fails after
a while, erratic ping, etc..) - tried forcing 2Ghz/5Ghz modes but that doesnt
help on this laptop (forcing 5Ghz 'solves' a similar issue on a T470s but
i dont remember the hw rev/model right now).

iwm0 at pci2 dev 0 function 0 "Intel AC 7265" rev 0x59, msi
iwm0: hw rev 0x210, fw ver 17.3216344376.0

bisected kernels and found out that
6.9-current (GENERIC.MP) #120: Thu Jul  8 23:45:06 MDT 2021
was okay and
6.9-current (GENERIC.MP) #122: Fri Jul  9 16:29:05 MDT 2021
wasnt, which narrowed the regression around the switch to newer firmware
version 29. (eg
https://github.com/openbsd/src/commit/6a5f473d46d947cc596b5dbde0c8b074f6a98723)
- on july 20 there's been
https://github.com/openbsd/src/commit/b57370d64c440d7c4d5594a677fb066e3786fbfc
too.

so iwm-7265D-29 firmware doesnt work on this revision of hardware on
this laptop (mpi@ has been seeing the same issues on a similar X1 gen3)
but forcing the use of the older firmware 'solves' it. As stsp@ told me,
the same hardware revision works in another laptop with the newer
firmware...

anyway, the diff below from stsp@ gives me a working iwm(4) on -current:

Index: dev/pci/if_iwm.c
===
RCS file: /cvs/src/sys/dev/pci/if_iwm.c,v
retrieving revision 1.370
diff -u -r1.370 if_iwm.c
--- dev/pci/if_iwm.c2 Oct 2021 07:47:54 -   1.370
+++ dev/pci/if_iwm.c3 Oct 2021 07:20:32 -
@@ -11085,11 +11085,7 @@
break;
case PCI_PRODUCT_INTEL_WL_7265_1:
case PCI_PRODUCT_INTEL_WL_7265_2:
-   if ((sc->sc_hw_rev & IWM_CSR_HW_REV_TYPE_MSK) ==
-   IWM_CSR_HW_REV_TYPE_7265D)
-   sc->sc_fwname = "iwm-7265D-29";
-   else
-   sc->sc_fwname = "iwm-7265-17";
+   sc->sc_fwname = "iwm-7265-17";
sc->host_interrupt_operation_mode = 0;
sc->sc_device_family = IWM_DEVICE_FAMILY_7000;
sc->sc_fwdmasegsz = IWM_FWDMASEGSZ;

Re: Black screen after drm upgrade to Linux 5.10.47

2021-07-08 Thread Landry Breuil

Le Thu, Jul 08, 2021 at 04:53:34PM +1000, Jonathan Gray a écrit :
> On Wed, Jul 07, 2021 at 11:02:27AM -0400, Josh Rickmar wrote:
> > With latest amd64 snapshot on my Thinkpad E485 I'm only seeing a
> > black screen when using X.  xenodm is still running, and with some
> > finger memory I can still log in and run various X applications, but
> > none of it is visible.
> 
> Can you try the following diff to revert to the 5.7 drm_mm ?
> I have not been able to reproduce the problem on t495/picasso
> with this change.

X starts on t495s with this diff, dmesg below. Thanks jsg!

OpenBSD 6.9-current (GENERIC.MP) #0: Thu Jul  8 09:35:15 CEST 2021
lan...@dawn.home.rhaalovely.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 14888513536 (14198MB)
avail mem = 14421245952 (13753MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.1 @ 0xb9ecc000 (63 entries)
bios0: vendor LENOVO version "R13ET48W(1.22 )" date 08/14/2020
bios0: LENOVO 20QJCTO1WW
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT SSDT SSDT TPM2 SSDT MSDM SLIC BATB HPET APIC MCFG 
SBST WSMT IVRS SSDT CRAT CDIT FPDT SSDT SSDT SSDT UEFI
acpi0: wakeup devices GPP0(S3) GPP1(S3) GPP2(S3) GPP3(S4) GPP4(S3) L850(S3) 
GPP5(S3) GPP6(S3) GP17(S3) XHC0(S3) XHC1(S3) GP18(S3) LID_(S3) SLPB(S3)
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpihpet0 at acpi0: 14318180 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx, 2295.98 MHz, 17-18-01
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu0: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 
8-way L2 cache
cpu0: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu0: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=1.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx, 2295.68 MHz, 17-18-01
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu1: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 
8-way L2 cache
cpu1: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu1: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu1: smt 1, core 0, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx, 2295.68 MHz, 17-18-01
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu2: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 
8-way L2 cache
cpu2: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu2: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu2: smt 0, core 1, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx, 2295.68 MHz, 17-18-01
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu3: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 
8-way L2 cache
cpu3: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu3: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative
cpu3: smt 1, core 1,

Re: firefox vs jitsi: stack exhaustion?

2021-04-20 Thread Landry Breuil

On Tue, Apr 20, 2021 at 09:06:27AM +0200, Jeremie Courreges-Anglas wrote:
> On Fri, Apr 09 2021, Mark Kettenis  wrote:
> >> Date: Fri, 9 Apr 2021 07:09:06 +1000
> >> From: Jonathan Matthew 
> >> 
> >> On Thu, Apr 08, 2021 at 10:24:06AM +0200, Martin Pieuchot wrote:
> >> > firefox often crash when somebody else connects to the jitsi I'm in.
> >> > The trace looks like a stack exhaustion, see below. 
> >> > 
> >> > Does this ring a bell?
> >> > 
> >> > #530 
> >> > #531 pthread_setschedparam (thread=0x0, policy=1, param=0x2ade5ca1df0)
> >> > at /usr/src/lib/librthread/rthread_sched.c:56
> >> > #532 0x02ae65f9f016 in 
> >> > rtc::PlatformThread::SetPriority(rtc::ThreadPriority) () from 
> >> > /usr/local/lib/firefox/libxul.so.101.0
> >> > #533 0x02ae65f9ed75 in rtc::PlatformThread::Run() ()
> >> >from /usr/local/lib/firefox/libxul.so.101.0
> >> > #534 0x02ae65f9ead9 in rtc::PlatformThread::StartThread(void*) ()
> >> >from /usr/local/lib/firefox/libxul.so.101.0
> >> > #535 0x02ade9d4df51 in _rthread_start (v=)
> >> > at /usr/src/lib/librthread/rthread.c:96
> >> > #536 0x02ad9c2ec3da in __tfork_thread ()
> >> > at /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84
> >> 
> >> This looks like, at least with our pthreads implementation, libwebrtc has a
> >> race between the newly created thread and the thread creating it.
> >> 
> >> The creating thread stores the thread handle in thread_ here:
> >> https://github.com/mozilla/gecko-dev/blob/master/third_party/libwebrtc/webrtc/rtc_base/platform_thread.cc#L186
> >> 
> >> and the new thread uses it here:
> >> https://github.com/mozilla/gecko-dev/blob/master/third_party/libwebrtc/webrtc/rtc_base/platform_thread.cc#L363
> >> which is called as almost the first thing the new thread does.
> >> 
> >> Our pthread_create() only stores the pthread handle into the supplied
> >> address after __tfork_thread() returns, by which time the new thread
> >> could already be running.
> >
> > Right.  POSIX isn't very explicit about this, but it does state that
> > the thread ID is stored upon successful completion.  So I don't think
> > portable code can rely on that until pthread_create() has returned.
> >
> > Using pthread_self() in the pthread_setschedparam() call will probably
> > help.
> 
> It does indeed, firefox diff below.
> 
> > But if PlatformThread::GetThreadRef() gets called early on in
> > the thread, there will still be a race.  Setting thread_ at the start of
> > PlatformThread::Run() would be better.
> 
> This brings questions, like how to do the same thing on Windows and
> non-Windows.  Lots of TODO entries in that code.  As far as I'm
> concerned I'll leave significant rethinking to upstream.

Fwiw, i'm not qualified to judge anything in this area, but i've filed
https://bugzilla.mozilla.org/show_bug.cgi?id=1706261 upstream at mozilla
if ppl want to discuss the specificities (or a better fix/refactoring).

note that mozilla is a libwebrtc downstream, the 'real' upstream is at
https://webrtc.googlesource.com/src/ (and i suppose somehow shared with
chrome) but im not going there, my last experience with trying to
discuss things with google was a nightmare.

Landry

Re: Firefox fails to create profile's permanent storage

2021-03-11 Thread Landry Breuil

On Thu, Mar 11, 2021 at 09:03:13AM +, RJ Johnson wrote:
> > im not 100% sure at all, but *maybe* the method creating the dir
> > hierarchy is
> > https://searchfox.org/mozilla-central/source/xpcom/io/nsLocalFileUnix.cpp#360
> >  .
> 
> You were right. I've created a bug on Bugzilla at
> https://bugzilla.mozilla.org/show_bug.cgi?id=1697721 about this issue.
> 
> If you are interested, a patch-compatible version is below.

well, thanks for that ! I'll make sure this is properly tracked.

note that in the upcoming 87.0, there's another issue with unveil and
downloads, being tracked in
https://bugzilla.mozilla.org/show_bug.cgi?id=1696958 - i dunno if those
issues can be more or less linked with each other..

Landry

Re: Firefox fails to create profile's permanent storage

2021-03-04 Thread Landry Breuil

On Thu, Mar 04, 2021 at 07:30:18AM +, RJ Johnson wrote:
> When creating a new profile (on first launch or with "firefox -P"),
> Firefox fails to create the
> "~/.mozilla/firefox//storage/permanent" folder.
> 
> I have observed this behavior with Firefox 85, 86, and 78esr, although
> more versions are likely affected. This behavior was observed on a
> machine running -current.
> 
> The two most obvious symptoms of this failure are the browser's Web
> Developer tools showing no page source in the Inspector tab (non-esr)
> and various error messages in the Browser Console relating to IndexedDB.
> 
> This failure is caused by unveil. When creating a profile, Firefox
> begins checking each directory in the path
> "/home//.mozilla/firefox//storage/permanent" for
> existence (i.e., "/home" then "/home/" then ...). If any directory
> in this chain does not exist, Firefox gives up on creating the
> "permanent" folder. This is easily observed in a ktrace. (I did
> "ktrace -id firefox -P". Search for "permanent".) Since Firefox has no
> access to "/home" (despite having access to the profile folder), the
> "permanent" folder is never created.
> 
> The easiest way to fix this issue, for profiles both new and old, is to
> manually create the "permanent" folder after Firefox creates the profile
> for you. Once this folder exists, Firefox seems to have no more issues.
> It only has trouble creating this folder initially.

That's a good finding. Someone (tm) (not me) with enough motivation
should look into fixing that in the code. Looking for 'PERMANENT' on
searchfox.org, the corresponding codepath seems to be around
https://searchfox.org/mozilla-central/source/dom/quota/ActorsParent.cpp#3548

im not 100% sure at all, but *maybe* the method creating the dir
hierarchy is
https://searchfox.org/mozilla-central/source/xpcom/io/nsLocalFileUnix.cpp#360 .

Or somewhere else. That's a 'nice' maze.. Good luck !

Landry

Re: OpenBSD 6.8 errata 014 breaks pf

2021-02-25 Thread Landry Breuil

On Thu, Feb 25, 2021 at 10:31:59AM +, Mikolaj Kucharski wrote:
> On Thu, Feb 25, 2021 at 10:07:32AM +0100, stef...@fritz.wtf wrote:
> > >Synopsis: After installing OpenBSD 6.8 errata 014 pf allows no connections 
> > >and knows no tables 
> > >Category: kernel   
> > >Environment:
> > System  : OpenBSD 6.8
> > Details : OpenBSD 6.8 (GENERIC) #4: Mon Jan 11 10:34:36 MST 2021
> >  
> > r...@syspatch-68-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> > 
> > Architecture: OpenBSD.amd64
> > Machine : amd64
> > >Description:
> > After patching my system with syspatch to 6.8-014 no connections to the 
> > server where possible, no ssh, no smtp, https, imap.  Disabling pf allowed 
> > connections. 
> > 
> > 
> > >How-To-Repeat:
> > 
> > Patch system using syspatch.
> > 
> > >Fix:
> > I had to revert the most recently installed patch with syspatch -r.
> > 
> > 
> > dmesg:
> > OpenBSD 6.8 (GENERIC) #4: Mon Jan 11 10:34:36 MST 2021
> > 
> > r...@syspatch-68-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> 
> Can you show your pf.conf? I don't see that problem here.
> 
> # syspatch | wc -l
>0
> 
> # sysctl -n kern.version
> OpenBSD 6.8 (GENERIC.MP) #5: Mon Feb 22 04:36:10 MST 2021
> r...@syspatch-68-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

no problem either on a VM doing dns/dhcp, i can connect over ssh and it
correctly does dns/dhcp:

furka# pfctl -sr
block drop in all
pass in on vio0 inet proto icmp from any to 172.20.97.3 icmp-type echorep
pass in on vio0 inet proto icmp from any to 172.20.97.3 icmp-type echoreq
pass in on vio0 inet proto icmp from any to 172.20.97.3 icmp-type timex
pass in on vio0 inet proto icmp from any to 172.20.97.3 icmp-type unreach
pass out all flags S/SA
pass in log on vio0 inet proto tcp from <__automatic_1e5c56b2_0> to 172.20.97.3 
port = 22 flags S/SA
pass in log on vio0 inet proto tcp from 172.20.97.21 to 172.20.97.3 port = 2812 
flags S/SA
pass in log on vio0 inet proto udp from <__automatic_1e5c56b2_1> to 172.20.97.3 
port = 53
pass in log on vio0 inet proto udp from any to any port = 67

furka# sysctl kern.version
kern.version=OpenBSD 6.8 (GENERIC) #5: Mon Feb 22 04:04:49 MST 2021
r...@syspatch-68-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC

Re: firefox pledge violation

2021-02-19 Thread Landry Breuil

On Fri, Feb 19, 2021 at 09:44:40PM +1100, Jonathan Gray wrote:
> On Fri, Feb 19, 2021 at 11:31:25AM +0100, Landry Breuil wrote:
> > On Fri, Feb 19, 2021 at 11:17:35AM +0100, Martin Pieuchot wrote:
> > > Firefox from -current, tab crashes, kernel says:
> > > 
> > > firefox[86270]: pledge "", syscall 289
> > 
> > maybe the drm update triggers again the codepaths leading to shm calls
> > prevented by pledge/unveil ?
> > try fiddling with the various knobs forcing/disabling acceleration ?
> > depends on the gfx chipset ?
> 
> $ LIBGL_ALWAYS_SOFTWARE=true firefox
> go to https://get.webgl.org/
> firefox[71928]: pledge "", syscall 289
> 
> $ LIBGL_ALWAYS_SOFTWARE=true chrome
> chrome[4]: pledge "", syscall 289
> chrome[85649]: pledge "", syscall 289
> chrome[23547]: pledge "", syscall 289

ok, so nothing new under the sun.

with the latest snapshot, get.webgl.org works here in firefox with amdgpu.

Re: firefox pledge violation

2021-02-19 Thread Landry Breuil

On Fri, Feb 19, 2021 at 11:17:35AM +0100, Martin Pieuchot wrote:
> Firefox from -current, tab crashes, kernel says:
> 
> firefox[86270]: pledge "", syscall 289

maybe the drm update triggers again the codepaths leading to shm calls
prevented by pledge/unveil ?
try fiddling with the various knobs forcing/disabling acceleration ?
depends on the gfx chipset ?

anyway, nothing i can do here, i'm just the guy pushing random buttons
to have updates.

you know better than me how to debug that :)

Re: ibus not work with firefox

2020-11-30 Thread Landry Breuil

On Mon, Nov 30, 2020 at 08:52:06AM +0800, Player wrote:
> Thanks for reading.
> on OpenBSD 6.8.
> Ibus not work with firefox.
> Other software of gnome works well.
> And tor-browser can use ibus.
> firefox call gtk_im_context_filter_keypress return false.
> And ibus not receive focus-in event of firefox.

you might want to try disabling pledge or unveil to figure out if some
paths are missing in unveil configuration for firefox to access ibus
socket or library, i have zero idea how ibus works and what is needed
for such interaction.

Landry

Re: amdgpu - extreme instability with Radeon RX 550

2020-07-04 Thread Landry Breuil

On Sat, Jul 04, 2020 at 05:58:07PM +1000, Jonathan Gray wrote:
> On Fri, Jul 03, 2020 at 11:14:02PM -0400, Joe Gidi wrote:
> > Hello,
> > 
> 
> firefox seems to be doing a dlopen after it has unveil'd and can't
> open libLLVM.  unveil removes visibility of parts of the filesystem,
> but it has to be done in the right place.

i dont think firefox itself is doing this dlopen, rather MESA ?
https://searchfox.org/mozilla-central/search?q=libllvm

>  37357 firefox  NAMI  "/usr/lib/libLLVM.so.2.0"
>  37357 firefox  RET   open -1 errno 2 No such file or directory
> 
> this can be reproduced on other hardware by forcing swrast which also
> uses libLLVM
> 
> LIBGL_ALWAYS_SOFTWARE=1 firefox
> 
> this a firefox specific problem which does not occur with chromium

Since unveil doesnt allow wildcards, i guess adding '/usr/lib r' to
/etc/firefox/unveil.gpu is the way to go. I dont have a machine with
amdgpu, and it doesnt seem to help LIBGL_ALWAYS_SOFTWARE=1 firefox
httpss://get.webgl.org here but maybe that's unrelated.

Landry

ltrace fails to display symbols for huge libraries ?

2019-11-06 Thread Landry Breuil

Hi,
playing with ltrace, trying to see what functions are called by firefox,
it works fine in many situations:

ltrace -i -ulibnspr4 /usr/local/bin/firefox
ltrace -i -u:NSS* /usr/local/bin/firefox

*but* for some reason it doesnt seem to work with libxul.so, which is
where most of the symbols of the engine lies.

At the beginning i thought it was an issue as libxul.so is dlopen()'ed
by firefox binary (and not linked in the binary) but in some trunk
builds i have around, libnss3 & libnspr4 are also dlopen()'ed by firefox
binary and ltrace still shows symbols from those libs, so that doesnt
seem an issue with dlopen().

/usr/local/bin/firefox:
StartEnd  Type  Open Ref GrpRef Name
13c3e030 13c3e0353000 exe   20   0 
/usr/local/bin/firefox
13c68ce1b000 13c68cef7000 rlib  01   0 
/usr/lib/libc++.so.3.0
13c6451ec000 13c64522e000 rlib  02   0 
/usr/lib/libc++abi.so.1.0
13c6a1986000 13c6a1993000 rlib  01   0 
/usr/lib/libpthread.so.26.1
13c6d5491000 13c6d54c rlib  01   0 /usr/lib/libm.so.10.1
13c6987e5000 13c6988d9000 rlib  01   0 /usr/lib/libc.so.96.0
13c696dba000 13c696dba000 ld.so 01   0 /usr/libexec/ld.so

objdump -p /usr/local/lib/firefox/libxul.so.85.0 |grep NEED lists all
the dependencies of libxul, and i have no issue showing called symbols
from those via ltrace, ie ltrace -i -ulibcairo /usr/local/bin/firefox
yields many things.

So, what could be the issue here ? the symbols i expect being called by
the code are not all exported in libxul.so ? The library is too large
(250mb)?  I find it strange that
ltrace -i -ulibxul /usr/local/bin/firefox
yields nothing while most of the actual code is there.

Thanks for any pointers.

Landry

Re: radeondrm hang details

2019-07-02 Thread Landry Breuil

On Fri, Jun 28, 2019 at 01:30:57PM +0200, Landry Breuil wrote:
> On Tue, Jun 04, 2019 at 11:29:58AM +0200, Landry Breuil wrote:
> > Hi,
> > 
> > so i have this dell optiplex 960 with radeon hd 3450 which hangs since
> > the drm update. For the first week, i had hangs with _x11 Xorg process
> > in schto, but since 2 or 3 weeks all the hangs (2 or 3 a day, usually
> > happening when i focus the browser or type smth in the address bar or
> > switch tab) let this process in the fsleep state.
> > 
> > rebuilt kernel without karl, set allowkmem.
> > attached details is the full process list, and the backtrace for the
> > _x11 Xorg process + dmesg.
> 
> New backtrace, this time running -current without any diff (not the one
> bumping some thread pools to 4), and the hang was on schto, clearly
> pointing at radeondrm methods. Hope it helps.

Fwiw, been running most of the day without hangs so far, so it seems that
toggling layers.acceleration.force-enabled & webgl.force-enabled to
false in my ffx profile has an effect.

If you want to reproduce this issue, might want to try with those turned
on..

Landry

Re: radeondrm hang details

2019-06-28 Thread Landry Breuil

On Tue, Jun 04, 2019 at 11:29:58AM +0200, Landry Breuil wrote:
> Hi,
> 
> so i have this dell optiplex 960 with radeon hd 3450 which hangs since
> the drm update. For the first week, i had hangs with _x11 Xorg process
> in schto, but since 2 or 3 weeks all the hangs (2 or 3 a day, usually
> happening when i focus the browser or type smth in the address bar or
> switch tab) let this process in the fsleep state.
> 
> rebuilt kernel without karl, set allowkmem.
> attached details is the full process list, and the backtrace for the
> _x11 Xorg process + dmesg.

New backtrace, this time running -current without any diff (not the one
bumping some thread pools to 4), and the hang was on schto, clearly
pointing at radeondrm methods. Hope it helps.

Landry
(gdb) target kvm
#0  0x8110a742 in mi_switch () at atomic.h:298
298 atomic.h: No such file or directory.
in atomic.h
Current language:  auto; currently minimal
(gdb) kvm proc 0x800023542ef8
#0  0x8110a742 in mi_switch () at atomic.h:298
298 in atomic.h
(gdb) bt
#0  0x8110a742 in mi_switch () at atomic.h:298
#1  0x8184c3d4 in sleep_finish (sls=0x8000234a5220, 
do_sleep=Variable "do_sleep" is not available.
) at /usr/src/sys/kern/kern_synch.c:297
#2  0x8184c2b1 in sleep_finish_all (sls=0x95acc6c3e69b3938, 
do_sleep=Variable "do_sleep" is not available.
) at /usr/src/sys/kern/kern_synch.c:156
#3  0x815a251c in schedule_timeout (timeout=-3125116618890996260) at 
/usr/src/sys/dev/pci/drm/drm_linux.c:99
#4  0x81104e87 in radeon_fence_default_wait (f=0xbb8, 
t=7344343019588750691)
at /usr/src/sys/dev/pci/drm/radeon/radeon_fence.c:1104
#5  0x814d32d8 in reservation_object_wait_timeout_rcu 
(obj=0x81c0f000, timeout=Variable "timeout" is not available.
)
at dev/pci/drm/include/linux/dma-fence.h:184
#6  0x81313661 in radeon_gem_wait_idle_ioctl (dev=Variable "dev" is not 
available.
) at /usr/src/sys/dev/pci/drm/radeon/radeon_gem.c:479
#7  0x8182cda8 in drmioctl (kdev=Variable "kdev" is not available.
) at /usr/src/sys/dev/pci/drm/drm_ioctl.c:788
#8  0x813e3ec5 in VOP_IOCTL (vp=0x8000234a55f0, command=Variable 
"command" is not available.
) at /usr/src/sys/kern/vfs_vops.c:289
#9  0x814a5302 in vn_ioctl (fp=0x15700, com=18446603336813915896, 
data=Variable "data" is not available.
) at /usr/src/sys/kern/vfs_vnops.c:524
#10 0x81499c32 in sys_ioctl (p=0x8000234a55f0, 
v=0xd660235d4ab8e4b1, retval=Variable "retval" is not available.
) at atomic.h:234
#11 0x8167c8b9 in syscall (frame=0xd4a15afe2168f879) at 
sys/syscall_mi.h:92
#12 0x81a77134 in Xsyscall ()

USER   PID %CPU %MEM   VSZ   RSS TT  STAT   STARTED   TIME COMMAND  
  UID  PPID CPU PRI  NI WCHAN  PADDR
root 82287  2.8  0.0 0 0 ??  DK  8:20AM9:02.15 drmwq
0 0   0  10   0 bored   8000220bb150
root 0  0.0  0.0 0 0 ??  DK  8:20AM0:02.13 swapper  
0 0   0 -18   0 schedul 81d93680
root 1  0.0  0.0   476   440 ??  I   8:20AM0:01.01 init 
0 0  28  10   0 wait80002206db28
root 82998  0.0  0.0 0 0 ??  DK  8:20AM0:01.00 smr  
0 0   0 -18   0 bored   80002206d8b0
root 80452  0.0  0.0 0 0 ??  DK  8:20AM  290:20.26 idle0
0 0   0 -22   0 -   80002206ced0
root 70409  0.0  0.0 0 0 ??  DK  8:20AM0:23.94 softclock
0 0   0 -22   0 bored   80002206cc58
root 19366  0.0  0.0 0 0 ??  DK  8:20AM0:00.03 systq
0 0   0  10   0 bored   80002206c768
root 68439  0.0  0.0 0 0 ??  DK  8:20AM0:10.00 systqmp  
0 0   0  10   0 bored   80002206c000
root 66983  0.0  0.0 0 0 ??  DK  8:20AM0:10.07 softnet  
0 0   0  10   0 bored   80002206c278
root 89409  0.0  0.0 0 0 ??  DK  8:20AM0:01.16 sensors  
0 0   0  10   0 bored   80002206c4f0
root 39103  0.0  0.0 0 0 ??  RK/18:20AM  288:08.83 idle1
0 0   0 -22   0 -   80002206c9e0
root 76385  0.0  0.0 0 0 ??  RK/28:20AM  287:50.70 idle2
0 0   0 -22   0 -   80002206d148
root 11063  0.0  0.0 0 0 ??  RK/38:20AM  289:07.49 idle3
0 0   0 -22   0 -   80002206d3c0
root 30524  0.0  0.0 0 0 ??  DK  8:20AM0:00.00 acpi0
0 0   0  10   0 acpi0   80002206d638
root 33676  0.0  0.0 0 0 ??  DK  8:20AM0:01.00 drmubwq  
0 0   0  10   0 bored   8000220baed8
root 84257  0.0  0.0 0 0 ??

Re: radeondrm cursor problems

2019-05-03 Thread Landry Breuil

On Fri, May 03, 2019 at 10:40:55PM -0600, Anthony J. Bentley wrote:
> Hi,
> 
> Since the radeondrm update, I've noticed the cursor doesn't always behave
> as expected. Sometimes it displays the wrong graphic.
> 
> For example, as I write this email, hovering over one xterm window shows
> the normal text selection cursor, but moving the mouse over the other it
> changes to the "pointing finger" cursor. In Firefox, hovering over links
> or selectable text shows the normal arrow pointer instead of the finger
> or text selector; the arrow is also noticeably offset, where moving
> between regular text and a link results in the arrow graphic shifting up
> by 10 pixels or so.

I think that matches what i experienced on my work desktop which is also
radeondrm - sometimes, when i want to click on a link, i need to get the
mouse some pixels above the link itself, and i've also seen the "wrong"
cursor being displayed when hovering different windows.

Landry

Re: macppc can't modify pages in swap

2018-12-25 Thread Landry Breuil

On Sun, Dec 23, 2018 at 09:20:11PM +0100, Mark Kettenis wrote:
> > Date: Sun, 23 Dec 2018 19:35:02 +0100 (CET)
> > From: Mark Kettenis 
> > 
> > > Date: Sun, 23 Dec 2018 12:26:17 +0100 (CET)
> > > From: Mark Kettenis 
> > > 
> > > That's a very good find.  I think there still is a potential race in
> > > your diff on MP systems since you save the bits before removing the
> > > PTE from the has tables.  I'll see if I can come up with a better diff.
> > 
> > So here is the diff I propose instead.  This zaps the PTE before
> > unlinking.  At that point the PTED_VA_MANAGED_M flag is still set so
> > the MOD/REF accounting will happen.  A process running on the other
> > CPU can't put the PTE back into the hash as we still hold the lock on
> > the pmap.
> > 
> > Bootstrapping clang with this diff, and things are still running even
> > though I've hit swap.  I'll give this a spin on an MP machine as well.
> 
> I fear sombody else needs to test on an MP machine as my dual G4
> doesn't boot anymore :(.

I'll put this diff on macppc-*.p in the next bulk (in some days), both
are MP. Maybe it'll fix the random ICEs when building c++ monsters,
which probably hit swap as there's only 2Gb physmem...

Landry

Re: Digital audio out no longer working?

2018-06-24 Thread Landry Breuil

On Sat, Jun 23, 2018 at 08:48:57PM +0100, Laurence Tratt wrote:
> On Sat, Jun 23, 2018 at 06:43:04PM +0200, Alexandre Ratchov wrote:
> 
> Hello Alexandre,
> 
> >> As of the last two snapshots, digital audio out via S/PDIF on my 
> >> machine
> >> no longer works:
> >> 
> >>   $ doas mixerctl outputs.mode=digital
> >>   mixerctl: field outputs.mode does not exist
> >> 
> >> There also seems to be something odd with mixerctl's output which may 
> >> or may
> >> be related? Note the "invalid format" near the end and the incomplete
> >> formatting of "volume.record".
> > Could you confirm that this diff fixes the problem?
> 
> I am happy to confirm that this completely solves the problem. Thanks so much
> for doing this!

Also fixes mixerctl on a secondary device/mic, before the patch:

$mixerctl -f /dev/mixer1 -av
record.enable=sysctl  [ off on sysctl ]

After the patch:

$mixerctl -f /dev/mixer1 -av
record.mic.mute=off  [ off on ]
record.mic=191 volume
record.enable=sysctl  [ off on sysctl ]

Thanks !

Landry

Re: no media opt on axe(4)

2018-04-26 Thread Landry Breuil

On Thu, Apr 26, 2018 at 07:05:11AM -0700, Mike Larkin wrote:
> On Thu, Apr 26, 2018 at 03:50:19PM +0200, Landry Breuil wrote:
> > On Tue, Apr 24, 2018 at 08:35:21PM +0200, Landry Breuil wrote:
> > > Hi,
> > > 
> > > sometimes since 6.3, something broken axe(4) mediaopt:
> > > 
> > > media: Ethernet none (none)
> > > supported media:
> > > media none
> > > 
> > > This is a trendnet TU2-ET100, listed in the manpage and shown in dmesg as:
> > > axe0 at uhub1 port 1 configuration 1 interface 0 "ASIX Electronics 
> > > AX88772" rev 2.00/0.01 addr 5
> > > axe0: AX88772, address 00:14:d1:da:77:57
> > > I think i used it in the past months but can't exactly remember when.
> > 
> > Something is strange somewhere, because booting a 6.2 or 6.3 kernel
> > yields the same.
> > 
> 
> Followup - we tried landry's axe on my surface book and it worked. On my x230,
> it didn't. Perhaps an xhci vs ehci issue?

Not sure anymore, because on the x1 it does the same with usb3
disabled/enabled via the bios. will poke a bit with AXE_DEBUG.

Landry

Re: no media opt on axe(4)

2018-04-26 Thread Landry Breuil

On Tue, Apr 24, 2018 at 08:35:21PM +0200, Landry Breuil wrote:
> Hi,
> 
> sometimes since 6.3, something broken axe(4) mediaopt:
> 
> media: Ethernet none (none)
> supported media:
> media none
> 
> This is a trendnet TU2-ET100, listed in the manpage and shown in dmesg as:
> axe0 at uhub1 port 1 configuration 1 interface 0 "ASIX Electronics AX88772" 
> rev 2.00/0.01 addr 5
> axe0: AX88772, address 00:14:d1:da:77:57
> I think i used it in the past months but can't exactly remember when.

Something is strange somewhere, because booting a 6.2 or 6.3 kernel
yields the same.

no media opt on axe(4)

2018-04-24 Thread Landry Breuil

Hi,

sometimes since 6.3, something broken axe(4) mediaopt:

media: Ethernet none (none)
supported media:
media none

This is a trendnet TU2-ET100, listed in the manpage and shown in dmesg as:
axe0 at uhub1 port 1 configuration 1 interface 0 "ASIX Electronics AX88772" rev 
2.00/0.01 addr 5
axe0: AX88772, address 00:14:d1:da:77:57
I think i used it in the past months but can't exactly remember when.

I'm using another axe(4) (D-Link DUB-E100) at home on 6.3 and it works
fine (of course down right now so cant compare mediaopt output..) - both
on amd64.

Since there's been no recent commits to axe(4) recently, and mpi@
lended me an axen(4) that works (ie that lists mediaopts) im not sure if
it's related to the device, the usb stack, or something else..
Device available here in the hackroom.

Landry

Re: modesetting driver broke video(1)

2018-04-24 Thread Landry Breuil

On Thu, Mar 29, 2018 at 11:47:16AM +0200, Martin Pieuchot wrote:
> Since we switched to the modesetting driver by default, the supported
> XvImage formats no longer include YUY2 nor UYVY which are expected by
> video(1).  Using the following Xorg.conf makes video(1) works again.
> 
> Section "Device"
>   Identifier "Device0"
>   Driver "intel"
> EndSection
> 
> Attached are the outputs of xvinfo(1) with the modesetting driver and
> the intel driver.

Interestingly, while video(1) itself is broken i managed to make
firefox/webrtc talk to the webcam and use in from the browser some weeks
ago, and it works right now on my x200s with an external uvideo using
-current. (Same webcam on my x1 doesnt work but that's probably a
conflict with the internal webcam which is broken for other reasons:
"uvideo0: could not open VS pipe: INVAL")

All that to say, maybe that means that video(1) can/should be adapted to
use another method/encoding, because the camera itself works in firefox
and is supported. If YUY2 and UYVY are not going to come back with
modesetting...

Re: Thunar dies and dumps core

2018-04-10 Thread Landry Breuil

On Mon, Apr 09, 2018 at 04:58:36PM -0600, George Mihai IACOB wrote:
> >Synopsis: Thunar dies and dumps core when opening documents with
> double-click
> >Category:user
> >Environment:
> System  : OpenBSD 6.3
> Details : OpenBSD 6.3 (GENERIC.MP) #107: Sat Mar 24 14:21:59 MDT
> 2018
>  dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/
> GENERIC.MP
> 
> Architecture: OpenBSD.amd64
> Machine : amd64
> >Description:
> When I doucle-click a file (image, spreadsheet) to launch the
> associated program, Thunar launches the program, dumps core and dies. This
> happens every time I try to open a file from Thunar.
> 
> >How-To-Repeat:
> 1. Open Thunar.
> 2. Navigate to the folder containing the file to open.
> 3. (Double)click the file to open.
> 
> >Fix:
> I don't know how to fix this. Using another file manager (xfe) helps.

I of course cant reproduce it, be it on current or 6.3, so you'll have to
find a way to narrow down the issue for you. Is it on any file/mimetype,
is it on a network share, are your running thunar within xfce, and most
of all, use gdb on the coredump to generate a proper backtrace of the
crash. - note that you might need to rebuild the thunar port with
"make DEBUG=-g" to have debug symbols in the binary.

Without more useful information, as is nobody can help you.

Re: firefox (or some rthread / network stuff) broken in -current

2018-02-11 Thread Landry Breuil

On Sun, Feb 11, 2018 at 03:13:07PM +0100, Matthieu Herrb wrote:
> On Sun, Feb 11, 2018 at 02:50:30PM +0100, Martin Pieuchot wrote:
> > On 11/02/18(Sun) 12:37, Matthieu Herrb wrote:
> > > Hi,
> > > 
> > > I ugraded my laptop from sources to -current yesterday. Since then
> > > firefox stops resolving host names after a dozen of minutes or so.
> > 
> > What do you mean with "stops resolving host names"?  What happens?  What
> > do you see?
> > 
> > How did figure out it was a name resolution problem?
> > 
> > A firefox error page?
> 
> Firefox says 'Resolving herrb.net' in popup  message area at the
> bottom of the windows, plays the little animation with a dot moving
> left to right and back in the tab, and the tab stays blank.

Im seeing the same thing right now, and it is 'by periods'. Are you
using wifi or wired ? Here over iwm, and a kernel from yesterday.

Re: httpd.conf configuration mismatch when setting ticket lifetime w/ servers sharing a cert

2018-02-09 Thread Landry Breuil

On Fri, Feb 09, 2018 at 09:40:33PM +0100, Landry Breuil wrote:
> On Fri, Feb 09, 2018 at 07:54:22PM +0100, Landry Breuil wrote:
> > Hi,

I think i found it with some printf-debugging...

If the default vhost has no tls config, and any of the other vhosts has some
non-default tls config (for protocols, ticket, dhe, ciphers..), the
server_match() function will return the default vhost for 's', and then parse.y
inconditionally compares the tls config for s and the current server - as the
default vhost has no tls config, of course they wont match.

My idea would be to compare the tls configs only if the default vhost has a tls
config.. but i'm not sure that's the way to go, since i'm not sure i understand
the rationale about comparing tls configs. Any httpd/ssl experts ? joel, i
think it is this way since r1.79...

With this diff, i can validate a config that would previously error out. I'm not
sure this is the way to go of course.

Index: parse.y
===
RCS file: /cvs/src/usr.sbin/httpd/parse.y,v
retrieving revision 1.92
diff -u -r1.92 parse.y
--- parse.y 28 Aug 2017 06:00:05 -  1.92
+++ parse.y 9 Feb 2018 22:40:20 -
@@ -316,7 +316,8 @@
free(srv);
YYERROR;
}
-   if (server_tls_cmp(s, srv, 0) != 0) {
+   if ((s->srv_conf.flags & SRVFLAG_TLS) &&
+   (server_tls_cmp(s, srv, 0) != 0)) {
yyerror("server \"%s\": tls "
"configuration mismatch on same "
"address/port",

Landry

Re: httpd.conf configuration mismatch when setting ticket lifetime w/ servers sharing a cert

2018-02-09 Thread Landry Breuil

On Fri, Feb 09, 2018 at 07:54:22PM +0100, Landry Breuil wrote:
> Hi,
> 
> on ftp.fr we use httpd on 6.2.
> 
> The config more or less looks like:
> 
> server "default" {
> alias distfiles.bsdfrog.org
> listen on egress port www
> location "/*" {
> block return 301 "https://$SERVER_NAME$REQUEST_URI;
> }
> 
> }
> 
> server "distfiles.bsdfrog.org" {
> listen on egress tls port https
> root "/distfiles"
> #   tls ticket lifetime 1800
> tls certificate "/etc/ssl/pond.obspm.bsdfrog.org.crt"
> tls key "/etc/ssl/private/pond.obspm.bsdfrog.org.key"
> }
> 
> server "ftp.fr.openbsd.org" {
> listen on egress port www
> listen on egress tls port https
> root "/mirror/ftp"
> #   tls ticket lifetime 1800
> tls certificate "/etc/ssl/pond.obspm.bsdfrog.org.crt"
> tls key "/etc/ssl/private/pond.obspm.bsdfrog.org.key"
> }
> 
> 
> Which works fine with https on the different vhosts. But as soon as i 
> uncomment
> the tls ticket lifetime lines, httpd -nvv complains about configuration
> mismatch:
> 
> server_tls_load_keypair: using certificate /etc/ssl/pond.obspm.bsdfrog.org.crt
> server_tls_load_keypair: using private key 
> /etc/ssl/private/pond.obspm.bsdfrog.org.key
> /etc/httpd.conf:37: server "ftp.fr.openbsd.org": tls configuration mismatch 
> on same address/port

I think i've found the bug - it manifests only if there are 3 server
definitions sharing a cert, not with 2. Will dig further.

Landry

httpd.conf configuration mismatch when setting ticket lifetime w/ servers sharing a cert

2018-02-09 Thread Landry Breuil

Hi,

on ftp.fr we use httpd on 6.2.

The config more or less looks like:

server "default" {
alias distfiles.bsdfrog.org
listen on egress port www
location "/*" {
block return 301 "https://$SERVER_NAME$REQUEST_URI;
}

}

server "distfiles.bsdfrog.org" {
listen on egress tls port https
root "/distfiles"
#   tls ticket lifetime 1800
tls certificate "/etc/ssl/pond.obspm.bsdfrog.org.crt"
tls key "/etc/ssl/private/pond.obspm.bsdfrog.org.key"
}

server "ftp.fr.openbsd.org" {
listen on egress port www
listen on egress tls port https
root "/mirror/ftp"
#   tls ticket lifetime 1800
tls certificate "/etc/ssl/pond.obspm.bsdfrog.org.crt"
tls key "/etc/ssl/private/pond.obspm.bsdfrog.org.key"
}


Which works fine with https on the different vhosts. But as soon as i uncomment
the tls ticket lifetime lines, httpd -nvv complains about configuration
mismatch:

server_tls_load_keypair: using certificate /etc/ssl/pond.obspm.bsdfrog.org.crt
server_tls_load_keypair: using private key 
/etc/ssl/private/pond.obspm.bsdfrog.org.key
/etc/httpd.conf:37: server "ftp.fr.openbsd.org": tls configuration mismatch on 
same address/port

which comes from
https://github.com/openbsd/src/blob/master/usr.sbin/httpd/parse.y#L319 - and
there i dont see what could mismatch here.. broken comparison on integers ?
same thing with 'default' for the value (without quotes) or

tls {
 ticket lifetime 1800
 certificate "/etc/ssl/pond.obspm.bsdfrog.org.crt"
 key "/etc/ssl/private/pond.obspm.bsdfrog.org.key"
}

which afaiui should be equivalent. Of course the ssl cert has all the necessary
altnames.

Anyone having a clue ? Running a similar config without issue ?

Landry

Re: amd64/machdep knob: forceukb forcing wrong encoding.

2018-02-04 Thread Landry Breuil

On Sun, Feb 04, 2018 at 09:32:41AM +, Jason McIntyre wrote:
> On Sun, Feb 04, 2018 at 11:28:31AM +0200, Artturi Alm wrote:
> > Hi,
> > 
> > machdep.forceukbd=1 feels broken to me, as i use "sv", and it doesn't 
> > respect
> > /etc/kbdtype.
> > caused nothing but my time being (not-so-well-)wasted, because i forgot 
> > having
> > just added forceukbd=1 && this 'quirk' being undocumented. so maybe a 
> > docbug.
> > 
> > just wanted to make a note of this somewhere, sry4noise..
> > -Artturi
> > 
> 
> morning.
> 
> i ran into this this week too. i think it is a bug in its behaviour, not
> a doc bug.

I've seen it too, which doesnt help when at the same time you have an
azerty and a qwerty kbd, /etc/kbdtype set to fr, with forceukbd=1 both
end up having a 'funny' behaviour..

Re: pure-ftpd config file

2018-01-21 Thread Landry Breuil

On Sun, Jan 21, 2018 at 01:02:31PM +0100, Torsten Boese wrote:
> Hi,
> 
> I'm using OpenBSD 6.2 and I want to use pure-fptd as a daemon. I activated 
> via 
> 
> # rcctl enable pureftpd 
> 
> the daemon ans its starts as expected. The problem is that pure-ftpd ignors 
> the config file.
> As written in /usr/local/share/doc/pure-ftpd/README.Configuration-File I 
> copied the pure-ftpd.conf in 
> 
> # /etc 
> as well as in 
> 
> # /usr/local/etc/
> . But pure-ftpd ignors the config. If I do  
> 
> # pure-ftpd /etc/etc/pure-ftpd.conf
> 
> the config works.
> 
> Where I have to store the config or what is to do to enable using the config 
> file for the daemon?

Iirc, pure-ftpd upstream didnt support a configuration file itself, you
configure it via flags passed on the commandline... by default the rc
script uses -A -B -H -u1000 (see /etc/rc.d/pure_ftpd) and you can
override it via

# rcctl set pure_ftpd flags 

(note that the rc script has an underscore - i dout your 'rcctl enable
pureftpd' did anythingat all)

I know debian has/had a perl script which parses a config directory with
one file by option, which are translated to commandline switches at
startup, but that wasnt upstream. I looked at the doc you mention, and
maybe it'd work if you set the path to where you installed the config
file (which i suppose you copied/adapted from the provided one in
share/examples?) as a startup flag:

# rcctl set pure_ftpd flags /path/to/config

The fact that the doc shows a 'double' /etc makes me think there's a
substitution error somewhere.

cc'ing ports@ as it's more relevant to a pure port issue..

Landry

ifstated link state ?

2018-01-04 Thread Landry Breuil

Hi,

i know ifstated was designed with carp in mind, but nothing from the
manpage seems to forbid one to use it with other interfaces, thus im
trying to make it work with ppp so that i can just up a potentially
existing ppp0 interface and ifstated would magically start xl2tpd to up
a tunnel.

fiddling with this, i came up with this sample config which... behaves
weird:

init-state vpn_unknown

state vpn_up {
if ppp1.link.unknown {
run 'echo "ppp1 is up->unk" > /tmp/foo'
}
if ppp1.link.down {
run 'echo "ppp1 is up->down" > /tmp/foo'

}
}
state vpn_unknown {
if ppp1.link.up {
run 'echo "ppp1 is unk->up" > /tmp/foo'
set-state vpn_up
}
if ppp1.link.down {
run 'echo "ppp1 is unk->down" > /tmp/foo'
set-state vpn_down
}
}

$ifconfig ppp1
ppp1: flags=8010 mtu 1500
index 8 priority 0 llprio 3
groups: ppp

$doas ifstated -vd -f t.conf  
initial state: vpn_unknown
changing state to vpn_unknown
running echo "ppp1 is unk->up" > /tmp/foo
changing state to vpn_up
running echo "ppp1 is up->unk" > /tmp/foo
started

i dont really understand the state flipping here... it's as if at first
ppp1.link.up was true, then immediately the link state is unknown.. or
maybe it just evaluates the first statement to true, when no macros are
used ? the config is valid according to the grammar, but the behaviour
is .. weird. Dunno if the grammar or the examples or the manpage should be
adapted/clarified ?

Re: Unable to boot OpenBSD within QEMU on an Intel Platinum 8176M

2018-01-02 Thread Landry Breuil

On Sat, Dec 30, 2017 at 09:23:03PM -0800, Mike Larkin wrote:
> On Tue, Jan 02, 2018 at 11:30:47AM -0500, Brian Rak wrote:
> > 
> > 
> The only thing I can say is that recently I've been noticing an uptick in the
> quantity of KVM related issues on OpenBSD. Whether this is due to some recent
> changes in KVM, or maybe due to more people running OpenBSD on KVM (and thus
> increasing the number of reports), I'm not sure. But kettenis@ did note a few
> days ago in a reply to a different KVM related issue that it seems their local
> APIC emulation code isn't behaving exactly as we expect. But that code hasn't
> changed in OpenBSD since, well, forever, so it's likely a KVM issue there.
> Whether this is your issue or not I don't know. You might bring this up on
> the KVM mailing lists and see if someone can shed light on it. If you search
> the tech@/misc@ archives for proxmox related threads, there was a KVM option
> reported a week or so back that seemed to fix the issue kettenis@ was 
> commenting
> on; perhaps this can help you.

ftr that option was kvm-intel.preemption_timer=0 on the host kernel
commandline.

Re: mp deadlock on 6.2 running on kvm

2017-12-22 Thread Landry Breuil

On Fri, Dec 15, 2017 at 01:59:06PM +0100, Landry Breuil wrote:
> On Fri, Dec 15, 2017 at 06:21:08PM +1000, Jonathan Matthew wrote:



> > So, for me at least, adding 'kvm-intel.preemption_timer=0' to the linux 
> > kernel
> > commandline (on the host) fixes this.  This option disables kvm's use of the
> > vmx preemption timer to schedule lapic counter events.  I think there's some
> > problem in how kvm calculates the tsc deadline, or the relationship between
> > that and the time value it uses for lapic counter reads, but I'm not sure 
> > what
> > exactly.
> > 
> 
> I'm also running with this option now, will bang the vms with some heavy
> workload and keep an eye on it. Thanks for the time you spent on this ! :)

Been running with it since a week without crashes/hardlocks, and i'm
also able to reboot VMs without issues. Sounds good!

Re: mp deadlock on 6.2 running on kvm

2017-12-15 Thread Landry Breuil

On Fri, Dec 15, 2017 at 06:21:08PM +1000, Jonathan Matthew wrote:
> On Tue, Dec 12, 2017 at 09:11:40PM +1000, Jonathan Matthew wrote:
> > On Mon, Dec 11, 2017 at 09:34:00AM +0100, Landry Breuil wrote:
> > > On Mon, Dec 11, 2017 at 06:21:01PM +1000, Jonathan Matthew wrote:
> > > > On 10/12/17 03:26, Landry Breuil wrote:
> > > > > On Sat, Dec 09, 2017 at 04:33:28PM +0100, Juan Francisco Cantero 
> > > > > Hurtado wrote:
> > > > > > On Thu, Dec 07, 2017 at 02:27:29PM +0100, Landry Breuil wrote:
> > > > > > > On Thu, Dec 07, 2017 at 11:52:46AM +0100, Martin Pieuchot wrote:
> > > > > > > > On 07/12/17(Thu) 08:34, Landry Breuil wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > i've been having kvm VMs running 6.2 hardlocking/deadlocking 
> > > > > > > > > since a
> > > > > > > > > while, all those running on proxmox 5.1 using linux 4.13.8 & 
> > > > > > > > > qemu-kvm
> > > > > > > > > 2.9.1. There were hardlocks upon reboot which were 'solved' 
> > > > > > > > > by disabling
> > > > > > > > > x2apic emulation in kvm (args: -cpu=kvm64,-x2apic) or giving 
> > > > > > > > > the host
> > > > > > > > > cpu flags to the vm (args: -cpu host) but there still remains 
> > > > > > > > > deadlocks
> > > > > > > > > during normal operation.
> > > > > > > > > 
> > > > > > > > > I'm now running a kernel with MP_LOCKDEBUG, so i'm collecting 
> > > > > > > > > traces in
> > > > > > > > > the vain hope that it might help someone interested in 
> > > > > > > > > locking issues.
> > > > > > > > > Here's the latest one:
> > > > > > > > 
> > > > > > > > Let me add that when you had x2apic enabled the kernel 'froze' 
> > > > > > > > inside
> > > > > > > > x2apic_readreg, trace below:
> > > > > > > > 
> > > > > > > >ddb{0}> tr
> > > > > > > >x2apic_readreg(10) at x2apic_readreg+0xf
> > > > > > > >lapic_delay(800022136900) at lapic_delay+0x5c
> > > > > > > >rtcput(800022136960) at rtcput+0x65
> > > > > > > >resettodr() at resettodr+0x1d6
> > > > > > > >perform_resettodr(81769b29) at perform_resettodr+0x9
> > > > > > > >taskq_thread(0) at taskq_thread+0x67
> > > > > > > >end trace frame: 0x0, count: -6
> > > > > > > > 
> > > > > > > > What you're seeing with a MP_LOCKDEBUG kernel is just a 
> > > > > > > > symptom.  A CPU
> > > > > > > > enters DDB because another one is 'frozen' while holding the
> > > > > > > > KERNEL_LOCK().  What's interesting is that in both case the 
> > > > > > > > frozen CPU
> > > > > > > > is trying to execute apic related code:
> > > > > > > >- x2apic_readreg
> > > > > > > >- lapic_delay
> > > > > > > > 
> > > > > > > > I believe this issue should be reported to KVM developers as 
> > > > > > > > well.
> > > > > > > 
> > > > > > > *very* interestingly, i had a new lock, running bsd.sp.. So i 
> > > > > > > think that
> > > > > > > rules out openbsd mp.
> > > > > > > 
> > > > > > > ddb> tr
> > > > > > > i82489_readreg(0) at i82489_readreg+0xd
> > > > > > > lapic_delay(81a84090) at lapic_delay+0x5c
> > > > > > > rtcget(81a84090) at rtcget+0x1a
> > > > > > > resettodr() at resettodr+0x3a
> > > > > > > perform_resettodr(81659e99) at perform_resettodr+0x9
> > > > > > > taskq_thread(0) at taskq_thread+0x57
> > > > > > > end trace frame: 0x0, count: -6
> > > > > > 
> > > > > > Try running with "-machine q35". It changes the emulated machine to
> > > > > > a modern pl

Re: mp deadlock on 6.2 running on kvm

2017-12-11 Thread Landry Breuil

On Mon, Dec 11, 2017 at 06:21:01PM +1000, Jonathan Matthew wrote:
> On 10/12/17 03:26, Landry Breuil wrote:
> > On Sat, Dec 09, 2017 at 04:33:28PM +0100, Juan Francisco Cantero Hurtado 
> > wrote:
> > > On Thu, Dec 07, 2017 at 02:27:29PM +0100, Landry Breuil wrote:
> > > > On Thu, Dec 07, 2017 at 11:52:46AM +0100, Martin Pieuchot wrote:
> > > > > On 07/12/17(Thu) 08:34, Landry Breuil wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > i've been having kvm VMs running 6.2 hardlocking/deadlocking since a
> > > > > > while, all those running on proxmox 5.1 using linux 4.13.8 & 
> > > > > > qemu-kvm
> > > > > > 2.9.1. There were hardlocks upon reboot which were 'solved' by 
> > > > > > disabling
> > > > > > x2apic emulation in kvm (args: -cpu=kvm64,-x2apic) or giving the 
> > > > > > host
> > > > > > cpu flags to the vm (args: -cpu host) but there still remains 
> > > > > > deadlocks
> > > > > > during normal operation.
> > > > > > 
> > > > > > I'm now running a kernel with MP_LOCKDEBUG, so i'm collecting 
> > > > > > traces in
> > > > > > the vain hope that it might help someone interested in locking 
> > > > > > issues.
> > > > > > Here's the latest one:
> > > > > 
> > > > > Let me add that when you had x2apic enabled the kernel 'froze' inside
> > > > > x2apic_readreg, trace below:
> > > > > 
> > > > >ddb{0}> tr
> > > > >x2apic_readreg(10) at x2apic_readreg+0xf
> > > > >lapic_delay(800022136900) at lapic_delay+0x5c
> > > > >rtcput(800022136960) at rtcput+0x65
> > > > >resettodr() at resettodr+0x1d6
> > > > >perform_resettodr(81769b29) at perform_resettodr+0x9
> > > > >taskq_thread(0) at taskq_thread+0x67
> > > > >end trace frame: 0x0, count: -6
> > > > > 
> > > > > What you're seeing with a MP_LOCKDEBUG kernel is just a symptom.  A 
> > > > > CPU
> > > > > enters DDB because another one is 'frozen' while holding the
> > > > > KERNEL_LOCK().  What's interesting is that in both case the frozen CPU
> > > > > is trying to execute apic related code:
> > > > >- x2apic_readreg
> > > > >- lapic_delay
> > > > > 
> > > > > I believe this issue should be reported to KVM developers as well.
> > > > 
> > > > *very* interestingly, i had a new lock, running bsd.sp.. So i think that
> > > > rules out openbsd mp.
> > > > 
> > > > ddb> tr
> > > > i82489_readreg(0) at i82489_readreg+0xd
> > > > lapic_delay(81a84090) at lapic_delay+0x5c
> > > > rtcget(81a84090) at rtcget+0x1a
> > > > resettodr() at resettodr+0x3a
> > > > perform_resettodr(81659e99) at perform_resettodr+0x9
> > > > taskq_thread(0) at taskq_thread+0x57
> > > > end trace frame: 0x0, count: -6
> > > 
> > > Try running with "-machine q35". It changes the emulated machine to
> > > a modern platform.
> > 
> > Right, i suppose that matches https://wiki.qemu.org/Features/Q35,
> > interesting. Will definitely try, mailed the kvm mailing list but got no
> > feedback so far.
> 
> I've been seeing this for a while too, on VMs that are already run with
> -machine q35 and -cpu host.  I was blaming my (still unfinished) pvclock
> code, but now I can fairly easily trigger it on single cpu VMs without that,
> mostly by running kernel compiles in a loop in a couple of different guests.
> I'm using a Fedora 25 (4.10.15-200.fc25.x86_64) kernel.
> 
> Adding some debug output to lapic_delay, it appears the KVM virtualized
> lapic counter hits zero and doesn't reset, so the lapic_delay loop in the
> guest never terminates.  KVM has several different ways it can provide the
> lapic counter and I'm not sure which one I'm using yet.
> 
> I just tried making lapic_delay give up after a million zero reads, and it
> seems to recover after a minute or so.  I'll leave it running to see if it
> happens again.

Hah, interesting. So i know i can try q35 just for the sake of emulating
a newer hw platform, but that wont fix this issue.

I see upstream this thread/patchset: https://lkml.org/lkml/2017/9/28/773
But i'm not sure that's the same issue.
https://patchwork.kernel.org/patch/9036161/ might be too ?

When you say you modified lapic_delay, i guess that's on the guest side ?

Landry

Re: mp deadlock on 6.2 running on kvm

2017-12-09 Thread Landry Breuil

On Sat, Dec 09, 2017 at 04:33:28PM +0100, Juan Francisco Cantero Hurtado wrote:
> On Thu, Dec 07, 2017 at 02:27:29PM +0100, Landry Breuil wrote:
> > On Thu, Dec 07, 2017 at 11:52:46AM +0100, Martin Pieuchot wrote:
> > > On 07/12/17(Thu) 08:34, Landry Breuil wrote:
> > > > Hi,
> > > > 
> > > > i've been having kvm VMs running 6.2 hardlocking/deadlocking since a
> > > > while, all those running on proxmox 5.1 using linux 4.13.8 & qemu-kvm
> > > > 2.9.1. There were hardlocks upon reboot which were 'solved' by disabling
> > > > x2apic emulation in kvm (args: -cpu=kvm64,-x2apic) or giving the host
> > > > cpu flags to the vm (args: -cpu host) but there still remains deadlocks
> > > > during normal operation.
> > > > 
> > > > I'm now running a kernel with MP_LOCKDEBUG, so i'm collecting traces in
> > > > the vain hope that it might help someone interested in locking issues.
> > > > Here's the latest one:
> > > 
> > > Let me add that when you had x2apic enabled the kernel 'froze' inside
> > > x2apic_readreg, trace below:
> > > 
> > >   ddb{0}> tr
> > >   x2apic_readreg(10) at x2apic_readreg+0xf
> > >   lapic_delay(800022136900) at lapic_delay+0x5c
> > >   rtcput(800022136960) at rtcput+0x65
> > >   resettodr() at resettodr+0x1d6
> > >   perform_resettodr(81769b29) at perform_resettodr+0x9
> > >   taskq_thread(0) at taskq_thread+0x67
> > >   end trace frame: 0x0, count: -6
> > > 
> > > What you're seeing with a MP_LOCKDEBUG kernel is just a symptom.  A CPU
> > > enters DDB because another one is 'frozen' while holding the
> > > KERNEL_LOCK().  What's interesting is that in both case the frozen CPU
> > > is trying to execute apic related code:
> > >   - x2apic_readreg
> > >   - lapic_delay
> > > 
> > > I believe this issue should be reported to KVM developers as well.
> > 
> > *very* interestingly, i had a new lock, running bsd.sp.. So i think that
> > rules out openbsd mp.
> > 
> > ddb> tr
> > i82489_readreg(0) at i82489_readreg+0xd
> > lapic_delay(81a84090) at lapic_delay+0x5c
> > rtcget(81a84090) at rtcget+0x1a
> > resettodr() at resettodr+0x3a
> > perform_resettodr(81659e99) at perform_resettodr+0x9
> > taskq_thread(0) at taskq_thread+0x57
> > end trace frame: 0x0, count: -6
> 
> Try running with "-machine q35". It changes the emulated machine to
> a modern platform.

Right, i suppose that matches https://wiki.qemu.org/Features/Q35,
interesting. Will definitely try, mailed the kvm mailing list but got no
feedback so far.

Thanks!
Landry

Re: mp deadlock on 6.2 running on kvm

2017-12-07 Thread Landry Breuil

On Thu, Dec 07, 2017 at 11:52:46AM +0100, Martin Pieuchot wrote:
> On 07/12/17(Thu) 08:34, Landry Breuil wrote:
> > Hi,
> > 
> > i've been having kvm VMs running 6.2 hardlocking/deadlocking since a
> > while, all those running on proxmox 5.1 using linux 4.13.8 & qemu-kvm
> > 2.9.1. There were hardlocks upon reboot which were 'solved' by disabling
> > x2apic emulation in kvm (args: -cpu=kvm64,-x2apic) or giving the host
> > cpu flags to the vm (args: -cpu host) but there still remains deadlocks
> > during normal operation.
> > 
> > I'm now running a kernel with MP_LOCKDEBUG, so i'm collecting traces in
> > the vain hope that it might help someone interested in locking issues.
> > Here's the latest one:
> 
> Let me add that when you had x2apic enabled the kernel 'froze' inside
> x2apic_readreg, trace below:
> 
>   ddb{0}> tr
>   x2apic_readreg(10) at x2apic_readreg+0xf
>   lapic_delay(800022136900) at lapic_delay+0x5c
>   rtcput(800022136960) at rtcput+0x65
>   resettodr() at resettodr+0x1d6
>   perform_resettodr(81769b29) at perform_resettodr+0x9
>   taskq_thread(0) at taskq_thread+0x67
>   end trace frame: 0x0, count: -6
> 
> What you're seeing with a MP_LOCKDEBUG kernel is just a symptom.  A CPU
> enters DDB because another one is 'frozen' while holding the
> KERNEL_LOCK().  What's interesting is that in both case the frozen CPU
> is trying to execute apic related code:
>   - x2apic_readreg
>   - lapic_delay
> 
> I believe this issue should be reported to KVM developers as well.

*very* interestingly, i had a new lock, running bsd.sp.. So i think that
rules out openbsd mp.

ddb> tr
i82489_readreg(0) at i82489_readreg+0xd
lapic_delay(81a84090) at lapic_delay+0x5c
rtcget(81a84090) at rtcget+0x1a
resettodr() at resettodr+0x3a
perform_resettodr(81659e99) at perform_resettodr+0x9
taskq_thread(0) at taskq_thread+0x57
end trace frame: 0x0, count: -6

Landry

mp deadlock on 6.2 running on kvm

2017-12-06 Thread Landry Breuil

Hi,

i've been having kvm VMs running 6.2 hardlocking/deadlocking since a
while, all those running on proxmox 5.1 using linux 4.13.8 & qemu-kvm
2.9.1. There were hardlocks upon reboot which were 'solved' by disabling
x2apic emulation in kvm (args: -cpu=kvm64,-x2apic) or giving the host
cpu flags to the vm (args: -cpu host) but there still remains deadlocks
during normal operation.

I'm now running a kernel with MP_LOCKDEBUG, so i'm collecting traces in
the vain hope that it might help someone interested in locking issues.
Here's the latest one:

login: __mp_lock(0x81b23ed0): lock spun outStopped at  db_enter+0x5:   
popq%rbp
ddb{2}> trace
db_enter() at db_enter+0x5
___mp_lock(40) at ___mp_lock+0x66
syscall() at syscall+0x1ff
--- syscall (number 4) ---
end of kernel
end trace frame: 0xb3b20d05fe0, count: -3
0xb3ba081bbfa:
ddb{2}> mach ddbcpu 1
Stopped at  x86_ipi_db+0x5: popq%rbp
ddb{1}> tr
x86_ipi_db(800022136530) at x86_ipi_db+0x5
x86_ipi_handler() at x86_ipi_handler+0x6b
Xresume_lapic_ipi() at Xresume_lapic_ipi+0x1f
--- interrupt ---
Bad frame pointer: 0xfe4fe85250cc
end trace frame: 0xfe4fe85250cc, count: -3
0x41cb8c419c524153:
ddb{1}> ps /o
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 216909  39313726   0  0x4002  rrdcached
*366090  74010  0 0x14000  0x2001  systq
ddb{1}> tr /p 0t366090
lapic_delay(81a97c88) at lapic_delay+0x5c
rtcget(81a97c88) at rtcget+0x1a
resettodr() at resettodr+0x3a
perform_resettodr(8150a7d9) at perform_resettodr+0x9
taskq_thread(0) at taskq_thread+0x67
end trace frame: 0x0, count: -5
ddb{1}> tr /p 0t216909
uvm_fault(0x81b526f8, 0x0, 0, 1) -> e
kernel: double fault trap, code=0
Faulted in DDB; continuing...

if i continue at that point or try bo sync, ddb is kaput and i can only stop
the vm. Of course an option would be to run bsd.sp, but that's a bit .. sad.

dmesg attached.

Landry
OpenBSD 6.2-stable (GENERIC.MP) #0: Fri Dec  1 11:27:09 CET 2017
landry@s64.proxmox2:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 17162944512 (16367MB)
avail mem = 16635760640 (15865MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf68d0 (10 entries)
bios0: vendor SeaBIOS version "rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org" 
date 04/01/2014
bios0: QEMU Standard PC (i440FX + PIIX, 1996)
acpi0 at bios0: rev 0
acpi0: sleep states S3 S4 S5
acpi0: tables DSDT FACP APIC HPET
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, 286.92 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,ARAT
cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 999MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, 472.57 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,ARAT
cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, 465.40 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,HTT,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,ARAT
cpu2: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 
16-way L2 cache
cpu2: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu2: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, 467.31 MHz
cpu3:

pf.conf/pf faq inconsistency about log (user) ?

2017-11-29 Thread Landry Breuil

Hi,

while testing some bits of logging with pf, i came on this
doc inconsistency:

pf.conf(5) says:

The keyword user logs the UID and PID of the socket on the local host
used to send or receive a packet, in addition to the normal information. 

while the faq (https://www.openbsd.org/faq/pf/logging.html) says:

user
Causes the user id and group id that owns the socket that the packet
is sourced from/destined to (whichever socket is local) to be logged
along with the standard log information. 

Afaict, after digging a bit to figure out how to see the logged info
(you need tcpdump -v), i figured out it was the process id owning the socket
that was logged, so i think the faq is wrong.

18:25:21.580108 rule 1/(match) [uid 0, pid 33213] pass out on em0: [uid
1000, pid 23403]

interestingly i couldnt figure out what the info '[uid 0, pid 33213]'
was referring to since on the local system there's no such pid (in that
case the logging was triggered by ssh'ing outside, pid 23403 being the local
pid for the ssh process) - its the same info for all logged pkts whatever the
process triggering the connection, but that doesnt seem to be a tcpdump
subprocess id...

Landry

Index: logging.html
===
RCS file: /cvs/www/faq/pf/logging.html,v
retrieving revision 1.69
diff -u -r1.69 logging.html
--- logging.html10 Oct 2017 19:17:08 -  1.69
+++ logging.html29 Nov 2017 17:27:55 -
@@ -99,7 +99,7 @@
 The default log interface pflog0 is created automatically.
 
 user
-Causes the user id and group id that owns the socket that the packet is
+Causes the user id and process id that owns the socket that the packet is
 sourced from/destined to (whichever socket is local) to be logged along
 with the standard log information.

Re: i386: memory pressure problem when building rust 1.22.0beta3

2017-11-28 Thread Landry Breuil

On Thu, Nov 16, 2017 at 02:37:40PM +0100, Sebastien Marie wrote:
> Hi,
> 
> I am working on new lang/rust version (next stable in 1 week), and I
> have problem building it under i386 (full dmesg below).
> 
> I suspect memory pressure in some way, but I dunno options I have to
> workaround (if possible).
> 
> It is possible that occasionnal failures sthen@ saw in bulk with current
> rustc version (1.21) in ports to be related. The version 1.22 hits it
> almost at every build try.

Fwiw, this issue is soon going to be a showstopper for having firefox
on i386, if some ppl still care about it now is a good time to dig into
that. Oh, and blaming it on the mozilla/rust devs wont be very helpful i
fear :)

Landry

Re: pkg_add: fatal error when updating packages

2017-11-04 Thread Landry Breuil

On Sat, Nov 04, 2017 at 05:47:46PM -0400, Michael Reed wrote:
> $ sysctl -n kern.version
> OpenBSD 6.2-current (GENERIC.MP) #193: Wed Nov  1 12:24:15 MDT 2017
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> 
> Recently I ran pkg_add -u and it bailed out with an ultra-long error
> message.

Welcome to the large club of people who had this error. Take a seat,
have a cocktail, you're here for a while as we sail on the
faiilbooaaatt..

Re: Libreoffice won't open hyperlinks

2017-09-04 Thread Landry Breuil

On Mon, Sep 04, 2017 at 12:10:21AM +0200, Jonathan Drews wrote:
> SENDBUG: -*- sendbug -*-
> SENDBUG: Lines starting with `SENDBUG' will be removed automatically.
> SENDBUG:
> SENDBUG: Choose from the following categories:
> SENDBUG:
> SENDBUG: system user library documentation kernel alpha amd64 arm hppa i386 
> m88k mips64 powerpc sh sparc sparc64 vax
> SENDBUG:
> SENDBUG:
> To: bugs@openbsd.org
> Subject: Libreoffice can't open it's hyperlinks
> From: jdr...@gmx.com
> Cc: cleetus
> Reply-To: jdr...@gmx.com
> 
> >Synopsis:  LibreOffice cannot find xdg-open
> >Category:  Low
> >Environment:
> System  : OpenBSD 6.1
> Details : OpenBSD 6.1 (GENERIC.MP) #21: Wed Aug 30 20:33:45 MDT 
> 2017
>  
> clee...@jackcat.cats.com:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> Architecture: OpenBSD.amd64
> Machine : amd64
> >Description:
> libreoffice won't open it's hyperlinks when you do Ctrl-Left Click.
> >How-To-Repeat:
> Open a document that has a URL for a web site. Do CTRL-Left Click
> >Fix:
> Solution: The software that opens applications is xdg-open and is in 
> /usr/local/bin/
> on OpenBSD. However libreoffice looks for xdg-open in /usr/bin.

This was fixed 2 months ago and will be in 6.2:

http://cvsweb.openbsd.org/cgi-bin/cvsweb/ports/editors/libreoffice/Makefile.diff?r1=1.160=1.161

Re: libtls.so is missing from base61.tgz set

2017-07-08 Thread Landry Breuil

On Sat, Jul 08, 2017 at 08:38:37AM -0400, RD Thrush wrote:
> >Synopsis:libtls.so is missing from base61.tgz set
> >Category:system
> >Environment:
>   System  : OpenBSD 6.1
>   Details : OpenBSD 6.1-current (GENERIC.MP) #93: Thu Jul  6 15:41:21 
> MDT 2017
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
>   After upgrading -current, I noticed the following warning
>   about libtls:
> 
> Path to firmware: http://firmware.openbsd.org/firmware/snapshots/
> Updating: vmm-firmware-1.10.2p3
> http://firmware.openbsd.org/firmware/snapshots/: warning: libtls.so.15.6: 
> minor version >= 7 expected, using it anyway
> http://firmware.openbsd.org/firmware/snapshots/vmm-firmware-1.10.2p3.tgz: 
> warning: libtls.so.15.6: minor version >= 7 expected, using it anyway
> 
>   libtls.so.15.7 is in /usr/src/distrib/sets/lists/base/mi;
>   however, it is missing from the base61.tgz set which likely contributes
>   to the above message, ie.:
> 1>(cd /usr/src/distrib/sets/lists && grep -r libtls.so *)
> base/mi:./usr/lib/libtls.so.15.7
> 2>tar -tvzf base61.tgz | grep libtls.so | wc
>0   0   0
> 
> >How-To-Repeat:
>   Upgrade -current and notice the first reboot message.
> >Fix:
>   Dunno

Known, fixed by
https://github.com/openbsd/src/commit/5de0d37d6f006dea60577d9428060862d8e94405,
wait for next snaps, etc..

Re: panic when mounting 8To ntfs partition

2017-03-14 Thread Landry Breuil

On Tue, Mar 14, 2017 at 11:58:23AM +0100, Jeremie Courreges-Anglas wrote:
> Jeremie Courreges-Anglas <j...@wxcvbn.org> writes:
> 
> > Landry Breuil <lan...@openbsd.org> writes:
> >
> >> Hi,
> >>
> >> i know we don't really have an ntfs maintainer, and that nobody should
> >> use ntfs, but interoperability...
> >>
> >> i have a 8Tb 'seagate backup plus hub' appearing as:
> >>
> >> uhub8 at uhub1 port 5 configuration 1 interface 0 "Seagate Backup+ Hub" 
> >> rev 2.10/48.85 addr 2
> >> umass0 at uhub8 port 1 configuration 1 interface 0 "Seagate Backup+ Hub 
> >> BK" rev 2.10/1.00 addr 3
> >> umass0: using SCSI over Bulk-Only
> >> scsibus4 at umass0: 2 targets, initiator 0
> >> sd2 at scsibus4 targ 1 lun 0: <Seagate, Backup+ Hub BK, D781> SCSI4 
> >> 0/direct fixed
> >>
> >> which just has a huge ntfs partition:
> >>
> >> # /dev/rsd2c:
> >> type: SCSI
> >> disk: SCSI disk
> >> label: Backup+ Hub BK  
> >> duid: 
> >> flags:
> >> bytes/sector: 512
> >> sectors/track: 63
> >> tracks/cylinder: 255
> >> sectors/cylinder: 16065
> >> cylinders: 972801
> >> total sectors: 15628053167
> >> boundstart: 0
> >> boundend: 15628053167
> >> drivedata: 0 
> >>
> >> 16 partitions:
> >> #size   offset  fstype [fsize bsize   cpg]
> >>   c:  156280531670  unused
> >>   i:   262144   34 unknown# 
> >> /mnt/sd2
> >>   j:  15627788288   264192   MSDOS  
> >>
> >> (with -pG)
> >> total sectors: 15628053167 # total bytes: 7452.0G
> >>   c:  7452.0G0  unused
> >>   i: 0.1G   34 unknown# 
> >> /mnt/sd2
> >>   j:  7451.9G   264192   MSDOS  
> >>
> >> Trying to mount it (dev/sd2j, of course) immediately panics the kernel
> >> (hand-written):
> >>
> >> panic: out of space in kmem_map
> >> panic
> >> malloc
> >> ntfs_calccfree
> >> ntfs_mountfs
> >> ntfs_mount
> >> sys_mount
> >> syscall
> >>
> >> I suppose pointing at
> >> https://github.com/openbsd/src/blob/master/sys/ntfs/ntfs_vfsops.c#L567
> >>
> >> So, what can be done to 1) avoid the panic and 2) eventually find a way to
> >> support those partitions sizes ?
> >
> > I guess that we should not allocate memory for the whole bitmap file,
> > but instead split it in chunks.
> >
> > I have only tested with dummy ntfs volumes on vnd devices, but this
> > fixes landry's panic and allows him to browse his NTFS filesystem.
> >
> > Concerns:
> > - what would be a better chunk size than 1MB?
> > - uint64_t vs off_t: add more tests
> >
> > Input welcome.
> 
> No opinion on that one?

I thought i had replied to the list but it seems i only replied to
jeremie.. with this diff, i've been able to mount and browse the ntfs
partition on the 8tb disk without issue. Mounting is slow, but i guess
that's expected.

$doas time mount -t ntfs /dev/sd2j /mnt/sd2
50.33 real 0.00 user 4.29 sys
$df -h /mnt/sd2
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/sd2j  7.3T5.9T1.4T81%/mnt/sd2

So even if that's not an ok (since i have no authority in this area),
that's a yay - at least it fixes the panic and restores the
basic functionality - thanks!

Landry

Re: panic when mounting 8To ntfs partition

2017-03-01 Thread Landry Breuil

On Wed, Mar 01, 2017 at 10:28:41AM +0100, Landry Breuil wrote:
> Hi,
> 
> i know we don't really have an ntfs maintainer, and that nobody should
> use ntfs, but interoperability...
> 
> i have a 8Tb 'seagate backup plus hub' appearing as:
> 
> uhub8 at uhub1 port 5 configuration 1 interface 0 "Seagate Backup+ Hub" rev 
> 2.10/48.85 addr 2
> umass0 at uhub8 port 1 configuration 1 interface 0 "Seagate Backup+ Hub BK" 
> rev 2.10/1.00 addr 3
> umass0: using SCSI over Bulk-Only
> scsibus4 at umass0: 2 targets, initiator 0
> sd2 at scsibus4 targ 1 lun 0: <Seagate, Backup+ Hub BK, D781> SCSI4 0/direct 
> fixed
> 
> which just has a huge ntfs partition:
> 
> # /dev/rsd2c:
> type: SCSI
> disk: SCSI disk
> label: Backup+ Hub BK  
> duid: 
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 972801
> total sectors: 15628053167
> boundstart: 0
> boundend: 15628053167
> drivedata: 0 
> 
> 16 partitions:
> #size   offset  fstype [fsize bsize   cpg]
>   c:  156280531670  unused
>   i:   262144   34 unknown# /mnt/sd2
>   j:  15627788288   264192   MSDOS  
> 
> (with -pG)
> total sectors: 15628053167 # total bytes: 7452.0G
>   c:  7452.0G0  unused
>   i: 0.1G   34 unknown# /mnt/sd2
>   j:  7451.9G   264192   MSDOS  
> 
> Trying to mount it (dev/sd2j, of course) immediately panics the kernel
> (hand-written):
> 
> panic: out of space in kmem_map
> panic
> malloc
> ntfs_calccfree
> ntfs_mountfs
> ntfs_mount
> sys_mount
> syscall

And since it's reproductible, here's another one with offsets this time:

panic: malloc: out of space in kmem_map
panic()
malloc+0x4b1
ntfs_calccfree+0x3b
ntfs_mountfs+0x3f9
ntfs_mount+0x21c
sys_mount+0x271
syscall+0x27b
syscall number 21
end trace frame 0x7f7c3901 count:7
0x1a4e3d60109a

That's with a self-built kernel from
OpenBSD 6.0-current (GENERIC.MP) #4: Mon Feb 13 08:58:42 CET 2017

panic when mounting 8To ntfs partition

2017-03-01 Thread Landry Breuil

Hi,

i know we don't really have an ntfs maintainer, and that nobody should
use ntfs, but interoperability...

i have a 8Tb 'seagate backup plus hub' appearing as:

uhub8 at uhub1 port 5 configuration 1 interface 0 "Seagate Backup+ Hub" rev 
2.10/48.85 addr 2
umass0 at uhub8 port 1 configuration 1 interface 0 "Seagate Backup+ Hub BK" rev 
2.10/1.00 addr 3
umass0: using SCSI over Bulk-Only
scsibus4 at umass0: 2 targets, initiator 0
sd2 at scsibus4 targ 1 lun 0:  SCSI4 0/direct 
fixed

which just has a huge ntfs partition:

# /dev/rsd2c:
type: SCSI
disk: SCSI disk
label: Backup+ Hub BK  
duid: 
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 972801
total sectors: 15628053167
boundstart: 0
boundend: 15628053167
drivedata: 0 

16 partitions:
#size   offset  fstype [fsize bsize   cpg]
  c:  156280531670  unused
  i:   262144   34 unknown# /mnt/sd2
  j:  15627788288   264192   MSDOS  

(with -pG)
total sectors: 15628053167 # total bytes: 7452.0G
  c:  7452.0G0  unused
  i: 0.1G   34 unknown# /mnt/sd2
  j:  7451.9G   264192   MSDOS  

Trying to mount it (dev/sd2j, of course) immediately panics the kernel
(hand-written):

panic: out of space in kmem_map
panic
malloc
ntfs_calccfree
ntfs_mountfs
ntfs_mount
sys_mount
syscall

I suppose pointing at
https://github.com/openbsd/src/blob/master/sys/ntfs/ntfs_vfsops.c#L567

So, what can be done to 1) avoid the panic and 2) eventually find a way to
support those partitions sizes ?

Landry

Re: coredump of firefox(76129), write failed: errno 14

2017-02-19 Thread Landry Breuil

On Sun, Jan 29, 2017 at 12:41:20PM +0100, Sebastien Marie wrote:
> Hi,
> 
> I report this error as it is the first time I saw it when a coredump is
> generated.
> 
> It occurs reproductibly with my current profile (but not with a fresh
> profile) when firefox crashs (currently when I visit
> https://locka99.gitbooks.io/a-guide-to-porting-c-to-rust with JS
> activated).
> 
> Please note that my point isn't the firefox crash, but the kernel error
> message.
> 
> (launch firefox and visit url above)
> $ firefox
> 
> (firefox:76129): Gdk-ERROR **: The program 'firefox' received an X Window 
> System error.
> This probably reflects a bug in the program.
> The error was 'RenderBadPicture (invalid Picture parameter)'.
>   (Details: serial 59593 error_code 143 request_code 139 (RENDER) minor_code 
> 7)
>   (Note to programmers: normally, X errors are reported asynchronously;
>that is, you will receive the error a while after causing it.
>To debug your program, run it with the GDK_SYNCHRONIZE environment
>variable to change this behavior. You can then get a meaningful
>backtrace from your debugger if you break on the gdk_x_error() function.)
> [Child 4704] ###!!! ABORT: Aborting on channel error.: file 
> /usr/obj/ports/firefox-51.0/firefox-51.0/ipc/glue/MessageChannel.cpp, line 
> 2056
> [Child 4704] ###!!! ABORT: Aborting on channel error.: file 
> /usr/obj/ports/firefox-51.0/firefox-51.0/ipc/glue/MessageChannel.cpp, line 
> 2056

As for this particular crash, looking for RenderBadPicture in bugzilla
yields https://bugzilla.mozilla.org/show_bug.cgi?id=1335827 - testing a
fix.

Re: openbsd 5.9 firefox and printer problems

2016-06-26 Thread Landry Breuil

On Thu, Jun 23, 2016 at 08:50:09PM +, sare...@att.net wrote:
> I installed OpenBSD 5.9 from install59iso and the xfce desktop. Firefox is 
> unusable with such slow web page loading, it just keeps loading and loading. 
> No plug-ins or extensions are installed. My HP PSC 1500 printer will not 
> setup, no ppd file found. Running hp-check says cups not installed or not 
> enabled. I used pkg_add to install hplip which installs cups and other 
> depends. Also what do you put in the .xinitrc file to get shutdown and reboot 
> in the menu?

Check /usr/local/share/docs/pkg-readmes/xfce* for shutdown/reboot. Same
thing for cups i'd say.

Landry

Re: -Wl,--gc-sections broken on powerpc

2016-03-19 Thread Landry Breuil

On Thu, Mar 17, 2016 at 10:46:04PM +0100, Jeremie Courreges-Anglas wrote:
> Landry Breuil <lan...@rhaalovely.net> writes:
> 
> > Hi,
> >
> > mpi@ already fixed some ports (audio/mpd at least) by removing
> > -Wl,--gc-sections from the linking flags, but 'lots' of other ports
> > using this construct fail on powerpc. Maybe we should make it a noop on
> > this arch ?
> 
> Unless someone comes up with a proper fix, here's a diff to
> disable --gc-sections. You should get the following warning:

Well, given the insane amount of replies my mail got, i doubt anyone's
coming with a proper fix... Mark, Martin, since you're the macppc port
maintainers, any opinion ?

Landry

> Index: bfd/elf32-ppc.c
> ===
> RCS file: /cvs/src/gnu/usr.bin/binutils-2.17/bfd/elf32-ppc.c,v
> retrieving revision 1.3
> diff -u -p -r1.3 elf32-ppc.c
> --- bfd/elf32-ppc.c   3 Aug 2015 18:03:04 -   1.3
> +++ bfd/elf32-ppc.c   17 Mar 2016 21:44:41 -
> @@ -7447,7 +7447,7 @@ ppc_elf_finish_dynamic_sections (bfd *ou
>  #endif
>  
>  #define elf_backend_plt_not_loaded   1
> -#define elf_backend_can_gc_sections  1
> +#define elf_backend_can_gc_sections  0
>  #define elf_backend_can_refcount 1
>  #define elf_backend_rela_normal  1
>  
> 
> 
> -- 
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
>

-Wl,--gc-sections broken on powerpc

2016-03-06 Thread Landry Breuil

Hi,

mpi@ already fixed some ports (audio/mpd at least) by removing
-Wl,--gc-sections from the linking flags, but 'lots' of other ports
using this construct fail on powerpc. Maybe we should make it a noop on
this arch ?

To reproduce, try building net/ntp on macppc, ld should reliably
segfault with:

libtool: link: cc -o sntp -pthread -ffunction-sections -fdata-sections -Wall 
-Wcast-align -Wcast-qual -Wmissing-prototypes -Wpointer-arith -Wshadow 
-Winit-self -Wstrict-overflow -Wno-strict-prototypes -O2 -pipe 
-Wl,--gc-sections sntp.o version.o libsntp.a 
/usr/obj/ports/ntp-4.2.8pl6/ntp-4.2.8p6/sntp/libopts/.libs/libopts.a 
../libntp/libntp.a -L.libs -levent_pthreads -levent_core  -lm -lssl -lcrypto 
-Wl,-rpath-link,/usr/local/lib
collect2: ld terminated with signal 11 [Segmentation fault], core dumped

Other affected ports (probably more..):
2016-01-22/audio/ncmpc.log
2016-01-22/devel/llvm.log
2016-01-22/net/dnscrypt-proxy,-main.log
2016-01-22/net/ntp.log
2016-01-22/x11/fltk.log

(all under http://build-failures.rhaalovely.net/powerpc)

Landry

ntpd crashes at startup with constraints when an ipv6 is configured

2015-12-16 Thread Landry Breuil

Hi,

i have an ipv6 configured on my external if, and with this config:

listen on 10.246.200.1
servers pool.ntp.org
sensor *
constraints from "https://www.google.com;

i get an ipv6 resolution for www.google.com
$host www.google.com
www.google.com has address 216.58.208.228
www.google.com has IPv6 address 2a00:1450:4007:80e::2004

but no ping for it (dont ask me why):
$ping6 www.google.com
PING6 www.google.com (2a00:1450:4007:80e::2004): 24 data bytes
^C--- www.google.com ping6 statistics ---
7 packets transmitted, 0 packets received, 100.0% packet loss

and ntpd crashes at startup (-current, macppc)

ntpd[1932]: listening on 10.246.200.1 
ntpd[1932]: ntp engine ready
ntpd[1932]: constraint reply from 216.58.208.228: offset -0.413616
ntpd[16978]: fatal: constraint 2a00:1450:4007:80e::2004, signal 15
ntpd[1932]: ntp_dispatch_imsg in ntp engine: pipe closed
ntpd[1932]: ntp_dispatch_imsg_dns in ntp engine: pipe closed
ntpd[1932]: ntp engine exiting

Of course if i disable the constraints, it starts fine.

ntpd[4086]: listening on 10.246.200.1 
ntpd[4086]: ntp engine ready
ntpd[4086]: peer 78.192.65.63 now valid
ntpd[4086]: peer 178.33.227.201 now valid
ntpd[4086]: peer 176.31.127.215 now valid
ntpd[4086]: peer 212.83.131.33 now valid

I dunno what ntpd should do in that case, but it shouldnt crash..

Landry

Re: Error building ffmpeg in ports tree

2015-12-06 Thread Landry Breuil

On Mon, Dec 07, 2015 at 02:59:43AM +, walk...@ssimicro.com wrote:
> Hello,
> 
> While trying to build gnome (and xfce4) from the ports tree,
> 5.8-release, AMD64, the build process died while trying to retrieve
> source files for ffmpeg.  The files did not appear listed at the
> location.  In order to resolve this I downloaded the package file,
> installed that, and then resumed the build process (this worked fine
> once that was installed.)  In adding the package I noted that the
> file name/versions are different between the port and the package ..
> seems like the port is looking for an older version.

You're probably mixing -release and -current, but since you provide no
logs nor details we can't really tell

Landry

Re: libjavascriptcoregtk-1.0 crashes midori

2015-09-12 Thread Landry Breuil

On Sat, Sep 12, 2015 at 05:52:45PM -0500, Herminio Hernandez Jr. wrote:
> Midori crashes upon startup on my iBook G4 running OpenBSD/macppc current. I 
> ran Midori in gdb and saw that libjavascriptcoregtk-1.0 was causing the crash.
> 
> I am attaching my gdb log file.

webkit has been unusable on powerpc since some years. Dig into upstream
bugzilla for some pointers

Landry

Re: Weekly network disconnect with G4 Mac Mini (gem0)

2015-09-08 Thread Landry Breuil

On Tue, Sep 08, 2015 at 04:44:57PM +0100, Stuart Henderson wrote:
> On 2015/09/08 17:28, Carlos Fenollosa wrote:
> > 
> > > On 07 Sep 2015, at 20:40, Stuart Henderson <st...@openbsd.org> wrote:
> > > 
> > > On 2015/09/07 20:26, Landry Breuil wrote:
> > >> I cant help you on the issue itself, but i can confirm you that i've
> > >> been seeing the exact same issue with gem0 on my g4 mac mini here, and
> > >> since some releases. randomly, gem0 just doesnt receive/send pkts
> > >> anymore and needs to be downed/upped.
> > > 
> > > Interesting - I don't see that on mine.
> > > 
> > > Out of interest does your switch have flow control enabled? (you will
> > > see rxpause and/or txpause in the ifconfig output). If it does, is there
> > > any change if you disable it on the switch (if you can do so)?
> > > 
> > > gem0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 
> > > 1500
> > >lladdr 00:0d:93:63:da:5a
> > >priority: 0
> > >groups: egress
> > >media: Ethernet autoselect (100baseTX full-duplex)
> > >status: active
> > 
> > Yes, it seems to be the case:
> > 
> > gem0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> > lladdr 00:11:24:87:a7:64
> > priority: 0
> > groups: egress
> > media: Ethernet autoselect (100baseTX full-duplex,rxpause,txpause)
> > status: active
> > inet 192.168.1.199 netmask 0xff00 broadcast 192.168.1.255
> > 
> > 
> > I have a crappy telco router, I’m actually not sure if I can disable it 
> > there. There is a section on QoS, but the option is disabled.
> > Could the driver be forced to disable flow control? At least I could try 
> > running it for a couple weeks to see if the bug is triggered again.
> > 
> > 
> > Thanks a lot,
> > Carlos
> > 
> 
> Flow control was a complete guess btw and might be unconnected.
> This diff ought to disable it but my mac is 1500km away at the moment
> so untested!
> 
> Landry, does yours show rxpause/txpause on this line?

nope, as i said my switch is more than basic..

gem0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:14:51:1f:5c:f4
priority: 0
media: Ethernet autoselect (100baseTX full-duplex)
status: active
inet 10.246.200.1 netmask 0xff00 broadcast 10.255.255.255
inet6 fe80::214:51ff:fe1f:5cf4%gem0 prefixlen 64 scopeid 0x2
inet6 2a01:e34:edcb:85c0::1 prefixlen 120

Landry

Re: Weekly network disconnect with G4 Mac Mini (gem0)

2015-09-07 Thread Landry Breuil

On Mon, Sep 07, 2015 at 08:03:51PM +0200, carles.fenoll...@gmail.com wrote:
> >Synopsis:Weekly network disconnect with G4 Mac Mini (gem0)
> >Category:powerpc
> >Environment:
>   System  : OpenBSD 5.7
>   Details : OpenBSD 5.7-stable (GENERIC) #2: Wed Aug 12 23:45:47 CEST 
> 2015
>root@mini:/usr/src/sys/arch/macppc/compile/GENERIC
> 
>   Architecture: OpenBSD.macppc
>   Machine : macppc
> >Description:
> 
>   Hello,
> 
>   I'm experiencing a very strange bug with a headless G4 Mac Mini with 
> the gem0 network driver. The network disconnects by itself and the machine 
> loses all internet connectivity. It doesn't respond to pings/ssh even inside 
> the local network. The rest of the machines in my network seem unaffected so 
> it's not an issue regarding my router.
> 
> >How-To-Repeat:
> 
> I've narrowed it down to the following conditions:
> 
> - It usually happens about a week of regular usage. My G4 has a fairly 
> consistent usage pattern so it makes sense that the bug also appears with a 
> pattern.
> Here are some sample dates where the bug was triggered:
>   - Restart on 12/Aug 04:15, happens again on 19/Aug 15:15
>   - Restart on 22/Aug 23:10, happens again on 31/Aug 12:46
>   - Restart on 31/Aug 15:10, happens again on 5/Sep 16:11
> 
> - It once happened after just a couple hours heavily downloading data 
> (BitTorrent, so it can either be a number of connections issue or an absolute 
> tx/rx amount issue)
> 
> - It can be fixed with with "ifconfig gem0 down && ifconfig gem0 up", but not 
> unplugging and replugging the cable. A system restart also solves the issue. 
> 
> 
> There are no error logs. The closest I can get to an error log is the fact 
> that afpd times out, and I used this timestamp to establish the exact time of 
> the issue. 
> 
> I also run an internet-dependent cron job which starts to fail consistently 
> with the afpd error message, so I'm confident that the bug trigger time is 
> correct.
> 
> Here is what I can see on /var/log/messages for the time when the bug is 
> triggered:
> 
> Aug 22 23:09:57 mini afpd[8461]: afp_alarm: child timed out, entering 
> disconnected state
> Aug 22 23:09:57 mini afpd[8461]: dsi_disconnect: entering disconnected state
> Aug 22 23:09:57 mini afpd[8461]: dsi_disconnect: entering disconnected state
> 
> Another one:
> 
> Aug 31 12:46:19 mini afpd[24528]: afp_alarm: child timed out, entering 
> disconnected state
> Aug 31 12:46:19 mini afpd[24528]: dsi_disconnect: entering disconnected state
> Aug 31 12:46:19 mini afpd[24528]: dsi_wrtreply: Bad file descriptor
> Aug 31 12:46:19 mini afpd[24528]: dsi_disconnect: entering disconnected state
> 
> This one is from yesterday:
> 
> Sep  5 16:10:50 mini ntpd[6258]: 2 out of 4 peers valid
> Sep  5 16:10:50 mini ntpd[6258]: bad peer from pool pool.ntp.org 
> (46.17.142.10)
> Sep  5 16:10:50 mini ntpd[6258]: bad peer from pool pool.ntp.org 
> (194.140.131.21)
> 
> 
> I then try to grep on /var/log for timestamps which are close to that date, 
> but there are no other error messages.
> 
> The machine is running headless so I can't see if there are any error 
> messages on screen.
> 
> >Fix:
> 
> ifconfig gem0 down && ifconfig gem0 up
> 
> As to a permanent fix, here are some hyphotheses:
> 
> - It is clearly a network issue, since it's solved by an ifconfig down+up
> - It is probably something driver-related, since I googled and looked at the 
> mailing lists, and there is nobody experiencing the same issue. I guess there 
> are few people using OpenBSD on a G4 with the gem0 driver, so this may be an 
> untested corner case of the driver. If it were a system-wide issue, somebody 
> else would probably have noticed it.
> - This may be a data overflow. It can be either in a counter of absolute 
> tx/rx data, or number of connections. The weird weekly periodicity has 
> probably something to do with it. Or maybe connections aren't properly 
> cleaned up and eventually they fill up some buffer? This is my best guess
> - It does not seem to affect the kernel/other processes since there are no 
> dmesg messages and the system doesn't require a restart.
> 
> 
> Can anybody give me more pointers to further narrow down the issue?

I cant help you on the issue itself, but i can confirm you that i've
been seeing the exact same issue with gem0 on my g4 mac mini here, and
since some releases. randomly, gem0 just doesnt receive/send pkts
anymore and needs to be downed/upped.

Landry

Re: Weekly network disconnect with G4 Mac Mini (gem0)

2015-09-07 Thread Landry Breuil

On Mon, Sep 07, 2015 at 07:40:10PM +0100, Stuart Henderson wrote:
> On 2015/09/07 20:26, Landry Breuil wrote:
> > I cant help you on the issue itself, but i can confirm you that i've
> > been seeing the exact same issue with gem0 on my g4 mac mini here, and
> > since some releases. randomly, gem0 just doesnt receive/send pkts
> > anymore and needs to be downed/upped.
> 
> Interesting - I don't see that on mine.
> 
> Out of interest does your switch have flow control enabled? (you will
> see rxpause and/or txpause in the ifconfig output). If it does, is there
> any change if you disable it on the switch (if you can do so)?

switch is an el cheapo 'mercury 8-port' something, so the only feature
it has is making pkts flow...

Landry

hppa sp crash

2015-04-03 Thread Landry Breuil

Hi,

sending this one because for once it has an unusual trace, regular ports
building workflow, running SP kernel. Trace and dmesg attached.

Landry
[ using 407040 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2015 OpenBSD. All rights reserved.  http://www.OpenBSD.org

OpenBSD 5.7-current (GENERIC.MP) #509: Tue Mar 31 17:03:44 MDT 2015
dera...@hppa.openbsd.org:/usr/src/sys/arch/hppa/compile/GENERIC.MP
HP 9000/785/J6000 (Duet W+) PA-RISC 2.0a
real mem = 536870912 (512MB)
rsvd mem = 524288 (512KB)
avail mem = 518483968 (494MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root [flex fff8]
pdc0 at mainbus0
power0 at mainbus0 offset 400804
lcd0 at mainbus0 offset 5d0008: model 0
cpu0 at mainbus0 offset ffa: PCXW L1-B 552MHz, FPU PCXW rev 1
cpu0: 512K(64b/l) Icache, 1024K(64b/l) wr-back Dcache, 160 coherent TLB
cpu1 at mainbus0 offset ffa2000: PCXW L1-B 552MHz, FPU PCXW rev 1
cpu1: 512K(64b/l) Icache, 1024K(64b/l) wr-back Dcache, 160 coherent TLB
mem0 at mainbus0 offset ed10200: size 512MB
astro0 at mainbus0 offset ed0: Astro rev 2.1
elroy0 at astro0 offset ed3c000: Elroy TR4.0 APIC ver 20, 7 pins
pci0 at elroy0
elroy1 at astro0 offset ed38000: Elroy TR4.0 APIC ver 20, 7 pins
pci1 at elroy1
em0 at pci1 dev 2 function 0 Intel 82546GB rev 0x03: line 0 irq 2, address 
00:04:23:b7:51:3c
em1 at pci1 dev 2 function 1 Intel 82546GB rev 0x03: line 1 irq 3, address 
00:04:23:b7:51:3d
elroy2 at astro0 offset ed34000: Elroy TR4.0 APIC ver 20, 7 pins
pci2 at elroy2
elroy3 at astro0 offset ed3: Elroy TR4.0 APIC ver 20, 7 pins
pci3 at elroy3
dc0 at pci3 dev 12 function 0 DEC 21142/3 rev 0x41: line 2 irq 5, address 
00:10:83:ff:8e:f5
lxtphy0 at dc0 phy 1: LXT970 10/100 PHY, rev. 3
Analog Devices AD1889 Audio rev 0x00 at pci3 dev 13 function 0 not configured
pciide0 at pci3 dev 14 function 0 NS PC87415 IDE rev 0x03: DMA, channel 0 
configured to native-PCI, channel 1 configured to native-PCI
pciide0: using line 0 irq 6 for native-PCI interrupt
ssio0 at pci3 dev 14 function 1 NS 87560 Legacy I/O rev 0x01: line 0 irq 6
com0 at ssio0 offset 3f8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at ssio0 offset 2f8 irq 3: ns16550a, 16 byte fifo
lpt0 at ssio0 offset 378 irq 7
ohci0 at pci3 dev 14 function 2 NS USB rev 0x02: line 0 irq 6, version 1.0, 
legacy support
siop0 at pci3 dev 15 function 0 Symbios Logic 53c896 rev 0x04: line 1 irq 7, 
using 8K of on-board RAM
scsibus1 at siop0: 16 targets, initiator 7
siop1 at pci3 dev 15 function 1 Symbios Logic 53c896 rev 0x04: line 1 irq 7, 
using 8K of on-board RAM
scsibus2 at siop1: 16 targets, initiator 7
sym0 at scsibus2 targ 5 lun 0: SEAGATE, ST3146707LC, D703 SCSI3 0/direct 
fixed serial.SEAGATE_ST3146707LC_3KS395LN
sd0 at scsibus0 targ 0 lun 0: SEAGATE, ST3146707LC, D703 SCSI3 0/direct fixed 
serial.SEAGATE_ST3146707LC_3KS395LN
sd0: 140014MB, 512 bytes/sector, 286749480 sectors
usb0 at ohci0: USB revision 1.0
uhub0 at usb0 NS OHCI root hub rev 1.00/1.00 addr 1
siop1: target 5 now using tagged 16 bit 40.0 MHz 31 REQ/ACK offset xfers
vscsi0 at root
scsibus3 at vscsi0: 256 targets
softraid0 at root
scsibus4 at softraid0: 256 targets
bootpath: 10/0/15/1.5 class=1 flags=80autoboot hpa=0xf4004000 spa=0x0 
io=0x19000
root on sd0a (127282a85bc65f44.a) swap on sd0b dump on sd0b
WARNING: / was not properly unmounted
[-- MARK -- Thu Apr  2 15:00:00 2015]
[-- MARK -- Thu Apr  2 16:00:00 2015]
[-- MARK -- Thu Apr  2 17:00:00 2015]
[-- MARK -- Thu Apr  2 18:00:00 2015]
[-- MARK -- Thu Apr  2 19:00:00 2015]
panic: trap: uvm_fault(0x6bc654, 14, 1, 2): 14

Stopped at  Debugger+0x18:  break   0,5

RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!

IF RUNNING SMP, USE 'mach ddbcpu #' AND 'trace' ON OTHER PROCESSORS, TOO.

DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!

ddb{0} [-- MARK -- Thu Apr  2 20:00:00 2015]
[-- MARK -- Thu Apr  2 21:00:00 2015]
[-- MARK -- Thu Apr  2 22:00:00 2015]
[-- MARK -- Thu Apr  2 23:00:00 2015]
[-- MARK -- Fri Apr  3 00:00:00 2015]
[-- MARK -- Fri Apr  3 01:00:00 2015]
[-- MARK -- Fri Apr  3 02:00:00 2015]
[-- MARK -- Fri Apr  3 03:00:00 2015]
trace

Debugger(6198bc,689268,5,228) at Debugger+0x18

panic(6623c0,6bc654,14,1) at panic+0xc8

panic(34aeac,a,c0a80690,80) at panic+0xa8

trap(c0,0,1b1504,c05af260) at trap+0xa88

-- trap #26

uvm_pmr_addr_RB_REMOVE_COLOR(19f940,33681c,1e94de70,ce2f8238) at uvm_pmr_addr_R

B_REMOVE_COLOR+0x258

uvm_pmr_addr_RB_REMOVE(701090,a,156bffc8,156bf8f8) at uvm_pmr_addr_RB_REMOVE+0x

e4

pool_do_get(3383bc,80713,24e0fc8,24e0c10) at pool_do_get+0x12c

uvm_pmr_remove_addr(1,71e260,43b7b000,2) at uvm_pmr_remove_addr+0x28

uvm_pmr_get1page(ce2f8000,71e560,5cbf3f04,1) at uvm_pmr_get1page+0x33c

uvm_pmr_getpages(33d784,702a00,703200,2240ac) at uvm_pmr_getpages+0x468

uvm_pagealloc(0,43b77000,0,) at uvm_pagealloc+0x17c

si_addr should be void* in signal.h ?

2015-03-02 Thread Landry Breuil

Hi,

stumbled upon this thanks to mozilla buildbot:

/home/buildslave-amd64/mozilla-central-amd64/build/js/src/asmjs/AsmJSSignalHandlers.cpp:1106:32:
error: static_cast from 'caddr_t' (aka 'char *') to 'uint8_t *' (aka 'unsigned 
char *') is not allowed
uint8_t *faultingAddress = static_castuint8_t*(info-si_addr);

(see https://bugzilla.mozilla.org/show_bug.cgi?id=1138205)

But it turns out we're the only ones failing, because everyone has
si_addr as a void *. FreeBSD has it in /usr/include/sys/signal.h, and
NetBSD has it via a #define indirection in /usr/include/sys/siginfo.h.

According to POSIX, si_addr should be void* :
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/signal.h.html

Is there a particular reason we diverge here ?

Landry

Re: tor segmentation fault on amd64 current

2014-09-30 Thread Landry Breuil

On Tue, Sep 30, 2014 at 11:30:06AM +0400, ba...@yandex.ru wrote:
 it sounds funny, but you are doing something wrong :) i simply have no
 place to make a mistake. 
 on virtual machine(i use vmware) install _CURRENT_, install mc, install
there's your problem-^

Re: xfce4-netload-1.2.0 no network flow display

2014-07-08 Thread Landry Breuil

On Tue, Jul 08, 2014 at 10:02:09AM +0800, 3akai wrote:
 I installed xfce4-netload-1.2.0 on amd64/kvm/qemu guest vm,
 and it only work on lo0, when I config it to display vio0/em0 network flow,
 it popup the error message like this:
 
 Xfce4-Netload-Plugin: Error in initializing:
 Interface was not found.
 
 and it no network flow display on vio0/em0,
 but the network(vio0/em0) is working.
 If I config it to lo0, netload can show the network(lo0) flow.

The code checking for interface type/status is here:
http://git.xfce.org/panel-plugins/xfce4-netload-plugin/tree/panel-plugin/wormulon/openbsd.c#n37

maybe for some reason vio devices dont pass that check (AF_LINK, maybe ?)

Patch welcome..

Landry

pdksh bug on an empty for loop preceded by a set

2013-06-02 Thread Landry Breuil

Hi,

as i found out while wandering through mozilla's build system revamp for
thunderbird - it turns out our pdksh has a difference with zsh, bash 
bash in the following case :

[05:38] dawn:/src/comm-central/ $set foo bar baz ; for out in ; do echo $out ; 
done
foo
bar
baz
[05:40] dawn:/src/comm-central/ $zsh 
dawn% set foo bar baz ; for out in ; do echo $out ; done
dawn% 
[05:41] dawn:/src/comm-central/ $bash
bash-4.2$ set foo bar baz ; for out in ; do echo $out ; done
bash-4.2$ exit
[05:43] dawn:/src/comm-central/ $dash
$set foo bar baz ; for out in ; do echo $out ; done
$^D[05:44] dawn:/src/comm-central/ $zoid
--[ This is the Zoidberg shell ]--[ Version 0.981 ]--
### This is a development version, consider it unstable
landry@dawn 05:43 ~$ set foo bar baz ; for out in ; do echo $out ; done
Missing $ on loop variable at line 3

syntax error at (eval 55) line 4, near do echo $out


zoid: done: command not found

(yeah, why not zoidberg ?)

As you can see, ksh outputs set values. boom in the thunderbird trunk
configure script, cf https://bugzilla.mozilla.org/show_bug.cgi?id=878661
for the complete analysis.

So, a real bug to fix, or just a difference ?

Landry

Re: Thread-related crash in Firefox on current snapshot

2012-04-04 Thread Landry Breuil

On Wed, Apr 04, 2012 at 02:54:07PM +0200, HSL GmbH - Lukas Ratajski wrote:
 Packages were updated today from snapshots around 10:00 UTC. I am aware that 
 there is some serious rthreads-related work in progress, this report may be 
 helpful for further bug hunting.
 
 The corefile is still around. I will keep the system in the current state 
 (OS/packages) in case you need more information.
 

Bug report lacks:
- what site you were visiting that triggered the crash
- where you were clicking
- backtrace for ALL threads
- backtrace with debug symbols
- firefox crashed, you get the pieces.. anyway, likely not debuggable.
  Live with it.

Landry

Re: Testing 5.0-beta / errors when installing packages

2011-07-28 Thread Landry Breuil

On Wed, Jul 27, 2011 at 11:14:41AM +0200, Stefan Wollny wrote:
 Hi,
 
 (1)
 THANK YOU for your confinued efforts!
 
 (2)
 This is my very first report, ever. Please be kind if the form is unfamiliar 
 and some relevant information is missing.
 I installed snapshot on July 26th, version i386, from 
 ftp://openbsd.cs.fau.de/ plus ports as a fresh install. Everything went 
 smooth.
 Added this site to .profile as PKG_PATH, too.
 
 (3)
 dmesg attached
 
 (4)
 Installed packages from the same site as common user using 'sudo'.
 - fluxbox: OK
 
 - firefox: ERROR!
 Can't install firefox-5.0p1: can't resolve gtk+2-2.24.5p0,cairo-1.10.2p1
 tried as workaround: Installation gtk+2 from ports (with dependend 
 cairo-1.10.2p1)
 result:Can't install firefox-5.0p1 because of libraries
  library pixman-1.20.0 not found


Sometimes packages are out of sync with snapshots. Try with another
mirror, or update to newer snapshots.

Landry

make NAN a constant (netbsd pr40695)

2010-10-23 Thread Landry Breuil

Hi,

i'm porting an app which uses:
static gdouble line_speed = NAN;
static gdouble line_course = NAN;

which yields:
gpspoint.c:84: error: initializer element is not constant
gpspoint.c:85: error: initializer element is not constant

Kirill Bychkov pointed out
(http://marc.info/?l=openbsd-portsm=128645629406557w=2)
to me that netbsd had the following related pr which affects us too:
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=40695

Can we consider using the same libc fix ?

is the initialization using:
static gdouble line_speed = __builtin_nanf();
static gdouble line_course = __builtin_nanf();

a valid temporary fix for the port ?

Thx,
Landry

83 matches

Mail list logo