Re: bsdinstall wifi setup is broken on CURRENT

2024-05-20 Thread Renato Botelho

On 18/05/24 11:33, Alfonso S. Siciliano wrote:

On 5/16/24 20:40, Renato Botelho wrote:
I saw some users on a .br group complaining bsdinstall was failing to 
setup wifi network on 15.0 snapshots and tried it myself.  I was able 
to reproduce the problem and also noticed another one.




Thank you for your report, the video is highly appreciated to understand 
the problem quickly and exactly.


I noticed Network Selection screen only shows one line, it's not 
beautiful to navigate through items this way.  On 14.1-BETA2 it shows 
multiple lines so it seems to be a regression.


Problem 1. Looking at wlanconfig it seems related to $height $width 
$rows for the selecting menu. Please could you open a PR adding me, so 
we can test and solve.


I've fixed it locally and submitted a fix for review

https://reviews.freebsd.org/D45271



The problem users reported was: after selecting desired network it 
just starts over instead of asking for password.  I made a video [1] 
showing the problem.


Problem 2. I know this issue about --mixedform, my last import 2 day ago 
should solve a6d8be451f62d425b71a4874f7d4e133b9fb393c.
You could try the last main snapshot (yesterday 17 May), please let me 
know any problem.


Last snapshot still contains bsddialog 1.0 so I'll wait for the next one 
and give it a try.




Jessica, I've cc'd you because git shows you were the last person 
making changes in this area.  If it's not related and I made a 
mistake, just ignore me.


[1] https://youtube.com/shorts/Gmeckokw2a0


Again thanks for the video.

Best Regards,
Alfonso




--
Renato Botelho



RES: RES: RES: usb mouse not work on boot

2024-05-20 Thread Ivan Quitschal


> -Mensagem original-
> De: owner-freebsd-curr...@freebsd.org  curr...@freebsd.org> Em nome de Dag-Erling Smørgrav
> Enviada em: segunda-feira, 20 de maio de 2024 06:01
> Para: Ivan Quitschal 
> Cc: Vladimir Kondratyev ; Warner Losh
> ; Oleksandr Kryvulia ; FreeBSD
> Current 
> Assunto: Re: RES: RES: usb mouse not work on boot
> 
> Ivan Quitschal  writes:
> > > Ivan Quitschal  writes:
> > > > diff --git a/sys/dev/usb/input/usbhid.c
> > > > b/sys/dev/usb/input/usbhid.c index 174e1c28ae96..7b19d713c943
> > > > 100644
> > > > --- a/sys/dev/usb/input/usbhid.c
> > > > +++ b/sys/dev/usb/input/usbhid.c
> > > > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > > > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > > > return (ENXIO);
> > > > +// return (BUS_PROBE_GENERIC + 1);
> > > > return (BUS_PROBE_DEFAULT + 1);  }
> > > You realize this diff does nothing at all, right?
> > Yeap, i also said it worked in 14-current old code only ,and has more
> > than 2 years already
> 
> No, I mean all this does is add a comment.  It has no effect on the code.
> 
> DES
> --
> Dag-Erling Smørgrav - d...@freebsd.org


Oh ok,, sorry

But actually it did change one return for another 

Usbhid.ko used to return this 
return (BUS_PROBE_GENERIC + 1);

and ums.ko used to take place instead , messing up our multimedia kbds and all
Was a priority issue when it shouldn’t matter

Then Vladmir changed to this
return (BUS_PROBE_DEFAULT + 1);  

and everything went to "voil" 


sorry for the miss communication
regards

tzk


Re: RES: RES: usb mouse not work on boot

2024-05-20 Thread Dag-Erling Smørgrav
Ivan Quitschal  writes:
> > Ivan Quitschal  writes:
> > > diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> > > index 174e1c28ae96..7b19d713c943 100644
> > > --- a/sys/dev/usb/input/usbhid.c
> > > +++ b/sys/dev/usb/input/usbhid.c
> > > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > > return (ENXIO);
> > > +// return (BUS_PROBE_GENERIC + 1);
> > > return (BUS_PROBE_DEFAULT + 1);
> > >  }
> > You realize this diff does nothing at all, right?
> Yeap, i also said it worked in 14-current old code only ,and has more
> than 2 years already

No, I mean all this does is add a comment.  It has no effect on the
code.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



RES: RES: usb mouse not work on boot

2024-05-19 Thread Ivan Quitschal
Hans participated on that, and in that one too,  he was the last person I spoke 
on this forum, then I found out the tragic news, then I lost some interest, not 
about bsd, bot about those things I remember he was directly involved , like 
that one for example .. 

but yes I know its not a proper patch , he told me that too .. with all the 
9 and such ...

last usb thing I spoke about on this list
https://lists.freebsd.org/archives/freebsd-current/2022-September/002580.html
so please 
not necessary


Thanks
tzk


> -Mensagem original-
> De: owner-freebsd-curr...@freebsd.org  curr...@freebsd.org> Em nome de Ivan Quitschal
> Enviada em: domingo, 19 de maio de 2024 19:49
> Para: Dag-Erling Smørgrav 
> Cc: Vladimir Kondratyev ; Warner Losh
> ; Oleksandr Kryvulia ; FreeBSD
> Current 
> Assunto: RES: RES: usb mouse not work on boot
> 
> Yeap, i also said it worked in 14-current old code only ,and has more than  2 
> years
> already
> 
> Only point was whether freebsd had
> this
> return (BUS_PROBE_DEFAULT + 1); }
> or that
> return (BUS_PROBE_GENERIC + 1);
> 
> glad we have the first one , aka the right return
> 
> We have an entire email chain about this day back in the day august 2022 don’t
> remember correctly
> 
> 
> 
> > -Mensagem original-
> > De: Dag-Erling Smørgrav  Enviada em: domingo, 19 de
> > maio de 2024 08:04
> > Para: Ivan Quitschal 
> > Cc: Vladimir Kondratyev ; Warner Losh
> > ; Oleksandr Kryvulia ; FreeBSD
> > Current 
> > Assunto: Re: RES: usb mouse not work on boot
> >
> > Ivan Quitschal  writes:
> > > diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> > > index 174e1c28ae96..7b19d713c943 100644
> > > --- a/sys/dev/usb/input/usbhid.c
> > > +++ b/sys/dev/usb/input/usbhid.c
> > > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > > return (ENXIO);
> > > +// return (BUS_PROBE_GENERIC + 1);
> > > return (BUS_PROBE_DEFAULT + 1); }
> >
> > You realize this diff does nothing at all, right?
> >
> > DES
> > --
> > Dag-Erling Smørgrav - d...@freebsd.org


RES: RES: usb mouse not work on boot

2024-05-19 Thread Ivan Quitschal
Yeap, i also said it worked in 14-current old code only ,and has more than  2 
years already 

Only point was whether freebsd had 
this
return (BUS_PROBE_DEFAULT + 1); }
or that
return (BUS_PROBE_GENERIC + 1);

glad we have the first one , aka the right return 

We have an entire email chain about this day back in the day august 2022 don’t 
remember correctly



> -Mensagem original-
> De: Dag-Erling Smørgrav 
> Enviada em: domingo, 19 de maio de 2024 08:04
> Para: Ivan Quitschal 
> Cc: Vladimir Kondratyev ; Warner Losh
> ; Oleksandr Kryvulia ; FreeBSD
> Current 
> Assunto: Re: RES: usb mouse not work on boot
> 
> Ivan Quitschal  writes:
> > diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> > index 174e1c28ae96..7b19d713c943 100644
> > --- a/sys/dev/usb/input/usbhid.c
> > +++ b/sys/dev/usb/input/usbhid.c
> > @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> > if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> > return (ENXIO);
> > +// return (BUS_PROBE_GENERIC + 1);
> > return (BUS_PROBE_DEFAULT + 1); }
> 
> You realize this diff does nothing at all, right?
> 
> DES
> --
> Dag-Erling Smørgrav - d...@freebsd.org


Re: RES: usb mouse not work on boot

2024-05-19 Thread Dag-Erling Smørgrav
Ivan Quitschal  writes:
> diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
> index 174e1c28ae96..7b19d713c943 100644
> --- a/sys/dev/usb/input/usbhid.c
> +++ b/sys/dev/usb/input/usbhid.c
> @@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
> if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
> return (ENXIO);
> +// return (BUS_PROBE_GENERIC + 1);
> return (BUS_PROBE_DEFAULT + 1);
> }

You realize this diff does nothing at all, right?

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: usb mouse not work on boot

2024-05-18 Thread Chris

On 2024-05-18 08:33, Warner Losh wrote:

On Sat, May 18, 2024, 9:22 AM Oleksandr Kryvulia 
wrote:


18.05.24 16:06, Warner Losh:



On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
wrote:


18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia   
writes:


Gary Jennejohn   writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"


One more correction. Via kld_list I need load ums(4), loading only
uhid(4) does not solve a problem.




You don't need to change kld_list. In fact, you should undo any changes
you've made there. Undo everything in loader.conf you've done.

This is a bug in the boot optimization stuff. Or rather, this exposes a
long standing bug in the USB code where there's an asymmetry between the
nomatch events and the bus tree it presents to devctl causing devmatch to
fail when the nomatch events aren't present on boot.

Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot.
Or update to the change I'm about to make.


Thanks for the detailed explanation, Warner. Interesting that on my system
hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but only
explicit set it in /boot/loader.conf did the trick. That is why I think
this sysctl don't work in my case.



Yea. That's the optimization. We don't start generating events until it is
one. Setting it in the bootloader causes all events to coke through.
Setting it in devmatch turns them on after we run devmatch the first time,
omitting all of the ones generated on boot.

Why is sysctl.conf(5) not the best location for this?



Warner




--Chris



Re: RES: RES: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 21:39, Ivan Quitschal:


Not sure, im on 14-current because of my synergy  insists on crashing  
after version synergy-1.14.0.4,3


But that’s pretty simple to check

Just do a
# grep ‘return (BUS_PROBE_’ /user/src/sys/dev/usb/input/usbhidc in 
your own kernel source tree to see what line is there




That's from my source tree:

root@thinkpad:/usr/src # grep 'return (BUS_PROBE_' 
/usr/src/sys/dev/usb/input/usbhid.c

   return (BUS_PROBE_DEFAULT + 1);

RES: RES: usb mouse not work on boot

2024-05-18 Thread Ivan Quitschal
Not sure, im on 14-current because of my synergy  insists on crashing  after 
version synergy-1.14.0.4,3
But that's pretty simple to check

Just do a
# grep 'return (BUS_PROBE_' /user/src/sys/dev/usb/input/usbhid.c in your own 
kernel source tree to see what line is there

Thanks

Ivan


De: owner-freebsd-curr...@freebsd.org  Em 
nome de Oleksandr Kryvulia
Enviada em: sábado, 18 de maio de 2024 15:29
Para: freebsd-current@freebsd.org
Assunto: Re: RES: usb mouse not work on boot

18.05.24 19:29, Ivan Quitschal:

Hi Warner /  WBR / Oleksandr

Im not sure if that's the case with this uhid.ko, but you guys remember I had a 
priority issue with this module and Vladimir made me a patch to fix the attach 
priority?

Warner, was it fixed since then?


Let me show the patch I use to this very day important line is this, the patch 
might be wrong , because im still on 14-current

+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);



diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
index 174e1c28ae96..7b19d713c943 100644
--- a/sys/dev/usb/input/usbhid.c
+++ b/sys/dev/usb/input/usbhid.c
@@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
return (ENXIO);
+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);
}


If I correctly understand this patch alredy in main with 
975407b1d8dcceac2b54e2c4df96aadec7dc4c3a



Re: RES: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 19:29, Ivan Quitschal:


Hi Warner /  WBR / Oleksandr

Im not sure if that’s the case with this uhid.ko, but you guys 
remember I had a priority issue with this module and Vladimir made me 
a patch to fix the attach priority?


Warner, was it fixed since then?

Let me show the patch I use to this very day important line is this, 
the patch might be wrong , because im still on 14-current


+// return (BUS_PROBE_GENERIC + 1);

    return (BUS_PROBE_DEFAULT + 1);

diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c

index 174e1c28ae96..7b19d713c943 100644

--- a/sys/dev/usb/input/usbhid.c

+++ b/sys/dev/usb/input/usbhid.c

@@ -802,6 +802,7 @@ usbhid_probe(device_t dev)

    if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))

    return (ENXIO);

+// return (BUS_PROBE_GENERIC + 1);

    return (BUS_PROBE_DEFAULT + 1);

}



If I correctly understand this patch alredy in main with 
975407b1d8dcceac2b54e2c4df96aadec7dc4c3a


RES: usb mouse not work on boot

2024-05-18 Thread Ivan Quitschal
Hi Warner /  WBR / Oleksandr

Im not sure if that’s the case with this uhid.ko, but you guys remember I had a 
priority issue with this module and Vladimir made me a patch to fix the attach 
priority?

Warner, was it fixed since then?


Let me show the patch I use to this very day important line is this, the patch 
might be wrong , because im still on 14-current

+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);



diff --git a/sys/dev/usb/input/usbhid.c b/sys/dev/usb/input/usbhid.c
index 174e1c28ae96..7b19d713c943 100644
--- a/sys/dev/usb/input/usbhid.c
+++ b/sys/dev/usb/input/usbhid.c
@@ -802,6 +802,7 @@ usbhid_probe(device_t dev)
if (hid_test_quirk(>sc_hw, HQ_HID_IGNORE))
return (ENXIO);
+// return (BUS_PROBE_GENERIC + 1);
return (BUS_PROBE_DEFAULT + 1);
}

Thanks

--tzk

De: owner-freebsd-curr...@freebsd.org  Em 
nome de Warner Losh
Enviada em: sábado, 18 de maio de 2024 12:33
Para: Oleksandr Kryvulia 
Cc: FreeBSD Current 
Assunto: Re: usb mouse not work on boot


On Sat, May 18, 2024, 9:22 AM Oleksandr Kryvulia 
mailto:shur...@shurik.kiev.ua>> wrote:
18.05.24 16:06, Warner Losh:


On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
mailto:shur...@shurik.kiev.ua>> wrote:
18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:


Oleksandr Kryvulia  
writes:

Gary Jennejohn  writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that

added the module should be automatically loaded during the kernel

boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell

script and += is not valid shell syntax.  On the other hand, something

like



kld_list="${kld_list} uhid"
Yes, you are right. I mean
sysrc kld_list+="uhid"

One more correction. Via kld_list I need load ums(4), loading only uhid(4) does 
not solve a problem.


You don't need to change kld_list. In fact, you should undo any changes you've 
made there. Undo everything in loader.conf you've done.

This is a bug in the boot optimization stuff. Or rather, this exposes a long 
standing bug in the USB code where there's an asymmetry between the nomatch 
events and the bus tree it presents to devctl causing devmatch to fail when the 
nomatch events aren't present on boot.

Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot. Or 
update to the change I'm about to make.


Thanks for the detailed explanation, Warner. Interesting that on my system 
hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but only explicit 
set it in /boot/loader.conf did the trick. That is why I think this sysctl 
don't work in my case.

Yea. That's the optimization. We don't start generating events until it is one. 
Setting it in the bootloader causes all events to coke through. Setting it in 
devmatch turns them on after we run devmatch the first time, omitting all of 
the ones generated on boot.

Warner


Re: usb mouse not work on boot

2024-05-18 Thread Warner Losh
On Sat, May 18, 2024, 9:22 AM Oleksandr Kryvulia 
wrote:

> 18.05.24 16:06, Warner Losh:
>
>
>
> On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
> wrote:
>
>> 18.05.24 12:59, Oleksandr Kryvulia:
>>
>> 18.05.24 12:55, Dag-Erling Smørgrav:
>>
>> Oleksandr Kryvulia   writes:
>>
>> Gary Jennejohn   writes:
>>
>> Try adding uhid_load="YES" to your /boot/loader.conf.  With that
>> added the module should be automatically loaded during the kernel
>> boot.
>>
>> As workaround I already have kld_list+="uhid" in /etc/rc.conf.
>>
>> I hope you don't mean that literally, because /etc/rc.conf is a shell
>> script and += is not valid shell syntax.  On the other hand, something
>> like
>>
>> kld_list="${kld_list} uhid"
>>
>> Yes, you are right. I mean
>> sysrc kld_list+="uhid"
>>
>>
>> One more correction. Via kld_list I need load ums(4), loading only
>> uhid(4) does not solve a problem.
>>
>
>
> You don't need to change kld_list. In fact, you should undo any changes
> you've made there. Undo everything in loader.conf you've done.
>
> This is a bug in the boot optimization stuff. Or rather, this exposes a
> long standing bug in the USB code where there's an asymmetry between the
> nomatch events and the bus tree it presents to devctl causing devmatch to
> fail when the nomatch events aren't present on boot.
>
> Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot.
> Or update to the change I'm about to make.
>
>
> Thanks for the detailed explanation, Warner. Interesting that on my system
> hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but only
> explicit set it in /boot/loader.conf did the trick. That is why I think
> this sysctl don't work in my case.
>

Yea. That's the optimization. We don't start generating events until it is
one. Setting it in the bootloader causes all events to coke through.
Setting it in devmatch turns them on after we run devmatch the first time,
omitting all of the ones generated on boot.

Warner

>


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 16:06, Warner Losh:



On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
 wrote:


18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia   
 writes:

Gary Jennejohn    writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"


One more correction. Via kld_list I need load ums(4), loading only
uhid(4) does not solve a problem.



You don't need to change kld_list. In fact, you should undo any 
changes you've made there. Undo everything in loader.conf you've done.


This is a bug in the boot optimization stuff. Or rather, this exposes 
a long standing bug in the USB code where there's an asymmetry between 
the nomatch events and the bus tree it presents to devctl causing 
devmatch to fail when the nomatch events aren't present on boot.


Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and 
reboot. Or update to the change I'm about to make.




Thanks for the detailed explanation, Warner. Interesting that on my 
system hw.bus.devctl_nomatch_enabled=1 is set by /etc/rc.d/devmatch but 
only explicit set it in /boot/loader.conf did the trick. That is why I 
think this sysctl don't work in my case.

Re: bsdinstall wifi setup is broken on CURRENT

2024-05-18 Thread Alfonso S. Siciliano

On 5/16/24 20:40, Renato Botelho wrote:
I saw some users on a .br group complaining bsdinstall was failing to 
setup wifi network on 15.0 snapshots and tried it myself.  I was able to 
reproduce the problem and also noticed another one.




Thank you for your report, the video is highly appreciated to understand 
the problem quickly and exactly.


I noticed Network Selection screen only shows one line, it's not 
beautiful to navigate through items this way.  On 14.1-BETA2 it shows 
multiple lines so it seems to be a regression.


Problem 1. Looking at wlanconfig it seems related to $height $width 
$rows for the selecting menu. Please could you open a PR adding me, so 
we can test and solve.




The problem users reported was: after selecting desired network it just 
starts over instead of asking for password.  I made a video [1] showing 
the problem.


Problem 2. I know this issue about --mixedform, my last import 2 day ago 
should solve a6d8be451f62d425b71a4874f7d4e133b9fb393c.
You could try the last main snapshot (yesterday 17 May), please let me 
know any problem.




Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just 
ignore me.


[1] https://youtube.com/shorts/Gmeckokw2a0


Again thanks for the video.

Best Regards,
Alfonso




Re: usb mouse not work on boot

2024-05-18 Thread Warner Losh
On Sat, May 18, 2024 at 6:51 AM Oleksandr Kryvulia 
wrote:

> 18.05.24 12:59, Oleksandr Kryvulia:
>
> 18.05.24 12:55, Dag-Erling Smørgrav:
>
> Oleksandr Kryvulia   writes:
>
> Gary Jennejohn   writes:
>
> Try adding uhid_load="YES" to your /boot/loader.conf.  With that
> added the module should be automatically loaded during the kernel
> boot.
>
> As workaround I already have kld_list+="uhid" in /etc/rc.conf.
>
> I hope you don't mean that literally, because /etc/rc.conf is a shell
> script and += is not valid shell syntax.  On the other hand, something
> like
>
> kld_list="${kld_list} uhid"
>
> Yes, you are right. I mean
> sysrc kld_list+="uhid"
>
>
> One more correction. Via kld_list I need load ums(4), loading only uhid(4)
> does not solve a problem.
>

Also, in this case, kld_list is a terrible place to load the files. You're
better off loading them with xxx_load=YES in loader.conf. The reason is
that both uhid and ums will match your mouse. kld_list loads these in a
random order (effectively) and the first one to load will claim the device,
since there's no re-probe when the next one loads. You should never use it,
unless the module you're loading isn't supported by the boot loader (like
drm-kmod). The old advice was to put everything in kld_list and it would
speed up boot, but all the performance bugs in the boot loader have been
fixed by a combination of moving to UEFI (which is generally faster),
BIOSes with performance bugs disappearing 10 years ago and block caching
being added to the boot loader. It should almost always be empty or just
drm-mod these days (unless you somehow have special needs).

By adding uhid last to this list in this way, you're guaranteeing you'll
hit this bug because it's not after ums, and that things won't work.

Warner


Re: usb mouse not work on boot

2024-05-18 Thread Warner Losh
On Sat, May 18, 2024, 6:51 AM Oleksandr Kryvulia 
wrote:

> 18.05.24 12:59, Oleksandr Kryvulia:
>
> 18.05.24 12:55, Dag-Erling Smørgrav:
>
> Oleksandr Kryvulia   writes:
>
> Gary Jennejohn   writes:
>
> Try adding uhid_load="YES" to your /boot/loader.conf.  With that
> added the module should be automatically loaded during the kernel
> boot.
>
> As workaround I already have kld_list+="uhid" in /etc/rc.conf.
>
> I hope you don't mean that literally, because /etc/rc.conf is a shell
> script and += is not valid shell syntax.  On the other hand, something
> like
>
> kld_list="${kld_list} uhid"
>
> Yes, you are right. I mean
> sysrc kld_list+="uhid"
>
>
> One more correction. Via kld_list I need load ums(4), loading only uhid(4)
> does not solve a problem.
>


You don't need to change kld_list. In fact, you should undo any changes
you've made there. Undo everything in loader.conf you've done.

This is a bug in the boot optimization stuff. Or rather, this exposes a
long standing bug in the USB code where there's an asymmetry between the
nomatch events and the bus tree it presents to devctl causing devmatch to
fail when the nomatch events aren't present on boot.

Just set hw.bus.devctl_nomatch_enabled=1 in /boot/loader.conf and reboot.
Or update to the change I'm about to make.

Warner


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 12:59, Oleksandr Kryvulia:

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia  writes:

Gary Jennejohn  writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"


One more correction. Via kld_list I need load ums(4), loading only 
uhid(4) does not solve a problem.


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 12:55, Dag-Erling Smørgrav:

Oleksandr Kryvulia  writes:

Gary Jennejohn  writes:

Try adding uhid_load="YES" to your /boot/loader.conf.  With that
added the module should be automatically loaded during the kernel
boot.

As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

Yes, you are right. I mean
sysrc kld_list+="uhid"

Re: usb mouse not work on boot

2024-05-18 Thread Dag-Erling Smørgrav
Oleksandr Kryvulia  writes:
> Gary Jennejohn  writes:
> > Try adding uhid_load="YES" to your /boot/loader.conf.  With that
> > added the module should be automatically loaded during the kernel
> > boot.
> As workaround I already have kld_list+="uhid" in /etc/rc.conf.

I hope you don't mean that literally, because /etc/rc.conf is a shell
script and += is not valid shell syntax.  On the other hand, something
like

kld_list="${kld_list} uhid"

should work, and is preferable to Gary's suggestion since loading
modules pre-boot is significantly slower and should only be done for
modules which are required to boot or mount the root filesystem, such as
zfs.

> But IMHO it some regression.

I agree, and 6437872c1d66 should be reverted until devmatch is capable
of loading uhid.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 12:42, Tomek CEDRO:

does it also affect usb keyboard in single boot mode?


Good question. I don't have usb keyboerd right now and will check it a 
bit later.





Re: usb mouse not work on boot

2024-05-18 Thread Nuno Teixeira
Hello,

To fix my setup with usb mouse and audio dac on both amd64 (laptop) and
rpi4:

/boot/loader.conf.local:
snd_uaudio_load="YES"
ums_load="YES"

This restores previous behaviour as it detects mouse before login prompt
and audio dac that it is processed correctly by sysctl.

Cheers,

Oleksandr Kryvulia  escreveu (sábado, 18/05/2024
à(s) 09:24):

> 18.05.24 10:26, Gary Jennejohn:
> > On Sat, 18 May 2024 09:20:24 +0300
> > Oleksandr Kryvulia  wrote:
> >
> >> After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer
> >> works on boot because uhid(4) is not autoloaded. To make it work I need
> >> manualy load uhid or replug my usb mouse.
> >>
> > Try adding uhid_load="YES" to your /boot/loader.conf.  With that added
> > the module should be automatically loaded during the kernel boot.
>
> As workaround I already have kld_list+="uhid" in /etc/rc.conf. But IMHO
> it some regression.
>
>
>

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia

18.05.24 10:26, Gary Jennejohn:

On Sat, 18 May 2024 09:20:24 +0300
Oleksandr Kryvulia  wrote:


After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer
works on boot because uhid(4) is not autoloaded. To make it work I need
manualy load uhid or replug my usb mouse.


Try adding uhid_load="YES" to your /boot/loader.conf.  With that added
the module should be automatically loaded during the kernel boot.


As workaround I already have kld_list+="uhid" in /etc/rc.conf. But IMHO 
it some regression.





Re: usb mouse not work on boot

2024-05-18 Thread Gary Jennejohn
On Sat, 18 May 2024 09:20:24 +0300
Oleksandr Kryvulia  wrote:

> After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer
> works on boot because uhid(4) is not autoloaded. To make it work I need
> manualy load uhid or replug my usb mouse.
>

Try adding uhid_load="YES" to your /boot/loader.conf.  With that added
the module should be automatically loaded during the kernel boot.

--
Gary Jennejohn



usb mouse not work on boot

2024-05-18 Thread Oleksandr Kryvulia
After 6437872c1d665c2605f54e8ff040b0ba41edad07 my usb mouse no longer 
works on boot because uhid(4) is not autoloaded. To make it work I need 
manualy load uhid or replug my usb mouse.




usb devices discovery delay

2024-05-17 Thread Nuno Teixeira
Hello all,

At  recent main-n270203-2790ff21452f usb devices mouse and audio dac get
detected 30sec after login prompt.

Don't see anything relevant on dmesg but I see that:

sysctl.conf
dev.pcm.4.play.vchanmode=passthrough

gives an error on not existing dev.pcm.4 (usb audio dac) what means that
usb devices was not detected at this time.

Anyone experience it?

Thanks

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: kldload tpm: Fail to load: "link_elf_obj: symbol tpm_bus_driver undefined"

2024-05-17 Thread Nuno Teixeira
Working fine!

Thanks for fast fix.

Justin Hibbits  escreveu (sexta, 17/05/2024 à(s)
13:57):

> On Fri, 17 May 2024 11:09:00 +0100
> Nuno Teixeira  wrote:
>
> > Hello,
> >
> > tpm kernel module fails to load starting on main from May 9.
> > Updated today and same error:
> >
> > ```
> > $ kldload tpm
> > kldload: an error occurred while loading module tpm. Please check
> > dmesg(8) for more details.
> >
> > (dmesg)
> > link_elf_obj: symbol tpm_bus_driver undefined
> > linker_load_file: /boot/kernel/tpm.ko - unsupported file type
> > ```
> >
> > I believe it is related to:
> >
> > ---
> > commit 10eea8dc8c4f3d2a3495e7fb08837d91adf465e9
> > Author: Justin Hibbits 
> > Date:   Thu May 9 15:27:35 2024 -0400
> >
> > tpm20: Support partial reads
> >
> > Summary:
> > In some cases the TPM utilities may read only a partial block,
> > instead of a full block.  If a new command starts while in the middle
> > of a read it may cause the TPM to go catatonic and no longer respond
> > to SPI.
> >
> > Reviewed by:kd
> > Obtained from:  Juniper Networks, Inc.
> > Differential Revision: https://reviews.freebsd.org/D45140
> > ---
> >
> > I use tpm for bhyve/Win11.
> >
> > Thanks,
>
> Sorry for the breakage.  Should be fixed by 62adeb92.
>
> - Justin
>


-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: kldload tpm: Fail to load: "link_elf_obj: symbol tpm_bus_driver undefined"

2024-05-17 Thread Justin Hibbits
On Fri, 17 May 2024 11:09:00 +0100
Nuno Teixeira  wrote:

> Hello,
> 
> tpm kernel module fails to load starting on main from May 9.
> Updated today and same error:
> 
> ```
> $ kldload tpm
> kldload: an error occurred while loading module tpm. Please check
> dmesg(8) for more details.
> 
> (dmesg)
> link_elf_obj: symbol tpm_bus_driver undefined
> linker_load_file: /boot/kernel/tpm.ko - unsupported file type
> ```
> 
> I believe it is related to:
> 
> ---
> commit 10eea8dc8c4f3d2a3495e7fb08837d91adf465e9
> Author: Justin Hibbits 
> Date:   Thu May 9 15:27:35 2024 -0400
> 
> tpm20: Support partial reads
> 
> Summary:
> In some cases the TPM utilities may read only a partial block,
> instead of a full block.  If a new command starts while in the middle
> of a read it may cause the TPM to go catatonic and no longer respond
> to SPI.
> 
> Reviewed by:kd
> Obtained from:  Juniper Networks, Inc.
> Differential Revision: https://reviews.freebsd.org/D45140
> ---
> 
> I use tpm for bhyve/Win11.
> 
> Thanks,

Sorry for the breakage.  Should be fixed by 62adeb92.

- Justin



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-17 Thread David Wolfskill
On Fri, May 17, 2024 at 08:00:05AM +0200, Emmanuel Vadot wrote:
> ...
>  Indeed, even if I know that I tested with GENERIC and amdgpu I think
> that I've only tested GENERIC-NODEBUG with i915kms.
>  Anyway, I've pushed both patches now. Sorry for the breakage.
> 
>  Cheers,
> 

Success:

g1-70(15.0-C)[1] uname -aUK
FreeBSD g1-70.catwhisker.org 15.0-CURRENT FreeBSD 15.0-CURRENT #147 
main-n270199-cd3681011001: Fri May 17 11:10:47 UTC 2024 
r...@g1-70.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY amd64 
1500018 1500018

Thank you! :-)

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Please do not mistake "authoritarian" for "conservative" -- or vice versa.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


kldload tpm: Fail to load: "link_elf_obj: symbol tpm_bus_driver undefined"

2024-05-17 Thread Nuno Teixeira
Hello,

tpm kernel module fails to load starting on main from May 9.
Updated today and same error:

```
$ kldload tpm
kldload: an error occurred while loading module tpm. Please check dmesg(8)
for more details.

(dmesg)
link_elf_obj: symbol tpm_bus_driver undefined
linker_load_file: /boot/kernel/tpm.ko - unsupported file type
```

I believe it is related to:

---
commit 10eea8dc8c4f3d2a3495e7fb08837d91adf465e9
Author: Justin Hibbits 
Date:   Thu May 9 15:27:35 2024 -0400

tpm20: Support partial reads

Summary:
In some cases the TPM utilities may read only a partial block, instead
of a full block.  If a new command starts while in the middle of a read
it may cause the TPM to go catatonic and no longer respond to SPI.

Reviewed by:kd
Obtained from:  Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D45140
---

I use tpm for bhyve/Win11.

Thanks,
-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-17 Thread Emmanuel Vadot
On Thu, 16 May 2024 22:10:16 -0700
Ryan Libby  wrote:

> On Thu, May 16, 2024 at 9:56?PM Emmanuel Vadot  wrote:
> >
> > On Thu, 16 May 2024 10:27:40 -0700
> > Ryan Libby  wrote:
> >
> > > On Thu, May 16, 2024 at 6:00?AM David Wolfskill  
> > > wrote:
> > > >
> > > > This is running main-n270174-abb1a1340e3f (built in-place from
> > > > main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> > > > the ports-resident kernel modules were rebuilt with the kernel,
> > > > courtesy (e.g.):
> > > >
> > > > g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> > > > PORTS_MODULES+=graphics/drm-61-kmod
> > > >
> > > > And since I dislike "sample sizes of one," I have this result on
> > > > two different laptops, each of which has both Nvidia & Intel graphics
> > > > (but for the older one (M4800), I stopped using (& building) the
> > > > Nvidia driver, since enabling it appears to disable GLX).
> > > >
> > > > Anyway: photos of the backtraces are at
> > > > https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> > > > as are copies of the build typescripts.
> > > >
> > > > Unfortunately, the panic message itself had (just) scrolled off the
> > > > top at the time I took the photos, but I hand-typed it (from the
> > > > M4800) in the Subject.
> > > >
> > > > Peace,
> > > > david
> > > > --
> > > > David H. Wolfskill  da...@catwhisker.org
> > > > Please do not mistake "authoritarian" for "conservative" -- or vice 
> > > > versa.
> > > >
> > > > See https://www.catwhisker.org/~david/publickey.gpg for my public key.
> > >
> > > Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.
> > >
> > > It looks like spin_lock_init was changed to no longer zero out the
> > > mutex before calling mtx_init, but the MTX_NEW flag was not added.
> > >
> > > Ryan
> > >
> >
> > Could be, I cannot reproduce this here (either with i915kms or amdgpu)
> > but I guess that depending on the hardware version or number of screens
> > etc ... code path is different and might trigger this.
> >  David can you test with
> > https://people.freebsd.org/~manu/0001-linuxkpi-Fix-spin_lock_init.patch
> > just to be sure that it fixes this issue ?
> >
> >  Cheers,
> >
> > --
> > Emmanuel Vadot  
> 
> It may depend on getting lucky with the uninitialized junk too, and you would
> need a kernel with KASSERTs enabled.
> 
> manu, I think the rwlock patch 5c0a1923486e65cd47398e52c03cb289d6120a78
> may need the same treatment with RW_NEW.
> 
> Ryan
> 

 Indeed, even if I know that I tested with GENERIC and amdgpu I think
that I've only tested GENERIC-NODEBUG with i915kms.
 Anyway, I've pushed both patches now. Sorry for the breakage.

 Cheers,

-- 
Emmanuel Vadot  



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread Ryan Libby
On Thu, May 16, 2024 at 9:56 PM Emmanuel Vadot  wrote:
>
> On Thu, 16 May 2024 10:27:40 -0700
> Ryan Libby  wrote:
>
> > On Thu, May 16, 2024 at 6:00?AM David Wolfskill  
> > wrote:
> > >
> > > This is running main-n270174-abb1a1340e3f (built in-place from
> > > main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> > > the ports-resident kernel modules were rebuilt with the kernel,
> > > courtesy (e.g.):
> > >
> > > g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> > > PORTS_MODULES+=graphics/drm-61-kmod
> > >
> > > And since I dislike "sample sizes of one," I have this result on
> > > two different laptops, each of which has both Nvidia & Intel graphics
> > > (but for the older one (M4800), I stopped using (& building) the
> > > Nvidia driver, since enabling it appears to disable GLX).
> > >
> > > Anyway: photos of the backtraces are at
> > > https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> > > as are copies of the build typescripts.
> > >
> > > Unfortunately, the panic message itself had (just) scrolled off the
> > > top at the time I took the photos, but I hand-typed it (from the
> > > M4800) in the Subject.
> > >
> > > Peace,
> > > david
> > > --
> > > David H. Wolfskill  da...@catwhisker.org
> > > Please do not mistake "authoritarian" for "conservative" -- or vice versa.
> > >
> > > See https://www.catwhisker.org/~david/publickey.gpg for my public key.
> >
> > Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.
> >
> > It looks like spin_lock_init was changed to no longer zero out the
> > mutex before calling mtx_init, but the MTX_NEW flag was not added.
> >
> > Ryan
> >
>
> Could be, I cannot reproduce this here (either with i915kms or amdgpu)
> but I guess that depending on the hardware version or number of screens
> etc ... code path is different and might trigger this.
>  David can you test with
> https://people.freebsd.org/~manu/0001-linuxkpi-Fix-spin_lock_init.patch
> just to be sure that it fixes this issue ?
>
>  Cheers,
>
> --
> Emmanuel Vadot  

It may depend on getting lucky with the uninitialized junk too, and you would
need a kernel with KASSERTs enabled.

manu, I think the rwlock patch 5c0a1923486e65cd47398e52c03cb289d6120a78
may need the same treatment with RW_NEW.

Ryan



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread Emmanuel Vadot
On Thu, 16 May 2024 10:27:40 -0700
Ryan Libby  wrote:

> On Thu, May 16, 2024 at 6:00?AM David Wolfskill  wrote:
> >
> > This is running main-n270174-abb1a1340e3f (built in-place from
> > main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> > the ports-resident kernel modules were rebuilt with the kernel,
> > courtesy (e.g.):
> >
> > g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> > PORTS_MODULES+=graphics/drm-61-kmod
> >
> > And since I dislike "sample sizes of one," I have this result on
> > two different laptops, each of which has both Nvidia & Intel graphics
> > (but for the older one (M4800), I stopped using (& building) the
> > Nvidia driver, since enabling it appears to disable GLX).
> >
> > Anyway: photos of the backtraces are at
> > https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> > as are copies of the build typescripts.
> >
> > Unfortunately, the panic message itself had (just) scrolled off the
> > top at the time I took the photos, but I hand-typed it (from the
> > M4800) in the Subject.
> >
> > Peace,
> > david
> > --
> > David H. Wolfskill  da...@catwhisker.org
> > Please do not mistake "authoritarian" for "conservative" -- or vice versa.
> >
> > See https://www.catwhisker.org/~david/publickey.gpg for my public key.
> 
> Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.
> 
> It looks like spin_lock_init was changed to no longer zero out the
> mutex before calling mtx_init, but the MTX_NEW flag was not added.
> 
> Ryan
> 

Could be, I cannot reproduce this here (either with i915kms or amdgpu)
but I guess that depending on the hardware version or number of screens
etc ... code path is different and might trigger this.
 David can you test with
https://people.freebsd.org/~manu/0001-linuxkpi-Fix-spin_lock_init.patch
just to be sure that it fixes this issue ?

 Cheers,

-- 
Emmanuel Vadot  



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Zhenlei Huang


> On May 17, 2024, at 2:26 AM, Konstantin Belousov  wrote:
> 
> On Thu, May 16, 2024 at 08:06:46PM +0800, Zhenlei Huang wrote:
>> Hi,
>> 
>> I'm recently working on https://reviews.freebsd.org/D45194 and got noticed
>> that gcc behaves weirdly.
>> 
>> A simple source file to demonstrate that.
>> 
>> ```
>> # cat ctors.c
>> 
>> #include 
>> 
>> __attribute__((constructor(101))) void init_101() { puts("init 1"); }
>> __attribute__((constructor(65535))) void init_65535() { puts("init 3"); }
>> __attribute__((constructor)) void init() { puts("init 4"); }
>> __attribute__((constructor(65535))) void init_65535_2() { puts("init 5"); }
>> __attribute__((constructor(65534))) void init_65534() { puts("init 2"); }
>> 
>> int main() { puts("main"); }
>> 
>> __attribute__((destructor(65534))) void fini_65534() { puts("fini 2"); }
>> __attribute__((destructor(65535))) void fini_65535() { puts("fini 3"); }
>> __attribute__((destructor)) void fini() { puts("fini 4"); }
>> __attribute__((destructor(65535))) void fini_65535_2() { puts("fini 5"); }
>> __attribute__((destructor(101))) void fini_101() { puts("fini 1"); }
>> 
>> # clang ctors.c && ./a.out
>> init 1
>> init 2
>> init 3
>> init 4
>> init 5
>> main
>> fini 5
>> fini 4
>> fini 3
>> fini 2
>> fini 1
>> ```
>> 
>> clang with the option -fno-use-init-array and run will produce the same 
>> result, which
>> is what I expected.
> Why do you add that switch?

gcc13 in ports is not configured with option --enable-initfini-array then it 
only produces .ctors / .dtors sections but
not .init_array / .fini_array sections. So I add that switch for clang to 
produce `.ctors` sections instead as
a baseline ( .ctors produced by clang indeed works as expected, the same with 
.init_array ).

> 
>> 
>> gcc13 from ports
>> ```
>> # gcc ctors.c && ./a.out
>> init 1
>> init 2
>> init 5
>> init 4
>> init 3
>> main
>> fini 3
>> fini 4
>> fini 5
>> fini 2
>> fini 1
>> ```
>> 
>> The above order is not expected. I think clang's one is correct.
>> 
>> Further hacking with readelf shows that clang produces the right order of
>> section .rela.ctors but gcc does not.
>> 
>> ```
>> # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
>> 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
>> # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with 
>> addend (.rela.ctors)' -A5 > gcc.txt
>> # diff clang.txt gcc.txt
>> 3,5c3,5
>> <  00080001 R_X86_64_64 0060 
>> init_65535_2 + 0
>> < 0008 00070001 R_X86_64_64 0040 init + 0
>> < 0010 00060001 R_X86_64_64 0020 init_65535 
>> + 0
>> ---
>>>  00060001 R_X86_64_64 0011 init_65535 + >>> 0
>>> 0008 00070001 R_X86_64_64 0022 init + 0
>>> 0010 00080001 R_X86_64_64 0033 init_65535_2 
>>> + 0
>> ```
>> 
>> The above show clearly gcc produces the wrong order of section `.rela.ctors`.
>> 
>> Is that expected behavior ?
>> 
>> I have not tried Linux version of gcc.
> Note that init array vs. init function behavior is encoded by a note added
> by crt1.o.  I suspect that the problem is that gcc port is built without
> --enable-initfini-array configure option.





Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Dag-Erling Smørgrav
Renato Botelho  writes:
> I'm not sure about a good way to test it on a running system instead.

Update your source tree, build and install world, run `sudo bsdconfig`,
scroll down and select “Network Management”, then select “Wireless
Networks”.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Konstantin Belousov
On Thu, May 16, 2024 at 08:05:57PM +, Lorenzo Salvadore wrote:
> On Thursday, May 16th, 2024 at 20:26, Konstantin Belousov 
>  wrote:
> > > gcc13 from ports
> > > `# gcc ctors.c && ./a.out init 1 init 2 init 5 init 4 init 3 main fini 3 
> > > fini 4 fini 5 fini 2 fini 1`
> > > 
> > > The above order is not expected. I think clang's one is correct.
> > > 
> > > Further hacking with readelf shows that clang produces the right order of
> > > section .rela.ctors but gcc does not.
> > > 
> > > ```
> > > # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
> > > 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
> > > # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with 
> > > addend (.rela.ctors)' -A5 > gcc.txt
> > > # diff clang.txt gcc.txt
> > > 3,5c3,5
> > > <  00080001 R_X86_64_64 0060 init_65535_2 + 0
> > > < 0008 00070001 R_X86_64_64 0040 init + 0
> > > < 0010 00060001 R_X86_64_64 0020 init_65535 + 0
> > > ---
> > > 
> > > >  00060001 R_X86_64_64 0011 init_65535 + 0
> > > > 0008 00070001 R_X86_64_64 0022 init + 0
> > > > 0010 00080001 R_X86_64_64 0033 init_65535_2 + 0
> > > > ```
> > > 
> > > The above show clearly gcc produces the wrong order of section 
> > > `.rela.ctors`.
> > > 
> > > Is that expected behavior ?
> > > 
> > > I have not tried Linux version of gcc.
> > 
> > Note that init array vs. init function behavior is encoded by a note added
> > by crt1.o. I suspect that the problem is that gcc port is built without
> > --enable-initfini-array configure option.
> 
> Indeed, support for .init_array and .fini_array has been added to the GCC 
> ports
> but is present in the *-devel ports only for now. I will
> soon proceed to enable it for the GCC standard ports too. lang/gcc14 is soon
> to be added to the ports tree and will have it since the beginning.
It is not 'support', but a bug.  For very long time, crt1.o instructs rtld
to use initarray instead of initfunc.  gcc generates broken binaries trying
to use initfunc.

> 
> If this is indeed the issue, switching to a -devel GCC port should fix it.
> 
> Cheers,
> 
> Lorenzo Salvadore



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Lorenzo Salvadore
On Thursday, May 16th, 2024 at 20:26, Konstantin Belousov  
wrote:
> > gcc13 from ports
> > `# gcc ctors.c && ./a.out init 1 init 2 init 5 init 4 init 3 main fini 3 
> > fini 4 fini 5 fini 2 fini 1`
> > 
> > The above order is not expected. I think clang's one is correct.
> > 
> > Further hacking with readelf shows that clang produces the right order of
> > section .rela.ctors but gcc does not.
> > 
> > ```
> > # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
> > 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
> > # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with 
> > addend (.rela.ctors)' -A5 > gcc.txt
> > # diff clang.txt gcc.txt
> > 3,5c3,5
> > <  00080001 R_X86_64_64 0060 init_65535_2 + 0
> > < 0008 00070001 R_X86_64_64 0040 init + 0
> > < 0010 00060001 R_X86_64_64 0020 init_65535 + 0
> > ---
> > 
> > >  00060001 R_X86_64_64 0011 init_65535 + 0
> > > 0008 00070001 R_X86_64_64 0022 init + 0
> > > 0010 00080001 R_X86_64_64 0033 init_65535_2 + 0
> > > ```
> > 
> > The above show clearly gcc produces the wrong order of section 
> > `.rela.ctors`.
> > 
> > Is that expected behavior ?
> > 
> > I have not tried Linux version of gcc.
> 
> Note that init array vs. init function behavior is encoded by a note added
> by crt1.o. I suspect that the problem is that gcc port is built without
> --enable-initfini-array configure option.

Indeed, support for .init_array and .fini_array has been added to the GCC ports
but is present in the *-devel ports only for now. I will
soon proceed to enable it for the GCC standard ports too. lang/gcc14 is soon
to be added to the ports tree and will have it since the beginning.

If this is indeed the issue, switching to a -devel GCC port should fix it.

Cheers,

Lorenzo Salvadore



Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Nuno Teixeira
Hello Renato,

I will give it a try this weekend with bhyve since I have a passtrhu for
iwlwifi card.

Cheers,

Renato Botelho  escreveu (quinta, 16/05/2024 à(s) 19:56):

> On 16/05/24 15:47, Jessica Clarke wrote:
> > On 16 May 2024, at 19:40, Renato Botelho  wrote:
> >>
> >> I saw some users on a .br group complaining bsdinstall was failing to
> setup wifi network on 15.0 snapshots and tried it myself.  I was able to
> reproduce the problem and also noticed another one.
> >>
> >> I noticed Network Selection screen only shows one line, it's not
> beautiful to navigate through items this way.  On 14.1-BETA2 it shows
> multiple lines so it seems to be a regression.
> >>
> >> The problem users reported was: after selecting desired network it just
> starts over instead of asking for password.  I made a video [1] showing the
> problem.
> >>
> >> Jessica, I've cc'd you because git shows you were the last person
> making changes in this area.  If it's not related and I made a mistake,
> just ignore me.
> >
> > Hi Renato,
> > I touched the code that lets you select the wireless interface in the
> > first place, but not the script that then gets called to set it up and
> > is responsible for the dialogs you see. Given the behaviour, I wonder
> > if this is what today’s import of bsddialog[1] fixes? From reading the
> > script the next dialog uses --mixedform, and restarts the script on
> > error, which it looks like is what you observe.
>
> Thanks for pointing that out, Jessica.  I'll wait for the next 15
> snapshot and will check.
>
> I'm not sure about a good way to test it on a running system instead.
>
> --
> Renato Botelho
>
>

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread SAH



Thank you for the information. The right email address is

i...@aktionheizung.de

Pay information exclusively to this email address. Thanks


-
On 16 May 2024, at 19:40, Renato Botelho  wrote:

I saw some users on a .br group complaining bsdinstall was failing to setup 
wifi network on 15.0 snapshots and tried it myself.  I was able to reproduce 
the problem and also noticed another one.

I noticed Network Selection screen only shows one line, it's not beautiful to 
navigate through items this way.  On 14.1-BETA2 it shows multiple lines so it 
seems to be a regression.

The problem users reported was: after selecting desired network it just starts 
over instead of asking for password.  I made a video [1] showing the problem.

Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just ignore me.


Hi Renato,
I touched the code that lets you select the wireless interface in the
first place, but not the script that then gets called to set it up and
is responsible for the dialogs you see. Given the behaviour, I wonder
if this is what today’s import of bsddialog[1] fixes? From reading the
script the next dialog uses --mixedform, and restarts the script on
error, which it looks like is what you observe.

Jess

[1]https://cgit.freebsd.org/src/commit/?id=a6d8be451f62d425b71a4874f7d4e133b9fb393c


[1]https://youtube.com/shorts/Gmeckokw2a0
--
Renato Botelho


Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Renato Botelho

On 16/05/24 15:47, Jessica Clarke wrote:

On 16 May 2024, at 19:40, Renato Botelho  wrote:


I saw some users on a .br group complaining bsdinstall was failing to setup 
wifi network on 15.0 snapshots and tried it myself.  I was able to reproduce 
the problem and also noticed another one.

I noticed Network Selection screen only shows one line, it's not beautiful to 
navigate through items this way.  On 14.1-BETA2 it shows multiple lines so it 
seems to be a regression.

The problem users reported was: after selecting desired network it just starts 
over instead of asking for password.  I made a video [1] showing the problem.

Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just ignore me.


Hi Renato,
I touched the code that lets you select the wireless interface in the
first place, but not the script that then gets called to set it up and
is responsible for the dialogs you see. Given the behaviour, I wonder
if this is what today’s import of bsddialog[1] fixes? From reading the
script the next dialog uses --mixedform, and restarts the script on
error, which it looks like is what you observe.


Thanks for pointing that out, Jessica.  I'll wait for the next 15 
snapshot and will check.


I'm not sure about a good way to test it on a running system instead.

--
Renato Botelho



Re: bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Jessica Clarke
On 16 May 2024, at 19:40, Renato Botelho  wrote:
> 
> I saw some users on a .br group complaining bsdinstall was failing to setup 
> wifi network on 15.0 snapshots and tried it myself.  I was able to reproduce 
> the problem and also noticed another one.
> 
> I noticed Network Selection screen only shows one line, it's not beautiful to 
> navigate through items this way.  On 14.1-BETA2 it shows multiple lines so it 
> seems to be a regression.
> 
> The problem users reported was: after selecting desired network it just 
> starts over instead of asking for password.  I made a video [1] showing the 
> problem.
> 
> Jessica, I've cc'd you because git shows you were the last person making 
> changes in this area.  If it's not related and I made a mistake, just ignore 
> me.

Hi Renato,
I touched the code that lets you select the wireless interface in the
first place, but not the script that then gets called to set it up and
is responsible for the dialogs you see. Given the behaviour, I wonder
if this is what today’s import of bsddialog[1] fixes? From reading the
script the next dialog uses --mixedform, and restarts the script on
error, which it looks like is what you observe.

Jess

[1] 
https://cgit.freebsd.org/src/commit/?id=a6d8be451f62d425b71a4874f7d4e133b9fb393c

> [1] https://youtube.com/shorts/Gmeckokw2a0
> -- 
> Renato Botelho




bsdinstall wifi setup is broken on CURRENT

2024-05-16 Thread Renato Botelho
I saw some users on a .br group complaining bsdinstall was failing to 
setup wifi network on 15.0 snapshots and tried it myself.  I was able to 
reproduce the problem and also noticed another one.


I noticed Network Selection screen only shows one line, it's not 
beautiful to navigate through items this way.  On 14.1-BETA2 it shows 
multiple lines so it seems to be a regression.


The problem users reported was: after selecting desired network it just 
starts over instead of asking for password.  I made a video [1] showing 
the problem.


Jessica, I've cc'd you because git shows you were the last person making 
changes in this area.  If it's not related and I made a mistake, just 
ignore me.


[1] https://youtube.com/shorts/Gmeckokw2a0
--
Renato Botelho



Re: gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Konstantin Belousov
On Thu, May 16, 2024 at 08:06:46PM +0800, Zhenlei Huang wrote:
> Hi,
> 
> I'm recently working on https://reviews.freebsd.org/D45194 and got noticed
> that gcc behaves weirdly.
> 
> A simple source file to demonstrate that.
> 
> ```
> # cat ctors.c
> 
> #include 
> 
> __attribute__((constructor(101))) void init_101() { puts("init 1"); }
> __attribute__((constructor(65535))) void init_65535() { puts("init 3"); }
> __attribute__((constructor)) void init() { puts("init 4"); }
> __attribute__((constructor(65535))) void init_65535_2() { puts("init 5"); }
> __attribute__((constructor(65534))) void init_65534() { puts("init 2"); }
> 
> int main() { puts("main"); }
> 
> __attribute__((destructor(65534))) void fini_65534() { puts("fini 2"); }
> __attribute__((destructor(65535))) void fini_65535() { puts("fini 3"); }
> __attribute__((destructor)) void fini() { puts("fini 4"); }
> __attribute__((destructor(65535))) void fini_65535_2() { puts("fini 5"); }
> __attribute__((destructor(101))) void fini_101() { puts("fini 1"); }
> 
> # clang ctors.c && ./a.out
> init 1
> init 2
> init 3
> init 4
> init 5
> main
> fini 5
> fini 4
> fini 3
> fini 2
> fini 1
> ```
> 
> clang with the option -fno-use-init-array and run will produce the same 
> result, which
> is what I expected.
Why do you add that switch?

> 
> gcc13 from ports
> ```
> # gcc ctors.c && ./a.out
> init 1
> init 2
> init 5
> init 4
> init 3
> main
> fini 3
> fini 4
> fini 5
> fini 2
> fini 1
> ```
> 
> The above order is not expected. I think clang's one is correct.
> 
> Further hacking with readelf shows that clang produces the right order of
> section .rela.ctors but gcc does not.
> 
> ```
> # clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 
> 'Relocation section with addend (.rela.ctors)' -A5 > clang.txt
> # gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with addend 
> (.rela.ctors)' -A5 > gcc.txt
> # diff clang.txt gcc.txt
> 3,5c3,5
> <  00080001 R_X86_64_64 0060 init_65535_2 
> + 0
> < 0008 00070001 R_X86_64_64 0040 init + 0
> < 0010 00060001 R_X86_64_64 0020 init_65535 + > 0
> ---
> >  00060001 R_X86_64_64 0011 init_65535 + > > 0
> > 0008 00070001 R_X86_64_64 0022 init + 0
> > 0010 00080001 R_X86_64_64 0033 init_65535_2 
> > + 0
> ```
> 
> The above show clearly gcc produces the wrong order of section `.rela.ctors`.
> 
> Is that expected behavior ?
> 
> I have not tried Linux version of gcc.
Note that init array vs. init function behavior is encoded by a note added
by crt1.o.  I suspect that the problem is that gcc port is built without
--enable-initfini-array configure option.



Re: Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread Ryan Libby
On Thu, May 16, 2024 at 6:00 AM David Wolfskill  wrote:
>
> This is running main-n270174-abb1a1340e3f (built in-place from
> main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
> the ports-resident kernel modules were rebuilt with the kernel,
> courtesy (e.g.):
>
> g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
> PORTS_MODULES+=graphics/drm-61-kmod
>
> And since I dislike "sample sizes of one," I have this result on
> two different laptops, each of which has both Nvidia & Intel graphics
> (but for the older one (M4800), I stopped using (& building) the
> Nvidia driver, since enabling it appears to disable GLX).
>
> Anyway: photos of the backtraces are at
> https://www.catwhisker.org/~david/FreeBSD/head/n270174/
> as are copies of the build typescripts.
>
> Unfortunately, the panic message itself had (just) scrolled off the
> top at the time I took the photos, but I hand-typed it (from the
> M4800) in the Subject.
>
> Peace,
> david
> --
> David H. Wolfskill  da...@catwhisker.org
> Please do not mistake "authoritarian" for "conservative" -- or vice versa.
>
> See https://www.catwhisker.org/~david/publickey.gpg for my public key.

Maybe regression from ae38a1a1bfdf320089c254e4dbffdf4769d89110 by manu.

It looks like spin_lock_init was changed to no longer zero out the
mutex before calling mtx_init, but the MTX_NEW flag was not added.

Ryan



Panic: lock "lnxspin" 0xfffff800176c0730 already initialized

2024-05-16 Thread David Wolfskill
This is running main-n270174-abb1a1340e3f (built in-place from
main-n270163-154ad8e0f88f), with ports at main-n663685-3f732745ab06;
the ports-resident kernel modules were rebuilt with the kernel,
courtesy (e.g.):

g1-70(14.1-S)[4] grep '^PORT' /etc/src.conf
PORTS_MODULES+=graphics/drm-61-kmod

And since I dislike "sample sizes of one," I have this result on
two different laptops, each of which has both Nvidia & Intel graphics
(but for the older one (M4800), I stopped using (& building) the
Nvidia driver, since enabling it appears to disable GLX).

Anyway: photos of the backtraces are at
https://www.catwhisker.org/~david/FreeBSD/head/n270174/
as are copies of the build typescripts.

Unfortunately, the panic message itself had (just) scrolled off the
top at the time I took the photos, but I hand-typed it (from the
M4800) in the Subject.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Please do not mistake "authoritarian" for "conservative" -- or vice versa.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


gcc behavior of init priority of .ctors and .dtors section

2024-05-16 Thread Zhenlei Huang
Hi,

I'm recently working on https://reviews.freebsd.org/D45194 and got noticed
that gcc behaves weirdly.

A simple source file to demonstrate that.

```
# cat ctors.c

#include 

__attribute__((constructor(101))) void init_101() { puts("init 1"); }
__attribute__((constructor(65535))) void init_65535() { puts("init 3"); }
__attribute__((constructor)) void init() { puts("init 4"); }
__attribute__((constructor(65535))) void init_65535_2() { puts("init 5"); }
__attribute__((constructor(65534))) void init_65534() { puts("init 2"); }

int main() { puts("main"); }

__attribute__((destructor(65534))) void fini_65534() { puts("fini 2"); }
__attribute__((destructor(65535))) void fini_65535() { puts("fini 3"); }
__attribute__((destructor)) void fini() { puts("fini 4"); }
__attribute__((destructor(65535))) void fini_65535_2() { puts("fini 5"); }
__attribute__((destructor(101))) void fini_101() { puts("fini 1"); }

# clang ctors.c && ./a.out
init 1
init 2
init 3
init 4
init 5
main
fini 5
fini 4
fini 3
fini 2
fini 1
```

clang with the option -fno-use-init-array and run will produce the same result, 
which
is what I expected.

gcc13 from ports
```
# gcc ctors.c && ./a.out
init 1
init 2
init 5
init 4
init 3
main
fini 3
fini 4
fini 5
fini 2
fini 1
```

The above order is not expected. I think clang's one is correct.

Further hacking with readelf shows that clang produces the right order of
section .rela.ctors but gcc does not.

```
# clang -fno-use-init-array -c ctors.c && readelf -r ctors.o | grep 'Relocation 
section with addend (.rela.ctors)' -A5 > clang.txt
# gcc -c ctors.c && readelf -r ctors.o | grep 'Relocation section with addend 
(.rela.ctors)' -A5 > gcc.txt
# diff clang.txt gcc.txt
3,5c3,5
<  00080001 R_X86_64_64 0060 init_65535_2 + 0
< 0008 00070001 R_X86_64_64 0040 init + 0
< 0010 00060001 R_X86_64_64 0020 init_65535 + 0
---
>  00060001 R_X86_64_64 0011 init_65535 + 0
> 0008 00070001 R_X86_64_64 0022 init + 0
> 0010 00080001 R_X86_64_64 0033 init_65535_2 + > 0
```

The above show clearly gcc produces the wrong order of section `.rela.ctors`.

Is that expected behavior ?

I have not tried Linux version of gcc.


Best regards,
Zhenlei




Re: pkg scripts need updating

2024-05-15 Thread Stefan Esser




Am 15.05.24 um 02:21 schrieb Enji Cooper:



On May 14, 2024, at 7:19 AM, Michael Butler  wrote:

After commit aa48259f337100e79933d660fec8856371f761ed to src which removed 
security_daily_compat_var, I get these warnings daily..

aaron.protected-networks.net login failures:

aaron.protected-networks.net refused connections:
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/405.pkg-base-audit: security_daily_compat_var: 
not found

Checking for security vulnerabilities in base (userland & kernel):
Database fetched: 2024-05-12T14:16-04:00
0 problem(s) in 0 installed package(s) found.
0 problem(s) in 0 installed package(s) found.
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found
/usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: not 
found

Checking for packages with security vulnerabilities:
Database fetched: 2024-05-12T14:16-04:00
/usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
not found
/usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
not found

Checking for packages with mismatched checksums:


Have you tried emailing the issue to the committer/filing a bug report to bring 
this to their attention?
Cheers,


The messages are caused by running:

/usr/local/etc/periodic/security/405.pkg-base-audit
/usr/local/etc/periodic/security/460.pkg-checksum
/usr/local/etc/periodic/security/410.pkg-audit

These scripts have been installed by pkg-1.12.2 on my system ...

Best regards, STefan



Re: Unfamiliar console message: in prompt_tty(): caught signal 2

2024-05-14 Thread Enji Cooper

> On Apr 21, 2024, at 1:48 PM, bob prohaska  wrote:
> 
> On Sun, Apr 21, 2024 at 10:16:55PM +0200, Dag-Erling Smørgrav wrote:
>> bob prohaska  writes:
>>> Apr 20 22:14:37 www su[30398]: in prompt_tty(): caught signal 2
>> 
>> This means someone ran `su` and pressed Ctrl-C instead of entering a
>> password when prompted.
> 
> Ahh, that would have been me. Thank you!

Logging SIGINT seems kind of odd, given that it would probably be a regular 
occurrence (to me at least)…
-Enji

signature.asc
Description: Message signed with OpenPGP


Re: pkg scripts need updating

2024-05-14 Thread Enji Cooper

> On May 14, 2024, at 7:19 AM, Michael Butler  
> wrote:
> 
> After commit aa48259f337100e79933d660fec8856371f761ed to src which removed 
> security_daily_compat_var, I get these warnings daily..
> 
> aaron.protected-networks.net login failures:
> 
> aaron.protected-networks.net refused connections:
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> /usr/local/etc/periodic/security/405.pkg-base-audit: 
> security_daily_compat_var: not found
> 
> Checking for security vulnerabilities in base (userland & kernel):
> Database fetched: 2024-05-12T14:16-04:00
> 0 problem(s) in 0 installed package(s) found.
> 0 problem(s) in 0 installed package(s) found.
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/410.pkg-audit: security_daily_compat_var: 
> not found
> 
> Checking for packages with security vulnerabilities:
> Database fetched: 2024-05-12T14:16-04:00
> /usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
> not found
> /usr/local/etc/periodic/security/460.pkg-checksum: security_daily_compat_var: 
> not found
> 
> Checking for packages with mismatched checksums:

Have you tried emailing the issue to the committer/filing a bug report to bring 
this to their attention?
Cheers,
-Enji

signature.asc
Description: Message signed with OpenPGP


Re: Graph of the FreeBSD memory fragmentation

2024-05-14 Thread Ryan Libby
On Tue, May 14, 2024 at 9:09 AM Ryan Libby  wrote:
>
> On Tue, May 14, 2024 at 1:14 AM Alexander Leidinger
>  wrote:
> >
> > Am 2024-05-14 03:54, schrieb Ryan Libby:
> > > That was a long winded way of saying: the "UMA bucket" axis is
> > > actually "vm phys free list order".
> > >
> > > That said, I find that dimension confusing because in fact there's
> > > just one piece of information there, the average size of a free list
> > > entry, and it doesn't actually depend on the free list order.  The
> > > graph could be 2D.
> >
> > It evolved into that...
> > At first I had a 3 dimensional dataset and the first try was to plot it
> > as is (3D). The outcome (as points) was not as good as I wanted it to
> > be, and plotting as lines gave the wrong direction of lines. I massaged
> > the plotting instructions until it looked good enough. I did not try a
> > 2D plot. I agree, with different colors for each free list order a 2D
> > plot may work too. If a 2D plot is better than a 3D plot in this case,
> > depends on the mental model of the topic the viewer has. One size may
> > not fit all. Feel free to experiment with other plotting styles.
> >
>
> What I mean is that the 13 values in the depth dimension (now "freelist
> size") are actually all showing the same information -- except for
> integer truncation issues and having clamped the negative values at
> -1000.  Each index value for a given order completely determines the
> values for the other orders at a given time point.
>
> In the patch (D40575) this is
> return (1000 -
> ((info.free_pages * 1000) / (1 << order) / info.free_blocks));
> but notice that free_pages and free_blocks don't depend on order, they
> are computed across all free list entries, of all orders, and are the
> same for a calculation for any order.  So for example we could solve for
> the average free list entry size by taking the value from order of 0:
> index_0 = 1000 - 1000 / 1 * free_pages / free_blocks
> avg_pages = free_pages / free_blocks = -(index_0 - 1000) / 1000
> and from that you can calculate all the other values.  Or just display
> it directly.  I'd suggest try plotting log2(avg_pages).
>
> In other words, I think just considering one value per time point is
> simpler and doesn't lose any information.
>
> > > The paper that defines this fragmentation index also says that "the
> > > fragmentation index is only meaningful when an allocation fails".  Are
> > > you actually seeing any contiguous allocations failures in your
> > > measurements?
> >
> > I'm not aware of such.
> > The index may only be meaningful for the purposes of the goal of the
> > paper when there are such failures, but if you look at the graph and how
> > it changed when Bojan changed the guard pages, I see value in the graph
> > for more than what the paper suggests.
> >
> > > Without that context, it seems like what the proposed sysctl reports
> > > is indirectly just the average size of free list entries.  We could
> > > just report that.
> >
> > The calculation of the value is part of a bigger picture. The value
> > returned is used by some other code to make decisions.
> >
> > Bye,
> > Alexander.
> >
> > --
> > http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> > http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF
>
> Okay I see that D40772 uses it, but always passes order=9, and compares
> it with threshold=300.

I see that it is not "always" as the order is actually arch dependent.

> Rearranging, it asks if the average free list
> entry size is at least 1.4 MiB.

..on amd64.

>
> Personally I'd prefer to consider values that are easy to interpret
> rather than an arbitrary index value.
>
> Ryan



Re: Graph of the FreeBSD memory fragmentation

2024-05-14 Thread Ryan Libby
On Tue, May 14, 2024 at 1:14 AM Alexander Leidinger
 wrote:
>
> Am 2024-05-14 03:54, schrieb Ryan Libby:
> > That was a long winded way of saying: the "UMA bucket" axis is
> > actually "vm phys free list order".
> >
> > That said, I find that dimension confusing because in fact there's
> > just one piece of information there, the average size of a free list
> > entry, and it doesn't actually depend on the free list order.  The
> > graph could be 2D.
>
> It evolved into that...
> At first I had a 3 dimensional dataset and the first try was to plot it
> as is (3D). The outcome (as points) was not as good as I wanted it to
> be, and plotting as lines gave the wrong direction of lines. I massaged
> the plotting instructions until it looked good enough. I did not try a
> 2D plot. I agree, with different colors for each free list order a 2D
> plot may work too. If a 2D plot is better than a 3D plot in this case,
> depends on the mental model of the topic the viewer has. One size may
> not fit all. Feel free to experiment with other plotting styles.
>

What I mean is that the 13 values in the depth dimension (now "freelist
size") are actually all showing the same information -- except for
integer truncation issues and having clamped the negative values at
-1000.  Each index value for a given order completely determines the
values for the other orders at a given time point.

In the patch (D40575) this is
return (1000 -
((info.free_pages * 1000) / (1 << order) / info.free_blocks));
but notice that free_pages and free_blocks don't depend on order, they
are computed across all free list entries, of all orders, and are the
same for a calculation for any order.  So for example we could solve for
the average free list entry size by taking the value from order of 0:
index_0 = 1000 - 1000 / 1 * free_pages / free_blocks
avg_pages = free_pages / free_blocks = -(index_0 - 1000) / 1000
and from that you can calculate all the other values.  Or just display
it directly.  I'd suggest try plotting log2(avg_pages).

In other words, I think just considering one value per time point is
simpler and doesn't lose any information.

> > The paper that defines this fragmentation index also says that "the
> > fragmentation index is only meaningful when an allocation fails".  Are
> > you actually seeing any contiguous allocations failures in your
> > measurements?
>
> I'm not aware of such.
> The index may only be meaningful for the purposes of the goal of the
> paper when there are such failures, but if you look at the graph and how
> it changed when Bojan changed the guard pages, I see value in the graph
> for more than what the paper suggests.
>
> > Without that context, it seems like what the proposed sysctl reports
> > is indirectly just the average size of free list entries.  We could
> > just report that.
>
> The calculation of the value is part of a bigger picture. The value
> returned is used by some other code to make decisions.
>
> Bye,
> Alexander.
>
> --
> http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF

Okay I see that D40772 uses it, but always passes order=9, and compares
it with threshold=300.  Rearranging, it asks if the average free list
entry size is at least 1.4 MiB.

Personally I'd prefer to consider values that are easy to interpret
rather than an arbitrary index value.

Ryan



pkg scripts need updating

2024-05-14 Thread Michael Butler
After commit aa48259f337100e79933d660fec8856371f761ed to src which 
removed security_daily_compat_var, I get these warnings daily..


aaron.protected-networks.net login failures:

aaron.protected-networks.net refused connections:
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/405.pkg-base-audit: 
security_daily_compat_var: not found


Checking for security vulnerabilities in base (userland & kernel):
Database fetched: 2024-05-12T14:16-04:00
0 problem(s) in 0 installed package(s) found.
0 problem(s) in 0 installed package(s) found.
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/410.pkg-audit: 
security_daily_compat_var: not found


Checking for packages with security vulnerabilities:
Database fetched: 2024-05-12T14:16-04:00
/usr/local/etc/periodic/security/460.pkg-checksum: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/460.pkg-checksum: 
security_daily_compat_var: not found
/usr/local/etc/periodic/security/460.pkg-checksum: 
security_daily_compat_var: not found


Checking for packages with mismatched checksums:




Re: Graph of the FreeBSD memory fragmentation

2024-05-14 Thread Alexander Leidinger

Am 2024-05-14 03:54, schrieb Ryan Libby:

That was a long winded way of saying: the "UMA bucket" axis is
actually "vm phys free list order".

That said, I find that dimension confusing because in fact there's
just one piece of information there, the average size of a free list
entry, and it doesn't actually depend on the free list order.  The
graph could be 2D.


It evolved into that...
At first I had a 3 dimensional dataset and the first try was to plot it 
as is (3D). The outcome (as points) was not as good as I wanted it to 
be, and plotting as lines gave the wrong direction of lines. I massaged 
the plotting instructions until it looked good enough. I did not try a 
2D plot. I agree, with different colors for each free list order a 2D 
plot may work too. If a 2D plot is better than a 3D plot in this case, 
depends on the mental model of the topic the viewer has. One size may 
not fit all. Feel free to experiment with other plotting styles.



The paper that defines this fragmentation index also says that "the
fragmentation index is only meaningful when an allocation fails".  Are
you actually seeing any contiguous allocations failures in your
measurements?


I'm not aware of such.
The index may only be meaningful for the purposes of the goal of the 
paper when there are such failures, but if you look at the graph and how 
it changed when Bojan changed the guard pages, I see value in the graph 
for more than what the paper suggests.



Without that context, it seems like what the proposed sysctl reports
is indirectly just the average size of free list entries.  We could
just report that.


The calculation of the value is part of a bigger picture. The value 
returned is used by some other code to make decisions.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


Re: Graph of the FreeBSD memory fragmentation

2024-05-13 Thread Ryan Libby
On Thu, May 9, 2024 at 2:36 AM Alexander Leidinger
 wrote:
>
> Am 2024-05-08 18:45, schrieb Bojan Novković:
> > Hi,
> >
> > On 5/7/24 14:02, Alexander Leidinger wrote:
> >
> >> Hi,
> >>
> >> I created some graphs of the memory fragmentation.
> >> https://www.leidinger.net/blog/2024/05/07/plotting-the-freebsd-memory-fragmentation/
> >>
> >> My goal was not comparing a specific change on a given benchmark, but
> >> to "have something which visualizes memory fragmentation". As part of
> >> that, Bojans commit
> >> https://cgit.freebsd.org/src/commit/?id=7a79d066976149349ecb90240d02eed0c4268737
> >> was just in the middle of my data collection. I have the impression
> >> that it made a positive difference in my non deterministic workload.
> > Thank you for working on this, the plots look great!
> > They provide a really clean visual overview of what's happening.
> > I'm working on another type of memory visualization which might
> > interest you, I'll share it with you once its done.
> > One small nit - the fragmentation index does not quantify fragmentation
> > for UMA buckets, but for page allocator freelists.
>
> Do I get it more correctly now: UMA buckets are type/structure specific
> allocation lists, and the page allocator freelists are size-specific
> allocation lists (which are used by UMA when no free item is available
> in a bucket)?
>

Yeah, that's more correct.  UMA is a higher level allocator than the
vm phys free lists.  The latter is what the proposed sysctl measures.

That measurement is not directly related to UMA or how UMA allocates
pages for most zones, except the handful of UMA_ZONE_CONTIG zones.
Because otherwise, UMA doesn't explicitly do or require allocations of
contiguous pages.

Most allocations by UMA of backing pages are done as single pages from
the perspective of the vm phys free lists.  When possible (slab size
of one page and machine architecture support) it then uses direct map
addresses to refer to items.  Otherwise the backing pages (single or
multiple not-necessarily-contiguous) are mapped into kernel virtual
address space.

Re malloc(9), "small" (up to 64 KB) allocations are served from uma
zone "buckets" of various sizes, while larger allocations skip UMA and
allocate kmem directly in a similar way to how UMA does allocations
for large slabs (as in multiple individual pages are mapped).

UMA "buckets" in the sense of struct uma_bucket are caches of free
items.  For non-cache zones, if buckets are exhausted then UMA goes to
slabs.  If slabs are exhausted then UMA allocates pages.  Pages will
first be served from the page cache, before the vm phys free lists.

That was a long winded way of saying: the "UMA bucket" axis is
actually "vm phys free list order".

That said, I find that dimension confusing because in fact there's
just one piece of information there, the average size of a free list
entry, and it doesn't actually depend on the free list order.  The
graph could be 2D.

The paper that defines this fragmentation index also says that "the
fragmentation index is only meaningful when an allocation fails".  Are
you actually seeing any contiguous allocations failures in your
measurements?

Without that context, it seems like what the proposed sysctl reports
is indirectly just the average size of free list entries.  We could
just report that.

> >> Is there anything which prevents https://reviews.freebsd.org/D40575 to
> >> be committed?
> > D40575 is closely tied to the compaction patch (D40772) which is
> > currently on hold until another issue is solved (see D45046 and related
> > revisions for more details).
>
> Any idea about https://reviews.freebsd.org/D16620 ? Is D45046 supposed
> to replace this, or is it about something else?
> I wanted to try D16620, but it doesn't apply and my naive/mechanical way
> of applying it panics.
>
> > I didn't consider landing D40575 because of that, but I guess it could
> > be useful on its own.
>
> It at least gives a way to quantify with numbers resp. qualitatively
> visualize. And as such it may help in visualizing differences like with
> your guard-pages commit. I wonder if the segregation of nofree
> allocations may result in a similar improvement for long-running
> systems.
>
> Bye,
> Alexander.
>
> --
> http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF

Ryan



Re: Graph of the FreeBSD memory fragmentation

2024-05-09 Thread Alexander Leidinger

Am 2024-05-08 18:45, schrieb Bojan Novković:

Hi,

On 5/7/24 14:02, Alexander Leidinger wrote:


Hi,

I created some graphs of the memory fragmentation.
https://www.leidinger.net/blog/2024/05/07/plotting-the-freebsd-memory-fragmentation/

My goal was not comparing a specific change on a given benchmark, but 
to "have something which visualizes memory fragmentation". As part of 
that, Bojans commit 
https://cgit.freebsd.org/src/commit/?id=7a79d066976149349ecb90240d02eed0c4268737 
was just in the middle of my data collection. I have the impression 
that it made a positive difference in my non deterministic workload.

Thank you for working on this, the plots look great!
They provide a really clean visual overview of what's happening.
I'm working on another type of memory visualization which might 
interest you, I'll share it with you once its done.
One small nit - the fragmentation index does not quantify fragmentation 
for UMA buckets, but for page allocator freelists.


Do I get it more correctly now: UMA buckets are type/structure specific 
allocation lists, and the page allocator freelists are size-specific 
allocation lists (which are used by UMA when no free item is available 
in a bucket)?


Is there anything which prevents https://reviews.freebsd.org/D40575 to 
be committed?
D40575 is closely tied to the compaction patch (D40772) which is 
currently on hold until another issue is solved (see D45046 and related 
revisions for more details).


Any idea about https://reviews.freebsd.org/D16620 ? Is D45046 supposed 
to replace this, or is it about something else?
I wanted to try D16620, but it doesn't apply and my naive/mechanical way 
of applying it panics.


I didn't consider landing D40575 because of that, but I guess it could 
be useful on its own.


It at least gives a way to quantify with numbers resp. qualitatively 
visualize. And as such it may help in visualizing differences like with 
your guard-pages commit. I wonder if the segregation of nofree 
allocations may result in a similar improvement for long-running 
systems.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


Re: Graph of the FreeBSD memory fragmentation

2024-05-08 Thread Mike Jakubik
Hi Alex,

No, i can't comment on the C code or it's change impact otherwise. But the
graphs are impressive, i say lets try it. I can test i 14-stable.

Ty.

On Tue, May 7, 2024 at 8:03 AM Alexander Leidinger 
wrote:

> Hi,
>
> I created some graphs of the memory fragmentation.
>
>
> https://www.leidinger.net/blog/2024/05/07/plotting-the-freebsd-memory-fragmentation/
>
> My goal was not comparing a specific change on a given benchmark, but to
> "have something which visualizes memory fragmentation". As part of that,
> Bojans commit
>
> https://cgit.freebsd.org/src/commit/?id=7a79d066976149349ecb90240d02eed0c4268737
> was just in the middle of my data collection. I have the impression that
> it made a positive difference in my non deterministic workload.
>
> Is there anything which prevents https://reviews.freebsd.org/D40575 to
> be committed?
>
> Maybe some other people want to have a look at the memory fragmentation
> and some of Bojans work
> (
> https://wiki.freebsd.org/SummerOfCode2023Projects/PhysicalMemoryAntiFragmentationMechanisms
> ).
>
> Bye,
> Alexander.
>
> --
> http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF
>


Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, . . .] [Update to Host OSVERSION 1500018 did not help]

2024-05-08 Thread Philip Paeps

On 2024-05-08 23:53:57 (+0800), Mark Millard wrote:


On Apr 29, 2024, at 20:16, Mark Millard  wrote:


On Apr 29, 2024, at 20:11, Mark Millard  wrote:


On Apr 29, 2024, at 19:54, Mark Millard  wrote:


On Apr 28, 2024, at 18:06, Philip Paeps  wrote:


On 2024-04-18 23:14:22 (+0800), Mark Millard wrote:
On Apr 18, 2024, at 08:02, Mark Millard  
wrote:

void  wrote on
Date: Thu, 18 Apr 2024 14:08:36 UTC :


Not sure where to post this..

The last bulk build for arm64 appears to have happened around
mid-March on ampere2. Is it broken?


main-armv7 building is broken and the last completed build
was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
gets stuck making no progress until manually forced to stop,
which leads to huge elapsed times for the incomplete builds:

[...]

My guess is that FreeBSD has something that broken after 
bd45bbe440
that was broken as of f5f08e41aa and was still broken at 
75464941dc .




One thing of possible note:

Failing . . .

Host OSVERSION: 156
Jail OSVERSION: 1500014


I have finished a package builder refresh this morning.  All our 
builder hosts (except PowerPC - I don't touch those) are now on 
main-n269671-feabaf8d5389 (OSVERSION 1500018).


ampere1 successfully finished its 140releng-armv7-quarterly build, 
so it looks like the problem with stuck builds was limited to 
ampere2 building main-armv7.  I'll keep a close eye on this one 
when it starts its next build.




I see that main-armv7 started.

It queued only 31935 instead of the prior 34528 (or more): it is 
doing an
incremental build instead of a full build. For example, pkg was not 
built
but instead the prior build is in use. Thus bad results from the 
prior

build might be involved in this new build.

I'd recommend forcing a full "poudriere bulk -c -a" that does a 
from-scratch

build for the purposes of the main-armv7 test.


Actually the test is not going to previde the information we are
after as things are.

giflib-5.2.2 failed to build, which leads to devel/doxygen being
skipped. devel/doxygen was the first one to hang up in the prior
2 failing attempts, if I remember right.

giflib-5.2.2 also causes graphics/graphviz to be skipped.
graphics/graphviz was installed just before the hangup in all of
the example hanups. So the context will not be replicated.

We need graphics/giflib to build to actually do the test.


Looks like:

https://cgit.freebsd.org/ports/commit/graphics/giflib?id=5007109903fc271e3ef0ba01d78781c1fed99f3f

is the fix for the graphic/giflib build failure.


Well, main-armv7 is building again and things are still
getting stuck. So much for my idea. For reference I
list the over 10-hr-so-far ones:

doxygen-1.9.6_1,2   build-depends 13:03:54
py39-pydot-2.0.0run-depends   12:24:04
py39-pygraphviz-1.6 lib-depends   12:10:38

"ps -alxdww" would likely be appropriate to get a copy
of the otuput of.

"procstat -k -k" usage and the like on stuck processes
would probably be appropriate.

Does anyone with appropriate investigative background
have login access to ampere2 to take a look at what
is getting stuck?


This is unfortunate.  I'm sure I have the appropriate background, but 
I'm spread very thin!  I'll get as much information as I can about this 
machine while it's stuck, before I bounce it again.


I think it may be worth a try building those ports in isolation on 
ref14-aarch64, and see what they're trying to do.  I'll also set up a 
set of refX-armv7 jails on that machine.


Hopefully we can get to the bottom of this soon.  This is a very tedious 
failure mode.


We could also try to put an older armv7 image on the builder jail on 
ampere2.  Depending on whether we have a sufficiently old image, that 
will either be very straightforward, or a very deep rabbit hole.


Thanks again for keeping an eye on this.  We really should have better 
monitoring for stuck builds than "Mark will tell us". :-)


Philip



Re: Graph of the FreeBSD memory fragmentation

2024-05-08 Thread Bojan Novković

Hi,

On 5/7/24 14:02, Alexander Leidinger wrote:


Hi,

I created some graphs of the memory fragmentation.
https://www.leidinger.net/blog/2024/05/07/plotting-the-freebsd-memory-fragmentation/

My goal was not comparing a specific change on a given benchmark, but 
to "have something which visualizes memory fragmentation". As part of 
that, Bojans commit 
https://cgit.freebsd.org/src/commit/?id=7a79d066976149349ecb90240d02eed0c4268737 
was just in the middle of my data collection. I have the impression 
that it made a positive difference in my non deterministic workload.

Thank you for working on this, the plots look great!
They provide a really clean visual overview of what's happening.
I'm working on another type of memory visualization which might interest 
you, I'll share it with you once its done.
One small nit - the fragmentation index does not quantify fragmentation 
for UMA buckets, but for page allocator freelists.
Is there anything which prevents https://reviews.freebsd.org/D40575 to 
be committed?
D40575 is closely tied to the compaction patch (D40772) which is 
currently on hold until another issue is solved (see D45046 and related 
revisions for more details).
I didn't consider landing D40575 because of that, but I guess it could 
be useful on its own.



Bojan




Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, . . .] [Update to Host OSVERSION 1500018 did not help]

2024-05-08 Thread Mark Millard
On Apr 29, 2024, at 20:16, Mark Millard  wrote:

> On Apr 29, 2024, at 20:11, Mark Millard  wrote:
> 
>> On Apr 29, 2024, at 19:54, Mark Millard  wrote:
>> 
>>> On Apr 28, 2024, at 18:06, Philip Paeps  wrote:
>>> 
 On 2024-04-18 23:14:22 (+0800), Mark Millard wrote:
> On Apr 18, 2024, at 08:02, Mark Millard  wrote:
>> void  wrote on
>> Date: Thu, 18 Apr 2024 14:08:36 UTC :
>> 
>>> Not sure where to post this..
>>> 
>>> The last bulk build for arm64 appears to have happened around
>>> mid-March on ampere2. Is it broken?
>> 
>> main-armv7 building is broken and the last completed build
>> was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
>> gets stuck making no progress until manually forced to stop,
>> which leads to huge elapsed times for the incomplete builds:
>> 
>> [...]
>> 
>> My guess is that FreeBSD has something that broken after bd45bbe440
>> that was broken as of f5f08e41aa and was still broken at 75464941dc .
>> 
> 
> One thing of possible note:
> 
> Failing . . .
> 
> Host OSVERSION: 156
> Jail OSVERSION: 1500014
 
 I have finished a package builder refresh this morning.  All our builder 
 hosts (except PowerPC - I don't touch those) are now on 
 main-n269671-feabaf8d5389 (OSVERSION 1500018).
 
 ampere1 successfully finished its 140releng-armv7-quarterly build, so it 
 looks like the problem with stuck builds was limited to ampere2 building 
 main-armv7.  I'll keep a close eye on this one when it starts its next 
 build.
 
>>> 
>>> I see that main-armv7 started.
>>> 
>>> It queued only 31935 instead of the prior 34528 (or more): it is doing an
>>> incremental build instead of a full build. For example, pkg was not built
>>> but instead the prior build is in use. Thus bad results from the prior
>>> build might be involved in this new build.
>>> 
>>> I'd recommend forcing a full "poudriere bulk -c -a" that does a from-scratch
>>> build for the purposes of the main-armv7 test.
>> 
>> Actually the test is not going to previde the information we are
>> after as things are.
>> 
>> giflib-5.2.2 failed to build, which leads to devel/doxygen being
>> skipped. devel/doxygen was the first one to hang up in the prior
>> 2 failing attempts, if I remember right.
>> 
>> giflib-5.2.2 also causes graphics/graphviz to be skipped.
>> graphics/graphviz was installed just before the hangup in all of
>> the example hanups. So the context will not be replicated.
>> 
>> We need graphics/giflib to build to actually do the test.
> 
> Looks like:
> 
> https://cgit.freebsd.org/ports/commit/graphics/giflib?id=5007109903fc271e3ef0ba01d78781c1fed99f3f
> 
> is the fix for the graphic/giflib build failure.

Well, main-armv7 is building again and things are still
getting stuck. So much for my idea. For reference I
list the over 10-hr-so-far ones:

doxygen-1.9.6_1,2   build-depends 13:03:54
py39-pydot-2.0.0run-depends   12:24:04
py39-pygraphviz-1.6 lib-depends   12:10:38

"ps -alxdww" would likely be appropriate to get a copy
of the otuput of.

"procstat -k -k" usage and the like on stuck processes
would probably be appropriate.

Does anyone with appropriate investigative background
have login access to ampere2 to take a look at what
is getting stuck?


===
Mark Millard
marklmi at yahoo.com




termcap.db unused?

2024-05-07 Thread Jamie Landeg-Jones
I was looking at a ktrace dump file recently (/bin/ls) and noticed that
during initialisation, it attempted to read /etc/termcap.db - as that
failed, it then read the text version pointed to by /etc/termcap

Adding a link: /etc/termcap.db -> /usr/share/misc/termcap.db

caused subsequent runs to use the termcap.db version.

Is there any reason why /etc/termcap is linked, whilst /etc/termcap.db
isn't? And if so, what's the purpose of /usr/share/misc/termcap.db ?

Cheers, Jamie



Graph of the FreeBSD memory fragmentation

2024-05-07 Thread Alexander Leidinger

Hi,

I created some graphs of the memory fragmentation.

https://www.leidinger.net/blog/2024/05/07/plotting-the-freebsd-memory-fragmentation/


My goal was not comparing a specific change on a given benchmark, but to 
"have something which visualizes memory fragmentation". As part of that, 
Bojans commit 
https://cgit.freebsd.org/src/commit/?id=7a79d066976149349ecb90240d02eed0c4268737 
was just in the middle of my data collection. I have the impression that 
it made a positive difference in my non deterministic workload.


Is there anything which prevents https://reviews.freebsd.org/D40575 to 
be committed?


Maybe some other people want to have a look at the memory fragmentation 
and some of Bojans work 
(https://wiki.freebsd.org/SummerOfCode2023Projects/PhysicalMemoryAntiFragmentationMechanisms).


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


FreeBSD Status Report - First Quarter 2024

2024-05-05 Thread Lorenzo Salvadore
FreeBSD Status Report First Quarter 2024

Here is the first 2024 status report, with 21 entries.

The New Year brings us many new interesting projects, such as the new libsys
that separates system calls from libc and libpthread or work on a graphical
installer for FreeBSD, which will help making our OS more user-friendly. Of
course, the usual projects keep going on, such as the work on cloud-init,
OpenStack, or the GCC ports. As usual our main teams share their progress with
us.

Have a nice read.

Lorenzo Salvadore, on behalf of the Status Team.

━━━

A rendered version of this report is available here:
https://www.freebsd.org/status/report-2024-01-2024-03/

━━━
Table of Contents

  • FreeBSD Team Reports
  □ FreeBSD Core Team
  □ FreeBSD Foundation
  □ FreeBSD Release Engineering Team
  □ Cluster Administration Team
  □ Continuous Integration
  □ Ports Collection
  • Projects
  □ Audio Stack Improvements
  □ Bhyve Improvements
  □ Graphical Installer for FreeBSD
  • Userland
  □ libsys
  □ PackageKit backend for FreeBSD pkg
  • Kernel
  □ iwlwifi(4) and wireless for 13.3-RELEASE
  • Architectures
  □ Ten64, WHLE-LS1, and HoneyComb
  • Cloud
  □ FreeBSD on Microsoft HyperV and Azure
  □ FreeBSD as a Tier 1 cloud-init Platform
  □ OpenStack on FreeBSD
  • Documentation
  □ Documentation Engineering Team
  • Ports
  □ FreshPorts: Notification of new packages
  □ GCC on FreeBSD
  □ Valgrind: port to arm64 on its way
  • Third Party Projects
  □ Containers and FreeBSD: Pot, Potluck and Potman

━━━

FreeBSD Team Reports

Entries from the various official and semi-official teams, as found in the
Administration Page.

FreeBSD Core Team

Contact: FreeBSD Core Team 

The FreeBSD Core Team is the governing body of FreeBSD.

13.3-RELEASE

FreeBSD 13.3 was released on March 5th, 2024.

The release announcement is at:

https://www.freebsd.org/releases/13.3R/announce/

Along the release engineering team, the project dedicates the 13.3-RELEASE to
Glen Barber, with thanks for his many years of contributions as Release
Engineer.

Future of 32-bit platform support

Core announced Future of 32-bit platform support in FreeBSD for deprecating
32-bit platforms over the next couple of major releases.

Commit bits

  • Core approved the src commit bit for Bojan Novković

  • Core reactivated the src commit bits for Mark Peek, Mark Murray, and
Lawrence Stewart

━━━

FreeBSD Foundation

Links:
FreeBSD Foundation URL: https://freebsdfoundation.org/
Technology Roadmap URL: https://freebsdfoundation.org/blog/technology-roadmap/
Donate URL: https://freebsdfoundation.org/donate/
Foundation Partnership Program URL: https://freebsdfoundation.org/our-donors/
freebsd-foundation-partnership-program/
FreeBSD Journal URL: https://freebsdfoundation.org/journal/
Foundation Events URL: https://freebsdfoundation.org/our-work/events/

Contact: Deb Goodkin 

The FreeBSD Foundation is a 501(c)(3) non-profit organization dedicated to
supporting and promoting the FreeBSD Project and worldwide community, and
helping to advance the state of FreeBSD. We do this in both technical and
non-technical ways. We are 100% supported by donations from individuals and
corporations and those investments help us fund the:

  • Software development projects to implement features and functionality in
FreeBSD

  • Sponsor and organize conferences and developer summits to provide
collaborative opportunities and promote FreeBSD

  • Purchase and support of hardware to improve and maintain FreeBSD
infrastructure

  • Resources to improve security, quality assurance, and continuous
integration efforts

  • Materials and staff needed to promote, educate, and advocate for FreeBSD

  • Collaboration between commercial vendors and FreeBSD developers

  • Representation of the FreeBSD Project in executing contracts, license
agreements, and other legal arrangements that require a recognized legal
entity

Operations

We kicked off the new year with ambitious goals to help move the FreeBSD
Project forward by identifying features and functionality to support in the
operating system and increasing our advocacy efforts to increase and expand the
visibility of FreeBSD. Stay tuned for a blog post that will provide more
information on our 2024 goals and plans.

We also published the 2024 Budget. In order to provide greater transparency
about the budgeting process, we wrote a blog post that provides more details on
how funding is allocated, new breakouts of some of the project expense
categories, and more details on where the funding is going.

OS Improvements

During the first 

Re: main [so: 15] amd64: Rare poudriere bulk builder "stuck in umtxq_sleep" condition (race failure?) during high-load-average "poudriere bulk -c -a" runs

2024-05-04 Thread Mark Millard
On May 4, 2024, at 09:59, Mark Millard  wrote:

> I recently did some of my rare "poudriere bulk -c -a" high-load-average
> style experiments, here on a 7950X3D (amd64) system and I ended up with
> a couple of stuck builders (one per bulk run of 2 runs). Contexts:
> 
> # uname -apKU
> FreeBSD 7950X3D-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT #142 
> main-n269589-9dcf39575efb-dirty: Sun Apr 21 07:28:55 UTC 2024 
> root@7950X3D-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64amd64/sys/GENERIC-NODBG
>  amd64 amd64 1500018 1500018
> 
> # uname -apKU
> FreeBSD 7950X3D-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT #142 
> main-n269589-9dcf39575efb-dirty: Sun Apr 21 07:28:55 UTC 2024 
> root@7950X3D-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64amd64/sys/GENERIC-NODBG
>  amd64 amd64 1500018 1500018
> 
> So: One was in a ZFS context and the other was in a UFS context.
> 
> 32 hardware threads, 32 builders, ALLOW_MAKE_JOBS=yes in use
> (no use of MAKE_JOBS_NUMBER_LIMIT or the like), USE_TMPFS=all
> in use, TMPFS_BLACKLIST in use, 192 GiBytes of RAM, 512 GiByte
> Swap partition in use, so SystemRAM+SystemSWAP being
> 704 GiBytes.
> 
> 
> I'll start with notes about the more recent UFS context experiment . . .
> 
> graphics/pinta in the UFS experiment had gotten stuck in threads
> of /usr/local/bin/mono (mono-sgen):
> 
> [05] 15:31:47 graphics/pinta | pinta-1.7.1_4  
> stage 15:28:31 2.30 GiB   0%   0%
> 
> # procstat -k -k 93415
>  PIDTID COMMTDNAME  KSTACK
>
> 93415 671706 mono-sgen   -   mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
> do_lock_umutex+0x6a6 __umtx_op_wait_umutex+0x49 sys__umtx_op+0x7e 
> amd64_syscall+0x115 fast_syscall_common+0xf8 
> 93415 678651 mono-sgen   SGen worker mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
> do_wait+0x244 __umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e 
> amd64_syscall+0x115 fast_syscall_common+0xf8 
> 93415 678652 mono-sgen   Finalizer   mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
> __umtx_op_sem2_wait+0x49a sys__umtx_op+0x7e amd64_syscall+0x115 
> fast_syscall_common+0xf8 
> 93415 678655 mono-sgen   -   mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
> do_wait+0x244 __umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e 
> amd64_syscall+0x115 fast_syscall_common+0xf8 
> 93415 678660 mono-sgen   Thread Pool Wor mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
> do_lock_umutex+0x6a6 __umtx_op_wait_umutex+0x49 sys__umtx_op+0x7e 
> amd64_syscall+0x115 fast_syscall_common+0xf8
> 
> So I did a kill -9 93415 to let the bulk run complete.
> 
> I then removed my ADDITION of BROKEN to print/miktex that had gotten
> stuck in the ZFS experiment and tried in the now tiny-load-average
> UFS context: bulk print/miktex graphics/pinta
> 
> They both worked just fine, not getting stuck (UFS context):
> 
> [00:00:50] [02] [00:00:25] Finished  graphics/pinta | pinta-1.7.1_4: Success 
> ending TMPFS: 2.30 GiB
> [00:14:11] [01] [00:13:47] Finished  print/miktex | miktex-23.9_3: Success 
> ending TMPFS: 3.21 GiB
> 
> I'll note that the "procstat -k -k" for the stuck print/miketex
> in the ZFS context had looked like:
> 
> # procstat -k -k 70121
> PIDTID COMMTDNAME  KSTACK 
>   
> 70121 409420 miktex-ctangle  -   mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
> do_wait+0x244 __umtx_op_wait+0x53 sys__umtx_op+0x7e amd64_syscall+0x115 
> fast_syscall_common+0xf8 
> 70121 646547 miktex-ctangle  -   mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae kqueue_scan+0x9f1 
> kqueue_kevent+0x13b kern_kevent_fp+0x4b kern_kevent_generic+0xd6 
> sys_kevent+0x61 amd64_syscall+0x115 fast_syscall_common+0xf8 
> 70121 646548 miktex-ctangle  -   mi_switch+0xba 
> sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
> do_wait+0x244 __umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e 
> amd64_syscall+0x115 fast_syscall_common+0xf8
> 
> Note that, unlike the UFS context, the above also involves: kqueue_scan
> 
> It looks like there is some form of failing race(?) condition
> that can occur on amd64 --and does rarely occur in high load
> average contexts.
> 
> I've no clue how to reduce this to a simple, repeatable context.
> 

Some other oddities, including comparison on ZFS to a run using
MUTUALLY_EXCLUSIVE_BUILD_PACKAGES to a run not using such. USE_TMPFS=all
and ALLOW_MAKE_JOBS were always in use. The combinations were:

A) ZFS 

Kenny Wayne Shepherd - Voodoo Child (Slight Return)

2024-05-04 Thread Bakul Shah

https://www.youtube.com/watch?v=oHO5a4l_8zY
Kenny Wayne Shepherd - Voodoo Child (Slight Return) - KTBA Cruise 2019
youtube.com



main [so: 15] amd64: Rare poudriere bulk builder "stuck in umtxq_sleep" condition (race failure?) during high-load-average "poudriere bulk -c -a" runs

2024-05-04 Thread Mark Millard
I recently did some of my rare "poudriere bulk -c -a" high-load-average
style experiments, here on a 7950X3D (amd64) system and I ended up with
a couple of stuck builders (one per bulk run of 2 runs). Contexts:

# uname -apKU
FreeBSD 7950X3D-UFS 15.0-CURRENT FreeBSD 15.0-CURRENT #142 
main-n269589-9dcf39575efb-dirty: Sun Apr 21 07:28:55 UTC 2024 
root@7950X3D-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64amd64/sys/GENERIC-NODBG
 amd64 amd64 1500018 1500018

# uname -apKU
FreeBSD 7950X3D-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT #142 
main-n269589-9dcf39575efb-dirty: Sun Apr 21 07:28:55 UTC 2024 
root@7950X3D-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64amd64/sys/GENERIC-NODBG
 amd64 amd64 1500018 1500018

So: One was in a ZFS context and the other was in a UFS context.

32 hardware threads, 32 builders, ALLOW_MAKE_JOBS=yes in use
(no use of MAKE_JOBS_NUMBER_LIMIT or the like), USE_TMPFS=all
in use, TMPFS_BLACKLIST in use, 192 GiBytes of RAM, 512 GiByte
Swap partition in use, so SystemRAM+SystemSWAP being
704 GiBytes.


I'll start with notes about the more recent UFS context experiment . . .

graphics/pinta in the UFS experiment had gotten stuck in threads
of /usr/local/bin/mono (mono-sgen):

[05] 15:31:47 graphics/pinta | pinta-1.7.1_4
  stage 15:28:31 2.30 GiB   0%   0%

# procstat -k -k 93415
  PIDTID COMMTDNAME  KSTACK 
  
93415 671706 mono-sgen   -   mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
do_lock_umutex+0x6a6 __umtx_op_wait_umutex+0x49 sys__umtx_op+0x7e 
amd64_syscall+0x115 fast_syscall_common+0xf8 
93415 678651 mono-sgen   SGen worker mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
do_wait+0x244 __umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e 
amd64_syscall+0x115 fast_syscall_common+0xf8 
93415 678652 mono-sgen   Finalizer   mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
__umtx_op_sem2_wait+0x49a sys__umtx_op+0x7e amd64_syscall+0x115 
fast_syscall_common+0xf8 
93415 678655 mono-sgen   -   mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
do_wait+0x244 __umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e 
amd64_syscall+0x115 fast_syscall_common+0xf8 
93415 678660 mono-sgen   Thread Pool Wor mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
do_lock_umutex+0x6a6 __umtx_op_wait_umutex+0x49 sys__umtx_op+0x7e 
amd64_syscall+0x115 fast_syscall_common+0xf8

So I did a kill -9 93415 to let the bulk run complete.

I then removed my ADDITION of BROKEN to print/miktex that had gotten
stuck in the ZFS experiment and tried in the now tiny-load-average
UFS context: bulk print/miktex graphics/pinta

They both worked just fine, not getting stuck (UFS context):

[00:00:50] [02] [00:00:25] Finished  graphics/pinta | pinta-1.7.1_4: Success 
ending TMPFS: 2.30 GiB
[00:14:11] [01] [00:13:47] Finished  print/miktex | miktex-23.9_3: Success 
ending TMPFS: 3.21 GiB

I'll note that the "procstat -k -k" for the stuck print/miketex
in the ZFS context had looked like:

# procstat -k -k 70121
 PIDTID COMMTDNAME  KSTACK  
 
70121 409420 miktex-ctangle  -   mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
do_wait+0x244 __umtx_op_wait+0x53 sys__umtx_op+0x7e amd64_syscall+0x115 
fast_syscall_common+0xf8 
70121 646547 miktex-ctangle  -   mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae kqueue_scan+0x9f1 
kqueue_kevent+0x13b kern_kevent_fp+0x4b kern_kevent_generic+0xd6 
sys_kevent+0x61 amd64_syscall+0x115 fast_syscall_common+0xf8 
70121 646548 miktex-ctangle  -   mi_switch+0xba 
sleepq_catch_signals+0x2c6 sleepq_wait_sig+0x9 _sleep+0x1ae umtxq_sleep+0x2cd 
do_wait+0x244 __umtx_op_wait_uint_private+0x54 sys__umtx_op+0x7e 
amd64_syscall+0x115 fast_syscall_common+0xf8

Note that, unlike the UFS context, the above also involves: kqueue_scan

It looks like there is some form of failing race(?) condition
that can occur on amd64 --and does rarely occur in high load
average contexts.

I've no clue how to reduce this to a simple, repeatable context.


===
Mark Millard
marklmi at yahoo.com




Re: build failure affecting port: "error: reference to 'filesystem' is ambiguous"

2024-05-02 Thread Dimitry Andric
Nice to see that upstream chose the more correct solution. :)

-Dimitry

> On 2 May 2024, at 11:57, Nuno Teixeira  wrote:
> 
> Hello Dimitry,
> 
> I've quoted your words in upstream PR and it solved with:
> 
> Stop using namespace std
> https://github.com/amsynth/amsynth/commit/6fb79100a6254220e5adc69a1428572539ecc377
> 
> I'm using patch globally that unbreak main and rest of supported releases 
> don't complaint about it.
> 
> Thanks!
> 
> Dimitry Andric  escreveu (terça, 30/04/2024 à(s) 18:45):
> On 30 Apr 2024, at 14:26, Nuno Teixeira  wrote:
> > 
> > I'm lost on build failure of audio/amsynth (updated to version 1.13.3) on 
> > recent main.
> > Thre strange thing is if I use llvm from ports, USES+=llvm, it fails with 
> > same error so I suspect that something related to main.
> > 
> > Any help is welcome and I didn't openned an upstream PR yet.
> > 
> > Thanks,
> > 
> > ---
> > src/Configuration.cpp:35:20: error: reference to 'filesystem' is ambiguous
> >35 | amsynthrc_fname = filesystem::get().config;
> >   |   ^
> > src/filesystem.h:27:7: note: candidate found by name lookup is 'filesystem'
> >27 | class filesystem
> >   |   ^
> > /usr/include/c++/v1/__chrono/file_clock.h:49:1: note: candidate found by 
> > name lookup is 'std::filesystem'
> >49 | _LIBCPP_BEGIN_NAMESPACE_FILESYSTEM
> >   | ^
> > /usr/include/c++/v1/__config:892:80: note: expanded from macro 
> > '_LIBCPP_BEGIN_NAMESPACE_FILESYSTEM'
> >   892 |  inline namespace __fs 
> > { namespace filesystem {
> >   | 
> >^
> 
> It looks like the program defines its own "filesystem" class, and also
> has "using namespace std;". 
> 
> Usually the easiest fix is to use "::filesystem" for the call sites that
> want to use the program's own definition.
> 
> Alternatively, rename the 'local' definition to something else, like
> "my_filesystem".
> 
> -Dimitry
> 
> 
> 
> -- 
> Nuno Teixeira
> FreeBSD UNIX: Web:  https://FreeBSD.org




Re: build failure affecting port: "error: reference to 'filesystem' is ambiguous"

2024-05-02 Thread Nuno Teixeira
Hello Dimitry,

I've quoted your words in upstream PR and it solved with:

Stop using namespace std
https://github.com/amsynth/amsynth/commit/6fb79100a6254220e5adc69a1428572539ecc377

I'm using patch globally that unbreak main and rest of supported releases
don't complaint about it.

Thanks!

Dimitry Andric  escreveu (terça, 30/04/2024 à(s) 18:45):

> On 30 Apr 2024, at 14:26, Nuno Teixeira  wrote:
> >
> > I'm lost on build failure of audio/amsynth (updated to version 1.13.3)
> on recent main.
> > Thre strange thing is if I use llvm from ports, USES+=llvm, it fails
> with same error so I suspect that something related to main.
> >
> > Any help is welcome and I didn't openned an upstream PR yet.
> >
> > Thanks,
> >
> > ---
> > src/Configuration.cpp:35:20: error: reference to 'filesystem' is
> ambiguous
> >35 | amsynthrc_fname = filesystem::get().config;
> >   |   ^
> > src/filesystem.h:27:7: note: candidate found by name lookup is
> 'filesystem'
> >27 | class filesystem
> >   |   ^
> > /usr/include/c++/v1/__chrono/file_clock.h:49:1: note: candidate found by
> name lookup is 'std::filesystem'
> >49 | _LIBCPP_BEGIN_NAMESPACE_FILESYSTEM
> >   | ^
> > /usr/include/c++/v1/__config:892:80: note: expanded from macro
> '_LIBCPP_BEGIN_NAMESPACE_FILESYSTEM'
> >   892 |  inline namespace
> __fs { namespace filesystem {
> >   |
>   ^
>
> It looks like the program defines its own "filesystem" class, and also
> has "using namespace std;".
>
> Usually the easiest fix is to use "::filesystem" for the call sites that
> want to use the program's own definition.
>
> Alternatively, rename the 'local' definition to something else, like
> "my_filesystem".
>
> -Dimitry
>
>

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


14.1-PRERELEASE boot stuck on feeding entropy

2024-05-02 Thread cglogic
Hello,

After upgrading from February's 14-stable amd64 to 14.1-PRERELEASE amd64 one of 
my systems can't boot anymore. Bootloader was, of course, updated.
It's freezes on "Feeding entropy ." message.
The system installed on pretty old Supermicro server with dual Intel Xeon E5620 
CPU, uses BIOS boot method and has root on ZFS.
This CPU has no integrated random number generator, if I recall correctly.

When I managed to boot this machine from recent 14.1-PRERELEASE installation 
media, mounted zroot to temporary dir, removed /boot/entropy file and rebooted 
it, the system booted.
However next reboot it can't boot again, with the same "Feeding entropy ." last 
message.

More modern hardware not affected by this issue.

Thanks.

Re: build failure affecting port: "error: reference to 'filesystem' is ambiguous"

2024-04-30 Thread Dimitry Andric
On 30 Apr 2024, at 14:26, Nuno Teixeira  wrote:
> 
> I'm lost on build failure of audio/amsynth (updated to version 1.13.3) on 
> recent main.
> Thre strange thing is if I use llvm from ports, USES+=llvm, it fails with 
> same error so I suspect that something related to main.
> 
> Any help is welcome and I didn't openned an upstream PR yet.
> 
> Thanks,
> 
> ---
> src/Configuration.cpp:35:20: error: reference to 'filesystem' is ambiguous
>35 | amsynthrc_fname = filesystem::get().config;
>   |   ^
> src/filesystem.h:27:7: note: candidate found by name lookup is 'filesystem'
>27 | class filesystem
>   |   ^
> /usr/include/c++/v1/__chrono/file_clock.h:49:1: note: candidate found by name 
> lookup is 'std::filesystem'
>49 | _LIBCPP_BEGIN_NAMESPACE_FILESYSTEM
>   | ^
> /usr/include/c++/v1/__config:892:80: note: expanded from macro 
> '_LIBCPP_BEGIN_NAMESPACE_FILESYSTEM'
>   892 |  inline namespace __fs { 
> namespace filesystem {
>   |   
>  ^

It looks like the program defines its own "filesystem" class, and also
has "using namespace std;". 

Usually the easiest fix is to use "::filesystem" for the call sites that
want to use the program's own definition.

Alternatively, rename the 'local' definition to something else, like
"my_filesystem".

-Dimitry




build failure affecting port: "error: reference to 'filesystem' is ambiguous"

2024-04-30 Thread Nuno Teixeira
Hello all,

I'm lost on build failure of audio/amsynth (updated to version 1.13.3) on
recent main.
Thre strange thing is if I use llvm from ports, USES+=llvm, it fails with
same error so I suspect that something related to main.

Any help is welcome and I didn't openned an upstream PR yet.

Thanks,

---
src/Configuration.cpp:35:20: error: reference to 'filesystem' is ambiguous
   35 | amsynthrc_fname = filesystem::get().config;
  |   ^
src/filesystem.h:27:7: note: candidate found by name lookup is 'filesystem'
   27 | class filesystem
  |   ^
/usr/include/c++/v1/__chrono/file_clock.h:49:1: note: candidate found by
name lookup is 'std::filesystem'
   49 | _LIBCPP_BEGIN_NAMESPACE_FILESYSTEM
  | ^
/usr/include/c++/v1/__config:892:80: note: expanded from macro
'_LIBCPP_BEGIN_NAMESPACE_FILESYSTEM'
  892 |  inline namespace __fs
{ namespace filesystem {
  |
   ^
src/Configuration.cpp:60:22: error: reference to 'filesystem' is ambiguous
   60 | current_bank_file = filesystem::get().default_bank;
  | ^
src/filesystem.h:27:7: note: candidate found by name lookup is 'filesystem'
   27 | class filesystem
  |   ^
/usr/include/c++/v1/__chrono/file_clock.h:49:1: note: candidate found by
name lookup is 'std::filesystem'
   49 | _LIBCPP_BEGIN_NAMESPACE_FILESYSTEM
  | ^
/usr/include/c++/v1/__config:892:80: note: expanded from macro
'_LIBCPP_BEGIN_NAMESPACE_FILESYSTEM'
  892 |  inline namespace __fs
{ namespace filesystem {
  |
   ^
2 errors generated.
gmake[2]: *** [Makefile:3376: src/amsynth_dssi_gtk-Configuration.o] Error 1
gmake[2]: *** Waiting for unfinished jobs
src/PresetController.cpp:474:9: error: reference to 'filesystem' is
ambiguous
  474 | return filesystem::get().user_banks;
  |^
src/filesystem.h:27:7: note: candidate found by name lookup is 'filesystem'
   27 | class filesystem
  |   ^
/usr/include/c++/v1/__chrono/file_clock.h:49:1: note: candidate found by
name lookup is 'std::filesystem'
   49 | _LIBCPP_BEGIN_NAMESPACE_FILESYSTEM
  | ^
/usr/include/c++/v1/__config:892:80: note: expanded from macro
'_LIBCPP_BEGIN_NAMESPACE_FILESYSTEM'
  892 |  inline namespace __fs
{ namespace filesystem {
  |
   ^
mv -f src/.deps/amsynth-AudioOutput.Tpo src/.deps/amsynth-AudioOutput.Po
1 error generated.
---

-- 
Nuno Teixeira
FreeBSD UNIX: Web:  https://FreeBSD.org


Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-29 Thread Mark Millard



On Apr 29, 2024, at 20:11, Mark Millard  wrote:

> On Apr 29, 2024, at 19:54, Mark Millard  wrote:
> 
>> On Apr 28, 2024, at 18:06, Philip Paeps  wrote:
>> 
>>> On 2024-04-18 23:14:22 (+0800), Mark Millard wrote:
 On Apr 18, 2024, at 08:02, Mark Millard  wrote:
> void  wrote on
> Date: Thu, 18 Apr 2024 14:08:36 UTC :
> 
>> Not sure where to post this..
>> 
>> The last bulk build for arm64 appears to have happened around
>> mid-March on ampere2. Is it broken?
> 
> main-armv7 building is broken and the last completed build
> was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
> gets stuck making no progress until manually forced to stop,
> which leads to huge elapsed times for the incomplete builds:
> 
> [...]
> 
> My guess is that FreeBSD has something that broken after bd45bbe440
> that was broken as of f5f08e41aa and was still broken at 75464941dc .
> 
 
 One thing of possible note:
 
 Failing . . .
 
 Host OSVERSION: 156
 Jail OSVERSION: 1500014
>>> 
>>> I have finished a package builder refresh this morning.  All our builder 
>>> hosts (except PowerPC - I don't touch those) are now on 
>>> main-n269671-feabaf8d5389 (OSVERSION 1500018).
>>> 
>>> ampere1 successfully finished its 140releng-armv7-quarterly build, so it 
>>> looks like the problem with stuck builds was limited to ampere2 building 
>>> main-armv7.  I'll keep a close eye on this one when it starts its next 
>>> build.
>>> 
>> 
>> I see that main-armv7 started.
>> 
>> It queued only 31935 instead of the prior 34528 (or more): it is doing an
>> incremental build instead of a full build. For example, pkg was not built
>> but instead the prior build is in use. Thus bad results from the prior
>> build might be involved in this new build.
>> 
>> I'd recommend forcing a full "poudriere bulk -c -a" that does a from-scratch
>> build for the purposes of the main-armv7 test.
> 
> Actually the test is not going to previde the information we are
> after as things are.
> 
> giflib-5.2.2 failed to build, which leads to devel/doxygen being
> skipped. devel/doxygen was the first one to hang up in the prior
> 2 failing attempts, if I remember right.
> 
> giflib-5.2.2 also causes graphics/graphviz to be skipped.
> graphics/graphviz was installed just before the hangup in all of
> the example hanups. So the context will not be replicated.
> 
> We need graphics/giflib to build to actually do the test.

Looks like:

https://cgit.freebsd.org/ports/commit/graphics/giflib?id=5007109903fc271e3ef0ba01d78781c1fed99f3f

is the fix for the graphic/giflib build failure.

===
Mark Millard
marklmi at yahoo.com




Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-29 Thread Mark Millard
On Apr 29, 2024, at 19:54, Mark Millard  wrote:

> On Apr 28, 2024, at 18:06, Philip Paeps  wrote:
> 
>> On 2024-04-18 23:14:22 (+0800), Mark Millard wrote:
>>> On Apr 18, 2024, at 08:02, Mark Millard  wrote:
 void  wrote on
 Date: Thu, 18 Apr 2024 14:08:36 UTC :
 
> Not sure where to post this..
> 
> The last bulk build for arm64 appears to have happened around
> mid-March on ampere2. Is it broken?
 
 main-armv7 building is broken and the last completed build
 was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
 gets stuck making no progress until manually forced to stop,
 which leads to huge elapsed times for the incomplete builds:
 
 [...]
 
 My guess is that FreeBSD has something that broken after bd45bbe440
 that was broken as of f5f08e41aa and was still broken at 75464941dc .
 
>>> 
>>> One thing of possible note:
>>> 
>>> Failing . . .
>>> 
>>> Host OSVERSION: 156
>>> Jail OSVERSION: 1500014
>> 
>> I have finished a package builder refresh this morning.  All our builder 
>> hosts (except PowerPC - I don't touch those) are now on 
>> main-n269671-feabaf8d5389 (OSVERSION 1500018).
>> 
>> ampere1 successfully finished its 140releng-armv7-quarterly build, so it 
>> looks like the problem with stuck builds was limited to ampere2 building 
>> main-armv7.  I'll keep a close eye on this one when it starts its next build.
>> 
> 
> I see that main-armv7 started.
> 
> It queued only 31935 instead of the prior 34528 (or more): it is doing an
> incremental build instead of a full build. For example, pkg was not built
> but instead the prior build is in use. Thus bad results from the prior
> build might be involved in this new build.
> 
> I'd recommend forcing a full "poudriere bulk -c -a" that does a from-scratch
> build for the purposes of the main-armv7 test.

Actually the test is not going to previde the information we are
after as things are.

giflib-5.2.2 failed to build, which leads to devel/doxygen being
skipped. devel/doxygen was the first one to hang up in the prior
2 failing attempts, if I remember right.

giflib-5.2.2 also causes graphics/graphviz to be skipped.
graphics/graphviz was installed just before the hangup in all of
the example hanups. So the context will not be replicated.

We need graphics/giflib to build to actually do the test.


===
Mark Millard
marklmi at yahoo.com




Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-29 Thread Mark Millard
On Apr 28, 2024, at 18:06, Philip Paeps  wrote:

> On 2024-04-18 23:14:22 (+0800), Mark Millard wrote:
>> On Apr 18, 2024, at 08:02, Mark Millard  wrote:
>>> void  wrote on
>>> Date: Thu, 18 Apr 2024 14:08:36 UTC :
>>> 
 Not sure where to post this..
 
 The last bulk build for arm64 appears to have happened around
 mid-March on ampere2. Is it broken?
>>> 
>>> main-armv7 building is broken and the last completed build
>>> was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
>>> gets stuck making no progress until manually forced to stop,
>>> which leads to huge elapsed times for the incomplete builds:
>>> 
>>> [...]
>>> 
>>> My guess is that FreeBSD has something that broken after bd45bbe440
>>> that was broken as of f5f08e41aa and was still broken at 75464941dc .
>>> 
>> 
>> One thing of possible note:
>> 
>> Failing . . .
>> 
>> Host OSVERSION: 156
>> Jail OSVERSION: 1500014
> 
> I have finished a package builder refresh this morning.  All our builder 
> hosts (except PowerPC - I don't touch those) are now on 
> main-n269671-feabaf8d5389 (OSVERSION 1500018).
> 
> ampere1 successfully finished its 140releng-armv7-quarterly build, so it 
> looks like the problem with stuck builds was limited to ampere2 building 
> main-armv7.  I'll keep a close eye on this one when it starts its next build.
> 

I see that main-armv7 started.

It queued only 31935 instead of the prior 34528 (or more): it is doing an
incremental build instead of a full build. For example, pkg was not built
but instead the prior build is in use. Thus bad results from the prior
build might be involved in this new build.

I'd recommend forcing a full "poudriere bulk -c -a" that does a from-scratch
build for the purposes of the main-armv7 test.

===
Mark Millard
marklmi at yahoo.com




Request for non-GENERIC kernel diff

2024-04-29 Thread Vladislav V. Prodan
Hello!

Who uses their own kernel config, please share the diff between the 14 and 13.x 
branches

Thanks.

--
 Vladislav V. Prodan
 System & Network Administrator
 support.od.ua



Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-28 Thread Philip Paeps

On 2024-04-18 23:14:22 (+0800), Mark Millard wrote:

On Apr 18, 2024, at 08:02, Mark Millard  wrote:

void  wrote on
Date: Thu, 18 Apr 2024 14:08:36 UTC :


Not sure where to post this..

The last bulk build for arm64 appears to have happened around
mid-March on ampere2. Is it broken?


main-armv7 building is broken and the last completed build
was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
gets stuck making no progress until manually forced to stop,
which leads to huge elapsed times for the incomplete builds:

[...]

My guess is that FreeBSD has something that broken after bd45bbe440
that was broken as of f5f08e41aa and was still broken at 75464941dc .



One thing of possible note:

Failing . . .

Host OSVERSION: 156
Jail OSVERSION: 1500014


I have finished a package builder refresh this morning.  All our builder 
hosts (except PowerPC - I don't touch those) are now on 
main-n269671-feabaf8d5389 (OSVERSION 1500018).


ampere1 successfully finished its 140releng-armv7-quarterly build, so it 
looks like the problem with stuck builds was limited to ampere2 building 
main-armv7.  I'll keep a close eye on this one when it starts its next 
build.


Philip



Re: serial/ulscom: response timeout using pySerial/esptool.py

2024-04-27 Thread FreeBSD User
Am Sat, 27 Apr 2024 11:28:55 +0200
FreeBSD User  schrieb:

Just for the record: running a small "victim NAS" based on an HP EliteDesk 800 
G2 mini,
XigmaNAS (latest official version, kernel see below), installing packages from 
an official
FreeBSD site for FBSD 13.2-RELEASE, gives me on an ESP32 D1 mini, not working 
with the
afore mentioned host, gives this (after a loop of 100x issued the esptool.py 
command, no
issues detected):

[...]
nas01: ~# esptool.py --chip esp32 --port /dev/cuaU0 --baud 115200 read_mac
esptool.py v4.5
Serial port /dev/cuaU0
Connecting..
Chip is ESP32-D0WD-V3 (revision v3.1)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme 
None
Crystal is 40MHz
MAC: XX:XX:XX:XX:XX:XX
Uploading stub...
Running stub...
Stub running...
MAC: XX:XX:XX:XX:XX:XX
Hard resetting via RTS pin...
[...]

.. and those from AZdelivery (larger and older chips):
[...]
nas01: ~# esptool.py --chip esp32 --port /dev/cuaU0 --baud 115200 read_mac
esptool.py v4.5
Serial port /dev/cuaU0
Connecting.
Chip is ESP32-D0WDQ6 (revision v1.0)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme 
None
Crystal is 40MHz
MAC: XX:XX:XX:XX:XX:XX
Uploading stub...
Running stub...
Stub running...
MAC: XX:XX:XX:XX:XX:XX
Hard resetting via RTS pin...

[...]

or

[... considered a different revision, but in fact the same old ESP32 as it 
reveals itself as
..]
nas01: ~# esptool.py --chip esp32 --port /dev/cuaU0 --baud 115200 read_mac
esptool.py v4.5
Serial port /dev/cuaU0
Connecting...
Chip is ESP32-D0WDQ6 (revision v1.0)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme 
None
Crystal is 40MHz
MAC: XX:XX:XX:XX:XX:XX
Uploading stub...
Running stub...
Stub running...
MAC: XX:XX:XX:XX:XX:XX
Hard resetting via RTS pin...


Big question is: is this an issue introduced with FBSD 14? In 2020 I played 
around with my
first attempts using the Arduino IDE which worked pretty well, with some minor 
issues (I had
to perform several attempts to get connected, using 12- and 13-STABLE that 
time). But the
Arduino IDE doen't work as well


> Am Thu, 25 Apr 2024 21:51:21 +0200
> Tomek CEDRO  schrieb:
> 
> > CP2102 are pretty good ones and never let me down :-)
> > 
> > Is your UART connection to ESP32 working correctly? Can you see the
> > boot message and whatever happens next in terminal (cu / minicom)? Are
> > RX TX pins not swapped? Power supply okay?  
> 
> The ESP32 used are 
> - ESP32-WROOM32 D1 mini, have 10 pieces of those, on each single one same 
> behaviour on same
> host
> - ESP32-WROOM32 sold by Chinese distributor AZdelivery in Germany, I got a 
> bunch of them,
> Revision 1 (baught 2020) and a more recent revision V4, baught a couple of 
> months ago.
> 
> AGAIN: ALL chips do not communicate with my private hosts (dmesg: CPU 
> microcode: updated from
> 0x1f to 0x21 CPU: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz (3200.18-MHz 
> K8-class CPU)), OS:
> FreeBSD 15.0-CURRENT #39 main-n269723-4ba444de708b: Sat Apr 27 06:42:44 CEST 
> 2024 amd64,
> mainboard is a crappy Z77 Pro4 ASrock, 
> 
> pciconf excerpts:
> [...]
> ichsmb0@pci0:0:31:3:class=0x0c0500 rev=0x04 hdr=0x00 vendor=0x8086 
> device=0x1e22
> subvendor=0x1849 subdevice=0x1e22 vendor = 'Intel Corporation'
> device = '7 Series/C216 Chipset Family SMBus Controller'
> class  = serial bus
> subclass   = SMBus
> bar   [10] = type Memory, range 64, base 0xf7c15000, size 256, enabled
> bar   [20] = type I/O Port, range 32, base 0xf040, size 32, enabled
> ..
> ehci1@pci0:0:29:0:  class=0x0c0320 rev=0x04 hdr=0x00 vendor=0x8086 
> device=0x1e26
> subvendor=0x1849 subdevice=0x1e26 vendor = 'Intel Corporation'
> device = '7 Series/C216 Chipset Family USB Enhanced Host Controller'
> class  = serial bus
> subclass   = USB
> bar   [10] = type Memory, range 32, base 0xf7c17000, size 1024, enabled
> cap 01[50] = powerspec 2  supports D0 D3  current D0
> cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
> cap 13[98] = PCI Advanced Features: FLR TP
> ..
> xhci0@pci0:0:20:0:  class=0x0c0330 rev=0x04 hdr=0x00 vendor=0x8086 
> device=0x1e31
> subvendor=0x1849 subdevice=0x1e31 vendor = 'Intel Corporation'
> device = '7 Series/C210 Series Chipset Family USB xHCI Host 
> Controller'
> class  = serial bus
> subclass   = USB
> bar   [10] = type Memory, range 64, base 0xf7c0, size 65536, enabled
> cap 01[70] = powerspec 2  supports D0 D3  current D0
> cap 05[80] = MSI supports 8 messages, 64 bit enabled with 1 message
> 
> 
> 
> > 
> > Are boards generic devkits of custom hardware? ESP32 in addition to RX
> > TX needs two control lines Reset and Boot that will switch the chip to
> > bootloader / flashing mode. Most USB-to-UART use RTS/CTS lines for
> > that. Are you sure these lines are available on your board and
> > connected to the target correctly? Do you have Reset and 

Re: serial/ulscom: response timeout using pySerial/esptool.py

2024-04-27 Thread FreeBSD User
Am Thu, 25 Apr 2024 21:51:21 +0200
Tomek CEDRO  schrieb:

> CP2102 are pretty good ones and never let me down :-)
> 
> Is your UART connection to ESP32 working correctly? Can you see the
> boot message and whatever happens next in terminal (cu / minicom)? Are
> RX TX pins not swapped? Power supply okay?

The ESP32 used are 
- ESP32-WROOM32 D1 mini, have 10 pieces of those, on each single one same 
behaviour on same
host
- ESP32-WROOM32 sold by Chinese distributor AZdelivery in Germany, I got a 
bunch of them,
Revision 1 (baught 2020) and a more recent revision V4, baught a couple of 
months ago.

AGAIN: ALL chips do not communicate with my private hosts (dmesg: CPU 
microcode: updated from
0x1f to 0x21 CPU: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz (3200.18-MHz K8-class 
CPU)), OS:
FreeBSD 15.0-CURRENT #39 main-n269723-4ba444de708b: Sat Apr 27 06:42:44 CEST 
2024 amd64,
mainboard is a crappy Z77 Pro4 ASrock, 

pciconf excerpts:
[...]
ichsmb0@pci0:0:31:3:class=0x0c0500 rev=0x04 hdr=0x00 vendor=0x8086 
device=0x1e22
subvendor=0x1849 subdevice=0x1e22 vendor = 'Intel Corporation'
device = '7 Series/C216 Chipset Family SMBus Controller'
class  = serial bus
subclass   = SMBus
bar   [10] = type Memory, range 64, base 0xf7c15000, size 256, enabled
bar   [20] = type I/O Port, range 32, base 0xf040, size 32, enabled
..
ehci1@pci0:0:29:0:  class=0x0c0320 rev=0x04 hdr=0x00 vendor=0x8086 
device=0x1e26
subvendor=0x1849 subdevice=0x1e26 vendor = 'Intel Corporation'
device = '7 Series/C216 Chipset Family USB Enhanced Host Controller'
class  = serial bus
subclass   = USB
bar   [10] = type Memory, range 32, base 0xf7c17000, size 1024, enabled
cap 01[50] = powerspec 2  supports D0 D3  current D0
cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
cap 13[98] = PCI Advanced Features: FLR TP
..
xhci0@pci0:0:20:0:  class=0x0c0330 rev=0x04 hdr=0x00 vendor=0x8086 
device=0x1e31
subvendor=0x1849 subdevice=0x1e31 vendor = 'Intel Corporation'
device = '7 Series/C210 Series Chipset Family USB xHCI Host Controller'
class  = serial bus
subclass   = USB
bar   [10] = type Memory, range 64, base 0xf7c0, size 65536, enabled
cap 01[70] = powerspec 2  supports D0 D3  current D0
cap 05[80] = MSI supports 8 messages, 64 bit enabled with 1 message



> 
> Are boards generic devkits of custom hardware? ESP32 in addition to RX
> TX needs two control lines Reset and Boot that will switch the chip to
> bootloader / flashing mode. Most USB-to-UART use RTS/CTS lines for
> that. Are you sure these lines are available on your board and
> connected to the target correctly? Do you have Reset and Boot buttons
> on the board so you could trigger bootloader by hand (hold Boot, press
> and release Reset, device will be in bootloader upload mode, retry
> esptool flashing now). You can also play with the buttons with active
> terminal attached (i.e. minicom) to see if they work as expected.

I tried minivom, but I have to confess, I'm a "noice" in that matter, so do not 
expect
professional debugging infos:

Unsuccessful issueing the following command on three different types of ESP32 as
described above, I use at least two of them and even one (a D1 mini) just 
unfolded from
its sealed anti static bag) while observing the minicom attached via -D 
/dev/cuaU1:

[...]
[ohartmann]: esptool.py --chip esp32 --baud 115200 --connect-attempts 400 
--port /dev/cuaU1
read_mac esptool.py v4.7.0
Loaded custom configuration from /pool/home/ohartmann/esptool.cfg
Serial port /dev/cuaU1
Connecting...

A serial exception error occurred: device reports readiness to read but 
returned no data
(device disconnected or multiple access on port?) Note: This error originates 
from pySerial.
It is likely not a problem with esptool, but with the hardware connection or 
drivers. For
troubleshooting steps visit:
https://docs.espressif.com/projects/esptool/en/latest/troubleshooting.html

[...]

On the reference minicom terminal I observed with the D1 mini (minicom -b 
115200  -D
/dev/cuaU1):
[...]

Welcome to minicom 2.8

OPTIONS: I18n 
Compiled on Apr 27 2024, 09:04:50.
Port /dev/cuaU1, 10:50:53

Press CTRL-A Z for help on special keys

ts Jul 29 2019 12:21:46

rst:x1 (POWERON_RESET),boot:0x3 (DOWNLOAD_BOOT(UART0/UART1/SDIO_REI_REO_V2))
waiting for download
 U� U� U� U� U� U� U� U


[... the older ESP32 from 2020 ...]

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DOUT, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:1044
load:0x40078000,len:10124
load:0x40080400,len:5828
entry 0x400806a8
�un  8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x3 (DOWNLOAD_BOOT(UART0/UART1/SDIO_REI_REO_V2))
waiting for download
es Jun  8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH]�(�:���   �


[... and the one purchased last year, called 

Re: TXT Kernel linking failed on -CURRENT

2024-04-26 Thread BSD USER

Konstantin, good day!

25.04.2024 0:09, Konstantin Belousov пишет:

On Wed, Apr 24, 2024 at 01:12:39PM +0500, BSD USER wrote:

linking kernel
ld: error: undefined symbol: ktrcapfail

referenced by vfs_lookup.c
    vfs_lookup.o:(namei)
referenced by vfs_lookup.c
    vfs_lookup.o:(namei_setup)
referenced by vfs_lookup.c
    vfs_lookup.o:(vfs_lookup)
referenced 3 more times

*** [kernel] Error code 1

Try
https://reviews.freebsd.org/D44931


Yes, now system and kernel builds fine.

Thanks!



Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-26 Thread Mark Millard
On Apr 26, 2024, at 18:55, Philip Paeps  wrote:

> On 2024-04-18 23:02:30 (+0800), Mark Millard wrote:
>> void  wrote on
>> Date: Thu, 18 Apr 2024 14:08:36 UTC :
>> 
>>> Not sure where to post this..
>>> 
>>> The last bulk build for arm64 appears to have happened around
>>> mid-March on ampere2. Is it broken?
>> 
>> main-armv7 building is broken and the last completed build
>> was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
>> gets stuck making no progress until manually forced to stop,
>> which leads to huge elapsed times for the incomplete builds:
>> 
>> pd5512ae7b8c6_s75464941dc 34472 12282  (+9196) 107  (+77) 4753  (+2247) 1390 
>>  (+529) 15940 parallel_build: Fri, 22 Mar 2024 11:05:01 GMT 651:21:56
>> 
>> p43e3af5f5763_sf5f08e41aa 19809 5919  (+3126) 137  (+100) 5363  (+2741) 1395 
>>  (+522) 6995 parallel_build: Wed, 28 Feb 2024 15:46:14 GMT 359:42:14 ampere2
>> 
>> ampere2 alternates between trying to build main-arm64 and main-armv7, so 
>> main-armv7 being stuck blocks main-arm64 from building.
>> 
>> One can see that all 13 job ID's show over 570 hours:
>> 
>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pd5512ae7b8c6_s75464941dc
>> 
>> It is not random which packages are building when this happens. Compare:
>> 
>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=p43e3af5f5763_sf5f08e41aa
>> 
>> By contrast, the 19 Feb 2024 from-scratch (full) build worked:
>> 
>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pe9c9c73181b5_sbd45bbe440
>> 
>> My guess is that FreeBSD has something that broken after bd45bbe440
>> that was broken as of f5f08e41aa and was still broken at 75464941dc .
> 
> It looks like ampere2 is going to end up in this state again:
> 
> https://pkg-status.freebsd.org/ampere2/build.html?mastername=main-armv7-default=p1c7a816cd0ad_s1bd4f769ca
> 
> It's got a couple of things stuck in -depends already.  I'll keep an eye on 
> it for the next hour or two.  If no progress is made, I'll kill this build 
> and force an upgrade.  The next build will start at 01:01 UTC Sunday.  So we 
> won't have long to wait before it tries again.
> 
> ampere1 is chewing away at llvm, and doesn't look stuck.
> 
> ampere3 has been upgraded.

Output from the likes of:

# ps -axldww

could be interesting. As might be output from:

# pstat -k -k PIDs_OF_STUCK_PROCESSES

(kernel stack backtraces).


===
Mark Millard
marklmi at yahoo.com




Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-26 Thread Philip Paeps

On 2024-04-18 23:02:30 (+0800), Mark Millard wrote:

void  wrote on
Date: Thu, 18 Apr 2024 14:08:36 UTC :


Not sure where to post this..

The last bulk build for arm64 appears to have happened around
mid-March on ampere2. Is it broken?


main-armv7 building is broken and the last completed build
was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
gets stuck making no progress until manually forced to stop,
which leads to huge elapsed times for the incomplete builds:

pd5512ae7b8c6_s75464941dc 34472 12282  (+9196) 107  (+77) 4753  
(+2247) 1390  (+529) 15940 parallel_build: Fri, 22 Mar 2024 11:05:01 
GMT 651:21:56


p43e3af5f5763_sf5f08e41aa 19809 5919  (+3126) 137  (+100) 5363  
(+2741) 1395  (+522) 6995 parallel_build: Wed, 28 Feb 2024 15:46:14 
GMT 359:42:14 ampere2


ampere2 alternates between trying to build main-arm64 and main-armv7, 
so main-armv7 being stuck blocks main-arm64 from building.


One can see that all 13 job ID's show over 570 hours:

http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pd5512ae7b8c6_s75464941dc

It is not random which packages are building when this happens. 
Compare:


http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=p43e3af5f5763_sf5f08e41aa

By contrast, the 19 Feb 2024 from-scratch (full) build worked:

http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pe9c9c73181b5_sbd45bbe440

My guess is that FreeBSD has something that broken after bd45bbe440
that was broken as of f5f08e41aa and was still broken at 75464941dc .


It looks like ampere2 is going to end up in this state again:

https://pkg-status.freebsd.org/ampere2/build.html?mastername=main-armv7-default=p1c7a816cd0ad_s1bd4f769ca

It's got a couple of things stuck in -depends already.  I'll keep an eye 
on it for the next hour or two.  If no progress is made, I'll kill this 
build and force an upgrade.  The next build will start at 01:01 UTC 
Sunday.  So we won't have long to wait before it tries again.


ampere1 is chewing away at llvm, and doesn't look stuck.

ampere3 has been upgraded.

Philip



Re: mysterious setting of B_DIRECT?

2024-04-25 Thread Rick Macklem
On Thu, Apr 25, 2024 at 8:51 PM Rick Macklem  wrote:
>
> On Thu, Apr 25, 2024 at 8:09 PM Konstantin Belousov  wrote:
> >
> > On Thu, Apr 25, 2024 at 07:49:23PM -0700, Rick Macklem wrote:
> > > Hi,
> > >
> > > This week I have been doing active testing as a part of an IETF
> > > bakeathon for NFSv4. During the week I had a NFSv4 client
> > > crash. On the surface, it is straightforward, in that it called
> > > ncl_doio_directwrite() and the field called b_caller1 was NULL.
> > >
> > > Now, here's the weird part...
> > > ncl_doio_directwrite() should never be called because B_DIRECT
> > > should never be set. (The only place B_DIRECT gets set in the code
> > > is never currently executed.)
> > Do you mean the place in nfs_directio_write()?  And the fact that
> > IO_SYNC is always set.
> Yes.
>
> >
> > >
> > > I have a patch that clears out the "never to be executed" code and
> > > this seems to avoid the patch, since with the patch, 
> > > ncl_doio_directwrite()
> > > no longer exists.
> > >
> > > What I cannot figure out is how B_DIRECT got set?
> > > I can note that UFS was under heavy load when the client crashed,
> > > but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
> > > without b_flags being set to 0.
> >
> > There are also vfs_bio_brelse()/vfs_bio_setflags() functions which can
> > set B_DIRECT.  On the other hand, they are not used by nfs client.
> Yes, again.
>
> >
> > What was the overall state of the buffer with the B_DIRECT flag?  Which
> > vnode it was assigned to?
> Unfortunately I was in a hurry and didn't get more info.
> And, since I have never seen this crash before, I doubt I'll be able
> to reproduce it.
Oh, and I will put the cleanup patch on phabricator. I didn't see the
crash again
during a few days of testing with the patch. This makes sense, since it gets
rid of ncl_doio_directwrite().

>
> Thanks, rick



Re: mysterious setting of B_DIRECT?

2024-04-25 Thread Rick Macklem
On Thu, Apr 25, 2024 at 8:09 PM Konstantin Belousov  wrote:
>
> On Thu, Apr 25, 2024 at 07:49:23PM -0700, Rick Macklem wrote:
> > Hi,
> >
> > This week I have been doing active testing as a part of an IETF
> > bakeathon for NFSv4. During the week I had a NFSv4 client
> > crash. On the surface, it is straightforward, in that it called
> > ncl_doio_directwrite() and the field called b_caller1 was NULL.
> >
> > Now, here's the weird part...
> > ncl_doio_directwrite() should never be called because B_DIRECT
> > should never be set. (The only place B_DIRECT gets set in the code
> > is never currently executed.)
> Do you mean the place in nfs_directio_write()?  And the fact that
> IO_SYNC is always set.
Yes.

>
> >
> > I have a patch that clears out the "never to be executed" code and
> > this seems to avoid the patch, since with the patch, ncl_doio_directwrite()
> > no longer exists.
> >
> > What I cannot figure out is how B_DIRECT got set?
> > I can note that UFS was under heavy load when the client crashed,
> > but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
> > without b_flags being set to 0.
>
> There are also vfs_bio_brelse()/vfs_bio_setflags() functions which can
> set B_DIRECT.  On the other hand, they are not used by nfs client.
Yes, again.

>
> What was the overall state of the buffer with the B_DIRECT flag?  Which
> vnode it was assigned to?
Unfortunately I was in a hurry and didn't get more info.
And, since I have never seen this crash before, I doubt I'll be able
to reproduce it.

Thanks, rick



Re: mysterious setting of B_DIRECT?

2024-04-25 Thread Konstantin Belousov
On Thu, Apr 25, 2024 at 07:49:23PM -0700, Rick Macklem wrote:
> Hi,
> 
> This week I have been doing active testing as a part of an IETF
> bakeathon for NFSv4. During the week I had a NFSv4 client
> crash. On the surface, it is straightforward, in that it called
> ncl_doio_directwrite() and the field called b_caller1 was NULL.
> 
> Now, here's the weird part...
> ncl_doio_directwrite() should never be called because B_DIRECT
> should never be set. (The only place B_DIRECT gets set in the code
> is never currently executed.)
Do you mean the place in nfs_directio_write()?  And the fact that
IO_SYNC is always set.

> 
> I have a patch that clears out the "never to be executed" code and
> this seems to avoid the patch, since with the patch, ncl_doio_directwrite()
> no longer exists.
> 
> What I cannot figure out is how B_DIRECT got set?
> I can note that UFS was under heavy load when the client crashed,
> but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
> without b_flags being set to 0.

There are also vfs_bio_brelse()/vfs_bio_setflags() functions which can
set B_DIRECT.  On the other hand, they are not used by nfs client.

What was the overall state of the buffer with the B_DIRECT flag?  Which
vnode it was assigned to?



mysterious setting of B_DIRECT?

2024-04-25 Thread Rick Macklem
Hi,

This week I have been doing active testing as a part of an IETF
bakeathon for NFSv4. During the week I had a NFSv4 client
crash. On the surface, it is straightforward, in that it called
ncl_doio_directwrite() and the field called b_caller1 was NULL.

Now, here's the weird part...
ncl_doio_directwrite() should never be called because B_DIRECT
should never be set. (The only place B_DIRECT gets set in the code
is never currently executed.)

I have a patch that clears out the "never to be executed" code and
this seems to avoid the patch, since with the patch, ncl_doio_directwrite()
no longer exists.

What I cannot figure out is how B_DIRECT got set?
I can note that UFS was under heavy load when the client crashed,
but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
without b_flags being set to 0.

Anyone have any ideas? rick



Re: serial/ulscom: response timeout using pySerial/esptool.py

2024-04-25 Thread Tom Jones
Can you isolate out the extraneous stuff and loop tx and rx on a CP2101 board 
and send bytes through? 

I did a bunch of development on an esp8266 board in the last few weeks and had 
no issues, but I’ve no idea if it were the same usb serial chip. 

I’ll have a dig around and see if I have something matching 

On Thu, Apr 25, 2024, at 20:17, FreeBSD User wrote:
> Hello,
>
> Host: 15.0-CURRENT FreeBSD 15.0-CURRENT #36 main-n269703-54c3aa02e926: 
> Thu Apr 25 18:48:56
> CEST 2024 amd64 or 14-STABLE recently compiled (dmesg/uname not at 
> hand).
>
> Hardware: oldish Z77Pro 4 based Asrock mainboard, a Lenovo T560 
> notebook, Fujitsu Esprimo Q5XX
> (simple desktop, Pentium Gold) or an oldish Fujitsu Celsius 7XX 
> workstation, 6 core Haswell
> XEON.
>
> Phenomenon: a couple of weeks now I try to connect to several Xtensa 
> ESP32 dev boards
> (ESP32-WROOM32 with CP2101 or CP2104 UART) via comms/py-esptool 
> (doesn't matter whether it is
> tho port's py39-esptool 4.5 or the latest py-esptool 4.7.0, doesn't 
> matter whether pkg package
> or self compiled on CURRENT and 14-STABLE, on all hardware platforms 
> same result).
>
> Attaching the ESP devel module via Micro USB cable (several type, 
> differnt vendors tried ...)
> show
>
> dmesg:
> [...]
> ugen0.4:  at usbus0
> uslcom0 on uhub3
> uslcom0:  rev 1.10/1.00, addr 4>
> on usbus0
> [...]
>
> When trying to connect to the ESP32 via below shown command (--trace 
> not every time issued), I
> get no connection:
>
> [ohartmann]: esptool.py --trace --chip esp32 --baud 115200 --port 
> /dev/cuaU1  flash_id
> esptool.py v4.7.0
> Loaded custom configuration from /pool/home/ohartmann/esptool.cfg
> Serial port /dev/cuaU1
> Connecting...TRACE +0.000 command op=0x08 data len=36 wait_response=1 
> timeout=0.100 data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
> TRACE +0.102 No serial data received.
> TRACE +0.052 command op=0x08 data len=36 wait_response=1 timeout=0.100 
> data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
> TRACE +0.107 No serial data received.
> TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 
> data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
> TRACE +0.107 No serial data received.
> TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 
> data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
>
>
> A serial exception error occurred: device reports readiness to read but 
> returned no data
> (device disconnected or multiple access on port?) Note: This error 
> originates from pySerial.
> It is likely not a problem with esptool, but with the hardware 
> connection or drivers. For
> troubleshooting steps visit:
> https://docs.espressif.com/projects/esptool/en/latest/troubleshooting.html
> [...]
>
>
> Whatever baud rate issued, in most cases on all tested OS versions and 
> almost all hardware
> platforms except the Fujistu Celsius 7XX (2015 model) I do not get any 
> connection! And it get
> more weird: To avoid out-of-sync-software I recompiled everything via 
> "portmaster -df
> comms/py-pyserial comms/py-esptool" and after that, everything was 
> fine, the connection was
> made, I got results out of the chip. Seconds later same problems.
>
> I exchanged cablings, exchanged the ESP32 model and vendor. Invariants 
> are 14-STABLE, daily
> compiled, CURRENT. daily compiled. On my private box (old Z77 based 
> IvyBridge ASRock crap), a
> couple of Lenovo T560 running 14-STABLE and several Fujitsu Esprimo 
> Q5XX boxes there is always
> this weird error message, but in very rare cases I get connection.
>
> Only exception: the Fujsitus Celsius 7XX workstation (14-STABLE, last 
> complied today noon). No
> matter what ESP32, no 

Re: serial/ulscom: response timeout using pySerial/esptool.py

2024-04-25 Thread Tomek CEDRO
CP2102 are pretty good ones and never let me down :-)

Is your UART connection to ESP32 working correctly? Can you see the
boot message and whatever happens next in terminal (cu / minicom)? Are
RX TX pins not swapped? Power supply okay?

Are boards generic devkits of custom hardware? ESP32 in addition to RX
TX needs two control lines Reset and Boot that will switch the chip to
bootloader / flashing mode. Most USB-to-UART use RTS/CTS lines for
that. Are you sure these lines are available on your board and
connected to the target correctly? Do you have Reset and Boot buttons
on the board so you could trigger bootloader by hand (hold Boot, press
and release Reset, device will be in bootloader upload mode, retry
esptool flashing now). You can also play with the buttons with active
terminal attached (i.e. minicom) to see if they work as expected.

Minicom serial terminal is pretty cool as it allows you to watch UART
behavior on adapter (un)plug. In minicom you can also enable/disable
hardware flow control lines (Ctrl+A O -> Serial Port Setup -> (F)
Hardware Flow Control). You can switch that easily and watch the
target behavior. If this is the problem you may want to use stty (1)
to enable/disable hardware flow control on the port.

Can you try with another board? ESP32 has fuses that may permanently
disable and/or mess up some hardware features.

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info



serial/ulscom: response timeout using pySerial/esptool.py

2024-04-25 Thread FreeBSD User
Hello,

Host: 15.0-CURRENT FreeBSD 15.0-CURRENT #36 main-n269703-54c3aa02e926: Thu Apr 
25 18:48:56
CEST 2024 amd64 or 14-STABLE recently compiled (dmesg/uname not at hand).

Hardware: oldish Z77Pro 4 based Asrock mainboard, a Lenovo T560 notebook, 
Fujitsu Esprimo Q5XX
(simple desktop, Pentium Gold) or an oldish Fujitsu Celsius 7XX workstation, 6 
core Haswell
XEON.

Phenomenon: a couple of weeks now I try to connect to several Xtensa ESP32 dev 
boards
(ESP32-WROOM32 with CP2101 or CP2104 UART) via comms/py-esptool (doesn't matter 
whether it is
tho port's py39-esptool 4.5 or the latest py-esptool 4.7.0, doesn't matter 
whether pkg package
or self compiled on CURRENT and 14-STABLE, on all hardware platforms same 
result).

Attaching the ESP devel module via Micro USB cable (several type, differnt 
vendors tried ...)
show

dmesg:
[...]
ugen0.4:  at usbus0
uslcom0 on uhub3
uslcom0: 
on usbus0
[...]

When trying to connect to the ESP32 via below shown command (--trace not every 
time issued), I
get no connection:

[ohartmann]: esptool.py --trace --chip esp32 --baud 115200 --port /dev/cuaU1  
flash_id
esptool.py v4.7.0
Loaded custom configuration from /pool/home/ohartmann/esptool.cfg
Serial port /dev/cuaU1
Connecting...TRACE +0.000 command op=0x08 data len=36 wait_response=1 
timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.
TRACE +0.102 No serial data received.
TRACE +0.052 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.
TRACE +0.107 No serial data received.
TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.
TRACE +0.107 No serial data received.
TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.


A serial exception error occurred: device reports readiness to read but 
returned no data
(device disconnected or multiple access on port?) Note: This error originates 
from pySerial.
It is likely not a problem with esptool, but with the hardware connection or 
drivers. For
troubleshooting steps visit:
https://docs.espressif.com/projects/esptool/en/latest/troubleshooting.html
[...]


Whatever baud rate issued, in most cases on all tested OS versions and almost 
all hardware
platforms except the Fujistu Celsius 7XX (2015 model) I do not get any 
connection! And it get
more weird: To avoid out-of-sync-software I recompiled everything via 
"portmaster -df
comms/py-pyserial comms/py-esptool" and after that, everything was fine, the 
connection was
made, I got results out of the chip. Seconds later same problems.

I exchanged cablings, exchanged the ESP32 model and vendor. Invariants are 
14-STABLE, daily
compiled, CURRENT. daily compiled. On my private box (old Z77 based IvyBridge 
ASRock crap), a
couple of Lenovo T560 running 14-STABLE and several Fujitsu Esprimo Q5XX boxes 
there is always
this weird error message, but in very rare cases I get connection.

Only exception: the Fujsitus Celsius 7XX workstation (14-STABLE, last complied 
today noon). No
matter what ESP32, no matter what vendor, no matter what cablin used: 
connection is established
at any BAUD rate issued at any time. Not one single failure as shown above in 
any session (I
checked several tenth times)!

Now I'm out of ideas and I suspect the CP210X ulscom serial driver to have 
trouble with most
onboard serial chipsets.

Can anyone help me track down this issue? Is there anything I could have missed?

I drives me nuts ...

Thanks in advance,

Oliver

 
-- 
O. Hartmann



Re: TXT Kernel linking failed on -CURRENT

2024-04-24 Thread Konstantin Belousov
On Wed, Apr 24, 2024 at 01:12:39PM +0500, BSD USER wrote:
> linking kernel
> ld: error: undefined symbol: ktrcapfail
> >>> referenced by vfs_lookup.c
> >>>   vfs_lookup.o:(namei)
> >>> referenced by vfs_lookup.c
> >>>   vfs_lookup.o:(namei_setup)
> >>> referenced by vfs_lookup.c
> >>>   vfs_lookup.o:(vfs_lookup)
> >>> referenced 3 more times
> *** [kernel] Error code 1

Try
https://reviews.freebsd.org/D44931



Re: Strange network/socket anomalies since about a month

2024-04-24 Thread Dag-Erling Smørgrav
Alexander Leidinger  writes:
> Gleb Smirnoff  writes:
> > I don't have any better idea than ktrace over failing application.
> > Yep, I understand that poudriere will produce a lot.
> Yes, it does. 4.4 GB just for the start of poudriere until the first
> package build fails due to a failed sccache start [...]

Using `ktrace -tcnpstuy` instead of just `ktrace` should greatly reduce
the size of the trace file.

(remind me to modify ktrace and kdump so this can be written as `-t-i`
or `-tI` instead...)

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: Strange network/socket anomalies since about a month

2024-04-24 Thread Alexander Leidinger

Am 2024-04-22 18:12, schrieb Gleb Smirnoff:

There were several preparatory commits that were not reverted and one 
of them
had a bug.  The bug manifested itself as failure to send(2) zero bytes 
over
unix/stream.  It was fixed with 
e6a4b57239dafc6c944473326891d46d966c0264. Can
you please check you have this revision? Other than that there are no 
known

bugs left.


Yes, I have this fix in my running kernel.


A> Any ideas how to track this down more easily than running the entire
A> poudriere in ktrace (e.g. a hint/script which dtrace probes to use)?

I don't have any better idea than ktrace over failing application.  
Yep, I
understand that poudriere will produce a lot.  But first we need to 
determine


Yes, it does. 4.4 GB just for the start of poudriere until the first 
package build fails due to a failed sccache start (luckily in the first 
builder, but I had at least 2 builders automatically spin up by 
poudriere at the time when I validated the failure in the logs and 
disabled the tracing).


what syscall fails and on what type of socket.  After that we can scope 
down to

using dtrace on very particular functions.


I'm not sure I manage to find the cause of the failure... the only thing 
which remotely looks like an issue is "Resource temporarily 
unavailable", but this is from the process which waits for the server to 
have started:

---snip---
 58406 sccache  1713947887.504834367 RET   __sysctl 0
 58406 sccache  1713947887.505521884 CALL  
rfork(0x8000<>2147483648)
 58406 sccache  1713947887.50575 CAP   system call not allowed: 
rfork

 58406 sccache  1713947887.505774176 RET   rfork 58426/0xe43a
 58406 sccache  1713947887.507304865 CALL  
compat11.kevent(0x3,0x371d360f89e8,0x2,0x371d360f89e8,0x2,0)
 58406 sccache  1713947887.507657906 STRU  struct freebsd11_kevent[] = { 
{ ident=11, filter=EVFILT_READ, flags=0x61, 
fflags=0, data=0, udata=0x0 }
 { ident=11, filter=EVFILT_WRITE, 
flags=0x61, fflags=0, data=0, udata=0x0 } }
 58406 sccache  1713947887.507689980 STRU  struct freebsd11_kevent[] = { 
{ ident=11, filter=EVFILT_READ, flags=0x4000, fflags=0, 
data=0, udata=0x0 }
 { ident=11, filter=EVFILT_WRITE, flags=0x4000, 
fflags=0, data=0, udata=0x0 } }

 58406 sccache  1713947887.507977155 RET   compat11.kevent 2
 58406 sccache  1713947887.508015751 CALL  write(0x5,0x371515685bcc,0x1)
 58406 sccache  1713947887.508086434 GIO   fd 5 wrote 1 byte
   0x 01   |.|

 58406 sccache  1713947887.508145930 RET   write 1
 58406 sccache  1713947887.508183140 CALL  
compat11.kevent(0x7,0,0,0x5a5689ab0c40,0x400,0)
 58406 sccache  1713947887.508396614 STRU  struct freebsd11_kevent[] = { 
 }
 58406 sccache  1713947887.508156537 STRU  struct freebsd11_kevent[] = { 
{ ident=4, filter=EVFILT_READ, flags=0x60, 
fflags=0, data=0x1, udata=0x } }

 58406 sccache  1713947887.508530888 RET   compat11.kevent 1
 58406 sccache  1713947887.508563736 CALL  read(0x4,0x371d3a2887c0,0x80)
 58406 sccache  1713947887.508729102 GIO   fd 4 read 1 byte
   0x 01   |.|

 58406 sccache  1713947887.508967661 RET   read 1
 58406 sccache  1713947887.508996753 CALL  read(0x4,0x371d3a2887c0,0x80)
 58406 sccache  1713947887.509028311 RET   read -1 errno 35 Resource 
temporarily unavailable
 58406 sccache  1713947887.509068838 CALL  
compat11.kevent(0x3,0,0,0x5a5689a97540,0x400,0x371d3a2887c8)

..
 58406 sccache  1713947897.514352552 CALL  
_umtx_op(0x5a5689a3d290,0x10,0x7fff,0,0)

 58406 sccache  1713947897.514383653 RET   _umtx_op 0
 58406 sccache  1713947897.514421273 CALL  write(0x5,0x371515685bcc,0x1)
 58406 sccache  1713947897.515050967 STRU  struct freebsd11_kevent[] = { 
{ ident=4, filter=EVFILT_READ, flags=0x60, 
fflags=0, data=0x1, udata=0x } }

 58406 sccache  1713947897.515146151 RET   compat11.kevent 1
 58406 sccache  1713947897.515178978 CALL  read(0x4,0x371d3a2887c0,0x80)
 58406 sccache  1713947897.515368070 GIO   fd 4 read 1 byte
   0x 01   |.|

 58406 sccache  1713947897.515396600 RET   read 1
 58406 sccache  1713947897.515426523 CALL  read(0x4,0x371d3a2887c0,0x80)
 58406 sccache  1713947897.515457073 RET   read -1 errno 35 Resource 
temporarily unavailable

 58406 sccache  1713947897.515004494 GIO   fd 5 wrote 1 byte
   0x 01   |.|
---snip---

https://www.leidinger.net/test/sccache.tar.bz2 contains the parts of the 
trace of the sccache processes (in case someone wants to have a look).


The poudriere run had several builders in parallel, at least 2 were 
running at that point in time. What the overlay does is to startup 
(sccache --start-server) the sccache server process (forks to return 
back on the command line) which creates a file system socket, and then 
it queries the stats (sccache --show-stats). So some of the traces in 
the tarball are the server start (those with "Timed 

TXT Kernel linking failed on -CURRENT

2024-04-24 Thread BSD USER

Sorry for HTML-trash from previous mail :)

Hi, FreeBSD Community!
I have a teach with FreeBSD and use -CURRENT on my test machine.
And some days ago after
- git pull
- make buildworld
- make buildkernel
There is /etc/src.conf and BSDSERV below, what can cause that error?
Thanks for help!

My /usr/src state is:

 git log -n 1
commit a0d7d68a2dd818ce84e37e1ff20c8849cda6d853 (HEAD -> main, 
origin/main, origin/HEAD)

Author: Cy Schubert 


kernel building failed with such messages:
--
--- force-dynamic-hack.pico ---
cc -target x86_64-unknown-freebsd15.0 
--sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
-B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin  -shared -O2 -pipe 
-fno-strict-aliasing -march=native  -nostdinc  -I. -I/usr/src/sys -I/u
sr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL 
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common    -MD  
-MF.depend.force-dynamic-hack.pico -MTforce-dynamic-hack.pico -fdebug-pr
efix-map=./machine=/usr/src/sys/amd64/include 
-fdebug-prefix-map=./x86=/usr/src/sys/x86/include 
-fdebug-prefix-map=./i386=/usr/src/sys/i386/include -mcmodel=kernel 
-mno-red-zone -mno-mmx -mno-sse -msoft-float -fn
o-asynchronous-unwind-tables -ffreestanding -fwrapv -Wall 
-Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual 
-Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ 
-Wmissing-include-dirs -fdi
agnostics-show-option -Wno-unknown-pragmas -Wswitch 
-Wno-error=tautological-compare -Wno-error=empty-body 
-Wno-error=parentheses-equality -Wno-error=unused-function 
-Wno-error=pointer-sign -Wno-error=shift-negativ
e-value -Wno-address-of-packed-member -Wno-format-zero-length   -mno-aes 
-mno-avx  -std=gnu99 -nostdlib  force-dynamic-hack.c -o 
force-dynamic-hack.pico

--- vers.c ---
MAKE="make" sh /usr/src/sys/conf/newvers.sh  BSDSERV
--- vers.o ---
cc -target x86_64-unknown-freebsd15.0 
--sysroot=/usr/obj/usr/src/amd64.amd64/tmp 
-B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin -c -O2 -pipe 
-fno-strict-aliasing -march=native  -nostdinc  -I. -I/usr/src/sys -I/usr/src
/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL 
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common 
-fdebug-prefix-map=./machine=/usr/src/sys/amd64/include 
-fdebug-prefix-map=./x86=/
usr/src/sys/x86/include 
-fdebug-prefix-map=./i386=/usr/src/sys/i386/include -mcmodel=kernel 
-mno-red-zone -mno-mmx -mno-sse -msoft-float 
-fno-asynchronous-unwind-tables -ffreestanding -fwrapv -Wall -Wstrict-proto
types -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef 
-Wno-pointer-sign -D__printf__=__freebsd_kprintf__ 
-Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas 
-Wswitch -Wno-error=tautologi
cal-compare -Wno-error=empty-body -Wno-error=parentheses-equality 
-Wno-error=unused-function -Wno-error=pointer-sign 
-Wno-error=shift-negative-value -Wno-address-of-packed-member 
-Wno-format-zero-length -mno-aes

 -mno-avx  -std=gnu99 -Werror vers.c
--- kernel ---
linking kernel
ld: error: undefined symbol: ktrcapfail
>>> referenced by vfs_lookup.c
>>>   vfs_lookup.o:(namei)
>>> referenced by vfs_lookup.c
>>>   vfs_lookup.o:(namei_setup)
>>> referenced by vfs_lookup.c
>>>   vfs_lookup.o:(vfs_lookup)
>>> referenced 3 more times
*** [kernel] Error code 1
make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/BSDSERV
make[2]: 1 error
make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/BSDSERV
 1098.27 real  2002.17 user   176.26 sys
make[1]: stopped in /usr/src
make: stopped in /usr/src

/etc/src.conf
===
WITHOUT_APM=yes
WITHOUT_ASSERT_DEBUG=yes
WITHOUT_AUTHPF=yes
WITHOUT_BHYVE=yes
WITHOUT_BLACKLIST=yes
WITHOUT_BLUETOOTH=yes
WITHOUT_CCD=yes
WITHOUT_CXGBETOOL=yes
WITHOUT_DEBUG_FILES=yes
WITHOUT_DTRACE=yes
WITHOUT_FLOPPY=yes
WITHOUT_GOOGLETEST=yes
WITHOUT_HAST=yes
WITHOUT_HTML=yes
WITHOUT_HYPERV=yes
WITHOUT_INET6=yes
WITHOUT_IPFILTER=yes
WITHOUT_ISCSI=yes
WITHOUT_KDUMP=yes
WITHOUT_KERNEL_SYMBOLS=yes
WITH_MALLOC_PRODUCTION=yes
WITHOUT_MLX5TOOL=yes
WITHOUT_NVME=yes
WITHOUT_OFED=yes
WITHOUT_PF=yes
WITHOUT_PTHREADS_ASSERTIONS=yes
WITHOUT_RADIUS_SUPPORT=yes
WITHOUT_RELRO=yes
WITHOUT_SSP=yes
WITHOUT_WARNS=yes
WITHOUT_WERROR=yes
WITHOUT_TESTS=yes
WITHOUT_WIRELESS=yes
BSDSERV
===
cpu HAMMER
ident   BSDSERV
device  amdtemp
options SCHED_ULE   # ULE scheduler
options PREEMPTION  # Enable kernel thread preemption
options VIMAGE  # Subsystem virtualization, e.g. 
VNET

options INET    # InterNETworking
options TCP_OFFLOAD # TCP offload
options TCP_BLACKBOX    # Enhanced TCP event 

Kernel linking error on -CURRENT

2024-04-24 Thread USER BSD
Hi, FreeBSD Community! I have a teach with FreeBSD and use -CURRENT on my test machine.And some days ago after- git pull- make buildworld- make buildkernel There is /etc/src.conf and BSDSERV below, what can cause that error?Thanks for help! kernel building failed with such messages:- force-dynamic-hack.pico ---cc -target x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin  -shared -O2 -pipe -fno-strict-aliasing -march=native  -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common    -MD  -MF.depend.force-dynamic-hack.pico -MTforce-dynamic-hack.pico -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -fdebug-prefix-map=./i386=/usr/src/sys/i386/include -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wswitch -Wno-error=tautological-compare -Wno-error=empty-body -Wno-error=parentheses-equality -Wno-error=unused-function -Wno-error=pointer-sign -Wno-error=shift-negative-value -Wno-address-of-packed-member -Wno-format-zero-length   -mno-aes -mno-avx  -std=gnu99 -nostdlib  force-dynamic-hack.c -o force-dynamic-hack.pico--- vers.c ---MAKE="make" sh /usr/src/sys/conf/newvers.sh  BSDSERV--- vers.o ---cc -target x86_64-unknown-freebsd15.0 --sysroot=/usr/obj/usr/src/amd64.amd64/tmp -B/usr/obj/usr/src/amd64.amd64/tmp/usr/bin -c -O2 -pipe -fno-strict-aliasing -march=native  -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/ck/include -I/usr/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include -fdebug-prefix-map=./x86=/usr/src/sys/x86/include -fdebug-prefix-map=./i386=/usr/src/sys/i386/include -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fwrapv -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wswitch -Wno-error=tautological-compare -Wno-error=empty-body -Wno-error=parentheses-equality -Wno-error=unused-function -Wno-error=pointer-sign -Wno-error=shift-negative-value -Wno-address-of-packed-member -Wno-format-zero-length   -mno-aes -mno-avx  -std=gnu99 -Werror vers.c--- kernel ---linking kernelld: error: undefined symbol: ktrcapfail>>> referenced by vfs_lookup.c>>>   vfs_lookup.o:(namei)>>> referenced by vfs_lookup.c>>>   vfs_lookup.o:(namei_setup)>>> referenced by vfs_lookup.c>>>   vfs_lookup.o:(vfs_lookup)>>> referenced 3 more times*** [kernel] Error code 1 make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/BSDSERVmake[2]: 1 error make[2]: stopped in /usr/obj/usr/src/amd64.amd64/sys/BSDSERV 1098.27 real  2002.17 user   176.26 sys make[1]: stopped in /usr/src make: stopped in /usr/src  /etc/src.conf===WITHOUT_APM=yesWITHOUT_ASSERT_DEBUG=yesWITHOUT_AUTHPF=yesWITHOUT_BHYVE=yesWITHOUT_BLACKLIST=yesWITHOUT_BLUETOOTH=yesWITHOUT_CCD=yesWITHOUT_CXGBETOOL=yesWITHOUT_DEBUG_FILES=yesWITHOUT_DTRACE=yesWITHOUT_FLOPPY=yesWITHOUT_GOOGLETEST=yesWITHOUT_HAST=yesWITHOUT_HTML=yesWITHOUT_HYPERV=yesWITHOUT_INET6=yesWITHOUT_IPFILTER=yesWITHOUT_ISCSI=yesWITHOUT_KDUMP=yesWITHOUT_KERNEL_SYMBOLS=yesWITH_MALLOC_PRODUCTION=yesWITHOUT_MLX5TOOL=yesWITHOUT_NVME=yesWITHOUT_OFED=yesWITHOUT_PF=yesWITHOUT_PTHREADS_ASSERTIONS=yesWITHOUT_RADIUS_SUPPORT=yesWITHOUT_RELRO=yesWITHOUT_SSP=yesWITHOUT_WARNS=yesWITHOUT_WERROR=yesWITHOUT_TESTS=yesWITHOUT_WIRELESS=yes BSDSERV===cpu HAMMERident   BSDSERVdevice  amdtempoptions SCHED_ULE   # ULE scheduleroptions PREEMPTION  # Enable kernel thread preemptionoptions VIMAGE  # Subsystem virtualization, e.g. VNEToptions INET    # InterNETworkingoptions TCP_OFFLOAD # TCP offloadoptions TCP_BLACKBOX    # Enhanced TCP event loggingoptions TCP_HHOOK   # hhook(9) framework for TCPoptions TCP_RFC7413 # TCP Fast Openoptions KERN_TLS    # TLS transmit & receive offloadoptions FFS # Berkeley Fast Filesystemoptions SOFTUPDATES # Enable FFS soft updates supportoptions MD_ROOT

Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-23 Thread Philip Paeps

On 2024-04-24 02:12:41 (+0800), Mark Millard wrote:


On Apr 19, 2024, at 07:16, Philip Paeps  wrote:


On 2024-04-18 23:02:30 (+0800), Mark Millard wrote:


void  wrote on
Date: Thu, 18 Apr 2024 14:08:36 UTC :


Not sure where to post this..

The last bulk build for arm64 appears to have happened around
mid-March on ampere2. Is it broken?


main-armv7 building is broken and the last completed build
was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
gets stuck making no progress until manually forced to stop,
which leads to huge elapsed times for the incomplete builds:

pd5512ae7b8c6_s75464941dc 34472 12282  (+9196) 107  (+77) 4753  
(+2247) 1390  (+529) 15940 parallel_build: Fri, 22 Mar 2024 11:05:01 
GMT 651:21:56


p43e3af5f5763_sf5f08e41aa 19809 5919  (+3126) 137  (+100) 5363  
(+2741) 1395  (+522) 6995 parallel_build: Wed, 28 Feb 2024 15:46:14 
GMT 359:42:14 ampere2


ampere2 alternates between trying to build main-arm64 and 
main-armv7, so main-armv7 being stuck blocks main-arm64 from 
building.


One can see that all 13 job ID's show over 570 hours:

http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pd5512ae7b8c6_s75464941dc

It is not random which packages are building when this happens. 
Compare:


http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=p43e3af5f5763_sf5f08e41aa

By contrast, the 19 Feb 2024 from-scratch (full) build worked:

http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pe9c9c73181b5_sbd45bbe440

My guess is that FreeBSD has something that broken after bd45bbe440
that was broken as of f5f08e41aa and was still broken at 75464941dc 
.


I'll kill the build on ampere2 again.  Thanks for the nudge.

We don't really have good monitoring for this.  Also: builds should 
time out after 36 hours.  The fact that this one does not is a bug in 
itself.


Philip [hat: clusteradm]


I'll note that I've never managed to replicate the problem for
building for armv7 on aarch64. But my context never has the
likes of:

QUOTE
Host OSVERSION: 156
Jail OSVERSION: 1500015
. . .
!!! Jail is newer than host. (Jail: 1500015, Host: 156) !!!
!!! This is not supported. !!!
!!! Host kernel must be same or newer than jail. !!!
!!! Expect build failures. !!!
END QUOTE

but always has the two OSVERSION's the same, such as:

Host OSVERSION: 1500015
Jail OSVERSION: 1500015

or, recently,

Host OSVERSION: 1500018
Jail OSVERSION: 1500018

My bulk runs do go through the sequence where the hangups
have repeated for main-armv7 on ampere2.

I wonder what would happen if "Host OSVERSION" was updated
(modernized) to match the modern "Jail OSVERSION" that would
be used?


The package builders are due for a regular refresh to newer -CURRENT 
dogfood.  I'll do the aarch64 builders first this time.


I've set /root/stop-builds on them.  I'll upgrade them when they go 
idle.  Or I'll kill them if they take much longer to build what they're 
building.  It annoys me that they do not stop building after 36 hours, 
like they're supposed to.


They're currently running:

n266879-6abee52e0d79   2023-12-09 01:06:28 jlduran strfmon: Silence 
scan-build warning


Our current clusteradm build is:

n269399-bbc6e6c5ec8c   2024-04-14 03:12:36 sigsys daemon: fix -R to 
enable supervision mode


I may do a new build while waiting for them to go idle:

-   quarterly 140arm64 1b931669de11 parallel_build 28776 15299   33  588 
   985 0  11871 3D:01:08:29 
https://pkg-status.freebsd.org/ampere1/build.html?mastername=140arm64-quarterly=1b931669de11
-   default main-arm64 p1c7a816cd0ad_s1bd4f769caf parallel_build 34528 
19888   65  669980 0  12926 4D:00:52:21 
https://pkg-status.freebsd.org/ampere2/build.html?mastername=main-arm64-default=p1c7a816cd0ad_s1bd4f769caf
-   default 140releng-armv7 2910ff97e727 parallel_build 34543 14826   60 
5539   1397 0  12721 1D:09:35:28 
https://pkg-status.freebsd.org/ampere3/build.html?mastername=140releng-armv7-default=2910ff97e727


Philip



Re: April 2024 stabilization week

2024-04-23 Thread Gleb Smirnoff
  Hi FreeBSD/main users & developers,

this stabilization week [likely final] status update:

* Netflix testing didn't discover any stability issues with
  main-n269602-dd03eafacba9.
* Netflix testing didn't discover any substantial performance
  degradations.  The data is still being analyzed though.
* A regression with ZFS reported in 
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278494
  has been addressed by ZFS 9f83eec03904b18e052fbe2c66542bd47254cf57.
* An old (more than a month old) regression has been identified
  with accept_filter(9).
  Fixed by a8acc2bf5699556946dda2a37589d3c3bd9762c6.

Since FreeBSD/main has been pushed with several non-documentation, non-a-
trivial-bugfix commits during the days of the stabilization week, I can't
guarantee that the above testing results are applicable to the current state of
FreeBSD/main.  That's why I created a temporary cherry-picking branch
stabweek-2024-Apr that is published at https://github.com/glebius/FreeBSD.git.

Users of FreeBSD/main are adviced with the following choices:

- Pull up FreeBSD/main up to a8acc2bf5699556946dda2a37589d3c3bd9762c6 and use it
  as your stabilization point.  There is tiny risk of untested changes added
  recently.
- Pull stabweek-2024-Apr from https://github.com/glebius/FreeBSD.git.
- Craft stabweek-2024-Apr yourself:
  # git checkout -b stabweek-2024-Apr dd03eafacba962c9dcec929c3ed9d63e7c43da3a
  # git cherry-pick -x --strategy=subtree -Xsubtree=sys/contrib/openzfs 
9f83eec03904b18e052fbe2c66542bd47254cf57
  # git cherry-pick -x a8acc2bf5699556946dda2a37589d3c3bd9762c6

I'm planning to end the advisory freeze on the main branch Wednesday morning
at 8:00 UTC, unless somebody opposes that with a valid reason, e.g. a
regression that I missed.

-- 
Gleb Smirnoff


signature.asc
Description: PGP signature


Re: pkg server for current/arm64 stopped ? [main-armv7 on ampere2, elapsed so far: 651:21:56]

2024-04-23 Thread Mark Millard
On Apr 19, 2024, at 07:16, Philip Paeps  wrote:

> On 2024-04-18 23:02:30 (+0800), Mark Millard wrote:
> 
>> void  wrote on
>> Date: Thu, 18 Apr 2024 14:08:36 UTC :
>> 
>>> Not sure where to post this..
>>> 
>>> The last bulk build for arm64 appears to have happened around
>>> mid-March on ampere2. Is it broken?
>> 
>> main-armv7 building is broken and the last completed build
>> was the one started on Mon, 19 Feb 2024 12:32:10 GMT. It
>> gets stuck making no progress until manually forced to stop,
>> which leads to huge elapsed times for the incomplete builds:
>> 
>> pd5512ae7b8c6_s75464941dc 34472 12282  (+9196) 107  (+77) 4753  (+2247) 1390 
>>  (+529) 15940 parallel_build: Fri, 22 Mar 2024 11:05:01 GMT 651:21:56
>> 
>> p43e3af5f5763_sf5f08e41aa 19809 5919  (+3126) 137  (+100) 5363  (+2741) 1395 
>>  (+522) 6995 parallel_build: Wed, 28 Feb 2024 15:46:14 GMT 359:42:14 ampere2
>> 
>> ampere2 alternates between trying to build main-arm64 and main-armv7, so 
>> main-armv7 being stuck blocks main-arm64 from building.
>> 
>> One can see that all 13 job ID's show over 570 hours:
>> 
>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pd5512ae7b8c6_s75464941dc
>> 
>> It is not random which packages are building when this happens. Compare:
>> 
>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=p43e3af5f5763_sf5f08e41aa
>> 
>> By contrast, the 19 Feb 2024 from-scratch (full) build worked:
>> 
>> http://ampere2.nyi.freebsd.org/build.html?mastername=main-armv7-default=pe9c9c73181b5_sbd45bbe440
>> 
>> My guess is that FreeBSD has something that broken after bd45bbe440
>> that was broken as of f5f08e41aa and was still broken at 75464941dc .
> 
> I'll kill the build on ampere2 again.  Thanks for the nudge.
> 
> We don't really have good monitoring for this.  Also: builds should time out 
> after 36 hours.  The fact that this one does not is a bug in itself.
> 
> Philip [hat: clusteradm]

I'll note that I've never managed to replicate the problem for
building for armv7 on aarch64. But my context never has the
likes of:

QUOTE
Host OSVERSION: 156
Jail OSVERSION: 1500015
 . .
!!! Jail is newer than host. (Jail: 1500015, Host: 156) !!!
!!! This is not supported. !!!
!!! Host kernel must be same or newer than jail. !!!
!!! Expect build failures. !!!
END QUOTE

but always has the two OSVERSION's the same, such as:

Host OSVERSION: 1500015
Jail OSVERSION: 1500015

or, recently,

Host OSVERSION: 1500018
Jail OSVERSION: 1500018

My bulk runs do go through the sequence where the hangups
have repeated for main-armv7 on ampere2.

I wonder what would happen if "Host OSVERSION" was updated
(modernized) to match the modern "Jail OSVERSION" that would
be used?



===
Mark Millard
marklmi at yahoo.com




Re: April 2024 stabilization week

2024-04-23 Thread Gleb Smirnoff
  Hi FreeBSD/main users & developers,

On Mon, Apr 22, 2024 at 01:00:50AM -0700, Gleb Smirnoff wrote:
T> This is an automated email to inform you that the April 2024 stabilization 
week
T> started with FreeBSD/main at main-n269602-dd03eafacba9, which was tagged as
T> main-stabweek-2024-Apr.
T> 
T> The tag main-stabweek-2024-Apr has been published at
T> https://github.com/glebius/FreeBSD/tags.  Those who want to participate
T> in the stabilization week are encouraged to update to the above
T> revision/tag and test their systems.

* Netflix testing didn't discover any stability issues with
  main-n269602-dd03eafacba9.  We are still running the performance test,
  but preliminary results are that everything is fine.
* My personal desktop/server experience with the tag neither has any problems.

Feel free to reply with your reports if you participated in the testing, too.

In bugzilla we have this submission, which looks important:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278494

I want to hear from Alexander and Martin before thawing the advisory freeze.
Don't want to declare the tag good if some ZFS systems fail to boot after
upgrade.

-- 
Gleb Smirnoff



Re: llvm and Undefined symbols: ___truncsfbf2 problem

2024-04-23 Thread Hiroo Ono
Thank you.
I updated my current to recent current and confirmed that julia
1.11.0 beta1 builds and runs with the system clang (18.1.4).


On Thu, 18 Apr 2024 00:36:28 +0200
Dimitry Andric  wrote:

> On 11 Apr 2024, at 15:07, Hiroo Ono  wrote:
> > 
> > Hello,
> > 
> > I am trying to update the lang/julia port to 1.11.0 (currently
> > still in beta 1). I seem to ran across this problem initially
> > reported on MacOS. https://github.com/JuliaLang/julia/issues/52067
> > 
> > The llvm team seems to have patched this problem only for Darwin.
> > https://github.com/llvm/llvm-project/pull/84192
> > 
> > I think the solution is also needed for FreeBSD, but should I
> > report it directly to llvm team or report here or to FreeBSD
> > bugzilla and ask toolchain maintainer of FreeBSD to report
> > upstream?  
> 
> The __bf16 type is only available on some architectures, and only
> supported by relatively recent compiler versions, in combination with
> some runtime support (i.e. compiler-rt or libgcc).
> 
> Approximately: it is available on aarch64, amd64, arm (with fp), i386
> (with sse2) and riscv. And it is supported by clang 15 and later
> (though not for riscv, which requires clang 18), and gcc 13 and later.
> 
> However, the runtime support in FreeBSD was only added with the recent
> merge of llvm 18. The necessary library functions (truncdfbf2 and
> truncsfbf2) are now in compiler-rt.
> 
> -Dimitry
> 
> 




Re: Strange network/socket anomalies since about a month

2024-04-22 Thread Gleb Smirnoff
  Alexander,

On Mon, Apr 22, 2024 at 09:26:59AM +0200, Alexander Leidinger wrote:
A> I see a higher failure rate of socket/network related stuff since a while.
A> Those failures are transient. Directly executing the same thing again may
A> or may not result in success/failure. I'm not able to reproduce this at
A> will. Sometimes they show up.
A> 
A> Examples:
A>  - poudriere runs with the sccache overlay (like ccache but also works for
A> rust) sometimes fail to create the communication socket and as such the
A> build fails. I have 3 different poudriere bulk runs after each other in my
A> build script, and when the first one fails, the second and third still run.
A> If the first fails due to the sccache issue, the second and 3rd may or may
A> not fail. Sometimes the first fails and the rest is ok. Sometimes all fail,
A> and if I then run one by hand it works (the script does the same as the
A> manual run, the script is simply a "for type in A B C; do; poudriere bulk
A> -O sccache -j $type -f  ${type}.pkglist; done" which I execute from the
A> same shell, and the script doesn't do env-sanityzing).
A>  - A webmail interface (inet / local net -> nginx (rev-proxy) -> nginx
A> (webmail service) -> php -> imap) sees intermittent issues sometimes.
A> Opening the same email directly again afterwards normally works. I've also
A> seen transient issues with pgp signing (webmail interface -> gnupg /
A> gpg-agent on the server), simply hitting send again after a failure works
A> fine.
A> 
A> Gleb, could this be related to the socket stuff you did 2 weeks ago? My
A> world is from 2024-04-17-112537. I do notice this since at least then, but
A> I'm not sure if they where there before that and I simply didn't notice
A> them. They are surely "new recently", that amount of issues I haven's seen
A> in January. The last two updates of current I did before the last one where
A> on 2024-03-31-120210 and 2024-04-08-112551.

The stuff I pushed 2 weeks ago was a large rewrite of unix/stream, but that was
reverted as it appears needs more work wrt to aio(4), nfs/rpc and also appeared
that sendfile(2) over unix(4) has some non-zero use.

There were several preparatory commits that were not reverted and one of them
had a bug.  The bug manifested itself as failure to send(2) zero bytes over
unix/stream.  It was fixed with e6a4b57239dafc6c944473326891d46d966c0264. Can
you please check you have this revision? Other than that there are no known
bugs left.

A> I could also imagine that some memory related transient failure could cause
A> this, but with >3 GB free I do not expect this. Important here may be that
A> I have https://reviews.freebsd.org/D40575 in my tree, which is memory
A> related, but it's only a metric to quantify memory fragmentation.
A> 
A> Any ideas how to track this down more easily than running the entire
A> poudriere in ktrace (e.g. a hint/script which dtrace probes to use)?

I don't have any better idea than ktrace over failing application.  Yep, I
understand that poudriere will produce a lot.  But first we need to determine
what syscall fails and on what type of socket.  After that we can scope down to
using dtrace on very particular functions.

-- 
Gleb Smirnoff



Re: Strange network/socket anomalies since about a month

2024-04-22 Thread Paul Mather
On Apr 22, 2024, at 3:26 AM, Alexander Leidinger  
wrote:


> Hi,
> 
> I see a higher failure rate of socket/network related stuff since a while. 
> Those failures are transient. Directly executing the same thing again may or 
> may not result in success/failure. I'm not able to reproduce this at will. 
> Sometimes they show up.
> 
> Examples:
> - poudriere runs with the sccache overlay (like ccache but also works for 
> rust) sometimes fail to create the communication socket and as such the build 
> fails. I have 3 different poudriere bulk runs after each other in my build 
> script, and when the first one fails, the second and third still run. If the 
> first fails due to the sccache issue, the second and 3rd may or may not fail. 
> Sometimes the first fails and the rest is ok. Sometimes all fail, and if I 
> then run one by hand it works (the script does the same as the manual run, 
> the script is simply a "for type in A B C; do; poudriere bulk -O sccache -j 
> $type -f  ${type}.pkglist; done" which I execute from the same shell, and the 
> script doesn't do env-sanityzing).
> - A webmail interface (inet / local net -> nginx (rev-proxy) -> nginx 
> (webmail service) -> php -> imap) sees intermittent issues sometimes. Opening 
> the same email directly again afterwards normally works. I've also seen 
> transient issues with pgp signing (webmail interface -> gnupg / gpg-agent on 
> the server), simply hitting send again after a failure works fine.
> 
> Gleb, could this be related to the socket stuff you did 2 weeks ago? My world 
> is from 2024-04-17-112537. I do notice this since at least then, but I'm not 
> sure if they where there before that and I simply didn't notice them. They 
> are surely "new recently", that amount of issues I haven's seen in January. 
> The last two updates of current I did before the last one where on 
> 2024-03-31-120210 and 2024-04-08-112551.
> 
> I could also imagine that some memory related transient failure could cause 
> this, but with >3 GB free I do not expect this. Important here may be that I 
> have https://reviews.freebsd.org/D40575 in my tree, which is memory related, 
> but it's only a metric to quantify memory fragmentation.
> 
> Any ideas how to track this down more easily than running the entire 
> poudriere in ktrace (e.g. a hint/script which dtrace probes to use)?


No answers, I'm afraid, just a "me too."

I have the same problem as you describe when using ports-mgmt/sccache-overlay 
when building packages with Poudriere.  In my case, I'm using FreeBSD 14-STABLE 
(stable/14-13952fbca).

I actually stopped using ports-mgmt/sccache-overlay because it got to the point 
where it didn't work more often than it did.  Then, a few months ago, I decided 
to start using it again on a whim and it worked reliably for me.  Then, 
starting a few weeks ago, it has reverted to the behaviour you describe above.  
It is not as bad right now as it got when I quit using it.  Now, sometimes it 
will fail, but it will succeed when re-running a "poudriere bulk" run.

I'd love it to go back to when it was working 100% of the time.

Cheers,

Paul.




April 2024 stabilization week

2024-04-22 Thread Gleb Smirnoff
  Hi FreeBSD/main users & developers:

This is an automated email to inform you that the April 2024 stabilization week
started with FreeBSD/main at main-n269602-dd03eafacba9, which was tagged as
main-stabweek-2024-Apr.

The tag main-stabweek-2024-Apr has been published at
https://github.com/glebius/FreeBSD/tags.  Those who want to participate
in the stabilization week are encouraged to update to the above
revision/tag and test their systems.

Developers are encouraged to avoid pushing new features to FreeBSD/main,
but focus on bugfixes instead.  The stabilization week runs up to
Friday 18:00 UTC, but if there is consensus that any regressions
discovered by participants have been fixed, it will end early.

Once that happens, the advisory freeze of FreeBSD/main branch is thawed.

--
Gleb Smirnoff



Strange network/socket anomalies since about a month

2024-04-22 Thread Alexander Leidinger

Hi,

I see a higher failure rate of socket/network related stuff since a 
while. Those failures are transient. Directly executing the same thing 
again may or may not result in success/failure. I'm not able to 
reproduce this at will. Sometimes they show up.


Examples:
 - poudriere runs with the sccache overlay (like ccache but also works 
for rust) sometimes fail to create the communication socket and as such 
the build fails. I have 3 different poudriere bulk runs after each other 
in my build script, and when the first one fails, the second and third 
still run. If the first fails due to the sccache issue, the second and 
3rd may or may not fail. Sometimes the first fails and the rest is ok. 
Sometimes all fail, and if I then run one by hand it works (the script 
does the same as the manual run, the script is simply a "for type in A B 
C; do; poudriere bulk -O sccache -j $type -f  ${type}.pkglist; done" 
which I execute from the same shell, and the script doesn't do 
env-sanityzing).
 - A webmail interface (inet / local net -> nginx (rev-proxy) -> nginx 
(webmail service) -> php -> imap) sees intermittent issues sometimes. 
Opening the same email directly again afterwards normally works. I've 
also seen transient issues with pgp signing (webmail interface -> gnupg 
/ gpg-agent on the server), simply hitting send again after a failure 
works fine.


Gleb, could this be related to the socket stuff you did 2 weeks ago? My 
world is from 2024-04-17-112537. I do notice this since at least then, 
but I'm not sure if they where there before that and I simply didn't 
notice them. They are surely "new recently", that amount of issues I 
haven's seen in January. The last two updates of current I did before 
the last one where on 2024-03-31-120210 and 2024-04-08-112551.


I could also imagine that some memory related transient failure could 
cause this, but with >3 GB free I do not expect this. Important here may 
be that I have https://reviews.freebsd.org/D40575 in my tree, which is 
memory related, but it's only a metric to quantify memory fragmentation.


Any ideas how to track this down more easily than running the entire 
poudriere in ktrace (e.g. a hint/script which dtrace probes to use)?


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


signature.asc
Description: OpenPGP digital signature


<    1   2   3   4   5   6   7   8   9   10   >