Re: time keeping fallback mechanics during reboot on octeon

2024-01-15 Thread Christian Gut
This behaved differently some releases ago. Since then the BOOT kernel had been 
introduced and then because of /dev/random and the upgrade kernel (bsd.upgrade) 
write access was needed.

> On 14. Jan 2024, at 09:32, Alexander Hall  wrote:
> 
> I don't have mine (EdgeRouter lite) running anymore, but IIRC, I had a cron 
> job poking the root fs to"resolve" this.
> 
> Sth like "mkdir /bump && rmdir /bump && sync".
> 
> /Alexander
> 
> On January 12, 2024 2:35:47 PM GMT+01:00, Christian Gut  
> wrote:
>> Hi,
>> 
>> Could somebody point me to documentation or tell me where OpenBSD gets the 
>> time from, when the system has no RTC and ntpd is not working?
>> 
>> I am using an EdgeRouter / octeon and at every reboot, the date/time gets 
>> reset to the exact same date.
>> 
>> I tried to read the source code of boot(9) and inittodr(9). I can see, that 
>> there seems to be a fallback to some timestamp that comes from the 
>> filesystem. Maybe when the root filesystem is mounted as of ffs_mountroot() 
>> for example. But my understanding did not go so far to identify from which 
>> file, directory, superblock or other filesystem metadata the information 
>> really comes from.
>> 
>> It seems to me, that either my system is broken or something on octeon does 
>> not work correctly for this fallback to happen correctly.
>> 
>> Kind Regards,
>> Christian
>> 
> 



Re: time keeping fallback mechanics during reboot on octeon

2024-01-13 Thread Christian Gut



> On 13. Jan 2024, at 10:03, Christian Gut  wrote:
> 
> 
> 
>> On 13. Jan 2024, at 00:58, Theo de Raadt  wrote:
>> 
>> I suspect this is due to how powerpc64 and octeon boot.  Their bootblocks are
>> a special kernel called BOOT which mounts the ffs filesystem diretly.  I 
>> suspect
>> during the transition to loading GENERIC.MP something wrong happens with the
>> on-disk time information, which misleads the next kernel.
> 
> Any thing I could do my self or provide information to improve that?
> I think I have one other of these machines where it seems to behave 
> differently.

I might have found the reason:

octeon seems to boot that special “BOOT” kernel, which has a ram disk. Inside 
that small boot process in 

src/sys/arch/octeon/stand/rdboot/disk.c
   disk_init() calls disk_proberoot()

all possible filesystems are iterated over and are mounted to /mnt. This is 
done to find the root filesystem. I would assume inside that boot kernel, the 
time had been set to the ramdisk time. As it mounts the root filesystem 
temporarily, the time of that filesystem gets set back to the time that had 
been set from the ramdisk kernel.

But that mount uses MNT_RDONLY?

A lot of assumptions, maybe you can confirm with deeper understanding.




Re: time keeping fallback mechanics during reboot on octeon

2024-01-13 Thread Christian Gut



> On 13. Jan 2024, at 00:58, Theo de Raadt  wrote:
> 
> I suspect this is due to how powerpc64 and octeon boot.  Their bootblocks are
> a special kernel called BOOT which mounts the ffs filesystem diretly.  I 
> suspect
> during the transition to loading GENERIC.MP something wrong happens with the
> on-disk time information, which misleads the next kernel.

Any thing I could do my self or provide information to improve that?
I think I have one other of these machines where it seems to behave differently.

That special kernel resides on a fat32 partition, as far as I know. Maybe I 
would need to “touch” or update that filesystem on shutdown? I did try to 
mount, change and unmount it. But I had no luck.

Kind Regards,

Christian



Re: time keeping fallback mechanics during reboot on octeon

2024-01-13 Thread Christian Gut



> On 12. Jan 2024, at 19:39, Otto Moerbeek  wrote:
> 
> On Fri, Jan 12, 2024 at 07:15:43PM +0100, Christian Weisgerber wrote:
> 
>> Otto Moerbeek:
>> 
>>> http://man.openbsd.org/octrtc seems to suggest EdgeRouter does not have
>>> an RTC. A dmesg should give more certainty.
>> 
>> I think the original poster is aware of this.
>> 
>> If I understand correctly, he expects that on reboot the system
>> clock is restored to the last value from before the reboot (so only
>> a minute or so is lost), and that this value is transported by a
>> time stamp on the root filesystem.  Apparently that doesn't happen.
> 
> Which is strange as the code to fall back to fs time is MI.
> That's why I asked for a dmesg, it might give a clue of what is going on.
> 

I should have clarified this. Thanks for pointing out, Christian. That machine 
has not rtc or battery backed clock. Please find dmesg at the end of this mail.

Here is what I observe in /var/log/messages, observe the timestamps.

I think Theo’s remarks might be the right pointer...



/var/log/messages during reboot:
Jan 13 09:52:48 cgbox reboot: rebooted by root
Jan 13 09:52:51 cgbox syslogd[67592]: exiting on signal 15
Oct  9 00:09:42 cgbox syslogd[96730]: start
Oct  9 00:09:42 cgbox /bsd: [ using 763704 bytes of bsd ELF symbol table ]
Oct  9 00:09:42 cgbox /bsd: Copyright (c) 1982, 1986, 1989, 1991, 1993
Oct  9 00:09:42 cgbox /bsd: The Regents of the University of California.  
All rights reserved.
Oct  9 00:09:42 cgbox /bsd: Copyright (c) 1995-2023 OpenBSD. All rights 
reserved.  https://www.OpenBSD.org
Oct  9 00:09:42 cgbox /bsd: Copyright (c) 1995-2023 OpenBSD. All rights 
reserved.  https://www.OpenBSD.org
Oct  9 00:09:42 cgbox /bsd: OpenBSD 7.4 (GENERIC.MP) #1382: Tue Oct 10 09:43:29 
MDT 2023
Oct  9 00:09:42 cgbox /bsd: 
dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/GENERIC.MP
Oct  9 00:09:42 cgbox /bsd: real mem = 1073741824 (1024MB)
Oct  9 00:09:42 cgbox /bsd: avail mem = 1035239424 (987MB)
Oct  9 00:09:42 cgbox /bsd: random: good seed from bootblocks
Oct  9 00:09:42 cgbox /bsd: mainbus0 at root: board 20300 rev 0.15, model 
cavium,ubnt_e300
[...]
Oct  9 00:09:42 cgbox /bsd: scsibus2 at softraid0: 256 targets
Oct  9 00:09:42 cgbox /bsd: root on sd0a (e313814cbcc56c6c.a) swap on sd0b dump 
on sd0b
Oct  9 00:09:42 cgbox /bsd: WARNING: CHECK AND RESET THE DATE!
Oct  9 00:09:44 cgbox ntpd[49396]: recvmsg x.x.x.x: No route to host
Oct  9 00:09:44 cgbox ntpd[49396]: recvmsg x.x.x.x: No route to host
Oct  9 00:09:44 cgbox savecore: no core dump






# dmesg

[ using 774200 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2023 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 7.4 (GENERIC.MP) #1382: Tue Oct 10 09:43:29 MDT 2023
dera...@octeon.openbsd.org:/usr/src/sys/arch/octeon/compile/GENERIC.MP
real mem = 1073741824 (1024MB)
avail mem = 1035223040 (987MB)
random: boothowto does not indicate good seed
mainbus0 at root: board 20300 rev 0.15, model cavium,ubnt_e300
cpu0 at mainbus0: CN70xx/CN71xx CPU rev 0.2 1000 MHz, CN70xx/CN71xx FPU rev 0.0
cpu0: cache L1-I 78KB 39 way D 32KB 32 way, L2 1024KB 8 way
clock0 at mainbus0: int 5
octcrypto0 at mainbus0
iobus0 at mainbus0
simplebus0 at iobus0: "soc"
"bootbus" at simplebus0 not configured
octciu0 at simplebus0
octcib0 at simplebus0: max-bits 23
octcib1 at simplebus0: max-bits 12
octcib2 at simplebus0: max-bits 6
octcib3 at simplebus0: max-bits 15
octcib4 at simplebus0: max-bits 4
octcib5 at simplebus0: max-bits 11
octcib6 at simplebus0: max-bits 11
octgpio0 at simplebus0: 20 pins, xbit 16
octsmi0 at simplebus0
octsmi1 at simplebus0
octpip0 at simplebus0
octgmx0 at octpip0 interface 0
cnmac0 at octgmx0: port 0 SGMII, address 74:ac:b9:43:0d:79
ukphy0 at cnmac0 phy 4: Generic IEEE 802.3u media interface, rev. 2: OUI 
0x0001c1, model 0x000c
cnmac1 at octgmx0: port 1 SGMII, address 74:ac:b9:43:0d:7a
ukphy1 at cnmac1 phy 5: Generic IEEE 802.3u media interface, rev. 2: OUI 
0x0001c1, model 0x000c
cnmac2 at octgmx0: port 2 SGMII, address 74:ac:b9:43:0d:7b
ukphy2 at cnmac2 phy 6: Generic IEEE 802.3u media interface, rev. 2: OUI 
0x0001c1, model 0x000c
cnmac3 at octgmx0: port 3 SGMII, address 74:ac:b9:43:0d:7c
ukphy3 at cnmac3 phy 7: Generic IEEE 802.3u media interface, rev. 2: OUI 
0x0001c1, model 0x000c
octsctl0 at simplebus0: disabled
octxctl0 at simplebus0: DWC3 rev 0x250a
xhci0 at octxctl0, xHCI 1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 
addr 1
octxctl1 at simplebus0: DWC3 rev 0x250a
xhci1 at octxctl1, xHCI 1.0
usb1 at xhci1: USB revision 3.0
uhub1 at usb1 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 
addr 1
"i2c" at simplebus0 not configured
"i2c" at simplebus0 not configured
com0 at simplebus0: ns16550a, 64 byte fifo
com0: console
com1 at simplebus0: ns16550a, 64 

Re: time keeping fallback mechanics during reboot on octeon

2024-01-12 Thread Christian Gut
Hi Otto,


> On 12. Jan 2024, at 15:52, Otto Moerbeek  wrote:
> 
> On Fri, Jan 12, 2024 at 02:35:47PM +0100, Christian Gut wrote:
> 
>> Hi,
>> 
>> Could somebody point me to documentation or tell me where OpenBSD gets the 
>> time from, when the system has no RTC and ntpd is not working?
>> 
>> I am using an EdgeRouter / octeon and at every reboot, the date/time gets 
>> reset to the exact same date.
>> 
>> I tried to read the source code of boot(9) and inittodr(9). I can see, that 
>> there seems to be a fallback to some timestamp that comes from the 
>> filesystem. Maybe when the root filesystem is mounted as of ffs_mountroot() 
>> for example. But my understanding did not go so far to identify from which 
>> file, directory, superblock or other filesystem metadata the information 
>> really comes from.
>> 
>> It seems to me, that either my system is broken or something on octeon does 
>> not work correctly for this fallback to happen correctly.
>> 
>> Kind Regards,
>> Christian
>> 
> 
> When there is no RTC or reading it fails, the "last written" timestamp
> of the root filesystem is used. 

Somehow this does not seem to work on that system. 

> What are you observing exactly? You did not tell any details.  How
> does ntpd fail? Please show some logs (/var/log/daemon should have some
> lines).

I know why ntpd does not work. The system operates in a very constrained 
environment. No ntp servers reachable, no https reachable.

I am okay if the time stays roughly the same during reboot. Like the same day, 
even only some days off is okay. But on that 7.4 system it always gets reset to 
Oct 9, 2023.

I checked the root filesystem. It is clean after reboot.
When the system runs and time has been corrected the following output suggests, 
that fallback from the filesystem should work correctly, but it doesnt:

# file -s /dev/rsd0a

/dev/rsd0a: Unix Fast File system [v2] (big-endian) last mounted on /, last 
written at Fri Jan 12 16:45:42 2024, clean flag 0, readonly flag 0, number of 
blocks 441848, number of data blocks 425527, number of cylinder groups 5, block 
size 16384,
 fragment size 2048, average file size 16384, average number of 
files in dir 64, pending blocks to free 0, pending inodes to free 0, 
system-wide uuid 0, minimum percentage of free blocks 5, TIME optimization

Anything I could check?
Maybe it has to do with the way Octeon boots?

Kind Regards,
Christian



time keeping fallback mechanics during reboot on octeon

2024-01-12 Thread Christian Gut
Hi,

Could somebody point me to documentation or tell me where OpenBSD gets the time 
from, when the system has no RTC and ntpd is not working?

I am using an EdgeRouter / octeon and at every reboot, the date/time gets reset 
to the exact same date.

I tried to read the source code of boot(9) and inittodr(9). I can see, that 
there seems to be a fallback to some timestamp that comes from the filesystem. 
Maybe when the root filesystem is mounted as of ffs_mountroot() for example. 
But my understanding did not go so far to identify from which file, directory, 
superblock or other filesystem metadata the information really comes from.

It seems to me, that either my system is broken or something on octeon does not 
work correctly for this fallback to happen correctly.

Kind Regards,
Christian



Re: 7.4 pfsync possible state update loop?

2023-12-04 Thread Christian Gut



> On 4. Dec 2023, at 10:59, Stuart Henderson  wrote:
> 
> On 2023-12-01, Christian Gut  wrote:
>> Hi List,
>> 
>> I just updated two carp/pfsync firewalls from 7.3 to 7.4. After updating the 
>> second box I see a massive increase in traffic on the sync interface. I now 
>> reproduced this with another pair of firewalls - same thing.
>> 
>> Both firewall have three physical interfaces: external, internal and sync. 
>> Sync interface is connected via ethernet cable directly. Syncinterface has 
>> an ip address.
>> 
>> Configuration of hostname.pfsync0:
>> syncdev em2
>> up
>> 
>> The way I updated these boxes, lets call them primary and secondary:
>> 
>> 1. update secondary to 7.4, including the change in hostname.pfsync0
>> 2. change hostname.carp0 to promote to master - reboot
>> 3. secondary is now master
>> 4. update primary to 7.4
>> => traffic on syncif increases
>> 
>> I tried so far - without any improvements:
>> - reboot both machines after another
>> - promote primary again
>> - ifconfig pfsync0 down; pfctl -F states; ifconfig pfsync0 up
> 
> When you tried down/flush/up did you do it on both firewalls at the same
> time? (i.e. down pfsync on both, then flush on both, then up pfsync)?

I did this only on the slave, as doing it on the master firewall would affect 
production.

> 
>> I think they might see some kind of loop updating the states between each 
>> other. Could someone point me to how I could diagnose further?
> 
> pfsync was largely rewritten between 7.3 and 7.4, we found one problem
> like this but it was fixed before release.
> 
> Best way to proceed is probably to capture traffic on the pfsync
> interface with tcpdump and see if it relates to any particular state/s
> and if there's anything special about them or the rules that generate
> them.
> 
> bugs@ might be a better place than misc@ to continue this.

I will try to gather more input and send it to bugs@ - thanks.




7.4 pfsync possible state update loop?

2023-12-01 Thread Christian Gut
Hi List,

I just updated two carp/pfsync firewalls from 7.3 to 7.4. After updating the 
second box I see a massive increase in traffic on the sync interface. I now 
reproduced this with another pair of firewalls - same thing.

Both firewall have three physical interfaces: external, internal and sync. Sync 
interface is connected via ethernet cable directly. Syncinterface has an ip 
address.

Configuration of hostname.pfsync0:
syncdev em2
up

The way I updated these boxes, lets call them primary and secondary:

1. update secondary to 7.4, including the change in hostname.pfsync0
2. change hostname.carp0 to promote to master - reboot
3. secondary is now master
4. update primary to 7.4
=> traffic on syncif increases

I tried so far - without any improvements:
- reboot both machines after another
- promote primary again
- ifconfig pfsync0 down; pfctl -F states; ifconfig pfsync0 up

I think they might see some kind of loop updating the states between each 
other. Could someone point me to how I could diagnose further?


Kind Regards,

Christian 


Re: i386 syspatch65-021_libcaut breaks bash and zsh

2019-12-06 Thread Christian Gut


> On 6. Dec 2019, at 13:20, Stuart Henderson  wrote:
> 
> On 2019-12-06, Christian Gut  wrote:
>> after installing syspatch65-021_libcaut on a i386 machine, bash and zsh 
>> installed from ports are broken:
>> 
>> $ bash
>> bash:bash: undefined symbol '__divdi3'
>> ld.so: bash: lazy binding failed!
>> Killed 
>> 
>> is this a known issue or am I doing something wrong?
>> 
> 
> It appears that the "library_aslr" step (by default this is run
> automatically at boot) fixes things up, so for now I would suggest
> that rebooting is likely to get your machine back to normal.
> 

Confirmed, reboot fixes this on 6.5 (both bash and zsh)

Thank you very much



i386 syspatch65-021_libcaut breaks bash and zsh

2019-12-06 Thread Christian Gut
Hi List,

after installing syspatch65-021_libcaut on a i386 machine, bash and zsh 
installed from ports are broken:

$ bash
bash:bash: undefined symbol '__divdi3'
ld.so: bash: lazy binding failed!
Killed 

is this a known issue or am I doing something wrong?

Kind Regards,

Christian



dhcpd(8) synchronisation

2018-01-30 Thread Christian Gut
Hi,

I just found the SYNCHRONISATION section of the dhcpd manual page. Sounds great 
- thanks for again implementing such a nice simple feature into OpenBSD.

I also had a look into dhcpd.c and sync.c and with my limited knowledge tried 
to answer the following question:

Is there any mechanism to compensate for a member of the synchronisation group 
being offline and coming back online after the other member(s) have generated 
new leases? As far as I understand, the member that is coming back online does 
not get bulk updates as it is the case with CARP and pfsync.

Am I right, that the user of such a setup is expected to fetch a current copy 
of dhcpd.leases before he starts dhcpd in sync mode? Or is there some other 
mechanism I did not see?

Kind Regards,
Christian


Re: multiple relays in smtpd.conf

2017-08-02 Thread Christian Gut

> On 2.Aug. 2017, at 14:09, Gilles Chehade <gil...@poolp.org> wrote:
> 
> On Wed, Aug 02, 2017 at 01:47:09PM +0200, Kirill Miazine wrote:
>> * Eric Faurot [2017-08-02 13:24]:
>>> On Wed, Aug 02, 2017 at 11:44:47AM +0200, Christian Gut wrote:
>>>> Hi List,
>>>> 
>>>> is it possible to have multiple relays (you might want to say smart hosts) 
>>>> in smtpd?
>>>> 
>>>> I currently use the following line:
>>>> 
>>>> accept from local for any relay via smarthost.example.org 
>>>> <http://smarthost.example.org/>
>>>> 
>>>> Now I would like to have multiple smart hosts in there for backup reasons, 
>>>> if one of the smart hosts is in maintainance. Is something like this 
>>>> possible?
>>>> 
>>>> accept from local for any relay via { smarthost1.example.org 
>>>> <http://smarthost1.example.org/>, smarthost2.example.org 
>>>> <http://smarthost2.example.org/> }
>>>> 
>>>> Kind Regards,
>>>> Christian
>>>> 
>>> It's not possible at the moment.  There is ongoing work to support this 
>>> feature,
>>> along with other improvements. But it's quite a big change, and we can't 
>>> give an
>>> ETA right now.
>> 
>> what about defining a new name in DNS containing addresses of all
>> smarthosts as a workaround for the OP for now?
>> 
> 
> This can work in some use-cases, this is exactly what a co-worker did to
> work around the limitation.

How will smtpd operate then? Does it use the DNS records in a round robin 
fashion or does it try them one after another if they fail?

Christian


multiple relays in smtpd.conf

2017-08-02 Thread Christian Gut
Hi List,

is it possible to have multiple relays (you might want to say smart hosts) in 
smtpd?

I currently use the following line:

accept from local for any relay via smarthost.example.org 


Now I would like to have multiple smart hosts in there for backup reasons, if 
one of the smart hosts is in maintainance. Is something like this possible?

accept from local for any relay via { smarthost1.example.org 
, smarthost2.example.org 
 }

Kind Regards,
Christian



authpf kills other states from same user

2012-07-03 Thread Christian Gut
Hi List,

I am using authpf to grant users access to a special part of our network. The
same firewall doing this is also used for other network separation and
internet access.

If my observation and understanding of the manual page are correct, authpf
kills all states that correspond to the user_ip. For my setup that means, it
also kills connections to other parts of the network which have not been
created through authpf rules.

The documentation reads:

On session exit the same rules and table entries
that were added at startup are removed, and all states associated with
the client's IP address are purged.

Is it possible to configure authpf, so that it only kills states which have
been created through authpf rules? If not, could this feature be decided for
future versions?

Kind regards,

Christian



Re: CARP/PFSYNC

2005-08-31 Thread Christian Gut

[EMAIL PROTECTED] wrote:
 If the

machine fails all is well [ ;) ] and the traffic is routed over the
other machine, however if only one interface fails, CARP notices this
and the interface is moved to the otehr machien, however this still
means that either ext_if or int_if is still leftt on the machine with
one failked card. This of course mucks up the routing! So my question
is, how do I best handle this?


CARP does this. From the manpage:

net.inet.carp.preempt:
Allow virtual hosts to preempt each other. It is also used to failover 
carp interfaces as a group.  When the option is enabled and one of the 
carp enabled physical interfaces goes down, advskew is changed to 240 on 
all carp interfaces.  See also the first example. Disabled by default.



My solution is that I have now started
coding a small daemon that will down the other interface
automatically should one fail.


That would be ifstated in the tree (not built by default)