ipmi driver broken
I noticed back when I upgraded to 5.9 the ipmi driver stopped working, it just said: ipmi0: get header fails ipmi0: no SDRs IPMI disabled I found the following post at the time which appeared to point out the issue and suggest a fix: http://openbsd-archive.7691.n7.nabble.com/fix-for-quot-ipmi0-get-header-fails-quot-td299427.html After applying this and installing the resulting kernel, ipmi worked fine. I skipped 6.0, but just updated my boxes to 6.1, and see the same ipmi failures. It looks like this fix hasn't been applied, the code in head is still missing this line. I applied it again to my 6.1 kernel and it still seems to make ipmi work fine as far as I can tell. Is there anyone maintaining ipmi or someone with commit privs that might be kind enough to apply this so the next release version would have working ipmi? Thanks much...
Re: ipmi driver broken
On Wed, Jun 28, 2017 at 06:31:34PM -0400, Predrag Punosevac wrote: > My understanding is that ipmi driver used by ipmitool is disabled > intensionally due to the security problems. IPMI pose a grave security > risk. IPMI on the SP is available whether or not the openbsd driver is enabled or in use; my understanding as to why it's disabled by default it that it's not necessarily considered stable. I've never had an issue with it, at least not for the limited use I make of it. > As you probably know OpenBSD comes with its own sensoring > framework. You probably want to check out Yes; I actually want the ipmi driver loaded so it can supply data to said framework: hw.sensors.ipmi0.temp0=34.00 degC (System Temp), OK hw.sensors.ipmi0.temp1=40.00 degC (Peripheral Temp), OK hw.sensors.ipmi0.fan0=4875 RPM (FAN 1), OK hw.sensors.ipmi0.fan1=3000 RPM (FAN 2), OK hw.sensors.ipmi0.fan2=3150 RPM (FAN 3), OK hw.sensors.ipmi0.fan3=5100 RPM (FAN 4), OK hw.sensors.ipmi0.fan4=3300 RPM (FAN A), OK hw.sensors.ipmi0.volt0=0.71 VDC (Vcore), OK hw.sensors.ipmi0.volt1=3.23 VDC (3.3VCC), OK hw.sensors.ipmi0.volt2=12.14 VDC (12V), OK hw.sensors.ipmi0.volt3=1.53 VDC (VDIMM), OK hw.sensors.ipmi0.volt4=4.99 VDC (5VCC), OK hw.sensors.ipmi0.volt5=-12.49 VDC (-12V), OK hw.sensors.ipmi0.volt6=3.17 VDC (VBAT), OK hw.sensors.ipmi0.volt7=3.36 VDC (VSB), OK hw.sensors.ipmi0.volt8=3.23 VDC (AVCC), OK hw.sensors.ipmi0.indicator0=Off (Chassis Intru), OK There's more sensor data available via the IPMI interface than the kernel supplies without it. It's also useful to be able to view the SEL without having to loop over the network to the SP management IP. On my linux boxes I also use the ipmi hardware watchdog, but last time I tried that on openbsd it just kept rebooting continuously 8-/. Guess that's one of the parts that's not stable :), but I can't remember the last time one of my openbsd boxes wedged up anyway. Anyway, thanks for the thoughts; but I do still want a working ipmi :). No biggie to add one line and recompile the kernel, but it would be nice to get fixed. It's still disabled by default out of the box, you have to explicitly reconfigure your kernel to enable it.
Re: ipmi driver broken
> From: Theo de Raadt > Sent: Wednesday, June 28, 2017 8:41 PM > > If you want it working, you will need to get it fixed. On all > machines, so that we can renable it. I definitely don't want to be one of those entitled people demanding work from developers without providing anything that you trounce upon ;). But that's a bit of a big ask, make it work on all machines? I've got four different models of supermicro servers that I certainly can do testing on, although as I said, on these particular servers as far as I can tell (other than the watchdog) the driver seems to work fine. > Let me explain how we work. I understand; really, I'm not asking you guys to invest a significant amount of effort in improving the driver, or even technically "fixing" any new issues or problems with it. I was only kindly requesting that you put back a line that appears to have accidentally been deleted a few revisions ago that broke it. So unless you're intentionally sabotaging it in preparation for the ritual sacrifice :)? It's too bad nobody else finds value in it; it provides sensors that aren't otherwise available, provides access to the system event log for event data, allows access to the management interface without needing to go through the network, and ideally would allow access to the hardware watchdog. Unfortunately I don't have expertise in low level hardware device driver development so while I could be a tester I can't be a primary maintainer. So if you guys end up scrapping it, I will be sad but that's the way it is. But until then, given it works for me, it doesn't hurt to use it :). Or to ask for one line to be put back so it would work in the shipped kernel; unless I suppose said request results in it getting scrapped ;). Thanks.
Re: ipmi driver broken
> From: Ted Unangst > Sent: Wednesday, June 28, 2017 8:50 PM > > i'm afraid i won't make a very good ipmi maintainer, but i think i applied the > patch in the right spot. Cool, thanks; much appreciated.
openldap port mdb support
mdb has been disabled in the openldap port since it looks like 2015/02/16, I was wondering if anyone has tried it since then to see if maybe the issues with it have been resolved? The other backends are deprecated upstream, it would be nice to get mdb working under openbsd. I'm going to try enabling it and running through the tests and see how things turn out but I was just curious if anyone else had worked with it in the past couple of years. Thanks...
WARNING: symbol(icudt58_dat) size mismatch, relink your program
I'm trying to compile openldap from ports under 6.1, and running it fails with the error: slapd:/usr/local/lib/libicuuc.so.12.0: /usr/local/lib/libicudata.so.12.0 : WARNING: symbol(icudt58_dat) size mismatch, relink your program I see there was some dicussion of this back around April, but no resolution, and I didn't see anything since then. Evidentally it impacts anything that uses textproc/icu from what I could tell. I poked around with it a bit but nothing jumped out as to why it's doing this. The symbol seems to be defined in libicudata.so and accessed by libicuuc.so. The actual object file in the distibution that contains it is dynamically generated. I have the exact same version running ok on a linux box so it doesn't seem to be an issue with the code itself. Has anyone figured out what's going on with this code under openbsd that's causing it to fail like this? Thanks...
Re: WARNING: symbol(icudt58_dat) size mismatch, relink your program
On Wed, Aug 02, 2017 at 05:37:40PM -0700, Paul B. Henson wrote: > I'm trying to compile openldap from ports under 6.1, and running it > fails with the error: > > slapd:/usr/local/lib/libicuuc.so.12.0: /usr/local/lib/libicudata.so.12.0 > : WARNING: symbol(icudt58_dat) size mismatch, relink your program I ended up checking out the 6.0 version of textproc/icu (57.1) into my 6.1 ports tree and compiling that, which seems to work fine. There must just be some weird issue with 58.2 under OpenBSD.
Re: openldap port mdb support
On Mon, Jul 10, 2017 at 07:34:11AM +, Stuart Henderson wrote: > Feel free to try it, I believe the required patch to force MDB_WRITEMAP > is still in there..but I don't think there were any major changes upstream > since the last attempt so I wouldn't hold out too much hope for it working > straight off. Hmm, as you said, trying to use mdb resulted in crashes. My initial debugging led to the cause of this as a NULL mdb environment, and ironically the root cause of that turned out to be the OpenBSD specific MDB_WRITEMAP patch 8-/. if ( !(flags & MDB_WRITEMAP) ) { Debug( LDAP_DEBUG_ANY, LDAP_XSTRING(mdb_db_open) ": database \"%s\" does not have writemap. " "This is required on systems without unified buffer cache.\n", be->be_suffix[0].bv_val, rc, 0 ); goto fail; } There are two problems with it; first, it accesses the local flags variable before it is initialized to mdb->mi_dbenv_flags shortly thereafter, so the value checked is random and the if block nondeterministically triggers, and second, it doesn't assign a failure value to rc before it jumps to fail, so the function returns successfully but with a closed be, and the code keeps going but later segfaults because of the NULL mdb environment. I updated the patch and moved the check to be after the flags initialization: flags = mdb->mi_dbenv_flags; and added an assignment to rc on failure: rc = MDB_INCOMPATIBLE; I then tweaked the mdb test suite to always enable MDB_WRITEMAP, and so far it's been running for 20 minutes with no errors, crashes, or failures. Right now it's compiled "-O0 -ggdb", if everything keeps looking good, I'll recompile it normally and do more testing.
Re: WARNING: symbol(icudt58_dat) size mismatch, relink your program
On Thu, Aug 03, 2017 at 05:33:15PM -0400, Predrag Punosevac wrote: > It is well known issue. > > https://marc.info/?l=openbsd-misc&m=149271724912565&w=2 > > It seems to be benign at least for my use case. Yah, I saw that discussion from back in April, but then it just stopped with no resolution. I'm not sure what your use case is, but as far as I can tell, it's preventing programs linked against libicuuc.so from running? So not too benign for me 8-/. But fortunately downgrading to the 6.0 version of the port seems to have worked around the issue. Thanks...
Re: WARNING: symbol(icudt58_dat) size mismatch, relink your program
On Sat, Aug 05, 2017 at 12:35:24AM +, Stuart Henderson wrote: > The ports@ list is a better venue for ports-related queries, > please see this: https://marc.info/?l=openbsd-ports&m=150157643516239&w=2 Ah, ok, thanks for the pointer. > This is not preventing programs from running. Hmm, I could've sworn I got that message and then slapd failed to start. Dunno, maybe I got confused. Once I'm done working with openldap mdb I'll start over from scratch and try again and see what happens. Thanks for the info...
Re: OpenBSDI 6.1 some Warnings when using OpenLDAP Tools
On Wed, Aug 09, 2017 at 09:06:19AM +0200, Markus Rosjat wrote: > this is more an info then a problem though since it seems to work. > When I use the slap tool like slapcat I get a size mismatch warning like > this Heh, we were just talking about that: https://marc.info/?l=openbsd-misc&m=150199443929908&w=2
opensmtpd / ldap unreliable
So I recently converted my opensmtpd server to use ldap as the backend for user authentication. It seems it's a bit untolerant to ldap issues? If the ldap server isn't available when opensmtpd is started, it says it started: # /etc/rc.d/smtpd start smtpd(ok) But it isn't there: # ps -aux | grep smtpd root 89090 0.0 0.0 304 1208 p6 S+p5:52PM0:00.00 grep smtpd And it's not really obvious why: May 22 17:52:51 bart smtpd[46044]: info: OpenSMTPD 6.0.4 starting May 22 17:52:51 bart smtpd[23325]: warn: table-proc: pipe closed May 22 17:52:51 bart smtpd[23325]: lookup: table-proc: exiting May 22 17:52:51 bart smtpd[73239]: smtpd: process lka socket closed Starting in debug mode: # smtpd -d info: OpenSMTPD 6.0.4 starting users[43283]: debug: reading key "url" -> "ldap://localhost:3389"; users[43283]: debug: reading key "basedn" -> users[43283]: debug: reading key "username" -> users[43283]: debug: reading key "password" -> users[43283]: debug: reading key "credentials_filter" -> "(&(objectClass=uidObject)(uid=%s))" users[43283]: debug: parsing attribute "credentials_attributes" (2) -> "uid,description" users[43283]: debug: done reading config users[43283]: warn: aldap_parse users[43283]: fatal: failed to connect warn: table-proc: pipe closed lookup: table-proc: exiting smtpd: process lka socket closed You can see it looks like it fails to connect to the ldap server at startup and just dies. Further, if the ldap server is up at startup, but ever restarts or has the connection broken, authentication just fails: May 21 13:22:10 bart smtpd[42132]: warn: user credentials lookup fail for users:henson The opensmtpd process needs to be restarted before authentication works again. In debug mode, it shows: users[7295]: debug: table_ldap: ldap_query: filter=(&(objectClass=uidObject)(uid=henson)), ret=0 5e46e2fabbf8d72e smtp event=authentication user=henson address=134.71.249.41 host=134.71.249.41 result=permfail Is it expected that the ldap support is currently not production ready? I see in a presentation from back in 2013 that ldap was classified experimental at the time, but it's not clear if that's still the case. I see in the repo at https://github.com/OpenSMTPD/OpenSMTPD-extras/blob/master/extras/tables/table-ldap/table_ldap.c there's a change to add ldap reconnection support: https://github.com/OpenSMTPD/OpenSMTPD-extras/commit/04e4c521b34d1987af915ff97dcb0d87daf122b0#diff-369c0fcbfbc85bf2cdad7dba1131b872 but it's dated 7/27/2017, and the last github release seems to be 201601072302 (although the openbsd port appears to be 201703132115, I guess it's not downloading it from github?). It looks like the code in head still fails to start if the ldap server isn't available when opensmtpd is started though. Is anybody using opensmtpd with ldap in production? If so, how are you working around this issue? Thanks...
Re: opensmtpd / ldap unreliable
> From: justina colmena > Sent: Tuesday, May 22, 2018 9:08 PM > > Are they being started in the wrong order at boot time? The LDAP server in use is not running on the local openBSD system. It might not be available due to an underlying network issue or some other problem that temporarily prevents successful connections/queries. > What you ask is a very general question: If A depends on B, and B is > missing, how do expect A to behave? In this specific case, I expect A to complain it was unable to contact B, to continue initializing, return temporary failures for any operation which requires B, and reattempt a connection to B on a regular basis until it is successful. From a reliability and full tolerance perspective, falling over and dying doesn't seem a very good choice for the circumstances.
Re: opensmtpd / ldap unreliable
> From: Gilles Chehade > Sent: Wednesday, May 23, 2018 1:20 PM > > That's bad but could easily be fixed if you want to help us Definitely; I'll pull the latest github head down and see if that fixes the LDAP connection recovery after startup issue, and then I can try any suggestions to make it more reliable at startup or possibly fiddle with that code myself. > That would be a bad idea... it's experimental :-p I did see that mentioned circa 2013, but I guess I kind of hoped it had moved beyond that by now :). Thanks much.
Re: opensmtpd / ldap unreliable
> From: Gilles Chehade > Sent: Wednesday, May 23, 2018 1:20 PM > > That's bad but could easily be fixed if you want to help us So I dropped in the latest table-ldap from git, and it still failed authentications after an LDAP server outage. It looks like the check is only in the table_ldap_check function? I'm not sure what that's for, but it doesn't seem to be called at all when doing authentication. I added a similar check into the table_ldap_lookup function, and also had to reorder the functions in the file a bit due to errors like this: table_ldap.c:92:15: warning: implicit declaration of function 'ldap_open' is invalid in C99 [-Wimplicit-function-declaration] Afterwards, opensmtpd successfully reconnected to LDAP and performed authentication after an LDAP outage :). users[14726]: debug: table_ldap: ldap_query: filter=(&(objectClass=uidObject)(uid=henson)), ret=0 users[14726]: debug: table-ldap: reconnecting users[14726]: info: table-ldap: closed previous connection users[14726]: debug: ldap server accepted credentials users[14726]: debug: table_ldap: ldap_query: filter=(&(objectClass=uidObject)(uid=henson)), ret=1 Here's what my changes currently are. I can submit a pull request on github if you'd like. Thanks. diff --git a/extras/tables/table-ldap/table_ldap.c b/extras/tables/table-ldap/table_ldap.c index 88c9ffd..9d20526 100644 --- a/extras/tables/table-ldap/table_ldap.c +++ b/extras/tables/table-ldap/table_ldap.c @@ -74,45 +74,6 @@ table_ldap_update(void) return 1; } -static int -table_ldap_check(int service, struct dict *params, const char *key) -{ - int ret; - - switch(service) { - case K_ALIAS: - case K_DOMAIN: - case K_CREDENTIALS: - case K_USERINFO: - case K_MAILADDR: - if ((ret = ldap_run_query(service, key, NULL, 0)) >= 0) { - return ret; - } - log_debug("debug: table-ldap: reconnecting"); - if (!(ret = ldap_open())) { - log_warnx("warn: table-ldap: failed to connect"); - } - return ret; - default: - return -1; - } -} - -static int -table_ldap_lookup(int service, struct dict *params, const char *key, char *dst, size_t sz) -{ - switch(service) { - case K_ALIAS: - case K_DOMAIN: - case K_CREDENTIALS: - case K_USERINFO: - case K_MAILADDR: - return ldap_run_query(service, key, dst, sz); - default: - return -1; - } -} - static int table_ldap_fetch(int service, struct dict *params, char *dst, size_t sz) { @@ -361,6 +322,32 @@ err: return 0; } +static int +table_ldap_lookup(int service, struct dict *params, const char *key, char *dst, size_t sz) +{ + int ret; + + switch(service) { + case K_ALIAS: + case K_DOMAIN: + case K_CREDENTIALS: + case K_USERINFO: + case K_MAILADDR: + if ((ret = ldap_run_query(service, key, dst, sz)) > 0) { + return ret; + } + log_debug("debug: table-ldap: reconnecting"); + if (!(ret = ldap_open())) { + log_warnx("warn: table-ldap: failed to connect"); + return ret; + } + return ldap_run_query(service, key, dst, sz); + default: + return -1; + } +} + + static int ldap_query(const char *filter, char **attributes, char ***outp, size_t n) { @@ -498,6 +485,31 @@ end: return ret; } +static int +table_ldap_check(int service, struct dict *params, const char *key) +{ + int ret; + + switch(service) { + case K_ALIAS: + case K_DOMAIN: + case K_CREDENTIALS: + case K_USERINFO: + case K_MAILADDR: + if ((ret = ldap_run_query(service, key, NULL, 0)) >= 0) { + return ret; + } + log_debug("debug: table-ldap: reconnecting"); + if (!(ret = ldap_open())) { + log_warnx("warn: table-ldap: failed to connect"); + } + return ret; + default: + return -1; + } +} + + int main(int argc, char **argv) {
Re: opensmtpd / ldap unreliable
On Sat, May 26, 2018 at 08:16:28AM +0200, Gilles Chehade wrote: > please do so we have more people able to test Done, thanks. What are your thoughts design-wise on dealing with ldap not being available at startup? Should layer 7 issues (ldap auth failed, etc) be handled differently than transport level issues (connection refused/timed out)?
smtpd new "relay as" syntax?
I just upgraded to OpenBSD 6.4, and I'm trying to figure out how to do this with the new syntax: accept from local for any relay via smtp://smtp.domain.com as "@domain.com" This would rewrite the outbound message to masquerade as being from the TLD rather than a specific machine. Right now I've got: action local_relay relay host smtp.domain.com match from local for any action local_relay But this doesn't do the rewriting. The only thing I see in the man page talks about 'senders [masquerade]' which seems to be for authenticated users. Am I missing something obvious? Thanks...
Re: smtpd new "relay as" syntax?
On Wed, Oct 31, 2018 at 08:07:09PM -0400, TronDD wrote: > Mail-from in the action options, I believe. Ah, yes; that seems to work, thanks. The previous implementation was documented as: If the as parameter is specified, smtpd(8) will rewrite the sender advertised in the SMTP session. address may be a user, a domain prefixed with `@', or an email address, causing smtpd(8) to rewrite the user-part, the domain-part, or the entire address, respectively. whereas this just said: mail-from mailaddr Use mailaddr as the MAIL FROM address within the SMTP transaction. It wasn't clear it would do the same rewriting functionality, I thought at first it just took a single email address.
isc bind - error sending response: would block
I recently updated a couple servers that were running OpenBSD 6.3 with bind 9.11.3 to OpenBSD 6.4 and bind 9.11.4pl2. Since then, I'm been getting a large number of "error sending response: would block" log messages: Nov 15 11:03:58 lisa named[79587]: client @0x6f2f02bc440 10.128.30.77#65198 (p64-keyvalueservice.icloud.com): view internal: error sending response: would block Nov 15 11:07:42 lisa named[79587]: client @0x6f325b7a440 10.128.0.19#1851 (alt1.gmail-smtp-in.l.google.com): view internal: error sending response: would block I reviewed the article at https://kb.isc.org/docs/aa-00717 ; but it's not clear if this just a warning message, and it tries again and successfully responds to the client, or is it's a hard error and the client never gets a response? I wasn't getting any errors before the upgrade, and I don't think the load on these servers is anywhere near high enough to cause them to be overloaded. Any thoughts on what might be going on? New bug in bind? Change in OpenBSD? So far I haven't gotten a response on the bind mailing list. Thanks...
mysteriously disappearing pf state entries
I'm running OpenBSD 6.6 operating as an inter-VLAN and border router using pf. Recently I wanted to use a nondefault state timeout for some UDP traffic traversing from my voip subnet to a provider off site. Within pf, there are three rules involved. The first is for traffic coming from the voip subnet, which gets a six minute state timeout and a tag: pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP keep state (udp.multiple 360) Then there is a NAT rule: match out on $ext_if from 10.128.0.0/16 nat-to { $ext_vip } sticky-address And a rule giving the traffic going out to the Internet a six minute timeout as well: pass out quick on $ext_if proto udp tagged VOIP_UDP keep state (udp.multiple 360) This initially looks like it worked; after the initial connection: bash-5.0# pfctl -v -s state | grep -A1 '110.73:9430' all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:00, expires in 00:06:00, 7:5 pkts, 4451:2203 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:00, expires in 00:06:00, 7:5 pkts, 4451:2203 bytes, rule 48, source-track There are two states, one with the internal addressing and one for the NAT translation, both with six minute timeouts. As time goes by: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:08, expires in 00:05:54, 31:31 pkts, 16469:18285 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:08, expires in 00:05:54, 31:31 pkts, 16469:18285 bytes, rule 48, source-track all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:20, expires in 00:05:42, 31:31 pkts, 16469:18285 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:20, expires in 00:05:42, 31:31 pkts, 16469:18285 bytes, rule 48, source-track More packets are seen, resetting the timeout: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:23, expires in 00:05:58, 32:32 pkts, 16872:19073 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:23, expires in 00:05:58, 32:32 pkts, 16872:19073 bytes, rule 48, source-track all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:38, expires in 00:05:43, 32:32 pkts, 16872:19073 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:38, expires in 00:05:43, 32:32 pkts, 16872:19073 bytes, rule 48, source-track again: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:41, expires in 00:06:00, 33:33 pkts, 17275:19931 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:41, expires in 00:06:00, 33:33 pkts, 17275:19931 bytes, rule 48, source-track all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:58, expires in 00:05:43, 33:33 pkts, 17275:19931 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:58, expires in 00:05:43, 33:33 pkts, 17275:19931 bytes, rule 48, source-track etc, etc: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:02:26, expires in 00:05:52, 37:37 pkts, 18863:23594 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:02:26, expires in 00:05:52, 37:37 pkts, 18863:23594 bytes, rule 48, source-track Until finally, there are no more packets for a while: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:02:36, expires in 00:05:54, 47:46 pkts, 24551:29876 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:02:36, expires in 00:05:54, 47:46 pkts, 24551:29876 bytes, rule 48, source-track all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:03:31, expires in 00:04:59, 47:46 pkts, 24551:29876 bytes, rule 63 all udp 96.251.22.157:55202 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:03:31, expires in 00:04:59, 47:46 pkts, 24551:29876 bytes, rule 48, source-track After this, the next time I look a couple seconds later, the state is gone? It reproducibly seems to disappear a minute after the last traffic is seen on the connection. Yet the timeout says 5 minutes are left? Why would the state be removed when it still had five minutes left before it expired? I know if it were a TCP state, it might go away before the timeout expires if the connection is shut down. But this is a UDP state. What would cause it to go away before the timeout expiration? Is there something wrong
lost pf state - disappeared before expiration?
I'm trying to set a longer timeout on a udp state, and for some reason it seems to be disappearing before the expiration 8-/. There are 3 rules involved: pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP keep state (udp.multiple 360) pass out quick on $ext_if proto udp tagged VOIP_UDP keep state (udp.multiple 360) match out on $ext_if from 10.128.0.0/16 nat-to { $ext_vip } sticky-address I turned on pf debugging, when the connection is created I see: May 17 15:36:39 lisa /bsd: pf: key search, in on vlan110: UDP wire: (0) 10.128.110.73:9430 198.148.6.55:9430 May 17 15:36:39 lisa /bsd: pf: key setup: UDP wire: (0) 10.128.110.73:9430 198.148.6.55:9430 stack: (0) - May 17 15:36:39 lisa /bsd: pf: key search, out on em2: UDP wire: (0) 198.148.6.55:9430 10.128.110.73:9430 May 17 15:36:39 lisa /bsd: pf: key setup: UDP wire: (0) 198.148.6.55:9430 96.251.22.157:63529 stack: (0) 198.148.6.55:9430 10.128.110.73:9430 and there are state entries: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:02:21, expires in 00:05:00, 29:29 pkts, 14166:18501 bytes, rule 63 all udp 96.251.22.157:55205 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:02:21, expires in 00:05:00, 29:29 pkts, 14166:18501 bytes, rule 48, source-track However, right after the 5 minute mark the states disappear. The last pf log entries are; May 17 15:38:47 lisa /bsd: pf: key search, in on vlan110: UDP wire: (0) 10.128.110.73:9430 198.148.6.55:9430 May 17 15:38:47 lisa /bsd: pf: key search, out on em2: UDP wire: (0) 198.148.6.55:9430 10.128.110.73:9430 I was hoping to see something about expiration in the pf debug logs but this is all that appears to be available. Any idea why these states would go away when there is 5 minutes left before the expiration? Thanks much...
Re: lost pf state - disappeared before expiration?
On 5/17/2020 8:40 PM, Strahil Nikolov wrote: > What is your conf having as a timeout ? Both of the rules explicitly override the default timeout with a six minute rule level timeout: pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP keep state (udp.multiple 360) pass out quick on $ext_if proto udp tagged VOIP_UDP keep state (udp.multiple 360) Which is being successfully applied, as shown by the states, which start out with a six minute expiration: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:00:02, expires in 00:06:00, 24:23 pkts, 12163:13840 bytes, rule 63 all udp 96.251.22.157:55205 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:00:02, expires in 00:06:00, 24:23 pkts, 12163:13840 bytes, rule 48, source-track However, once a minute has passed, and the expiration shows five minutes left: age 00:02:21, expires in 00:05:00, 29:29 pkts, 14166:18501 bytes, rule 63 all udp 96.251.22.157:55205 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:02:21, expires in 00:05:00, 29:29 pkts, 14166:18501 bytes, rule 48, source-track Both of the rules simply disappear. Interestingly, I believe the default multiple:multiple timeout is one minute. Which makes me wonder if for some reason the default timeout is being applied to these rules which have an explicit longer timeout? That seems buggy, unless there is something wrong with my configuration. Even so, for a state that says it has five minutes left to go away doesn't seem right. Thanks for the input…
state replication bug in pfsync?
I've been trying to diagnose a mysterious issue where a UDP state disappears before it's supposed to expire. I finally tracked it down to pfsync. On the primary server, the state entries look like: all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:02:21, expires in 00:04:59, 34:34 pkts, 17887:20606 bytes, rule 64 all udp 96.251.22.157:58308 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE age 00:02:21, expires in 00:04:59, 34:34 pkts, 17887:20606 bytes, rule 49 They shouldn't expire for five minutes. However, the same states, at the same time, on the backup server: Thu Jun 4 18:17:27 PDT 2020 all udp 198.148.6.55:9430 <- 10.128.110.73:9430 MULTIPLE:MULTIPLE age 00:02:22, expires in 00:00:00, 0:0 pkts, 0:0 bytes all udp 96.251.22.157:58308 (10.128.110.73:9430) -> 198.148.6.55:9430 MULTIPLE:MULTIPLE expire. And then the synchronization from the backup to the primary removes them. These two systems share a carp vip, and other than the macro defining the local IP address of each individual system, pf.conf is exactly the same on both. How come when the state is transferred to the backup after initially being created on the primary, the state on the backup has the default timeout for udp multiple rather than the custom one defined in my rules: match out on $ext_if from 10.128.0.0/16 nat-to $ext_vip pass out quick on $ext_if proto udp tagged VOIP_UDP keep state (udp.multiple 360) pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP keep state (udp.multiple 360) That doesn't seem right. Am I missing something? Thanks much.
pfsync and rule specific state timeouts
Where is it documented that in order for pfsync to properly synchronize rule specific state timeouts that the rule sets on the systems being synchronized must be *exactly* the same? I have a pair of redundant firewalls synchronizing state, and recently added a couple rules that increase the default timeout for a UDP connection: pass out quick on $ext_if proto udp tagged VOIP_UDP keep state (udp.multiple 360) pass in quick on vlan110 proto udp from any to port = 9430 tag VOIP_UDP keep state (udp.multiple 360) Despite the timeout being set to six minutes, the states kept disappearing after approximately a minute of idle time. After spending a lot of time trying to debug it, I finally figured out that the states replicated to the backup firewall received the default one minute timeout rather than the six minute timeout specified by the rule, and when they expired on the backup firewall, they were deleted from the primary firewall. After further debugging, I discovered that pfsync on the receiving system only applies the rule specific timeout if the entire rule set is exactly identical on both systems. While my rule set was functionally identical on both systems, it was not exactly the same, having rules such as: pass in quick on $ext_if proto tcp from any to $ext_if port ssh which had the primary IP address on each system substituted, resulting in a rule set that was "different". This seems overly strict. What if two systems being used as redundant firewalls had different network cards? This would make the names of the interfaces different, resulting in rule sets that were not the same, preventing per-rule state timeouts from being properly applied. I can understand you wouldn't want to apply the wrong timeout, but it seems that validating a per rule checksum rather than an entire rule set checksum would be more flexible. Both the rule number and the rule content on both of these systems for these rules are exactly the same. It is just other rules that have a different IP address given that each system has its own separate IP address in addition to the virtual carp address...
Re: pfsync and rule specific state timeouts
On 6/5/2020 11:15 PM, obs...@loopw.com wrote: 1) “egress” can be used to reference the external nic in a rule, instead of having a specific IP. Egress is defined as the nic with the default route. pass in quick log on egress inet proto tcp to (egress) port 22 Ah, I think I seen that in the past but did not remember it offhand. Thanks; although these boxes run OSPF and the default route changes depend on the network state, so I'm not sure that this would work. 2) Both of the firewall IP addresses can be in a rule if egress is not suitable for your topology, something like this will sync over cleanly with pfsync: pass in quick log on $ext_if inet proto tcp to { $fw1_ext $fw2_ext } port 22 I thought about doing that, but I ended up just making a table with a single IP address in it, each router having the appropriate IP address in the table, and the rule referencing the table being exactly the same on both. Everything is working properly now. I do still wonder if this requirement is documented anywhere? I've been looking, and could not find it. It was very confusing trying to sort out why my states were mysteriously disappearing, I ended up having to add some extra debugging code in the kernel to figure out what was happening. Thanks…
pfsync interface in carp group
I've had a pair of redundant firewalls using pfsync for years. I've noticed in the past that whenever I rebooted the secondary firewall, the carp interfaces on the primary would flip to backup and then back to master as the secondary one rebooted. I never really noticed any issues with it, so I just ignored it. Since upgrading to 6.7 though, if I have an active ssh connection to the carp IP address on the primary when I reboot the secondary and these interface flip-flops occur, my connection is dropped, which is undesirable. It looks like this is happening because by default the pfsync interface is in the carp group, so when it goes down, it demotes all of the carp interfaces on the system. I can see why this would be useful for a setup using multicast and more than two firewalls, as if you are not synchronizing states, you are probably not the best choice to be the active owner of the virtual IP addresses. However, for only two firewalls, when you're using the syncpeer directive for the pfsync interface, it seems it would be better not to default to belonging to the carp group? With only two firewalls, if one of them has broken synchronization, so does the other, so is there any real point in trying to migrate away from the one that's currently master? I updated my configuration to remove the pfsync interface from the carp group and now when I reboot there are no issues with the carp interfaces changing state or connections being dropped. Would it make sense to not automatically include the pfsync interface in the carp group if it is using the syncpeer directive? Thanks…
Re: pfsync interface in carp group
On 6/7/2020 5:21 PM, Markus Wernig wrote: I don't see that behaviour on my carp pair. Are you using a cross-link cable between the two firewalls? (You shouldn't, in my experience.) Yes, I am using a direct link between the two physical firewalls. It seems to be the configuration recommended by the documentation? https://www.openbsd.org/faq/pf/carp.html "The firewalls are connected back-to-back using a crossover cable on em1." As well as in 'man pfsync': "Only run the pfsync protocol on a trusted network - ideally a network dedicated to pfsync messages such as a crossover cable between two firewalls." "A crossover cable connects the two firewalls via their sis2 interfaces." Is this no longer a best practice?
Re: pfsync interface in carp group
On 6/8/2020 6:29 AM, Philipp Buehler wrote: did you follow some "howto" and set net.inet.carp.preempt=1? Well, if you consider the official openBSD documentation a "how-to", then yes :). In the example in https://www.openbsd.org/faq/pf/carp.html under the section "Combining CARP and pfsync for Failover" it says: ! enable preemption and group interface failover # sysctl net.inet.carp.preempt=1 # echo 'net.inet.carp.preempt=1' >> /etc/sysctl.conf As well as in the example in 'man pfsync': The following must also be added to /etc/sysctl.conf: net.inet.carp.preempt=1 One of my firewalls has newer hardware and more power than the other, it is the primary. If I reboot it and the load fails over to the secondary, I want the load to automatically come back to the primary once it is available again. Thanks…
Re: pfsync interface in carp group
On 6/9/2020 7:36 AM, Stuart Henderson wrote: IME the best setup for pfsync between 2 machines is to use a dedicated cross-connect (preferably configured for jumbo frames). Obviously that's not possible with >2 machines though. Hmm, I had never considered using jumbo frames. It looks like based on the traffic level on my systems, the packets are generally below the default 1500 MTU anyway though, so it probably wouldn't help. 12:16:27.564940 lisa-bart.pbhware.com: PFSYNCv6 len 896 12:16:28.023806 lisa-bart.pbhware.com: PFSYNCv6 len 712 12:16:28.195774 bart-lisa.pbhware.com: PFSYNCv6 len 276 12:16:28.207817 lisa-bart.pbhware.com: PFSYNCv6 len 528 I'm undecided what's best to do with the group by default. But I never use syncpeer for that config, just the default with multicast, which I think is quite common - changing group based on whether or not syncpeer is used doesn't make sense to me. I guess multicast would work too for a direct peer relationship, but it just seemed more accurate to explicitly configure the two peers. Some documentation regarding the interaction of pfsync and the carp group might be helpful, along with a suggestion to remove the carp group from the pfsync interface in certain deployment scenarios. It would also be nice to document the dependency on the two rule sets being exactly identical in order to properly replicate rule specific state timeouts between them. It took me a while to sort out why that was failing. Maybe I will try to write something up; the source for the pfsync man page is in CVS, where is the source for webpages such as: https://www.openbsd.org/faq/pf/carp.html#pfsyncop Thanks…
Re: pfsync interface in carp group
On 6/9/2020 1:42 PM, Markus Wernig wrote: Neither jumbo frames nor multicast will prevent group demotion when the other side of a crosslink cable goes physically down. Only not having the sync interface in the carp group will. True. But I think he was just discussing general best practices, not things specific to the issue I raised. Using jumbo frames would reduce the packet overhead if the number of states to be sent in a particular transmission would have exceeded the default MTU. In my scenario, it looks like I don't have enough traffic for that to occur, at least not on a regular basis.
skylake Xeon, C232 chipset, i210-AT ethernet
I'm about to build a server with a supermicro X11SSL-F motherboard and a Xeon E3-1240L v5 processor. The SATA ports should be AHCI compliant, and it looks like the i210-AT ethernet is supported by the em driver, so I think everything should work ok. But it's pretty new stuff, so I wanted to check and see if anybody was aware of any problems or issues with OpenBSD 5.8 and the latest Intel processors and chipsets before I pulled the trigger on it. Thanks much!
kernel reordering and config -e
I just updated a server to 6.2; unfortunately this box has an oddball SOL com2 on irq10 so I need to run 'config -e' on the kernel to update it and make the serial console work. I noticed afterwards in the boot messages it was complaining about kernel reordering failures, and thinking I was fixing it, I updated the file /var/db/kernel.SHA256 with the hash of my modified kernel. I quickly discovered that resulted in a successfully reordered kernel with a stock com2 irq :(. I didn't see anything in the config man page or faq about interaction between kernel reordering and config on a binary kernel. In hindsight I see that the hash check is to keep from replacing a locally modified kernel. Is there a supported way to both fix hardcoded settings on a stock kernel and use reordering? Or do you need to update your settings in the config and compile a kernel from scratch? If you do, does /usr/share/compile automatically get populated with your new kernel objects and reordering just starts working, or do you need to do something manually to get it running with a locally compiled kernel? Thanks...
Re: kernel reordering and config -e
On Mon, Nov 20, 2017 at 06:50:30AM +0100, Sebastien Marie wrote: > When it did that, it uses the object (I didn't recall the exact name) > with the previous mentioned array, with *default* configuration. So the > previous modification done with config(8) is cleared. Yeah, I figured that out after I updated the saved KARL hash and then my box came up with no serial console :). > For me, there is currently no way to ask config(8) to alter the right > file in /usr/share/relink/kernel to "ship" the modification in all > future generated KARL kernels. I thought that might be the case; maybe someday config(8) will be extended to work with the object files as well as the kernel binary itself to allow that. > - makes your changes in /usr/src/sys, build and install a new no-GENERIC > kernel (and do it at each upgrade) If I do that, can the resultant object files (which will have my com2 irq change) be used with KARL? Hmm, it seems like all I really need to do is compile a new com_isa.o and drop it in to the existing directory? Or replace whichever object file contains the constant I need to change; it's not like I'm modifying code or making any drastic changes... Hmm, I'll have to compile a new kernel and poke at it; it'll just be a matter of remembering to redo it after patches, but I already had to redo the config -e anyway. Thanks...
Sierra Wireless MC7455 LTE cell network card
I'm trying to get the subject card to work under OpenBSD 6.2; it works fine under Linux so I know the card itself and its SIM etc are correctly configured and functional. The card is set to MBIM mode, and I'd like to use the umb driver rather than the umsm driver as not to have to muck with PPP. It seems this card is detected first by the umsm driver though, as I had to disable that driver for the card to be picked up by umb. The umb man page says "Devices which fail to provide a conforming MBIM implementation will probably be attached as some other driver", does this indicate the MC7455 (as opposed to the EM7455, which is explicitly listed as compatible) isn't recognized as an MBIM device? It seems to work fine in MBIM mode under linux, and the umb driver does find it once umsm is disabled. Is there any way to access the serial interface of the device under openbsd in order to execute diagostic AT commands? Under linux in addition to the network device the card also generates a few USB serial devices, one of which can be used to run commands on it. I saw such devices with the umsm driver, but I don't see any with the umb driver. I haven't gotten any farther than installing the card and getting the umb driver to recognize it at this point, but it would be nice to be able to poke at it and see what the card has to say for itself. Thanks...
Re: kernel reordering and config -e
On Mon, Nov 20, 2017 at 08:37:43AM +, Roderick wrote: > Commenting out the line "/usr/libexec/reorder_kernel &" at the > end of rc? > > I suspect it is not forseen not to benefice of KARL. No, actually, if the hash of the kernel is different than expected, the reorder_kernel aborts and doesn't generate a new one. So you don't need to do anything explicitly after the config -e to avoid your change being wiped out. What I did was update the saved hash with one matching my modified kernel (not quite understanding what was going on yet) which caused KARL to wipe my changes out with the default.
Re: kernel reordering and config -e
On Mon, Nov 20, 2017 at 02:01:56PM -0700, Theo de Raadt wrote: > If someone wants to solve this fully there have been some proposals > for keeping track of the instruction sequence, and attempting to > reapply it upon each relink in the build directory. There just hasn't > been any scripting changes to do that from anyone, and it isn't on my > radar as important. Ah, rather than make binary changes to the object files that get linked, just redo the changes to the resultant kernel binary every time it is generated. That's definitely simpler to do with the existing tools. I see someone made a suggestion that you replied to with a classic "where's the patch" :), I don't think he was suggesting someone else do it but more looking for guidance on what you'd considerable acceptable before spending time on it. For example, would the basic "user manually constructs a text file with config commands by hand that then just gets passed to config on stdin" approach he mentioned be good enough to commit? Or would you want something more integrated into config where it would have a new command that would generate a file based on the current session, and a new option to process changes from a file rather than interactively? It looks like it would be difficult to detect errors in the first scenario, and I don't know if that would be an issue. Thanks...
Re: kernel reordering and config -e
On Tue, Nov 21, 2017 at 09:49:37AM +, Dimitris Papastamos wrote: > This is what I do in rc.shutdown to handle this case: > > /usr/bin/printf "disable inteldrm*\nquit\n" | /usr/sbin/config -ef /bsd > /bin/sha256 -h /var/db/kernel.SHA256 /bsd Cool, thanks for the suggestion; that should be good as long as the box doesn't panic or otherwise have an unclean shutdown.
Re: kernel reordering and config -e
On Wed, Nov 22, 2017 at 04:45:59PM +, Kevin Chadwick wrote: > I believe the second scenario would need /dev/mem access making it a > larger change than it first appears (config with a new option could > possibly save the original kernel file and compare the two kernel > files). Ah, I didn't mean that; I meant save your interactive 'config -e' session in a file that could be played back later. IE, you run 'config -e - /etc/ukc.conf ...', then type 'change x', 'disable y' etc, and then when you 'quit', config would write a transcript of your changes to /etc/ukc.conf such that 'config -e -
umb device, SIM has no PIN?
I'm trying to get an LTE card working in MBIM mode with the umb device driver, but it just keeps saying "SIM not initialized PIN required". The SIM isn't PIN locked, as far as I know the SIM has no PIN. I've tested the card and SIM under linux on the exact same system and was able to get it working fine just by supplying the APN. The card is a Sierra Wireless MC7455; to get it working with the umb driver I did have to disable the umsm driver as for some reason that one claimed it first. Once that driver was disabled the umb driver seemed happy with it: umb0 at uhub2 port 3 configuration 1 interface 12 "Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 add r 3 ugen0 at uhub2 port 3 configuration 1 "Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 3 After boot, the interface looked like: umb0: flags=8810 mtu 1500 index 6 priority 0 llprio 3 roaming disabled registration unknown state down cell-class none SIM not initialized PIN required status: down I set the APN and tried to bring it up: umb0: flags=8811 mtu 1500 index 6 priority 0 llprio 3 roaming disabled registration unknown state down cell-class none SIM not initialized PIN required APN r.ispsn status: down But it still just says the SIM is not initialized. After a minute or two, it starts logging these to the console: umb0: state change timeout umb0: state change timeout umb0: state change timeout umb0: state change timeout Am I missing something? This card isn't listed explicitly as being compatible, is there a problem with the driver and this particular card? Under linux, the serial control interfaces were available as USB devices so you could poke at the card with AT commands, I don't see any listed booted under openbsd. The umb driver doesn't support accessing the card directly for debugging and diagnostics? Thanks...
Re: umb device, SIM has no PIN?
> The card is a Sierra Wireless MC7455; to get it working with the umb Looking at the source code, I see that there's an workaround for the EM7455 card, something about requiring an "FCC Authentication" command? >From what I understand the MC7455 is the same as the EM7455 other than form factor, so I added it to the list for that workaround and also turned on debugging in the driver. Here's what it has to say now: Jul 23 18:12:41 maggie /bsd: umb0 at uhub2 port 3 configuration 1 interface 12 "Sierra Wireless, Inc orporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 3 Jul 23 18:12:41 maggie /bsd: umb0: ctrl_len=4096, maxpktlen=1422, cap=0x20 Jul 23 18:12:41 maggie /bsd: umb0: ctrl-ifno#12: ep-ctrl=5, data-ifno#13: ep-rx=4, ep-tx=3 Jul 23 18:12:41 maggie /bsd: umb0: rx/tx size 16384/16384 Jul 23 18:12:41 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 1) Jul 23 18:12:41 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 1) Jul 23 18:12:41 maggie /bsd:0: 01 00 00 00 10 00 00 00 01 00 00 00 00 10 00 00 Jul 23 18:12:41 maggie /bsd: umb0: vers 1.0 Jul 23 18:12:41 maggie /bsd: ugen0 at uhub2 port 3 configuration 1 "Sierra Wireless, Incorporated Si erra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 3 Jul 23 18:13:31 maggie /bsd: umb0: stop: reached state DOWN Jul 23 18:13:59 maggie /bsd: umb0: init: opening ... Jul 23 18:13:59 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 2) Jul 23 18:13:59 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 2) Jul 23 18:13:59 maggie /bsd:0: 01 00 00 00 10 00 00 00 02 00 00 00 00 10 00 00 Jul 23 18:14:29 maggie /bsd: umb0: state change timeout Jul 23 18:14:29 maggie /bsd: umb0: init: opening ... Jul 23 18:14:29 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 3) Jul 23 18:14:29 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 3) Jul 23 18:14:29 maggie /bsd:0: 01 00 00 00 10 00 00 00 03 00 00 00 00 10 00 00 Jul 23 18:14:59 maggie /bsd: umb0: state change timeout Not sure where to go from here.
Re: umb device, SIM has no PIN?
On Fri, Nov 24, 2017 at 11:08:25AM +, Stuart Henderson wrote: > > booted under openbsd. The umb driver doesn't support accessing the card > > directly for debugging and diagnostics? > > Correct, you can't get at those from OpenBSD atm. That's a bummer; guess you wouldn't care too much if things were working :), but when you're trying to sort out why they're not it sure would be nice. Then if you're placing an external antenna the signal strength readings are cool. > I don't have it handy to check now, but IIRC that's similar to what I > see on MC8805 after adding the ID for fcc auth. Interestingly, I tried the card in an external miniPCI to USB adapter, and it worked fine? Without any driver changes or adding the fcc auth ID. But when the card is installed directly in the system it doesn't work :(. But it works fine under Linux, so it's not that the system hardware is broken or incompatible. It's a PC Engines APU 3, maybe the OpenBSD USB drivers for this board aren't working quite right? With the external adapter, the driver sends the MBIM_OPEN_MSG, gets an interrupt, receives a response, parses the response, and moves on. Installed on the board, the driver sends the MBIM_OPEN_MSG, and then nothing happens. No interrupt, no response, nothing. Jul 23 18:00:35 maggie /bsd: uhub1 at usb1 configuration 1 interface 0 "AMD EHCI root hub" rev 2.00/1.00 addr 1 Jul 23 18:00:35 maggie /bsd: uhub2 at uhub1 port 1 configuration 1 interface 0 " Advanced Micro Devices product 0x7900" rev 2.00/0.18 addr 2 Jul 23 18:00:35 maggie /bsd: umb0 at uhub2 port 3 configuration 1 interface 12 " Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 3 It looks like the card is on a stacked hub or something when installed in the box? When it's plugged in on the external adapter: Jul 23 18:15:38 maggie /bsd: umb1 at uhub0 port 4 configuration 1 interface 12 " Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 2 It's connected directly to a controller. They seem to init the same: Jul 23 18:00:35 maggie /bsd: umb0: umb_attach Jul 23 18:00:35 maggie /bsd: umb0: ctrl_len=4096, maxpktlen=1422, cap=0x20 Jul 23 18:00:35 maggie /bsd: umb0: ctrl-ifno#12: ep-ctrl=5, data-ifno#13: ep-rx= 4, ep-tx=3 Jul 23 18:00:35 maggie /bsd: umb0: rx/tx size 16384/16384 Jul 23 18:00:35 maggie /bsd: umb0: umb_open Jul 23 18:00:35 maggie /bsd: umb0: umb_ctrl_msg Jul 23 18:00:35 maggie /bsd: umb0: -> snd MBIM_OPEN_MSG (tid 1) Jul 23 18:00:35 maggie /bsd: umb0: sent MBIM_OPEN_MSG (tid 1) Jul 23 18:00:35 maggie /bsd:0: 01 00 00 00 10 00 00 00 01 00 00 00 00 10 0 0 00 Jul 23 18:21:20 maggie /bsd: umb1: umb_attach Jul 23 18:21:20 maggie /bsd: umb1: ctrl_len=4096, maxpktlen=1422, cap=0x20 Jul 23 18:21:20 maggie /bsd: umb1: ctrl-ifno#12: ep-ctrl=5, data-ifno#13: ep-rx= 4, ep-tx=3 Jul 23 18:21:20 maggie /bsd: umb1: rx/tx size 16384/16384 Jul 23 18:21:20 maggie /bsd: umb1: umb_open Jul 23 18:21:20 maggie /bsd: umb1: umb_ctrl_msg Jul 23 18:21:20 maggie /bsd: umb1: -> snd MBIM_OPEN_MSG (tid 1) Jul 23 18:21:20 maggie /bsd: umb1: sent MBIM_OPEN_MSG (tid 1) Jul 23 18:21:20 maggie /bsd:0: 01 00 00 00 10 00 00 00 01 00 00 00 00 10 0 0 00 But it's only when connected externally that the card actually generates an interrupt and sends a response: Jul 23 18:15:48 maggie /bsd: umb1: umb_intr Jul 23 18:15:48 maggie /bsd: umb1: umb_intr: response available Jul 23 18:15:48 maggie /bsd: umb1: umb_get_response_task Jul 23 18:15:48 maggie /bsd: umb1: umb_decode_response Jul 23 18:15:48 maggie /bsd: umb1: got response: len 16 Any thoughts on how to diagnose what might be a USB driver issue as opposed to an LTE card issue 8-/? Thanks...
broken EHCI USB on AMD chipset?
I have a pcengines APU 3 system, which has both USB3 and USB2 ports: ehci0 at pci0 dev 18 function 0 "AMD Hudson-2 USB2" rev 0x39: apic 4 int 18 ehci1 at pci0 dev 19 function 0 "AMD Hudson-2 USB2" rev 0x39: apic 4 int 18 xhci0 at pci0 dev 16 function 0 "AMD Bolton xHCI" rev 0x11: msi The USB2 ports seem to be broken. I initially was having trouble getting an LTE modem to work, but then noticed more general underlying issues. If a USB device is connected when the system boots, it will find it; however, if you hot plug a USB device after the system is up, it doesn't notice. Further, if you unplug a device while the system is up, it doesn't notice it was removed. It appears that the system isn't receiving interrupts for the EHCI USB devices, the interrupt count from vmstat -i : irq101/ehci0 190 irq101/ehci1 460 does not change when I plug in or remove a device, and I think the reason the LTE modem is not working is because the cell modem driver never receives a response from the commands sent to the modem, presumably because the interrupt notifying the USB driver the data is ready to read is never seen/handled. The xhci usb3 ports work fine, they hot plug/remove devices correctly, the lte modem works when plugged into them, and the vmstat interrupt count for irq99/xhci0 increases when devices are using those ports. The EHCI ports seem to work fine under Linux, including the LTE modem when attached to them, so this seems to be an issue with openbsd, not faulty hardware per se. The Linux driver does have a couple of workarounds in their EHCI driver for AMD chipsets, I'm not sure if either of them are relevant for this; one involves disabling low power mode during transfers and the other says: "EHCI controller on AMD SB700/SB800/Hudson-2/3 platforms may read/write memory space which does not belong to it when there is NULL pointer with T-bit set to 1 in the frame list table. To avoid the issue, the frame list link pointer should always contain a valid pointer to a inactive qh" I don't see anything specifically discussing flaky interrupts. Any thoughts on what might be going on here with USB and how it fix it? Thanks...
pcengines apu boards
I was wondering if anybody is successfully running openbsd on pcengines apu boards? I have one of their APU3 series, specifically a apu3b4 with OpenBSD 6.2 on it but I can't get the USB2 EHCI ports functioning correctly (for one thing, they don't detect a hot plugged device), I'm not sure if it's an issue with the ehci driver and the amd ehci chipset or possibly something in the bios acpi tables. But just as a data point, it would be interesting to know if the problem is specific to my board or endemic to the design, so if anyone has an APU series board with fully functional USB2 ports on the ehci controller, I would much appreciate hearing which board it is, which specific AMD chipset is driving the controller, and what bios version you are running (and what OpenBSD version too). Thanks much.
Re: pcengines apu boards
> From: Base Pr1me > Sent: Thursday, November 30, 2017 2:08 PM > > I run 5 apu2 devices with no problems. I don't have any apu3 devices ... yet. Thanks for the feedback. Do you by any chance have any USB type Mini PCI cards installed internally? I initially noticed the issue with a mini PCI LTE modem card. Then I realized it was a more generic USB problem; I believe the apu2 has USB1 and USB2 ports, the apu3 has two USB3 ports externally, and then the mini PCI and a couple of internal headers are USB2. The USB3 ports, using the xHCI driver, work fine, I suppose in the worst case I could use an external Mini PCI to USB adapter and plug the card in outside of the case, but that just seems so kludgy . I actually found a friend locally who had a apu2 board, he couldn't get the LTE card to work on the internal mini PCI slot, which also appeared to be EHCI based, and it would sometimes work and sometimes not plugged into the external USB ports. It was really weird, when plugged into the same external port, sometimes the device would show up on the EHCI bus (and not work) and sometimes it would show up on the OHCI bus (and work). He didn't seem to have any trouble with USB flash drives on the EHCI bus on his apu2 though.
Re: pcengines apu boards
> From: Bryan Everly > Sent: Thursday, November 30, 2017 2:46 PM > > I'm running my primary firewall at home on an apu2... Cool. Have you ever tried using an internal Mini PCI card in it?
Re: pcengines apu boards
> From: Eike Lantzsch > Sent: Thursday, November 30, 2017 3:12 PM > > here: APU2C4 with one SATA drive of 6TB and one 4TB via USB3 and an Hmm, I didn't think the apu2 had USB3, but double checking the specs I see it does. My friend that said he had an APU2 must actually have an original APU, as his board doesn't have USB3. Yeah, the external xHCI USB3 ports work fine on my APU3, it's the EHCI ones that are screwed up, they are only available via two internal headers or if you use the Mini PCI slot. There probably aren't very many people that are routing the internal USB headers to external connectors, so unless somebody is using a USB Mini PCI expansion card on an APU2/3, they probably aren't using the EHCI controller. Thanks for the info.
Re: broken EHCI USB on AMD chipset?
On Tue, Nov 28, 2017 at 08:03:05PM -0800, Paul B. Henson wrote: > The EHCI ports seem to work fine under Linux, including the LTE modem > when attached to them, so this seems to be an issue with openbsd, not > faulty hardware per se. I tested FreeBSD on this box as well, it detected the EHCI ports as: usbus1: EHCI version 1.0 usbus1 on ehci0 usbus1: 480Mbps High Speed USB v2.0 usbus2: EHCI version 1.0 usbus2 on ehci1 usbus2: 480Mbps High Speed USB v2.0 ugen1.1: at usbus1 ugen2.1: at usbus2 uhub0: on usbus1 on usbus2 ugen1.2: at usbus1 uhub3: on usb us1 ugen2.2: at usbus2 uhub4: on usb us2 As far as I can tell the ports work ok under FreeBSD, detecting hot plug and removal of devices, and the interrupt count from vmstat -i increases when doing so. FreeBSD doesn't support the Sierra Wireless card I have but I'm guessing it would work. So it just seems to be an issue with OpenBSD and this board or USB chipset or something. I turned on debugging in the ehci and uhub code, but when I plug something in nothing whatsoever happens, so that wasn't very useful. Any suggestions on other debugging to enable or any other approach to figure out what's going on here? Thanks...
Re: broken EHCI USB on AMD chipset?
> From: Stefan Sperling > Sent: Friday, December 1, 2017 10:35 AM > > Problems with ehci(4) on AMD SB700 are known. > For instance, athn(4) USB devices don't work on such ports. Interesting; that's a similar device to the LTE network modem I'm working with. > Could you try adding missing workarounds to our EHCI driver to fix > your problem? That would probably help with other known issues, too. Hmm, sadly low level device drivers aren't my area of expertise :(. I was trying to compare the Linux driver, but it is structured quite differently than the openbsd one. Now that I see that the FreeBSD one works, at least as far as hot plug/remove, it is more similar to the openbsd driver, I'll see if I can pick anything out of it that I can make sense out of to add to openbsd. I did find a section of code in OpenBSD's echi_pci.c with a comment of "Enable workaround for dropped interrupts as required" which was being applied to ATI chipsets; I was excited for a moment as that seems to be exactly the problem being experienced and AMD bought ATI, so I hoped perhaps enabling that for my AMD chipset would do something, but unfortunately all it did was result in interrupt timed out messages . There's another function called ehci_sb700_match that's looking for an ATI chipset, which controls whether or not to "apply the ATI SB600/SB700 workaround", those are also the names of the AMD controllers, I'm going to look to see if perhaps those should be applied or not and if they do anything. If my shooting in the dark comes up with anything promising I'll bring it back to show somebody who knows what they're doing :). Thanks.
Re: pcengines apu boards
On Sat, Dec 02, 2017 at 10:40:14PM +1000, Douglas Ray wrote: > On the APU3a4 the internal USB headers were broken. > I had email from pcengines (March 2017) saying this would > be addressed in the APU3b series., but we went for APU2. I have a APU3b series, they fixed the incorrect pinout on the internal usb headers. The internal ECHI ports work fine under both linux and freebsd connected to a USB backplate I'm testing with. It's definitely a disagreement between the AMD EHCI USB chipset and OpenBSD . I'm going to see if I can port some of the workarounds and quirks for that chipset from linux/freebsd to the openbsd driver and see if I have any luck getting it working; drivers aren't my strong suite but we'll see what happens. In the worst case I guess I'll use an external miniPCI to USB adapter and connect my LTE modem to the external xHCI ports, they seem to work fine under OpenBSD. Thanks...
Re: pcengines apu boards
> From: Marko Cupac > Sent: Monday, December 4, 2017 3:54 AM > > I have just ordered one APU3b4, as I wanted to test mobile provider as > a backup link. I see it probably won't be any good as OpenBSD router > (yet), but at least I'll be able to test and give feedback. Assuming you're planning to use an internal Mini PCI card, unless you have more luck than me, it's not going to work :(. I'm hoping I will be able to fix the EHCI driver to be more happy with the AMD USB chipset, but this point I'm still fumbling with it :).
help updating EHCI driver
I'm trying to port some quirks for AMD USB chipsets from other operating systems to OpenBSD to hopefully resolve issues I am having with the pc engines APU3 EHCI ports, as they seem to work fine on those systems. I've got a pretty rough draft of one of them, which disables low-power mode during transfers, but would appreciate a little clarification on device I/O as I'm not generally a device driver developer. Under Linux, the kernel uses absolute addresses when it's doing port I/O to a device, so that's what I am referencing in their implementation. In OpenBSD I see that a driver maps a handle to a region of memory and then uses offsets from the base of that region for port I/O. It looks like the EHCI driver code has already mapped that region and the handle is available for me to use, but I don't see where that mapping was made or how to figure out what the base was in order to turn the absolute addresses I have into appropriate offsets to use with the openbsd API? Then, for some of the chipsets, in addition to poking at the USB device itself to twiddle the low-power mode, you also have to muck with the northbridge configuration. I think I gathered the device information, although I don't know that was the correct way to do so; but I need to map the I/O region for it to a handle so I can modify it. If a driver for one device needs to write to a different device is it supposed to call bus_space_map on its own to get a mapping, or can it somehow get access to the existing one already in place for that device? Finally, low power mode is supposed to be disabled whenever there are asynchronous transfers occurring, and then re-enabled once they complete. I'm not sure I've put the calls in the right place, and I know I haven't handled the case where transfers fail or are canceled rather than complete. The other quirk involves never having an empty frame list; I have implemented the logic to detect when that is required, but haven't even come close to wrapping my head around actually implementing the quirk itself. In any case, here is my current laughable diff, advice and corrections most appreciated. Index: pci/ehci_pci.c === RCS file: /cvs/src/sys/dev/pci/ehci_pci.c,v retrieving revision 1.30 diff -u -p -r1.30 ehci_pci.c --- pci/ehci_pci.c 20 Jul 2016 09:48:06 - 1.30 +++ pci/ehci_pci.c 6 Dec 2017 02:46:24 - @@ -66,6 +66,8 @@ struct ehci_pci_softc { }; int ehci_sb700_match(struct pci_attach_args *pa); +int ehci_amd_pll_quirk_match(struct pci_attach_args *pa); +int ehci_amd_pll_quirk_match_nb(struct pci_attach_args *pa); #define EHCI_SBx00_WORKAROUND_REG 0x50 #define EHCI_SBx00_WORKAROUND_ENABLE (1 << 3) @@ -111,6 +113,7 @@ ehci_pci_attach(struct device *parent, s char *devname = sc->sc.sc_bus.bdev.dv_xname; usbd_status r; int s; + struct pci_attach_args amd_pa; /* Map I/O registers */ if (pci_mapreg_map(pa, PCI_CBMEM, PCI_MAPREG_TYPE_MEM, 0, @@ -131,6 +134,86 @@ ehci_pci_attach(struct device *parent, s /* Handle quirks */ switch (PCI_VENDOR(pa->pa_id)) { + case PCI_VENDOR_AMD: + /* AMD errata indicates 8111 chipset EHCI is broken */ + if (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_AMD_8111_EHCI) { + printf("%s: AMD 8111 EHCI broken, skipping", devname); + goto disestablish_ret; + } + if (pci_find_device(&amd_pa, ehci_amd_pll_quirk_match)) { + sc->sc.amd_chipset.rev = PCI_REVISION(amd_pa.pa_class); + if (PCI_PRODUCT(amd_pa.pa_id) == PCI_PRODUCT_ATI_SBX00_SMB) { + if (sc->sc.amd_chipset.rev >= 0x10 && + sc->sc.amd_chipset.rev <= 0x1f) + sc->sc.amd_chipset.gen = AMD_CHIPSET_SB600; + else if (sc->sc.amd_chipset.rev >= 0x30 && +sc->sc.amd_chipset.rev <= 0x3f) + sc->sc.amd_chipset.gen = AMD_CHIPSET_SB700; + else if (sc->sc.amd_chipset.rev >= 0x40 && +sc->sc.amd_chipset.rev <= 0x4f) + sc->sc.amd_chipset.gen = AMD_CHIPSET_SB800; + else + sc->sc.amd_chipset.gen = AMD_CHIPSET_UNKNOWN; + + } + else if (PCI_PRODUCT(amd_pa.pa_id) == PCI_PRODUCT_AMD_HUDSON2_SMB) { + if (sc->sc.amd_chipset.rev >= 0x11 && + sc->sc.amd_chipset.rev <= 0x14) + sc->sc.amd_chipset.gen = AMD_CHIPSET_HUDSON2; + else if (sc->sc.amd_chipset.rev >= 0x15 && +
Re: 3g modem support
> From: Marko Cupac > Sent: Wednesday, December 6, 2017 2:47 AM > > ...which suggests some Sierra Wireless modems, none of which are > available for purchase in the country I live in. I've got the MC7455, which I believe is basically the same as the EM7455. Presumably this might be one of the cards you say you can't get though. I haven't been able to thoroughly test it given my issues with the APU3, but a friend of mine played with it a bit under vmware and it seems functional under both the umsm driver with PPP and the umb driver in MBIM mode, although in order to use the latter you need to disable the umsm module as it claims the device with a higher priority. I ended up ordering one of these: https://www.amazon.com/gp/product/B01JGCSPEA/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1 and will most likely connect the miniPCI card to the external xHCI ports on the APU3 which at least at initial glance seems functional. Kind of kludgy, but better than the other fallback plan of using linux for this deployment :).
Re: help updating EHCI driver
> From: Martin Pieuchot > Sent: Thursday, December 7, 2017 3:18 AM > > Which issue are you having? Sorry, there was more context in an earlier thread. Basically, I have a pc engines APU3 board which has AMD Hudson-2 EHCI USB ports on it. If devices are plugged in when the system boots and the ports are initialized, the operating system sees they are there. However, if you hot plug a device after it is booted, or remove a device that was plugged in at boot, the system does not notice the change in state. Also, the Sierra wireless LTE modem I was trying to use does not function, the driver sends an open message over the USB bus to the device and then nothing ever comes back. Once the system is booted, the interrupt account for the ECHI ports from 'vmstat -I' never increases. The xHCI USB ports on the same system seemed to work fine, detecting hot plug/remove events, and properly initializing the wireless modem. > What makes you think that the quirks below > will help? What do you mean with 'work fine on those systems'? If they > work fine, which issues are you having? The same board when booted up under either Linux or FreeBSD appears to have fully functional EHCI USB ports; they detect hot plug/remove events, and under linux the wireless modem is initialized and can successfully pass traffic (FreeBSD doesn't have a driver for it, so I was unable to test it there). I honestly don't know if these specific quirks will resolve the issue under OpenBSD, all I know is that there is something Linux/FreeBSD is doing different regarding the USB hardware, and porting these quirks seemed a good place to start. I'm not really a low level hardware/device driver guy, so I'm flying a bit blind. Someone told me there are known issues with amd USB ports in general, such as ath based USB wireless cards not working, so these might help that problem even if it doesn't fix mine. > It depends how the controller is connected to the host. If you look at > the PCI glue driver, dev/pci/ehci_pci.c you'll see > > 115: /* Map I/O registers */ > 116: if (pci_mapreg_map(pa, PCI_CBMEM, PCI_MAPREG_TYPE_MEM, 0, > 117: &sc->sc.iot, &sc->sc.ioh, NULL, &sc->sc.sc_size, 0)) > > Then in the EHCI driver, dev/usb/ehci.c, these are accessed via the > EREAD/EWRITE/EOREAD/EOWRITE macros. Ah, ok; it appears that pci_mapreg_map defines the start of the region as: ex = pa->pa_ioex; if (ex != NULL) { start = max(PCI_IO_START, ex->ex_start); with: #define PCI_IO_START 0 So the starting memory address is either 0 or pa->pa_ioex from the struct pci_attach_args that was passed into ehci_pci_attach. Given the existing reads: sc->sc.sc_offs = EREAD1(&sc->sc, EHCI_CAPLENGTH); define EHCI_CAPLENGTH 0x00 I'm pretty sure it's not zero, so it must be the one from pa_ioex. > But maybe you just want to use pci_conf_read()/pci_conf_write()? Hmm, given my lack of detailed knowledge of this area I can't say for sure. However, there are two different things being done in the linux code I am referring to: outb_p(0xe0, 0xcd6); This I believe writes the byte 0xe0 to the I/O port at address 0xcd6, where as this: pci_write_config_dword(amd_chipset.nb_dev, 0xe4, val); Writes the contents of the variable val to the PCI configuration register located at 0xe4. Those are two different operations, right? So wherever the linux code writes to an absolute I/O port for the USB device, such as 0xcd6, I can subtract the beginning of the mapped region as stored in pa_ioex from it to arrive at the appropriate offset value to use with EREAD/WRITE? > Is low power mode enabled on OpenBSD? Based on the comment in the linux code: "The hardware normally enables the A-link power management feature, which lets the system lower the power consumption in idle states." I believe that is the default behavior of the hardware unless you explicitly do something otherwise with it. > The other quirk involves never having an empty frame list; I have > > implemented the logic to detect when that is required, but haven't even > > come close to wrapping my head around actually implementing the quirk > > itself. > > For which transfer type is this quirk required, isochronous only? The explanation of this quirk is: "EHCI controller on AMD SB700/SB800/Hudson-2/3 platforms may read/write memory space which does not belong to it when there is NULL pointer with T-bit set to 1 in the frame list table. To avoid the issue, the frame list link pointer should always contain a valid pointer to a inactive qh." I am also unfortunately not that expert in the underlying USB hardware level protocol, but the quirk is referenced in two functions, scan_isoc: if (!ehci->use_dummy_qh || q.itd->hw_next != EHCI_LIST_END(ehci)) *hw_p = q.itd->hw_next; else
rdomain/rtable
I've got a box with an LTE cellular modem in it whose purpose is to provide a backup connection to the Internet if the hardwire service goes down. It's running OSPF to connect to the rest of the network, and the only time any traffic should go over the cellular link (which is slower and bandwidth capped) is if the hardwire interconnection is down, including ideally traffic generated from the system itself. I have that part working, by adding in a local static default route to the cellular gateway with less priority than the OSPF default route. However, for testing purposes, I'd like to be able to poke out the cellular link on an as-needed basis without having to switch the entire box over to using it. Virtual routing tables looked perfect for this purpose, as I could just spawn a single process with a different default route, we do something similar with network name spaces under Linux. However, I can't quite get it to work. What I'd really like is to be able to make a copy of the current system routing table, then change one thing about it. However, a new rdomain shows up with no routes or interfaces in the routing table. I can add the new default route pointing out the cellular link, and get traffic to go out there. But I haven't sorted out how to make all the traffic for my internal network still go through the internal link rather than get sent out the default route. While ideally all the OSPF routes would propagate to the other routing domain I tried just adding a static to the /16 for our internal address space: Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default24.x.x.x UGS06 - 8 umb0 10.0/1610.128.0.21UGS00 - 8 em0 That doesn't work; the documentation says you need to get pf to pass packets across routing domains. However, it says: rtable number Used to select an alternate routing table for the routing lookup. Only effective before the route lookup happened, i.e. when filtering inbound. Unfortunately, for traffic originating from the system itself, there isn't really an "inbound" interface? So I'm not sure what pf rule would make this work. Is it just not possible, or am I missing something? Thanks much.
Re: Solved IPMI, but I can't get onto network to outside
On Thu, Dec 21, 2017 at 12:52:33PM -0700, Chris Bennett wrote: > > > IP: 104.217.196.248/29 > > > Gateway: 104.217.196.249 > > > Netmask: 255.255.255.248 > > > > > > > What is your network interface? > > > > I have two, em0 and em1 > > em0: > inet 104.217.196.248 255.255.255.248 > > And I admit I really don't see what IP addresses I get > with 104.217.196.248/29. That's not the IP address you're supposed to use, that's the subnet they've allocated you. See: http://www.subnet-calculator.com/subnet.php 104.217.196.248 is the network address, you can't assign that to an actual host. The usable IP addresses in that subnet are 104.217.196.249-104.217.196.254, 104.217.196.249 is your gateway, so that leaves you 104.217.196.250-104.217.196.254 to assign to your systems. 104.217.196.255 is the broadcast address for the subnet. Update your hostname.em0 to use 104.217.196.250 and make sure your /etc/mygate file contains 104.217.196.249.
Re: rdomain/rtable
Thanks for the info. I don't want to move any interfaces to a non-default routing domain, I just want to be able to run a process with a different default route. I can make that work, via the route -T 10 exec you mention after setting a default route in that domain. But I can't seem to get traffic for my local subnet sent out my internal interface, even after I add a route to it in the non-default routing domain. Dunno, maybe I'm missing something. I set it up like: Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default24.x.x.x UGS02 - 8 umb0 10.0/1610.128.0.20UGS00 - 8 em0 But 'ping 10.128.0.20' shows the packets going out umb0, not em0? Thanks again. On Sat, Dec 23, 2017 at 05:07:37PM +0100, Sebastian Benoit wrote: > > When you create a new routing domain, for example by adding an interface to > a routing domain (e.g. ifconfig umb0 rdomain 10), you create a new routing > table 10. It will be empty until you add an address on umb0 or, for example > add your default route. > > This routing table will be used to forward packets that are "in that routing > domain" (the packet is marked with the rdomain or rather the rtable it will > use). How does the packet get marked? > > Three ways: > > * with pf, as you have discovered. As the manpage documents, the > mark needs to be set before route lookup is done. > > * when a paket comes in on an interface in rdomain 10, it will stay in > rdomain 10 (unless pf changes it). > > * a packet is generated on the local machine by a process that "is in that > routing domain". I.e. processes are also marked with a rdomain. > > To start a process in a specific rdomain (10), use "route -T 10 exec > command", for example > > route -T 10 exec ping -n ip > > or even > > route -T 10 exec ksh > > Processes spawned by that shell will inherit the rdomain. > > Note that i used -n in the ping example. DNS resolving using the resolvers > in resolv.conf might not work, as long as those resolvers are not reachable > in rdomain 10. > > Hope this helps ...
Re: pcengines apu boards
On Wed, Jan 17, 2018 at 12:56:04PM +0100, Christopher Zimmermann wrote: > I have the same problem and have tried to hunt the bug, but failed so > far. Have you already identified the quirks linux and freebsd use to > fix this problem? No :(, I worked on it for a while but kernel hacking isn't my speciality. I don't think the specific quirks I was initially trying to port would have fixed it anyway, as they seemed mainly aimed at data transfers and I couldn't even get the miniPCI card to get hot plugged and detected while testing with an external miniPCI to USB adapter plugged into the internal EHCI header. I ended up just using the external adapter plugged into the xHCI ports exposed outside the case. Annoying not to be able to just have it inside the case, but it works like a champ in this configuration.
OpenLDAP under 6.8 - no intermediate certs in chain
I just updated one of my servers running 6.7 to 6.8, and am having a problem with openldap. I have the intermediate cert and root CA in a file referenced by the openldap config: TLSCACertificateFile/etc/openldap/cabundle.crt Under 6.7 with the openldap port from that version, this results in the chain being served: Certificate chain 0 s:CN = ldap-netsvc.pbhware.com i:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3 1 s:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3 i:O = Digital Signature Trust Co., CN = DST Root CA X3 2 s:O = Digital Signature Trust Co., CN = DST Root CA X3 i:O = Digital Signature Trust Co., CN = DST Root CA X3 However, under 6.8 with the newer openldap 2.4.53 port, only the server cert itself is being served, not the intermediate or root: Certificate chain 0 s:CN = ldap-netsvc.pbhware.com i:C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3 This of course causes clients to fail to validate the server cert :(. I'm running openldap 2.4.53 on other operating systems and as far as I know there's no change in behavior with it. So I'm guessing there's an interoperability issue between openbsd libressl and openldap that's causing this problem? Do I need to configure something differently? Any other suggestions? Thanks much...
Re: OpenLDAP under 6.8 - no intermediate certs in chain
On 11/15/2020 10:18 PM, Brad Smith wrote: I remember seeing this commit recently. Not sure if this is your problem or not. https://marc.info/?l=openbsd-cvs&m=160511882917510&w=2 That definitely looks like it, thanks for the pointer.
Re: OpenLDAP under 6.8 - no intermediate certs in chain
On 11/16/2020 2:30 AM, Stuart Henderson wrote: Yes OpenLDAP is broken with TLS 1.3 server-side unless you have that commit (or build LibreSSL with TLS 1.3 server support disabled). As far as I can tell there's no method to disable TLS 1.3 via config. Hmm, yah, you can disable old versions, but I don't think there is any way to disable newer ones.
Re: OpenLDAP under 6.8 - no intermediate certs in chain
On 11/16/2020 6:52 AM, Stuart Henderson wrote: ...actually I have now added a workaround to the databases/openldap port in 6.8-stable to disable TLS 1.3, so either rebuild or wait for -stable packages and it should fix things. Cool, I was actually already building from source in order to enable modules. I updated my ports tree and rebuilt, looks good now, thanks much for the quick fix. It still does behave a little bit differently; under 6.7 it was including the root CA in the chain sent by the server, under 6.8 it is only including the intermediate, not the root. Which I actually prefer, as sending the root is a waste of time, the client needs to have that itself anyway in order to validate the chain in the first place.
umb0 broke in 6.9
I just upgraded a box that has a cell data card in it and it no longer seems to work :(. The card is: umb0 at uhub0 port 3 configuration 1 interface 12 "Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 2 The contents of /etc/hostname.umb0 are just: apn r.ispsn The interface shows: umb0: flags=8811 mtu 1500 index 6 priority 6 llprio 3 roaming disabled registration unknown state down cell-class none SIM not initialized PIN required APN r.ispsn status: down There is no PIN on the SIM. It was working fine right before the upgrade. The only umb change I see in the changelog is: Added vid/pid table to umb(4) allowing matching to alternate configurations. I'm not sure what that means or if my config needs something changed to work again? Any suggestions appreciated. The card is in an external minipci adapter connected via USB3. The server is a PC Engines apu3 which actually has an internal minipci connector, but I couldn't get that to work as internally it was connected via USB2 and there were issues with that chipset. I vaguely recall it was actually failing something like this 8-/. Thanks...
6.9 kernel compile fails
I'm trying to compile a kernel with some debugging enabled for an problem I've having with umb, and now my problem has turning into an error compiling the kernel :). After getting the error on my updated from 6.8 code base, I whacked it and did a fresh checkout, but it still shows up: -bash-5.1$ pwd /sys/arch/amd64/compile/GENERIC.MP -bash-5.1$ make make: don't know how to make /usr/src/sys/dev/pci/drm/i915/dvo_ch7017.c (prerequisite of: dvo_ch7017.o) Stop in /sys/arch/amd64/compile/GENERIC.MP It looks like that file is at: /usr/src/sys/dev/pci/drm/i915/display/dvo_ch7017.c not where it's looking: /usr/src/sys/dev/pci/drm/i915/dvo_ch7017.c I created a symlink and then it complained about a missing header file, so I made another symlink, and it complained about another C file, etc etc, until I finally just ran: ln -s display/* . A whole bunch of stuff compiled, then it complained about: make: don't know how to make /usr/src/sys/dev/pci/drm/i915/i915_gem_clflush.c (prerequisite of: i915_gem_clflush.o) so queue: ln -s gem/* . ln -s gt/* . ln -s uc/* . and it trundled along for a while, then: make: don't know how to make /usr/src/sys/dev/isa/asmc.c (prerequisite of: asmc.o) Finally, after: ln -s ../acpi/asmc.c . it finished compiling and the resultant kernel seems to work. It seems odd the stable kernel source would be broken, but I'm not sure what I might have done wrong? It's a fresh checkout, and there's not much to compiling it. The box doing to compiling was updated from 6.8, I haven't tried on a box with a fresh 6.9 install. Thanks...
Re: umb0 broke in 6.9
On Mon, Jun 14, 2021 at 08:07:15AM -, Stuart Henderson wrote: > just add "#define UMB_DEBUG" to if_umb.c and send the full dmesg output. Hmm, that's didn't work, I also needed to update umb_debug = 1 in the code? After that, I got a little output, full dmesg included below but the umb part looks like: umb0 at uhub0 port 3 configuration 1 interface 12 "Sierra Wireless, Incorporated Sierra Wireless MC7455 Qualcomm\M-. Snapdragon? X7 LTE-A" rev 2.10/0.06 addr 2 umb0: NCM align=4 div=4 rem=0 umb0: Only NTB16 format supported. umb0: -> snd MBIM_OPEN_MSG (tid 1) umb0: vers 1.0 umb0: stop: reached state DOWN umb0: init: opening ... umb0: -> snd MBIM_OPEN_MSG (tid 2) umb0: init: opening ... umb0: -> snd MBIM_OPEN_MSG (tid 3) umb0: stop: reached state DOWN This seems kind of like the original problem I had with the card when it was attached to the internal USB2 minipci slot rather than to the external USB3 one: http://openbsd-archive.7691.n7.nabble.com/umb-device-SIM-has-no-PIN-td331358.html Maybe a change in the USB code broke it? OpenBSD 6.9-stable (GENERIC.MP) #12: Mon Jun 14 15:54:43 PDT 2021 r...@obsd-bld.pbhware.com:/sys/arch/amd64/compile/GENERIC.MP real mem = 4261011456 (4063MB) avail mem = 4116484096 (3925MB) User Kernel Config UKC> disable Humsm 361 umsm* disabled UKC> quit Continuing... random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xcff9f020 (7 entries) bios0: vendor coreboot version "v4.6.3" date 20171030 bios0: PC Engines PC Engines apu3 acpi0 at bios0: ACPI 4.0 acpi0: sleep states S0 S1 S2 S4 S5 acpi0: tables DSDT FACP SSDT TCPA APIC HEST SSDT SSDT HPET acpi0: wakeup devices PWRB(S4) PBR4(S4) PBR5(S4) PBR6(S4) PBR7(S4) PBR8(S4) UOH1(S3) UOH2(S3) UOH3(S3) UOH4(S3) UOH5(S3) UOH6(S3) XHC0(S4) acpitimer0 at acpi0: 3579545 Hz, 32 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD GX-412TC SOC, 998.40 MHz, 16-30-01 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 16-way L2 cache cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: AMD GX-412TC SOC, 998.13 MHz, 16-30-01 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 16-way L2 cache cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: AMD GX-412TC SOC, 998.13 MHz, 16-30-01 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT cpu2: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 16-way L2 cache cpu2: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative cpu2: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu2: disabling user TSC (skew=-144) cpu2: smt 0, core 2, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: AMD GX-412TC SOC, 998.13 MHz, 16-30-01 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT cpu3: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 16-way L2 cache cpu3: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative cpu3: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu3: smt 0, core 3, package 0 ioapic0 at mainbus0: apid 4 pa 0xfec0, version 21, 24 pins ioapic1 at mainbus0: apid 5 pa 0x
Re: umb0 broke in 6.9
On 6/14/2021 4:54 PM, Stuart Henderson wrote: find when the problem started .. with 6.9 userland you can probably get away with just booting the relevant older kernel for a test for probably most/maybe all of the way back to 6.8. So I booted the 6.8 kernel, and everything seemed to be mostly working, but the umb interface still wasn't initialized properly :(. I was thinking I'd have to do a fresh install of 6.8 and start the test from there, but then I considered that one thing I still hadn't done was a cold power cycle. I have a remote console on the serial port of the system, but it doesn't have built-in remote power control and it's not hooked up to a remote control power switch, so it's not that convenient to deal with power. I coordinated a cold power cycle and booted up the 6.8 kernel, and the umb interface worked :). I then booted the 6.9 kernel, and it also worked? By default the kernel allocates the device to the umsm driver (it would be nice if the umb driver took priority instead), so the first 6.9 boot after the install used that driver until I disabled it and rebooted. I thought perhaps the 6.9 version of that driver put the card in a bad state, so I tried booting the 6.9 kernel with it enabled, and then booting it again with it disabled. But the umb interface was still working after that test. So it seems that somehow the upgrade process put the hardware in a bad state where it would not initialize, and a cold power cycle seems to have sorted that out. I wasn't able to reproduce the issue doing some testing, so I guess I will write it off as "that's odd" and just be happy it seems to be working reliably now :). Thanks much for your assistance looking at it…
wireguard reconfiguration reliability
We're using wireguard to set up VPN connections from various systems deployed on-prem at customer sites to central openbsd boxes to route internal traffic between the remote boxes and the internal network. After a fresh reboot with a given configuration, everything works great. The problem we have is when we later add or remove a remote system and try to reconfigure the wireguard interface on the central servers. Sometimes the new system just won't work, or oddly the new system works fine but an existing system that was working breaks 8-/. When that happens, we generally have to reboot it, at which point everything works. Occasionally ifconfig on the wg interface just wedges completely. When that happens, it won't reboot cleaning, we have to hard reset it. Has anyone else seen this type of behavior? I'm not sure how common it is to have regular ongoing changes to wireguard like we are doing, so it might not pop up often. Thanks much...
openbsd vm with SR-IOV vf nic
Is it very common for people to be running openbsd boxes under virtualization and using an SR-IOV vf nic? I'm curious what cards people are using. It looks like the only available driver is iavf, for the Intel 700 cards? Are there any other drivers I missed? We have some systems with Intel X550 cards in them, based on the 82599 chipset, which openbsd doesn't currently support. Yuichiro NAITO ported a driver from netbsd: https://marc.info/?l=openbsd-tech&m=168722323125036&w=2 We tested it under 7.3, and then an updated version for 7.4, and it's been working great. At one point yasuoka@ had said he would review and merge it, but it looks like that hasn't happened yet and I haven't heard back the last couple of times I tried to ask him about it (I assume he's busy with other things and don't want to bug him any more). So I was just wondering if there are any other available drivers I might have missed for other cards we might have, if anybody else was interested in X550/82599 vf support, and if maybe any other dev might be willing to take a look at it and possibly commit it. Thanks much...
Re: wireguard reconfiguration reliability
On 3/20/2024 1:44 AM, Kirill Miazine wrote: actually I checked, and I do use wgpka on clients, but not on the server -- I don't remember why I didn't... In our case the server is on an Internet accessible address, whereas the clients are behind a NAT firewall. We also have keepalives enabled on the clients (to maintain their NAT mapping) but not on the server (as if the client isn't sending its keepalives the server isn't going to get through anyway). A scenario where it stops but then works again as soon as traffic is sent does kind of sound like a firewall or NAT timeout issue? We don't have that problem, if we leave it completely alone it generally works indefinitely with no issues. It's just when we try to modify the configuration that things sometimes go sideways. Thanks for the data point…
Re: openbsd vm with SR-IOV vf nic
On 3/20/2024 2:46 AM, Jonathan Matthew wrote: mcx(4) supports virtual functions, mostly because they're identical to physical functions from the driver's perspective, so all we had to do was add the device IDs. Ah, that wasn't readily apparent; I didn't see anything in the man page mentioning SR-IOV or virtual functions, nor in the source code when I went to take a peek now. I guess if you're familiar with Mellanox you'd be aware the vf was basically the same as the physical card and presumably supported by the same driver. bnxt(4) could support virtual functions pretty easily, since they're largely the same as physical functions, but some work would be required there. Cool, thanks for the pointers…
Re: wireguard reconfiguration reliability
On 3/20/2024 9:21 AM, Zack Newman wrote: clients in rdomain(4) 0. Last week I ran ifconfig wg1 destroy, replaced the wgkey and wgpsk for one of the three wgpeers in the second interface, and ran sh /etc/netstart wg1. Once I did this, the server seemingly froze: That's similar to what we see, although generally the entire server doesn't die, just the ifconfig command wedges and can't be killed, and the box can't be rebooted cleanly. Thanks for the feedback…
Re: wireguard reconfiguration reliability
On Wed, Mar 20, 2024 at 09:56:06PM +0100, Kirill Miazine wrote: > Like in this thread, I guess: > > https://marc.info/?t=16964239631&r=1&w=2 Yes, that is likely the issue we're hitting. Seems last message is from 10/2023 and the issue wasn't resolved :(, so I guess it's a known problem with no solution on the horizon. Next time I'll try your workaround of batching the commands up (ifconfig wg1 down; ifconfig wg1 delete; ifconfig wg1 destroy) rather than running one at a time and keep my fingers crossed I win the race condition :). Thanks for the help...
Re: wireguard reconfiguration reliability
On Thu, Mar 21, 2024 at 12:23:06PM +0300, Vitaliy Makkoveev wrote: > wg(4) diff was committed to -current. Does the problem exist in upcoming > 7.5? Oh, I didn't know a fix had been committed, the referenced thread didn't mention a final one. Thanks, I'll take a look.
Intel 10G X550T sr-iov virtual function driver
I recently migrated an OpenBSD vm running under qemu/kvm to a new server which has an Intel 10G X550T NIC (Intel Corporation Ethernet Converged Network Adapter X550-T2) and am passing a vf though to the vm. Unfortunately, it appears openbsd doesn't have a driver for this virtualized device? The dmesg output shows: vendor "Intel", unknown product 0x1565 (class network subclass ethernet, rev 0x0 0) I see in the current PCI device list: https://github.com/openbsd/src/blob/master/sys/dev/pci/pcidevs there is support for the native card itself (INTEL X550T id 0x1563) but nothing for the virtual function 0x1565. Are there any plans to support this card as a virtual function in a vm? Thanks much...
what all touches the carp demote counter?
I'm setting up a second router that's going to sit next to an existing one and become a redundant failover system. The current one is in production, and I've been converting some of the existing LAN subnets on it to use carp interfaces and making them primary and the new box secondary. I also set up a carp interface on the WAN side and made the new box primary for testing as that didn't exist before. That all worked fine when I set it up by hand, but when I rebooted the new box, the old box stayed primary for everything including the WAN interface, which I tracked down to the carp demote counter, which ended up at 2 on the new box after the reboot: bash-4.3# ifconfig -g carp carp: carp demote count 2 After I manually decreased the demote counter by 2 back to 0 the WAN interface master switched back to the new box. I'm not sure what's doing that at boot? I am running ospfd on the box, but I don't have any demote statements in my configuration. I'm also running npppd, but I don't see anything about that and carp demotion. What else might be setting carp demotion values? Thanks...
Re: what all touches the carp demote counter?
On Mon, Oct 10, 2016 at 09:43:56PM -0300, R0me0 *** wrote: > Did you adjust advskew value on the machine you want to be Backup ? Yes, the backup has an advskew of 5 and the primary an advskew of 1. As I mentioned, when I first configured the interfaces by hand the two systems properly negotiated master/backup roles, it was only after I rebooted the one that was supposed to be primary on this interface that it came up as backup, and I traced it to the fact the the carp demote value was set to 2. When I manually changed the carp demote value to 0, the system once again pre-empted the master role on the interface. I'm just not sure what is twiddling with the carp demotion value. Unless ospdf does it by default? The man page for the config file reads like it would only do it if you explicitly include the demote keyword in the area or interface section. Thanks for the suggestion though.
Re: what all touches the carp demote counter?
On Tue, Oct 11, 2016 at 08:44:05AM +0200, mxb wrote: > Master-Backup setup with pfsync in place, means that you synchronize > states between boxes. Then Master is rebooted, it becomes out-of-sync > then it comes to states. So until it is in sync with Backup (which > became Master after reboot), it will not become Master. > > This process is auto. Just need to wait. I haven't set up pfsync yet, I need to upgrade the old box first. Right now I'm just working with carp. Does pfsync fiddle with the carp demotion value even if it's not configured? Thanks...
Re: what all touches the carp demote counter?
On Wed, Oct 12, 2016 at 08:37:59AM +0200, mxb wrote: > But as R0me0 stated, you should probably re-check your configuration. The configuration checked out. I rebooted a few more times, and I couldn't reproduce the problem. I still have no idea why the carp demotion counter was set to 2 the first time I rebooted. It doesn't seem to be doing it anymore though. Thanks for all the suggestions though, it helped to verify everything was set up right.
Re: what all touches the carp demote counter?
Arg, I'm still having issues with the carp demote counter. I disabled ospfd for now, but something is still changing it. After a reboot without ospfd, the counter is changing between 0 and 1: bash-4.3# ifconfig -g carp carp: carp demote count 1 bash-4.3# ifconfig -g carp carp: carp demote count 0 bash-4.3# ifconfig -g carp carp: carp demote count 1 bash-4.3# ifconfig -g carp carp: carp demote count 0 And the carp interface is flapping: Oct 14 13:17:17 lisa /bsd: carp0: state transition: BACKUP -> MASTER Oct 14 13:17:23 lisa /bsd: carp0: state transition: MASTER -> BACKUP Oct 14 13:17:43 lisa /bsd: carp0: state transition: BACKUP -> MASTER Oct 14 13:17:49 lisa /bsd: carp0: state transition: MASTER -> BACKUP Oct 14 13:18:08 lisa /bsd: carp0: state transition: BACKUP -> MASTER There's not too much running; smtpd, sshd, npppd, dhcpd. Any suggestions as to what might be screwing with the carp demote value? Thanks... root 1 0.0 0.0 440 520 ?? Is 1:14PM0:01.01 /sbin/init root 21696 0.0 0.0 1044 1296 ?? Isp1:14PM0:00.00 syslogd: [priv] (syslogd) _syslogd 22103 0.0 0.0 1044 1388 ?? Sp 1:14PM0:00.07 /usr/sbin/syslogd _pflogd 5335 0.0 0.0 684 400 ?? Sp 1:14PM0:00.02 pflogd: [running] -s 160 -i pfl root 27252 0.0 0.0 620 600 ?? Is 1:14PM0:00.00 pflogd: [priv] (pflogd) _ntp 16170 0.0 0.0 636 1472 ?? Isp1:14PM0:00.02 ntpd: dns engine (ntpd) _ntp 15754 0.0 0.0 688 1540 ?? S I'm setting up a second router that's going to sit next to an existing > one and become a redundant failover system. The current one is in > production, and I've been converting some of the existing LAN subnets on it > to use carp interfaces and making them primary and the new box > secondary. I also set up a carp interface on the WAN side and made the > new box primary for testing as that didn't exist before. That all > worked fine when I set it up by hand, but when I rebooted the new box, > the old box stayed primary for everything including the WAN interface, > which I tracked down to the carp demote counter, which ended up at 2 on > the new box after the reboot: > > bash-4.3# ifconfig -g carp > carp: carp demote count 2 > > After I manually decreased the demote counter by 2 back to 0 the WAN > interface master switched back to the new box. > > I'm not sure what's doing that at boot? I am running ospfd on the box, > but I don't have any demote statements in my configuration. I'm also > running npppd, but I don't see anything about that and carp demotion. > What else might be setting carp demotion values? > > Thanks...
Re: what all touches the carp demote counter?
On Fri, Oct 14, 2016 at 01:27:42PM -0700, Paul B. Henson wrote: > Arg, I'm still having issues with the carp demote counter. I disabled > ospfd for now, but something is still changing it. After a reboot > without ospfd, the counter is changing between 0 and 1: Ah, I tracked it down. I had configured another carp interface on the new system which didn't yet have a corresponding interface on the old system. I have the carp interfaces configured with explicit peer addresses rather than using multicast, and evidentally the inability to send a packet to the peer was causing the other carp interface to twiddle the global carp demote counter, which popped up once I cranked up the carp log level: Oct 14 15:21:48 lisa /bsd: carp: carp1 demoted group carp by -1 to 2 (< snderrors) Oct 14 15:21:52 lisa /bsd: carp1: ip_output failed: 64 Oct 14 15:21:54 lisa /bsd: carp: carp1 demoted group carp by 1 to 3 (> snderrors) Oct 14 15:21:55 lisa /bsd: carp1: ip_output failed: 64 Oct 14 15:22:14 lisa /bsd: carp: carp1 demoted group carp by -1 to 2 (< snderrors) Oct 14 15:22:18 lisa /bsd: carp1: ip_output failed: 64 Oct 14 15:22:20 lisa /bsd: carp: carp1 demoted group carp by 1 to 3 (> snderrors) It doesn't do this if I remove the carppeer and use the default multicast; that's an unexpected side effect of configuring a carppeer that might be worth documenting. A down carppeer on one interface can impact the functionality of all carp interfaces on the system.
Supermicro X11SSL-F freezes probing USB 3
I just put together a new server with a Supermicro X11SSL-F motherboard and a Xeon E3-1240L v5 processor, and was trying to install openbsd 5.8 on it. The install cd freezes while booting after it probes the USB 3 devices: >>> xhci probe won xhci0 at pci0 dev 20 function 0 "Intel 100 Series xHCI" rev 0x31: msi >>> probing for usb* >>> usb probe returned 1 >>> usb probe won usb0 at xhci0: USB revision 3.0 >>> probing for uhub* >>> uhub probe returned 10 >>> uhub probe won uhub0 at usb0 "Intel xHCI root hub" rev 3.00/1.00 addr 1 [system freezes here] I also tried the latest snapshot install cd, same problem. If I disable xhci, the installer boots successfully, although I haven't actually tried installing yet. I don't really need usb on this box, so I'm not really concerned if it's not going to work. It's not going into production for a few weeks though, so if anybody's interested in looking at why it's broken I could provide further details or test possible fixes. Here's a dmesg of the snapshot install kernel booted without xhci: OpenBSD 5.9-current (RAMDISK_CD) #1737: Sun Mar 6 19:18:13 MST 2016 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD real mem = 16976416768 (16189MB) avail mem = 16460062720 (15697MB) User Kernel Config UKC> do\^H \^Hiso\^H \^Hable xhci 98 xhci* disabled UKC> quit Continuing... mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x7fb95000 (59 entries) bios0: vendor American Megatrends Inc. version "1.0b" date 12/29/2015 bios0: Supermicro Super Server acpi0 at bios0: rev 2 acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG HPET SSDT LPIT SSDT SSDT SSDT DBGP DBG2 SSDT SSDT UEFI SSDT DMAR EINJ ERST BERT HEST acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.73 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache cpu0: apic clock running at 24MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE cpu at mainbus0: not configured cpu at mainbus0: not configured cpu at mainbus0: not configured cpu at mainbus0: not configured cpu at mainbus0: not configured cpu at mainbus0: not configured cpu at mainbus0: not configured ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 1 (PEG0) acpiprt2 at acpi0: bus -1 (PEG1) acpiprt3 at acpi0: bus -1 (PEG2) acpiprt4 at acpi0: bus 2 (RP09) acpiprt5 at acpi0: bus 3 (RP10)
Re: Supermicro X11SSL-F freezes probing USB 3
On Mon, Mar 28, 2016 at 03:06:39PM -0400, Sonic wrote: > If I wait long enough the install will finally finish booting but the > keyboard (no ps2 ports) doesn't work. Could I trouble you to be more specific as to the duration of "long enough" :)? I think my patience ran out after about 15-20 minutes. So it eventually boots without disabling xhci, but the USB doesn't work in the end anyway? I'm installing via an IPMI virtual serial port so the lack of keyboard isn't really an issue for me, I can live without USB but as the box won't be going live for a few weeks I thought I'd see if any devs wanted me to try anything on it before I just moved forward without USB support. I've got -current set up to ready to patch and compile to test stuff on it if I can. It would be nice to get it working for situations like yours where it's needed. I booted a FreeBSD 10.2 livecd on it, and that initialized the xhci chipset fine and usb devices seem to work ok. I tried to compare the drivers, they share a bit in common but they're also quite different and it doesn't help that I'm not really a low level driver guy 8-/. I'm sure the new Skylake stuff just needs some minor tweak to make it happy. Thanks...
Re: Supermicro X11SSL-F freezes probing USB 3
On Tue, Mar 29, 2016 at 04:55:05PM -0400, Sonic wrote: > Unfortunately that option isn't available for me. The IPMI SOL on this > Dell stops forwarding the console once the system boots. The usb keyboard should still work when the bootloader is running, that's being handled by the BIOS. You just need to determine what port/baud rate the IPMI serial port is on your system, and when the bootloader shows up, just type for example: stty com1 115200 set tty com1 That will switch the bootloader to directly use the serial port, and when you boot the OS, it will do so as well. After these two commands, just 'boot -c' as usual to disable xhci, and continue on the serial port. Real servers don't need heads or keyboards ;). At least on my supermicro box, the default bios setting also allows me to type that on the IPMI serial console as well, in addition to the boot up messages it also forwards the bootloader by default, it doesn't stop forwarding until the OS itself loads. If you install over the serial console, the OS will by default be installed to use it too. So maybe you don't need that keyboard after all :).
Re: Supermicro X11SSL-F freezes probing USB 3
On Tue, Mar 29, 2016 at 07:06:41PM -0400, Sonic wrote: > On Tue, Mar 29, 2016 at 6:15 PM, Paul B. Henson wrote: > > stty com1 115200 > > set tty com1 > > Yes, tried that with no luck, SOL still stops forwarding. The box does Hmm, that sounds broken. Are you sure you've got the right serial port and baud rate? Once you switch the boot loader to serial, it's no longer a matter of "forwarding", it's direct serial access as far as the bootloader/OS is concerned. The BIOS forwarding piece is out of the picture. Unless your IPMI serial port implementation is broken you've probably got the wrong settings. Double check in the BIOS what the IPMI serial port settings are (which port, speed, etc) and make sure they match what you tell the bootloader. > rely on ssh for everything. But there's always some possible problem > that it would be nice to be able to plug in a keyboard and monitor to > work with. Can't you use the IPMI virtual head? I can't remember the last time I used a physical anything with a server. Rack and forget... Unless a power supply blows.
Re: Supermicro X11SSL-F freezes probing USB 3
On Tue, Mar 29, 2016 at 10:46:15PM -0400, Sonic wrote: > The IPMI is part of Dell's iDRAC stuff and the only thing I've found [...] > may be the iDRAC license level as well, anything above the "basic" > level, providing a limited feature set, requires purchasing a license Eeew. We've got some HP gear that requires an extra cost license to make the remote kvm gui head work past the bootloader which is ridiculous (but technically, I don't think remote kvm is part of the base IPMI standard), but the IPMI SOL serial port??? That's just crazy. I've never used Dell and never will for servers; desktops/notebooks, sure, but servers? Nah. Sun gear was pretty good until Oracle killed them off, we used IBM for a while until they sold it off to Lenovo and policy wouldn't let us buy from a non-US company (like the gear itself doesn't come from China anyway). Right now we're using HP at my dayjob and it's working out ok. I pretty much use supermicro for personal gear and sidejobs, it's generally good stuff. At least my IPMI SOL port works :). Good luck :).
Re: Supermicro X11SSL-F freezes probing USB 3
On Wed, Mar 30, 2016 at 03:34:25PM -0400, Sonic wrote: > Ahha! Who would have thought... com0 was the ticket. Thanks much! Sweet, glad to hear you got it working. Usually the IPMI SOL comes after the physical serial ports, I've never seen it be the first one. But hey, it's Dell :). Maybe now that 5.9 is out (a month early, nice, just in time for my new box) one of the devs will have time to take a look at the skylake usb 3 issues.
no SDRs IPMI disabled?
I just installed 5.9 on a Supermicro X11SSL-F board, and tried to enable the ipmi driver. During boot, it shows: ipmi0 at mainbus0: version 2.0 interface KCS iobase 0xca2/2 spacing 1 iic0: skipping sensors to avoid ipmi0 interactions ipmi0: get header fails ipmi0: no SDRs IPMI disabled ipmi at mainbus0 not configured Any suggestions on how to make this work? The full dmesg is: OpenBSD 5.9 (GENERIC.MP) #1888: Fri Feb 26 01:20:19 MST 2016 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 16976416768 (16189MB) avail mem = 16457711616 (15695MB) User Kernel Config UKC> enable ipmi 401 ipmi0 enabled UKC> quit Continuing... mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x7fb95000 (59 entries) bios0: vendor American Megatrends Inc. version "1.0b" date 12/29/2015 bios0: Supermicro Super Server acpi0 at bios0: rev 2 acpi0: sleep states S0 S4 S5 acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG HPET SSDT LPIT SSDT SSDT SSDT DBGP DBG2 SSDT SSDT UEFI SSDT DMAR EINJ ERST BERT HEST acpi0: wakeup devices PEGP(S4) PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PXSX(S4) RP09(S4) PXSX(S4) RP10(S4) PXSX(S4) RP11(S4) PXSX(S4) RP12(S4) PXSX(S4) RP13(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.85 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 24MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 4 (application processor) cpu2: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 2, package 0 cpu3 at mainbus0: apid 6 (application processor) cpu3: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 0, core 3, package 0 cpu4 at mainbus0: apid 1 (application processor) cpu4: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz cpu4: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT cpu4: 256KB 64b/line 8-way L2 cache cpu4: smt 1, core 0, package 0 cpu5 at mainbus0: apid 3 (application processor) cpu5: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz, 2100.00 MHz cpu5: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SENSOR,ARAT cpu5: 256K
Intel Atom S1260 (SuperServer 5017A-EF)
I'm looking at a supermicro SuperServer 5017A-EF for openbsd purposes, it's got an Intel atom S1260 SoC, Marvell 88SE9230 SATA, and i350AM2 dual gig interfaces. It looks like i350 support shipped in 5.2, and I'm pretty sure the Marvell chip is AHCI compliant, so I'd think that would be ok, but I'm leery about the SoC, I can't find any references to openbsd running on this specific chip or any atom based SoC for that matter and I'd hate to buy a box that didn't run openbsd well :(. Any feedback on this particular server, this atom SoC in specific, or even a general opinion on how well this might work out much appreciated :). Thanks much...
Re: Intel Atom S1260 (SuperServer 5017A-EF)
On Fri, Nov 15, 2013 at 11:25:50PM +0100, Sebastian Benoit wrote: > Don't buy this one (yet). The Marvell 88SE9230 SATA does not work. > i know cause i have one ;-) Arg, disappointing, but I'm glad I thought to check before buying :). Do you know if anybody's working on it? So much for "standard" AHCI , does it not find it, or find it but crap out? Do all the other components work ok? I could temporarily stick a PCI SATA card in it to get by until the onboard SATA is supported if all the other pieces are happy. Does anybody have any suggestions for a good/cheap 2 port SATA PCI card that supports openbsd? > The earlier 5017A-* machines are ok. Hmm, the only other 5017A model I see doesn't have IPMI. Thanks for the help...
Re: Intel Atom S1260 (SuperServer 5017A-EF)
On Fri, Nov 15, 2013 at 11:25:50PM +0100, Sebastian Benoit wrote: > Don't buy this one (yet). The Marvell 88SE9230 SATA does not work. > i know cause i have one ;-) Hmm, looks like support was added in FreeBSD back in June 2012: http://lists.freebsd.org/pipermail/svn-src-stable-9/2012-June/002131.html so hopefully it wouldn't be to hard for somebody with the right skill set (unfortunately not me when it comes to low level drivers ) to tune it up for openbsd. Looking at the backstory behind that commit: http://forums.freebsd.org/showthread.php?t=32563 evidentally marvell doesn't follow the AHCI spec very well and the freebsd driver has workarounds for various quirks. Stupid marvell :(, too bad supermicro didn't use a better sata chip. Poking through the freebsd code, it looks like it has a workaround for "Marvell controllers do not wait for readyness" which appears to be adding in an extra delay when the controller is reset, and "Some weird controllers do not return signature in FIS receive area. Read it from PxSIG register.", which copies some results from a different location overwriting what was copied in from the standard location. Other than that, I don't see any other kludges, the rest is just the standard ahci stuff. I see the openbsd ahci driver is completely different than the freebsd one, so dunno how easily such workarounds could be implemented.
Re: Intel Atom S1260 (SuperServer 5017A-EF)
On Sat, Nov 16, 2013 at 11:34:15AM +0100, Sebastian Benoit wrote: > sorry, i mispoke, i meant 5015A-* and they dont have a dedicated ipmi port. Oh, yah, I've actually got one of those, it's been working great. I was actually planning on replacing it with this newer one, which supports more memory and has more power, and reallocate it to another task. > anyway, dmesg attached, if someone cares. i'm not going to do anything more > with it. > > cpu0: apic clock running at 99MHz > cpu at mainbus0: not configured > cpu at mainbus0: not configured > cpu at mainbus0: not configured > > ahci0: failed to stop port, cannot softreset Hmm, not very promising, it didn't even initialize all four cores. The ahci error is one of the things the freebsd driver works around, the crappy marvell chipset breaks spec on the reset function. Lots of "unknowns" and "unconfigured" in that dmesg :(, guess I need to find another option. Least I found out before I bought it, thanks much for the heads up.
Re: Intel Atom S1260 (SuperServer 5017A-EF)
On Sat, Nov 16, 2013 at 12:27:08PM +0100, Carsten Larsen wrote: > Maybe just buy the previous model 5015A-*? I have been running one of > those for some years now and it works like a charm. From their website I > see it has reached End-of-Life though. I've actually got one of those, as you say, I've been very happy with it. I was looking for a newer model with more power and a separate IPMI port. Guess I've got to keep looking...
Re: Intel Atom S1260 (SuperServer 5017A-EF)
On Fri, Nov 15, 2013 at 08:42:50PM -0800, Chris Cappuccio wrote: > It's very old. This patch did not make it into the driver and I have > no idea if those chips work through some other change, or not. Likely > not. These older chips must be really buggy pieces of shit if you have > to disable NCQ. Bleh. I can definitely see the openbsd philosophy leaning towards not supporting crap ;). The two workarounds in freebsd for this newer marvell sata chipset don't seem quite as egregious, but I'm not really a low level driver guy...
Re: Intel Atom S1260 (SuperServer 5017A-EF)
On Sat, Nov 16, 2013 at 12:15:19PM -0800, Paul B. Henson wrote: > > sorry, i mispoke, i meant 5015A-* and they dont have a dedicated ipmi port. > > Oh, yah, I've actually got one of those, it's been working great. I was > actually planning on replacing it with this newer one, which supports > more memory and has more power, and reallocate it to another task. I forgot to mention, but the newer one also supports ECC memory, which is a plus.
low-power/small form factor server (supermicro X9SCL-F w Core i3-3220T)
I was recently looking for a low-power small form factor box and was initially thinking of the supermicro SuperServer 5017A-EF, which seemed a good fit. Unfortunately, the fairly new atom SoC in that box isn't currently supported, nor is the crappy "not-quite-AHCI" Marvell sata controller. So, I'm thinking of putting something together from parts instead. I'm looking at the supermicro X9SCL-F motherboard which has an Intel C202 PCH chipset and 2 gigabit interfaces (Intel 82579LM and 82574L), combined with a Core i3-3220T, stuffed in a 510T-203B chassis. I see from the em man page and the list archives that those two Intel ethernet chipsets seem reasonably well supported. I couldn't find any specific mention of the C202 chipset, but I believe the Intel AHCI SATA interface is actually AHCI compliant, so trust it would work fine with the standard ahci driver. The i3 processor has a 35w TDP versus the atom's 8.5w, but actually working with openbsd is a bit more important than saving a few watts :). According to the Intel ARK this i3 processor should support ECC memory when installed on a board with a server class chipset. I really appreciated the heads up I got last week about the unsupported atom, that definitely saved me from ordering a box I couldn't use 8-/, so if anybody sees any potential issues with this combination for an openBSD server I'd appreciate hearing about it :). Thanks much.
Re: low-power/small form factor server (supermicro X9SCL-F w Core i3-3220T)
> From: Bryan Vyhmeister [mailto:br...@bsdjournal.net] > Sent: Tuesday, November 19, 2013 9:46 PM > > I have lots of X9SCL-F, X9SCL+-F, X9SCM-F, X9SCI-LN4, X9SCI-LN4F, > X9SCM-iiF boards running OpenBSD in production. Both network interfaces > work flawlessly. Cool, thanks much for the info. > Although I'm not using any of the low power chips since I've > found that heat is really not an issue and the non T chips scale down With the 200W power supply in the small form factor chassis, supermicro says the max processor TDP supported by the motherboard is 45w. I guess if you put one in that potentially uses greater power but never push it to do so it would still work, but I assume bad things would happen if it ever accidentally cranked and tried to suck more power than was available 8-/. > G860, Core i3 2120, Core i3 3240, Xeon E3 1220, Xeon E3 1260L, and Xeon The specifications for the motherboard on the supermicro site say processors with integrated graphics are "not recommended", and since so far I've been unable to push them into clarifying why I was a little leery. The footnote indicates it's coming from Intel regarding the C202 chipset. I've seen a handful of reports of people using processors with integrated graphics with this chipset, and then with your confirmation I feel better about ordering it. Obviously the chipset doesn't support integrated graphics, so the silicon in the CPU is going to waste, but I'm guessing the supermicro documentation team read "doesn't support integrated graphics" in the chipset documentation and translated that into "you shouldn't use one" as opposed to "if you do use one, you can't use the integrated graphics". > If you don't need IPMI, you could save a few dollars and go with the non > F versions of the boards. I have found that the IPMI "Text Console" > never works right for anything I've tried including OpenBSD. I've used the serial redirection on illumos and linux boxes without any trouble. I rarely use the video redirection other than for potentially initial bootstrapping and rare diagnostic issues. It's nice not ever having to visit the box in person once it's racked :). Thanks again for the feedback, it was very helpful.
Re: low-power/small form factor server (supermicro X9SCL-F w Core i3-3220T)
> From: Stuart Henderson > Sent: Wednesday, November 20, 2013 3:54 AM > > One thing to note, which may be irrelevant, but may be very important, > is which CPUs support AES-NI - the LGA1155 Pentium/i3 don't. Yeah, you've got to bump up to a much more expensive Xeon to get that :(. Thanks for the heads up, but for this box the extra cost isn't worth it.
Re: low-power/small form factor server (supermicro X9SCL-F w Core i3-3220T)
On Wed, Nov 20, 2013 at 12:35:35PM -0800, 'Bryan Vyhmeister' wrote: > From looking at Supermicro's CSE-510-203B page, it says 65W TDP and > every CPU I've mentioned below except for the Xeon E3 1220 (80W) and > Xeon E3 1230v2 (69W) fall below this. Hmm, I guess I was actually looking at the SuperServer 5017C-LF page: http://www.supermicro.com/products/system/1U/5017/SYS-5017C-LF.cfm It has the X9SCL-F motherboard, a similar chassis with a 200w power supply, and indicates max tdp <= 45w. I asked supermicro support about the 510T-203B chassis with the same motherboard, and they told me it only supported up to 45w as well. Dunno, better safe than sorry, the T version is about the same price as the regular.
Re: low-power/small form factor server (supermicro X9SCL-F w Core i3-3220T)
> From: 'Bryan Vyhmeister' [mailto:br...@bsdjournal.net] > Sent: Wednesday, November 20, 2013 1:51 PM > > Very interesting. There is some ambiguity in the specs. Looking at the > SC510L-200B chassis which is what's included with the SYS-5017C-LF > system you linked to, it also says 65W TDP. Well, it can't hurt to use less power :). I ended up ordering all the parts, hopefully they'll show up by mid next week so I can assemble them over the long weekend. I managed to escalate the integrated graphics question high enough to find somebody who knew what they were talking about, he said, as you confirmed, that they work fine with this motherboard other than that you cannot use the integrated graphics, it is disabled. Pretty much what I thought, they should clarify their footnote on the specification page. > the hw.sensors values: Out of curiosity, have you tried enabling the ipmi driver? I have an older atom server with an X7SPA-HF motherboard (which is actually being replaced by this one), and I found that the ipmi sensor provided more values than the lm one. The box came as passively cooled, I ended up sticking in three fans anyway. It still runs a bit hot, but within the acceptable range for that processor: hw.sensors.ipmi0.temp0=54.00 degC (System Temp), CRITICAL hw.sensors.ipmi0.temp1=64.00 degC (CPU Temp), CRITICAL hw.sensors.ipmi0.fan0=4840 RPM (CPU FAN), OK hw.sensors.ipmi0.fan1=6135 RPM (SYS FAN), OK hw.sensors.ipmi0.volt0=1.11 VDC (CPU Vcore), OK hw.sensors.ipmi0.volt1=1.04 VDC (Vichcore), OK hw.sensors.ipmi0.volt2=3.30 VDC (+3.3VCC), OK hw.sensors.ipmi0.volt3=1.53 VDC (VDIMM), OK hw.sensors.ipmi0.volt4=5.09 VDC (+5 V), OK hw.sensors.ipmi0.volt5=12.30 VDC (+12 V), OK hw.sensors.ipmi0.volt6=3.30 VDC (+3.3VSB), OK hw.sensors.ipmi0.volt7=3.22 VDC (VBAT), OK hw.sensors.ipmi0.indicator0=Off (Chassis Intru), OK hw.sensors.ipmi0.indicator1=On (PS Status), OK > I think the SC510T-200B/203B is the better choice for hard drives. I use Yep. I like hot-swap, it would be a pain to have to disconnect and unrack the unit to swap out a fixed internal drive.
Re: low-power/small form factor server (supermicro X9SCL-F w Core i3-3220T)
On Wed, Nov 20, 2013 at 10:16:05PM -0500, Ted Unangst wrote: > The ipmi driver is disabled by default because it does bad things on > some systems. If you don't go out of your way to enable it, the not > configured line is all you'll see. That's what I was going to say, but you beat me to it ;). For mailing list archive purposes, you need to run 'config -e' on your kernel binary and 'enable ipmi', then reboot. So far I haven't found anything bad it does on my system. When it's enabled, there is a watchdog setting in sysctl on my box, but the last time I tried to use it it always rebooted after the interval, it didn't seem to be reseting the timeout correctly. My box hasn't ever frozen since it's been deployed, so that lack hasn't been a problem. The only difference I've noticed between ipmi/no ipmi is a few more sensors displayed.
Re: Patch to remove "adult" content from spamd(8) man page
On Fri, Nov 22, 2013 at 01:09:36PM -0600, J. Lewis Muir wrote: > I don't see it that way. Huckleberry Finn is a book, and I don't need > to read it unless I want to. The spamd(8) man page is a man page I need > to read in order to understand how to use spamd. Let me fix that for you: "The spamd(8) man page is a man page I don't need to read it unless I want to use spamd, a choice I am making of my own free will, and if I don't like it, I guess I could just go use some other software that doesn't get my panties in a bunch." Maybe you could try spam assassin instead? Unless, of course, you find the metaphor of "killing" spam offensive...