Re: rtadvd: no longer decrement lifetimes in real time
On Mon, Aug 14, 2017 at 07:12:03PM -0400, Jeremie Courreges-Anglas wrote: > > This one fell through the cracks... > > On Sat, Aug 12 2017, Florian Obser wrote: > > Stop supporting prefix lifetimes that decrement in real time. > > It complicates the code, it's off by default and RFC 4861 section > > 6.2.1 lists it as MAY. > > After this we can stop regenerating the RA packets everytime we send > > them. > > I don't think regenerating the RA every time is a big problem, even > though I looked into this at some point. That is true, we can do better though. I might have a diff ;) > > > Also I'm not convinced that this has a use case. I think it > > comes from a fairy tale where renumbering is easy. > > Easy renumbering looks like a laudable goal to me. > > > Considering the two hour rule in RFC 4862 this might not actually > > work to begin with... > > I can't do proper checks right now, but I think that this feature + the > rules described in 5.5.3.e are compatible. Preferred lifetime is not > affected by these rules. For valid lifetime, it seems that it will be > maintained at 2hours, as long as routers keep announcing the prefix. > Which appears to be compatible with renumbering. I'm saying that you probably would do renumbering in a different way. Just put in a static 2h vltime for a week and be done with it. Eventually stop announcing the prefix. > > > OK? > > I'd prefer that we keep this code. Sure, it's not massively in the way. This came up because I was hacking (again) on a parse.y config file for rtadvd. And less features are better :) In the parse.y code I implemented it in a way that you can say, I want it to expire on Feb 14th 2016. Damn, all the cool references from my childhood about the future are now in the past :(. Anyway, we can bikeshed over that if and when I finally get that code done. > > -- > jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE > -- I'm not entirely sure you are real.
Re: ksh(1) history lines allocation
On Mon, Aug 14, 2017 at 10:26:48PM -0400, Jeremie Courreges-Anglas wrote: > > So I tinkered with the way ksh(1) tracks memory allocation, trying to > make it faster in the general case. One approach used a RB tree, > I wrote since a simple hash table implementation which seems to work > rather well. > > But the actual problem I'd first like to solve is a corner case. I use > HISTSIZE=2, and when the actual line count in my histfile approaches > 25000 (1.25 * HISTSIZE), ksh(1) has a hard time handling it. The main > reason is that it calls afree() ~5000 times in a loop, with afree() > traversing the APERM freelist, which contains >2 elements. This is > expensive. > > For history lines, we don't actually need to keep track of allocations > using an area, history lines are private to history.c and no gc/whatever > is needed there. So here's a diff that just uses strdup(3)/free(3). > > Comments? ok? I was able to reproduce the problem with a HISTSIZE of 10 which at 125000 entries rendered my system unusable. With the patch I am running fine with a HISTSIZE of 12 and have come back several times after hitting the 1.25x threshold. Regression tests pass. Rob > Index: history.c > === > RCS file: /d/cvs/src/bin/ksh/history.c,v > retrieving revision 1.64 > diff -u -p -p -u -r1.64 history.c > --- history.c 11 Aug 2017 19:37:58 - 1.64 > +++ history.c 15 Aug 2017 01:14:58 - > @@ -428,7 +428,7 @@ histbackup(void) > > if (histptr >= history && last_line != hist_source->line) { > hist_source->line--; > - afree(*histptr, APERM); > + free(*histptr); > histptr--; > last_line = hist_source->line; > } > @@ -613,14 +613,15 @@ histsave(int lno, const char *cmd, int d > #endif > } > > - c = str_save(cmd, APERM); > + if ((c = strdup(cmd)) == NULL) > + internal_errorf(1, "unable to allocate memory"); > if ((cp = strrchr(c, '\n')) != NULL) > *cp = '\0'; > > if (histptr < history + histsize - 1) > histptr++; > else { /* remove oldest command */ > - afree(*history, APERM); > + free(*history); > memmove(history, history + 1, > (histsize - 1) * sizeof(*history)); > } > > > > -- > jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE >
Re: ifstated: stop tracking interface indexes
On Mon, Aug 14 2017, Rob Pierce wrote: > ifstated currently tracks and maintains the index of each monitored interface > and does not maintain interface names. This means we need to re-index on > interface departure and arrival. > > The following diff moves away from indexes to names. Indexes are still > required, > but easily obtained dynamically as needed. This helps simplify the next diff > that will provide support for interface departure and arrival. > > Suggested by deraadt. > > No intended functional change. Regress tests pass. > > Ok? The idea looks sound to me, however I would keep the "interface" symbol in parse.y (your diff doesn't remove all "interface" references btw). The current code checks the existence of the interface at startup. If the interface doesn't exists, you get a syntax error. This could happen because of a missing interface (an interesting case), or because of a typo. Whether or not we're erroring out, it is nice to print a diagnostic message. I'm not sure this change was intended, so here's a tentative diff that that keeps the existing behavior. Regress tests pass. Index: ifstated.c === RCS file: /d/cvs/src/usr.sbin/ifstated/ifstated.c,v retrieving revision 1.59 diff -u -p -r1.59 ifstated.c --- ifstated.c 14 Aug 2017 03:15:28 - 1.59 +++ ifstated.c 15 Aug 2017 03:04:47 - @@ -61,8 +61,8 @@ void external_handler(int, short, void void external_exec(struct ifsd_external *, int); void check_external_status(struct ifsd_state *); void external_evtimer_setup(struct ifsd_state *, int); -void scan_ifstate(int, int, int); -intscan_ifstate_single(int, int, struct ifsd_state *); +void scan_ifstate(const char *, int, int); +intscan_ifstate_single(const char *, int, struct ifsd_state *); void fetch_ifstate(int); __dead voidusage(void); void adjust_expressions(struct ifsd_expression_list *, int); @@ -233,6 +233,8 @@ rt_msg_handler(int fd, short event, void char msg[2048]; struct rt_msghdr *rtm = (struct rt_msghdr *)&msg; struct if_msghdr ifm; + char ifnamebuf[IFNAMSIZ]; + char *ifname; ssize_t len; if ((len = read(fd, msg, sizeof(msg))) == -1) { @@ -250,7 +252,10 @@ rt_msg_handler(int fd, short event, void switch (rtm->rtm_type) { case RTM_IFINFO: memcpy(&ifm, rtm, sizeof(ifm)); - scan_ifstate(ifm.ifm_index, ifm.ifm_data.ifi_link_state, 1); + ifname = if_indextoname(ifm.ifm_index, ifnamebuf); + /* ifname is NULL on interface departure */ + if (ifname != NULL) + scan_ifstate(ifname, ifm.ifm_data.ifi_link_state, 1); break; case RTM_DESYNC: fetch_ifstate(1); @@ -431,7 +436,7 @@ external_evtimer_setup(struct ifsd_state #defineLINK_STATE_IS_DOWN(_s) (!LINK_STATE_IS_UP((_s))) int -scan_ifstate_single(int ifindex, int s, struct ifsd_state *state) +scan_ifstate_single(const char *ifname, int s, struct ifsd_state *state) { struct ifsd_ifstate *ifstate; struct ifsd_expression_list expressions; @@ -440,7 +445,7 @@ scan_ifstate_single(int ifindex, int s, TAILQ_INIT(&expressions); TAILQ_FOREACH(ifstate, &state->interface_states, entries) { - if (ifstate->ifindex == ifindex) { + if (strcmp(ifstate->ifname, ifname) == 0) { if (ifstate->prevstate != s && (ifstate->prevstate != -1 || !opt_inhibit)) { struct ifsd_expression *expression; @@ -472,15 +477,15 @@ scan_ifstate_single(int ifindex, int s, } void -scan_ifstate(int ifindex, int s, int do_eval) +scan_ifstate(const char *ifname, int s, int do_eval) { struct ifsd_state *state; int cur_eval = 0; - if (scan_ifstate_single(ifindex, s, &conf->initstate) && do_eval) + if (scan_ifstate_single(ifname, s, &conf->initstate) && do_eval) eval_state(&conf->initstate); TAILQ_FOREACH(state, &conf->states, entries) { - if (scan_ifstate_single(ifindex, s, state) && + if (scan_ifstate_single(ifname, s, state) && (do_eval && state == conf->curstate)) cur_eval = 1; } @@ -619,8 +624,8 @@ fetch_ifstate(int do_eval) for (ifa = ifap; ifa; ifa = ifa->ifa_next) { if (ifa->ifa_addr->sa_family == AF_LINK) { struct if_data *ifdata = ifa->ifa_data; - scan_ifstate(if_nametoindex(ifa->ifa_name), - ifdata->ifi_link_state, do_eval); + scan_ifstate(ifa->ifa_name, ifdata->ifi_link_state, + do_eval); } } Index: ifstated.h ==
ifstated: stop tracking interface indexes
ifstated currently tracks and maintains the index of each monitored interface and does not maintain interface names. This means we need to re-index on interface departure and arrival. The following diff moves away from indexes to names. Indexes are still required, but easily obtained dynamically as needed. This helps simplify the next diff that will provide support for interface departure and arrival. Suggested by deraadt. No intended functional change. Regress tests pass. Ok? Index: ifstated.c === RCS file: /cvs/src/usr.sbin/ifstated/ifstated.c,v retrieving revision 1.59 diff -u -p -r1.59 ifstated.c --- ifstated.c 14 Aug 2017 03:15:28 - 1.59 +++ ifstated.c 15 Aug 2017 00:34:11 - @@ -61,8 +61,8 @@ void external_handler(int, short, void void external_exec(struct ifsd_external *, int); void check_external_status(struct ifsd_state *); void external_evtimer_setup(struct ifsd_state *, int); -void scan_ifstate(int, int, int); -intscan_ifstate_single(int, int, struct ifsd_state *); +void scan_ifstate(const char *, int, int); +intscan_ifstate_single(const char *, int, struct ifsd_state *); void fetch_ifstate(int); __dead voidusage(void); void adjust_expressions(struct ifsd_expression_list *, int); @@ -233,6 +233,8 @@ rt_msg_handler(int fd, short event, void char msg[2048]; struct rt_msghdr *rtm = (struct rt_msghdr *)&msg; struct if_msghdr ifm; + char ifnamebuf[IFNAMSIZ]; + char *ifname; ssize_t len; if ((len = read(fd, msg, sizeof(msg))) == -1) { @@ -250,7 +252,10 @@ rt_msg_handler(int fd, short event, void switch (rtm->rtm_type) { case RTM_IFINFO: memcpy(&ifm, rtm, sizeof(ifm)); - scan_ifstate(ifm.ifm_index, ifm.ifm_data.ifi_link_state, 1); + ifname = if_indextoname(ifm.ifm_index, ifnamebuf); + /* ifname is NULL on interface departure */ + if (ifname != NULL) + scan_ifstate(ifname, ifm.ifm_data.ifi_link_state, 1); break; case RTM_DESYNC: fetch_ifstate(1); @@ -431,7 +436,7 @@ external_evtimer_setup(struct ifsd_state #defineLINK_STATE_IS_DOWN(_s) (!LINK_STATE_IS_UP((_s))) int -scan_ifstate_single(int ifindex, int s, struct ifsd_state *state) +scan_ifstate_single(const char *ifname, int s, struct ifsd_state *state) { struct ifsd_ifstate *ifstate; struct ifsd_expression_list expressions; @@ -440,7 +445,7 @@ scan_ifstate_single(int ifindex, int s, TAILQ_INIT(&expressions); TAILQ_FOREACH(ifstate, &state->interface_states, entries) { - if (ifstate->ifindex == ifindex) { + if (strcmp(ifstate->ifname, ifname) == 0) { if (ifstate->prevstate != s && (ifstate->prevstate != -1 || !opt_inhibit)) { struct ifsd_expression *expression; @@ -472,15 +477,15 @@ scan_ifstate_single(int ifindex, int s, } void -scan_ifstate(int ifindex, int s, int do_eval) +scan_ifstate(const char *ifname, int s, int do_eval) { struct ifsd_state *state; int cur_eval = 0; - if (scan_ifstate_single(ifindex, s, &conf->initstate) && do_eval) + if (scan_ifstate_single(ifname, s, &conf->initstate) && do_eval) eval_state(&conf->initstate); TAILQ_FOREACH(state, &conf->states, entries) { - if (scan_ifstate_single(ifindex, s, state) && + if (scan_ifstate_single(ifname, s, state) && (do_eval && state == conf->curstate)) cur_eval = 1; } @@ -619,8 +624,8 @@ fetch_ifstate(int do_eval) for (ifa = ifap; ifa; ifa = ifa->ifa_next) { if (ifa->ifa_addr->sa_family == AF_LINK) { struct if_data *ifdata = ifa->ifa_data; - scan_ifstate(if_nametoindex(ifa->ifa_name), - ifdata->ifi_link_state, do_eval); + scan_ifstate(ifa->ifa_name, ifdata->ifi_link_state, + do_eval); } } Index: ifstated.h === RCS file: /cvs/src/usr.sbin/ifstated/ifstated.h,v retrieving revision 1.18 diff -u -p -r1.18 ifstated.h --- ifstated.h 14 Aug 2017 03:15:28 - 1.18 +++ ifstated.h 15 Aug 2017 00:34:11 - @@ -41,7 +41,7 @@ struct ifsd_ifstate { #define IFSD_LINKUP2 int prevstate; u_int32_trefcount; - u_short ifindex; + char ifname[IFNAMSIZ]; }; struct ifsd_external { Index: parse.y
ksh(1) history lines allocation
So I tinkered with the way ksh(1) tracks memory allocation, trying to make it faster in the general case. One approach used a RB tree, I wrote since a simple hash table implementation which seems to work rather well. But the actual problem I'd first like to solve is a corner case. I use HISTSIZE=2, and when the actual line count in my histfile approaches 25000 (1.25 * HISTSIZE), ksh(1) has a hard time handling it. The main reason is that it calls afree() ~5000 times in a loop, with afree() traversing the APERM freelist, which contains >2 elements. This is expensive. For history lines, we don't actually need to keep track of allocations using an area, history lines are private to history.c and no gc/whatever is needed there. So here's a diff that just uses strdup(3)/free(3). Comments? ok? Index: history.c === RCS file: /d/cvs/src/bin/ksh/history.c,v retrieving revision 1.64 diff -u -p -p -u -r1.64 history.c --- history.c 11 Aug 2017 19:37:58 - 1.64 +++ history.c 15 Aug 2017 01:14:58 - @@ -428,7 +428,7 @@ histbackup(void) if (histptr >= history && last_line != hist_source->line) { hist_source->line--; - afree(*histptr, APERM); + free(*histptr); histptr--; last_line = hist_source->line; } @@ -613,14 +613,15 @@ histsave(int lno, const char *cmd, int d #endif } - c = str_save(cmd, APERM); + if ((c = strdup(cmd)) == NULL) + internal_errorf(1, "unable to allocate memory"); if ((cp = strrchr(c, '\n')) != NULL) *cp = '\0'; if (histptr < history + histsize - 1) histptr++; else { /* remove oldest command */ - afree(*history, APERM); + free(*history); memmove(history, history + 1, (histsize - 1) * sizeof(*history)); } -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: rtadvd: no longer decrement lifetimes in real time
This one fell through the cracks... On Sat, Aug 12 2017, Florian Obser wrote: > Stop supporting prefix lifetimes that decrement in real time. > It complicates the code, it's off by default and RFC 4861 section > 6.2.1 lists it as MAY. > After this we can stop regenerating the RA packets everytime we send > them. I don't think regenerating the RA every time is a big problem, even though I looked into this at some point. > Also I'm not convinced that this has a use case. I think it > comes from a fairy tale where renumbering is easy. Easy renumbering looks like a laudable goal to me. > Considering the two hour rule in RFC 4862 this might not actually > work to begin with... I can't do proper checks right now, but I think that this feature + the rules described in 5.5.3.e are compatible. Preferred lifetime is not affected by these rules. For valid lifetime, it seems that it will be maintained at 2hours, as long as routers keep announcing the prefix. Which appears to be compatible with renumbering. > OK? I'd prefer that we keep this code. -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: Please test: HZ bump
On 14/08/17(Mon) 22:32, Mark Kettenis wrote: > > Date: Mon, 14 Aug 2017 16:06:51 -0400 > > From: Martin Pieuchot > > > > I'd like to improve the fairness of the scheduler, with the goal of > > mitigating userland starvations. For that the kernel needs to have > > a better understanding of the amount of executed time per task. > > > > The smallest interval currently usable on all our architectures for > > such accounting is a tick. With the current HZ value of 100, this > > smallest interval is 10ms. I'd like to bump this value to 1000. > > > > The diff below intentionally bump other `hz' value to keep current > > ratios. We certainly want to call schedclock(), or a similar time > > accounting function, at a higher frequency than 16 Hz. However this > > will be part of a later diff. > > > > I'd be really interested in test reports. mlarkin@ raised a good > > question: is your battery lifetime shorter with this diff? > > > > Comments, oks? > > Need to look at this a bit more carefully but: > > > Index: conf/param.c > > === > > RCS file: /cvs/src/sys/conf/param.c,v > > retrieving revision 1.37 > > diff -u -p -r1.37 param.c > > --- conf/param.c6 May 2016 19:45:35 - 1.37 > > +++ conf/param.c14 Aug 2017 17:03:23 - > > @@ -76,7 +76,7 @@ > > # define DST 0 > > #endif > > #ifndef HZ > > -#defineHZ 100 > > +#defineHZ 1000 > > #endif > > inthz = HZ; > > inttick = 100 / HZ; > > Index: kern/kern_clock.c > > === > > RCS file: /cvs/src/sys/kern/kern_clock.c,v > > retrieving revision 1.93 > > diff -u -p -r1.93 kern_clock.c > > --- kern/kern_clock.c 22 Jul 2017 14:33:45 - 1.93 > > +++ kern/kern_clock.c 14 Aug 2017 19:50:49 - > > @@ -406,12 +406,11 @@ statclock(struct clockframe *frame) > > if (p != NULL) { > > p->p_cpticks++; > > /* > > -* If no schedclock is provided, call it here at ~~12-25 Hz; > > +* If no schedclock is provided, call it here; > > * ~~16 Hz is best > > */ > > if (schedhz == 0) { > > - if ((++curcpu()->ci_schedstate.spc_schedticks & 3) == > > - 0) > > + if ((spc->spc_schedticks & 0x3f) == 0) > > That ++ should not be dropped sould it? Indeed! Index: conf/param.c === RCS file: /cvs/src/sys/conf/param.c,v retrieving revision 1.37 diff -u -p -r1.37 param.c --- conf/param.c6 May 2016 19:45:35 - 1.37 +++ conf/param.c14 Aug 2017 17:03:23 - @@ -76,7 +76,7 @@ # define DST 0 #endif #ifndef HZ -#defineHZ 100 +#defineHZ 1000 #endif inthz = HZ; inttick = 100 / HZ; Index: kern/kern_clock.c === RCS file: /cvs/src/sys/kern/kern_clock.c,v retrieving revision 1.93 diff -u -p -r1.93 kern_clock.c --- kern/kern_clock.c 22 Jul 2017 14:33:45 - 1.93 +++ kern/kern_clock.c 14 Aug 2017 21:03:54 - @@ -406,12 +406,11 @@ statclock(struct clockframe *frame) if (p != NULL) { p->p_cpticks++; /* -* If no schedclock is provided, call it here at ~~12-25 Hz; +* If no schedclock is provided, call it here; * ~~16 Hz is best */ if (schedhz == 0) { - if ((++curcpu()->ci_schedstate.spc_schedticks & 3) == - 0) + if ((++spc->spc_schedticks & 0x3f) == 0) schedclock(p); } } Index: arch/amd64/isa/clock.c === RCS file: /cvs/src/sys/arch/amd64/isa/clock.c,v retrieving revision 1.25 diff -u -p -r1.25 clock.c --- arch/amd64/isa/clock.c 11 Aug 2017 21:18:11 - 1.25 +++ arch/amd64/isa/clock.c 14 Aug 2017 17:19:35 - @@ -303,8 +303,8 @@ rtcdrain(void *v) void i8254_initclocks(void) { - stathz = 128; - profhz = 1024; + stathz = 1024; + profhz = 8192; isa_intr_establish(NULL, 0, IST_PULSE, IPL_CLOCK, clockintr, 0, "clock"); @@ -321,7 +321,7 @@ rtcstart(void) { static struct timeout rtcdrain_timeout; - mc146818_write(NULL, MC_REGA, MC_BASE_32_KHz | MC_RATE_128_Hz); + mc146818_write(NULL, MC_REGA, MC_BASE_32_KHz | MC_RATE_1024_Hz); mc146818_write(NULL, MC_REGB, MC_REGB_24HR | MC_REGB_PIE); /* @@ -577,10 +577,10 @@ setstatclockrate(int arg) if (initclock_func == i8254_initclocks) { if (arg == stathz) mc146818_write(NULL, MC_REGA, - MC_BASE_32_KHz | MC_RATE_128_Hz); + MC_BASE
Re: Please test: HZ bump
Ted Unangst wrote: > Martin Pieuchot wrote: > > I'd like to improve the fairness of the scheduler, with the goal of > > mitigating userland starvations. For that the kernel needs to have > > a better understanding of the amount of executed time per task. > > > > The smallest interval currently usable on all our architectures for > > such accounting is a tick. With the current HZ value of 100, this > > smallest interval is 10ms. I'd like to bump this value to 1000. > > Maybe we want this too, for sh? This looks like accidental netbsd copying. Or > are we intentionally resetting hz on sh for some reason? apparently yes because the clock only works at 64hz. is the conf file a better place for that, instead of having two separate ifndef initializers with different values? that troubles me, even if it seems to work. just define HZ=64 in the right place. Index: arch/landisk/conf/GENERIC === RCS file: /cvs/src/sys/arch/landisk/conf/GENERIC,v retrieving revision 1.51 diff -u -p -r1.51 GENERIC --- arch/landisk/conf/GENERIC 28 Jun 2016 04:41:37 - 1.51 +++ arch/landisk/conf/GENERIC 14 Aug 2017 20:56:29 - @@ -21,6 +21,8 @@ optionPCLOCK= # 33.33MHz clo option DONT_INIT_BSC #optionDONT_INIT_PCIBSC +option HZ=64 + option PCIVERBOSE option USER_PCICONF# user-space PCI configuration option USBVERBOSE > > > Index: arch/sh/sh/clock.c > === > RCS file: /cvs/src/sys/arch/sh/sh/clock.c,v > retrieving revision 1.9 > diff -u -p -r1.9 clock.c > --- arch/sh/sh/clock.c5 Mar 2016 17:16:33 - 1.9 > +++ arch/sh/sh/clock.c14 Aug 2017 20:49:31 - > @@ -47,9 +47,6 @@ > > #define NWDOG 0 > > -#ifndef HZ > -#define HZ 64 > -#endif > #define MINYEAR 2002/* "today" */ > #define SH_RTC_CLOCK16384 /* Hz */ > > @@ -231,10 +228,6 @@ cpu_initclocks(void) > { > if (sh_clock.pclock == 0) > panic("No PCLOCK information."); > - > - /* Set global variables. */ > - hz = HZ; > - tick = 100 / hz; > > /* >* Use TMU channel 0 as hard clock >
Re: Please test: HZ bump
Martin Pieuchot wrote: > I'd like to improve the fairness of the scheduler, with the goal of > mitigating userland starvations. For that the kernel needs to have > a better understanding of the amount of executed time per task. > > The smallest interval currently usable on all our architectures for > such accounting is a tick. With the current HZ value of 100, this > smallest interval is 10ms. I'd like to bump this value to 1000. Maybe we want this too, for sh? This looks like accidental netbsd copying. Or are we intentionally resetting hz on sh for some reason? Index: arch/sh/sh/clock.c === RCS file: /cvs/src/sys/arch/sh/sh/clock.c,v retrieving revision 1.9 diff -u -p -r1.9 clock.c --- arch/sh/sh/clock.c 5 Mar 2016 17:16:33 - 1.9 +++ arch/sh/sh/clock.c 14 Aug 2017 20:49:31 - @@ -47,9 +47,6 @@ #defineNWDOG 0 -#ifndef HZ -#defineHZ 64 -#endif #defineMINYEAR 2002/* "today" */ #defineSH_RTC_CLOCK16384 /* Hz */ @@ -231,10 +228,6 @@ cpu_initclocks(void) { if (sh_clock.pclock == 0) panic("No PCLOCK information."); - - /* Set global variables. */ - hz = HZ; - tick = 100 / hz; /* * Use TMU channel 0 as hard clock
Re: hme: incorrect register endian for PCI sun hme devices?
On 14/08/17 21:18, Mark Kettenis wrote: >> So tracing through HME register writes it seems the difference between >> OpenBSD and the other OSs is that OpenBSD appears to write to the >> virtual address 0x40008098000 with a standard (0x80) primary ASI, >> whereas the other OSs seem to write directly to the physical address >> 0x1ff0400 with a physical LE ASI. >> >> Is this because in OpenBSD the memory is being allocated as DVMA memory >> via the IOMMU? > > Ah, no. For memory mapped io it seems we create an actual > little-endian memory mapping (i.e. with the IE bit set). That was > probably done to support mapping framebuffers. Ah yes, I bet that's it - thanks for the pointer! Not sure it's going to be the easiest job to implement though. ATB, Mark.
Re: Please test: HZ bump
> Date: Mon, 14 Aug 2017 16:06:51 -0400 > From: Martin Pieuchot > > I'd like to improve the fairness of the scheduler, with the goal of > mitigating userland starvations. For that the kernel needs to have > a better understanding of the amount of executed time per task. > > The smallest interval currently usable on all our architectures for > such accounting is a tick. With the current HZ value of 100, this > smallest interval is 10ms. I'd like to bump this value to 1000. > > The diff below intentionally bump other `hz' value to keep current > ratios. We certainly want to call schedclock(), or a similar time > accounting function, at a higher frequency than 16 Hz. However this > will be part of a later diff. > > I'd be really interested in test reports. mlarkin@ raised a good > question: is your battery lifetime shorter with this diff? > > Comments, oks? Need to look at this a bit more carefully but: > Index: conf/param.c > === > RCS file: /cvs/src/sys/conf/param.c,v > retrieving revision 1.37 > diff -u -p -r1.37 param.c > --- conf/param.c 6 May 2016 19:45:35 - 1.37 > +++ conf/param.c 14 Aug 2017 17:03:23 - > @@ -76,7 +76,7 @@ > # define DST 0 > #endif > #ifndef HZ > -#define HZ 100 > +#define HZ 1000 > #endif > int hz = HZ; > int tick = 100 / HZ; > Index: kern/kern_clock.c > === > RCS file: /cvs/src/sys/kern/kern_clock.c,v > retrieving revision 1.93 > diff -u -p -r1.93 kern_clock.c > --- kern/kern_clock.c 22 Jul 2017 14:33:45 - 1.93 > +++ kern/kern_clock.c 14 Aug 2017 19:50:49 - > @@ -406,12 +406,11 @@ statclock(struct clockframe *frame) > if (p != NULL) { > p->p_cpticks++; > /* > - * If no schedclock is provided, call it here at ~~12-25 Hz; > + * If no schedclock is provided, call it here; >* ~~16 Hz is best >*/ > if (schedhz == 0) { > - if ((++curcpu()->ci_schedstate.spc_schedticks & 3) == > - 0) > + if ((spc->spc_schedticks & 0x3f) == 0) That ++ should not be dropped sould it?
Re: hme: incorrect register endian for PCI sun hme devices?
> From: Mark Cave-Ayland > Date: Mon, 14 Aug 2017 19:59:55 +0100 > > On 14/08/17 14:25, Mark Kettenis wrote: > > >> Great, thanks for the information - the fact that the nsphy0 has been > >> detected correctly means that the access still works. Looks like I'll > >> have to go digging deeper. > > > > The OpenBSD code uses %asi if necessary to let the hardware do the > > byteswapping. Howver, I think the psycho(4) host bridge also does an > > implicit byteswap. Always has been a bit confusing to me. But the > > code defenitely works correctly on real hardware. > > So tracing through HME register writes it seems the difference between > OpenBSD and the other OSs is that OpenBSD appears to write to the > virtual address 0x40008098000 with a standard (0x80) primary ASI, > whereas the other OSs seem to write directly to the physical address > 0x1ff0400 with a physical LE ASI. > > Is this because in OpenBSD the memory is being allocated as DVMA memory > via the IOMMU? Ah, no. For memory mapped io it seems we create an actual little-endian memory mapping (i.e. with the IE bit set). That was probably done to support mapping framebuffers.
Please test: HZ bump
I'd like to improve the fairness of the scheduler, with the goal of mitigating userland starvations. For that the kernel needs to have a better understanding of the amount of executed time per task. The smallest interval currently usable on all our architectures for such accounting is a tick. With the current HZ value of 100, this smallest interval is 10ms. I'd like to bump this value to 1000. The diff below intentionally bump other `hz' value to keep current ratios. We certainly want to call schedclock(), or a similar time accounting function, at a higher frequency than 16 Hz. However this will be part of a later diff. I'd be really interested in test reports. mlarkin@ raised a good question: is your battery lifetime shorter with this diff? Comments, oks? Index: conf/param.c === RCS file: /cvs/src/sys/conf/param.c,v retrieving revision 1.37 diff -u -p -r1.37 param.c --- conf/param.c6 May 2016 19:45:35 - 1.37 +++ conf/param.c14 Aug 2017 17:03:23 - @@ -76,7 +76,7 @@ # define DST 0 #endif #ifndef HZ -#defineHZ 100 +#defineHZ 1000 #endif inthz = HZ; inttick = 100 / HZ; Index: kern/kern_clock.c === RCS file: /cvs/src/sys/kern/kern_clock.c,v retrieving revision 1.93 diff -u -p -r1.93 kern_clock.c --- kern/kern_clock.c 22 Jul 2017 14:33:45 - 1.93 +++ kern/kern_clock.c 14 Aug 2017 19:50:49 - @@ -406,12 +406,11 @@ statclock(struct clockframe *frame) if (p != NULL) { p->p_cpticks++; /* -* If no schedclock is provided, call it here at ~~12-25 Hz; +* If no schedclock is provided, call it here; * ~~16 Hz is best */ if (schedhz == 0) { - if ((++curcpu()->ci_schedstate.spc_schedticks & 3) == - 0) + if ((spc->spc_schedticks & 0x3f) == 0) schedclock(p); } } Index: arch/amd64/isa/clock.c === RCS file: /cvs/src/sys/arch/amd64/isa/clock.c,v retrieving revision 1.25 diff -u -p -r1.25 clock.c --- arch/amd64/isa/clock.c 11 Aug 2017 21:18:11 - 1.25 +++ arch/amd64/isa/clock.c 14 Aug 2017 17:19:35 - @@ -303,8 +303,8 @@ rtcdrain(void *v) void i8254_initclocks(void) { - stathz = 128; - profhz = 1024; + stathz = 1024; + profhz = 8192; isa_intr_establish(NULL, 0, IST_PULSE, IPL_CLOCK, clockintr, 0, "clock"); @@ -321,7 +321,7 @@ rtcstart(void) { static struct timeout rtcdrain_timeout; - mc146818_write(NULL, MC_REGA, MC_BASE_32_KHz | MC_RATE_128_Hz); + mc146818_write(NULL, MC_REGA, MC_BASE_32_KHz | MC_RATE_1024_Hz); mc146818_write(NULL, MC_REGB, MC_REGB_24HR | MC_REGB_PIE); /* @@ -577,10 +577,10 @@ setstatclockrate(int arg) if (initclock_func == i8254_initclocks) { if (arg == stathz) mc146818_write(NULL, MC_REGA, - MC_BASE_32_KHz | MC_RATE_128_Hz); + MC_BASE_32_KHz | MC_RATE_1024_Hz); else mc146818_write(NULL, MC_REGA, - MC_BASE_32_KHz | MC_RATE_1024_Hz); + MC_BASE_32_KHz | MC_RATE_8192_Hz); } } Index: arch/armv7/omap/dmtimer.c === RCS file: /cvs/src/sys/arch/armv7/omap/dmtimer.c,v retrieving revision 1.6 diff -u -p -r1.6 dmtimer.c --- arch/armv7/omap/dmtimer.c 22 Jan 2015 14:33:01 - 1.6 +++ arch/armv7/omap/dmtimer.c 14 Aug 2017 17:16:01 - @@ -296,8 +296,8 @@ dmtimer_cpu_initclocks() { struct dmtimer_softc*sc = dmtimer_cd.cd_devs[1]; - stathz = 128; - profhz = 1024; + stathz = 1024; + profhz = 8192; sc->sc_ticks_per_second = TIMER_FREQUENCY; /* 32768 */ Index: arch/armv7/omap/gptimer.c === RCS file: /cvs/src/sys/arch/armv7/omap/gptimer.c,v retrieving revision 1.4 diff -u -p -r1.4 gptimer.c --- arch/armv7/omap/gptimer.c 20 Jun 2014 14:08:11 - 1.4 +++ arch/armv7/omap/gptimer.c 14 Aug 2017 17:15:44 - @@ -283,8 +283,8 @@ void gptimer_cpu_initclocks() { // u_int32_t now; - stathz = 128; - profhz = 1024; + stathz = 1024; + profhz = 8192; ticks_per_second = TIMER_FREQUENCY; Index: arch/armv7/sunxi/sxitimer.c === RCS file: /cvs/src/sys/arch/armv7/sunxi/sxitimer.c,v retrieving revision 1.10 diff -u -p -r1.10 sxitimer.c --- arch/armv7/sunxi/sxitimer.c 21 Jan 2017 08:26:49 - 1.10 +++ arch/arm
Re: remove in6_are_prefix_equal()
On Mon, Aug 14, 2017 at 05:53:10PM +, Florian Obser wrote: > After we stopped processing router advertisements in the kernel > sppp_update_ip6_addr() became the last user of n6_are_prefix_equal(). > Since it compares /128 prefixes it doesn't need all the bells and > whistles and can be converted to a memcmp. Remove the new unused > n6_are_prefix_equal(). > > OK? OK bluhm@ > > diff --git net/if_spppsubr.c net/if_spppsubr.c > index 4b541535bda..89e3f1b5713 100644 > --- net/if_spppsubr.c > +++ net/if_spppsubr.c > @@ -4355,7 +4355,6 @@ sppp_update_ip6_addr(void *arg) > struct sppp *sp = arg; > struct ifnet *ifp = &sp->pp_if; > struct in6_aliasreq *ifra = &sp->ipv6cp.req_ifid; > - struct in6_addr mask = in6mask128; > struct in6_ifaddr *ia6; > int error; > > @@ -4386,7 +4385,8 @@ sppp_update_ip6_addr(void *arg) >*/ > > /* Destination address can only be set for /128. */ > - if (!in6_are_prefix_equal(&ia6->ia_prefixmask.sin6_addr, &mask, 128)) { > + if (memcmp(&ia6->ia_prefixmask.sin6_addr, &in6mask128, > + sizeof(in6mask128)) != 0) { > ifra->ifra_dstaddr.sin6_len = 0; > ifra->ifra_dstaddr.sin6_family = AF_UNSPEC; > } > diff --git netinet6/in6.c netinet6/in6.c > index b83e6df6c66..f9596be629b 100644 > --- netinet6/in6.c > +++ netinet6/in6.c > @@ -1519,32 +1519,6 @@ in6_matchlen(struct in6_addr *src, struct in6_addr > *dst) > return match; > } > > -int > -in6_are_prefix_equal(struct in6_addr *p1, struct in6_addr *p2, int len) > -{ > - int bytelen, bitlen; > - > - /* sanity check */ > - if (0 > len || len > 128) { > - log(LOG_ERR, "in6_are_prefix_equal: invalid prefix > length(%d)\n", > - len); > - return (0); > - } > - > - bytelen = len / 8; > - bitlen = len % 8; > - > - if (bcmp(&p1->s6_addr, &p2->s6_addr, bytelen)) > - return (0); > - /* len == 128 is ok because bitlen == 0 then */ > - if (bitlen != 0 && > - p1->s6_addr[bytelen] >> (8 - bitlen) != > - p2->s6_addr[bytelen] >> (8 - bitlen)) > - return (0); > - > - return (1); > -} > - > void > in6_prefixlen2mask(struct in6_addr *maskp, int len) > { > diff --git netinet6/in6_var.h netinet6/in6_var.h > index a023d12a1bf..e5a72e6f903 100644 > --- netinet6/in6_var.h > +++ netinet6/in6_var.h > @@ -395,7 +395,6 @@ struct in6_ifaddr *in6ifa_ifpforlinklocal(struct ifnet *, > int); > struct in6_ifaddr *in6ifa_ifpwithaddr(struct ifnet *, struct in6_addr *); > int in6_addr2scopeid(unsigned int, struct in6_addr *); > int in6_matchlen(struct in6_addr *, struct in6_addr *); > -int in6_are_prefix_equal(struct in6_addr *, struct in6_addr *, int); > void in6_prefixlen2mask(struct in6_addr *, int); > void in6_purgeprefix(struct ifnet *); > #endif /* _KERNEL */ > > -- > I'm not entirely sure you are real.
Re: [patch] Add -z and -Z to apmd for automatic suspend/hibernate
On Mon, Aug 14, 2017 at 11:21:03AM -0400, Ted Unangst wrote: > Klemens Nanni wrote: > > > + case 'z': > > > + autoaction = AUTO_SUSPEND; > > > + autolimit = strtonum(optarg, 1, 100, &errstr); > > > + if (errstr != NULL) > > > + error("invalid percent: %s", errstr); > > > + break; > > You should pass optarg instead of errstr to error(). Either ways error() > > will still append since it uses err(3). This leads to > > > > $ obj/apmd -dz0 > > apmd: invalid percent: too small: Result too large > > actually, both, but you should use something like errc() instead of errno. Sure error("%s percentage: %s", errstr, optarg) would be the idiomatic way but that requires changing error()'s signature and usage all over apmd.c. Do you suggest replacing err(3) with errc(3) in error()?
Re: hme: incorrect register endian for PCI sun hme devices?
On 14/08/17 14:25, Mark Kettenis wrote: >> Great, thanks for the information - the fact that the nsphy0 has been >> detected correctly means that the access still works. Looks like I'll >> have to go digging deeper. > > The OpenBSD code uses %asi if necessary to let the hardware do the > byteswapping. Howver, I think the psycho(4) host bridge also does an > implicit byteswap. Always has been a bit confusing to me. But the > code defenitely works correctly on real hardware. So tracing through HME register writes it seems the difference between OpenBSD and the other OSs is that OpenBSD appears to write to the virtual address 0x40008098000 with a standard (0x80) primary ASI, whereas the other OSs seem to write directly to the physical address 0x1ff0400 with a physical LE ASI. Is this because in OpenBSD the memory is being allocated as DVMA memory via the IOMMU? ATB, Mark.
remove in6_are_prefix_equal()
After we stopped processing router advertisements in the kernel sppp_update_ip6_addr() became the last user of n6_are_prefix_equal(). Since it compares /128 prefixes it doesn't need all the bells and whistles and can be converted to a memcmp. Remove the new unused n6_are_prefix_equal(). OK? diff --git net/if_spppsubr.c net/if_spppsubr.c index 4b541535bda..89e3f1b5713 100644 --- net/if_spppsubr.c +++ net/if_spppsubr.c @@ -4355,7 +4355,6 @@ sppp_update_ip6_addr(void *arg) struct sppp *sp = arg; struct ifnet *ifp = &sp->pp_if; struct in6_aliasreq *ifra = &sp->ipv6cp.req_ifid; - struct in6_addr mask = in6mask128; struct in6_ifaddr *ia6; int error; @@ -4386,7 +4385,8 @@ sppp_update_ip6_addr(void *arg) */ /* Destination address can only be set for /128. */ - if (!in6_are_prefix_equal(&ia6->ia_prefixmask.sin6_addr, &mask, 128)) { + if (memcmp(&ia6->ia_prefixmask.sin6_addr, &in6mask128, + sizeof(in6mask128)) != 0) { ifra->ifra_dstaddr.sin6_len = 0; ifra->ifra_dstaddr.sin6_family = AF_UNSPEC; } diff --git netinet6/in6.c netinet6/in6.c index b83e6df6c66..f9596be629b 100644 --- netinet6/in6.c +++ netinet6/in6.c @@ -1519,32 +1519,6 @@ in6_matchlen(struct in6_addr *src, struct in6_addr *dst) return match; } -int -in6_are_prefix_equal(struct in6_addr *p1, struct in6_addr *p2, int len) -{ - int bytelen, bitlen; - - /* sanity check */ - if (0 > len || len > 128) { - log(LOG_ERR, "in6_are_prefix_equal: invalid prefix length(%d)\n", - len); - return (0); - } - - bytelen = len / 8; - bitlen = len % 8; - - if (bcmp(&p1->s6_addr, &p2->s6_addr, bytelen)) - return (0); - /* len == 128 is ok because bitlen == 0 then */ - if (bitlen != 0 && - p1->s6_addr[bytelen] >> (8 - bitlen) != - p2->s6_addr[bytelen] >> (8 - bitlen)) - return (0); - - return (1); -} - void in6_prefixlen2mask(struct in6_addr *maskp, int len) { diff --git netinet6/in6_var.h netinet6/in6_var.h index a023d12a1bf..e5a72e6f903 100644 --- netinet6/in6_var.h +++ netinet6/in6_var.h @@ -395,7 +395,6 @@ struct in6_ifaddr *in6ifa_ifpforlinklocal(struct ifnet *, int); struct in6_ifaddr *in6ifa_ifpwithaddr(struct ifnet *, struct in6_addr *); intin6_addr2scopeid(unsigned int, struct in6_addr *); intin6_matchlen(struct in6_addr *, struct in6_addr *); -intin6_are_prefix_equal(struct in6_addr *, struct in6_addr *, int); void in6_prefixlen2mask(struct in6_addr *, int); void in6_purgeprefix(struct ifnet *); #endif /* _KERNEL */ -- I'm not entirely sure you are real.
Re: 1M routes or 1M arp entries
On 2017/08/14 16:48, Simon Mages wrote: > Hi, > > you may want to take a look into /etc/login.conf > login.conf(5), cap_mkdb(1) I wouldn't normally recommend cap_mkdb for the login.conf file, it's too easy to forget to update the db after making a change. I'd just edit the text file. You will need to fully logout for changes to take effect (with ssh multiplexing, the master needs to exit; with X, you need a new session).
Re: 1M routes or 1M arp entries
On 14.8.2017. 16:48, Simon Mages wrote: > Hi, > > you may want to take a look into /etc/login.conf > login.conf(5), cap_mkdb(1) > > In this file you can fiddle with you limit maxima > for login classes. > > BR > Simon > Thank you, i will do that ...
Re: [patch] Add -z and -Z to apmd for automatic suspend/hibernate
Klemens Nanni wrote: > > + case 'z': > > + autoaction = AUTO_SUSPEND; > > + autolimit = strtonum(optarg, 1, 100, &errstr); > > + if (errstr != NULL) > > + error("invalid percent: %s", errstr); > > + break; > You should pass optarg instead of errstr to error(). Either ways error() > will still append since it uses err(3). This leads to > > $ obj/apmd -dz0 > apmd: invalid percent: too small: Result too large actually, both, but you should use something like errc() instead of errno.
Re: [patch] Add -z and -Z to apmd for automatic suspend/hibernate
On Sun, Aug 13, 2017 at 02:13:42PM +0200, Jesper Wallin wrote: > On Sun, Aug 13, 2017 at 09:52:22AM +0200, Martijn van Duren wrote: > > I've also been bitten by this a couple of times, but you can also solve > > this via the sensorsd framework, which is how I've done it. > > Yeah, someone on IRC also suggested sensorsd or even ksh and a cronjob. > I personally find it a bit too ducttapey though, especially for a > feature one would expect on a laptop. This also saves me from running an > extra daemon just in case my battery runs out. > > > I'm no expert in this area and I'm not going to make any statements on > > whether we should add this or not, but two (non functionality) nits > > inline for future reference. > > Thanks for the feedback, greatly appreciated! I've done a few changes > and also rewrote some of the manual, as my previous patch was lying. > > > Jesper Wallin > > > Index: usr.sbin/apmd/apmd.8 > === > RCS file: /cvs/src/usr.sbin/apmd/apmd.8,v > retrieving revision 1.47 > diff -u -p -r1.47 apmd.8 > --- usr.sbin/apmd/apmd.8 12 Feb 2015 14:03:49 - 1.47 > +++ usr.sbin/apmd/apmd.8 13 Aug 2017 11:50:16 - > @@ -113,6 +113,20 @@ The polling rate defaults to > once per 10 minutes, but may be specified using the > .Fl t > command-line flag. > +.It Fl z Ar percent > +Automatically suspend the system if no AC is connected and the > +estimated battery life is equal or below > +.Ar percent . > +.It Fl Z Ar percent > +Automatically hibernate the system if no AC is connected and the > +estimated battery life is equal or below > +.Ar percent . > +.Pp > +If both > +.Fl Z > +and > +.Fl z > +are specified, the first one will supersede the other. > .El > .Pp > When a client requests a suspend or stand-by state, > Index: usr.sbin/apmd/apmd.c > === > RCS file: /cvs/src/usr.sbin/apmd/apmd.c,v > retrieving revision 1.79 > diff -u -p -r1.79 apmd.c > --- usr.sbin/apmd/apmd.c 16 Nov 2015 17:35:05 - 1.79 > +++ usr.sbin/apmd/apmd.c 13 Aug 2017 11:50:16 - > @@ -56,6 +56,9 @@ > #define TRUE 1 > #define FALSE 0 > > +#define AUTO_SUSPEND 1 > +#define AUTO_HIBERNATE 2 > + > const char apmdev[] = _PATH_APM_CTLDEV; > const char sockfile[] = _PATH_APM_SOCKET; > > @@ -94,8 +97,8 @@ void > usage(void) > { > fprintf(stderr, > - "usage: %s [-AadHLs] [-f devname] [-S sockname] [-t seconds]\n", > - __progname); > + "usage: %s [-AadHLs] [-f devname] [-S sockname] [-t seconds] " > + "[-z percent] [-Z percent]\n", __progname); > exit(1); > } > > @@ -348,6 +351,8 @@ main(int argc, char *argv[]) > { > const char *fname = apmdev; > int ctl_fd, sock_fd, ch, suspends, standbys, hibernates, resumes; > + int autoaction = 0; > + int autolimit = 0; > int statonly = 0; > int powerstatus = 0, powerbak = 0, powerchange = 0; > int noacsleep = 0; > @@ -355,13 +360,14 @@ main(int argc, char *argv[]) > struct apm_power_info pinfo; > time_t apmtimeout = 0; > const char *sockname = sockfile; > + const char *errstr; > int kq, nchanges; > struct kevent ev[2]; > int ncpu_mib[2] = { CTL_HW, HW_NCPU }; > int ncpu; > size_t ncpu_sz = sizeof(ncpu); > > - while ((ch = getopt(argc, argv, "aACdHLsf:t:S:")) != -1) > + while ((ch = getopt(argc, argv, "aACdHLsf:t:S:z:Z:")) != -1) > switch(ch) { > case 'a': > noacsleep = 1; > @@ -402,6 +408,18 @@ main(int argc, char *argv[]) > doperf = PERF_MANUAL; > setperfpolicy("high"); > break; > + case 'z': > + autoaction = AUTO_SUSPEND; > + autolimit = strtonum(optarg, 1, 100, &errstr); > + if (errstr != NULL) > + error("invalid percent: %s", errstr); > + break; > + case 'Z': > + autoaction = AUTO_HIBERNATE; > + autolimit = strtonum(optarg, 1, 100, &errstr); > + if (errstr != NULL) > + error("invalid percent: %s", errstr); > + break; > case '?': > default: > usage(); > @@ -479,6 +497,20 @@ main(int argc, char *argv[]) > if (powerstatus != powerbak) { > powerstatus = powerbak; > powerchange = 1; > + } > + > + if (!powerstatus && autoaction && > + autolimit > (int)pinfo.battery_life) { > + syslog(LOG_NOTICE, "estimated battery life > %d%%, " > + "autoaction limit set to %d%% .", > +
Re: 1M routes or 1M arp entries
Hi, you may want to take a look into /etc/login.conf login.conf(5), cap_mkdb(1) In this file you can fiddle with you limit maxima for login classes. BR Simon 2017-08-14 16:28 GMT+02:00, Hrvoje Popovski : > On 14.8.2017. 16:03, Alexander Bluhm wrote: >> On Mon, Aug 14, 2017 at 03:52:56PM +0200, Hrvoje Popovski wrote: >>> # netstat -rnf inet >>> netstat: Cannot allocate memory >> >> Have you tried to increase ulimit -d ? > > it seems that i can decrease it but not increase it, or i don't know how > to do it properly :) > > # ulimit -d > 33554432 > > # ulimit -d 33554433 > > # ulimit -d > 33554432 > >
Re: 1M routes or 1M arp entries
On 14.8.2017. 16:03, Alexander Bluhm wrote: > On Mon, Aug 14, 2017 at 03:52:56PM +0200, Hrvoje Popovski wrote: >> # netstat -rnf inet >> netstat: Cannot allocate memory > > Have you tried to increase ulimit -d ? it seems that i can decrease it but not increase it, or i don't know how to do it properly :) # ulimit -d 33554432 # ulimit -d 33554433 # ulimit -d 33554432
Re: 1M routes or 1M arp entries
On Mon, Aug 14, 2017 at 03:52:56PM +0200, Hrvoje Popovski wrote: > # netstat -rnf inet > netstat: Cannot allocate memory Have you tried to increase ulimit -d ? bluhm
1M routes or 1M arp entries
Hi all, when openbsd imports cca 1M routes or more and if i want to see them with "netstat -rn" i'm getting "Cannot allocate memory". bgpd can see all routes. i don't think that this is real problem but full bgp table is cca 700K routes. # bgpctl show ip bgp mem RDE memory statistics 1245184 IPv4 unicast network entries using 47.5M of memory 2490368 rib entries using 152M of memory 2490368 prefix entries using 152M of memory 1 BGP path attribute entries using 120B of memory 1 BGP AS-PATH attribute entries using 37B of memory, and holding 1 references 0 BGP attributes entries using 0B of memory and holding 0 references 0 BGP attributes using 0B of memory RIB using 352M of memory # bgpctl show ip bgp | wc -l 1245188 # netstat -rnf inet netstat: Cannot allocate memory same happens with arp. if cca 1M arp entries are injected with "arp" and "netstat -rn" i'm getting "Cannot allocate memory". of course that this is extremely ridiculous example, but i would be good if i can a least delete arp entries. # vmstat -m | egrep "Name|arp" NameSize Requests FailInUse Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle arp 56 14819030 983053 13950 104 13846 13846 0 80 # arp -an HostEthernet AddressNetif ExpireFlags arp: malloc: Cannot allocate memory # arp -ad arp: malloc: Cannot allocate memory # netstat -rnf inet netstat: Cannot allocate memory
Re: hme: incorrect register endian for PCI sun hme devices?
> From: Mark Cave-Ayland > Date: Mon, 14 Aug 2017 06:31:34 +0100 > > On 13/08/17 16:52, Kaashif Hymabaccus wrote: > > > Hello Mark, > > > > I have a Sun Ultra 5 with the following dmesg: > > > > console is /pci@1f,0/pci@1,1/ebus@1/se@14,40:a > > Copyright (c) 1982, 1986, 1989, 1991, 1993 > > The Regents of the University of California. All rights reserved. > > Copyright (c) 1995-2017 OpenBSD. All rights reserved. > > https://www.OpenBSD.org > > > > OpenBSD 6.1-current (GENERIC) #225: Fri Aug 11 19:58:43 MDT 2017 > > dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC > > real mem = 536870912 (512MB) > > avail mem = 512393216 (488MB) > > mpath0 at root > > scsibus0 at mpath0: 256 targets > > mainbus0 at root: Sun Ultra 5/10 UPA/PCI (UltraSPARC-IIi 270MHz) > > cpu0 at mainbus0: SUNW,UltraSPARC-IIi (rev 1.3) @ 269.802 MHz > > cpu0: physical 16K instruction (32 b/l), 16K data (32 b/l), 256K external > > (64 b/l) > > psycho0 at mainbus0 addr 0xfffc4000: SUNW,sabre, impl 0, version 0, ign 7c0 > > psycho0: bus range 0-2, PCI bus 0 > > psycho0: dvma map c000-dfff > > pci0 at psycho0 > > ppb0 at pci0 dev 1 function 1 "Sun Simba" rev 0x11 > > pci1 at ppb0 bus 1 > > ebus0 at pci1 dev 1 function 0 "Sun PCIO EBus2" rev 0x01 > > auxio0 at ebus0 addr 726000-726003, 728000-728003, 72a000-72a003, > > 72c000-72c003, 72f000-72f003 > > power0 at ebus0 addr 724000-724003 ivec 0x25 > > "SUNW,pll" at ebus0 addr 504000-504002 not configured > > sab0 at ebus0 addr 40-40007f ivec 0x2b: rev 3.2 > > sabtty0 at sab0 port 0: console > > sabtty1 at sab0 port 1 > > comkbd0 at ebus0 addr 3083f8-3083ff ivec 0x29: no keyboard > > comms0 at ebus0 addr 3062f8-3062ff ivec 0x2a > > wsmouse0 at comms0 mux 0 > > lpt0 at ebus0 addr 3043bc-3043cb, 30015c-30015d, 70-7f ivec 0x22: > > polled > > "fdthree" at ebus0 addr 3023f0-3023f7, 706000-70600f, 72-720003 ivec > > 0x27 not configured > > clock1 at ebus0 addr 0-1fff: mk48t59 > > "flashprom" at ebus0 addr 0-f not configured > > audioce0 at ebus0 addr 20-2000ff, 702000-70200f, 704000-70400f, > > 722000-722003 ivec 0x23 ivec 0x24: nvaddrs 0 > > audio0 at audioce0 > > hme0 at pci1 dev 1 function 1 "Sun HME" rev 0x01: ivec 0x7e1, address > > 08:00:20:19:39:20 > > nsphy0 at hme0 phy 1: DP83840 10/100 PHY, rev. 1 > > machfb0 at pci1 dev 2 function 0 "ATI Mach64" rev 0x9a > > machfb0: ATY,GT-B, 1152x900 > > wsdisplay0 at machfb0 mux 1 > > wsdisplay0: screen 0 added (std, sun emulation) > > pciide0 at pci1 dev 3 function 0 "CMD Technology PCI0646" rev 0x03: DMA, > > channel 0 configured to native-PCI, channel 1 configured to native-PCI > > pciide0: using ivec 0x7e0 for native-PCI interrupt > > wd0 at pciide0 channel 0 drive 0: > > wd0: 16-sector PIO, LBA48, 117800MB, 241254720 sectors > > wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 > > atapiscsi0 at pciide0 channel 1 drive 0 > > scsibus1 at atapiscsi0: 2 targets > > cd0 at scsibus1 targ 0 lun 0: ATAPI > > 5/cdrom removable > > wd1 at pciide0 channel 1 drive 1: > > wd1: 16-sector PIO, LBA, 19546MB, 40031712 sectors > > cd0(pciide0:1:0): using PIO mode 4, DMA mode 2 > > wd1(pciide0:1:1): using PIO mode 4, DMA mode 2 > > ppb1 at pci0 dev 1 function 0 "Sun Simba" rev 0x11 > > pci2 at ppb1 bus 2 > > vscsi0 at root > > scsibus2 at vscsi0: 256 targets > > softraid0 at root > > scsibus3 at softraid0: 256 targets > > bootpath: /pci@1f,0/pci@1,1/ide@3,0/disk@0,0 > > root on wd0a (f52f0bbc65e53556.a) swap on wd0b dump on wd0b > > > > It has a PCI hme card and it works great. > > > > I would be happy to help if you want to test some diff or program, but > > I am not knowledgeable enough to comment on the inner workings of the > > hme driver. > > Great, thanks for the information - the fact that the nsphy0 has been > detected correctly means that the access still works. Looks like I'll > have to go digging deeper. The OpenBSD code uses %asi if necessary to let the hardware do the byteswapping. Howver, I think the psycho(4) host bridge also does an implicit byteswap. Always has been a bit confusing to me. But the code defenitely works correctly on real hardware.
Re: no seqpacket in nfs
On Sun, Aug 13, 2017 at 11:03:02PM -0400, Ted Unangst wrote: > here's a new version that pulls the check higher. OK bluhm@ > - if (so->so_type == SOCK_SEQPACKET) > - flags = MSG_EOR; > - else > - flags = 0; > + flags = 0; > > error = sosend(so, sendnam, NULL, top, NULL, flags); You could kill the flags variable and just pass 0.
allow iwm_stop to sleep
This diff makes iwm_stop() always run in a process context. I want iwm_stop() to be able to sleep so that it can wait for asynchronous driver tasks, and perhaps even wait for firmware commands, in the future. If the interrupt handler detects a fatal condition, instead of calling iwm_stop() directly, defer to the init task. The init task looks at flags to understand what happened and restarts or stops the device as appropriate. I found that toggling the RF kill switch can trigger a fatal firmware error. Hence I am letting the interrupt handler check RF kill before checking for fatal firmware error. Provides better error reporting when the kill switch is used. During suspend, bring the device down during the QUIESCE stage which is allowed to sleep. dhclient/down/scan in a loop still works (as far as it always has, with various errors reported in dmesg). Suspend/resume still works. ok? Index: if_iwm.c === RCS file: /cvs/src/sys/dev/pci/if_iwm.c,v retrieving revision 1.211 diff -u -p -r1.211 if_iwm.c --- if_iwm.c13 Aug 2017 18:08:03 - 1.211 +++ if_iwm.c14 Aug 2017 06:44:59 - @@ -6517,6 +6517,7 @@ iwm_stop(struct ifnet *ifp) sc->sc_flags &= ~IWM_FLAG_BINDING_ACTIVE; sc->sc_flags &= ~IWM_FLAG_STA_ACTIVE; sc->sc_flags &= ~IWM_FLAG_TE_ACTIVE; + sc->sc_flags &= ~IWM_FLAG_HW_ERR; sc->sc_newstate(ic, IEEE80211_S_INIT, -1); @@ -7169,7 +7170,6 @@ int iwm_intr(void *arg) { struct iwm_softc *sc = arg; - struct ifnet *ifp = IC2IFP(&sc->sc_ic); int handled = 0; int r1, r2, rv = 0; int isperiodic = 0; @@ -7218,6 +7218,15 @@ iwm_intr(void *arg) /* ignored */ handled |= (r1 & (IWM_CSR_INT_BIT_ALIVE /*| IWM_CSR_INT_BIT_SCD*/)); + if (r1 & IWM_CSR_INT_BIT_RF_KILL) { + handled |= IWM_CSR_INT_BIT_RF_KILL; + if (iwm_check_rfkill(sc)) { + task_add(systq, &sc->init_task); + rv = 1; + goto out; + } + } + if (r1 & IWM_CSR_INT_BIT_SW_ERR) { #ifdef IWM_DEBUG int i; @@ -7238,7 +7247,6 @@ iwm_intr(void *arg) #endif printf("%s: fatal firmware error\n", DEVNAME(sc)); - iwm_stop(ifp); task_add(systq, &sc->init_task); rv = 1; goto out; @@ -7248,7 +7256,8 @@ iwm_intr(void *arg) if (r1 & IWM_CSR_INT_BIT_HW_ERR) { handled |= IWM_CSR_INT_BIT_HW_ERR; printf("%s: hardware error, stopping device \n", DEVNAME(sc)); - iwm_stop(ifp); + sc->sc_flags |= IWM_FLAG_HW_ERR; + task_add(systq, &sc->init_task); rv = 1; goto out; } @@ -7262,13 +7271,6 @@ iwm_intr(void *arg) wakeup(&sc->sc_fw); } - if (r1 & IWM_CSR_INT_BIT_RF_KILL) { - handled |= IWM_CSR_INT_BIT_RF_KILL; - if (iwm_check_rfkill(sc) && (ifp->if_flags & IFF_UP)) { - iwm_stop(ifp); - } - } - if (r1 & IWM_CSR_INT_BIT_RX_PERIODIC) { handled |= IWM_CSR_INT_BIT_RX_PERIODIC; IWM_WRITE(sc, IWM_CSR_INT, IWM_CSR_INT_BIT_RX_PERIODIC); @@ -7739,23 +7741,27 @@ iwm_init_task(void *arg1) { struct iwm_softc *sc = arg1; struct ifnet *ifp = &sc->sc_ic.ic_if; - int s; + int s = splnet(); int generation = sc->sc_generation; + int fatal = (sc->sc_flags & (IWM_FLAG_HW_ERR | IWM_FLAG_RFKILL)); rw_enter_write(&sc->ioctl_rwl); if (generation != sc->sc_generation) { rw_exit(&sc->ioctl_rwl); + splx(s); return; } - s = splnet(); if (ifp->if_flags & IFF_RUNNING) iwm_stop(ifp); - if ((ifp->if_flags & (IFF_UP | IFF_RUNNING)) == IFF_UP) + else if (sc->sc_flags & IWM_FLAG_HW_ERR) + sc->sc_flags &= ~IWM_FLAG_HW_ERR; + + if (!fatal && (ifp->if_flags & (IFF_UP | IFF_RUNNING)) == IFF_UP) iwm_init(ifp); - splx(s); rw_exit(&sc->ioctl_rwl); + splx(s); } int @@ -7778,9 +7784,12 @@ iwm_activate(struct device *self, int ac int err = 0; switch (act) { - case DVACT_SUSPEND: - if (ifp->if_flags & IFF_RUNNING) + case DVACT_QUIESCE: + if (ifp->if_flags & IFF_RUNNING) { + rw_enter_write(&sc->ioctl_rwl); iwm_stop(ifp); + rw_exit(&sc->ioctl_rwl); + } break; case DVACT_RESUME: err = iwm_resume(sc); Index: if_iwmvar.h === RCS file: /cvs/src/sys/dev/pci/if_iwmvar.h,v retrieving revision 1.33 diff -u -p -r1.33 if_iwmvar.h --- if