Re: equivalent flag or code to MSG_MORE in NetBSD?

2019-06-10 Thread Erik Fair
There is an ancient BSD ioctl(2) which might cover this case: FIONREAD. The 
point of it back in the day was to be able to know just how much data could be 
read from a file descriptor (e.g., from TTY input buffers) without blocking.

Erik Fair

Re: Running out of buffers?

2018-05-02 Thread Erik Fair

> On Apr 27, 2018, at 15:58, Robert Elz  > wrote:
> 
>Date:Fri, 27 Apr 2018 21:34:49 +0100
>From:Roy Marples >
>Message-ID:   >
> 
>  | Hopefully this fixes the issues and won't impact small memory devices 
>  | too much.
> 
> While those are probably useful changes to make, they don't fix anything,
> merely make it less likely.
> 
> We really need to turn off the error on recv() by default - and allow it
> to be turned on by applications that actually want to deal with this.
> 
> kre
> 

Very old related PR for the transmit side: http://gnats.netbsd.org/7285 


Erik



Re: running out of file descriptors

2018-04-22 Thread Erik Fair
We (NetBSD) still have a historical construct in config(5) that a bunch of 
system-wide limits like MAXFILES are calculated from a presumed average or 
median amount of those resources per user, expressed as multiples of “maxusers 
[n]” in config(5).

We may wish to survey typical applications now in use and see what the 
distribution of {file, socket} descriptors “per user” looks like (standard 
distribution? bi-modal? tri-modal?) and see if our current multipliers still 
reflect actual use. Or perhaps use a different scheme/model for sizing kernel & 
system resource limits than “n per user."

I’ve had to contend with our assumed limits by significantly raising the 
per-process soft limits for programs like pkgsrc/www/privoxy wherein descriptor 
use is proportional to traffic/clients rather than to number of processes.

something to consider,

    Erik Fair

> On Apr 18, 2018, at 02:27, Thomas Klausner <t...@giga.or.at> wrote:
> 
> On Wed, Apr 18, 2018 at 11:10:37AM +0200, Martin Husemann wrote:
>> On Wed, Apr 18, 2018 at 11:08:49AM +0200, Thomas Klausner wrote:
>>> Did anyone else notice something similar?
>> 
>> Check with fstat(1) ?
> 
> Good idea. Right now, the top ones seem nearly ok:
> 
> # fstat | sed "s/ [0-9].*$//" | sort | uniq -c | sort -n
> ...
>  24 root pbulk-build
>  26 wiz  at-spi2-registry 
>  30 wiz  at-spi-bus-launc
>  34 wiz  zsh   
>  40 wiz  dbus-daemon
>  42 root sh
>  45 root X  
>  49 bulk sh
>  56 bulk cc1plus   
>  62 root sshd   
>  73 wiz  transmission-gtk
>  92 root master  
> 115 wiz  syncthing  
> 142 wiz  firefox   
> 
> I wonder about some of the numbers (master 92, sshd 62) but I don't
> see anything eating thousands. Perhaps it's one particular package.
> Thomas



Re: Potentially undesirable behavior with apropos(1)

2016-07-11 Thread Erik Fair

> On Jul 8, 2016, at 08:36, Abhinav Upadhyay <er.abhinav.upadh...@gmail.com> 
> wrote:
> 
> On Fri, Jul 8, 2016 at 8:56 PM, Tom Ivar Helbekkmo <t...@hamartun.priv.no> 
> wrote:
>> Abhinav Upadhyay <er.abhinav.upadh...@gmail.com> writes:
>> 
>>> We just need to handle the special cases where we don't want to stem :)
>> 
>> ...or perhaps do the stemming only when the resulting stem is found in
>> /usr/share/dict/words?
> 
> Yes, that's probably a good idea. I first need to write the custom
> tokenizer and I can probably use that dictionary to decide what to
> stem and what not to stem.
> 
> -
> Abhinav

In principle a lot of technical names are marked up in mandoc as “.Tn foo” 
which might provide a good list of words to “not stem.”

Erik Fair




Re: building current - signal 6

2016-05-05 Thread Erik Fair
In /usr/include/sys/signal.h find:

#define SIGABRT 6   /* abort() */

See also signal(7) and abort(3).

It would appear that make(1) called abort(3) since rm(1) doesn’t have any 
direct calls to it that I could find by reading the base source. Cursory grep 
through /usr/src/usr.bin/make/* shows possibilities …

Perhaps turning on some make debug options will reveal more.

Erik Fair

> On May 3, 2016, at 16:59, Riccardo Mottola <riccardo.mott...@libero.it> wrote:
> 
> Hi,
> 
> when I build current on amd64, my build fails with:
> 
> #create  font-misc-misc/6x12-ISO8859-4.bdf
> /usr/src/../obj/external/mit/xorg/tools/ucs2any/ucs2any 
> /usr/xsrc/external/mit/font-misc-misc/dist/6x12.bdf 
> /usr/src/../obj/destdir.amd64/usr/X11R7/lib/X11/fonts/util/map-ISO8859-4 
> ISO8859-4
> Writing 223 characters into file '6x12-ISO8859-4.bdf'.
> #create  font-misc-misc/6x12-ISO8859-4.pcf.gz
> rm -f 6x12-ISO8859-4.pcf.gz
> *** Signal 6
> 
> Stop.
> nbmake[11]: stopped in 
> /usr/src/external/mit/xorg/share/fonts/misc/font-misc-mis c
> 
> *** Failed target:  dependall
> *** Failed command: cd 
> "/usr/src/external/mit/xorg/share/fonts/misc/font-misc-misc"; 
> /usr/src/../tools/bin/nbmake realall
> *** Error code 1
> 
> Stop.
> nbmake[10]: stopped in 
> /usr/src/external/mit/xorg/share/fonts/misc/font-misc-misc
> 
> 
> is the issue "Signal 6", what is going wrong?
> 
> 
> Thanks, Riccardo




Re: No buffer space available

2014-09-02 Thread Erik Fair

On Sep 1, 2014, at 01:46 , Michael van Elst mlel...@serpens.de wrote:

 t...@hamartun.priv.no (Tom Ivar Helbekkmo) writes:
 Sep  1 09:32:49 barsoom openvpn[2896]: write UDPv4: No buffer space 
 available (code=55)
 
 This doesn't necessarily mean that the buffer space is too small. It often
 also means that you try to send data faster than possible through the
 particular hardware. I can also mean that the hardware currently cannot
 send data (e.g. if the interface is inactive / has no carrier ).

See http://gnats.netbsd.org/7285

Erik f...@netbsd.org



Re: No buffer space available

2014-09-02 Thread Erik Fair
Network hangs are insidious. [old fart story time]

The headscratcher for me was the one in the 1990's at apple.com (when apple.com 
was a DEC VAX-8650 running 4.3BSD) that led me to discover TCP_SYN attacks and 
report that to the CERT two years before panix.com was attacked in the same 
way. Problem: far too limited initial TCP SYN queue length (5!), and when the 
short queue was full, any new TCP connection attempts to that port failed from 
connection timed out (SYN packet inbound dropped because queue for that port 
is full), despite ping (ICMP) working fine.

Imagine:

telnet localhost 25 gives connection timed out (wait, what? How is that 
possible?)

kill sendmail (yeah, we used sendmail back then)

telnet localhost 25 gives connection refused (OK, as expected)

restart sendmail

telnet localhost 25 gives connection timed out (WTF?!!)

Rebooting the VAX didn't clear the problem either - same behavior afterwards.

That's when I went looking to our routers to see if anything was wrong with the 
rest of our connections to the Internet.

The source of my problem was warring default routes in a pair of our 
exterior-facing Cisco routers (round  round a class of outbound packets went 
until TTL exceeded), but because the routers carried about 2/3rds of the full 
default-free Internet routing table at the time, we didn't immediately notice 
that we couldn't talk to 1/3rd of the Internet. Of course, they could all still 
send packets to us ... which is how the TCP SYN queue got full: our SYN_ACKs 
weren't getting out to that 1/3rd, and with the SYN queue full (and a 
two-minute timeout), suddenly SMTP stops accepting any other connection 
attempts.

Once I found the default route loop, I fixed it, and then watched the load on 
apple.com shoot up as the Internet started actually being able to speak to our 
SMTP server again.

My report to the CERT (then at CMU SEI) came out of first how did this 
happen?, followed by, wow, I could send five or six packets every two minutes 
with totally random non-responsive (non-existant!) IP source addresses to any 
particular host/TCP port combination and stop that host from being able to 
respond on that port! I could shut down E-mail at AOL! Moo hah hah! Oh, and, 
yeah, just try to trace  stop me, I dare you. [the CERT did nothing with my 
report, alas. I quietly provided it to friends at SGI and a few other places]

I also sent a somewhat oblique message to the IETF mailing list, asserting that 
a class of one-way (bidirectional communication not required) attacks existed, 
and that ISP ingress filtering of customer IP source addresses was the only way 
we'd be able to both forestall them, and trace them. That's a BCP now, but Phil 
Karn flamed me at the time for wanting to break one mode of mobile-IP. I wasn't 
graphic or explicit because that list was public, and I didn't want to provide 
a recipe for any would-be attackers until both the ingress filtering was 
deployed, and the OS companies had fixed their TCP implementations.

This all got fixed a few years later after Panix.com was attacked (though 
nowhere near as elegantly - they were really massively flooded) with the TCP 
SYN queue system we now have in NetBSD and all other responsible OSes.

The Internet is a pretty hostile network.

[/old fart story time]

How this relates: as noted in PR/7285, we have a semantic problem with our 
errors from the networking code: ENOBUFS (55) is returned for BOTH mbuf 
exhaustion, AND for network interface queue full (see the IFQ_MAXLEN, 
IFQ_SET_MAXLEN(), IF_QFULL() macros in /usr/include/net/if.h, and then in the 
particular network interface driver you use).

TCP is well-behaved: it just backs off and retransmits when it hits a condition 
like that, and your application probably never hears about it - though it may 
experience the condition as a performance degradation as TCP backs off.

UDP, not so much.

If your UDP-based applications are reporting that error, they're probably not 
doing anything active/adaptive about it. Some human is expected to analyze the 
situation and deal with it somehow. Lucky you, human. It might be time for 
you to recapitulate the TCP congestion measurement and backoff algorithms in 
your UDP application (good luck with that well-trod path to tears). Or just 
convert to TCP. Or ... fix your network (stack? interface? media? switches?), 
if you can figure out what's actually wrong.

The bad part is that without a distinct error message for queue full, I can't 
tell you whether you really are running out of mbufs (though netstat -m will 
tell you if you've ever hit the limit, and netstat -s will tell you about some 
queues on a per-protocol basis, but I don't see counters for network interfaces 
in there, as there probably should be), or whether you're overrunning the 
network interface output queue limit, whatever that is.

In both cases, your application should take such an error as a message to back 
off and retransmit later (like TCP