Re: getaddrinfo error for existing host without requested address family

2022-10-26 Thread Mike Karels
On Oct 17, I wrote:

> On Wed, 28 Sep 2022, Konstantin Belousov wrote:

> > On Tue, Sep 27, 2022 at 03:53:12PM -0500, Mike Karels wrote:
> > > I recently noticed the following behavior:
> > > 
> > > % ping6 redrock
> > > ping6: Name does not resolve
> > > % host redrock
> > > redrock.karels.net has address 10.0.2.2
> > > redrock.karels.net mail is handled by 10 mail.karels.net.
> > > % ping6 nonexistenthost
> > > ping6: Name does not resolve
> > > 
> > > The first error message is misleading, because the name *does* resolve,
> > > but has no  record, and it is the same error message as for a name
> > > that truly does not exist.  The problem comes from the set of error
> > > codes that getaddrinfo() returns in these two cases.  The problem did
> > > not exist with gethostbyname(), which has separate error codes for the
> > > two (although gethostbyname did not have provision for IPv6, it handled
> > > cases like domain names and mail domains without IPv4 addresses).
> > > 
> > > getaddrinfo() uses a richer set of error codes than gethostbyname(), but
> > > still misses this case.  However, looking at , I see
> > > 
> > > #if 0
> > > /* Obsoleted on RFC 2553bis-02 */
> > > #define   EAI_ADDRFAMILY   1  /* address family for hostname not 
> > > supported */
> > > #endif
> > > ...
> > > #if 0
> > > /* Obsoleted on RFC 2553bis-02 */
> > > #define   EAI_NODATA   7  /* no address associated with hostname 
> > > */
> > > #endif
> > > 
> > > I don't know why these two were omitted from the update to RFC 2553, but
> > > the first seems to me to be the correct error for an existing name without
> > > an address for the requested address family.  Also, that is the error
> > > message produced by Linux (Ubuntu 22.04.1).
> > > 
> > > NetBSD and OpenBSD produce the second of these two errors for a host
> > > without the requested address.  But they also produce the same error
> > > when a name does not exist.
> > > 
> > > RFC 2553bis-02 has timed out, and is replaced by RFC 3493, which is also
> > > missing EAI_ADDRFAMILY.  These are informational RFCs, not specifying an
> > > Internet standard.
> > > 
> > > I propose re-enabling EAI_ADDRFAMILY and using it for the situation
> > > where a name exists but does not have an address in the requested family.
> > > This would make the error in the example less misleading, and would behave
> > > the same as Linux in this regard.  The change to netdb.h is trivial, but
> > > getaddrinfo() needs a little more work because it uses the NS_* errors
> > > from  internally and then translates.  But it will benefit
> > > from greater accuracy in other cases as well (e.g.  "out of memory"
> > > rather than "Name does not resolve").
> > > 
> > > Comments?  I have a change in progress, but wanted to float the idea
> > > before I finish it and put it into review.

> > Perhaps look there
> > https://www.openwall.com/lists/libc-coord/2022/09/27/1

> > You might want to participate in the thread, instead of me.

> I participated in a short discussion on that list.  The TL;DR:

> - Linux/glibc (Ubuntu at least) uses EAI_NODATA ("No address associated
> with hostname") when a name is valid but does not have the requested
> address family.  This is better than FreeBSD currently, as it is
> distinguished from EAI_NONAME ("Name or service not known").  But it
> implies that there is no address in any family.  (I showed an example
> from ping6 above, but it turns out to be atypical.)

> - The author of the musl C library for Linux plans to use EAI_NODATA as
> well, but with a different error message.

> - Linux also uses EAI_ADDRFAMILY, but only when a numeric address is in the
> wrong family, e.g. telnet -6 127.0.0.1.

> - POSIX, like the latest RFC, does not define EAI_NODATA or EAI_ADDRFAMILY.

> - There were no other opinions expressed.

> I see two choices for FreeBSD when there is no address in the requested
> family.  One is to use EAI_NODATA, probably using a modified error message.
> The has the main disadvantage that we have several NLS translations.  Also,
> it is different than Linux.

> The other choice is to use EAI_ADDRFAMILY ("Address family for hostname
> not supported") as originally proposed.  The existing error message seems
> reasonable for this case.

> Any comments or votes?  I am inclined to use EAI_ADDRFAMILY as originally
> proposed.

I put up a review, https://reviews.freebsd.org/D37139, with these changes.
The changes should be submitted as several commits, as indicated in the
review.

Mike



Re: finding optimal ipfw strategy

2022-10-26 Thread Victor Gamov

Hi All

Up this thread after few years :-)


Now I have following HW/SW setup:
- FreeBSD 12.3-STABLE r372089 GENERIC amd64
- Xeon(R) CPU E5-2470 v2 @ 2.40GHz
- ix0 hardware
- about 10-15 vlans like "vlan: 100 vlanpcp: 0 parent interface: ix0"
- and all vlans are bridged via bridgeX
- about 200 multicast streams (200K packets / 2G multicast traffic 
incoming via one vlan100)

- ipfw to allow/deny incoming/outgoing traffic on any vlanX:
  -- net.link.bridge.ipfw=1 -- to enable layer2 filtering (ARP)
  -- net.link.bridge.ipfw_arp=1 -- to filter ARP
  -- net.link.bridge.pfil_bridge=0 -- no filtering on bridgeX
  -- net.link.bridge.pfil_member=1 -- to filter in/out on bridged vlans
  -- net.link.ether.ipfw=1 -- to filter inter-vlan non IP packets like 
STP/CDP/etc



IPFW optimized strategy based on early messages but still in research :-)

Some things are working fine but some I still can't figure out.

As documented at ipfw(8) (part "PACKET FLOW") [bdg_forward] at lower 
layer so ALL packets bridged unconditionaly and I can't drop undesired 
incoming packets based on incoming vlan (like "deny ip from any to any 
in recv vlanX") _before_ they bridged ?


Then, if packet bridged all packets copied to all bridged vlans?

And only in [ip_output] undesired outgoing packets will be dropped ?


Is it possible to drop incoming packets _before_ they bridged?

--
CU,
Victor Gamov



Re: netstat -I ix0

2022-10-26 Thread Victor Gamov
I check the patch and looks like huge errors counter due huge UDP 
traffic in my case.  I'll try to upgared and check it.


Thanks for everyone!


On 26.10.2022 14:34, Franco Fichtner wrote:



On 26. Oct 2022, at 13:11, Cristian Cardoso  
wrote:

Despite the error increments in the interface, I used the equipment for a month 
until I migrated my things from the router I had and even with a large error 
increment I didn't have any problems with packet loss or routing.


The cards exhibiting this input error are counting (valid) zero-checksum
UDP packets as input errors despite handling them just fine. This is a
documented errata.

I think the whole "problem" started when Intel opted to push all
interface errors into the input error counter in FreeBSD 13 which
surfaced this non-issue.


--
CU,
Victor Gamov



Re: Too aggressive TCP ACKs

2022-10-26 Thread tuexen
> On 26. Oct 2022, at 14:59, Tom Jones  wrote:
> 
> On Wed, Oct 26, 2022 at 02:55:21PM +0200, tue...@freebsd.org wrote:
>>> On 26. Oct 2022, at 10:57, Tom Jones  wrote:
>>> 
>>> On Sat, Oct 22, 2022 at 12:14:25PM +0200, Hans Petter Selasky wrote:
 Hi,
 
 Some thoughts about this topic.
 
 Delaying ACKs means loss of performance when using Gigabit TCP 
 connections in data centers. There it is important to ACK the data as 
 quick as possible, to avoid running out of TCP window space. Thinking 
 about TCP connections at 30 GBit/s and above!
 
 I think the implementation should be exactly like it is.
 
 There is a software LRO in FreeBSD to coalesce the ACKs before they hit 
 the network stack, so there are no real problems there.
 
>>> 
>>> Changing the ACK ratio seems to be okay in most cases, a paper I wrote
>>> about this was published this week:
>>> 
>>> https://onlinelibrary.wiley.com/doi/10.1002/sat.1466
>>> 
>>> It focuses on QUIC, but congestion control dynamics don't change with
>>> the protocol. You should be able to read there, but if not I'm happy to
>>> send anyone a pdf.
>> Is QUIC using an L=2 for ABC?
> 
> I think that is the rfc recommendation, actual deployed reality is more
> scattershot.
Wouldn't that be relevant? If you get an ack for, let's say 8 packets, you would
only increment (in slow start) the cwnd by 2 packets, not 8?

Best regards
Michael
> 
> - Tom
> 




Re: Too aggressive TCP ACKs

2022-10-26 Thread Tom Jones
On Wed, Oct 26, 2022 at 02:55:21PM +0200, tue...@freebsd.org wrote:
> > On 26. Oct 2022, at 10:57, Tom Jones  wrote:
> > 
> > On Sat, Oct 22, 2022 at 12:14:25PM +0200, Hans Petter Selasky wrote:
> >> Hi,
> >> 
> >> Some thoughts about this topic.
> >> 
> >> Delaying ACKs means loss of performance when using Gigabit TCP 
> >> connections in data centers. There it is important to ACK the data as 
> >> quick as possible, to avoid running out of TCP window space. Thinking 
> >> about TCP connections at 30 GBit/s and above!
> >> 
> >> I think the implementation should be exactly like it is.
> >> 
> >> There is a software LRO in FreeBSD to coalesce the ACKs before they hit 
> >> the network stack, so there are no real problems there.
> >> 
> > 
> > Changing the ACK ratio seems to be okay in most cases, a paper I wrote
> > about this was published this week:
> > 
> > https://onlinelibrary.wiley.com/doi/10.1002/sat.1466
> > 
> > It focuses on QUIC, but congestion control dynamics don't change with
> > the protocol. You should be able to read there, but if not I'm happy to
> > send anyone a pdf.
> Is QUIC using an L=2 for ABC?

I think that is the rfc recommendation, actual deployed reality is more
scattershot.

- Tom



Re: Too aggressive TCP ACKs

2022-10-26 Thread tuexen
> On 26. Oct 2022, at 10:57, Tom Jones  wrote:
> 
> On Sat, Oct 22, 2022 at 12:14:25PM +0200, Hans Petter Selasky wrote:
>> Hi,
>> 
>> Some thoughts about this topic.
>> 
>> Delaying ACKs means loss of performance when using Gigabit TCP 
>> connections in data centers. There it is important to ACK the data as 
>> quick as possible, to avoid running out of TCP window space. Thinking 
>> about TCP connections at 30 GBit/s and above!
>> 
>> I think the implementation should be exactly like it is.
>> 
>> There is a software LRO in FreeBSD to coalesce the ACKs before they hit 
>> the network stack, so there are no real problems there.
>> 
> 
> Changing the ACK ratio seems to be okay in most cases, a paper I wrote
> about this was published this week:
> 
> https://onlinelibrary.wiley.com/doi/10.1002/sat.1466
> 
> It focuses on QUIC, but congestion control dynamics don't change with
> the protocol. You should be able to read there, but if not I'm happy to
> send anyone a pdf.
Is QUIC using an L=2 for ABC?

Best regards
Michael
> 
> - Tom




Re: netstat -I ix0

2022-10-26 Thread Franco Fichtner


> On 26. Oct 2022, at 13:11, Cristian Cardoso  
> wrote:
> 
> Despite the error increments in the interface, I used the equipment for a 
> month until I migrated my things from the router I had and even with a large 
> error increment I didn't have any problems with packet loss or routing.

The cards exhibiting this input error are counting (valid) zero-checksum
UDP packets as input errors despite handling them just fine. This is a
documented errata.

I think the whole "problem" started when Intel opted to push all
interface errors into the input error counter in FreeBSD 13 which
surfaced this non-issue.


Cheers,
Franco


Re: netstat -I ix0

2022-10-26 Thread Cristian Cardoso
Hi

Despite the error increments in the interface, I used the equipment for a
month until I migrated my things from the router I had and even with a
large error increment I didn't have any problems with packet loss or
routing.

Em ter., 25 de out. de 2022 às 17:58, Victor Gamov  escreveu:

> So, it's not a problem but counter only bug?
>
>
> On 24.10.2022 23:29, Cristian Cardoso wrote:
> > Hi
> >
> > After I upgraded my hardware from 12.2 to 12.3-RELEASE, I also started
> > getting error alerts on my FreeBSD monitors via snmp.
> > People asked me to upgrade the version to 13, I did but the problem
> > persisted, I changed SFP modules and fiber cable, even so the error
> > persisted.
> > This was a system change that created more counters from the
> > implementation of a new revision in FreeBSD.
> >
> > I reported my bug at the time:
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=262093
> >
> >  From what you could see in the url that Konrad passed, a fix is already
> > being tested.
> >
> > Em seg., 24 de out. de 2022 às 17:21, Konrad Kręciwilk
> > mailto:konrad.kreciw...@korbank.pl>>
> escreveu:
> >
> > Hello,
> >
> > It seems your issues is related with
> > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266048
> > Regards,
> > KK
> >
> > W dniu 2022-10-24 17:59, Victor Gamov napisał(a):
> >  > Hi All
> >  >
> >  > On my FreeBSD 12.3-STABLE r372089 GENERIC  amd64 I have a huge
> amount
> >  > of RX errors on ix0:
> >  >
> >  > =
> >  > # netstat -n -I ix0
> >  > NameMtu Network   Address  Ipkts Ierrs Idrop
> > Opkts
> >  > Oerrs  Coll
> >  > ix01500   a0:36:9f:1d:80:80 2336589698003
> > 774784702245
> >  >   0 1816532334151 0 0
> >  > =
> >  >
> >  > =
> >  > # sysctl dev.ix.0 | grep errs
> >  > dev.ix.0.mac_stats.checksum_errs: 774729511713
> >  > dev.ix.0.mac_stats.rec_len_errs: 31682
> >  > dev.ix.0.mac_stats.byte_errs: 14414734
> >  > dev.ix.0.mac_stats.ill_errs: 3034419
> >  > dev.ix.0.mac_stats.crc_errs: 54705100
> >  > dev.ix.0.mac_stats.rx_errs: 774787585085
> >  > =
> >  >
> >  >
> >  > Are these errors phy/hw problems e.x. wrong SFP+
> hardware/firmware,
> >  > too high / too low RX levels, broken patch cord?  Or may it be
> > some OS
> >  > errors?
> >  >
> >  >
> >  > Thanks for any advise!
>
>
> --
> CU,
> Victor Gamov
>
>


Re: Too aggressive TCP ACKs

2022-10-26 Thread Hans Petter Selasky

On 10/26/22 10:57, Tom Jones wrote:

It focuses on QUIC, but congestion control dynamics don't change with
the protocol. You should be able to read there, but if not I'm happy to
send anyone a pdf.


If QUIC doesn't support TSO  (Large Send Offload), it cannot be compared 
I think.


--HPS



Re: Too aggressive TCP ACKs

2022-10-26 Thread Tom Jones
On Sat, Oct 22, 2022 at 12:14:25PM +0200, Hans Petter Selasky wrote:
> Hi,
> 
> Some thoughts about this topic.
> 
> Delaying ACKs means loss of performance when using Gigabit TCP 
> connections in data centers. There it is important to ACK the data as 
> quick as possible, to avoid running out of TCP window space. Thinking 
> about TCP connections at 30 GBit/s and above!
> 
> I think the implementation should be exactly like it is.
> 
> There is a software LRO in FreeBSD to coalesce the ACKs before they hit 
> the network stack, so there are no real problems there.
> 

Changing the ACK ratio seems to be okay in most cases, a paper I wrote
about this was published this week:

https://onlinelibrary.wiley.com/doi/10.1002/sat.1466

It focuses on QUIC, but congestion control dynamics don't change with
the protocol. You should be able to read there, but if not I'm happy to
send anyone a pdf.

- Tom