Re: WARNING in ip_recv_error

2018-05-24 Thread Paolo Abeni
On Wed, 2018-05-23 at 11:40 -0400, Willem de Bruijn wrote:
> On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
>  wrote:
> > On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
> >  wrote:
> > > On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
> > >  wrote:
> > > > On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
> > > >  wrote:
> > > > > On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
> > > > >  wrote:
> > > > > > On Fri, May 18, 2018 at 11:44 AM, David Miller 
> > > > > >  wrote:
> > > > > > > From: Eric Dumazet 
> > > > > > > Date: Fri, 18 May 2018 08:30:43 -0700
> > > > > > > 
> > > > > > > > We probably need to revert Willem patch 
> > > > > > > > (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
> > > > > > > 
> > > > > > > Is it really valid to reach ip_recv_err with an ipv6 socket?
> > > > > > 
> > > > > > I guess the issue is that setsockopt IPV6_ADDRFORM is not an
> > > > > > atomic operation, so that the socket is neither fully ipv4 nor fully
> > > > > > ipv6 by the time it reaches ip_recv_error.
> > > > > > 
> > > > > >   sk->sk_socket->ops = _dgram_ops;
> > > > > >   < HERE >
> > > > > >   sk->sk_family = PF_INET;
> > > > > > 
> > > > > > Even calling inet_recv_error to demux would not necessarily help.
> > > > > > 
> > > > > > Safest would be to look up by skb->protocol, similar to what
> > > > > > ipv6_recv_error does to handle v4-mapped-v6.
> > > > > > 
> > > > > > Or to make that function safe with PF_INET and swap the order
> > > > > > of the above two operations.
> > > > > > 
> > > > > > All sound needlessly complicated for this rare socket option, but
> > > > > > I don't have a better idea yet. Dropping on the floor is not nice,
> > > > > > either.
> > > > > 
> > > > > Ensuring that ip_recv_error correctly handles packets from either
> > > > > socket and removing the warning should indeed be good.
> > > > > 
> > > > > It is robust against v4-mapped packets from an AF_INET6 socket,
> > > > > but see caveat on reconnect below.
> > > > > 
> > > > > The code between ipv6_recv_error for v4-mapped addresses and
> > > > > ip_recv_error is essentially the same, the main difference being
> > > > > whether to return network headers as sockaddr_in with SOL_IP
> > > > > or sockaddr_in6 with SOL_IPV6.
> > > > > 
> > > > > There are very few other locations in the stack that explicitly test
> > > > > sk_family in this way and thus would be vulnerable to races with
> > > > > IPV6_ADDRFORM.
> > > > > 
> > > > > I'm not sure whether it is possible for a udpv6 socket to queue a
> > > > > real ipv6 packet on the error queue, disconnect, connect to an
> > > > > ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
> > > > > on a true ipv6 packet. That would return buggy data, e.g., in
> > > > > msg_name.
> > > > 
> > > > In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
> > > > error queue is empty, and then take its lock for the duration of the
> > > > operation.
> > > 
> > > Actually, no reason to hold the lock. This setsockopt holds the socket
> > > lock, which connect would need, too. So testing that the queue
> > > is empty after testing that it is connected to a v4 address is
> > > sufficient to ensure that no ipv6 packets are queued for reception.
> > > 
> > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > > index 4d780c7f0130..a975d6311341 100644
> > > --- a/net/ipv6/ipv6_sockglue.c
> > > +++ b/net/ipv6/ipv6_sockglue.c
> > > @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
> > > int level, int optname,
> > > 
> > > if (ipv6_only_sock(sk) ||
> > > !ipv6_addr_v4mapped(>sk_v6_daddr)) {
> > > retv = -EADDRNOTAVAIL;
> > > break;
> > > }
> > > 
> > > +   if (!skb_queue_empty(>sk_error_queue)) {
> > > +   retv = -EBUSY;
> > > +   break;
> > > +   }
> > > +
> > > fl6_free_socklist(sk);
> > > __ipv6_sock_mc_close(sk);
> > > 
> > > After this it should be safe to remove the warning in ip_recv_error.
> > 
> > Hmm.. nope.
> > 
> > This ensures that the socket cannot produce any new true v6 packets.
> > But it does not guarantee that they are not already in the system, e.g.
> > queued in tc, and will find their way to the error queue later.
> > 
> > We'll have to just be able to handle ipv6 packets in ip_recv_error.
> > Since IPV6_ADDRFORM is used to pass to legacy v4-only
> > processes and those likely are only confused by SOL_IPV6
> > error messages, it is probably best to just drop them and perhaps
> > WARN_ONCE.
> 
> Even more fun, this is not limited to the error queue.
> 

Re: WARNING in ip_recv_error

2018-05-24 Thread Paolo Abeni
On Wed, 2018-05-23 at 11:40 -0400, Willem de Bruijn wrote:
> On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
>  wrote:
> > On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
> >  wrote:
> > > On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
> > >  wrote:
> > > > On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
> > > >  wrote:
> > > > > On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
> > > > >  wrote:
> > > > > > On Fri, May 18, 2018 at 11:44 AM, David Miller 
> > > > > >  wrote:
> > > > > > > From: Eric Dumazet 
> > > > > > > Date: Fri, 18 May 2018 08:30:43 -0700
> > > > > > > 
> > > > > > > > We probably need to revert Willem patch 
> > > > > > > > (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
> > > > > > > 
> > > > > > > Is it really valid to reach ip_recv_err with an ipv6 socket?
> > > > > > 
> > > > > > I guess the issue is that setsockopt IPV6_ADDRFORM is not an
> > > > > > atomic operation, so that the socket is neither fully ipv4 nor fully
> > > > > > ipv6 by the time it reaches ip_recv_error.
> > > > > > 
> > > > > >   sk->sk_socket->ops = _dgram_ops;
> > > > > >   < HERE >
> > > > > >   sk->sk_family = PF_INET;
> > > > > > 
> > > > > > Even calling inet_recv_error to demux would not necessarily help.
> > > > > > 
> > > > > > Safest would be to look up by skb->protocol, similar to what
> > > > > > ipv6_recv_error does to handle v4-mapped-v6.
> > > > > > 
> > > > > > Or to make that function safe with PF_INET and swap the order
> > > > > > of the above two operations.
> > > > > > 
> > > > > > All sound needlessly complicated for this rare socket option, but
> > > > > > I don't have a better idea yet. Dropping on the floor is not nice,
> > > > > > either.
> > > > > 
> > > > > Ensuring that ip_recv_error correctly handles packets from either
> > > > > socket and removing the warning should indeed be good.
> > > > > 
> > > > > It is robust against v4-mapped packets from an AF_INET6 socket,
> > > > > but see caveat on reconnect below.
> > > > > 
> > > > > The code between ipv6_recv_error for v4-mapped addresses and
> > > > > ip_recv_error is essentially the same, the main difference being
> > > > > whether to return network headers as sockaddr_in with SOL_IP
> > > > > or sockaddr_in6 with SOL_IPV6.
> > > > > 
> > > > > There are very few other locations in the stack that explicitly test
> > > > > sk_family in this way and thus would be vulnerable to races with
> > > > > IPV6_ADDRFORM.
> > > > > 
> > > > > I'm not sure whether it is possible for a udpv6 socket to queue a
> > > > > real ipv6 packet on the error queue, disconnect, connect to an
> > > > > ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
> > > > > on a true ipv6 packet. That would return buggy data, e.g., in
> > > > > msg_name.
> > > > 
> > > > In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
> > > > error queue is empty, and then take its lock for the duration of the
> > > > operation.
> > > 
> > > Actually, no reason to hold the lock. This setsockopt holds the socket
> > > lock, which connect would need, too. So testing that the queue
> > > is empty after testing that it is connected to a v4 address is
> > > sufficient to ensure that no ipv6 packets are queued for reception.
> > > 
> > > diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> > > index 4d780c7f0130..a975d6311341 100644
> > > --- a/net/ipv6/ipv6_sockglue.c
> > > +++ b/net/ipv6/ipv6_sockglue.c
> > > @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
> > > int level, int optname,
> > > 
> > > if (ipv6_only_sock(sk) ||
> > > !ipv6_addr_v4mapped(>sk_v6_daddr)) {
> > > retv = -EADDRNOTAVAIL;
> > > break;
> > > }
> > > 
> > > +   if (!skb_queue_empty(>sk_error_queue)) {
> > > +   retv = -EBUSY;
> > > +   break;
> > > +   }
> > > +
> > > fl6_free_socklist(sk);
> > > __ipv6_sock_mc_close(sk);
> > > 
> > > After this it should be safe to remove the warning in ip_recv_error.
> > 
> > Hmm.. nope.
> > 
> > This ensures that the socket cannot produce any new true v6 packets.
> > But it does not guarantee that they are not already in the system, e.g.
> > queued in tc, and will find their way to the error queue later.
> > 
> > We'll have to just be able to handle ipv6 packets in ip_recv_error.
> > Since IPV6_ADDRFORM is used to pass to legacy v4-only
> > processes and those likely are only confused by SOL_IPV6
> > error messages, it is probably best to just drop them and perhaps
> > WARN_ONCE.
> 
> Even more fun, this is not limited to the error queue.
> 
> I can queue a v6 packet for reception on a socket, connect to a v4
> address, call IPV6_ADDRFORM and then a regular recvfrom will
> return a partial v6 address as AF_INET.
> 
> We definitely do not want to 

Re: WARNING in ip_recv_error

2018-05-23 Thread Willem de Bruijn
On Wed, May 23, 2018 at 11:40 AM, Willem de Bruijn
 wrote:
> On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>>>  wrote:
 On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
  wrote:
> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 11:44 AM, David Miller  
>> wrote:
>>> From: Eric Dumazet 
>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>
 We probably need to revert Willem patch 
 (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>
>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>
>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>> atomic operation, so that the socket is neither fully ipv4 nor fully
>> ipv6 by the time it reaches ip_recv_error.
>>
>>   sk->sk_socket->ops = _dgram_ops;
>>   < HERE >
>>   sk->sk_family = PF_INET;
>>
>> Even calling inet_recv_error to demux would not necessarily help.
>>
>> Safest would be to look up by skb->protocol, similar to what
>> ipv6_recv_error does to handle v4-mapped-v6.
>>
>> Or to make that function safe with PF_INET and swap the order
>> of the above two operations.
>>
>> All sound needlessly complicated for this rare socket option, but
>> I don't have a better idea yet. Dropping on the floor is not nice,
>> either.
>
> Ensuring that ip_recv_error correctly handles packets from either
> socket and removing the warning should indeed be good.
>
> It is robust against v4-mapped packets from an AF_INET6 socket,
> but see caveat on reconnect below.
>
> The code between ipv6_recv_error for v4-mapped addresses and
> ip_recv_error is essentially the same, the main difference being
> whether to return network headers as sockaddr_in with SOL_IP
> or sockaddr_in6 with SOL_IPV6.
>
> There are very few other locations in the stack that explicitly test
> sk_family in this way and thus would be vulnerable to races with
> IPV6_ADDRFORM.
>
> I'm not sure whether it is possible for a udpv6 socket to queue a
> real ipv6 packet on the error queue, disconnect, connect to an
> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
> on a true ipv6 packet. That would return buggy data, e.g., in
> msg_name.

 In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
 error queue is empty, and then take its lock for the duration of the
 operation.
>>>
>>> Actually, no reason to hold the lock. This setsockopt holds the socket
>>> lock, which connect would need, too. So testing that the queue
>>> is empty after testing that it is connected to a v4 address is
>>> sufficient to ensure that no ipv6 packets are queued for reception.
>>>
>>> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
>>> index 4d780c7f0130..a975d6311341 100644
>>> --- a/net/ipv6/ipv6_sockglue.c
>>> +++ b/net/ipv6/ipv6_sockglue.c
>>> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
>>> int level, int optname,
>>>
>>> if (ipv6_only_sock(sk) ||
>>> !ipv6_addr_v4mapped(>sk_v6_daddr)) {
>>> retv = -EADDRNOTAVAIL;
>>> break;
>>> }
>>>
>>> +   if (!skb_queue_empty(>sk_error_queue)) {
>>> +   retv = -EBUSY;
>>> +   break;
>>> +   }
>>> +
>>> fl6_free_socklist(sk);
>>> __ipv6_sock_mc_close(sk);
>>>
>>> After this it should be safe to remove the warning in ip_recv_error.
>>
>> Hmm.. nope.
>>
>> This ensures that the socket cannot produce any new true v6 packets.
>> But it does not guarantee that they are not already in the system, e.g.
>> queued in tc, and will find their way to the error queue later.
>>
>> We'll have to just be able to handle ipv6 packets in ip_recv_error.
>> Since IPV6_ADDRFORM is used to pass to legacy v4-only
>> processes and those likely are only confused by SOL_IPV6
>> error messages, it is probably best to just drop them and perhaps
>> WARN_ONCE.
>
> Even more fun, this is not limited to the error queue.
>
> I can queue a v6 packet for reception on a socket, connect to a v4
> address, call IPV6_ADDRFORM and then a regular recvfrom will
> return a partial v6 address as AF_INET.
>
> We definitely do not want to have to add a check
>
>   if (skb->protocol == htons(ETH_P_IPV6)) {
> kfree_skb(skb);
> goto try_again;
>   }
>

Re: WARNING in ip_recv_error

2018-05-23 Thread Willem de Bruijn
On Wed, May 23, 2018 at 11:40 AM, Willem de Bruijn
 wrote:
> On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>>>  wrote:
 On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
  wrote:
> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 11:44 AM, David Miller  
>> wrote:
>>> From: Eric Dumazet 
>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>
 We probably need to revert Willem patch 
 (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>
>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>
>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>> atomic operation, so that the socket is neither fully ipv4 nor fully
>> ipv6 by the time it reaches ip_recv_error.
>>
>>   sk->sk_socket->ops = _dgram_ops;
>>   < HERE >
>>   sk->sk_family = PF_INET;
>>
>> Even calling inet_recv_error to demux would not necessarily help.
>>
>> Safest would be to look up by skb->protocol, similar to what
>> ipv6_recv_error does to handle v4-mapped-v6.
>>
>> Or to make that function safe with PF_INET and swap the order
>> of the above two operations.
>>
>> All sound needlessly complicated for this rare socket option, but
>> I don't have a better idea yet. Dropping on the floor is not nice,
>> either.
>
> Ensuring that ip_recv_error correctly handles packets from either
> socket and removing the warning should indeed be good.
>
> It is robust against v4-mapped packets from an AF_INET6 socket,
> but see caveat on reconnect below.
>
> The code between ipv6_recv_error for v4-mapped addresses and
> ip_recv_error is essentially the same, the main difference being
> whether to return network headers as sockaddr_in with SOL_IP
> or sockaddr_in6 with SOL_IPV6.
>
> There are very few other locations in the stack that explicitly test
> sk_family in this way and thus would be vulnerable to races with
> IPV6_ADDRFORM.
>
> I'm not sure whether it is possible for a udpv6 socket to queue a
> real ipv6 packet on the error queue, disconnect, connect to an
> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
> on a true ipv6 packet. That would return buggy data, e.g., in
> msg_name.

 In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
 error queue is empty, and then take its lock for the duration of the
 operation.
>>>
>>> Actually, no reason to hold the lock. This setsockopt holds the socket
>>> lock, which connect would need, too. So testing that the queue
>>> is empty after testing that it is connected to a v4 address is
>>> sufficient to ensure that no ipv6 packets are queued for reception.
>>>
>>> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
>>> index 4d780c7f0130..a975d6311341 100644
>>> --- a/net/ipv6/ipv6_sockglue.c
>>> +++ b/net/ipv6/ipv6_sockglue.c
>>> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
>>> int level, int optname,
>>>
>>> if (ipv6_only_sock(sk) ||
>>> !ipv6_addr_v4mapped(>sk_v6_daddr)) {
>>> retv = -EADDRNOTAVAIL;
>>> break;
>>> }
>>>
>>> +   if (!skb_queue_empty(>sk_error_queue)) {
>>> +   retv = -EBUSY;
>>> +   break;
>>> +   }
>>> +
>>> fl6_free_socklist(sk);
>>> __ipv6_sock_mc_close(sk);
>>>
>>> After this it should be safe to remove the warning in ip_recv_error.
>>
>> Hmm.. nope.
>>
>> This ensures that the socket cannot produce any new true v6 packets.
>> But it does not guarantee that they are not already in the system, e.g.
>> queued in tc, and will find their way to the error queue later.
>>
>> We'll have to just be able to handle ipv6 packets in ip_recv_error.
>> Since IPV6_ADDRFORM is used to pass to legacy v4-only
>> processes and those likely are only confused by SOL_IPV6
>> error messages, it is probably best to just drop them and perhaps
>> WARN_ONCE.
>
> Even more fun, this is not limited to the error queue.
>
> I can queue a v6 packet for reception on a socket, connect to a v4
> address, call IPV6_ADDRFORM and then a regular recvfrom will
> return a partial v6 address as AF_INET.
>
> We definitely do not want to have to add a check
>
>   if (skb->protocol == htons(ETH_P_IPV6)) {
> kfree_skb(skb);
> goto try_again;
>   }
>
> to the normal recvmsg path.
>
> An alternative may be to tighten the check on when to allow
> IPV6_ADDRFORM. Not only return EBUSY if a packet is pending,
> but also if any sk_{rmem, omem, wmem}_alloc is non-zero. Only,
> these tightened 

Re: WARNING in ip_recv_error

2018-05-23 Thread Willem de Bruijn
On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>>>  wrote:
 On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
  wrote:
> On Fri, May 18, 2018 at 11:44 AM, David Miller  
> wrote:
>> From: Eric Dumazet 
>> Date: Fri, 18 May 2018 08:30:43 -0700
>>
>>> We probably need to revert Willem patch 
>>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>
>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>
> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
> atomic operation, so that the socket is neither fully ipv4 nor fully
> ipv6 by the time it reaches ip_recv_error.
>
>   sk->sk_socket->ops = _dgram_ops;
>   < HERE >
>   sk->sk_family = PF_INET;
>
> Even calling inet_recv_error to demux would not necessarily help.
>
> Safest would be to look up by skb->protocol, similar to what
> ipv6_recv_error does to handle v4-mapped-v6.
>
> Or to make that function safe with PF_INET and swap the order
> of the above two operations.
>
> All sound needlessly complicated for this rare socket option, but
> I don't have a better idea yet. Dropping on the floor is not nice,
> either.

 Ensuring that ip_recv_error correctly handles packets from either
 socket and removing the warning should indeed be good.

 It is robust against v4-mapped packets from an AF_INET6 socket,
 but see caveat on reconnect below.

 The code between ipv6_recv_error for v4-mapped addresses and
 ip_recv_error is essentially the same, the main difference being
 whether to return network headers as sockaddr_in with SOL_IP
 or sockaddr_in6 with SOL_IPV6.

 There are very few other locations in the stack that explicitly test
 sk_family in this way and thus would be vulnerable to races with
 IPV6_ADDRFORM.

 I'm not sure whether it is possible for a udpv6 socket to queue a
 real ipv6 packet on the error queue, disconnect, connect to an
 ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
 on a true ipv6 packet. That would return buggy data, e.g., in
 msg_name.
>>>
>>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>>> error queue is empty, and then take its lock for the duration of the
>>> operation.
>>
>> Actually, no reason to hold the lock. This setsockopt holds the socket
>> lock, which connect would need, too. So testing that the queue
>> is empty after testing that it is connected to a v4 address is
>> sufficient to ensure that no ipv6 packets are queued for reception.
>>
>> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
>> index 4d780c7f0130..a975d6311341 100644
>> --- a/net/ipv6/ipv6_sockglue.c
>> +++ b/net/ipv6/ipv6_sockglue.c
>> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
>> int level, int optname,
>>
>> if (ipv6_only_sock(sk) ||
>> !ipv6_addr_v4mapped(>sk_v6_daddr)) {
>> retv = -EADDRNOTAVAIL;
>> break;
>> }
>>
>> +   if (!skb_queue_empty(>sk_error_queue)) {
>> +   retv = -EBUSY;
>> +   break;
>> +   }
>> +
>> fl6_free_socklist(sk);
>> __ipv6_sock_mc_close(sk);
>>
>> After this it should be safe to remove the warning in ip_recv_error.
>
> Hmm.. nope.
>
> This ensures that the socket cannot produce any new true v6 packets.
> But it does not guarantee that they are not already in the system, e.g.
> queued in tc, and will find their way to the error queue later.
>
> We'll have to just be able to handle ipv6 packets in ip_recv_error.
> Since IPV6_ADDRFORM is used to pass to legacy v4-only
> processes and those likely are only confused by SOL_IPV6
> error messages, it is probably best to just drop them and perhaps
> WARN_ONCE.

Even more fun, this is not limited to the error queue.

I can queue a v6 packet for reception on a socket, connect to a v4
address, call IPV6_ADDRFORM and then a regular recvfrom will
return a partial v6 address as AF_INET.

We definitely do not want to have to add a check

  if (skb->protocol == htons(ETH_P_IPV6)) {
kfree_skb(skb);
goto try_again;
  }

to the normal recvmsg path.

An alternative may be to tighten the check on when to allow
IPV6_ADDRFORM. Not only return EBUSY if a packet is pending,
but also if any sk_{rmem, omem, wmem}_alloc is non-zero. Only,
these 

Re: WARNING in ip_recv_error

2018-05-23 Thread Willem de Bruijn
On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>>>  wrote:
 On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
  wrote:
> On Fri, May 18, 2018 at 11:44 AM, David Miller  
> wrote:
>> From: Eric Dumazet 
>> Date: Fri, 18 May 2018 08:30:43 -0700
>>
>>> We probably need to revert Willem patch 
>>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>
>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>
> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
> atomic operation, so that the socket is neither fully ipv4 nor fully
> ipv6 by the time it reaches ip_recv_error.
>
>   sk->sk_socket->ops = _dgram_ops;
>   < HERE >
>   sk->sk_family = PF_INET;
>
> Even calling inet_recv_error to demux would not necessarily help.
>
> Safest would be to look up by skb->protocol, similar to what
> ipv6_recv_error does to handle v4-mapped-v6.
>
> Or to make that function safe with PF_INET and swap the order
> of the above two operations.
>
> All sound needlessly complicated for this rare socket option, but
> I don't have a better idea yet. Dropping on the floor is not nice,
> either.

 Ensuring that ip_recv_error correctly handles packets from either
 socket and removing the warning should indeed be good.

 It is robust against v4-mapped packets from an AF_INET6 socket,
 but see caveat on reconnect below.

 The code between ipv6_recv_error for v4-mapped addresses and
 ip_recv_error is essentially the same, the main difference being
 whether to return network headers as sockaddr_in with SOL_IP
 or sockaddr_in6 with SOL_IPV6.

 There are very few other locations in the stack that explicitly test
 sk_family in this way and thus would be vulnerable to races with
 IPV6_ADDRFORM.

 I'm not sure whether it is possible for a udpv6 socket to queue a
 real ipv6 packet on the error queue, disconnect, connect to an
 ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
 on a true ipv6 packet. That would return buggy data, e.g., in
 msg_name.
>>>
>>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>>> error queue is empty, and then take its lock for the duration of the
>>> operation.
>>
>> Actually, no reason to hold the lock. This setsockopt holds the socket
>> lock, which connect would need, too. So testing that the queue
>> is empty after testing that it is connected to a v4 address is
>> sufficient to ensure that no ipv6 packets are queued for reception.
>>
>> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
>> index 4d780c7f0130..a975d6311341 100644
>> --- a/net/ipv6/ipv6_sockglue.c
>> +++ b/net/ipv6/ipv6_sockglue.c
>> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
>> int level, int optname,
>>
>> if (ipv6_only_sock(sk) ||
>> !ipv6_addr_v4mapped(>sk_v6_daddr)) {
>> retv = -EADDRNOTAVAIL;
>> break;
>> }
>>
>> +   if (!skb_queue_empty(>sk_error_queue)) {
>> +   retv = -EBUSY;
>> +   break;
>> +   }
>> +
>> fl6_free_socklist(sk);
>> __ipv6_sock_mc_close(sk);
>>
>> After this it should be safe to remove the warning in ip_recv_error.
>
> Hmm.. nope.
>
> This ensures that the socket cannot produce any new true v6 packets.
> But it does not guarantee that they are not already in the system, e.g.
> queued in tc, and will find their way to the error queue later.
>
> We'll have to just be able to handle ipv6 packets in ip_recv_error.
> Since IPV6_ADDRFORM is used to pass to legacy v4-only
> processes and those likely are only confused by SOL_IPV6
> error messages, it is probably best to just drop them and perhaps
> WARN_ONCE.

Even more fun, this is not limited to the error queue.

I can queue a v6 packet for reception on a socket, connect to a v4
address, call IPV6_ADDRFORM and then a regular recvfrom will
return a partial v6 address as AF_INET.

We definitely do not want to have to add a check

  if (skb->protocol == htons(ETH_P_IPV6)) {
kfree_skb(skb);
goto try_again;
  }

to the normal recvmsg path.

An alternative may be to tighten the check on when to allow
IPV6_ADDRFORM. Not only return EBUSY if a packet is pending,
but also if any sk_{rmem, omem, wmem}_alloc is non-zero. Only,
these tightened constraints could break a legacy application.

Either way, this race is somewhat tangential to the one that
RaceFuzzer found. The sk changes that IPV6_ADDRFORM makes
to sk_prot, sk_socket->ops and 

Re: WARNING in ip_recv_error

2018-05-20 Thread Willem de Bruijn
On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>>  wrote:
 On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
> From: Eric Dumazet 
> Date: Fri, 18 May 2018 08:30:43 -0700
>
>> We probably need to revert Willem patch 
>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>
> Is it really valid to reach ip_recv_err with an ipv6 socket?

 I guess the issue is that setsockopt IPV6_ADDRFORM is not an
 atomic operation, so that the socket is neither fully ipv4 nor fully
 ipv6 by the time it reaches ip_recv_error.

   sk->sk_socket->ops = _dgram_ops;
   < HERE >
   sk->sk_family = PF_INET;

 Even calling inet_recv_error to demux would not necessarily help.

 Safest would be to look up by skb->protocol, similar to what
 ipv6_recv_error does to handle v4-mapped-v6.

 Or to make that function safe with PF_INET and swap the order
 of the above two operations.

 All sound needlessly complicated for this rare socket option, but
 I don't have a better idea yet. Dropping on the floor is not nice,
 either.
>>>
>>> Ensuring that ip_recv_error correctly handles packets from either
>>> socket and removing the warning should indeed be good.
>>>
>>> It is robust against v4-mapped packets from an AF_INET6 socket,
>>> but see caveat on reconnect below.
>>>
>>> The code between ipv6_recv_error for v4-mapped addresses and
>>> ip_recv_error is essentially the same, the main difference being
>>> whether to return network headers as sockaddr_in with SOL_IP
>>> or sockaddr_in6 with SOL_IPV6.
>>>
>>> There are very few other locations in the stack that explicitly test
>>> sk_family in this way and thus would be vulnerable to races with
>>> IPV6_ADDRFORM.
>>>
>>> I'm not sure whether it is possible for a udpv6 socket to queue a
>>> real ipv6 packet on the error queue, disconnect, connect to an
>>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>>> on a true ipv6 packet. That would return buggy data, e.g., in
>>> msg_name.
>>
>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>> error queue is empty, and then take its lock for the duration of the
>> operation.
>
> Actually, no reason to hold the lock. This setsockopt holds the socket
> lock, which connect would need, too. So testing that the queue
> is empty after testing that it is connected to a v4 address is
> sufficient to ensure that no ipv6 packets are queued for reception.
>
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index 4d780c7f0130..a975d6311341 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
> int level, int optname,
>
> if (ipv6_only_sock(sk) ||
> !ipv6_addr_v4mapped(>sk_v6_daddr)) {
> retv = -EADDRNOTAVAIL;
> break;
> }
>
> +   if (!skb_queue_empty(>sk_error_queue)) {
> +   retv = -EBUSY;
> +   break;
> +   }
> +
> fl6_free_socklist(sk);
> __ipv6_sock_mc_close(sk);
>
> After this it should be safe to remove the warning in ip_recv_error.

Hmm.. nope.

This ensures that the socket cannot produce any new true v6 packets.
But it does not guarantee that they are not already in the system, e.g.
queued in tc, and will find their way to the error queue later.

We'll have to just be able to handle ipv6 packets in ip_recv_error.
Since IPV6_ADDRFORM is used to pass to legacy v4-only
processes and those likely are only confused by SOL_IPV6
error messages, it is probably best to just drop them and perhaps
WARN_ONCE.


Re: WARNING in ip_recv_error

2018-05-20 Thread Willem de Bruijn
On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>>  wrote:
 On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
> From: Eric Dumazet 
> Date: Fri, 18 May 2018 08:30:43 -0700
>
>> We probably need to revert Willem patch 
>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>
> Is it really valid to reach ip_recv_err with an ipv6 socket?

 I guess the issue is that setsockopt IPV6_ADDRFORM is not an
 atomic operation, so that the socket is neither fully ipv4 nor fully
 ipv6 by the time it reaches ip_recv_error.

   sk->sk_socket->ops = _dgram_ops;
   < HERE >
   sk->sk_family = PF_INET;

 Even calling inet_recv_error to demux would not necessarily help.

 Safest would be to look up by skb->protocol, similar to what
 ipv6_recv_error does to handle v4-mapped-v6.

 Or to make that function safe with PF_INET and swap the order
 of the above two operations.

 All sound needlessly complicated for this rare socket option, but
 I don't have a better idea yet. Dropping on the floor is not nice,
 either.
>>>
>>> Ensuring that ip_recv_error correctly handles packets from either
>>> socket and removing the warning should indeed be good.
>>>
>>> It is robust against v4-mapped packets from an AF_INET6 socket,
>>> but see caveat on reconnect below.
>>>
>>> The code between ipv6_recv_error for v4-mapped addresses and
>>> ip_recv_error is essentially the same, the main difference being
>>> whether to return network headers as sockaddr_in with SOL_IP
>>> or sockaddr_in6 with SOL_IPV6.
>>>
>>> There are very few other locations in the stack that explicitly test
>>> sk_family in this way and thus would be vulnerable to races with
>>> IPV6_ADDRFORM.
>>>
>>> I'm not sure whether it is possible for a udpv6 socket to queue a
>>> real ipv6 packet on the error queue, disconnect, connect to an
>>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>>> on a true ipv6 packet. That would return buggy data, e.g., in
>>> msg_name.
>>
>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>> error queue is empty, and then take its lock for the duration of the
>> operation.
>
> Actually, no reason to hold the lock. This setsockopt holds the socket
> lock, which connect would need, too. So testing that the queue
> is empty after testing that it is connected to a v4 address is
> sufficient to ensure that no ipv6 packets are queued for reception.
>
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index 4d780c7f0130..a975d6311341 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
> int level, int optname,
>
> if (ipv6_only_sock(sk) ||
> !ipv6_addr_v4mapped(>sk_v6_daddr)) {
> retv = -EADDRNOTAVAIL;
> break;
> }
>
> +   if (!skb_queue_empty(>sk_error_queue)) {
> +   retv = -EBUSY;
> +   break;
> +   }
> +
> fl6_free_socklist(sk);
> __ipv6_sock_mc_close(sk);
>
> After this it should be safe to remove the warning in ip_recv_error.

Hmm.. nope.

This ensures that the socket cannot produce any new true v6 packets.
But it does not guarantee that they are not already in the system, e.g.
queued in tc, and will find their way to the error queue later.

We'll have to just be able to handle ipv6 packets in ip_recv_error.
Since IPV6_ADDRFORM is used to pass to legacy v4-only
processes and those likely are only confused by SOL_IPV6
error messages, it is probably best to just drop them and perhaps
WARN_ONCE.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
 From: Eric Dumazet 
 Date: Fri, 18 May 2018 08:30:43 -0700

> We probably need to revert Willem patch 
> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)

 Is it really valid to reach ip_recv_err with an ipv6 socket?
>>>
>>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>>> atomic operation, so that the socket is neither fully ipv4 nor fully
>>> ipv6 by the time it reaches ip_recv_error.
>>>
>>>   sk->sk_socket->ops = _dgram_ops;
>>>   < HERE >
>>>   sk->sk_family = PF_INET;
>>>
>>> Even calling inet_recv_error to demux would not necessarily help.
>>>
>>> Safest would be to look up by skb->protocol, similar to what
>>> ipv6_recv_error does to handle v4-mapped-v6.
>>>
>>> Or to make that function safe with PF_INET and swap the order
>>> of the above two operations.
>>>
>>> All sound needlessly complicated for this rare socket option, but
>>> I don't have a better idea yet. Dropping on the floor is not nice,
>>> either.
>>
>> Ensuring that ip_recv_error correctly handles packets from either
>> socket and removing the warning should indeed be good.
>>
>> It is robust against v4-mapped packets from an AF_INET6 socket,
>> but see caveat on reconnect below.
>>
>> The code between ipv6_recv_error for v4-mapped addresses and
>> ip_recv_error is essentially the same, the main difference being
>> whether to return network headers as sockaddr_in with SOL_IP
>> or sockaddr_in6 with SOL_IPV6.
>>
>> There are very few other locations in the stack that explicitly test
>> sk_family in this way and thus would be vulnerable to races with
>> IPV6_ADDRFORM.
>>
>> I'm not sure whether it is possible for a udpv6 socket to queue a
>> real ipv6 packet on the error queue, disconnect, connect to an
>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>> on a true ipv6 packet. That would return buggy data, e.g., in
>> msg_name.
>
> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
> error queue is empty, and then take its lock for the duration of the
> operation.

Actually, no reason to hold the lock. This setsockopt holds the socket
lock, which connect would need, too. So testing that the queue
is empty after testing that it is connected to a v4 address is
sufficient to ensure that no ipv6 packets are queued for reception.

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 4d780c7f0130..a975d6311341 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
int level, int optname,

if (ipv6_only_sock(sk) ||
!ipv6_addr_v4mapped(>sk_v6_daddr)) {
retv = -EADDRNOTAVAIL;
break;
}

+   if (!skb_queue_empty(>sk_error_queue)) {
+   retv = -EBUSY;
+   break;
+   }
+
fl6_free_socklist(sk);
__ipv6_sock_mc_close(sk);

After this it should be safe to remove the warning in ip_recv_error.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>  wrote:
>>> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
 From: Eric Dumazet 
 Date: Fri, 18 May 2018 08:30:43 -0700

> We probably need to revert Willem patch 
> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)

 Is it really valid to reach ip_recv_err with an ipv6 socket?
>>>
>>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>>> atomic operation, so that the socket is neither fully ipv4 nor fully
>>> ipv6 by the time it reaches ip_recv_error.
>>>
>>>   sk->sk_socket->ops = _dgram_ops;
>>>   < HERE >
>>>   sk->sk_family = PF_INET;
>>>
>>> Even calling inet_recv_error to demux would not necessarily help.
>>>
>>> Safest would be to look up by skb->protocol, similar to what
>>> ipv6_recv_error does to handle v4-mapped-v6.
>>>
>>> Or to make that function safe with PF_INET and swap the order
>>> of the above two operations.
>>>
>>> All sound needlessly complicated for this rare socket option, but
>>> I don't have a better idea yet. Dropping on the floor is not nice,
>>> either.
>>
>> Ensuring that ip_recv_error correctly handles packets from either
>> socket and removing the warning should indeed be good.
>>
>> It is robust against v4-mapped packets from an AF_INET6 socket,
>> but see caveat on reconnect below.
>>
>> The code between ipv6_recv_error for v4-mapped addresses and
>> ip_recv_error is essentially the same, the main difference being
>> whether to return network headers as sockaddr_in with SOL_IP
>> or sockaddr_in6 with SOL_IPV6.
>>
>> There are very few other locations in the stack that explicitly test
>> sk_family in this way and thus would be vulnerable to races with
>> IPV6_ADDRFORM.
>>
>> I'm not sure whether it is possible for a udpv6 socket to queue a
>> real ipv6 packet on the error queue, disconnect, connect to an
>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>> on a true ipv6 packet. That would return buggy data, e.g., in
>> msg_name.
>
> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
> error queue is empty, and then take its lock for the duration of the
> operation.

Actually, no reason to hold the lock. This setsockopt holds the socket
lock, which connect would need, too. So testing that the queue
is empty after testing that it is connected to a v4 address is
sufficient to ensure that no ipv6 packets are queued for reception.

diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 4d780c7f0130..a975d6311341 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
int level, int optname,

if (ipv6_only_sock(sk) ||
!ipv6_addr_v4mapped(>sk_v6_daddr)) {
retv = -EADDRNOTAVAIL;
break;
}

+   if (!skb_queue_empty(>sk_error_queue)) {
+   retv = -EBUSY;
+   break;
+   }
+
fl6_free_socklist(sk);
__ipv6_sock_mc_close(sk);

After this it should be safe to remove the warning in ip_recv_error.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
>>> From: Eric Dumazet 
>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>
 We probably need to revert Willem patch 
 (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>
>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>
>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>> atomic operation, so that the socket is neither fully ipv4 nor fully
>> ipv6 by the time it reaches ip_recv_error.
>>
>>   sk->sk_socket->ops = _dgram_ops;
>>   < HERE >
>>   sk->sk_family = PF_INET;
>>
>> Even calling inet_recv_error to demux would not necessarily help.
>>
>> Safest would be to look up by skb->protocol, similar to what
>> ipv6_recv_error does to handle v4-mapped-v6.
>>
>> Or to make that function safe with PF_INET and swap the order
>> of the above two operations.
>>
>> All sound needlessly complicated for this rare socket option, but
>> I don't have a better idea yet. Dropping on the floor is not nice,
>> either.
>
> Ensuring that ip_recv_error correctly handles packets from either
> socket and removing the warning should indeed be good.
>
> It is robust against v4-mapped packets from an AF_INET6 socket,
> but see caveat on reconnect below.
>
> The code between ipv6_recv_error for v4-mapped addresses and
> ip_recv_error is essentially the same, the main difference being
> whether to return network headers as sockaddr_in with SOL_IP
> or sockaddr_in6 with SOL_IPV6.
>
> There are very few other locations in the stack that explicitly test
> sk_family in this way and thus would be vulnerable to races with
> IPV6_ADDRFORM.
>
> I'm not sure whether it is possible for a udpv6 socket to queue a
> real ipv6 packet on the error queue, disconnect, connect to an
> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
> on a true ipv6 packet. That would return buggy data, e.g., in
> msg_name.

In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
error queue is empty, and then take its lock for the duration of the
operation.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>  wrote:
>> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
>>> From: Eric Dumazet 
>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>
 We probably need to revert Willem patch 
 (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>
>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>
>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>> atomic operation, so that the socket is neither fully ipv4 nor fully
>> ipv6 by the time it reaches ip_recv_error.
>>
>>   sk->sk_socket->ops = _dgram_ops;
>>   < HERE >
>>   sk->sk_family = PF_INET;
>>
>> Even calling inet_recv_error to demux would not necessarily help.
>>
>> Safest would be to look up by skb->protocol, similar to what
>> ipv6_recv_error does to handle v4-mapped-v6.
>>
>> Or to make that function safe with PF_INET and swap the order
>> of the above two operations.
>>
>> All sound needlessly complicated for this rare socket option, but
>> I don't have a better idea yet. Dropping on the floor is not nice,
>> either.
>
> Ensuring that ip_recv_error correctly handles packets from either
> socket and removing the warning should indeed be good.
>
> It is robust against v4-mapped packets from an AF_INET6 socket,
> but see caveat on reconnect below.
>
> The code between ipv6_recv_error for v4-mapped addresses and
> ip_recv_error is essentially the same, the main difference being
> whether to return network headers as sockaddr_in with SOL_IP
> or sockaddr_in6 with SOL_IPV6.
>
> There are very few other locations in the stack that explicitly test
> sk_family in this way and thus would be vulnerable to races with
> IPV6_ADDRFORM.
>
> I'm not sure whether it is possible for a udpv6 socket to queue a
> real ipv6 packet on the error queue, disconnect, connect to an
> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
> on a true ipv6 packet. That would return buggy data, e.g., in
> msg_name.

In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
error queue is empty, and then take its lock for the duration of the
operation.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
>> From: Eric Dumazet 
>> Date: Fri, 18 May 2018 08:30:43 -0700
>>
>>> We probably need to revert Willem patch 
>>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>
>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>
> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
> atomic operation, so that the socket is neither fully ipv4 nor fully
> ipv6 by the time it reaches ip_recv_error.
>
>   sk->sk_socket->ops = _dgram_ops;
>   < HERE >
>   sk->sk_family = PF_INET;
>
> Even calling inet_recv_error to demux would not necessarily help.
>
> Safest would be to look up by skb->protocol, similar to what
> ipv6_recv_error does to handle v4-mapped-v6.
>
> Or to make that function safe with PF_INET and swap the order
> of the above two operations.
>
> All sound needlessly complicated for this rare socket option, but
> I don't have a better idea yet. Dropping on the floor is not nice,
> either.

Ensuring that ip_recv_error correctly handles packets from either
socket and removing the warning should indeed be good.

It is robust against v4-mapped packets from an AF_INET6 socket,
but see caveat on reconnect below.

The code between ipv6_recv_error for v4-mapped addresses and
ip_recv_error is essentially the same, the main difference being
whether to return network headers as sockaddr_in with SOL_IP
or sockaddr_in6 with SOL_IPV6.

There are very few other locations in the stack that explicitly test
sk_family in this way and thus would be vulnerable to races with
IPV6_ADDRFORM.

I'm not sure whether it is possible for a udpv6 socket to queue a
real ipv6 packet on the error queue, disconnect, connect to an
ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
on a true ipv6 packet. That would return buggy data, e.g., in
msg_name.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
 wrote:
> On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
>> From: Eric Dumazet 
>> Date: Fri, 18 May 2018 08:30:43 -0700
>>
>>> We probably need to revert Willem patch 
>>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>
>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>
> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
> atomic operation, so that the socket is neither fully ipv4 nor fully
> ipv6 by the time it reaches ip_recv_error.
>
>   sk->sk_socket->ops = _dgram_ops;
>   < HERE >
>   sk->sk_family = PF_INET;
>
> Even calling inet_recv_error to demux would not necessarily help.
>
> Safest would be to look up by skb->protocol, similar to what
> ipv6_recv_error does to handle v4-mapped-v6.
>
> Or to make that function safe with PF_INET and swap the order
> of the above two operations.
>
> All sound needlessly complicated for this rare socket option, but
> I don't have a better idea yet. Dropping on the floor is not nice,
> either.

Ensuring that ip_recv_error correctly handles packets from either
socket and removing the warning should indeed be good.

It is robust against v4-mapped packets from an AF_INET6 socket,
but see caveat on reconnect below.

The code between ipv6_recv_error for v4-mapped addresses and
ip_recv_error is essentially the same, the main difference being
whether to return network headers as sockaddr_in with SOL_IP
or sockaddr_in6 with SOL_IPV6.

There are very few other locations in the stack that explicitly test
sk_family in this way and thus would be vulnerable to races with
IPV6_ADDRFORM.

I'm not sure whether it is possible for a udpv6 socket to queue a
real ipv6 packet on the error queue, disconnect, connect to an
ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
on a true ipv6 packet. That would return buggy data, e.g., in
msg_name.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
> From: Eric Dumazet 
> Date: Fri, 18 May 2018 08:30:43 -0700
>
>> We probably need to revert Willem patch 
>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>
> Is it really valid to reach ip_recv_err with an ipv6 socket?

I guess the issue is that setsockopt IPV6_ADDRFORM is not an
atomic operation, so that the socket is neither fully ipv4 nor fully
ipv6 by the time it reaches ip_recv_error.

  sk->sk_socket->ops = _dgram_ops;
  < HERE >
  sk->sk_family = PF_INET;

Even calling inet_recv_error to demux would not necessarily help.

Safest would be to look up by skb->protocol, similar to what
ipv6_recv_error does to handle v4-mapped-v6.

Or to make that function safe with PF_INET and swap the order
of the above two operations.

All sound needlessly complicated for this rare socket option, but
I don't have a better idea yet. Dropping on the floor is not nice,
either.


Re: WARNING in ip_recv_error

2018-05-18 Thread Willem de Bruijn
On Fri, May 18, 2018 at 11:44 AM, David Miller  wrote:
> From: Eric Dumazet 
> Date: Fri, 18 May 2018 08:30:43 -0700
>
>> We probably need to revert Willem patch 
>> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>
> Is it really valid to reach ip_recv_err with an ipv6 socket?

I guess the issue is that setsockopt IPV6_ADDRFORM is not an
atomic operation, so that the socket is neither fully ipv4 nor fully
ipv6 by the time it reaches ip_recv_error.

  sk->sk_socket->ops = _dgram_ops;
  < HERE >
  sk->sk_family = PF_INET;

Even calling inet_recv_error to demux would not necessarily help.

Safest would be to look up by skb->protocol, similar to what
ipv6_recv_error does to handle v4-mapped-v6.

Or to make that function safe with PF_INET and swap the order
of the above two operations.

All sound needlessly complicated for this rare socket option, but
I don't have a better idea yet. Dropping on the floor is not nice,
either.


Re: WARNING in ip_recv_error

2018-05-18 Thread David Miller
From: Eric Dumazet 
Date: Fri, 18 May 2018 08:30:43 -0700

> We probably need to revert Willem patch 
> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)

Is it really valid to reach ip_recv_err with an ipv6 socket?


Re: WARNING in ip_recv_error

2018-05-18 Thread David Miller
From: Eric Dumazet 
Date: Fri, 18 May 2018 08:30:43 -0700

> We probably need to revert Willem patch 
> (7ce875e5ecb8562fd44040f69bda96c999e38bbc)

Is it really valid to reach ip_recv_err with an ipv6 socket?


Re: WARNING in ip_recv_error

2018-05-18 Thread Eric Dumazet


On 05/18/2018 05:08 AM, DaeRyong Jeong wrote:
> We report the crash: WARNING in ip_recv_error
> (I resend the email since I mistakenly missed the subject in my previous
> email. I'm sorry.)
> 
> 
> This crash has been found in v4.17-rc1 using RaceFuzzer (a modified
> version of Syzkaller), which we describe more at the end of this
> report. Our analysis shows that the race occurs when invoking two
> syscalls concurrently, setsockopt$inet6_IPV6_ADDRFORM and recvmsg.
> 
> 
> Diagnosis:
> We think the concurrent execution of do_ipv6_setsockopt() with optname
> IPV6_ADDRFORM and inet_recvmsg() causes the crash. do_ipv6_setsockopt()
> can update sk->prot to _prot and sk->sk_family to PF_INET. But
> inet_recvmsg() can execute sk->sk_prot->recvmsg() right after that
> sk->prot is updated and sk->sk_family is not updated by
> do_ipv6_setsockopt(). This will lead WARN_ON in ip_recv_error().
> 
> 
> Thread interleaving:
> CPU0 (do_ipv6_setsockopt) CPU1 (inet_recvmsg)
> = =
> struct proto *prot = _prot;
> ...
> sk->sk_prot = prot;
> sk->sk_socket->ops = _dgram_ops;
>   err = sk->sk_prot->recvmsg(sk, 
> msg, size, flags & MSG_DONTWAIT,
>   flags & 
> ~MSG_DONTWAIT, _len);
> 
>   (in udp_recvmsg)
>   if (flags & MSG_ERRQUEUE)
>   return 
> ip_recv_error(sk, msg, len, addr_len);
>   
>   (in ip_recv_error)
>   WARN_ON_ONCE(sk->sk_family == 
> AF_INET6);
> sk->sk_family = PF_INET;
> 
> 
> Call Sequence:
> CPU0
> =
> udpv6_setsockopt
>   ipv6_setsockopt
>   do_ipv6_setsockopt
> 
> CPU1
> =
> sock_recvmsg
>   sock_recvmsg_nosec
>   inet_recvmsg
>   udp_recvmsg
> 
> 
> ==
> WARNING: CPU: 1 PID: 32600 at 
> /home/daeryong/workspace/new-race-fuzzer/kernels_repo/kernel_v4.17-rc1/net/ipv4/ip_sockglue.c:508
>  ip_recv_error+0x6f2/0x720 net/ipv4/ip_sockglue.c:508
> Kernel panic - not syncing: panic_on_warn set ...
> 
> CPU: 1 PID: 32600 Comm: syz-executor0 Not tainted 4.17.0-rc1 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x166/0x21c lib/dump_stack.c:113
>  panic+0x1a0/0x3a7 kernel/panic.c:184
>  __warn+0x191/0x1a0 kernel/panic.c:536
>  report_bug+0x132/0x1b0 lib/bug.c:186
>  fixup_bug.part.11+0x28/0x50 arch/x86/kernel/traps.c:178
>  fixup_bug arch/x86/kernel/traps.c:247 [inline]
>  do_error_trap+0x28b/0x2d0 arch/x86/kernel/traps.c:296
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
>  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
> RIP: 0010:ip_recv_error+0x6f2/0x720 net/ipv4/ip_sockglue.c:508
> RSP: 0018:8801dadff630 EFLAGS: 00010212
> RAX: 0004 RBX: 2002 RCX: 8327de12
> RDX: 008a RSI: c90001a0c000 RDI: 8801be615010
> RBP: 8801dadff720 R08: 2002 R09: 8801dadff918
> R10: 8801dadff738 R11: 8801dadffaff R12: 8801be615000
> R13: 8801dadffd50 R14: 11003b5bfece R15: 8801dadffb90
>  udp_recvmsg+0x834/0xa10 net/ipv4/udp.c:1571
>  inet_recvmsg+0x121/0x420 net/ipv4/af_inet.c:830
>  sock_recvmsg_nosec net/socket.c:802 [inline]
>  sock_recvmsg+0x7f/0xa0 net/socket.c:809
>  ___sys_recvmsg+0x1f0/0x430 net/socket.c:2279
>  __sys_recvmsg+0xfc/0x1c0 net/socket.c:2328
>  __do_sys_recvmsg net/socket.c:2338 [inline]
>  __se_sys_recvmsg net/socket.c:2335 [inline]
>  __x64_sys_recvmsg+0x48/0x50 net/socket.c:2335
>  do_syscall_64+0x15f/0x4a0 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4563f9
> RSP: 002b:7f24f6927b28 EFLAGS: 0246 ORIG_RAX: 002f
> RAX: ffda RBX: 0072bfa0 RCX: 004563f9
> RDX: 2002 RSI: 2240 RDI: 0016
> RBP: 04e4 R08:  R09: 
> R10:  R11: 0246 R12: 7f24f69286d4
> R13:  R14: 006fc600 R15: 
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
> ==
> 
> 
> = About RaceFuzzer
> 
> RaceFuzzer is a customized version of Syzkaller, specifically tailored
> to find race condition bugs in the Linux kernel. While we leverage
> many different technique, the notable feature of RaceFuzzer is in
> leveraging a custom hypervisor (QEMU/KVM) to interleave the
> scheduling. In particular, we modified the hypervisor to 

Re: WARNING in ip_recv_error

2018-05-18 Thread Eric Dumazet


On 05/18/2018 05:08 AM, DaeRyong Jeong wrote:
> We report the crash: WARNING in ip_recv_error
> (I resend the email since I mistakenly missed the subject in my previous
> email. I'm sorry.)
> 
> 
> This crash has been found in v4.17-rc1 using RaceFuzzer (a modified
> version of Syzkaller), which we describe more at the end of this
> report. Our analysis shows that the race occurs when invoking two
> syscalls concurrently, setsockopt$inet6_IPV6_ADDRFORM and recvmsg.
> 
> 
> Diagnosis:
> We think the concurrent execution of do_ipv6_setsockopt() with optname
> IPV6_ADDRFORM and inet_recvmsg() causes the crash. do_ipv6_setsockopt()
> can update sk->prot to _prot and sk->sk_family to PF_INET. But
> inet_recvmsg() can execute sk->sk_prot->recvmsg() right after that
> sk->prot is updated and sk->sk_family is not updated by
> do_ipv6_setsockopt(). This will lead WARN_ON in ip_recv_error().
> 
> 
> Thread interleaving:
> CPU0 (do_ipv6_setsockopt) CPU1 (inet_recvmsg)
> = =
> struct proto *prot = _prot;
> ...
> sk->sk_prot = prot;
> sk->sk_socket->ops = _dgram_ops;
>   err = sk->sk_prot->recvmsg(sk, 
> msg, size, flags & MSG_DONTWAIT,
>   flags & 
> ~MSG_DONTWAIT, _len);
> 
>   (in udp_recvmsg)
>   if (flags & MSG_ERRQUEUE)
>   return 
> ip_recv_error(sk, msg, len, addr_len);
>   
>   (in ip_recv_error)
>   WARN_ON_ONCE(sk->sk_family == 
> AF_INET6);
> sk->sk_family = PF_INET;
> 
> 
> Call Sequence:
> CPU0
> =
> udpv6_setsockopt
>   ipv6_setsockopt
>   do_ipv6_setsockopt
> 
> CPU1
> =
> sock_recvmsg
>   sock_recvmsg_nosec
>   inet_recvmsg
>   udp_recvmsg
> 
> 
> ==
> WARNING: CPU: 1 PID: 32600 at 
> /home/daeryong/workspace/new-race-fuzzer/kernels_repo/kernel_v4.17-rc1/net/ipv4/ip_sockglue.c:508
>  ip_recv_error+0x6f2/0x720 net/ipv4/ip_sockglue.c:508
> Kernel panic - not syncing: panic_on_warn set ...
> 
> CPU: 1 PID: 32600 Comm: syz-executor0 Not tainted 4.17.0-rc1 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x166/0x21c lib/dump_stack.c:113
>  panic+0x1a0/0x3a7 kernel/panic.c:184
>  __warn+0x191/0x1a0 kernel/panic.c:536
>  report_bug+0x132/0x1b0 lib/bug.c:186
>  fixup_bug.part.11+0x28/0x50 arch/x86/kernel/traps.c:178
>  fixup_bug arch/x86/kernel/traps.c:247 [inline]
>  do_error_trap+0x28b/0x2d0 arch/x86/kernel/traps.c:296
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
>  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
> RIP: 0010:ip_recv_error+0x6f2/0x720 net/ipv4/ip_sockglue.c:508
> RSP: 0018:8801dadff630 EFLAGS: 00010212
> RAX: 0004 RBX: 2002 RCX: 8327de12
> RDX: 008a RSI: c90001a0c000 RDI: 8801be615010
> RBP: 8801dadff720 R08: 2002 R09: 8801dadff918
> R10: 8801dadff738 R11: 8801dadffaff R12: 8801be615000
> R13: 8801dadffd50 R14: 11003b5bfece R15: 8801dadffb90
>  udp_recvmsg+0x834/0xa10 net/ipv4/udp.c:1571
>  inet_recvmsg+0x121/0x420 net/ipv4/af_inet.c:830
>  sock_recvmsg_nosec net/socket.c:802 [inline]
>  sock_recvmsg+0x7f/0xa0 net/socket.c:809
>  ___sys_recvmsg+0x1f0/0x430 net/socket.c:2279
>  __sys_recvmsg+0xfc/0x1c0 net/socket.c:2328
>  __do_sys_recvmsg net/socket.c:2338 [inline]
>  __se_sys_recvmsg net/socket.c:2335 [inline]
>  __x64_sys_recvmsg+0x48/0x50 net/socket.c:2335
>  do_syscall_64+0x15f/0x4a0 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x4563f9
> RSP: 002b:7f24f6927b28 EFLAGS: 0246 ORIG_RAX: 002f
> RAX: ffda RBX: 0072bfa0 RCX: 004563f9
> RDX: 2002 RSI: 2240 RDI: 0016
> RBP: 04e4 R08:  R09: 
> R10:  R11: 0246 R12: 7f24f69286d4
> R13:  R14: 006fc600 R15: 
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
> ==
> 
> 
> = About RaceFuzzer
> 
> RaceFuzzer is a customized version of Syzkaller, specifically tailored
> to find race condition bugs in the Linux kernel. While we leverage
> many different technique, the notable feature of RaceFuzzer is in
> leveraging a custom hypervisor (QEMU/KVM) to interleave the
> scheduling. In particular, we modified the hypervisor to