Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-11-22 Thread Alexander Graf

On 29.09.2011, at 18:06, Amit Shah wrote:

> On (Wed) 03 Aug 2011 [13:24:22], Jan Kiszka wrote:
>> From: Fabien Chouteau 
>> 
>> In the current implementation, if Slirp tries to send an IP packet to a 
>> client
>> with an unknown hardware address, the packet is simply dropped and an ARP
>> request is sent (if_encap in slirp/slirp.c).
>> 
>> With this patch, Slirp will send the ARP request, re-queue the packet and try
>> to send it later. The packet is dropped after one second if the ARP reply is
>> not received.
> 
> This patch causes a segfault when guests wake up from hibernate.
> 
> Recipe:
> 1. Start guest with -net user -net nic,model=virtio
> 2. (guest) ping 10.0.2.2
> 3. (guest) echo "disk" > /sys/power/state
> 4. Re-start guest with same command line
> 5. Ping has stopped receiving replies.
> 6. Kill that ping process and start a new one.  qemu segfaults.
> 
> This needs the not-upstream-yet virtio S4 handling patches, found at
> http://thread.gmane.org/gmane.linux.kernel/1197141
> 
> The backtrace is:
> 
> (gdb) bt
> #0  0x77e421f7 in slirp_insque (a=0x0, b=0x78f95d50) at
> /home/amit/src/qemu/slirp/misc.c:27
> #1  0x77e40738 in if_start (slirp=0x78a9cdf0) at
> /home/amit/src/qemu/slirp/if.c:194
> #2  0x77e44828 in slirp_select_poll (readfds=0x7fffd930,
> writefds=0x7fffd9b0, xfds=0x7fffda30, select_error=0)
>at /home/amit/src/qemu/slirp/slirp.c:588
> #3  0x77e110f1 in main_loop_wait (nonblocking=)
> at /home/amit/src/qemu/vl.c:1549
> #4  0x77d7dc47 in main_loop () at
> /home/amit/src/qemu/vl.c:1579
> #5  main (argc=, argv=, envp= out>) at /home/amit/src/qemu/vl.c:3574
> 
> 
> Reverting the patch keeps the ping going on after resume.  

I get the same thing with yesterday's HEAD (close to 1.0-rc3), but without 
hibernation.

I'm running KVM Autotest on PPC machines to check my ppc-next queue and every 
single test failed for me because of segmentation faults in the slirp code. 
Reverting this patch (and the follow-up patch which fixes the struct mbuf 
definition) makes all tests not segfault for me, so I'm fairly sure this is the 
offending one :).

I'm not saying that the patch is actually wrong - maybe it only exposes another 
bug that was only hidden so far. Either way, the breakage looks pretty much 
like memory corruption to me.

Also, I'm having a hard time reproducing the problem manually. It triggers 
every time in Autotest, but never when I try to trigger it manually. 
Essentially Autotest is merely trying to connect to the guest using ssh every 
couple of seconds, so I don't know why I can't reproduce it without it.

Please fix or revert this for 1.0.


Alex




Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-09-29 Thread Amit Shah
On (Thu) 29 Sep 2011 [20:19:42], Jan Kiszka wrote:
> On 2011-09-29 20:05, Amit Shah wrote:
> > On (Thu) 29 Sep 2011 [19:53:47], Jan Kiszka wrote:
>  Can't reproduce, I'm not getting stable hibernation here even without
>  any network configured.
> >>>
> >>> With virtio devices and the patches applied?  Can you tell me what
> >>> you're seeing?
> >>
> >> No, I didn't patch my guest. I was using standard IDE with an emulated
> >> NIC (or without) against a 3.1-rc3 (or so) guest.
> > 
> > Strange, using qemu.git and an F14 guest (2.6.38) using this cmd line:
> > 
> > ./x86_64-softmmu/qemu-system-x86_64 -m 512 /guests/f14-suspend.qcow2 -net 
> > none -enable-kvm -smp 2
> > 
> > I could successfully hibernate and resume.
> 
> -cpu qemu64,-kvmclock makes it work. Would have to check a different
> kernel version to find out if it's a guest kernel or qemu/kvm issue.

Aha, yes.  I disable kvmclock in my kernels as that has known issues
with hibernate.  I haven't checked if the 2.6.38 f14 kernel had it
disabled, or maybe I didn't hit its problem point soon enough.

Amit



Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-09-29 Thread Jan Kiszka
On 2011-09-29 20:05, Amit Shah wrote:
> On (Thu) 29 Sep 2011 [19:53:47], Jan Kiszka wrote:
 Can't reproduce, I'm not getting stable hibernation here even without
 any network configured.
>>>
>>> With virtio devices and the patches applied?  Can you tell me what
>>> you're seeing?
>>
>> No, I didn't patch my guest. I was using standard IDE with an emulated
>> NIC (or without) against a 3.1-rc3 (or so) guest.
> 
> Strange, using qemu.git and an F14 guest (2.6.38) using this cmd line:
> 
> ./x86_64-softmmu/qemu-system-x86_64 -m 512 /guests/f14-suspend.qcow2 -net 
> none -enable-kvm -smp 2
> 
> I could successfully hibernate and resume.

-cpu qemu64,-kvmclock makes it work. Would have to check a different
kernel version to find out if it's a guest kernel or qemu/kvm issue.

> 
 Could you check if the recent pull request [1] changes the picture for you?
>>>
>>> Thanks, that series fixes the problem.
>>
>> Perfect! Right in time. :)
> 
> And people say slirp is neglected and unmaintainable :-)

Who says this? I think slirp just needed some attention. It already
presented two fairly old bugs to me, but strangely right after I adopted
it... ;)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-09-29 Thread Amit Shah
On (Thu) 29 Sep 2011 [19:53:47], Jan Kiszka wrote:
> >> Can't reproduce, I'm not getting stable hibernation here even without
> >> any network configured.
> > 
> > With virtio devices and the patches applied?  Can you tell me what
> > you're seeing?
> 
> No, I didn't patch my guest. I was using standard IDE with an emulated
> NIC (or without) against a 3.1-rc3 (or so) guest.

Strange, using qemu.git and an F14 guest (2.6.38) using this cmd line:

./x86_64-softmmu/qemu-system-x86_64 -m 512 /guests/f14-suspend.qcow2 -net none 
-enable-kvm -smp 2

I could successfully hibernate and resume.

> >> Could you check if the recent pull request [1] changes the picture for you?
> > 
> > Thanks, that series fixes the problem.
> 
> Perfect! Right in time. :)

And people say slirp is neglected and unmaintainable :-)

Amit



Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-09-29 Thread Jan Kiszka
On 2011-09-29 19:50, Amit Shah wrote:
> On (Thu) 29 Sep 2011 [19:41:33], Jan Kiszka wrote:
>> On 2011-09-29 18:06, Amit Shah wrote:
>>> On (Wed) 03 Aug 2011 [13:24:22], Jan Kiszka wrote:
 From: Fabien Chouteau 

 In the current implementation, if Slirp tries to send an IP packet to a 
 client
 with an unknown hardware address, the packet is simply dropped and an ARP
 request is sent (if_encap in slirp/slirp.c).

 With this patch, Slirp will send the ARP request, re-queue the packet and 
 try
 to send it later. The packet is dropped after one second if the ARP reply 
 is
 not received.
>>>
>>> This patch causes a segfault when guests wake up from hibernate.
>>>
>>> Recipe:
>>> 1. Start guest with -net user -net nic,model=virtio
>>> 2. (guest) ping 10.0.2.2
>>> 3. (guest) echo "disk" > /sys/power/state
>>> 4. Re-start guest with same command line
>>> 5. Ping has stopped receiving replies.
>>> 6. Kill that ping process and start a new one.  qemu segfaults.
>>
>> Can't reproduce, I'm not getting stable hibernation here even without
>> any network configured.
> 
> With virtio devices and the patches applied?  Can you tell me what
> you're seeing?

No, I didn't patch my guest. I was using standard IDE with an emulated
NIC (or without) against a 3.1-rc3 (or so) guest.

> 
>> Could you check if the recent pull request [1] changes the picture for you?
> 
> Thanks, that series fixes the problem.

Perfect! Right in time. :)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-09-29 Thread Amit Shah
On (Thu) 29 Sep 2011 [19:41:33], Jan Kiszka wrote:
> On 2011-09-29 18:06, Amit Shah wrote:
> > On (Wed) 03 Aug 2011 [13:24:22], Jan Kiszka wrote:
> >> From: Fabien Chouteau 
> >>
> >> In the current implementation, if Slirp tries to send an IP packet to a 
> >> client
> >> with an unknown hardware address, the packet is simply dropped and an ARP
> >> request is sent (if_encap in slirp/slirp.c).
> >>
> >> With this patch, Slirp will send the ARP request, re-queue the packet and 
> >> try
> >> to send it later. The packet is dropped after one second if the ARP reply 
> >> is
> >> not received.
> > 
> > This patch causes a segfault when guests wake up from hibernate.
> > 
> > Recipe:
> > 1. Start guest with -net user -net nic,model=virtio
> > 2. (guest) ping 10.0.2.2
> > 3. (guest) echo "disk" > /sys/power/state
> > 4. Re-start guest with same command line
> > 5. Ping has stopped receiving replies.
> > 6. Kill that ping process and start a new one.  qemu segfaults.
> 
> Can't reproduce, I'm not getting stable hibernation here even without
> any network configured.

With virtio devices and the patches applied?  Can you tell me what
you're seeing?

> Could you check if the recent pull request [1] changes the picture for you?

Thanks, that series fixes the problem.

Amit



Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-09-29 Thread Jan Kiszka
On 2011-09-29 18:06, Amit Shah wrote:
> On (Wed) 03 Aug 2011 [13:24:22], Jan Kiszka wrote:
>> From: Fabien Chouteau 
>>
>> In the current implementation, if Slirp tries to send an IP packet to a 
>> client
>> with an unknown hardware address, the packet is simply dropped and an ARP
>> request is sent (if_encap in slirp/slirp.c).
>>
>> With this patch, Slirp will send the ARP request, re-queue the packet and try
>> to send it later. The packet is dropped after one second if the ARP reply is
>> not received.
> 
> This patch causes a segfault when guests wake up from hibernate.
> 
> Recipe:
> 1. Start guest with -net user -net nic,model=virtio
> 2. (guest) ping 10.0.2.2
> 3. (guest) echo "disk" > /sys/power/state
> 4. Re-start guest with same command line
> 5. Ping has stopped receiving replies.
> 6. Kill that ping process and start a new one.  qemu segfaults.

Can't reproduce, I'm not getting stable hibernation here even without
any network configured.

Could you check if the recent pull request [1] changes the picture for you?

Thanks,
Jan

[1] http://thread.gmane.org/gmane.comp.emulators.qemu/118992

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-09-29 Thread Amit Shah
On (Wed) 03 Aug 2011 [13:24:22], Jan Kiszka wrote:
> From: Fabien Chouteau 
> 
> In the current implementation, if Slirp tries to send an IP packet to a client
> with an unknown hardware address, the packet is simply dropped and an ARP
> request is sent (if_encap in slirp/slirp.c).
> 
> With this patch, Slirp will send the ARP request, re-queue the packet and try
> to send it later. The packet is dropped after one second if the ARP reply is
> not received.

This patch causes a segfault when guests wake up from hibernate.

Recipe:
1. Start guest with -net user -net nic,model=virtio
2. (guest) ping 10.0.2.2
3. (guest) echo "disk" > /sys/power/state
4. Re-start guest with same command line
5. Ping has stopped receiving replies.
6. Kill that ping process and start a new one.  qemu segfaults.

This needs the not-upstream-yet virtio S4 handling patches, found at
http://thread.gmane.org/gmane.linux.kernel/1197141

The backtrace is:

(gdb) bt
#0  0x77e421f7 in slirp_insque (a=0x0, b=0x78f95d50) at
/home/amit/src/qemu/slirp/misc.c:27
#1  0x77e40738 in if_start (slirp=0x78a9cdf0) at
/home/amit/src/qemu/slirp/if.c:194
#2  0x77e44828 in slirp_select_poll (readfds=0x7fffd930,
writefds=0x7fffd9b0, xfds=0x7fffda30, select_error=0)
at /home/amit/src/qemu/slirp/slirp.c:588
#3  0x77e110f1 in main_loop_wait (nonblocking=)
at /home/amit/src/qemu/vl.c:1549
#4  0x77d7dc47 in main_loop () at
/home/amit/src/qemu/vl.c:1579
#5  main (argc=, argv=, envp=) at /home/amit/src/qemu/vl.c:3574


Reverting the patch keeps the ping going on after resume.  

Leaving the patch in context:

> 
> Signed-off-by: Fabien Chouteau 
> Signed-off-by: Jan Kiszka 
> ---
>  slirp/if.c|   28 +++---
>  slirp/main.h  |2 +-
>  slirp/mbuf.c  |2 +
>  slirp/mbuf.h  |2 +
>  slirp/slirp.c |   72 +++-
>  5 files changed, 69 insertions(+), 37 deletions(-)
> 
> diff --git a/slirp/if.c b/slirp/if.c
> index 0f04e13..2d79e45 100644
> --- a/slirp/if.c
> +++ b/slirp/if.c
> @@ -6,6 +6,7 @@
>   */
>  
>  #include 
> +#include "qemu-timer.h"
>  
>  #define ifs_init(ifm) ((ifm)->ifs_next = (ifm)->ifs_prev = (ifm))
>  
> @@ -105,6 +106,9 @@ if_output(struct socket *so, struct mbuf *ifm)
>   ifs_init(ifm);
>   insque(ifm, ifq);
>  
> +/* Expiration date = Now + 1 second */
> +ifm->expiration_date = qemu_get_clock_ns(rt_clock) + 10ULL;
> +
>  diddit:
>   slirp->if_queued++;
>  
> @@ -153,6 +157,9 @@ diddit:
>  void
>  if_start(Slirp *slirp)
>  {
> +int requeued = 0;
> +uint64_t now;
> +
>   struct mbuf *ifm, *ifqt;
>  
>   DEBUG_CALL("if_start");
> @@ -165,6 +172,8 @@ if_start(Slirp *slirp)
>  if (!slirp_can_output(slirp->opaque))
>  return;
>  
> +now = qemu_get_clock_ns(rt_clock);
> +
>   /*
>* See which queue to get next packet from
>* If there's something in the fastq, select it immediately
> @@ -199,11 +208,22 @@ if_start(Slirp *slirp)
>  ifm->ifq_so->so_nqueued = 0;
>   }
>  
> - /* Encapsulate the packet for sending */
> -if_encap(slirp, (uint8_t *)ifm->m_data, ifm->m_len);
> -
> -m_free(ifm);
> +if (ifm->expiration_date < now) {
> +/* Expired */
> +m_free(ifm);
> +} else {
> +/* Encapsulate the packet for sending */
> +if (if_encap(slirp, ifm)) {
> +m_free(ifm);
> +} else {
> +/* re-queue */
> +insque(ifm, ifqt);
> +requeued++;
> +}
> +}
>  
>   if (slirp->if_queued)
>  goto again;
> +
> +slirp->if_queued = requeued;
>  }
> diff --git a/slirp/main.h b/slirp/main.h
> index 0dd8d81..028df4b 100644
> --- a/slirp/main.h
> +++ b/slirp/main.h
> @@ -42,5 +42,5 @@ extern int tcp_keepintvl;
>  #define PROTO_PPP 0x2
>  #endif
>  
> -void if_encap(Slirp *slirp, const uint8_t *ip_data, int ip_data_len);
> +int if_encap(Slirp *slirp, struct mbuf *ifm);
>  ssize_t slirp_send(struct socket *so, const void *buf, size_t len, int 
> flags);
> diff --git a/slirp/mbuf.c b/slirp/mbuf.c
> index ce2eb84..c699c75 100644
> --- a/slirp/mbuf.c
> +++ b/slirp/mbuf.c
> @@ -70,6 +70,8 @@ m_get(Slirp *slirp)
>   m->m_len = 0;
>  m->m_nextpkt = NULL;
>  m->m_prevpkt = NULL;
> +m->arp_requested = false;
> +m->expiration_date = (uint64_t)-1;
>  end_error:
>   DEBUG_ARG("m = %lx", (long )m);
>   return m;
> diff --git a/slirp/mbuf.h b/slirp/mbuf.h
> index b74544b..55170e5 100644
> --- a/slirp/mbuf.h
> +++ b/slirp/mbuf.h
> @@ -86,6 +86,8 @@ struct mbuf {
>   charm_dat_[1]; /* ANSI don't like 0 sized arrays */
>   char*m_ext_;
>   } M_dat;
> +bool arp_requested;
> +uint64_t expiration_date;
>  };
>  
>  #define m_next   m_hdr.mh_next
> diff --gi

[Qemu-devel] [PATCH 3/3] Delayed IP packets

2011-08-03 Thread Jan Kiszka
From: Fabien Chouteau 

In the current implementation, if Slirp tries to send an IP packet to a client
with an unknown hardware address, the packet is simply dropped and an ARP
request is sent (if_encap in slirp/slirp.c).

With this patch, Slirp will send the ARP request, re-queue the packet and try
to send it later. The packet is dropped after one second if the ARP reply is
not received.

Signed-off-by: Fabien Chouteau 
Signed-off-by: Jan Kiszka 
---
 slirp/if.c|   28 +++---
 slirp/main.h  |2 +-
 slirp/mbuf.c  |2 +
 slirp/mbuf.h  |2 +
 slirp/slirp.c |   72 +++-
 5 files changed, 69 insertions(+), 37 deletions(-)

diff --git a/slirp/if.c b/slirp/if.c
index 0f04e13..2d79e45 100644
--- a/slirp/if.c
+++ b/slirp/if.c
@@ -6,6 +6,7 @@
  */
 
 #include 
+#include "qemu-timer.h"
 
 #define ifs_init(ifm) ((ifm)->ifs_next = (ifm)->ifs_prev = (ifm))
 
@@ -105,6 +106,9 @@ if_output(struct socket *so, struct mbuf *ifm)
ifs_init(ifm);
insque(ifm, ifq);
 
+/* Expiration date = Now + 1 second */
+ifm->expiration_date = qemu_get_clock_ns(rt_clock) + 10ULL;
+
 diddit:
slirp->if_queued++;
 
@@ -153,6 +157,9 @@ diddit:
 void
 if_start(Slirp *slirp)
 {
+int requeued = 0;
+uint64_t now;
+
struct mbuf *ifm, *ifqt;
 
DEBUG_CALL("if_start");
@@ -165,6 +172,8 @@ if_start(Slirp *slirp)
 if (!slirp_can_output(slirp->opaque))
 return;
 
+now = qemu_get_clock_ns(rt_clock);
+
/*
 * See which queue to get next packet from
 * If there's something in the fastq, select it immediately
@@ -199,11 +208,22 @@ if_start(Slirp *slirp)
   ifm->ifq_so->so_nqueued = 0;
}
 
-   /* Encapsulate the packet for sending */
-if_encap(slirp, (uint8_t *)ifm->m_data, ifm->m_len);
-
-m_free(ifm);
+if (ifm->expiration_date < now) {
+/* Expired */
+m_free(ifm);
+} else {
+/* Encapsulate the packet for sending */
+if (if_encap(slirp, ifm)) {
+m_free(ifm);
+} else {
+/* re-queue */
+insque(ifm, ifqt);
+requeued++;
+}
+}
 
if (slirp->if_queued)
   goto again;
+
+slirp->if_queued = requeued;
 }
diff --git a/slirp/main.h b/slirp/main.h
index 0dd8d81..028df4b 100644
--- a/slirp/main.h
+++ b/slirp/main.h
@@ -42,5 +42,5 @@ extern int tcp_keepintvl;
 #define PROTO_PPP 0x2
 #endif
 
-void if_encap(Slirp *slirp, const uint8_t *ip_data, int ip_data_len);
+int if_encap(Slirp *slirp, struct mbuf *ifm);
 ssize_t slirp_send(struct socket *so, const void *buf, size_t len, int flags);
diff --git a/slirp/mbuf.c b/slirp/mbuf.c
index ce2eb84..c699c75 100644
--- a/slirp/mbuf.c
+++ b/slirp/mbuf.c
@@ -70,6 +70,8 @@ m_get(Slirp *slirp)
m->m_len = 0;
 m->m_nextpkt = NULL;
 m->m_prevpkt = NULL;
+m->arp_requested = false;
+m->expiration_date = (uint64_t)-1;
 end_error:
DEBUG_ARG("m = %lx", (long )m);
return m;
diff --git a/slirp/mbuf.h b/slirp/mbuf.h
index b74544b..55170e5 100644
--- a/slirp/mbuf.h
+++ b/slirp/mbuf.h
@@ -86,6 +86,8 @@ struct mbuf {
charm_dat_[1]; /* ANSI don't like 0 sized arrays */
char*m_ext_;
} M_dat;
+bool arp_requested;
+uint64_t expiration_date;
 };
 
 #define m_next m_hdr.mh_next
diff --git a/slirp/slirp.c b/slirp/slirp.c
index 4a9a4d5..a86cc6e 100644
--- a/slirp/slirp.c
+++ b/slirp/slirp.c
@@ -692,55 +692,63 @@ void slirp_input(Slirp *slirp, const uint8_t *pkt, int 
pkt_len)
 }
 }
 
-/* output the IP packet to the ethernet device */
-void if_encap(Slirp *slirp, const uint8_t *ip_data, int ip_data_len)
+/* Output the IP packet to the ethernet device. Returns 0 if the packet must be
+ * re-queued.
+ */
+int if_encap(Slirp *slirp, struct mbuf *ifm)
 {
 uint8_t buf[1600];
 struct ethhdr *eh = (struct ethhdr *)buf;
 uint8_t ethaddr[ETH_ALEN];
-const struct ip *iph = (const struct ip *)ip_data;
+const struct ip *iph = (const struct ip *)ifm->m_data;
 
-if (ip_data_len + ETH_HLEN > sizeof(buf))
-return;
+if (ifm->m_len + ETH_HLEN > sizeof(buf)) {
+return 1;
+}
 
 if (!arp_table_search(slirp, iph->ip_dst.s_addr, ethaddr)) {
 uint8_t arp_req[ETH_HLEN + sizeof(struct arphdr)];
 struct ethhdr *reh = (struct ethhdr *)arp_req;
 struct arphdr *rah = (struct arphdr *)(arp_req + ETH_HLEN);
 
-/* If the client addr is not known, there is no point in
-   sending the packet to it. Normally the sender should have
-   done an ARP request to get its MAC address. Here we do it
-   in place of sending the packet and we hope that the sender
-   will retry sending its packet. */
-memset(reh->h_dest, 0xff, ETH_ALEN);