Re: [RFC PATCH 09/21] contrib/gitdm: Add Nutanix to the domain map

2020-10-06 Thread Prerna Saxena


On 04/10/20, 11:35 PM, "Philippe Mathieu-Daudé"  wrote:

There is a number of contributors from this domain,
add its own entry to the gitdm domain map.

Cc: Ani Sinha 
Cc: David Vrabel 
Cc: Felipe Franciosi 
Cc: Jonathan Davies 
Cc: Malcolm Crossley 
Cc: Mike Cui 
Cc: Peter Turschmid 
Cc: Prerna Saxena 
Cc: Raphael Norwitz 
Cc: Swapnil Ingle 
Cc: Ani Sinha 
Signed-off-by: Philippe Mathieu-Daudé 
---
One Reviewed-by/Ack-by from someone from this domain
should be sufficient to get this patch merged.

Ani, can you confirm the a...@anisinha.ca email?
Should it go into 'individual contributors' instead?
---
 contrib/gitdm/domain-map| 1 +
 contrib/gitdm/group-map-nutanix | 2 ++
 gitdm.config| 1 +
 3 files changed, 4 insertions(+)
 create mode 100644 contrib/gitdm/group-map-nutanix

diff --git a/contrib/gitdm/domain-map b/contrib/gitdm/domain-map
index 4850eab4c4..39251fd97c 100644
--- a/contrib/gitdm/domain-map
+++ b/contrib/gitdm/domain-map
@@ -24,6 +24,7 @@ linaro.org  Linaro
 codesourcery.com Mentor Graphics
 microsoft.com   Microsoft
 nokia.com   Nokia
+nutanix.com Nutanix
 oracle.com  Oracle
 proxmox.com Proxmox
 redhat.com  Red Hat
diff --git a/contrib/gitdm/group-map-nutanix 
b/contrib/gitdm/group-map-nutanix
new file mode 100644
index 00..a3f11425b3
--- /dev/null
+++ b/contrib/gitdm/group-map-nutanix
@@ -0,0 +1,2 @@
+raphael.s.norw...@gmail.com
+a...@anisinha.ca
diff --git a/gitdm.config b/gitdm.config
index c01c219078..4f821ab8ba 100644
--- a/gitdm.config
+++ b/gitdm.config
@@ -37,6 +37,7 @@ GroupMap contrib/gitdm/group-map-cadence Cadence Design 
Systems
 GroupMap contrib/gitdm/group-map-codeweavers CodeWeavers
 GroupMap contrib/gitdm/group-map-ibm IBM
 GroupMap contrib/gitdm/group-map-janustech Janus Technologies
+GroupMap contrib/gitdm/group-map-nutanix Nutanix

-- 
2.26.2

LGTM. Raphael is still a part of Nutanix. I see Ani has already responded about 
him not being with the company anymore, so you might want to add him to the 
individual contributors' list.

Regards,
Prerna



Re: [Qemu-devel] [PATCH 2/2] vhost-user: only seek a reply if needed in set_mem_table

2016-09-08 Thread Prerna Saxena
Hi Maxime,


On 08/09/16 2:04 pm, "Maxime Coquelin"  wrote:

>The goal of this patch is to only request a sync (reply_ack,
>or get_features) in set_mem_table only when necessary.
>
>It should not be necessary the first time we set the table,
>or when we add a new regions which hadn't been merged with an
>existing ones.


I don’t think so. 
This patch is not helping us solve the issue.
The hang introduced by original use of get_features() in set_mem_table was 
traced down to use of TCG mode for vhost-user test. This has now been fixed via:

-
commit cdafe929615ec5eca71bcd5a3d12bab5678e5886
Author: Eduardo Habkost 
Date:   Fri Sep 2 15:59:43 2016 -0300


vhost-user-test: Use libqos instead of pxe-virtio.rom

vhost-user-test relies on iPXE just to initialize the virtio-net
device, and doesn't do any actual packet tx/rx testing.

In addition to that, the test relies on TCG, which is
imcompatible with vhost. The test only worked by accident: a bug
the memory backend initialization made memory regions not have
the DIRTY_MEMORY_CODE bit set in dirty_log_mask.

This changes vhost-user-test to initialize the virtio-net device
using libqos, and not use TCG nor pxe-virtio.rom.

Signed-off-by: Eduardo Habkost 

---

So I think the original hang seems to have been fixed with Patch 1/2 of this 
series alone.

Regarding Patch 2/2:
This patch seems to filter responses from set_mem_table only for certain 
updates of memory regions. It violates the definition of the REPLY_ACK feature. 
This feature expects the client to send a response for every call of 
set_mem_table. And here, qemu exits the set_mem_table() function in some cases 
without even waiting for the reply that is going to come in.

As for use of this approach with get_features, we have already debated that on 
the list before : 
https://lists.nongnu.org/archive/html/qemu-devel/2016-07/msg00689.html
To quote:
"I do not entirely agree with that. The first set_mem_table command is not much
different from subsequent set_mem_table calls."

Regards,
Prerna



Re: [Qemu-devel] [PATCH] Revert "vhost-user: Attempt to fix a race with set_mem_table."

2016-08-16 Thread Prerna Saxena





On 16/08/16 2:39 am, "Michael S. Tsirkin" <m...@redhat.com> wrote:

>On Mon, Aug 15, 2016 at 04:15:08PM +0100, Peter Maydell wrote:
>> On 15 August 2016 at 14:35, Michael S. Tsirkin <m...@redhat.com> wrote:
>> > This reverts commit 28ed5ef16384f12500abd3647973ee21b03cbe23.
>> >
>> > I still think it's the right thing to do, but
>> > tests have been failing sporadically.
>> >
>> > Revert for now, and hope to fix it before the release.
>> >
>> > Cc: Prerna Saxena <prerna.sax...@nutanix.com>
>> > Cc: Peter Maydell <peter.mayd...@linaro.org>
>> > Cc: Marc-André Lureau <mlur...@redhat.com>
>> > Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>> > ---
>> 
>> Applied, thanks. I found my clang-on-x86-64 Linux Ubuntu xenial
>> build would hang in vhost-user/read-guest-mem after 10 or
>> so iterations, but with this revert applied it seems fine,
>> so I think this commit was definitely the culprit.
>> 
>> -- PMM
>
>That's nice for the RC, but I think we do want to
>have the underlying issue fixed down the road.
>Prerna, Marc André - any chance you could try
>reproducing on an ubuntu guest?
>
>In particular to make sure the issue whatever it is
>will not tigger once clients negotiate the
>new feature bit.


Sure, I’ll give it a try.

Prerna


Re: [Qemu-devel] [PATCH] Revert "vhost-user: Attempt to fix a race with set_mem_table."

2016-08-15 Thread Prerna Saxena
Ack. You beat me to the patch by a few minutes :)

Prerna





On 15/08/16 7:05 pm, "Michael S. Tsirkin" <m...@redhat.com> wrote:

>This reverts commit 28ed5ef16384f12500abd3647973ee21b03cbe23.
>
>I still think it's the right thing to do, but
>tests have been failing sporadically.
>
>Revert for now, and hope to fix it before the release.
>
>Cc: Prerna Saxena <prerna.sax...@nutanix.com>
>Cc: Peter Maydell <peter.mayd...@linaro.org>
>Cc: Marc-André Lureau <mlur...@redhat.com>
>Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>---
> hw/virtio/vhost-user.c | 127 +++--
> 1 file changed, 60 insertions(+), 67 deletions(-)
>
>diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>index 1a7d53c..b57454a 100644
>--- a/hw/virtio/vhost-user.c
>+++ b/hw/virtio/vhost-user.c
>@@ -263,6 +263,66 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
>uint64_t base,
> return 0;
> }
> 
>+static int vhost_user_set_mem_table(struct vhost_dev *dev,
>+struct vhost_memory *mem)
>+{
>+int fds[VHOST_MEMORY_MAX_NREGIONS];
>+int i, fd;
>+size_t fd_num = 0;
>+bool reply_supported = virtio_has_feature(dev->protocol_features,
>+  
>VHOST_USER_PROTOCOL_F_REPLY_ACK);
>+
>+VhostUserMsg msg = {
>+.request = VHOST_USER_SET_MEM_TABLE,
>+.flags = VHOST_USER_VERSION,
>+};
>+
>+if (reply_supported) {
>+msg.flags |= VHOST_USER_NEED_REPLY_MASK;
>+}
>+
>+for (i = 0; i < dev->mem->nregions; ++i) {
>+struct vhost_memory_region *reg = dev->mem->regions + i;
>+ram_addr_t offset;
>+MemoryRegion *mr;
>+
>+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
>+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
>+ );
>+fd = memory_region_get_fd(mr);
>+if (fd > 0) {
>+msg.payload.memory.regions[fd_num].userspace_addr = 
>reg->userspace_addr;
>+msg.payload.memory.regions[fd_num].memory_size  = 
>reg->memory_size;
>+msg.payload.memory.regions[fd_num].guest_phys_addr = 
>reg->guest_phys_addr;
>+msg.payload.memory.regions[fd_num].mmap_offset = offset;
>+assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
>+fds[fd_num++] = fd;
>+}
>+}
>+
>+msg.payload.memory.nregions = fd_num;
>+
>+if (!fd_num) {
>+error_report("Failed initializing vhost-user memory map, "
>+ "consider using -object memory-backend-file share=on");
>+return -1;
>+}
>+
>+msg.size = sizeof(msg.payload.memory.nregions);
>+msg.size += sizeof(msg.payload.memory.padding);
>+msg.size += fd_num * sizeof(VhostUserMemoryRegion);
>+
>+if (vhost_user_write(dev, , fds, fd_num) < 0) {
>+return -1;
>+}
>+
>+if (reply_supported) {
>+return process_message_reply(dev, msg.request);
>+}
>+
>+return 0;
>+}
>+
> static int vhost_user_set_vring_addr(struct vhost_dev *dev,
>  struct vhost_vring_addr *addr)
> {
>@@ -477,73 +537,6 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
>uint64_t *features)
> return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
> }
> 
>-static int vhost_user_set_mem_table(struct vhost_dev *dev,
>-struct vhost_memory *mem)
>-{
>-int fds[VHOST_MEMORY_MAX_NREGIONS];
>-int i, fd;
>-size_t fd_num = 0;
>-uint64_t features;
>-bool reply_supported = virtio_has_feature(dev->protocol_features,
>-  
>VHOST_USER_PROTOCOL_F_REPLY_ACK);
>-
>-VhostUserMsg msg = {
>-.request = VHOST_USER_SET_MEM_TABLE,
>-.flags = VHOST_USER_VERSION,
>-};
>-
>-if (reply_supported) {
>-msg.flags |= VHOST_USER_NEED_REPLY_MASK;
>-}
>-
>-for (i = 0; i < dev->mem->nregions; ++i) {
>-struct vhost_memory_region *reg = dev->mem->regions + i;
>-ram_addr_t offset;
>-MemoryRegion *mr;
>-
>-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
>-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
>- );
>-fd = memory_region_get_fd(mr);
>-if (fd > 0) {
>-msg.payload.memory.regions[fd_num].userspace_addr
&g

Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table.

2016-08-14 Thread Prerna Saxena
On 14/08/16 8:21 am, "Michael S. Tsirkin" <m...@redhat.com> wrote:


>On Fri, Aug 12, 2016 at 07:16:34AM +, Prerna Saxena wrote:
>> 
>> On 12/08/16 12:08 pm, "Fam Zheng" <f...@redhat.com> wrote:
>> 
>> 
>> 
>> 
>> 
>> >On Wed, 08/10 18:30, Michael S. Tsirkin wrote:
>> >> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> >> 
>> >> The set_mem_table command currently does not seek a reply. Hence, there is
>> >> no easy way for a remote application to notify to QEMU when it finished
>> >> setting up memory, or if there were errors doing so.
>> >> 
>> >> As an example:
>> >> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
>> >> application). SET_MEM_TABLE does not require a reply according to the 
>> >> spec.
>> >> (2) Qemu commits the memory to the guest.
>> >> (3) Guest issues an I/O operation over a new memory region which was 
>> >> configured on (1).
>> >> (4) The application has not yet remapped the memory, but it sees the I/O 
>> >> request.
>> >> (5) The application cannot satisfy the request because it does not know 
>> >> about those GPAs.
>> >> 
>> >> While a guaranteed fix would require a protocol extension (committed 
>> >> separately),
>> >> a best-effort workaround for existing applications is to send a 
>> >> GET_FEATURES
>> >> message before completing the vhost_user_set_mem_table() call.
>> >> Since GET_FEATURES requires a reply, an application that processes 
>> >> vhost-user
>> >> messages synchronously would probably have completed the SET_MEM_TABLE 
>> >> before replying.
>> >> 
>> >> Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
>> >> Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
>> >> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>> >
>> >Sporadic hangs are seen with test-vhost-user after this patch:
>> >
>> >https://travis-ci.org/qemu/qemu/builds
>> >
>> >Reverting seems to fix it for me.
>> >
>> >Is this a known problem?
>> >
>> >Fam
>> 
>> Hi Fam,
>> Thanks for reporting the sporadic hangs. I had seen ‘make check’ pass on my 
>> Centos 6 environment, so missed this.
>> I am setting up the docker test env to repro this, but I think I can guess 
>> the problem :
>> 
>> In tests/vhost-user-test.c: 
>> 
>> static void chr_read(void *opaque, const uint8_t *buf, int size)
>> {
>> ..[snip]..
>> 
>> case VHOST_USER_SET_MEM_TABLE:
>>/* received the mem table */
>>memcpy(>memory, , sizeof(msg.payload.memory));
>>s->fds_num = qemu_chr_fe_get_msgfds(chr, s->fds, 
>> G_N_ELEMENTS(s->fds));
>> 
>> 
>>/* signal the test that it can continue */
>>g_cond_signal(>data_cond);
>>break;
>> ..[snip]..
>> }
>> 
>> 
>> The test seems to be marked complete as soon as mem_table is copied. 
>> However, this patch 3/3 changes the behaviour of the SET_MEM_TABLE vhost 
>> command implementation with qemu. SET_MEM_TABLE now sends out a new message 
>> GET_FEATURES, and the call is only completed once it receives features from 
>> the remote application. (or the test framework, as is the case here.)
>
>Hmm but why does it matter that data_cond is woken up?

Michael, sorry, I didn’t quite understand that. Could you pls explain ?

>
>
>> While the test itself can be modified (Do not signal completion until we’ve 
>> sent a follow-up response to GET_FEATURES), I am now wondering if this patch 
>> may break existing vhost applications too ? If so, reverting it possibly 
>> better.
>
>What bothers me is that the new feature might cause the same
>issue once we enable it in the test.

No it wont. The new feature is a protocol extension, and only works if it has 
been negotiated with. If not negotiated, that part of code is never executed.

>
>How about a patch to tests/vhost-user-test.c adding the new
>protocol feature? I would be quite interested to see what
>is going on with it.

Yes that can be done. But you can see that the protocol extension patch will 
not change the behaviour of the _existing_ test.

>
>
>> What confuses me is why it doesn’t fail all the time, but only about 20% to 
>> 30% time as Fam reports. 
>
>And succeeds every time on my systems :(

+1 to that :( I have had no luck repro’ing it.

>
>> 
>> Thoughts : Michael, Fam, MarcAndre ?
>> 
>> Regards,
>>

Prerna


Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table.

2016-08-12 Thread Prerna Saxena

On 12/08/16 12:08 pm, "Fam Zheng" <f...@redhat.com> wrote:





>On Wed, 08/10 18:30, Michael S. Tsirkin wrote:
>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> 
>> The set_mem_table command currently does not seek a reply. Hence, there is
>> no easy way for a remote application to notify to QEMU when it finished
>> setting up memory, or if there were errors doing so.
>> 
>> As an example:
>> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
>> application). SET_MEM_TABLE does not require a reply according to the spec.
>> (2) Qemu commits the memory to the guest.
>> (3) Guest issues an I/O operation over a new memory region which was 
>> configured on (1).
>> (4) The application has not yet remapped the memory, but it sees the I/O 
>> request.
>> (5) The application cannot satisfy the request because it does not know 
>> about those GPAs.
>> 
>> While a guaranteed fix would require a protocol extension (committed 
>> separately),
>> a best-effort workaround for existing applications is to send a GET_FEATURES
>> message before completing the vhost_user_set_mem_table() call.
>> Since GET_FEATURES requires a reply, an application that processes vhost-user
>> messages synchronously would probably have completed the SET_MEM_TABLE 
>> before replying.
>> 
>> Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
>> Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
>> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>
>Sporadic hangs are seen with test-vhost-user after this patch:
>
>https://travis-ci.org/qemu/qemu/builds
>
>Reverting seems to fix it for me.
>
>Is this a known problem?
>
>Fam

Hi Fam,
Thanks for reporting the sporadic hangs. I had seen ‘make check’ pass on my 
Centos 6 environment, so missed this.
I am setting up the docker test env to repro this, but I think I can guess the 
problem :

In tests/vhost-user-test.c: 

static void chr_read(void *opaque, const uint8_t *buf, int size)
{
..[snip]..

case VHOST_USER_SET_MEM_TABLE:
   /* received the mem table */
   memcpy(>memory, , sizeof(msg.payload.memory));
   s->fds_num = qemu_chr_fe_get_msgfds(chr, s->fds, G_N_ELEMENTS(s->fds));


   /* signal the test that it can continue */
   g_cond_signal(>data_cond);
   break;
..[snip]..
}


The test seems to be marked complete as soon as mem_table is copied. 
However, this patch 3/3 changes the behaviour of the SET_MEM_TABLE vhost 
command implementation with qemu. SET_MEM_TABLE now sends out a new message 
GET_FEATURES, and the call is only completed once it receives features from the 
remote application. (or the test framework, as is the case here.)
While the test itself can be modified (Do not signal completion until we’ve 
sent a follow-up response to GET_FEATURES), I am now wondering if this patch 
may break existing vhost applications too ? If so, reverting it possibly better.
What confuses me is why it doesn’t fail all the time, but only about 20% to 30% 
time as Fam reports. 

Thoughts : Michael, Fam, MarcAndre ?

Regards,
Prerna 


Re: [Qemu-devel] [PATCH for-2.7 v5.1 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-08-05 Thread Prerna Saxena
On 04/08/16 9:41 am, "Michael S. Tsirkin" <m...@redhat.com> wrote:



>On Sat, Jul 30, 2016 at 06:38:23AM +, Prerna Saxena wrote:
>> 
>> 
>> 
>> 
>> 
>> On 30/07/16 2:19 am, "Eric Blake" <ebl...@redhat.com> wrote:
>> 
>> >On 07/28/2016 01:07 AM, Prerna Saxena wrote:
>> >> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> >> 
>> >> This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.
>> >> 
>> >
>> >> +
>> >> +With this protocol extension negotiated, the sender (QEMU) can set the
>> >> +"need_reply" [Bit 3] flag to any command. This indicates that
>> >> +the client MUST respond with a Payload VhostUserMsg indicating success or
>> >> +failure. The payload should be set to zero on success or non-zero on 
>> >> failure.
>> >> +(Unless the message already has an explicit reply body)
>> >
>> >Rather than make this parenthetical, I would go with:
>> >
>> >The payload should be set to zero on success or non-zero on failure,
>> >unless the message already has an explicit reply body.
>> 
>> Hi Eric,
>> Thank you for taking a look, but I think you possibly missed the latest 
>> patchset posted last night.
>> This had already been incorporated in v6 that I’d posted last night before 
>> your message.
>> See https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06772.html
>> 
>> 
>> >
>> >> +
>> >> +This indicates to QEMU that the requested operation has deterministically
>> >> +been met or not. Today, QEMU is expected to terminate the main vhost-user
>> >
>> >Reads awkwardly; maybe:
>> >
>> >The response payload gives QEMU a deterministic indication of the result
>> >of the command.
>> 
>> Hmm, it is more of personal taste, so I’ll refrain from commenting either 
>> way.
>
>I prefer Eric's form too. "that ... or not" isn't very clear.

Done.

>
>> >
>> >> +loop upon receiving such errors. In future, qemu could be taught to be 
>> >> more
>> >> +resilient for selective requests.
>> >> +
>> >> +For the message types that already solicit a reply from the client, the
>> >> +presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set 
>> >> brings
>> >> +no behaviourial change. (See the 'Communication' section for details.)
>> >
>> >s/behaviourial/behavioural/ (or if the document widely favors US
>> >spelling, behavioral)
>> 
>> 
>> The last 3 iterations of this patchset have only seen review comments 
>> focussed on documentation suggestions and indentation of code, but nothing 
>> on the idea/code itself. This gives me hope that the patch is possibly close 
>> to merging within 2.7 timeframe :-)
>> May I request the maintainers to please correct this tiny spelling typo as 
>> this is checked in?
>> 
>> Regards,
>> Prerna
>
>Probably easier to post v7 with above minor things.

Posted a v7 which incorporates all suggestions made by Eric.
https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg01027.html

Regards,



[Qemu-devel] [PATCH for-2.7 v7 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-08-05 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_reply" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Reviewed-by: Marc-André Lureau <marcandre.lur...@redhat.com>
Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt | 26 ++
 hw/virtio/vhost-user.c| 32 
 2 files changed, 58 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..7890d71 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_reply flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+ details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +469,24 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+---
+The original vhost-user specification only demands replies for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the
+"need_reply" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure,
+unless the message already has an explicit reply body.
+
+The response payload gives QEMU a deterministic indication of the result
+of the command. Today, QEMU is expected to terminate the main vhost-user
+loop upon receiving such errors. In future, qemu could be taught to be more
+resilient for selective requests.
+
+For the message types that already solicit a reply from the client, the
+presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
+no behavioural change. (See the 'Communication' section for details.)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 1995fd2..b57454a 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK (0x3)
 #define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_NEED_REPLY_MASK  (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -158,6 +160,25 @@ fail:
 return -1;
 }
 
+static int process_message_reply(struct vhost_dev *dev,
+ VhostUserRequest request)
+{
+VhostUserMsg msg;
+
+if (vhost_user_read(dev, ) < 0) {
+return -1;
+}
+
+if (msg.request != request) {
+error_report("Received unexpected msg type."
+ "Expected %d received %d",
+ request, msg.request);
+return -1;
+}
+
+return msg.payload.u64 ? -1 : 0;
+}
+
 static bool vhost_user_one_time_request(VhostUserRequest request)
 {
 switch (request) {
@@ -248,11 +269,18 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
 int fds[VHOST_MEMORY_MAX_NREGIONS];
 int i, fd;
 size_t fd_num = 0;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+  VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
 VhostUserMsg msg = {
 .request = VHOST_USER_SET_MEM_TABLE,
 .flags = VHOST_USER_VERSION,
 };
 
+if (reply_supported) {
+  

[Qemu-devel] [PATCH for-2.7 v7 0/2]vhost-user: Extend protocol to receive replies on any command.

2016-08-05 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

[ This series incorporates all suggestions around documentation that were 
suggested.]
vhost-user: Extend protocol to receive replies on any command.

The current vhost-user protocol requires the client to send reply to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application).Note that SET_MEM_TABLE does not require a reply according to the 
spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications.
Patch 1 introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This
feature, if negotiated, allows QEMU to request a reply to any message by setting
the newly introduced "need_reply" flag. The application must then respond to 
qemu
by providing a status about the requested operation.

Patch 2 adds a workaround for the race described above for clients that do not 
support REPLY_ACK
feature. It introduces  a get_features command to be sent before returning from 
set_mem_table. While this is not a complete fix, it will help client 
applications that strictly process messagesin order.

Changelog:
--
Changes v6 -> v7:
1) Patch 1: In docs/specs/vhost-user.txt
*   s/behaviourial/behavioural/
*   "This indicates to QEMU that the requested operation has deterministically 
been met or not" -> " The response payload gives QEMU a deterministic 
indication of the result of the command."
2) Patch 2 : Unchanged.

Changes v5.1 -> v6:
1) Patch 1 : fixed some minor indentation issues and a really tiny 
documentation chang
2) Patch 2 : unchanged.

Changes v5->v5.1 :
1) Patch 1 : no change
2) Patch 2 : fixes a tiny typo I'd accidentally introduced while creating v5 
from v4. The code itself is unchanged from v4.

Changes v4->v5:
1) Patch 1 :
* Reword 'response' to 'reply' on public demand.
* Documentation is more concise.
Patch 2 : unchanged

Changes v3->v4:
1) Rearranged code in PATCH 1 to offset compiler warnings about missing 
declaration of vhost_user_read(). Fixed by moving process_message_reply() after 
definition of vhost_user_read()
2) Fixed minor suggestions in writeup for this protocol extension.

Changes v2->v3:
1) Swapped the patch numbers 1 & 2 from the previous series.
2) Patch 1 (previously patch 2 in v2): addresses MarcAndre's review comments 
and renames function 'process_message_response' to 'process_message_reply'
3) Patch 2 (ie patch 1 in v2) : Unchanged from v2.

Changes v1->v2:
1) Patch 1 : Ask for get_features before returning from set_mem_table(new).
2) Patch 2 : * Improve documentation.
  * Abstract out commonly used operations in the form of a function, 
process_message_response(). Also implement this only for SET_MEM_TABLE.

References:
v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html
v3 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg01598.html
v4 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html
v5 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06338.html
v5.1:https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06359.html 
v6 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06772.html

Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  26 +
 hw/virtio/vhost-user.c| 137 +-
 2 files changed, 114 insertions(+), 49 deletions(-)

-- 
1.8.1.2




[Qemu-devel] [PATCH for-2.7 v7 2/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-08-05 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application has not yet remapped the memory, but it sees the I/O 
request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

While a guaranteed fix would require a protocol extension (committed 
separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before 
replying.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 hw/virtio/vhost-user.c | 127 ++---
 1 file changed, 67 insertions(+), 60 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index b57454a..1a7d53c 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -263,66 +263,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-struct vhost_memory *mem)
-{
-int fds[VHOST_MEMORY_MAX_NREGIONS];
-int i, fd;
-size_t fd_num = 0;
-bool reply_supported = virtio_has_feature(dev->protocol_features,
-  VHOST_USER_PROTOCOL_F_REPLY_ACK);
-
-VhostUserMsg msg = {
-.request = VHOST_USER_SET_MEM_TABLE,
-.flags = VHOST_USER_VERSION,
-};
-
-if (reply_supported) {
-msg.flags |= VHOST_USER_NEED_REPLY_MASK;
-}
-
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-ram_addr_t offset;
-MemoryRegion *mr;
-
-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
- );
-fd = memory_region_get_fd(mr);
-if (fd > 0) {
-msg.payload.memory.regions[fd_num].userspace_addr = 
reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr = 
reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-}
-}
-
-msg.payload.memory.nregions = fd_num;
-
-if (!fd_num) {
-error_report("Failed initializing vhost-user memory map, "
- "consider using -object memory-backend-file share=on");
-return -1;
-}
-
-msg.size = sizeof(msg.payload.memory.nregions);
-msg.size += sizeof(msg.payload.memory.padding);
-msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-if (vhost_user_write(dev, , fds, fd_num) < 0) {
-return -1;
-}
-
-if (reply_supported) {
-return process_message_reply(dev, msg.request);
-}
-
-return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
  struct vhost_vring_addr *addr)
 {
@@ -537,6 +477,73 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+struct vhost_memory *mem)
+{
+int fds[VHOST_MEMORY_MAX_NREGIONS];
+int i, fd;
+size_t fd_num = 0;
+uint64_t features;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+  VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_MEM_TABLE,
+.flags = VHOST_USER_VERSION,
+};
+
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_REPLY_MASK;
+}
+
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+ram_addr_t offset;
+MemoryRegion *mr;
+
+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+ );
+fd = memory_

Re: [Qemu-devel] [PATCH for-2.7 v5.1 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-30 Thread Prerna Saxena





On 30/07/16 2:19 am, "Eric Blake" <ebl...@redhat.com> wrote:

>On 07/28/2016 01:07 AM, Prerna Saxena wrote:
>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> 
>> This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.
>> 
>
>> +
>> +With this protocol extension negotiated, the sender (QEMU) can set the
>> +"need_reply" [Bit 3] flag to any command. This indicates that
>> +the client MUST respond with a Payload VhostUserMsg indicating success or
>> +failure. The payload should be set to zero on success or non-zero on 
>> failure.
>> +(Unless the message already has an explicit reply body)
>
>Rather than make this parenthetical, I would go with:
>
>The payload should be set to zero on success or non-zero on failure,
>unless the message already has an explicit reply body.

Hi Eric,
Thank you for taking a look, but I think you possibly missed the latest 
patchset posted last night.
This had already been incorporated in v6 that I’d posted last night before your 
message.
See https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06772.html


>
>> +
>> +This indicates to QEMU that the requested operation has deterministically
>> +been met or not. Today, QEMU is expected to terminate the main vhost-user
>
>Reads awkwardly; maybe:
>
>The response payload gives QEMU a deterministic indication of the result
>of the command.

Hmm, it is more of personal taste, so I’ll refrain from commenting either way.

>
>> +loop upon receiving such errors. In future, qemu could be taught to be more
>> +resilient for selective requests.
>> +
>> +For the message types that already solicit a reply from the client, the
>> +presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set 
>> brings
>> +no behaviourial change. (See the 'Communication' section for details.)
>
>s/behaviourial/behavioural/ (or if the document widely favors US
>spelling, behavioral)


The last 3 iterations of this patchset have only seen review comments focussed 
on documentation suggestions and indentation of code, but nothing on the 
idea/code itself. This gives me hope that the patch is possibly close to 
merging within 2.7 timeframe :-)
May I request the maintainers to please correct this tiny spelling typo as this 
is checked in?

Regards,
Prerna


[Qemu-devel] [PATCH for-2.7 v6 2/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-07-29 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application has not yet remapped the memory, but it sees the I/O 
request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

While a guaranteed fix would require a protocol extension (committed 
separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before 
replying.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 hw/virtio/vhost-user.c | 125 ++---
 1 file changed, 67 insertions(+), 58 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 521a5db..53c37a6 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -254,64 +254,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-struct vhost_memory *mem)
-{
-int fds[VHOST_MEMORY_MAX_NREGIONS];
-int i, fd;
-size_t fd_num = 0;
-bool reply_supported = virtio_has_feature(dev->protocol_features,
-  VHOST_USER_PROTOCOL_F_REPLY_ACK);
-
-VhostUserMsg msg = {
-.request = VHOST_USER_SET_MEM_TABLE,
-.flags = VHOST_USER_VERSION,
-};
-
-if (reply_supported) {
-msg.flags |= VHOST_USER_NEED_REPLY_MASK;
-}
-
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-ram_addr_t offset;
-MemoryRegion *mr;
-
-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
- );
-fd = memory_region_get_fd(mr);
-if (fd > 0) {
-msg.payload.memory.regions[fd_num].userspace_addr = 
reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr = 
reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-}
-}
-
-msg.payload.memory.nregions = fd_num;
-
-if (!fd_num) {
-error_report("Failed initializing vhost-user memory map, "
- "consider using -object memory-backend-file share=on");
-return -1;
-}
-
-msg.size = sizeof(msg.payload.memory.nregions);
-msg.size += sizeof(msg.payload.memory.padding);
-msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-vhost_user_write(dev, , fds, fd_num);
-
-if (reply_supported) {
-return process_message_reply(dev, msg.request);
-}
-
-return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
  struct vhost_vring_addr *addr)
 {
@@ -514,6 +456,73 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+struct vhost_memory *mem)
+{
+int fds[VHOST_MEMORY_MAX_NREGIONS];
+int i, fd;
+size_t fd_num = 0;
+uint64_t features;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+  VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_MEM_TABLE,
+.flags = VHOST_USER_VERSION,
+};
+
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_REPLY_MASK;
+}
+
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+ram_addr_t offset;
+MemoryRegion *mr;
+
+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+ );
+fd = memory_region_get_fd(mr);
+if (fd > 0) {
+ 

[Qemu-devel] [PATCH for-2.7 v6 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-29 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_reply" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Reviewed-by: Marc-André Lureau <marcandre.lur...@redhat.com>
Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt | 26 ++
 hw/virtio/vhost-user.c| 32 
 2 files changed, 58 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..57a8357 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_reply flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+ details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +469,24 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+---
+The original vhost-user specification only demands replies for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the
+"need_reply" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure,
+unless the message already has an explicit reply body.
+
+This indicates to QEMU that the requested operation has deterministically
+been met or not. Today, QEMU is expected to terminate the main vhost-user
+loop upon receiving such errors. In future, qemu could be taught to be more
+resilient for selective requests.
+
+For the message types that already solicit a reply from the client, the
+presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
+no behaviourial change. (See the 'Communication' section for details.)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..521a5db 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK (0x3)
 #define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_NEED_REPLY_MASK  (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -158,6 +160,25 @@ fail:
 return -1;
 }
 
+static int process_message_reply(struct vhost_dev *dev,
+ VhostUserRequest request)
+{
+VhostUserMsg msg;
+
+if (vhost_user_read(dev, ) < 0) {
+return -1;
+}
+
+if (msg.request != request) {
+error_report("Received unexpected msg type."
+ "Expected %d received %d",
+ request, msg.request);
+return -1;
+}
+
+return msg.payload.u64 ? -1 : 0;
+}
+
 static bool vhost_user_one_time_request(VhostUserRequest request)
 {
 switch (request) {
@@ -239,11 +260,18 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
 int fds[VHOST_MEMORY_MAX_NREGIONS];
 int i, fd;
 size_t fd_num = 0;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+  VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
 VhostUserMsg msg = {
 .request = VHOST_USER_SET_MEM_TABLE,
 .flags = VHOST_USER_VERSION,
 };
 
+if (reply_supported) {
+ 

[Qemu-devel] [PATCH for-2.7 v6 0/2] vhost-user: Extend protocol to receive replies on any command.

2016-07-29 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

*** BLURB HERE ***
vhost-user: Extend protocol to receive replies on any command.

The current vhost-user protocol requires the client to send reply to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application).Note that SET_MEM_TABLE does not require a reply according to the 
spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications.
Patch 1 introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This
feature, if negotiated, allows QEMU to request a reply to any message by setting
the newly introduced "need_reply" flag. The application must then respond to 
qemu
by providing a status about the requested operation.

Patch 2 adds a workaround for the race described above for clients that do not 
support REPLY_ACK
feature. It introduces  a get_features command to be sent before returning from 
set_mem_table. While this is not a complete fix, it will help client 
applications that strictly process messagesin order.

Changelog:
--
Changes v5.1 -> v6:
1) Patch 1 : fixed some minor indentation issues and a really tiny 
documentation chang
2) Patch 2 : unchanged.

Changes v5->v5.1 :
1) Patch 1 : no change
2) Patch 2 : fixes a tiny typo I'd accidentally introduced while creating v5 
from v4. The code itself is unchanged from v4.

Changes v4->v5:
1) Patch 1 :
* Reword 'response' to 'reply' on public demand.
* Documentation is more concise.
Patch 2 : unchanged

Changes v3->v4:
1) Rearranged code in PATCH 1 to offset compiler warnings about missing 
declaration of vhost_user_read(). Fixed by moving process_message_reply() after 
definition of vhost_user_read()
2) Fixed minor suggestions in writeup for this protocol extension.

Changes v2->v3:
1) Swapped the patch numbers 1 & 2 from the previous series.
2) Patch 1 (previously patch 2 in v2): addresses MarcAndre's review comments 
and renames function 'process_message_response' to 'process_message_reply'
3) Patch 2 (ie patch 1 in v2) : Unchanged from v2.

Changes v1->v2:
1) Patch 1 : Ask for get_features before returning from set_mem_table(new).
2) Patch 2 : * Improve documentation.
  * Abstract out commonly used operations in the form of a function, 
process_message_response(). Also implement this only for SET_MEM_TABLE.

References:
v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html
v3 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg01598.html
v4 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html
v5 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06338.html
v5.1:https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06359.html 

Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  26 +
 hw/virtio/vhost-user.c| 135 ++
 2 files changed, 114 insertions(+), 47 deletions(-)

-- 
1.8.1.2




Re: [Qemu-devel] [PATCH v4 2/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-07-28 Thread Prerna Saxena
On 27/07/16 7:00 pm, "Michael S. Tsirkin" <m...@redhat.com> wrote:



>On Wed, Jul 27, 2016 at 02:52:37AM -0700, Prerna Saxena wrote:
>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> 
>> The set_mem_table command currently does not seek a reply. Hence, there is
>> no easy way for a remote application to notify to QEMU when it finished
>> setting up memory, or if there were errors doing so.
>> 
>> As an example:
>> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
>> application). SET_MEM_TABLE does not require a reply according to the spec.
>> (2) Qemu commits the memory to the guest.
>> (3) Guest issues an I/O operation over a new memory region which was 
>> configured on (1).
>> (4) The application has not yet remapped the memory, but it sees the I/O 
>> request.
>> (5) The application cannot satisfy the request because it does not know 
>> about those GPAs.
>> 
>> While a guaranteed fix would require a protocol extension (committed 
>> separately),
>> a best-effort workaround for existing applications is to send a GET_FEATURES
>> message before completing the vhost_user_set_mem_table() call.
>> Since GET_FEATURES requires a reply, an application that processes vhost-user
>> messages synchronously would probably have completed the SET_MEM_TABLE 
>> before replying.
>> 
>> Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
>
>Could you pls reorder patchset so this is 1/2?
>1/1 is still under review but I'd like to make sure
>we have some kind of fix in place for 2.7.

Hi Michael,
The review comments for patch 1 were around documentation and the choice of 
name of flag.
There has been no recommendation/comment on the code itself.
I have fixed all of that and posted a new patch series. (Version v5.1)
Hope both the patches make it in time for 2.7.

Thanks, once again, for reviewing this.

Regards,
Prerna


[Qemu-devel] [PATCH for-2.7 v5.1 2/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-07-28 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application has not yet remapped the memory, but it sees the I/O 
request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

While a guaranteed fix would require a protocol extension (committed 
separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before 
replying.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 hw/virtio/vhost-user.c | 123 ++---
 1 file changed, 65 insertions(+), 58 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 86e7ae0..d0dafa0 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -254,64 +254,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-struct vhost_memory *mem)
-{
-int fds[VHOST_MEMORY_MAX_NREGIONS];
-int i, fd;
-size_t fd_num = 0;
-bool reply_supported = virtio_has_feature(dev->protocol_features,
-VHOST_USER_PROTOCOL_F_REPLY_ACK);
-
-VhostUserMsg msg = {
-.request = VHOST_USER_SET_MEM_TABLE,
-.flags = VHOST_USER_VERSION,
-};
-
-if (reply_supported) {
-msg.flags |= VHOST_USER_NEED_REPLY_MASK;
-}
-
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-ram_addr_t offset;
-MemoryRegion *mr;
-
-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
- );
-fd = memory_region_get_fd(mr);
-if (fd > 0) {
-msg.payload.memory.regions[fd_num].userspace_addr = 
reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr = 
reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-}
-}
-
-msg.payload.memory.nregions = fd_num;
-
-if (!fd_num) {
-error_report("Failed initializing vhost-user memory map, "
- "consider using -object memory-backend-file share=on");
-return -1;
-}
-
-msg.size = sizeof(msg.payload.memory.nregions);
-msg.size += sizeof(msg.payload.memory.padding);
-msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-vhost_user_write(dev, , fds, fd_num);
-
-if (reply_supported) {
-return process_message_reply(dev, msg.request);
-}
-
-return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
  struct vhost_vring_addr *addr)
 {
@@ -514,6 +456,71 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+struct vhost_memory *mem)
+{
+int fds[VHOST_MEMORY_MAX_NREGIONS];
+int i, fd;
+size_t fd_num = 0;
+uint64_t features;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_MEM_TABLE,
+.flags = VHOST_USER_VERSION,
+};
+
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_REPLY_MASK;
+}
+
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+ram_addr_t offset;
+MemoryRegion *mr;
+
+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+ );
+fd = memory_region_get_fd(mr);
+if (fd > 0) {
+msg.payload.memory.regions[fd

[Qemu-devel] [PATCH for-2.7 v5.1 0/2] vhost-user: Extend protocol to receive replies on any command.

2016-07-28 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

vhost-user: Extend protocol to receive replies on any command.

The current vhost-user protocol requires the client to send reply to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application).Note that SET_MEM_TABLE does not require a reply according to the 
spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications.
Patch 1 introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This
feature, if negotiated, allows QEMU to request a reply to any message by setting
the newly introduced "need_reply" flag. The application must then respond to 
qemu
by providing a status about the requested operation.

Patch 2 adds a workaround for the race described above for clients that do not 
support REPLY_ACK
feature. It introduces  a get_features command to be sent before returning from 
set_mem_table. While this is not a complete fix, it will help client 
applications that strictly process messagesin order.

Changelog:
--
Changes v5->v5.1 :
1) Patch 1 : no change
2) Patch 2 : fixes a tiny typo I'd accidentally introduced while creating v5 
from v4. The code itself is unchanged from v4.

Changes v4->v5:
1) Patch 1 :
* Reword 'response' to 'reply' on public demand.
* Documentation is more concise.
Patch 2 : unchanged

Changes v3->v4:
1) Rearranged code in PATCH 1 to offset compiler warnings about missing 
declaration of vhost_user_read(). Fixed by moving process_message_reply() after 
definition of vhost_user_read()
2) Fixed minor suggestions in writeup for this protocol extension.

Changes v2->v3:
1) Swapped the patch numbers 1 & 2 from the previous series.
2) Patch 1 (previously patch 2 in v2): addresses MarcAndre's review comments 
and renames function 'process_message_response' to 'process_message_reply'
3) Patch 2 (ie patch 1 in v2) : Unchanged from v2.

Changes v1->v2:
1) Patch 1 : Ask for get_features before returning from set_mem_table(new).
2) Patch 2 : * Improve documentation.
  * Abstract out commonly used operations in the form of a function, 
process_message_response(). Also implement this only for SET_MEM_TABLE.

References:
v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html
v3 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg01598.html
v4 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html

Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  44 +++
 hw/virtio/vhost-user.c| 133 ++
 2 files changed, 130 insertions(+), 47 deletions(-)

-- 
1.8.1.2



[Qemu-devel] [PATCH for-2.7 v5.1 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-28 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_reply" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt | 26 ++
 hw/virtio/vhost-user.c| 32 
 2 files changed, 58 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..54b5c8f 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_reply flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+ details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +469,24 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+---
+The original vhost-user specification only demands replies for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the
+"need_reply" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure.
+(Unless the message already has an explicit reply body)
+
+This indicates to QEMU that the requested operation has deterministically
+been met or not. Today, QEMU is expected to terminate the main vhost-user
+loop upon receiving such errors. In future, qemu could be taught to be more
+resilient for selective requests.
+
+For the message types that already solicit a reply from the client, the
+presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
+no behaviourial change. (See the 'Communication' section for details.)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..86e7ae0 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK (0x3)
 #define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_NEED_REPLY_MASK   (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -158,6 +160,25 @@ fail:
 return -1;
 }
 
+static int process_message_reply(struct vhost_dev *dev,
+VhostUserRequest request)
+{
+VhostUserMsg msg;
+
+if (vhost_user_read(dev, ) < 0) {
+return 0;
+}
+
+if (msg.request != request) {
+error_report("Received unexpected msg type."
+"Expected %d received %d",
+request, msg.request);
+return -1;
+}
+
+return msg.payload.u64 ? -1 : 0;
+}
+
 static bool vhost_user_one_time_request(VhostUserRequest request)
 {
 switch (request) {
@@ -239,11 +260,18 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
 int fds[VHOST_MEMORY_MAX_NREGIONS];
 int i, fd;
 size_t fd_num = 0;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
 VhostUserMsg msg = {
 .request = VHOST_USER_SET_MEM_TABLE,
 .flags = VHOST_USER_VERSION,
 };
 
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_REPLY_MASK;
+}
+
 for 

Re: [Qemu-devel] [PATCH v4 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-28 Thread Prerna Saxena





On 27/07/16 6:58 pm, "Michael S. Tsirkin" <m...@redhat.com> wrote:

>On Wed, Jul 27, 2016 at 12:56:18PM +, Prerna Saxena wrote:
>> Hi Marc,
>> Thanks, please find my reply inline.
>> 
>> 
>> 
>> 
>> 
>> On 27/07/16 4:35 pm, "Marc-André Lureau" <marcandre.lur...@gmail.com> wrote:
>> 
>> >Hi
>> >
>> >On Wed, Jul 27, 2016 at 1:52 PM, Prerna Saxena <saxenap@gmail.com> 
>> >wrote:
>> >> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> >>
>> >> This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.
>> >>
>> >> If negotiated, client applications should send a u64 payload in
>> >> response to any message that contains the "need_response" bit set
>> >> on the message flags. Setting the payload to "zero" indicates the
>> >> command finished successfully. Likewise, setting it to "non-zero"
>> >> indicates an error.
>> >>
>> >> Currently implemented only for SET_MEM_TABLE.
>> >>
>> >> Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
>> >> ---
>> >>  docs/specs/vhost-user.txt | 41 +
>> >>  hw/virtio/vhost-user.c| 32 
>> >>  2 files changed, 73 insertions(+)
>> >>
>> >> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
>> >> index 777c49c..57df586 100644
>> >> --- a/docs/specs/vhost-user.txt
>> >> +++ b/docs/specs/vhost-user.txt
>> >> @@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
>> >>   * Flags: 32-bit bit field:
>> >> - Lower 2 bits are the version (currently 0x01)
>> >> - Bit 2 is the reply flag - needs to be sent on each reply from the 
>> >> slave
>> >> +   - Bit 3 is the need_response flag - see 
>> >> VHOST_USER_PROTOCOL_F_REPLY_ACK for
>> >> + details.
>> >
>> >Why need_response and not "need reply"?
>> 
>> (I’d already pointed this out earlier, but looks like I was possibly not 
>> very clear.)
>> Before deciding on the right name for Bit 3, let us see the nomenclature for 
>> Bit 2 above : "Bit 2 is the reply flag - needs to be sent on each reply from 
>> the slave”.
>> So we already have a _reply_ flag in use. If the name Bit 3 as the 
>> _need_reply_ flag, don’t you think it would be ultra-confusing ? I found it 
>> confusing  when I reviewed the documentation with this different term.
>> So I chose the name need_response with much deliberation — it conveys the 
>> essence of what this flag means to achieve, but without adding to confusion.
>
>I don't see confusion, I think I agree with Marc André.


Allright. Posted a new series with the reworded terminology and updated (more 
concise) documentation.

Regards,
Prerna


[Qemu-devel] [PATCH for-2.7 v5 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-28 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_reply" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt | 26 ++
 hw/virtio/vhost-user.c| 32 
 2 files changed, 58 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..54b5c8f 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_reply flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+ details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +469,24 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+---
+The original vhost-user specification only demands replies for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the
+"need_reply" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure.
+(Unless the message already has an explicit reply body)
+
+This indicates to QEMU that the requested operation has deterministically
+been met or not. Today, QEMU is expected to terminate the main vhost-user
+loop upon receiving such errors. In future, qemu could be taught to be more
+resilient for selective requests.
+
+For the message types that already solicit a reply from the client, the
+presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
+no behaviourial change. (See the 'Communication' section for details.)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..86e7ae0 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK (0x3)
 #define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_NEED_REPLY_MASK   (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -158,6 +160,25 @@ fail:
 return -1;
 }
 
+static int process_message_reply(struct vhost_dev *dev,
+VhostUserRequest request)
+{
+VhostUserMsg msg;
+
+if (vhost_user_read(dev, ) < 0) {
+return 0;
+}
+
+if (msg.request != request) {
+error_report("Received unexpected msg type."
+"Expected %d received %d",
+request, msg.request);
+return -1;
+}
+
+return msg.payload.u64 ? -1 : 0;
+}
+
 static bool vhost_user_one_time_request(VhostUserRequest request)
 {
 switch (request) {
@@ -239,11 +260,18 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
 int fds[VHOST_MEMORY_MAX_NREGIONS];
 int i, fd;
 size_t fd_num = 0;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
 VhostUserMsg msg = {
 .request = VHOST_USER_SET_MEM_TABLE,
 .flags = VHOST_USER_VERSION,
 };
 
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_REPLY_MASK;
+}
+
 for 

[Qemu-devel] [PATCH for-2.7 v5 0/2] vhost-user: Extend protocol to receive replies on any command.

2016-07-28 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>


vhost-user: Extend protocol to receive replies on any command.

The current vhost-user protocol requires the client to send reply to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application).Note that SET_MEM_TABLE does not require a reply according to the 
spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications.
Patch 1 introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This
feature, if negotiated, allows QEMU to request a reply to any message by setting
the newly introduced "need_reply" flag. The application must then respond to 
qemu
by providing a status about the requested operation.

Patch 2 adds a workaround for the race described above for clients that do not 
support REPLY_ACK
feature. It introduces  a get_features command to be sent before returning from 
set_mem_table. While this is not a complete fix, it will help client 
applications that strictly process messagesin order.

Changelog:
--
Changes v4->v5:
1) Patch 1 :
* Reword 'response' to 'reply' on public demand.
* Documentation is more concise.
Patch 2 : unchanged

Changes v3->v4:
1) Rearranged code in PATCH 1 to offset compiler warnings about missing 
declaration of vhost_user_read(). Fixed by moving process_message_reply() after 
definition of vhost_user_read()
2) Fixed minor suggestions in writeup for this protocol extension.

Changes v2->v3:
1) Swapped the patch numbers 1 & 2 from the previous series.
2) Patch 1 (previously patch 2 in v2): addresses MarcAndre's review comments 
and renames function 'process_message_response' to 'process_message_reply'
3) Patch 2 (ie patch 1 in v2) : Unchanged from v2.

Changes v1->v2:
1) Patch 1 : Ask for get_features before returning from set_mem_table(new).
2) Patch 2 : * Improve documentation.
  * Abstract out commonly used operations in the form of a function, 
process_message_response(). Also implement this only for SET_MEM_TABLE.

References:
v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html
v3 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg01598.html
v4 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06173.html

Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  44 +++
 hw/virtio/vhost-user.c| 133 ++
 2 files changed, 130 insertions(+), 47 deletions(-)

-- 
1.8.1.2


Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  26 +
 hw/virtio/vhost-user.c| 132 +-
 2 files changed, 111 insertions(+), 47 deletions(-)

-- 
1.8.1.2

Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  26 +
 hw/virtio/vhost-user.c| 132 +-
 2 files changed, 111 insertions(+), 47 deletions(-)

-- 
1.8.1.2




[Qemu-devel] [PATCH for-2.7 v5 2/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-07-28 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application has not yet remapped the memory, but it sees the I/O 
request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

While a guaranteed fix would require a protocol extension (committed 
separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before 
replying.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 hw/virtio/vhost-user.c | 122 ++---
 1 file changed, 64 insertions(+), 58 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 86e7ae0..2fc7f25 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -254,64 +254,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-struct vhost_memory *mem)
-{
-int fds[VHOST_MEMORY_MAX_NREGIONS];
-int i, fd;
-size_t fd_num = 0;
-bool reply_supported = virtio_has_feature(dev->protocol_features,
-VHOST_USER_PROTOCOL_F_REPLY_ACK);
-
-VhostUserMsg msg = {
-.request = VHOST_USER_SET_MEM_TABLE,
-.flags = VHOST_USER_VERSION,
-};
-
-if (reply_supported) {
-msg.flags |= VHOST_USER_NEED_REPLY_MASK;
-}
-
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-ram_addr_t offset;
-MemoryRegion *mr;
-
-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
- );
-fd = memory_region_get_fd(mr);
-if (fd > 0) {
-msg.payload.memory.regions[fd_num].userspace_addr = 
reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr = 
reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-}
-}
-
-msg.payload.memory.nregions = fd_num;
-
-if (!fd_num) {
-error_report("Failed initializing vhost-user memory map, "
- "consider using -object memory-backend-file share=on");
-return -1;
-}
-
-msg.size = sizeof(msg.payload.memory.nregions);
-msg.size += sizeof(msg.payload.memory.padding);
-msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-vhost_user_write(dev, , fds, fd_num);
-
-if (reply_supported) {
-return process_message_reply(dev, msg.request);
-}
-
-return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
  struct vhost_vring_addr *addr)
 {
@@ -514,6 +456,70 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+struct vhost_memory *mem)
+{
+int fds[VHOST_MEMORY_MAX_NREGIONS];
+int i, fd;
+size_t fd_num = 0;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_MEM_TABLE,
+.flags = VHOST_USER_VERSION,
+};
+
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_REPLY_MASK;
+}
+
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+ram_addr_t offset;
+MemoryRegion *mr;
+
+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+ );
+fd = memory_region_get_fd(mr);
+if (fd > 0) {
+msg.payload.memory.regions[fd_num].userspace_addr

Re: [Qemu-devel] [PATCH v4 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-27 Thread Prerna Saxena
Hi Marc,
Thanks, please find my reply inline.





On 27/07/16 4:35 pm, "Marc-André Lureau" <marcandre.lur...@gmail.com> wrote:

>Hi
>
>On Wed, Jul 27, 2016 at 1:52 PM, Prerna Saxena <saxenap@gmail.com> wrote:
>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>>
>> This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.
>>
>> If negotiated, client applications should send a u64 payload in
>> response to any message that contains the "need_response" bit set
>> on the message flags. Setting the payload to "zero" indicates the
>> command finished successfully. Likewise, setting it to "non-zero"
>> indicates an error.
>>
>> Currently implemented only for SET_MEM_TABLE.
>>
>> Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
>> ---
>>  docs/specs/vhost-user.txt | 41 +
>>  hw/virtio/vhost-user.c| 32 
>>  2 files changed, 73 insertions(+)
>>
>> diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
>> index 777c49c..57df586 100644
>> --- a/docs/specs/vhost-user.txt
>> +++ b/docs/specs/vhost-user.txt
>> @@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
>>   * Flags: 32-bit bit field:
>> - Lower 2 bits are the version (currently 0x01)
>> - Bit 2 is the reply flag - needs to be sent on each reply from the slave
>> +   - Bit 3 is the need_response flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK 
>> for
>> + details.
>
>Why need_response and not "need reply"?

(I’d already pointed this out earlier, but looks like I was possibly not very 
clear.)
Before deciding on the right name for Bit 3, let us see the nomenclature for 
Bit 2 above : "Bit 2 is the reply flag - needs to be sent on each reply from 
the slave”.
So we already have a _reply_ flag in use. If the name Bit 3 as the _need_reply_ 
flag, don’t you think it would be ultra-confusing ? I found it confusing  when 
I reviewed the documentation with this different term.
So I chose the name need_response with much deliberation — it conveys the 
essence of what this flag means to achieve, but without adding to confusion.

>
>btw, I wonder if it would be worth to introduce an enum at this point
>
>>   * Size - 32-bit size of the payload
>>
>>
>> @@ -126,6 +128,8 @@ the ones that do:
>>   * VHOST_GET_VRING_BASE
>>   * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
>>
>> +[ Also see the section on REPLY_ACK protocol extension. ]
>> +
>>  There are several messages that the master sends with file descriptors 
>> passed
>>  in the ancillary data:
>>
>> @@ -254,6 +258,7 @@ Protocol features
>>  #define VHOST_USER_PROTOCOL_F_MQ 0
>>  #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
>>  #define VHOST_USER_PROTOCOL_F_RARP   2
>> +#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
>>
>>  Message types
>>  -
>> @@ -464,3 +469,39 @@ Message types
>>is present in VHOST_USER_GET_PROTOCOL_FEATURES.
>>The first 6 bytes of the payload contain the mac address of the guest 
>> to
>>allow the vhost user backend to construct and broadcast the fake RARP.
>> +
>> +VHOST_USER_PROTOCOL_F_REPLY_ACK:
>> +---
>> +The original vhost-user specification only demands responses for certain
>
>responses/replies

If you feel strongly about it, will change it here.

>
>> +commands. This differs from the vhost protocol implementation where commands
>> +are sent over an ioctl() call and block until the client has completed.
>> +
>> +With this protocol extension negotiated, the sender (QEMU) can set the newly
>> +introduced "need_response" [Bit 3] flag to any command. This indicates that
>
>need reply, you can remove the "newly introduced" (it's not going to
>be so new after a while)

* need_reply = no I don’t agree, for reasons cited earlier.
* remove the “newly introduced” phrase = agree, will do.

>
>> +the client MUST respond with a Payload VhostUserMsg indicating success or
>
>I would put right here for clarity:
>
>...MUST respond with a Payload VhostUserMsg (unless the message has
>already an explicit reply body)...
>
>alternatively, I would forbid using the bit 3 on commands that have
>already an explicit reply.

I don’t currently have any code that raises an error for such cases.
The implementation silently ignores it.

>
>> +failure. The payload should be set to zero on success or non-zero on 
>> failure.
>> +In other words, r

Re: [Qemu-devel] [PATCH v3 0/2] vhost-user: Extend protocol to receive replies on any command.

2016-07-27 Thread Prerna Saxena





On 27/07/16 9:51 am, "Michael S. Tsirkin" <m...@redhat.com> wrote:

>On Mon, Jul 25, 2016 at 02:27:18PM +0400, Marc-André Lureau wrote:
>> Hi
>> 
>> On Mon, Jul 25, 2016 at 10:41 AM, Prerna <saxenap@gmail.com> wrote:
>> >
>> >
>> > On Thu, Jul 7, 2016 at 12:04 PM, Prerna Saxena <saxenap@gmail.com>
>> > wrote:
>> >>
>> >> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> >>
>> >> The current vhost-user protocol requires the client to send responses to
>> >> only a
>> >> few commands. For the remaining commands, it is impossible for QEMU to
>> >> know the
>> >> status of the requested operation -- ie, did it succeed? If so, by what
>> >> time?
>> >>
>> >> This is inconvenient, and can also lead to races. As an example:
>> >>  [..snip..]
>> >>
>> >> References:
>> >> v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
>> >> v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html
>> >>
>> >>
>> >> Prerna Saxena (2):
>> >>   vhost-user: Introduce a new protocol feature REPLY_ACK.
>> >>   vhost-user: Attempt to fix a race with set_mem_table.
>> >>
>> >>  docs/specs/vhost-user.txt |  44 +++
>> >>  hw/virtio/vhost-user.c| 133
>> >> ++
>> >>  2 files changed, 130 insertions(+), 47 deletions(-)
>> >>
>> >
>> > Ping !
>> > Michael, MarcAndre, Did you have a chance to look at this patch series?
>> >
>> 
>> That's not going to make it in 2.7 I am afraid.
>
>It's a bugfix so - depends on how quickly can comments be addressed.
>
>-- 
>MST

Thanks Michael, Marc, 
I just posted a v4 addressing the review comments. Both make-check and 
compilation run to completion.

Marc,
I addressed part of your suggestion on documentation. However, I have been 
reminded in the past about being more verbose while describing the change : 
<https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07428.html>

Hope this patch series is in time for 2.7 :-)

Regards,
Prerna


[Qemu-devel] [PATCH v4 0/2] vhost-user: Extend protocol to receive replies on any command.

2016-07-27 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

*** BLURB HERE ***

vhost-user: Extend protocol to receive replies on any command.

The current vhost-user protocol requires the client to send responses to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application).Note that SET_MEM_TABLE does not require a reply according to the 
spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications.
Patch 1 introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This 
feature, if negotiated, allows QEMU to request a response to any message by 
setting the newly introduced "need_response" flag. The application must then 
respond to qemu by providing a status about the requested operation.

Patch 2 adds a workaround for the race described above for clients that do not 
support REPLY_ACK
feature. it introduces  a get_features command to be sent before returning from 
set_mem_table. While this is not a complete fix, it will help client 
applications that strictly process messagesin order.

Changelog:
--
Changes v3->v4:
1) Rearranged code in PATCH 1 to offset compiler warnings about missing 
declaration of vhost_user_read(). Fixed by moving process_message_reply() after 
definition of vhost_user_read()
2) Fixed minor suggestions in writeup for this protocol extension.

Changes v2->v3:
1) Swapped the patch numbers 1 & 2 from the previous series.
2) Patch 1 (previously patch 2 in v2): addresses MarcAndre's review comments 
and renames function 'process_message_response' to 'process_message_reply'
3) Patch 2 (ie patch 1 in v2) : Unchanged from v2.

Changes v1->v2:
1) Patch 1 : Ask for get_features before returning from set_mem_table(new).
2) Patch 2 : * Improve documentation.
  * Abstract out commonly used operations in the form of a function, 
process_message_response(). Also implement this only for SET_MEM_TABLE.

References:
v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html
v3 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg01598.html

Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  44 +++
 hw/virtio/vhost-user.c| 133 ++
 2 files changed, 130 insertions(+), 47 deletions(-)

-- 
1.8.1.2

Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  41 ++
 hw/virtio/vhost-user.c| 133 ++
 2 files changed, 127 insertions(+), 47 deletions(-)

-- 
1.8.1.2




[Qemu-devel] [PATCH v4 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-27 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_response" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt | 41 +
 hw/virtio/vhost-user.c| 32 
 2 files changed, 73 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..57df586 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_response flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+ details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +469,39 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+---
+The original vhost-user specification only demands responses for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the newly
+introduced "need_response" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure.
+In other words, response must be in the following format :
+
+
+| request | flags | size | payload |
+
+
+ * Request: 32-bit type of the request
+ * Flags: 32-bit bit field:
+ * Size: size of the payload ( see below)
+ * Payload : a u64 integer, where a non-zero value indicates a failure.
+
+This indicates to QEMU that the requested operation has deterministically
+been met or not. Today, QEMU is expected to terminate the main vhost-user
+loop upon receiving such errors. In future, qemu could be taught to be more
+resilient for selective requests.
+
+Note that as per the original vhost-user protocol, the following four messages
+anyway require distinct responses from the vhost-user client process:
+ * VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
+ * VHOST_GET_VRING_BASE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+
+For these message types, the presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or
+need_response bit being set brings no behaviourial change.
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..0cdb918 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK (0x3)
 #define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_NEED_RESPONSE_MASK   (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -158,6 +160,25 @@ fail:
 return -1;
 }
 
+static int process_message_reply(struct vhost_dev *dev,
+VhostUserRequest request)
+{
+VhostUserMsg msg;
+
+if (vhost_user_read(dev, ) < 0) {
+return 0;
+}
+
+if (msg.request != request) {
+error_report("Received unexpected msg type."
+"Expected %d received %d",
+request, msg.request);
+return -1;
+}
+
+return msg.payload.u64 ? -1 : 0;
+}
+
 static bool vhost_user_one_time_reques

[Qemu-devel] [PATCH v4 2/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-07-27 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application has not yet remapped the memory, but it sees the I/O 
request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

While a guaranteed fix would require a protocol extension (committed 
separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before 
replying.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 hw/virtio/vhost-user.c | 123 ++---
 1 file changed, 65 insertions(+), 58 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 0cdb918..f96607e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -254,64 +254,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-struct vhost_memory *mem)
-{
-int fds[VHOST_MEMORY_MAX_NREGIONS];
-int i, fd;
-size_t fd_num = 0;
-bool reply_supported = virtio_has_feature(dev->protocol_features,
-VHOST_USER_PROTOCOL_F_REPLY_ACK);
-
-VhostUserMsg msg = {
-.request = VHOST_USER_SET_MEM_TABLE,
-.flags = VHOST_USER_VERSION,
-};
-
-if (reply_supported) {
-msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
-}
-
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-ram_addr_t offset;
-MemoryRegion *mr;
-
-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
- );
-fd = memory_region_get_fd(mr);
-if (fd > 0) {
-msg.payload.memory.regions[fd_num].userspace_addr = 
reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr = 
reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-}
-}
-
-msg.payload.memory.nregions = fd_num;
-
-if (!fd_num) {
-error_report("Failed initializing vhost-user memory map, "
- "consider using -object memory-backend-file share=on");
-return -1;
-}
-
-msg.size = sizeof(msg.payload.memory.nregions);
-msg.size += sizeof(msg.payload.memory.padding);
-msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-vhost_user_write(dev, , fds, fd_num);
-
-if (reply_supported) {
-return process_message_reply(dev, msg.request);
-}
-
-return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
  struct vhost_vring_addr *addr)
 {
@@ -514,6 +456,71 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+struct vhost_memory *mem)
+{
+int fds[VHOST_MEMORY_MAX_NREGIONS];
+int i, fd;
+size_t fd_num = 0;
+uint64_t features;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_MEM_TABLE,
+.flags = VHOST_USER_VERSION,
+};
+
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
+}
+
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+ram_addr_t offset;
+MemoryRegion *mr;
+
+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+ );
+fd = memory_region_get_fd(mr);
+if (fd > 0) {
+msg.payload.memo

Re: [Qemu-devel] [PATCH v3 0/2] vhost-user: Extend protocol to receive replies on any command.

2016-07-25 Thread Prerna Saxena
Hi Marc,
Thank you for taking a look.




On 25/07/16 3:57 pm, "Marc-André Lureau" <marcandre.lur...@gmail.com> wrote:

>Hi
>
>On Mon, Jul 25, 2016 at 10:41 AM, Prerna <saxenap@gmail.com> wrote:
>>
>>
>> On Thu, Jul 7, 2016 at 12:04 PM, Prerna Saxena <saxenap....@gmail.com>
>> wrote:
>>>
>>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>>>
>>> The current vhost-user protocol requires the client to send responses to
>>> only a
>>> few commands. For the remaining commands, it is impossible for QEMU to
>>> know the
>>> status of the requested operation -- ie, did it succeed? If so, by what
>>> time?
>>>
>>> This is inconvenient, and can also lead to races. As an example:
>>>  [..snip..]
>>>
>>> References:
>>> v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
>>> v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html
>>>
>>>
>>> Prerna Saxena (2):
>>>   vhost-user: Introduce a new protocol feature REPLY_ACK.
>>>   vhost-user: Attempt to fix a race with set_mem_table.
>>>
>>>  docs/specs/vhost-user.txt |  44 +++
>>>  hw/virtio/vhost-user.c| 133
>>> ++
>>>  2 files changed, 130 insertions(+), 47 deletions(-)
>>>
>>
>> Ping !
>> Michael, MarcAndre, Did you have a chance to look at this patch series?
>>
>
>That's not going to make it in 2.7 I am afraid. Beside the second
>patch that I think is somewhat superflous or worse, as I said in
>previous review (so I won't ack it, but Michael liked it and he is the
>maintainer)
>
>It fails to compile, easy to fix by moving process_message_reply after
>vhost_user_read:
>
>/home/elmarco/src/qemu/hw/virtio/vhost-user.c: In function
>‘process_message_reply’:
>/home/elmarco/src/qemu/hw/virtio/vhost-user.c:117:9: warning: implicit
>declaration of function ‘vhost_user_read’
>[-Wimplicit-function-declaration]
> if (vhost_user_read(dev, ) < 0) {
> ^~~
>/home/elmarco/src/qemu/hw/virtio/vhost-user.c:117:5: warning: nested
>extern declaration of ‘vhost_user_read’ [-Wnested-externs]
> if (vhost_user_read(dev, ) < 0) {
> ^~
>/home/elmarco/src/qemu/hw/virtio/vhost-user.c: At top level:
>/home/elmarco/src/qemu/hw/virtio/vhost-user.c:136:12: error: static
>declaration of ‘vhost_user_read’ follows non-static declaration
> static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
>^~~
>/home/elmarco/src/qemu/hw/virtio/vhost-user.c:117:9: note: previous
>implicit declaration of ‘vhost_user_read’ was here
> if (vhost_user_read(dev, ) < 0) {
> ^~~

I really need to check on this. I am pretty positive I had verified this before 
posting, but its been a while since these patches were posted.


>
>Secondly, make check just hangs in /x86_64/vhost-user/read-guest-mem
>(a sign that backward compatibility is broken).
>
>There is still many "response" wording, where "reply" should be used
>for more consistency (VHOST_USER_NEED_RESPONSE_MASK and in the doc)

Right. There is a reason I havent reworded it here. We already have a 
VHOST_USER_REPLY_MASK 
flag that assumes that the incoming message is a reply to an already-sent vhost 
command.
Use of the word ‘REPLY’ in this context would have caused some confusion.

>
>Regarding the doc, I would simplify it a bit:
>
>VHOST_USER_PROTOCOL_F_REPLY_ACK:
>---
>The original vhost-user specification only demands replies for certain
>commands. This differs from the vhost protocol implementation where commands
>are sent over an ioctl() call and block until the client has completed.
>
>With this protocol extension negotiated, the sender (QEMU) can set the newly
>introduced "need_reply" [Bit 3] flag to any command. This indicates that
>the client MUST reply with a Payload VhostUserMsg indicating success or
>failure. The payload should be set to zero on success or non-zero on failure.
>In other words, reply message must be in the following format :
>
>
>| request | flags | size | payload |
>
>
> * Request: 32-bit type of the request
> * Flags: 32-bit bit field:
> * Size: size of the payload ( see below)
> * Payload : a u64 integer, where a non-zero value indicates a failure.
>
>This indicates to QEMU that the requested operation has
>deterministically been met or not. Today, QEMU is expected to terminate
>the main vhost-user loop upon receiving such errors. In future, qemu could
>be taught to be more resilient for selective requests.
>
>Note that for messages that already require distinct replies, the presence of
>need_reply bit being set brings no behavioural change.
>
>-- 
>Marc-André Lureau

Regards,
Prerna


[Qemu-devel] [PATCH v3 2/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-07-07 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application has not yet remapped the memory, but it sees the I/O 
request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

While a guaranteed fix would require a protocol extension (committed 
separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that processes vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before 
replying.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 hw/virtio/vhost-user.c | 123 ++---
 1 file changed, 65 insertions(+), 58 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 899f354..a3a114d 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -254,64 +254,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-struct vhost_memory *mem)
-{
-int fds[VHOST_MEMORY_MAX_NREGIONS];
-int i, fd;
-size_t fd_num = 0;
-bool reply_supported = virtio_has_feature(dev->protocol_features,
-VHOST_USER_PROTOCOL_F_REPLY_ACK);
-
-VhostUserMsg msg = {
-.request = VHOST_USER_SET_MEM_TABLE,
-.flags = VHOST_USER_VERSION,
-};
-
-if (reply_supported) {
-msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
-}
-
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-ram_addr_t offset;
-MemoryRegion *mr;
-
-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
- );
-fd = memory_region_get_fd(mr);
-if (fd > 0) {
-msg.payload.memory.regions[fd_num].userspace_addr = 
reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr = 
reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-}
-}
-
-msg.payload.memory.nregions = fd_num;
-
-if (!fd_num) {
-error_report("Failed initializing vhost-user memory map, "
- "consider using -object memory-backend-file share=on");
-return -1;
-}
-
-msg.size = sizeof(msg.payload.memory.nregions);
-msg.size += sizeof(msg.payload.memory.padding);
-msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-vhost_user_write(dev, , fds, fd_num);
-
-if (reply_supported) {
-return process_message_reply(dev, msg.request);
-}
-
-return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
  struct vhost_vring_addr *addr)
 {
@@ -514,6 +456,71 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+struct vhost_memory *mem)
+{
+int fds[VHOST_MEMORY_MAX_NREGIONS];
+int i, fd;
+size_t fd_num = 0;
+uint64_t features;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+VHOST_USER_PROTOCOL_F_REPLY_ACK);
+
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_MEM_TABLE,
+.flags = VHOST_USER_VERSION,
+};
+
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
+}
+
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+ram_addr_t offset;
+MemoryRegion *mr;
+
+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+ );
+fd = memory_region_get_fd(mr);
+if (fd > 0) {
+msg.payload.memo

[Qemu-devel] [PATCH v3 1/2] vhost-user: Introduce a new protocol feature REPLY_ACK.

2016-07-07 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_response" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt | 44 
 hw/virtio/vhost-user.c| 32 
 2 files changed, 76 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..26dbe71 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_response flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+ details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +469,42 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+---
+The original vhost-user specification only demands responses for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the newly
+introduced "need_response" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure.
+In other words, response must be in the following format :
+
+
+| request | flags | size | payload |
+
+
+ * Request: 32-bit type of the request
+ * Flags: 32-bit bit field:
+ * Size: size of the payload ( see below)
+ * Payload : a u64 integer, where a non-zero value indicates a failure.
+
+This aids debugging the application's responses from QEMU. More
+importantly, it indicates to QEMU that the requested operation has
+deterministically (not) been met. Today, QEMU is expected to terminate
+the main vhost-user loop upon receiving such errors. In future, qemu could
+be taught to be more resilient for selective requests.
+
+Note that as per the original vhost-user protocol, the following four messages
+anyway require distinct responses from the vhost-user client process:
+ * VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
+ * VHOST_GET_VRING_BASE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+
+For these message types, the presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or
+need_response bit being set brings no behaviourial change.
+The response from the client is identical whether or not the REPLY_ACK feature
+has been negotiated.
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..899f354 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK (0x3)
 #define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_NEED_RESPONSE_MASK   (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -107,6 +109,25 @@ static VhostUserMsg m __attribute__ ((unused));
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION(0x1)
 
+static int process_message_reply(struct vhost_dev *dev,
+VhostUserRequest request)
+{
+VhostUserMsg msg;
+
+if (vhost_user_read(dev, ) < 0) {
+return 0;
+}
+
+if (msg.request != request) {
+error_report

[Qemu-devel] [PATCH v3 0/2] vhost-user: Extend protocol to receive replies on any command.

2016-07-07 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The current vhost-user protocol requires the client to send responses to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application).Note that SET_MEM_TABLE does not require a reply according to the 
spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications.
Patch 1 introduces a protocol extension, VHOST_USER_PROTOCOL_F_REPLY_ACK. This 
feature, if negotiated, allows QEMU to request a response to any message by 
setting the newly introduced "need_response" flag. The application must then 
respond to qemu by providing a status about the requested operation.

Patch 2 adds a workaround for the race described above for clients that do not 
support REPLY_ACK
feature. it introduces  a get_features command to be sent before returning from 
set_mem_table. While this is not a complete fix, it will help client 
applications that strictly process messagesin order.

Changelog:
--

Changes v2->v3:
1) Swapped the patch numbers 1 & 2 from the previous series.
2) Patch 1 (previously patch 2 in v2): addresses MarcAndre's review comments 
and renames function 'process_message_response' to 'process_message_reply'
3) Patch 2 (ie patch 1 in v2) : Unchanged from v2.

Changes v1->v2:
1) Patch 1 : Ask for get_features before returning from set_mem_table(new).
2) Patch 2 : * Improve documentation.
  * Abstract out commonly used operations in the form of a function, 
process_message_response(). Also implement this only for SET_MEM_TABLE.

References:
v1 : https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg07152.html
v2 : https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00048.html


Prerna Saxena (2):
  vhost-user: Introduce a new protocol feature REPLY_ACK.
  vhost-user: Attempt to fix a race with set_mem_table.

 docs/specs/vhost-user.txt |  44 +++
 hw/virtio/vhost-user.c| 133 ++
 2 files changed, 130 insertions(+), 47 deletions(-)

-- 
1.8.1.2




Re: [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.

2016-07-04 Thread Prerna Saxena
Hi Michael,
Thank you for taking a look.





On 04/07/16 5:29 pm, "Michael S. Tsirkin" <m...@redhat.com> wrote:

>On Fri, Jul 01, 2016 at 02:46:20AM -0700, Prerna Saxena wrote:
>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> 
>> The current vhost-user protocol requires the client to send responses to 
>> only a
>> few commands. For the remaining commands, it is impossible for QEMU to know 
>> the
>> status of the requested operation -- ie, did it succeed? If so, by what time?
>> 
>> This is inconvenient, and can also lead to races. As an example:
>> 
>> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
>> application).
>> Note that SET_MEM_TABLE does not require a reply according to the spec.
>> (2) Qemu commits the memory to the guest.
>> (3) Guest issues an I/O operation over a new memory region which was 
>> configured on (1).
>> (4) The application hasn't yet remapped the memory, but it sees the I/O 
>> request.
>> (5) The application cannot satisfy the request because it does not know 
>> about those GPAs.
>> 
>> Note that the kernel implementation does not suffer from this limitation 
>> since messages are sent via an ioctl(). The ioctl() blocks until the backend 
>> (eg. vhost-net) completes the command and returns (with an error code).
>> 
>> Changing the behaviour of current vhost-user commands would break existing 
>> applications. 
>> To work around this race, Patch 1 adds a get_features command to be sent 
>> before returning from set_mem_table. While this is not a complete fix, it 
>> will help client applications that strictly process messages in order.
>> 
>> The second patch introduces a protocol extension, 
>> VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to 
>> request a response to any message by setting the newly introduced 
>> "need_response" flag. The application must then respond to qemu by providing 
>> a status about the requested operation.
>
>
>OK this all looks very reasonable (and I do like patch 1 too)
>but there's one source of waste here: we do not need to
>synchronize when we set up device the first time
>when hdev->memory_changed is false.
>
>I think we should test that and skip synch in both patches
>unless  hdev->memory_changed is set.

I do not entirely agree with that. The first set_mem_table command is not much 
different from subsequent set_mem_table calls.
For all cases, there is a fair chance that the vhost-user application may, for 
some reason, not be able to map the guest memory.
This protocol extension provides a mechanism for such errors to be propagated 
back to QEMU. It is upto QEMU to acknowledge the failure (by terminating itself 
or failing the device) or ignore it. However, in the absence of such a 
mechanism, it would be really bad for QEMU to believe that the vhost 
application is all set to process guest requests when reality is quite the 
opposite.

Also, as pointed out before, QEMU needs to have a notion of _when_ the memory 
mapping was finished, so that it may proceed to pass on actual requests to the 
vhost user application. The race described in the covering letter (above) can 
potentially happen even at first-time initialization.

This protocol extension is an attempt to bridge the subtle behavioural 
difference between vhost-user and vhost-kernel. Patch 1, in my opinion, makes 
the code less intuitive. This is because we are calling a GET_FEATURES vhost 
message from inside the handler for another vhost command— SET_MEM_TABLE. 
However, if you think it better to have both Patch 1 & 2, I’ll be happy to post 
both.

Regards,
Prerna

>
>
>> Changelog:
>> -
>> Changes since v1:
>> Patch 1 : Ask for get_features before returning from set_mem_table(new). 
>> Patch 2 : * Improve documentation. 
>>   * Abstract out commonly used operations in the form of a function, 
>> process_message_response(). Also implement this only for SET_MEM_TABLE.
>> 
>> Prerna Saxena (2):
>>   vhost-user: Attempt to prevent a race on set_mem_table.
>>   vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.
>> 
>>  docs/specs/vhost-user.txt |  40 
>>  hw/virtio/vhost-user.c| 157 
>> --
>>  2 files changed, 150 insertions(+), 47 deletions(-)
>> 
>> -- 
>> 1.8.1.2


Re: [Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.

2016-07-03 Thread Prerna Saxena
Hi Marc-Andre,
Thank you for taking a look.





On 03/07/16 5:17 pm, "Marc-André Lureau" <marcandre.lur...@gmail.com> wrote:

>Hi
>
>On Fri, Jul 1, 2016 at 11:46 AM, Prerna Saxena <saxenap@gmail.com> wrote:
>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>>
>> The current vhost-user protocol requires the client to send responses to 
>> only a
>> few commands. For the remaining commands, it is impossible for QEMU to know 
>> the
>> status of the requested operation -- ie, did it succeed? If so, by what time?
>>
>> This is inconvenient, and can also lead to races. As an example:
>>
>> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
>> application).
>> Note that SET_MEM_TABLE does not require a reply according to the spec.
>> (2) Qemu commits the memory to the guest.
>> (3) Guest issues an I/O operation over a new memory region which was 
>> configured on (1).
>> (4) The application hasn't yet remapped the memory, but it sees the I/O 
>> request.
>> (5) The application cannot satisfy the request because it does not know 
>> about those GPAs.
>>
>> Note that the kernel implementation does not suffer from this limitation 
>> since messages are sent via an ioctl(). The ioctl() blocks until the backend 
>> (eg. vhost-net) completes the command and returns (with an error code).
>>
>> Changing the behaviour of current vhost-user commands would break existing 
>> applications.
>> To work around this race, Patch 1 adds a get_features command to be sent 
>> before returning from set_mem_table. While this is not a complete fix, it 
>> will help client applications that strictly process messages in order.
>>
>> The second patch introduces a protocol extension, 
>> VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to 
>> request a response to any message by setting the newly introduced 
>> "need_response" flag. The application must then respond to qemu by providing 
>> a status about the requested operation.
>>
>> Changelog:
>> -
>> Changes since v1:
>> Patch 1 : Ask for get_features before returning from set_mem_table(new).
>> Patch 2 : * Improve documentation.
>>   * Abstract out commonly used operations in the form of a function, 
>> process_message_response(). Also implement this only for SET_MEM_TABLE.
>>
>
>Overall, that looks good to me.
>
>Why do we have both "response" and "reply" which basically means the
>same thing, right? I would rather stick with "reply".

Allright, will rename this function to process_message_reply().

>
>I am not convinced the first patch is needed, imho it is a
>workaround/hack, the solution is given with the patch 2 only.

Great, I’ll post a v3 with just Patch2.

Regards,
Prerna

>
>> Prerna Saxena (2):
>>   vhost-user: Attempt to prevent a race on set_mem_table.
>>   vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.
>>
>>  docs/specs/vhost-user.txt |  40 
>>  hw/virtio/vhost-user.c| 157 
>> --
>>  2 files changed, 150 insertions(+), 47 deletions(-)
>>
>> --
>> 1.8.1.2
>>
>
>
>
>-- 
>Marc-André Lureau
>


[Qemu-devel] [PATCH 2/2] vhost-user : Introduce a new protocol feature REPLY_ACK.

2016-07-01 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

This introduces the VHOST_USER_PROTOCOL_F_REPLY_ACK.

If negotiated, client applications should send a u64 payload in
response to any message that contains the "need_response" bit set
on the message flags. Setting the payload to "zero" indicates the
command finished successfully. Likewise, setting it to "non-zero"
indicates an error.

Currently implemented only for SET_MEM_TABLE.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt | 44 
 hw/virtio/vhost-user.c| 32 
 2 files changed, 76 insertions(+)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..26dbe71 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,8 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_response flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for
+ details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +128,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension. ]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +258,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +469,42 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+---
+The original vhost-user specification only demands responses for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed.
+
+With this protocol extension negotiated, the sender (QEMU) can set the newly
+introduced "need_response" [Bit 3] flag to any command. This indicates that
+the client MUST respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure.
+In other words, response must be in the following format :
+
+
+| request | flags | size | payload |
+
+
+ * Request: 32-bit type of the request
+ * Flags: 32-bit bit field:
+ * Size: size of the payload ( see below)
+ * Payload : a u64 integer, where a non-zero value indicates a failure.
+
+This aids debugging the application's responses from QEMU. More
+importantly, it indicates to QEMU that the requested operation has
+deterministically (not) been met. Today, QEMU is expected to terminate
+the main vhost-user loop upon receiving such errors. In future, qemu could
+be taught to be more resilient for selective requests.
+
+Note that as per the original vhost-user protocol, the following four messages
+anyway require distinct responses from the vhost-user client process:
+ * VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
+ * VHOST_GET_VRING_BASE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+
+For these message types, the presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or
+need_response bit being set brings no behaviourial change.
+The response from the client is identical whether or not the REPLY_ACK feature
+has been negotiated.
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 858a1bb..bff229e 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -84,6 +85,7 @@ typedef struct VhostUserMsg {
 
 #define VHOST_USER_VERSION_MASK (0x3)
 #define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_NEED_RESPONSE_MASK   (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -107,6 +109,25 @@ static VhostUserMsg m __attribute__ ((unused));
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION(0x1)
 
+static int process_message_response(struct vhost_dev *dev,
+VhostUserRequest request)
+{
+VhostUserMsg msg;
+
+if (vhost_user_read(dev, ) < 0) {
+return 0;
+}
+
+if (msg.reques

[Qemu-devel] [PATCH v2 0/2]vhost-user: Extend protocol to seek response for any command.

2016-07-01 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The current vhost-user protocol requires the client to send responses to only a
few commands. For the remaining commands, it is impossible for QEMU to know the
status of the requested operation -- ie, did it succeed? If so, by what time?

This is inconvenient, and can also lead to races. As an example:

(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application).
Note that SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications. 
To work around this race, Patch 1 adds a get_features command to be sent before 
returning from set_mem_table. While this is not a complete fix, it will help 
client applications that strictly process messages in order.

The second patch introduces a protocol extension, 
VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to 
request a response to any message by setting the newly introduced 
"need_response" flag. The application must then respond to qemu by providing a 
status about the requested operation.

Changelog:
-
Changes since v1:
Patch 1 : Ask for get_features before returning from set_mem_table(new). 
Patch 2 : * Improve documentation. 
  * Abstract out commonly used operations in the form of a function, 
process_message_response(). Also implement this only for SET_MEM_TABLE.

Prerna Saxena (2):
  vhost-user: Attempt to prevent a race on set_mem_table.
  vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK.

 docs/specs/vhost-user.txt |  40 
 hw/virtio/vhost-user.c| 157 --
 2 files changed, 150 insertions(+), 47 deletions(-)

-- 
1.8.1.2




[Qemu-devel] [PATCH 1/2] vhost-user: Attempt to fix a race with set_mem_table.

2016-07-01 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The set_mem_table command currently does not seek a reply. Hence, there is
no easy way for a remote application to notify to QEMU when it finished
setting up memory, or if there were errors doing the so.

As an example:
(1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
application). SET_MEM_TABLE does not require a reply according to the spec.
(2) Qemu commits the memory to the guest.
(3) Guest issues an I/O operation over a new memory region which was configured 
on (1).
(4) The application has not yet remapped the memory, but it sees the I/O 
request.
(5) The application cannot satisfy the request because it does not know about 
those GPAs.

While a guaranteed fix would require a protocol extension (committed 
separately),
a best-effort workaround for existing applications is to send a GET_FEATURES
message before completing the vhost_user_set_mem_table() call.
Since GET_FEATURES requires a reply, an application that process vhost-user
messages synchronously would probably have completed the SET_MEM_TABLE before 
replying.

For a vhost-user application that processes mesages strictly in order, a 
response
against GET_FEATURES will ensure that the application has finished processing 
the
previous set_mem request too.

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 hw/virtio/vhost-user.c | 104 +++--
 1 file changed, 57 insertions(+), 47 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..858a1bb 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -233,53 +233,6 @@ static int vhost_user_set_log_base(struct vhost_dev *dev, 
uint64_t base,
 return 0;
 }
 
-static int vhost_user_set_mem_table(struct vhost_dev *dev,
-struct vhost_memory *mem)
-{
-int fds[VHOST_MEMORY_MAX_NREGIONS];
-int i, fd;
-size_t fd_num = 0;
-VhostUserMsg msg = {
-.request = VHOST_USER_SET_MEM_TABLE,
-.flags = VHOST_USER_VERSION,
-};
-
-for (i = 0; i < dev->mem->nregions; ++i) {
-struct vhost_memory_region *reg = dev->mem->regions + i;
-ram_addr_t offset;
-MemoryRegion *mr;
-
-assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
-mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
- );
-fd = memory_region_get_fd(mr);
-if (fd > 0) {
-msg.payload.memory.regions[fd_num].userspace_addr = 
reg->userspace_addr;
-msg.payload.memory.regions[fd_num].memory_size  = reg->memory_size;
-msg.payload.memory.regions[fd_num].guest_phys_addr = 
reg->guest_phys_addr;
-msg.payload.memory.regions[fd_num].mmap_offset = offset;
-assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
-fds[fd_num++] = fd;
-}
-}
-
-msg.payload.memory.nregions = fd_num;
-
-if (!fd_num) {
-error_report("Failed initializing vhost-user memory map, "
- "consider using -object memory-backend-file share=on");
-return -1;
-}
-
-msg.size = sizeof(msg.payload.memory.nregions);
-msg.size += sizeof(msg.payload.memory.padding);
-msg.size += fd_num * sizeof(VhostUserMemoryRegion);
-
-vhost_user_write(dev, , fds, fd_num);
-
-return 0;
-}
-
 static int vhost_user_set_vring_addr(struct vhost_dev *dev,
  struct vhost_vring_addr *addr)
 {
@@ -482,6 +435,63 @@ static int vhost_user_get_features(struct vhost_dev *dev, 
uint64_t *features)
 return vhost_user_get_u64(dev, VHOST_USER_GET_FEATURES, features);
 }
 
+static int vhost_user_set_mem_table(struct vhost_dev *dev,
+struct vhost_memory *mem)
+{
+int fds[VHOST_MEMORY_MAX_NREGIONS];
+int i, fd;
+size_t fd_num = 0;
+uint64_t features;
+VhostUserMsg msg = {
+.request = VHOST_USER_SET_MEM_TABLE,
+.flags = VHOST_USER_VERSION,
+};
+
+for (i = 0; i < dev->mem->nregions; ++i) {
+struct vhost_memory_region *reg = dev->mem->regions + i;
+ram_addr_t offset;
+MemoryRegion *mr;
+
+assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
+mr = memory_region_from_host((void *)(uintptr_t)reg->userspace_addr,
+ );
+fd = memory_region_get_fd(mr);
+if (fd > 0) {
+msg.payload.memory.regions[fd_num].userspace_addr \
+= reg->userspace_addr;
+msg.payload.memory.regions[fd_num].memory_size  \
+= reg->memory_size;
+msg.payload.memory.regions[fd_num].guest_phys_addr \
+= reg->guest_

Re: [Qemu-devel] [PATCH 0/1] vhost-user: Add a protocol extension for client responses to vhost commands.

2016-06-25 Thread Prerna Saxena





On 26/06/16 8:15 am, "Michael S. Tsirkin" <m...@redhat.com> wrote:

>On Sat, Jun 25, 2016 at 03:13:54AM +, Prerna Saxena wrote:
>> 
>> 
>> 
>> 
>> 
>> On 25/06/16 4:43 am, "Michael S. Tsirkin" <m...@redhat.com> wrote:
>> 
>> >On Fri, Jun 24, 2016 at 05:39:31PM +, Prerna Saxena wrote:
>> >> 
>> >> 
>> >> On 24/06/16 9:15 pm, "Felipe Franciosi" <fel...@nutanix.com> wrote:
>> >> 
>> >> >We talked to MST on IRC a while back and he brainstormed the idea of 
>> >> >doing this per-message.
>> >> >(I even recall proposing to call this feature REPLY_ALL and he suggested 
>> >> >REPLY_ANY due to that.)
>> >> >
>> >> >I agree with doing it per message, as the protocol itself should be 
>> >> >flexible in that sense.
>> >> >(Even if qemu today will probably want to ask for a reply in all 
>> >> >messages.)
>> >> 
>> >> In fact, the current implementation does exactly this. If 
>> >> VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, the current QEMU patch 
>> >> sets the NEED_RESPONSE flag bit for all outgoing messages — basically 
>> >> enforcing the vhost-user application to respond to all messages.
>> >
>> >
>> >This seems unnecessary. Let's only do that for messages that actually
>> >need to be synchronous.
>> 
>> It would be nice to distinguish the vhost-user protocol itself from its QEMU 
>> implementation.
>> The protocol should, in theory, have provision for an implementation (such 
>> as QEMU’s vhost-user implementation) to seek response for _any_ command. 
>> However, we can choose to be selective in our QEMU implementation and just 
>> have limited commands currently send a response, such as SET_MEM_TABLE. 
>> 
>> In other words, we will still require the NEED_RESPONSE flag bit defined, 
>> but we can just set it to 1 it for SET_MEM_TABLE command in our QEMU 
>> implementation. All other vhost-user commands are sent from QEMU setting 
>> this to 0, so the application does not send an ack.
>> 
>> Michael, Does that correctly summarize what you were meaning to suggest here 
>> ?
>> 
>> Regards,
>> Prerna
>
>Exactly.

Thanks for your response. I will rework and send out a patch to that end.

Regards,
Prerna

>
>> 
>> >
>> >> >
>> >> >On 24/06/2016, 14:59, "Qemu-devel on behalf of Marc-André Lureau" 
>> >> ><qemu-devel-bounces+felipe=nutanix@nongnu.org on behalf of 
>> >> >marcandre.lur...@gmail.com> wrote:
>> >> >
>> >> >Hi
>> >> >
>> >> >On Fri, Jun 24, 2016 at 10:17 AM, Prerna Saxena <saxenap@gmail.com> 
>> >> >wrote:
>> >> >> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> >> >>
>> >> >> The current vhost-user protocol requires the client to send responses 
>> >> >> to only few commands. For the remaining commands, it is impossible for 
>> >> >> QEMU to know the status of the requested operation -- ie, did it 
>> >> >> succeed at all, and if so, at what time.
>> >> >>
>> >> >> This is inconvenient, and can also lead to races. As an example:
>> >> >>
>> >> >> (1) qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
>> >> >> application) and SET_MEM_TABLE doesn't require a reply according to 
>> >> >> the spec.
>> >> >> (2) qemu commits the memory to the guest.
>> >> >> (3) guest issues an I/O operation over a new memory region which was 
>> >> >> configured on (1)
>> >> >> (4) The application hasn't yet remapped the memory, but it sees the 
>> >> >> I/O request.
>> >> >> (5) The application cannot satisfy the request because it doesn't know 
>> >> >> about those GPAs
>> >> >>
>> >> >> Note that the kernel implementation does not suffer from this 
>> >> >> limitation since messages are sent via an ioctl(). The ioctl() blocks 
>> >> >> until the backend (eg. vhost-net) completes the command and returns 
>> >> >> (with an error code).
>> >> >>
>> >> >> Changing the behaviour of current vhost-user commands would break 
>> >> >> existing applications. This patch introduces a protocol extension, 
>> >> >> VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows 
>> >> >> QEMU to annotate messages to the application that it seeks a response 
>> >> >> for. The application must then respond to qemu by providing a status 
>> >> >> about the requested operation.
>> >> >
>> >> >I like the idea, as I encountered a similar issue in my
>> >> >"vhost-user-gpu" development (which I worked around by sending a dump
>> >> >GET_FEATURES.. to sync things). But I question the need to have a flag
>> >> >per message. I think if the protocol feature is negociated, all
>> >> >messages should have a reply. Why do you want it to be per-message?
>> >> >
>> >> >thanks
>> >> >
>> >> >-- 
>> >> >Marc-André Lureau
>> >> >
>> >> >
>> >> >


Re: [Qemu-devel] [PATCH 0/1] vhost-user: Add a protocol extension for client responses to vhost commands.

2016-06-24 Thread Prerna Saxena





On 25/06/16 4:43 am, "Michael S. Tsirkin" <m...@redhat.com> wrote:

>On Fri, Jun 24, 2016 at 05:39:31PM +, Prerna Saxena wrote:
>> 
>> 
>> On 24/06/16 9:15 pm, "Felipe Franciosi" <fel...@nutanix.com> wrote:
>> 
>> >We talked to MST on IRC a while back and he brainstormed the idea of doing 
>> >this per-message.
>> >(I even recall proposing to call this feature REPLY_ALL and he suggested 
>> >REPLY_ANY due to that.)
>> >
>> >I agree with doing it per message, as the protocol itself should be 
>> >flexible in that sense.
>> >(Even if qemu today will probably want to ask for a reply in all messages.)
>> 
>> In fact, the current implementation does exactly this. If 
>> VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, the current QEMU patch sets 
>> the NEED_RESPONSE flag bit for all outgoing messages — basically enforcing 
>> the vhost-user application to respond to all messages.
>
>
>This seems unnecessary. Let's only do that for messages that actually
>need to be synchronous.

It would be nice to distinguish the vhost-user protocol itself from its QEMU 
implementation.
The protocol should, in theory, have provision for an implementation (such as 
QEMU’s vhost-user implementation) to seek response for _any_ command. However, 
we can choose to be selective in our QEMU implementation and just have limited 
commands currently send a response, such as SET_MEM_TABLE. 

In other words, we will still require the NEED_RESPONSE flag bit defined, but 
we can just set it to 1 it for SET_MEM_TABLE command in our QEMU 
implementation. All other vhost-user commands are sent from QEMU setting this 
to 0, so the application does not send an ack.

Michael, Does that correctly summarize what you were meaning to suggest here ?

Regards,
Prerna


>
>> >
>> >On 24/06/2016, 14:59, "Qemu-devel on behalf of Marc-André Lureau" 
>> ><qemu-devel-bounces+felipe=nutanix@nongnu.org on behalf of 
>> >marcandre.lur...@gmail.com> wrote:
>> >
>> >Hi
>> >
>> >On Fri, Jun 24, 2016 at 10:17 AM, Prerna Saxena <saxenap@gmail.com> 
>> >wrote:
>> >> From: Prerna Saxena <prerna.sax...@nutanix.com>
>> >>
>> >> The current vhost-user protocol requires the client to send responses to 
>> >> only few commands. For the remaining commands, it is impossible for QEMU 
>> >> to know the status of the requested operation -- ie, did it succeed at 
>> >> all, and if so, at what time.
>> >>
>> >> This is inconvenient, and can also lead to races. As an example:
>> >>
>> >> (1) qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
>> >> application) and SET_MEM_TABLE doesn't require a reply according to the 
>> >> spec.
>> >> (2) qemu commits the memory to the guest.
>> >> (3) guest issues an I/O operation over a new memory region which was 
>> >> configured on (1)
>> >> (4) The application hasn't yet remapped the memory, but it sees the I/O 
>> >> request.
>> >> (5) The application cannot satisfy the request because it doesn't know 
>> >> about those GPAs
>> >>
>> >> Note that the kernel implementation does not suffer from this limitation 
>> >> since messages are sent via an ioctl(). The ioctl() blocks until the 
>> >> backend (eg. vhost-net) completes the command and returns (with an error 
>> >> code).
>> >>
>> >> Changing the behaviour of current vhost-user commands would break 
>> >> existing applications. This patch introduces a protocol extension, 
>> >> VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU 
>> >> to annotate messages to the application that it seeks a response for. The 
>> >> application must then respond to qemu by providing a status about the 
>> >> requested operation.
>> >
>> >I like the idea, as I encountered a similar issue in my
>> >"vhost-user-gpu" development (which I worked around by sending a dump
>> >GET_FEATURES.. to sync things). But I question the need to have a flag
>> >per message. I think if the protocol feature is negociated, all
>> >messages should have a reply. Why do you want it to be per-message?
>> >
>> >thanks
>> >
>> >-- 
>> >Marc-André Lureau
>> >
>> >
>> >


Re: [Qemu-devel] [PATCH 0/1] vhost-user: Add a protocol extension for client responses to vhost commands.

2016-06-24 Thread Prerna Saxena


On 24/06/16 9:15 pm, "Felipe Franciosi" <fel...@nutanix.com> wrote:

>We talked to MST on IRC a while back and he brainstormed the idea of doing 
>this per-message.
>(I even recall proposing to call this feature REPLY_ALL and he suggested 
>REPLY_ANY due to that.)
>
>I agree with doing it per message, as the protocol itself should be flexible 
>in that sense.
>(Even if qemu today will probably want to ask for a reply in all messages.)

In fact, the current implementation does exactly this. If 
VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, the current QEMU patch sets the 
NEED_RESPONSE flag bit for all outgoing messages — basically enforcing the 
vhost-user application to respond to all messages.

>
>On 24/06/2016, 14:59, "Qemu-devel on behalf of Marc-André Lureau" 
><qemu-devel-bounces+felipe=nutanix@nongnu.org on behalf of 
>marcandre.lur...@gmail.com> wrote:
>
>Hi
>
>On Fri, Jun 24, 2016 at 10:17 AM, Prerna Saxena <saxenap@gmail.com> wrote:
>> From: Prerna Saxena <prerna.sax...@nutanix.com>
>>
>> The current vhost-user protocol requires the client to send responses to 
>> only few commands. For the remaining commands, it is impossible for QEMU to 
>> know the status of the requested operation -- ie, did it succeed at all, and 
>> if so, at what time.
>>
>> This is inconvenient, and can also lead to races. As an example:
>>
>> (1) qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
>> application) and SET_MEM_TABLE doesn't require a reply according to the spec.
>> (2) qemu commits the memory to the guest.
>> (3) guest issues an I/O operation over a new memory region which was 
>> configured on (1)
>> (4) The application hasn't yet remapped the memory, but it sees the I/O 
>> request.
>> (5) The application cannot satisfy the request because it doesn't know about 
>> those GPAs
>>
>> Note that the kernel implementation does not suffer from this limitation 
>> since messages are sent via an ioctl(). The ioctl() blocks until the backend 
>> (eg. vhost-net) completes the command and returns (with an error code).
>>
>> Changing the behaviour of current vhost-user commands would break existing 
>> applications. This patch introduces a protocol extension, 
>> VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to 
>> annotate messages to the application that it seeks a response for. The 
>> application must then respond to qemu by providing a status about the 
>> requested operation.
>
>I like the idea, as I encountered a similar issue in my
>"vhost-user-gpu" development (which I worked around by sending a dump
>GET_FEATURES.. to sync things). But I question the need to have a flag
>per message. I think if the protocol feature is negociated, all
>messages should have a reply. Why do you want it to be per-message?
>
>thanks
>
>-- 
>Marc-André Lureau
>
>
>


[Qemu-devel] [PATCH 1/1] vhost-user : Introduce a new feature VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, forces the remote vhost-user process to send a u64 reply containing a status

2016-06-24 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

Signed-off-by: Prerna Saxena <prerna.sax...@nutanix.com>
---
 docs/specs/vhost-user.txt |  36 +++
 hw/virtio/vhost-user.c| 153 +-
 2 files changed, 186 insertions(+), 3 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 777c49c..e5388b2 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -37,6 +37,7 @@ consists of 3 header fields and a payload:
  * Flags: 32-bit bit field:
- Lower 2 bits are the version (currently 0x01)
- Bit 2 is the reply flag - needs to be sent on each reply from the slave
+   - Bit 3 is the need_response flag - see VHOST_USER_PROTOCOL_F_REPLY_ACK for 
details.
  * Size - 32-bit size of the payload
 
 
@@ -126,6 +127,8 @@ the ones that do:
  * VHOST_GET_VRING_BASE
  * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
 
+[ Also see the section on REPLY_ACK protocol extension]
+
 There are several messages that the master sends with file descriptors passed
 in the ancillary data:
 
@@ -254,6 +257,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MQ 0
 #define VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
 #define VHOST_USER_PROTOCOL_F_RARP   2
+#define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 
 Message types
 -
@@ -464,3 +468,35 @@ Message types
   is present in VHOST_USER_GET_PROTOCOL_FEATURES.
   The first 6 bytes of the payload contain the mac address of the guest to
   allow the vhost user backend to construct and broadcast the fake RARP.
+
+VHOST_USER_PROTOCOL_F_REPLY_ACK:
+
+The original vhost-user specification only demands responses for certain
+commands. This differs from the vhost protocol implementation where commands
+are sent over an ioctl() call and block until the client has completed. Not
+receiving a response for commands like VHOST_SET_MEM_TABLE makes the sender
+unable to tell when the client has finished (re)mapping the GPA, or whether it
+has failed altogether.
+
+With this protocol extension negotiated, the sender can set the newly
+introduced "need_response" [Bit 3] flag to any command. This indicates that
+the client MUST to respond with a Payload VhostUserMsg indicating success or
+failure. The payload should be set to zero on success or non-zero on failure.
+In other words, response must be in the following format :
+
+| request | flags | size | payload |
+
+
+ * Request: 32-bit type of the original request which is being responded to.
+ * Flags: 32-bit bit field: (VHOST_USER_VERSION | VHOST_USER_REPLY_MASK)
+ * Size: size of the payload ( see below)
+ * Payload : a u64 integer, where a non-zero value indicates a failure.
+
+Note that as per the original vhost-user protocol, the following four messages 
anyway
+require distinct responses from the vhost-user client process :
+ * VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
+ * VHOST_GET_VRING_BASE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
+For these message types, the presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or
+need_response bit being set brings no behaviourial change.
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 495e09f..f01ebb4 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -31,6 +31,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
 VHOST_USER_PROTOCOL_F_RARP = 2,
+VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -82,8 +83,9 @@ typedef struct VhostUserLog {
 typedef struct VhostUserMsg {
 VhostUserRequest request;
 
-#define VHOST_USER_VERSION_MASK (0x3)
-#define VHOST_USER_REPLY_MASK   (0x1<<2)
+#define VHOST_USER_VERSION_MASK (0x3)
+#define VHOST_USER_REPLY_MASK   (0x1 << 2)
+#define VHOST_USER_NEED_RESPONSE_MASK   (0x1 << 3)
 uint32_t flags;
 uint32_t size; /* the following payload size */
 union {
@@ -239,10 +241,17 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
 int fds[VHOST_MEMORY_MAX_NREGIONS];
 int i, fd;
 size_t fd_num = 0;
+bool reply_supported = virtio_has_feature(dev->protocol_features,
+VHOST_USER_PROTOCOL_F_REPLY_ACK);
 VhostUserMsg msg = {
 .request = VHOST_USER_SET_MEM_TABLE,
 .flags = VHOST_USER_VERSION,
 };
+VhostUserRequest request = msg.request;
+
+if (reply_supported) {
+msg.flags |= VHOST_USER_NEED_RESPONSE_MASK;
+}
 
 for (i = 0; i < dev->mem->nregions; ++i) {
 struct vhost_memory_region *reg = dev->mem->regions + i;
@@ -277,6 +286,20 @@ static int vhost_user_set_mem_table(struct vhost_dev *dev,
 
 vhost_user_write(dev, , fds, fd_num);
 
+if (reply_supported) {
+ 

[Qemu-devel] [PATCH 0/1] vhost-user: Add a protocol extension for client responses to vhost commands.

2016-06-24 Thread Prerna Saxena
From: Prerna Saxena <prerna.sax...@nutanix.com>

The current vhost-user protocol requires the client to send responses to only 
few commands. For the remaining commands, it is impossible for QEMU to know the 
status of the requested operation -- ie, did it succeed at all, and if so, at 
what time.

This is inconvenient, and can also lead to races. As an example:

(1) qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net 
application) and SET_MEM_TABLE doesn't require a reply according to the spec.
(2) qemu commits the memory to the guest.
(3) guest issues an I/O operation over a new memory region which was configured 
on (1)
(4) The application hasn't yet remapped the memory, but it sees the I/O request.
(5) The application cannot satisfy the request because it doesn't know about 
those GPAs

Note that the kernel implementation does not suffer from this limitation since 
messages are sent via an ioctl(). The ioctl() blocks until the backend (eg. 
vhost-net) completes the command and returns (with an error code).

Changing the behaviour of current vhost-user commands would break existing 
applications. This patch introduces a protocol extension, 
VHOST_USER_PROTOCOL_F_REPLY_ACK. This feature, if negotiated, allows QEMU to 
annotate messages to the application that it seeks a response for. The 
application must then respond to qemu by providing a status about the requested 
operation.


Prerna Saxena (1):
  vhost-user : Introduce a new feature, VHOST_USER_PROTOCOL_F_REPLY_ACK 
   This feature, if negotiated, forces the remote vhost-user
   process to send a u64 reply containin status code for each
   requested operation.  
   Status codes are '0' for success, and non-zero for error.

 docs/specs/vhost-user.txt |  36 +++
 hw/virtio/vhost-user.c| 153 +-
 2 files changed, 186 insertions(+), 3 deletions(-)

-- 
1.8.1.2




[Qemu-devel] [PATCH 2/2] Debug : Add error messages before a call to debug().

2016-04-15 Thread Prerna Saxena
Qemu code has abort() calls in various places which raises a SIGABRT;
This patch adds error messages before (most)calls to abort(), so that
it is easier to determine why QEMU died.

Signed-off-by: Prerna Saxena <saxenap@gmail.com>
---
 block.c| 1 +
 block/block-backend.c  | 4 
 block/curl.c   | 1 +
 block/io.c | 1 +
 block/linux-aio.c  | 1 +
 block/mirror.c | 2 ++
 block/qcow2-cache.c| 1 +
 block/qcow2-cluster.c  | 3 +++
 block/qcow2-refcount.c | 7 +++
 block/qcow2.c  | 2 ++
 blockdev.c | 3 +++
 crypto/aes.c   | 1 +
 exec.c | 4 
 hw/scsi/scsi-disk.c| 2 ++
 hw/virtio/virtio.c | 5 -
 vl.c   | 2 ++
 16 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index d4939b4..160f277 100644
--- a/block.c
+++ b/block.c
@@ -3725,6 +3725,7 @@ void bdrv_remove_aio_context_notifier(BlockDriverState 
*bs,
 }
 }
 
+error_report("Matching context notifier not found for removal. Aborting");
 abort();
 }
 
diff --git a/block/block-backend.c b/block/block-backend.c
index d74f670..0aa8692 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -407,6 +407,7 @@ BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
 return blk;
 }
 }
+error_report("Drive Info not found, Aborting.");
 abort();
 }
 
@@ -463,6 +464,8 @@ int blk_attach_dev(BlockBackend *blk, void *dev)
 void blk_attach_dev_nofail(BlockBackend *blk, void *dev)
 {
 if (blk_attach_dev(blk, dev) < 0) {
+error_report("Attaching device model to block %s failed. Aborting",
+blk->name);
 abort();
 }
 }
@@ -1143,6 +1146,7 @@ BlockErrorAction blk_get_error_action(BlockBackend *blk, 
bool is_read,
 case BLOCKDEV_ON_ERROR_IGNORE:
 return BLOCK_ERROR_ACTION_IGNORE;
 default:
+error_report("Unrecognized Block Error Action %d. Aborting.",on_err);
 abort();
 }
 }
diff --git a/block/curl.c b/block/curl.c
index 5a8f8b6..fe2225a 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -382,6 +382,7 @@ static void curl_multi_timeout_do(void *arg)
 
 curl_multi_check_completion(s);
 #else
+error_report("Curl timer expired, Aborting.");
 abort();
 #endif
 }
diff --git a/block/io.c b/block/io.c
index a7dbf85..6f45959 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2045,6 +2045,7 @@ void bdrv_aio_cancel(BlockAIOCB *acb)
 } else if (acb->bs) {
 aio_poll(bdrv_get_aio_context(acb->bs), true);
 } else {
+error_report("Aio context not found. Aborting.");
 abort();
 }
 }
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 805757e..38d7812 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -206,6 +206,7 @@ static void ioq_submit(struct qemu_laio_state *s)
 break;
 }
 if (ret < 0) {
+error_report("Error %d submitting io. Aborting.", ret);
 abort();
 }
 
diff --git a/block/mirror.c b/block/mirror.c
index c2cfc1a..600e3c2 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -389,6 +389,8 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 mirror_do_zero_or_discard(s, sector_num, io_sectors, true);
 break;
 default:
+error_report("Unrecognized mirror option %d. Aborting.",
+mirror_method);
 abort();
 }
 assert(io_sectors);
diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 0fe8eda..80766a2 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -334,6 +334,7 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
 if (min_lru_index == -1) {
 /* This can't happen in current synchronous code, but leave the check
  * here as a reminder for whoever starts using AIO with the cache */
+error_report("Invalid Index %d, Aborting", min_lru_index);
 abort();
 }
 
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 31ecc10..1914d97 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -583,6 +583,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t 
offset,
 }
 break;
 default:
+error_report("Invalid cluster type %d. Aborting.", ret);
 abort();
 }
 
@@ -868,6 +869,7 @@ static int count_cow_clusters(BDRVQcow2State *s, int 
nb_clusters,
 case QCOW2_CLUSTER_ZERO:
 break;
 default:
+error_report("Invalid cluster type %d, Aborting.", cluster_type);
 abort();
 }
 }
@@ -1494,6 +1496,7 @@ static int discard_single_l2(BlockDriverState *bs, 
uint64_t offset,
 break;
 
 default:
+error_report("Invalid clus

[Qemu-devel] [PATCH 0/2] Cleanup and instrumenting qemu exits due to abort().

2016-04-15 Thread Prerna Saxena
Today, some calls to abort() do not have a preceding error string that
might hint to the end user why QEMU died. Debugging such scenarios is
painful today.

This patchset attempts to clean up some dead code in vvfat.c;
it also aims to improve qemu error-reporting by placing error messages
that precede calls to abort().

Prerna Saxena (2):
  Block: Cleanup vvfat.c to remove dead code.
  Debug : Add error messages before a call to debug().

 block.c|  1 +
 block/block-backend.c  |  4 
 block/curl.c   |  1 +
 block/io.c |  1 +
 block/linux-aio.c  |  1 +
 block/mirror.c |  2 ++
 block/qcow2-cache.c|  1 +
 block/qcow2-cluster.c  |  3 +++
 block/qcow2-refcount.c |  7 +++
 block/qcow2.c  |  2 ++
 block/vvfat.c  | 17 +++--
 blockdev.c |  3 +++
 crypto/aes.c   |  1 +
 exec.c |  4 
 hw/scsi/scsi-disk.c|  2 ++
 hw/virtio/virtio.c |  5 -
 vl.c   |  2 ++
 17 files changed, 42 insertions(+), 15 deletions(-)

-- 
1.8.1.2




[Qemu-devel] [PATCH 1/2] Block: Cleanup vvfat.c to remove dead code.

2016-04-15 Thread Prerna Saxena
Commit 43dc2a64 replaced assert() with abort(), but didnt remove statements
that followed these calls. So current code still has return values set after
a call to abort(). Such statements will never execute and need to be cleaned
up.

Signed-off-by: Prerna Saxena <saxenap@nutanix.com>
---
 block/vvfat.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 6b85314..ffe739b 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1747,8 +1747,7 @@ static uint32_t 
get_cluster_count_for_direntry(BDRVVVFATState* s,
schedule_new_file(s, g_strdup(path), cluster_num);
else {
 abort();
-   return 0;
-   }
+   }
 }
 
 while(1) {
@@ -1768,7 +1767,6 @@ static uint32_t 
get_cluster_count_for_direntry(BDRVVVFATState* s,
* (cluster_num - mapping->begin)) {
/* offset of this cluster in file chain has changed */
 abort();
-   copy_it = 1;
} else if (offset == 0) {
const char* basename = get_basename(mapping->path);
 
@@ -1780,7 +1778,6 @@ static uint32_t 
get_cluster_count_for_direntry(BDRVVVFATState* s,
if (mapping->first_mapping_index != first_mapping_index
&& mapping->info.file.offset > 0) {
 abort();
-   copy_it = 1;
}
 
/* need to write out? */
@@ -1946,8 +1943,6 @@ DLOG(fprintf(stderr, "check direntry %d:\n", i); 
print_direntry(direntries + i))
}
} else
 abort(); /* cluster_count = 0; */
-
-   ret += cluster_count;
}
 
cluster_num = modified_fat_get(s, cluster_num);
@@ -2578,10 +2573,6 @@ static int handle_commits(BDRVVVFATState* s)
 for (i = 0; !fail && i < s->commits.next; i++) {
commit_t* commit = array_get(&(s->commits), i);
switch(commit->action) {
-   case ACTION_RENAME: case ACTION_MKDIR:
-abort();
-   fail = -2;
-   break;
case ACTION_WRITEOUT: {
 #ifndef NDEBUG
 /* these variables are only used by assert() below */
@@ -2639,6 +2630,8 @@ static int handle_commits(BDRVVVFATState* s)
 
break;
}
+case ACTION_RENAME:
+case ACTION_MKDIR:
default:
 abort();
}
@@ -2729,7 +2722,6 @@ static int do_commit(BDRVVVFATState* s)
 if (ret) {
fprintf(stderr, "Error handling renames (%d)\n", ret);
 abort();
-   return ret;
 }
 
 /* copy FAT (with bdrv_read) */
@@ -2740,21 +2732,18 @@ static int do_commit(BDRVVVFATState* s)
 if (ret) {
fprintf(stderr, "Fatal: error while committing (%d)\n", ret);
 abort();
-   return ret;
 }
 
 ret = handle_commits(s);
 if (ret) {
fprintf(stderr, "Error handling commits (%d)\n", ret);
 abort();
-   return ret;
 }
 
 ret = handle_deletes(s);
 if (ret) {
fprintf(stderr, "Error deleting\n");
 abort();
-   return ret;
 }
 
 if (s->qcow->drv->bdrv_make_empty) {
-- 
1.8.1.2




Re: [Qemu-devel] [PATCH 2/2] [v3] target-ppc: Enhance CPU nodes of device tree to be PAPR compliant.

2013-08-11 Thread Prerna Saxena
On 08/08/2013 04:04 PM, Andreas Färber wrote:
 Am 08.08.2013 09:26, schrieb Prerna Saxena:

 From: Prerna Saxena pre...@linux.vnet.ibm.com
 Date: Thu, 8 Aug 2013 06:38:03 +0530
 Subject: [PATCH 2/2] Enhance CPU nodes of device tree to be PAPR compliant.

 This is based on patch from Andreas which enables the default CPU with KVM
 to show up as -cpu type, such as POWER7_V2.3@0

 While this is definitely, more descriptive, PAPR mandates the device tree CPU
 node names to be of the form : PowerPC,name where name should not have
 underscores.
 Hence replacing the CPU model (which has underscores) with CPU alias.

 With this patch, the CPU nodes of device tree show up as :
 /proc/device-tree/cpus/PowerPC,POWER7@0/...
 /proc/device-tree/cpus/PowerPC,POWER7@4/...

 Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
 
 Not yet happy...

:(

 
 ---
  hw/ppc/spapr.c | 22 --
  1 file changed, 20 insertions(+), 2 deletions(-)

 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 59e2fea..8efd84e 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -43,6 +43,7 @@
  #include hw/pci-host/spapr.h
  #include hw/ppc/xics.h
  #include hw/pci/msi.h
 +#include cpu-models.h
  
  #include hw/pci/pci.h
  
 @@ -80,6 +81,8 @@
  
  #define HTAB_SIZE(spapr)(1ULL  ((spapr)-htab_shift))
  
 +#define PPC_DEVTREE_STR PowerPC,
 +
  sPAPREnvironment *spapr;
  
  int spapr_allocate_irq(int hint, bool lsi)
 @@ -322,9 +325,16 @@ static void *spapr_create_fdt_skel(const char 
 *cpu_model,
  _FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
  _FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
  
 -modelname = g_strdup(cpu_model);
 +/*
 + * PAPR convention mandates that
 + * Device tree nodes must be named as:
 + * PowerPC,CPU-NAME@...
 + * Also, CPU-NAME must not have underscores.(hence use of CPU-ALIAS)
 + */
 +
 +modelname = g_strdup_printf(PPC_DEVTREE_STR %s, cpu_model);
  
 -for (i = 0; i  strlen(modelname); i++) {
 +for (i = strlen(PPC_DEVTREE_STR); i  strlen(modelname); i++) {
  modelname[i] = toupper(modelname[i]);
  }
  
 
 One of your colleagues had brought up that PowerPC, prefix were not
 mandatory - is it *required* by the PAPR spec now, or is it just that
 the IBM CPUs used with PAPR happen to have such a name?

I dont know what context lead to this observation.
However, PAPR mentions the following nomenclature guideline:

The value of this property shall be of the form: “PowerPC,name”,
where name is the name of the processor chip which may be displayed to
the user. name shall not contain underscores.

I think this name guideline will hold good for all PAPR compliant
processors.

 
 @@ -1315,6 +1325,14 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
  
  cpu_model = g_strndup(parent_name,
  strlen(parent_name) - strlen(- TYPE_POWERPC_CPU));
 +
 +for (i = 0; ppc_cpu_aliases[i].model != NULL; i++) {
 +if (strcmp(ppc_cpu_aliases[i].model, cpu_model) == 0) {
 +g_free(cpu_model);
 +cpu_model = g_strndup(ppc_cpu_aliases[i].alias,
 +strlen(ppc_cpu_aliases[i].alias));
 +}
 +}
  }
  
  /* Prepare the device tree */
 
 This is still fixing up the name in the wrong place: -cpu POWER7_v2.3
 will not get fixed, only -cpu host or KVM's default.
 
 The solution I had discussed with Alex is the following: When devices
 need to expose their name to firmware in a special way, we have the
 DeviceClass::fw_name field. All we have to do is assign it and use it
 instead of cpu_model if non-NULL, just like we assign DeviceClass::desc.
 The way to do it would be to extend the family of POWERPC_DEF* macros to
 specify the additional field on the relevant CPU models.
 

Would this be the same use-case as reflected by: ppc_cpu_aliases.alias ?
If so, do we really need a separate field to convey the same information ?

 Therefore my above question: Would it be sufficient to explicitly name
 POWER7_v2.3 PowerPC,POWER7 etc. and to drop the upper-casing?
 Or would we also need to name a CPU such as MPC8572E (random Freescale
 CPU where I don't know the expected fw_name and that is unlikely to
 occur/work in sPAPR) PowerPC,MPC8572E if someone specified it with
 -cpu MPC8572E?
 

If this is not a PAPR-compliant CPU, I dont think the PAPR naming
convention is of any good.
I havent worked with non-PAPR cpus. Is the device tree for such CPUs
generated by routines in hw/ppc/spapr.c ? Or do they have custom
routines to generate appropriate device tree nodes ?

Regards,
-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [PATCH 0/2] [v3] target-ppc: Enhance CPU nodes of SPAPR-generated device tree

2013-08-08 Thread Prerna Saxena
By default on KVM or when user asks for it via -cpu host, cpu_model will
be host and sPAPR merely upper-cases it for the SLOF device tree.

PATCH 1/2 : Change the SPAPR code so that we get the underlying CPU type,
 e.g., POWER7_V2.3@0 in the device tree.
PATCH 2/2 : Make the device-tree CPU nodes PAPR-compliant.

Changelog from v2:
PATCH 1/2 : Reworked and augmented by Andres Farber against original posted by 
Prerna.
PATCH 2/2 : New.

Regards,
-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [PATCH 1/2] [v3] target-ppc: Get CPU name to correct reflect its model in the SLOF device tree.

2013-08-08 Thread Prerna Saxena
From: Andreas Farber afaer...@suse.de
Date: Wed, 7 Aug 2013 14:50:41 +0530
Subject: [PATCH 1/2] By default on KVM or when user asks for it via -cpu
 host, cpu_model will be host and sPAPR merely
 upper-cases it for the SLOF device tree.

Change it so that we get the underlying CPU type, e.g., POWER7_V2.3@0.

Tested-by: Prerna Saxena pre...@linux.vnet.ibm.com
Signed-off-by: Andreas Färber afaer...@suse.de
---
 hw/ppc/spapr.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 16bfab9..59e2fea 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1072,7 +1072,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 const char *kernel_cmdline = args-kernel_cmdline;
 const char *initrd_filename = args-initrd_filename;
 const char *boot_device = args-boot_device;
-PowerPCCPU *cpu;
+PowerPCCPU *cpu = NULL;
 CPUPPCState *env;
 PCIHostState *phb;
 int i;
@@ -1307,6 +1307,16 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 register_savevm_live(NULL, spapr/htab, -1, 1,
  savevm_htab_handlers, spapr);
 
+if (kvm_enabled()  strcmp(cpu_model, host) == 0) {
+ObjectClass *cpu_class = object_get_class(OBJECT(cpu));
+ObjectClass *parent_cpu_class = object_class_get_parent(cpu_class);
+
+const char *parent_name = object_class_get_name(parent_cpu_class);
+
+cpu_model = g_strndup(parent_name,
+strlen(parent_name) - strlen(- TYPE_POWERPC_CPU));
+}
+
 /* Prepare the device tree */
 spapr-fdt_skel = spapr_create_fdt_skel(cpu_model,
 initrd_base, initrd_size,
-- 
1.7.11.4


-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [PATCH 2/2] [v3] target-ppc: Enhance CPU nodes of device tree to be PAPR compliant.

2013-08-08 Thread Prerna Saxena

From: Prerna Saxena pre...@linux.vnet.ibm.com
Date: Thu, 8 Aug 2013 06:38:03 +0530
Subject: [PATCH 2/2] Enhance CPU nodes of device tree to be PAPR compliant.

This is based on patch from Andreas which enables the default CPU with KVM
to show up as -cpu type, such as POWER7_V2.3@0

While this is definitely, more descriptive, PAPR mandates the device tree CPU
node names to be of the form : PowerPC,name where name should not have
underscores.
Hence replacing the CPU model (which has underscores) with CPU alias.

With this patch, the CPU nodes of device tree show up as :
/proc/device-tree/cpus/PowerPC,POWER7@0/...
/proc/device-tree/cpus/PowerPC,POWER7@4/...

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 hw/ppc/spapr.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 59e2fea..8efd84e 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -43,6 +43,7 @@
 #include hw/pci-host/spapr.h
 #include hw/ppc/xics.h
 #include hw/pci/msi.h
+#include cpu-models.h
 
 #include hw/pci/pci.h
 
@@ -80,6 +81,8 @@
 
 #define HTAB_SIZE(spapr)(1ULL  ((spapr)-htab_shift))
 
+#define PPC_DEVTREE_STR PowerPC,
+
 sPAPREnvironment *spapr;
 
 int spapr_allocate_irq(int hint, bool lsi)
@@ -322,9 +325,16 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
 _FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
 
-modelname = g_strdup(cpu_model);
+/*
+ * PAPR convention mandates that
+ * Device tree nodes must be named as:
+ * PowerPC,CPU-NAME@...
+ * Also, CPU-NAME must not have underscores.(hence use of CPU-ALIAS)
+ */
+
+modelname = g_strdup_printf(PPC_DEVTREE_STR %s, cpu_model);
 
-for (i = 0; i  strlen(modelname); i++) {
+for (i = strlen(PPC_DEVTREE_STR); i  strlen(modelname); i++) {
 modelname[i] = toupper(modelname[i]);
 }
 
@@ -1315,6 +1325,14 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 
 cpu_model = g_strndup(parent_name,
 strlen(parent_name) - strlen(- TYPE_POWERPC_CPU));
+
+for (i = 0; ppc_cpu_aliases[i].model != NULL; i++) {
+if (strcmp(ppc_cpu_aliases[i].model, cpu_model) == 0) {
+g_free(cpu_model);
+cpu_model = g_strndup(ppc_cpu_aliases[i].alias,
+strlen(ppc_cpu_aliases[i].alias));
+}
+}
 }
 
 /* Prepare the device tree */
-- 
1.7.11.4



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




Re: [Qemu-devel] [PATCH for-next] spapr: Avoid HOST@0 CPU node name in SLOF device tree for -cpu host

2013-08-07 Thread Prerna Saxena
On 08/01/2013 06:32 AM, Andreas Färber wrote:
 By default on KVM or when user asks for it via -cpu host, cpu_model will
 be host and sPAPR merely upper-cases it for the SLOF device tree.
 
 Change it so that we get the underlying CPU type, e.g., POWER7_V2.3@0.
 
 Reported-by: Prerna Saxena pre...@linux.vnet.ibm.com
 Signed-off-by: Andreas Färber afaer...@suse.de
 ---

ACK.
Reviewed and tested --Works as expected.

I'll send out an updated follow-up patch later in the day which ensures
PAPR compliance for nomenclature.

Regards,
-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




Re: [Qemu-devel] [PATCH 18/19] target-ppc: Enhance the CPU node labels for the guest device tree for pseries.

2013-07-10 Thread Prerna Saxena
Hi Andreas,
Thanks for the response.

On 07/08/2013 10:15 PM, Andreas Färber wrote:
 Hi,
 
 Am 08.07.2013 17:49, schrieb Prerna Saxena:
 On 07/08/2013 02:32 PM, Andreas Färber wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Am 08.07.2013 03:09, schrieb David Gibson:
 On Sat, Jul 06, 2013 at 11:54:15PM +1000, Alexey Kardashevskiy
 wrote:
 @@ -1342,6 +1346,13 @@ static void
 ppc_spapr_init(QEMUMachineInitArgs *args) 
 register_savevm_live(NULL, spapr/htab, -1, 1, 
 savevm_htab_handlers, spapr);

 +/* Ensure that cpu_model is correctly reflected for a KVM
 guest */ +if (kvm_enabled()  !strcmp(cpu_model, host)) { 
 +asm (mfpvr %0 +: =r(pvr)); +
 cpu_model = ppc_cpu_alias_by_pvr(pvr);

 This needs to be protected by an ifdef CONFIG_KVM or similar.  If
 the compiler optimization level is turned down, so that it doesn't 
 recognize that the kvm_enabled() is always false, then this could 
 attempt to compile the ppc asm instructions on an x86 (or
 whatever) host.

 This hunk can be completely replaced by QOM mechanisms - just didn't
 get to replying yet...

 Sorry I already sent out a v2, and only then saw your message. Could you
 pls explain how I could use QOM to replace this code block ?
 
 Well, in short the thing is it has not much to do with KVM. The
 KVM-specific host-powerpc64-cpu type is derived from the one you're
 looking for and thus you can use object_class_get_parent() to obtain the
 parent type and look at its name - stripping - TYPE_POWERPC_CPU from
 it should be much more efficient but will give you the detailed name
 including revision. I was planning to propose an alternative patch for that.

This is what my patch does :-)

+const char *ppc_cpu_alias_by_pvr(uint32_t pvr)
+{
+int i;
+const char *cpu_alias;
+char *offset, *model;
+
+cpu_alias  = object_class_get_name(OBJECT_CLASS
+(ppc_cpu_class_by_pvr(pvr)));
+ [snip]

 
 Replacing a concrete model name with its simpler alias is a secondary
 issue (separate patch) that is not specific to KVM or -cpu host. Compare
 -cpu POWER8_v1.0 printing .../POWER8_v1.0@0/... presumably.
 

Agree that this is not specific to KVM. That is the reason I have set it
in a separate function, which can be called otherwise as well.

Just to clarify your response, you want the function I coded to be split
into 2 different pieces, to cater to the two specific requirements you
mention ? That can be done, but not sure if it is too much code bloat.

 Further, Alex has already applied a patch of his working around the
 alias table being a rather archaic construct, not intended for frequent
 use. Instead of adding even more functions that iterate it, we should
 turn it into a hashtable for efficient lookup.
 

Can you / Alexander Graf point me to the fix ? I can rework my patch to
consume it ?

 (Note that the cpu_model_str field may contain more than just the model
 name, it is otherwise unused in softmmu and I was therefore preparing a
 patch to ban its use to linux-user solely, so the type name seems the
 most reliable indicator we have and as a bonus no PVR needed for it.)
 

Hmm, maybe obsoleting PVR check is not such a great idea.
I'm not sure if my earlier email clearly outlined the use-case this
patch was attempting to fix. Here is a detailed explanation :

We will still need PVR based lookups for cases such as the one I have
described. As an illustration, consider running in a KVM environment
where QEMU hasnt been started with a specific CPU type via -CPU
PPC_MODEL. In this case, we will be required to do a PVR_based lookup
only -- to make sure the guest gets initialized with the same CPU as
host. The notion of _same_cpu_model_ can only be built over a PVR check.

Regards,
-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [PATCH v2 18/19] target-ppc: Enhance the CPU node labels for the guest device tree for pseries.

2013-07-08 Thread Prerna Saxena
Hi David,
Thanks for the review feedback. I have incorporated your changes in v2 of
the patch, which follows herewith.

Regards,
Prerna

Subject: [PATCH v2] Target-ppc : Enhance the CPU node labels for the guest
 device tree for pseries.

In absence of a -CPU parameter in the qemu command line, the nodes of
KVM-enabled guest device tree look like this :

/proc/device-tree/cpus/HOST@0/...
/proc/device-tree/cpus/HOST@4/...

This patch replaces this obscure 'HOST' label with a more descriptive label.
This is gathered by first identifying the PVR of the host, and then determining
the host CPU alias which corresponds to the model indicated by this PVR.

Sample Final outcome for an KVM-enabled pseries guest running on POWER7:
/proc/device-tree/cpus/PowerPC,POWER7@0/...
/proc/device-tree/cpus/PowerPC,POWER7@4/...

This also helps userspace tools like ppc64_cpu, which expect the device tree
to be in this format.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: David Gibson da...@gibson.dropbear.id.au
---
 hw/ppc/spapr.c  | 18 +++---
 target-ppc/cpu-qom.h|  1 +
 target-ppc/translate_init.c | 28 
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fe34291..ddf263a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -79,6 +79,7 @@
 
 #define HTAB_SIZE(spapr)(1ULL  ((spapr)-htab_shift))
 
+#define PPC_DEVTREE_STR PowerPC,
 sPAPREnvironment *spapr;
 
 int spapr_allocate_irq(int hint, bool lsi)
@@ -295,9 +296,12 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
 _FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
 
-modelname = g_strdup(cpu_model);
+/* device tree nodes must look like this :
+ * PowerPC,CPU_ALIAS@0
+ */
+modelname = g_strdup_printf(PPC_DEVTREE_STR %s, cpu_model);
 
-for (i = 0; i  strlen(modelname); i++) {
+for (i = strlen(PPC_DEVTREE_STR); i  strlen(modelname); i++) {
 modelname[i] = toupper(modelname[i]);
 }
 
@@ -735,7 +739,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 MemoryRegion *sysmem = get_system_memory();
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 hwaddr rma_alloc_size;
-uint32_t initrd_base = 0;
+uint32_t initrd_base = 0, pvr = 0;
 long kernel_size = 0, initrd_size = 0;
 long load_limit, rtas_limit, fw_size;
 char *filename;
@@ -959,6 +963,14 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 
 spapr-entry_point = 0x100;
 
+#ifdef CONFIG_KVM
+/* Ensure that cpu_model is correctly reflected for a KVM guest */
+if (kvm_enabled()  !strcmp(cpu_model, host)) {
+asm (mfpvr %0
+: =r(pvr));
+cpu_model = ppc_cpu_alias_by_pvr(pvr);
+}
+#endif
 /* Prepare the device tree */
 spapr-fdt_skel = spapr_create_fdt_skel(cpu_model,
 initrd_base, initrd_size,
diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
index 84ba105..90dd1dd 100644
--- a/target-ppc/cpu-qom.h
+++ b/target-ppc/cpu-qom.h
@@ -99,6 +99,7 @@ static inline PowerPCCPU *ppc_env_get_cpu(CPUPPCState *env)
 #define ENV_OFFSET offsetof(PowerPCCPU, env)
 
 PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr);
+const char *ppc_cpu_alias_by_pvr(uint32_t pvr);
 
 void ppc_cpu_do_interrupt(CPUState *cpu);
 void ppc_cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf,
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 50e0ee5..21a7f6f 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7913,6 +7913,34 @@ PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr)
 return pcc;
 }
 
+const char *ppc_cpu_alias_by_pvr(uint32_t pvr)
+{
+int i;
+const char *cpu_alias;
+char *offset, *model;
+
+cpu_alias  = object_class_get_name(OBJECT_CLASS
+(ppc_cpu_class_by_pvr(pvr)));
+
+/* Replace the full class name in cpu_alias with the CPU alias
+ * Eg, POWER7_V2.3-POWERPC64-CPU can simply be called
+ * POWER7
+ */
+
+offset = strstr(cpu_alias, - TYPE_POWERPC_CPU);
+if (offset) {
+model = g_strndup(cpu_alias, offset - cpu_alias);
+for (i = 0; ppc_cpu_aliases[i].model != NULL; i++) {
+if (strcmp(ppc_cpu_aliases[i].model, model) == 0) {
+g_free(model);
+return ppc_cpu_aliases[i].alias;
+}
+}
+g_free(model);
+}
+return NULL;
+}
+
 static gint ppc_cpu_compare_class_name(gconstpointer a, gconstpointer b)
 {
 ObjectClass *oc = (ObjectClass *)a;
-- 
1.7.11.7



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




Re: [Qemu-devel] [PATCH 18/19] target-ppc: Enhance the CPU node labels for the guest device tree for pseries.

2013-07-08 Thread Prerna Saxena
On 07/08/2013 02:32 PM, Andreas Färber wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Am 08.07.2013 03:09, schrieb David Gibson:
 On Sat, Jul 06, 2013 at 11:54:15PM +1000, Alexey Kardashevskiy
 wrote:
 @@ -1342,6 +1346,13 @@ static void
 ppc_spapr_init(QEMUMachineInitArgs *args) 
 register_savevm_live(NULL, spapr/htab, -1, 1, 
 savevm_htab_handlers, spapr);

 +/* Ensure that cpu_model is correctly reflected for a KVM
 guest */ +if (kvm_enabled()  !strcmp(cpu_model, host)) { 
 +asm (mfpvr %0 +: =r(pvr)); +
 cpu_model = ppc_cpu_alias_by_pvr(pvr);

 This needs to be protected by an ifdef CONFIG_KVM or similar.  If
 the compiler optimization level is turned down, so that it doesn't 
 recognize that the kvm_enabled() is always false, then this could 
 attempt to compile the ppc asm instructions on an x86 (or
 whatever) host.
 
 This hunk can be completely replaced by QOM mechanisms - just didn't
 get to replying yet...
 

Hi Andreas,
Sorry I already sent out a v2, and only then saw your message. Could you
pls explain how I could use QOM to replace this code block ?

Regards,
-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [PATCH] Target-ppc : Enhance the CPU node labels for guest device tree for pseries.

2013-07-05 Thread Prerna Saxena
[PATCH] Target-ppc : Enhance the CPU node labels for the guest
 device tree for pseries.

In absence of a -CPU parameter in the qemu command line, the nodes of
KVM-enabled guest device tree look like this :

/proc/device-tree/cpus/HOST@0/...
/proc/device-tree/cpus/HOST@4/...

This patch replaces this obscure 'HOST' label with a more descriptive label.
This is gathered by first identifying the PVR of the host, and then determining
the host CPU alias which corresponds to the model indicated by this PVR.

Sample Final outcome for an KVM-enabled pseries guest running on POWER7:
/proc/device-tree/cpus/PowerPC,POWER7@0/...
/proc/device-tree/cpus/PowerPC,POWER7@4/...

This also helps userspace tools like ppc64_cpu, which expect the device tree
to be in this format in the guest.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 hw/ppc/spapr.c  | 17 ++---
 target-ppc/cpu-qom.h|  1 +
 target-ppc/translate_init.c | 28 
 3 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fe34291..e084f3f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -79,6 +79,7 @@
 
 #define HTAB_SIZE(spapr)(1ULL  ((spapr)-htab_shift))
 
+#define PPC_DEVTREE_STR PowerPC,
 sPAPREnvironment *spapr;
 
 int spapr_allocate_irq(int hint, bool lsi)
@@ -295,9 +296,12 @@ static void *spapr_create_fdt_skel(const char *cpu_model,
 _FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
 _FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
 
-modelname = g_strdup(cpu_model);
+/* device tree nodes must look like this :
+ * PowerPC,CPU_ALIAS@0
+ */
+modelname = g_strdup_printf(PPC_DEVTREE_STR %s, cpu_model);
 
-for (i = 0; i  strlen(modelname); i++) {
+for (i = strlen(PPC_DEVTREE_STR); i  strlen(modelname); i++) {
 modelname[i] = toupper(modelname[i]);
 }
 
@@ -735,7 +739,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 MemoryRegion *sysmem = get_system_memory();
 MemoryRegion *ram = g_new(MemoryRegion, 1);
 hwaddr rma_alloc_size;
-uint32_t initrd_base = 0;
+uint32_t initrd_base = 0, pvr = 0;
 long kernel_size = 0, initrd_size = 0;
 long load_limit, rtas_limit, fw_size;
 char *filename;
@@ -959,6 +963,13 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 
 spapr-entry_point = 0x100;
 
+/* Ensure that cpu_model is correctly reflected for a KVM guest */
+if (kvm_enabled()  !strcmp(cpu_model, host)) {
+asm (mfpvr %0
+: =r(pvr));
+cpu_model = ppc_cpu_alias_by_pvr(pvr);
+}
+
 /* Prepare the device tree */
 spapr-fdt_skel = spapr_create_fdt_skel(cpu_model,
 initrd_base, initrd_size,
diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
index 84ba105..90dd1dd 100644
--- a/target-ppc/cpu-qom.h
+++ b/target-ppc/cpu-qom.h
@@ -99,6 +99,7 @@ static inline PowerPCCPU *ppc_env_get_cpu(CPUPPCState *env)
 #define ENV_OFFSET offsetof(PowerPCCPU, env)
 
 PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr);
+const char *ppc_cpu_alias_by_pvr(uint32_t pvr);
 
 void ppc_cpu_do_interrupt(CPUState *cpu);
 void ppc_cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf,
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 50e0ee5..21a7f6f 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7913,6 +7913,34 @@ PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr)
 return pcc;
 }
 
+const char *ppc_cpu_alias_by_pvr(uint32_t pvr)
+{
+int i;
+const char *cpu_alias;
+char *offset, *model;
+
+cpu_alias  = object_class_get_name(OBJECT_CLASS
+(ppc_cpu_class_by_pvr(pvr)));
+
+/* Replace the full class name in cpu_alias with the CPU alias
+ * Eg, POWER7_V2.3-POWERPC64-CPU can simply be called
+ * POWER7
+ */
+
+offset = strstr(cpu_alias, - TYPE_POWERPC_CPU);
+if (offset) {
+model = g_strndup(cpu_alias, offset - cpu_alias);
+for (i = 0; ppc_cpu_aliases[i].model != NULL; i++) {
+if (strcmp(ppc_cpu_aliases[i].model, model) == 0) {
+g_free(model);
+return ppc_cpu_aliases[i].alias;
+}
+}
+g_free(model);
+}
+return NULL;
+}
+
 static gint ppc_cpu_compare_class_name(gconstpointer a, gconstpointer b)
 {
 ObjectClass *oc = (ObjectClass *)a;
-- 
1.7.11.7



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




Re: [Qemu-devel] [PATCH 16/17] ppc64: Enable QEMU to run on POWER 8 DD1 chip.

2013-07-04 Thread Prerna Saxena
Hi Andreas,
Thank you for taking a look.
I have incorporated your feedback into a new patch, attached herewith.


Regards,
Prerna

Subject: [PATCH] target-ppc: Add POWER8 v1.0 CPU model

This patch adds CPU PVR definition for POWER8,
and enables QEMU to launch guests on POWER8 hardware.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
Reviewed-by: Paul Mackerras pau...@samba.org
Reviewed-by: Andreas Farber afaer...@suse.de
---
 target-ppc/cpu-models.c |  3 +++
 target-ppc/cpu-models.h |  1 +
 target-ppc/translate_init.c | 34 ++
 3 files changed, 38 insertions(+)

diff --git a/target-ppc/cpu-models.c b/target-ppc/cpu-models.c
index 17f56b7..72f7088 100644
--- a/target-ppc/cpu-models.c
+++ b/target-ppc/cpu-models.c
@@ -1145,6 +1145,8 @@
 POWER7 v2.1)
 POWERPC_DEF(POWER7_v2.3,   CPU_POWERPC_POWER7_v23, POWER7,
 POWER7 v2.3)
+POWERPC_DEF(POWER8_v1.0,   CPU_POWERPC_POWER8_v10, POWER8,
+POWER8 v1.0)
 POWERPC_DEF(970,   CPU_POWERPC_970,970,
 PowerPC 970)
 POWERPC_DEF(970fx_v1.0,CPU_POWERPC_970FX_v10,  970FX,
@@ -1390,6 +1392,7 @@ const PowerPCCPUAlias ppc_cpu_aliases[] = {
 { Dino,  POWER3 },
 { POWER3+, 631 },
 { POWER7, POWER7_v2.3 },
+{ POWER8, POWER8_v1.0 },
 { 970fx, 970fx_v3.1 },
 { 970mp, 970mp_v1.1 },
 { Apache, RS64 },
diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
index a94f835..1c67a0e 100644
--- a/target-ppc/cpu-models.h
+++ b/target-ppc/cpu-models.h
@@ -555,6 +555,7 @@ enum {
 CPU_POWERPC_POWER7_v20 = 0x003F0200,
 CPU_POWERPC_POWER7_v21 = 0x003F0201,
 CPU_POWERPC_POWER7_v23 = 0x003F0203,
+CPU_POWERPC_POWER8_v10 = 0x004B0100,
 CPU_POWERPC_970= 0x00390202,
 CPU_POWERPC_970FX_v10  = 0x00391100,
 CPU_POWERPC_970FX_v20  = 0x003C0200,
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 71e434a..a1d8e70 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7042,6 +7042,40 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 pcc-l1_dcache_size = 0x8000;
 pcc-l1_icache_size = 0x8000;
 }
+
+POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
+
+dc-desc = POWER8;
+pcc-init_proc = init_proc_POWER7;
+pcc-check_pow = check_pow_nocheck;
+pcc-insns_flags = PPC_INSNS_BASE | PPC_STRING | PPC_MFTB |
+   PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
+   PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |
+   PPC_FLOAT_STFIWX |
+   PPC_CACHE | PPC_CACHE_ICBI | PPC_CACHE_DCBZ |
+   PPC_MEM_SYNC | PPC_MEM_EIEIO |
+   PPC_MEM_TLBIE | PPC_MEM_TLBSYNC |
+   PPC_64B | PPC_ALTIVEC |
+   PPC_SEGMENT_64B | PPC_SLBI |
+   PPC_POPCNTB | PPC_POPCNTWD;
+pcc-insns_flags2 = PPC2_VSX | PPC2_DFP | PPC2_DBRX;
+pcc-msr_mask = 0x8204FF36ULL;
+pcc-mmu_model = POWERPC_MMU_2_06;
+#if defined(CONFIG_SOFTMMU)
+pcc-handle_mmu_fault = ppc_hash64_handle_mmu_fault;
+#endif
+pcc-excp_model = POWERPC_EXCP_POWER7;
+pcc-bus_model = PPC_FLAGS_INPUT_POWER7;
+pcc-bfd_mach = bfd_mach_ppc64;
+pcc-flags = POWERPC_FLAG_VRE | POWERPC_FLAG_SE |
+ POWERPC_FLAG_BE | POWERPC_FLAG_PMM |
+ POWERPC_FLAG_BUS_CLK | POWERPC_FLAG_CFAR;
+pcc-l1_dcache_size = 0x8000;
+pcc-l1_icache_size = 0x8000;
+}
 #endif /* defined (TARGET_PPC64) */
 
 
-- 
1.7.11.4



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Request for inputs]Qemu parameters that need runtime change.

2011-03-02 Thread Prerna Saxena

Hi,
QEMU at present can be started with a huge list of parameters, and only 
a subset of these can be changed at runtime. For the remaining ones, one 
needs to restart the qemu instance.
I've been trying to put together a list of some such parameters, which 
would make good candidates for a runtime change. Request inputs on more 
such parameters that could make it here, and also whether the following 
are good-to-have features:
1. Allowing a runtime change from one chardev backend to another ( Eg, 
from TCP socket to unix, and vice-versa )

2. Changing the network interfaces (from -net user to -net tap ? )

I'm presently aware of these; it would be good to get more inputs on 
what more can be done here.


--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] [RFC][PATCH 0/2] Allow cache settings for block devices to be changed at runtime.

2011-02-28 Thread Prerna Saxena
The following patchset introduces monitor commands:

1. set_cache DEVICE CACHE-SETTING
Change cache settings for block device, DEVICE, through the monitor.
(Available options : 'none', 'writeback', 'writethrough')
Eg,
(qemu)set_cache ide0-hd0 none 
- Changes cache setting for ide0-hd0 to 'none'

2. info block
Now extended to display cache settings for available block devices.

TODOS :
---
1. Support 'unsafe' cache mode.
2. Display current cache setting for device, if the CACHE-SETTING option
is not supplied by the user. Eg, 
(qemu)set_cache ide0-hd0
presently errors out. Ideally, it should display current cache setting 
for the given device ide0-hd0

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC][PATCH 1/2] Add monitor command 'set-cache' to change cache settings for a block device.

2011-02-28 Thread Prerna Saxena
Usage :
(qemu) set_cache DEVICE CACHE-MODE
where CACHE-MODE can be one of writeback/ writethrough/ none.

At present, the image file is closed and re-opened with appropriate flags.
It might potentially cause problems if the underlying image is deleted 
while a running qemu instance is using it. A change in cache operations
will cause the image file to be closed, and a deleted file will be gone.
Suggestions to fix this ?

---
 blockdev.c  |   76 +++
 blockdev.h  |1 +
 hmp-commands.hx |   13 +
 3 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 0690cc8..6735205 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -636,6 +636,82 @@ out:
 return ret;
 }
 
+int do_set_cache(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+const char *device = qdict_get_str(qdict, device);
+const char *cache = qdict_get_str(qdict, cache);
+BlockDriverState *bs;
+BlockDriver *drv;
+int ret = 0;
+int bdrv_flags = 0;
+
+if (!cache) {
+   /* TODO: in the absence of a change request,
+ simply display current cache setting.
+ Currently one needs 'info block' to query this */
+qerror_report(QERR_MISSING_PARAMETER, cache);
+return -1;
+}
+
+bs = bdrv_find(device);
+if (!bs) {
+qerror_report(QERR_DEVICE_NOT_FOUND, device);
+return -1;
+}
+
+/* Clear old flags */
+bdrv_flags = bs-open_flags;
+if (bdrv_flags  BDRV_O_CACHE_MASK) {
+bdrv_flags = ~BDRV_O_CACHE_MASK;
+}
+
+/* Determine flags for requested cache setting */
+if (!strcmp(cache, none)) {
+bdrv_flags |= BDRV_O_NOCACHE;
+} else if (!strcmp(cache, writeback)) {
+bdrv_flags |= BDRV_O_CACHE_WB;
+} else if (!strcmp(cache, unsafe)) {
+   /* TODO : Support unsafe mode */
+qerror_report(QERR_INVALID_PARAMETER_VALUE, cache,
+   writeback, writethrough, none);
+return -1;
+} else if (!strcmp(cache, writethrough)) {
+/* Default setting */
+} else {
+qerror_report(QERR_INVALID_PARAMETER_VALUE, cache,
+   'cache' must be one of writeback, writethrough, none);
+return -1;
+}
+
+/* Verify that the cache setting specified is different from current.
+ * Does NOT call for error return, since the 'request' is already
+ * honoured.
+ */
+if (bdrv_flags == bs-open_flags) {
+qerror_report(QERR_PROPERTY_VALUE_IN_USE, device, cache, cache);
+return 0;
+}
+
+/* Quiesce IO for the given block device */
+qemu_aio_flush();
+bdrv_flush(bs);
+
+/* Change cache value and restart IO on the block device */
+printf(Setting cache=%s for device %s [ filename %s ], cache, device,
+bs-filename );
+drv = bs-drv;
+bdrv_close(bs);
+ret = bdrv_open(bs, bs-filename, bdrv_flags, drv);
+/*
+ * A failed attempt to reopen the image file must lead to 'abort()'
+ */
+if (ret != 0) {
+abort();
+}
+
+return ret;
+}
+
 static int eject_device(Monitor *mon, BlockDriverState *bs, int force)
 {
 if (!force) {
diff --git a/blockdev.h b/blockdev.h
index 2c9e780..9f35817 100644
--- a/blockdev.h
+++ b/blockdev.h
@@ -63,6 +63,7 @@ int do_change_block(Monitor *mon, const char *device,
 const char *filename, const char *fmt);
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data);
 int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data);
+int do_set_cache(Monitor *mon, const QDict *qdict, QObject **ret_data);
 int do_block_resize(Monitor *mon, const QDict *qdict, QObject **ret_data);
 
 #endif
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 372bef4..18761cf 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1066,7 +1066,20 @@ STEXI
 @findex watchdog_action
 Change watchdog action.
 ETEXI
+{
+.name   = set_cache,
+.args_type  = device:B,cache:s,
+.params = device writeback|writethrough|none,
+.help   = change cache settings for device,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_set_cache,
+},
 
+STEXI
+@item set_cache
+@findex set_cache
+Set cache options for a block device.
+ETEXI
 {
 .name   = acl_show,
 .args_type  = aclname:s,
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC][PATCH 2/2] Extend monitor command 'info block' to display cache settings for block devices.

2011-02-28 Thread Prerna Saxena
(qemu)info block
SAMPLE output :
ide0-hd0: type=hd removable=0 cache=none file=/tmp/abc.img ro=0
drv=qcow2 encrypted=0

---
 block.c |   22 --
 1 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index f7d91a2..c717888 100644
--- a/block.c
+++ b/block.c
@@ -1707,6 +1707,23 @@ static void bdrv_print_dict(QObject *obj, void *opaque)
 monitor_printf(mon,  locked=%d, qdict_get_bool(bs_dict, locked));
 }
 
+if (qdict_haskey(bs_dict, open_flags) 
+!strcmp(qdict_get_str(bs_dict, type), hd)) {
+int open_flags = qdict_get_int(bs_dict, open_flags);
+if (open_flags  BDRV_O_NOCACHE) {
+monitor_printf(mon,  cache=none);
+} else if (open_flags  BDRV_O_CACHE_WB) {
+if (open_flags  BDRV_O_NO_FLUSH) {
+monitor_printf(mon,  cache=unsafe);
+}
+else {
+monitor_printf(mon,  cache=writeback);
+}
+} else {
+monitor_printf(mon,  cache=writethrough);
+}
+}
+
 if (qdict_haskey(bs_dict, inserted)) {
 QDict *qdict = qobject_to_qdict(qdict_get(bs_dict, inserted));
 
@@ -1756,9 +1773,10 @@ void bdrv_info(Monitor *mon, QObject **ret_data)
 }
 
 bs_obj = qobject_from_jsonf({ 'device': %s, 'type': %s, 
-'removable': %i, 'locked': %i },
+'removable': %i, 'locked': %i, 
+'open_flags': %d },
 bs-device_name, type, bs-removable,
-bs-locked);
+bs-locked, bs-open_flags);
 
 if (bs-drv) {
 QObject *obj;
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




Re: [Qemu-devel] [PATCH v2] Add a DTrace tracing backend targetted for SystemTAP compatability

2010-10-27 Thread Prerna Saxena

ACK, works well!
A suggestion though..

On 10/20/2010 07:39 PM, Daniel P. Berrange wrote:


eg, instead of

   probe process(qemu).mark(qemu_malloc) {
 printf(Malloc %d %p\n, $arg1, $arg2);
   }

The addition of qemu.stp to /usr/share/systemtap/tapset/
lets users write

   probe qemu.qemu_malloc {
 printf(Malloc %d %p\n, size, ptr);
   }
...


diff --git a/tracetool b/tracetool
index 7010858..047f16b 100755
--- a/tracetool
+++ b/tracetool
+linetos_dtrace()
+{
+local name args arglist state
+
+# Define prototype for probe arguments
+catEOF
+probe qemu.$name = process(qemu).mark($name)
+{


The 'process' probes only work by looking for the binary in $PATH, 
unless the full path is specified. When compiling qemu in non-standard 
locations ( ie with --prefix), such probes would not point to the 
correct binary. It could be nice if tracetool could pass the full build 
path for defining the probe point. Eg,


probe qemu.qemu_malloc = 
process(/Path/to/build/dir/bin/qemu).mark(qemu_malloc) { .. }


--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] Re: [RFC][PATCH 4/5] trace-event

2010-10-24 Thread Prerna Saxena

On 10/22/2010 08:57 PM, Stefan Hajnoczi wrote:

On Thu, Oct 21, 2010 at 03:10:18PM +0530, Prerna Saxena wrote:

trace-event : QMP interface to change state of a trace-event.
(Analogous to hmp command : trace-event )

Signed-off-by: Prerna Saxenapre...@linux.vnet.ibm.com
---
  qmp-commands.hx |   32 
  1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 7e95f4e..f2008e8 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -761,6 +761,38 @@ Example:

  Note: This command must be issued before issuing any other command.

+EQMP
+
+{
+.name   = trace-event,
+.args_type  = name:s,option:b,
+.params = name on|off,
+.help   = changes state of a specific trace event,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_change_trace_event_state_qmp,
+},
+
+SQMP
+trace-event
+---
+
+Change state of a trace-event.


The name is a little odd because it has no verb.  How about
set-trace-event or enable-trace-event?



Sure, makes sense.
I'll incorporate this when I send out the next set of patches.

Thanks,
--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] Re: [Tracing][v4 PATCH 2/2] Add documentation for QMP interfaces

2010-10-21 Thread Prerna Saxena
 set.


+- status: State of trace-event [ '0': disabled; '1':enabled  ] (json-int)


This should be a json bool called 'enabled' or 'disabled', but what happens
when a file is not defined?



Changed type to json bool.
The trace infrastructure sets the trace-output file to trace-PID ( 
created in current dir) if no explicit trace-file is specified at 
startup. (Users can also change the default trace-file at runtime using 
the hmp command 'trace-file set FILE' I'll be covering QMP interface for 
the same in the upcoming patchset. )



+
+Example:
+
+-  { execute: query-trace-file }
+- {
+  return:{
+ trace-file: trace-26609,
+ status: 1
+  }
+   }
+
+EQMP





--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] Re: [Tracing][RFC] QMP interface to toggle state of a trace-event

2010-10-21 Thread Prerna Saxena


Thanks for the review!

On 10/21/2010 12:53 AM, Luiz Capitulino wrote:

On Wed, 20 Oct 2010 15:28:49 +0530
Prerna Saxenapre...@linux.vnet.ibm.com  wrote:


QMP command trace-event to toggle state of a trace-event.
  Illustration :
  -  { execute: trace-event, arguments: { name: qemu_malloc, 
option: true} }
  - { return: {} }

Posting this as an RFC for now. I'll post the final version as a part of
  the cumulative QMP patchset for tracing ( including patches for query-*
commands posted earlier :
http://lists.gnu.org/archive/html/qemu-devel/2010-10/msg01232.html )

Signed-off-by: Prerna Saxenapre...@linux.vnet.ibm.com
---
  hmp-commands.hx |2 +-
  monitor.c   |   43 +--
  qmp-commands.hx |   32 
  3 files changed, 70 insertions(+), 7 deletions(-)
diff --git a/qmp-commands.hx b/qmp-commands.hx
index bc79b55..7613d73 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -761,6 +761,38 @@ Example:

  Note: This command must be issued before issuing any other command.

+EQMP
+
+{
+.name   = trace-event,
+.args_type  = name:s,option:b,
+.params = name on|off,
+.help   = changes state of a specific trace event,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_change_trace_event_state_qmp,
+},
+
+SQMP
+trace-event
+---
+
+Change state of a trace-event.
+
+Arguments:
+
+- name: name of trace-event (json-string)
+- option: new state for the trace-event (json-bool)


This should be called 'enabled'.



I agree, 'enabled' is less ambiguous. Will change in the next patchset.



I think you should submit a new series containing only the proposed
interfaces documentation (one patch per interface) and the intro email
should describe the use cases the proposed interfaces are supposed to
address.


I'll send out the new documentation patchset series shortly.


+
+Example:
+
+-  { execute: trace-event, arguments: { name: ABC, option:false } 
}
+- { return: {} }
+
+Notes:
+
+(1) The 'query-trace-events' command should be used to check the new state
+of the trace-event.
+
  3. Query Commands
  =






--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] [RFC][PATCH 1/5] query-trace command

2010-10-21 Thread Prerna Saxena
QMP interface query-trace to list current contents of trace-buffer.
( Analogous to hmp command : info trace )


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   51 +++
 1 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 793cf1c..f289064 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1539,3 +1539,54 @@ Example:
 
 EQMP
 
+SQMP
+query-trace
+-
+
+Show current contents of trace buffer.
+
+Returns a json-array of json-objects containing the following data:
+
+- event_id: Event ID for the trace-event(json-int)
+- timestamp: trace timestamp in ns (json-int)
+- trace-arg: A json-object containing args logged by the trace-event:
+- arg1: First trace argument (json-int)
+- arg2: Second trace argument (json-int)
+- arg3: Third trace argument (json-int)
+- arg4: Fourth trace argument (json-int)
+- arg5: Fifth trace argument (json-int)
+- arg6: Sixth trace argument (json-int)
+
+Example:
+
+- { execute: query-trace }
+- {
+  return:[
+ {
+event: 22,
+timestamp: 129456235912365,
+trace-arg:{
+   arg1: 886,
+   arg2: 80,
+   arg3: 0,
+   arg4: 0,
+   arg5: 0,
+   arg6: 0,
+}
+ },
+ {
+event: 22,
+timestamp: 129456235973407,
+trace-arg:{
+   arg1: 886,
+   arg2: 80,
+   arg3: 0,
+   arg4: 0,
+   arg5: 0,
+   arg6: 0
+},
+ }
+  ]
+   }
+
+EQMP
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC][PATCH 2/5] query-trace-events

2010-10-21 Thread Prerna Saxena
'query-trace-events' : QMP interface to display currently available 
trace-events with their state.
( Analogous to hmp command : info trace-events )

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index f289064..e079eef 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1590,3 +1590,35 @@ Example:
}
 
 EQMP
+
+SQMP
+query-trace-events
+--
+
+Show all available trace-events  their state.
+
+Returns a json-array of json-objects containing the following data:
+
+- name: Name of Trace-event (json-string)
+- event_id: Event ID of Trace-event (json-int)
+- state: State of trace-event (json-bool)
+
+Example:
+
+- { execute: query-trace-events }
+- {
+  return:[
+ {
+name: qemu_malloc,
+event_id: 0,
+state: false
+ },
+ {
+name: qemu_realloc,
+event_id: 1,
+state: false
+ },
+  ]
+   }
+
+EQMP
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC][PATCH 5/5] set-trace-file

2010-10-21 Thread Prerna Saxena
set-trace-file : QMP command to:
 - Enable/disable logging traces to file
 - Set a new output file
 - Flush a semi-filled trace-buffer to output file.


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   41 +
 1 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index f2008e8..295382f 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -793,6 +793,47 @@ Notes:
 (1) The 'query-trace-events' command should be used to check the new state
 of the trace-event.
 
+EQMP
+
+{
+.name   = set-trace-file,
+.args_type  = enable:-e?,flush:-f?,filename:F?,
+.params = [-e] [-f] [filename],
+.help   = Sets a user-specified output file to write traces to,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_set_trace_file_qmp,
+},
+
+SQMP
+set-trace-file
+--
+
+Set a new output file to log trace data to.
+
+Arguments:
+
+- filename: name of new output file to write trace data to.
+  (json-string, optional)
+- enable: if false, traces are not written to file.
+  : Only when this is 'true' that trace buffer contents get logged
+in a file. (json-bool, optional, defaults to false)
+- flush: if true, contents of trace buffer are immediately written to file,
+   instead of waiting for the buffer to be full.
+   (json-bool, optional, defaults to false)
+
+Example:
+1. Set a new trace-file:
+- { execute: set-trace-file, arguments: { filename: ABC,
+ enable:true } }
+- { return: {} }
+
+2. Flush the current traces to file:
+- { execute: set-trace-file, arguments: { flush: true } }
+
+Notes:
+
+(1) The 'query-trace-file' command should be used to check active trace-file.
+
 3. Query Commands
 =
 
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC] [PATCH 3/5] query-trace-file

2010-10-21 Thread Prerna Saxena
'query-trace-file' : QMP interface to find currently set trace file and 
its status.
(Analogous to hmp command : trace-file)

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   24 
 1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index e079eef..7e95f4e 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1622,3 +1622,27 @@ Example:
}
 
 EQMP
+
+SQMP
+query-trace-file
+
+
+Display the trace file name to which trace data is currently logged, and its
+status.
+
+Returns a json-object containing the following data:
+
+- trace-file: Name + path of Trace-file (json-string)
+- enabled: State of trace-event (json-bool)
+
+Example:
+
+- { execute: query-trace-file }
+- {
+  return:{
+ trace-file: /tmp/trace-26609,
+ enabled: true
+  }
+   }
+
+EQMP
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC][PATCH 4/5] trace-event

2010-10-21 Thread Prerna Saxena
trace-event : QMP interface to change state of a trace-event.
(Analogous to hmp command : trace-event )

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 7e95f4e..f2008e8 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -761,6 +761,38 @@ Example:
 
 Note: This command must be issued before issuing any other command.
 
+EQMP
+
+{
+.name   = trace-event,
+.args_type  = name:s,option:b,
+.params = name on|off,
+.help   = changes state of a specific trace event,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_change_trace_event_state_qmp,
+},
+
+SQMP
+trace-event
+---
+
+Change state of a trace-event.
+
+Arguments:
+
+- name: name of trace-event (json-string)
+- enable: New state to be set for the trace-event (json-bool)
+
+Example:
+
+- { execute: trace-event, arguments: { name: ABC, enable:false } }
+- { return: {} }
+
+Notes:
+
+(1) The 'query-trace-events' command should be used to check the new state
+of the trace-event.
+
 3. Query Commands
 =
 
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC 0/5] QMP interfaces for tracing

2010-10-21 Thread Prerna Saxena
As suggested by Luiz, I'm posting this set of documentation patches that
elucidate for proposed QMP interfaces for tracing.

QMP commands :
* trace-event : to toggle state of a trace-event.
* set-trace-file : to set a new output file for tracing; enable/disable 
   writing traces to file; flush buffer contents to file.
* Query Commands :
--
   * query-trace : to list current contents of trace buffer that havent 
   been written to file.
   * query-trace-events : to list all available trace-events and their status.
   * query-trace-file : to display currently set trace file and its status.

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][RFC] QMP interface to toggle state of a trace-event

2010-10-20 Thread Prerna Saxena
QMP command trace-event to toggle state of a trace-event.
 Illustration :
 - { execute: trace-event, arguments: { name: qemu_malloc, option: 
true} }
 - { return: {} }

Posting this as an RFC for now. I'll post the final version as a part of
 the cumulative QMP patchset for tracing ( including patches for query-* 
commands posted earlier : 
http://lists.gnu.org/archive/html/qemu-devel/2010-10/msg01232.html )

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 hmp-commands.hx |2 +-
 monitor.c   |   43 +--
 qmp-commands.hx |   32 
 3 files changed, 70 insertions(+), 7 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 81999aa..76ec2fe 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -149,7 +149,7 @@ ETEXI
 .args_type  = name:s,option:b,
 .params = name on|off,
 .help   = changes status of a specific trace event,
-.mhandler.cmd = do_change_trace_event_state,
+.mhandler.cmd = do_change_trace_event_state_hmp,
 },
 
 STEXI
diff --git a/monitor.c b/monitor.c
index c7e1f53..0766ed3 100644
--- a/monitor.c
+++ b/monitor.c
@@ -545,17 +545,43 @@ static void do_help_cmd(Monitor *mon, const QDict *qdict)
 }
 
 #ifdef CONFIG_SIMPLE_TRACE
-static void do_change_trace_event_state(Monitor *mon, const QDict *qdict)
+
+/**
+ * HMP handler to change trace event state.
+ *
+ */
+void do_change_trace_event_state_hmp(Monitor *mon, const QDict *qdict)
 {
-const char *tp_name = qdict_get_str(qdict, name);
-bool new_state = qdict_get_bool(qdict, option);
-int ret = st_change_trace_event_state(tp_name, new_state);
+if (!do_change_trace_event_state_generic(qdict)) {
+monitor_printf(mon, unknown event name \%s\\n,
+  qdict_get_str(qdict, name));
+}
+}
 
-if (!ret) {
-monitor_printf(mon, unknown event name \%s\\n, tp_name);
+/**
+ * QMP handler to change trace event state.
+ *
+ */
+static int do_change_trace_event_state_qmp(Monitor *mon, const QDict *qdict,
+   QObject **ret_data)
+{
+if (!do_change_trace_event_state_generic(qdict)) {
+qerror_report(QERR_INVALID_PARAMETER, qdict_get_str(qdict, name));
+return -1;
 }
+return 0;
 }
 
+/**
+ * Generic handler to change trace event state.
+ *
+ */
+static int do_change_trace_event_state_generic(const QDict *qdict)
+{
+const char *tp_name = qdict_get_str(qdict, name);
+bool new_state = qdict_get_bool(qdict, option);
+return st_change_trace_event_state(tp_name, new_state);
+}
 static void do_trace_file(Monitor *mon, const QDict *qdict)
 {
 const char *op = qdict_get_try_str(qdict, op);
@@ -583,6 +609,11 @@ static void do_info_trace_file_to_qmp(Monitor *mon, 
QObject **ret_data)
 {
 *ret_data = st_print_file_to_qobject();
 }
+
+#else
+static int do_change_trace_event_state_qmp(Monitor *mon, const QDict *qdict,
+QObject **ret_data) {}
+
 #endif
 
 static void user_monitor_complete(void *opaque, QObject *ret_data)
diff --git a/qmp-commands.hx b/qmp-commands.hx
index bc79b55..7613d73 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -761,6 +761,38 @@ Example:
 
 Note: This command must be issued before issuing any other command.
 
+EQMP
+
+{
+.name   = trace-event,
+.args_type  = name:s,option:b,
+.params = name on|off,
+.help   = changes state of a specific trace event,
+.user_print = monitor_user_noop,
+.mhandler.cmd_new = do_change_trace_event_state_qmp,
+},
+
+SQMP
+trace-event
+---
+
+Change state of a trace-event.
+
+Arguments:
+
+- name: name of trace-event (json-string)
+- option: new state for the trace-event (json-bool)
+
+Example:
+
+- { execute: trace-event, arguments: { name: ABC, option:false } }
+- { return: {} }
+
+Notes:
+
+(1) The 'query-trace-events' command should be used to check the new state 
+of the trace-event.
+
 3. Query Commands
 =
 
-- 
1.7.2.3



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][v4 PATCH 0/2] QMP Query interfaces for tracing

2010-10-19 Thread Prerna Saxena
This patch set introduces three QMP query interfaces for tracing :

* query-trace: to list current contents of trace-buffer
* query-trace-events : to list all available trace-events with their 
   state.
* query-trace-file   : to list currently set trace-file with its status.

Changelog :
---
Changes v3 - v4 :
- Add 'query-trace-file' interface to query currently active trace-file.
- Cleanup.

Changes v2 - v3 :
- Change declarations of st_print_trace_to_qlist() and 
st_print_trace_events_to_qlist() to return QList*

Changes v1 - v2 :
- Add 'timestamp' field for query-trace output.
- Misc cleanups.


-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][v4 PATCH 1/2] Introduce QMP interfaces

2010-10-19 Thread Prerna Saxena
[PATCH 1/2] Introduce QMP interfaces :
 - query-trace
 - query-trace-events
 - query-trace-file


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c |   53 ---
 simpletrace.c |   69 +
 simpletrace.h |5 
 3 files changed, 123 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index 260cc02..c7e1f53 100644
--- a/monitor.c
+++ b/monitor.c
@@ -578,6 +578,11 @@ static void do_trace_file(Monitor *mon, const QDict *qdict)
 help_cmd(mon, trace-file);
 }
 }
+
+static void do_info_trace_file_to_qmp(Monitor *mon, QObject **ret_data)
+{
+*ret_data = st_print_file_to_qobject();
+}
 #endif
 
 static void user_monitor_complete(void *opaque, QObject *ret_data)
@@ -945,15 +950,27 @@ static void do_info_cpu_stats(Monitor *mon)
 #endif
 
 #if defined(CONFIG_SIMPLE_TRACE)
-static void do_info_trace(Monitor *mon)
+static void do_info_trace_print(Monitor *mon, const QObject *data)
 {
 st_print_trace((FILE *)mon, monitor_fprintf);
 }
 
-static void do_info_trace_events(Monitor *mon)
+static void do_info_trace(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = st_print_trace_to_qlist();
+*ret_data = QOBJECT(trace_event_list);
+}
+
+static void do_info_trace_events_print(Monitor *mon, const QObject *data)
 {
 st_print_trace_events((FILE *)mon, monitor_fprintf);
 }
+
+static void do_info_trace_events(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = st_print_trace_events_to_qlist();
+*ret_data = QOBJECT(trace_event_list);
+}
 #endif
 
 /**
@@ -2610,14 +2627,16 @@ static const mon_cmd_t info_cmds[] = {
 .args_type  = ,
 .params = ,
 .help   = show current contents of trace buffer,
-.mhandler.info = do_info_trace,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
 },
 {
 .name   = trace-events,
 .args_type  = ,
 .params = ,
 .help   = show available trace-events  their state,
-.mhandler.info = do_info_trace_events,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
 },
 #endif
 {
@@ -2752,6 +2771,32 @@ static const mon_cmd_t qmp_query_cmds[] = {
 .mhandler.info_async = do_info_balloon,
 .flags  = MONITOR_CMD_ASYNC,
 },
+#if defined(CONFIG_SIMPLE_TRACE)
+{
+.name   = trace,
+.args_type  = ,
+.params = ,
+.help   = show current contents of trace buffer,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
+},
+{
+.name   = trace-events,
+.args_type  = ,
+.params = ,
+.help   = show available trace-events  their state,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
+},
+{
+.name   = trace-file,
+.args_type  = ,
+.params = ,
+.help   = show currently active trace output file and its status,
+.user_print = monitor_user_noop,
+.mhandler.info_new = do_info_trace_file_to_qmp,
+},
+#endif
 { /* NULL */ },
 };
 
diff --git a/simpletrace.c b/simpletrace.c
index deb1e07..d24d6b0 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -220,6 +220,43 @@ void st_print_trace(FILE *stream, int 
(*stream_printf)(FILE *stream, const char
 }
 }
 
+/**
+ * Add the current contents of trace-buffer as a QList.
+ *
+ */
+QList* st_print_trace_to_qlist(void)
+{
+QObject *data;
+QList *tlist;
+unsigned int i;
+
+tlist = qlist_new();
+
+for (i = 0; i  trace_idx; i++) {
+  data = qobject_from_jsonf({
+ 'timestamp': % PRId64 ,
+ 'event': % PRId64 ,
+ 'arg1': % PRId64 ,
+ 'arg2': % PRId64 ,
+ 'arg3': % PRId64 ,
+ 'arg4': % PRId64 ,
+ 'arg5': % PRId64 ,
+ 'arg6': % PRId64
+},
+trace_buf[i].timestamp_ns,
+trace_buf[i].event,
+trace_buf[i].x1,
+trace_buf[i].x2,
+trace_buf[i].x3,
+trace_buf[i].x4,
+trace_buf[i].x5,
+trace_buf[i].x6);
+  qlist_append_obj(tlist, data);
+}
+
+return tlist;
+}
+
 void st_print_trace_events(FILE *stream, int (*stream_printf)(FILE *stream, 
const char *fmt, ...))
 {
 unsigned int i;
@@ -230,6 +267,38

[Qemu-devel] [Tracing][v4 PATCH 2/2] Add documentation for QMP interfaces

2010-10-19 Thread Prerna Saxena
[PATCH 2/2] Add documentation for QMP commands:
 - query-trace
 - query-trace-events
 - query-trace-file.


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   94 +++
 1 files changed, 94 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 793cf1c..bc79b55 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1539,3 +1539,97 @@ Example:
 
 EQMP
 
+SQMP
+query-trace
+-
+
+Show contents of trace buffer.
+
+Returns a set of json-objects containing the following data:
+
+- event: Event ID for the trace-event(json-int)
+- timestamp: trace timestamp (json-int)
+- arg1 .. arg6: Arguments logged by the trace-event (json-int)
+
+Example:
+
+- { execute: query-trace }
+- {
+  return:{
+ event: 22,
+ timestamp: 129456235912365,
+ arg1: 886
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0,
+   },
+   {
+ event: 22,
+ timestamp: 129456235973407,
+ arg1: 886,
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0
+   },
+   ...
+   }
+
+EQMP
+
+SQMP
+query-trace-events
+--
+
+Show all available trace-events  their state.
+
+Returns a set of json-objects containing the following data:
+
+- name: Name of Trace-event (json-string)
+- event-id: Event ID of Trace-event (json-int)
+- state: State of trace-event [ '0': inactive; '1':active  ] (json-int)
+
+Example:
+
+- { execute: query-trace-events }
+- {
+  return:{
+ name: qemu_malloc,
+ event-id: 0
+ state: 0,
+  },
+  {
+ name: qemu_realloc,
+ event-id: 1,
+ state: 0
+  },
+  ...
+   }
+
+EQMP
+
+SQMP
+query-trace-file
+
+
+Display currently set trace file name and its status.
+
+Returns a set of json-objects containing the following data:
+
+- trace-file: Name of Trace-file (json-string)
+- status: State of trace-event [ '0': disabled; '1':enabled  ] (json-int)
+
+Example:
+
+- { execute: query-trace-file }
+- {
+  return:{
+ trace-file: trace-26609,
+ status: 1
+  }
+   }
+
+EQMP
-- 
1.7.2.2



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




Re: [Qemu-devel] [Tracing][v4 PATCH 2/2] Add documentation for QMP interfaces

2010-10-19 Thread Prerna Saxena

On 10/19/2010 11:57 AM, Prerna Saxena wrote:

[PATCH 2/2] Add documentation for QMP commands:
  - query-trace
  - query-trace-events
  - query-trace-file.




I've been trying ways to avoid building this documentation for other 
trace backends ( since these commands are only available with the 
'simple' backend ). However, looks like hxtool blindly copies text 
between SQMP and EQMP.
I can only think of making hxtool a wee bit intelligent to be able to 
parse CONFIG_* options and build documentation accordingly. Is there a 
workaround I'm missing ?


--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] [Tracing][RFC v3 PATCH 0/2] QMP Query interfaces for tracing

2010-10-18 Thread Prerna Saxena
This patch set introduces two QMP interfaces for tracing :

* query-trace: to list current contents of trace-buffer
* query-trace-events : to list all available trace-events with their state.

Changelog :
---
Changes v2 - v3 :
- Change declarations of st_print_trace_to_qlist() and 
st_print_trace_events_to_qlist() to return QList*

Changes v1 - v2 :
- Add 'timestamp' field for query-trace output.
- Misc cleanups.

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][RFC v3 PATCH 1/2] Introduce QMP interfaces : query-trace query-trace-events

2010-10-18 Thread Prerna Saxena
[PATCH 1/2] Introduce QMP interfaces : query-trace  query-trace-events.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c |   40 +++---
 simpletrace.c |   58 +
 simpletrace.h |4 +++
 3 files changed, 98 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index fbb678d..41f3477 100644
--- a/monitor.c
+++ b/monitor.c
@@ -941,15 +941,27 @@ static void do_info_cpu_stats(Monitor *mon)
 #endif
 
 #if defined(CONFIG_SIMPLE_TRACE)
-static void do_info_trace(Monitor *mon)
+static void do_info_trace_print(Monitor *mon)
 {
 st_print_trace((FILE *)mon, monitor_fprintf);
 }
 
-static void do_info_trace_events(Monitor *mon)
+static void do_info_trace(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = st_print_trace_to_qlist();
+*ret_data = QOBJECT(trace_event_list);
+}
+
+static void do_info_trace_events_print(Monitor *mon, const QObject *data)
 {
 st_print_trace_events((FILE *)mon, monitor_fprintf);
 }
+
+static void do_info_trace_events(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = st_print_trace_events_to_qlist();
+*ret_data = QOBJECT(trace_event_list);
+}
 #endif
 
 /**
@@ -2606,14 +2618,16 @@ static const mon_cmd_t info_cmds[] = {
 .args_type  = ,
 .params = ,
 .help   = show current contents of trace buffer,
-.mhandler.info = do_info_trace,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
 },
 {
 .name   = trace-events,
 .args_type  = ,
 .params = ,
 .help   = show available trace-events  their state,
-.mhandler.info = do_info_trace_events,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
 },
 #endif
 {
@@ -2748,6 +2762,24 @@ static const mon_cmd_t qmp_query_cmds[] = {
 .mhandler.info_async = do_info_balloon,
 .flags  = MONITOR_CMD_ASYNC,
 },
+#if defined(CONFIG_SIMPLE_TRACE)
+{
+.name   = trace,
+.args_type  = ,
+.params = ,
+.help   = show current contents of trace buffer,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
+},
+{
+.name   = trace-events,
+.args_type  = ,
+.params = ,
+.help   = show available trace-events  their state,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
+},
+#endif
 { /* NULL */ },
 };
 
diff --git a/simpletrace.c b/simpletrace.c
index f849e42..9d7ec68 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -220,6 +220,43 @@ void st_print_trace(FILE *stream, int 
(*stream_printf)(FILE *stream, const char
 }
 }
 
+/**
+ * Add the current contents of trace-buffer as a QList.
+ *
+ */
+QList* st_print_trace_to_qlist()
+{
+QObject *data;
+QList *tlist;
+unsigned int i;
+
+tlist = qlist_new();
+
+for (i = 0; i  trace_idx; i++) {
+  data = qobject_from_jsonf({
+ 'timestamp': % PRId64 ,
+ 'event': % PRId64 ,
+ 'arg1': % PRId64 ,
+ 'arg2': % PRId64 ,
+ 'arg3': % PRId64 ,
+ 'arg4': % PRId64 ,
+ 'arg5': % PRId64 ,
+ 'arg6': % PRId64
+},
+trace_buf[i].timestamp_ns,
+trace_buf[i].event,
+trace_buf[i].x1,
+trace_buf[i].x2,
+trace_buf[i].x3,
+trace_buf[i].x4,
+trace_buf[i].x5,
+trace_buf[i].x6);
+  qlist_append_obj(tlist, data);
+}
+
+return tlist;
+}
+
 void st_print_trace_events(FILE *stream, int (*stream_printf)(FILE *stream, 
const char *fmt, ...))
 {
 unsigned int i;
@@ -230,6 +267,27 @@ void st_print_trace_events(FILE *stream, int 
(*stream_printf)(FILE *stream, cons
 }
 }
 
+/**
+ * Add current set of trace-events as a QList.
+ *
+ */
+QList* st_print_trace_events_to_qlist()
+{
+QObject *data;
+QList *tlist;
+unsigned int i;
+
+tlist = qlist_new();
+
+for (i = 0; i  NR_TRACE_EVENTS; i++) {
+  data = qobject_from_jsonf({ 'name': %s, 'event-id': %d, 'state': 
%d}, trace_list[i].tp_name, i,
+trace_list[i].state);
+  qlist_append_obj(tlist, data);
+}
+
+return tlist;
+}
+
 static TraceEvent* find_trace_event_by_name(const char

[Qemu-devel] [Tracing][RFC v3 PATCH 2/2] Add documentation for QMP commands: query-trace query-trace-events.

2010-10-18 Thread Prerna Saxena
[PATCH 2/2] Add documentation for QMP commands: query-trace  
query-trace-events.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   71 +++
 1 files changed, 71 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 793cf1c..fefc93d 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1539,3 +1539,74 @@ Example:
 
 EQMP
 
+SQMP
+query-trace
+-
+
+Show contents of trace buffer.
+
+Returns a set of json-objects containing the following data:
+
+- event: Event ID for the trace-event(json-int)
+- timestamp: trace timestamp (json-int)
+- arg1 .. arg6: Arguments logged by the trace-event (json-int)
+
+Example:
+
+- { execute: query-trace }
+- {
+  return:{
+ event: 22,
+ timestamp: 129456235912365,
+ arg1: 886
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0,
+   },
+   {
+ event: 22,
+ timestamp: 129456235973407,
+ arg1: 886,
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0
+   },
+   ...
+   }
+
+EQMP
+
+SQMP
+query-trace-events
+--
+
+Show all available trace-events  their state.
+
+Returns a set of json-objects containing the following data:
+
+- name: Name of Trace-event (json-string)
+- event-id: Event ID of Trace-event (json-int)
+- state: State of trace-event [ '0': inactive; '1':active  ] (json-int)
+
+Example:
+
+- { execute: query-trace-events }
+- {
+  return:{
+ name: qemu_malloc,
+ event-id: 0
+ state: 0,
+  },
+  {
+ name: qemu_realloc,
+ event-id: 1,
+ state: 0
+  },
+  ...
+   }
+
+EQMP
-- 
1.7.2.2



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] Re: [Tracing][RFC v3 PATCH 0/2] QMP Query interfaces for tracing

2010-10-18 Thread Prerna Saxena

On 10/18/2010 07:51 PM, Luiz Capitulino wrote:

On Mon, 18 Oct 2010 11:36:55 +0530
Prerna Saxenapre...@linux.vnet.ibm.com  wrote:


This patch set introduces two QMP interfaces for tracing :

* query-trace: to list current contents of trace-buffer
* query-trace-events : to list all available trace-events with their state.


This is in my to-review queue, but it's going to take a few days, because
I have to take a deeper look at the tracing feature to be able to review it.



Thanks for looking..I'd look forward to your comments :-)


Two initial questions:

  o This is labeled as an RFC, but you're versioning it. Should this be
considered for inclusion?


I'm sending out a new version with some enhancements shortly -- for 
inclusion.




  o Is this really useful w/o being able to set new traces?



I'm working on that as well. The query commands are the earliest 
interfaces to be implemented. I will be adding interfaces to toggle the 
state of trace-events, set a new trace-file, etc.




Changelog :
---
Changes v2 -  v3 :
- Change declarations of st_print_trace_to_qlist() and
st_print_trace_events_to_qlist() to return QList*

Changes v1 -  v2 :
- Add 'timestamp' field for query-trace output.
- Misc cleanups.





Thanks,
--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] [Tracing] [RFC PATCH 2/2] : Documentation for QMP interfaces

2010-10-14 Thread Prerna Saxena
[PATCH 2/2] Add documentation for QMP commands: query-trace  
query-trace-events.


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   53 +
 1 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 793cf1c..9a48984 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1539,3 +1539,56 @@ Example:
 
 EQMP
 
+SQMP
+query-trace
+-
+
+Show contents of trace buffer.
+
+Returns a set of json-objects containing the following data:
+
+- Event: Event ID for the trace-event(json-int)
+- arg1 .. arg6: Arguments logged by the trace-event (json-int)
+
+Example:
+
+- { execute: query-trace }
+- {
+  return:{
+ Event: 22,
+ arg6: 0,
+ arg5: 0,
+ arg4: 0,
+ arg3: 0,
+ arg2: 80,
+ arg1: 886
+   }
+   }
+
+EQMP
+
+SQMP
+query-trace-events
+--
+
+Show all available trace-events  their state.
+
+Returns a set of json-objects containing the following data:
+
+- name: Name of Trace-event (json-string)
+- state: State of trace-event [ '0': inactive; '1':active  ] (json-int)
+- eventID: Event ID of Trace-event (json-int)
+
+Example:
+
+- { execute: query-trace-events }
+- {
+  return:{
+ name: qemu_malloc,
+ state: 0,
+ eventID: 0
+  }
+   }
+
+EQMP
+
-- 
1.7.2.2



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][RFC v2 PATCH 0/2] QMP Query interfaces for tracing

2010-10-14 Thread Prerna Saxena
This patch set introduces two QMP interfaces for tracing :

* query-trace: to list current contents of trace-buffer
* query-trace-events : to list all available trace-events with their state.

Changelog :
---
Changes from v1 :
- Add 'timestamp' field for query-trace output.
- Misc cleanups.

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][RFC v2 PATCH 1/2] Introduce 'query-trace' 'query-trace-events' interfaces

2010-10-14 Thread Prerna Saxena
[PATCH 1/2] Introduce QMP interfaces : query-trace  query-trace-events


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c |   46 +---
 simpletrace.c |   58 +
 simpletrace.h |4 +++
 3 files changed, 104 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index fbb678d..7a150ae 100644
--- a/monitor.c
+++ b/monitor.c
@@ -941,15 +941,33 @@ static void do_info_cpu_stats(Monitor *mon)
 #endif
 
 #if defined(CONFIG_SIMPLE_TRACE)
-static void do_info_trace(Monitor *mon)
+static void do_info_trace_print(Monitor *mon)
 {
 st_print_trace((FILE *)mon, monitor_fprintf);
 }
 
-static void do_info_trace_events(Monitor *mon)
+static void do_info_trace(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = NULL;
+
+st_print_trace_to_qlist(trace_event_list);
+
+*ret_data = QOBJECT(trace_event_list);
+}
+
+static void do_info_trace_events_print(Monitor *mon, const QObject *data)
 {
 st_print_trace_events((FILE *)mon, monitor_fprintf);
 }
+
+static void do_info_trace_events(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = NULL;
+
+st_print_trace_events_to_qlist(trace_event_list);
+
+*ret_data = QOBJECT(trace_event_list);
+}
 #endif
 
 /**
@@ -2606,14 +2624,16 @@ static const mon_cmd_t info_cmds[] = {
 .args_type  = ,
 .params = ,
 .help   = show current contents of trace buffer,
-.mhandler.info = do_info_trace,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
 },
 {
 .name   = trace-events,
 .args_type  = ,
 .params = ,
 .help   = show available trace-events  their state,
-.mhandler.info = do_info_trace_events,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
 },
 #endif
 {
@@ -2748,6 +2768,24 @@ static const mon_cmd_t qmp_query_cmds[] = {
 .mhandler.info_async = do_info_balloon,
 .flags  = MONITOR_CMD_ASYNC,
 },
+#if defined(CONFIG_SIMPLE_TRACE)
+{
+.name   = trace,
+.args_type  = ,
+.params = ,
+.help   = show current contents of trace buffer,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
+},
+{
+.name   = trace-events,
+.args_type  = ,
+.params = ,
+.help   = show available trace-events  their state,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
+},
+#endif
 { /* NULL */ },
 };
 
diff --git a/simpletrace.c b/simpletrace.c
index f849e42..a964312 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -220,6 +220,43 @@ void st_print_trace(FILE *stream, int 
(*stream_printf)(FILE *stream, const char
 }
 }
 
+/**
+ * Add the current contents of trace-buffer as a QList.
+ * NOTE: This assumes trace_list hasnt already been allocated with a QList.
+ *The initialization happens here.
+ */
+void st_print_trace_to_qlist(QList **tlist)
+{
+QObject *data;
+unsigned int i;
+
+assert(tlist);
+
+*tlist = qlist_new();
+
+for (i = 0; i  trace_idx; i++) {
+  data = qobject_from_jsonf({
+ 'timestamp': % PRId64 ,
+ 'event': % PRId64 ,
+ 'arg1': % PRId64 ,
+ 'arg2': % PRId64 ,
+ 'arg3': % PRId64 ,
+ 'arg4': % PRId64 ,
+ 'arg5': % PRId64 ,
+ 'arg6': % PRId64
+},
+trace_buf[i].timestamp_ns,
+trace_buf[i].event,
+trace_buf[i].x1,
+trace_buf[i].x2,
+trace_buf[i].x3,
+trace_buf[i].x4,
+trace_buf[i].x5,
+trace_buf[i].x6);
+  qlist_append_obj(*tlist, data);
+}
+}
+
 void st_print_trace_events(FILE *stream, int (*stream_printf)(FILE *stream, 
const char *fmt, ...))
 {
 unsigned int i;
@@ -230,6 +267,27 @@ void st_print_trace_events(FILE *stream, int 
(*stream_printf)(FILE *stream, cons
 }
 }
 
+/**
+ * Add current set of trace-events as a QList.
+ * NOTE: This assumes trace_list hasnt already been allocated with a QList.
+ *The initialization happens here.
+ */
+void st_print_trace_events_to_qlist(QList **tlist)
+{
+QObject *data;
+unsigned int i;
+
+assert(tlist);
+
+*tlist = qlist_new();
+
+for (i = 0; i  NR_TRACE_EVENTS; i++) {
+  data

[Qemu-devel] [Tracing][RFC v2 PATCH 2/2] Documentation for QMP interfaces

2010-10-14 Thread Prerna Saxena
[PATCH 2/2] Add documentation for QMP commands: query-trace  
query-trace-events.


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qmp-commands.hx |   71 +++
 1 files changed, 71 insertions(+), 0 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 793cf1c..fefc93d 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1539,3 +1539,74 @@ Example:
 
 EQMP
 
+SQMP
+query-trace
+-
+
+Show contents of trace buffer.
+
+Returns a set of json-objects containing the following data:
+
+- event: Event ID for the trace-event(json-int)
+- timestamp: trace timestamp (json-int)
+- arg1 .. arg6: Arguments logged by the trace-event (json-int)
+
+Example:
+
+- { execute: query-trace }
+- {
+  return:{
+ event: 22,
+ timestamp: 129456235912365,
+ arg1: 886
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0,
+   },
+   {
+ event: 22,
+ timestamp: 129456235973407,
+ arg1: 886,
+ arg2: 80,
+ arg3: 0,
+ arg4: 0,
+ arg5: 0,
+ arg6: 0
+   },
+   ...
+   }
+
+EQMP
+
+SQMP
+query-trace-events
+--
+
+Show all available trace-events  their state.
+
+Returns a set of json-objects containing the following data:
+
+- name: Name of Trace-event (json-string)
+- event-id: Event ID of Trace-event (json-int)
+- state: State of trace-event [ '0': inactive; '1':active  ] (json-int)
+
+Example:
+
+- { execute: query-trace-events }
+- {
+  return:{
+ name: qemu_malloc,
+ event-id: 0
+ state: 0,
+  },
+  {
+ name: qemu_realloc,
+ event-id: 1,
+ state: 0
+  },
+  ...
+   }
+
+EQMP
-- 
1.7.2.2



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing] [RFC PATCH 0/2] : QMP query Interfaces for tracing

2010-10-13 Thread Prerna Saxena
This patch set introduces two QMP interfaces for tracing :

* query-trace: to list current contents of trace-buffer
* query-trace-events : to list all available trace-events with their state.

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing] [RFC PATCH 1/2] : Introduce 'query-trace' 'query-trace-events' interfaces

2010-10-13 Thread Prerna Saxena
[PATCH 1/2] Introduce QMP interfaces : query-trace  query-trace-events


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c |   46 ++
 simpletrace.c |   54 ++
 simpletrace.h |2 ++
 3 files changed, 98 insertions(+), 4 deletions(-)

diff --git a/monitor.c b/monitor.c
index fbb678d..7a150ae 100644
--- a/monitor.c
+++ b/monitor.c
@@ -941,15 +941,33 @@ static void do_info_cpu_stats(Monitor *mon)
 #endif
 
 #if defined(CONFIG_SIMPLE_TRACE)
-static void do_info_trace(Monitor *mon)
+static void do_info_trace_print(Monitor *mon)
 {
 st_print_trace((FILE *)mon, monitor_fprintf);
 }
 
-static void do_info_trace_events(Monitor *mon)
+static void do_info_trace(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = NULL;
+
+st_print_trace_to_qlist(trace_event_list);
+
+*ret_data = QOBJECT(trace_event_list);
+}
+
+static void do_info_trace_events_print(Monitor *mon, const QObject *data)
 {
 st_print_trace_events((FILE *)mon, monitor_fprintf);
 }
+
+static void do_info_trace_events(Monitor *mon, QObject **ret_data)
+{
+QList *trace_event_list = NULL;
+
+st_print_trace_events_to_qlist(trace_event_list);
+
+*ret_data = QOBJECT(trace_event_list);
+}
 #endif
 
 /**
@@ -2606,14 +2624,16 @@ static const mon_cmd_t info_cmds[] = {
 .args_type  = ,
 .params = ,
 .help   = show current contents of trace buffer,
-.mhandler.info = do_info_trace,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
 },
 {
 .name   = trace-events,
 .args_type  = ,
 .params = ,
 .help   = show available trace-events  their state,
-.mhandler.info = do_info_trace_events,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
 },
 #endif
 {
@@ -2748,6 +2768,24 @@ static const mon_cmd_t qmp_query_cmds[] = {
 .mhandler.info_async = do_info_balloon,
 .flags  = MONITOR_CMD_ASYNC,
 },
+#if defined(CONFIG_SIMPLE_TRACE)
+{
+.name   = trace,
+.args_type  = ,
+.params = ,
+.help   = show current contents of trace buffer,
+.user_print = do_info_trace_print,
+.mhandler.info_new = do_info_trace,
+},
+{
+.name   = trace-events,
+.args_type  = ,
+.params = ,
+.help   = show available trace-events  their state,
+.user_print = do_info_trace_events_print,
+.mhandler.info_new = do_info_trace_events,
+},
+#endif
 { /* NULL */ },
 };
 
diff --git a/simpletrace.c b/simpletrace.c
index f849e42..d1f66b4 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -220,6 +220,39 @@ void st_print_trace(FILE *stream, int 
(*stream_printf)(FILE *stream, const char
 }
 }
 
+void st_print_trace_to_qlist(QList **tlist)
+{
+QObject *data;
+unsigned int i;
+
+if (!tlist || *tlist )
+return;
+
+/* NOTE : This assumes trace_list hasnt already been allocated with a 
QList.
+ *The initialization happens here.
+ */
+*tlist = qlist_new();
+
+for (i = 0; i  trace_idx; i++) {
+  data = qobject_from_jsonf({
+ 'Event': % PRId64 ,
+ 'arg1': % PRId64 ,
+ 'arg2': % PRId64 ,
+ 'arg3': % PRId64 ,
+ 'arg4': % PRId64 ,
+ 'arg5': % PRId64 ,
+ 'arg6': % PRId64
+},
+trace_buf[i].event, trace_buf[i].x1,
+trace_buf[i].x2, trace_buf[i].x3,
+trace_buf[i].x4, trace_buf[i].x5,
+trace_buf[i].x6);
+  qlist_append_obj(*tlist, data);
+}
+
+return;
+}
+
 void st_print_trace_events(FILE *stream, int (*stream_printf)(FILE *stream, 
const char *fmt, ...))
 {
 unsigned int i;
@@ -230,6 +263,27 @@ void st_print_trace_events(FILE *stream, int 
(*stream_printf)(FILE *stream, cons
 }
 }
 
+void st_print_trace_events_to_qlist(QList **tlist)
+{
+QObject *data;
+unsigned int i;
+
+if (!tlist || *tlist )
+return;
+
+/* NOTE : This assumes trace_list hasnt already been allocated with a 
QList.
+ *The initialization happens here.
+ */
+*tlist = qlist_new();
+
+for (i = 0; i  NR_TRACE_EVENTS; i++) {
+  data = qobject_from_jsonf({ 'name': %s, 'eventID': %d, 'state': %d 
}, trace_list[i].tp_name, i, trace_list[i].state);
+  qlist_append_obj(*tlist, data);
+}
+
+return;
+}
+
 static TraceEvent* find_trace_event_by_name(const char *tname)
 {
 unsigned int

[Qemu-devel] [PATCH][Tracing v2] Process -trace using QemuOptsList

2010-08-27 Thread Prerna Saxena
[PATCH] Add -trace file FILENAME switch to qemu startup command.
 This processes the argument using QemuOptsList


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 qemu-config.c |   18 ++
 qemu-config.h |3 +++
 vl.c  |5 -
 3 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index 95abe61..9106511 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -294,6 +294,21 @@ QemuOptsList qemu_mon_opts = {
 },
 };
 
+#ifdef CONFIG_SIMPLE_TRACE
+QemuOptsList qemu_trace_opts = {
+.name = trace,
+.implied_opt_name = trace,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_trace_opts.head),
+.desc = {
+{
+.name = file,
+.type = QEMU_OPT_STRING,
+},
+{ /* end if list */ }
+},
+};
+#endif
+
 QemuOptsList qemu_cpudef_opts = {
 .name = cpudef,
 .head = QTAILQ_HEAD_INITIALIZER(qemu_cpudef_opts.head),
@@ -352,6 +367,9 @@ static QemuOptsList *vm_config_groups[] = {
 qemu_global_opts,
 qemu_mon_opts,
 qemu_cpudef_opts,
+#ifdef CONFIG_SIMPLE_TRACE
+qemu_trace_opts,
+#endif
 NULL,
 };
 
diff --git a/qemu-config.h b/qemu-config.h
index dca69d4..4db2fb5 100644
--- a/qemu-config.h
+++ b/qemu-config.h
@@ -14,6 +14,9 @@ extern QemuOptsList qemu_rtc_opts;
 extern QemuOptsList qemu_global_opts;
 extern QemuOptsList qemu_mon_opts;
 extern QemuOptsList qemu_cpudef_opts;
+#ifdef CONFIG_SIMPLE_TRACE
+extern QemuOptsList qemu_trace_opts;
+#endif
 
 QemuOptsList *qemu_find_opts(const char *group);
 int qemu_set_option(const char *str);
diff --git a/vl.c b/vl.c
index 99664e9..0ff04e9 100644
--- a/vl.c
+++ b/vl.c
@@ -2599,7 +2599,10 @@ int main(int argc, char **argv, char **envp)
 break;
 #ifdef CONFIG_SIMPLE_TRACE
 case QEMU_OPTION_trace:
-trace_file = optarg;
+opts = qemu_opts_parse(qemu_trace_opts, optarg, 0);
+if (opts) {
+trace_file = qemu_opt_get(opts, file);
+}
 break;
 #endif
 case QEMU_OPTION_readconfig:
-- 
1.7.2.1



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




Re: [Qemu-devel] [PATCH] trace: Make trace record fields 64-bit

2010-08-11 Thread Prerna Saxena

On 08/09/2010 07:05 PM, Stefan Hajnoczi wrote:

Explicitly use 64-bit fields in trace records so that timestamps and
magic numbers work for 32-bit host builds.

Signed-off-by: Stefan Hajnoczistefa...@linux.vnet.ibm.com
---
  simpletrace.c  |   31 +--
  simpletrace.h  |   11 ++-
  simpletrace.py |2 +-
  tracetool  |6 +++---
  4 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/simpletrace.c b/simpletrace.c
index 954cc4e..01acfc5 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -9,18 +9,29 @@
   */

  #includestdlib.h
+#includestdint.h
  #includestdio.h
  #includetime.h
  #include trace.h

+/** Trace file header event ID */
+#define HEADER_EVENT_ID (~(uint64_t)0) /* avoids conflicting with 
TraceEventIDs */
+
+/** Trace file magic number */
+#define HEADER_MAGIC 0xf2b177cb0aa429b4ULL
+
+/** Trace file version number, bump if format changes */
+#define HEADER_VERSION 0
+
+/** Trace buffer entry */
  typedef struct {
-unsigned long event;
-unsigned long timestamp_ns;
-unsigned long x1;
-unsigned long x2;
-unsigned long x3;
-unsigned long x4;
-unsigned long x5;
+uint64_t event;
+uint64_t timestamp_ns;
+uint64_t x1;
+uint64_t x2;
+uint64_t x3;
+uint64_t x4;
+uint64_t x5;
  } TraceRecord;

  enum {
@@ -42,9 +53,9 @@ void st_print_trace_file_status(FILE *stream, int 
(*stream_printf)(FILE *stream,
  static bool write_header(FILE *fp)
  {
  TraceRecord header = {
-.event = -1UL, /* max avoids conflicting with TraceEventIDs */
-.timestamp_ns = 0xf2b177cb0aa429b4, /* magic number */
-.x1 = 0, /* bump this version number if file format changes */
+.event = HEADER_EVENT_ID,
+.timestamp_ns = HEADER_MAGIC,
+.x1 = HEADER_VERSION,
  };

  return fwrite(header, sizeof header, 1, fp) == 1;
diff --git a/simpletrace.h b/simpletrace.h
index 6a2b8d9..f81aa8e 100644
--- a/simpletrace.h
+++ b/simpletrace.h
@@ -10,6 +10,7 @@
  #define SIMPLETRACE_H

  #includestdbool.h
+#includestdint.h
  #includestdio.h

  typedef unsigned int TraceEventID;


It would be useful to have :

typedef uint64_t TraceEventID;

This ensures that the maximum number of trace events available on both 
32 and 64 bit builds is same.



@@ -20,11 +21,11 @@ typedef struct {
  } TraceEvent;

  void trace0(TraceEventID event);
-void trace1(TraceEventID event, unsigned long x1);
-void trace2(TraceEventID event, unsigned long x1, unsigned long x2);
-void trace3(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3);
-void trace4(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3, unsigned long x4);
-void trace5(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3, unsigned long x4, unsigned long x5);
+void trace1(TraceEventID event, uint64_t x1);
+void trace2(TraceEventID event, uint64_t x1, uint64_t x2);
+void trace3(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3);
+void trace4(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4);
+void trace5(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4, uint64_t x5);
  void st_print_trace(FILE *stream, int (*stream_printf)(FILE *stream, const 
char *fmt, ...));
  void st_print_trace_events(FILE *stream, int (*stream_printf)(FILE *stream, 
const char *fmt, ...));
  void st_change_trace_event_state(const char *tname, bool tstate);
diff --git a/simpletrace.py b/simpletrace.py
index 979d911..fdf0eb5 100755
--- a/simpletrace.py
+++ b/simpletrace.py
@@ -17,7 +17,7 @@ header_event_id = 0x
  header_magic= 0xf2b177cb0aa429b4
  header_version  = 0

-trace_fmt = 'LLL'
+trace_fmt = '=QQQ'
  trace_len = struct.calcsize(trace_fmt)
  event_re  = re.compile(r'(disable\s+)?([a-zA-Z0-9_]+)\(([^)]*)\)\s+([^]*)')

diff --git a/tracetool b/tracetool
index c5a5bdc..b78cd97 100755
--- a/tracetool
+++ b/tracetool
@@ -151,11 +151,11 @@ EOF
  simple_event_num=0
  }

-cast_args_to_ulong()
+cast_args_to_uint64_t()
  {
  local arg
  for arg in $(get_argnames $1); do
-echo -n (unsigned long)$arg
+echo -n (uint64_t)$arg


Tested this on a 32 bit host. It throws up some warnings, and we need :
echo -n (uint64_t)(uintptr_t)$arg


  done
  }

@@ -173,7 +173,7 @@ linetoh_simple()
  trace_args=$simple_event_num
  if [ $argc -gt 0 ]
  then
-trace_args=$trace_args, $(cast_args_to_ulong $1)
+trace_args=$trace_args, $(cast_args_to_uint64_t $1)
  fi

  catEOF



--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] [Tracing] More Trace events

2010-08-11 Thread Prerna Saxena
This patch adds few more trace events for tracking IO and also to trace 
balloon event flagged via the monitor.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 balloon.c|2 ++
 ioport.c |7 +++
 trace-events |8 
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/balloon.c b/balloon.c
index 8e0b7f1..0021fef 100644
--- a/balloon.c
+++ b/balloon.c
@@ -29,6 +29,7 @@
 #include cpu-common.h
 #include kvm.h
 #include balloon.h
+#include trace.h
 
 
 static QEMUBalloonEvent *qemu_balloon_event;
@@ -43,6 +44,7 @@ void qemu_add_balloon_handler(QEMUBalloonEvent *func, void 
*opaque)
 int qemu_balloon(ram_addr_t target, MonitorCompletion cb, void *opaque)
 {
 if (qemu_balloon_event) {
+trace_balloon_event(qemu_balloon_event_opaque, target);
 qemu_balloon_event(qemu_balloon_event_opaque, target, cb, opaque);
 return 1;
 } else {
diff --git a/ioport.c b/ioport.c
index 53dd87a..ec3dc65 100644
--- a/ioport.c
+++ b/ioport.c
@@ -26,6 +26,7 @@
  */
 
 #include ioport.h
+#include trace.h
 
 /***/
 /* IO Port */
@@ -195,18 +196,21 @@ void isa_unassign_ioport(pio_addr_t start, int length)
 void cpu_outb(pio_addr_t addr, uint8_t val)
 {
 LOG_IOPORT(outb: %04FMT_pioaddr %02PRIx8\n, addr, val);
+trace_cpu_out(addr, val);
 ioport_write(0, addr, val);
 }
 
 void cpu_outw(pio_addr_t addr, uint16_t val)
 {
 LOG_IOPORT(outw: %04FMT_pioaddr %04PRIx16\n, addr, val);
+trace_cpu_out(addr, val);
 ioport_write(1, addr, val);
 }
 
 void cpu_outl(pio_addr_t addr, uint32_t val)
 {
 LOG_IOPORT(outl: %04FMT_pioaddr %08PRIx32\n, addr, val);
+trace_cpu_out(addr, val);
 ioport_write(2, addr, val);
 }
 
@@ -214,6 +218,7 @@ uint8_t cpu_inb(pio_addr_t addr)
 {
 uint8_t val;
 val = ioport_read(0, addr);
+trace_cpu_in(addr, val);
 LOG_IOPORT(inb : %04FMT_pioaddr %02PRIx8\n, addr, val);
 return val;
 }
@@ -222,6 +227,7 @@ uint16_t cpu_inw(pio_addr_t addr)
 {
 uint16_t val;
 val = ioport_read(1, addr);
+trace_cpu_in(addr, val);
 LOG_IOPORT(inw : %04FMT_pioaddr %04PRIx16\n, addr, val);
 return val;
 }
@@ -230,6 +236,7 @@ uint32_t cpu_inl(pio_addr_t addr)
 {
 uint32_t val;
 val = ioport_read(2, addr);
+trace_cpu_in(addr, val);
 LOG_IOPORT(inl : %04FMT_pioaddr %08PRIx32\n, addr, val);
 return val;
 }
diff --git a/trace-events b/trace-events
index 80197b6..cade0b5 100644
--- a/trace-events
+++ b/trace-events
@@ -59,3 +59,11 @@ virtio_blk_handle_write(void *req, unsigned long sector, 
unsigned long nsectors)
 
 # posix-aio-compat.c
 paio_submit(void *acb, void *opaque, unsigned long sector_num, unsigned long 
nb_sectors, unsigned long type) acb %p opaque %p sector_num %lu nb_sectors %lu 
type %lu
+
+# ioport.c
+cpu_in(unsigned int addr, unsigned int val) Addr %u Value %u
+cpu_out(unsigned int addr, unsigned int val) Addr %u Value %u
+
+# balloon.c
+# Since requests are raised via monitor, not many tracepoints are needed.
+balloon_event(void *opaque, unsigned long addr) Opaque %p Addr %lu
-- 
1.6.2.5



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [PATCH v2] trace: Make trace record fields 64-bit

2010-08-11 Thread Prerna Saxena
Explicitly use 64-bit fields in trace records so that timestamps and
 magic numbers work for 32-bit host builds.

Changelog (from initial patch posted by Stefan):
1) TraceEventID is now uint64_t to take care of same number of 
tracepoints on both 32 and 64 bit builds.
2) Cast arguments to uintptr_t, and then to uint64_t to bypass warnings.


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 simpletrace.c  |   41 ++---
 simpletrace.h  |   13 +++--
 simpletrace.py |2 +-
 tracetool  |6 +++---
 4 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/simpletrace.c b/simpletrace.c
index 954cc4e..27b0cab 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -9,18 +9,29 @@
  */
 
 #include stdlib.h
+#include stdint.h
 #include stdio.h
 #include time.h
 #include trace.h
 
+/** Trace file header event ID */
+#define HEADER_EVENT_ID (~(uint64_t)0) /* avoids conflicting with 
TraceEventIDs */
+
+/** Trace file magic number */
+#define HEADER_MAGIC 0xf2b177cb0aa429b4ULL
+
+/** Trace file version number, bump if format changes */
+#define HEADER_VERSION 0
+
+/** Trace buffer entry */
 typedef struct {
-unsigned long event;
-unsigned long timestamp_ns;
-unsigned long x1;
-unsigned long x2;
-unsigned long x3;
-unsigned long x4;
-unsigned long x5;
+uint64_t event;
+uint64_t timestamp_ns;
+uint64_t x1;
+uint64_t x2;
+uint64_t x3;
+uint64_t x4;
+uint64_t x5;
 } TraceRecord;
 
 enum {
@@ -42,9 +53,9 @@ void st_print_trace_file_status(FILE *stream, int 
(*stream_printf)(FILE *stream,
 static bool write_header(FILE *fp)
 {
 TraceRecord header = {
-.event = -1UL, /* max avoids conflicting with TraceEventIDs */
-.timestamp_ns = 0xf2b177cb0aa429b4, /* magic number */
-.x1 = 0, /* bump this version number if file format changes */
+.event = HEADER_EVENT_ID,
+.timestamp_ns = HEADER_MAGIC,
+.x1 = HEADER_VERSION,
 };
 
 return fwrite(header, sizeof header, 1, fp) == 1;
@@ -160,27 +171,27 @@ void trace0(TraceEventID event)
 trace(event, 0, 0, 0, 0, 0);
 }
 
-void trace1(TraceEventID event, unsigned long x1)
+void trace1(TraceEventID event, uint64_t x1)
 {
 trace(event, x1, 0, 0, 0, 0);
 }
 
-void trace2(TraceEventID event, unsigned long x1, unsigned long x2)
+void trace2(TraceEventID event, uint64_t x1, uint64_t x2)
 {
 trace(event, x1, x2, 0, 0, 0);
 }
 
-void trace3(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3)
+void trace3(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3)
 {
 trace(event, x1, x2, x3, 0, 0);
 }
 
-void trace4(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3, unsigned long x4)
+void trace4(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4)
 {
 trace(event, x1, x2, x3, x4, 0);
 }
 
-void trace5(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3, unsigned long x4, unsigned long x5)
+void trace5(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4, uint64_t x5)
 {
 trace(event, x1, x2, x3, x4, x5);
 }
diff --git a/simpletrace.h b/simpletrace.h
index 6a2b8d9..00ca439 100644
--- a/simpletrace.h
+++ b/simpletrace.h
@@ -10,9 +10,10 @@
 #define SIMPLETRACE_H
 
 #include stdbool.h
+#include stdint.h
 #include stdio.h
 
-typedef unsigned int TraceEventID;
+typedef uint64_t TraceEventID;
 
 typedef struct {
 const char *tp_name;
@@ -20,11 +21,11 @@ typedef struct {
 } TraceEvent;
 
 void trace0(TraceEventID event);
-void trace1(TraceEventID event, unsigned long x1);
-void trace2(TraceEventID event, unsigned long x1, unsigned long x2);
-void trace3(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3);
-void trace4(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3, unsigned long x4);
-void trace5(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3, unsigned long x4, unsigned long x5);
+void trace1(TraceEventID event, uint64_t x1);
+void trace2(TraceEventID event, uint64_t x1, uint64_t x2);
+void trace3(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3);
+void trace4(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4);
+void trace5(TraceEventID event, uint64_t x1, uint64_t x2, uint64_t x3, 
uint64_t x4, uint64_t x5);
 void st_print_trace(FILE *stream, int (*stream_printf)(FILE *stream, const 
char *fmt, ...));
 void st_print_trace_events(FILE *stream, int (*stream_printf)(FILE *stream, 
const char *fmt, ...));
 void st_change_trace_event_state(const char *tname, bool tstate);
diff --git a/simpletrace.py b/simpletrace.py
index 979d911..fdf0eb5 100755
--- a/simpletrace.py
+++ b/simpletrace.py
@@ -17,7 +17,7 @@ header_event_id = 0x
 header_magic= 0xf2b177cb0aa429b4
 header_version  = 0
 
-trace_fmt = 'LLL'
+trace_fmt = '=QQQ'
 trace_len = struct.calcsize(trace_fmt)
 event_re

[Qemu-devel] [Tracing][PATCH 0/2] More Trace events

2010-08-11 Thread Prerna Saxena
Set of patches to add trace-events for tracking IO and balloon events 
flagged via the monitor.

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][PATCH 1/2] More Trace events

2010-08-11 Thread Prerna Saxena
[PATCH 1/2] Trace events for tracking port IO

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 ioport.c |7 +++
 trace-events |4 
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/ioport.c b/ioport.c
index 53dd87a..ec3dc65 100644
--- a/ioport.c
+++ b/ioport.c
@@ -26,6 +26,7 @@
  */
 
 #include ioport.h
+#include trace.h
 
 /***/
 /* IO Port */
@@ -195,18 +196,21 @@ void isa_unassign_ioport(pio_addr_t start, int length)
 void cpu_outb(pio_addr_t addr, uint8_t val)
 {
 LOG_IOPORT(outb: %04FMT_pioaddr %02PRIx8\n, addr, val);
+trace_cpu_out(addr, val);
 ioport_write(0, addr, val);
 }
 
 void cpu_outw(pio_addr_t addr, uint16_t val)
 {
 LOG_IOPORT(outw: %04FMT_pioaddr %04PRIx16\n, addr, val);
+trace_cpu_out(addr, val);
 ioport_write(1, addr, val);
 }
 
 void cpu_outl(pio_addr_t addr, uint32_t val)
 {
 LOG_IOPORT(outl: %04FMT_pioaddr %08PRIx32\n, addr, val);
+trace_cpu_out(addr, val);
 ioport_write(2, addr, val);
 }
 
@@ -214,6 +218,7 @@ uint8_t cpu_inb(pio_addr_t addr)
 {
 uint8_t val;
 val = ioport_read(0, addr);
+trace_cpu_in(addr, val);
 LOG_IOPORT(inb : %04FMT_pioaddr %02PRIx8\n, addr, val);
 return val;
 }
@@ -222,6 +227,7 @@ uint16_t cpu_inw(pio_addr_t addr)
 {
 uint16_t val;
 val = ioport_read(1, addr);
+trace_cpu_in(addr, val);
 LOG_IOPORT(inw : %04FMT_pioaddr %04PRIx16\n, addr, val);
 return val;
 }
@@ -230,6 +236,7 @@ uint32_t cpu_inl(pio_addr_t addr)
 {
 uint32_t val;
 val = ioport_read(2, addr);
+trace_cpu_in(addr, val);
 LOG_IOPORT(inl : %04FMT_pioaddr %08PRIx32\n, addr, val);
 return val;
 }
diff --git a/trace-events b/trace-events
index 80197b6..7dbd08f 100644
--- a/trace-events
+++ b/trace-events
@@ -59,3 +59,7 @@ virtio_blk_handle_write(void *req, unsigned long sector, 
unsigned long nsectors)
 
 # posix-aio-compat.c
 paio_submit(void *acb, void *opaque, unsigned long sector_num, unsigned long 
nb_sectors, unsigned long type) acb %p opaque %p sector_num %lu nb_sectors %lu 
type %lu
+
+# ioport.c
+cpu_in(unsigned int addr, unsigned int val) addr %u value %u
+cpu_out(unsigned int addr, unsigned int val) addr %u value %u
-- 
1.6.2.5



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing] Compilation failure

2010-08-09 Thread Prerna Saxena

Hi Stefan,
I think this needs to be resolved.

  CCtrace.o
  CCsimpletrace.o
cc1: warnings being treated as errors
/home/prerna/qemu-testing/git/qemu/simpletrace.c: In function 
‘write_header’:
/home/prerna/qemu-testing/git/qemu/simpletrace.c:46: error: integer 
constant is too large for ‘long’ type
/home/prerna/qemu-testing/git/qemu/simpletrace.c:46: error: large 
integer implicitly truncated to unsigned type

make: *** [simpletrace.o] Error 1

The error arises due to :
TraceRecord header = {
.event = -1UL, /* max avoids conflicting with TraceEventIDs */
.timestamp_ns = 0xf2b177cb0aa429b4, /* magic number */
 error.

Also, it would be better to #define the magic number to some macro, and 
use that instead of using the constant directly.


Regards,
--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] [Tracing][PATCH] Compilation fixes

2010-08-05 Thread Prerna Saxena
Fix to ensure rebuild is properly triggered when switching trace backends 
using ./configure.
Also, when using the 'ust' backend, check if the relevant headers are 
available at host.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 Makefile  |4 ++--
 configure |   20 +---
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 8831174..3bd41ce 100644
--- a/Makefile
+++ b/Makefile
@@ -132,10 +132,10 @@ bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS)
 
 iov.o: iov.c iov.h
 
-trace.h: $(SRC_PATH)/trace-events
+trace.h: $(SRC_PATH)/trace-events config-host.mak
$(call quiet-command,sh $(SRC_PATH)/tracetool --$(TRACE_BACKEND) -h  
$  $@,  GEN   $@)
 
-trace.c: $(SRC_PATH)/trace-events
+trace.c: $(SRC_PATH)/trace-events config-host.mak
$(call quiet-command,sh $(SRC_PATH)/tracetool --$(TRACE_BACKEND) -c  
$  $@,  GEN   $@)
 
 trace.o: trace.c $(GENERATED_HEADERS)
diff --git a/configure b/configure
index fe1b027..ee9f1e3 100755
--- a/configure
+++ b/configure
@@ -2011,6 +2011,23 @@ if test $? -ne 0 ; then
   exit 1
 fi
 
+##
+# For 'ust' backend, test if ust headers are present
+if test $trace_backend = ust; then
+  cat  $TMPC  EOF
+#include ust/tracepoint.h
+#include ust/marker.h
+int main(void) { return 0; }
+EOF
+  if compile_prog   ; then
+LIBS=-lust $LIBS
+  else
+echo ERROR: Trace backend 'ust' does not have relevant headers available
+echoon the host. Pls choose a different backend.
+exit 1
+  fi
+fi
+##
 # End of CC checks
 # After here, no more $cc or $ld runs
 
@@ -2392,9 +2409,6 @@ echo TRACE_BACKEND=$trace_backend  $config_host_mak
 if test $trace_backend = simple; then
   echo CONFIG_SIMPLE_TRACE=y  $config_host_mak
 fi
-if test $trace_backend = ust; then
-  LIBS=-lust $LIBS
-fi
 # Set the appropriate trace file.
 if test $trace_backend = simple; then
   trace_file=\$trace_file-%u\
-- 
1.6.2.5



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [Tracing][PATCH v2] Add options to specify trace file name at startup and runtime.

2010-08-04 Thread Prerna Saxena
This patch adds an optional command line switch '-trace' to specify the 
filename to write traces to, when qemu starts.
Eg, If compiled with the 'simple' trace backend,
[t...@system]$ qemu -trace FILENAME IMAGE
Allows the binary traces to be written to FILENAME instead of the option 
set at config-time. 

Also, this adds monitor sub-command 'set' to trace-file commands to 
dynamically change trace log file at runtime. 
Eg,
(qemu)trace-file set FILENAME
This allows one to set trace outputs to FILENAME from the default 
specified at startup.

Changelog from v1 :
- Cleanups.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c   |6 ++
 qemu-monitor.hx |6 +++---
 qemu-options.hx |   11 +++
 simpletrace.c   |   41 +++--
 tracetool   |1 +
 vl.c|   20 
 6 files changed, 72 insertions(+), 13 deletions(-)

diff --git a/monitor.c b/monitor.c
index 1e35a6b..1d6c4c0 100644
--- a/monitor.c
+++ b/monitor.c
@@ -544,6 +544,7 @@ static void do_change_trace_event_state(Monitor *mon, const 
QDict *qdict)
 static void do_trace_file(Monitor *mon, const QDict *qdict)
 {
 const char *op = qdict_get_try_str(qdict, op);
+const char *arg = qdict_get_try_str(qdict, arg);
 
 if (!op) {
 st_print_trace_file_status((FILE *)mon, monitor_fprintf);
@@ -553,8 +554,13 @@ static void do_trace_file(Monitor *mon, const QDict *qdict)
 st_set_trace_file_enabled(false);
 } else if (!strcmp(op, flush)) {
 st_flush_trace_buffer();
+} else if (!strcmp(op, set)) {
+if (arg) {
+st_set_trace_file(arg);
+}
 } else {
 monitor_printf(mon, unexpected argument \%s\\n, op);
+help_cmd(mon, trace-file);
 }
 }
 #endif
diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index 25887bd..adfaf2b 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -276,9 +276,9 @@ ETEXI
 
 {
 .name   = trace-file,
-.args_type  = op:s?,
-.params = op [on|off|flush],
-.help   = open, close, or flush trace file,
+.args_type  = op:s?,arg:F?,
+.params = on|off|flush|set [arg],
+.help   = open, close, or flush trace file, or set a new file 
name,
 .mhandler.cmd = do_trace_file,
 },
 
diff --git a/qemu-options.hx b/qemu-options.hx
index d1d2272..aea9675 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2223,6 +2223,17 @@ Normally QEMU loads a configuration file from 
@var{sysconfdir}/qemu.conf and
 @var{sysconfdir}/targ...@var{arch}.conf on startup.  The @code{-nodefconfig}
 option will prevent QEMU from loading these configuration files at startup.
 ETEXI
+#ifdef CONFIG_SIMPLE_TRACE
+DEF(trace, HAS_ARG, QEMU_OPTION_trace,
+-trace\n
+Specify a trace file to log traces to\n,
+QEMU_ARCH_ALL)
+STEXI
+...@item -trace
+...@findex -trace
+Specify a trace file to log output traces to.
+ETEXI
+#endif
 
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
diff --git a/simpletrace.c b/simpletrace.c
index 71110b3..19855f4 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -20,25 +20,46 @@ enum {
 static TraceRecord trace_buf[TRACE_BUF_LEN];
 static unsigned int trace_idx;
 static FILE *trace_fp;
-static bool trace_file_enabled = true;
+static char *trace_file_name = NULL;
+static bool trace_file_enabled = false;
 
 void st_print_trace_file_status(FILE *stream, int (*stream_printf)(FILE 
*stream, const char *fmt, ...))
 {
-stream_printf(stream, Trace file \ CONFIG_TRACE_FILE \ %s.\n,
-  getpid(), trace_file_enabled ? on : off);
+stream_printf(stream, Trace file \%s\ %s.\n,
+  trace_file_name, trace_file_enabled ? on : off);
 }
 
-static bool open_trace_file(void)
+static inline bool open_trace_file(void)
 {
-char *filename;
+trace_fp = fopen(trace_file_name, w);
+return trace_fp != NULL;
+}
+
+/**
+ * set_trace_file : To set the name of a trace file.
+ * @file : pointer to the name to be set.
+ * If NULL, set to the default name-pid set at config time.
+ */
+bool st_set_trace_file(const char *file)
+{
+st_set_trace_file_enabled(false);
 
-if (asprintf(filename, CONFIG_TRACE_FILE, getpid())  0) {
-return false;
+free(trace_file_name);
+
+if (!file) {
+if (asprintf(trace_file_name, CONFIG_TRACE_FILE, getpid())  0) {
+trace_file_name = NULL;
+   return false;
+} 
+} else {
+if (asprintf(trace_file_name, %s, file)  0) {
+trace_file_name = NULL;
+return false;
+}
 }
 
-trace_fp = fopen(filename, w);
-free(filename);
-return trace_fp != NULL;
+st_set_trace_file_enabled(true);
+return true;
 }
 
 static void flush_trace_file(void)
diff --git a/tracetool b/tracetool
index ac832af..5b979f5 100755
--- a/tracetool
+++ b/tracetool
@@ -158,6 +158,7 @@ void

Re: [Qemu-devel] [Tracing][PATCH] Add options to specify trace file name at startup and runtime.

2010-08-04 Thread Prerna Saxena

On 08/03/2010 07:45 PM, Stefan Hajnoczi wrote:

On Tue, Aug 3, 2010 at 6:37 AM, Prerna Saxenapre...@linux.vnet.ibm.com  wrote:

This patch adds an optional command line switch '-trace' to specify the
filename to write traces to, when qemu starts.
Eg, If compiled with the 'simple' trace backend,
[t...@system]$ qemu -trace FILENAME IMAGE
Allows the binary traces to be written to FILENAME instead of the option
set at config-time.

Also, this adds monitor sub-command 'set' to trace-file commands to
dynamically change trace log file at runtime.
Eg,
(qemu)trace-file set FILENAME
This allows one to set trace outputs to FILENAME from the default
specified at startup.

Signed-off-by: Prerna Saxenapre...@linux.vnet.ibm.com
---
  monitor.c   |6 ++
  qemu-monitor.hx |6 +++---
  qemu-options.hx |   11 +++
  simpletrace.c   |   41 -
  tracetool   |1 +
  vl.c|   22 ++
  6 files changed, 75 insertions(+), 12 deletions(-)


Looks like a good approach.  I checked that this also handles the case
where trace events fire before the command-line option is handled and
the trace filename is set.


diff --git a/monitor.c b/monitor.c
index 1e35a6b..8e2a3a6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -544,6 +544,7 @@ static void do_change_trace_event_state(Monitor *mon, const 
QDict *qdict)
  static void do_trace_file(Monitor *mon, const QDict *qdict)
  {
 const char *op = qdict_get_try_str(qdict, op);
+const char *arg = qdict_get_try_str(qdict, arg);

 if (!op) {
 st_print_trace_file_status((FILE *)mon,monitor_fprintf);
@@ -553,8 +554,13 @@ static void do_trace_file(Monitor *mon, const QDict *qdict)
 st_set_trace_file_enabled(false);
 } else if (!strcmp(op, flush)) {
 st_flush_trace_buffer();
+} else if (!strcmp(op, set)) {
+if (arg) {
+st_set_trace_file(arg);
+}
 } else {
 monitor_printf(mon, unexpected argument \%s\\n, op);
+monitor_printf(mon, Options are: [on | off| flush| set FILENAME]);


Can we use help_cmd() here to print the help text and avoid
duplicating the options?


Agree, changed in v2.


 }
  }
  #endif
...
...
  static bool open_trace_file(void)
  {
-char *filename;
+trace_fp = fopen(trace_file_name, w);
+return trace_fp != NULL;
+}


This could be inlined now.  The function is only used by one caller.



Done in v2.



-if (asprintf(filename, CONFIG_TRACE_FILE, getpid())  0) {
-return false;
+/**
+ * set_trace_file : To set the name of a trace file.
+ * @file : pointer to the name to be set.
+ * If NULL, set to the default name-pid  set at config time.
+ */
+bool st_set_trace_file(const char *file)
+{
+if (trace_file_enabled) {
+st_set_trace_file_enabled(false);
 }


No need for an if statement.  If trace_file_enabled is already false,
then st_set_trace_file_enabled() is a nop.


Agree this is unnecessary. Changed in v2.



-trace_fp = fopen(filename, w);
-free(filename);
-return trace_fp != NULL;
+if (trace_file_name) {
+free(trace_file_name);
+}


No need for an if statement.  free(NULL) is a nop.


Changed in v2.


+
+if (!file) {
+if (asprintf(trace_file_name, CONFIG_TRACE_FILE, getpid())  0) {
+   return false;
+}
+} else {
+if (asprintf(trace_file_name, %s, file)  0) {
+return false;
+}
+}


When asprintf() fails, the value of the string pointer is undefined
according to the man page.  That can result in double frees.  It would
be safest to set trace_file_name = NULL on failure.



Done.



...
 ...

@@ -2590,6 +2597,12 @@ int main(int argc, char **argv, char **envp)
 }
 xen_mode = XEN_ATTACH;
 break;
+#ifdef CONFIG_SIMPLE_TRACE
+case QEMU_OPTION_trace:
+trace_file = (char *) qemu_malloc(strlen(optarg) + 1);
+strcpy(trace_file, optarg);
+break;
+#endif


Malloc isn't necessary, just hold the optarg pointer like gdbstub_dev
and other string options do.


It wouldnt be corect to use optarg directly here. If this optional 
argument is not specified, st_set_file_name() is called with a NULL 
argument, and the filename defaults to config-specified name.
(This is how gdbstub_dev works too. The optional argument is copied to 
gdbstub_dev if provided.)




...



Thanks,
--
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India



[Qemu-devel] [Tracing][PATCH v3] Add options to specify trace file name at startup and runtime.

2010-08-04 Thread Prerna Saxena
Stefanha, Malc,
Thanks for suggestions. Resending the patch after clean-up.

This patch adds an optional command line switch '-trace' to specify the 
filename to write traces to, when qemu starts.
Eg, If compiled with the 'simple' trace backend,
[t...@system]$ qemu -trace FILENAME IMAGE
Allows the binary traces to be written to FILENAME instead of the option 
set at config-time. 

Also, this adds monitor sub-command 'set' to trace-file commands to 
dynamically change trace log file at runtime. 
Eg,
(qemu)trace-file set FILENAME
This allows one to set trace outputs to FILENAME from the default 
specified at startup.

Changelog from v2 :
- Cleanups.


Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c   |6 ++
 qemu-monitor.hx |6 +++---
 qemu-options.hx |   11 +++
 simpletrace.c   |   41 +++--
 tracetool   |1 +
 vl.c|   18 ++
 6 files changed, 70 insertions(+), 13 deletions(-)

diff --git a/monitor.c b/monitor.c
index 1e35a6b..1d6c4c0 100644
--- a/monitor.c
+++ b/monitor.c
@@ -544,6 +544,7 @@ static void do_change_trace_event_state(Monitor *mon, const 
QDict *qdict)
 static void do_trace_file(Monitor *mon, const QDict *qdict)
 {
 const char *op = qdict_get_try_str(qdict, op);
+const char *arg = qdict_get_try_str(qdict, arg);
 
 if (!op) {
 st_print_trace_file_status((FILE *)mon, monitor_fprintf);
@@ -553,8 +554,13 @@ static void do_trace_file(Monitor *mon, const QDict *qdict)
 st_set_trace_file_enabled(false);
 } else if (!strcmp(op, flush)) {
 st_flush_trace_buffer();
+} else if (!strcmp(op, set)) {
+if (arg) {
+st_set_trace_file(arg);
+}
 } else {
 monitor_printf(mon, unexpected argument \%s\\n, op);
+help_cmd(mon, trace-file);
 }
 }
 #endif
diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index 25887bd..adfaf2b 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -276,9 +276,9 @@ ETEXI
 
 {
 .name   = trace-file,
-.args_type  = op:s?,
-.params = op [on|off|flush],
-.help   = open, close, or flush trace file,
+.args_type  = op:s?,arg:F?,
+.params = on|off|flush|set [arg],
+.help   = open, close, or flush trace file, or set a new file 
name,
 .mhandler.cmd = do_trace_file,
 },
 
diff --git a/qemu-options.hx b/qemu-options.hx
index d1d2272..aea9675 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2223,6 +2223,17 @@ Normally QEMU loads a configuration file from 
@var{sysconfdir}/qemu.conf and
 @var{sysconfdir}/targ...@var{arch}.conf on startup.  The @code{-nodefconfig}
 option will prevent QEMU from loading these configuration files at startup.
 ETEXI
+#ifdef CONFIG_SIMPLE_TRACE
+DEF(trace, HAS_ARG, QEMU_OPTION_trace,
+-trace\n
+Specify a trace file to log traces to\n,
+QEMU_ARCH_ALL)
+STEXI
+...@item -trace
+...@findex -trace
+Specify a trace file to log output traces to.
+ETEXI
+#endif
 
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
diff --git a/simpletrace.c b/simpletrace.c
index 71110b3..860bcf1 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -20,25 +20,46 @@ enum {
 static TraceRecord trace_buf[TRACE_BUF_LEN];
 static unsigned int trace_idx;
 static FILE *trace_fp;
-static bool trace_file_enabled = true;
+static char *trace_file_name = NULL;
+static bool trace_file_enabled = false;
 
 void st_print_trace_file_status(FILE *stream, int (*stream_printf)(FILE 
*stream, const char *fmt, ...))
 {
-stream_printf(stream, Trace file \ CONFIG_TRACE_FILE \ %s.\n,
-  getpid(), trace_file_enabled ? on : off);
+stream_printf(stream, Trace file \%s\ %s.\n,
+  trace_file_name, trace_file_enabled ? on : off);
 }
 
-static bool open_trace_file(void)
+static inline bool open_trace_file(void)
 {
-char *filename;
+trace_fp = fopen(trace_file_name, w);
+return trace_fp != NULL;
+}
+
+/**
+ * set_trace_file : To set the name of a trace file.
+ * @file : pointer to the name to be set.
+ * If NULL, set to the default name-pid set at config time.
+ */
+bool st_set_trace_file(const char *file)
+{
+st_set_trace_file_enabled(false);
 
-if (asprintf(filename, CONFIG_TRACE_FILE, getpid())  0) {
-return false;
+free(trace_file_name);
+
+if (!file) {
+if (asprintf(trace_file_name, CONFIG_TRACE_FILE, getpid())  0) {
+trace_file_name = NULL;
+return false;
+} 
+} else {
+if (asprintf(trace_file_name, %s, file)  0) {
+trace_file_name = NULL;
+return false;
+}
 }
 
-trace_fp = fopen(filename, w);
-free(filename);
-return trace_fp != NULL;
+st_set_trace_file_enabled(true);
+return true;
 }
 
 static void flush_trace_file(void)
diff --git a/tracetool b/tracetool
index ac832af..5b979f5

[Qemu-devel] [Tracing][PATCH] Add options to specify trace file name at startup and runtime.

2010-08-02 Thread Prerna Saxena
This patch adds an optional command line switch '-trace' to specify the 
filename to write traces to, when qemu starts.
Eg, If compiled with the 'simple' trace backend,
[t...@system]$ qemu -trace FILENAME IMAGE
Allows the binary traces to be written to FILENAME instead of the option 
set at config-time. 

Also, this adds monitor sub-command 'set' to trace-file commands to 
dynamically change trace log file at runtime. 
Eg,
(qemu)trace-file set FILENAME
This allows one to set trace outputs to FILENAME from the default 
specified at startup.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c   |6 ++
 qemu-monitor.hx |6 +++---
 qemu-options.hx |   11 +++
 simpletrace.c   |   41 -
 tracetool   |1 +
 vl.c|   22 ++
 6 files changed, 75 insertions(+), 12 deletions(-)

diff --git a/monitor.c b/monitor.c
index 1e35a6b..8e2a3a6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -544,6 +544,7 @@ static void do_change_trace_event_state(Monitor *mon, const 
QDict *qdict)
 static void do_trace_file(Monitor *mon, const QDict *qdict)
 {
 const char *op = qdict_get_try_str(qdict, op);
+const char *arg = qdict_get_try_str(qdict, arg);
 
 if (!op) {
 st_print_trace_file_status((FILE *)mon, monitor_fprintf);
@@ -553,8 +554,13 @@ static void do_trace_file(Monitor *mon, const QDict *qdict)
 st_set_trace_file_enabled(false);
 } else if (!strcmp(op, flush)) {
 st_flush_trace_buffer();
+} else if (!strcmp(op, set)) {
+if (arg) {
+st_set_trace_file(arg);
+}
 } else {
 monitor_printf(mon, unexpected argument \%s\\n, op);
+monitor_printf(mon, Options are: [on | off| flush| set FILENAME]);
 }
 }
 #endif
diff --git a/qemu-monitor.hx b/qemu-monitor.hx
index 25887bd..adfaf2b 100644
--- a/qemu-monitor.hx
+++ b/qemu-monitor.hx
@@ -276,9 +276,9 @@ ETEXI
 
 {
 .name   = trace-file,
-.args_type  = op:s?,
-.params = op [on|off|flush],
-.help   = open, close, or flush trace file,
+.args_type  = op:s?,arg:F?,
+.params = on|off|flush|set [arg],
+.help   = open, close, or flush trace file, or set a new file 
name,
 .mhandler.cmd = do_trace_file,
 },
 
diff --git a/qemu-options.hx b/qemu-options.hx
index d1d2272..aea9675 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2223,6 +2223,17 @@ Normally QEMU loads a configuration file from 
@var{sysconfdir}/qemu.conf and
 @var{sysconfdir}/targ...@var{arch}.conf on startup.  The @code{-nodefconfig}
 option will prevent QEMU from loading these configuration files at startup.
 ETEXI
+#ifdef CONFIG_SIMPLE_TRACE
+DEF(trace, HAS_ARG, QEMU_OPTION_trace,
+-trace\n
+Specify a trace file to log traces to\n,
+QEMU_ARCH_ALL)
+STEXI
+...@item -trace
+...@findex -trace
+Specify a trace file to log output traces to.
+ETEXI
+#endif
 
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
diff --git a/simpletrace.c b/simpletrace.c
index 71110b3..5812fe9 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -20,25 +20,48 @@ enum {
 static TraceRecord trace_buf[TRACE_BUF_LEN];
 static unsigned int trace_idx;
 static FILE *trace_fp;
-static bool trace_file_enabled = true;
+static char *trace_file_name = NULL;
+static bool trace_file_enabled = false;
 
 void st_print_trace_file_status(FILE *stream, int (*stream_printf)(FILE 
*stream, const char *fmt, ...))
 {
-stream_printf(stream, Trace file \ CONFIG_TRACE_FILE \ %s.\n,
-  getpid(), trace_file_enabled ? on : off);
+stream_printf(stream, Trace file \%s\ %s.\n,
+  trace_file_name, trace_file_enabled ? on : off);
 }
 
 static bool open_trace_file(void)
 {
-char *filename;
+trace_fp = fopen(trace_file_name, w);
+return trace_fp != NULL;
+}
 
-if (asprintf(filename, CONFIG_TRACE_FILE, getpid())  0) {
-return false;
+/**
+ * set_trace_file : To set the name of a trace file.
+ * @file : pointer to the name to be set.
+ * If NULL, set to the default name-pid set at config time.
+ */
+bool st_set_trace_file(const char *file)
+{
+if (trace_file_enabled) {
+st_set_trace_file_enabled(false);
 }
 
-trace_fp = fopen(filename, w);
-free(filename);
-return trace_fp != NULL;
+if (trace_file_name) {
+free(trace_file_name);
+}
+
+if (!file) {
+if (asprintf(trace_file_name, CONFIG_TRACE_FILE, getpid())  0) {
+   return false;
+} 
+} else {
+if (asprintf(trace_file_name, %s, file)  0) {
+return false;
+}
+}
+
+st_set_trace_file_enabled(true);
+return true;
 }
 
 static void flush_trace_file(void)
diff --git a/tracetool b/tracetool
index ac832af..5b979f5 100755
--- a/tracetool
+++ b/tracetool
@@ -158,6 +158,7 @@ void st_print_trace_events(FILE *stream, int

[Qemu-devel] [Tracing][PATCH] Allow bulk enabling of trace events at compile time.

2010-07-13 Thread Prerna Saxena
[PATCH] For 'simple' trace backend, allow bulk enabling/disabling of trace
 events at compile time.
 Trace events that are preceded by 'disable' keyword are compiled in, but 
 turned off by default. These can individually be turned on using the monitor.
 All other trace events are enabled by default.

TODO :
This could be enhanced when the trace-event namespace is partitioned into a
group and an ID within that group. In such a case, marking a group as enabled 
would automatically enable all trace-events listed under it.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 trace-events |3 +++
 tracetool|   36 
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/trace-events b/trace-events
index a533414..cb5ef00 100644
--- a/trace-events
+++ b/trace-events
@@ -17,6 +17,9 @@
 # Example: qemu_malloc(size_t size) size %zu
 #
 # The disable keyword will build without the trace event.
+# In case of 'simple' trace backend, it will allow the trace event to be
+# compiled, but this would be turned off by default. It can be toggled on via 
+# the monitor.
 #
 # The name must be a valid as a C function name.
 #
diff --git a/tracetool b/tracetool
index b7a0499..98d23fb 100755
--- a/tracetool
+++ b/tracetool
@@ -73,6 +73,20 @@ get_fmt()
 echo $fmt
 }
 
+# Get the state of a trace event
+get_state()
+{
+local str disable state
+str=$(get_name $1)
+disable=${str##disable }
+if [ $disable = $str ] ; then
+state=1
+else
+state=0
+fi
+echo $state
+}
+
 linetoh_begin_nop()
 {
 return
@@ -155,12 +169,16 @@ cast_args_to_ulong()
 
 linetoh_simple()
 {
-local name args argc ulong_args
+local name args argc ulong_args state
 name=$(get_name $1)
 args=$(get_args $1)
 argc=$(get_argc $1)
 ulong_args=$(cast_args_to_ulong $1)
 
+state=$(get_state $1)
+if [ $state = 0 ]; then
+name=${name##disable }
+fi
 cat EOF
 static inline void trace_$name($args) {
 trace$argc($simple_event_num, $ulong_args);
@@ -191,10 +209,14 @@ EOF
 
 linetoc_simple()
 {
-local name
+local name state
 name=$(get_name $1)
+state=$(get_state $1)
+if [ $state = 0 ] ; then
+name=${name##disable }
+fi
 cat EOF
-{.tp_name = $name, .state=0},
+{.tp_name = $name, .state=$state},
 EOF
 simple_event_num=$((simple_event_num + 1))
 }
@@ -305,7 +327,13 @@ convert()
 disable=${str%%disable *}
 echo
 if test -z $disable; then
-lineto$1_nop ${str##disable }
+# Pass the disabled state as an arg to lineto$1_simple().
+# For all other cases, call lineto$1_nop()
+if [ $backend = simple ]; then
+$process_line $str
+else
+lineto$1_nop ${str##disable }
+fi
 else
 $process_line $str
 fi
-- 
1.6.2.5



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [RFC v5[PATCH][Tracing] Fix build errors for target i386-linux-user

2010-07-11 Thread Prerna Saxena
[PATCH] Separate monitor command handler interfaces and tracing internals.

Changelog from v3:
 - cleanup ( removed unnecessary references to 'rec' )

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c |   23 +++
 simpletrace.c |   50 --
 tracetool |7 +++
 3 files changed, 58 insertions(+), 22 deletions(-)

diff --git a/monitor.c b/monitor.c
index 433a3ec..1f89938 100644
--- a/monitor.c
+++ b/monitor.c
@@ -540,6 +540,29 @@ static void do_change_trace_event_state(Monitor *mon, 
const QDict *qdict)
 bool new_state = qdict_get_bool(qdict, option);
 change_trace_event_state(tp_name, new_state);
 }
+
+void do_info_trace(Monitor *mon)
+{
+unsigned int i;
+char rec[MAX_TRACE_STR_LEN];
+unsigned int trace_idx = get_trace_idx();
+
+for (i = 0; i  trace_idx ; i++) {
+if (format_trace_string(i, rec)) {
+monitor_printf(mon, rec);
+}
+}
+}
+
+void do_info_all_trace_events(Monitor *mon)
+{
+unsigned int i;
+
+for (i = 0; i  NR_TRACE_EVENTS; i++) {
+monitor_printf(mon, %s [Event ID %u] : state %u\n,
+trace_list[i].tp_name, i, trace_list[i].state);
+}
+}
 #endif
 
 static void user_monitor_complete(void *opaque, QObject *ret_data)
diff --git a/simpletrace.c b/simpletrace.c
index 57c41fc..9e3b46c 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -1,8 +1,8 @@
 #include stdlib.h
 #include stdio.h
-#include monitor.h
 #include trace.h
 
+/* Remember to update MAX_TRACE_STR_LEN when changing TraceRecord structure */
 typedef struct {
 unsigned long event;
 unsigned long x1;
@@ -69,27 +69,6 @@ void trace5(TraceEventID event, unsigned long x1, unsigned 
long x2, unsigned lon
 trace(event, x1, x2, x3, x4, x5);
 }
 
-void do_info_trace(Monitor *mon)
-{
-unsigned int i;
-
-for (i = 0; i  trace_idx ; i++) {
-monitor_printf(mon, Event %lu : %lx %lx %lx %lx %lx\n,
-  trace_buf[i].event, trace_buf[i].x1, trace_buf[i].x2,
-trace_buf[i].x3, trace_buf[i].x4, trace_buf[i].x5);
-}
-}
-
-void do_info_all_trace_events(Monitor *mon)
-{
-unsigned int i;
-
-for (i = 0; i  NR_TRACE_EVENTS; i++) {
-monitor_printf(mon, %s [Event ID %u] : state %u\n,
-trace_list[i].tp_name, i, trace_list[i].state);
-}
-}
-
 static TraceEvent* find_trace_event_by_name(const char *tname)
 {
 unsigned int i;
@@ -115,3 +94,30 @@ void change_trace_event_state(const char *tname, bool 
tstate)
 tp-state = tstate;
 }
 }
+
+/**
+ * Return the current trace index.
+ *
+ */
+unsigned int get_trace_idx(void)
+{
+return trace_idx;
+}
+
+/**
+ * returns formatted TraceRecord at a given index in the trace buffer.
+ * FORMAT : Event %lu : %lx %lx %lx %lx %lx\n
+ * 
+ * @idx : index in the buffer for which trace record is returned.
+ * @trace_str : output string passed.
+ */
+char* format_trace_string(unsigned int idx, char trace_str[])
+{
+if (idx = TRACE_BUF_LEN) {
+return NULL;
+}
+sprintf(trace_str[0], Event %lu : %lx %lx %lx %lx %lx\n,
+ trace_buf[idx].event, trace_buf[idx].x1, 
trace_buf[idx].x2,
+   trace_buf[idx].x3, trace_buf[idx].x4, 
trace_buf[idx].x5);
+return trace_str[0];
+}
diff --git a/tracetool b/tracetool
index c77280d..b7a0499 100755
--- a/tracetool
+++ b/tracetool
@@ -125,6 +125,11 @@ typedef struct {
 bool state;
 } TraceEvent;
 
+/* Max size of trace string to be displayed via the monitor.
+ * Format : Event %lu : %lx %lx %lx %lx %lx\n
+ */
+#define MAX_TRACE_STR_LEN 100
+
 void trace1(TraceEventID event, unsigned long x1);
 void trace2(TraceEventID event, unsigned long x1, unsigned long x2);
 void trace3(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3);
@@ -133,6 +138,8 @@ void trace5(TraceEventID event, unsigned long x1, unsigned 
long x2, unsigned lon
 void do_info_trace(Monitor *mon);
 void do_info_all_trace_events(Monitor *mon);
 void change_trace_event_state(const char *tname, bool tstate);
+unsigned int get_trace_idx(void);
+char* format_trace_string(unsigned int idx, char *trace_str);
 EOF
 
 simple_event_num=0
-- 
1.6.2.5



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] [PATCH][Tracing] Specify trace file name

2010-07-09 Thread Prerna Saxena
[PATCH] Allow users to specify a file for trace-outputs at configuration.
Also, allow trace files to be annotated by pid so each qemu instance has 
unique traces.

The trace file name can be passed as a config option:
--trace-file=/path/to/file
(Default : /tmp/trace )
At runtime, the pid of the qemu process is appended to the filename so 
that mutiple qemu instances do not have overlapping logs.

Eg : /tmp/trace-1234 for qemu launched with pid 1234.

I have yet to test this on windows. getpid() is used at many places
in code(including vnc.c), so I'm hoping this would be okay too.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 configure |   20 
 simpletrace.c |   13 -
 tracetool |1 +
 vl.c  |8 
 4 files changed, 41 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 02bf602..18cb6ab 100755
--- a/configure
+++ b/configure
@@ -313,6 +313,7 @@ check_utests=no
 user_pie=no
 zero_malloc=
 trace_backend=nop
+trace_file=
 
 # OS specific
 if check_define __linux__ ; then
@@ -517,6 +518,8 @@ for opt do
   ;;
   --trace-backend=*) trace_backend=$optarg
   ;;
+  --trace-file=*) trace_file=$optarg
+  ;;
   --enable-gprof) gprof=yes
   ;;
   --static)
@@ -876,6 +879,9 @@ echo   --disable-docs   disable documentation 
build
 echo   --disable-vhost-net  disable vhost-net acceleration support
 echo   --enable-vhost-net   enable vhost-net acceleration support
 echo   --trace-backend=BTrace backend nop simple ust
+echo   --trace-file=NAMEFull PATH,NAME of file to store traces
+echoDefault:/tmp/trace-pid
+echoDefault:trace-pid on Windows
 echo 
 echo NOTE: The object files are built at the place where configure is 
launched
 exit 1
@@ -2132,6 +2138,7 @@ echo fdatasync $fdatasync
 echo uuid support  $uuid
 echo vhost-net support $vhost_net
 echo Trace backend $trace_backend
+echo Trace Output File $trace_file-pid
 
 if test $sdl_too_old = yes; then
 echo - Your SDL version is too old - please upgrade to have SDL support
@@ -2387,6 +2394,19 @@ fi
 if test $trace_backend = ust; then
   LIBS=-lust $LIBS
 fi
+# Set the appropriate trace file.
+if test $trace_backend = simple; then
+  if test $trace_file = ; then
+if test $mingw32 = yes ; then
+  trace_file=\trace-%u\
+else
+  trace_file=\/tmp/trace-%u\
+fi
+  else
+trace_file=\$trace_file-%u\
+  fi
+fi
+echo CONFIG_TRACE_FILE=$trace_file  $config_host_mak
 echo TOOLS=$tools  $config_host_mak
 echo ROMS=$roms  $config_host_mak
 echo MAKE=$make  $config_host_mak
diff --git a/simpletrace.c b/simpletrace.c
index 57c41fc..4f3228f 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -20,6 +20,16 @@ static TraceRecord trace_buf[TRACE_BUF_LEN];
 static unsigned int trace_idx;
 static FILE *trace_fp;
 
+char* trace_file_name;
+
+/**
+ * Initialize trace file name.
+ */
+int init_trace_file(void)
+{
+   return asprintf(trace_file_name, CONFIG_TRACE_FILE, getpid());
+}
+
 static void trace(TraceEventID event, unsigned long x1,
   unsigned long x2, unsigned long x3,
   unsigned long x4, unsigned long x5) {
@@ -40,7 +50,7 @@ static void trace(TraceEventID event, unsigned long x1,
 trace_idx = 0;
 
 if (!trace_fp) {
-trace_fp = fopen(/tmp/trace.log, w);
+trace_fp = fopen(trace_file_name, w);
 }
 if (trace_fp) {
 size_t result = fwrite(trace_buf, sizeof trace_buf, 1, trace_fp);
@@ -78,6 +88,7 @@ void do_info_trace(Monitor *mon)
   trace_buf[i].event, trace_buf[i].x1, trace_buf[i].x2,
 trace_buf[i].x3, trace_buf[i].x4, trace_buf[i].x5);
 }
+monitor_printf(mon, Trace output logged at %s, trace_file_name);
 }
 
 void do_info_all_trace_events(Monitor *mon)
diff --git a/tracetool b/tracetool
index c77280d..05ece45 100755
--- a/tracetool
+++ b/tracetool
@@ -125,6 +125,7 @@ typedef struct {
 bool state;
 } TraceEvent;
 
+int init_trace_file(void);
 void trace1(TraceEventID event, unsigned long x1);
 void trace2(TraceEventID event, unsigned long x1, unsigned long x2);
 void trace3(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3);
diff --git a/vl.c b/vl.c
index 920717a..adc28ef 100644
--- a/vl.c
+++ b/vl.c
@@ -95,6 +95,10 @@ extern int madvise(caddr_t, size_t, int);
 #include windows.h
 #endif
 
+#ifdef CONFIG_SIMPLE_TRACE
+#include trace.h
+#endif
+
 #ifdef CONFIG_SDL
 #if defined(__APPLE__) || defined(main)
 #include SDL.h
@@ -2758,6 +2762,10 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+/* Init tracing, if configured */
+#ifdef CONFIG_SIMPLE_TRACE
+init_trace_file();
+#endif
 /* init the bluetooth world */
 if (foreach_device_config(DEV_BT, bt_parse))
 exit(1);
-- 
1.6.2.5

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems

[Qemu-devel] [RFC v4][PATCH][Tracing] Fix build errors for target i386-linux-user

2010-07-09 Thread Prerna Saxena
[PATCH] Separate monitor command handler interfaces and tracing internals.

Changelog from v3 :
1. Cleanups.

Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com
---
 monitor.c |   23 +++
 simpletrace.c |   52 ++--
 tracetool |7 +++
 3 files changed, 60 insertions(+), 22 deletions(-)

diff --git a/monitor.c b/monitor.c
index 433a3ec..1f89938 100644
--- a/monitor.c
+++ b/monitor.c
@@ -540,6 +540,29 @@ static void do_change_trace_event_state(Monitor *mon, 
const QDict *qdict)
 bool new_state = qdict_get_bool(qdict, option);
 change_trace_event_state(tp_name, new_state);
 }
+
+void do_info_trace(Monitor *mon)
+{
+unsigned int i;
+char rec[MAX_TRACE_STR_LEN];
+unsigned int trace_idx = get_trace_idx();
+
+for (i = 0; i  trace_idx ; i++) {
+if (format_trace_string(i, rec)) {
+monitor_printf(mon, rec);
+}
+}
+}
+
+void do_info_all_trace_events(Monitor *mon)
+{
+unsigned int i;
+
+for (i = 0; i  NR_TRACE_EVENTS; i++) {
+monitor_printf(mon, %s [Event ID %u] : state %u\n,
+trace_list[i].tp_name, i, trace_list[i].state);
+}
+}
 #endif
 
 static void user_monitor_complete(void *opaque, QObject *ret_data)
diff --git a/simpletrace.c b/simpletrace.c
index 57c41fc..78507ec 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -1,8 +1,8 @@
 #include stdlib.h
 #include stdio.h
-#include monitor.h
 #include trace.h
 
+/* Remember to update MAX_TRACE_STR_LEN when changing TraceRecord structure */
 typedef struct {
 unsigned long event;
 unsigned long x1;
@@ -69,27 +69,6 @@ void trace5(TraceEventID event, unsigned long x1, unsigned 
long x2, unsigned lon
 trace(event, x1, x2, x3, x4, x5);
 }
 
-void do_info_trace(Monitor *mon)
-{
-unsigned int i;
-
-for (i = 0; i  trace_idx ; i++) {
-monitor_printf(mon, Event %lu : %lx %lx %lx %lx %lx\n,
-  trace_buf[i].event, trace_buf[i].x1, trace_buf[i].x2,
-trace_buf[i].x3, trace_buf[i].x4, trace_buf[i].x5);
-}
-}
-
-void do_info_all_trace_events(Monitor *mon)
-{
-unsigned int i;
-
-for (i = 0; i  NR_TRACE_EVENTS; i++) {
-monitor_printf(mon, %s [Event ID %u] : state %u\n,
-trace_list[i].tp_name, i, trace_list[i].state);
-}
-}
-
 static TraceEvent* find_trace_event_by_name(const char *tname)
 {
 unsigned int i;
@@ -115,3 +94,32 @@ void change_trace_event_state(const char *tname, bool 
tstate)
 tp-state = tstate;
 }
 }
+
+/**
+ * Return the current trace index.
+ *
+ */
+unsigned int get_trace_idx(void)
+{
+return trace_idx;
+}
+
+/**
+ * returns formatted TraceRecord at a given index in the trace buffer.
+ * FORMAT : Event %lu : %lx %lx %lx %lx %lx\n
+ * 
+ * @idx : index in the buffer for which trace record is returned.
+ * @trace_str : output string passed.
+ */
+char* format_trace_string(unsigned int idx, char trace_str[])
+{
+TraceRecord rec;
+if (idx = TRACE_BUF_LEN) {
+return NULL;
+}
+rec = trace_buf[idx];
+sprintf(trace_str[0], Event %lu : %lx %lx %lx %lx %lx\n,
+ trace_buf[idx].event, trace_buf[idx].x1, 
trace_buf[idx].x2,
+   trace_buf[idx].x3, trace_buf[idx].x4, 
trace_buf[idx].x5);
+return trace_str[0];
+}
diff --git a/tracetool b/tracetool
index c77280d..b7a0499 100755
--- a/tracetool
+++ b/tracetool
@@ -125,6 +125,11 @@ typedef struct {
 bool state;
 } TraceEvent;
 
+/* Max size of trace string to be displayed via the monitor.
+ * Format : Event %lu : %lx %lx %lx %lx %lx\n
+ */
+#define MAX_TRACE_STR_LEN 100
+
 void trace1(TraceEventID event, unsigned long x1);
 void trace2(TraceEventID event, unsigned long x1, unsigned long x2);
 void trace3(TraceEventID event, unsigned long x1, unsigned long x2, unsigned 
long x3);
@@ -133,6 +138,8 @@ void trace5(TraceEventID event, unsigned long x1, unsigned 
long x2, unsigned lon
 void do_info_trace(Monitor *mon);
 void do_info_all_trace_events(Monitor *mon);
 void change_trace_event_state(const char *tname, bool tstate);
+unsigned int get_trace_idx(void);
+char* format_trace_string(unsigned int idx, char *trace_str);
 EOF
 
 simple_event_num=0
-- 
1.6.2.5



-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India




[Qemu-devel] Re: [RFC v3][PATCH][Tracing] Fix build errors for target i386-linux-user

2010-07-09 Thread Prerna Saxena

On 07/08/2010 07:04 PM, Stefan Hajnoczi wrote:

On Thu, Jul 08, 2010 at 04:50:52PM +0530, Prerna Saxena wrote:

On 07/08/2010 02:50 PM, Stefan Hajnoczi wrote:

On Thu, Jul 08, 2010 at 10:58:58AM +0530, Prerna Saxena wrote:

[PATCH] Separate monitor command handler interfaces and tracing internals.


Signed-off-by: Prerna Saxenapre...@linux.vnet.ibm.com
---
  monitor.c |   23 +++
  simpletrace.c |   51 +--
  tracetool |7 +++
  3 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/monitor.c b/monitor.c
index 433a3ec..1f89938 100644
--- a/monitor.c
+++ b/monitor.c
@@ -540,6 +540,29 @@ static void do_change_trace_event_state(Monitor *mon, 
const QDict *qdict)
  bool new_state = qdict_get_bool(qdict, option);
  change_trace_event_state(tp_name, new_state);
  }
+
+void do_info_trace(Monitor *mon)
+{
+unsigned int i;
+char rec[MAX_TRACE_STR_LEN];
+unsigned int trace_idx = get_trace_idx();
+
+for (i = 0; i   trace_idx ; i++) {
+if (format_trace_string(i, rec)) {
+monitor_printf(mon, rec);
+}
+}
+}
+
+void do_info_all_trace_events(Monitor *mon)
+{
+unsigned int i;
+
+for (i = 0; i   NR_TRACE_EVENTS; i++) {
+monitor_printf(mon, %s [Event ID %u] : state %u\n,
+trace_list[i].tp_name, i, trace_list[i].state);
+}
+}
  #endif

  static void user_monitor_complete(void *opaque, QObject *ret_data)
diff --git a/simpletrace.c b/simpletrace.c
index 57c41fc..c7b1e7e 100644
--- a/simpletrace.c
+++ b/simpletrace.c
@@ -1,8 +1,8 @@
  #includestdlib.h
  #includestdio.h
-#include monitor.h
  #include trace.h

+/* Remember to update TRACE_REC_SIZE when changing TraceRecord structure */


I can't see TRACE_REC_SIZE anywhere else in this patch.


Oops. This comment must go. The connotation was for
MAX_TRACE_STR_LEN to be large enough to hold the formatted string,
but I'm not sure if there is a way to test that.



Done in v4.




  typedef struct {
  unsigned long event;
  unsigned long x1;
@@ -69,27 +69,6 @@ void trace5(TraceEventID event, unsigned long x1, unsigned 
long x2, unsigned lon
  trace(event, x1, x2, x3, x4, x5);
  }

-void do_info_trace(Monitor *mon)
-{
-unsigned int i;
-
-for (i = 0; i   trace_idx ; i++) {
-monitor_printf(mon, Event %lu : %lx %lx %lx %lx %lx\n,
-  trace_buf[i].event, trace_buf[i].x1, trace_buf[i].x2,
-trace_buf[i].x3, trace_buf[i].x4, trace_buf[i].x5);
-}
-}
-
-void do_info_all_trace_events(Monitor *mon)
-{
-unsigned int i;
-
-for (i = 0; i   NR_TRACE_EVENTS; i++) {
-monitor_printf(mon, %s [Event ID %u] : state %u\n,
-trace_list[i].tp_name, i, trace_list[i].state);
-}
-}
-
  static TraceEvent* find_trace_event_by_name(const char *tname)
  {
  unsigned int i;
@@ -115,3 +94,31 @@ void change_trace_event_state(const char *tname, bool 
tstate)
  tp-state = tstate;
  }
  }
+
+/**
+ * Return the current trace index.
+ *
+ */
+unsigned int get_trace_idx(void)
+{
+return trace_idx;
+}


format_trace_string() returns NULL if the index is beyond the last valid
trace record.  monitor.c doesn't need to know how many trace records
there are ahead of time, it can just keep printing until it gets NULL.
I don't feel strongly about this but wanted to mention it.


format_trace_string() returns NULL when the index passed exceeds the
size of trace buffer. This function is meant for printing current
contents of trace buffer, which may be less than the entire buffer
size.


Sorry, you're right the patch will return NULL if the index exceeds the
size of the trace buffer.

The idea I was suggesting requires it to return NULL when the index=
trace_idx.



I've tried to keep this as generic as possible. get_trace_idx() can be 
put to use to query state of trace buffer in different scenarios.





+
+/**
+ * returns formatted TraceRecord at a given index in the trace buffer.
+ * FORMAT : Event %lu : %lx %lx %lx %lx %lx\n
+ *
+ * @idx : index in the buffer for which trace record is returned.
+ * @trace_str : output string passed.
+ */
+char* format_trace_string(unsigned int idx, char trace_str[])
+{
+TraceRecord rec;
+if (idx= TRACE_BUF_LEN || sizeof(trace_str)= MAX_TRACE_STR_LEN) {


sizeof(trace_str) == sizeof(char *), not the size of the caller's array in 
bytes.


Hmm, I'll need to scrap off this check.



Done.



The fixed size limit can be eliminated using asprintf(3), which
allocates a string of the right size while doing the string formatting.
The caller of format_trace_string() is then responsible for freeing the
string when they are done with it.



I am somehow reluctant to allocate memory here and free it somewhere
else. Calls for memory leaks quite easily in case it gets missed.
I'd rather use stack-allocated arrays that clean up after the call

  1   2   >