Re: [Qemu-devel] [ANNOUNCE] qemu-test: a set of tests scripts for QEMU

2012-01-01 Thread Avi Kivity
On 12/29/2011 11:10 PM, Peter Maydell wrote:

>
> > Even if that turns out to be the case, it's fine.  Better to
> > have a few good devices than dozens of bad ones.
>
> This is easier to say on the x86 side of the fence, since
> most of the devices you need are already in the codebase and
> nobody is going to throw them out again if tests don't get
> written for them :-)
>
> There are lots of things where I'd rather have a "tested by
> booting a guest OS" implementation than none at all (like audio
> support for the versatile/etc boards, which went in recently) or
> TrustZone support (not in yet but may be along later). At least
> then we have something that works for most people and something
> we can fix bugs in, rather than a gaping hole in capability.
>

We can have different criteria for different parts of the tree. 
Undesirable, but tradeoffs have to be made.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 00/21][RFC] postcopy live migration

2012-01-01 Thread Orit Wasserman
On 12/30/2011 12:39 AM, Anthony Liguori wrote:
> On 12/28/2011 07:25 PM, Isaku Yamahata wrote:
>> Intro
>> =
>> This patch series implements postcopy live migration.[1]
>> As discussed at KVM forum 2011, dedicated character device is used for
>> distributed shared memory between migration source and destination.
>> Now we can discuss/benchmark/compare with precopy. I believe there are
>> much rooms for improvement.
>>
>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>
>>
>> Usage
>> =
>> You need load umem character device on the host before starting migration.
>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>> on only linux umem character device. But the driver dependent code is split
>> into a file.
>> I tested only host page size == guest page size case, but the implementation
>> allows host page size != guest page size case.
>>
>> The following options are added with this patch series.
>> - incoming part
>>command line options
>>-postcopy [-postcopy-flags]
>>where flags is for changing behavior for benchmark/debugging
>>Currently the following flags are available
>>0: default
>>1: enable touching page request
>>
>>example:
>>qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm
>>
>> - outging part
>>options for migrate command
>>migrate [-p [-n]] URI
>>-p: indicate postcopy migration
>>-n: disable background transferring pages: This is for benchmark/debugging
>>
>>example:
>>migrate -p -n tcp::
>>
>>
>> TODO
>> 
>> - benchmark/evaluation. Especially how async page fault affects the result.
> 
> I'll review this series next week (Mike/Juan, please also review when you 
> can).
> 
> But we really need to think hard about whether this is the right thing to 
> take into the tree.  I worry a lot about the fact that we don't test pre-copy 
> migration nearly enough and adding a second form just introduces more things 
> to test.
>
> It's also not clear to me why post-copy is better.  If you were going to sit 
> down and explain to someone building a management tool when they should use 
> pre-copy and when they should use post-copy, what would you tell them?

Start with pre-copy , if it doesn't converge switch to post-copy  

Orit
> 
> Regards,
> 
> Anthony Liguori
> 




Re: [Qemu-devel] [PATCH 00/21][RFC] postcopy live migration

2012-01-01 Thread Dor Laor

On 12/30/2011 12:39 AM, Anthony Liguori wrote:

On 12/28/2011 07:25 PM, Isaku Yamahata wrote:

Intro
=
This patch series implements postcopy live migration.[1]
As discussed at KVM forum 2011, dedicated character device is used for
distributed shared memory between migration source and destination.
Now we can discuss/benchmark/compare with precopy. I believe there are
much rooms for improvement.

[1] http://wiki.qemu.org/Features/PostCopyLiveMigration


Usage
=
You need load umem character device on the host before starting
migration.
Postcopy can be used for tcg and kvm accelarator. The implementation
depend
on only linux umem character device. But the driver dependent code is
split
into a file.
I tested only host page size == guest page size case, but the
implementation
allows host page size != guest page size case.

The following options are added with this patch series.
- incoming part
command line options
-postcopy [-postcopy-flags]
where flags is for changing behavior for benchmark/debugging
Currently the following flags are available
0: default
1: enable touching page request

example:
qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm

- outging part
options for migrate command
migrate [-p [-n]] URI
-p: indicate postcopy migration
-n: disable background transferring pages: This is for
benchmark/debugging

example:
migrate -p -n tcp::


TODO

- benchmark/evaluation. Especially how async page fault affects the
result.


I'll review this series next week (Mike/Juan, please also review when
you can).

But we really need to think hard about whether this is the right thing
to take into the tree. I worry a lot about the fact that we don't test
pre-copy migration nearly enough and adding a second form just
introduces more things to test.


It is an issue but it can't be a merge criteria, Isaku is not blame of 
pre copy live migration lack of testing.


I would say that 90% of issues of live migration problems are not 
related to the pre|post stage but more of issues of device model save 
state. So post-copy shouldn't add a significant regression here.


Probably it will be good to ask every migration patch writer to write an 
additional unit test for migration.



It's also not clear to me why post-copy is better. If you were going to
sit down and explain to someone building a management tool when they
should use pre-copy and when they should use post-copy, what would you
tell them?


Today, we have a default of max-downtime of 100ms.
If either the guest work set size or the host networking throughput 
can't match the downtime, migration won't end.

The mgmt user options are:
 - increase the downtime more and more to an actual stop
 - fail migrate

W/ post-copy there is another option.
Performance measurements will teach us (probably prior to commit) when 
this stage is valuable. Most likely, we better try first with pre-copy 
and if we can't meet the downtime we can optionally use post-copy.


Here's a paper by Umesh (the migration thread writer):
http://osnet.cs.binghamton.edu/publications/hines09postcopy_osr.pdf

Regards,
Dor



Regards,

Anthony Liguori






Re: [Qemu-devel] Better qemu/kvm defaults (was Re: [RFC PATCH 0/4] Gang scheduling in CFS)

2012-01-01 Thread Dor Laor

On 12/29/2011 06:16 PM, Anthony Liguori wrote:

On 12/29/2011 10:07 AM, Dor Laor wrote:

On 12/26/2011 11:05 AM, Avi Kivity wrote:

On 12/26/2011 05:14 AM, Nikunj A Dadhania wrote:


btw you can get an additional speedup by enabling x2apic, for
default_send_IPI_mask_logical().


In the host?



In the host, for the guest:

qemu -cpu ...,+x2apic



It seems to me that we should improve our default flags.
So many times users fail to submit the proper huge command-line
options that we
require. Honestly, we can't blame them, there are so many flags and so
many use
cases its just too hard to get it right for humans.

I propose a basic idea and folks are welcome to discuss it:

1. Improve qemu/kvm defaults
Break the current backward compatibility (but add a --default-
backward-compat-mode) and set better values for:
- rtc slew time


What do you specifically mean?


-rtc localtime,driftfix=slew




- cache=none


I'm not sure I see this as a "better default" particularly since
O_DIRECT fails on certain file systems. I think we really need to let
WCE be toggable from the guest and then have a caching mode independent
of WCE. We then need some heuristics to only enable cache=off when we
know it's safe.


cache=none is still faster then it has the FS support.
qemu can test-run O_DIRECT and fall back to cache mode or just test the 
filesystem capabilities.





- x2apic, maybe enhance qemu64 or move to -cpu host?


Alex posted a patch for this. I'm planning on merging it although so far
no one has chimed up either way.


- aio=native|threads (auto-sense?)


aio=native is unsafe to default because linux-aio is just fubar. It
falls back to synchronous I/O if the underlying filesystem doesn't
support aio. There's no way in userspace to problem if it's actually
supported or not either...


Can we test-run this too? Maybe as a separate qemu mode or even binary 
that given a qemu cmdline, it will try to suggest better parameters?



- use virtio devices by default


I don't think this is realistic since appropriately licensed signed
virtio drivers do not exist for Windows. (Please note the phrase
"appropriately licensed signed").


What's the percentage of qemu invocation w/ windows guest and a short 
cmd line? My hunch is that plain short cmdline indicates a developer and 
probably they'll use linux guest.





- more?

Different defaults may be picked automatically when TCG|KVM used.

2. External hardening configuration file kept in qemu.git
For non qemu/kvm specific definitions like the io scheduler we
should maintain a script in our tree that sets/sense the optimal
settings of the host kernel (maybe similar one for the guest).


What are "appropriate host settings" and why aren't we suggesting that
distros and/or upstream just set them by default?


It's hard to set the right default for a distribution since the same 
distro should optimize for various usages of the same OS. For example, 
Fedora has tuned-adm w/ available profiles:

- desktop-powersave
- server-powersave
- enterprise-storage
- spindown-disk
- laptop-battery-powersave
- default
- throughput-performance
- latency-performance
- laptop-ac-powersave

We need to keep on recommending the best profile for virtualization, for 
Fedora I think it either enterprise-storage and maybe 
throughput-performance.


If we have a such a script, it can call the matching tuned profile 
instead of tweaking every /sys option.




Regards,

Anthony Liguori


HTH,
Dor









Re: [Qemu-devel] Compiling without python?

2012-01-01 Thread Michael S. Tsirkin
On Fri, Dec 30, 2011 at 11:36:53PM +0100, Sebastian Herbszt wrote:
> Sebastian Herbszt wrote:
> >Is it still possible to compile without python?
> >
> >python /v1.0-267-gda5361c/scripts/qapi-commands.py -m -o .
> >/bin/sh: python: command not found
> >make: *** [qmp-commands.h] Error 127
> >
> >Sebastian
> 
> Care to answer?
> 
> Thanks,
> Sebastian

Probably not. I think configure should check for python.

-- 
MST



Re: [Qemu-devel] Better qemu/kvm defaults (was Re: [RFC PATCH 0/4] Gang scheduling in CFS)

2012-01-01 Thread Ronen Hod

On 01/01/2012 12:16 PM, Dor Laor wrote:

On 12/29/2011 06:16 PM, Anthony Liguori wrote:

On 12/29/2011 10:07 AM, Dor Laor wrote:

On 12/26/2011 11:05 AM, Avi Kivity wrote:

On 12/26/2011 05:14 AM, Nikunj A Dadhania wrote:


btw you can get an additional speedup by enabling x2apic, for
default_send_IPI_mask_logical().


In the host?



In the host, for the guest:

qemu -cpu ...,+x2apic



It seems to me that we should improve our default flags.
So many times users fail to submit the proper huge command-line
options that we
require. Honestly, we can't blame them, there are so many flags and so
many use
cases its just too hard to get it right for humans.


You might want to take into account migration considerations. I.e., the 
target host's optimal setup.
Also, we need to beware of too much automation, since hardware changes 
might void Windows license activations.
Some of the parameters will depend on dynamic factors such as the total 
guest's nCPUs, mem, sharing (KSM), or whatever.
As a minimum, we can automatically suggest the qemu parameters and the 
host setup.


Ronen.



I propose a basic idea and folks are welcome to discuss it:

1. Improve qemu/kvm defaults
Break the current backward compatibility (but add a --default-
backward-compat-mode) and set better values for:
- rtc slew time


What do you specifically mean?


-rtc localtime,driftfix=slew




- cache=none


I'm not sure I see this as a "better default" particularly since
O_DIRECT fails on certain file systems. I think we really need to let
WCE be toggable from the guest and then have a caching mode independent
of WCE. We then need some heuristics to only enable cache=off when we
know it's safe.


cache=none is still faster then it has the FS support.
qemu can test-run O_DIRECT and fall back to cache mode or just test 
the filesystem capabilities.





- x2apic, maybe enhance qemu64 or move to -cpu host?


Alex posted a patch for this. I'm planning on merging it although so far
no one has chimed up either way.


- aio=native|threads (auto-sense?)


aio=native is unsafe to default because linux-aio is just fubar. It
falls back to synchronous I/O if the underlying filesystem doesn't
support aio. There's no way in userspace to problem if it's actually
supported or not either...


Can we test-run this too? Maybe as a separate qemu mode or even binary 
that given a qemu cmdline, it will try to suggest better parameters?



- use virtio devices by default


I don't think this is realistic since appropriately licensed signed
virtio drivers do not exist for Windows. (Please note the phrase
"appropriately licensed signed").


What's the percentage of qemu invocation w/ windows guest and a short 
cmd line? My hunch is that plain short cmdline indicates a developer 
and probably they'll use linux guest.





- more?

Different defaults may be picked automatically when TCG|KVM used.

2. External hardening configuration file kept in qemu.git
For non qemu/kvm specific definitions like the io scheduler we
should maintain a script in our tree that sets/sense the optimal
settings of the host kernel (maybe similar one for the guest).


What are "appropriate host settings" and why aren't we suggesting that
distros and/or upstream just set them by default?


It's hard to set the right default for a distribution since the same 
distro should optimize for various usages of the same OS. For example, 
Fedora has tuned-adm w/ available profiles:

- desktop-powersave
- server-powersave
- enterprise-storage
- spindown-disk
- laptop-battery-powersave
- default
- throughput-performance
- latency-performance
- laptop-ac-powersave

We need to keep on recommending the best profile for virtualization, 
for Fedora I think it either enterprise-storage and maybe 
throughput-performance.


If we have a such a script, it can call the matching tuned profile 
instead of tweaking every /sys option.




Regards,

Anthony Liguori


HTH,
Dor












Re: [Qemu-devel] [PATCH 1/4] rtc: fix 12-hour mode

2012-01-01 Thread Avi Kivity
On 11/21/2011 08:00 PM, Paolo Bonzini wrote:
> Hours in 12-hour mode are in the 1-12 range, not 0-11.
>
> @@ -320,7 +324,8 @@ static void rtc_copy_date(RTCState *s)
>  s->cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm->tm_hour);
>  } else {
>  /* 12 hour format */
> -s->cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm->tm_hour % 12);
> +int h = (tm->tm_hour % 12) ? tm->tm_hour % 12 : 12;
> +s->cmos_data[RTC_HOURS] = rtc_to_bcd(s, h);
>  if (tm->tm_hour >= 12)
>  s->cmos_data[RTC_HOURS] |= 0x80;
>  }

Nitpick, don't update patch on this account:

I dislike seeing int-to-bool conversion on anything that is not a
true/false or count value.  Things like if (has_some_property) or if
(!nr_items) read well and are easily understood.  But here, you're not
checking for "are there any tm_hours, if yes, use them, if not, use
12".  You're testing against the value 0 which has a special encoding in
12 hour mode.

The is usually manifested in

   if (!strcmp(a, b)) ...

strcmp() does not return a bool or a count, and in fact it reads exactly
the opposite of the intent: "if not string compare".  strcmp() returns
an enumeration, or perhaps a mapping of string trichotomy to integer
trichotomy.

Sorry about the pontification, back to the regularly scheduled
whitespace discussion.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 00/21][RFC] postcopy live migration

2012-01-01 Thread Stefan Hajnoczi
On Sun, Jan 1, 2012 at 9:43 AM, Orit Wasserman  wrote:
> On 12/30/2011 12:39 AM, Anthony Liguori wrote:
>> On 12/28/2011 07:25 PM, Isaku Yamahata wrote:
>>> Intro
>>> =
>>> This patch series implements postcopy live migration.[1]
>>> As discussed at KVM forum 2011, dedicated character device is used for
>>> distributed shared memory between migration source and destination.
>>> Now we can discuss/benchmark/compare with precopy. I believe there are
>>> much rooms for improvement.
>>>
>>> [1] http://wiki.qemu.org/Features/PostCopyLiveMigration
>>>
>>>
>>> Usage
>>> =
>>> You need load umem character device on the host before starting migration.
>>> Postcopy can be used for tcg and kvm accelarator. The implementation depend
>>> on only linux umem character device. But the driver dependent code is split
>>> into a file.
>>> I tested only host page size == guest page size case, but the implementation
>>> allows host page size != guest page size case.
>>>
>>> The following options are added with this patch series.
>>> - incoming part
>>>    command line options
>>>    -postcopy [-postcopy-flags]
>>>    where flags is for changing behavior for benchmark/debugging
>>>    Currently the following flags are available
>>>    0: default
>>>    1: enable touching page request
>>>
>>>    example:
>>>    qemu -postcopy -incoming tcp:0: -monitor stdio -machine accel=kvm
>>>
>>> - outging part
>>>    options for migrate command
>>>    migrate [-p [-n]] URI
>>>    -p: indicate postcopy migration
>>>    -n: disable background transferring pages: This is for 
>>> benchmark/debugging
>>>
>>>    example:
>>>    migrate -p -n tcp::
>>>
>>>
>>> TODO
>>> 
>>> - benchmark/evaluation. Especially how async page fault affects the result.
>>
>> I'll review this series next week (Mike/Juan, please also review when you 
>> can).
>>
>> But we really need to think hard about whether this is the right thing to 
>> take into the tree.  I worry a lot about the fact that we don't test 
>> pre-copy migration nearly enough and adding a second form just introduces 
>> more things to test.
>>
>> It's also not clear to me why post-copy is better.  If you were going to sit 
>> down and explain to someone building a management tool when they should use 
>> pre-copy and when they should use post-copy, what would you tell them?
>
> Start with pre-copy , if it doesn't converge switch to post-copy

Post-copy throttles the guest when page faults are encountered because
the destination machine waits for memory pages from the source
machine.  Is there a reason this page fault-based throttling cannot be
done on the source machine with pre-copy migration?  I'm not sure
post-copy provides new behavior in terms of convergence, we could do
the same with pre-copy migration.

Post-copy has other advantages though, it immediately frees logical
CPUs on the source machine (though RAM and network bandwidth is still
required until migration completes).

Stefan



Re: [Qemu-devel] [PATCH v6] block:add-cow file format

2012-01-01 Thread Stefan Hajnoczi
On Sat, Dec 31, 2011 at 9:17 AM, Dong Xu Wang
 wrote:
> On Fri, Dec 30, 2011 at 22:09, Stefan Hajnoczi  wrote:
>> On Thu, Dec 29, 2011 at 05:36:59PM +0800, Dong Xu Wang wrote:
>> > +    ret = bdrv_file_open(&backing_bs, bs->backing_file, 0);
>> > +    if (ret < 0) {
>> > +        return ret;
>> > +    }
>> > +    bdrv_delete(backing_bs);
>>
>> What does this do?  (It leaks s->bitmap when it fails.)
>
>
> I wanna make sure backing_file exists while opening.

block.c:bdrv_open() opens the backing file after .bdrv_open() has
returned.  It fails if the backing file does not exist.  There's no
need to duplicate this.

>> > +    s->image_hd = bdrv_new("");
>> > +
>> > +    if (path_has_protocol(s->image_file)) {
>> > +        pstrcpy(image_filename, sizeof(image_filename),
>> > +                s->image_file);
>> > +    } else {
>> > +        path_combine(image_filename, sizeof(image_filename),
>> > +                     bs->filename, s->image_file);
>> > +    }
>> > +
>> > +    image_drv = bdrv_find_format("raw");
>> > +    ret = bdrv_open(s->image_hd, image_filename, flags, image_drv);
>> > +    if (ret < 0) {
>> > +        bdrv_delete(s->image_hd);
>> > +        s->image_hd = NULL;
>> > +        goto fail;
>> > +    }
>>
>> TODO

If you were wondering why I put "TODO" here it's because I had some
thoughts when reviewing but didn't fully investigate it yet.  When I
send the rest of my feedback I'll include my comment here.  (I should
have removed this before replying :))

>> > +         1036 - 2059:   image_file
>> > +                        image_file is a raw file, While using
>> > image_file, must
>> > +                        together with image_file. All unused bytes are
>> > padded
>>
>> "While using image_file, must together with image_file"
>>
>> What does this mean?
>
>
> I mean while using add-cow, must together with image_file and backing_file.
> Both of them can not be missing.
> Errors with sentences like that, I will correct them in v7.

That sounds like qemu-img create behavior which should not be part of
the file format specification.

The only impact on the file format speficiation is that backing_file
can be an empty string but image_file must always be a valid filename.



[Qemu-devel] [PATCH] Fix wrong region_offset when overlaying a page with another

2012-01-01 Thread Avi Kivity
cpu_register_physical_memory_log() does not update region_offset
if a page was previously registered for the same address.  This
could cause mmio accesses going to the wrong place, by using the
old region_offset.

Signed-off-by: Avi Kivity 
---

Once qemu-test is merged, remind me to post a testlet for this.

 exec.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/exec.c b/exec.c
index 8a3f621..c366835 100644
--- a/exec.c
+++ b/exec.c
@@ -2542,6 +2542,7 @@ void cpu_register_physical_memory_log(target_phys_addr_t 
start_addr,
 p->region_offset = 0;
 } else {
 p->phys_offset = phys_offset;
+p->region_offset = region_offset;
 if ((phys_offset & ~TARGET_PAGE_MASK) <= IO_MEM_ROM ||
 (phys_offset & IO_MEM_ROMD))
 phys_offset += TARGET_PAGE_SIZE;
-- 
1.7.7.1




Re: [Qemu-devel] [PATCH 1/4] rtc: fix 12-hour mode

2012-01-01 Thread Paolo Bonzini

On 01/01/2012 03:53 PM, Avi Kivity wrote:

On 11/21/2011 08:00 PM, Paolo Bonzini wrote:

Hours in 12-hour mode are in the 1-12 range, not 0-11.

@@ -320,7 +324,8 @@ static void rtc_copy_date(RTCState *s)
  s->cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm->tm_hour);
  } else {
  /* 12 hour format */
-s->cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm->tm_hour % 12);
+int h = (tm->tm_hour % 12) ? tm->tm_hour % 12 : 12;
+s->cmos_data[RTC_HOURS] = rtc_to_bcd(s, h);
  if (tm->tm_hour>= 12)
  s->cmos_data[RTC_HOURS] |= 0x80;
  }


Nitpick, don't update patch on this account:

I dislike seeing int-to-bool conversion on anything that is not a
true/false or count value.  Things like if (has_some_property) or if
(!nr_items) read well and are easily understood.  But here, you're not
checking for "are there any tm_hours, if yes, use them, if not, use
12".  You're testing against the value 0 which has a special encoding in
12 hour mode.

The is usually manifested in

if (!strcmp(a, b)) ...

strcmp() does not return a bool or a count, and in fact it reads exactly
the opposite of the intent: "if not string compare".  strcmp() returns
an enumeration, or perhaps a mapping of string trichotomy to integer
trichotomy.

Sorry about the pontification, back to the regularly scheduled
whitespace discussion.


Yeah, I probably would have remarked the same if it was someone else's 
patch. :)


BTW, I have kvm-unit-tests tests for these and other RTC emulation 
problems.  Long term perhaps they should become qtests, but for now 
kvm-unit-tests is the best place.  However, without these four patches 
they hang which is a bit too heavy.  So I'll submit as soon as these are in.


Paolo




Re: [Qemu-devel] [PATCH 4/9] virtio-9p: remove PCI dependencies from hw/9pfs/

2012-01-01 Thread Paolo Bonzini

On 11/29/2011 02:12 PM, Michael S. Tsirkin wrote:

On Tue, Nov 29, 2011 at 09:38:43AM +0100, Paolo Bonzini wrote:

On 11/28/2011 07:46 PM, Michael S. Tsirkin wrote:

  +#ifdef CONFIG_LINUX
  +static int virtio_9p_init_pci(PCIDevice *pci_dev)
  +{
  +VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
  +VirtIODevice *vdev;
  +
  +vdev = virtio_9p_init(&pci_dev->qdev,&proxy->fsconf);
  +vdev->nvectors = proxy->nvectors;
  +virtio_init_pci(proxy, vdev);
  +/* make the actual value visible */
  +proxy->nvectors = vdev->nvectors;
  +return 0;
  +}
  +#endif
  +

This ifdef looks wrong to me - is there no way 9p can thinkably
work on non-linux hosts? If yes, we should have a separate config
entry for 9p, configure script can make it
conditional on linux host.


I think it was true in the beginning, but more recent versions
should only depend on CONFIG_POSIX.  I will set up FreeBSD and come
back.

Paolo


It might be easier to add CONFIG_VIRTIO_9P. That won't hurt in any case.


Actually CONFIG_VIRTFS exists already (Linux dependencies are mostly 
statfs and d_off in struct dirent, plus extended attributes which are 
checked separately).


Will resend.

Paolo




Re: [Qemu-devel] virtio-net with virtio-mmio

2012-01-01 Thread Ying-Shiuan Pan
Hi, Stefan

Thanks for your response.


2011/12/30 Stefan Hajnoczi 
>
> On Wed, Dec 28, 2011 at 06:16:42PM +0800, Ying-Shiuan Pan wrote:
> > I'm very interested in virtio-mmio Peter Maydell did for QEMU,
> > (http://lists.nongnu.org/archive/html/qemu-devel/2011-11/msg01870.html)
> >
> > actually, I've tested the virtio-blk, and it is working.
> > I applied those patch to QEMU-1.0 and brought the virtio_mmio.c from
> > Linux-3.2-rc back to Linux-linaro-2.6.38.
> >
> > I also found a small bug in virito-mmio.c: virtio_mmio_write(),
> > Peter forgot to break in each case of switch block.
> > After fixed the small bug, the virtio-balloon works as well.
>
> PMM: Do you have a git branch where you could apply Ying-Shiuan's fix?
hmm... I've sent Peter my patch :)
>
> > Then, when I attempted to enable the virtio-net, the initialization part is
> > fine,
> > however, the QEMU crashed and printed this message:
> > "virtio-net header not in first element"
> >
> > It happens when the front-end virtio-net is invoking virtqueue_kick() at
> > the end of try_fill_recv().
> > Then, QEMU gets this message and invokes virtio_net_receive(), then the
> > error happens.
>
> virtio-net is looking at a virtqueue element (a packet buffer provided
> by the guest OS into which QEMU will copy the received packet) but the
> virtqueue element does not have the expected format.  You can check the
> virtio specification for the details on the virtio-net buffer layout:
> http://ozlabs.org/~rusty/virtio-spec/
>
> It sounds like the vring is corrupted or QEMU's virtio code is not
> correctly reading the virtqueue element (which is basically an iovec
> array).
>
> virtio-net is more advanced than virtio-blk in how it uses virtio so
> it's not surprising that you've discovered an issue.  Debugging this
> comes down to looking at the vring descriptors and the virtqueue
> elements that QEMU extracts.
>
> It sounds like there's a problem with the rx queue, you could try
> indirect_desc=off to disable indirect vring descriptors to see if that
> is part of the problem.
>
> Stefan
I guessed that the problem was in adding header,
virtio-net should insert its header to the ring buffer before its
packet, correct me if I'm wrong.

virtio-net driver (backend part) can use add_recvbuf_small() /
add_recvbuf_big() / add_recvbuf_mergeable()
depending on what features the virtio-net device (QEMU part) provides.
I found that both small and big add the header, while the mergeable
one does not.
Besides, in that add_recvbuf_big(), the comments mentioned about a QEMU bug??

Anyway, I found a workaround solution, that is disable the
VIRTIO_NET_F_MRG_RXBUF feature
and make virtio-net driver use add_recvbuf_big()

However, I am still not clear about how the problem happened.
If I luckily figure out, I'll post it. :)

So~ currently, virito-balloon, virtio-blk, and virtio-net can work
properly  with virtio-mmio. :D

The following is the *workaround solution* which is for virtio-net
driver part (frontend).

diff --git a/virtio_net.c b/virtio_net.c
index 82dba5a..811e2d9 100644
--- a/virtio_net.c
+++ b/virtio_net.c
@@ -955,8 +955,10 @@ static int virtnet_probe(struct virtio_device *vdev)
virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN))
vi->big_packets = true;

+#ifndef ARM
if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
vi->mergeable_rx_bufs = true;
+#endif

/* We expect two virtqueues, receive then send,
 * and optionally control. */



Best Regards,
Ying-Shiuan Pan



[Qemu-devel] [RFC PATCH 2/4] Makefile and configure changes for tracetool.py

2012-01-01 Thread Harsh Prateek Bora
Use tracetool.py to generate tracing code for requested backend.

TODO: tracetool.py needs to be updated to add support for other backends.

Signed-off-by: Harsh Prateek Bora 
---
 Makefile.objs |4 ++--
 configure |4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 804bc3c..24123a8 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -344,12 +344,12 @@ else
 trace.h: trace.h-timestamp
 endif
 trace.h-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
-   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -h < $< > $@,"  GEN   trace.h")
+   $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/tracetool.py 
--backend=$(TRACE_BACKEND) -h < $< > $@,"  GEN   trace.h")
@cmp -s $@ trace.h || cp $@ trace.h
 
 trace.c: trace.c-timestamp
 trace.c-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak
-   $(call quiet-command,sh $(SRC_PATH)/scripts/tracetool 
--$(TRACE_BACKEND) -c < $< > $@,"  GEN   trace.c")
+   $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/tracetool.py 
--backend=$(TRACE_BACKEND) -c < $< > $@,"  GEN   trace.c")
@cmp -s $@ trace.c || cp $@ trace.c
 
 trace.o: trace.c $(GENERATED_HEADERS)
diff --git a/configure b/configure
index 19e8394..0ea8634 100755
--- a/configure
+++ b/configure
@@ -1040,7 +1040,7 @@ echo "  --disable-docs   disable documentation 
build"
 echo "  --disable-vhost-net  disable vhost-net acceleration support"
 echo "  --enable-vhost-net   enable vhost-net acceleration support"
 echo "  --enable-trace-backend=B Set trace backend"
-echo "   Available backends:" 
$("$source_path"/scripts/tracetool --list-backends)
+echo "   Available backends:" 
$("$source_path"/scripts/tracetool.py --list-backends)
 echo "  --with-trace-file=NAME   Full PATH,NAME of file to store traces"
 echo "   Default:trace-"
 echo "  --disable-spice  disable spice"
@@ -2477,7 +2477,7 @@ fi
 ##
 # check if trace backend exists
 
-sh "$source_path/scripts/tracetool" "--$trace_backend" --check-backend > 
/dev/null 2> /dev/null
+$python "$source_path/scripts/tracetool.py" "--backend=$trace_backend" 
--check-backend  > /dev/null 2> /dev/null
 if test "$?" -ne 0 ; then
   echo
   echo "Error: invalid trace backend"
-- 
1.7.1.1




[Qemu-devel] [RFC PATCH 4/4] simpletrace.py: updated log reader script to handle new log format

2012-01-01 Thread Harsh Prateek Bora
Note: This script has been updated with minimal changes required to observe
the new trace log format in action and therefore can be improved for a
better design. It can still read the logs from older log format as well.

Signed-off-by: Harsh Prateek Bora 
---
 scripts/simpletrace.py |  110 +++-
 1 files changed, 99 insertions(+), 11 deletions(-)

diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
index f55e5e6..69829dd 100755
--- a/scripts/simpletrace.py
+++ b/scripts/simpletrace.py
@@ -12,16 +12,37 @@
 import struct
 import re
 import inspect
+import sys
+from tracetool import *
 
 header_event_id = 0x
 header_magic= 0xf2b177cb0aa429b4
 header_version  = 0
+log_header_ver  = 2
 dropped_event_id = 0xfffe
 
 trace_fmt = '='
+header_fmt = '=QQII'
+header_len = struct.calcsize(header_fmt)
 trace_len = struct.calcsize(trace_fmt)
 event_re  = re.compile(r'(disable\s+)?([a-zA-Z0-9_]+)\(([^)]*)\).*')
 
+def get_event_argtypes(fobj):
+"""Parse a trace-events file into {event_num: (name, type arg, ...)}."""
+
+events = {dropped_event_id: ('name', 'args')}
+event_num = 0
+for line in fobj:
+m = event_re.match(line.strip())
+if m is None:
+continue
+
+disable, name, args = m.groups()
+events[event_num] = tuple(args.split(','))
+event_num += 1
+return events
+
+
 def parse_events(fobj):
 """Parse a trace-events file into {event_num: (name, arg1, ...)}."""
 
@@ -48,15 +69,16 @@ def read_record(fobj):
 return None
 return struct.unpack(trace_fmt, s)
 
+def read_header(fobj):
+'''Read a trace record header'''
+s = fobj.read(header_len)
+if len(s) != header_len:
+return None
+return struct.unpack(header_fmt, s)
+
+
 def read_trace_file(fobj):
 """Deserialize trace records from a file, yielding record tuples 
(event_num, timestamp, arg1, ..., arg6)."""
-header = read_record(fobj)
-if header is None or \
-   header[0] != header_event_id or \
-   header[1] != header_magic or \
-   header[2] != header_version:
-raise ValueError('not a trace file or incompatible version')
-
 while True:
 rec = read_record(fobj)
 if rec is None:
@@ -64,6 +86,59 @@ def read_trace_file(fobj):
 
 yield rec
 
+def process_event(event_id, fobj):
+params = etypes[event_id]
+for elem in params:
+if is_string(elem):
+l = fobj.read(4)
+(len,) = struct.unpack('=L', l)
+s = fobj.read(len)
+sfmt = `len`+'s'
+(str,) = struct.unpack_from(sfmt, s)
+type, sep, var = elem.rpartition('*')
+print '%(arg)s=%(val)s' % { 'arg': var.lstrip(), 'val': str },
+elif '*' in elem:
+(value,) = struct.unpack('=Q', fobj.read(8))
+type, sep, var = elem.rpartition('*')
+print '%(arg)s=0x%(val)u' % { 'arg': var.lstrip(), 'val': value },
+else:
+(value,) = struct.unpack('=Q', fobj.read(8))
+type, sep, var = elem.rpartition(' ')
+print '%(arg)s=%(val)u' % { 'arg': var.lstrip(), 'val': value },
+print
+return
+
+etypes = get_event_argtypes(open(sys.argv[1], 'r'))
+def processv2(fobj):
+'''Process simpletrace v2 log trace records'''
+events = parse_events(open(sys.argv[1], 'r'))
+#print events
+max_events = len(events) - 1
+last_timestamp = None
+while True:
+# read record header
+rechdr = read_header(fobj)
+if rechdr is None:
+print "No more records"
+break
+
+if rechdr[0] >= max_events:
+print "Invalid Event ID found, Trace Log may be corrupted !!!"
+sys.exit(1)
+
+if last_timestamp is None:
+last_timestamp = rechdr[1]
+delta_ns = rechdr[1] - last_timestamp
+last_timestamp = rechdr[1]
+
+print events[rechdr[0]][0],  '%0.3f' % (delta_ns / 1000.0),
+# read data
+# process_event(rec[0], fobj)
+# rechdr[2] is record length (including header)
+#print etypes[rechdr[0]]
+#data = fobj.read(rechdr[2] - header_len)
+process_event(rechdr[0], fobj)
+
 class Analyzer(object):
 """A trace file analyzer which processes trace records.
 
@@ -90,8 +165,6 @@ def process(events, log, analyzer):
 """Invoke an analyzer on each event in a log."""
 if isinstance(events, str):
 events = parse_events(open(events, 'r'))
-if isinstance(log, str):
-log = open(log, 'rb')
 
 def build_fn(analyzer, event):
 fn = getattr(analyzer, event[0], None)
@@ -129,8 +202,10 @@ def run(analyzer):
 sys.exit(1)
 
 events = parse_events(open(sys.argv[1], 'r'))
-process(events, sys.argv[2], analyzer)
+process(events, log, analyzer)
+
 
+log = open(sys.argv[2], 'rb')
 if __name__ == '__main__':
 class Formatter(Analy

[Qemu-devel] [PATCH 4/4] simpletrace.py: Simpletrace v2 tracelog reader script

2012-01-01 Thread Harsh Prateek Bora

Signed-off-by: Harsh Prateek Bora 
---
 scripts/simpletrace.py |  110 +++-
 1 files changed, 99 insertions(+), 11 deletions(-)

diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
index f55e5e6..69829dd 100755
--- a/scripts/simpletrace.py
+++ b/scripts/simpletrace.py
@@ -12,16 +12,37 @@
 import struct
 import re
 import inspect
+import sys
+from tracetool import *
 
 header_event_id = 0x
 header_magic= 0xf2b177cb0aa429b4
 header_version  = 0
+log_header_ver  = 2
 dropped_event_id = 0xfffe
 
 trace_fmt = '='
+header_fmt = '=QQII'
+header_len = struct.calcsize(header_fmt)
 trace_len = struct.calcsize(trace_fmt)
 event_re  = re.compile(r'(disable\s+)?([a-zA-Z0-9_]+)\(([^)]*)\).*')
 
+def get_event_argtypes(fobj):
+"""Parse a trace-events file into {event_num: (name, type arg, ...)}."""
+
+events = {dropped_event_id: ('name', 'args')}
+event_num = 0
+for line in fobj:
+m = event_re.match(line.strip())
+if m is None:
+continue
+
+disable, name, args = m.groups()
+events[event_num] = tuple(args.split(','))
+event_num += 1
+return events
+
+
 def parse_events(fobj):
 """Parse a trace-events file into {event_num: (name, arg1, ...)}."""
 
@@ -48,15 +69,16 @@ def read_record(fobj):
 return None
 return struct.unpack(trace_fmt, s)
 
+def read_header(fobj):
+'''Read a trace record header'''
+s = fobj.read(header_len)
+if len(s) != header_len:
+return None
+return struct.unpack(header_fmt, s)
+
+
 def read_trace_file(fobj):
 """Deserialize trace records from a file, yielding record tuples 
(event_num, timestamp, arg1, ..., arg6)."""
-header = read_record(fobj)
-if header is None or \
-   header[0] != header_event_id or \
-   header[1] != header_magic or \
-   header[2] != header_version:
-raise ValueError('not a trace file or incompatible version')
-
 while True:
 rec = read_record(fobj)
 if rec is None:
@@ -64,6 +86,59 @@ def read_trace_file(fobj):
 
 yield rec
 
+def process_event(event_id, fobj):
+params = etypes[event_id]
+for elem in params:
+if is_string(elem):
+l = fobj.read(4)
+(len,) = struct.unpack('=L', l)
+s = fobj.read(len)
+sfmt = `len`+'s'
+(str,) = struct.unpack_from(sfmt, s)
+type, sep, var = elem.rpartition('*')
+print '%(arg)s=%(val)s' % { 'arg': var.lstrip(), 'val': str },
+elif '*' in elem:
+(value,) = struct.unpack('=Q', fobj.read(8))
+type, sep, var = elem.rpartition('*')
+print '%(arg)s=0x%(val)u' % { 'arg': var.lstrip(), 'val': value },
+else:
+(value,) = struct.unpack('=Q', fobj.read(8))
+type, sep, var = elem.rpartition(' ')
+print '%(arg)s=%(val)u' % { 'arg': var.lstrip(), 'val': value },
+print
+return
+
+etypes = get_event_argtypes(open(sys.argv[1], 'r'))
+def processv2(fobj):
+'''Process simpletrace v2 log trace records'''
+events = parse_events(open(sys.argv[1], 'r'))
+#print events
+max_events = len(events) - 1
+last_timestamp = None
+while True:
+# read record header
+rechdr = read_header(fobj)
+if rechdr is None:
+print "No more records"
+break
+
+if rechdr[0] >= max_events:
+print "Invalid Event ID found, Trace Log may be corrupted !!!"
+sys.exit(1)
+
+if last_timestamp is None:
+last_timestamp = rechdr[1]
+delta_ns = rechdr[1] - last_timestamp
+last_timestamp = rechdr[1]
+
+print events[rechdr[0]][0],  '%0.3f' % (delta_ns / 1000.0),
+# read data
+# process_event(rec[0], fobj)
+# rechdr[2] is record length (including header)
+#print etypes[rechdr[0]]
+#data = fobj.read(rechdr[2] - header_len)
+process_event(rechdr[0], fobj)
+
 class Analyzer(object):
 """A trace file analyzer which processes trace records.
 
@@ -90,8 +165,6 @@ def process(events, log, analyzer):
 """Invoke an analyzer on each event in a log."""
 if isinstance(events, str):
 events = parse_events(open(events, 'r'))
-if isinstance(log, str):
-log = open(log, 'rb')
 
 def build_fn(analyzer, event):
 fn = getattr(analyzer, event[0], None)
@@ -129,8 +202,10 @@ def run(analyzer):
 sys.exit(1)
 
 events = parse_events(open(sys.argv[1], 'r'))
-process(events, sys.argv[2], analyzer)
+process(events, log, analyzer)
+
 
+log = open(sys.argv[2], 'rb')
 if __name__ == '__main__':
 class Formatter(Analyzer):
 def __init__(self):
@@ -148,4 +223,17 @@ if __name__ == '__main__':
 fields.append('%s=0x%x' % (event[i], rec[i + 1]))
 print ' '.join(fields)
 
-run(Formatter())
+#run(

[Qemu-devel] [RFC PATCH 1/4] Converting tracetool.sh to tracetool.py

2012-01-01 Thread Harsh Prateek Bora
Note: As of now, tracetool.py only supports simpletrace backend.
TODO: Add support for other tracing backends.

Signed-off-by: Harsh Prateek Bora 
---
 scripts/tracetool.py |  299 ++
 1 files changed, 299 insertions(+), 0 deletions(-)
 create mode 100755 scripts/tracetool.py

diff --git a/scripts/tracetool.py b/scripts/tracetool.py
new file mode 100755
index 000..95c3c2e
--- /dev/null
+++ b/scripts/tracetool.py
@@ -0,0 +1,299 @@
+#!/usr/bin/env python
+# Python based tracetool script (Code generator for trace events)
+#
+# Copyright IBM, Corp. 2011
+#
+# This work is licensed under the terms of the GNU GPL, version 2.  See
+# the COPYING file in the top-level directory.
+#
+#
+
+import sys
+import getopt
+
+def usage():
+print "Tracetool: Generate tracing code for trace events."
+print "   Takes input from stdin, output can be redirected from 
stdout."
+print "Usage:"
+print sys.argv[0]+" --backend=simple [-h | -c]"
+sys.exit(1)
+
+def get_name(line, sep='('):
+head, sep, tail = line.partition(sep)
+return head
+
+def get_args(line, sep1='(', sep2=')'):
+head, sep1, tail = line.partition(sep1)
+args, sep2, fmt_str = tail.partition(sep2)
+return args
+
+def get_argnames(line, sep=','):
+nfields = 0
+str = []
+args = get_args(line)
+for field in args.split():
+  nfields = nfields + 1
+  # Drop pointer star
+  type, ptr, tail = field.partition('*')
+  if type != field:
+field = tail
+
+  name, sep, tail = field.partition(',')
+
+  if name == field:
+continue
+  str.append(name)
+  str.append(", ")
+
+if nfields > 1:
+  str.append(name)
+  return ''.join(str)
+else:
+  return
+
+def get_argc(line):
+argc = 0
+argnames = get_argnames(line)
+if argnames:
+  for name in argnames.split(','):
+argc = argc + 1
+return argc
+
+def cast_args_to_uint64_t(line):
+str = []
+argnames = line.argnames
+if argnames:
+  for name in argnames.split():
+str.append("(uint64_t)(uintptr_t)")
+str.append(name)
+return ''.join(str)
+
+# trace args to be generated for simpletrace backend
+def trace_args_simple(event):
+if event.argc:
+return (str(event.num) + ', ' + cast_args_to_uint64_t(event))
+else:
+return str(event.num)
+
+def calc_sizeofargs(line):
+args = get_args(line)
+argc = get_argc(line)
+strtype = ('const char*', 'char*', 'const char *', 'char *')
+str = []
+newstr = ""
+if argc > 0:
+  str = args.split(',')
+  for elem in str:
+if elem.lstrip().startswith(strtype): #strings
+  type, sep, var = elem.rpartition('*')
+  newstr = newstr+"4 + strlen("+var.lstrip()+") + "
+#elif '*' in elem:
+#  newstr = newstr + "4 + " # pointer vars
+else:
+  #type, sep, var = elem.rpartition(' ')
+  #newstr = newstr+"sizeof("+type.lstrip()+") + "
+  newstr = newstr + '8 + '
+newstr = newstr + '0' # for last +
+return newstr
+
+
+def trace_h_begin():
+print '''#ifndef TRACE_H
+#define TRACE_H
+
+/* This file is autogenerated by tracetool, do not edit. */
+
+#include "qemu-common.h"'''
+return
+
+def trace_h_end():
+print '#endif /* TRACE_H */'
+return
+
+def trace_c_begin():
+print '/* This file is autogenerated by tracetool, do not edit. */'
+return
+
+def trace_c_end():
+# nop, required for trace_gen
+return
+
+def simple_h(events):
+print '#include "trace/simple.h"'
+print
+for event in events:
+print 'void trace_%(name)s(%(args)s);' % {
+'name': event.name,
+'args': event.args
+}
+print
+print '#define NR_TRACE_EVENTS %d' % (event.num + 1)
+print 'extern TraceEvent trace_list[NR_TRACE_EVENTS];'
+
+return
+
+def is_string(arg):
+strtype = ('const char*', 'char*', 'const char *', 'char *')
+if arg.lstrip().startswith(strtype):
+return True
+else:
+return False
+
+def simple_c(events):
+rec_off = 0
+print '#include "trace.h"'
+print '#include "trace/simple.h"'
+print
+print 'TraceEvent trace_list[] = {'
+print
+eventlist = list(events)
+for event in eventlist:
+print '{.tp_name = "%(name)s", .state=0},' % {
+'name': event.name
+}
+print
+print '};'
+print
+for event in eventlist:
+argc = event.argc
+print '''void trace_%(name)s(%(args)s)
+{
+unsigned int tbuf_idx, rec_off;
+uint64_t var64 __attribute__ ((unused));
+uint64_t pvar64 __attribute__ ((unused));
+uint32_t slen __attribute__ ((unused));
+
+if (!trace_list[%(event_id)s].state) {
+return;
+}
+''' % {
+'name': event.name,
+'args': event.args,
+'event_id': event.num,
+}
+print '''tbuf_idx = trace_alloc_record(%(event_id)s, %(sizestr)s);
+rec_off = (tbuf_idx + ST_V2_REC

[Qemu-devel] [RFC PATCH 0/4] simpletrace : support var num/size of args, strings.

2012-01-01 Thread Harsh Prateek Bora
Existing simple trace can log upto 6 args per trace event and does not
support strings in trace record format. Introducing new trace format as
discussed earlier on list to support variable number/size of arguments.
(Ref: http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg03426.html)

Basic testing of this patch is successful. Stress testing not yet done.

Apply patches, then run:

make distclean
./configure with --enable-trace-backend=simple
make
sudo make install

TODO: Add suppport for other backends in tracetool.py

Harsh Prateek Bora (4):
  Converting tracetool.sh to tracetool.py
  Makefile and configure changes for tracetool.py
  simpletrace-v2: Handle variable number/size of elements per trace record.
  simpletrace.py: updated log reader script to handle new log format

 Makefile.objs  |4 +-
 configure  |4 +-
 monitor.c  |2 +-
 scripts/simpletrace.py |  110 --
 scripts/tracetool.py   |  299 
 trace/simple.c |  178 ++--
 trace/simple.h |   31 -
 7 files changed, 490 insertions(+), 138 deletions(-)
 create mode 100755 scripts/tracetool.py




Re: [Qemu-devel] [PATCH 4/4] simpletrace.py: Simpletrace v2 tracelog reader script

2012-01-01 Thread Harsh Bora

Hi,
Please ignore this duplicate patch in the series which was sent by 
mistake, consider the one with subject '[RFC PATCH 4/4] simpletrace.py: 
updated log reader script to handle new log format'. Sorry for 
inconvenience.


- Harsh

On 01/02/2012 12:20 PM, Harsh Prateek Bora wrote:

Signed-off-by: Harsh Prateek Bora
---
  scripts/simpletrace.py |  110 +++-
  1 files changed, 99 insertions(+), 11 deletions(-)

diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
index f55e5e6..69829dd 100755
--- a/scripts/simpletrace.py
+++ b/scripts/simpletrace.py
@@ -12,16 +12,37 @@
  import struct
  import re
  import inspect
+import sys
+from tracetool import *

  header_event_id = 0x
  header_magic= 0xf2b177cb0aa429b4
  header_version  = 0
+log_header_ver  = 2
  dropped_event_id = 0xfffe

  trace_fmt = '='
+header_fmt = '=QQII'
+header_len = struct.calcsize(header_fmt)
  trace_len = struct.calcsize(trace_fmt)
  event_re  = re.compile(r'(disable\s+)?([a-zA-Z0-9_]+)\(([^)]*)\).*')

+def get_event_argtypes(fobj):
+"""Parse a trace-events file into {event_num: (name, type arg, ...)}."""
+
+events = {dropped_event_id: ('name', 'args')}
+event_num = 0
+for line in fobj:
+m = event_re.match(line.strip())
+if m is None:
+continue
+
+disable, name, args = m.groups()
+events[event_num] = tuple(args.split(','))
+event_num += 1
+return events
+
+
  def parse_events(fobj):
  """Parse a trace-events file into {event_num: (name, arg1, ...)}."""

@@ -48,15 +69,16 @@ def read_record(fobj):
  return None
  return struct.unpack(trace_fmt, s)

+def read_header(fobj):
+'''Read a trace record header'''
+s = fobj.read(header_len)
+if len(s) != header_len:
+return None
+return struct.unpack(header_fmt, s)
+
+
  def read_trace_file(fobj):
  """Deserialize trace records from a file, yielding record tuples (event_num, timestamp, 
arg1, ..., arg6)."""
-header = read_record(fobj)
-if header is None or \
-   header[0] != header_event_id or \
-   header[1] != header_magic or \
-   header[2] != header_version:
-raise ValueError('not a trace file or incompatible version')
-
  while True:
  rec = read_record(fobj)
  if rec is None:
@@ -64,6 +86,59 @@ def read_trace_file(fobj):

  yield rec

+def process_event(event_id, fobj):
+params = etypes[event_id]
+for elem in params:
+if is_string(elem):
+l = fobj.read(4)
+(len,) = struct.unpack('=L', l)
+s = fobj.read(len)
+sfmt = `len`+'s'
+(str,) = struct.unpack_from(sfmt, s)
+type, sep, var = elem.rpartition('*')
+print '%(arg)s=%(val)s' % { 'arg': var.lstrip(), 'val': str },
+elif '*' in elem:
+(value,) = struct.unpack('=Q', fobj.read(8))
+type, sep, var = elem.rpartition('*')
+print '%(arg)s=0x%(val)u' % { 'arg': var.lstrip(), 'val': value },
+else:
+(value,) = struct.unpack('=Q', fobj.read(8))
+type, sep, var = elem.rpartition(' ')
+print '%(arg)s=%(val)u' % { 'arg': var.lstrip(), 'val': value },
+print
+return
+
+etypes = get_event_argtypes(open(sys.argv[1], 'r'))
+def processv2(fobj):
+'''Process simpletrace v2 log trace records'''
+events = parse_events(open(sys.argv[1], 'r'))
+#print events
+max_events = len(events) - 1
+last_timestamp = None
+while True:
+# read record header
+rechdr = read_header(fobj)
+if rechdr is None:
+print "No more records"
+break
+
+if rechdr[0]>= max_events:
+print "Invalid Event ID found, Trace Log may be corrupted !!!"
+sys.exit(1)
+
+if last_timestamp is None:
+last_timestamp = rechdr[1]
+delta_ns = rechdr[1] - last_timestamp
+last_timestamp = rechdr[1]
+
+print events[rechdr[0]][0],  '%0.3f' % (delta_ns / 1000.0),
+# read data
+# process_event(rec[0], fobj)
+# rechdr[2] is record length (including header)
+#print etypes[rechdr[0]]
+#data = fobj.read(rechdr[2] - header_len)
+process_event(rechdr[0], fobj)
+
  class Analyzer(object):
  """A trace file analyzer which processes trace records.

@@ -90,8 +165,6 @@ def process(events, log, analyzer):
  """Invoke an analyzer on each event in a log."""
  if isinstance(events, str):
  events = parse_events(open(events, 'r'))
-if isinstance(log, str):
-log = open(log, 'rb')

  def build_fn(analyzer, event):
  fn = getattr(analyzer, event[0], None)
@@ -129,8 +202,10 @@ def run(analyzer):
  sys.exit(1)

  events = parse_events(open(sys.argv[1], 'r'))
-process(events, sys.argv[2], analyzer)
+process(events, log, analyzer)
+

+

[Qemu-devel] [RFC PATCH 3/4] simpletrace-v2: Handle variable number/size of elements per trace record.

2012-01-01 Thread Harsh Prateek Bora
Advantages over existing simpletrace backend:
- More than 6 elements (vitually unlimited) arguments can be traced.
- This allows to trace strings (variable size element) as well.

Signed-off-by: Harsh Prateek Bora 
---
 monitor.c  |2 +-
 trace/simple.c |  178 
 trace/simple.h |   31 --
 3 files changed, 88 insertions(+), 123 deletions(-)

diff --git a/monitor.c b/monitor.c
index ffda0fe..b6f85d1 100644
--- a/monitor.c
+++ b/monitor.c
@@ -945,7 +945,7 @@ static void do_info_cpu_stats(Monitor *mon)
 #if defined(CONFIG_TRACE_SIMPLE)
 static void do_info_trace(Monitor *mon)
 {
-st_print_trace((FILE *)mon, &monitor_fprintf);
+/* FIXME: st_print_trace((FILE *)mon, &monitor_fprintf); */
 }
 #endif
 
diff --git a/trace/simple.c b/trace/simple.c
index b639dda..f7e7967 100644
--- a/trace/simple.c
+++ b/trace/simple.c
@@ -27,7 +27,7 @@
 #define HEADER_MAGIC 0xf2b177cb0aa429b4ULL
 
 /** Trace file version number, bump if format changes */
-#define HEADER_VERSION 0
+#define HEADER_VERSION 2
 
 /** Records were dropped event ID */
 #define DROPPED_EVENT_ID (~(uint64_t)0 - 1)
@@ -35,23 +35,6 @@
 /** Trace record is valid */
 #define TRACE_RECORD_VALID ((uint64_t)1 << 63)
 
-/** Trace buffer entry */
-typedef struct {
-uint64_t event;
-uint64_t timestamp_ns;
-uint64_t x1;
-uint64_t x2;
-uint64_t x3;
-uint64_t x4;
-uint64_t x5;
-uint64_t x6;
-} TraceRecord;
-
-enum {
-TRACE_BUF_LEN = 4096,
-TRACE_BUF_FLUSH_THRESHOLD = TRACE_BUF_LEN / 4,
-};
-
 /*
  * Trace records are written out by a dedicated thread.  The thread waits for
  * records to become available, writes them out, and then waits again.
@@ -62,11 +45,12 @@ static GCond *trace_empty_cond;
 static bool trace_available;
 static bool trace_writeout_enabled;
 
-static TraceRecord trace_buf[TRACE_BUF_LEN];
+uint8_t trace_buf[TRACE_BUF_LEN];
 static unsigned int trace_idx;
 static FILE *trace_fp;
 static char *trace_file_name = NULL;
 
+
 /**
  * Read a trace record from the trace buffer
  *
@@ -75,16 +59,19 @@ static char *trace_file_name = NULL;
  *
  * Returns false if the record is not valid.
  */
-static bool get_trace_record(unsigned int idx, TraceRecord *record)
+static bool get_trace_record(unsigned int idx, TraceRecord **recordptr)
 {
-if (!(trace_buf[idx].event & TRACE_RECORD_VALID)) {
+TraceRecord *record = (TraceRecord *) &trace_buf[idx];
+if (!(record->event & TRACE_RECORD_VALID)) {
 return false;
 }
 
 __sync_synchronize(); /* read memory barrier before accessing record */
 
-*record = trace_buf[idx];
-record->event &= ~TRACE_RECORD_VALID;
+*recordptr = g_malloc(record->length);
+/* make a copy of record to avoid being overwritten */
+memcpy(*recordptr, record, record->length);
+(*recordptr)->event &= ~TRACE_RECORD_VALID;
 return true;
 }
 
@@ -106,6 +93,47 @@ static void flush_trace_file(bool wait)
 g_static_mutex_unlock(&trace_lock);
 }
 
+unsigned int trace_alloc_record(TraceEventID event, uint32_t datasize)
+{
+unsigned int idx, rec_off;
+uint32_t rec_len = ST_V2_REC_HDR_LEN + datasize;
+uint64_t timestamp_ns = get_clock();
+
+idx = g_atomic_int_exchange_and_add((gint *)&trace_idx, rec_len) % 
TRACE_BUF_LEN;
+rec_off = idx;
+write_to_buffer(rec_off, (uint8_t*)&event, sizeof(event));
+rec_off += sizeof(event);
+write_to_buffer(rec_off, (uint8_t*)×tamp_ns, sizeof(timestamp_ns));
+rec_off += sizeof(timestamp_ns);
+write_to_buffer(rec_off, (uint8_t*)&rec_len, sizeof(rec_len));
+rec_off += sizeof(rec_len);
+return idx;
+}
+
+void write_to_buffer(unsigned int idx, uint8_t *dataptr, uint32_t size)
+{
+uint32_t x = 0;
+while (x < size)
+{
+if (idx >= TRACE_BUF_LEN) {
+idx = idx % TRACE_BUF_LEN;
+}
+trace_buf[idx++] = dataptr[x++];
+}
+}
+
+void trace_mark_record_complete(unsigned int idx)
+{
+TraceRecord *record = (TraceRecord*) &trace_buf[idx];;
+
+__sync_synchronize(); /* write barrier before marking as valid */
+record->event |= TRACE_RECORD_VALID;
+
+if (idx > TRACE_BUF_FLUSH_THRESHOLD) {
+flush_trace_file(false);
+}
+}
+
 static void wait_for_trace_records_available(void)
 {
 g_static_mutex_lock(&trace_lock);
@@ -120,7 +148,7 @@ static void wait_for_trace_records_available(void)
 
 static gpointer writeout_thread(gpointer opaque)
 {
-TraceRecord record;
+TraceRecord record, *recordptr;
 unsigned int writeout_idx = 0;
 unsigned int num_available, idx;
 size_t unused __attribute__ ((unused));
@@ -130,19 +158,20 @@ static gpointer writeout_thread(gpointer opaque)
 
 num_available = trace_idx - writeout_idx;
 if (num_available > TRACE_BUF_LEN) {
-record = (TraceRecord){
-.event = DROPPED_EVENT_ID,
-.x1 = num_available,
-};
-unused = fwrite(&record, size