Re: [Qemu-devel] [PATCH 5/6] PPC: booke206: Check for min/max TLB entry size

2012-01-20 Thread Paolo Bonzini

On 01/20/2012 04:17 AM, Alexander Graf wrote:

+if ((size_tlb  size_max) || (size_tlb  size_max)) {


You want  size_min, and the extra parentheses look odd.


+/* set to min size */
+tlb-mas1= ~MAS1_TSIZE_MASK;
+tlb-mas1 |= size_min  (MAS1_TSIZE_SHIFT + 1);
+}






Re: [Qemu-devel] [PATCH 5/6] PPC: booke206: Check for min/max TLB entry size

2012-01-20 Thread Andreas Färber
Am 20.01.2012 04:17, schrieb Alexander Graf:
 When setting a TLB entry, we need to check if the TLB we're putting it in
 actually supports the given size. According to the 2.06 PowerPC ISA, a
 value that's out of range results in the minimum page size for the TLB
 to be used.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  target-ppc/op_helper.c |   11 +++
  1 files changed, 11 insertions(+), 0 deletions(-)
 
 diff --git a/target-ppc/op_helper.c b/target-ppc/op_helper.c
 index 6339c95..0a88bf4 100644
 --- a/target-ppc/op_helper.c
 +++ b/target-ppc/op_helper.c
 @@ -4228,6 +4228,7 @@ void helper_booke206_tlbwe(void)
  {
  uint32_t tlbncfg, tlbn;
  ppcmas_tlb_t *tlb;
 +uint32_t size_tlb, size_min, size_max;
  
  switch (env-spr[SPR_BOOKE_MAS0]  MAS0_WQ_MASK) {
  case MAS0_WQ_ALWAYS:
 @@ -4273,6 +4274,16 @@ void helper_booke206_tlbwe(void)
  tlb-mas1 = ~MAS1_IPROT;
  }
  
 +/* XXX only applies for MAV 1.0 */
 +size_tlb = (tlb-mas1  MAS1_TSIZE_MASK)  (MAS1_TSIZE_SHIFT + 1);
 +size_min = (tlbncfg  TLBnCFG_MINSIZE)  TLBnCFG_MINSIZE_SHIFT;
 +size_max = (tlbncfg  TLBnCFG_MAXSIZE)  TLBnCFG_MAXSIZE_SHIFT;
 +if ((size_tlb  size_max) || (size_tlb  size_max)) {

This looks wrong...?

Andreas

 +/* set to min size */
 +tlb-mas1 = ~MAS1_TSIZE_MASK;
 +tlb-mas1 |= size_min  (MAS1_TSIZE_SHIFT + 1);
 +}
 +
  if (booke206_tlb_to_page_size(env, tlb) == TARGET_PAGE_SIZE) {
  tlb_flush_page(env, tlb-mas2  MAS2_EPN_MASK);
  } else {

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH v4 08/15] qmp: add block_job_cancel command

2012-01-20 Thread Kevin Wolf
Am 20.01.2012 01:02, schrieb Eric Blake:
 On 01/06/2012 07:01 AM, Stefan Hajnoczi wrote:
 Add block_job_cancel, which stops an active block streaming operation.
 When the operation has been cancelled the new BLOCK_JOB_CANCELLED event
 is emitted.

 Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 
 +++ b/hmp-commands.hx
 @@ -98,6 +98,20 @@ Set maximum speed for a background block operation.
  ETEXI
  
  {
 +.name   = block_job_cancel,
 +.args_type  = device:B,
 +.params = device,
 +.help   = stop an active block streaming operation,
 +.mhandler.cmd = hmp_block_job_cancel,
 +},
 +
 
 Looking at this from libvirt's perspective, would it be possible to give
 this a different name?  Then libvirt would know that if
 block_job_cancel_async exists, we have the official semantics; and if it
 doesn't exist, then we can try block_job_cancel as a fallback to see if
 we have the old blocking semantics.
 
 But by using the same name as the old unofficial blocking command, it's
 difficult to tell if we should expect an event, or whether completion of
 the command means completion of the cancel.
 
 On the other hand, I guess we can rely on completion of the command,
 followed by reading block job status to see if the job is still in
 flight, will tell us whether we need to worry about waiting for an event
 - if the job is complete (whether or not this command was the blocking
 variant), we are done; if the job is ongoing, we have the new semantics
 and can expect an event; and that only leaves the race of calling the
 command, then the job completes, then we query and see it done, then the
 event comes, where we just have to be ready to ignore an unexpected event.

You're quoting the HMP part, is that intentional? You shouldn't be using
this at all.

Anyway, are there even any qemu versions out there that implement an
older interface?

 +##
 +# @block_job_cancel:
 +#
 +# Stop an active block streaming operation.
 +#
 +# This command returns immediately after marking the active block streaming
 +# operation for cancellation.  It is an error to call this command if no
 +# operation is in progress.
 +#
 +# The operation will cancel as soon as possible and then emit the
 +# BLOCK_JOB_CANCELLED event.  Before that happens the job is still visible 
 when
 +# enumerated using query-block-jobs.
 
 Is there any policy on _ vs - in command names?  It seems awkward to
 have block_job_cancel but query-block-jobs.

block_job_cancel is HMP, whereas query-block-jobs is a QMP command. QMP
uses - consistently. Not sure if HMP is consistent, but it tends to use _.

Kevin



Re: [Qemu-devel] [PATCH v12 4/4] arm: SoC model for Calxeda Highbank

2012-01-20 Thread Peter Maydell
On 19 January 2012 23:17, Rob Herring rob.herr...@calxeda.com wrote:
 On 01/19/2012 03:44 PM, Peter Maydell wrote:
 On 19 January 2012 21:31, Mark Langsdorf mark.langsd...@calxeda.com wrote:
 +    highbank_binfo.board_id = 0xEC10100f; /* provided by deviceTree */

 Where does this number come from? It's not in
 http://www.arm.linux.org.uk/developer/machines/

 Is 3027 (==0xbd3) you?
 http://www.arm.linux.org.uk/developer/machines/list.php?id=3027


 Much of the data there is wrong as none of it is used. 0 or -1 is the
 right value as those are obviously meaningless. A highbank kernel will
 never be booted without devicetree and in that case this number is
 irrelevant. This is the legacy boot interface and qemu really needs to
 learn to boot with a separate dtb.

Yeah, but the documentation even for DTB boot says we should pass
in a machine number. If 0 or -1 are right then there should be
some documentation that says so. I'll accept mailing list post
from some authoritative person [eg Grant Likely] if necessary.
But this is an ABI between boot loaders and the kernel so I don't
want to just have something random that happens to work. (And in
particular if -1 is the officially sanctioned number then we need
to fix arm_boot to be able to pass values 16 bits wide.)

-- PMM



Re: [Qemu-devel] Get only TCG code without execution

2012-01-20 Thread Peter Maydell
On 20 January 2012 06:12, 陳韋任 che...@iis.sinica.edu.tw wrote:
  Out of curiosity. What's ARM memory model? From the Wikipedia [1], it seems
 ARMv7 has the same memory model as IA64.

The ARM memory model is the set of semantics for memory
accesses as defined in the ARM Architecture Reference
Manual (covering not just reordering but also exclusive
accesses, alignment, barriers, etc). The manual devotes
50 pages to it so I'm not about to try to summarise it here :-)

 What's load/store exclusive implementation?

How we implement the ARM instructions LDREX/STREX/LDREXD/STREXD/etc.
These have documented (complicated!) semantics which our
implementation doesn't provide.

 And as a general emulator, QEMU shouldn't implement any
 architecture-specific memory model, right?

Wrong, at least in theory. Ideally QEMU should implement exactly
the semantics required by the guest architecture memory model
(it's allowed to be stricter than the architecture requires, of
course), in the same way it should implement the semantics required
by the guest architecture instruction set. A guest binary for ARM
can rely on the memory ordering constraints imposed by the memory
model just as much as it can rely on the fact that the ADD instruction
adds two registers together. In practice, of course (a) this is an
enormous amount of work and also slows the emulator down drastically
and (b) guest binaries don't actually rely that much on the memory
model. And the fairly strict memory model provided by x86 means that
for x86 hosts we actually get most of the important bits of the guest
memory model right anyway.

 What comes into my mind is QEMU only need to follow guest memory
 operations when translates guest binary to TCG ops. When translate
 TCG ops to host binary, it also has to be careful not to mess up
 the memory ordering.

This might be doable if TCG provided a set of ops which allowed
you to implement the guest memory model; it doesn't. If we ever
move to emulating guest SMP in multiple host threads this will
become more important, I suspect.

From a pragmatic we just want to run guests point of view, what
QEMU does now is entirely sufficient; I'm just saying that for
a strictly correct emulation of the guest architecture we're a
bit lacking.

-- PMM



Re: [Qemu-devel] bad USB tablet update rate on qemu-1.0

2012-01-20 Thread Andreas Färber
Hi Erik,

Am 19.01.2012 20:15, schrieb Erik Rull:
 Andreas Färber wrote:
 Am 19.01.2012 17:40, schrieb Erik Rull:
 [...] there seems to be a
 difference between the captured cursor for the native X-Windows window
 and the VNC window that occured somewhere between 0.14 and 1.0.

 Then try `git bisect start v1.0 v0.14.0' to find out when exactly the
 perceived behavior changed. :)

 I just did a clone of the current qemu-kvm (which I use) and started to
 bisect, but got an error, where I don't know how to proceed:
 erik@debian:~/qemu-test/qemu-kvm$ git bisect good qemu-kvm-0.14.0
 You need to start by git bisect start
 Do you want me to do it for you [Y/n]?
 erik@debian:~/qemu-test/qemu-kvm$ git bisect bad qemu-kvm-1.0
 Bisecting: 2043 revisions left to test after this
 fatal: Entry 'roms/seabios' not uptodate. Cannot merge.
 erik@debian:~/qemu-test/qemu-kvm$

Hm, did you maybe previously do a `git submodule init'? You may need to
run `git submodule update' then (which may fail as recently for roms/SLOF).

Otherwise, generally when it does not compile you can try `git bisect
skip' to try a different commit.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



[Qemu-devel] [PATCH 2/3] vnc: fix ctrl key in vnc terminal emulation

2012-01-20 Thread Gerd Hoffmann
Make the control keys for terminals on the vnc display
(i.e. qemu -vnc :0 -serial vc) work.  Makes the terminals
alot more usable as typing Ctrl-C in your serial console
actually has the desired effect ;)

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 ui/vnc.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 5752bf8..810582b 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1552,9 +1552,11 @@ static void do_key_event(VncState *vs, int down, int 
keycode, int sym)
 else
 kbd_put_keycode(keycode | SCANCODE_UP);
 } else {
+bool numlock = vs-modifiers_state[0x45];
+bool control = (vs-modifiers_state[0x1d] ||
+vs-modifiers_state[0x9d]);
 /* QEMU console emulation */
 if (down) {
-int numlock = vs-modifiers_state[0x45];
 switch (keycode) {
 case 0x2a:  /* Left Shift */
 case 0x36:  /* Right Shift */
@@ -1642,7 +1644,11 @@ static void do_key_event(VncState *vs, int down, int 
keycode, int sym)
 break;
 
 default:
-kbd_put_keysym(sym);
+if (control) {
+kbd_put_keysym(sym  0x1f);
+} else {
+kbd_put_keysym(sym);
+}
 break;
 }
 }
-- 
1.7.1




[Qemu-devel] [PATCH 3/3] vnc: implement shared flag handling.

2012-01-20 Thread Gerd Hoffmann
VNC clients send a shared flag in the client init message.  Up to now
qemu completely ignores this.  This patch implements shared flag
handling.  It comes with three policies:  By default qemu behaves as one
would expect:  Asking for a exclusive access grants exclusive access to
the client connecting.  There is also a desktop sharing mode which
disallows exclusive connects (so one forgetting -shared wouldn't drop
everybody else) and a compatibility mode which mimics the traditional
(but non-conforming) qemu behavior.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 qemu-options.hx |   13 +++
 ui/vnc.c|   98 +++
 ui/vnc.h|   16 +
 3 files changed, 127 insertions(+), 0 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 6295cde..6317660 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1090,6 +1090,19 @@ This can be really helpful to save bandwidth when 
playing videos. Disabling
 adaptive encodings allows to restore the original static behavior of encodings
 like Tight.
 
+@item share=[allow-exclusive|force-shared|ignore]
+
+Set display sharing policy.  'allow-exclusive' allows clients to ask
+for exclusive access.  As suggested by the rfb spec this is
+implemented by dropping other connections.  Connecting multiple
+clients in parallel requires all clients asking for a shared session
+(vncviewer: -shared switch).  This is the default.  'force-shared'
+disables exclusive client access.  Useful for shared desktop sessions,
+where you don't want someone forgetting specify -shared disconnect
+everybody else.  'ignore' completely ignores the shared flag and
+allows everybody connect unconditionally.  Doesn't conform to the rfb
+spec but is traditional qemu behavior.
+
 @end table
 ETEXI
 
diff --git a/ui/vnc.c b/ui/vnc.c
index 810582b..83a9b15 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -47,6 +47,29 @@ static DisplayChangeListener *dcl;
 
 static int vnc_cursor_define(VncState *vs);
 
+static void vnc_set_share_mode(VncState *vs, VncShareMode mode)
+{
+#ifdef _VNC_DEBUG
+static const char *mn[] = {
+[0]   = undefined,
+[VNC_SHARE_MODE_CONNECTING]   = connecting,
+[VNC_SHARE_MODE_SHARED]   = shared,
+[VNC_SHARE_MODE_EXCLUSIVE]= exclusive,
+[VNC_SHARE_MODE_DISCONNECTED] = disconnected,
+};
+fprintf(stderr, %s/%d: %s - %s\n, __func__,
+vs-csock, mn[vs-share_mode], mn[mode]);
+#endif
+
+if (vs-share_mode == VNC_SHARE_MODE_EXCLUSIVE) {
+vs-vd-num_exclusive--;
+}
+vs-share_mode = mode;
+if (vs-share_mode == VNC_SHARE_MODE_EXCLUSIVE) {
+vs-vd-num_exclusive++;
+}
+}
+
 static char *addr_to_string(const char *format,
 struct sockaddr_storage *sa,
 socklen_t salen) {
@@ -997,6 +1020,7 @@ static void vnc_disconnect_start(VncState *vs)
 {
 if (vs-csock == -1)
 return;
+vnc_set_share_mode(vs, VNC_SHARE_MODE_DISCONNECTED);
 qemu_set_fd_handler2(vs-csock, NULL, NULL, NULL, NULL);
 closesocket(vs-csock);
 vs-csock = -1;
@@ -2054,8 +2078,67 @@ static int protocol_client_msg(VncState *vs, uint8_t 
*data, size_t len)
 static int protocol_client_init(VncState *vs, uint8_t *data, size_t len)
 {
 char buf[1024];
+VncShareMode mode;
 int size;
 
+mode = data[0] ? VNC_SHARE_MODE_SHARED : VNC_SHARE_MODE_EXCLUSIVE;
+switch (vs-vd-share_policy) {
+case VNC_SHARE_POLICY_IGNORE:
+/*
+ * Ignore the shared flag.  Nothing to do here.
+ *
+ * Doesn't conform to the rfb spec but is traditional qemu
+ * behavior, thus left here as option for compatibility
+ * reasons.
+ */
+break;
+case VNC_SHARE_POLICY_ALLOW_EXCLUSIVE:
+/*
+ * Policy: Allow clients ask for exclusive access.
+ *
+ * Implementation: When a client asks for exclusive access,
+ * disconnect all others. Shared connects are allowed as long
+ * as no exclusive connection exists.
+ *
+ * This is how the rfb spec suggests to handle the shared flag.
+ */
+if (mode == VNC_SHARE_MODE_EXCLUSIVE) {
+VncState *client;
+QTAILQ_FOREACH(client, vs-vd-clients, next) {
+if (vs == client) {
+continue;
+}
+if (client-share_mode != VNC_SHARE_MODE_EXCLUSIVE 
+client-share_mode != VNC_SHARE_MODE_SHARED) {
+continue;
+}
+vnc_disconnect_start(client);
+}
+}
+if (mode == VNC_SHARE_MODE_SHARED) {
+if (vs-vd-num_exclusive  0) {
+vnc_disconnect_start(vs);
+return 0;
+}
+}
+break;
+case VNC_SHARE_POLICY_FORCE_SHARED:
+/*
+ * Policy: Shared connects 

[Qemu-devel] [PATCH 1/3] Fix vnc memory corruption with width = 1400

2012-01-20 Thread Gerd Hoffmann
vnc assumes that the screen width is a multiple of 16 in several places.
If this is not the case vnc will overrun buffers, corrupt memory, make
qemu crash.

This is the minimum fix for this bug. It makes sure we don't overrun the
scanline, thereby fixing the segfault.  The rendering is *not* correct
though, there is a black border at the right side of the screen, 8
pixels wide because 1400 % 16 == 8.

Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 ui/vnc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 16b79ec..5752bf8 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -2445,7 +2445,7 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
 guest_ptr  = guest_row;
 server_ptr = server_row;
 
-for (x = 0; x  vd-guest.ds-width;
+for (x = 0; x + 15  vd-guest.ds-width;
 x += 16, guest_ptr += cmp_bytes, server_ptr += cmp_bytes) {
 if (!test_and_clear_bit((x / 16), vd-guest.dirty[y]))
 continue;
-- 
1.7.1




[Qemu-devel] [PATCH 0/3] vnc patch collection

2012-01-20 Thread Gerd Hoffmann
  Hi,

All of these patches have been on the list before, this is just a
re-post after rebasing them.  This time with a git repo to pull from
for your convinience.  Check the individual patch descriptions for
details.

cheers,
  Gerd

The following changes since commit 515689235c4c3d9c3f0406ddcdd21ed8da77062b:

  Merge remote-tracking branch 'spice/spice.v47' into staging (2012-01-19 
12:51:02 -0600)

are available in the git repository at:

  git://git.kraxel.org/qemu vnc.1

Gerd Hoffmann (3):
  Fix vnc memory corruption with width = 1400
  vnc: fix ctrl key in vnc terminal emulation
  vnc: implement shared flag handling.

 qemu-options.hx |   13 ++
 ui/vnc.c|  110 +-
 ui/vnc.h|   16 
 3 files changed, 136 insertions(+), 3 deletions(-)



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-20 Thread Kevin Wolf
Am 20.01.2012 09:58, schrieb ronnie sahlberg:
 On Thu, Jan 19, 2012 at 11:17 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 18.12.2011 05:48, schrieb Ronnie Sahlberg:
 This patch adds configuration variables for iSCSI to set
 initiator-name to use when logging in to the target,
 which type of header-digest to negotiate with the target
 and username and password for CHAP authentication.

 This allows specifying a initiator-name either from the command line
 -iscsi initiator-name=iqn.2004-01.com.example:test
 or from a configuration file included with -readconfig
 [iscsi]
   initiator-name = iqn.2004-01.com.example:test
   header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
   user = CHAP username
   password = CHAP password

 The patch also updates the manpage and qemu-doc

 Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com

 So these options are global? What if I wanted to use two different
 setups for two different images?

 
 
 Good point.
 I will rework the patch so that it first checks for
 [iscsi iqn.target.name]
 and if that is not found it falls-back to just checking for [iscsi]
 
 That would allow to have one catch all section for all targets, but
 also the possibility to override and use different settings on a
 per-target basis.
 
 I will post an updated patch in a day or two.

Thanks, this sounds like a good solution.

Kevin



Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Marcelo Tosatti
On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
 On 2012-01-19 18:53, Marcelo Tosatti wrote:
  What problems does it cause, and in which scenarios? Can't they be
  fixed?
  
  If the guest compensates for lost ticks, and KVM reinjects them, guest
  time advances faster then it should, to the extent where NTP fails to
  correct it. This is the case with RHEL4.
  
  But for example v2.4 kernel (or Windows with non-acpi HAL) do not
  compensate. In that case you want KVM to reinject.
  
  I don't know of any other way to fix this.
 
 OK, i see. The old unsolved problem of guessing what is being executed.
 
 Then the next question is how and where to control this. Conceptually,
 there should rather be a global switch say compensate for lost ticks of
 periodic timers: yes/no - instead of a per-timer knob. Didn't we
 discussed something like this before?

I don't see the advantage of a global control versus per device
control (in fact it lowers flexibility).

 What about periodic APIC tick compensation? I suppose the kernel does
 not support this as no common guest makes use of this as clock source,
 right? 

Recent guests use the APIC timer as clock event, but their time keeping 
algorithms are not as susceptible to lost ticks as the ones that use 
PIT/RTC.

 Or the HPET? Once the user space model supports compensation, we
 need to control it as well. Individually?

Ulrich has posted patches for HPET compensation:

http://lists.gnu.org/archive/html/qemu-devel/2011-03/msg01989.html

 I just want to avoid introducing an clumsy interface we then need to
 maintain for a long time.
 
 Jan

If the option is a qdev property, i don't see what is clumsy about it?




Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Jan Kiszka
On 2012-01-20 11:14, Marcelo Tosatti wrote:
 On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
 On 2012-01-19 18:53, Marcelo Tosatti wrote:
 What problems does it cause, and in which scenarios? Can't they be
 fixed?

 If the guest compensates for lost ticks, and KVM reinjects them, guest
 time advances faster then it should, to the extent where NTP fails to
 correct it. This is the case with RHEL4.

 But for example v2.4 kernel (or Windows with non-acpi HAL) do not
 compensate. In that case you want KVM to reinject.

 I don't know of any other way to fix this.

 OK, i see. The old unsolved problem of guessing what is being executed.

 Then the next question is how and where to control this. Conceptually,
 there should rather be a global switch say compensate for lost ticks of
 periodic timers: yes/no - instead of a per-timer knob. Didn't we
 discussed something like this before?
 
 I don't see the advantage of a global control versus per device
 control (in fact it lowers flexibility).

Usability. Users should not have to care about individual tick-based
clocks. They care about my OS requires lost ticks compensation, yes or no.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Daniel P. Berrange
On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:14, Marcelo Tosatti wrote:
  On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
  On 2012-01-19 18:53, Marcelo Tosatti wrote:
  What problems does it cause, and in which scenarios? Can't they be
  fixed?
 
  If the guest compensates for lost ticks, and KVM reinjects them, guest
  time advances faster then it should, to the extent where NTP fails to
  correct it. This is the case with RHEL4.
 
  But for example v2.4 kernel (or Windows with non-acpi HAL) do not
  compensate. In that case you want KVM to reinject.
 
  I don't know of any other way to fix this.
 
  OK, i see. The old unsolved problem of guessing what is being executed.
 
  Then the next question is how and where to control this. Conceptually,
  there should rather be a global switch say compensate for lost ticks of
  periodic timers: yes/no - instead of a per-timer knob. Didn't we
  discussed something like this before?
  
  I don't see the advantage of a global control versus per device
  control (in fact it lowers flexibility).
 
 Usability. Users should not have to care about individual tick-based
 clocks. They care about my OS requires lost ticks compensation, yes or no.

FYI, at the libvirt level we model policy against individual timers, for
example:

  clock offset=localtime
timer name=rtc tickpolicy=catchup track=guest/
timer name=pit tickpolicy=delay/
  /clock


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Jamie Lokier
Jan Kiszka wrote:
 Usability. Users should not have to care about individual tick-based
 clocks. They care about my OS requires lost ticks compensation, yes or no.

Conceivably an OS may require lost ticks compensation depending on
boot options given to the OS telling it which clock sources to use.

However I like the idea of a global default, which you can set and all
the devices inherit it unless overridden in each device.

-- Jamie



Re: [Qemu-devel] Get only TCG code without execution

2012-01-20 Thread Peter Maydell
On 20 January 2012 09:44, 陳韋任 che...@iis.sinica.edu.tw wrote:
 On Fri, Jan 20, 2012 at 09:09:46AM +, Peter Maydell wrote:
  AFAIK, LLVM defines it's own memory model [1] which is inspired by the C++11
 memory model. That's why I think instead of implementing architecture-specific
 memory model, QEMU should define a more general (strict) one.

LLVM has the advantage that it can require all its incoming code
to adhere to a common memory model (ie something like the C++ one).

  You said,

  guest binaries don't actually rely that much on the memory model.

 I think the reason is those guest binaries are single thread. Memory model is
 important in multi-threaded case. BTW, our binary translator now can translate
 x86 binary to ARM binary, and ARM has weaker memory model than x86.

Yes. At the moment this works for QEMU on ARM hosts because in
system mode QEMU itself is single-threaded so the nastier interactions
between multiple guest CPUs don't occur (just about every memory model
defines that memory interactions within a single thread of execution
behave in the obvious manner). I also had in mind that guest binaries
tend to make fairly stereotypical use of things like LDREX/STREX
rather than relying on obscure details like their interaction with
plain load/stores.

 P.S. Happy Chinese New Year. :)

You too!

-- PMM



[Qemu-devel] [PATCH v9 8/9] hw/exynos4210.c: Add LAN support for SMDKC210.

2012-01-20 Thread Evgeny Voevodin
SMDKC210 uses lan9215 chip, but lan9118 in 16-bit mode seems to
be enough.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 hw/exynos4_boards.c |   27 +--
 1 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/hw/exynos4_boards.c b/hw/exynos4_boards.c
index 55fca06..8befe8b 100644
--- a/hw/exynos4_boards.c
+++ b/hw/exynos4_boards.c
@@ -23,6 +23,7 @@
 
 #include sysemu.h
 #include sysbus.h
+#include net.h
 #include arm-misc.h
 #include exec-memory.h
 #include exynos4210.h
@@ -42,6 +43,8 @@
 #define  PRINT_DEBUG(fmt, args...)  do {} while (0)
 #endif
 
+#define SMDK_LAN9118_BASE_ADDR  0x0500
+
 typedef enum Exynos4BoardType {
 EXYNOS4_BOARD_NURI,
 EXYNOS4_BOARD_SMDKC210,
@@ -68,6 +71,24 @@ static struct arm_boot_info exynos4_board_binfo = {
 .smp_loader_start = EXYNOS4210_SMP_BOOT_ADDR,
 };
 
+static void lan9215_init(uint32_t base, qemu_irq irq)
+{
+DeviceState *dev;
+SysBusDevice *s;
+
+/* This should be a 9215 but the 9118 is close enough */
+if (nd_table[0].vlan) {
+qemu_check_nic_model(nd_table[0], lan9118);
+dev = qdev_create(NULL, lan9118);
+qdev_set_nic_properties(dev, nd_table[0]);
+qdev_prop_set_uint32(dev, mode_16bit, 1);
+qdev_init_nofail(dev);
+s = sysbus_from_qdev(dev);
+sysbus_mmio_map(s, 0, base);
+sysbus_connect_irq(s, 0, irq);
+}
+}
+
 static Exynos4210State *exynos4_boards_init_common(
 const char *kernel_filename,
 const char *kernel_cmdline,
@@ -114,9 +135,11 @@ static void smdkc210_init(ram_addr_t ram_size,
 const char *kernel_filename, const char *kernel_cmdline,
 const char *initrd_filename, const char *cpu_model)
 {
-exynos4_boards_init_common(kernel_filename, kernel_cmdline,
-initrd_filename, EXYNOS4_BOARD_SMDKC210);
+Exynos4210State *s = exynos4_boards_init_common(kernel_filename,
+kernel_cmdline, initrd_filename, EXYNOS4_BOARD_SMDKC210);
 
+lan9215_init(SMDK_LAN9118_BASE_ADDR,
+qemu_irq_invert(s-irq_table[exynos4210_get_irq(37, 1)]));
 arm_load_kernel(first_cpu, exynos4_board_binfo);
 }
 
-- 
1.7.4.1




[Qemu-devel] [PATCH v9 6/9] ARM: exynos4210: MCT support.

2012-01-20 Thread Evgeny Voevodin

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 Makefile.target |2 +-
 hw/exynos4210.c |   19 +
 hw/exynos4210_mct.c | 1479 +++
 3 files changed, 1499 insertions(+), 1 deletions(-)
 create mode 100644 hw/exynos4210_mct.c

diff --git a/Makefile.target b/Makefile.target
index 6cddf0c..24e7e99 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -341,7 +341,7 @@ obj-arm-y += versatile_pci.o
 obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o
 obj-arm-y += exynos4210_gic.o exynos4210_combiner.o exynos4210.o
 obj-arm-y += exynos4_boards.o exynos4210_uart.o exynos4210_pwm.o
-obj-arm-y += exynos4210_pmu.o
+obj-arm-y += exynos4210_pmu.o exynos4210_mct.o
 obj-arm-y += arm_l2x0.o
 obj-arm-y += arm_mptimer.o
 obj-arm-y += armv7m.o armv7m_nvic.o stellaris.o pl022.o stellaris_enet.o
diff --git a/hw/exynos4210.c b/hw/exynos4210.c
index 2e2dbf0..aceec2a 100644
--- a/hw/exynos4210.c
+++ b/hw/exynos4210.c
@@ -32,6 +32,9 @@
 /* PWM */
 #define EXYNOS4210_PWM_BASE_ADDR   0x139D
 
+/* MCT */
+#define EXYNOS4210_MCT_BASE_ADDR   0x1005
+
 /* UART's definitions */
 #define EXYNOS4210_UART0_BASE_ADDR 0x1380
 #define EXYNOS4210_UART1_BASE_ADDR 0x1381
@@ -222,6 +225,22 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem,
 irq_table[exynos4210_get_irq(22, 4)],
 NULL);
 
+/* Multi Core Timer */
+dev = qdev_create(NULL, exynos4210.mct);
+qdev_init_nofail(dev);
+busdev = sysbus_from_qdev(dev);
+for (n = 0; n  4; n++) {
+/* Connect global timer interrupts to Combiner gpio_in */
+sysbus_connect_irq(busdev, n,
+irq_table[exynos4210_get_irq(1, 4 + n)]);
+}
+/* Connect local timer interrupts to Combiner gpio_in */
+sysbus_connect_irq(busdev, 4,
+irq_table[exynos4210_get_irq(51, 0)]);
+sysbus_connect_irq(busdev, 5,
+irq_table[exynos4210_get_irq(35, 3)]);
+sysbus_mmio_map(busdev, 0, EXYNOS4210_MCT_BASE_ADDR);
+
 /*** UARTs ***/
 exynos4210_uart_create(EXYNOS4210_UART0_BASE_ADDR,
EXYNOS4210_UART0_FIFO_SIZE, 0, NULL,
diff --git a/hw/exynos4210_mct.c b/hw/exynos4210_mct.c
new file mode 100644
index 000..81bc04e
--- /dev/null
+++ b/hw/exynos4210_mct.c
@@ -0,0 +1,1479 @@
+/*
+ * Samsung exynos4210 Multi Core timer
+ *
+ * Copyright (c) 2000 - 2011 Samsung Electronics Co., Ltd.
+ * All rights reserved.
+ *
+ * Evgeny Voevodin e.voevo...@samsung.com
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+/*
+ * Global Timer:
+ *
+ * Consists of two timers. First represents Free Running Counter and second
+ * is used to measure interval from FRC to nearest comparator.
+ *
+ *0   
UINT64_MAX
+ *|  timer0 |
+ *| -- |
+ *| frc--- |
+ *|__|__|
+ *CMP0  CMP1 CMP2|   CMP3
+ * __||_
+ * | timer1 |
+ * | - |
+ *frc  CMPx
+ *
+ * Problem: when implementing global timer as is, overflow arises.
+ * next_time = cur_time + period * count;
+ * period and count are 64 bits width.
+ * Lets arm timer for MCT_GT_COUNTER_STEP count and update internal G_CNT
+ * register during each event.
+ *
+ * Problem: both timers need to be implemented using MCT_XT_COUNTER_STEP 
because
+ * local timer contains two counters: TCNT and ICNT. TCNT == 0 - ICNT--.
+ * IRQ is generated when ICNT riches zero. Implementation where TCNT == 0
+ * generates IRQs suffers from too frequently events. Better to have one
+ * uint64_t counter equal to TCNT*ICNT and arm ptimer.c for a 
minimum(TCNT*ICNT,
+ * MCT_GT_COUNTER_STEP); (yes, if target tunes ICNT * TCNT to be too low 
values,
+ * there is no way to avoid frequently events).
+ */

[Qemu-devel] [PATCH] rename get_clock_realtime

2012-01-20 Thread Paolo Bonzini
get_clock_realtime accesses the host_clock, not the rt_clock.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-timer.c |2 +-
 qemu-timer.h |4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index cd026c6..4a14a6d 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -436,7 +436,7 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
 return cpu_get_clock();
 }
 case QEMU_CLOCK_HOST:
-now = get_clock_realtime();
+now = get_clock_host();
 last = clock-last;
 clock-last = now;
 if (now  last) {
diff --git a/qemu-timer.h b/qemu-timer.h
index de17f3b..b180fca 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -93,7 +93,7 @@ static inline int64_t get_ticks_per_sec(void)
 }
 
 /* real time host monotonic timer */
-static inline int64_t get_clock_realtime(void)
+static inline int64_t get_clock_host(void)
 {
 struct timeval tv;
 
@@ -131,7 +131,7 @@ static inline int64_t get_clock(void)
 {
 /* XXX: using gettimeofday leads to problems if the date
changes, so it should be avoided. */
-return get_clock_realtime();
+return get_clock_host();
 }
 }
 #endif
-- 
1.7.7.1




[Qemu-devel] [PATCH] nseries: attach monitor powerdown request to menelaus

2012-01-20 Thread Paolo Bonzini
I noticed some unused code in the twl92230, probably from before
qdev-ification.  This patch makes the machine use the chip's pwrbtn
signal.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/nseries.c  |2 ++
 hw/twl92230.c |   21 +
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/hw/nseries.c b/hw/nseries.c
index d429dbd..c5b3184 100644
--- a/hw/nseries.c
+++ b/hw/nseries.c
@@ -204,6 +204,8 @@ static void n8x0_i2c_setup(struct n800_s *s)
   qdev_get_gpio_in(s-cpu-ih[0],
OMAP_INT_24XX_SYS_NIRQ));
 
+qemu_system_powerdown = qdev_get_gpio_in(dev, 3);
+
 /* Attach a TMP105 PM chip (A0 wired to ground) */
 dev = i2c_create_slave(s-i2c, tmp105, N8X0_TMP105_ADDR);
 qdev_connect_gpio_out(dev, 0, tmp_irq);
diff --git a/hw/twl92230.c b/hw/twl92230.c
index a75448f..6416752 100644
--- a/hw/twl92230.c
+++ b/hw/twl92230.c
@@ -61,9 +61,7 @@ typedef struct {
 } rtc;
 uint16_t rtc_next_vmstate;
 qemu_irq out[4];
-qemu_irq *in;
 uint8_t pwrbtn_state;
-qemu_irq pwrbtn;
 } MenelausState;
 
 static inline void menelaus_update(MenelausState *s)
@@ -186,14 +184,12 @@ static void menelaus_gpio_set(void *opaque, int line, int 
level)
 {
 MenelausState *s = (MenelausState *) opaque;
 
-/* No interrupt generated */
-s-inputs = ~(1  line);
-s-inputs |= level  line;
-}
-
-static void menelaus_pwrbtn_set(void *opaque, int line, int level)
-{
-MenelausState *s = (MenelausState *) opaque;
+if (line  3) {
+/* No interrupt generated */
+s-inputs = ~(1  line);
+s-inputs |= level  line;
+return;
+}
 
 if (!s-pwrbtn_state  level) {
 s-status |= 1  11;  /* PSHBTN */
@@ -849,8 +845,9 @@ static int twl92230_init(i2c_slave *i2c)
 s-rtc.hz_tm = qemu_new_timer_ms(rt_clock, menelaus_rtc_hz, s);
 /* Three output pins plus one interrupt pin.  */
 qdev_init_gpio_out(i2c-qdev, s-out, 4);
-qdev_init_gpio_in(i2c-qdev, menelaus_gpio_set, 3);
-s-pwrbtn = qemu_allocate_irqs(menelaus_pwrbtn_set, s, 1)[0];
+
+/* Three input pins plus one power-button pin.  */
+qdev_init_gpio_in(i2c-qdev, menelaus_gpio_set, 4);
 
 menelaus_reset(s-i2c);
 
-- 
1.7.7.1




Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Jan Kiszka
On 2012-01-20 11:39, Jamie Lokier wrote:
 Jan Kiszka wrote:
 Usability. Users should not have to care about individual tick-based
 clocks. They care about my OS requires lost ticks compensation, yes or no.
 
 Conceivably an OS may require lost ticks compensation depending on
 boot options given to the OS telling it which clock sources to use.
 
 However I like the idea of a global default, which you can set and all
 the devices inherit it unless overridden in each device.

OK, this sounds like a good option: add per-device control but also
introduce global default. The latter can still be done later on.

The only problem is that we should already come up with the right,
generic control switch template. reinject=on|off, as I did it for now
for the PIT, is definitely not optimal.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Jan Kiszka
On 2012-01-20 11:25, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:14, Marcelo Tosatti wrote:
 On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
 On 2012-01-19 18:53, Marcelo Tosatti wrote:
 What problems does it cause, and in which scenarios? Can't they be
 fixed?

 If the guest compensates for lost ticks, and KVM reinjects them, guest
 time advances faster then it should, to the extent where NTP fails to
 correct it. This is the case with RHEL4.

 But for example v2.4 kernel (or Windows with non-acpi HAL) do not
 compensate. In that case you want KVM to reinject.

 I don't know of any other way to fix this.

 OK, i see. The old unsolved problem of guessing what is being executed.

 Then the next question is how and where to control this. Conceptually,
 there should rather be a global switch say compensate for lost ticks of
 periodic timers: yes/no - instead of a per-timer knob. Didn't we
 discussed something like this before?

 I don't see the advantage of a global control versus per device
 control (in fact it lowers flexibility).

 Usability. Users should not have to care about individual tick-based
 clocks. They care about my OS requires lost ticks compensation, yes or no.
 
 FYI, at the libvirt level we model policy against individual timers, for
 example:
 
   clock offset=localtime
 timer name=rtc tickpolicy=catchup track=guest/
 timer name=pit tickpolicy=delay/
   /clock

Are the various modes of tickpolicy fully specified somewhere?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH] rename get_clock_realtime

2012-01-20 Thread Jan Kiszka
On 2012-01-20 12:05, Paolo Bonzini wrote:
 get_clock_realtime accesses the host_clock, not the rt_clock.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  qemu-timer.c |2 +-
  qemu-timer.h |4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/qemu-timer.c b/qemu-timer.c
 index cd026c6..4a14a6d 100644
 --- a/qemu-timer.c
 +++ b/qemu-timer.c
 @@ -436,7 +436,7 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
  return cpu_get_clock();
  }
  case QEMU_CLOCK_HOST:
 -now = get_clock_realtime();
 +now = get_clock_host();
  last = clock-last;
  clock-last = now;
  if (now  last) {
 diff --git a/qemu-timer.h b/qemu-timer.h
 index de17f3b..b180fca 100644
 --- a/qemu-timer.h
 +++ b/qemu-timer.h
 @@ -93,7 +93,7 @@ static inline int64_t get_ticks_per_sec(void)
  }
  
  /* real time host monotonic timer */
 -static inline int64_t get_clock_realtime(void)
 +static inline int64_t get_clock_host(void)

It accesses the host realtime clock, so get_clock_host_realtime would be
optimal. In that light, the comment above should be fixed as well.

Jan

  {
  struct timeval tv;
  
 @@ -131,7 +131,7 @@ static inline int64_t get_clock(void)
  {
  /* XXX: using gettimeofday leads to problems if the date
 changes, so it should be avoided. */
 -return get_clock_realtime();
 +return get_clock_host();
  }
  }
  #endif

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] [PATCH v9 0/9] ARM: Samsung Exynos4210-based boards support.

2012-01-20 Thread Evgeny Voevodin
This set of patches adds support for Samsung S5PC210-based boards NURI and 
SMDKC210.
Tested on Linux kernel v3.x series.

Usage:
-smp 2 option is obligatory for now.
To test emulation models user can launch Linux kernel v3.x configured with 
exynos4_defconfig configuration.
This will allow to boot kernel with initrd.

To get support of framebuffer by kernel, enable Samsung S3C framebuffer driver  
in configuration file.
Note: at current time mainline kernel does not support framebuffer properly
and to get it worked kernel must be patched a little. This problem will be 
solved soon
by Samsung kernel developers

To get support of ethernet for smdkc210 board by kernel, enable SMSC 
LAN911x/LAN921x families
embedded ethernet driver in configuration file.
Note: NURI board does not emulate ethernet adapter since it is a mobile 
device. 


Examples:
Boot smdkc210 with NFS root support.
qemu-system-arm -kernel ./zImage -append ip=dhcp root=/dev/nfs 
nfsroot=10.0.2.2:/srv/nfs/ rw  -M smdkc210 -smp 2

Boot smdkc210 with NFS root support and serial redirected to terminal
qemu-system-arm -kernel ./zImage -append console=ttySAC0,115200n8 ip=dhcp 
root=/dev/nfs nfsroot=10.0.2.2:/srv/nfs/ rw  -serial stdio -M smdkc210 -smp 2

Boot NURI with initrd root support and serial redirected to terminal
qemu-system-arm -kernel ./zImage -append console=ttySAC0,115200n8 
root=/dev/ram rw  -serial stdio -M nuri -smp 2 -initrd ./rootfs.ext2


Changelog:
 v8-v9
 - exynos4210.c: secondary cpu bootloader memory region allocation is removed 
(it resides in already allocated IROM),
 removed hack memory region for secondary CPU boot loader (PMU 
device added).
 added l2x0 cache controller
 - exynos4210_pmu.c: PMU registers modelling device added to emulation. It is 
needed since PMU contains 
 INFORM5 register which is used to boot the secondary CPUs.
 - exynos4_boards.c: indentation fix
 - exynos4210_uart.c: indentation fix, BREAK event handling code added, fixed 
size of allocated registers region
 - exynos4210_gic.c: number of IRQs passed to gic_init() due to last mainline 
update.
 - lan9118.: added VMSTATE fields due to last mainline update.
 v7-v8
 - exynos4_boards.c: lack of spaces fix
 - exynos4210_gic.c: lack of spaces fix
 - exynos4210_combiner.c: lack of spaces fix
 - exynos4210_uart.c: lack of spaces fix, indentation fix
 - exynos4210_mct.c: ULL suffix fix
 v6-v7
 - exynos4210_pwm.c: added usage of ptimer.h
 - exynos4210_mct.c: added usage of ptimer.h
 v5-v6
 - arm_boot.c, vexpress.c, realview.c: board should specify smp_bootreg_addr if 
its ncpu  1
 - patch order changed, boot secondary CPU is included in exynos boards 
patch.
 - exynos4210_mct.c: usage of UINTX_MAX, removed excessive property list, fixed 
indentation,
 fixed comments
 - exynos4210_pwm.c: spaces and brakcets in macros, removed excessive property 
list,
 fixed indentation,
 - exynos4210_combiner.c: removed excessive reset, fixed indentation, fixed 
comments
 - exynos4210_gic.c: fixed indentation, fixed syntax
 - exynos4210_uart.c: fixed indentation, fixed syntax
 - exynos4210.c: fixed comments
 - Makefile.target: removed \
 - hw/exynos4210_fimd.c: rebased against current master: all manipulation with 
physical pages are dropped and
 replaced with new memory API functions;

 added three new members to winow's state: 
MemoryRegionSection to describe section
 of system RAM containing current framebuffer, host 
pointer to framebuffer data and
 framebuffer length;

 mapping of framebuffer now performed only on 
framebuffer settings change
 instead on every display update;

 bytes swap in uint64 variable now performed with 
standard QEMU bswap64 function;

 blencon register type changed to uint32_t;

 fixed incorrect spelling of palette word;

 if ... else statements in exynos4210_fimd_read() and 
exynos4210_fimd_write() are
 replaced with switch() {} statement.
 


 v4-v5
 - hw/exynos4210_gic.c: Use memory aliases for CPU interface and Distributer.
   Excessive RW functions are removed.
 - hw/exynos4210_pwm.c and hw/exynos4210_mct.c: Saving of timers added.
 - hw/exynos4210_uart.c: VMSTATE version_id fixed.
 v3-v4
 - Split Exynos SOC and boards.
 - Temporary removed SD and CMU support to post later.
 - Lan9118 remarks took into account.
 - Secondary CPU bootloader remarks took into account.
 - PWM remarks took into account.
 - UART remarks took into account.
 v2-v3
 - Reverted hw/arm_gic.c modification
 - Added IRQ Gate to Exynos4210 board.

Evgeny Voevodin (6):
  ARM: exynos4210: IRQ subsystem support.
  ARM: Samsung exynos4210-based boards emulation
  ARM: exynos4210: PWM support.
  ARM: 

[Qemu-devel] [PATCH v9 2/9] ARM: Samsung exynos4210-based boards emulation

2012-01-20 Thread Evgeny Voevodin
Add initial support of NURI and SMDKC210 boards

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 Makefile.target |3 +-
 hw/exynos4210.c |  193 +++
 hw/exynos4210.h |   40 +++
 hw/exynos4_boards.c |  143 +
 4 files changed, 378 insertions(+), 1 deletions(-)
 create mode 100644 hw/exynos4210.c
 create mode 100644 hw/exynos4_boards.c

diff --git a/Makefile.target b/Makefile.target
index 4ac257e..6199d44 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -339,7 +339,8 @@ obj-arm-y = integratorcp.o versatilepb.o arm_pic.o 
arm_timer.o
 obj-arm-y += arm_boot.o pl011.o pl031.o pl050.o pl080.o pl110.o pl181.o pl190.o
 obj-arm-y += versatile_pci.o
 obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o
-obj-arm-y += exynos4210_gic.o exynos4210_combiner.o
+obj-arm-y += exynos4210_gic.o exynos4210_combiner.o exynos4210.o
+obj-arm-y += exynos4_boards.o
 obj-arm-y += arm_l2x0.o
 obj-arm-y += arm_mptimer.o
 obj-arm-y += armv7m.o armv7m_nvic.o stellaris.o pl022.o stellaris_enet.o
diff --git a/hw/exynos4210.c b/hw/exynos4210.c
new file mode 100644
index 000..aff0081
--- /dev/null
+++ b/hw/exynos4210.c
@@ -0,0 +1,193 @@
+/*
+ *  Samsung exynos4210 SoC emulation
+ *
+ *  Copyright (c) 2011 Samsung Electronics Co., Ltd. All rights reserved.
+ *Maksim Kozlov m.koz...@samsung.com
+ *Evgeny Voevodin e.voevo...@samsung.com
+ *Igor Mitsyanko  i.mitsya...@samsung.com
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see http://www.gnu.org/licenses/.
+ *
+ */
+
+#include boards.h
+#include sysemu.h
+#include sysbus.h
+#include arm-misc.h
+#include exynos4210.h
+
+#define EXYNOS4210_CHIPID_ADDR 0x1000
+
+/* External GIC */
+#define EXYNOS4210_EXT_GIC_CPU_BASE_ADDR0x1048
+#define EXYNOS4210_EXT_GIC_DIST_BASE_ADDR   0x1049
+
+/* Combiner */
+#define EXYNOS4210_EXT_COMBINER_BASE_ADDR   0x1044
+#define EXYNOS4210_INT_COMBINER_BASE_ADDR   0x10448000
+
+static uint8_t chipid_and_omr[] = { 0x11, 0x02, 0x21, 0x43,
+0x09, 0x00, 0x00, 0x00 };
+
+Exynos4210State *exynos4210_init(MemoryRegion *system_mem,
+unsigned long ram_size)
+{
+qemu_irq cpu_irq[4];
+int n;
+Exynos4210State *s = g_new(Exynos4210State, 1);
+qemu_irq *irq_table;
+qemu_irq *irqp;
+qemu_irq gate_irq[EXYNOS4210_IRQ_GATE_NINPUTS];
+unsigned long mem_size;
+DeviceState *dev;
+SysBusDevice *busdev;
+
+for (n = 0; n  smp_cpus; n++) {
+s-env[n] = cpu_init(cortex-a9);
+if (!s-env[n]) {
+fprintf(stderr, Unable to find CPU %d definition\n, n);
+exit(1);
+}
+/* Create PIC controller for each processor instance */
+irqp = arm_pic_init_cpu(s-env[n]);
+
+/*
+ * Get GICs gpio_in cpu_irq to connect a combiner to them later.
+ * Use only IRQ for a while.
+ */
+cpu_irq[n] = irqp[ARM_PIC_CPU_IRQ];
+}
+
+/*** IRQs ***/
+
+s-irq_table = exynos4210_init_irq(s-irqs);
+irq_table = s-irq_table;
+
+/* IRQ Gate */
+dev = qdev_create(NULL, exynos4210.irq_gate);
+qdev_init_nofail(dev);
+/* Get IRQ Gate input in gate_irq */
+for (n = 0; n  EXYNOS4210_IRQ_GATE_NINPUTS; n++) {
+gate_irq[n] = qdev_get_gpio_in(dev, n);
+}
+busdev = sysbus_from_qdev(dev);
+/* Connect IRQ Gate output to cpu_irq */
+for (n = 0; n  smp_cpus; n++) {
+sysbus_connect_irq(busdev, n, cpu_irq[n]);
+}
+
+/* Private memory region and Internal GIC */
+dev = qdev_create(NULL, a9mpcore_priv);
+qdev_prop_set_uint32(dev, num-cpu, smp_cpus);
+qdev_init_nofail(dev);
+busdev = sysbus_from_qdev(dev);
+sysbus_mmio_map(busdev, 0, EXYNOS4210_SMP_PRIVATE_BASE_ADDR);
+for (n = 0; n  smp_cpus; n++) {
+sysbus_connect_irq(busdev, n, gate_irq[n * 2]);
+}
+for (n = 0; n  EXYNOS4210_INT_GIC_NIRQ; n++) {
+s-irqs.int_gic_irq[n] = qdev_get_gpio_in(dev, n);
+}
+
+/* Cache controller */
+sysbus_create_simple(l2x0, EXYNOS4210_L2X0_BASE_ADDR, NULL);
+
+/* External GIC */
+dev = qdev_create(NULL, exynos4210.gic);
+qdev_prop_set_uint32(dev, num-cpu, smp_cpus);
+qdev_init_nofail(dev);
+busdev = sysbus_from_qdev(dev);
+/* Map CPU interface */
+

[Qemu-devel] [PATCH v9 5/9] ARM: exynos4210: Added PMU register model.

2012-01-20 Thread Evgeny Voevodin
From: Maksim Kozlov m.koz...@samsung.com

This model just implements PMU registers as a bulk of memory.
The only reason of existence in such form is that secondary CPU
boot loader uses PMU INFORM5 register as a holding pen.

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 Makefile.target |1 +
 hw/exynos4210.c |9 +
 hw/exynos4210_pmu.c |  549 +++
 3 files changed, 559 insertions(+), 0 deletions(-)
 create mode 100644 hw/exynos4210_pmu.c

diff --git a/Makefile.target b/Makefile.target
index 1914870..6cddf0c 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -341,6 +341,7 @@ obj-arm-y += versatile_pci.o
 obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o
 obj-arm-y += exynos4210_gic.o exynos4210_combiner.o exynos4210.o
 obj-arm-y += exynos4_boards.o exynos4210_uart.o exynos4210_pwm.o
+obj-arm-y += exynos4210_pmu.o
 obj-arm-y += arm_l2x0.o
 obj-arm-y += arm_mptimer.o
 obj-arm-y += armv7m.o armv7m_nvic.o stellaris.o pl022.o stellaris_enet.o
diff --git a/hw/exynos4210.c b/hw/exynos4210.c
index ea5a1f8..2e2dbf0 100644
--- a/hw/exynos4210.c
+++ b/hw/exynos4210.c
@@ -52,6 +52,9 @@
 #define EXYNOS4210_EXT_COMBINER_BASE_ADDR   0x1044
 #define EXYNOS4210_INT_COMBINER_BASE_ADDR   0x10448000
 
+/* PMU SFR base address */
+#define EXYNOS4210_PMU_BASE_ADDR0x1002
+
 static uint8_t chipid_and_omr[] = { 0x11, 0x02, 0x21, 0x43,
 0x09, 0x00, 0x00, 0x00 };
 
@@ -204,6 +207,12 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem,
 memory_region_add_subregion(system_mem, EXYNOS4210_DRAM0_BASE_ADDR,
 s-dram0_mem);
 
+   /* PMU.
+* The only reason of existence at the moment is that secondary CPU boot
+* loader uses PMU INFORM5 register as a holding pen.
+*/
+sysbus_create_simple(exynos4210.pmu, EXYNOS4210_PMU_BASE_ADDR, NULL);
+
 /* PWM */
 sysbus_create_varargs(exynos4210.pwm, EXYNOS4210_PWM_BASE_ADDR,
 irq_table[exynos4210_get_irq(22, 0)],
diff --git a/hw/exynos4210_pmu.c b/hw/exynos4210_pmu.c
new file mode 100644
index 000..235c3c3
--- /dev/null
+++ b/hw/exynos4210_pmu.c
@@ -0,0 +1,549 @@
+/*
+ *  exynos4210 Power Management Unit (PMU) Emulation
+ *
+ *  Copyright (C) 2011 Samsung Electronics Co Ltd.
+ *Maksim Kozlov m.koz...@samsung.com
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+/*
+ * This model just implements PMU registers as a bulk of memory. The only 
reason
+ * of existence in such form is that secondary CPU boot loader uses PMU INFORM5
+ * register as a holding pen.
+ */
+
+#include sysbus.h
+
+#undef DEBUG_PMU
+#undef DEBUG_PMU_EXTEND
+
+//#define DEBUG_PMU
+//#define DEBUG_PMU_EXTEND
+
+#define  PRINT_DEBUG(fmt, args...)  \
+do {} while (0)
+#define  PRINT_DEBUG_EXTEND(fmt, args...) \
+do {} while (0)
+
+#ifdef DEBUG_PMU
+
+#undef PRINT_DEBUG
+#define  PRINT_DEBUG(fmt, args...)  \
+do { \
+fprintf(stderr,   [%s:%d]   fmt, __func__, __LINE__, ##args); \
+} while (0)
+
+#ifdef DEBUG_PMU_EXTEND
+
+#undef PRINT_DEBUG_EXTEND
+#define  PRINT_DEBUG_EXTEND(fmt, args...) \
+do { \
+fprintf(stderr,   [%s:%d]   fmt, __func__, __LINE__, ##args); \
+} while (0)
+
+#endif /* EXTEND */
+#endif
+
+/*
+ *  Offsets for PMU registers
+ */
+#define OM_STAT  0x /* OM status register */
+#define RTC_CLKO_SEL 0x000C /* Controls RTCCLKOUT */
+#define GNSS_RTC_OUT_CTRL0x0010 /* Controls GNSS_RTC_OUT */
+#define SYSTEM_POWER_DOWN_CTRL   0x0200 /* Decides whether system-level
+   low-power mode is used. */
+#define SYSTEM_POWER_DOWN_OPTION 0x0208 /* Sets control options for
+   CENTRAL_SEQ */
+#define SWRESET  0x0400 /* Generate software reset */
+#define RST_STAT 0x0404 /* Reset status register */
+#define WAKEUP_STAT  0x0600 /* Wakeup status register  */
+#define EINT_WAKEUP_MASK 0x0604 /* Configure External INTerrupt mask */
+#define WAKEUP_MASK  0x0608 /* Configure wakeup source mask */
+#define HDMI_PHY_CONTROL 0x0700 /* HDMI PHY control register */
+#define USBDEVICE_PHY_CONTROL0x0704 /* USB Device PHY control register */
+#define 

[Qemu-devel] [PATCH v9 7/9] hw/lan9118: Add basic 16-bit mode support.

2012-01-20 Thread Evgeny Voevodin

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 hw/lan9118.c |  122 +++--
 1 files changed, 117 insertions(+), 5 deletions(-)

diff --git a/hw/lan9118.c b/hw/lan9118.c
index 9b199d0..d5e1551 100644
--- a/hw/lan9118.c
+++ b/hw/lan9118.c
@@ -235,6 +235,16 @@ typedef struct {
 int32_t rxp_offset;
 int32_t rxp_size;
 int32_t rxp_pad;
+
+uint32_t write_word_prev_offset;
+uint32_t write_word_n;
+uint16_t write_word_l;
+uint16_t write_word_h;
+uint32_t read_word_prev_offset;
+uint32_t read_word_n;
+uint32_t read_long;
+
+uint32_t mode_16bit;
 } lan9118_state;
 
 static const VMStateDescription vmstate_lan9118 = {
@@ -294,6 +304,14 @@ static const VMStateDescription vmstate_lan9118 = {
 VMSTATE_INT32(rxp_offset, lan9118_state),
 VMSTATE_INT32(rxp_size, lan9118_state),
 VMSTATE_INT32(rxp_pad, lan9118_state),
+VMSTATE_UINT32(write_word_prev_offset, lan9118_state),
+VMSTATE_UINT32(write_word_n, lan9118_state),
+VMSTATE_UINT16(write_word_l, lan9118_state),
+VMSTATE_UINT16(write_word_h, lan9118_state),
+VMSTATE_UINT32(read_word_prev_offset, lan9118_state),
+VMSTATE_UINT32(read_word_n, lan9118_state),
+VMSTATE_UINT32(read_long, lan9118_state),
+VMSTATE_UINT32(mode_16bit, lan9118_state),
 VMSTATE_END_OF_LIST()
 }
 };
@@ -390,7 +408,7 @@ static void lan9118_reset(DeviceState *d)
 s-fifo_int = 0x4800;
 s-rx_cfg = 0;
 s-tx_cfg = 0;
-s-hw_cfg = 0x0005;
+s-hw_cfg = s-mode_16bit ? 0x0005 : 0x00050004;
 s-pmt_ctrl = 0x45;
 s-gpio_cfg = 0;
 s-txp-fifo_used = 0;
@@ -429,6 +447,9 @@ static void lan9118_reset(DeviceState *d)
 s-mac_mii_data = 0;
 s-mac_flow = 0;
 
+s-read_word_n = 0;
+s-write_word_n = 0;
+
 phy_reset(s);
 
 s-eeprom_writable = 0;
@@ -984,7 +1005,7 @@ static void lan9118_writel(void *opaque, 
target_phys_addr_t offset,
 {
 lan9118_state *s = (lan9118_state *)opaque;
 offset = 0xff;
-
+
 //DPRINTF(Write reg 0x%02x = 0x%08x\n, (int)offset, val);
 if (offset = 0x20  offset  0x40) {
 /* TX FIFO */
@@ -1034,7 +1055,7 @@ static void lan9118_writel(void *opaque, 
target_phys_addr_t offset,
 /* SRST */
 lan9118_reset(s-busdev.qdev);
 } else {
-s-hw_cfg = val  0x003f300;
+s-hw_cfg = (val  0x003f300) | (s-hw_cfg  0x4);
 }
 break;
 case CSR_RX_DP_CTRL:
@@ -1113,6 +1134,46 @@ static void lan9118_writel(void *opaque, 
target_phys_addr_t offset,
 lan9118_update(s);
 }
 
+static void lan9118_writew(void *opaque, target_phys_addr_t offset,
+   uint32_t val)
+{
+lan9118_state *s = (lan9118_state *)opaque;
+offset = 0xff;
+
+if (s-write_word_prev_offset != (offset  ~0x3)) {
+/* New offset, reset word counter */
+s-write_word_n = 0;
+s-write_word_prev_offset = offset  ~0x3;
+}
+
+if (offset  0x2) {
+s-write_word_h = val;
+} else {
+s-write_word_l = val;
+}
+
+//DPRINTF(Writew reg 0x%02x = 0x%08x\n, (int)offset, val);
+s-write_word_n++;
+if (s-write_word_n == 2) {
+s-write_word_n = 0;
+lan9118_writel(s, offset  ~3, s-write_word_l +
+(s-write_word_h  16), 4);
+}
+}
+
+static void lan9118_16bit_mode_write(void *opaque, target_phys_addr_t offset,
+ uint64_t val, unsigned size)
+{
+switch (size) {
+case 2:
+return lan9118_writew(opaque, offset, (uint32_t)val);
+case 4:
+return lan9118_writel(opaque, offset, val, size);
+}
+
+hw_error(lan9118_write: Bad size 0x%x\n, size);
+}
+
 static uint64_t lan9118_readl(void *opaque, target_phys_addr_t offset,
   unsigned size)
 {
@@ -1149,7 +1210,7 @@ static uint64_t lan9118_readl(void *opaque, 
target_phys_addr_t offset,
 case CSR_TX_CFG:
 return s-tx_cfg;
 case CSR_HW_CFG:
-return s-hw_cfg | 0x4;
+return s-hw_cfg;
 case CSR_RX_DP_CTRL:
 return 0;
 case CSR_RX_FIFO_INF:
@@ -1187,12 +1248,60 @@ static uint64_t lan9118_readl(void *opaque, 
target_phys_addr_t offset,
 return 0;
 }
 
+static uint32_t lan9118_readw(void *opaque, target_phys_addr_t offset)
+{
+lan9118_state *s = (lan9118_state *)opaque;
+uint32_t val;
+
+if (s-read_word_prev_offset != (offset  ~0x3)) {
+/* New offset, reset word counter */
+s-read_word_n = 0;
+s-read_word_prev_offset = offset  ~0x3;
+}
+
+s-read_word_n++;
+if (s-read_word_n == 1) {
+s-read_long = lan9118_readl(s, offset  ~3, 4);
+} else {
+s-read_word_n = 0;
+}
+
+if (offset  2) {
+val = s-read_long  16;
+} else {
+val = s-read_long  0x;
+}
+
+//DPRINTF(Readw reg 0x%02x, val 0x%x\n, (int)offset, val);
+ 

Re: [Qemu-devel] [PATCH] rename get_clock_realtime

2012-01-20 Thread Paolo Bonzini

On 01/20/2012 12:17 PM, Jan Kiszka wrote:

On 2012-01-20 12:05, Paolo Bonzini wrote:

get_clock_realtime accesses the host_clock, not the rt_clock.

Signed-off-by: Paolo Bonzinipbonz...@redhat.com
---
  qemu-timer.c |2 +-
  qemu-timer.h |4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index cd026c6..4a14a6d 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -436,7 +436,7 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
  return cpu_get_clock();
  }
  case QEMU_CLOCK_HOST:
-now = get_clock_realtime();
+now = get_clock_host();
  last = clock-last;
  clock-last = now;
  if (now  last) {
diff --git a/qemu-timer.h b/qemu-timer.h
index de17f3b..b180fca 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -93,7 +93,7 @@ static inline int64_t get_ticks_per_sec(void)
  }

  /* real time host monotonic timer */
-static inline int64_t get_clock_realtime(void)
+static inline int64_t get_clock_host(void)


It accesses the host realtime clock, so get_clock_host_realtime would be
optimal. In that light, the comment above should be fixed as well.


Yeah, however, realtime is quite confusing because CLOCK_MONOTONIC is 
part of the real-time clock API.  The code is much clearer than the 
comment, I'll remove it completely.  v2 on the way.


Paolo




Re: [Qemu-devel] [PATCH] hw/9pfs: Update MAINTAINERS file

2012-01-20 Thread Aneesh Kumar K.V
On Thu, 19 Jan 2012 23:40:25 +0100, Andreas Färber afaer...@suse.de wrote:
 Am 19.01.2012 18:27, schrieb Aneesh Kumar K.V:
  From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
  
  Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
  ---
   MAINTAINERS |6 --
   1 files changed, 4 insertions(+), 2 deletions(-)
  
  diff --git a/MAINTAINERS b/MAINTAINERS
  index de2a9163..f9f131c 100644
  --- a/MAINTAINERS
  +++ b/MAINTAINERS
  @@ -399,9 +399,11 @@ S: Supported
   F: hw/virtio*
   
   virtio-9p
  -M: Venkateswararao Jujjuri (JV) jv...@linux.vnet.ibm.com
  +M: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
   S: Supported
  -F: hw/virtio-9p*
  +F: hw/9pfs/ fsdev/
  +T: https://github.com/kvaneesh/QEMU
 
 Shouldn't this be T: git://github.com/kvaneesh/QEMU.git your-9p-branch?
 You can use the above Web URL with W: if you wish.
 
 Maybe update the title to 9pfs, too? (Not that I understand much of it,
 just spotting the apparent inconsistency.)
 
 Andreas

All the branches in that repo are used for 9p development. I use
for-upstream to push changes to anthony. So was not sure whether I
should specify for-upstream there.

 -aneesh




[Qemu-devel] [PATCH v2] rename get_clock_realtime

2012-01-20 Thread Paolo Bonzini
get_clock_realtime accesses the host_clock, not the rt_clock.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
v1-v2: remove incorrect comment

 qemu-timer.c |2 +-
 qemu-timer.h |5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index cd026c6..4a14a6d 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -436,7 +436,7 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
 return cpu_get_clock();
 }
 case QEMU_CLOCK_HOST:
-now = get_clock_realtime();
+now = get_clock_host();
 last = clock-last;
 clock-last = now;
 if (now  last) {
diff --git a/qemu-timer.h b/qemu-timer.h
index de17f3b..a61f209 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -92,8 +92,7 @@ static inline int64_t get_ticks_per_sec(void)
 return 10LL;
 }
 
-/* real time host monotonic timer */
-static inline int64_t get_clock_realtime(void)
+static inline int64_t get_clock_host(void)
 {
 struct timeval tv;
 
@@ -131,7 +130,7 @@ static inline int64_t get_clock(void)
 {
 /* XXX: using gettimeofday leads to problems if the date
changes, so it should be avoided. */
-return get_clock_realtime();
+return get_clock_host();
 }
 }
 #endif
-- 
1.7.7.1




[Qemu-devel] [PATCH v9 1/9] ARM: exynos4210: IRQ subsystem support.

2012-01-20 Thread Evgeny Voevodin

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 Makefile.target  |1 +
 hw/exynos4210.h  |   82 
 hw/exynos4210_combiner.c |  472 ++
 hw/exynos4210_gic.c  |  436 ++
 4 files changed, 991 insertions(+), 0 deletions(-)
 create mode 100644 hw/exynos4210.h
 create mode 100644 hw/exynos4210_combiner.c
 create mode 100644 hw/exynos4210_gic.c

diff --git a/Makefile.target b/Makefile.target
index 06d79b8..4ac257e 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -339,6 +339,7 @@ obj-arm-y = integratorcp.o versatilepb.o arm_pic.o 
arm_timer.o
 obj-arm-y += arm_boot.o pl011.o pl031.o pl050.o pl080.o pl110.o pl181.o pl190.o
 obj-arm-y += versatile_pci.o
 obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o
+obj-arm-y += exynos4210_gic.o exynos4210_combiner.o
 obj-arm-y += arm_l2x0.o
 obj-arm-y += arm_mptimer.o
 obj-arm-y += armv7m.o armv7m_nvic.o stellaris.o pl022.o stellaris_enet.o
diff --git a/hw/exynos4210.h b/hw/exynos4210.h
new file mode 100644
index 000..cef264b
--- /dev/null
+++ b/hw/exynos4210.h
@@ -0,0 +1,82 @@
+/*
+ *  Samsung exynos4210 SoC emulation
+ *
+ *  Copyright (c) 2011 Samsung Electronics Co., Ltd. All rights reserved.
+ *Maksim Kozlov m.koz...@samsung.com
+ *Evgeny Voevodin e.voevo...@samsung.com
+ *Igor Mitsyanko i.mitsya...@samsung.com
+ *
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see http://www.gnu.org/licenses/.
+ *
+ */
+
+
+#ifndef EXYNOS4210_H_
+#define EXYNOS4210_H_
+
+#include qemu-common.h
+#include memory.h
+
+#define EXYNOS4210_MAX_CPUS 2
+
+/*
+ * exynos4210 IRQ subsystem stub definitions.
+ */
+#define EXYNOS4210_IRQ_GATE_NINPUTS 8
+
+#define EXYNOS4210_MAX_INT_COMBINER_OUT_IRQ  64
+#define EXYNOS4210_MAX_EXT_COMBINER_OUT_IRQ  16
+#define EXYNOS4210_MAX_INT_COMBINER_IN_IRQ   \
+(EXYNOS4210_MAX_INT_COMBINER_OUT_IRQ * 8)
+#define EXYNOS4210_MAX_EXT_COMBINER_IN_IRQ   \
+(EXYNOS4210_MAX_EXT_COMBINER_OUT_IRQ * 8)
+
+#define EXYNOS4210_COMBINER_GET_IRQ_NUM(grp, bit)  ((grp)*8 + (bit))
+#define EXYNOS4210_COMBINER_GET_GRP_NUM(irq)   ((irq) / 8)
+#define EXYNOS4210_COMBINER_GET_BIT_NUM(irq) \
+((irq) - 8 * EXYNOS4210_COMBINER_GET_GRP_NUM(irq))
+
+/* IRQs number for external and internal GIC */
+#define EXYNOS4210_EXT_GIC_NIRQ (160-32)
+#define EXYNOS4210_INT_GIC_NIRQ 64
+
+typedef struct Exynos4210Irq {
+qemu_irq int_combiner_irq[EXYNOS4210_MAX_INT_COMBINER_IN_IRQ];
+qemu_irq ext_combiner_irq[EXYNOS4210_MAX_EXT_COMBINER_IN_IRQ];
+qemu_irq int_gic_irq[EXYNOS4210_INT_GIC_NIRQ];
+qemu_irq ext_gic_irq[EXYNOS4210_EXT_GIC_NIRQ];
+qemu_irq board_irqs[EXYNOS4210_MAX_INT_COMBINER_IN_IRQ];
+} Exynos4210Irq;
+
+/* Initialize exynos4210 IRQ subsystem stub */
+qemu_irq *exynos4210_init_irq(Exynos4210Irq *env);
+
+/* Initialize board IRQs.
+ * These IRQs contain splitted Int/External Combiner and External Gic IRQs */
+void exynos4210_init_board_irqs(Exynos4210Irq *s);
+
+/* Get IRQ number from exynos4210 IRQ subsystem stub.
+ * To identify IRQ source use internal combiner group and bit number
+ *  grp - group number
+ *  bit - bit number inside group */
+uint32_t exynos4210_get_irq(uint32_t grp, uint32_t bit);
+
+/*
+ * Get Combiner input GPIO into irqs structure
+ */
+void exynos4210_combiner_get_gpioin(Exynos4210Irq *irqs, DeviceState *dev,
+int ext);
+
+#endif /* EXYNOS4210_H_ */
diff --git a/hw/exynos4210_combiner.c b/hw/exynos4210_combiner.c
new file mode 100644
index 000..84e692b
--- /dev/null
+++ b/hw/exynos4210_combiner.c
@@ -0,0 +1,472 @@
+/*
+ * Samsung exynos4210 Interrupt Combiner
+ *
+ * Copyright (c) 2000 - 2011 Samsung Electronics Co., Ltd.
+ * All rights reserved.
+ *
+ * Evgeny Voevodin e.voevo...@samsung.com
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU 

Re: [Qemu-devel] [PATCH] rename get_clock_realtime

2012-01-20 Thread Jan Kiszka
On 2012-01-20 12:26, Paolo Bonzini wrote:
 On 01/20/2012 12:17 PM, Jan Kiszka wrote:
 On 2012-01-20 12:05, Paolo Bonzini wrote:
 get_clock_realtime accesses the host_clock, not the rt_clock.

 Signed-off-by: Paolo Bonzinipbonz...@redhat.com
 ---
   qemu-timer.c |2 +-
   qemu-timer.h |4 ++--
   2 files changed, 3 insertions(+), 3 deletions(-)

 diff --git a/qemu-timer.c b/qemu-timer.c
 index cd026c6..4a14a6d 100644
 --- a/qemu-timer.c
 +++ b/qemu-timer.c
 @@ -436,7 +436,7 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
   return cpu_get_clock();
   }
   case QEMU_CLOCK_HOST:
 -now = get_clock_realtime();
 +now = get_clock_host();
   last = clock-last;
   clock-last = now;
   if (now  last) {
 diff --git a/qemu-timer.h b/qemu-timer.h
 index de17f3b..b180fca 100644
 --- a/qemu-timer.h
 +++ b/qemu-timer.h
 @@ -93,7 +93,7 @@ static inline int64_t get_ticks_per_sec(void)
   }

   /* real time host monotonic timer */
 -static inline int64_t get_clock_realtime(void)
 +static inline int64_t get_clock_host(void)

 It accesses the host realtime clock, so get_clock_host_realtime would be
 optimal. In that light, the comment above should be fixed as well.
 
 Yeah, however, realtime is quite confusing because CLOCK_MONOTONIC is 
 part of the real-time clock API.  The code is much clearer than the 
 comment, I'll remove it completely.  v2 on the way.

There is CLOCK_MONOTONIC and CLOCK_REALTIME, and this function uses the
latter.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH] rename get_clock_realtime

2012-01-20 Thread Paolo Bonzini

On 01/20/2012 12:29 PM, Jan Kiszka wrote:

  Yeah, however, realtime is quite confusing because CLOCK_MONOTONIC is
  part of the real-time clock API.  The code is much clearer than the
  comment, I'll remove it completely.  v2 on the way.

There is CLOCK_MONOTONIC and CLOCK_REALTIME, and this function uses the
latter.


Actually it uses gettimeofday, but I see what you mean.

But QEMU_CLOCK_REALTIME and rt_clock refer to CLOCK_MONOTONIC and are 
_not_ read by get_clock_realtime.  This actually is what prompted me to 
rename get_clock_realtime (see commit message).


Paolo



[Qemu-devel] [PATCH v9 3/9] ARM: exynos4210: UART support

2012-01-20 Thread Evgeny Voevodin
From: Maksim Kozlov m.koz...@samsung.com

Add basic support of exynos4210 UART

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
---
 Makefile.target  |2 +-
 hw/exynos4210.c  |   29 +++
 hw/exynos4210.h  |9 +
 hw/exynos4210_uart.c |  661 ++
 4 files changed, 700 insertions(+), 1 deletions(-)
 create mode 100644 hw/exynos4210_uart.c

diff --git a/Makefile.target b/Makefile.target
index 6199d44..c856de3 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -340,7 +340,7 @@ obj-arm-y += arm_boot.o pl011.o pl031.o pl050.o pl080.o 
pl110.o pl181.o pl190.o
 obj-arm-y += versatile_pci.o
 obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o
 obj-arm-y += exynos4210_gic.o exynos4210_combiner.o exynos4210.o
-obj-arm-y += exynos4_boards.o
+obj-arm-y += exynos4_boards.o exynos4210_uart.o
 obj-arm-y += arm_l2x0.o
 obj-arm-y += arm_mptimer.o
 obj-arm-y += armv7m.o armv7m_nvic.o stellaris.o pl022.o stellaris_enet.o
diff --git a/hw/exynos4210.c b/hw/exynos4210.c
index aff0081..3838b96 100644
--- a/hw/exynos4210.c
+++ b/hw/exynos4210.c
@@ -29,6 +29,18 @@
 
 #define EXYNOS4210_CHIPID_ADDR 0x1000
 
+/* UART's definitions */
+#define EXYNOS4210_UART0_BASE_ADDR 0x1380
+#define EXYNOS4210_UART1_BASE_ADDR 0x1381
+#define EXYNOS4210_UART2_BASE_ADDR 0x1382
+#define EXYNOS4210_UART3_BASE_ADDR 0x1383
+#define EXYNOS4210_UART0_FIFO_SIZE 256
+#define EXYNOS4210_UART1_FIFO_SIZE 64
+#define EXYNOS4210_UART2_FIFO_SIZE 16
+#define EXYNOS4210_UART3_FIFO_SIZE 16
+/* Interrupt Group of External Interrupt Combiner for UART */
+#define EXYNOS4210_UART_INT_GRP26
+
 /* External GIC */
 #define EXYNOS4210_EXT_GIC_CPU_BASE_ADDR0x1048
 #define EXYNOS4210_EXT_GIC_DIST_BASE_ADDR   0x1049
@@ -189,5 +201,22 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem,
 memory_region_add_subregion(system_mem, EXYNOS4210_DRAM0_BASE_ADDR,
 s-dram0_mem);
 
+/*** UARTs ***/
+exynos4210_uart_create(EXYNOS4210_UART0_BASE_ADDR,
+   EXYNOS4210_UART0_FIFO_SIZE, 0, NULL,
+irq_table[exynos4210_get_irq(EXYNOS4210_UART_INT_GRP, 0)]);
+
+exynos4210_uart_create(EXYNOS4210_UART1_BASE_ADDR,
+   EXYNOS4210_UART1_FIFO_SIZE, 1, NULL,
+irq_table[exynos4210_get_irq(EXYNOS4210_UART_INT_GRP, 1)]);
+
+exynos4210_uart_create(EXYNOS4210_UART2_BASE_ADDR,
+   EXYNOS4210_UART2_FIFO_SIZE, 2, NULL,
+irq_table[exynos4210_get_irq(EXYNOS4210_UART_INT_GRP, 2)]);
+
+exynos4210_uart_create(EXYNOS4210_UART3_BASE_ADDR,
+   EXYNOS4210_UART3_FIFO_SIZE, 3, NULL,
+irq_table[exynos4210_get_irq(EXYNOS4210_UART_INT_GRP, 3)]);
+
 return s;
 }
diff --git a/hw/exynos4210.h b/hw/exynos4210.h
index 621e75a..11368e8 100644
--- a/hw/exynos4210.h
+++ b/hw/exynos4210.h
@@ -119,4 +119,13 @@ uint32_t exynos4210_get_irq(uint32_t grp, uint32_t bit);
 void exynos4210_combiner_get_gpioin(Exynos4210Irq *irqs, DeviceState *dev,
 int ext);
 
+/*
+ * exynos4210 UART
+ */
+DeviceState *exynos4210_uart_create(target_phys_addr_t addr,
+int fifo_size,
+int channel,
+CharDriverState *chr,
+qemu_irq irq);
+
 #endif /* EXYNOS4210_H_ */
diff --git a/hw/exynos4210_uart.c b/hw/exynos4210_uart.c
new file mode 100644
index 000..01ccf5a
--- /dev/null
+++ b/hw/exynos4210_uart.c
@@ -0,0 +1,661 @@
+/*
+ *  Exynos4210 UART Emulation
+ *
+ *  Copyright (C) 2011 Samsung Electronics Co Ltd.
+ *Maksim Kozlov, m.koz...@samsung.com
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under the terms of the GNU General Public License as published by the
+ *  Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful, but WITHOUT
+ *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ *  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ *  for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see http://www.gnu.org/licenses/.
+ *
+ */
+
+#include sysbus.h
+#include sysemu.h
+#include qemu-char.h
+
+#include exynos4210.h
+
+#undef DEBUG_UART
+#undef DEBUG_UART_EXTEND
+#undef DEBUG_IRQ
+#undef DEBUG_Rx_DATA
+#undef DEBUG_Tx_DATA
+
+#define DEBUG_UART0
+#define DEBUG_UART_EXTEND 0
+#define DEBUG_IRQ 0
+#define DEBUG_Rx_DATA 0
+#define DEBUG_Tx_DATA 0
+
+#if DEBUG_UART
+#define  PRINT_DEBUG(fmt, args...)  \
+do { \
+fprintf(stderr,   [%s:%d]   fmt, __func__, __LINE__, ##args); 

[Qemu-devel] [PATCH v9 4/9] ARM: exynos4210: PWM support.

2012-01-20 Thread Evgeny Voevodin

Signed-off-by: Evgeny Voevodin e.voevo...@samsung.com
Reviewed-by: Peter Maydell peter.mayd...@linaro.org
---
 Makefile.target |2 +-
 hw/exynos4210.c |   12 ++
 hw/exynos4210_pwm.c |  413 +++
 3 files changed, 426 insertions(+), 1 deletions(-)
 create mode 100644 hw/exynos4210_pwm.c

diff --git a/Makefile.target b/Makefile.target
index c856de3..1914870 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -340,7 +340,7 @@ obj-arm-y += arm_boot.o pl011.o pl031.o pl050.o pl080.o 
pl110.o pl181.o pl190.o
 obj-arm-y += versatile_pci.o
 obj-arm-y += realview_gic.o realview.o arm_sysctl.o arm11mpcore.o a9mpcore.o
 obj-arm-y += exynos4210_gic.o exynos4210_combiner.o exynos4210.o
-obj-arm-y += exynos4_boards.o exynos4210_uart.o
+obj-arm-y += exynos4_boards.o exynos4210_uart.o exynos4210_pwm.o
 obj-arm-y += arm_l2x0.o
 obj-arm-y += arm_mptimer.o
 obj-arm-y += armv7m.o armv7m_nvic.o stellaris.o pl022.o stellaris_enet.o
diff --git a/hw/exynos4210.c b/hw/exynos4210.c
index 3838b96..ea5a1f8 100644
--- a/hw/exynos4210.c
+++ b/hw/exynos4210.c
@@ -29,6 +29,9 @@
 
 #define EXYNOS4210_CHIPID_ADDR 0x1000
 
+/* PWM */
+#define EXYNOS4210_PWM_BASE_ADDR   0x139D
+
 /* UART's definitions */
 #define EXYNOS4210_UART0_BASE_ADDR 0x1380
 #define EXYNOS4210_UART1_BASE_ADDR 0x1381
@@ -201,6 +204,15 @@ Exynos4210State *exynos4210_init(MemoryRegion *system_mem,
 memory_region_add_subregion(system_mem, EXYNOS4210_DRAM0_BASE_ADDR,
 s-dram0_mem);
 
+/* PWM */
+sysbus_create_varargs(exynos4210.pwm, EXYNOS4210_PWM_BASE_ADDR,
+irq_table[exynos4210_get_irq(22, 0)],
+irq_table[exynos4210_get_irq(22, 1)],
+irq_table[exynos4210_get_irq(22, 2)],
+irq_table[exynos4210_get_irq(22, 3)],
+irq_table[exynos4210_get_irq(22, 4)],
+NULL);
+
 /*** UARTs ***/
 exynos4210_uart_create(EXYNOS4210_UART0_BASE_ADDR,
EXYNOS4210_UART0_FIFO_SIZE, 0, NULL,
diff --git a/hw/exynos4210_pwm.c b/hw/exynos4210_pwm.c
new file mode 100644
index 000..29504d2
--- /dev/null
+++ b/hw/exynos4210_pwm.c
@@ -0,0 +1,413 @@
+/*
+ * Samsung exynos4210 Pulse Width Modulation Timer
+ *
+ * Copyright (c) 2000 - 2011 Samsung Electronics Co., Ltd.
+ * All rights reserved.
+ *
+ * Evgeny Voevodin e.voevo...@samsung.com
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see http://www.gnu.org/licenses/.
+ */
+
+#include sysbus.h
+#include qemu-timer.h
+#include qemu-common.h
+#include ptimer.h
+
+#include exynos4210.h
+
+//#define DEBUG_PWM
+
+#ifdef DEBUG_PWM
+#define DPRINTF(fmt, ...) \
+do { fprintf(stdout, PWM: [%24s:%5d]  fmt, __func__, __LINE__, \
+## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while (0)
+#endif
+
+#define EXYNOS4210_PWM_TIMERS_NUM  5
+#define EXYNOS4210_PWM_REG_MEM_SIZE0x50
+
+#define TCFG00x
+#define TCFG10x0004
+#define TCON 0x0008
+#define TCNTB0   0x000C
+#define TCMPB0   0x0010
+#define TCNTO0   0x0014
+#define TCNTB1   0x0018
+#define TCMPB1   0x001C
+#define TCNTO1   0x0020
+#define TCNTB2   0x0024
+#define TCMPB2   0x0028
+#define TCNTO2   0x002C
+#define TCNTB3   0x0030
+#define TCMPB3   0x0034
+#define TCNTO3   0x0038
+#define TCNTB4   0x003C
+#define TCNTO4   0x0040
+#define TINT_CSTAT   0x0044
+
+#define TCNTB(x)(0xC * (x))
+#define TCMPB(x)(0xC * (x) + 1)
+#define TCNTO(x)(0xC * (x) + 2)
+
+#define GET_PRESCALER(reg, x) (((reg)  (0xFF  (8 * (x  8 * (x))
+#define GET_DIVIDER(reg, x) (1  (((reg)  (0xF  (4 * (x  (4 * (x
+
+/*
+ * Attention! Timer4 doesn't have OUTPUT_INVERTER,
+ * so Auto Reload bit is not accessible by macros!
+ */
+#define TCON_TIMER_BASE(x)  (((x) ? 1 : 0) * 4 + 4 * (x))
+#define TCON_TIMER_START(x) (1  (TCON_TIMER_BASE(x) + 0))
+#define TCON_TIMER_MANUAL_UPD(x)(1  (TCON_TIMER_BASE(x) + 1))
+#define TCON_TIMER_OUTPUT_INV(x)(1  (TCON_TIMER_BASE(x) + 2))
+#define TCON_TIMER_AUTO_RELOAD(x)   (1  (TCON_TIMER_BASE(x) + 3))
+#define TCON_TIMER4_AUTO_RELOAD (1  22)
+
+#define TINT_CSTAT_STATUS(x)(1  (5 + (x)))
+#define TINT_CSTAT_ENABLE(x)  

Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Jan Kiszka
On 2012-01-20 12:45, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 12:13:48PM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:25, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:14, Marcelo Tosatti wrote:
 On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
 On 2012-01-19 18:53, Marcelo Tosatti wrote:
 What problems does it cause, and in which scenarios? Can't they be
 fixed?

 If the guest compensates for lost ticks, and KVM reinjects them, guest
 time advances faster then it should, to the extent where NTP fails to
 correct it. This is the case with RHEL4.

 But for example v2.4 kernel (or Windows with non-acpi HAL) do not
 compensate. In that case you want KVM to reinject.

 I don't know of any other way to fix this.

 OK, i see. The old unsolved problem of guessing what is being executed.

 Then the next question is how and where to control this. Conceptually,
 there should rather be a global switch say compensate for lost ticks of
 periodic timers: yes/no - instead of a per-timer knob. Didn't we
 discussed something like this before?

 I don't see the advantage of a global control versus per device
 control (in fact it lowers flexibility).

 Usability. Users should not have to care about individual tick-based
 clocks. They care about my OS requires lost ticks compensation, yes or 
 no.

 FYI, at the libvirt level we model policy against individual timers, for
 example:

   clock offset=localtime
 timer name=rtc tickpolicy=catchup track=guest/
 timer name=pit tickpolicy=delay/
   /clock

 Are the various modes of tickpolicy fully specified somewhere?
 
 There are some (not all that great) docs here:
 
   http://libvirt.org/formatdomain.html#elementsTime
 
 The meaning of the 4 policies are:
 
   delay: continue to deliver at normal rate

What does this mean? The timer stops ticking until the guest accepts its
ticks again?

 catchup: deliver at higher rate to catchup
   merge: ticks merged into 1 single tick
 discard: all missed ticks are discarded

But those interpretations aren't stated in the docs. That makes it hard
to map them on individual hypervisors - or model proper new hypervisor
interfaces accordingly.

 
 
 The original design rationale was here, though beware that some things
 changed between this design  the actual implementation libvirt has:
 
   https://www.redhat.com/archives/libvir-list/2010-March/msg00304.html
 
 Regards,
 Daniel

Given that there is almost no tick compensation in QEMU yet (ignoring
the awful RTC hack for now), this is a good time to establish a useful
generic interface with the advent of the KVM device models.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [RFC 0/5]: QMP: add balloon-get-memory-stats command

2012-01-20 Thread Luiz Capitulino
On Thu, 19 Jan 2012 10:15:55 -0700
Eric Blake ebl...@redhat.com wrote:

 On 01/19/2012 08:56 AM, Luiz Capitulino wrote:
  Long ago, commit 625a5be added the guest provided memory statistics to
  the query-balloon command. Unfortunately, it also introduced a severe
  bug: query-balloon would hang if the guest didn't respond. This, in turn,
  would also cause a hang in libvirt.
  
  Because of that, we decided to disable the guest memory stats feature
  (commit 11724ff).
  
  As we decided to let commands implement ad-hoc async mechanisms until we
  get a proper way to do it, I decided to try to re-enable that feature.
  
  My idea is to have a command and an event. The command gets the process
  started by sending a request to guest and returns. Later, when the guest
  makes the memory stats info available, it's sent to the client by means
  of an QMP event (please, take a look at patch 05/05 for full details).
  
  I'm not sure if that approach is good for libvirt though, so it would be
  very helpful to get their input (Eric, I'm CC'ing you here, but feel free
  to route this to someone else).
 
 [I went ahead and cc'd the libvirt list]
 
 Yes, libvirt can live with this approach.  And having this in parallel
 to a qemu-ga verb is nice, since, as it was pointed out, this would
 allow interaction with guests that have a balloon device but not a guest
 agent.
 
 You may want to read this thread [1], for thoughts on the impact of
 making another existing blocking command be extended into one that
 starts an async event and ends when an event is raised; libvirt can
 expose both a blocking and an asynchronous implementation to the user on
 top of the qemu model being just asynchronous.
 [1] https://www.redhat.com/archives/libvir-list/2012-January/msg00562.html
 
 Thinking aloud - do we need a means to poll the state of the
 balloon-stat query?

We could have a query-balloon-memory-stats command that returns the last
available stats (or none, if ballon-get-memory-stats wasn't issued), and
I think that it would be better to move the stats info from the event to
the query command too, this way the event would just signal that the stats
info are available.

I find that approach a bit more complicated though.

  On the one hand, if libvirtd issues the start
 command, then gets stopped, then the event occurs, then libvirtd is
 restarted, then libvirt won't know that the event was missed.  On the
 other hand, since this involves guest interaction, libvirt already has
 to assume that the guest may be malicious and refuse to report stats
 and/or report invalid stats, so libvirt would already have to be
 prepared to give up if no event has arrived in a fixed amount of time,
 and that also means that restarting libvirtd can just ignore any balloon
 query that was in flight before the restart.

Yes, there's no guarantee the event will be ever sent. If it doesn't
arrive after a fixed amount of time, the best thing to do is to issue
the start command again.

 So I guess I'm okay with just a start and an event, with no poll of the
 last-known guest response.  But it does mean that qemu has to gracefully
 handle if libvirt makes two start requests in a row without any
 intervening events, and conversely that libvirt has to be prepared for
 an event that happens even when libvirt doesn't remember triggering the
 start command.

There could be intervening events. Everything can happen between the
start command and the event (I/O Error, VM stop, etc). Libvirt has to be
prepared for that.

 
  Another interesting point is that, there's another way of doing this and
  it's using qemu-ga instead. That's, qemu-ga could read that information
  from proc and return it. This is easier  simpler, as it doesn't involve
  guest communication. We also could return a lot more information if needed.
  The only disadvantage I can see is the dependency on qemu-ga...
 
 Most likely, we would want to teach libvirt to use both methods, and
 give the choice to the user on which approach to use when the guest
 supports both.
 




[Qemu-devel] [PATCH] m48t59: use rtc_clock for alarm timer

2012-01-20 Thread Paolo Bonzini
This lets the RTC get adjustments from the host NTP client.
The watchdog still uses the vm_clock.  The previous behavior is
available with -rtc clock=vm.

Cc: Andreas Färber afaer...@suse.de
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/m48t59.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/m48t59.c b/hw/m48t59.c
index c043996..fd5dc00 100644
--- a/hw/m48t59.c
+++ b/hw/m48t59.c
@@ -126,7 +126,7 @@ static void alarm_cb (void *opaque)
 /* Repeat once a second */
 next_time = 1;
 }
-qemu_mod_timer(NVRAM-alrm_timer, qemu_get_clock_ns(vm_clock) +
+qemu_mod_timer(NVRAM-alrm_timer, qemu_get_clock_ns(rtc_clock) +
 next_time * 1000);
 qemu_set_irq(NVRAM-IRQ, 0);
 }
@@ -687,7 +687,7 @@ static void m48t59_init_common(M48t59State *s)
 {
 s-buffer = g_malloc0(s-size);
 if (s-type == 59) {
-s-alrm_timer = qemu_new_timer_ns(vm_clock, alarm_cb, s);
+s-alrm_timer = qemu_new_timer_ns(rtc_clock, alarm_cb, s);
 s-wd_timer = qemu_new_timer_ns(vm_clock, watchdog_cb, s);
 }
 qemu_get_timedate(s-alarm, 0);
-- 
1.7.7.1




Re: [Qemu-devel] [RFC 0/5]: QMP: add balloon-get-memory-stats command

2012-01-20 Thread Luiz Capitulino
On Thu, 19 Jan 2012 15:50:06 -0600
Adam Litke a...@us.ibm.com wrote:

 On Thu, Jan 19, 2012 at 01:56:26PM -0200, Luiz Capitulino wrote:
  Long ago, commit 625a5be added the guest provided memory statistics to
  the query-balloon command. Unfortunately, it also introduced a severe
  bug: query-balloon would hang if the guest didn't respond. This, in turn,
  would also cause a hang in libvirt.
  
  Because of that, we decided to disable the guest memory stats feature
  (commit 11724ff).
  
  As we decided to let commands implement ad-hoc async mechanisms until we
  get a proper way to do it, I decided to try to re-enable that feature.
  
  My idea is to have a command and an event. The command gets the process
  started by sending a request to guest and returns. Later, when the guest
  makes the memory stats info available, it's sent to the client by means
  of an QMP event (please, take a look at patch 05/05 for full details).
  
  I'm not sure if that approach is good for libvirt though, so it would be
  very helpful to get their input (Eric, I'm CC'ing you here, but feel free
  to route this to someone else).
  
  Another interesting point is that, there's another way of doing this and
  it's using qemu-ga instead. That's, qemu-ga could read that information
  from proc and return it. This is easier  simpler, as it doesn't involve
  guest communication. We also could return a lot more information if needed.
  The only disadvantage I can see is the dependency on qemu-ga...
  
   QMP/qmp-events.txt  |   28 
   balloon.c   |   47 +--
   balloon.h   |7 ---
   hmp.c   |   25 +
   hw/virtio-balloon.c |   39 +++
   monitor.c   |3 +++
   monitor.h   |1 +
   qapi-schema.json|   42 ++
   qmp-commands.hx |6 ++
   9 files changed, 129 insertions(+), 69 deletions(-)
  
 
 The patch series looks good.  Thanks for making the improvements.  Once it is
 upstream I can help out with the libvirt pieces.  I tested it with a Fedora-15
 VM and it worked out of the box :)  I just have one small nit.  Due to the way
 that the virtio stats queue works, the guest emits a stats event with bogus
 values when the balloon driver initializes (which gives the host control of 
 the
 channel).  Your patches emit this initial event (which contains undefined
 values).  In my old code, we kept a boolean flag in the ballon device to 
 record
 if stats have been requested and only if that was set would we raise the 
 event.
 Without this, the guest can spam the host with an unlimited number of bogus
 events.

Yeah, I've seen this too. I'll fix it. We also need migration and HMP
support. Although the latter seems difficult as we don't have a way for
qemu subsystems to wait for events yet.

 Tested-by: Adam Litke a...@us.ibm.com

Thanks!

 




[Qemu-devel] [PATCH 0/4] Use rtc_clock uniformly for ARM

2012-01-20 Thread Paolo Bonzini
This series uses rtc_clock uniformly in device models that provide RTC
functionality.  This will let users choose the desired semantics for
the clock.

This is most important with qtest, where -rtc clock=vm will provide
determinism and let you run tests that fake execution for large time
periods.  However, for consistency I'm switching also the two RTC
models that always used the vm_clock, m48t59 and pl031.  m48t59 is
not ARM so I'm sending it separately.

Patch 3 fixes an unrelated bug in the pl031 migration code.

Paolo Bonzini (4):
  rtc: add -rtc clock=rt
  arm: switch real-time clocks to rtc_clock
  pl031: rearm alarm timer upon load
  pl031: switch clock base to rtc_clock

 hw/omap1.c  |   10 +++---
 hw/pl031.c  |   76 ---
 hw/pxa2xx.c |   28 ++--
 hw/strongarm.c  |   10 +++---
 hw/twl92230.c   |9 +++---
 qemu-options.hx |7 +++--
 vl.c|2 +
 7 files changed, 85 insertions(+), 57 deletions(-)

-- 
1.7.7.1




[Qemu-devel] [PATCH 1/4] rtc: add -rtc clock=rt

2012-01-20 Thread Paolo Bonzini
This will let people use backwards-compatible semantics for devices that
will be affected by the following patch.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 qemu-options.hx |7 ---
 vl.c|2 ++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 6295cde..da311f0 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2360,7 +2360,7 @@ DEF(localtime, 0, QEMU_OPTION_localtime, , 
QEMU_ARCH_ALL)
 DEF(startdate, HAS_ARG, QEMU_OPTION_startdate, , QEMU_ARCH_ALL)
 
 DEF(rtc, HAS_ARG, QEMU_OPTION_rtc, \
--rtc [base=utc|localtime|date][,clock=host|vm][,driftfix=none|slew]\n \
+-rtc [base=utc|localtime|date][,clock=host|rt|vm][,driftfix=none|slew]\n 
\
 set the RTC base and clock, enable drift fix for clock 
ticks (x86 only)\n,
 QEMU_ARCH_ALL)
 
@@ -2376,8 +2376,9 @@ format @code{2006-06-17T16:01:21} or @code{2006-06-17}. 
The default base is UTC.
 By default the RTC is driven by the host system time. This allows to use the
 RTC as accurate reference clock inside the guest, specifically if the host
 time is smoothly following an accurate external reference clock, e.g. via NTP.
-If you want to isolate the guest time from the host, even prevent it from
-progressing during suspension, you can set @option{clock} to @code{vm} instead.
+If you want to isolate the guest time from the host, you can set @option{clock}
+to @code{rt} instead.  To even prevent it from progressing during suspension,
+you can set it to @code{vm}.
 
 Enable @option{driftfix} (i386 targets only) if you experience time drift 
problems,
 specifically with Windows' ACPI HAL. This option will try to figure out how
diff --git a/vl.c b/vl.c
index ba55b35..6ad67a6 100644
--- a/vl.c
+++ b/vl.c
@@ -537,6 +537,8 @@ static void configure_rtc(QemuOpts *opts)
 if (value) {
 if (!strcmp(value, host)) {
 rtc_clock = host_clock;
+} else if (!strcmp(value, rt)) {
+rtc_clock = rt_clock;
 } else if (!strcmp(value, vm)) {
 rtc_clock = vm_clock;
 } else {
-- 
1.7.7.1





[Qemu-devel] [PATCH 2/4] arm: switch real-time clocks to rtc_clock

2012-01-20 Thread Paolo Bonzini
This lets the user specify the desired semantics.  By default, the RTC
will follow adjustments from the host's NTP client.  -rtc clock=vm will
improve determinism with both icount and qtest.  Finally, the previous
behavior is available with -rtc clock=rt.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/omap1.c |   10 +-
 hw/pxa2xx.c|   28 ++--
 hw/strongarm.c |   10 +-
 hw/twl92230.c  |9 +
 4 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/hw/omap1.c b/hw/omap1.c
index 1aa5f23..e9b19a2 100644
--- a/hw/omap1.c
+++ b/hw/omap1.c
@@ -2888,7 +2888,7 @@ static void omap_rtc_reset(struct omap_rtc_s *s)
 s-pm_am = 0;
 s-auto_comp = 0;
 s-round = 0;
-s-tick = qemu_get_clock_ms(rt_clock);
+s-tick = qemu_get_clock_ms(rtc_clock);
 memset(s-alarm_tm, 0, sizeof(s-alarm_tm));
 s-alarm_tm.tm_mday = 0x01;
 s-status = 1  7;
@@ -2909,7 +2909,7 @@ static struct omap_rtc_s *omap_rtc_init(MemoryRegion 
*system_memory,
 
 s-irq = timerirq;
 s-alarm = alarmirq;
-s-clk = qemu_new_timer_ms(rt_clock, omap_rtc_tick, s);
+s-clk = qemu_new_timer_ms(rtc_clock, omap_rtc_tick, s);
 
 omap_rtc_reset(s);
 
@@ -3497,9 +3497,9 @@ static void omap_lpg_tick(void *opaque)
 struct omap_lpg_s *s = opaque;
 
 if (s-cycle)
-qemu_mod_timer(s-tm, qemu_get_clock_ms(rt_clock) + s-period - s-on);
+qemu_mod_timer(s-tm, qemu_get_clock_ms(rtc_clock) + s-period - 
s-on);
 else
-qemu_mod_timer(s-tm, qemu_get_clock_ms(rt_clock) + s-on);
+qemu_mod_timer(s-tm, qemu_get_clock_ms(rtc_clock) + s-on);
 
 s-cycle = !s-cycle;
 printf(%s: LED is %s\n, __FUNCTION__, s-cycle ? on : off);
@@ -3617,7 +3617,7 @@ static struct omap_lpg_s *omap_lpg_init(MemoryRegion 
*system_memory,
 struct omap_lpg_s *s = (struct omap_lpg_s *)
 g_malloc0(sizeof(struct omap_lpg_s));
 
-s-tm = qemu_new_timer_ms(rt_clock, omap_lpg_tick, s);
+s-tm = qemu_new_timer_ms(rtc_clock, omap_lpg_tick, s);
 
 omap_lpg_reset(s);
 
diff --git a/hw/pxa2xx.c b/hw/pxa2xx.c
index 6ddd500..35816a5 100644
--- a/hw/pxa2xx.c
+++ b/hw/pxa2xx.c
@@ -875,7 +875,7 @@ static inline void pxa2xx_rtc_int_update(PXA2xxRTCState *s)
 
 static void pxa2xx_rtc_hzupdate(PXA2xxRTCState *s)
 {
-int64_t rt = qemu_get_clock_ms(rt_clock);
+int64_t rt = qemu_get_clock_ms(rtc_clock);
 s-last_rcnr += ((rt - s-last_hz)  15) /
 (1000 * ((s-rttr  0x) + 1));
 s-last_rdcr += ((rt - s-last_hz)  15) /
@@ -885,7 +885,7 @@ static void pxa2xx_rtc_hzupdate(PXA2xxRTCState *s)
 
 static void pxa2xx_rtc_swupdate(PXA2xxRTCState *s)
 {
-int64_t rt = qemu_get_clock_ms(rt_clock);
+int64_t rt = qemu_get_clock_ms(rtc_clock);
 if (s-rtsr  (1  12))
 s-last_swcr += (rt - s-last_sw) / 10;
 s-last_sw = rt;
@@ -893,7 +893,7 @@ static void pxa2xx_rtc_swupdate(PXA2xxRTCState *s)
 
 static void pxa2xx_rtc_piupdate(PXA2xxRTCState *s)
 {
-int64_t rt = qemu_get_clock_ms(rt_clock);
+int64_t rt = qemu_get_clock_ms(rtc_clock);
 if (s-rtsr  (1  15))
 s-last_swcr += rt - s-last_pi;
 s-last_pi = rt;
@@ -1019,16 +1019,16 @@ static uint64_t pxa2xx_rtc_read(void *opaque, 
target_phys_addr_t addr,
 case PIAR:
 return s-piar;
 case RCNR:
-return s-last_rcnr + ((qemu_get_clock_ms(rt_clock) - s-last_hz)  
15) /
+return s-last_rcnr + ((qemu_get_clock_ms(rtc_clock) - s-last_hz)  
15) /
 (1000 * ((s-rttr  0x) + 1));
 case RDCR:
-return s-last_rdcr + ((qemu_get_clock_ms(rt_clock) - s-last_hz)  
15) /
+return s-last_rdcr + ((qemu_get_clock_ms(rtc_clock) - s-last_hz)  
15) /
 (1000 * ((s-rttr  0x) + 1));
 case RYCR:
 return s-last_rycr;
 case SWCR:
 if (s-rtsr  (1  12))
-return s-last_swcr + (qemu_get_clock_ms(rt_clock) - s-last_sw) / 
10;
+return s-last_swcr + (qemu_get_clock_ms(rtc_clock) - s-last_sw) 
/ 10;
 else
 return s-last_swcr;
 default:
@@ -1168,14 +1168,14 @@ static int pxa2xx_rtc_init(SysBusDevice *dev)
 s-last_swcr = (tm.tm_hour  19) |
 (tm.tm_min  13) | (tm.tm_sec  7);
 s-last_rtcpicr = 0;
-s-last_hz = s-last_sw = s-last_pi = qemu_get_clock_ms(rt_clock);
-
-s-rtc_hz= qemu_new_timer_ms(rt_clock, pxa2xx_rtc_hz_tick,s);
-s-rtc_rdal1 = qemu_new_timer_ms(rt_clock, pxa2xx_rtc_rdal1_tick, s);
-s-rtc_rdal2 = qemu_new_timer_ms(rt_clock, pxa2xx_rtc_rdal2_tick, s);
-s-rtc_swal1 = qemu_new_timer_ms(rt_clock, pxa2xx_rtc_swal1_tick, s);
-s-rtc_swal2 = qemu_new_timer_ms(rt_clock, pxa2xx_rtc_swal2_tick, s);
-s-rtc_pi= qemu_new_timer_ms(rt_clock, pxa2xx_rtc_pi_tick,s);
+s-last_hz = s-last_sw = s-last_pi = qemu_get_clock_ms(rtc_clock);
+
+s-rtc_hz= qemu_new_timer_ms(rtc_clock, pxa2xx_rtc_hz_tick,s);
+s-rtc_rdal1 = qemu_new_timer_ms(rtc_clock, 

[Qemu-devel] [PATCH 4/4] pl031: switch clock base to rtc_clock

2012-01-20 Thread Paolo Bonzini
This lets the user specify the desired semantics.  By default, the RTC
will follow adjustments from the host's NTP client, and will remain in
sync when the virtual machine is stopped.  The previous behavior, which
provides determinism with both icount and qtest, remains available with
-rtc clock=vm.

pl031 supports migration, so we need to convert the time
base from rtc_clock to vm_clock and back for backwards compatibility.

The device model is kind of broken, because it stores the offset with
second precision rather than nanosecond.  The alarm timer thus may be
off by up to one second.  Anyway, this is unrelated to this change.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/pl031.c |   38 ++
 1 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/hw/pl031.c b/hw/pl031.c
index 1ca4c23..1a4de99 100644
--- a/hw/pl031.c
+++ b/hw/pl031.c
@@ -13,6 +13,7 @@
 
 #include sysbus.h
 #include qemu-timer.h
+#include sysemu.h
 
 //#define DEBUG_PL031
 
@@ -38,6 +39,11 @@ typedef struct {
 QEMUTimer *timer;
 qemu_irq irq;
 
+/* Needed to preserve the tick_count across migration, even if the
+ * absolute value of the rtc_clock is different on the source and
+ * destination.
+ */
+uint32_t tick_offset_vmstate;
 uint32_t tick_offset;
 
 uint32_t mr;
@@ -68,27 +74,23 @@ static void pl031_interrupt(void * opaque)
 
 static uint32_t pl031_get_count(pl031_state *s)
 {
-/* This assumes qemu_get_clock_ns returns the time since the machine was
-   created.  */
-return s-tick_offset + qemu_get_clock_ns(vm_clock) / get_ticks_per_sec();
+int64_t now = qemu_get_clock_ns(rtc_clock);
+return s-tick_offset + now / get_ticks_per_sec();
 }
 
 static void pl031_set_alarm(pl031_state *s)
 {
-int64_t now;
 uint32_t ticks;
 
-now = qemu_get_clock_ns(vm_clock);
-ticks = s-tick_offset + now / get_ticks_per_sec();
-
 /* The timer wraps around.  This subtraction also wraps in the same way,
and gives correct results when alarm  now_ticks.  */
-ticks = s-mr - ticks;
+ticks = s-mr - pl031_get_count(s);
 DPRINTF(Alarm set in %ud ticks\n, ticks);
 if (ticks == 0) {
 qemu_del_timer(s-timer);
 pl031_interrupt(s);
 } else {
+int64_t now = qemu_get_clock_ns(rtc_clock);
 qemu_mod_timer(s-timer, now + (int64_t)ticks * get_ticks_per_sec());
 }
 }
@@ -190,18 +192,29 @@ static int pl031_init(SysBusDevice *dev)
 sysbus_init_mmio(dev, s-iomem);
 
 sysbus_init_irq(dev, s-irq);
-/* ??? We assume vm_clock is zero at this point.  */
 qemu_get_timedate(tm, 0);
-s-tick_offset = mktimegm(tm);
+s-tick_offset = mktimegm(tm) - qemu_get_clock_ns(rtc_clock) / 
get_ticks_per_sec();
 
-s-timer = qemu_new_timer_ns(vm_clock, pl031_interrupt, s);
+s-timer = qemu_new_timer_ns(rtc_clock, pl031_interrupt, s);
 return 0;
 }
 
+static void pl031_pre_save(void *opaque)
+{
+pl031_state *s = opaque;
+
+/* tick_offset is base_time - rtc_clock base time.  Instead, we want to
+ * store the base time relative to the vm_clock for 
backwards-compatibility.  */
+int64_t delta = qemu_get_clock_ns(rtc_clock) - qemu_get_clock_ns(vm_clock);
+s-tick_offset_vmstate = s-tick_offset + delta / get_ticks_per_sec();
+}
+
 static int pl031_post_load(void *opaque, int version_id)
 {
 pl031_state *s = opaque;
 
+int64_t delta = qemu_get_clock_ns(rtc_clock) - qemu_get_clock_ns(vm_clock);
+s-tick_offset = s-tick_offset_vmstate - delta / get_ticks_per_sec();
 pl031_set_alarm(s);
 return 0;
 }
@@ -210,9 +223,10 @@ static const VMStateDescription vmstate_pl031 = {
 .name = pl031,
 .version_id = 1,
 .minimum_version_id = 1,
+.pre_save = pl031_pre_save,
 .post_load = pl031_post_load,
 .fields = (VMStateField[]) {
-VMSTATE_UINT32(tick_offset, pl031_state),
+VMSTATE_UINT32(tick_offset_vmstate, pl031_state),
 VMSTATE_UINT32(mr, pl031_state),
 VMSTATE_UINT32(lr, pl031_state),
 VMSTATE_UINT32(cr, pl031_state),
-- 
1.7.7.1



[Qemu-devel] [PATCH 3/4] pl031: rearm alarm timer upon load

2012-01-20 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 hw/pl031.c |   40 +---
 1 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/hw/pl031.c b/hw/pl031.c
index 2fb0c8e..1ca4c23 100644
--- a/hw/pl031.c
+++ b/hw/pl031.c
@@ -47,21 +47,6 @@ typedef struct {
 uint32_t is;
 } pl031_state;
 
-static const VMStateDescription vmstate_pl031 = {
-.name = pl031,
-.version_id = 1,
-.minimum_version_id = 1,
-.fields = (VMStateField[]) {
-VMSTATE_UINT32(tick_offset, pl031_state),
-VMSTATE_UINT32(mr, pl031_state),
-VMSTATE_UINT32(lr, pl031_state),
-VMSTATE_UINT32(cr, pl031_state),
-VMSTATE_UINT32(im, pl031_state),
-VMSTATE_UINT32(is, pl031_state),
-VMSTATE_END_OF_LIST()
-}
-};
-
 static const unsigned char pl031_id[] = {
 0x31, 0x10, 0x14, 0x00, /* Device ID*/
 0x0d, 0xf0, 0x05, 0xb1  /* Cell ID  */
@@ -213,6 +198,31 @@ static int pl031_init(SysBusDevice *dev)
 return 0;
 }
 
+static int pl031_post_load(void *opaque, int version_id)
+{
+pl031_state *s = opaque;
+
+pl031_set_alarm(s);
+return 0;
+}
+
+static const VMStateDescription vmstate_pl031 = {
+.name = pl031,
+.version_id = 1,
+.minimum_version_id = 1,
+.post_load = pl031_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(tick_offset, pl031_state),
+VMSTATE_UINT32(mr, pl031_state),
+VMSTATE_UINT32(lr, pl031_state),
+VMSTATE_UINT32(cr, pl031_state),
+VMSTATE_UINT32(im, pl031_state),
+VMSTATE_UINT32(is, pl031_state),
+VMSTATE_END_OF_LIST()
+}
+};
+
+
 static SysBusDeviceInfo pl031_info = {
 .init = pl031_init,
 .qdev.name = pl031,
-- 
1.7.7.1





Re: [Qemu-devel] [PATCH v4 08/15] qmp: add block_job_cancel command

2012-01-20 Thread Luiz Capitulino
On Fri, 20 Jan 2012 09:30:45 +0100
Kevin Wolf kw...@redhat.com wrote:

 Am 20.01.2012 01:02, schrieb Eric Blake:
  On 01/06/2012 07:01 AM, Stefan Hajnoczi wrote:
  Add block_job_cancel, which stops an active block streaming operation.
  When the operation has been cancelled the new BLOCK_JOB_CANCELLED event
  is emitted.
 
  Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
  
  +++ b/hmp-commands.hx
  @@ -98,6 +98,20 @@ Set maximum speed for a background block operation.
   ETEXI
   
   {
  +.name   = block_job_cancel,
  +.args_type  = device:B,
  +.params = device,
  +.help   = stop an active block streaming operation,
  +.mhandler.cmd = hmp_block_job_cancel,
  +},
  +
  
  Looking at this from libvirt's perspective, would it be possible to give
  this a different name?  Then libvirt would know that if
  block_job_cancel_async exists, we have the official semantics; and if it
  doesn't exist, then we can try block_job_cancel as a fallback to see if
  we have the old blocking semantics.
  
  But by using the same name as the old unofficial blocking command, it's
  difficult to tell if we should expect an event, or whether completion of
  the command means completion of the cancel.
  
  On the other hand, I guess we can rely on completion of the command,
  followed by reading block job status to see if the job is still in
  flight, will tell us whether we need to worry about waiting for an event
  - if the job is complete (whether or not this command was the blocking
  variant), we are done; if the job is ongoing, we have the new semantics
  and can expect an event; and that only leaves the race of calling the
  command, then the job completes, then we query and see it done, then the
  event comes, where we just have to be ready to ignore an unexpected event.
 
 You're quoting the HMP part, is that intentional? You shouldn't be using
 this at all.
 
 Anyway, are there even any qemu versions out there that implement an
 older interface?
 
  +##
  +# @block_job_cancel:
  +#
  +# Stop an active block streaming operation.
  +#
  +# This command returns immediately after marking the active block 
  streaming
  +# operation for cancellation.  It is an error to call this command if no
  +# operation is in progress.
  +#
  +# The operation will cancel as soon as possible and then emit the
  +# BLOCK_JOB_CANCELLED event.  Before that happens the job is still 
  visible when
  +# enumerated using query-block-jobs.
  
  Is there any policy on _ vs - in command names?  It seems awkward to
  have block_job_cancel but query-block-jobs.
 
 block_job_cancel is HMP, whereas query-block-jobs is a QMP command. QMP
 uses - consistently. Not sure if HMP is consistent, but it tends to use _.

This very series broke QMP's consistency because it was designed when we
were following HMP's inconsistencies...



Re: [Qemu-devel] [SeaBIOS] Error booting from USB Storage Device in QEMU-KVM GIT MASTER

2012-01-20 Thread Dyweni - Qemu-Devel

Hi All,

I have good and bad news...

I tested QEMU-KVM using branches 'master'
(9501d0f1b6efc83f69d06b27a625bad71d30d58b) and 'uq/master'
(6a48ffaaa732b2142c1b5030178f2d4a0fa499fe). Seabios used was the 
version
included in those branches (no -L switch). Both branches failed to 
detect
the USB Flash Drive (error message: 'Unable to configure USB MSC 
device.').


I checked out SeaBios branch 'master'
(b3df857fe6d3fffb108379637ea4a456ce6e09ba) and passed that to QEMU-KVM
using the -L switch. Both branches don't fail as bad. Both versions 
detect

the USB Flash Drive (message: 'USB MSC blksize=512 sectors=204800') and
then indicate they are booting from hard disk.

In order to see the console, I had to copy the following files into the
directory specified by the -L switch in order to see the screen:
- vgabios-cirrus.bin
- pxe-rtl8139.rom
- vapic.bin

I also noticed one new regression: booting runs REALLY REALLY slow. The
upgrade from 0.14 w/ USB to git master w/ IDE slowed down a little bit.
But this is magnitudes slower.  While I'm waiting, I see one of my 
cores

at 100% usage.

---
Thanks,
Dyweni

On Thu, 19 Jan 2012 20:57:11 -0500, Kevin O'Connor wrote:


On Thu, Jan 19, 2012 at 06:57:59AM -0600, Dyweni - Qemu-Devel wrote:
Hi, I am unable to boot KVM using a usb flash drive. I'm using 
QEMU-KVM

built from GIT MASTER as of this morning. Here's my QEMU-KVM startup
options: qemu-system-x86_64 -curses -m 512 -snapshot -device
piix3-usb-uhci -drive
id=usbflash,file=flash.img,if=none,boot=on,cache=writeback -device
usb-storage,drive=usbflash -net nic,macaddr=$(getmacpublic),vlan=0 
-net

tap,vlan=0,ifname=$publictap,script=no -net
nic,macaddr=$(getmacprivate),vlan=1 -net
tap,vlan=1,ifname=$privatetap,script=no $*

I tried a modifed version of the above, and it worked fine for me.
qemu-system-x86_64 -snapshot -L test -device piix3-usb-uhci -drive
id=usbflash,file=dos-drivec-new,if=none,cache=writeback -device
usb-storage,drive=usbflash -chardev stdio,id=seabios -device
isa-debugcon,iobase=0x402,chardev=seabios What version of qemu are 
you
using? There are a few known quirks in the seabios code that were 
fixed
recently, but I did not think they impacted the qemu emulation. 
-Kevin




Re: [Qemu-devel] bad USB tablet update rate on qemu-1.0

2012-01-20 Thread Erik Rull

Hi Andreas,

Andreas Färber wrote:

Hi Erik,

Am 19.01.2012 20:15, schrieb Erik Rull:

Andreas Färber wrote:

Am 19.01.2012 17:40, schrieb Erik Rull:

[...] there seems to be a
difference between the captured cursor for the native X-Windows window
and the VNC window that occured somewhere between 0.14 and 1.0.


Then try `git bisect start v1.0 v0.14.0' to find out when exactly the
perceived behavior changed. :)



I just did a clone of the current qemu-kvm (which I use) and started to
bisect, but got an error, where I don't know how to proceed:
erik@debian:~/qemu-test/qemu-kvm$ git bisect good qemu-kvm-0.14.0
You need to start by git bisect start
Do you want me to do it for you [Y/n]?
erik@debian:~/qemu-test/qemu-kvm$ git bisect bad qemu-kvm-1.0
Bisecting: 2043 revisions left to test after this
fatal: Entry 'roms/seabios' not uptodate. Cannot merge.
erik@debian:~/qemu-test/qemu-kvm$


Hm, did you maybe previously do a `git submodule init'? You may need to
run `git submodule update' then (which may fail as recently for roms/SLOF).

Otherwise, generally when it does not compile you can try `git bisect
skip' to try a different commit.

Andreas


Updated to the latest git version from github.com and it's fine now. I did 
a test bisectioning by just using bisect good and bad randomized. My test 
system is available again on Monday. I will keep you updated.


Best regards,

Erik




Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Daniel P. Berrange
On Fri, Jan 20, 2012 at 01:00:06PM +0100, Jan Kiszka wrote:
 On 2012-01-20 12:45, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 12:13:48PM +0100, Jan Kiszka wrote:
  On 2012-01-20 11:25, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
  On 2012-01-20 11:14, Marcelo Tosatti wrote:
  On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
  On 2012-01-19 18:53, Marcelo Tosatti wrote:
  What problems does it cause, and in which scenarios? Can't they be
  fixed?
 
  If the guest compensates for lost ticks, and KVM reinjects them, guest
  time advances faster then it should, to the extent where NTP fails to
  correct it. This is the case with RHEL4.
 
  But for example v2.4 kernel (or Windows with non-acpi HAL) do not
  compensate. In that case you want KVM to reinject.
 
  I don't know of any other way to fix this.
 
  OK, i see. The old unsolved problem of guessing what is being executed.
 
  Then the next question is how and where to control this. Conceptually,
  there should rather be a global switch say compensate for lost ticks 
  of
  periodic timers: yes/no - instead of a per-timer knob. Didn't we
  discussed something like this before?
 
  I don't see the advantage of a global control versus per device
  control (in fact it lowers flexibility).
 
  Usability. Users should not have to care about individual tick-based
  clocks. They care about my OS requires lost ticks compensation, yes or 
  no.
 
  FYI, at the libvirt level we model policy against individual timers, for
  example:
 
clock offset=localtime
  timer name=rtc tickpolicy=catchup track=guest/
  timer name=pit tickpolicy=delay/
/clock
 
  Are the various modes of tickpolicy fully specified somewhere?
  
  There are some (not all that great) docs here:
  
http://libvirt.org/formatdomain.html#elementsTime
  
  The meaning of the 4 policies are:
  
delay: continue to deliver at normal rate
 
 What does this mean? The timer stops ticking until the guest accepts its
 ticks again?

It means that the hypervisor will not attempt to do any compensation,
so the guest will see delays in its ticks being delivered  gradually
drift over time.

  catchup: deliver at higher rate to catchup
merge: ticks merged into 1 single tick
  discard: all missed ticks are discarded
 
 But those interpretations aren't stated in the docs. That makes it hard
 to map them on individual hypervisors - or model proper new hypervisor
 interfaces accordingly.

That's not a real problem, now I notice they are missing the docs, I
can just add them in.


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH] m48t59: use rtc_clock for alarm timer

2012-01-20 Thread Andreas Färber
Am 20.01.2012 13:05, schrieb Paolo Bonzini:
 This lets the RTC get adjustments from the host NTP client.
 The watchdog still uses the vm_clock.  The previous behavior is
 available with -rtc clock=vm.
 

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com

Reviewed-by: Andreas Färber afaer...@suse.de
Cc: Blue

Andreas

 ---
  hw/m48t59.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/hw/m48t59.c b/hw/m48t59.c
 index c043996..fd5dc00 100644
 --- a/hw/m48t59.c
 +++ b/hw/m48t59.c
 @@ -126,7 +126,7 @@ static void alarm_cb (void *opaque)
  /* Repeat once a second */
  next_time = 1;
  }
 -qemu_mod_timer(NVRAM-alrm_timer, qemu_get_clock_ns(vm_clock) +
 +qemu_mod_timer(NVRAM-alrm_timer, qemu_get_clock_ns(rtc_clock) +
  next_time * 1000);
  qemu_set_irq(NVRAM-IRQ, 0);
  }
 @@ -687,7 +687,7 @@ static void m48t59_init_common(M48t59State *s)
  {
  s-buffer = g_malloc0(s-size);
  if (s-type == 59) {
 -s-alrm_timer = qemu_new_timer_ns(vm_clock, alarm_cb, s);
 +s-alrm_timer = qemu_new_timer_ns(rtc_clock, alarm_cb, s);
  s-wd_timer = qemu_new_timer_ns(vm_clock, watchdog_cb, s);
  }
  qemu_get_timedate(s-alarm, 0);

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [Spice-devel] Vioserial of Windows guest OS on Qemu 0.15

2012-01-20 Thread Vadim Rozenfeld
This code is slightly buggy.
Please try Yan's repository
at github (https://github.com/YanVugenfirer/).
I believe that the most critical changes
have been merged already by Yan into this
public repository.
I will ask to update binaries and sources at
fedoraproject site as well.

Best regards,
Vadim.


- Original Message -
From: Charles.Tsai-蔡清海-研究發展部 charles.t...@cloudena.com
To: Vadim Rozenfeld vroze...@redhat.com
Cc: Michael Roth mdr...@linux.vnet.ibm.com, Stefan Hajnoczi 
stefa...@gmail.com, spice-de...@lists.freedesktop.org, Alex Huang-黃必賢-研究發展部 
alex.hu...@cloudena.com, Alon Levy al...@redhat.com, qemu-devel 
qemu-devel@nongnu.org
Sent: Friday, January 20, 2012 3:25:51 AM
Subject: RE: [Qemu-devel] [Spice-devel] Vioserial of Windows guest OS on Qemu 
0.15

Vadim,

We downloaded the driver source from the following website.
===
wget --no-check-certificate 
https://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/src/virtio-win-prewhql-0.1-15-sources.zip

-Original Message-
From: Vadim Rozenfeld [mailto:vroze...@redhat.com]
Sent: Thursday, January 19, 2012 8:25 PM
To: Charles.Tsai-蔡清海-研究發展部
Cc: Michael Roth; Stefan Hajnoczi; spice-de...@lists.freedesktop.org; Alex 
Huang-黃必賢-研究發展部; Alon Levy; qemu-devel
Subject: Re: [Qemu-devel] [Spice-devel] Vioserial of Windows guest OS on Qemu 
0.15

Just to be sure that we are on the same page:
could you tell me about the origin of the source?
Is it the latest from the Yan's repository at github.com?

- Original Message -
From: Charles.Tsai-蔡清海-研究發展部 charles.t...@cloudena.com
To: Vadim Rozenfeld vroze...@redhat.com
Cc: Michael Roth mdr...@linux.vnet.ibm.com, Stefan Hajnoczi 
stefa...@gmail.com, spice-de...@lists.freedesktop.org, Alex Huang-黃必賢-研究發展部 
alex.hu...@cloudena.com, Alon Levy al...@redhat.com, qemu-devel 
qemu-devel@nongnu.org
Sent: Thursday, January 19, 2012 12:06:16 PM
Subject: RE: [Qemu-devel] [Spice-devel] Vioserial of Windows guest OS on Qemu 
0.15

Vadim,

We built it from the driver source. Up to this moment, we always use the same 
binary to test Qemu.


-Original Message-
From: Vadim Rozenfeld [mailto:vroze...@redhat.com]
Sent: Thursday, January 19, 2012 5:39 PM
To: Charles.Tsai-蔡清海-研究發展部
Cc: Michael Roth; Stefan Hajnoczi; spice-de...@lists.freedesktop.org; Alex 
Huang-黃必賢-研究發展部; Alon Levy; qemu-devel
Subject: RE: [Qemu-devel] [Spice-devel] Vioserial of Windows guest OS on Qemu 
0.15

On Thu, 2012-01-19 at 16:33 +0800, Charles.Tsai-蔡清海-研究發展部 wrote:
 Vadim,

 It is SMP system.
What about vioserial driver itself?
did you build it from sources or is
it one, available through RHEL channels?


 -Original Message-
 From: Vadim Rozenfeld [mailto:vroze...@redhat.com]
 Sent: Thursday, January 19, 2012 3:58 PM
 To: Charles.Tsai-蔡清海-研究發展部
 Cc: Michael Roth; Stefan Hajnoczi; spice-de...@lists.freedesktop.org;
 Alex Huang-黃必賢-研究發展部; Alon Levy; qemu-devel
 Subject: RE: [Qemu-devel] [Spice-devel] Vioserial of Windows guest OS
 on Qemu 0.15

 On Thu, 2012-01-19 at 09:41 +0800, Charles.Tsai-蔡清海-研究發展部 wrote:
  Vadim,
 
  I tested on Qemu 1.0.50. and found the VioSerial driver had problem to 
  install on 64-bit Win7 guest.
  During the driver installation, the system hung after the driver
  being installed. After I rebooted the guest OS, the Vioserial driver work. 
  The hang system seemed to be found only during the driver installation.
 
 On UP or SMP system?
 
  -Original Message-
  From: Vadim Rozenfeld [mailto:vroze...@redhat.com]
  Sent: Wednesday, January 18, 2012 4:57 AM
  To: Michael Roth
  Cc: Charles.Tsai-蔡清海-研究發展部; Stefan Hajnoczi;
  spice-de...@lists.freedesktop.org; Alex Huang-黃必賢-研究發展部; Alon Levy;
  qemu-devel
  Subject: Re: [Qemu-devel] [Spice-devel] Vioserial of Windows guest
  OS on Qemu 0.15
 
  On Mon, 2012-01-16 at 19:50 -0600, Michael Roth wrote:
   On 01/15/2012 08:02 PM, Charles.Tsai-蔡清海-研究發展部 wrote:
Vadim,
   
Thank you for your prompt reply. Here are the information for our test 
case.
   
   
1) we use the following command line to launch the guest OS
   
   
/usr/bin/kvm -S -M pc-0.14 -enable-kvm -m 1024 -smp
1,sockets=1,cores=1,threads=1 -name win_xp -uuid
d9388815-ddd3-c38e-33c2-a9d5fcc7a775 -nodefconfig -nodefaults
-chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/win_xp.monitor,
se rv er,nowait -mon
chardev=charmonitor,id=monitor,mode=readline
-rtc base=localtime
-device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,multifunction=on,a
dd
r=
0x5.0x0 -drive
file=/media/Images/Windows-XP.img,if=none,id=drive-ide0-0-0,form
at
=r
aw -device
ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,boot
in
de
x=1
-netdev tap,fd=17,id=hostnet0
-device
rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:e8:dc:b1,bus=pci.0,
mu
lt
ifunction=on,addr=0x3.0x0
-chardev pty,id=charserial0
-device 

Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Jan Kiszka
On 2012-01-20 13:42, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 01:00:06PM +0100, Jan Kiszka wrote:
 On 2012-01-20 12:45, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 12:13:48PM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:25, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:14, Marcelo Tosatti wrote:
 On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
 On 2012-01-19 18:53, Marcelo Tosatti wrote:
 What problems does it cause, and in which scenarios? Can't they be
 fixed?

 If the guest compensates for lost ticks, and KVM reinjects them, guest
 time advances faster then it should, to the extent where NTP fails to
 correct it. This is the case with RHEL4.

 But for example v2.4 kernel (or Windows with non-acpi HAL) do not
 compensate. In that case you want KVM to reinject.

 I don't know of any other way to fix this.

 OK, i see. The old unsolved problem of guessing what is being executed.

 Then the next question is how and where to control this. Conceptually,
 there should rather be a global switch say compensate for lost ticks 
 of
 periodic timers: yes/no - instead of a per-timer knob. Didn't we
 discussed something like this before?

 I don't see the advantage of a global control versus per device
 control (in fact it lowers flexibility).

 Usability. Users should not have to care about individual tick-based
 clocks. They care about my OS requires lost ticks compensation, yes or 
 no.

 FYI, at the libvirt level we model policy against individual timers, for
 example:

   clock offset=localtime
 timer name=rtc tickpolicy=catchup track=guest/
 timer name=pit tickpolicy=delay/
   /clock

 Are the various modes of tickpolicy fully specified somewhere?

 There are some (not all that great) docs here:

   http://libvirt.org/formatdomain.html#elementsTime

 The meaning of the 4 policies are:

   delay: continue to deliver at normal rate

 What does this mean? The timer stops ticking until the guest accepts its
 ticks again?
 
 It means that the hypervisor will not attempt to do any compensation,
 so the guest will see delays in its ticks being delivered  gradually
 drift over time.

Still, is the logic as I described? Or what is the difference to discard.

 
 catchup: deliver at higher rate to catchup
   merge: ticks merged into 1 single tick
 discard: all missed ticks are discarded

 But those interpretations aren't stated in the docs. That makes it hard
 to map them on individual hypervisors - or model proper new hypervisor
 interfaces accordingly.
 
 That's not a real problem, now I notice they are missing the docs, I
 can just add them in.

TIA, but just please more verbose. The above descriptions only help if
you take real implementations of hypervisors as reference.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Daniel P. Berrange
On Fri, Jan 20, 2012 at 01:51:20PM +0100, Jan Kiszka wrote:
 On 2012-01-20 13:42, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 01:00:06PM +0100, Jan Kiszka wrote:
  On 2012-01-20 12:45, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 12:13:48PM +0100, Jan Kiszka wrote:
  On 2012-01-20 11:25, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
  On 2012-01-20 11:14, Marcelo Tosatti wrote:
  On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
  On 2012-01-19 18:53, Marcelo Tosatti wrote:
  What problems does it cause, and in which scenarios? Can't they be
  fixed?
 
  If the guest compensates for lost ticks, and KVM reinjects them, 
  guest
  time advances faster then it should, to the extent where NTP fails 
  to
  correct it. This is the case with RHEL4.
 
  But for example v2.4 kernel (or Windows with non-acpi HAL) do not
  compensate. In that case you want KVM to reinject.
 
  I don't know of any other way to fix this.
 
  OK, i see. The old unsolved problem of guessing what is being 
  executed.
 
  Then the next question is how and where to control this. 
  Conceptually,
  there should rather be a global switch say compensate for lost 
  ticks of
  periodic timers: yes/no - instead of a per-timer knob. Didn't we
  discussed something like this before?
 
  I don't see the advantage of a global control versus per device
  control (in fact it lowers flexibility).
 
  Usability. Users should not have to care about individual tick-based
  clocks. They care about my OS requires lost ticks compensation, yes 
  or no.
 
  FYI, at the libvirt level we model policy against individual timers, for
  example:
 
clock offset=localtime
  timer name=rtc tickpolicy=catchup track=guest/
  timer name=pit tickpolicy=delay/
/clock
 
  Are the various modes of tickpolicy fully specified somewhere?
 
  There are some (not all that great) docs here:
 
http://libvirt.org/formatdomain.html#elementsTime
 
  The meaning of the 4 policies are:
 
delay: continue to deliver at normal rate
 
  What does this mean? The timer stops ticking until the guest accepts its
  ticks again?
  
  It means that the hypervisor will not attempt to do any compensation,
  so the guest will see delays in its ticks being delivered  gradually
  drift over time.
 
 Still, is the logic as I described? Or what is the difference to discard.

With 'discard', the delayed tick will be thrown away. In 'delay', the
delayed tick will still be injected to the guest, possibly well after
the intended injection time though, and there will be no attempt to
compensate by speeding up delivery of later ticks.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Jan Kiszka
On 2012-01-20 13:54, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 01:51:20PM +0100, Jan Kiszka wrote:
 On 2012-01-20 13:42, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 01:00:06PM +0100, Jan Kiszka wrote:
 On 2012-01-20 12:45, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 12:13:48PM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:25, Daniel P. Berrange wrote:
 On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
 On 2012-01-20 11:14, Marcelo Tosatti wrote:
 On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
 On 2012-01-19 18:53, Marcelo Tosatti wrote:
 What problems does it cause, and in which scenarios? Can't they be
 fixed?

 If the guest compensates for lost ticks, and KVM reinjects them, 
 guest
 time advances faster then it should, to the extent where NTP fails 
 to
 correct it. This is the case with RHEL4.

 But for example v2.4 kernel (or Windows with non-acpi HAL) do not
 compensate. In that case you want KVM to reinject.

 I don't know of any other way to fix this.

 OK, i see. The old unsolved problem of guessing what is being 
 executed.

 Then the next question is how and where to control this. 
 Conceptually,
 there should rather be a global switch say compensate for lost 
 ticks of
 periodic timers: yes/no - instead of a per-timer knob. Didn't we
 discussed something like this before?

 I don't see the advantage of a global control versus per device
 control (in fact it lowers flexibility).

 Usability. Users should not have to care about individual tick-based
 clocks. They care about my OS requires lost ticks compensation, yes 
 or no.

 FYI, at the libvirt level we model policy against individual timers, for
 example:

   clock offset=localtime
 timer name=rtc tickpolicy=catchup track=guest/
 timer name=pit tickpolicy=delay/
   /clock

 Are the various modes of tickpolicy fully specified somewhere?

 There are some (not all that great) docs here:

   http://libvirt.org/formatdomain.html#elementsTime

 The meaning of the 4 policies are:

   delay: continue to deliver at normal rate

 What does this mean? The timer stops ticking until the guest accepts its
 ticks again?

 It means that the hypervisor will not attempt to do any compensation,
 so the guest will see delays in its ticks being delivered  gradually
 drift over time.

 Still, is the logic as I described? Or what is the difference to discard.
 
 With 'discard', the delayed tick will be thrown away. In 'delay', the
 delayed tick will still be injected to the guest, possibly well after
 the intended injection time though, and there will be no attempt to
 compensate by speeding up delivery of later ticks.

OK, let's see if I got it:

delay   - all lost ticks are replayed in a row once the guest accepts
  them again
catchup - lost ticks are gradually replayed at a higher frequency than
  the original tick
merge   - at most one tick is replayed once the guest accepts it again
discard - no lost ticks compensation

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 00/14] softfloat: Use POSIX integer types - benchmarked

2012-01-20 Thread Peter Maydell
On 16 January 2012 00:46, Andreas Färber afaer...@suse.de wrote:
 For a loop count of 100,000 and 5 runs I got the following results:

  current:        138.9-204.1 Whetstone-MIPS
  [u]int*_t:      185.2-188.7 Whetstone-MIPS
  [u]int_fast*_t: 285.7-294.1 Whetstone-MIPS

  Toshiba AC100:  833.3-909.1 Whetstone-MIPS

 These results seem to indicate that the fast POSIX types are indeed
 somewhat faster, both compared to exact-size POSIX types and to the
 current state.

OTOH I did a run of scimark2 and got:
current tree:
**  **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to p...@nist.gov) **
**  **
Using   2.00 seconds min time per kenel.
Composite Score:   12.98
FFT Mflops: 7.66(N=1024)
SOR Mflops:19.49(100 x 100)
MonteCarlo: Mflops: 6.12
Sparse matmult  Mflops:15.34(N=1000, nz=5000)
LU  Mflops:16.28(M=100, N=100)

with patches (yours and mine):
**  **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to p...@nist.gov) **
**  **
Using   2.00 seconds min time per kenel.
Composite Score:   11.87
FFT Mflops: 7.12(N=1024)
SOR Mflops:17.66(100 x 100)
MonteCarlo: Mflops: 5.75
Sparse matmult  Mflops:14.03(N=1000, nz=5000)
LU  Mflops:14.81(M=100, N=100)

Hmmm...

-- PMM



Re: [Qemu-devel] qemu-kvm upstreaming: Do we need -no-kvm-pit and -no-kvm-pit-reinjection semantics?

2012-01-20 Thread Daniel P. Berrange
On Fri, Jan 20, 2012 at 02:02:03PM +0100, Jan Kiszka wrote:
 On 2012-01-20 13:54, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 01:51:20PM +0100, Jan Kiszka wrote:
  On 2012-01-20 13:42, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 01:00:06PM +0100, Jan Kiszka wrote:
  On 2012-01-20 12:45, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 12:13:48PM +0100, Jan Kiszka wrote:
  On 2012-01-20 11:25, Daniel P. Berrange wrote:
  On Fri, Jan 20, 2012 at 11:22:27AM +0100, Jan Kiszka wrote:
  On 2012-01-20 11:14, Marcelo Tosatti wrote:
  On Thu, Jan 19, 2012 at 07:01:44PM +0100, Jan Kiszka wrote:
  On 2012-01-19 18:53, Marcelo Tosatti wrote:
  What problems does it cause, and in which scenarios? Can't they 
  be
  fixed?
 
  If the guest compensates for lost ticks, and KVM reinjects them, 
  guest
  time advances faster then it should, to the extent where NTP 
  fails to
  correct it. This is the case with RHEL4.
 
  But for example v2.4 kernel (or Windows with non-acpi HAL) do not
  compensate. In that case you want KVM to reinject.
 
  I don't know of any other way to fix this.
 
  OK, i see. The old unsolved problem of guessing what is being 
  executed.
 
  Then the next question is how and where to control this. 
  Conceptually,
  there should rather be a global switch say compensate for lost 
  ticks of
  periodic timers: yes/no - instead of a per-timer knob. Didn't we
  discussed something like this before?
 
  I don't see the advantage of a global control versus per device
  control (in fact it lowers flexibility).
 
  Usability. Users should not have to care about individual tick-based
  clocks. They care about my OS requires lost ticks compensation, yes 
  or no.
 
  FYI, at the libvirt level we model policy against individual timers, 
  for
  example:
 
clock offset=localtime
  timer name=rtc tickpolicy=catchup track=guest/
  timer name=pit tickpolicy=delay/
/clock
 
  Are the various modes of tickpolicy fully specified somewhere?
 
  There are some (not all that great) docs here:
 
http://libvirt.org/formatdomain.html#elementsTime
 
  The meaning of the 4 policies are:
 
delay: continue to deliver at normal rate
 
  What does this mean? The timer stops ticking until the guest accepts its
  ticks again?
 
  It means that the hypervisor will not attempt to do any compensation,
  so the guest will see delays in its ticks being delivered  gradually
  drift over time.
 
  Still, is the logic as I described? Or what is the difference to discard.
  
  With 'discard', the delayed tick will be thrown away. In 'delay', the
  delayed tick will still be injected to the guest, possibly well after
  the intended injection time though, and there will be no attempt to
  compensate by speeding up delivery of later ticks.
 
 OK, let's see if I got it:
 
 delay   - all lost ticks are replayed in a row once the guest accepts
   them again
 catchup - lost ticks are gradually replayed at a higher frequency than
   the original tick
 merge   - at most one tick is replayed once the guest accepts it again
 discard - no lost ticks compensation

Yes, I think that is a good understanding.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] Unsubscription Confirmation

2012-01-20 Thread RealEstateMalaysian.com
Thank you for subscribing. You have now unsubscribed and no more messages will 
be sent.




[Qemu-devel] [PATCH] PPC: booke206: Check for min/max TLB entry size

2012-01-20 Thread Alexander Graf
When setting a TLB entry, we need to check if the TLB we're putting it in
actually supports the given size. According to the 2.06 PowerPC ISA, a
value that's out of range results in the minimum page size for the TLB
to be used.

Signed-off-by: Alexander Graf ag...@suse.de

---

v1 - v2:

  - fix min/max check

diff --git a/target-ppc/op_helper.c b/target-ppc/op_helper.c
index 6339c95..8cd0224 100644
--- a/target-ppc/op_helper.c
+++ b/target-ppc/op_helper.c
@@ -4228,6 +4228,7 @@ void helper_booke206_tlbwe(void)
 {
 uint32_t tlbncfg, tlbn;
 ppcmas_tlb_t *tlb;
+uint32_t size_tlb, size_min, size_max;
 
 switch (env-spr[SPR_BOOKE_MAS0]  MAS0_WQ_MASK) {
 case MAS0_WQ_ALWAYS:
@@ -4273,6 +4274,16 @@ void helper_booke206_tlbwe(void)
 tlb-mas1 = ~MAS1_IPROT;
 }
 
+/* XXX only applies for MAV 1.0 */
+size_tlb = (tlb-mas1  MAS1_TSIZE_MASK)  (MAS1_TSIZE_SHIFT + 1);
+size_min = (tlbncfg  TLBnCFG_MINSIZE)  TLBnCFG_MINSIZE_SHIFT;
+size_max = (tlbncfg  TLBnCFG_MAXSIZE)  TLBnCFG_MAXSIZE_SHIFT;
+if ((size_tlb  size_max) || (size_tlb  size_min)) {
+/* set to min size */
+tlb-mas1 = ~MAS1_TSIZE_MASK;
+tlb-mas1 |= size_min  (MAS1_TSIZE_SHIFT + 1);
+}
+
 if (booke206_tlb_to_page_size(env, tlb) == TARGET_PAGE_SIZE) {
 tlb_flush_page(env, tlb-mas2  MAS2_EPN_MASK);
 } else {



Re: [Qemu-devel] [PATCH v12 4/4] arm: SoC model for Calxeda Highbank

2012-01-20 Thread Rob Herring
On 01/20/2012 02:47 AM, Peter Maydell wrote:
 On 19 January 2012 23:17, Rob Herring rob.herr...@calxeda.com wrote:
 On 01/19/2012 03:44 PM, Peter Maydell wrote:
 On 19 January 2012 21:31, Mark Langsdorf mark.langsd...@calxeda.com wrote:
 +highbank_binfo.board_id = 0xEC10100f; /* provided by deviceTree */

 Where does this number come from? It's not in
 http://www.arm.linux.org.uk/developer/machines/

 Is 3027 (==0xbd3) you?
 http://www.arm.linux.org.uk/developer/machines/list.php?id=3027


 Much of the data there is wrong as none of it is used. 0 or -1 is the
 right value as those are obviously meaningless. A highbank kernel will
 never be booted without devicetree and in that case this number is
 irrelevant. This is the legacy boot interface and qemu really needs to
 learn to boot with a separate dtb.
 
 Yeah, but the documentation even for DTB boot says we should pass
 in a machine number. If 0 or -1 are right then there should be
 some documentation that says so. I'll accept mailing list post
 from some authoritative person [eg Grant Likely] if necessary.

Kernel DT co-maintainer is not authoritative enough for you?

The documentation needs some clarification.

 But this is an ABI between boot loaders and the kernel so I don't
 want to just have something random that happens to work. (And in
 particular if -1 is the officially sanctioned number then we need
 to fix arm_boot to be able to pass values 16 bits wide.)
 

Here's were the kernel sets the mach #. nr is from the database for
non-DT and ~0 for DT machines.

#define MACHINE_START(_type,_name)  \
static const struct machine_desc __mach_desc_##_type\
 __used \
 __attribute__((__section__(.arch.info.init))) = {\
.nr = MACH_TYPE_##_type,\
.name   = _name,

#define MACHINE_END \
};

#define DT_MACHINE_START(_name, _namestr)   \
static const struct machine_desc __mach_desc_##_name\
 __used \
 __attribute__((__section__(.arch.info.init))) = {\
.nr = ~0,   \
.name   = _namestr,

In any case, the kernel ignores the value passed in if a valid dtb is
passed in.

Rob



[Qemu-devel] [PATCH 1/2] KVM: Update headers (except HIOR mess)

2012-01-20 Thread Alexander Graf
This patch is basically what ./scripts/update-linux-headers.sh against
upstream KVM's next branch outputs except that all the HIOR bits are
removed. These we have to update with the code that uses them.

Signed-off-by: Alexander Graf ag...@suse.de
---
 linux-headers/asm-powerpc/kvm.h  |9 ++-
 linux-headers/asm-powerpc/kvm_para.h |   41 -
 linux-headers/asm-s390/kvm.h |9 +++
 linux-headers/asm-x86/hyperv.h   |1 +
 linux-headers/asm-x86/kvm.h  |4 +++
 linux-headers/linux/kvm.h|   41 ++
 linux-headers/linux/kvm_para.h   |1 -
 linux-headers/linux/virtio_ring.h|6 ++--
 8 files changed, 100 insertions(+), 12 deletions(-)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index fb3fddc..1f0cb55 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -265,12 +265,9 @@ struct kvm_debug_exit_arch {
 struct kvm_guest_debug_arch {
 };
 
-#define KVM_REG_MASK   0x001f
-#define KVM_REG_EXT_MASK   0xffe0
-#define KVM_REG_GPR0x
-#define KVM_REG_FPR0x0020
-#define KVM_REG_QPR0x0040
-#define KVM_REG_FQPR   0x0060
+/* definition of registers in kvm_run */
+struct kvm_sync_regs {
+};
 
 #define KVM_INTERRUPT_SET  -1U
 #define KVM_INTERRUPT_UNSET-2U
diff --git a/linux-headers/asm-powerpc/kvm_para.h 
b/linux-headers/asm-powerpc/kvm_para.h
index ad58c90..c047a84 100644
--- a/linux-headers/asm-powerpc/kvm_para.h
+++ b/linux-headers/asm-powerpc/kvm_para.h
@@ -22,6 +22,16 @@
 
 #include linux/types.h
 
+/*
+ * Additions to this struct must only occur at the end, and should be
+ * accompanied by a KVM_MAGIC_FEAT flag to advertise that they are present
+ * (albeit not necessarily relevant to the current target hardware platform).
+ *
+ * Struct fields are always 32 or 64 bit aligned, depending on them being 32
+ * or 64 bit wide respectively.
+ *
+ * See Documentation/virtual/kvm/ppc-pv.txt
+ */
 struct kvm_vcpu_arch_shared {
__u64 scratch1;
__u64 scratch2;
@@ -33,11 +43,35 @@ struct kvm_vcpu_arch_shared {
__u64 sprg3;
__u64 srr0;
__u64 srr1;
-   __u64 dar;
+   __u64 dar;  /* dear on BookE */
__u64 msr;
__u32 dsisr;
__u32 int_pending;  /* Tells the guest if we have an interrupt */
__u32 sr[16];
+   __u32 mas0;
+   __u32 mas1;
+   __u64 mas7_3;
+   __u64 mas2;
+   __u32 mas4;
+   __u32 mas6;
+   __u32 esr;
+   __u32 pir;
+
+   /*
+* SPRG4-7 are user-readable, so we can only keep these consistent
+* between the shared area and the real registers when there's an
+* intervening exit to KVM.  This also applies to SPRG3 on some
+* chips.
+*
+* This suffices for access by guest userspace, since in PR-mode
+* KVM, an exit must occur when changing the guest's MSR[PR].
+* If the guest kernel writes to SPRG3-7 via the shared area, it
+* must also use the shared area for reading while in kernel space.
+*/
+   __u64 sprg4;
+   __u64 sprg5;
+   __u64 sprg6;
+   __u64 sprg7;
 };
 
 #define KVM_SC_MAGIC_R00x4b564d21 /* KVM! */
@@ -47,7 +81,10 @@ struct kvm_vcpu_arch_shared {
 
 #define KVM_FEATURE_MAGIC_PAGE 1
 
-#define KVM_MAGIC_FEAT_SR  (1  0)
+#define KVM_MAGIC_FEAT_SR  (1  0)
+
+/* MASn, ESR, PIR, and high SPRGs */
+#define KVM_MAGIC_FEAT_MAS0_TO_SPRG7   (1  1)
 
 
 #endif /* __POWERPC_KVM_PARA_H__ */
diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index 82b32a1..9acbde4 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -41,4 +41,13 @@ struct kvm_debug_exit_arch {
 struct kvm_guest_debug_arch {
 };
 
+#define KVM_SYNC_PREFIX (1UL  0)
+#define KVM_SYNC_GPRS   (1UL  1)
+#define KVM_SYNC_ACRS   (1UL  2)
+/* definition of registers in kvm_run */
+struct kvm_sync_regs {
+   __u64 prefix;   /* prefix register */
+   __u64 gprs[16]; /* general purpose registers */
+   __u32 acrs[16]; /* access registers */
+};
 #endif
diff --git a/linux-headers/asm-x86/hyperv.h b/linux-headers/asm-x86/hyperv.h
index 5df477a..b80420b 100644
--- a/linux-headers/asm-x86/hyperv.h
+++ b/linux-headers/asm-x86/hyperv.h
@@ -189,5 +189,6 @@
 #define HV_STATUS_INVALID_HYPERCALL_CODE   2
 #define HV_STATUS_INVALID_HYPERCALL_INPUT  3
 #define HV_STATUS_INVALID_ALIGNMENT4
+#define HV_STATUS_INSUFFICIENT_BUFFERS 19
 
 #endif
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index 4d8dcbd..e7d1c19 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -321,4 +321,8 @@ struct kvm_xcrs {
__u64 padding[16];
 };
 
+/* definition of registers in kvm_run */
+struct kvm_sync_regs {
+};
+
 #endif /* _ASM_X86_KVM_H */
diff --git 

[Qemu-devel] [PATCH 2/2] PPC: KVM: Update HIOR code to new interface

2012-01-20 Thread Alexander Graf
Unfortunately the HIOR setting code slipped into upstream QEMU
before it was pulled into upstream KVM. And since Murphy is always
right, comments on the patches only emerged on the pull request
leading to changes in the interface.

So here's an update to the HIOR setting. While at it, I also relaxed
it a bit since for HV KVM we can already run fine without and 3.2
works just fine with HV KVM but when not setting HIOR. We will only
need this when running PAPR in PR KVM.

Since we accidently changed the ABI and API along the way, we have
to update the underlying kernel headers together with the code that
uses it to not break bisectability.

Signed-off-by: Alexander Graf ag...@suse.de
---
 linux-headers/asm-powerpc/kvm.h |2 +-
 linux-headers/linux/kvm.h   |   37 -
 target-ppc/kvm.c|   10 +++---
 3 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index 1f0cb55..b921c3f 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -324,6 +324,6 @@ struct kvm_book3e_206_tlb_params {
__u32 reserved[8];
 };
 
-#define KVM_ONE_REG_PPC_HIOR   KVM_ONE_REG_PPC | 0x100
+#define KVM_REG_PPC_HIOR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1)
 
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 4847813..f6b5343 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -684,30 +684,33 @@ struct kvm_dirty_tlb {
 
 /* Available with KVM_CAP_ONE_REG */
 
-#define KVM_ONE_REG_GENERIC0xULL
+#define KVM_REG_ARCH_MASK  0xff00ULL
+#define KVM_REG_GENERIC0xULL
 
 /*
  * Architecture specific registers are to be defined in arch headers and
  * ORed with the arch identifier.
  */
-#define KVM_ONE_REG_PPC0x1000ULL
-#define KVM_ONE_REG_X860x2000ULL
-#define KVM_ONE_REG_IA64   0x3000ULL
-#define KVM_ONE_REG_ARM0x4000ULL
-#define KVM_ONE_REG_S390   0x5000ULL
+#define KVM_REG_PPC0x1000ULL
+#define KVM_REG_X860x2000ULL
+#define KVM_REG_IA64   0x3000ULL
+#define KVM_REG_ARM0x4000ULL
+#define KVM_REG_S390   0x5000ULL
+
+#define KVM_REG_SIZE_SHIFT 52
+#define KVM_REG_SIZE_MASK  0x00f0ULL
+#define KVM_REG_SIZE_U80xULL
+#define KVM_REG_SIZE_U16   0x0010ULL
+#define KVM_REG_SIZE_U32   0x0020ULL
+#define KVM_REG_SIZE_U64   0x0030ULL
+#define KVM_REG_SIZE_U128  0x0040ULL
+#define KVM_REG_SIZE_U256  0x0050ULL
+#define KVM_REG_SIZE_U512  0x0060ULL
+#define KVM_REG_SIZE_U1024 0x0070ULL
 
 struct kvm_one_reg {
__u64 id;
-   union {
-   __u8 reg8;
-   __u16 reg16;
-   __u32 reg32;
-   __u64 reg64;
-   __u8 reg128[16];
-   __u8 reg256[32];
-   __u8 reg512[64];
-   __u8 reg1024[128];
-   } u;
+   __u64 addr;
 };
 
 /*
@@ -850,7 +853,7 @@ struct kvm_s390_ucas_mapping {
 /* Available with KVM_CAP_SW_TLB */
 #define KVM_DIRTY_TLB_IOW(KVMIO,  0xaa, struct kvm_dirty_tlb)
 /* Available with KVM_CAP_ONE_REG */
-#define KVM_GET_ONE_REG  _IOWR(KVMIO, 0xab, struct kvm_one_reg)
+#define KVM_GET_ONE_REG  _IOW(KVMIO,  0xab, struct kvm_one_reg)
 #define KVM_SET_ONE_REG  _IOW(KVMIO,  0xac, struct kvm_one_reg)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1  0)
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index ce8ac5b..50cfa02 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -740,6 +740,7 @@ void kvmppc_set_papr(CPUState *env)
 struct kvm_one_reg reg = {};
 struct kvm_sregs sregs = {};
 int ret;
+uint64_t hior = env-spr[SPR_HIOR];
 
 cap.cap = KVM_CAP_PPC_PAPR;
 ret = kvm_vcpu_ioctl(env, KVM_ENABLE_CAP, cap);
@@ -755,11 +756,14 @@ void kvmppc_set_papr(CPUState *env)
  * Once we have qdev CPUs, move HIOR to a qdev property and
  * remove this chunk.
  */
-reg.id = KVM_ONE_REG_PPC_HIOR;
-reg.u.reg64 = env-spr[SPR_HIOR];
+reg.id = KVM_REG_PPC_HIOR;
+reg.addr = (uintptr_t)hior;
 ret = kvm_vcpu_ioctl(env, KVM_SET_ONE_REG, reg);
 if (ret) {
-goto fail;
+fprintf(stderr, Couldn't set HIOR. Maybe you're running an old \n
+kernel with support for HV KVM but no PAPR PR \n
+KVM in which case things will work. If they don't \n
+please update your host kernel!\n);
 }
 
 /* Set SDR1 so kernel space 

Re: [Qemu-devel] virtual pc hash table vs physical pc hash table

2012-01-20 Thread Xin Tong
maybe one of the reasons of having the virtual pc hash table is that
the pc does not need to be tranlated to a physical pc which is used in
the physical pc hash table.


Xin


On Fri, Jan 20, 2012 at 1:24 AM, 陳韋任 che...@iis.sinica.edu.tw wrote:
 On Mon, Jan 02, 2012 at 07:11:41AM -0500, Xin Tong wrote:
 In qemu, there is a virtual pc hash table and a physical pc hash
 table. virtual pc hash table is used to find tbs until a context
 switch. and physical pc hash table keeps all the translated tb.
 virtual pc hash table is smaller, accessed with 12 bits and physical
 pc hash table is bigger, accessed with 15 bits. the size in the hash
 tables are the primary reasons for having 2 hash tables ?

  I think not. tb_find_fast use the virtual pc as index to search 
 env-tb_jmp_cache,
 and check the result of tb_jmp_cache is sane or not. If something goes wrong, 
 it
 turns to call tb_find_slow to use physical pc to do a slow search. The size
 should not be the reason for having 2 hash tables.

 Regards,
 chenwj

 --
 Wei-Ren Chen (陳韋任)
 Computer Systems Lab, Institute of Information Science,
 Academia Sinica, Taiwan (R.O.C.)
 Tel:886-2-2788-3799 #1667
 Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-01-20 Thread Xin Tong
On Fri, Jan 20, 2012 at 3:23 AM, 陳韋任 che...@iis.sinica.edu.tw wrote:
 1.  The control of gCR3 and hCR3 needs kernel access. While they can
 be set with a device module as what is done in kvm. Trapping into the
 kernel every time gCR3 is reseted might be too expensive.

  Why the control of gCR3 needs kernel access? Isn't gCR3 just a field of the
 CPUX86State? QEMU should have the control of it. Or you mean the trapping 
 thing?

I do not think gCR3 is a field in the CPUx86State. I think inorder to
change the guest CR3, we need to trap into the kernel as kvm does.

 2. After setting the gCR3 and hCR3. whatever memory references fall
 within the guest memory will be done correctly. However, memory
 references done by the host will be broken. Therefore, when we load
 the from the CPUstates, call to helpers for exits from the code cache,
 we need to change the paging mechanism back to non-nested. can this be
 done ? how expensive will this be ?

  Why the memeory references done by the host will be broken?

the CPUstate is a host memory, if nested paging is enabled, the guest
page table is walked  and then the host. however, for memory accesses
to CPUstate, we do not want to guest page table to be walked.


 Regards,
 chenwj

 --
 Wei-Ren Chen (陳韋任)
 Computer Systems Lab, Institute of Information Science,
 Academia Sinica, Taiwan (R.O.C.)
 Tel:886-2-2788-3799 #1667
 Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] [PATCH v12 4/4] arm: SoC model for Calxeda Highbank

2012-01-20 Thread Peter Maydell
On 20 January 2012 13:48, Rob Herring rob.herr...@calxeda.com wrote:
 Kernel DT co-maintainer is not authoritative enough for you?

Only if I recognise their name :-) [ie, sorry.]

 The documentation needs some clarification.

 But this is an ABI between boot loaders and the kernel so I don't
 want to just have something random that happens to work. (And in
 particular if -1 is the officially sanctioned number then we need
 to fix arm_boot to be able to pass values 16 bits wide.)


 Here's were the kernel sets the mach #. nr is from the database for
 non-DT and ~0 for DT machines.

 #define MACHINE_START(_type,_name)                      \
 static const struct machine_desc __mach_desc_##_type    \
  __used                                                 \
  __attribute__((__section__(.arch.info.init))) = {    \
        .nr             = MACH_TYPE_##_type,            \
        .name           = _name,

 #define MACHINE_END                             \
 };

 #define DT_MACHINE_START(_name, _namestr)               \
 static const struct machine_desc __mach_desc_##_name    \
  __used                                                 \
  __attribute__((__section__(.arch.info.init))) = {    \
        .nr             = ~0,                           \
        .name           = _namestr,

 In any case, the kernel ignores the value passed in if a valid dtb is
 passed in.

I wonder if we should be passing in anything-except-minus-1,
since if you pass -1 and no DT then the kernel will fail
silently, whereas if you pass something else and no DT the
kernel will complain about the mismatch.

Even when we add a --dtb foo option to qemu, there's bound
to be a pile of user error where users pass in --kernel but
not --dtb.

-- PMM



[Qemu-devel] macvtap performance: good when writing from guest, abysmal when reading on guest (~ 700kB/s)

2012-01-20 Thread Lutz Vieweg

Hi,

I've been using qemu-kvm along with ordinary tap-devices and software bridges
for quite some time. When I recently noticed that a certain TCP connection 
between
a guest and a remote physical host was limited to ~ 80MB/s, I thought it would
be a good idea to check whether by using macvtap, instead, the performance
would get better.

So I setup a guest on a host that has a direct peer-to-peer 10G cable to
another host, and configured it to use a macvtap device.

Then I did some benchmarks, using nc on both sides, just reading from 
/dev/zero,
writing to /dev/null.

When the guest VM is writing into a TCP connection to the physical host 
(linux-3.1.6),
the performance is ~ 140MB/s - not great, but better than with ordinary
tap devices.

But to my big surprise, the performance when the physical host is writing,
and the guest VM is reading is abysmal, only ~ 700kB/s!
No bottleneck is obvious - the CPU usage and NIC utilization of both
the VM, its host, and the other host is all quite low.
strace on qemu process indicates that from time to time, there are pauses 
of ~ 0.5
seconds in between the many reads from /dev/tapX, but I am not sure whether
this is the whole reason for the bad performance.

Any ideas?

Or should I rather stay with ordinary tap/brctl, or try yet another
virtual NIC technique?

Regards,

Lutz Vieweg





Re: [Qemu-devel] [PATCH 2/4] Add cleanup function

2012-01-20 Thread Ryan Harper
* Eric Blake ebl...@redhat.com [2012-01-17 16:03]:
 On 01/16/2012 10:16 AM, Ryan Harper wrote:
   if test -z $1 -o -z $2; then
   echo Usage: $0 QEMU TEST1 [TEST2 ...]
  +cleanup
   exit 1
 
  Is it worth using 'trap cleanup 0' to install the cleanup handler up
  front, instead of modifying all exit call sites?
  
  I thought about that, but it seemed to require switching to /bin/bash
 
 Not really.
 
  
  and I know Anthony had written the scripts carefully to be /bin/sh.
 
 POSIX requires /bin/sh to support 'trap cleanup 0', and I don't know of

I was using trap cleanup SIGINT; which /bin/sh didn't like:

(finalgravity) qemu-test % ./qemu-test 
~/work/git/qemu/x86_64-softmmu/qemu-system-x86_64 tests/virtio-serial.sh 
trap: SIGINT: bad trap

but with 0 instead, that seems to work.

 any counter-example shells that fail to do this.  There are non-POSIX
 shells where installing a trap 0 handler from inside a function body
 invokes the handler upon exiting the function, instead of exiting the
 overall script, but even Solaris /bin/sh knows how to correctly handle a
 trap 0 handler installed outside of any function calls.
 
 https://www.gnu.org/software/autoconf/manual/autoconf.html#trap
 
 -- 
 Eric Blake   ebl...@redhat.com+1-919-301-3266
 Libvirt virtualization library http://libvirt.org
 



-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ry...@us.ibm.com




[Qemu-devel] [Bug 919242] [NEW] qemu-img convert to VDI corrupts image

2012-01-20 Thread jbthiel
Public bug reported:

Hello,  thanks to all for the great work on qemu, an excellent
technology.

There appears to be a serious bug in qemu-img 1.0, yielding silent
corruption when converting an image to VDI format.  After conversion to
VDI, an image with WinNT4sp6 (NTFS) yields a boot failure (details
below) -- presumably due to some corruption, since the image works fine
as the source .vhd (from virtualPC6), and also when converted to QCOW2
or VMDK format.

TEST CASE:
OS X 10.6.8 on Intel i5
Qemu 1.0 from mac ports  (macports.org)
The source BaseDrive.vhd image is from VirtualPC6 (Mac)
$ qemu-img info BaseDrive.vhd
image: BaseDrive.vhd
file format: vpc
virtual size: 2.0G (2096898048 bytes)
disk size: 190M

The image has a fresh Windows NT4sp6 NTFS installation.  It's from VirtualPC6 
(Connectix)  inside a .vhdp package directory on OS X.  Convert via:
  qemu-img convert -f vpc -O vdi BaseDrive.vhd  BaseDrive.vdi

Now run the resulting vdi file with: 
  qemu-system-i386 -cpu pentium BaseDrive.vdi
On boot, NT4 crashes with
STOP: c26c {Unable to Load Device Driver}
\??\C:\WINNT\system32\win32k.sys device driver could not be loaded.
Error Status was 0xc221

Both qemu 1.0, and VirtualBox 4.1.8 yield the same error on this VDI.

Conversion of the exact same image to QCOW2 or VMDK format yields a working 
image (ie. qemu and VirtualBox boot fine):
  qemu-img convert -f vpc -O qcow2 BaseDrive.vhd  BaseDrive.qcow2
  OR
  qemu-img convert -f vpc -O vmdk BaseDrive.vhd  BaseDrive.vmdk

Furthermore, I tested converting from raw, qcow2, and vmdk  to vdi, and
in all these cases the original format boots, but the converted VDI
fails to boot as above.

Along the way, I think I also tested a VDI natively created and
installed from VirtualBox, which did boot fine in qemu.  Thus the
problem appears to be not in qemu-system-i386 reading the VDI, rather in
the qemu-img conversion to VDI.


SEVERITY: CRITICAL
The severity of this bug is critical as it appears to produce a silently 
corrupted VDI image.  (which is presumably the cause of the boot failure; 
though I have not explicitly check-disked the resulting VDI image).  It also 
impedes easy inter-use between qemu and VirtualBox.

WORKAROUND:
The workaround is to use the VMDK format instead of VDI. 
VMDK is supported by both qemu and VirtualBox (and vmWare).


I can supply a test VHD/QCOW2/VMDK image if desired to reproduce the bug.   
(but it's large, 190M)

-- jbthiel

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/919242

Title:
  qemu-img convert to VDI corrupts image

Status in QEMU:
  New

Bug description:
  Hello,  thanks to all for the great work on qemu, an excellent
  technology.

  There appears to be a serious bug in qemu-img 1.0, yielding silent
  corruption when converting an image to VDI format.  After conversion
  to  VDI, an image with WinNT4sp6 (NTFS) yields a boot failure (details
  below) -- presumably due to some corruption, since the image works
  fine as the source .vhd (from virtualPC6), and also when converted to
  QCOW2 or VMDK format.

  TEST CASE:
  OS X 10.6.8 on Intel i5
  Qemu 1.0 from mac ports  (macports.org)
  The source BaseDrive.vhd image is from VirtualPC6 (Mac)
  $ qemu-img info BaseDrive.vhd
  image: BaseDrive.vhd
  file format: vpc
  virtual size: 2.0G (2096898048 bytes)
  disk size: 190M

  The image has a fresh Windows NT4sp6 NTFS installation.  It's from VirtualPC6 
(Connectix)  inside a .vhdp package directory on OS X.  Convert via:
qemu-img convert -f vpc -O vdi BaseDrive.vhd  BaseDrive.vdi

  Now run the resulting vdi file with: 
qemu-system-i386 -cpu pentium BaseDrive.vdi
  On boot, NT4 crashes with
  STOP: c26c {Unable to Load Device Driver}
  \??\C:\WINNT\system32\win32k.sys device driver could not be loaded.
  Error Status was 0xc221

  Both qemu 1.0, and VirtualBox 4.1.8 yield the same error on this VDI.

  Conversion of the exact same image to QCOW2 or VMDK format yields a working 
image (ie. qemu and VirtualBox boot fine):
qemu-img convert -f vpc -O qcow2 BaseDrive.vhd  BaseDrive.qcow2
OR
qemu-img convert -f vpc -O vmdk BaseDrive.vhd  BaseDrive.vmdk

  Furthermore, I tested converting from raw, qcow2, and vmdk  to vdi,
  and in all these cases the original format boots, but the converted
  VDI fails to boot as above.

  Along the way, I think I also tested a VDI natively created and
  installed from VirtualBox, which did boot fine in qemu.  Thus the
  problem appears to be not in qemu-system-i386 reading the VDI, rather
  in the qemu-img conversion to VDI.

  
  SEVERITY: CRITICAL
  The severity of this bug is critical as it appears to produce a silently 
corrupted VDI image.  (which is presumably the cause of the boot failure; 
though I have not explicitly check-disked the resulting VDI image).  It also 
impedes easy 

Re: [Qemu-devel] [PATCH v12 4/4] arm: SoC model for Calxeda Highbank

2012-01-20 Thread Mark Langsdorf
On 01/20/2012 07:48 AM, Rob Herring wrote:
 On 01/20/2012 02:47 AM, Peter Maydell wrote:
 On 19 January 2012 23:17, Rob Herring rob.herr...@calxeda.com wrote:
 On 01/19/2012 03:44 PM, Peter Maydell wrote:
 On 19 January 2012 21:31, Mark Langsdorf mark.langsd...@calxeda.com 
 wrote:
 +highbank_binfo.board_id = 0xEC10100f; /* provided by deviceTree */

 Where does this number come from? It's not in
 http://www.arm.linux.org.uk/developer/machines/

 Is 3027 (==0xbd3) you?
 http://www.arm.linux.org.uk/developer/machines/list.php?id=3027


 Much of the data there is wrong as none of it is used. 0 or -1 is the
 right value as those are obviously meaningless. A highbank kernel will
 never be booted without devicetree and in that case this number is
 irrelevant. This is the legacy boot interface and qemu really needs to
 learn to boot with a separate dtb.

 Yeah, but the documentation even for DTB boot says we should pass
 in a machine number. If 0 or -1 are right then there should be
 some documentation that says so. I'll accept mailing list post
 from some authoritative person [eg Grant Likely] if necessary.
 
 Kernel DT co-maintainer is not authoritative enough for you?

Peter, is that sufficient for me to send in the patch with a
board_id of -1? Thanks.

--Mark Langsdorf
Calxeda, Inc.



Re: [Qemu-devel] [PATCH v12 4/4] arm: SoC model for Calxeda Highbank

2012-01-20 Thread Peter Maydell
On 20 January 2012 16:25, Mark Langsdorf mark.langsd...@calxeda.com wrote:
 On 01/20/2012 07:48 AM, Rob Herring wrote:
 On 01/20/2012 02:47 AM, Peter Maydell wrote:
 On 19 January 2012 23:17, Rob Herring rob.herr...@calxeda.com wrote:
 On 01/19/2012 03:44 PM, Peter Maydell wrote:
 On 19 January 2012 21:31, Mark Langsdorf mark.langsd...@calxeda.com 
 wrote:
 +    highbank_binfo.board_id = 0xEC10100f; /* provided by deviceTree */

 Where does this number come from? It's not in
 http://www.arm.linux.org.uk/developer/machines/

 Is 3027 (==0xbd3) you?
 http://www.arm.linux.org.uk/developer/machines/list.php?id=3027


 Much of the data there is wrong as none of it is used. 0 or -1 is the
 right value as those are obviously meaningless. A highbank kernel will
 never be booted without devicetree and in that case this number is
 irrelevant. This is the legacy boot interface and qemu really needs to
 learn to boot with a separate dtb.

 Yeah, but the documentation even for DTB boot says we should pass
 in a machine number. If 0 or -1 are right then there should be
 some documentation that says so. I'll accept mailing list post
 from some authoritative person [eg Grant Likely] if necessary.

 Kernel DT co-maintainer is not authoritative enough for you?

 Peter, is that sufficient for me to send in the patch with a
 board_id of -1? Thanks.

It's still not clear to me from this conversation if the right
answer is 0, -1 or anything that's not a valid board ID
and not -1 either...

-- PMM



[Qemu-devel] [Bug 919242] Re: qemu-img convert to VDI corrupts image

2012-01-20 Thread Stefan Weil
** Changed in: qemu
 Assignee: (unassigned) = Stefan Weil (ubuntu-weilnetz)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/919242

Title:
  qemu-img convert to VDI corrupts image

Status in QEMU:
  New

Bug description:
  Hello,  thanks to all for the great work on qemu, an excellent
  technology.

  There appears to be a serious bug in qemu-img 1.0, yielding silent
  corruption when converting an image to VDI format.  After conversion
  to  VDI, an image with WinNT4sp6 (NTFS) yields a boot failure (details
  below) -- presumably due to some corruption, since the image works
  fine as the source .vhd (from virtualPC6), and also when converted to
  QCOW2 or VMDK format.

  TEST CASE:
  OS X 10.6.8 on Intel i5
  Qemu 1.0 from mac ports  (macports.org)
  The source BaseDrive.vhd image is from VirtualPC6 (Mac)
  $ qemu-img info BaseDrive.vhd
  image: BaseDrive.vhd
  file format: vpc
  virtual size: 2.0G (2096898048 bytes)
  disk size: 190M

  The image has a fresh Windows NT4sp6 NTFS installation.  It's from VirtualPC6 
(Connectix)  inside a .vhdp package directory on OS X.  Convert via:
qemu-img convert -f vpc -O vdi BaseDrive.vhd  BaseDrive.vdi

  Now run the resulting vdi file with: 
qemu-system-i386 -cpu pentium BaseDrive.vdi
  On boot, NT4 crashes with
  STOP: c26c {Unable to Load Device Driver}
  \??\C:\WINNT\system32\win32k.sys device driver could not be loaded.
  Error Status was 0xc221

  Both qemu 1.0, and VirtualBox 4.1.8 yield the same error on this VDI.

  Conversion of the exact same image to QCOW2 or VMDK format yields a working 
image (ie. qemu and VirtualBox boot fine):
qemu-img convert -f vpc -O qcow2 BaseDrive.vhd  BaseDrive.qcow2
OR
qemu-img convert -f vpc -O vmdk BaseDrive.vhd  BaseDrive.vmdk

  Furthermore, I tested converting from raw, qcow2, and vmdk  to vdi,
  and in all these cases the original format boots, but the converted
  VDI fails to boot as above.

  Along the way, I think I also tested a VDI natively created and
  installed from VirtualBox, which did boot fine in qemu.  Thus the
  problem appears to be not in qemu-system-i386 reading the VDI, rather
  in the qemu-img conversion to VDI.

  
  SEVERITY: CRITICAL
  The severity of this bug is critical as it appears to produce a silently 
corrupted VDI image.  (which is presumably the cause of the boot failure; 
though I have not explicitly check-disked the resulting VDI image).  It also 
impedes easy inter-use between qemu and VirtualBox.

  WORKAROUND:
  The workaround is to use the VMDK format instead of VDI. 
  VMDK is supported by both qemu and VirtualBox (and vmWare).

  
  I can supply a test VHD/QCOW2/VMDK image if desired to reproduce the bug.   
(but it's large, 190M)

  -- jbthiel

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/919242/+subscriptions



Re: [Qemu-devel] [PATCH v12 4/4] arm: SoC model for Calxeda Highbank

2012-01-20 Thread Mark Langsdorf
On 01/20/2012 10:27 AM, Peter Maydell wrote:
 On 20 January 2012 16:25, Mark Langsdorf mark.langsd...@calxeda.com wrote:
 On 01/20/2012 07:48 AM, Rob Herring wrote:
 On 01/20/2012 02:47 AM, Peter Maydell wrote:
 On 19 January 2012 23:17, Rob Herring rob.herr...@calxeda.com wrote:
 On 01/19/2012 03:44 PM, Peter Maydell wrote:
 On 19 January 2012 21:31, Mark Langsdorf mark.langsd...@calxeda.com 
 wrote:
 +highbank_binfo.board_id = 0xEC10100f; /* provided by deviceTree */

 Where does this number come from? It's not in
 http://www.arm.linux.org.uk/developer/machines/

 Is 3027 (==0xbd3) you?
 http://www.arm.linux.org.uk/developer/machines/list.php?id=3027


 Much of the data there is wrong as none of it is used. 0 or -1 is the
 right value as those are obviously meaningless. A highbank kernel will
 never be booted without devicetree and in that case this number is
 irrelevant. This is the legacy boot interface and qemu really needs to
 learn to boot with a separate dtb.

 Yeah, but the documentation even for DTB boot says we should pass
 in a machine number. If 0 or -1 are right then there should be
 some documentation that says so. I'll accept mailing list post
 from some authoritative person [eg Grant Likely] if necessary.

 Kernel DT co-maintainer is not authoritative enough for you?

 Peter, is that sufficient for me to send in the patch with a
 board_id of -1? Thanks.
 
 It's still not clear to me from this conversation if the right
 answer is 0, -1 or anything that's not a valid board ID
 and not -1 either...

Quoting Rob from upthread:
0 or -1 is the right value as those are obviously meaningless.

The original code that Rob wrote had a board_id of -1. That's
the right answer.

--Mark Langsdorf
Calxeda, Inc.




Re: [Qemu-devel] [PATCH v12 4/4] arm: SoC model for Calxeda Highbank

2012-01-20 Thread Peter Maydell
On 20 January 2012 16:57, Mark Langsdorf mark.langsd...@calxeda.com wrote:
 On 01/20/2012 10:27 AM, Peter Maydell wrote:
 It's still not clear to me from this conversation if the right
 answer is 0, -1 or anything that's not a valid board ID
 and not -1 either...

 Quoting Rob from upthread:
 0 or -1 is the right value as those are obviously meaningless.

 The original code that Rob wrote had a board_id of -1. That's
 the right answer.

In that case you need a patch that causes arm_boot to actually
pass -1, not 0x.

(Also it would be nice if the kernel barfed if (id == -1 and
there's no appended device tree), but that's not a qemu thing.)

-- PMM



[Qemu-devel] [PATCH] arm_boot: support board IDs more than 16 bits wide

2012-01-20 Thread Peter Maydell
Support passing a board ID value to the kernel in r1
that is more than 16 bits wide. This is needed to pass
the '-1 == invalid' value for boards which only support
device tree booting.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
This applies after the Calxeda patchset. Mark, I suggest you put it
in your patchset in the appropriate place.

 hw/arm_boot.c |   11 +--
 1 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/arm_boot.c b/hw/arm_boot.c
index 8f73a29..01ca997 100644
--- a/hw/arm_boot.c
+++ b/hw/arm_boot.c
@@ -20,10 +20,10 @@
 /* The worlds second smallest bootloader.  Set r0-r2, then jump to kernel.  */
 static uint32_t bootloader[] = {
   0xe3a0, /* mov r0, #0 */
-  0xe3a01000, /* mov r1, #0x?? */
-  0xe3811c00, /* orr r1, r1, #0x??00 */
-  0xe59f2000, /* ldr r2, [pc, #0] */
-  0xe59ff000, /* ldr pc, [pc, #0] */
+  0xe59f1004, /* ldr r1, [pc, #4] */
+  0xe59f2004, /* ldr r2, [pc, #4] */
+  0xe59ff004, /* ldr pc, [pc, #4] */
+  0, /* Board ID */
   0, /* Address of kernel args.  Set by integratorcp_init.  */
   0  /* Kernel entry point.  Set by integratorcp_init.  */
 };
@@ -289,8 +289,7 @@ void arm_load_kernel(CPUState *env, struct arm_boot_info 
*info)
 } else {
 initrd_size = 0;
 }
-bootloader[1] |= info-board_id  0xff;
-bootloader[2] |= (info-board_id  8)  0xff;
+bootloader[4] = info-board_id;
 bootloader[5] = info-loader_start + KERNEL_ARGS_ADDR;
 bootloader[6] = entry;
 for (n = 0; n  sizeof(bootloader) / 4; n++) {
-- 
1.7.1




[Qemu-devel] [PATCH v4 0/6] save/restore on Xen

2012-01-20 Thread Stefano Stabellini
Hi all,
this is the fourth version of the Xen save/restore patch series.
We have been discussing this issue for quite a while on #qemu and
qemu-devel:


http://marc.info/?l=qemu-develm=132346828427314w=2
http://marc.info/?l=qemu-develm=132377734605464w=2


A few different approaches were proposed to achieve the goal
of a working save/restore with upstream Qemu on Xen, however after
prototyping some of them I came up with yet another solution, that I
think leads to the best results with the less amount of code
duplications and ugliness.
Far from saying that this patch series is an example of elegance and
simplicity, but it is closer to acceptable anything else I have seen so
far.

What's new is that Qemu is going to keep track of its own physmap on
xenstore, so that Xen can be fully aware of the changes Qemu makes to
the guest's memory map at any time.
This is all handled by Xen or Xen support in Qemu internally and can be
used to solve our save/restore framebuffer problem.

From the Qemu common code POV, we still need to avoid saving the guest's
ram when running on Xen, and we need to avoid resetting the videoram on
restore (that is a benefit to the generic Qemu case too, because it
saves few cpu cycles).


Changes in v4:

- keep a record of the MemoryRegion's name on xenstore;

- print a message when avoiding a memory allocation on restore.


This is the list of patches with a diffstat:

Anthony PERARD (4):
  vl.c: do not save the RAM state when Xen is enabled
  xen mapcache: check if memory region has moved.
  cirrus_vga: do not reset videoram on resume
  xen: change memory access behavior during migration.

Stefano Stabellini (2):
  Set runstate to INMIGRATE earlier
  xen: record physmap changes to xenstore

 hw/cirrus_vga.c |9 +++-
 vl.c|8 ++-
 xen-all.c   |  112 ++-
 xen-mapcache.c  |   22 +-
 xen-mapcache.h  |9 +++-
 5 files changed, 147 insertions(+), 13 deletions(-)


git://xenbits.xen.org/people/sstabellini/qemu-dm.git saverestore-4

Cheers,

Stefano



[Qemu-devel] [PATCH v4 2/6] xen mapcache: check if memory region has moved.

2012-01-20 Thread Stefano Stabellini
From: Anthony PERARD anthony.per...@citrix.com

This patch changes the xen_map_cache behavior. Before trying to map a guest
addr, mapcache will look into the list of range of address that have been moved
(physmap/set_memory). There is currently one memory space like this, the vram,
moved from were it's allocated to were the guest will look into.

This help to have a succefull migration.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 xen-all.c  |   18 +-
 xen-mapcache.c |   22 +++---
 xen-mapcache.h |9 +++--
 3 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/xen-all.c b/xen-all.c
index c86ebf4..507d93d 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -225,6 +225,22 @@ static XenPhysmap *get_physmapping(XenIOState *state,
 return NULL;
 }
 
+static target_phys_addr_t xen_phys_offset_to_gaddr(target_phys_addr_t 
start_addr,
+   ram_addr_t size, void 
*opaque)
+{
+target_phys_addr_t addr = start_addr  TARGET_PAGE_MASK;
+XenIOState *xen_io_state = opaque;
+XenPhysmap *physmap = NULL;
+
+QLIST_FOREACH(physmap, xen_io_state-physmap, list) {
+if (range_covers_byte(physmap-phys_offset, physmap-size, addr)) {
+return physmap-start_addr;
+}
+}
+
+return start_addr;
+}
+
 #if CONFIG_XEN_CTRL_INTERFACE_VERSION = 340
 static int xen_add_to_physmap(XenIOState *state,
   target_phys_addr_t start_addr,
@@ -964,7 +980,7 @@ int xen_hvm_init(void)
 }
 
 /* Init RAM management */
-xen_map_cache_init();
+xen_map_cache_init(xen_phys_offset_to_gaddr, state);
 xen_ram_init(ram_size);
 
 qemu_add_vm_change_state_handler(xen_hvm_change_state_handler, state);
diff --git a/xen-mapcache.c b/xen-mapcache.c
index 9fecc64..d9c995b 100644
--- a/xen-mapcache.c
+++ b/xen-mapcache.c
@@ -76,6 +76,9 @@ typedef struct MapCache {
 uint8_t *last_address_vaddr;
 unsigned long max_mcache_size;
 unsigned int mcache_bucket_shift;
+
+phys_offset_to_gaddr_t phys_offset_to_gaddr;
+void *opaque;
 } MapCache;
 
 static MapCache *mapcache;
@@ -89,13 +92,16 @@ static inline int test_bits(int nr, int size, const 
unsigned long *addr)
 return 0;
 }
 
-void xen_map_cache_init(void)
+void xen_map_cache_init(phys_offset_to_gaddr_t f, void *opaque)
 {
 unsigned long size;
 struct rlimit rlimit_as;
 
 mapcache = g_malloc0(sizeof (MapCache));
 
+mapcache-phys_offset_to_gaddr = f;
+mapcache-opaque = opaque;
+
 QTAILQ_INIT(mapcache-locked_entries);
 mapcache-last_address_index = -1;
 
@@ -191,9 +197,14 @@ uint8_t *xen_map_cache(target_phys_addr_t phys_addr, 
target_phys_addr_t size,
uint8_t lock)
 {
 MapCacheEntry *entry, *pentry = NULL;
-target_phys_addr_t address_index  = phys_addr  MCACHE_BUCKET_SHIFT;
-target_phys_addr_t address_offset = phys_addr  (MCACHE_BUCKET_SIZE - 1);
+target_phys_addr_t address_index;
+target_phys_addr_t address_offset;
 target_phys_addr_t __size = size;
+bool translated = false;
+
+tryagain:
+address_index  = phys_addr  MCACHE_BUCKET_SHIFT;
+address_offset = phys_addr  (MCACHE_BUCKET_SIZE - 1);
 
 trace_xen_map_cache(phys_addr);
 
@@ -235,6 +246,11 @@ uint8_t *xen_map_cache(target_phys_addr_t phys_addr, 
target_phys_addr_t size,
 if(!test_bits(address_offset  XC_PAGE_SHIFT, size  XC_PAGE_SHIFT,
 entry-valid_mapping)) {
 mapcache-last_address_index = -1;
+if (!translated  mapcache-phys_offset_to_gaddr) {
+phys_addr = mapcache-phys_offset_to_gaddr(phys_addr, size, 
mapcache-opaque);
+translated = true;
+goto tryagain;
+}
 trace_xen_map_cache_return(NULL);
 return NULL;
 }
diff --git a/xen-mapcache.h b/xen-mapcache.h
index da874ca..70301a5 100644
--- a/xen-mapcache.h
+++ b/xen-mapcache.h
@@ -11,9 +11,13 @@
 
 #include stdlib.h
 
+typedef target_phys_addr_t (*phys_offset_to_gaddr_t)(target_phys_addr_t 
start_addr,
+ ram_addr_t size,
+ void *opaque);
 #ifdef CONFIG_XEN
 
-void xen_map_cache_init(void);
+void xen_map_cache_init(phys_offset_to_gaddr_t f,
+void *opaque);
 uint8_t *xen_map_cache(target_phys_addr_t phys_addr, target_phys_addr_t size,
uint8_t lock);
 ram_addr_t xen_ram_addr_from_mapcache(void *ptr);
@@ -22,7 +26,8 @@ void xen_invalidate_map_cache(void);
 
 #else
 
-static inline void xen_map_cache_init(void)
+static inline void xen_map_cache_init(phys_offset_to_gaddr_t f,
+  void *opaque)
 {
 }
 
-- 
1.7.2.5




[Qemu-devel] [PATCH v4 1/6] vl.c: do not save the RAM state when Xen is enabled

2012-01-20 Thread Stefano Stabellini
From: Anthony PERARD anthony.per...@citrix.com

In the Xen case, the guest RAM is not handle by QEMU, and it is saved by
Xen tools.
So, we just avoid to register the RAM save state handler.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 vl.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index ba55b35..6f0435b 100644
--- a/vl.c
+++ b/vl.c
@@ -3270,8 +3270,10 @@ int main(int argc, char **argv, char **envp)
 default_drive(default_sdcard, snapshot, machine-use_scsi,
   IF_SD, 0, SD_OPTS);
 
-register_savevm_live(NULL, ram, 0, 4, NULL, ram_save_live, NULL,
- ram_load, NULL);
+if (!xen_enabled()) {
+register_savevm_live(NULL, ram, 0, 4, NULL, ram_save_live, NULL,
+ ram_load, NULL);
+}
 
 if (nb_numa_nodes  0) {
 int i;
-- 
1.7.2.5




[Qemu-devel] [PATCH v4 5/6] xen: record physmap changes to xenstore

2012-01-20 Thread Stefano Stabellini
Write to xenstore any physmap changes so that the hypervisor can be
aware of them.
Read physmap changes from xenstore on boot.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 xen-all.c |   78 -
 1 files changed, 77 insertions(+), 1 deletions(-)

diff --git a/xen-all.c b/xen-all.c
index 507d93d..bb66c82 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -63,7 +63,7 @@ static inline ioreq_t *xen_vcpu_ioreq(shared_iopage_t 
*shared_page, int vcpu)
 typedef struct XenPhysmap {
 target_phys_addr_t start_addr;
 ram_addr_t size;
-MemoryRegion *mr;
+char *name;
 target_phys_addr_t phys_offset;
 
 QLIST_ENTRY(XenPhysmap) list;
@@ -253,6 +253,7 @@ static int xen_add_to_physmap(XenIOState *state,
 XenPhysmap *physmap = NULL;
 target_phys_addr_t pfn, start_gpfn;
 target_phys_addr_t phys_offset = memory_region_get_ram_addr(mr);
+char path[80], value[17];
 
 if (get_physmapping(state, start_addr, size)) {
 return 0;
@@ -291,6 +292,7 @@ go_physmap:
 
 physmap-start_addr = start_addr;
 physmap-size = size;
+physmap-name = (char *)mr-name;
 physmap-phys_offset = phys_offset;
 
 QLIST_INSERT_HEAD(state-physmap, physmap, list);
@@ -299,6 +301,30 @@ go_physmap:
start_addr  TARGET_PAGE_BITS,
(start_addr + size)  TARGET_PAGE_BITS,
XEN_DOMCTL_MEM_CACHEATTR_WB);
+
+snprintf(path, sizeof(path),
+/local/domain/0/device-model/%d/physmap/%PRIx64/start_addr,
+xen_domid, (uint64_t)phys_offset);
+snprintf(value, sizeof(value), %PRIx64, (uint64_t)start_addr);
+if (!xs_write(state-xenstore, 0, path, value, strlen(value))) {
+return -1;
+}
+snprintf(path, sizeof(path),
+/local/domain/0/device-model/%d/physmap/%PRIx64/size,
+xen_domid, (uint64_t)phys_offset);
+snprintf(value, sizeof(value), %PRIx64, (uint64_t)size);
+if (!xs_write(state-xenstore, 0, path, value, strlen(value))) {
+return -1;
+}
+if (mr-name) {
+snprintf(path, sizeof(path),
+/local/domain/0/device-model/%d/physmap/%PRIx64/name,
+xen_domid, (uint64_t)phys_offset);
+if (!xs_write(state-xenstore, 0, path, mr-name, strlen(mr-name))) {
+return -1;
+}
+}
+
 return 0;
 }
 
@@ -926,6 +952,55 @@ int xen_init(void)
 return 0;
 }
 
+static void xen_read_physmap(XenIOState *state)
+{
+XenPhysmap *physmap = NULL;
+unsigned int len, num, i;
+char path[80], *value = NULL;
+char **entries = NULL;
+
+snprintf(path, sizeof(path),
+/local/domain/0/device-model/%d/physmap, xen_domid);
+entries = xs_directory(state-xenstore, 0, path, num);
+if (entries == NULL)
+return;
+
+for (i = 0; i  num; i++) {
+physmap = g_malloc(sizeof (XenPhysmap));
+physmap-phys_offset = strtoull(entries[i], NULL, 16);
+snprintf(path, sizeof(path),
+/local/domain/0/device-model/%d/physmap/%s/start_addr,
+xen_domid, entries[i]);
+value = xs_read(state-xenstore, 0, path, len);
+if (value == NULL) {
+free(physmap);
+continue;
+}
+physmap-start_addr = strtoull(value, NULL, 16);
+free(value);
+
+snprintf(path, sizeof(path),
+/local/domain/0/device-model/%d/physmap/%s/size,
+xen_domid, entries[i]);
+value = xs_read(state-xenstore, 0, path, len);
+if (value == NULL) {
+free(physmap);
+continue;
+}
+physmap-size = strtoull(value, NULL, 16);
+free(value);
+
+snprintf(path, sizeof(path),
+/local/domain/0/device-model/%d/physmap/%s/name,
+xen_domid, entries[i]);
+physmap-name = xs_read(state-xenstore, 0, path, len);
+
+QLIST_INSERT_HEAD(state-physmap, physmap, list);
+}
+free(entries);
+return;
+}
+
 int xen_hvm_init(void)
 {
 int i, rc;
@@ -998,6 +1073,7 @@ int xen_hvm_init(void)
 xen_be_register(console, xen_console_ops);
 xen_be_register(vkbd, xen_kbdmouse_ops);
 xen_be_register(qdisk, xen_blkdev_ops);
+xen_read_physmap(state);
 
 return 0;
 }
-- 
1.7.2.5




[Qemu-devel] [PATCH 18/20] kvm: x86: Add user space part for in-kernel i8259

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

Introduce the alternative 'kvm-i8259' device model that exploits KVM
in-kernel acceleration.

The PIIX3 initialization code is furthermore extended by KVM specific
IRQ route setup. GSI injection differs in KVM mode from the user space
model. As we can dispatch ISA-range IRQs to both IOAPIC and PIC inside
the kernel, we do not need to inject them separately. This is reflected
by a KVM-specific GSI handler.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile.target |2 +-
 hw/kvm/i8259.c  |  128 +++
 hw/pc.h |1 +
 hw/pc_piix.c|   52 --
 4 files changed, 178 insertions(+), 5 deletions(-)
 create mode 100644 hw/kvm/i8259.c

diff --git a/Makefile.target b/Makefile.target
index 1a63a1c..701073d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -233,7 +233,7 @@ obj-i386-y += vmport.o
 obj-i386-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/i8259.c b/hw/kvm/i8259.c
new file mode 100644
index 000..64bb5c2
--- /dev/null
+++ b/hw/kvm/i8259.c
@@ -0,0 +1,128 @@
+/*
+ * KVM in-kernel PIC (i8259) support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka  jan.kis...@siemens.com
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include hw/i8259_internal.h
+#include hw/apic_internal.h
+#include kvm.h
+
+static void kvm_pic_get(PICCommonState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_pic_state *kpic;
+int ret;
+
+chip.chip_id = s-master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, chip);
+if (ret  0) {
+fprintf(stderr, KVM_GET_IRQCHIP failed: %s\n, strerror(ret));
+abort();
+}
+
+kpic = chip.chip.pic;
+
+s-last_irr = kpic-last_irr;
+s-irr = kpic-irr;
+s-imr = kpic-imr;
+s-isr = kpic-isr;
+s-priority_add = kpic-priority_add;
+s-irq_base = kpic-irq_base;
+s-read_reg_select = kpic-read_reg_select;
+s-poll = kpic-poll;
+s-special_mask = kpic-special_mask;
+s-init_state = kpic-init_state;
+s-auto_eoi = kpic-auto_eoi;
+s-rotate_on_auto_eoi = kpic-rotate_on_auto_eoi;
+s-special_fully_nested_mode = kpic-special_fully_nested_mode;
+s-init4 = kpic-init4;
+s-elcr = kpic-elcr;
+s-elcr_mask = kpic-elcr_mask;
+}
+
+static void kvm_pic_put(PICCommonState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_pic_state *kpic;
+int ret;
+
+chip.chip_id = s-master ? KVM_IRQCHIP_PIC_MASTER : KVM_IRQCHIP_PIC_SLAVE;
+
+kpic = chip.chip.pic;
+
+kpic-last_irr = s-last_irr;
+kpic-irr = s-irr;
+kpic-imr = s-imr;
+kpic-isr = s-isr;
+kpic-priority_add = s-priority_add;
+kpic-irq_base = s-irq_base;
+kpic-read_reg_select = s-read_reg_select;
+kpic-poll = s-poll;
+kpic-special_mask = s-special_mask;
+kpic-init_state = s-init_state;
+kpic-auto_eoi = s-auto_eoi;
+kpic-rotate_on_auto_eoi = s-rotate_on_auto_eoi;
+kpic-special_fully_nested_mode = s-special_fully_nested_mode;
+kpic-init4 = s-init4;
+kpic-elcr = s-elcr;
+kpic-elcr_mask = s-elcr_mask;
+
+ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, chip);
+if (ret  0) {
+fprintf(stderr, KVM_GET_IRQCHIP failed: %s\n, strerror(ret));
+abort();
+}
+}
+
+static void kvm_pic_reset(DeviceState *dev)
+{
+PICCommonState *s = DO_UPCAST(PICCommonState, dev.qdev, dev);
+
+pic_reset_common(s);
+s-elcr = 0;
+
+kvm_pic_put(s);
+}
+
+static void kvm_pic_set_irq(void *opaque, int irq, int level)
+{
+int delivered;
+
+delivered = kvm_irqchip_set_irq(kvm_state, irq, level);
+apic_report_irq_delivered(delivered);
+}
+
+static void kvm_pic_init(PICCommonState *s)
+{
+memory_region_init_reservation(s-base_io, kvm-pic, 2);
+memory_region_init_reservation(s-elcr_io, kvm-elcr, 1);
+}
+
+qemu_irq *kvm_i8259_init(ISABus *bus)
+{
+i8259_init_chip(kvm-i8259, bus, true);
+i8259_init_chip(kvm-i8259, bus, false);
+
+return qemu_allocate_irqs(kvm_pic_set_irq, NULL, ISA_NUM_IRQS);
+}
+
+static PICCommonInfo kvm_i8259_info = {
+.isadev.qdev.name  = kvm-i8259,
+.isadev.qdev.reset = kvm_pic_reset,
+.init   = kvm_pic_init,
+.pre_save   = kvm_pic_get,
+.post_load  = kvm_pic_put,
+};
+
+static void kvm_pic_register(void)
+{
+pic_qdev_register(kvm_i8259_info);
+}
+
+device_init(kvm_pic_register)
diff --git a/hw/pc.h b/hw/pc.h
index ece069a..5e913db 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -64,6 +64,7 @@ bool parallel_mm_init(MemoryRegion *address_space,
 
 extern DeviceState *isa_pic;
 

[Qemu-devel] [PATCH 08/20] apic: Factor out base class for KVM reuse

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

The KVM in-kernel APIC model will reuse parts of the user space model
while providing the same frontend view to guest and most management
interfaces.

Factor out an APIC base class to encapsulate those parts that will be
shared by user space and KVM model. This class offers callback hooks for
init, base/tpr setting, and the external NMI delivery that will be
set via APICCommonInfo structure and implemented specifically in the
subclasses.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile.target|2 +-
 hw/apic.c  |  338 +++-
 hw/apic.h  |1 -
 hw/apic_common.c   |  252 ++
 hw/apic_internal.h |  112 +
 5 files changed, 406 insertions(+), 299 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h

diff --git a/Makefile.target b/Makefile.target
index 0451b63..4446273 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -228,7 +228,7 @@ obj-y += device-hotplug.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o pc.o
-obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic_common.o apic.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
diff --git a/hw/apic.c b/hw/apic.c
index bec493b..387a469 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -16,53 +16,13 @@
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see http://www.gnu.org/licenses/
  */
-#include hw.h
+#include apic_internal.h
 #include apic.h
 #include ioapic.h
-#include qemu-timer.h
 #include host-utils.h
-#include sysbus.h
 #include trace.h
 #include pc.h
 
-/* APIC Local Vector Table */
-#define APIC_LVT_TIMER   0
-#define APIC_LVT_THERMAL 1
-#define APIC_LVT_PERFORM 2
-#define APIC_LVT_LINT0   3
-#define APIC_LVT_LINT1   4
-#define APIC_LVT_ERROR   5
-#define APIC_LVT_NB  6
-
-/* APIC delivery modes */
-#define APIC_DM_FIXED  0
-#define APIC_DM_LOWPRI 1
-#define APIC_DM_SMI2
-#define APIC_DM_NMI4
-#define APIC_DM_INIT   5
-#define APIC_DM_SIPI   6
-#define APIC_DM_EXTINT 7
-
-/* APIC destination mode */
-#define APIC_DESTMODE_FLAT 0xf
-#define APIC_DESTMODE_CLUSTER  1
-
-#define APIC_TRIGGER_EDGE  0
-#define APIC_TRIGGER_LEVEL 1
-
-#defineAPIC_LVT_TIMER_PERIODIC (117)
-#defineAPIC_LVT_MASKED (116)
-#defineAPIC_LVT_LEVEL_TRIGGER  (115)
-#defineAPIC_LVT_REMOTE_IRR (114)
-#defineAPIC_INPUT_POLARITY (113)
-#defineAPIC_SEND_PENDING   (112)
-
-#define ESR_ILLEGAL_ADDRESS (1  7)
-
-#define APIC_SV_DIRECTED_IO (112)
-#define APIC_SV_ENABLE  (18)
-
-#define MAX_APICS 255
 #define MAX_APIC_WORDS 8
 
 /* Intel APIC constants: from include/asm/msidef.h */
@@ -75,43 +35,10 @@
 #define MSI_ADDR_DEST_ID_SHIFT 12
 #defineMSI_ADDR_DEST_ID_MASK   0x000
 
-#define MSI_ADDR_SIZE   0x10
-
-typedef struct APICState APICState;
-
-struct APICState {
-SysBusDevice busdev;
-MemoryRegion io_memory;
-void *cpu_env;
-uint32_t apicbase;
-uint8_t id;
-uint8_t arb_id;
-uint8_t tpr;
-uint32_t spurious_vec;
-uint8_t log_dest;
-uint8_t dest_mode;
-uint32_t isr[8];  /* in service register */
-uint32_t tmr[8];  /* trigger mode register */
-uint32_t irr[8]; /* interrupt request register */
-uint32_t lvt[APIC_LVT_NB];
-uint32_t esr; /* error register */
-uint32_t icr[2];
-
-uint32_t divide_conf;
-int count_shift;
-uint32_t initial_count;
-int64_t initial_count_load_time, next_time;
-uint32_t idx;
-QEMUTimer *timer;
-int sipi_vector;
-int wait_for_sipi;
-};
-
-static APICState *local_apics[MAX_APICS + 1];
-static int apic_irq_delivered;
+static APICCommonState *local_apics[MAX_APICS + 1];
 
-static void apic_set_irq(APICState *s, int vector_num, int trigger_mode);
-static void apic_update_irq(APICState *s);
+static void apic_set_irq(APICCommonState *s, int vector_num, int trigger_mode);
+static void apic_update_irq(APICCommonState *s);
 static void apic_get_delivery_bitmask(uint32_t *deliver_bitmask,
   uint8_t dest, uint8_t dest_mode);
 
@@ -151,7 +78,7 @@ static inline int get_bit(uint32_t *tab, int index)
 return !!(tab[i]  mask);
 }
 
-static void apic_local_deliver(APICState *s, int vector)
+static void apic_local_deliver(APICCommonState *s, int vector)
 {
 uint32_t lvt = s-lvt[vector];
 int trigger_mode;
@@ -185,7 +112,7 @@ static void apic_local_deliver(APICState *s, int vector)
 
 void apic_deliver_pic_intr(DeviceState *d, int level)
 {
-APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
+APICCommonState *s = 

[Qemu-devel] [PATCH 19/20] kvm: x86: Add user space part for in-kernel IOAPIC

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

This introduces the KVM-accelerated IOAPIC model 'kvm-ioapic' and
extends the IRQ routing setup by the 0-2 redirection when needed.

The kvm-ioapic model has a property that allows to define its GSI base
for injecting interrupts into the kernel model. This will allow to
disentangle PIC and IOAPIC pins for chipsets that support more
sophisticated IRQ routes than the PIIX3. So far the base is kept at 0,
i.e. PIC and IOAPIC share pins 0..15.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile.target |2 +-
 hw/kvm/ioapic.c |  114 +++
 hw/pc_piix.c|   15 +++-
 3 files changed, 129 insertions(+), 2 deletions(-)
 create mode 100644 hw/kvm/ioapic.c

diff --git a/Makefile.target b/Makefile.target
index 701073d..98cb997 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -233,7 +233,7 @@ obj-i386-y += vmport.o
 obj-i386-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o kvm/i8259.o kvm/ioapic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/ioapic.c b/hw/kvm/ioapic.c
new file mode 100644
index 000..10ffdd4
--- /dev/null
+++ b/hw/kvm/ioapic.c
@@ -0,0 +1,114 @@
+/*
+ * KVM in-kernel IOPIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka  jan.kis...@siemens.com
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include hw/pc.h
+#include hw/ioapic_internal.h
+#include hw/apic_internal.h
+#include kvm.h
+
+typedef struct KVMIOAPICState KVMIOAPICState;
+
+struct KVMIOAPICState {
+IOAPICCommonState ioapic;
+uint32_t kvm_gsi_base;
+};
+
+static void kvm_ioapic_get(IOAPICCommonState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_ioapic_state *kioapic;
+int ret, i;
+
+chip.chip_id = KVM_IRQCHIP_IOAPIC;
+ret = kvm_vm_ioctl(kvm_state, KVM_GET_IRQCHIP, chip);
+if (ret  0) {
+fprintf(stderr, KVM_GET_IRQCHIP failed: %s\n, strerror(ret));
+abort();
+}
+
+kioapic = chip.chip.ioapic;
+
+s-id = kioapic-id;
+s-ioregsel = kioapic-ioregsel;
+s-irr = kioapic-irr;
+for (i = 0; i  IOAPIC_NUM_PINS; i++) {
+s-ioredtbl[i] = kioapic-redirtbl[i].bits;
+}
+}
+
+static void kvm_ioapic_put(IOAPICCommonState *s)
+{
+struct kvm_irqchip chip;
+struct kvm_ioapic_state *kioapic;
+int ret, i;
+
+chip.chip_id = KVM_IRQCHIP_IOAPIC;
+kioapic = chip.chip.ioapic;
+
+kioapic-id = s-id;
+kioapic-ioregsel = s-ioregsel;
+kioapic-base_address = s-busdev.mmio[0].addr;
+kioapic-irr = s-irr;
+for (i = 0; i  IOAPIC_NUM_PINS; i++) {
+kioapic-redirtbl[i].bits = s-ioredtbl[i];
+}
+
+ret = kvm_vm_ioctl(kvm_state, KVM_SET_IRQCHIP, chip);
+if (ret  0) {
+fprintf(stderr, KVM_GET_IRQCHIP failed: %s\n, strerror(ret));
+abort();
+}
+}
+
+static void kvm_ioapic_reset(DeviceState *dev)
+{
+IOAPICCommonState *s = DO_UPCAST(IOAPICCommonState, busdev.qdev, dev);
+
+ioapic_reset_common(dev);
+kvm_ioapic_put(s);
+}
+
+static void kvm_ioapic_set_irq(void *opaque, int irq, int level)
+{
+KVMIOAPICState *s = opaque;
+int delivered;
+
+delivered = kvm_irqchip_set_irq(kvm_state, s-kvm_gsi_base + irq, level);
+apic_report_irq_delivered(delivered);
+}
+
+static void kvm_ioapic_init(IOAPICCommonState *s, int instance_no)
+{
+memory_region_init_reservation(s-io_memory, kvm-ioapic, 0x1000);
+
+qdev_init_gpio_in(s-busdev.qdev, kvm_ioapic_set_irq, IOAPIC_NUM_PINS);
+}
+
+static IOAPICCommonInfo kvm_ioapic_info = {
+.busdev.qdev.name  = kvm-ioapic,
+.busdev.qdev.size = sizeof(KVMIOAPICState),
+.busdev.qdev.reset = kvm_ioapic_reset,
+.busdev.qdev.props = (Property[]) {
+DEFINE_PROP_UINT32(gsi_base, KVMIOAPICState, kvm_gsi_base, 0),
+DEFINE_PROP_END_OF_LIST()
+},
+.init  = kvm_ioapic_init,
+.pre_save  = kvm_ioapic_get,
+.post_load = kvm_ioapic_put,
+};
+
+static void kvm_ioapic_register_device(void)
+{
+ioapic_qdev_register(kvm_ioapic_info);
+}
+
+device_init(kvm_ioapic_register_device)
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 297c04a..a285ad2 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -69,6 +69,15 @@ static void kvm_piix3_setup_irq_routing(bool pci_enabled)
 for (i = 8; i  16; ++i) {
 kvm_irqchip_add_route(s, i, KVM_IRQCHIP_PIC_SLAVE, i - 8);
 }
+if (pci_enabled) {
+for (i = 0; i  24; ++i) {
+if (i == 0) {
+kvm_irqchip_add_route(s, i, KVM_IRQCHIP_IOAPIC, 2);
+} else if (i != 2) {
+kvm_irqchip_add_route(s, i, KVM_IRQCHIP_IOAPIC, i);
+}
+}
+ 

[Qemu-devel] [PATCH 12/20] ioapic: Drop post-load irr initialization

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

As all devices undergo a reset prior to vmloa, and the reset value of
irr is 0, we do not need to do this clearing for older vmstates
explicitly. Dropping this redundant code will also make KVM integration
a bit simpler.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/ioapic.c |   12 
 1 files changed, 0 insertions(+), 12 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index 27b07c6..0743af6 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -278,21 +278,9 @@ ioapic_mem_write(void *opaque, target_phys_addr_t addr, 
uint64_t val,
 }
 }
 
-static int ioapic_post_load(void *opaque, int version_id)
-{
-IOAPICState *s = opaque;
-
-if (version_id == 1) {
-/* set sane value */
-s-irr = 0;
-}
-return 0;
-}
-
 static const VMStateDescription vmstate_ioapic = {
 .name = ioapic,
 .version_id = 3,
-.post_load = ioapic_post_load,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
 .fields = (VMStateField[]) {
-- 
1.7.6.4




Re: [Qemu-devel] Get only TCG code without execution

2012-01-20 Thread 陳韋任
 I was not talking about semantics of individual instructions but semantics
 of the whole multi-threaded program. Multi-threaded programs can lead to
 several different (most of which are unintended) states of the CPU. What
 states are possible is described in a mathematically rigorous definition of
 the ARM memory model. My task is to implement this memory model over TCG
 ops and then compare the results on several different (multi-threaded)
 litmus tests with the implementation of the memory model over ARM
 instructions. For the same task, I need QEMU to give me the TCG translation
 for code which it never branches into and hence, never needs to translate
 and execute (because ARM multiprocessors can perform speculative execution).

  Out of curiosity. What's ARM memory model? From the Wikipedia [1], it seems
ARMv7 has the same memory model as IA64.

Regards,
chenwj

[1] http://en.wikipedia.org/wiki/Memory_ordering

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-01-20 Thread 陳韋任
 1.  The control of gCR3 and hCR3 needs kernel access. While they can
 be set with a device module as what is done in kvm. Trapping into the
 kernel every time gCR3 is reseted might be too expensive.

  Why the control of gCR3 needs kernel access? Isn't gCR3 just a field of the
CPUX86State? QEMU should have the control of it. Or you mean the trapping thing?
 
 2. After setting the gCR3 and hCR3. whatever memory references fall
 within the guest memory will be done correctly. However, memory
 references done by the host will be broken. Therefore, when we load
 the from the CPUstates, call to helpers for exits from the code cache,
 we need to change the paging mechanism back to non-nested. can this be
 done ? how expensive will this be ?

  Why the memeory references done by the host will be broken?

Regards,
chenwj
 
-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] Get only TCG code without execution

2012-01-20 Thread 陳韋任
  I was not talking about semantics of individual instructions but semantics
  of the whole multi-threaded program. Multi-threaded programs can lead to
  several different (most of which are unintended) states of the CPU. What
  states are possible is described in a mathematically rigorous definition of
  the ARM memory model. My task is to implement this memory model over TCG ops
  and then compare the results on several different (multi-threaded) litmus
  tests with the implementation of the memory model over ARM instructions.
 
 Some points to note:
  * The current QEMU code has some known race conditions which can cause
 crashes/hangs in heavily threaded programs in linux-user mode; see eg
 https://bugs.launchpad.net/qemu/+bug/668799
  * We don't really make a serious attempt at implementing the ARM memory
 model in QEMU; our load/store exclusive implementation is pretty hopeless,
 for instance
  * In linux-user mode we basically just pass loads/stores/etc through as
 host-cpu loads/stores, so you get whatever the host's memory model semantics
 are, not what the guest CPU is supposed to do
  * a combination of the above plus the fact we don't implement caches in
 system emulation mode means that our implementation of all the barrier
 insns is a simple no-op; you'll never see barriers at the TCG op level

  What's load/store exclusive implementation? And as a general emulator, QEMU
shouldn't implement any architecture-specific memory model, right? What comes
into my mind is QEMU only need to follow guest memory operations when translates
guest binary to TCG ops. When translate TCG ops to host binary, it also has to
be careful not to mess up the memory ordering.

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] Question about do_interrupt (target-i386/op_helper.c)

2012-01-20 Thread 陳韋任
 In cpu_exec() env is a local variable. In do_interrupt() it is the global
 variable (held in a specific register via asm(AREG0)). The two aren't
 necessarily the same value, hence the fiddling about.

  Do you mean we sync env with envl at this point?

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] [PATCH] iSCSI: add configuration variables for iSCSI

2012-01-20 Thread ronnie sahlberg
On Thu, Jan 19, 2012 at 11:17 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 18.12.2011 05:48, schrieb Ronnie Sahlberg:
 This patch adds configuration variables for iSCSI to set
 initiator-name to use when logging in to the target,
 which type of header-digest to negotiate with the target
 and username and password for CHAP authentication.

 This allows specifying a initiator-name either from the command line
 -iscsi initiator-name=iqn.2004-01.com.example:test
 or from a configuration file included with -readconfig
 [iscsi]
   initiator-name = iqn.2004-01.com.example:test
   header-digest = CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
   user = CHAP username
   password = CHAP password

 The patch also updates the manpage and qemu-doc

 Signed-off-by: Ronnie Sahlberg ronniesahlb...@gmail.com

 So these options are global? What if I wanted to use two different
 setups for two different images?



Good point.
I will rework the patch so that it first checks for
[iscsi iqn.target.name]
and if that is not found it falls-back to just checking for [iscsi]

That would allow to have one catch all section for all targets, but
also the possibility to override and use different settings on a
per-target basis.

I will post an updated patch in a day or two.



regards
ronnie sahlberg



Re: [Qemu-devel] Removing indeterminism in qemu execution

2012-01-20 Thread batuzovk
 Hi developers,

 I'm debugging an operating system with QEMU and I have a race condition in
 the OS. The problem is that each time I run QEMU I get this error in a
 different place, so it makes impossible for gdb to debug it. My plan is to
 remove this indeterminism and be able to reproduce the same error in the
 same place every time. To do that:

 * The test is automated (there is no user IO)
 * I've passed the options -rtc base=2006-06-17,clock=vm,driftfixone
 -icount 2 to QEMU
 * There is no use of KVM (the modules have been removed from the kernel)

 So even with that, in each execution I get a different error every time.
 Do
 you have any suggestions to make the execution identical each time is
 being
 run?


 Many thanks!!

 --
 Zeus Gómez Marmolejo
 Zet - The x86 (IA-32) open implementation
 http://zet.aluzina.org

Hello.

Actually any (not only user) I/O can cause non-determinism: it is not
known when data would be ready. The things became even more complicated if
you took into account multi-threaded nature of QEMU. Threads communicate
with each other and you can not predict context switches.

AFAIK there is no easy guaranteed-to-work solution for your problem, but
there are some hard ones (e.g. vmware retrace, though it is not based on
QEMU). If your test case is really simple you can try disabling any
multi-threading you can in QEMU and just hope for it to work.

-- 
Kirill Batuzov



[Qemu-devel] [PATCH 20/20] kvm: Activate in-kernel irqchip support

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

Make the basic in-kernel irqchip support selectable via
-machine ...,kernel_irqchip=on. Leave it off by default until it can
fully replace user space models.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 qemu-config.c   |4 
 qemu-options.hx |5 -
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index ecc88e8..b030205 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -531,6 +531,10 @@ static QemuOptsList qemu_machine_opts = {
 .name = accel,
 .type = QEMU_OPT_STRING,
 .help = accelerator list,
+}, {
+.name = kernel_irqchip,
+.type = QEMU_OPT_BOOL,
+.help = use KVM in-kernel irqchip,
 },
 { /* End of list */ }
 },
diff --git a/qemu-options.hx b/qemu-options.hx
index 6295cde..3a07ae8 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -31,7 +31,8 @@ DEF(machine, HAS_ARG, QEMU_OPTION_machine, \
 -machine [type=]name[,prop[=value][,...]]\n
 selects emulated machine (-machine ? for list)\n
 property accel=accel1[:accel2[:...]] selects 
accelerator\n
-supported accelerators are kvm, xen, tcg (default: 
tcg)\n,
+supported accelerators are kvm, xen, tcg (default: tcg)\n
+kernel_irqchip=on|off controls accelerated irqchip 
support\n,
 QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
@@ -44,6 +45,8 @@ This is used to enable an accelerator. Depending on the 
target architecture,
 kvm, xen, or tcg can be available. By default, tcg is used. If there is more
 than one accelerator specified, the next one is used if the previous one fails
 to initialize.
+@item kernel_irqchip=on|off
+Enables in-kernel irqchip support for the chosen accelerator when available.
 @end table
 ETEXI
 
-- 
1.7.6.4




[Qemu-devel] [PATCH 02/20] hyper-v: initialize Hyper-V CPUID leaves.

2012-01-20 Thread Marcelo Tosatti
From: Vadim Rozenfeld vroze...@redhat.com

Signed-off-by: Marcelo Tosatti mtosa...@redhat.com
---
 target-i386/kvm.c |   65 +++-
 1 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 04e65c5..1f56492 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -29,6 +29,7 @@
 #include hw/pc.h
 #include hw/apic.h
 #include ioport.h
+#include hyperv.h
 
 //#define DEBUG_KVM
 
@@ -373,11 +374,16 @@ int kvm_arch_init_vcpu(CPUState *env)
 cpuid_i = 0;
 
 /* Paravirtualization CPUIDs */
-memcpy(signature, KVMKVMKVM\0\0\0, 12);
 c = cpuid_data.entries[cpuid_i++];
 memset(c, 0, sizeof(*c));
 c-function = KVM_CPUID_SIGNATURE;
-c-eax = 0;
+if (!hyperv_enabled()) {
+memcpy(signature, KVMKVMKVM\0\0\0, 12);
+c-eax = 0;
+} else {
+memcpy(signature, Microsoft Hv, 12);
+c-eax = HYPERV_CPUID_MIN;
+}
 c-ebx = signature[0];
 c-ecx = signature[1];
 c-edx = signature[2];
@@ -388,6 +394,54 @@ int kvm_arch_init_vcpu(CPUState *env)
 c-eax = env-cpuid_kvm_features 
 kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX);
 
+if (hyperv_enabled()) {
+memcpy(signature, Hv#1\0\0\0\0\0\0\0\0, 12);
+c-eax = signature[0];
+
+c = cpuid_data.entries[cpuid_i++];
+memset(c, 0, sizeof(*c));
+c-function = HYPERV_CPUID_VERSION;
+c-eax = 0x1bbc;
+c-ebx = 0x00060001;
+
+c = cpuid_data.entries[cpuid_i++];
+memset(c, 0, sizeof(*c));
+c-function = HYPERV_CPUID_FEATURES;
+if (hyperv_relaxed_timing_enabled()) {
+c-eax |= HV_X64_MSR_HYPERCALL_AVAILABLE;
+}
+if (hyperv_vapic_recommended()) {
+c-eax |= HV_X64_MSR_HYPERCALL_AVAILABLE;
+c-eax |= HV_X64_MSR_APIC_ACCESS_AVAILABLE;
+}
+
+c = cpuid_data.entries[cpuid_i++];
+memset(c, 0, sizeof(*c));
+c-function = HYPERV_CPUID_ENLIGHTMENT_INFO;
+if (hyperv_relaxed_timing_enabled()) {
+c-eax |= HV_X64_RELAXED_TIMING_RECOMMENDED;
+}
+if (hyperv_vapic_recommended()) {
+c-eax |= HV_X64_APIC_ACCESS_RECOMMENDED;
+}
+c-ebx = hyperv_get_spinlock_retries();
+
+c = cpuid_data.entries[cpuid_i++];
+memset(c, 0, sizeof(*c));
+c-function = HYPERV_CPUID_IMPLEMENT_LIMITS;
+c-eax = 0x40;
+c-ebx = 0x40;
+
+c = cpuid_data.entries[cpuid_i++];
+memset(c, 0, sizeof(*c));
+c-function = KVM_CPUID_SIGNATURE_NEXT;
+memcpy(signature, KVMKVMKVM\0\0\0, 12);
+c-eax = 0;
+c-ebx = signature[0];
+c-ecx = signature[1];
+c-edx = signature[2];
+}
+
 has_msr_async_pf_en = c-eax  (1  KVM_FEATURE_ASYNC_PF);
 
 cpu_x86_cpuid(env, 0, 0, limit, unused, unused, unused);
@@ -933,6 +987,13 @@ static int kvm_put_msrs(CPUState *env, int level)
 kvm_msr_entry_set(msrs[n++], MSR_KVM_ASYNC_PF_EN,
   env-async_pf_en_msr);
 }
+if (hyperv_hypercall_available()) {
+kvm_msr_entry_set(msrs[n++], HV_X64_MSR_GUEST_OS_ID, 0);
+kvm_msr_entry_set(msrs[n++], HV_X64_MSR_HYPERCALL, 0);
+}
+if (hyperv_vapic_recommended()) {
+kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0);
+}
 }
 if (env-mcg_cap) {
 int i;
-- 
1.7.6.4




[Qemu-devel] [PATCH 00/20] [PULL] qemu-kvm.git uq/master queue

2012-01-20 Thread Marcelo Tosatti
The following changes since commit 8c4ec5c0269bda18bb777a64b2008088d1c632dc:

  pxa2xx_keypad: fix unbalanced parenthesis. (2012-01-17 02:14:42 +0100)

are available in the git repository at:
  git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

Jan Kiszka (18):
  msi: Generalize msix_supported to msi_supported
  kvm: Move kvmclock into hw/kvm folder
  apic: Stop timer on reset
  apic: Inject external NMI events via LINT1
  apic: Introduce apic_report_irq_delivered
  apic: Factor out base class for KVM reuse
  apic: Open-code timer save/restore
  i8259: Completely privatize PicState
  i8259: Factor out base class for KVM reuse
  ioapic: Drop post-load irr initialization
  ioapic: Factor out base class for KVM reuse
  memory: Introduce memory_region_init_reservation
  kvm: Introduce core services for in-kernel irqchip support
  kvm: x86: Establish IRQ0 override control
  kvm: x86: Add user space part for in-kernel APIC
  kvm: x86: Add user space part for in-kernel i8259
  kvm: x86: Add user space part for in-kernel IOAPIC
  kvm: Activate in-kernel irqchip support

Vadim Rozenfeld (2):
  hyper-v: introduce Hyper-V support infrastructure.
  hyper-v: initialize Hyper-V CPUID leaves.

 Makefile.objs  |2 +-
 Makefile.target|8 +-
 configure  |1 +
 cpus.c |6 +-
 hw/apic.c  |  356 ++--
 hw/apic.h  |1 +
 hw/apic_common.c   |  302 ++
 hw/apic_internal.h |  115 +
 hw/i8259.c |  163 --
 hw/i8259_common.c  |  147 +
 hw/i8259_internal.h|   76 +
 hw/ioapic.c|  142 ++--
 hw/ioapic_common.c |  104 
 hw/ioapic_internal.h   |   97 +++
 hw/kvm/apic.c  |  138 
 hw/{kvmclock.c = kvm/clock.c} |4 +-
 hw/{kvmclock.h = kvm/clock.h} |0
 hw/kvm/i8259.c |  128 ++
 hw/kvm/ioapic.c|  114 +
 hw/msi.c   |8 +
 hw/msi.h   |2 +
 hw/msix.c  |9 +-
 hw/msix.h  |2 -
 hw/pc.c|   20 ++-
 hw/pc.h|8 +-
 hw/pc_piix.c   |   69 +++-
 kvm-all.c  |  154 +
 kvm-stub.c |5 +
 kvm.h  |   14 ++
 memory.c   |   36 
 memory.h   |   16 ++
 qemu-config.c  |4 +
 qemu-options.hx|5 +-
 sysemu.h   |1 -
 target-i386/cpuid.c|   14 ++
 target-i386/hyperv.c   |   64 +++
 target-i386/hyperv.h   |   43 +
 target-i386/kvm.c  |  114 +-
 trace-events   |2 +-
 vl.c   |1 -
 40 files changed, 1902 insertions(+), 593 deletions(-)
 create mode 100644 hw/apic_common.c
 create mode 100644 hw/apic_internal.h
 create mode 100644 hw/i8259_common.c
 create mode 100644 hw/i8259_internal.h
 create mode 100644 hw/ioapic_common.c
 create mode 100644 hw/ioapic_internal.h
 create mode 100644 hw/kvm/apic.c
 rename hw/{kvmclock.c = kvm/clock.c} (98%)
 rename hw/{kvmclock.h = kvm/clock.h} (100%)
 create mode 100644 hw/kvm/i8259.c
 create mode 100644 hw/kvm/ioapic.c
 create mode 100644 target-i386/hyperv.c
 create mode 100644 target-i386/hyperv.h



[Qemu-devel] [PATCH 07/20] apic: Introduce apic_report_irq_delivered

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

The in-kernel i8259 and IOAPIC backends for KVM will need this, so
encapsulate the shared bits.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/apic.c|   11 ---
 hw/apic.h|1 +
 trace-events |2 +-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index b9d733c..bec493b 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -413,6 +413,13 @@ static void apic_update_irq(APICState *s)
 }
 }
 
+void apic_report_irq_delivered(int delivered)
+{
+apic_irq_delivered += delivered;
+
+trace_apic_report_irq_delivered(apic_irq_delivered);
+}
+
 void apic_reset_irq_delivered(void)
 {
 trace_apic_reset_irq_delivered(apic_irq_delivered);
@@ -429,9 +436,7 @@ int apic_get_irq_delivered(void)
 
 static void apic_set_irq(APICState *s, int vector_num, int trigger_mode)
 {
-apic_irq_delivered += !get_bit(s-irr, vector_num);
-
-trace_apic_set_irq(apic_irq_delivered);
+apic_report_irq_delivered(!get_bit(s-irr, vector_num));
 
 set_bit(s-irr, vector_num);
 if (trigger_mode)
diff --git a/hw/apic.h b/hw/apic.h
index a62d83b..8173d8a 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -10,6 +10,7 @@ int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
 void apic_deliver_nmi(DeviceState *d);
 int apic_get_interrupt(DeviceState *s);
+void apic_report_irq_delivered(int delivered);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
 void cpu_set_apic_base(DeviceState *s, uint64_t val);
diff --git a/trace-events b/trace-events
index c18435b..5a260d6 100644
--- a/trace-events
+++ b/trace-events
@@ -95,9 +95,9 @@ cpu_get_apic_base(uint64_t val) %016PRIx64
 apic_mem_readl(uint64_t addr, uint32_t val)  %PRIx64 = %08x
 apic_mem_writel(uint64_t addr, uint32_t val) %PRIx64 = %08x
 # coalescing
+apic_report_irq_delivered(int apic_irq_delivered) coalescing %d
 apic_reset_irq_delivered(int apic_irq_delivered) old coalescing %d
 apic_get_irq_delivered(int apic_irq_delivered) returning coalescing %d
-apic_set_irq(int apic_irq_delivered) coalescing %d
 
 # hw/cs4231.c
 cs4231_mem_readl_dreg(uint32_t reg, uint32_t ret) read dreg %d: 0x%02x
-- 
1.7.6.4




Re: [Qemu-devel] [PATCH v4 0/6] save/restore on Xen

2012-01-20 Thread Jan Kiszka
On 2012-01-20 18:20, Stefano Stabellini wrote:
 Hi all,
 this is the fourth version of the Xen save/restore patch series.
 We have been discussing this issue for quite a while on #qemu and
 qemu-devel:
 
 
 http://marc.info/?l=qemu-develm=132346828427314w=2
 http://marc.info/?l=qemu-develm=132377734605464w=2
 
 
 A few different approaches were proposed to achieve the goal
 of a working save/restore with upstream Qemu on Xen, however after
 prototyping some of them I came up with yet another solution, that I
 think leads to the best results with the less amount of code
 duplications and ugliness.
 Far from saying that this patch series is an example of elegance and
 simplicity, but it is closer to acceptable anything else I have seen so
 far.
 
 What's new is that Qemu is going to keep track of its own physmap on
 xenstore, so that Xen can be fully aware of the changes Qemu makes to
 the guest's memory map at any time.
 This is all handled by Xen or Xen support in Qemu internally and can be
 used to solve our save/restore framebuffer problem.
 
From the Qemu common code POV, we still need to avoid saving the guest's
 ram when running on Xen, and we need to avoid resetting the videoram on
 restore (that is a benefit to the generic Qemu case too, because it
 saves few cpu cycles).

For my understanding: Refraining from the memset is required as the
already restored vram would then be overwritten? Or what is the ordering
of init, RAM restore, and initial device reset now?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] [PATCH 17/20] kvm: x86: Add user space part for in-kernel APIC

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

This introduces the alternative APIC device which makes use of KVM's
in-kernel device model. External NMI injection via LINT1 is emulated by
checking the current state of the in-kernel APIC, only injecting a NMI
into the VCPU if LINT1 is unmasked and configured to DM_NMI.

MSI is not yet supported, so we disable this when the in-kernel model is
in use.

CC: Lai Jiangshan la...@cn.fujitsu.com
Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 Makefile.target   |2 +-
 hw/kvm/apic.c |  138 +
 hw/pc.c   |   15 --
 kvm.h |4 ++
 target-i386/kvm.c |   38 +++
 5 files changed, 191 insertions(+), 6 deletions(-)
 create mode 100644 hw/kvm/apic.c

diff --git a/Makefile.target b/Makefile.target
index 556942d..1a63a1c 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -233,7 +233,7 @@ obj-i386-y += vmport.o
 obj-i386-y += pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += debugcon.o multiboot.o
 obj-i386-y += pc_piix.o
-obj-i386-$(CONFIG_KVM) += kvm/clock.o
+obj-i386-$(CONFIG_KVM) += kvm/clock.o kvm/apic.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 
 # shared objects
diff --git a/hw/kvm/apic.c b/hw/kvm/apic.c
new file mode 100644
index 000..6300695
--- /dev/null
+++ b/hw/kvm/apic.c
@@ -0,0 +1,138 @@
+/*
+ * KVM in-kernel APIC support
+ *
+ * Copyright (c) 2011 Siemens AG
+ *
+ * Authors:
+ *  Jan Kiszka  jan.kis...@siemens.com
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ */
+#include hw/apic_internal.h
+#include kvm.h
+
+static inline void kvm_apic_set_reg(struct kvm_lapic_state *kapic,
+int reg_id, uint32_t val)
+{
+*((uint32_t *)(kapic-regs + (reg_id  4))) = val;
+}
+
+static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic,
+int reg_id)
+{
+return *((uint32_t *)(kapic-regs + (reg_id  4)));
+}
+
+void kvm_put_apic_state(DeviceState *d, struct kvm_lapic_state *kapic)
+{
+APICCommonState *s = DO_UPCAST(APICCommonState, busdev.qdev, d);
+int i;
+
+memset(kapic, 0, sizeof(kapic));
+kvm_apic_set_reg(kapic, 0x2, s-id  24);
+kvm_apic_set_reg(kapic, 0x8, s-tpr);
+kvm_apic_set_reg(kapic, 0xd, s-log_dest  24);
+kvm_apic_set_reg(kapic, 0xe, s-dest_mode  28 | 0x0fff);
+kvm_apic_set_reg(kapic, 0xf, s-spurious_vec);
+for (i = 0; i  8; i++) {
+kvm_apic_set_reg(kapic, 0x10 + i, s-isr[i]);
+kvm_apic_set_reg(kapic, 0x18 + i, s-tmr[i]);
+kvm_apic_set_reg(kapic, 0x20 + i, s-irr[i]);
+}
+kvm_apic_set_reg(kapic, 0x28, s-esr);
+kvm_apic_set_reg(kapic, 0x30, s-icr[0]);
+kvm_apic_set_reg(kapic, 0x31, s-icr[1]);
+for (i = 0; i  APIC_LVT_NB; i++) {
+kvm_apic_set_reg(kapic, 0x32 + i, s-lvt[i]);
+}
+kvm_apic_set_reg(kapic, 0x38, s-initial_count);
+kvm_apic_set_reg(kapic, 0x3e, s-divide_conf);
+}
+
+void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic)
+{
+APICCommonState *s = DO_UPCAST(APICCommonState, busdev.qdev, d);
+int i, v;
+
+s-id = kvm_apic_get_reg(kapic, 0x2)  24;
+s-tpr = kvm_apic_get_reg(kapic, 0x8);
+s-arb_id = kvm_apic_get_reg(kapic, 0x9);
+s-log_dest = kvm_apic_get_reg(kapic, 0xd)  24;
+s-dest_mode = kvm_apic_get_reg(kapic, 0xe)  28;
+s-spurious_vec = kvm_apic_get_reg(kapic, 0xf);
+for (i = 0; i  8; i++) {
+s-isr[i] = kvm_apic_get_reg(kapic, 0x10 + i);
+s-tmr[i] = kvm_apic_get_reg(kapic, 0x18 + i);
+s-irr[i] = kvm_apic_get_reg(kapic, 0x20 + i);
+}
+s-esr = kvm_apic_get_reg(kapic, 0x28);
+s-icr[0] = kvm_apic_get_reg(kapic, 0x30);
+s-icr[1] = kvm_apic_get_reg(kapic, 0x31);
+for (i = 0; i  APIC_LVT_NB; i++) {
+s-lvt[i] = kvm_apic_get_reg(kapic, 0x32 + i);
+}
+s-initial_count = kvm_apic_get_reg(kapic, 0x38);
+s-divide_conf = kvm_apic_get_reg(kapic, 0x3e);
+
+v = (s-divide_conf  3) | ((s-divide_conf  1)  4);
+s-count_shift = (v + 1)  7;
+
+s-initial_count_load_time = qemu_get_clock_ns(vm_clock);
+apic_next_timer(s, s-initial_count_load_time);
+}
+
+static void kvm_apic_set_base(APICCommonState *s, uint64_t val)
+{
+s-apicbase = val;
+}
+
+static void kvm_apic_set_tpr(APICCommonState *s, uint8_t val)
+{
+s-tpr = (val  0x0f)  4;
+}
+
+static void do_inject_external_nmi(void *data)
+{
+APICCommonState *s = data;
+CPUState *env = s-cpu_env;
+uint32_t lvt;
+int ret;
+
+cpu_synchronize_state(env);
+
+lvt = s-lvt[APIC_LVT_LINT1];
+if (!(lvt  APIC_LVT_MASKED)  ((lvt  8)  7) == APIC_DM_NMI) {
+ret = kvm_vcpu_ioctl(env, KVM_NMI);
+if (ret  0) {
+fprintf(stderr, KVM: injection failed, NMI lost (%s)\n,
+strerror(-ret));
+}
+}
+}
+
+static void 

[Qemu-devel] [PATCH v4 4/6] cirrus_vga: do not reset videoram on resume

2012-01-20 Thread Stefano Stabellini
From: Anthony PERARD anthony.per...@citrix.com

When resuming we shouldn't set the videoram to 0xff considering that we
are about to read it from the savefile.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 hw/cirrus_vga.c |9 ++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
index f7b1d3d..eec2fc0 100644
--- a/hw/cirrus_vga.c
+++ b/hw/cirrus_vga.c
@@ -32,6 +32,7 @@
 #include console.h
 #include vga_int.h
 #include loader.h
+#include sysemu.h
 
 /*
  * TODO:
@@ -2760,9 +2761,11 @@ static void cirrus_reset(void *opaque)
 }
 s-vga.cr[0x27] = s-device_id;
 
-/* Win2K seems to assume that the pattern buffer is at 0xff
-   initially ! */
-memset(s-vga.vram_ptr, 0xff, s-real_vram_size);
+if (!runstate_check(RUN_STATE_INMIGRATE)) {
+/* Win2K seems to assume that the pattern buffer is at 0xff
+   initially ! */
+memset(s-vga.vram_ptr, 0xff, s-real_vram_size);
+}
 
 s-cirrus_hidden_dac_lockindex = 5;
 s-cirrus_hidden_dac_data = 0;
-- 
1.7.2.5




[Qemu-devel] [PATCH 03/20] msi: Generalize msix_supported to msi_supported

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

Rename msix_supported to msi_supported and control MSI and MSI-X
activation this way. That was likely to original intention for this
flag, but MSI support came after MSI-X.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/msi.c  |8 
 hw/msi.h  |2 ++
 hw/msix.c |9 -
 hw/msix.h |2 --
 hw/pc.c   |4 ++--
 5 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/msi.c b/hw/msi.c
index f214fcf..5d6ceb6 100644
--- a/hw/msi.c
+++ b/hw/msi.c
@@ -36,6 +36,9 @@
 
 #define PCI_MSI_VECTORS_MAX 32
 
+/* Flag for interrupt controller to declare MSI/MSI-X support */
+bool msi_supported;
+
 /* If we get rid of cap allocator, we won't need this. */
 static inline uint8_t msi_cap_sizeof(uint16_t flags)
 {
@@ -116,6 +119,11 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
 uint16_t flags;
 uint8_t cap_size;
 int config_offset;
+
+if (!msi_supported) {
+return -ENOTSUP;
+}
+
 MSI_DEV_PRINTF(dev,
init offset: 0x%PRIx8 vector: %PRId8
 64bit %d mask %d\n,
diff --git a/hw/msi.h b/hw/msi.h
index 5766018..3040bb0 100644
--- a/hw/msi.h
+++ b/hw/msi.h
@@ -24,6 +24,8 @@
 #include qemu-common.h
 #include pci.h
 
+extern bool msi_supported;
+
 bool msi_enabled(const PCIDevice *dev);
 int msi_init(struct PCIDevice *dev, uint8_t offset,
  unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask);
diff --git a/hw/msix.c b/hw/msix.c
index f47d26b..3835eaa 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -15,6 +15,7 @@
  */
 
 #include hw.h
+#include msi.h
 #include msix.h
 #include pci.h
 #include range.h
@@ -35,9 +36,6 @@
 #define MSIX_MAX_ENTRIES 32
 
 
-/* Flag for interrupt controller to declare MSI-X support */
-int msix_supported;
-
 /* Add MSI-X capability to the config space for the device. */
 /* Given a bar and its size, add MSI-X table on top of it
  * and fill MSI-X capability in the config space.
@@ -238,10 +236,11 @@ int msix_init(struct PCIDevice *dev, unsigned short 
nentries,
   unsigned bar_nr, unsigned bar_size)
 {
 int ret;
+
 /* Nothing to do if MSI is not supported by interrupt controller */
-if (!msix_supported)
+if (!msi_supported) {
 return -ENOTSUP;
-
+}
 if (nentries  MSIX_MAX_ENTRIES)
 return -EINVAL;
 
diff --git a/hw/msix.h b/hw/msix.h
index 7e04336..5aba22b 100644
--- a/hw/msix.h
+++ b/hw/msix.h
@@ -29,6 +29,4 @@ void msix_notify(PCIDevice *dev, unsigned vector);
 
 void msix_reset(PCIDevice *dev);
 
-extern int msix_supported;
-
 #endif
diff --git a/hw/pc.c b/hw/pc.c
index 85304cf..04304e0 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -36,7 +36,7 @@
 #include elf.h
 #include multiboot.h
 #include mc146818rtc.h
-#include msix.h
+#include msi.h
 #include sysbus.h
 #include sysemu.h
 #include blockdev.h
@@ -896,7 +896,7 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 apic_mapped = 1;
 }
 
-msix_supported = 1;
+msi_supported = true;
 
 return dev;
 }
-- 
1.7.6.4




[Qemu-devel] [PATCH 16/20] kvm: x86: Establish IRQ0 override control

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

KVM is forced to disable the IRQ0 override when we run with in-kernel
irqchip but without IRQ routing support of the kernel. Set the fwcfg
value correspondingly. This aligns us with qemu-kvm.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/pc.c|3 ++-
 kvm-all.c  |5 +
 kvm-stub.c |5 +
 kvm.h  |2 ++
 sysemu.h   |1 -
 vl.c   |1 -
 6 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 04304e0..38d787a 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -39,6 +39,7 @@
 #include msi.h
 #include sysbus.h
 #include sysemu.h
+#include kvm.h
 #include blockdev.h
 #include ui/qemu-spice.h
 #include memory.h
@@ -609,7 +610,7 @@ static void *bochs_bios_init(void)
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES, (uint8_t *)acpi_tables,
  acpi_tables_len);
-fw_cfg_add_bytes(fw_cfg, FW_CFG_IRQ0_OVERRIDE, irq0override, 1);
+fw_cfg_add_i32(fw_cfg, FW_CFG_IRQ0_OVERRIDE, kvm_allows_irq0_override());
 
 smbios_table = smbios_get_table(smbios_len);
 if (smbios_table)
diff --git a/kvm-all.c b/kvm-all.c
index fa9d92d..88f1156 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1307,6 +1307,11 @@ int kvm_has_gsi_routing(void)
 return kvm_check_extension(kvm_state, KVM_CAP_IRQ_ROUTING);
 }
 
+int kvm_allows_irq0_override(void)
+{
+return !kvm_enabled() || !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 if (!kvm_has_sync_mmu()) {
diff --git a/kvm-stub.c b/kvm-stub.c
index 06064b9..6c2b06b 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -78,6 +78,11 @@ int kvm_has_many_ioeventfds(void)
 return 0;
 }
 
+int kvm_allows_irq0_override(void)
+{
+return 1;
+}
+
 void kvm_setup_guest_memory(void *start, size_t size)
 {
 }
diff --git a/kvm.h b/kvm.h
index dd2d4f0..ad430fd 100644
--- a/kvm.h
+++ b/kvm.h
@@ -53,6 +53,8 @@ int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_gsi_routing(void);
 
+int kvm_allows_irq0_override(void);
+
 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
 
diff --git a/sysemu.h b/sysemu.h
index ddef2bb..caff268 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -102,7 +102,6 @@ extern int vga_interface_type;
 extern int graphic_width;
 extern int graphic_height;
 extern int graphic_depth;
-extern uint8_t irq0override;
 extern DisplayType display_type;
 extern const char *keyboard_layout;
 extern int win2k_install_hack;
diff --git a/vl.c b/vl.c
index ba55b35..132c387 100644
--- a/vl.c
+++ b/vl.c
@@ -218,7 +218,6 @@ int no_reboot = 0;
 int no_shutdown = 0;
 int cursor_hide = 1;
 int graphic_rotate = 0;
-uint8_t irq0override = 1;
 const char *watchdog;
 QEMUOptionRom option_rom[MAX_OPTION_ROMS];
 int nb_option_roms;
-- 
1.7.6.4




[Qemu-devel] [PATCH 09/20] apic: Open-code timer save/restore

2012-01-20 Thread Marcelo Tosatti
From: Jan Kiszka jan.kis...@siemens.com

To enable migration between accelerated and non-accelerated APIC models,
we will need to handle the timer saving and restoring specially and can
no longer rely on the automatics of VMSTATE_TIMER. Specifically,
accelerated model will not start any QEMUTimer.

This patch therefore factors out the generic bits into apic_next_timer
and use a post-load callback to implemented model-specific logic.

Signed-off-by: Jan Kiszka jan.kis...@siemens.com
---
 hw/apic.c  |   30 +++-
 hw/apic_common.c   |   54 ++-
 hw/apic_internal.h |3 ++
 3 files changed, 67 insertions(+), 20 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 387a469..e59c964 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -521,25 +521,9 @@ static uint32_t apic_get_current_count(APICCommonState *s)
 
 static void apic_timer_update(APICCommonState *s, int64_t current_time)
 {
-int64_t next_time, d;
-
-if (!(s-lvt[APIC_LVT_TIMER]  APIC_LVT_MASKED)) {
-d = (current_time - s-initial_count_load_time) 
-s-count_shift;
-if (s-lvt[APIC_LVT_TIMER]  APIC_LVT_TIMER_PERIODIC) {
-if (!s-initial_count)
-goto no_timer;
-d = ((d / ((uint64_t)s-initial_count + 1)) + 1) * 
((uint64_t)s-initial_count + 1);
-} else {
-if (d = s-initial_count)
-goto no_timer;
-d = (uint64_t)s-initial_count + 1;
-}
-next_time = s-initial_count_load_time + (d  s-count_shift);
-qemu_mod_timer(s-timer, next_time);
-s-next_time = next_time;
+if (apic_next_timer(s, current_time)) {
+qemu_mod_timer(s-timer, s-next_time);
 } else {
-no_timer:
 qemu_del_timer(s-timer);
 }
 }
@@ -753,6 +737,15 @@ static void apic_mem_writel(void *opaque, 
target_phys_addr_t addr, uint32_t val)
 }
 }
 
+static void apic_post_load(APICCommonState *s)
+{
+if (s-timer_expiry != -1) {
+qemu_mod_timer(s-timer, s-timer_expiry);
+} else {
+qemu_del_timer(s-timer);
+}
+}
+
 static const MemoryRegionOps apic_io_ops = {
 .old_mmio = {
 .read = { apic_mem_readb, apic_mem_readw, apic_mem_readl, },
@@ -776,6 +769,7 @@ static APICCommonInfo apic_info = {
 .set_base = apic_set_base,
 .set_tpr = apic_set_tpr,
 .external_nmi = apic_external_nmi,
+.post_load = apic_post_load,
 };
 
 static void apic_register_devices(void)
diff --git a/hw/apic_common.c b/hw/apic_common.c
index eef977f..e05369c 100644
--- a/hw/apic_common.c
+++ b/hw/apic_common.c
@@ -93,6 +93,39 @@ void apic_deliver_nmi(DeviceState *d)
 info-external_nmi(s);
 }
 
+bool apic_next_timer(APICCommonState *s, int64_t current_time)
+{
+int64_t d;
+
+/* We need to store the timer state separately to support APIC
+ * implementations that maintain a non-QEMU timer, e.g. inside the
+ * host kernel. This open-coded state allows us to migrate between
+ * both models. */
+s-timer_expiry = -1;
+
+if (s-lvt[APIC_LVT_TIMER]  APIC_LVT_MASKED) {
+return false;
+}
+
+d = (current_time - s-initial_count_load_time)  s-count_shift;
+
+if (s-lvt[APIC_LVT_TIMER]  APIC_LVT_TIMER_PERIODIC) {
+if (!s-initial_count) {
+return false;
+}
+d = ((d / ((uint64_t)s-initial_count + 1)) + 1) *
+((uint64_t)s-initial_count + 1);
+} else {
+if (d = s-initial_count) {
+return false;
+}
+d = (uint64_t)s-initial_count + 1;
+}
+s-next_time = s-initial_count_load_time + (d  s-count_shift);
+s-timer_expiry = s-next_time;
+return true;
+}
+
 void apic_init_reset(DeviceState *d)
 {
 APICCommonState *s = DO_UPCAST(APICCommonState, busdev.qdev, d);
@@ -120,7 +153,10 @@ void apic_init_reset(DeviceState *d)
 s-next_time = 0;
 s-wait_for_sipi = 1;
 
-qemu_del_timer(s-timer);
+if (s-timer) {
+qemu_del_timer(s-timer);
+}
+s-timer_expiry = -1;
 }
 
 static void apic_reset_common(DeviceState *d)
@@ -203,12 +239,25 @@ static int apic_init_common(SysBusDevice *dev)
 return 0;
 }
 
+static int apic_dispatch_post_load(void *opaque, int version_id)
+{
+APICCommonState *s = opaque;
+APICCommonInfo *info =
+DO_UPCAST(APICCommonInfo, busdev.qdev, s-busdev.qdev.info);
+
+if (info-post_load) {
+info-post_load(s);
+}
+return 0;
+}
+
 static const VMStateDescription vmstate_apic_common = {
 .name = apic,
 .version_id = 3,
 .minimum_version_id = 3,
 .minimum_version_id_old = 1,
 .load_state_old = apic_load_old,
+.post_load = apic_dispatch_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINT32(apicbase, APICCommonState),
 VMSTATE_UINT8(id, APICCommonState),
@@ -228,7 +277,8 @@ static const VMStateDescription vmstate_apic_common = {
 VMSTATE_UINT32(initial_count, APICCommonState),
 

Re: [Qemu-devel] QEMU TB Unlinking due to interrupt.

2012-01-20 Thread 陳韋任
On Fri, Dec 09, 2011 at 07:08:43PM -0500, Xin Tong wrote:
 can you forward me the patch please. Peter. Also, as far as I
 understand, tb unlinking can only happen in 2 cases.
 
 1. some other tbs being invalidated.
 2. interrrupt pending

  http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg03643.html

HTH,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



[Qemu-devel] [PATCH v4 3/6] Set runstate to INMIGRATE earlier

2012-01-20 Thread Stefano Stabellini
Set runstate to RUN_STATE_INMIGRATE as soon as we can on resume.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 vl.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/vl.c b/vl.c
index 6f0435b..bb0139f 100644
--- a/vl.c
+++ b/vl.c
@@ -2972,6 +2972,7 @@ int main(int argc, char **argv, char **envp)
 break;
 case QEMU_OPTION_incoming:
 incoming = optarg;
+runstate_set(RUN_STATE_INMIGRATE);
 break;
 case QEMU_OPTION_nodefaults:
 default_serial = 0;
@@ -3468,7 +3469,6 @@ int main(int argc, char **argv, char **envp)
 }
 
 if (incoming) {
-runstate_set(RUN_STATE_INMIGRATE);
 int ret = qemu_start_incoming_migration(incoming);
 if (ret  0) {
 fprintf(stderr, Migration failed. Exit code %s(%d), exiting.\n,
-- 
1.7.2.5




Re: [Qemu-devel] [PATCH] arm_boot: support board IDs more than 16 bits wide

2012-01-20 Thread Mark Langsdorf
On 01/20/2012 11:05 AM, Peter Maydell wrote:
 Support passing a board ID value to the kernel in r1
 that is more than 16 bits wide. This is needed to pass
 the '-1 == invalid' value for boards which only support
 device tree booting.
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 This applies after the Calxeda patchset. Mark, I suggest you put it
 in your patchset in the appropriate place.

Thanks, very helpful.

I'm currently tracking down a bug in which linux/arch/arm/kernel/smp.c
scu_get_core_count() returns the value of the raw_readl of scu_base
+ SCU_CONFIG (0xfee4) as the value of smpboot[0]. It seems to be
bypassing the a9mpcore.c code entirely. I'm not sure what's happening
there.

--Mark Langsdorf
Calxeda, Inc.



Re: [Qemu-devel] Get only TCG code without execution

2012-01-20 Thread 陳韋任
On Fri, Jan 20, 2012 at 09:09:46AM +, Peter Maydell wrote:
 On 20 January 2012 06:12, 陳韋任 che...@iis.sinica.edu.tw wrote:
   Out of curiosity. What's ARM memory model? From the Wikipedia [1], it seems
  ARMv7 has the same memory model as IA64.
 
 The ARM memory model is the set of semantics for memory
 accesses as defined in the ARM Architecture Reference
 Manual (covering not just reordering but also exclusive
 accesses, alignment, barriers, etc). The manual devotes
 50 pages to it so I'm not about to try to summarise it here :-)

  Seems the Wikipedia only lists the memory ordering part. ;)
 
  And as a general emulator, QEMU shouldn't implement any
  architecture-specific memory model, right?
 
 Wrong, at least in theory. Ideally QEMU should implement exactly
 the semantics required by the guest architecture memory model
 (it's allowed to be stricter than the architecture requires, of
 course), in the same way it should implement the semantics required
 by the guest architecture instruction set. A guest binary for ARM
 can rely on the memory ordering constraints imposed by the memory
 model just as much as it can rely on the fact that the ADD instruction
 adds two registers together. In practice, of course (a) this is an
 enormous amount of work and also slows the emulator down drastically
 and (b) guest binaries don't actually rely that much on the memory
 model. And the fairly strict memory model provided by x86 means that
 for x86 hosts we actually get most of the important bits of the guest
 memory model right anyway.

  AFAIK, LLVM defines it's own memory model [1] which is inspired by the C++11
memory model. That's why I think instead of implementing architecture-specific
memory model, QEMU should define a more general (strict) one.

  You said,

  guest binaries don't actually rely that much on the memory model.

I think the reason is those guest binaries are single thread. Memory model is
important in multi-threaded case. BTW, our binary translator now can translate
x86 binary to ARM binary, and ARM has weaker memory model than x86.
 
[1] http://llvm.org/docs/LangRef.html#memmodel

Regards,
chenwj

P.S. Happy Chinese New Year. :)

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



  1   2   >