[Qemu-devel] target-unicore32: New ISA support for QEMU

2011-03-24 Thread Guan Xuetao
Hi,
I want to add new ISA (UniCore32) support for QEMU.
I have finished unicore32-linux-user development based on qemu-stable-0.14, 
 and will begin unicore32-softmmu work.

What need I do for merging it into QEMU?

Thanks  Regards.

Guan Xuetao
2011/3/24





Re: [Qemu-devel] target-unicore32: New ISA support for QEMU

2011-03-24 Thread Roy Tam
Hi,

2011/3/24 Guan Xuetao g...@mprc.pku.edu.cn:
 Hi,
 I want to add new ISA (UniCore32) support for QEMU.
 I have finished unicore32-linux-user development based on qemu-stable-0.14,
  and will begin unicore32-softmmu work.

 What need I do for merging it into QEMU?


Post the patch sets. Just like what you did in linux-kernel maillist.

 Thanks  Regards.

 Guan Xuetao
 2011/3/24







RE: [Qemu-devel] target-unicore32: New ISA support for QEMU

2011-03-24 Thread Guan Xuetao


 -Original Message-
 From: Roy Tam [mailto:roy...@gmail.com]
 Sent: Thursday, March 24, 2011 2:40 PM
 To: Guan Xuetao
 Cc: qemu-devel@nongnu.org
 Subject: Re: [Qemu-devel] target-unicore32: New ISA support for QEMU
 
 Hi,
 
 2011/3/24 Guan Xuetao g...@mprc.pku.edu.cn:
  Hi,
  I want to add new ISA (UniCore32) support for QEMU.
  I have finished unicore32-linux-user development based on qemu-stable-0.14,
   and will begin unicore32-softmmu work.
 
  What need I do for merging it into QEMU?
 
 
 Post the patch sets. Just like what you did in linux-kernel maillist.
Thanks.

Then, I will split the patch set into 3 part:
Patch 1 for target-unicore32 directory
Patch 2 for linux-user/unicore32 directory
Patch 3 for the rest modifications

All patches had  passed checkpatch.pl script.

Guan Xuetao





[Qemu-devel] [Bug 739088] Re: I/O errors after Save/Restore

2011-03-24 Thread Yongjie Ren
this bug is fixed in the latest qemu-kvm.git:
2c9bb5d4e5ae3b12ad71bd6a0c1b32003661f53a

** Changed in: qemu
   Status: New = Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/739088

Title:
  I/O errors after Save/Restore

Status in QEMU:
  Fix Released

Bug description:
  qemu-kvm commit: b73357ecd2b14c057134cb71d29447b5b988c516
  ( Author: Marcelo Tosatti mtosa...@redhat.comDate:   Wed Mar 16 
17:04:16 2011 -0300)
  kvm commit: a72e315c509376bbd1e121219c3ad9f23973923f

  After restoring from saved img, some I/O errors appear in dmesg and
  file system is read-only.  I'm sure that the  guest runs normally
  before saving. See the pictures attached in detail.

  Reproduce steps:
  
  1.create a guest:
qemu-img create -b /share/xvs/img/app/ia32e_SMP.img -f qcow2 
/root/test0320.img
qemu-system-x86_64  -m 256  -net 
nic,macaddr=00:16:3e:06:8a:08,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup 
-hda /root/test0320.img
  2.save the guest: 
on qemu monitor: migrate exec:dd of=/root/test-save.img
  3.quit from qemu: 
q command on qemu monitor
  4.restore from img just saved:
qemu-system-x86_64  -m 256  -net 
nic,macaddr=00:16:3e:06:8a:08,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup 
-incoming=/roo/test-save.img
  5.see dmesg in restored guest, you'll find some I/O errors. And run some
  commands such as ps, touch,reboot and so on. Then some I/O errors 
appear.



[Qemu-devel] [Bug 739092] Re: guest hangs when using network after live migration

2011-03-24 Thread Yongjie Ren
this bug is fixed in the latest qemu-kvm.git:
2c9bb5d4e5ae3b12ad71bd6a0c1b32003661f53a

** Changed in: qemu
   Status: New = Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/739092

Title:
  guest hangs when using network after live migration

Status in QEMU:
  Fix Released

Bug description:
  qemu-kvm commit: b73357ecd2b14c057134cb71d29447b5b988c516
  ( Author: Marcelo Tosatti mtosa...@redhat.com Date: Wed Mar 16 17:04:16 
2011 -0300)
  kvm commit: a72e315c509376bbd1e121219c3ad9f23973923f

  Guest hangs when I run command ssh/netstat using network after live
  migration. ssh or netstat hangs even if I press Ctrl+C to interrupt. And
  I cannnot connect the guest using ssh GustIP from the host, though the sshd
  is still running in the guest. 

  Reproduce steps:
  
  1.start a tcp daemon for migration:
qemu-system-x86_64  -m 256 -smp 4  -incoming tcp:localhost: -no-acpi 
-net
  nic,macaddr=00:16:3e:63:d5:90,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup
  -hda /root/lv.img
  2.create a guest:
qemu-system-x86_64  -m 256 -smp 4  -no-acpi -net
  nic,macaddr=00:16:3e:63:d5:90,model=rtl8139 -net tap,script=/etc/kvm/qemu-ifup
  -hda /root/lv.img
  3.migration: 
migrate tcp:localhost:
  4.run command in guest:
ssh root@192.168.1.177



Re: [Qemu-devel] [RFC] QCFG: a new mechanism to replace QemuOpts and option handling

2011-03-24 Thread Markus Armbruster
Anthony Liguori anth...@codemonkey.ws writes:

 On 03/22/2011 08:01 AM, Markus Armbruster wrote:
 Type checking macros are feasible (see [*] for an existence proof), but
 things do get hairy, and the resulting error messages can be less than
 clear at times.

 That just gives you a warning.  You can do much better things with
 __builtin_types_compatible_p.

Quote of existence proof != this is the best way to do it :)

 But neither really solves the problem I'm talking about.  I can go
 into it in depth but hopefully we both can agree that trying to build
 introspection macros is pretty difficult if not impossible :-)

Depends on what you need from them.

  However,
 this makes it very difficult to support things like lists of lists or
 anything else that would basically require a non-concrete type.
 Sounds like you want a more expressive type system than C's, and propose
 to get it by building your own DSL.  I'm not sure that's wise.

 It's an IDL.  IDL based RPCs are pretty common with C.  The IDL is
 purely declarative.

Still sounds like you want a more expressive type system than C's, and
propose to build it ourselves.

First step would be to define the type system rigorously.

 If you plan to expose these types in a library, you need to either
 explicitly pad each structure and make sure that the padding is
 updated correctly each time a new member is added.
 As long as the data description is data, writing a program to check that
 a new version is compatible with the old one shouldn't be hard.

 If you define the structs on your own, you need to either have a data
 description of the padding or be very careful doing it yourself.

With a program to check compatibility, you can easily make the build
fail when you break compatibility.

  Alternatively, you
 can add an allocation function that automatically pads each structure
 transparently.
 Weaker than a comprehensive check, but could be good enough.

 qmp-gen.py creates qmp-types.[ch] to do exactly the above and also
 generates the type declaration so that you don't have to duplicate the
 type marshalling code and the type declaration.  Today, this is close
 to 2k LOCs so it's actually a significant amount of code code.

 There is also the code that takes the input (via QCFG or QMP) and
 calls an appropriate C function with a strongly typed argument.  I've
 Not sure I got you here.  Perhaps an example could enlighten me :)

 void qapi_free_vnc_info(VncInfo *obj)
 {
 if (!obj) {
 return;
 }
 if (obj-has_host) {
 qemu_free(obj-host);
 }
 if (obj-has_family) {
 qemu_free(obj-family);
 }
 if (obj-has_service) {
 qemu_free(obj-service);
 }
 if (obj-has_auth) {
 qemu_free(obj-auth);
 }
 if (obj-has_clients) {
 qapi_free_vnc_client_info(obj-clients);
 }


 qapi_free_vnc_info(obj-next);
 qemu_free(obj);
 }

 It's pretty basic boiler plate code that could be written by hand, but
 why not generate it.  It actually all adds up pretty quickly in terms
 of SLOCs.

Since we have the data description anyway, why not

for all members
if member needs destruction
destroy it

Just because we can generate boiler-plate code doesn't mean we should
have boiler-plate code.

 The mechanism I described using the visitor pattern is really the
 right solution for vmstate.  The hard problems to solve for vmstate
 are:

 1) How to we support old versions in a robust way.  There are fancy
 things we could do once we have a proper visitor mechanism.  We could
 have special marshallers for old versions, we could treat the output
 of the visitor as an in memory tree and do XSLT style translations,
 etc.

 2) How do we generate the visitor for each device.  I don't think it's
 practical to describe devices in JSON.  It certainly simplifies the
 problem but it seems ugly to me.  I think we realistically need a C
 style IDL and adopt a style of treating it as a header.
 Now I'm confused.  Do you mean your JSON-based DSL won't cut it for
 vmstate?

 If yes, why is it wise to go down that route now?

 There are a few paths we could go.  We can describe devices in JSON.
 This makes VMState introspectable with all of the nice properties of
 everything else.  But the question is, do we describe the full device
 state and then use a separate mechanism to cull out the bits that can
 be recomputed.

 To we only describe the guest visible state and treat that as a
 separate structure?  Is that embedded in the main state object or do
 we explicitly translate the main state object to this new type?

Requiring fields of the externally visible vmstate wire format to be
backed by members of the device state struct (whether they are right in
the outermost struct or not) creates a potentially troublesome tight
coupling between wire format and internal 

[Qemu-devel] Re: [0/27] Implement emulation of pSeries logical partitions (v4)

2011-03-24 Thread Alexander Graf

On 24.03.2011, at 05:41, David Gibson wrote:

 On Wed, Mar 23, 2011 at 10:29:04PM +0100, Alexander Graf wrote:
 
 On 23.03.2011, at 22:08, Benjamin Herrenschmidt wrote:
 
 On Wed, 2011-03-23 at 15:45 +0100, Alexander Graf wrote:
 
 What's the magic to start a guest? I tried passing a disk which SLOF
 didn't detect (obviously - there's no IDE there). I also tried running
 a kernel directly with -kernel which gave me no output. How are you
 usually running your images? 
 
 hrm... you using -M pseries right ?
 
 Yup
 
 so -kernel should work with a kernel that is compiled for the pseries
 platform (and it won't use SLOF).
 
 Yeah, that part works.
 
 
 a disk should work with SLOF if you use the default which is scsi (ie
 pseries machine sets that flag that tells qemu to default to scsi, which
 is then picked up by our vscsi).
 
 IE. You should be able to stick a distro ISO in the virtual CD-ROM and
 boot from that with SLOF. SLOF will read the qemu boot list (tho it only
 knows about c, d and n at that stage) and try them in order.
 
 That one doesn't. If I just pass in a disk w/o specifically saying
 it's a scsi disk I end up with no hard disk in the guest :).
 
 Um.. what exact command line are you using?  For me, both -hda file
 and a bare file on the command line work...

Ah, yes. It's the amount of RAM. I didn't pass in -m before which made SLOF not 
detect the SCSI adapter but crash before that.
So yes, please add a min_ram field.


Alex




Re: [Qemu-devel] [PATCH v7] rtl8139: add vlan support

2011-03-24 Thread Jason Wang

On 03/23/2011 07:11 AM, Benjamin Poirier wrote:

Hello,

Here is version 7 of my patchset to add vlan support to the emulated rtl8139
nic.

Changes since v6:
* added check against guest requesting tagging on frames with len  12
* simplified tag extraction in receive function. dot1q_buf arg removed
  from rtl8139_do_receive(). Frame is linearized in transfer_frame()
  when loopback mode is on.
* added an entry to file header

I've ran the same tests as usual on linux and this time also freebsd 8.2, with
and without vlanhwtso in the latter case. Jason, you're right that loopback
mode is seldom used! It seems the bsd driver only uses it at probe time to
identify a defect in some 8169 [1,2] and even then, that check has been
disabled [3]. The linux driver doesn't support loopback mode (unless it's well
hidden.)

[1] 
http://lists.freebsd.org/pipermail/freebsd-emulation/2006-May/thread.html#2055
[2] 
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/re/if_re.c?rev=1.196;content-type=text%2Fplain
[3] http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/re/if_re.c#rev1.68

Changes since v5:
* moved all receive changes to add vlan tag extraction
* fixed checkpatch.pl style issues
* fixed bugs in receive case related to small buffers and loopback
  mode. Moved too small buffer code back where it used to be, though
  it is changed in content.

Changes since v4:
* removed alloca(), for real. Thanks to the reviewers for their
  patience. This patchset now has more versions than the vlan header
  has bytes!
* corrected the unlikely, debug printf and long lines, as per comments
* cleaned out ifdef's pertaining to ethernet checksum calculation.
  According to a comment since removed they were related to an
  optimization:
  RTL8139 provides frame CRC with received packet, this feature
  seems to be ignored by most drivers, disabled by default
  see commit ccf1d14

I've tested v5 using x86_64 host/guest with the usual procedure. I've also ran
the clang analyzer on the qemu code base, just for fun.

Changes since v3:
* removed alloca() and #includenet/ethernet.h  as per comments
* reordered patches to put extraction before insertion. Extraction
  touches only the receive path but insertion touches both. The two
  patches are now needed to have vlan functionnality.

I've tested v4 with x86_64 host/guest. I used the same testing procedure as
before. I've tested a plain configuration as well as one with tso + vlan
offload, successfully.

I had to hack around the Linux 8139cp driver to be able to enable tso on vlan
which leads me to wonder, can someone with access to the C+ spec or a real
card confirm that it can do tso and vlan offload at the same time? The patch
I used for the kernel is at https://gist.github.com/851895.

Changes since v2:
insertion:
* moved insertion later in the process, to handle tso
* use qemu_sendv_packet() to insert the tag for us
* added dot1q_buf parameter to rtl8139_do_receive() to avoid some
  memcpy() in loopback mode. Note that the code path through that
  function is unchanged when dot1q_buf is NULL.

extraction:
* reduced the amount of copying by moving the frame too short logic
  after the removal of the vlan tag (as is done in e1000.c for
  example). Unfortunately, that logic can no longer be shared betwen
  C+ and C mode.

I've posted v2 of these patches back in November
http://article.gmane.org/gmane.comp.emulators.qemu/84252

I've tested v3 on the following combinations of guest and hosts:
host: x86_64, guest: x86_64
host: x86_64, guest: ppc32
host: ppc32, guest: ppc32

Testing on the x86_64 host used '-net tap' and consisted of:
* making an http transfert on the untagged interface.
* ping -s 0-1472 to another host on a vlan.
* making an scp upload to another host on a vlan.

Testing on the ppc32 host used '-net socket' connected to an x86_64 qemu-kvm
running the virtio nic and consisted of:
* establishing an ssh connection between the two using an untagged interface.
* ping -s 0-1472 between the two using a vlan.
* making an scp transfer in both directions using a vlan.

All that was successful. Nevertheless, it doesn't exercise all code paths so
care is in order.

Please note that the lack of vlan support in rtl8139 has taken a few people
aback:
https://bugzilla.redhat.com/show_bug.cgi?id=516587
http://article.gmane.org/gmane.linux.network.general/14266

Thanks,
-Ben


Looks good to me.

Acked-by: Jason Wang jasow...@redhat.com



Re: [Qemu-devel] Re: [PATCH 00/14] lm32: Milkymist board support

2011-03-24 Thread Alexander Graf

On 16.03.2011, at 18:08, Alexander Graf wrote:

 On 03/07/2011 11:32 PM, Michael Walle wrote:
 This is the second (and last) patchset of the LatticeMico32 support. It
 adds almost complete support for the opensource and openhardware Milkymist
 One board [1].
 
 [1] http://www.milkymist.org/mmone.html
 
 
 From my side you get:
 
 Acked-by: Alexander Graf ag...@suse.de
 
 But it'd be great if the respective subsystem maintainers could also take a 
 look at code affecting them.

Ping? Enough time passed for people to comment now, no?


Alex




[Qemu-devel] [PATCH 0/3] unicore32: add unicore32-linux-user support for qemu 0.14

2011-03-24 Thread Guan Xuetao

The patch set adds new unicore32-linux-user support for qemu-stable-0.14
Patch 1 adds target-unicore32 directory
Patch 2 adds linux-user/unicore32 directory
Patch 3 adds necessary modifications for other files

Signed-off-by: Guan Xuetao g...@mprc.pku.edu.cn

---
GuanXuetao (3):
  unicore32: add target-unicore32 directory for unicore32-linux-user
support
  unicore32: add necessry headers in linux-user/unicore32 for unicore32
support
  unicore32: necessary modifications for other files to support
unicore32

 configure|   11 +-
 cpu-exec.c   |   12 +-
 default-configs/unicore32-linux-user.mak |1 +
 elf.h|2 +
 fpu/softfloat-specialize.h   |   10 +-
 linux-user/elfload.c |   74 ++
 linux-user/main.c|   89 ++-
 linux-user/qemu.h|5 +-
 linux-user/syscall_defs.h|   10 +-
 linux-user/unicore32/syscall.h   |   55 +
 linux-user/unicore32/syscall_nr.h|  371 ++
 linux-user/unicore32/target_signal.h |   26 +
 linux-user/unicore32/termbits.h  |2 +
 target-unicore32/cpu.h   |  184 +++
 target-unicore32/exec.h  |   50 +
 target-unicore32/helper.c|  546 
 target-unicore32/helpers.h   |   70 +
 target-unicore32/op_helper.c |  202 +++
 target-unicore32/translate.c | 2110 ++
 19 files changed, 3817 insertions(+), 13 deletions(-)
 create mode 100644 default-configs/unicore32-linux-user.mak
 create mode 100644 linux-user/unicore32/syscall.h
 create mode 100644 linux-user/unicore32/syscall_nr.h
 create mode 100644 linux-user/unicore32/target_signal.h
 create mode 100644 linux-user/unicore32/termbits.h
 create mode 100644 target-unicore32/cpu.h
 create mode 100644 target-unicore32/exec.h
 create mode 100644 target-unicore32/helper.c
 create mode 100644 target-unicore32/helpers.h
 create mode 100644 target-unicore32/op_helper.c
 create mode 100644 target-unicore32/translate.c





[Qemu-devel] [PATCH 3/3] unicore32: necessary modifications for other files to support unicore32

2011-03-24 Thread Guan Xuetao
unicore32: necessary modifications for other files to support unicore32

Signed-off-by: Guan Xuetao g...@mprc.pku.edu.cn
---
 configure|   11 +++-
 cpu-exec.c   |   12 -
 default-configs/unicore32-linux-user.mak |1 +
 elf.h|2 +
 fpu/softfloat-specialize.h   |   10 ++--
 linux-user/elfload.c |   74 +
 linux-user/main.c|   89 +-
 linux-user/qemu.h|5 +-
 linux-user/syscall_defs.h|   10 ++-
 9 files changed, 201 insertions(+), 13 deletions(-)
 create mode 100644 default-configs/unicore32-linux-user.mak

diff --git a/configure b/configure
index 598e8e1..a6633cf 100755
--- a/configure
+++ b/configure
@@ -280,7 +280,7 @@ else
 fi
 
 case $cpu in
-  alpha|cris|ia64|m68k|microblaze|ppc|ppc64|sparc64)
+  alpha|cris|ia64|m68k|microblaze|ppc|ppc64|sparc64|unicore32)
 cpu=$cpu
   ;;
   i386|i486|i586|i686|i86pc|BePC)
@@ -793,6 +793,9 @@ case $cpu in
 hppa*)
host_guest_base=yes
;;
+unicore32*)
+   host_guest_base=yes
+   ;;
 esac
 
 [ -z $guest_base ]  guest_base=$host_guest_base
@@ -1018,6 +1021,7 @@ sh4eb-linux-user \
 sparc-linux-user \
 sparc64-linux-user \
 sparc32plus-linux-user \
+unicore32-linux-user \
 
 fi
 # the following are Darwin specific
@@ -2495,7 +2499,7 @@ echo docdir=$docdir  $config_host_mak
 echo confdir=$confdir  $config_host_mak
 
 case $cpu in
-  
i386|x86_64|alpha|cris|hppa|ia64|m68k|microblaze|mips|mips64|ppc|ppc64|s390|s390x|sparc|sparc64)
+  
i386|x86_64|alpha|cris|hppa|ia64|m68k|microblaze|mips|mips64|ppc|ppc64|s390|s390x|sparc|sparc64|unicore32)
 ARCH=$cpu
   ;;
   armv4b|armv4l)
@@ -3007,6 +3011,9 @@ case $target_arch2 in
   s390x)
 target_phys_bits=64
   ;;
+  unicore32)
+target_phys_bits=32
+  ;;
   *)
 echo Unsupported target CPU
 exit 1
diff --git a/cpu-exec.c b/cpu-exec.c
index 8c9fb8b..130e0c3 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -262,6 +262,7 @@ int cpu_exec(CPUState *env1)
 env-cc_x = (env-sr  4)  1;
 #elif defined(TARGET_ALPHA)
 #elif defined(TARGET_ARM)
+#elif defined(TARGET_UNICORE32)
 #elif defined(TARGET_PPC)
 #elif defined(TARGET_MICROBLAZE)
 #elif defined(TARGET_MIPS)
@@ -326,6 +327,8 @@ int cpu_exec(CPUState *env1)
 do_interrupt(env);
 #elif defined(TARGET_ARM)
 do_interrupt(env);
+#elif defined(TARGET_UNICORE32)
+do_interrupt(env);
 #elif defined(TARGET_SH4)
do_interrupt(env);
 #elif defined(TARGET_ALPHA)
@@ -363,7 +366,7 @@ int cpu_exec(CPUState *env1)
 }
 #if defined(TARGET_ARM) || defined(TARGET_SPARC) || defined(TARGET_MIPS) || \
 defined(TARGET_PPC) || defined(TARGET_ALPHA) || defined(TARGET_CRIS) || \
-defined(TARGET_MICROBLAZE)
+defined(TARGET_MICROBLAZE) || defined(TARGET_UNICORE32)
 if (interrupt_request  CPU_INTERRUPT_HALT) {
 env-interrupt_request = ~CPU_INTERRUPT_HALT;
 env-halted = 1;
@@ -503,6 +506,12 @@ int cpu_exec(CPUState *env1)
 do_interrupt(env);
 next_tb = 0;
 }
+#elif defined(TARGET_UNICORE32)
+if (interrupt_request  CPU_INTERRUPT_HARD
+ !(env-uncached_asr  ASR_I)) {
+do_interrupt(env);
+next_tb = 0;
+}
 #elif defined(TARGET_SH4)
 if (interrupt_request  CPU_INTERRUPT_HARD) {
 do_interrupt(env);
@@ -653,6 +662,7 @@ int cpu_exec(CPUState *env1)
 env-eflags = env-eflags | helper_cc_compute_all(CC_OP) | (DF  DF_MASK);
 #elif defined(TARGET_ARM)
 /* XXX: Save/restore host fpu exception state?.  */
+#elif defined(TARGET_UNICORE32)
 #elif defined(TARGET_SPARC)
 #elif defined(TARGET_PPC)
 #elif defined(TARGET_M68K)
diff --git a/default-configs/unicore32-linux-user.mak 
b/default-configs/unicore32-linux-user.mak
new file mode 100644
index 000..6aafd21
--- /dev/null
+++ b/default-configs/unicore32-linux-user.mak
@@ -0,0 +1 @@
+# Default configuration for unicore32-linux-user
diff --git a/elf.h b/elf.h
index 7067c90..876c1da 100644
--- a/elf.h
+++ b/elf.h
@@ -105,6 +105,8 @@ typedef int64_t  Elf64_Sxword;
 #define EM_H8_300H  47  /* Hitachi H8/300H */
 #define EM_H8S  48  /* Hitachi H8S */
 
+#define EM_UNICORE32110 /* UniCore32 */
+
 /*
  * This is an interim value that we will use until the committee comes
  * up with a final number.
diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index eb644b2..839049d 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -30,7 +30,7 @@ these four paragraphs for those parts of this code that are 

[Qemu-devel] Re: [PATCH, RFC] virtio_blk: add cache control support

2011-03-24 Thread Christian Borntraeger
Am 24.03.2011 04:05, schrieb Anthony Liguori:
 ie. lguest and S/390 don't trap writes to config space.

 Or perhaps they should?  But we should be explicit about needing it...
 I don't think we ever operated on the assumption that config space writes 
 would trap.
 
 I don't think adding it is the right thing either because you can do byte 
 access to the config space which makes atomicity difficult.

There is the additional problem, that s390 has no MMIO and,therefore,
there is no real HW support for trapping writes to an area. You can
use page faults, or read-only faults on newer systems, but this is 
expensive. In addition, page faults only deliver the page frame, but
not the offset within a page.
 
 Any reason not to use a control queue to negotiate dynamic features? 

Sounds reasonable.



[Qemu-devel] [PATCH 2/3] unicore32: add necessary headers in linux-user/unicore32 for unicore32 support

2011-03-24 Thread Guan Xuetao
unicore32: add necessary headers in linux-user/unicore32 for unicore32 support

Signed-off-by: Guan Xuetao g...@mprc.pku.edu.cn
---
 linux-user/unicore32/syscall.h   |   55 +
 linux-user/unicore32/syscall_nr.h|  371 ++
 linux-user/unicore32/target_signal.h |   26 +++
 linux-user/unicore32/termbits.h  |2 +
 4 files changed, 454 insertions(+), 0 deletions(-)
 create mode 100644 linux-user/unicore32/syscall.h
 create mode 100644 linux-user/unicore32/syscall_nr.h
 create mode 100644 linux-user/unicore32/target_signal.h
 create mode 100644 linux-user/unicore32/termbits.h

diff --git a/linux-user/unicore32/syscall.h b/linux-user/unicore32/syscall.h
new file mode 100644
index 000..010cdd8
--- /dev/null
+++ b/linux-user/unicore32/syscall.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2010-2011 GUAN Xue-tao
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef __UC32_SYSCALL_H__
+#define __UC32_SYSCALL_H__
+struct target_pt_regs {
+abi_ulong uregs[34];
+};
+
+#define UC32_REG_pc uregs[31]
+#define UC32_REG_lr uregs[30]
+#define UC32_REG_sp uregs[29]
+#define UC32_REG_ip uregs[28]
+#define UC32_REG_fp uregs[27]
+#define UC32_REG_26 uregs[26]
+#define UC32_REG_25 uregs[25]
+#define UC32_REG_24 uregs[24]
+#define UC32_REG_23 uregs[23]
+#define UC32_REG_22 uregs[22]
+#define UC32_REG_21 uregs[21]
+#define UC32_REG_20 uregs[20]
+#define UC32_REG_19 uregs[19]
+#define UC32_REG_18 uregs[18]
+#define UC32_REG_17 uregs[17]
+#define UC32_REG_16 uregs[16]
+#define UC32_REG_15 uregs[15]
+#define UC32_REG_14 uregs[14]
+#define UC32_REG_13 uregs[13]
+#define UC32_REG_12 uregs[12]
+#define UC32_REG_11 uregs[11]
+#define UC32_REG_10 uregs[10]
+#define UC32_REG_09 uregs[9]
+#define UC32_REG_08 uregs[8]
+#define UC32_REG_07 uregs[7]
+#define UC32_REG_06 uregs[6]
+#define UC32_REG_05 uregs[5]
+#define UC32_REG_04 uregs[4]
+#define UC32_REG_03 uregs[3]
+#define UC32_REG_02 uregs[2]
+#define UC32_REG_01 uregs[1]
+#define UC32_REG_00 uregs[0]
+#define UC32_REG_asruregs[32]
+#define UC32_REG_ORIG_00uregs[33]
+
+#define UC32_SYSCALL_BASE   0x90
+#define UC32_SYSCALL_ARCH_BASE  0xf
+#define UC32_SYSCALL_NR_set_tls (UC32_SYSCALL_ARCH_BASE + 5)
+
+#define UNAME_MACHINE UniCore-II
+
+#endif /* __UC32_SYSCALL_H__ */
diff --git a/linux-user/unicore32/syscall_nr.h 
b/linux-user/unicore32/syscall_nr.h
new file mode 100644
index 000..9c72d84
--- /dev/null
+++ b/linux-user/unicore32/syscall_nr.h
@@ -0,0 +1,371 @@
+/*
+ * This file contains the system call numbers for UniCore32 oldabi.
+ *
+ * Copyright (C) 2010-2011 GUAN Xue-tao
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#define TARGET_NR_restart_syscall   0
+#define TARGET_NR_exit  1
+#define TARGET_NR_fork  2
+#define TARGET_NR_read  3
+#define TARGET_NR_write 4
+#define TARGET_NR_open  5
+#define TARGET_NR_close 6
+#define TARGET_NR_waitpid   7
+#define TARGET_NR_creat 8
+#define TARGET_NR_link  9
+#define TARGET_NR_unlink10
+#define TARGET_NR_execve11
+#define TARGET_NR_chdir 12
+#define TARGET_NR_time  13
+#define TARGET_NR_mknod 14
+#define TARGET_NR_chmod 15
+#define TARGET_NR_lchown16
+#define TARGET_NR_break 17
+/* 18 */
+#define TARGET_NR_lseek 19
+#define TARGET_NR_getpid20
+#define TARGET_NR_mount 21
+#define TARGET_NR_umount22
+#define TARGET_NR_setuid23
+#define TARGET_NR_getuid24
+#define TARGET_NR_stime 25
+#define TARGET_NR_ptrace26
+#define TARGET_NR_alarm 27
+/* 28 */
+#define TARGET_NR_pause 29
+#define TARGET_NR_utime 30

[Qemu-devel] [PATCH 0/3] spicevmc - chardev: restore guest open / close (v2)

2011-03-24 Thread Hans de Goede
Hi All,

When we moved from the spicevmc device (which directly implemented a virtio
serial port) to doing spicevmc as a chardev backend we lost the notification
of the guest opening / closing the port to spice server. This causes the
server to not fall back to server mouse mode when the agent inside the
guest stops / dies (for what ever reason). Which causes the mouse to
stop working in this scenario. This patch set fixes this regression.

Changes since v1:
-Replace return qemu_chr_guest_open(vcon-chr); with just
 qemu_chr_guest_open(vcon-chr);, since this is a void func. idem for close.

Regards,

Hans



[Qemu-devel] [PATCH 1/3] chardev: Allow frontends to notify backends of guest open / close

2011-03-24 Thread Hans de Goede
Some frontends know when the guest has opened the channel and is actively
listening to it, for example virtio-serial. This patch adds 2 new qemu-chardev
functions which can be used by frontends to signal guest open / close, and
allows interested backends to listen to this.

Signed-off-by: Hans de Goede hdego...@redhat.com
---
 qemu-char.c |   17 +
 qemu-char.h |4 
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/qemu-char.c b/qemu-char.c
index 31c9e79..7ec7196 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -476,6 +476,9 @@ static CharDriverState *qemu_chr_open_mux(CharDriverState 
*drv)
 chr-chr_write = mux_chr_write;
 chr-chr_update_read_handler = mux_chr_update_read_handler;
 chr-chr_accept_input = mux_chr_accept_input;
+/* Frontend guest-open / -close notification is not support with muxes */
+chr-chr_guest_open = NULL;
+chr-chr_guest_close = NULL;
 
 /* Muxes are always open on creation */
 qemu_chr_generic_open(chr);
@@ -2575,6 +2578,20 @@ void qemu_chr_set_echo(struct CharDriverState *chr, bool 
echo)
 }
 }
 
+void qemu_chr_guest_open(struct CharDriverState *chr)
+{
+if (chr-chr_guest_open) {
+chr-chr_guest_open(chr);
+}
+}
+
+void qemu_chr_guest_close(struct CharDriverState *chr)
+{
+if (chr-chr_guest_close) {
+chr-chr_guest_close(chr);
+}
+}
+
 void qemu_chr_close(CharDriverState *chr)
 {
 QTAILQ_REMOVE(chardevs, chr, next);
diff --git a/qemu-char.h b/qemu-char.h
index 56d9954..d2f5e5f 100644
--- a/qemu-char.h
+++ b/qemu-char.h
@@ -65,6 +65,8 @@ struct CharDriverState {
 void (*chr_close)(struct CharDriverState *chr);
 void (*chr_accept_input)(struct CharDriverState *chr);
 void (*chr_set_echo)(struct CharDriverState *chr, bool echo);
+void (*chr_guest_open)(struct CharDriverState *chr);
+void (*chr_guest_close)(struct CharDriverState *chr);
 void *opaque;
 QEMUBH *bh;
 char *label;
@@ -78,6 +80,8 @@ CharDriverState *qemu_chr_open_opts(QemuOpts *opts,
 void (*init)(struct CharDriverState *s));
 CharDriverState *qemu_chr_open(const char *label, const char *filename, void 
(*init)(struct CharDriverState *s));
 void qemu_chr_set_echo(struct CharDriverState *chr, bool echo);
+void qemu_chr_guest_open(struct CharDriverState *chr);
+void qemu_chr_guest_close(struct CharDriverState *chr);
 void qemu_chr_close(CharDriverState *chr);
 void qemu_chr_printf(CharDriverState *s, const char *fmt, ...)
 GCC_FMT_ATTR(2, 3);
-- 
1.7.3.2




[Qemu-devel] [PATCH 2/3] virtio-console: notify backend of guest open / close

2011-03-24 Thread Hans de Goede
Signed-off-by: Hans de Goede hdego...@redhat.com
---
 hw/virtio-console.c |   18 ++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/hw/virtio-console.c b/hw/virtio-console.c
index c235b27..e635771 100644
--- a/hw/virtio-console.c
+++ b/hw/virtio-console.c
@@ -27,6 +27,22 @@ static ssize_t flush_buf(VirtIOSerialPort *port, const 
uint8_t *buf, size_t len)
 return qemu_chr_write(vcon-chr, buf, len);
 }
 
+/* Callback function that's called when the guest opens the port */
+static void guest_open(VirtIOSerialPort *port)
+{
+VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port);
+
+qemu_chr_guest_open(vcon-chr);
+}
+
+/* Callback function that's called when the guest closes the port */
+static void guest_close(VirtIOSerialPort *port)
+{
+VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port);
+
+qemu_chr_guest_close(vcon-chr);
+}
+
 /* Readiness of the guest to accept data on a port */
 static int chr_can_read(void *opaque)
 {
@@ -63,6 +79,8 @@ static int generic_port_init(VirtConsole *vcon, 
VirtIOSerialPort *port)
 qemu_chr_add_handlers(vcon-chr, chr_can_read, chr_read, chr_event,
   vcon);
 vcon-port.info-have_data = flush_buf;
+vcon-port.info-guest_open = guest_open;
+vcon-port.info-guest_close = guest_close;
 }
 return 0;
 }
-- 
1.7.3.2




[Qemu-devel] [PATCH 3/3] spice-chardev: listen to frontend guest open / close

2011-03-24 Thread Hans de Goede
Note the vmc_register_interface() in spice_chr_write is left in place
in case someone uses spice-chardev with a frontend which does not have
guest open / close notification.

Signed-off-by: Hans de Goede hdego...@redhat.com
---
 spice-qemu-char.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/spice-qemu-char.c b/spice-qemu-char.c
index 6134fe9..605c241 100644
--- a/spice-qemu-char.c
+++ b/spice-qemu-char.c
@@ -130,6 +130,18 @@ static void spice_chr_close(struct CharDriverState *chr)
 qemu_free(s);
 }
 
+static void spice_chr_guest_open(struct CharDriverState *chr)
+{
+SpiceCharDriver *s = chr-opaque;
+vmc_register_interface(s);
+}
+
+static void spice_chr_guest_close(struct CharDriverState *chr)
+{
+SpiceCharDriver *s = chr-opaque;
+vmc_unregister_interface(s);
+}
+
 static void print_allowed_subtypes(void)
 {
 const char** psubtype;
@@ -182,6 +194,8 @@ CharDriverState *qemu_chr_open_spice(QemuOpts *opts)
 chr-opaque = s;
 chr-chr_write = spice_chr_write;
 chr-chr_close = spice_chr_close;
+chr-chr_guest_open = spice_chr_guest_open;
+chr-chr_guest_close = spice_chr_guest_close;
 
 qemu_chr_generic_open(chr);
 
-- 
1.7.3.2




[Qemu-devel] [PATCH 3/3] Basic implementation of Sharp Zaurus SL-5500 collie PDA

2011-03-24 Thread Dmitry Eremin-Solenikov
Add very basic implementation of collie PDA emulation. The system lacks
LoCoMo and graphics/sound emulation. Linux kernel boots up to mounting
rootfs (theoretically it can be provided in pflash images).

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 hw/collie.c |   70 +++
 1 files changed, 70 insertions(+), 0 deletions(-)
 create mode 100644 hw/collie.c

diff --git a/hw/collie.c b/hw/collie.c
new file mode 100644
index 000..965fd13
--- /dev/null
+++ b/hw/collie.c
@@ -0,0 +1,70 @@
+/*
+ * SA-1110-based Sharp Zaurus SL-5500 platform.
+ *
+ * Copyright (C) 2011 Dmitry Eremin-Solenikov
+ *
+ * This code is licensed under GNU GPL v2.
+ */
+#include hw.h
+#include sysbus.h
+#include boards.h
+#include devices.h
+#include strongarm.h
+#include arm-misc.h
+#include flash.h
+#include blockdev.h
+
+static struct arm_boot_info collie_binfo = {
+.loader_start = SA_SDCS0,
+.ram_size = 0x2000,
+};
+
+static void collie_init(ram_addr_t ram_size,
+const char *boot_device,
+const char *kernel_filename, const char *kernel_cmdline,
+const char *initrd_filename, const char *cpu_model)
+{
+StrongARMState *s;
+DriveInfo *dinfo;
+ram_addr_t phys_flash;
+
+if (!cpu_model) {
+cpu_model = sa1110;
+}
+
+s = sa1110_init(collie_binfo.ram_size, cpu_model);
+(void) s;
+
+phys_flash = qemu_ram_alloc(NULL, collie.fl1, 0x0200);
+dinfo = drive_get(IF_PFLASH, 0, 0);
+pflash_cfi01_register(SA_CS0, phys_flash,
+dinfo ? dinfo-bdrv : NULL, (64 * 1024),
+512, 4, 0x00, 0x00, 0x00, 0x00, 0);
+
+phys_flash = qemu_ram_alloc(NULL, collie.fl2, 0x0200);
+dinfo = drive_get(IF_PFLASH, 0, 1);
+pflash_cfi01_register(SA_CS1, phys_flash,
+dinfo ? dinfo-bdrv : NULL, (64 * 1024),
+512, 4, 0x00, 0x00, 0x00, 0x00, 0);
+
+sysbus_create_simple(scoop, 0x4080, NULL);
+
+collie_binfo.kernel_filename = kernel_filename;
+collie_binfo.kernel_cmdline = kernel_cmdline;
+collie_binfo.initrd_filename = initrd_filename;
+collie_binfo.board_id = 0x208;
+arm_load_kernel(s-env, collie_binfo);
+}
+
+static QEMUMachine collie_machine = {
+.name = collie,
+.desc = Collie PDA (SA-1110),
+.init = collie_init,
+};
+
+static void collie_machine_init(void)
+{
+qemu_register_machine(collie_machine);
+}
+
+machine_init(collie_machine_init)
-- 
1.7.4.1




[Qemu-devel] [PATCH 1/3] arm: basic support for ARMv4/ARMv4T emulation

2011-03-24 Thread Dmitry Eremin-Solenikov
Currently target-arm/ assumes at least ARMv5 core. Add support for
handling also ARMv4/ARMv4T. This changes the following instructions:

BX(v4T and later)

BKPT, BLX, CDP2, CLZ, LDC2, LDRD, MCRR, MCRR2, MRRC, MCRR, MRC2, MRRC,
MRRC2, PLD QADD, QDADD, QDSUB, QSUB, STRD, SMLAxy, SMLALxy, SMLAWxy,
SMULxy, SMULWxy, STC2 (v5 and later)

All instructions that are v5TE and later are also bound to just v5, as
that's how it was before.

This patch doesn _not_ include disabling of cp15 access and base-updated
data abort model (that will be required to emulate chips based on a
ARM7TDMI), because:
* no ARM7TDMI chips are currently emulated (or planned)
* those features aren't strictly necessary for my purposes (SA-1 core
  emulation).

Patch is heavily based on patch by Filip Navara filip.nav...@gmail.com
which in turn is based on work by Ulrich Hecht u...@suse.de and Vincent
Sanders vi...@kyllikki.org.

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 target-arm/cpu.h   |4 +++-
 target-arm/helper.c|   24 
 target-arm/translate.c |   25 ++---
 3 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 1ae7982..e247a7a 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -360,7 +360,9 @@ enum arm_features {
 ARM_FEATURE_M, /* Microcontroller profile.  */
 ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
 ARM_FEATURE_THUMB2EE,
-ARM_FEATURE_V7MP/* v7 Multiprocessing Extensions */
+ARM_FEATURE_V7MP,/* v7 Multiprocessing Extensions */
+ARM_FEATURE_V4T,
+ARM_FEATURE_V5,
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 78f3d39..49ff5cf 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -48,17 +48,23 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c0_cpuid = id;
 switch (id) {
 case ARM_CPUID_ARM926:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_VFP);
 env-vfp.xregs[ARM_VFP_FPSID] = 0x41011090;
 env-cp15.c0_cachetype = 0x1dd20d2;
 env-cp15.c1_sys = 0x00090078;
 break;
 case ARM_CPUID_ARM946:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_MPU);
 env-cp15.c0_cachetype = 0x0f004006;
 env-cp15.c1_sys = 0x0078;
 break;
 case ARM_CPUID_ARM1026:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_VFP);
 set_feature(env, ARM_FEATURE_AUXCR);
 env-vfp.xregs[ARM_VFP_FPSID] = 0x410110a0;
@@ -67,6 +73,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 break;
 case ARM_CPUID_ARM1136_R2:
 case ARM_CPUID_ARM1136:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_VFP);
 set_feature(env, ARM_FEATURE_AUXCR);
@@ -79,6 +87,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env-cp15.c1_sys = 0x00050078;
 break;
 case ARM_CPUID_ARM11MPCORE:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_VFP);
@@ -91,6 +101,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env-cp15.c0_cachetype = 0x1dd20d2;
 break;
 case ARM_CPUID_CORTEXA8:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_V7);
@@ -113,6 +125,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c1_sys = 0x00c50078;
 break;
 case ARM_CPUID_CORTEXA9:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_V7);
@@ -140,6 +154,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c1_sys = 0x00c50078;
 break;
 case ARM_CPUID_CORTEXM3:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_THUMB2);
 set_feature(env, ARM_FEATURE_V7);
@@ -147,6 +163,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 set_feature(env, ARM_FEATURE_DIV);
 break;
 case ARM_CPUID_ANY: /* For userspace emulation.  */
+set_feature(env, ARM_FEATURE_V4T);
+

[Qemu-devel] Re: [PATCH 1/3] use kernel-provided para_features instead of statically coming up with new capabilities

2011-03-24 Thread Avi Kivity

On 03/18/2011 12:42 AM, Glauber Costa wrote:

According to Avi's comments over my last submission, I decided to take a
different, and more correct direction - we hope.

This patch is now using the features provided by KVM_GET_SUPPORTED_CPUID 
directly to
mask out features from guest-visible cpuid.

The old get_para_features() mechanism is kept for older kernels that do not 
implement it.



+#ifdef CONFIG_KVM_PARA
+struct kvm_para_features {
+int cap;
+int feature;
+} para_features[] = {
+{ KVM_CAP_CLOCKSOURCE, KVM_FEATURE_CLOCKSOURCE },
+{ KVM_CAP_NOP_IO_DELAY, KVM_FEATURE_NOP_IO_DELAY },
+{ KVM_CAP_PV_MMU, KVM_FEATURE_MMU_OP },
+#ifdef KVM_CAP_ASYNC_PF
+{ KVM_CAP_ASYNC_PF, KVM_FEATURE_ASYNC_PF },
+#endif


Shouldn't the others get the same #ifdef treatment?

Yes, we depend on a kernels of a certain age, but let's not add more 
dependencies.



+{ -1, -1 }
+};


Since you use ARRAY_SIZE() later, don't need the guard here.


+
+static int get_para_features(CPUState *env)
+{
+int i, features = 0;
+
+for (i = 0; i  ARRAY_SIZE(para_features) - 1; i++) {


So you can drop the - 1.


+if (kvm_check_extension(env-kvm_state, para_features[i].cap)) {
+features |= (1  para_features[i].feature);
+}
+}
+
+return features;
+}
+#endif
+
+
  uint32_t kvm_arch_get_supported_cpuid(CPUState *env, uint32_t function,
uint32_t index, int reg)
  {


-#ifdef CONFIG_KVM_PARA
-struct kvm_para_features {
-int cap;
-int feature;
-} para_features[] = {
-{ KVM_CAP_CLOCKSOURCE, KVM_FEATURE_CLOCKSOURCE },
-{ KVM_CAP_NOP_IO_DELAY, KVM_FEATURE_NOP_IO_DELAY },
-{ KVM_CAP_PV_MMU, KVM_FEATURE_MMU_OP },
-#ifdef KVM_CAP_ASYNC_PF
-{ KVM_CAP_ASYNC_PF, KVM_FEATURE_ASYNC_PF },
-#endif
-{ -1, -1 }
-};
-
-static int get_para_features(CPUState *env)
-{
-int i, features = 0;
-
-for (i = 0; i  ARRAY_SIZE(para_features) - 1; i++) {
-if (kvm_check_extension(env-kvm_state, para_features[i].cap)) {
-features |= (1  para_features[i].feature);
-}
-}
-#ifdef KVM_CAP_ASYNC_PF
-has_msr_async_pf_en = features  (1  KVM_FEATURE_ASYNC_PF);
-#endif
-return features;
-}
-#endif
-


Oh.  The whole thing was copied wholesale.  Nevermind.

--
error compiling committee.c: too many arguments to function




[Qemu-devel] [PATCH 2/3] Implement basic part of SA-1110/SA-1100

2011-03-24 Thread Dmitry Eremin-Solenikov
Basic implementation of DEC/Intel SA-1100/SA-1110 chips emulation.
Implemented:
 - IRQs
 - GPIO
 - PPC
 - RTC
 - UARTs (no IrDA/etc.)
 - OST reused from pxa25x

Everything else is TODO (esp. PM/idle/sleep!) - see the todo in the
hw/strongarm.c

V2:
  * removed all strongarm variants except latest
  * dropped unused casts
  * fixed PIC vmstate
  * fixed new devices created with version_id = 1

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 Makefile.target |2 +
 hw/strongarm.c  | 1302 +++
 hw/strongarm.h  |   62 +++
 target-arm/cpu.h|3 +
 target-arm/helper.c |9 +
 5 files changed, 1378 insertions(+), 0 deletions(-)
 create mode 100644 hw/strongarm.c
 create mode 100644 hw/strongarm.h

diff --git a/Makefile.target b/Makefile.target
index 62b102a..abc2978 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -328,6 +328,8 @@ obj-arm-y += framebuffer.o
 obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o syborg_keyboard.o
 obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o
 obj-arm-y += syborg_virtio.o
+obj-arm-y += strongarm.o
+obj-arm-y += collie.o
 
 obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
 obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
diff --git a/hw/strongarm.c b/hw/strongarm.c
new file mode 100644
index 000..f5e300c
--- /dev/null
+++ b/hw/strongarm.c
@@ -0,0 +1,1302 @@
+/*
+ * StrongARM SA-1100/SA-1110 emulation
+ *
+ * Copyright (C) 2011 Dmitry Eremin-Solenikov
+ *
+ * Largely based on StrongARM emulation:
+ * Copyright (c) 2006 Openedhand Ltd.
+ * Written by Andrzej Zaborowski bal...@zabor.org
+ *
+ * UART code based on QEMU 16550A UART emulation
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2008 Citrix Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+#include sysbus.h
+#include strongarm.h
+#include qemu-error.h
+#include arm-misc.h
+#include sysemu.h
+
+/*
+ TODO
+ - Implement cp15, c14 ?
+ - Implement cp15, c15 !!! (idle used in L)
+ - Implement idle mode handling/DIM
+ - Implement sleep mode/Wake sources
+ - Implement reset control
+ - Implement memory control regs
+ - PCMCIA handling
+ - Maybe support MBGNT/MBREQ
+ - DMA channels
+ - GPCLK
+ - IrDA
+ - MCP
+ - Enhance UART with modem signals
+ */
+
+static struct {
+target_phys_addr_t io_base;
+int irq;
+} sa_serial[] = {
+{ 0x8001, SA_PIC_UART1 },
+{ 0x8003, SA_PIC_UART2 },
+{ 0x8005, SA_PIC_UART3 },
+{ 0, 0 }
+};
+
+/* Interrupt Controller */
+typedef struct {
+SysBusDevice busdev;
+qemu_irqirq;
+qemu_irqfiq;
+
+uint32_t pending;
+uint32_t enabled;
+uint32_t is_fiq;
+uint32_t int_idle;
+} StrongARMPICState;
+
+#define ICIP0x00
+#define ICMR0x04
+#define ICLR0x08
+#define ICFP0x10
+#define ICPR0x20
+#define ICCR0x0c
+
+#define SA_PIC_SRCS 32
+
+
+static void strongarm_pic_update(void *opaque)
+{
+StrongARMPICState *s = opaque;
+
+/* FIXME: reflect DIM */
+qemu_set_irq(s-fiq, s-pending  s-enabled   s-is_fiq);
+qemu_set_irq(s-irq, s-pending  s-enabled  ~s-is_fiq);
+}
+
+static void strongarm_pic_set_irq(void *opaque, int irq, int level)
+{
+StrongARMPICState *s = opaque;
+
+if (level) {
+s-pending |= 1  irq;
+} else {
+s-pending = ~(1  irq);
+}
+
+strongarm_pic_update(s);
+}
+
+static uint32_t strongarm_pic_mem_read(void *opaque, target_phys_addr_t offset)
+{
+StrongARMPICState *s = opaque;
+
+switch (offset) {
+case ICIP:
+return s-pending  ~s-is_fiq  s-enabled;
+case ICMR:
+return s-enabled;
+case ICLR:
+return s-is_fiq;
+case ICCR:
+return s-int_idle == 0;
+case ICFP:
+return s-pending  s-is_fiq  s-enabled;
+case ICPR:
+return s-pending;
+default:
+printf(%s: Bad register offset 0x TARGET_FMT_plx \n,
+__func__, offset);
+return 0;
+}
+}
+
+static void strongarm_pic_mem_write(void *opaque, target_phys_addr_t offset,
+uint32_t value)
+{
+StrongARMPICState *s = opaque;
+
+switch (offset) {
+case ICMR:
+s-enabled = value;
+break;
+case ICLR:
+s-is_fiq = value;
+break;
+case ICCR:
+s-int_idle = (value  1) ? 0 : ~0;
+break;
+default:
+printf(%s: Bad register offset 0x TARGET_FMT_plx \n,
+__func__, offset);
+break;
+}
+strongarm_pic_update(s);
+}
+
+static CPUReadMemoryFunc * const strongarm_pic_readfn[] = {
+strongarm_pic_mem_read,
+strongarm_pic_mem_read,
+strongarm_pic_mem_read,
+};
+
+static CPUWriteMemoryFunc * const strongarm_pic_writefn[] = {
+strongarm_pic_mem_write,
+strongarm_pic_mem_write,
+strongarm_pic_mem_write,
+};
+
+static int 

[Qemu-devel] Re: [PATCH 0/3] enable newer msr set for kvm

2011-03-24 Thread Avi Kivity

On 03/18/2011 12:42 AM, Glauber Costa wrote:

This patch is a follow up to an earlier one that aims to enable
kvmclock newer msr set. This time I'm doing it through a more sane
mechanism of consulting the kernel about the supported msr set.


Thanks, applied.

--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [PATCH 0/3] enable newer msr set for kvm

2011-03-24 Thread Avi Kivity

On 03/24/2011 12:37 PM, Avi Kivity wrote:

On 03/18/2011 12:42 AM, Glauber Costa wrote:

This patch is a follow up to an earlier one that aims to enable
kvmclock newer msr set. This time I'm doing it through a more sane
mechanism of consulting the kernel about the supported msr set.


Thanks, applied.



(to uq/master)

--
error compiling committee.c: too many arguments to function




[Qemu-devel] Re: [PATCH 2/3] Implement basic part of SA-1110/SA-1100

2011-03-24 Thread Juan Quintela
Dmitry Eremin-Solenikov dbarysh...@gmail.com wrote:
 Basic implementation of DEC/Intel SA-1100/SA-1110 chips emulation.
 Implemented:
  - IRQs
  - GPIO
  - PPC
  - RTC
  - UARTs (no IrDA/etc.)
  - OST reused from pxa25x

 Everything else is TODO (esp. PM/idle/sleep!) - see the todo in the
 hw/strongarm.c

 V2:
   * removed all strongarm variants except latest
   * dropped unused casts
   * fixed PIC vmstate
   * fixed new devices created with version_id = 1

 Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
 ---
  Makefile.target |2 +
  hw/strongarm.c  | 1302 
 +++
  hw/strongarm.h  |   62 +++
  target-arm/cpu.h|3 +
  target-arm/helper.c |9 +
  5 files changed, 1378 insertions(+), 0 deletions(-)
  create mode 100644 hw/strongarm.c
  create mode 100644 hw/strongarm.h

 diff --git a/Makefile.target b/Makefile.target
 index 62b102a..abc2978 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -328,6 +328,8 @@ obj-arm-y += framebuffer.o
  obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o syborg_keyboard.o
  obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o
  obj-arm-y += syborg_virtio.o
 +obj-arm-y += strongarm.o
 +obj-arm-y += collie.o

You add collie.c on next patch.  This would break bisect.

Later, Juan.



Re: [Qemu-devel] [PATCH 0/3] spicevmc - chardev: restore guest open / close (v2)

2011-03-24 Thread Alon Levy
On Thu, Mar 24, 2011 at 11:12:01AM +0100, Hans de Goede wrote:
 Hi All,
 
 When we moved from the spicevmc device (which directly implemented a virtio
 serial port) to doing spicevmc as a chardev backend we lost the notification
 of the guest opening / closing the port to spice server. This causes the
 server to not fall back to server mouse mode when the agent inside the
 guest stops / dies (for what ever reason). Which causes the mouse to
 stop working in this scenario. This patch set fixes this regression.

Reviewed-by: Alon Levy al...@redhat.com

 
 Changes since v1:
 -Replace return qemu_chr_guest_open(vcon-chr); with just
  qemu_chr_guest_open(vcon-chr);, since this is a void func. idem for close.
 
 Regards,
 
 Hans
 



[Qemu-devel] Re: [PATCH 2/3] Implement basic part of SA-1110/SA-1100

2011-03-24 Thread Dmitry Eremin-Solenikov
On 3/24/11, Juan Quintela quint...@redhat.com wrote:
 Dmitry Eremin-Solenikov dbarysh...@gmail.com wrote:
 Basic implementation of DEC/Intel SA-1100/SA-1110 chips emulation.
 Implemented:
  - IRQs
  - GPIO
  - PPC
  - RTC
  - UARTs (no IrDA/etc.)
  - OST reused from pxa25x

 Everything else is TODO (esp. PM/idle/sleep!) - see the todo in the
 hw/strongarm.c

 V2:
   * removed all strongarm variants except latest
   * dropped unused casts
   * fixed PIC vmstate
   * fixed new devices created with version_id = 1

 Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
 ---
  Makefile.target |2 +
  hw/strongarm.c  | 1302
 +++
  hw/strongarm.h  |   62 +++
  target-arm/cpu.h|3 +
  target-arm/helper.c |9 +
  5 files changed, 1378 insertions(+), 0 deletions(-)
  create mode 100644 hw/strongarm.c
  create mode 100644 hw/strongarm.h

 diff --git a/Makefile.target b/Makefile.target
 index 62b102a..abc2978 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -328,6 +328,8 @@ obj-arm-y += framebuffer.o
  obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o syborg_keyboard.o
  obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o
  obj-arm-y += syborg_virtio.o
 +obj-arm-y += strongarm.o
 +obj-arm-y += collie.o

 You add collie.c on next patch.  This would break bisect.

Oops. It seems it sneaked in through some git rebases ...


-- 
With best wishes
Dmitry



Re: [Qemu-devel] OVMF, SeaBIOS non-CSM based legacy boot

2011-03-24 Thread Gleb Natapov
On Wed, Mar 23, 2011 at 03:32:41PM -0700, Jordan Justen wrote:
 2011/3/23 Gleb Natapov g...@redhat.com:
  On Tue, Mar 22, 2011 at 02:53:16PM -0700, Jordan Justen wrote:
  To support a boot override for UEFI, this full path would be needed.
  For the purposes of a UEFI boot override, could the user could provide
  the partition  path info?
 
  How the user knows what to provide. In most cases this user will be
  management anyway. So the use case is like this: new HD is connected
  to a VM and user wants to boot whatever is installed there.
 
 Yeah, that sounds like something something you or I might do, but not
 your average user.
 
I don't see any advanced use case here.

 But, a VM user is most likely not your average user I guess. :)
Management apps (such as libvirt) are VM users too and they may do
pretty advanced things. For instance create VM from preinstalled image.
Actually this is very common use case when VM is provisioned from
preinstalled disk template. 

 
  With legacy
  boot this is the matter of running MBR code, with UEFI user need to boot
  something else and browse file system hierarchy to find magic file to
  boot from?
  Sound like step backward even from legacy bios :)
 
 True, it is not a great scenario.
 
 But you can set up the boot options by browsing the filesystems in the
 firmware setup program.
 
For that I need to involve VM user. Not acceptable. OVMF should find
what to boot for him. User can't be trusted to handle such complex task.
Actually VM may never have a user in the common sense of the word:
person that sits against monitor and operate this particular guest.
May be store the path on a disk in a hidden partition somewhere?

  Is the some notion of default boot in UEFI.
 
 Only for removable media (CD, floppy, USB).  In that case
 /efi/boot/boot(ia32|x64).efi can be 'searched' for.
 
That's good. So we have problems only with hard disks.

 I don't think any UEFI OS installs it's OS loader on the hard disk at this 
 path.
 
How UEFI OS tells UEFI firmware where to boot from?

  I don't know that it matters what you call it (second stage loader?
  perhaps...).  One (arguable) issue with legacy boot process is that
  some 'magic' code must exist in the MBR.
  Legacy boot process has many issues but I wouldn't call MBR one of them.
 
 Tell that to my current system which I foolishly partitioned with GPT,
 and can't get grub/grub2 to work with. :)
 
That's grub/grub2 problem, no? Does grub support EFI at all?

  But lest not argue about that. I doubt we will be able to change UEFI now :)
 
 Yes, many things are frustratingly solidified in UEFI at this point.
 Some spec related, some install base related.
 
  This sounds like a tough to maintain solution.  For boot overrides,
  maybe the user can specify the path.
  User shouldn't know or care. He should be able to download raw disk
  image from internet and run it with qemu -hda image.raw and boot into
  whatever installed there if the image is bootable. It sounds like UEFI
  can't support such usage scenario! And I am not even talk about boot
  overrides in the above scenario.
 
 Yes, I can't think of a great 'user-friendly' solution to this, except
 for the VM to crack the fs and scan for boot loaders.  It might have
 'known' ones, or ask the user.  And this would only make sense for
 importing a disk image in a GUI.
 
 Importing VM's via a GUI these days often involves a lot more metadata
 than just a disk image, so that might be able to include the NV-Vars
 data.
 
Any direct invocation of qemu from command line can be thought of as
importing a disk image. This may be considered as advance use case,
but I think we still have to make it as usable as it is now.

For non-raw disk formats like qcow2 we may save path to boot image
somewhere in image metadata and provide interface to set it from a
guest, but for raw format this will not work.

  For the non-boot override case, we should add support for
  nv-variables, and use the path that the OS sets.
  That makes VM usage much less flexible then it is today. Disk images are
  not self contained any more. I have tens of images that I run inside
  different VMs from different hosts all of the time. It is unreasonable
  to expect that I will track additional images with nv-variables needed
  to boot from them.
 
 Hmm, I'm not sure what to say.  I guess you'd need to know your path, ie:
 /pci@i0cf8/ide@1,1/drive@1/disk@0:0,/path/abc.efi
 associated with each disk image.
 
I need to know only 0,/path/abc.efi part of it. The same image can
appear as different device depending on how VM was started. We need
to find the place to store this info at the image itself.

 By the way, today OVMF attempts to store NV-Var data in a file on the
 disk, but this cannot support variables at runtime.  (This is why I
 sent in the patch for using -pflash on x86/x86-64.)
 
And this file is stored always at the same location? If it is then then
problem is solved! But what do you mean by this 

Re: [Qemu-devel] [PATCH v2] Do not delete BlockDriverState when deleting the drive

2011-03-24 Thread Markus Armbruster
Whoops, almost missed this.  Best to cc: me to avoid that.

Ryan Harper ry...@us.ibm.com writes:

 * Markus Armbruster arm...@redhat.com [2011-03-15 04:48]:
 Sorry for the long delay, I was out of action for a week.
 
 Ryan Harper ry...@us.ibm.com writes:
 
  When removing a drive from the host-side via drive_del we currently have 
  the
  following path:
 
  drive_del
  qemu_aio_flush()
  bdrv_close()
  drive_uninit()
  bdrv_delete()
 
  When we bdrv_delete() we end up qemu_free() the BlockDriverState pointer
  however, the block devices retain a copy of this pointer, see
  hw/virtio-blk.c:virtio_blk_init() where we s-bs = conf-bs.
 
  We now have a use-after-free situation.  If the guest continues to issue IO
  against the device, and we've reallocated the memory that the 
  BlockDriverState
  pointed at, then we will fail the bs-drv checks in the various bdrv_ 
  methods.
 
 we will fail the bs-drv checks is misleading, in my opinion.  Here's
 what happens:
 
 1. bdrv_close(bs) zaps bs-drv, which makes any subsequent I/O get
dropped.  Works as designed.
 
 2. drive_uninit() frees the bs.  Since the device is still connected to
bs, any subsequent I/O is a use-after-free.
 
The value of bs-drv becomes unpredictable on free.  As long as it
remains null, I/O still gets dropped.  I/O crashes or worse once that
changed.  Could be right on free, could be much later.
 
 If you respin anyway, please clarify your description.

 Sure.  I wasn't planning a new version, but I'll update and send anyhow
 as I didn't see it get included in pull from the block branch.
 
  To resolve this issue as simply as possible, we can chose to not actually
  delete the BlockDriverState pointer.  Since bdrv_close() handles setting 
  the drv
  pointer to NULL, we just need to remove the BlockDriverState from the QLIST
  that is used to enumerate the block devices.  This is currently handled 
  within
  bdrv_delete, so move this into it's own function, bdrv_remove().
 
 Why do we remove the BlockDriverState from bdrv_states?  Because we want
 drive_del make its *name* go away.
 
 Begs the question: is the code prepared for a BlockDriverState object
 that isn't on bdrv_states?  Turns out we're in luck: bdrv_new() already
 creates such objects when the device_name is empty.  This is used for
 internal BlockDriverStates such as COW backing files.  Your code makes
 device_name empty when taking the object off bdrv_states, so we're good.
 
 Begs yet another question: how does the behavior of a BlockDriverState
 change when it's taken off bdrv_states, and is that the behavior we
 want?  Changes:
 
 * bdrv_delete() no longer takes it off bdrv_states.  Good.
 
 * bdrv_close_all(), bdrv_commit_all() and bdrv_flush_all() no longer
   cover it.  Okay, because bdrv_close(), bdrv_commit() and bdrv_flush()
   do nothing anyway for closed BlockDriverStates.
 
 * info block and info blockstats no longer show it, because
   bdrv_info() and bdrv_info_stats() no longer see it.  Okay.
 
 * bdrv_find(), bdrv_next(), bdrv_iterate() no longer see it.  Impact?
   Please check their uses and report.

1664  block-migration.c block_load
  bs = bdrv_find(device_name);

 - no longer see it.  This is fine since we can't migrate a block
 device that has been removed

2562  blockdev.c do_commit
  bs = bdrv_find(device);

 - do_commit won't see it in either when calling bdrv_commit_all()
 Fine as you mention above.  If user specifies the device name
 we won't find it, that's OK because we can't commit data against
 a closed BlockDriverState.

3587  blockdev.c do_snapshot_blkdev
  bs = bdrv_find(device);

- OK, cannot take a snapshot against a deleted BlockDriverState

4662  blockdev.c do_eject
  bs = bdrv_find(filename);

- OK, cannot eject a deleted BlockDriverState; 

5676  blockdev.c do_block_set_passwd
  bs = bdrv_find(qdict_get_str(qdict, device));

- OK, cannot set password a deleted BlockDriverState; 

6701  blockdev.c do_change_block
  bs = bdrv_find(device);

- OK, cannot change the file/device of a deleted BlockDriverState; 

7732  blockdev.c do_drive_del
  bs = bdrv_find(id);

- OK, cannot delete an already deleted Drive

8783  blockdev.c do_block_resize
  bs = bdrv_find(device);

- OK, cannot resize a deleted Drive

9312  hw/qdev-properties.c parse_drive
  bs = bdrv_find(str);

- Used when invoking qdev_prop_drive .parse method;  parse method is 
 invoked via
 qdev_device_add() which calls set_property() which invokes parse.  
 AFAICT, this is OK
 since we won't be going down the device add path worrying about a
 deleted block device.

Thanks for checking!

  The result is that we can now invoke drive_del, this closes the file 
  descriptors
  and sets BlockDriverState-drv to NULL which 

Re: [Qemu-devel] OVMF, SeaBIOS non-CSM based legacy boot

2011-03-24 Thread Michal Suchanek
On 23 March 2011 23:32, Jordan Justen jljus...@gmail.com wrote:
 2011/3/23 Gleb Natapov g...@redhat.com:
 On Tue, Mar 22, 2011 at 02:53:16PM -0700, Jordan Justen wrote:
 To support a boot override for UEFI, this full path would be needed.
 For the purposes of a UEFI boot override, could the user could provide
 the partition  path info?

 How the user knows what to provide. In most cases this user will be
 management anyway. So the use case is like this: new HD is connected
 to a VM and user wants to boot whatever is installed there.

 Yeah, that sounds like something something you or I might do, but not
 your average user.

 But, a VM user is most likely not your average user I guess. :)

 With legacy
 boot this is the matter of running MBR code, with UEFI user need to boot
 something else and browse file system hierarchy to find magic file to
 boot from?
 Sound like step backward even from legacy bios :)

 True, it is not a great scenario.

 But you can set up the boot options by browsing the filesystems in the
 firmware setup program.

 Is the some notion of default boot in UEFI.

 Only for removable media (CD, floppy, USB).  In that case
 /efi/boot/boot(ia32|x64).efi can be 'searched' for.


To the contrary.

The distinction between removable and fixed media is quite blurry these days.

Is an eSATA disk removable? Or a SATA disk connected to AHCI
controller through an internal (but hotplug capable) connector?
And when I unplug the disk, put it into an enclosure and connect
through USB has it magically started to be removable then?

When installing EFI bootloaders it is suggested to create a small FAT
partition at the start of the disk and put the loader there, just
under this name.

The Apple EFI also understands some variables stored in their HFS
volumes but that's another story.

If you want to prepare a disk image for somebody to use you *can* put
the bootloader in the well-known location.

If you pass just the raw disk image to somebody else it is technically
removable and should use the removable path, and the user can attach
it through the USB emulation if they insist on making it clearly
removable.

Thanks

Michal



Re: [Qemu-devel] Re: [PATCH 3/3] raw-posix: Re-open host CD-ROM after media change

2011-03-24 Thread Kevin Wolf
Am 23.03.2011 21:50, schrieb Stefan Hajnoczi:
 On Wed, Mar 23, 2011 at 8:27 PM, Juan Quintela quint...@redhat.com wrote:
 Stefan Hajnoczi stefa...@linux.vnet.ibm.com wrote:
 +
 +if (s-fd == -1) {
 +s-fd = qemu_open(bs-filename, s-open_flags, 0644);

 Everything else on that file uses plain open not qemu_open.
 diference is basically that qemu_open() adds flag O_CLOEXEC.

 I don't know if this one should be vanilla open or the other ones
 qemu_open().

 What do you think?
 
 raw_open_common() uses qemu_open().  That's why I used it.

And I think it's correct. There's no reason not to set O_CLOEXEC here.
Maybe some of the open() users need to be fixed.

 +if (s-fd  0) {
 +return 0;
 +}
 +}
 +
 +ret = (ioctl(s-fd, CDROM_DRIVE_STATUS, CDSL_CURRENT) == CDS_DISC_OK);

 parens are not needed around ==.
 
 Yes, if you want I'll remove them.  I just did it for readability.

I like them.

Kevin



[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-03-24 Thread dum8d0g
Using ubuntu natty narwhal installed today (2011-03-24) I tried to do a
snapshot with the help of libvirt. Here are the results using natty
version of qemu-kvm and libvirt and using presented slowdown packages.

root@koberec:~# time virsh snapshot-create 1
Domain snapshot 1300968929 created

real4m39.594s
user0m0.000s
sys 0m0.020s
root@koberec:~# cd /storage/slowsave/
root@koberec:/storage/slowsave# dpkg -l | grep -E 'libvirt|qemu'

 
ii  libvirt-bin 0.8.8-1ubuntu5   
the programs for the libvirt library
ii  libvirt00.8.8-1ubuntu5   
library for interfacing with different virtualization systems
ii  qemu-common 0.14.0+noroms-0ubuntu3   
qemu common functionality (bios, documentation, etc)
ii  qemu-kvm0.14.0+noroms-0ubuntu3   
Full virtualization on i386 and amd64 hardware
root@koberec:/storage/slowsave# dpkg -r qemu-common qemu-kvm

 
root@koberec:/storage/slowsave# dpkg -i 
qemu-common_0.12.5+noroms-0ubuntu7.2_all.deb 
qemu-kvm_0.12.5+noroms-0ubuntu7.2_amd64.deb 
root@koberec:/storage/slowsave# pkill kvm; sleep 5; service libvirt-bin restart
root@koberec:/storage/slowsave# time virsh snapshot-create 1
Domain snapshot 1300969754 created

real2m22.055s
user0m0.000s
sys 0m0.010s
root@koberec:/storage/slowsave# qemu-img snapshot -l /storage/debian.qcow2 | 
tail -n 1
8 1300969754  57M 2011-03-24 08:29:14   00:03:37.652
root@koberec:/storage/slowsave# virsh console 1
Connected to domain vm
Escape character is ^]

Debian GNU/Linux 5.0 debian ttyS0

debian login: root
Password: 
Last login: Thu Mar 24 08:15:18 EDT 2011 on ttyS0
Linux debian 2.6.26-2-amd64 #1 SMP Thu Sep 16 15:56:38 UTC 2010 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
debian:~# free -m
 total   used   free sharedbuffers cached
Mem:   561 39521  0  4 16
-/+ buffers/cache: 19542
Swap:  478  0478
debian:~# 
root@koberec:/storage/slowsave# dd if=/dev/urandom of=/storage/emptyfile bs=1M 
count=40
40+0 records in
40+0 records out
41943040 bytes (42 MB) copied, 5.4184 s, 7.7 MB/s


I am not sure if my measurements are relevant to anything in here, but I hope 
so.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  New
Status in “qemu-kvm” source package in Lucid:
  In Progress
Status in “libvirt” source package in Maverick:
  New
Status in “qemu-kvm” source package in Maverick:
  In Progress

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  virsh save is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.



Re: [Qemu-devel] OVMF, SeaBIOS non-CSM based legacy boot

2011-03-24 Thread Gleb Natapov
On Thu, Mar 24, 2011 at 01:27:39PM +0100, Michal Suchanek wrote:
 On 23 March 2011 23:32, Jordan Justen jljus...@gmail.com wrote:
  2011/3/23 Gleb Natapov g...@redhat.com:
  On Tue, Mar 22, 2011 at 02:53:16PM -0700, Jordan Justen wrote:
  To support a boot override for UEFI, this full path would be needed.
  For the purposes of a UEFI boot override, could the user could provide
  the partition  path info?
 
  How the user knows what to provide. In most cases this user will be
  management anyway. So the use case is like this: new HD is connected
  to a VM and user wants to boot whatever is installed there.
 
  Yeah, that sounds like something something you or I might do, but not
  your average user.
 
  But, a VM user is most likely not your average user I guess. :)
 
  With legacy
  boot this is the matter of running MBR code, with UEFI user need to boot
  something else and browse file system hierarchy to find magic file to
  boot from?
  Sound like step backward even from legacy bios :)
 
  True, it is not a great scenario.
 
  But you can set up the boot options by browsing the filesystems in the
  firmware setup program.
 
  Is the some notion of default boot in UEFI.
 
  Only for removable media (CD, floppy, USB).  In that case
  /efi/boot/boot(ia32|x64).efi can be 'searched' for.
 
 
 To the contrary.
 
 The distinction between removable and fixed media is quite blurry these days.
 
My thoughts exactly. It is double blurry in virtualization world where
moving any media from VM to VM is so easy.

 Is an eSATA disk removable? Or a SATA disk connected to AHCI
 controller through an internal (but hotplug capable) connector?
 And when I unplug the disk, put it into an enclosure and connect
 through USB has it magically started to be removable then?
 
 When installing EFI bootloaders it is suggested to create a small FAT
 partition at the start of the disk and put the loader there, just
 under this name.
 
 The Apple EFI also understands some variables stored in their HFS
 volumes but that's another story.
 
 If you want to prepare a disk image for somebody to use you *can* put
 the bootloader in the well-known location.
 
 If you pass just the raw disk image to somebody else it is technically
 removable and should use the removable path, and the user can attach
 it through the USB emulation if they insist on making it clearly
 removable.
 
 Thanks
 
 Michal

--
Gleb.



[Qemu-devel] [PATCH] qemu-options.hx: fix spice tls-channel

2011-03-24 Thread Alon Levy
missing cursor and smartcard channels, and missing a note about
tunnel and smartcard channels not always being available.
---
 qemu-options.hx |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 540f5c2..ebd98af 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -695,7 +695,7 @@ DEF(spice, HAS_ARG, QEMU_OPTION_spice,
 [,disable-ticketing][,tls-port=nr][,x509-dir=dir]\n
 [,x509-key-file=file][,x509-cert-file=file]\n
 [,x509-dh-key-file=file][,tls-ciphers=list]\n
-[,tls-channel=main|display|inputs|record|playback|tunnel]\n
+
[,tls-channel=main|display|cursor|inputs|record|playback|tunnel|smartcard]\n
 [,image-compression=auto_glz|auto_lz|quic|glz|lz|off]\n
 [,jpeg-wan-compression=auto|never|always]\n
 [,streaming-video=off|all|filter]\n
@@ -743,13 +743,15 @@ The x509 file names can also be configured individually.
 @item tls-ciphers=list
 Specify which ciphers to use.
 
-@item tls-channel=[main|display|inputs|record|playback|tunnel]
-@item plaintext-channel=[main|display|inputs|record|playback|tunnel]
+@item tls-channel=[main|display|cursor|inputs|record|playback|tunnel|smartcard]
+@item 
plaintext-channel=[main|display|cursor|inputs|record|playback|tunnel|smartcard]
 Force specific channel to be used with or without TLS encryption.  The
 options can be specified multiple times to configure multiple
 channels.  The special name default can be used to set the default
 mode.  For channels which are not explicitly forced into one mode the
 spice client is allowed to pick tls/plaintext as he pleases.
+Note: tunnel and smartcard are compile time options for the underlying spice
+server library, they may not exist in your distribution's default package.
 
 @item image-compression=[auto_glz|auto_lz|quic|glz|lz|off]
 Configure image compression (lossless).
-- 
1.7.4.1




[Qemu-devel] Re: [Bug 658610] Re: Check whether images have write permissions

2011-03-24 Thread Serge Hallyn
Thanks, Anthony.  Does that mean that the bug should be 'Invalid',
or perhaps just 'Wontfix' (for the IDE case)?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/658610

Title:
  Check whether images have write permissions

Status in QEMU:
  New
Status in “qemu-kvm” package in Ubuntu:
  Confirmed

Bug description:
  KVM/Qemu should check whether the disk images have write permissions
  in order to prevent users from getting weird IO errors in their VMs
  without understanding what's happening.



[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-03-24 Thread Serge Hallyn
Thanks for that info.  That is unexpected.  Could you send the xml
description of the domain you were snapshotting, as well as the format
of the backing file (i.e. qemu-img info filename.img) and what
filesystem it is stored on (or whether it is LVM)?  I'd like to try to
reproduce it.

Since you are seeing this in natty, it seems certain that while your
symptom is the same as that in the original bug report, the cause is
different.  So it may be best to open a new bug report to track down the
new issue in natty.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  New
Status in “qemu-kvm” source package in Lucid:
  In Progress
Status in “libvirt” source package in Maverick:
  New
Status in “qemu-kvm” source package in Maverick:
  In Progress

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  virsh save is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.



[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-03-24 Thread Serge Hallyn
To be clear, please re-install the stock natty packages, do a virsh
snapshot-create, and then do 'ubuntu-bug libvirt-bin' to file a new bug.
Then please give the info I asked for in comment 25 in that bug.

Thanks!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  New
Status in “qemu-kvm” source package in Lucid:
  In Progress
Status in “libvirt” source package in Maverick:
  New
Status in “qemu-kvm” source package in Maverick:
  In Progress

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  virsh save is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.



[Qemu-devel] [Bug 658610] Re: Check whether images have write permissions

2011-03-24 Thread Anthony Liguori
We can always improve the information to the user for something like so
I've marked this as wishlist.

** Changed in: qemu
   Importance: Undecided = Wishlist

** Changed in: qemu
   Status: New = Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/658610

Title:
  Check whether images have write permissions

Status in QEMU:
  Confirmed
Status in “qemu-kvm” package in Ubuntu:
  Confirmed

Bug description:
  KVM/Qemu should check whether the disk images have write permissions
  in order to prevent users from getting weird IO errors in their VMs
  without understanding what's happening.



[Qemu-devel] [Bug 418112] Re: qemu-img should give reasons for failing

2011-03-24 Thread Serge Hallyn
** Changed in: qemu
   Status: In Progress = Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/418112

Title:
  qemu-img should give reasons for failing

Status in QEMU:
  Fix Released
Status in “qemu” package in Ubuntu:
  Won't Fix
Status in “qemu-kvm” package in Ubuntu:
  Fix Released

Bug description:
  Binary package hint: qemu

    $ kvm-img create -f qcow2 disks/t.img 4000M
    Formatting 'disks/t.img', fmt=qcow2, size=4096000 kB
    qemu-img: Error while formatting

  strace shows that the real error was Permission denied. It should
  say so, using strerror.

  ProblemType: Bug
  Architecture: i386
  Date: Mon Aug 24 13:31:45 2009
  DistroRelease: Ubuntu 9.10
  Package: qemu 0.10.6-1ubuntu1
  ProcEnviron:
   LANGUAGE=en_GB.UTF-8
   LC_COLLATE=C
   PATH=(custom, user)
   LANG=en_GB.UTF-8
   SHELL=/bin/bash
  ProcVersionSignature: Ubuntu 2.6.31-5.24-generic
  SourcePackage: qemu
  Uname: Linux 2.6.31-5-generic i686



[Qemu-devel] [PATCH 05/17] s390x: enable CPU_QuadU

2011-03-24 Thread Alexander Graf
From: Ulrich Hecht u...@suse.de

S390x uses the QuadU type, so let's enable it.

Signed-off-by: Ulrich Hecht u...@suse.de
---
 cpu-all.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 87b0f86..5a26d7a 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -138,7 +138,7 @@ typedef union {
 uint64_t ll;
 } CPU_DoubleU;
 
-#ifdef TARGET_SPARC
+#if defined(TARGET_SPARC) || defined(TARGET_S390X)
 typedef union {
 float128 q;
 #if defined(HOST_WORDS_BIGENDIAN) \
-- 
1.6.0.2




[Qemu-devel] [PATCH 04/17] s390x: Enable nptl for s390x

2011-03-24 Thread Alexander Graf
From: Ulrich Hecht u...@suse.de

S390x user emulation can do nptl. Reflect this in the configure script.

Signed-off-by: Ulrich Hecht u...@suse.de
---
 configure |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index a166de0..6b421d8 100755
--- a/configure
+++ b/configure
@@ -3087,6 +3087,7 @@ case $target_arch2 in
 target_phys_bits=64
   ;;
   s390x)
+target_nptl=yes
 target_phys_bits=64
   ;;
   *)
-- 
1.6.0.2




[Qemu-devel] [PATCH 09/17] s390x: Dispatch interrupts to KVM or the real CPU

2011-03-24 Thread Alexander Graf
The KVM interrupt injection path is non-generic for now. So we need to push
knowledge of how to inject a device interrupt using KVM into the actual device
code.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/s390-virtio-bus.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index d44eff2..bebe965 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -43,6 +43,8 @@
 do { } while (0)
 #endif
 
+#define VIRTIO_EXT_CODE   0x2603
+
 struct BusInfo s390_virtio_bus_info = {
 .name   = s390-virtio,
 .size   = sizeof(VirtIOS390Bus),
@@ -304,9 +306,13 @@ static void virtio_s390_notify(void *opaque, uint16_t 
vector)
 {
 VirtIOS390Device *dev = (VirtIOS390Device*)opaque;
 uint64_t token = s390_virtio_device_vq_token(dev, vector);
+CPUState *env = s390_cpu_addr2state(0);
 
-/* XXX kvm dependency! */
-kvm_s390_virtio_irq(s390_cpu_addr2state(0), 0, token);
+if (kvm_enabled()) {
+kvm_s390_virtio_irq(env, 0, token);
+} else {
+cpu_inject_ext(env, VIRTIO_EXT_CODE, 0, token);
+}
 }
 
 static unsigned virtio_s390_get_features(void *opaque)
-- 
1.6.0.2




[Qemu-devel] [PATCH 01/17] Only build ivshmem when CONFIG_PCI CONFIG_KVM

2011-03-24 Thread Alexander Graf
The ivshmem depends on PCI and KVM, not only KVM. Reflect this
in the Makefile, so we don't get build errors on s390x.

Signed-off-by: Alexander Graf ag...@suse.de
---
 Makefile.target |8 +++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/Makefile.target b/Makefile.target
index f0df98e..17ad396 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -209,7 +209,13 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
 obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
 
 # Inter-VM PCI shared memory
-obj-$(CONFIG_KVM) += ivshmem.o
+CONFIG_IVSHMEM =
+ifeq ($(CONFIG_KVM), y)
+  ifeq ($(CONFIG_PCI), y)
+CONFIG_IVSHMEM = y
+  endif
+endif
+obj-$(CONFIG_IVSHMEM) += ivshmem.o
 
 # Hardware support
 obj-i386-y += vga.o
-- 
1.6.0.2




[Qemu-devel] [PATCH 08/17] s390x: Enable s390x-softmmu target

2011-03-24 Thread Alexander Graf
This patch adds some code paths for running s390x guest OSs without the
need for KVM.

Signed-off-by: Alexander Graf ag...@suse.de
---
 cpu-exec.c  |8 
 target-s390x/exec.h |   20 
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 34eaedc..b113f8b 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -316,6 +316,8 @@ int cpu_exec(CPUState *env1)
 do_interrupt(env);
 #elif defined(TARGET_M68K)
 do_interrupt(0);
+#elif defined(TARGET_S390X)
+do_interrupt(env);
 #endif
 env-exception_index = -1;
 #endif
@@ -524,6 +526,12 @@ int cpu_exec(CPUState *env1)
 do_interrupt(1);
 next_tb = 0;
 }
+#elif defined(TARGET_S390X)  !defined(CONFIG_USER_ONLY)
+if ((interrupt_request  CPU_INTERRUPT_HARD) 
+(env-psw.mask  PSW_MASK_EXT)) {
+do_interrupt(env);
+next_tb = 0;
+}
 #endif
/* Don't use the cached interupt_request value,
   do_interrupt may have updated the EXITTB flag. */
diff --git a/target-s390x/exec.h b/target-s390x/exec.h
index f7893f3..6fe64a6 100644
--- a/target-s390x/exec.h
+++ b/target-s390x/exec.h
@@ -34,6 +34,26 @@ static inline int cpu_has_work(CPUState *env)
 return env-interrupt_request  CPU_INTERRUPT_HARD; // guess
 }
 
+static inline void regs_to_env(void)
+{
+}
+
+static inline void env_to_regs(void)
+{
+}
+
+static inline int cpu_halted(CPUState *env)
+{
+if (!env-halted) {
+   return 0;
+}
+if (cpu_has_work(env)) {
+env-halted = 0;
+return 0;
+}
+return EXCP_HALTED;
+}
+
 static inline void cpu_pc_from_tb(CPUState *env, TranslationBlock* tb)
 {
 env-psw.addr = tb-pc;
-- 
1.6.0.2




[Qemu-devel] [PATCH 11/17] s390x: virtio machine storage keys

2011-03-24 Thread Alexander Graf
For emulation (and migration) we need to know about the guest's storage keys.
These are separate from actual RAM contents, so we need to allocate them in
parallel to RAM.

While touching the file, this patch also adjusts the hypercall function
to a new syntax that aligns better with tcg emulated code.

Signed-off-by: Alexander Graf ag...@suse.de
---
 hw/s390-virtio.c |   21 +
 1 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/hw/s390-virtio.c b/hw/s390-virtio.c
index 850422f..be2c80c 100644
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -82,13 +82,12 @@ CPUState *s390_cpu_addr2state(uint16_t cpu_addr)
 return ipi_states[cpu_addr];
 }
 
-int s390_virtio_hypercall(CPUState *env)
+int s390_virtio_hypercall(CPUState *env, uint64_t mem, uint64_t hypercall)
 {
 int r = 0, i;
-target_ulong mem = env-regs[2];
 
-dprintf(KVM hypercall: %ld\n, env-regs[1]);
-switch (env-regs[1]) {
+dprintf(KVM hypercall: %ld\n, hypercall);
+switch (hypercall) {
 case KVM_S390_VIRTIO_NOTIFY:
 if (mem  ram_size) {
 VirtIOS390Device *dev = s390_virtio_bus_find_vring(s390_bus,
@@ -128,8 +127,7 @@ int s390_virtio_hypercall(CPUState *env)
 break;
 }
 
-env-regs[2] = r;
-return 0;
+return r;
 }
 
 /* PC hardware initialisation */
@@ -145,14 +143,9 @@ static void s390_init(ram_addr_t ram_size,
 ram_addr_t kernel_size = 0;
 ram_addr_t initrd_offset;
 ram_addr_t initrd_size = 0;
+uint8_t *storage_keys;
 int i;
 
-/* XXX we only work on KVM for now */
-
-if (!kvm_enabled()) {
-fprintf(stderr, The S390 target only works with KVM enabled\n);
-exit(1);
-}
 
 /* get a BUS */
 s390_bus = s390_virtio_bus_init(ram_size);
@@ -161,6 +154,9 @@ static void s390_init(ram_addr_t ram_size,
 ram_addr = qemu_ram_alloc(NULL, s390.ram, ram_size);
 cpu_register_physical_memory(0, ram_size, ram_addr);
 
+/* allocate storage keys */
+storage_keys = qemu_mallocz(ram_size / TARGET_PAGE_SIZE);
+
 /* init CPUs */
 if (cpu_model == NULL) {
 cpu_model = host;
@@ -178,6 +174,7 @@ static void s390_init(ram_addr_t ram_size,
 ipi_states[i] = tmp_env;
 tmp_env-halted = 1;
 tmp_env-exception_index = EXCP_HLT;
+tmp_env-storage_keys = storage_keys;
 }
 
 env-halted = 0;
-- 
1.6.0.2




[Qemu-devel] [PATCH 17/17] s390x: build s390x by default

2011-03-24 Thread Alexander Graf
This patch enables building of s390x-softmmu and s390x-linux-user
targets by default.

Signed-off-by: Alexander Graf ag...@suse.de
---
 configure|2 ++
 default-configs/s390x-linux-user.mak |1 +
 2 files changed, 3 insertions(+), 0 deletions(-)
 create mode 100644 default-configs/s390x-linux-user.mak

diff --git a/configure b/configure
index 6b421d8..6dd2363 100755
--- a/configure
+++ b/configure
@@ -997,6 +997,7 @@ sh4-softmmu \
 sh4eb-softmmu \
 sparc-softmmu \
 sparc64-softmmu \
+s390x-softmmu \
 
 fi
 # the following are Linux specific
@@ -1021,6 +1022,7 @@ sh4eb-linux-user \
 sparc-linux-user \
 sparc64-linux-user \
 sparc32plus-linux-user \
+s390x-linux-user \
 
 fi
 # the following are Darwin specific
diff --git a/default-configs/s390x-linux-user.mak 
b/default-configs/s390x-linux-user.mak
new file mode 100644
index 000..a243c99
--- /dev/null
+++ b/default-configs/s390x-linux-user.mak
@@ -0,0 +1 @@
+# Default configuration for s390x-linux-user
-- 
1.6.0.2




[Qemu-devel] [PATCH 00/17] s390x emulation support

2011-03-24 Thread Alexander Graf
We've had support for running s390x guests with KVM for a
while now. This patch set also enables support for running
s390x guests in system as well as linux-user mode in emulation!

Within this scope, I again want to stress that this is _not_
supposed to replace Hercules - the s390 emulator - in any way.
The only target supported by qemu is Linux. You can only run
Linux applications with linux-user emulation and Linux guest OSs
with the system emulation. All the device logic (and 24 bit mode)
for running legacy stuff is missing. Use Hercules for those!

I have successfully run the following guest OSs:

  - SUSE Linux Enterprise Server 11 SP1
  - Debian Lenny

Both of which work just fine on x86_64 and ppc hosts. Other hosts
should also work. The only thing that did not work for me is network.
Somehow networking only works with KVM enabled, so there is probably
some bug involved still.

Either way - rejoice! As with this patch set you can finally fulfill
your mainframe desires on your local workstation. And - most importantly -
finally test patches to virtio against s390!

For images, I'm hoping for Aurelien to provide Debian images that run
in qemu. Other distributions only provide S390x target support in their
enterprise variants, keeping me from redistributing images :(.

If you're trying to get things rolling yourself, make sure to use a
recent kernel that has support for the virtio architecture and virtio
console support - otherwise you won't see output.

The linux user mode emulation part only support 64bit binaries, so
running Debian binaries with that one is out of question for now. Use
the system emulation mode if you really need to run Debian binaries.


Alexander Graf (12):
  Only build ivshmem when CONFIG_PCI  CONFIG_KVM
  virtio: use generic name when possible
  s390x: Enable s390x-softmmu target
  s390x: Dispatch interrupts to KVM or the real CPU
  s390x: Adjust GDB stub
  s390x: virtio machine storage keys
  s390x: Prepare cpu.h for emulation
  s390x: helper functions for system emulation
  s390x: Implement opcode helpers
  s390x: Adjust internal kvm code
  s390x: translate engine for s390x CPU
  s390x: build s390x by default

Ulrich Hecht (5):
  s390x: Enable disassembler for s390x
  s390x: Enable nptl for s390x
  s390x: enable CPU_QuadU
  s390x: s390x-linux-user support
  linux-user: define a couple of syscalls for non-uid16 targets

 Makefile.target  |8 +-
 blockdev.c   |2 +-
 configure|3 +
 cpu-all.h|2 +-
 cpu-exec.c   |8 +
 default-configs/s390x-linux-user.mak |1 +
 disas.c  |6 +
 gdbstub.c|8 +-
 hw/s390-virtio-bus.c |   12 +-
 hw/s390-virtio.c |   21 +-
 hw/virtio-pci.c  |3 +
 linux-user/elfload.c |   18 +
 linux-user/main.c|   89 +
 linux-user/s390x/syscall.h   |   25 +
 linux-user/s390x/syscall_nr.h|  349 +++
 linux-user/s390x/target_signal.h |   26 +
 linux-user/s390x/termbits.h  |  283 ++
 linux-user/signal.c  |  314 +++
 linux-user/syscall.c |  143 +-
 linux-user/syscall_defs.h|   56 +-
 s390x.ld |  194 ++
 scripts/qemu-binfmt-conf.sh  |4 +-
 target-s390x/cpu.h   |  759 +-
 target-s390x/exec.h  |   20 +
 target-s390x/helper.c|  581 -
 target-s390x/helpers.h   |  151 +
 target-s390x/kvm.c   |   60 +-
 target-s390x/op_helper.c | 2891 +++-
 target-s390x/translate.c | 5161 +-
 vl.c |6 +-
 30 files changed, 11063 insertions(+), 141 deletions(-)
 create mode 100644 default-configs/s390x-linux-user.mak
 create mode 100644 linux-user/s390x/syscall.h
 create mode 100644 linux-user/s390x/syscall_nr.h
 create mode 100644 linux-user/s390x/target_signal.h
 create mode 100644 linux-user/s390x/termbits.h
 create mode 100644 s390x.ld
 create mode 100644 target-s390x/helpers.h




[Qemu-devel] [PATCH 10/17] s390x: Adjust GDB stub

2011-03-24 Thread Alexander Graf
We have successfully lazilized cc computation, so we need to manually
trigger its calculation when gdb wants to fetch it. We also changed the
variable name, so writing it writes into a different field now.

Signed-off-by: Alexander Graf ag...@suse.de
---
 gdbstub.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index 1e9f931..f8b5d7e 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1431,7 +1431,11 @@ static int cpu_gdb_read_register(CPUState *env, uint8_t 
*mem_buf, int n)
 /* XXX */
 break;
 case S390_PC_REGNUM: GET_REGL(env-psw.addr); break;
-case S390_CC_REGNUM: GET_REG32(env-cc); break;
+case S390_CC_REGNUM:
+env-cc_op = calc_cc(env, env-cc_op, env-cc_src, env-cc_dst,
+ env-cc_vr);
+GET_REG32(env-cc_op);
+break;
 }
 
 return 0;
@@ -1457,7 +1461,7 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t 
*mem_buf, int n)
 /* XXX */
 break;
 case S390_PC_REGNUM: env-psw.addr = tmpl; break;
-case S390_CC_REGNUM: env-cc = tmp32; r=4; break;
+case S390_CC_REGNUM: env-cc_op = tmp32; r=4; break;
 }
 
 return r;
-- 
1.6.0.2




[Qemu-devel] [PATCH 02/17] virtio: use generic name when possible

2011-03-24 Thread Alexander Graf
We have two different virtio buses: pci and s390. The abstraction path
taken in qemu is to have generic aliases for each device type in the
architecture specific qdev devices.

So let's make use of these aliases whenever we can and define them
whenever we can.

Signed-off-by: Alexander Graf ag...@suse.de
---
 blockdev.c   |2 +-
 hw/s390-virtio-bus.c |2 ++
 hw/virtio-pci.c  |3 +++
 vl.c |6 +++---
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 0690cc8..bc598ed 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -503,7 +503,7 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi)
 case IF_VIRTIO:
 /* add virtio block device */
 opts = qemu_opts_create(qemu_find_opts(device), NULL, 0);
-qemu_opt_set(opts, driver, virtio-blk-pci);
+qemu_opt_set(opts, driver, virtio-blk);
 qemu_opt_set(opts, drive, dinfo-id);
 if (devaddr)
 qemu_opt_set(opts, addr, devaddr);
diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index 784dc01..d44eff2 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -325,6 +325,7 @@ static const VirtIOBindings virtio_s390_bindings = {
 static VirtIOS390DeviceInfo s390_virtio_net = {
 .init = s390_virtio_net_init,
 .qdev.name = virtio-net-s390,
+.qdev.alias = virtio-net,
 .qdev.size = sizeof(VirtIOS390Device),
 .qdev.props = (Property[]) {
 DEFINE_NIC_PROPERTIES(VirtIOS390Device, nic),
@@ -340,6 +341,7 @@ static VirtIOS390DeviceInfo s390_virtio_net = {
 static VirtIOS390DeviceInfo s390_virtio_blk = {
 .init = s390_virtio_blk_init,
 .qdev.name = virtio-blk-s390,
+.qdev.alias = virtio-blk,
 .qdev.size = sizeof(VirtIOS390Device),
 .qdev.props = (Property[]) {
 DEFINE_BLOCK_PROPERTIES(VirtIOS390Device, block),
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 3911b09..96cfe8b 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -873,6 +873,7 @@ static PCIDeviceInfo virtio_info[] = {
 .qdev.reset = virtio_pci_reset,
 },{
 .qdev.name  = virtio-net-pci,
+.qdev.alias = virtio-net,
 .qdev.size  = sizeof(VirtIOPCIProxy),
 .init   = virtio_net_init_pci,
 .exit   = virtio_net_exit_pci,
@@ -909,6 +910,7 @@ static PCIDeviceInfo virtio_info[] = {
 .qdev.reset = virtio_pci_reset,
 },{
 .qdev.name = virtio-balloon-pci,
+.qdev.alias = virtio-balloon,
 .qdev.size = sizeof(VirtIOPCIProxy),
 .init  = virtio_balloon_init_pci,
 .exit  = virtio_exit_pci,
@@ -920,6 +922,7 @@ static PCIDeviceInfo virtio_info[] = {
 },{
 #ifdef CONFIG_VIRTFS
 .qdev.name = virtio-9p-pci,
+.qdev.alias = virtio-9p,
 .qdev.size = sizeof(VirtIOPCIProxy),
 .init  = virtio_9p_init_pci,
 .qdev.props = (Property[]) {
diff --git a/vl.c b/vl.c
index b1a94aa..8d77e43 100644
--- a/vl.c
+++ b/vl.c
@@ -1572,7 +1572,7 @@ static int balloon_parse(const char *arg)
 /* create empty opts */
 opts = qemu_opts_create(qemu_find_opts(device), NULL, 0);
 }
-qemu_opt_set(opts, driver, virtio-balloon-pci);
+qemu_opt_set(opts, driver, virtio-balloon);
 return 0;
 }
 
@@ -2450,12 +2450,12 @@ int main(int argc, char **argv, char **envp)
  qemu_opt_get(opts, path),
  qemu_opt_get(opts, security_model));
 
-len = strlen(virtio-9p-pci,fsdev=,mount_tag=);
+len = strlen(virtio-9p,fsdev=,mount_tag=);
 len += 2*strlen(qemu_opt_get(opts, mount_tag));
 arg_9p = qemu_malloc((len + 1) * sizeof(*arg_9p));
 
 snprintf(arg_9p, (len + 1) * sizeof(*arg_9p),
- virtio-9p-pci,fsdev=%s,mount_tag=%s,
+ virtio-9p,fsdev=%s,mount_tag=%s,
  qemu_opt_get(opts, mount_tag),
  qemu_opt_get(opts, mount_tag));
 
-- 
1.6.0.2




[Qemu-devel] [PATCH 12/17] s390x: Prepare cpu.h for emulation

2011-03-24 Thread Alexander Graf
We need to add some more logic to the CPU description to leverage emulation
of an s390x CPU. This patch adds all the required helpers, fields in CPUState
and constant definitions required for user and system emulation.

Signed-off-by: Alexander Graf ag...@suse.de
---
 target-s390x/cpu.h |  759 +--
 1 files changed, 729 insertions(+), 30 deletions(-)

diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index e47c372..54ecaa9 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -26,10 +26,24 @@
 #define CPUState struct CPUS390XState
 
 #include cpu-defs.h
+#define TARGET_PAGE_BITS 12
+
+#define TARGET_PHYS_ADDR_SPACE_BITS 64
+#define TARGET_VIRT_ADDR_SPACE_BITS 64
+
+#include cpu-all.h
 
 #include softfloat.h
 
-#define NB_MMU_MODES 2
+#define NB_MMU_MODES 3
+
+#define MMU_MODE0_SUFFIX _primary
+#define MMU_MODE1_SUFFIX _secondary
+#define MMU_MODE2_SUFFIX _home
+
+#define MMU_USER_IDX 1
+
+#define MAX_EXT_QUEUE 16
 
 typedef union FPReg {
 struct {
@@ -45,23 +59,58 @@ typedef union FPReg {
 uint64_t i;
 } FPReg;
 
+typedef struct PSW {
+uint64_t mask;
+uint64_t addr;
+} PSW;
+
+typedef struct ExtQueue {
+uint32_t code;
+uint32_t param;
+uint32_t param64;
+} ExtQueue;
+
 typedef struct CPUS390XState {
 uint64_t regs[16]; /* GP registers */
 
 uint32_t aregs[16];/* access registers */
 
 uint32_t fpc;  /* floating-point control register */
-FPReg fregs[16]; /* FP registers */
+CPU_DoubleU fregs[16]; /* FP registers */
 float_status fpu_status; /* passed to softfloat lib */
 
-struct {
-uint64_t mask;
-uint64_t addr;
-} psw;
+PSW psw;
 
-int cc; /* condition code (0-3) */
+uint32_t cc_op;
+uint64_t cc_src;
+uint64_t cc_dst;
+uint64_t cc_vr;
 
 uint64_t __excp_addr;
+uint64_t psa;
+
+uint32_t int_pgm_code;
+uint32_t int_pgm_ilc;
+
+uint32_t int_svc_code;
+uint32_t int_svc_ilc;
+
+uint64_t cregs[16]; /* control registers */
+
+int pending_int;
+ExtQueue ext_queue[MAX_EXT_QUEUE];
+
+/* reset does memset(0) up to here */
+
+int ext_index;
+int cpu_num;
+uint8_t *storage_keys;
+
+uint64_t tod_offset;
+uint64_t tod_basetime;
+QEMUTimer *tod_timer;
+
+QEMUTimer *cpu_timer;
 
 CPU_COMMON
 } CPUS390XState;
@@ -69,24 +118,174 @@ typedef struct CPUS390XState {
 #if defined(CONFIG_USER_ONLY)
 static inline void cpu_clone_regs(CPUState *env, target_ulong newsp)
 {
-if (newsp)
+if (newsp) {
 env-regs[15] = newsp;
+}
 env-regs[0] = 0;
 }
 #endif
 
-#define MMU_MODE0_SUFFIX _kernel
-#define MMU_MODE1_SUFFIX _user
-#define MMU_USER_IDX 1
+/* Interrupt Codes */
+/* Program Interrupts */
+#define PGM_OPERATION   0x0001
+#define PGM_PRIVILEGED  0x0002
+#define PGM_EXECUTE 0x0003
+#define PGM_PROTECTION  0x0004
+#define PGM_ADDRESSING  0x0005
+#define PGM_SPECIFICATION   0x0006
+#define PGM_DATA0x0007
+#define PGM_FIXPT_OVERFLOW  0x0008
+#define PGM_FIXPT_DIVIDE0x0009
+#define PGM_DEC_OVERFLOW0x000a
+#define PGM_DEC_DIVIDE  0x000b
+#define PGM_HFP_EXP_OVERFLOW0x000c
+#define PGM_HFP_EXP_UNDERFLOW   0x000d
+#define PGM_HFP_SIGNIFICANCE0x000e
+#define PGM_HFP_DIVIDE  0x000f
+#define PGM_SEGMENT_TRANS   0x0010
+#define PGM_PAGE_TRANS  0x0011
+#define PGM_TRANS_SPEC  0x0012
+#define PGM_SPECIAL_OP  0x0013
+#define PGM_OPERAND 0x0015
+#define PGM_TRACE_TABLE 0x0016
+#define PGM_SPACE_SWITCH0x001c
+#define PGM_HFP_SQRT0x001d
+#define PGM_PC_TRANS_SPEC   0x001f
+#define PGM_AFX_TRANS   0x0020
+#define PGM_ASX_TRANS   0x0021
+#define PGM_LX_TRANS0x0022
+#define PGM_EX_TRANS0x0023
+#define PGM_PRIM_AUTH   0x0024
+#define PGM_SEC_AUTH0x0025
+#define PGM_ALET_SPEC   0x0028
+#define PGM_ALEN_SPEC   0x0029
+#define PGM_ALE_SEQ 0x002a
+#define PGM_ASTE_VALID  0x002b
+#define PGM_ASTE_SEQ0x002c
+#define PGM_EXT_AUTH0x002d
+#define PGM_STACK_FULL  0x0030
+#define PGM_STACK_EMPTY 0x0031
+#define PGM_STACK_SPEC  0x0032
+#define PGM_STACK_TYPE  0x0033
+#define PGM_STACK_OP0x0034
+#define PGM_ASCE_TYPE   0x0038
+#define PGM_REG_FIRST_TRANS 0x0039
+#define PGM_REG_SEC_TRANS   0x003a
+#define PGM_REG_THIRD_TRANS 0x003b
+#define PGM_MONITOR 0x0040
+#define PGM_PER 

[Qemu-devel] [PATCH 15/17] s390x: Adjust internal kvm code

2011-03-24 Thread Alexander Graf
We're now finally emulating an s390x CPU, so we can move quite some logic
from the kvm code out into generic CPU code.

This patch does this and adjusts the interfaces according to what the code
around now expects to be able to call.

Signed-off-by: Alexander Graf ag...@suse.de
---
 target-s390x/kvm.c |   60 ---
 1 files changed, 14 insertions(+), 46 deletions(-)

diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 6e94274..0fd4cbb 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -49,13 +49,6 @@
 #define DIAG_KVM_HYPERCALL  0x500
 #define DIAG_KVM_BREAKPOINT 0x501
 
-#define SCP_LENGTH  0x00
-#define SCP_FUNCTION_CODE   0x02
-#define SCP_CONTROL_MASK0x03
-#define SCP_RESPONSE_CODE   0x06
-#define SCP_MEM_CODE0x08
-#define SCP_INCREMENT   0x0a
-
 #define ICPT_INSTRUCTION0x04
 #define ICPT_WAITPSW0x1c
 #define ICPT_SOFT_INTERCEPT 0x24
@@ -182,8 +175,8 @@ int kvm_arch_process_irqchip_events(CPUState *env)
 return 0;
 }
 
-static void kvm_s390_interrupt_internal(CPUState *env, int type, uint32_t parm,
-uint64_t parm64, int vm)
+void kvm_s390_interrupt_internal(CPUState *env, int type, uint32_t parm,
+ uint64_t parm64, int vm)
 {
 struct kvm_s390_interrupt kvmint;
 int r;
@@ -218,7 +211,7 @@ void kvm_s390_virtio_irq(CPUState *env, int config_change, 
uint64_t token)
 token, 1);
 }
 
-static void kvm_s390_interrupt(CPUState *env, int type, uint32_t code)
+void kvm_s390_interrupt(CPUState *env, int type, uint32_t code)
 {
 kvm_s390_interrupt_internal(env, type, code, 0, 0);
 }
@@ -228,16 +221,16 @@ static void enter_pgmcheck(CPUState *env, uint16_t code)
 kvm_s390_interrupt(env, KVM_S390_PROGRAM_INT, code);
 }
 
-static void setcc(CPUState *env, uint64_t cc)
+static inline void setcc(CPUState *env, uint64_t cc)
 {
-env-kvm_run-psw_mask = ~(3ul  44);
+env-kvm_run-psw_mask = ~(3ull  44);
 env-kvm_run-psw_mask |= (cc  3)  44;
 
 env-psw.mask = ~(3ul  44);
 env-psw.mask |= (cc  3)  44;
 }
 
-static int sclp_service_call(CPUState *env, struct kvm_run *run, uint16_t 
ipbh0)
+static int kvm_sclp_service_call(CPUState *env, struct kvm_run *run, uint16_t 
ipbh0)
 {
 uint32_t sccb;
 uint64_t code;
@@ -247,35 +240,11 @@ static int sclp_service_call(CPUState *env, struct 
kvm_run *run, uint16_t ipbh0)
 sccb = env-regs[ipbh0  0xf];
 code = env-regs[(ipbh0  0xf0)  4];
 
-dprintf(sclp(0x%x, 0x%lx)\n, sccb, code);
-
-if (sccb  ~0x7ff8ul) {
-fprintf(stderr, KVM: invalid sccb address 0x%x\n, sccb);
-r = -1;
-goto out;
-}
-
-switch(code) {
-case SCLP_CMDW_READ_SCP_INFO:
-case SCLP_CMDW_READ_SCP_INFO_FORCED:
-stw_phys(sccb + SCP_MEM_CODE, ram_size  20);
-stb_phys(sccb + SCP_INCREMENT, 1);
-stw_phys(sccb + SCP_RESPONSE_CODE, 0x10);
-setcc(env, 0);
-
-kvm_s390_interrupt_internal(env, KVM_S390_INT_SERVICE,
-sccb  ~3, 0, 1);
-break;
-default:
-dprintf(KVM: invalid sclp call 0x%x / 0x%lx\n, sccb, code);
-r = -1;
-break;
-}
-
-out:
-if (r  0) {
+r = sclp_service_call(env, sccb, code);
+if (r) {
 setcc(env, 3);
 }
+
 return 0;
 }
 
@@ -287,7 +256,7 @@ static int handle_priv(CPUState *env, struct kvm_run *run, 
uint8_t ipa1)
 dprintf(KVM: PRIV: %d\n, ipa1);
 switch (ipa1) {
 case PRIV_SCLP_CALL:
-r = sclp_service_call(env, run, ipbh0);
+r = kvm_sclp_service_call(env, run, ipbh0);
 break;
 default:
 dprintf(KVM: unknown PRIV: 0x%x\n, ipa1);
@@ -300,12 +269,10 @@ static int handle_priv(CPUState *env, struct kvm_run 
*run, uint8_t ipa1)
 
 static int handle_hypercall(CPUState *env, struct kvm_run *run)
 {
-int r;
-
 cpu_synchronize_state(env);
-r = s390_virtio_hypercall(env);
+env-regs[2] = s390_virtio_hypercall(env, env-regs[2], env-regs[1]);
 
-return r;
+return 0;
 }
 
 static int handle_diag(CPUState *env, struct kvm_run *run, int ipb_code)
@@ -450,7 +417,8 @@ static int handle_intercept(CPUState *env)
 int icpt_code = run-s390_sieic.icptcode;
 int r = 0;
 
-dprintf(intercept: 0x%x (at 0x%lx)\n, icpt_code, env-kvm_run-psw_addr);
+dprintf(intercept: 0x%x (at 0x%lx)\n, icpt_code,
+(long)env-kvm_run-psw_addr);
 switch (icpt_code) {
 case ICPT_INSTRUCTION:
 r = handle_instruction(env, run);
-- 
1.6.0.2




[Qemu-devel] [PATCH 07/17] linux-user: define a couple of syscalls for non-uid16 targets

2011-03-24 Thread Alexander Graf
From: Ulrich Hecht u...@suse.de

Quite a number of syscalls are only defined on systems with USE_UID16
defined; this patch defines them on other systems as well.

Fixes a large number of uid/gid-related testcases on the s390x target
(and most likely on other targets as well)

Signed-off-by: Ulrich Hecht u...@suse.de
---
 linux-user/syscall.c |  125 ++
 1 files changed, 105 insertions(+), 20 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ec7be9b..d15cdc1 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -326,7 +326,7 @@ static int sys_fchmodat(int dirfd, const char *pathname, 
mode_t mode)
   return (fchmodat(dirfd, pathname, mode, 0));
 }
 #endif
-#if defined(TARGET_NR_fchownat)  defined(USE_UID16)
+#if defined(TARGET_NR_fchownat)
 static int sys_fchownat(int dirfd, const char *pathname, uid_t owner,
 gid_t group, int flags)
 {
@@ -435,7 +435,7 @@ _syscall3(int,sys_faccessat,int,dirfd,const char 
*,pathname,int,mode)
 #if defined(TARGET_NR_fchmodat)  defined(__NR_fchmodat)
 _syscall3(int,sys_fchmodat,int,dirfd,const char *,pathname, mode_t,mode)
 #endif
-#if defined(TARGET_NR_fchownat)  defined(__NR_fchownat)  defined(USE_UID16)
+#if defined(TARGET_NR_fchownat)  defined(__NR_fchownat)
 _syscall5(int,sys_fchownat,int,dirfd,const char *,pathname,
   uid_t,owner,gid_t,group,int,flags)
 #endif
@@ -6819,18 +6819,35 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 case TARGET_NR_setfsgid:
 ret = get_errno(setfsgid(arg1));
 break;
+#else /* USE_UID16 */
+#if defined(TARGET_NR_fchownat)  defined(__NR_fchownat)
+case TARGET_NR_fchownat:
+if (!(p = lock_user_string(arg2)))
+goto efault;
+ret = get_errno(sys_fchownat(arg1, p, arg3, arg4, arg5));
+unlock_user(p, arg2, 0);
+break;
+#endif
 #endif /* USE_UID16 */
 
-#ifdef TARGET_NR_lchown32
+#if defined(TARGET_NR_lchown32) || !defined(USE_UID16)
+#if defined(TARGET_NR_lchown32)
 case TARGET_NR_lchown32:
+#else
+case TARGET_NR_lchown:
+#endif
 if (!(p = lock_user_string(arg1)))
 goto efault;
 ret = get_errno(lchown(p, arg2, arg3));
 unlock_user(p, arg1, 0);
 break;
 #endif
-#ifdef TARGET_NR_getuid32
+#if defined(TARGET_NR_getuid32) || (defined(TARGET_NR_getuid)  
!defined(USE_UID16))
+#if defined(TARGET_NR_getuid32)
 case TARGET_NR_getuid32:
+#else
+case TARGET_NR_getuid:
+#endif
 ret = get_errno(getuid());
 break;
 #endif
@@ -6975,33 +6992,57 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 break;
 #endif
 
-#ifdef TARGET_NR_getgid32
+#if defined(TARGET_NR_getgid32) || (defined(TARGET_NR_getgid)  
!defined(USE_UID16))
+#if defined(TARGET_NR_getgid32)
 case TARGET_NR_getgid32:
+#else
+case TARGET_NR_getgid:
+#endif
 ret = get_errno(getgid());
 break;
 #endif
-#ifdef TARGET_NR_geteuid32
+#if defined(TARGET_NR_geteuid32) || (defined(TARGET_NR_geteuid)  
!defined(USE_UID16))
+#if defined(TARGET_NR_geteuid32)
 case TARGET_NR_geteuid32:
+#else
+case TARGET_NR_geteuid:
+#endif
 ret = get_errno(geteuid());
 break;
 #endif
-#ifdef TARGET_NR_getegid32
+#if defined(TARGET_NR_getegid32) || (defined(TARGET_NR_getegid)  
!defined(USE_UID16))
+#if defined(TARGET_NR_getegid32)
 case TARGET_NR_getegid32:
+#else
+case TARGET_NR_getegid:
+#endif
 ret = get_errno(getegid());
 break;
 #endif
-#ifdef TARGET_NR_setreuid32
+#if defined(TARGET_NR_setreuid32) || !defined(USE_UID16)
+#if defined(TARGET_NR_setreuid32)
 case TARGET_NR_setreuid32:
+#else
+case TARGET_NR_setreuid:
+#endif
 ret = get_errno(setreuid(arg1, arg2));
 break;
 #endif
-#ifdef TARGET_NR_setregid32
+#if defined(TARGET_NR_setregid32) || !defined(USE_UID16)
+#if defined(TARGET_NR_setregid32)
 case TARGET_NR_setregid32:
+#else
+case TARGET_NR_setregid:
+#endif
 ret = get_errno(setregid(arg1, arg2));
 break;
 #endif
-#ifdef TARGET_NR_getgroups32
+#if defined(TARGET_NR_getgroups32) || !defined(USE_UID16)
+#if defined(TARGET_NR_getgroups32)
 case TARGET_NR_getgroups32:
+#else
+case TARGET_NR_getgroups:
+#endif
 {
 int gidsetsize = arg1;
 uint32_t *target_grouplist;
@@ -7025,8 +7066,12 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 }
 break;
 #endif
-#ifdef TARGET_NR_setgroups32
+#if defined(TARGET_NR_setgroups32) || !defined(USE_UID16)
+#if defined(TARGET_NR_setgroups32)
 case TARGET_NR_setgroups32:
+#else
+case TARGET_NR_setgroups:
+#endif
 {
 int gidsetsize = arg1;
 uint32_t *target_grouplist;
@@ -7046,18 +7091,30 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 }
 break;
 #endif
-#ifdef TARGET_NR_fchown32
+#if defined(TARGET_NR_fchown32) || !defined(USE_UID16)
+#if 

[Qemu-devel] [PATCH 06/17] s390x: s390x-linux-user support

2011-03-24 Thread Alexander Graf
From: Ulrich Hecht u...@suse.de

This patch adds support for running s390x binaries in the linux-user emulation
code.

Signed-off-by: Ulrich Hecht u...@suse.de
---
 linux-user/elfload.c |   18 ++
 linux-user/main.c|   89 ++
 linux-user/s390x/syscall.h   |   25 +++
 linux-user/s390x/syscall_nr.h|  349 ++
 linux-user/s390x/target_signal.h |   26 +++
 linux-user/s390x/termbits.h  |  283 ++
 linux-user/signal.c  |  314 ++
 linux-user/syscall.c |   18 ++-
 linux-user/syscall_defs.h|   56 ++-
 s390x.ld |  194 +
 scripts/qemu-binfmt-conf.sh  |4 +-
 11 files changed, 1368 insertions(+), 8 deletions(-)
 create mode 100644 linux-user/s390x/syscall.h
 create mode 100644 linux-user/s390x/syscall_nr.h
 create mode 100644 linux-user/s390x/target_signal.h
 create mode 100644 linux-user/s390x/termbits.h
 create mode 100644 s390x.ld

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index fe5410e..0b26ea2 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -793,6 +793,24 @@ static inline void init_thread(struct target_pt_regs *regs,
 
 #endif /* TARGET_ALPHA */
 
+#ifdef TARGET_S390X
+
+#define ELF_START_MMAP (0x200ULL)
+
+#define elf_check_arch(x) ( (x) == ELF_ARCH )
+
+#define ELF_CLASS  ELFCLASS64
+#define ELF_DATA   ELFDATA2MSB
+#define ELF_ARCH   EM_S390
+
+static inline void init_thread(struct target_pt_regs *regs, struct image_info 
*infop)
+{
+regs-psw.addr = infop-entry;
+regs-gprs[15] = infop-start_stack;
+}
+
+#endif /* TARGET_S390X */
+
 #ifndef ELF_PLATFORM
 #define ELF_PLATFORM (NULL)
 #endif
diff --git a/linux-user/main.c b/linux-user/main.c
index e651bfd..02788ba 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2624,6 +2624,86 @@ void cpu_loop (CPUState *env)
 }
 #endif /* TARGET_ALPHA */
 
+#ifdef TARGET_S390X
+void cpu_loop(CPUS390XState *env)
+{
+int trapnr;
+target_siginfo_t info;
+
+while (1) {
+trapnr = cpu_s390x_exec (env);
+
+if ((trapnr  0x) == EXCP_EXECUTE_SVC) {
+int n = trapnr  0x;
+env-regs[2] = do_syscall(env, n,
+   env-regs[2],
+   env-regs[3],
+   env-regs[4],
+   env-regs[5],
+   env-regs[6],
+   env-regs[7]);
+}
+else switch (trapnr) {
+case EXCP_INTERRUPT:
+/* just indicate that signals should be handled asap */
+break;
+case EXCP_DEBUG:
+{
+int sig;
+
+sig = gdb_handlesig (env, TARGET_SIGTRAP);
+if (sig) {
+info.si_signo = sig;
+info.si_errno = 0;
+info.si_code = TARGET_TRAP_BRKPT;
+queue_signal(env, info.si_signo, info);
+}
+}
+break;
+case EXCP_SVC:
+{
+int n = env-int_svc_code;
+if (!n) n = env-regs[1];  /* syscalls  255 */
+env-regs[2] = do_syscall(env, n,
+   env-regs[2],
+   env-regs[3],
+   env-regs[4],
+   env-regs[5],
+   env-regs[6],
+   env-regs[7]);
+}
+break;
+case EXCP_ADDR:
+{
+info.si_signo = SIGSEGV;
+info.si_errno = 0;
+/* XXX: check env-error_code */
+info.si_code = TARGET_SEGV_MAPERR;
+info._sifields._sigfault._addr = env-__excp_addr;
+queue_signal(env, info.si_signo, info);
+}
+break;
+case EXCP_SPEC:
+{
+fprintf(stderr,specification exception insn 0x%08x%04x\n, 
ldl(env-psw.addr), lduw(env-psw.addr + 4));
+info.si_signo = SIGILL;
+info.si_errno = 0;
+info.si_code = TARGET_ILL_ILLOPC;
+info._sifields._sigfault._addr = env-__excp_addr;
+queue_signal(env, info.si_signo, info);
+}
+break;
+default:
+printf (Unhandled trap: 0x%x\n, trapnr);
+cpu_dump_state(env, stderr, fprintf, 0);
+exit (1);
+}
+process_pending_signals (env);
+}
+}
+
+#endif /* TARGET_S390X */
+
 static void version(void)
 {
 printf(qemu- TARGET_ARCH  version  QEMU_VERSION QEMU_PKGVERSION
@@ -3363,6 +3443,15 @@ int main(int argc, char **argv, char **envp)
env-regs[15] = regs-acr;  
env-pc = regs-erp;
 }
+#elif defined(TARGET_S390X)
+{
+int i;
+for (i = 0; i  16; i++) {
+ 

[Qemu-devel] [PATCH 03/17] s390x: Enable disassembler for s390x

2011-03-24 Thread Alexander Graf
From: Ulrich Hecht u...@suse.de

This patch enables the instruction disassembler when using an
S390x target.

Signed-off-by: Ulrich Hecht u...@suse.de
---
 disas.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/disas.c b/disas.c
index c76f36f..17b4ce4 100644
--- a/disas.c
+++ b/disas.c
@@ -215,6 +215,9 @@ void target_disas(FILE *out, target_ulong code, 
target_ulong size, int flags)
 disasm_info.mach = bfd_mach_cris_v32;
 print_insn = print_insn_crisv32;
 }
+#elif defined(TARGET_S390X)
+disasm_info.mach = bfd_mach_s390_64;
+print_insn = print_insn_s390;
 #elif defined(TARGET_MICROBLAZE)
 disasm_info.mach = bfd_arch_microblaze;
 print_insn = print_insn_microblaze;
@@ -414,6 +417,9 @@ void monitor_disas(Monitor *mon, CPUState *env,
 #elif defined(TARGET_SH4)
 disasm_info.mach = bfd_mach_sh4;
 print_insn = print_insn_sh;
+#elif defined(TARGET_S390X)
+disasm_info.mach = bfd_mach_s390_64;
+print_insn = print_insn_s390;
 #else
 monitor_printf(mon, 0x TARGET_FMT_lx
: Asm output not supported on this arch\n, pc);
-- 
1.6.0.2




[Qemu-devel] [PATCH 13/17] s390x: helper functions for system emulation

2011-03-24 Thread Alexander Graf
When running system emulation, we need to transverse through the MMU and
deliver interrupts according to the specification.

This patch implements those two pieces and in addition adjusts the CPU
initialization code to account for the new fields in CPUState.

Signed-off-by: Alexander Graf ag...@suse.de
---
 target-s390x/helper.c |  581 -
 1 files changed, 571 insertions(+), 10 deletions(-)

diff --git a/target-s390x/helper.c b/target-s390x/helper.c
index 4a5297b..d55a73a 100644
--- a/target-s390x/helper.c
+++ b/target-s390x/helper.c
@@ -2,6 +2,7 @@
  *  S/390 helpers
  *
  *  Copyright (c) 2009 Ulrich Hecht
+ *  Copyright (c) 2011 Alexander Graf
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public
@@ -25,27 +26,108 @@
 #include exec-all.h
 #include gdbstub.h
 #include qemu-common.h
+#include qemu-timer.h
 
+#if !defined(CONFIG_USER_ONLY)
 #include linux/kvm.h
 #include kvm.h
+#endif
+
+//#define S390_PTE_PRINTF_HACK
+//#define DEBUG_S390
+//#define DEBUG_S390_PTE
+//#define DEBUG_S390_STDOUT
+
+#ifdef DEBUG_S390
+#ifdef DEBUG_S390_STDOUT
+#define dprintf(fmt, ...) \
+do { fprintf(stderr, fmt, ## __VA_ARGS__); \
+ qemu_log(fmt, ##__VA_ARGS__); } while (0)
+#else
+#define dprintf(fmt, ...) \
+do { qemu_log(fmt, ## __VA_ARGS__); } while (0)
+#endif
+#else
+#define dprintf(fmt, ...) \
+do { } while (0)
+#endif
+
+#ifdef DEBUG_S390_PTE
+#define pte_dprintf dprintf
+#else
+#define pte_dprintf(fmt, ...) \
+do { } while (0)
+#endif
+
+#ifndef CONFIG_USER_ONLY
+static void s390x_tod_timer(void *opaque)
+{
+CPUState *env = opaque;
+
+env-pending_int |= INTERRUPT_TOD;
+cpu_interrupt(env, CPU_INTERRUPT_HARD);
+}
+
+static void s390x_cpu_timer(void *opaque)
+{
+CPUState *env = opaque;
+
+env-pending_int |= INTERRUPT_CPUTIMER;
+cpu_interrupt(env, CPU_INTERRUPT_HARD);
+}
+#endif
 
 CPUS390XState *cpu_s390x_init(const char *cpu_model)
 {
 CPUS390XState *env;
+#if !defined (CONFIG_USER_ONLY)
+struct tm tm;
+#endif
 static int inited = 0;
+static int cpu_num = 0;
 
 env = qemu_mallocz(sizeof(CPUS390XState));
 cpu_exec_init(env);
 if (!inited) {
 inited = 1;
+s390x_translate_init();
 }
 
+#if !defined(CONFIG_USER_ONLY)
+qemu_get_timedate(tm, 0);
+env-tod_offset = TOD_UNIX_EPOCH +
+  (time2tod(mktimegm(tm)) * 10ULL);
+env-tod_basetime = 0;
+env-tod_timer = qemu_new_timer(vm_clock, s390x_tod_timer, env);
+env-cpu_timer = qemu_new_timer(vm_clock, s390x_cpu_timer, env);
+#endif
 env-cpu_model_str = cpu_model;
+env-cpu_num = cpu_num++;
+env-ext_index = -1;
 cpu_reset(env);
 qemu_init_vcpu(env);
 return env;
 }
 
+#if defined(CONFIG_USER_ONLY)
+
+void do_interrupt (CPUState *env)
+{
+env-exception_index = -1;
+}
+
+int cpu_s390x_handle_mmu_fault (CPUState *env, target_ulong address, int rw,
+  int mmu_idx, int is_softmmu)
+{
+/* fprintf(stderr,%s: address 0x%lx rw %d mmu_idx %d is_softmmu %d\n,
+__FUNCTION__, address, rw, mmu_idx, is_softmmu); */
+env-exception_index = EXCP_ADDR;
+env-__excp_addr = address; /* FIXME: find out how this works on a real 
machine */
+return 1;
+}
+
+#endif /* CONFIG_USER_ONLY */
+
 void cpu_reset(CPUS390XState *env)
 {
 if (qemu_loglevel_mask(CPU_LOG_RESET)) {
@@ -53,32 +135,511 @@ void cpu_reset(CPUS390XState *env)
 log_cpu_state(env, 0);
 }
 
-memset(env, 0, offsetof(CPUS390XState, breakpoints));
+memset(env, 0, offsetof(CPUS390XState, cpu_num));
 /* FIXME: reset vector? */
 tlb_flush(env, 1);
 }
 
-target_phys_addr_t cpu_get_phys_page_debug(CPUState *env, target_ulong addr)
+#ifndef CONFIG_USER_ONLY
+
+/* Ensure to exit the TB after this call! */
+static void trigger_pgm_exception(CPUState *env, uint32_t code, uint32_t ilc)
+{
+env-exception_index = EXCP_PGM;
+env-int_pgm_code = code;
+env-int_pgm_ilc = ilc;
+}
+
+static int trans_bits(CPUState *env, uint64_t mode)
+{
+int bits = 0;
+
+switch (mode) {
+case PSW_ASC_PRIMARY:
+bits = 1;
+break;
+case PSW_ASC_SECONDARY:
+bits = 2;
+break;
+case PSW_ASC_HOME:
+bits = 3;
+break;
+default:
+cpu_abort(env, unknown asc mode\n);
+break;
+}
+
+return bits;
+}
+
+static void trigger_prot_fault(CPUState *env, target_ulong vaddr, uint64_t 
mode)
+{
+int ilc = ILC_LATER_INC_2;
+int bits = trans_bits(env, mode) | 4;
+
+dprintf(%s: vaddr=%016 PRIx64  bits=%d\n, __FUNCTION__, vaddr, bits);
+#ifdef S390_PTE_PRINTF_HACK
+fprintf(stderr, %s: vaddr=%016 PRIx64  bits=%d\n, __FUNCTION__, vaddr, 
bits);
+#endif
+
+stq_phys(env-psa + offsetof(LowCore, trans_exc_code), vaddr | bits);
+trigger_pgm_exception(env, PGM_PROTECTION, ilc);
+}
+
+static void 

Re: [Qemu-devel] OVMF, SeaBIOS non-CSM based legacy boot

2011-03-24 Thread Jordan Justen
2011/3/24 Gleb Natapov g...@redhat.com:
 On Wed, Mar 23, 2011 at 03:32:41PM -0700, Jordan Justen wrote:
 By the way, today OVMF attempts to store NV-Var data in a file on the
 disk, but this cannot support variables at runtime.  (This is why I
 sent in the patch for using -pflash on x86/x86-64.)

 And this file is stored always at the same location? If it is then then
 problem is solved! But what do you mean by this cannot support
 variables at runtime?

The variables can be set while the OS is running, and the OS has
exclusive control over the disk at that time.  Today in OVMF we set
variables into memory during this time, and hope that memory it still
around after a reset.  This does not provide realistic non-volatile
UEFI variable support.

What we really need is flash memory.  (See my 'hw/pc: Support system
flash memory' patch.)

But, there is nothing stopping us from also storing the variables on
the disk (during the firmware boot), and also using them as a backup.

Additionally, we can add yet another backup system of looking for
known os-loader executable paths.  This would be needed if a disk
image were ever to be transferred from a real machine to a VM image.
But, this would require firmware updates as new UEFI OS loader install
paths are added.  Also, let's hope no OS decides to generate a random
path for the OS loader. :)

-Jordan



Re: [Qemu-devel] [PATCH 05/17] s390x: enable CPU_QuadU

2011-03-24 Thread Peter Maydell
On 24 March 2011 15:58, Alexander Graf ag...@suse.de wrote:
 --- a/cpu-all.h
 +++ b/cpu-all.h
 @@ -138,7 +138,7 @@ typedef union {
     uint64_t ll;
  } CPU_DoubleU;

 -#ifdef TARGET_SPARC
 +#if defined(TARGET_SPARC) || defined(TARGET_S390X)
  typedef union {
     float128 q;
  #if defined(HOST_WORDS_BIGENDIAN) \

I think this typedef should just be protected by
#ifdef CONFIG_SOFTFLOAT (because only softfloat defines
a float128 type) rather than by a set of TARGET_* checks.

If you do that then you can get rid of the || (defined(__arm__)
 !defined(__VFP_FP__)  !defined(CONFIG_SOFTFLOAT)) part
of the inner ifdef, beacuse it's always false.

-- PMM



Re: [Qemu-devel] [PATCH 05/17] s390x: enable CPU_QuadU

2011-03-24 Thread Alexander Graf

On 24.03.2011, at 17:52, Peter Maydell wrote:

 On 24 March 2011 15:58, Alexander Graf ag...@suse.de wrote:
 --- a/cpu-all.h
 +++ b/cpu-all.h
 @@ -138,7 +138,7 @@ typedef union {
 uint64_t ll;
  } CPU_DoubleU;
 
 -#ifdef TARGET_SPARC
 +#if defined(TARGET_SPARC) || defined(TARGET_S390X)
  typedef union {
 float128 q;
  #if defined(HOST_WORDS_BIGENDIAN) \
 
 I think this typedef should just be protected by
 #ifdef CONFIG_SOFTFLOAT (because only softfloat defines
 a float128 type) rather than by a set of TARGET_* checks.
 
 If you do that then you can get rid of the || (defined(__arm__)
  !defined(__VFP_FP__)  !defined(CONFIG_SOFTFLOAT)) part
 of the inner ifdef, beacuse it's always false.

I'm fairly indifferent on that one :).


Alex




Re: [Qemu-devel] [PATCH 14/17] s390x: Implement opcode helpers

2011-03-24 Thread Richard Henderson
On 03/24/2011 10:29 AM, Peter Maydell wrote:
 On 24 March 2011 15:58, Alexander Graf ag...@suse.de wrote:
 
 This is more random comments in passing than a thorough review; sorry.
 
 +#if HOST_LONG_BITS == 64  defined(__GNUC__)
 +/* assuming 64-bit hosts have __uint128_t */
 +__uint128_t dividend = (((__uint128_t)env-regs[r1])  64) |
 +   (env-regs[r1+1]);
 +__uint128_t quotient = dividend / divisor;
 +env-regs[r1+1] = quotient;
 +__uint128_t remainder = dividend % divisor;
 +env-regs[r1] = remainder;
 +#else
 +/* 32-bit hosts would need special wrapper functionality - just 
 abort if
 +   we encounter such a case; it's very unlikely anyways. */
 +cpu_abort(env, 128 - 64/64 division not implemented\n);
 +#endif
 
 ...I'm still using a 32 bit system :-)

A couple of options:

(1) Steal code from gcc's __[u]divdi3 for implementing double-word division via
single-word division.  In this case, your single-word will be long long.

(2) Implement a simple bit reduction loop.  This is probably easiest.

(3) Reuse some of the softfloat code that manipulates 128bit quantities.  This
is probably the best option, particularly if the availability of __uint128
is taught to softfloat so that it doesn't always open-code stuff that the
compiler could take care of.


r~



Re: [Qemu-devel] [PATCH 14/17] s390x: Implement opcode helpers

2011-03-24 Thread Peter Maydell
On 24 March 2011 15:58, Alexander Graf ag...@suse.de wrote:

This is more random comments in passing than a thorough review; sorry.

 +#if HOST_LONG_BITS == 64  defined(__GNUC__)
 +        /* assuming 64-bit hosts have __uint128_t */
 +        __uint128_t dividend = (((__uint128_t)env-regs[r1])  64) |
 +                               (env-regs[r1+1]);
 +        __uint128_t quotient = dividend / divisor;
 +        env-regs[r1+1] = quotient;
 +        __uint128_t remainder = dividend % divisor;
 +        env-regs[r1] = remainder;
 +#else
 +        /* 32-bit hosts would need special wrapper functionality - just 
 abort if
 +           we encounter such a case; it's very unlikely anyways. */
 +        cpu_abort(env, 128 - 64/64 division not implemented\n);
 +#endif

...I'm still using a 32 bit system :-)

 +/* condition codes for binary FP ops */
 +static uint32_t set_cc_f32(float32 v1, float32 v2)
 +{
 +    if (float32_is_any_nan(v1) || float32_is_any_nan(v2)) {
 +        return 3;
 +    } else if (float32_eq(v1, v2, env-fpu_status)) {
 +        return 0;
 +    } else if (float32_lt(v1, v2, env-fpu_status)) {
 +        return 1;
 +    } else {
 +        return 2;
 +    }
 +}

Can you not use float32_compare_quiet() (returns a value
telling you if it's less/equal/greater/unordered)?
If not, needs a comment saying why you need to do it the hard way.

 +/* negative absolute of 32-bit float */
 +uint32_t HELPER(lcebr)(uint32_t f1, uint32_t f2)
 +{
 +    env-fregs[f1].l.upper = float32_sub(float32_zero, 
 env-fregs[f2].l.upper,
 +                                         env-fpu_status);
 +    return set_cc_nz_f32(env-fregs[f1].l.upper);
 +}

Google suggests this is wrong:
http://publib.boulder.ibm.com/cgi-bin/bookmgr/BOOKS/DZ9AR006/19.4.10?SHELF=DT=19990630131355CASE=
says for lcebr that:
The sign is inverted for any operand, including a QNaN or SNaN,
without causing an arithmetic exception.

but float32_sub will raise exceptions for NaNs. You want
float32_chs() (and similarly for the other types).

 +/* convert 64-bit float to 128-bit float */
 +uint32_t HELPER(lcxbr)(uint32_t f1, uint32_t f2)

Wrong comment? Looks like another invert-sign op from
the online POO.

 +/* 128-bit FP compare RR */
 +uint32_t HELPER(cxbr)(uint32_t f1, uint32_t f2)
 +{
 +    CPU_QuadU v1;
 +    v1.ll.upper = env-fregs[f1].ll;
 +    v1.ll.lower = env-fregs[f1 + 2].ll;
 +    CPU_QuadU v2;
 +    v2.ll.upper = env-fregs[f2].ll;
 +    v2.ll.lower = env-fregs[f2 + 2].ll;
 +    if (float128_is_any_nan(v1.q) || float128_is_any_nan(v2.q)) {
 +        return 3;
 +    } else if (float128_eq(v1.q, v2.q, env-fpu_status)) {
 +        return 0;
 +    } else if (float128_lt(v1.q, v2.q, env-fpu_status)) {
 +        return 1;
 +    } else {
 +        return 2;
 +    }
 +}

float128_compare_quiet() again?

 +/* convert 32-bit float to 64-bit int */
 +uint32_t HELPER(cgebr)(uint32_t r1, uint32_t f2, uint32_t m3)
 +{
 +    float32 v2 = env-fregs[f2].l.upper;
 +    set_round_mode(m3);
 +    env-regs[r1] = float32_to_int64(v2, env-fpu_status);
 +    return set_cc_nz_f32(v2);
 +}

Should this really be permanently setting the rounding mode
for future instructions as well as for the op it does itself?

 +/* load 32-bit FP zero */
 +void HELPER(lzer)(uint32_t f1)
 +{
 +    env-fregs[f1].l.upper = float32_zero;
 +}

Surely this is trivial enough to inline rather than
calling a helper function...

 +/* load 128-bit FP zero */
 +void HELPER(lzxr)(uint32_t f1)
 +{
 +    CPU_QuadU x;
 +    x.q = float64_to_float128(float64_zero, env-fpu_status);

Yuck. Just define a float128_zero if we need one.

 +uint32_t HELPER(tceb)(uint32_t f1, uint64_t m2)
 +{
 +    float32 v1 = env-fregs[f1].l.upper;
 +    int neg = float32_is_neg(v1);
 +    uint32_t cc = 0;
 +
 +    HELPER_LOG(%s: v1 0x%lx m2 0x%lx neg %d\n, __FUNCTION__, (long)v1, m2, 
 neg);
 +    if ((float32_is_zero(v1)  (m2  (1  (11-neg ||
 +        (float32_is_infinity(v1)  (m2  (1  (5-neg ||
 +        (float32_is_any_nan(v1)  (m2  (1  (3-neg ||
 +        (float32_is_signaling_nan(v1)  (m2  (1  (1-neg) {
 +        cc = 1;
 +    } else if (m2  (1  (9-neg))) {
 +        /* assume normalized number */
 +        cc = 1;
 +    }
 +
 +    /* FIXME: denormalized? */
 +    return cc;
 +}

There's a float32_is_zero_or_denormal(); if you need a
float32_is_denormal() which is false for real zero we
could add it, I guess.

 +static inline uint32_t cc_calc_nabs_32(CPUState *env, int32_t dst)
 +{
 +    return !!dst;
 +}

Another candidate for inlining.

-- PMM



Re: [Qemu-devel] [PATCH 14/17] s390x: Implement opcode helpers

2011-03-24 Thread Alexander Graf

Am 24.03.2011 um 18:41 schrieb Richard Henderson r...@twiddle.net:

 On 03/24/2011 10:29 AM, Peter Maydell wrote:
 On 24 March 2011 15:58, Alexander Graf ag...@suse.de wrote:
 
 This is more random comments in passing than a thorough review; sorry.
 
 +#if HOST_LONG_BITS == 64  defined(__GNUC__)
 +/* assuming 64-bit hosts have __uint128_t */
 +__uint128_t dividend = (((__uint128_t)env-regs[r1])  64) |
 +   (env-regs[r1+1]);
 +__uint128_t quotient = dividend / divisor;
 +env-regs[r1+1] = quotient;
 +__uint128_t remainder = dividend % divisor;
 +env-regs[r1] = remainder;
 +#else
 +/* 32-bit hosts would need special wrapper functionality - just 
 abort if
 +   we encounter such a case; it's very unlikely anyways. */
 +cpu_abort(env, 128 - 64/64 division not implemented\n);
 +#endif
 
 ...I'm still using a 32 bit system :-)
 
 A couple of options:
 
 (1) Steal code from gcc's __[u]divdi3 for implementing double-word division 
 via
single-word division.  In this case, your single-word will be long long.
 
 (2) Implement a simple bit reduction loop.  This is probably easiest.
 
 (3) Reuse some of the softfloat code that manipulates 128bit quantities.  This
is probably the best option, particularly if the availability of __uint128
is taught to softfloat so that it doesn't always open-code stuff that the
compiler could take care of.

In all applications I've run so far this abort has _never_ fired. I'm not sure 
gcc even emits it.

So IMHO this abort doesn't hurt. Once we get a bug report of a user hitting it, 
we can think of ways to implement the missing bits. For now, it's not worse 
than a not implemented opcode (of which we still have quite a number).


Alex

 



Re: [Qemu-devel] [PATCH 14/17] s390x: Implement opcode helpers

2011-03-24 Thread Alexander Graf

Am 24.03.2011 um 18:29 schrieb Peter Maydell peter.mayd...@linaro.org:

 On 24 March 2011 15:58, Alexander Graf ag...@suse.de wrote:
 
 This is more random comments in passing than a thorough review; sorry.
 
 +#if HOST_LONG_BITS == 64  defined(__GNUC__)
 +/* assuming 64-bit hosts have __uint128_t */
 +__uint128_t dividend = (((__uint128_t)env-regs[r1])  64) |
 +   (env-regs[r1+1]);
 +__uint128_t quotient = dividend / divisor;
 +env-regs[r1+1] = quotient;
 +__uint128_t remainder = dividend % divisor;
 +env-regs[r1] = remainder;
 +#else
 +/* 32-bit hosts would need special wrapper functionality - just 
 abort if
 +   we encounter such a case; it's very unlikely anyways. */
 +cpu_abort(env, 128 - 64/64 division not implemented\n);
 +#endif
 
 ...I'm still using a 32 bit system :-)
 
 +/* condition codes for binary FP ops */
 +static uint32_t set_cc_f32(float32 v1, float32 v2)
 +{
 +if (float32_is_any_nan(v1) || float32_is_any_nan(v2)) {
 +return 3;
 +} else if (float32_eq(v1, v2, env-fpu_status)) {
 +return 0;
 +} else if (float32_lt(v1, v2, env-fpu_status)) {
 +return 1;
 +} else {
 +return 2;
 +}
 +}
 
 Can you not use float32_compare_quiet() (returns a value
 telling you if it's less/equal/greater/unordered)?
 If not, needs a comment saying why you need to do it the hard way.

Phew - I really have (almost) no clue about fp. Those parts come from Uli. So I 
guess it's the easiest to just ask him :)

 
 +/* negative absolute of 32-bit float */
 +uint32_t HELPER(lcebr)(uint32_t f1, uint32_t f2)
 +{
 +env-fregs[f1].l.upper = float32_sub(float32_zero, 
 env-fregs[f2].l.upper,
 + env-fpu_status);
 +return set_cc_nz_f32(env-fregs[f1].l.upper);
 +}
 
 Google suggests this is wrong:
 http://publib.boulder.ibm.com/cgi-bin/bookmgr/BOOKS/DZ9AR006/19.4.10?SHELF=DT=19990630131355CASE=
 says for lcebr that:
 The sign is inverted for any operand, including a QNaN or SNaN,
 without causing an arithmetic exception.
 
 but float32_sub will raise exceptions for NaNs. You want
 float32_chs() (and similarly for the other types).
 
 +/* convert 64-bit float to 128-bit float */
 +uint32_t HELPER(lcxbr)(uint32_t f1, uint32_t f2)
 
 Wrong comment? Looks like another invert-sign op from
 the online POO.
 
 +/* 128-bit FP compare RR */
 +uint32_t HELPER(cxbr)(uint32_t f1, uint32_t f2)
 +{
 +CPU_QuadU v1;
 +v1.ll.upper = env-fregs[f1].ll;
 +v1.ll.lower = env-fregs[f1 + 2].ll;
 +CPU_QuadU v2;
 +v2.ll.upper = env-fregs[f2].ll;
 +v2.ll.lower = env-fregs[f2 + 2].ll;
 +if (float128_is_any_nan(v1.q) || float128_is_any_nan(v2.q)) {
 +return 3;
 +} else if (float128_eq(v1.q, v2.q, env-fpu_status)) {
 +return 0;
 +} else if (float128_lt(v1.q, v2.q, env-fpu_status)) {
 +return 1;
 +} else {
 +return 2;
 +}
 +}
 
 float128_compare_quiet() again?
 
 +/* convert 32-bit float to 64-bit int */
 +uint32_t HELPER(cgebr)(uint32_t r1, uint32_t f2, uint32_t m3)
 +{
 +float32 v2 = env-fregs[f2].l.upper;
 +set_round_mode(m3);
 +env-regs[r1] = float32_to_int64(v2, env-fpu_status);
 +return set_cc_nz_f32(v2);
 +}
 
 Should this really be permanently setting the rounding mode
 for future instructions as well as for the op it does itself?
 
 +/* load 32-bit FP zero */
 +void HELPER(lzer)(uint32_t f1)
 +{
 +env-fregs[f1].l.upper = float32_zero;
 +}
 
 Surely this is trivial enough to inline rather than
 calling a helper function...
 
 +/* load 128-bit FP zero */
 +void HELPER(lzxr)(uint32_t f1)
 +{
 +CPU_QuadU x;
 +x.q = float64_to_float128(float64_zero, env-fpu_status);
 
 Yuck. Just define a float128_zero if we need one.
 
 +uint32_t HELPER(tceb)(uint32_t f1, uint64_t m2)
 +{
 +float32 v1 = env-fregs[f1].l.upper;
 +int neg = float32_is_neg(v1);
 +uint32_t cc = 0;
 +
 +HELPER_LOG(%s: v1 0x%lx m2 0x%lx neg %d\n, __FUNCTION__, (long)v1, 
 m2, neg);
 +if ((float32_is_zero(v1)  (m2  (1  (11-neg ||
 +(float32_is_infinity(v1)  (m2  (1  (5-neg ||
 +(float32_is_any_nan(v1)  (m2  (1  (3-neg ||
 +(float32_is_signaling_nan(v1)  (m2  (1  (1-neg) {
 +cc = 1;
 +} else if (m2  (1  (9-neg))) {
 +/* assume normalized number */
 +cc = 1;
 +}
 +
 +/* FIXME: denormalized? */
 +return cc;
 +}
 
 There's a float32_is_zero_or_denormal(); if you need a
 float32_is_denormal() which is false for real zero we
 could add it, I guess.
 
 +static inline uint32_t cc_calc_nabs_32(CPUState *env, int32_t dst)
 +{
 +return !!dst;
 +}
 
 Another candidate for inlining.
 
 -- PMM



Re: [Qemu-devel] OVMF, SeaBIOS non-CSM based legacy boot

2011-03-24 Thread Gleb Natapov
On Thu, Mar 24, 2011 at 09:46:09AM -0700, Jordan Justen wrote:
 2011/3/24 Gleb Natapov g...@redhat.com:
  On Wed, Mar 23, 2011 at 03:32:41PM -0700, Jordan Justen wrote:
  By the way, today OVMF attempts to store NV-Var data in a file on the
  disk, but this cannot support variables at runtime.  (This is why I
  sent in the patch for using -pflash on x86/x86-64.)
 
  And this file is stored always at the same location? If it is then then
  problem is solved! But what do you mean by this cannot support
  variables at runtime?
 
 The variables can be set while the OS is running, and the OS has
 exclusive control over the disk at that time.  Today in OVMF we set
 variables into memory during this time, and hope that memory it still
 around after a reset.  This does not provide realistic non-volatile
 UEFI variable support.
KVM preserve memory during reset, but we better not rely on that.

 
 What we really need is flash memory.  (See my 'hw/pc: Support system
 flash memory' patch.)
Storing boot file only on flash memory will require to distribute the
flash image along with disk image.

 
 But, there is nothing stopping us from also storing the variables on
 the disk (during the firmware boot), and also using them as a backup.
 
This will still require at least one reboot for variables to be saved in
a filesystem, but this is better then nothing.

 Additionally, we can add yet another backup system of looking for
 known os-loader executable paths.  This would be needed if a disk
 image were ever to be transferred from a real machine to a VM image.
 But, this would require firmware updates as new UEFI OS loader install
 paths are added.  Also, let's hope no OS decides to generate a random
 path for the OS loader. :)
 
Firmware updates in a VM is very easy, so not a big deal.

--
Gleb.



Re: [Qemu-devel] Re: [PATCH 3/3] raw-posix: Re-open host CD-ROM after media change

2011-03-24 Thread Stefan Hajnoczi
On Thu, Mar 24, 2011 at 12:42 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 23.03.2011 21:50, schrieb Stefan Hajnoczi:
 On Wed, Mar 23, 2011 at 8:27 PM, Juan Quintela quint...@redhat.com wrote:
 Stefan Hajnoczi stefa...@linux.vnet.ibm.com wrote:
 +
 +    if (s-fd == -1) {
 +        s-fd = qemu_open(bs-filename, s-open_flags, 0644);

 Everything else on that file uses plain open not qemu_open.
 diference is basically that qemu_open() adds flag O_CLOEXEC.

 I don't know if this one should be vanilla open or the other ones
 qemu_open().

 What do you think?

 raw_open_common() uses qemu_open().  That's why I used it.

 And I think it's correct. There's no reason not to set O_CLOEXEC here.
 Maybe some of the open() users need to be fixed.

 +        if (s-fd  0) {
 +            return 0;
 +        }
 +    }
 +
 +    ret = (ioctl(s-fd, CDROM_DRIVE_STATUS, CDSL_CURRENT) == CDS_DISC_OK);

 parens are not needed around ==.

 Yes, if you want I'll remove them.  I just did it for readability.

 I like them.

I will have to #ifdef QUINTELA and #ifdef KWOLF :).  Seriously, I'll
leave the braces unless anyone feels really strongly either way.  This
passes checkpatch.pl BTW.

Stefan



[Qemu-devel] [PATCH] qemu-timer: Add and use new function qemu_timer_expired_ns

2011-03-24 Thread Stefan Weil
This simply moves code which is used three times
into a new function thus improving readability.

Signed-off-by: Stefan Weil w...@mail.berlios.de
---
 qemu-timer.c |   17 ++---
 1 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 50f1943..c3ad72a 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -177,6 +177,11 @@ struct qemu_alarm_timer {
 
 static struct qemu_alarm_timer *alarm_timer;
 
+static bool qemu_timer_expired_ns(QEMUTimer *timer_head, int64_t current_time)
+{
+return timer_head  (timer_head-expire_time = current_time);
+}
+
 int qemu_alarm_pending(void)
 {
 return alarm_timer-pending;
@@ -438,10 +443,9 @@ static void qemu_mod_timer_ns(QEMUTimer *ts, int64_t 
expire_time)
 pt = active_timers[ts-clock-type];
 for(;;) {
 t = *pt;
-if (!t)
-break;
-if (t-expire_time  expire_time)
+if (!qemu_timer_expired_ns(t, expire_time)) {
 break;
+}
 pt = t-next;
 }
 ts-expire_time = expire_time;
@@ -478,9 +482,7 @@ int qemu_timer_pending(QEMUTimer *ts)
 
 int qemu_timer_expired(QEMUTimer *timer_head, int64_t current_time)
 {
-if (!timer_head)
-return 0;
-return (timer_head-expire_time = current_time * timer_head-scale);
+return qemu_timer_expired_ns(timer_head, current_time * timer_head-scale);
 }
 
 static void qemu_run_timers(QEMUClock *clock)
@@ -495,8 +497,9 @@ static void qemu_run_timers(QEMUClock *clock)
 ptimer_head = active_timers[clock-type];
 for(;;) {
 ts = *ptimer_head;
-if (!ts || ts-expire_time  current_time)
+if (!qemu_timer_expired_ns(ts, current_time)) {
 break;
+}
 /* remove timer from the list before calling the callback */
 *ptimer_head = ts-next;
 ts-next = NULL;
-- 
1.7.2.5




[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-03-24 Thread dum8d0g
In reply to question #26
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/741887

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  New
Status in “qemu-kvm” source package in Lucid:
  In Progress
Status in “libvirt” source package in Maverick:
  New
Status in “qemu-kvm” source package in Maverick:
  In Progress

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  virsh save is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.



[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-03-24 Thread dum8d0g
In reply to question #25: everything is included in #27. Is it enough?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  New
Status in “qemu-kvm” source package in Lucid:
  In Progress
Status in “libvirt” source package in Maverick:
  New
Status in “qemu-kvm” source package in Maverick:
  In Progress

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  virsh save is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.



[Qemu-devel] Re: [PATCH 3/3] raw-posix: Re-open host CD-ROM after media change

2011-03-24 Thread Juan Quintela
Stefan Hajnoczi stefa...@gmail.com wrote:
 On Thu, Mar 24, 2011 at 12:42 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 23.03.2011 21:50, schrieb Stefan Hajnoczi:
 On Wed, Mar 23, 2011 at 8:27 PM, Juan Quintela quint...@redhat.com wrote:
 Stefan Hajnoczi stefa...@linux.vnet.ibm.com wrote:
 +
 +    if (s-fd == -1) {
 +        s-fd = qemu_open(bs-filename, s-open_flags, 0644);

 Everything else on that file uses plain open not qemu_open.
 diference is basically that qemu_open() adds flag O_CLOEXEC.

 I don't know if this one should be vanilla open or the other ones
 qemu_open().

 What do you think?

 raw_open_common() uses qemu_open().  That's why I used it.

 And I think it's correct. There's no reason not to set O_CLOEXEC here.
 Maybe some of the open() users need to be fixed.

I didn't doubt that.  What I tried to point is that there are three
opens for cdrom/floppy on that file.  It makes sense that all are the
same.  I guessed that proper fix was to change all others to
qemu_open(), but just wanted to point that it was inconsistent, and
should be done one or other way.

 +        if (s-fd  0) {
 +            return 0;
 +        }
 +    }
 +
 +    ret = (ioctl(s-fd, CDROM_DRIVE_STATUS, CDSL_CURRENT) == 
 CDS_DISC_OK);

 parens are not needed around ==.

 Yes, if you want I'll remove them.  I just did it for readability.

 I like them.

 I will have to #ifdef QUINTELA and #ifdef KWOLF :).  Seriously, I'll
 leave the braces unless anyone feels really strongly either way.  This
 passes checkpatch.pl BTW.

In

x = (a == b);

braces are useless.  But my preference is not 'strong' enough to try to
force it on other people.

It appears that other people feel strong about it, so let it be.

Later, Juan.



[Qemu-devel] [Bug 524447] Re: virsh save is very slow

2011-03-24 Thread Serge Hallyn
Yes, thanks.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/524447

Title:
  virsh save is very slow

Status in libvirt virtualization API:
  Unknown
Status in QEMU:
  Fix Released
Status in “libvirt” package in Ubuntu:
  Invalid
Status in “qemu-kvm” package in Ubuntu:
  Fix Released
Status in “libvirt” source package in Lucid:
  New
Status in “qemu-kvm” source package in Lucid:
  In Progress
Status in “libvirt” source package in Maverick:
  New
Status in “qemu-kvm” source package in Maverick:
  In Progress

Bug description:
  ==
  SRU Justification:
  1. impact: 'qemu save' is slow
  2. how addressed: a patch upstream fixes the case when a file does not 
announce when it is ready.
  3. patch: see the patch in linked bzr trees
  4. TEST CASE: see comment #4 for a specific recipe
  5. regression potential:  this patch only touches the vm save path.
  ==

  As reported here: http://www.redhat.com/archives/libvir-
  list/2009-December/msg00203.html

  virsh save is very slow - it writes the image at around 1MB/sec on
  my test system.

  (I think I saw a bug report for this issue on Fedora's bugzilla, but I
  can't find it now...)

  Confirmed under Karmic.



[Qemu-devel] Re: [PATCH 01/17] Only build ivshmem when CONFIG_PCI CONFIG_KVM

2011-03-24 Thread Juan Quintela
Alexander Graf ag...@suse.de wrote:
 The ivshmem depends on PCI and KVM, not only KVM. Reflect this
 in the Makefile, so we don't get build errors on s390x.

 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  Makefile.target |8 +++-
  1 files changed, 7 insertions(+), 1 deletions(-)

 diff --git a/Makefile.target b/Makefile.target
 index f0df98e..17ad396 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -209,7 +209,13 @@ QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
  obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
  
  # Inter-VM PCI shared memory
 -obj-$(CONFIG_KVM) += ivshmem.o
 +CONFIG_IVSHMEM =
 +ifeq ($(CONFIG_KVM), y)
 +  ifeq ($(CONFIG_PCI), y)
 +CONFIG_IVSHMEM = y
 +  endif
 +endif
 +obj-$(CONFIG_IVSHMEM) += ivshmem.o

This shouldn't be here.  Proper place is at ./configure, or better yet
at defaults/x86_64-softmmu.mak

CONFIG_IVSHMEM=y

It is complicated though, because we depend on PCI and KVM.

Later, Juan.



[Qemu-devel] [PATCH 1/3] arm: basic support for ARMv4/ARMv4T emulation

2011-03-24 Thread Dmitry Eremin-Solenikov
Currently target-arm/ assumes at least ARMv5 core. Add support for
handling also ARMv4/ARMv4T. This changes the following instructions:

BX(v4T and later)

BKPT, BLX, CDP2, CLZ, LDC2, LDRD, MCRR, MCRR2, MRRC, MCRR, MRC2, MRRC,
MRRC2, PLD QADD, QDADD, QDSUB, QSUB, STRD, SMLAxy, SMLALxy, SMLAWxy,
SMULxy, SMULWxy, STC2 (v5 and later)

All instructions that are v5TE and later are also bound to just v5, as
that's how it was before.

This patch doesn _not_ include disabling of cp15 access and base-updated
data abort model (that will be required to emulate chips based on a
ARM7TDMI), because:
* no ARM7TDMI chips are currently emulated (or planned)
* those features aren't strictly necessary for my purposes (SA-1 core
  emulation).

Patch is heavily based on patch by Filip Navara filip.nav...@gmail.com
which in turn is based on work by Ulrich Hecht u...@suse.de and Vincent
Sanders vi...@kyllikki.org.

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 target-arm/cpu.h   |4 +++-
 target-arm/helper.c|   24 
 target-arm/translate.c |   25 ++---
 3 files changed, 49 insertions(+), 4 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 1ae7982..e247a7a 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -360,7 +360,9 @@ enum arm_features {
 ARM_FEATURE_M, /* Microcontroller profile.  */
 ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
 ARM_FEATURE_THUMB2EE,
-ARM_FEATURE_V7MP/* v7 Multiprocessing Extensions */
+ARM_FEATURE_V7MP,/* v7 Multiprocessing Extensions */
+ARM_FEATURE_V4T,
+ARM_FEATURE_V5,
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 78f3d39..49ff5cf 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -48,17 +48,23 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c0_cpuid = id;
 switch (id) {
 case ARM_CPUID_ARM926:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_VFP);
 env-vfp.xregs[ARM_VFP_FPSID] = 0x41011090;
 env-cp15.c0_cachetype = 0x1dd20d2;
 env-cp15.c1_sys = 0x00090078;
 break;
 case ARM_CPUID_ARM946:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_MPU);
 env-cp15.c0_cachetype = 0x0f004006;
 env-cp15.c1_sys = 0x0078;
 break;
 case ARM_CPUID_ARM1026:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_VFP);
 set_feature(env, ARM_FEATURE_AUXCR);
 env-vfp.xregs[ARM_VFP_FPSID] = 0x410110a0;
@@ -67,6 +73,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 break;
 case ARM_CPUID_ARM1136_R2:
 case ARM_CPUID_ARM1136:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_VFP);
 set_feature(env, ARM_FEATURE_AUXCR);
@@ -79,6 +87,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env-cp15.c1_sys = 0x00050078;
 break;
 case ARM_CPUID_ARM11MPCORE:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_VFP);
@@ -91,6 +101,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env-cp15.c0_cachetype = 0x1dd20d2;
 break;
 case ARM_CPUID_CORTEXA8:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_V7);
@@ -113,6 +125,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c1_sys = 0x00c50078;
 break;
 case ARM_CPUID_CORTEXA9:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_V7);
@@ -140,6 +154,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c1_sys = 0x00c50078;
 break;
 case ARM_CPUID_CORTEXM3:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_THUMB2);
 set_feature(env, ARM_FEATURE_V7);
@@ -147,6 +163,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 set_feature(env, ARM_FEATURE_DIV);
 break;
 case ARM_CPUID_ANY: /* For userspace emulation.  */
+set_feature(env, ARM_FEATURE_V4T);
+

[Qemu-devel] [PATCH 2/3] Implement basic part of SA-1110/SA-1100

2011-03-24 Thread Dmitry Eremin-Solenikov
Basic implementation of DEC/Intel SA-1100/SA-1110 chips emulation.
Implemented:
 - IRQs
 - GPIO
 - PPC
 - RTC
 - UARTs (no IrDA/etc.)
 - OST reused from pxa25x

Everything else is TODO (esp. PM/idle/sleep!) - see the todo in the
hw/strongarm.c

V2:
  * removed all strongarm variants except latest
  * dropped unused casts
  * fixed PIC vmstate
  * fixed new devices created with version_id = 1

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 Makefile.target |1 +
 hw/strongarm.c  | 1301 +++
 hw/strongarm.h  |   62 +++
 target-arm/cpu.h|3 +
 target-arm/helper.c |9 +
 5 files changed, 1376 insertions(+), 0 deletions(-)
 create mode 100644 hw/strongarm.c
 create mode 100644 hw/strongarm.h

diff --git a/Makefile.target b/Makefile.target
index 62b102a..d071a4d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -328,6 +328,7 @@ obj-arm-y += framebuffer.o
 obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o syborg_keyboard.o
 obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o
 obj-arm-y += syborg_virtio.o
+obj-arm-y += strongarm.o
 
 obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
 obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
diff --git a/hw/strongarm.c b/hw/strongarm.c
new file mode 100644
index 000..9f3df87
--- /dev/null
+++ b/hw/strongarm.c
@@ -0,0 +1,1301 @@
+/*
+ * StrongARM SA-1100/SA-1110 emulation
+ *
+ * Copyright (C) 2011 Dmitry Eremin-Solenikov
+ *
+ * Largely based on StrongARM emulation:
+ * Copyright (c) 2006 Openedhand Ltd.
+ * Written by Andrzej Zaborowski bal...@zabor.org
+ *
+ * UART code based on QEMU 16550A UART emulation
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2008 Citrix Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+#include sysbus.h
+#include strongarm.h
+#include qemu-error.h
+#include arm-misc.h
+#include sysemu.h
+
+/*
+ TODO
+ - Implement cp15, c14 ?
+ - Implement cp15, c15 !!! (idle used in L)
+ - Implement idle mode handling/DIM
+ - Implement sleep mode/Wake sources
+ - Implement reset control
+ - Implement memory control regs
+ - PCMCIA handling
+ - Maybe support MBGNT/MBREQ
+ - DMA channels
+ - GPCLK
+ - IrDA
+ - MCP
+ - Enhance UART with modem signals
+ */
+
+static struct {
+target_phys_addr_t io_base;
+int irq;
+} sa_serial[] = {
+{ 0x8001, SA_PIC_UART1 },
+{ 0x8003, SA_PIC_UART2 },
+{ 0x8005, SA_PIC_UART3 },
+{ 0, 0 }
+};
+
+/* Interrupt Controller */
+typedef struct {
+SysBusDevice busdev;
+qemu_irqirq;
+qemu_irqfiq;
+
+uint32_t pending;
+uint32_t enabled;
+uint32_t is_fiq;
+uint32_t int_idle;
+} StrongARMPICState;
+
+#define ICIP0x00
+#define ICMR0x04
+#define ICLR0x08
+#define ICFP0x10
+#define ICPR0x20
+#define ICCR0x0c
+
+#define SA_PIC_SRCS 32
+
+
+static void strongarm_pic_update(void *opaque)
+{
+StrongARMPICState *s = opaque;
+
+/* FIXME: reflect DIM */
+qemu_set_irq(s-fiq, s-pending  s-enabled   s-is_fiq);
+qemu_set_irq(s-irq, s-pending  s-enabled  ~s-is_fiq);
+}
+
+static void strongarm_pic_set_irq(void *opaque, int irq, int level)
+{
+StrongARMPICState *s = opaque;
+
+if (level) {
+s-pending |= 1  irq;
+} else {
+s-pending = ~(1  irq);
+}
+
+strongarm_pic_update(s);
+}
+
+static uint32_t strongarm_pic_mem_read(void *opaque, target_phys_addr_t offset)
+{
+StrongARMPICState *s = opaque;
+
+switch (offset) {
+case ICIP:
+return s-pending  ~s-is_fiq  s-enabled;
+case ICMR:
+return s-enabled;
+case ICLR:
+return s-is_fiq;
+case ICCR:
+return s-int_idle == 0;
+case ICFP:
+return s-pending  s-is_fiq  s-enabled;
+case ICPR:
+return s-pending;
+default:
+printf(%s: Bad register offset 0x TARGET_FMT_plx \n,
+__func__, offset);
+return 0;
+}
+}
+
+static void strongarm_pic_mem_write(void *opaque, target_phys_addr_t offset,
+uint32_t value)
+{
+StrongARMPICState *s = opaque;
+
+switch (offset) {
+case ICMR:
+s-enabled = value;
+break;
+case ICLR:
+s-is_fiq = value;
+break;
+case ICCR:
+s-int_idle = (value  1) ? 0 : ~0;
+break;
+default:
+printf(%s: Bad register offset 0x TARGET_FMT_plx \n,
+__func__, offset);
+break;
+}
+strongarm_pic_update(s);
+}
+
+static CPUReadMemoryFunc * const strongarm_pic_readfn[] = {
+strongarm_pic_mem_read,
+strongarm_pic_mem_read,
+strongarm_pic_mem_read,
+};
+
+static CPUWriteMemoryFunc * const strongarm_pic_writefn[] = {
+strongarm_pic_mem_write,
+strongarm_pic_mem_write,
+strongarm_pic_mem_write,
+};
+
+static int 

[Qemu-devel] [PATCH 3/3] Basic implementation of Sharp Zaurus SL-5500 collie PDA

2011-03-24 Thread Dmitry Eremin-Solenikov
Add very basic implementation of collie PDA emulation. The system lacks
LoCoMo and graphics/sound emulation. Linux kernel boots up to mounting
rootfs (theoretically it can be provided in pflash images).

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 Makefile.target |1 +
 hw/collie.c |   70 +++
 2 files changed, 71 insertions(+), 0 deletions(-)
 create mode 100644 hw/collie.c

diff --git a/Makefile.target b/Makefile.target
index d071a4d..abc2978 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -329,6 +329,7 @@ obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o 
syborg_keyboard.o
 obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o
 obj-arm-y += syborg_virtio.o
 obj-arm-y += strongarm.o
+obj-arm-y += collie.o
 
 obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
 obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
diff --git a/hw/collie.c b/hw/collie.c
new file mode 100644
index 000..965fd13
--- /dev/null
+++ b/hw/collie.c
@@ -0,0 +1,70 @@
+/*
+ * SA-1110-based Sharp Zaurus SL-5500 platform.
+ *
+ * Copyright (C) 2011 Dmitry Eremin-Solenikov
+ *
+ * This code is licensed under GNU GPL v2.
+ */
+#include hw.h
+#include sysbus.h
+#include boards.h
+#include devices.h
+#include strongarm.h
+#include arm-misc.h
+#include flash.h
+#include blockdev.h
+
+static struct arm_boot_info collie_binfo = {
+.loader_start = SA_SDCS0,
+.ram_size = 0x2000,
+};
+
+static void collie_init(ram_addr_t ram_size,
+const char *boot_device,
+const char *kernel_filename, const char *kernel_cmdline,
+const char *initrd_filename, const char *cpu_model)
+{
+StrongARMState *s;
+DriveInfo *dinfo;
+ram_addr_t phys_flash;
+
+if (!cpu_model) {
+cpu_model = sa1110;
+}
+
+s = sa1110_init(collie_binfo.ram_size, cpu_model);
+(void) s;
+
+phys_flash = qemu_ram_alloc(NULL, collie.fl1, 0x0200);
+dinfo = drive_get(IF_PFLASH, 0, 0);
+pflash_cfi01_register(SA_CS0, phys_flash,
+dinfo ? dinfo-bdrv : NULL, (64 * 1024),
+512, 4, 0x00, 0x00, 0x00, 0x00, 0);
+
+phys_flash = qemu_ram_alloc(NULL, collie.fl2, 0x0200);
+dinfo = drive_get(IF_PFLASH, 0, 1);
+pflash_cfi01_register(SA_CS1, phys_flash,
+dinfo ? dinfo-bdrv : NULL, (64 * 1024),
+512, 4, 0x00, 0x00, 0x00, 0x00, 0);
+
+sysbus_create_simple(scoop, 0x4080, NULL);
+
+collie_binfo.kernel_filename = kernel_filename;
+collie_binfo.kernel_cmdline = kernel_cmdline;
+collie_binfo.initrd_filename = initrd_filename;
+collie_binfo.board_id = 0x208;
+arm_load_kernel(s-env, collie_binfo);
+}
+
+static QEMUMachine collie_machine = {
+.name = collie,
+.desc = Collie PDA (SA-1110),
+.init = collie_init,
+};
+
+static void collie_machine_init(void)
+{
+qemu_register_machine(collie_machine);
+}
+
+machine_init(collie_machine_init)
-- 
1.7.4.1




[Qemu-devel] [PATCH 2/2] rbd: allow configuration of rados from the rbd filename

2011-03-24 Thread Josh Durgin
The new format is rbd:pool/image[@snapshot][:option1=value1[:option2=value2...]]
Each option is used to configure rados, and may be any Ceph option, or conf.
The conf option specifies a Ceph configuration file to read.

This allows rbd volumes from more than one Ceph cluster to be used by
specifying different monitor addresses, as well as having different
logging levels or locations for different volumes.

Signed-off-by: Josh Durgin josh.dur...@dreamhost.com
---
 block/rbd.c |   96 --
 1 files changed, 86 insertions(+), 10 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 5c3287e..b146fcb 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -22,13 +22,16 @@
 /*
  * When specifying the image filename use:
  *
- * rbd:poolname/devicename
+ * rbd:poolname/devicename[@snapshotname][:option1=value1[:option2=value2...]]
  *
  * poolname must be the name of an existing rados pool
  *
  * devicename is the basename for all objects used to
  * emulate the raw device.
  *
+ * Each option given is used to configure rados, and may be any Ceph option, 
or conf.
+ * The conf option specifies a Ceph configuration file to read.
+ *
  * Metadata information (image size, ...) is stored in an
  * object with the name devicename.rbd.
  *
@@ -121,7 +124,8 @@ static int qemu_rbd_next_tok(char *dst, int dst_len,
 static int qemu_rbd_parsename(const char *filename,
   char *pool, int pool_len,
   char *snap, int snap_len,
-  char *name, int name_len)
+  char *name, int name_len,
+  char *conf, int conf_len)
 {
 const char *start;
 char *p, *buf;
@@ -133,28 +137,82 @@ static int qemu_rbd_parsename(const char *filename,
  buf = qemu_strdup(start);
 p = buf;
+*snap = '\0';
+*conf = '\0';
  ret = qemu_rbd_next_tok(pool, pool_len, p, '/', pool name, p);
 if (ret  0 || !p) {
 ret = -EINVAL;
 goto done;
 }
-ret = qemu_rbd_next_tok(name, name_len, p, '@', object name, p);
-if (ret  0) {
-goto done;
+
+if (strchr(p, '@')) {
+ret = qemu_rbd_next_tok(name, name_len, p, '@', object name, p);
+if (ret  0) {
+goto done;
+}
+ret = qemu_rbd_next_tok(snap, snap_len, p, ':', snap name, p);
+} else {
+ret = qemu_rbd_next_tok(name, name_len, p, ':', object name, p);
 }
-if (!p) {
-*snap = '\0';
+if (ret  0 || !p) {
 goto done;
 }
 -ret = qemu_rbd_next_tok(snap, snap_len, p, '\0', snap name, p);
+ret = qemu_rbd_next_tok(conf, conf_len, p, '\0', configuration, p);
  done:
 qemu_free(buf);
 return ret;
 }
 +static int qemu_rbd_set_conf(rados_t cluster, const char *conf)
+{
+char *p, *buf;
+char name[RBD_MAX_CONF_NAME_SIZE];
+char value[RBD_MAX_CONF_VAL_SIZE];
+int ret = 0;
+
+buf = qemu_strdup(conf);
+p = buf;
+
+while (p) {
+ret = qemu_rbd_next_tok(name, sizeof(name), p, '=', conf option 
name, p);
+if (ret  0) {
+break;
+}
+
+if (!p) {
+error_report(conf option %s has no value, name);
+ret = -EINVAL;
+break;
+}
+
+ret = qemu_rbd_next_tok(value, sizeof(value), p, ':', conf option 
value, p);
+if (ret  0) {
+break;
+}
+
+if (strncmp(name, conf, strlen(conf))) {
+ret = rados_conf_set(cluster, name, value);
+if (ret  0) {
+error_report(invalid conf option %s, name);
+ret = -EINVAL;
+break;
+}
+} else {
+ret = rados_conf_read_file(cluster, value);
+if (ret  0) {
+error_report(error reading conf file %s, value);
+break;
+}
+}
+}
+
+qemu_free(buf);
+return ret;
+}
+
 static int qemu_rbd_create(const char *filename, QEMUOptionParameter *options)
 {
 int64_t bytes = 0;
@@ -163,6 +221,7 @@ static int qemu_rbd_create(const char *filename, 
QEMUOptionParameter *options)
 char pool[RBD_MAX_POOL_NAME_SIZE];
 char name[RBD_MAX_IMAGE_NAME_SIZE];
 char snap_buf[RBD_MAX_SNAP_NAME_SIZE];
+char conf[RBD_MAX_CONF_SIZE];
 char *snap = NULL;
 rados_t cluster;
 rados_ioctx_t io_ctx;
@@ -170,7 +229,8 @@ static int qemu_rbd_create(const char *filename, 
QEMUOptionParameter *options)
  if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf),
-   name, sizeof(name))  0) {
+   name, sizeof(name),
+   conf, sizeof(conf))  0) {
 return -EINVAL;
 }
 if (snap_buf[0] != '\0') {
@@ -209,6 +269,13 @@ static int qemu_rbd_create(const char *filename, 
QEMUOptionParameter *options)
 return -EIO;
 }
 +   

[Qemu-devel] [PATCH 1/2] rbd: use the higher level librbd instead of just librados

2011-03-24 Thread Josh Durgin
librbd stacks on top of librados to provide access
to rbd images.

Using librbd simplifies the qemu code, and allows
qemu to use new versions of the rbd format
with few (if any) changes.

Signed-off-by: Josh Durgin josh.dur...@dreamhost.com
Signed-off-by: Yehuda Sadeh yeh...@hq.newdream.net
---
 block/rbd.c   |  784 ++---
 block/rbd_types.h |   71 -
 configure |   33 +--
 3 files changed, 216 insertions(+), 672 deletions(-)
 delete mode 100644 block/rbd_types.h

diff --git a/block/rbd.c b/block/rbd.c
index 249a590..5c3287e 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -8,13 +8,14 @@
  *
  */
 +#include inttypes.h
+
 #include qemu-common.h
 #include qemu-error.h
 -#include rbd_types.h
 #include block_int.h
 -#include rados/librados.h
+#include rbd/librbd.h
   @@ -40,6 +41,12 @@
  #define OBJ_MAX_SIZE (1UL  OBJ_DEFAULT_OBJ_ORDER)
 +#define RBD_MAX_CONF_NAME_SIZE 128
+#define RBD_MAX_CONF_VAL_SIZE 512
+#define RBD_MAX_CONF_SIZE 1024
+#define RBD_MAX_POOL_NAME_SIZE 128
+#define RBD_MAX_SNAP_NAME_SIZE 128
+
 typedef struct RBDAIOCB {
 BlockDriverAIOCB common;
 QEMUBH *bh;
@@ -48,7 +55,6 @@ typedef struct RBDAIOCB {
 char *bounce;
 int write;
 int64_t sector_num;
-int aiocnt;
 int error;
 struct BDRVRBDState *s;
 int cancelled;
@@ -59,7 +65,7 @@ typedef struct RADOSCB {
 RBDAIOCB *acb;
 struct BDRVRBDState *s;
 int done;
-int64_t segsize;
+int64_t size;
 char *buf;
 int ret;
 } RADOSCB;
@@ -69,25 +75,22 @@ typedef struct RADOSCB {
  typedef struct BDRVRBDState {
 int fds[2];
-rados_pool_t pool;
-rados_pool_t header_pool;
-char name[RBD_MAX_OBJ_NAME_SIZE];
-char block_name[RBD_MAX_BLOCK_NAME_SIZE];
-uint64_t size;
-uint64_t objsize;
+rados_t cluster;
+rados_ioctx_t io_ctx;
+rbd_image_t image;
+char name[RBD_MAX_IMAGE_NAME_SIZE];
 int qemu_aio_count;
+char *snap;
 int event_reader_pos;
 RADOSCB *event_rcb;
 } BDRVRBDState;
 -typedef struct rbd_obj_header_ondisk RbdHeader1;
-
 static void rbd_aio_bh_cb(void *opaque);
 -static int rbd_next_tok(char *dst, int dst_len,
-char *src, char delim,
-const char *name,
-char **p)
+static int qemu_rbd_next_tok(char *dst, int dst_len,
+ char *src, char delim,
+ const char *name,
+ char **p)
 {
 int l;
 char *end;
@@ -115,10 +118,10 @@ static int rbd_next_tok(char *dst, int dst_len,
 return 0;
 }
 -static int rbd_parsename(const char *filename,
- char *pool, int pool_len,
- char *snap, int snap_len,
- char *name, int name_len)
+static int qemu_rbd_parsename(const char *filename,
+  char *pool, int pool_len,
+  char *snap, int snap_len,
+  char *name, int name_len)
 {
 const char *start;
 char *p, *buf;
@@ -131,12 +134,12 @@ static int rbd_parsename(const char *filename,
 buf = qemu_strdup(start);
 p = buf;
 -ret = rbd_next_tok(pool, pool_len, p, '/', pool name, p);
+ret = qemu_rbd_next_tok(pool, pool_len, p, '/', pool name, p);
 if (ret  0 || !p) {
 ret = -EINVAL;
 goto done;
 }
-ret = rbd_next_tok(name, name_len, p, '@', object name, p);
+ret = qemu_rbd_next_tok(name, name_len, p, '@', object name, p);
 if (ret  0) {
 goto done;
 }
@@ -145,123 +148,35 @@ static int rbd_parsename(const char *filename,
 goto done;
 }
 -ret = rbd_next_tok(snap, snap_len, p, '\0', snap name, p);
+ret = qemu_rbd_next_tok(snap, snap_len, p, '\0', snap name, p);
  done:
 qemu_free(buf);
 return ret;
 }
 -static int create_tmap_op(uint8_t op, const char *name, char **tmap_desc)
-{
-uint32_t len = strlen(name);
-uint32_t len_le = cpu_to_le32(len);
-/* total_len = encoding op + name + empty buffer */
-uint32_t total_len = 1 + (sizeof(uint32_t) + len) + sizeof(uint32_t);
-uint8_t *desc = NULL;
-
-desc = qemu_malloc(total_len);
-
-*tmap_desc = (char *)desc;
-
-*desc = op;
-desc++;
-memcpy(desc, len_le, sizeof(len_le));
-desc += sizeof(len_le);
-memcpy(desc, name, len);
-desc += len;
-len = 0; /* no need for endian conversion for 0 */
-memcpy(desc, len, sizeof(len));
-desc += sizeof(len);
-
-return (char *)desc - *tmap_desc;
-}
-
-static void free_tmap_op(char *tmap_desc)
-{
-qemu_free(tmap_desc);
-}
-
-static int rbd_register_image(rados_pool_t pool, const char *name)
-{
-char *tmap_desc;
-const char *dir = RBD_DIRECTORY;
-int ret;
-
-ret = create_tmap_op(CEPH_OSD_TMAP_SET, name, tmap_desc);
-if (ret  0) {
-return ret;
-}
-
-ret = rados_tmap_update(pool, dir, tmap_desc, ret);
-

[Qemu-devel] [0/27] Implement emulation of pSeries logical partitions (v5)

2011-03-24 Thread David Gibson
This patch series adds a pseries machine to qemu, allowing it to
emulate IBM pSeries logical partitions.  More specifically it
implements the interface defined by the PowerPC Architecture Platform
Requirements document (PAPR, or sPAPR for short).

Along the way we add a bunch of support for more modern ppc CPUs than
are currently supported.  It also makes some significant cleanups to
the translation code for hash page table based ppc MMUs.

Please apply.

---

Note that I haven't implemented (yet) a min_ram field in the machine
structure.  There are a number of places where the pseries platform
would benefit from more participation of the machine description in
command line validation.  In want to think a bit more about these
before sending some patches.  For now I've taken the simpler approach
of just adding a meaningful error message to the machine init function
if ram_size is too small.

Changes since v4 of this series:
 * Fix build breakages for powerpc targets other than ppc64 full system.
 * Since the pseries platform requires libfdt, only compile it when
   configured with --enable-fdt
 * Give an informative error if invoked with insufficient guest RAM to
   run the partition firmware.  Without this, giving insufficient RAM
   - such as qemu's default 64M - would lead to the firmware failing
   cryptically partway through boot.

Changes since v3 of this series:
 * Many, many checkpatch fixups
 * Integrated feedback from qemu-devel list
 * Added in-partition SLOF firmware

Changes since v2 of this series:
 * Assorted bugfixes and cleanups.

Changes since v1 of this series:
 * numerous coding style fixups
 * incorporated most review comments from initial version
 * moved to a wholly dynamic hypercall registration scheme
 * assorted other cleanups
 * many more patches implementing VIO devices




[Qemu-devel] [PATCH 01/27] Clean up PowerPC SLB handling code

2011-03-24 Thread David Gibson
Currently the SLB information when emulating a PowerPC 970 is
storeed in a structure with the unhelpfully named fields 'tmp'
and 'tmp64'.  While the layout in these fields does match the
description of the SLB in the architecture document, it is not
convenient either for looking up the SLB, or for emulating the
slbmte instruction.

This patch, therefore, reorganizes the SLB entry structure to be
divided in the the ESID related and VSID related fields as
they are divided in instructions accessing the SLB.

In addition to making the code smaller and more readable, this will
make it easier to implement for the 1TB segments used in more
recent PowerPC chips.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/cpu.h   |   29 +++-
 target-ppc/helper.c|  178 ++--
 target-ppc/helper.h|1 -
 target-ppc/op_helper.c |9 +--
 4 files changed, 80 insertions(+), 137 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index deb8d7c..124bbbf 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -43,6 +43,8 @@
 # define TARGET_VIRT_ADDR_SPACE_BITS 64
 #endif
 
+#define TARGET_PAGE_BITS_16M 24
+
 #else /* defined (TARGET_PPC64) */
 /* PowerPC 32 definitions */
 #define TARGET_LONG_BITS 32
@@ -359,10 +361,31 @@ union ppc_tlb_t {
 
 typedef struct ppc_slb_t ppc_slb_t;
 struct ppc_slb_t {
-uint64_t tmp64;
-uint32_t tmp;
+uint64_t esid;
+uint64_t vsid;
 };
 
+/* Bits in the SLB ESID word */
+#define SLB_ESID_ESID   0xF000ULL
+#define SLB_ESID_V  0x0800ULL /* valid */
+
+/* Bits in the SLB VSID word */
+#define SLB_VSID_SHIFT  12
+#define SLB_VSID_SSIZE_SHIFT62
+#define SLB_VSID_B  0xc000ULL
+#define SLB_VSID_B_256M 0xULL
+#define SLB_VSID_VSID   0x3000ULL
+#define SLB_VSID_KS 0x0800ULL
+#define SLB_VSID_KP 0x0400ULL
+#define SLB_VSID_N  0x0200ULL /* no-execute */
+#define SLB_VSID_L  0x0100ULL
+#define SLB_VSID_C  0x0080ULL /* class */
+#define SLB_VSID_LP 0x0030ULL
+#define SLB_VSID_ATTR   0x0FFFULL
+
+#define SEGMENT_SHIFT_256M  28
+#define SEGMENT_MASK_256M   (~((1ULL  SEGMENT_SHIFT_256M) - 1))
+
 /*/
 /* Machine state register bits definition*/
 #define MSR_SF   63 /* Sixty-four-bit modehflags */
@@ -755,7 +778,7 @@ void ppc_store_sdr1 (CPUPPCState *env, target_ulong value);
 void ppc_store_asr (CPUPPCState *env, target_ulong value);
 target_ulong ppc_load_slb (CPUPPCState *env, int slb_nr);
 target_ulong ppc_load_sr (CPUPPCState *env, int sr_nr);
-void ppc_store_slb (CPUPPCState *env, target_ulong rb, target_ulong rs);
+int ppc_store_slb (CPUPPCState *env, target_ulong rb, target_ulong rs);
 #endif /* defined(TARGET_PPC64) */
 void ppc_store_sr (CPUPPCState *env, int srnum, target_ulong value);
 #endif /* !defined(CONFIG_USER_ONLY) */
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 4b49101..2094ca3 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -672,85 +672,36 @@ static inline int find_pte(CPUState *env, mmu_ctx_t *ctx, 
int h, int rw,
 }
 
 #if defined(TARGET_PPC64)
-static ppc_slb_t *slb_get_entry(CPUPPCState *env, int nr)
-{
-ppc_slb_t *retval = env-slb[nr];
-
-#if 0 // XXX implement bridge mode?
-if (env-spr[SPR_ASR]  1) {
-target_phys_addr_t sr_base;
-
-sr_base = env-spr[SPR_ASR]  0xf000;
-sr_base += (12 * nr);
-
-retval-tmp64 = ldq_phys(sr_base);
-retval-tmp = ldl_phys(sr_base + 8);
-}
-#endif
-
-return retval;
-}
-
-static void slb_set_entry(CPUPPCState *env, int nr, ppc_slb_t *slb)
-{
-ppc_slb_t *entry = env-slb[nr];
-
-if (slb == entry)
-return;
-
-entry-tmp64 = slb-tmp64;
-entry-tmp = slb-tmp;
-}
-
-static inline int slb_is_valid(ppc_slb_t *slb)
-{
-return (int)(slb-tmp64  0x0800ULL);
-}
-
-static inline void slb_invalidate(ppc_slb_t *slb)
-{
-slb-tmp64 = ~0x0800ULL;
-}
-
 static inline int slb_lookup(CPUPPCState *env, target_ulong eaddr,
  target_ulong *vsid, target_ulong *page_mask,
  int *attr, int *target_page_bits)
 {
-target_ulong mask;
-int n, ret;
+uint64_t esid;
+int n;
 
-ret = -5;
 LOG_SLB(%s: eaddr  TARGET_FMT_lx \n, __func__, eaddr);
-mask = 0xULL; /* Avoid gcc warning */
+
+esid = (eaddr  SEGMENT_MASK_256M) | SLB_ESID_V;
+
 for (n = 0; n  env-slb_nr; n++) {
-ppc_slb_t *slb = slb_get_entry(env, n);
-
-LOG_SLB(%s: seg %d %016 PRIx64  %08
-PRIx32 \n, __func__, n, slb-tmp64, 

[Qemu-devel] [PATCH 06/27] Correct ppc popcntb logic, implement popcntw and popcntd

2011-03-24 Thread David Gibson
From: David Gibson d...@au1.ibm.com

qemu already includes support for the popcntb instruction introduced
in POWER5 (although it doesn't actually allow you to choose POWER5).

However, the logic is slightly incorrect: it will generate results
truncated to 32-bits when the CPU is in 32-bit mode.  This is not
normal for powerpc - generally arithmetic instructions on a 64-bit
powerpc cpu will generate full 64 bit results, it's just that only the
low 32 bits will be significant for condition codes.

This patch corrects this nit, which actually simplifies the code slightly.

In addition, this patch implements the popcntw and popcntd
instructions added in POWER7, in preparation for allowing POWER7 as an
emulated CPU.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/cpu.h   |2 +
 target-ppc/helper.h|3 +-
 target-ppc/op_helper.c |   55 +++
 target-ppc/translate.c |   20 +
 4 files changed, 69 insertions(+), 11 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index f293f85..37dde39 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1505,6 +1505,8 @@ enum {
 PPC_DCRX   = 0x2000ULL,
 /* user-mode DCR access, implemented in PowerPC 460  */
 PPC_DCRUX  = 0x4000ULL,
+/* popcntw and popcntd instructions  */
+PPC_POPCNTWD   = 0x8000ULL,
 };
 
 /*/
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 2b4744d..7c02be9 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -38,10 +38,11 @@ DEF_HELPER_2(mulldo, i64, i64, i64)
 
 DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl)
 DEF_HELPER_FLAGS_1(popcntb, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl)
+DEF_HELPER_FLAGS_1(popcntw, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl)
 DEF_HELPER_2(sraw, tl, tl, tl)
 #if defined(TARGET_PPC64)
 DEF_HELPER_FLAGS_1(cntlzd, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl)
-DEF_HELPER_FLAGS_1(popcntb_64, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl)
+DEF_HELPER_FLAGS_1(popcntd, TCG_CALL_CONST | TCG_CALL_PURE, tl, tl)
 DEF_HELPER_2(srad, tl, tl, tl)
 #endif
 
diff --git a/target-ppc/op_helper.c b/target-ppc/op_helper.c
index aa2e8ba..b1b883d 100644
--- a/target-ppc/op_helper.c
+++ b/target-ppc/op_helper.c
@@ -499,6 +499,50 @@ target_ulong helper_srad (target_ulong value, target_ulong 
shift)
 }
 #endif
 
+#if defined(TARGET_PPC64)
+target_ulong helper_popcntb (target_ulong val)
+{
+val = (val  0xULL) + ((val   1) 
+   0xULL);
+val = (val  0xULL) + ((val   2) 
+   0xULL);
+val = (val  0x0f0f0f0f0f0f0f0fULL) + ((val   4) 
+   0x0f0f0f0f0f0f0f0fULL);
+return val;
+}
+
+target_ulong helper_popcntw (target_ulong val)
+{
+val = (val  0xULL) + ((val   1) 
+   0xULL);
+val = (val  0xULL) + ((val   2) 
+   0xULL);
+val = (val  0x0f0f0f0f0f0f0f0fULL) + ((val   4) 
+   0x0f0f0f0f0f0f0f0fULL);
+val = (val  0x00ff00ff00ff00ffULL) + ((val   8) 
+   0x00ff00ff00ff00ffULL);
+val = (val  0xULL) + ((val  16) 
+   0xULL);
+return val;
+}
+
+target_ulong helper_popcntd (target_ulong val)
+{
+val = (val  0xULL) + ((val   1) 
+   0xULL);
+val = (val  0xULL) + ((val   2) 
+   0xULL);
+val = (val  0x0f0f0f0f0f0f0f0fULL) + ((val   4) 
+   0x0f0f0f0f0f0f0f0fULL);
+val = (val  0x00ff00ff00ff00ffULL) + ((val   8) 
+   0x00ff00ff00ff00ffULL);
+val = (val  0xULL) + ((val  16) 
+   0xULL);
+val = (val  0xULL) + ((val  32) 
+   0xULL);
+return val;
+}
+#else
 target_ulong helper_popcntb (target_ulong val)
 {
 val = (val  0x) + ((val   1)  0x);
@@ -507,12 +551,13 @@ target_ulong helper_popcntb (target_ulong val)
 return val;
 }
 
-#if defined(TARGET_PPC64)
-target_ulong helper_popcntb_64 (target_ulong val)
+target_ulong helper_popcntw (target_ulong val)
 {
-val = (val  0xULL) + ((val   1)  
0xULL);
-val = (val  0xULL) + ((val   2)  
0xULL);
-val = (val  

[Qemu-devel] [PATCH 02/27] Allow qemu_devtree_setprop() to take arbitrary values

2011-03-24 Thread David Gibson
From: David Gibson d...@au1.ibm.com

Currently qemu_devtree_setprop() expects the new property value to be
given as a uint32_t *.  While property values consisting of u32s are
common, in general they can have any bytestring value.

Therefore, this patch alters the function to take a void * instead,
allowing callers to easily give anything as the property value.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
---
 device_tree.c |2 +-
 device_tree.h |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/device_tree.c b/device_tree.c
index 426a631..21be070 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -74,7 +74,7 @@ fail:
 }
 
 int qemu_devtree_setprop(void *fdt, const char *node_path,
- const char *property, uint32_t *val_array, int size)
+ const char *property, void *val_array, int size)
 {
 int offset;
 
diff --git a/device_tree.h b/device_tree.h
index f05c4e7..cecd98f 100644
--- a/device_tree.h
+++ b/device_tree.h
@@ -17,7 +17,7 @@
 void *load_device_tree(const char *filename_path, int *sizep);
 
 int qemu_devtree_setprop(void *fdt, const char *node_path,
- const char *property, uint32_t *val_array, int size);
+ const char *property, void *val_array, int size);
 int qemu_devtree_setprop_cell(void *fdt, const char *node_path,
   const char *property, uint32_t val);
 int qemu_devtree_setprop_string(void *fdt, const char *node_path,
-- 
1.7.1




[Qemu-devel] [PATCH 08/27] Parse SDR1 on mtspr instead of at translate time

2011-03-24 Thread David Gibson
On ppc machines with hash table MMUs, the special purpose register SDR1
contains both the base address of the encoded size (hashed) page tables.

At present, we interpret the SDR1 value within the address translation
path.  But because the encodings of the size for 32-bit and 64-bit are
different this makes for a confusing branch on the MMU type with a bunch
of curly shifts and masks in the middle of the translate path.

This patch cleans things up by moving the interpretation on SDR1 into the
helper function handling the write to the register.  This leaves a simple
pre-sanitized base address and mask for the hash table in the CPUState
structure which is easier to work with in the translation path.

This makes the translation path more readable.  It addresses the FIXME
comment currently in the mtsdr1 helper, by validating the SDR1 value during
interpretation.  Finally it opens the way for emulating a pSeries-style
partition where the hash table used for translation is not mapped into
the guests's RAM.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 monitor.c   |2 +-
 target-ppc/cpu.h|   11 +-
 target-ppc/helper.c |   80 ---
 target-ppc/kvm.c|2 +-
 target-ppc/machine.c|6 ++-
 target-ppc/translate.c  |2 +-
 target-ppc/translate_init.c |7 +---
 7 files changed, 63 insertions(+), 47 deletions(-)

diff --git a/monitor.c b/monitor.c
index 76a8207..f1a08dc 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3462,7 +3462,7 @@ static const MonitorDef monitor_defs[] = {
 { asr, offsetof(CPUState, asr) },
 #endif
 /* Segment registers */
-{ sdr1, offsetof(CPUState, sdr1) },
+{ sdr1, offsetof(CPUState, spr[SPR_SDR1]) },
 { sr0, offsetof(CPUState, sr[0]) },
 { sr1, offsetof(CPUState, sr[1]) },
 { sr2, offsetof(CPUState, sr[2]) },
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 37dde39..ead4566 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -359,6 +359,14 @@ union ppc_tlb_t {
 };
 #endif
 
+#define SDR_32_HTABORG 0xUL
+#define SDR_32_HTABMASK0x01FFUL
+
+#if defined(TARGET_PPC64)
+#define SDR_64_HTABORG 0xFFFCULL
+#define SDR_64_HTABSIZE0x001FULL
+#endif /* defined(TARGET_PPC64 */
+
 typedef struct ppc_slb_t ppc_slb_t;
 struct ppc_slb_t {
 uint64_t esid;
@@ -642,7 +650,8 @@ struct CPUPPCState {
 int slb_nr;
 #endif
 /* segment registers */
-target_ulong sdr1;
+target_phys_addr_t htab_base;
+target_phys_addr_t htab_mask;
 target_ulong sr[32];
 /* BATs */
 int nb_BATs;
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 7ca33cb..68d2d9c 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -788,20 +788,19 @@ int ppc_load_slb_vsid (CPUPPCState *env, target_ulong rb, 
target_ulong *rt)
 #endif /* defined(TARGET_PPC64) */
 
 /* Perform segment based translation */
-static inline target_phys_addr_t get_pgaddr(target_phys_addr_t sdr1,
-int sdr_sh,
-target_phys_addr_t hash,
-target_phys_addr_t mask)
+static inline target_phys_addr_t get_pgaddr(target_phys_addr_t htab_base,
+target_phys_addr_t htab_mask,
+target_phys_addr_t hash)
 {
-return (sdr1  ((target_phys_addr_t)(-1ULL)  sdr_sh)) | (hash  mask);
+return htab_base | (hash  htab_mask);
 }
 
 static inline int get_segment(CPUState *env, mmu_ctx_t *ctx,
   target_ulong eaddr, int rw, int type)
 {
-target_phys_addr_t sdr, hash, mask, sdr_mask, htab_mask;
+target_phys_addr_t hash;
 target_ulong sr, vsid, vsid_mask, pgidx, page_mask;
-int ds, vsid_sh, sdr_sh, pr, target_page_bits;
+int ds, vsid_sh, pr, target_page_bits;
 int ret, ret2;
 
 pr = msr_pr;
@@ -826,8 +825,6 @@ static inline int get_segment(CPUState *env, mmu_ctx_t *ctx,
 ctx-eaddr = eaddr;
 vsid_mask = 0x3F80ULL;
 vsid_sh = 7;
-sdr_sh = 18;
-sdr_mask = 0x3FF80;
 } else
 #endif /* defined(TARGET_PPC64) */
 {
@@ -840,8 +837,6 @@ static inline int get_segment(CPUState *env, mmu_ctx_t *ctx,
 vsid = sr  0x00FF;
 vsid_mask = 0x01C0;
 vsid_sh = 6;
-sdr_sh = 16;
-sdr_mask = 0xFFC0;
 target_page_bits = TARGET_PAGE_BITS;
 LOG_MMU(Check segment v= TARGET_FMT_lx  %d  TARGET_FMT_lx  nip=
 TARGET_FMT_lx  lr= TARGET_FMT_lx
@@ -857,29 +852,26 @@ static inline int get_segment(CPUState *env, mmu_ctx_t 
*ctx,
 if (type != ACCESS_CODE || ctx-nx == 0) {
 /* Page address translation */
 /* Primary table address */
-sdr = env-sdr1;
 pgidx = (eaddr  page_mask)  target_page_bits;
 #if 

[Qemu-devel] [PATCH 13/27] Start implementing pSeries logical partition machine

2011-03-24 Thread David Gibson
This patch adds a pseries machine to qemu.  This aims to emulate a
logical partition on an IBM pSeries machine, compliant to the
PowerPC Architecture Platform Requirements (PAPR) document.

This initial version is quite limited, it implements a basic machine
and PAPR hypercall emulation.  So far only one hypercall is present -
H_PUT_TERM_CHAR - so that a (write-only) console is available.

Multiple CPUs are permitted, with SMP entry handled kexec() style.

The machine so far more resembles an old POWER4 style full system
partition rather than a modern LPAR, in that the guest manages the
page tables directly, rather than via hypercalls.

The machine requires qemu to be configured with --enable-fdt.  The
machine can (so far) only be booted with -kernel - i.e. no partition
firmware is provided.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 Makefile.target  |4 +
 hw/spapr.c   |  313 ++
 hw/spapr.h   |  257 
 hw/spapr_hcall.c |   43 
 4 files changed, 617 insertions(+), 0 deletions(-)
 create mode 100644 hw/spapr.c
 create mode 100644 hw/spapr.h
 create mode 100644 hw/spapr_hcall.c

diff --git a/Makefile.target b/Makefile.target
index 62b102a..ccf090b 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,6 +231,10 @@ obj-ppc-y += ppc_prep.o
 obj-ppc-y += ppc_oldworld.o
 # NewWorld PowerMac
 obj-ppc-y += ppc_newworld.o
+# IBM pSeries (sPAPR)i
+ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
+obj-ppc-y += spapr.o spapr_hcall.o
+endif
 # PowerPC 4xx boards
 obj-ppc-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
 obj-ppc-y += ppc440.o ppc440_bamboo.o
diff --git a/hw/spapr.c b/hw/spapr.c
new file mode 100644
index 000..0deea1b
--- /dev/null
+++ b/hw/spapr.c
@@ -0,0 +1,313 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * Copyright (c) 2004-2007 Fabrice Bellard
+ * Copyright (c) 2007 Jocelyn Mayer
+ * Copyright (c) 2010 David Gibson, IBM Corporation.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ */
+#include sysemu.h
+#include qemu-char.h
+#include hw.h
+#include elf.h
+
+#include hw/boards.h
+#include hw/ppc.h
+#include hw/loader.h
+
+#include hw/spapr.h
+
+#include libfdt.h
+
+#define KERNEL_LOAD_ADDR0x
+#define INITRD_LOAD_ADDR0x0280
+#define FDT_MAX_SIZE0x1
+
+#define TIMEBASE_FREQ   51200ULL
+
+#define MAX_CPUS32
+
+sPAPREnvironment *spapr;
+
+static void *spapr_create_fdt(int *fdt_size, ram_addr_t ramsize,
+  const char *cpu_model, CPUState *envs[],
+  sPAPREnvironment *spapr,
+  target_phys_addr_t initrd_base,
+  target_phys_addr_t initrd_size,
+  const char *kernel_cmdline)
+{
+void *fdt;
+uint64_t mem_reg_property[] = { 0, cpu_to_be64(ramsize) };
+uint32_t start_prop = cpu_to_be32(initrd_base);
+uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
+int i;
+char *modelname;
+
+#define _FDT(exp) \
+do { \
+int ret = (exp);   \
+if (ret  0) { \
+fprintf(stderr, qemu: error creating device tree: %s: %s\n, \
+#exp, fdt_strerror(ret));  \
+exit(1);   \
+}  \
+} while (0)
+
+fdt = qemu_mallocz(FDT_MAX_SIZE);
+_FDT((fdt_create(fdt, FDT_MAX_SIZE)));
+
+_FDT((fdt_finish_reservemap(fdt)));
+
+/* Root node */
+_FDT((fdt_begin_node(fdt, )));
+_FDT((fdt_property_string(fdt, device_type, chrp)));
+_FDT((fdt_property_string(fdt, model, qemu,emulated-pSeries-LPAR)));
+
+

[Qemu-devel] [PATCH 15/27] Virtual hash page table handling on pSeries machine

2011-03-24 Thread David Gibson
On pSeries logical partitions, excepting the old POWER4-style full system
partitions, the guest does not have direct access to the hardware page
table.  Instead, the pagetable exists in hypervisor memory, and the guest
must manipulate it with hypercalls.

However, our current pSeries emulation more closely resembles the old
style where the guest must set up and handle the pagetables itself.  This
patch converts it to act like a modern partition.

This involves two things: first, the hash translation path is modified to
permit the has table to be stored externally to the emulated machine's
RAM.  The pSeries machine init code configures the CPUs to use this mode.

Secondly, we emulate the PAPR hypercalls for manipulating the external
hashed page table.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/spapr.c  |   35 ++-
 hw/spapr_hcall.c|  254 +++
 target-ppc/cpu.h|2 +
 target-ppc/helper.c |   36 ++--
 4 files changed, 315 insertions(+), 12 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index f3d6125..cd05d3f 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -52,12 +52,15 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
   sPAPREnvironment *spapr,
   target_phys_addr_t initrd_base,
   target_phys_addr_t initrd_size,
-  const char *kernel_cmdline)
+  const char *kernel_cmdline,
+  long hash_shift)
 {
 void *fdt;
 uint64_t mem_reg_property[] = { 0, cpu_to_be64(ramsize) };
 uint32_t start_prop = cpu_to_be32(initrd_base);
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
+uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
+char hypertas_prop[] = hcall-pft\0hcall-term;
 int i;
 char *modelname;
 int ret;
@@ -145,6 +148,8 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
  * full emu, for kvm we should copy it from the host */
 _FDT((fdt_property_cell(fdt, clock-frequency, 10)));
 _FDT((fdt_property_cell(fdt, ibm,slb-size, env-slb_nr)));
+_FDT((fdt_property(fdt, ibm,pft-size,
+   pft_size_prop, sizeof(pft_size_prop;
 _FDT((fdt_property_string(fdt, status, okay)));
 _FDT((fdt_property(fdt, 64-bit, NULL, 0)));
 
@@ -160,6 +165,14 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 
 _FDT((fdt_end_node(fdt)));
 
+/* RTAS */
+_FDT((fdt_begin_node(fdt, rtas)));
+
+_FDT((fdt_property(fdt, ibm,hypertas-functions, hypertas_prop,
+   sizeof(hypertas_prop;
+
+_FDT((fdt_end_node(fdt)));
+
 /* vdevice */
 _FDT((fdt_begin_node(fdt, vdevice)));
 
@@ -208,12 +221,13 @@ static void ppc_spapr_init(ram_addr_t ram_size,
const char *cpu_model)
 {
 CPUState *envs[MAX_CPUS];
-void *fdt;
+void *fdt, *htab;
 int i;
 ram_addr_t ram_offset;
 target_phys_addr_t fdt_addr;
 uint32_t kernel_base, initrd_base;
-long kernel_size, initrd_size;
+long kernel_size, initrd_size, htab_size;
+long pteg_shift = 17;
 int fdt_size;
 
 spapr = qemu_malloc(sizeof(*spapr));
@@ -250,6 +264,18 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 ram_offset = qemu_ram_alloc(NULL, ppc_spapr.ram, ram_size);
 cpu_register_physical_memory(0, ram_size, ram_offset);
 
+/* allocate hash page table.  For now we always make this 16mb,
+ * later we should probably make it scale to the size of guest
+ * RAM */
+htab_size = 1ULL  (pteg_shift + 7);
+htab = qemu_mallocz(htab_size);
+
+for (i = 0; i  smp_cpus; i++) {
+envs[i]-external_htab = htab;
+envs[i]-htab_base = -1;
+envs[i]-htab_mask = htab_size - 1;
+}
+
 spapr-vio_bus = spapr_vio_bus_init();
 
 for (i = 0; i  MAX_SERIAL_PORTS; i++) {
@@ -296,7 +322,8 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 
 /* Prepare the device tree */
 fdt = spapr_create_fdt(fdt_size, ram_size, cpu_model, envs, spapr,
-   initrd_base, initrd_size, kernel_cmdline);
+   initrd_base, initrd_size, kernel_cmdline,
+   pteg_shift + 7);
 assert(fdt != NULL);
 
 cpu_physical_memory_write(fdt_addr, fdt, fdt_size);
diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index 7623969..5c2dd88 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -1,8 +1,253 @@
 #include sysemu.h
 #include cpu.h
 #include qemu-char.h
+#include sysemu.h
+#include qemu-char.h
+#include exec-all.h
 #include hw/spapr.h
 
+#define HPTES_PER_GROUP 8
+
+#define HPTE_V_SSIZE_SHIFT  62
+#define HPTE_V_AVPN_SHIFT   7
+#define HPTE_V_AVPN 0x3f80ULL
+#define HPTE_V_AVPN_VAL(x)  (((x)  HPTE_V_AVPN)  HPTE_V_AVPN_SHIFT)
+#define 

[Qemu-devel] [PATCH 04/27] Implement PowerPC slbmfee and slbmfev instructions

2011-03-24 Thread David Gibson
For a 64-bit PowerPC target, qemu correctly implements translation
through the segment lookaside buffer.  Likewise it supports the
slbmte instruction which is used to load entries into the SLB.

However, it does not emulate the slbmfee and slbmfev instructions
which read SLB entries back into registers.  Because these are
only occasionally used in guests (mostly for debugging) we get
away with it.

However, given the recent SLB cleanups, it becomes quite easy to
implement these, and thereby allow, amongst other things, a guest
Linux to use xmon's command to dump the SLB.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/cpu.h   |2 ++
 target-ppc/helper.c|   26 ++
 target-ppc/helper.h|2 ++
 target-ppc/op_helper.c |   20 
 target-ppc/translate.c |   31 ++-
 5 files changed, 80 insertions(+), 1 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 36ca342..f293f85 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -779,6 +779,8 @@ void ppc_store_asr (CPUPPCState *env, target_ulong value);
 target_ulong ppc_load_slb (CPUPPCState *env, int slb_nr);
 target_ulong ppc_load_sr (CPUPPCState *env, int sr_nr);
 int ppc_store_slb (CPUPPCState *env, target_ulong rb, target_ulong rs);
+int ppc_load_slb_esid (CPUPPCState *env, target_ulong rb, target_ulong *rt);
+int ppc_load_slb_vsid (CPUPPCState *env, target_ulong rb, target_ulong *rt);
 #endif /* defined(TARGET_PPC64) */
 void ppc_store_sr (CPUPPCState *env, int srnum, target_ulong value);
 #endif /* !defined(CONFIG_USER_ONLY) */
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 452a35c..b9621d2 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -774,6 +774,32 @@ int ppc_store_slb (CPUPPCState *env, target_ulong rb, 
target_ulong rs)
 
 return 0;
 }
+
+int ppc_load_slb_esid (CPUPPCState *env, target_ulong rb, target_ulong *rt)
+{
+int slot = rb  0xfff;
+ppc_slb_t *slb = env-slb[slot];
+
+if (slot = env-slb_nr) {
+return -1;
+}
+
+*rt = slb-esid;
+return 0;
+}
+
+int ppc_load_slb_vsid (CPUPPCState *env, target_ulong rb, target_ulong *rt)
+{
+int slot = rb  0xfff;
+ppc_slb_t *slb = env-slb[slot];
+
+if (slot = env-slb_nr) {
+return -1;
+}
+
+*rt = slb-vsid;
+return 0;
+}
 #endif /* defined(TARGET_PPC64) */
 
 /* Perform segment based translation */
diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index d512cb0..1a69cf8 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -341,6 +341,8 @@ DEF_HELPER_FLAGS_0(tlbia, TCG_CALL_CONST, void)
 DEF_HELPER_FLAGS_1(tlbie, TCG_CALL_CONST, void, tl)
 #if defined(TARGET_PPC64)
 DEF_HELPER_FLAGS_2(store_slb, TCG_CALL_CONST, void, tl, tl)
+DEF_HELPER_1(load_slb_esid, tl, tl)
+DEF_HELPER_1(load_slb_vsid, tl, tl)
 DEF_HELPER_FLAGS_0(slbia, TCG_CALL_CONST, void)
 DEF_HELPER_FLAGS_1(slbie, TCG_CALL_CONST, void, tl)
 #endif
diff --git a/target-ppc/op_helper.c b/target-ppc/op_helper.c
index bf41627..bdb1f17 100644
--- a/target-ppc/op_helper.c
+++ b/target-ppc/op_helper.c
@@ -3753,6 +3753,26 @@ void helper_store_slb (target_ulong rb, target_ulong rs)
 }
 }
 
+target_ulong helper_load_slb_esid (target_ulong rb)
+{
+target_ulong rt;
+
+if (ppc_load_slb_esid(env, rb, rt)  0) {
+helper_raise_exception_err(POWERPC_EXCP_PROGRAM, POWERPC_EXCP_INVAL);
+}
+return rt;
+}
+
+target_ulong helper_load_slb_vsid (target_ulong rb)
+{
+target_ulong rt;
+
+if (ppc_load_slb_vsid(env, rb, rt)  0) {
+helper_raise_exception_err(POWERPC_EXCP_PROGRAM, POWERPC_EXCP_INVAL);
+}
+return rt;
+}
+
 void helper_slbia (void)
 {
 ppc_slb_invalidate_all(env);
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 3d265e3..0b6bfe7 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -4227,6 +4227,33 @@ static void gen_slbmte(DisasContext *ctx)
 #endif
 }
 
+static void gen_slbmfee(DisasContext *ctx)
+{
+#if defined(CONFIG_USER_ONLY)
+gen_inval_exception(ctx, POWERPC_EXCP_PRIV_REG);
+#else
+if (unlikely(!ctx-mem_idx)) {
+gen_inval_exception(ctx, POWERPC_EXCP_PRIV_REG);
+return;
+}
+gen_helper_load_slb_esid(cpu_gpr[rS(ctx-opcode)],
+ cpu_gpr[rB(ctx-opcode)]);
+#endif
+}
+
+static void gen_slbmfev(DisasContext *ctx)
+{
+#if defined(CONFIG_USER_ONLY)
+gen_inval_exception(ctx, POWERPC_EXCP_PRIV_REG);
+#else
+if (unlikely(!ctx-mem_idx)) {
+gen_inval_exception(ctx, POWERPC_EXCP_PRIV_REG);
+return;
+}
+gen_helper_load_slb_vsid(cpu_gpr[rS(ctx-opcode)],
+ cpu_gpr[rB(ctx-opcode)]);
+#endif
+}
 #endif /* defined(TARGET_PPC64) */
 
 /***  Lookaside buffer management  ***/
@@ -8300,7 +8327,9 @@ GEN_HANDLER2(mfsrin_64b, mfsrin, 0x1F, 0x13, 0x14, 
0x001F0001,
 GEN_HANDLER2(mtsr_64b, mtsr, 0x1F, 0x12, 0x06, 0x0010F801, 

[Qemu-devel] [PATCH 11/27] Support 1T segments on ppc

2011-03-24 Thread David Gibson
Traditionally, the segments used for the two-stage translation used on
powerpc MMUs were 256MB in size.  This was the only option on all hash
page table based 32-bit powerpc cpus, and on the earlier 64-bit hash page
table based cpus.  However, newer 64-bit cpus also permit 1TB segments

This patch adds support for 1TB segment translation to the qemu code.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/cpu.h|7 +++
 target-ppc/helper.c |   50 ++
 2 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index fd2dfcd..10341b3 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -114,6 +114,7 @@ enum powerpc_mmu_t {
 POWERPC_MMU_601= 0x000A,
 #if defined(TARGET_PPC64)
 #define POWERPC_MMU_64   0x0001
+#define POWERPC_MMU_1TSEG0x0002
 /* 64 bits PowerPC MMU */
 POWERPC_MMU_64B= POWERPC_MMU_64 | 0x0001,
 /* 620 variant (no segment exceptions) */
@@ -382,9 +383,11 @@ struct ppc_slb_t {
 
 /* Bits in the SLB VSID word */
 #define SLB_VSID_SHIFT  12
+#define SLB_VSID_SHIFT_1T   24
 #define SLB_VSID_SSIZE_SHIFT62
 #define SLB_VSID_B  0xc000ULL
 #define SLB_VSID_B_256M 0xULL
+#define SLB_VSID_B_1T   0x4000ULL
 #define SLB_VSID_VSID   0x3000ULL
 #define SLB_VSID_PTEM   (SLB_VSID_B | SLB_VSID_VSID)
 #define SLB_VSID_KS 0x0800ULL
@@ -398,6 +401,10 @@ struct ppc_slb_t {
 #define SEGMENT_SHIFT_256M  28
 #define SEGMENT_MASK_256M   (~((1ULL  SEGMENT_SHIFT_256M) - 1))
 
+#define SEGMENT_SHIFT_1T40
+#define SEGMENT_MASK_1T (~((1ULL  SEGMENT_SHIFT_1T) - 1))
+
+
 /*/
 /* Machine state register bits definition*/
 #define MSR_SF   63 /* Sixty-four-bit modehflags */
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index ae8001c..6712fce 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -675,19 +675,26 @@ static inline int find_pte(CPUState *env, mmu_ctx_t *ctx, 
int h, int rw,
 #if defined(TARGET_PPC64)
 static inline ppc_slb_t *slb_lookup(CPUPPCState *env, target_ulong eaddr)
 {
-uint64_t esid;
+uint64_t esid_256M, esid_1T;
 int n;
 
 LOG_SLB(%s: eaddr  TARGET_FMT_lx \n, __func__, eaddr);
 
-esid = (eaddr  SEGMENT_MASK_256M) | SLB_ESID_V;
+esid_256M = (eaddr  SEGMENT_MASK_256M) | SLB_ESID_V;
+esid_1T = (eaddr  SEGMENT_MASK_1T) | SLB_ESID_V;
 
 for (n = 0; n  env-slb_nr; n++) {
 ppc_slb_t *slb = env-slb[n];
 
 LOG_SLB(%s: slot %d %016 PRIx64  %016
 PRIx64 \n, __func__, n, slb-esid, slb-vsid);
-if (slb-esid == esid) {
+/* We check for 1T matches on all MMUs here - if the MMU
+ * doesn't have 1T segment support, we will have prevented 1T
+ * entries from being inserted in the slbmte code. */
+if (((slb-esid == esid_256M) 
+ ((slb-vsid  SLB_VSID_B) == SLB_VSID_B_256M))
+|| ((slb-esid == esid_1T) 
+((slb-vsid  SLB_VSID_B) == SLB_VSID_B_1T))) {
 return slb;
 }
 }
@@ -740,14 +747,20 @@ void ppc_slb_invalidate_one (CPUPPCState *env, uint64_t 
T0)
 int ppc_store_slb (CPUPPCState *env, target_ulong rb, target_ulong rs)
 {
 int slot = rb  0xfff;
-uint64_t esid = rb  ~0xfff;
 ppc_slb_t *slb = env-slb[slot];
 
-if (slot = env-slb_nr) {
-return -1;
+if (rb  (0x1000 - env-slb_nr)) {
+return -1; /* Reserved bits set or slot too high */
+}
+if (rs  (SLB_VSID_B  ~SLB_VSID_B_1T)) {
+return -1; /* Bad segment size */
+}
+if ((rs  SLB_VSID_B)  !(env-mmu_model  POWERPC_MMU_1TSEG)) {
+return -1; /* 1T segment on MMU that doesn't support it */
 }
 
-slb-esid = esid;
+/* Mask out the slot number as we store the entry */
+slb-esid = rb  (SLB_ESID_ESID | SLB_ESID_V);
 slb-vsid = rs;
 
 LOG_SLB(%s: %d  TARGET_FMT_lx  -  TARGET_FMT_lx  = %016 PRIx64
@@ -799,6 +812,7 @@ static inline int get_segment(CPUState *env, mmu_ctx_t *ctx,
 if (env-mmu_model  POWERPC_MMU_64) {
 ppc_slb_t *slb;
 target_ulong pageaddr;
+int segment_bits;
 
 LOG_MMU(Check SLBs\n);
 slb = slb_lookup(env, eaddr);
@@ -806,7 +820,14 @@ static inline int get_segment(CPUState *env, mmu_ctx_t 
*ctx,
 return -5;
 }
 
-vsid = (slb-vsid  SLB_VSID_VSID)  SLB_VSID_SHIFT;
+if (slb-vsid  SLB_VSID_B) {
+vsid = (slb-vsid  SLB_VSID_VSID)  SLB_VSID_SHIFT_1T;
+segment_bits = 40;
+} else {
+vsid = (slb-vsid  SLB_VSID_VSID)  SLB_VSID_SHIFT;
+segment_bits = 28;
+}
+
 

[Qemu-devel] [PATCH 12/27] Add POWER7 support for ppc

2011-03-24 Thread David Gibson
This adds emulation support for the recent POWER7 cpu to qemu.  It's far
from perfect - it's missing a number of POWER7 features so far, including
any support for VSX or decimal floating point instructions.  However, it's
close enough to boot a kernel with the POWER7 PVR.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/ppc.c|   35 ++
 hw/ppc.h|1 +
 target-ppc/cpu.h|   16 ++
 target-ppc/helper.c |6 ++
 target-ppc/translate_init.c |  107 +++
 5 files changed, 165 insertions(+), 0 deletions(-)

diff --git a/hw/ppc.c b/hw/ppc.c
index b55a848..dabb816 100644
--- a/hw/ppc.c
+++ b/hw/ppc.c
@@ -247,6 +247,41 @@ void ppc970_irq_init (CPUState *env)
 env-irq_inputs = (void **)qemu_allocate_irqs(ppc970_set_irq, env,
   PPC970_INPUT_NB);
 }
+
+/* POWER7 internal IRQ controller */
+static void power7_set_irq (void *opaque, int pin, int level)
+{
+CPUState *env = opaque;
+int cur_level;
+
+LOG_IRQ(%s: env %p pin %d level %d\n, __func__,
+env, pin, level);
+cur_level = (env-irq_input_state  pin)  1;
+
+switch (pin) {
+case POWER7_INPUT_INT:
+/* Level sensitive - active high */
+LOG_IRQ(%s: set the external IRQ state to %d\n,
+__func__, level);
+ppc_set_irq(env, PPC_INTERRUPT_EXT, level);
+break;
+default:
+/* Unknown pin - do nothing */
+LOG_IRQ(%s: unknown IRQ pin %d\n, __func__, pin);
+return;
+}
+if (level) {
+env-irq_input_state |= 1  pin;
+} else {
+env-irq_input_state = ~(1  pin);
+}
+}
+
+void ppcPOWER7_irq_init (CPUState *env)
+{
+env-irq_inputs = (void **)qemu_allocate_irqs(power7_set_irq, env,
+  POWER7_INPUT_NB);
+}
 #endif /* defined(TARGET_PPC64) */
 
 /* PowerPC 40x internal IRQ controller */
diff --git a/hw/ppc.h b/hw/ppc.h
index 34f54cf..3ccf134 100644
--- a/hw/ppc.h
+++ b/hw/ppc.h
@@ -36,6 +36,7 @@ void ppc40x_irq_init (CPUState *env);
 void ppce500_irq_init (CPUState *env);
 void ppc6xx_irq_init (CPUState *env);
 void ppc970_irq_init (CPUState *env);
+void ppcPOWER7_irq_init (CPUState *env);
 
 /* PPC machines for OpenBIOS */
 enum {
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 10341b3..25d0658 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -119,6 +119,8 @@ enum powerpc_mmu_t {
 POWERPC_MMU_64B= POWERPC_MMU_64 | 0x0001,
 /* 620 variant (no segment exceptions) */
 POWERPC_MMU_620= POWERPC_MMU_64 | 0x0002,
+/* Architecture 2.06 variant   */
+POWERPC_MMU_2_06   = POWERPC_MMU_64 | POWERPC_MMU_1TSEG | 0x0003,
 #endif /* defined(TARGET_PPC64) */
 };
 
@@ -154,6 +156,8 @@ enum powerpc_excp_t {
 #if defined(TARGET_PPC64)
 /* PowerPC 970 exception model  */
 POWERPC_EXCP_970,
+/* POWER7 exception model   */
+POWERPC_EXCP_POWER7,
 #endif /* defined(TARGET_PPC64) */
 };
 
@@ -289,6 +293,8 @@ enum powerpc_input_t {
 PPC_FLAGS_INPUT_405,
 /* PowerPC 970 bus  */
 PPC_FLAGS_INPUT_970,
+/* PowerPC POWER7 bus   */
+PPC_FLAGS_INPUT_POWER7,
 /* PowerPC 401 bus  */
 PPC_FLAGS_INPUT_401,
 /* Freescale RCPU bus   */
@@ -1001,6 +1007,7 @@ static inline void cpu_clone_regs(CPUState *env, 
target_ulong newsp)
 #define SPR_HSPRG1(0x131)
 #define SPR_HDSISR(0x132)
 #define SPR_HDAR  (0x133)
+#define SPR_SPURR (0x134)
 #define SPR_BOOKE_DBCR0   (0x134)
 #define SPR_IBCR  (0x135)
 #define SPR_PURR  (0x135)
@@ -1625,6 +1632,15 @@ enum {
 PPC970_INPUT_THINT  = 6,
 PPC970_INPUT_NB,
 };
+
+enum {
+/* POWER7 input pins */
+POWER7_INPUT_INT= 0,
+/* POWER7 probably has other inputs, but we don't care about them
+ * for any existing machine.  We can wire these up when we need
+ * them */
+POWER7_INPUT_NB,
+};
 #endif
 
 /* Hardware exceptions definitions */
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 6712fce..278bee4 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -1200,6 +1200,7 @@ static inline int check_physical(CPUState *env, mmu_ctx_t 
*ctx,
 #if defined(TARGET_PPC64)
 case POWERPC_MMU_620:
 case POWERPC_MMU_64B:
+case POWERPC_MMU_2_06:
 /* Real address are 60 bits long */
 ctx-raddr = 0x0FFFULL;
 ctx-prot |= PAGE_WRITE;
@@ -1277,6 +1278,7 @@ int get_physical_address (CPUState *env, mmu_ctx_t *ctx, 
target_ulong eaddr,
 #if defined(TARGET_PPC64)
 case POWERPC_MMU_620:
 case POWERPC_MMU_64B:
+case POWERPC_MMU_2_06:
 #endif
 if (ret  0) {
 /* We didn't match any BAT entry or don't 

[Qemu-devel] [PATCH 16/27] Implement hcall based RTAS for pSeries machines

2011-03-24 Thread David Gibson
On pSeries machines, operating systems can instantiate RTAS (Run-Time
Abstraction Services), a runtime component of the firmware which implements
a number of low-level, infrequently used operations.  On logical partitions
under a hypervisor, many of the RTAS functions require hypervisor
privilege.  For simplicity, therefore, hypervisor systems typically
implement the in-partition RTAS as just a tiny wrapper around a hypercall
which actually implements the various RTAS functions.

This patch implements such a hypercall based RTAS for our emulated pSeries
machine.  A tiny in-partition firmware calls a new hypercall, which
looks up available RTAS services in a table.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 Makefile|3 +-
 Makefile.target |2 +-
 configure   |4 +-
 hw/spapr.c  |   26 +++-
 hw/spapr.h  |   22 +++
 hw/spapr_hcall.c|   15 +
 hw/spapr_rtas.c |  131 +++
 pc-bios/spapr-rtas.bin  |  Bin 0 - 20 bytes
 pc-bios/spapr-rtas/Makefile |   24 +++
 pc-bios/spapr-rtas/spapr-rtas.S |   36 +++
 10 files changed, 257 insertions(+), 6 deletions(-)
 create mode 100644 hw/spapr_rtas.c
 create mode 100644 pc-bios/spapr-rtas.bin
 create mode 100644 pc-bios/spapr-rtas/Makefile
 create mode 100644 pc-bios/spapr-rtas/spapr-rtas.S

diff --git a/Makefile b/Makefile
index 89e88b4..e0b3fea 100644
--- a/Makefile
+++ b/Makefile
@@ -213,7 +213,8 @@ pxe-ne2k_pci.bin pxe-pcnet.bin \
 pxe-rtl8139.bin pxe-virtio.bin \
 bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
 multiboot.bin linuxboot.bin \
-s390-zipl.rom
+s390-zipl.rom \
+spapr-rtas.bin
 else
 BLOBS=
 endif
diff --git a/Makefile.target b/Makefile.target
index cf12691..a53d99f 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -233,7 +233,7 @@ obj-ppc-y += ppc_oldworld.o
 obj-ppc-y += ppc_newworld.o
 # IBM pSeries (sPAPR)
 ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
-obj-ppc-y += spapr.o spapr_hcall.o spapr_vio.o
+obj-ppc-y += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
 obj-ppc-y += spapr_vty.o
 endif
 # PowerPC 4xx boards
diff --git a/configure b/configure
index 5a5827f..7d8d890 100755
--- a/configure
+++ b/configure
@@ -2461,7 +2461,9 @@ if test \( $cpu = i386 -o $cpu = x86_64 \) -a \
 $softmmu = yes ; then
   roms=optionrom
 fi
-
+if test $cpu = ppc64 ; then
+  roms=$roms spapr-rtas
+fi
 
 echo Install prefix$prefix
 echo BIOS directory`eval echo $datadir`
diff --git a/hw/spapr.c b/hw/spapr.c
index cd05d3f..ff1eb3b 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -40,6 +40,7 @@
 #define KERNEL_LOAD_ADDR0x
 #define INITRD_LOAD_ADDR0x0280
 #define FDT_MAX_SIZE0x1
+#define RTAS_MAX_SIZE   0x1
 
 #define TIMEBASE_FREQ   51200ULL
 
@@ -53,6 +54,8 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
   target_phys_addr_t initrd_base,
   target_phys_addr_t initrd_size,
   const char *kernel_cmdline,
+  target_phys_addr_t rtas_addr,
+  target_phys_addr_t rtas_size,
   long hash_shift)
 {
 void *fdt;
@@ -195,6 +198,12 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 exit(1);
 }
 
+/* RTAS */
+ret = spapr_rtas_device_tree_setup(fdt, rtas_addr, rtas_size);
+if (ret  0) {
+fprintf(stderr, Couldn't set up RTAS device tree properties\n);
+}
+
 _FDT((fdt_pack(fdt)));
 
 *fdt_size = fdt_totalsize(fdt);
@@ -224,11 +233,12 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 void *fdt, *htab;
 int i;
 ram_addr_t ram_offset;
-target_phys_addr_t fdt_addr;
+target_phys_addr_t fdt_addr, rtas_addr;
 uint32_t kernel_base, initrd_base;
-long kernel_size, initrd_size, htab_size;
+long kernel_size, initrd_size, htab_size, rtas_size;
 long pteg_shift = 17;
 int fdt_size;
+char *filename;
 
 spapr = qemu_malloc(sizeof(*spapr));
 cpu_ppc_hypercall = emulate_spapr_hypercall;
@@ -237,6 +247,8 @@ static void ppc_spapr_init(ram_addr_t ram_size,
  * 2GB, so that it can be processed with 32-bit code if
  * necessary */
 fdt_addr = MIN(ram_size, 0x8000) - FDT_MAX_SIZE;
+/* RTAS goes just below that */
+rtas_addr = fdt_addr - RTAS_MAX_SIZE;
 
 /* init CPUs */
 if (cpu_model == NULL) {
@@ -276,6 +288,14 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 envs[i]-htab_mask = htab_size - 1;
 }
 
+filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, spapr-rtas.bin);
+rtas_size = load_image_targphys(filename, rtas_addr, ram_size - rtas_addr);
+if (rtas_size  0) {
+hw_error(qemu: could not load LPAR rtas '%s'\n, filename);
+exit(1);
+}
+  

[Qemu-devel] [PATCH 22/27] Implement sPAPR Virtual LAN (ibmveth)

2011-03-24 Thread David Gibson
This patch implements the PAPR specified Inter Virtual Machine Logical
LAN; that is the virtual hardware used by the Linux ibmveth driver.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: David Gibson d...@au1.ibm.com

Conflicts:

Makefile.target
---
 Makefile.target |2 +-
 hw/spapr.c  |   21 ++-
 hw/spapr_llan.c |  521 +++
 hw/spapr_vio.h  |3 +
 4 files changed, 545 insertions(+), 2 deletions(-)
 create mode 100644 hw/spapr_llan.c

diff --git a/Makefile.target b/Makefile.target
index c795428..cd7bb41 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -234,7 +234,7 @@ obj-ppc-y += ppc_newworld.o
 # IBM pSeries (sPAPR)
 ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
 obj-ppc-y += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
-obj-ppc-y += xics.o spapr_vty.o
+obj-ppc-y += xics.o spapr_vty.o spapr_llan.o
 endif
 # PowerPC 4xx boards
 obj-ppc-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
diff --git a/hw/spapr.c b/hw/spapr.c
index 69759c5..18660dc 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -27,6 +27,7 @@
 #include sysemu.h
 #include hw.h
 #include elf.h
+#include net.h
 
 #include hw/boards.h
 #include hw/ppc.h
@@ -321,7 +322,7 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 qemu_free(filename);
 
 /* Set up Interrupt Controller */
-spapr-icp = xics_system_init(smp_cpus, envs, MAX_SERIAL_PORTS);
+spapr-icp = xics_system_init(smp_cpus, envs, MAX_SERIAL_PORTS + nb_nics);
 
 /* Set up VIO bus */
 spapr-vio_bus = spapr_vio_bus_init();
@@ -333,6 +334,24 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 }
 }
 
+for (i = 0; i  nb_nics; i++, irq++) {
+NICInfo *nd = nd_table[i];
+
+if (!nd-model) {
+nd-model = qemu_strdup(ibmveth);
+}
+
+if (strcmp(nd-model, ibmveth) == 0) {
+spapr_vlan_create(spapr-vio_bus, 0x1000 + i, nd,
+  xics_find_qirq(spapr-icp, irq), irq);
+} else {
+fprintf(stderr, pSeries (sPAPR) platform does not support 
+NIC model '%s' (only ibmveth is supported)\n,
+nd-model);
+exit(1);
+}
+}
+
 if (kernel_filename) {
 uint64_t lowaddr = 0;
 
diff --git a/hw/spapr_llan.c b/hw/spapr_llan.c
new file mode 100644
index 000..1d83fd5
--- /dev/null
+++ b/hw/spapr_llan.c
@@ -0,0 +1,521 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * PAPR Inter-VM Logical Lan, aka ibmveth
+ *
+ * Copyright (c) 2010,2011 David Gibson, IBM Corporation.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ */
+#include hw.h
+#include net.h
+#include hw/qdev.h
+#include hw/spapr.h
+#include hw/spapr_vio.h
+
+#include libfdt.h
+
+#define ETH_ALEN6
+#define MAX_PACKET_SIZE 65536
+
+/*#define DEBUG*/
+
+#ifdef DEBUG
+#define dprintf(fmt...) do { fprintf(stderr, fmt); } while (0)
+#else
+#define dprintf(fmt...)
+#endif
+
+/*
+ * Virtual LAN device
+ */
+
+typedef uint64_t vlan_bd_t;
+
+#define VLAN_BD_VALID0x8000ULL
+#define VLAN_BD_TOGGLE   0x4000ULL
+#define VLAN_BD_NO_CSUM  0x0200ULL
+#define VLAN_BD_CSUM_GOOD0x0100ULL
+#define VLAN_BD_LEN_MASK 0x00ffULL
+#define VLAN_BD_LEN(bd)  (((bd)  VLAN_BD_LEN_MASK)  32)
+#define VLAN_BD_ADDR_MASK0xULL
+#define VLAN_BD_ADDR(bd) ((bd)  VLAN_BD_ADDR_MASK)
+
+#define VLAN_VALID_BD(addr, len) (VLAN_BD_VALID | \
+  (((len)  32)  VLAN_BD_LEN_MASK) |  \
+  (addr  VLAN_BD_ADDR_MASK))
+
+#define VLAN_RXQC_TOGGLE 0x80
+#define VLAN_RXQC_VALID  0x40
+#define VLAN_RXQC_NO_CSUM0x02
+#define VLAN_RXQC_CSUM_GOOD  0x01
+
+#define VLAN_RQ_ALIGNMENT16
+#define VLAN_RXQ_BD_OFF  0
+#define VLAN_FILTER_BD_OFF   8
+#define 

[Qemu-devel] [PATCH 19/27] Add PAPR H_VIO_SIGNAL hypercall and infrastructure for VIO interrupts

2011-03-24 Thread David Gibson
This patch adds infrastructure to support interrupts from PAPR virtual IO
devices.  This includes correctly advertising those interrupts in the
device tree, and implementing the H_VIO_SIGNAL hypercall, used to
enable and disable individual device interrupts.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/spapr.c |2 +-
 hw/spapr_vio.c |   37 +
 hw/spapr_vio.h |6 ++
 3 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index b8244c9..0f6f40b 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -64,7 +64,7 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 uint32_t start_prop = cpu_to_be32(initrd_base);
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
-char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr;
+char hypertas_prop[] = 
hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt;
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 int i;
 char *modelname;
diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index 10acb4c..605079c 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -105,6 +105,16 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
 }
 }
 
+if (dev-qirq) {
+uint32_t ints_prop[] = {cpu_to_be32(dev-vio_irq_num), 0};
+
+ret = fdt_setprop(fdt, node_off, interrupts, ints_prop,
+  sizeof(ints_prop));
+if (ret  0) {
+return ret;
+}
+}
+
 if (info-devnode) {
 ret = (info-devnode)(dev, fdt, node_off);
 if (ret  0) {
@@ -140,6 +150,30 @@ void spapr_vio_bus_register_withprop(VIOsPAPRDeviceInfo 
*info)
 qdev_register(info-qdev);
 }
 
+static target_ulong h_vio_signal(CPUState *env, sPAPREnvironment *spapr,
+ target_ulong opcode,
+ target_ulong *args)
+{
+target_ulong reg = args[0];
+target_ulong mode = args[1];
+VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr-vio_bus, reg);
+VIOsPAPRDeviceInfo *info;
+
+if (!dev) {
+return H_PARAMETER;
+}
+
+info = (VIOsPAPRDeviceInfo *)dev-qdev.info;
+
+if (mode  ~info-signal_mask) {
+return H_PARAMETER;
+}
+
+dev-signal_state = mode;
+
+return H_SUCCESS;
+}
+
 VIOsPAPRBus *spapr_vio_bus_init(void)
 {
 VIOsPAPRBus *bus;
@@ -156,6 +190,9 @@ VIOsPAPRBus *spapr_vio_bus_init(void)
 qbus = qbus_create(spapr_vio_bus_info, dev, spapr-vio);
 bus = DO_UPCAST(VIOsPAPRBus, bus, qbus);
 
+/* hcall-vio */
+spapr_register_hypercall(H_VIO_SIGNAL, h_vio_signal);
+
 for (qinfo = device_info_list; qinfo; qinfo = qinfo-next) {
 VIOsPAPRDeviceInfo *info = (VIOsPAPRDeviceInfo *)qinfo;
 
diff --git a/hw/spapr_vio.h b/hw/spapr_vio.h
index b164ad3..8a000c6 100644
--- a/hw/spapr_vio.h
+++ b/hw/spapr_vio.h
@@ -24,6 +24,9 @@
 typedef struct VIOsPAPRDevice {
 DeviceState qdev;
 uint32_t reg;
+qemu_irq qirq;
+uint32_t vio_irq_num;
+target_ulong signal_state;
 } VIOsPAPRDevice;
 
 typedef struct VIOsPAPRBus {
@@ -33,6 +36,7 @@ typedef struct VIOsPAPRBus {
 typedef struct {
 DeviceInfo qdev;
 const char *dt_name, *dt_type, *dt_compatible;
+target_ulong signal_mask;
 int (*init)(VIOsPAPRDevice *dev);
 void (*hcalls)(VIOsPAPRBus *bus);
 int (*devnode)(VIOsPAPRDevice *dev, void *fdt, int node_off);
@@ -43,6 +47,8 @@ extern VIOsPAPRDevice *spapr_vio_find_by_reg(VIOsPAPRBus 
*bus, uint32_t reg);
 extern void spapr_vio_bus_register_withprop(VIOsPAPRDeviceInfo *info);
 extern int spapr_populate_vdevice(VIOsPAPRBus *bus, void *fdt);
 
+extern int spapr_vio_signal(VIOsPAPRDevice *dev, target_ulong mode);
+
 void vty_putchars(VIOsPAPRDevice *sdev, uint8_t *buf, int len);
 void spapr_vty_create(VIOsPAPRBus *bus,
   uint32_t reg, CharDriverState *chardev);
-- 
1.7.1




[Qemu-devel] [PATCH 25/27] Add a PAPR TCE-bypass mechanism for the pSeries machine

2011-03-24 Thread David Gibson
From: Ben Herrenschmidt b...@kernel.crashing.org

Usually, PAPR virtual IO devices use a virtual IOMMU mechanism, TCEs,
to mediate all DMA transfers.  While this is necessary for some sorts of
operation, it can be complex to program and slow for others.

This patch implements a mechanism for bypassing TCE translation, treating
IO addresses as plain (guest) physical memory addresses.  This has two
main uses:
 * Simple, but 64-bit aware programs like firmwares can use the VIO devices
without the complexity of TCE setup.
 * The guest OS can optionally use the TCE bypass to improve performance in
suitable situations.

The mechanism used is a per-device flag which disables TCE translation.
The flag is toggled with some (hypervisor-implemented) RTAS methods.

Signed-off-by: Ben Herrenschmidt b...@kernel.crashing.org
Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/spapr_vio.c |   82 
 hw/spapr_vio.h |5 +++
 2 files changed, 87 insertions(+), 0 deletions(-)

diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index 8f14fcc..481a804 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -226,6 +226,12 @@ int spapr_tce_dma_write(VIOsPAPRDevice *dev, uint64_t 
taddr, const void *buf,
 (unsigned long long)taddr, size);
 #endif
 
+/* Check for bypass */
+if (dev-flags  VIO_PAPR_FLAG_DMA_BYPASS) {
+cpu_physical_memory_write(taddr, buf, size);
+return 0;
+}
+
 while (size) {
 uint64_t tce;
 uint32_t lsize;
@@ -313,6 +319,12 @@ int spapr_tce_dma_read(VIOsPAPRDevice *dev, uint64_t 
taddr, void *buf,
 (unsigned long long)taddr, size);
 #endif
 
+/* Check for bypass */
+if (dev-flags  VIO_PAPR_FLAG_DMA_BYPASS) {
+cpu_physical_memory_read(taddr, buf, size);
+return 0;
+}
+
 while (size) {
 uint64_t tce;
 uint32_t lsize;
@@ -513,6 +525,72 @@ int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
 return 0;
 }
 
+/* quiesce handling */
+
+static void spapr_vio_quiesce_one(VIOsPAPRDevice *dev)
+{
+dev-flags = ~VIO_PAPR_FLAG_DMA_BYPASS;
+
+if (dev-rtce_table) {
+size_t size = (dev-rtce_window_size  SPAPR_VIO_TCE_PAGE_SHIFT)
+* sizeof(VIOsPAPR_RTCE);
+memset(dev-rtce_table, 0, size);
+}
+
+dev-crq.qladdr = 0;
+dev-crq.qsize = 0;
+dev-crq.qnext = 0;
+}
+
+static void rtas_set_tce_bypass(sPAPREnvironment *spapr, uint32_t token,
+uint32_t nargs, target_ulong args,
+uint32_t nret, target_ulong rets)
+{
+VIOsPAPRBus *bus = spapr-vio_bus;
+VIOsPAPRDevice *dev;
+uint32_t unit, enable;
+
+if (nargs != 2) {
+rtas_st(rets, 0, -3);
+return;
+}
+unit = rtas_ld(args, 0);
+enable = rtas_ld(args, 1);
+dev = spapr_vio_find_by_reg(bus, unit);
+if (!dev) {
+rtas_st(rets, 0, -3);
+return;
+}
+if (enable) {
+dev-flags |= VIO_PAPR_FLAG_DMA_BYPASS;
+} else {
+dev-flags = ~VIO_PAPR_FLAG_DMA_BYPASS;
+}
+
+rtas_st(rets, 0, 0);
+}
+
+static void rtas_quiesce(sPAPREnvironment *spapr, uint32_t token,
+ uint32_t nargs, target_ulong args,
+ uint32_t nret, target_ulong rets)
+{
+VIOsPAPRBus *bus = spapr-vio_bus;
+DeviceState *qdev;
+VIOsPAPRDevice *dev = NULL;
+
+if (nargs != 0) {
+rtas_st(rets, 0, -3);
+return;
+}
+
+QLIST_FOREACH(qdev, bus-bus.children, sibling) {
+dev = (VIOsPAPRDevice *)qdev;
+spapr_vio_quiesce_one(dev);
+}
+
+rtas_st(rets, 0, 0);
+}
+
 static int spapr_vio_busdev_init(DeviceState *qdev, DeviceInfo *qinfo)
 {
 VIOsPAPRDeviceInfo *info = (VIOsPAPRDeviceInfo *)qinfo;
@@ -591,6 +669,10 @@ VIOsPAPRBus *spapr_vio_bus_init(void)
 spapr_register_hypercall(H_SEND_CRQ, h_send_crq);
 spapr_register_hypercall(H_ENABLE_CRQ, h_enable_crq);
 
+/* RTAS calls */
+spapr_rtas_register(ibm,set-tce-bypass, rtas_set_tce_bypass);
+spapr_rtas_register(quiesce, rtas_quiesce);
+
 for (qinfo = device_info_list; qinfo; qinfo = qinfo-next) {
 VIOsPAPRDeviceInfo *info = (VIOsPAPRDeviceInfo *)qinfo;
 
diff --git a/hw/spapr_vio.h b/hw/spapr_vio.h
index b7d0daa..841b043 100644
--- a/hw/spapr_vio.h
+++ b/hw/spapr_vio.h
@@ -48,6 +48,8 @@ typedef struct VIOsPAPR_CRQ {
 typedef struct VIOsPAPRDevice {
 DeviceState qdev;
 uint32_t reg;
+uint32_t flags;
+#define VIO_PAPR_FLAG_DMA_BYPASS0x1
 qemu_irq qirq;
 uint32_t vio_irq_num;
 target_ulong signal_state;
@@ -104,4 +106,7 @@ void spapr_vlan_create(VIOsPAPRBus *bus, uint32_t reg, 
NICInfo *nd,
 void spapr_vscsi_create(VIOsPAPRBus *bus, uint32_t reg,
 qemu_irq qirq, uint32_t vio_irq_num);
 
+int spapr_tce_set_bypass(uint32_t unit, uint32_t enable);
+void spapr_vio_quiesce(void);
+
 #endif /* _HW_SPAPR_VIO_H */
-- 
1.7.1




[Qemu-devel] [PATCH 23/27] Implement PAPR CRQ hypercalls

2011-03-24 Thread David Gibson
This patch implements the infrastructure and hypercalls necessary for the
PAPR specified CRQ (Command Request Queue) mechanism.  This general
request queueing system is used by many of the PAPR virtual IO devices,
including the virtual scsi adapter.

Signed-off-by: Ben Herrenschmidt b...@kernel.crashing.org
Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/spapr.c   |2 +-
 hw/spapr_vio.c   |  160 ++
 hw/spapr_vio.h   |   12 
 target-ppc/kvm_ppc.h |   11 
 4 files changed, 184 insertions(+), 1 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 18660dc..02a3bbe 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -66,7 +66,7 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
 char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt
-\0hcall-tce;
+\0hcall-tce\0hcall-vio;
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 int i;
 char *modelname;
diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index 39d77ee..8f14fcc 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -28,6 +28,7 @@
 #include hw/sysbus.h
 #include kvm.h
 #include device_tree.h
+#include kvm_ppc.h
 
 #include hw/spapr.h
 #include hw/spapr_vio.h
@@ -359,6 +360,159 @@ uint64_t ldq_tce(VIOsPAPRDevice *dev, uint64_t taddr)
 return tswap64(val);
 }
 
+/*
+ * CRQ handling
+ */
+static target_ulong h_reg_crq(CPUState *env, sPAPREnvironment *spapr,
+  target_ulong opcode, target_ulong *args)
+{
+target_ulong reg = args[0];
+target_ulong queue_addr = args[1];
+target_ulong queue_len = args[2];
+VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr-vio_bus, reg);
+
+if (!dev) {
+hcall_dprintf(h_reg_crq on non-existent unit 0x
+  TARGET_FMT_lx \n, reg);
+return H_PARAMETER;
+}
+
+/* We can't grok a queue size bigger than 256M for now */
+if (queue_len  0x1000 || queue_len  0x1000) {
+hcall_dprintf(h_reg_crq, queue size too small or too big (0x%llx)\n,
+  (unsigned long long)queue_len);
+return H_PARAMETER;
+}
+
+/* Check queue alignment */
+if (queue_addr  0xfff) {
+hcall_dprintf(h_reg_crq, queue not aligned (0x%llx)\n,
+  (unsigned long long)queue_addr);
+return H_PARAMETER;
+}
+
+/* Check if device supports CRQs */
+if (!dev-crq.SendFunc) {
+return H_NOT_FOUND;
+}
+
+
+/* Already a queue ? */
+if (dev-crq.qsize) {
+return H_RESOURCE;
+}
+dev-crq.qladdr = queue_addr;
+dev-crq.qsize = queue_len;
+dev-crq.qnext = 0;
+
+dprintf(CRQ for dev 0x TARGET_FMT_lx  registered at 0x
+TARGET_FMT_lx /0x TARGET_FMT_lx \n,
+reg, queue_addr, queue_len);
+return H_SUCCESS;
+}
+
+static target_ulong h_free_crq(CPUState *env, sPAPREnvironment *spapr,
+   target_ulong opcode, target_ulong *args)
+{
+target_ulong reg = args[0];
+VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr-vio_bus, reg);
+
+if (!dev) {
+hcall_dprintf(h_free_crq on non-existent unit 0x
+  TARGET_FMT_lx \n, reg);
+return H_PARAMETER;
+}
+
+dev-crq.qladdr = 0;
+dev-crq.qsize = 0;
+dev-crq.qnext = 0;
+
+dprintf(CRQ for dev 0x TARGET_FMT_lx  freed\n, reg);
+
+return H_SUCCESS;
+}
+
+static target_ulong h_send_crq(CPUState *env, sPAPREnvironment *spapr,
+   target_ulong opcode, target_ulong *args)
+{
+target_ulong reg = args[0];
+target_ulong msg_hi = args[1];
+target_ulong msg_lo = args[2];
+VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr-vio_bus, reg);
+uint64_t crq_mangle[2];
+
+if (!dev) {
+hcall_dprintf(h_send_crq on non-existent unit 0x
+  TARGET_FMT_lx \n, reg);
+return H_PARAMETER;
+}
+crq_mangle[0] = cpu_to_be64(msg_hi);
+crq_mangle[1] = cpu_to_be64(msg_lo);
+
+if (dev-crq.SendFunc) {
+return dev-crq.SendFunc(dev, (uint8_t *)crq_mangle);
+}
+
+return H_HARDWARE;
+}
+
+static target_ulong h_enable_crq(CPUState *env, sPAPREnvironment *spapr,
+ target_ulong opcode, target_ulong *args)
+{
+target_ulong reg = args[0];
+VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr-vio_bus, reg);
+
+if (!dev) {
+hcall_dprintf(h_enable_crq on non-existent unit 0x
+  TARGET_FMT_lx \n, reg);
+return H_PARAMETER;
+}
+
+return 0;
+}
+
+/* Returns negative error, 0 success, or positive: queue full */
+int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
+{
+int rc;
+uint8_t byte;
+
+if (!dev-crq.qsize) {
+fprintf(stderr, spapr_vio_send_creq on 

[Qemu-devel] [PATCH 24/27] Implement PAPR virtual SCSI interface (ibmvscsi)

2011-03-24 Thread David Gibson
This patch implements the infrastructure and hypercalls necessary for
the PAPR specified Virtual SCSI interface.  This is the normal method
for providing (virtual) disks to PAPR partitions.

Signed-off-by: Ben Herrenschmidt b...@kernel.crashing.org
Signed-off-by: David Gibson d...@au1.ibm.com

Conflicts:

Makefile.target
---
 Makefile.target  |2 +-
 hw/ppc-viosrp.h  |  216 
 hw/spapr.c   |   11 +-
 hw/spapr_vio.h   |3 +
 hw/spapr_vscsi.c |  988 ++
 hw/srp.h |  240 +
 6 files changed, 1458 insertions(+), 2 deletions(-)
 create mode 100644 hw/ppc-viosrp.h
 create mode 100644 hw/spapr_vscsi.c
 create mode 100644 hw/srp.h

diff --git a/Makefile.target b/Makefile.target
index cd7bb41..565e1fb 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -234,7 +234,7 @@ obj-ppc-y += ppc_newworld.o
 # IBM pSeries (sPAPR)
 ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
 obj-ppc-y += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
-obj-ppc-y += xics.o spapr_vty.o spapr_llan.o
+obj-ppc-y += xics.o spapr_vty.o spapr_llan.o spapr_vscsi.o
 endif
 # PowerPC 4xx boards
 obj-ppc-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
diff --git a/hw/ppc-viosrp.h b/hw/ppc-viosrp.h
new file mode 100644
index 000..d8e365d
--- /dev/null
+++ b/hw/ppc-viosrp.h
@@ -0,0 +1,216 @@
+/*/
+/* srp.h -- SCSI RDMA Protocol definitions   */
+/*   */
+/* Written By: Colin Devilbis, IBM Corporation   */
+/*   */
+/* Copyright (C) 2003 IBM Corporation*/
+/*   */
+/* This program is free software; you can redistribute it and/or modify  */
+/* it under the terms of the GNU General Public License as published by  */
+/* the Free Software Foundation; either version 2 of the License, or */
+/* (at your option) any later version.   */
+/*   */
+/* This program is distributed in the hope that it will be useful,   */
+/* but WITHOUT ANY WARRANTY; without even the implied warranty of*/
+/* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the */
+/* GNU General Public License for more details.  */
+/*   */
+/* You should have received a copy of the GNU General Public License */
+/* along with this program; if not, write to the Free Software   */
+/* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */
+/*   */
+/*   */
+/* This file contains structures and definitions for IBM RPA (RS/6000*/
+/* platform architecture) implementation of the SRP (SCSI RDMA Protocol) */
+/* standard.  SRP is used on IBM iSeries and pSeries platforms to send SCSI  */
+/* commands between logical partitions.  */
+/*   */
+/* SRP Information Units (IUs) are sent on a Command/Response Queue (CRQ)  */
+/* between partitions.  The definitions in this file are architected,*/
+/* and cannot be changed without breaking compatibility with other versions  */
+/* of Linux and other operating systems (AIX, OS/400) that talk this protocol*/
+/* between logical partitions*/
+/*/
+#ifndef PPC_VIOSRP_H
+#define PPC_VIOSRP_H
+
+#define SRP_VERSION 16.a
+#define SRP_MAX_IU_LEN256
+#define SRP_MAX_LOC_LEN 32
+
+union srp_iu {
+struct srp_login_req login_req;
+struct srp_login_rsp login_rsp;
+struct srp_login_rej login_rej;
+struct srp_i_logout i_logout;
+struct srp_t_logout t_logout;
+struct srp_tsk_mgmt tsk_mgmt;
+struct srp_cmd cmd;
+struct srp_rsp rsp;
+uint8_t reserved[SRP_MAX_IU_LEN];
+};
+
+enum viosrp_crq_formats {
+VIOSRP_SRP_FORMAT = 0x01,
+VIOSRP_MAD_FORMAT = 0x02,
+VIOSRP_OS400_FORMAT = 0x03,
+VIOSRP_AIX_FORMAT = 0x04,
+VIOSRP_LINUX_FORMAT = 0x06,
+VIOSRP_INLINE_FORMAT = 0x07
+};
+
+enum viosrp_crq_status {
+VIOSRP_OK = 0x0,
+VIOSRP_NONRECOVERABLE_ERR = 0x1,
+VIOSRP_VIOLATES_MAX_XFER = 0x2,
+VIOSRP_PARTNER_PANIC = 0x3,
+VIOSRP_DEVICE_BUSY = 0x8,
+VIOSRP_ADAPTER_FAIL = 0x10,
+VIOSRP_OK2 = 0x99,
+};
+
+struct viosrp_crq 

[Qemu-devel] [PATCH 21/27] Implement TCE translation for sPAPR VIO

2011-03-24 Thread David Gibson
This patch implements the necessary infrastructure and hypercalls for
sPAPR's TCE (Translation Control Entry) IOMMU mechanism.  This is necessary
for all virtual IO devices which do DMA (i.e. nearly all of them).

Signed-off-by: Ben Herrenschmidt b...@kernel.crashing.org
Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/spapr.c |3 +-
 hw/spapr_vio.c |  238 
 hw/spapr_vio.h |   32 
 3 files changed, 272 insertions(+), 1 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index 56dba8e..69759c5 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -64,7 +64,8 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 uint32_t start_prop = cpu_to_be32(initrd_base);
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
-char hypertas_prop[] = 
hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt;
+char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt
+\0hcall-tce;
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 int i;
 char *modelname;
diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index 605079c..39d77ee 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -37,6 +37,7 @@
 #endif /* CONFIG_FDT */
 
 /* #define DEBUG_SPAPR */
+/* #define DEBUG_TCE */
 
 #ifdef DEBUG_SPAPR
 #define dprintf(fmt, ...) \
@@ -115,6 +116,28 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
 }
 }
 
+if (dev-rtce_window_size) {
+uint32_t dma_prop[] = {cpu_to_be32(dev-reg),
+   0, 0,
+   0, cpu_to_be32(dev-rtce_window_size)};
+
+ret = fdt_setprop_cell(fdt, node_off, ibm,#dma-address-cells, 2);
+if (ret  0) {
+return ret;
+}
+
+ret = fdt_setprop_cell(fdt, node_off, ibm,#dma-size-cells, 2);
+if (ret  0) {
+return ret;
+}
+
+ret = fdt_setprop(fdt, node_off, ibm,my-dma-window, dma_prop,
+  sizeof(dma_prop));
+if (ret  0) {
+return ret;
+}
+}
+
 if (info-devnode) {
 ret = (info-devnode)(dev, fdt, node_off);
 if (ret  0) {
@@ -126,6 +149,216 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
 }
 #endif /* CONFIG_FDT */
 
+/*
+ * RTCE handling
+ */
+
+static void rtce_init(VIOsPAPRDevice *dev)
+{
+size_t size = (dev-rtce_window_size  SPAPR_VIO_TCE_PAGE_SHIFT)
+* sizeof(VIOsPAPR_RTCE);
+
+if (size) {
+dev-rtce_table = qemu_mallocz(size);
+}
+}
+
+static target_ulong h_put_tce(CPUState *env, sPAPREnvironment *spapr,
+  target_ulong opcode, target_ulong *args)
+{
+target_ulong liobn = args[0];
+target_ulong ioba = args[1];
+target_ulong tce = args[2];
+VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr-vio_bus, liobn);
+VIOsPAPR_RTCE *rtce;
+
+if (!dev) {
+hcall_dprintf(spapr_vio_put_tce on non-existent LIOBN 
+  TARGET_FMT_lx \n, liobn);
+return H_PARAMETER;
+}
+
+ioba = ~(SPAPR_VIO_TCE_PAGE_SIZE - 1);
+
+#ifdef DEBUG_TCE
+fprintf(stderr, spapr_vio_put_tce on %s  ioba 0x TARGET_FMT_lx
+  TCE 0x TARGET_FMT_lx \n, dev-qdev.id, ioba, tce);
+#endif
+
+if (ioba = dev-rtce_window_size) {
+hcall_dprintf(spapr_vio_put_tce on out-of-boards IOBA 0x
+  TARGET_FMT_lx \n, ioba);
+return H_PARAMETER;
+}
+
+rtce = dev-rtce_table + (ioba  SPAPR_VIO_TCE_PAGE_SHIFT);
+rtce-tce = tce;
+
+return H_SUCCESS;
+}
+
+int spapr_vio_check_tces(VIOsPAPRDevice *dev, target_ulong ioba,
+ target_ulong len, enum VIOsPAPR_TCEAccess access)
+{
+int start, end, i;
+
+start = ioba  SPAPR_VIO_TCE_PAGE_SHIFT;
+end = (ioba + len - 1)  SPAPR_VIO_TCE_PAGE_SHIFT;
+
+for (i = start; i = end; i++) {
+if ((dev-rtce_table[i].tce  access) != access) {
+#ifdef DEBUG_TCE
+fprintf(stderr, FAIL on %d\n, i);
+#endif
+return -1;
+}
+}
+
+return 0;
+}
+
+int spapr_tce_dma_write(VIOsPAPRDevice *dev, uint64_t taddr, const void *buf,
+uint32_t size)
+{
+#ifdef DEBUG_TCE
+fprintf(stderr, spapr_tce_dma_write taddr=0x%llx size=0x%x\n,
+(unsigned long long)taddr, size);
+#endif
+
+while (size) {
+uint64_t tce;
+uint32_t lsize;
+uint64_t txaddr;
+
+/* Check if we are in bound */
+if (taddr = dev-rtce_window_size) {
+#ifdef DEBUG_TCE
+fprintf(stderr, spapr_tce_dma_write out of bounds\n);
+#endif
+return H_DEST_PARM;
+}
+tce = dev-rtce_table[taddr  SPAPR_VIO_TCE_PAGE_SHIFT].tce;
+
+/* How much til end of page ? */
+lsize = MIN(size, ((~taddr)  SPAPR_VIO_TCE_PAGE_MASK) + 1);
+
+/* Check TCE */
+if (!(tce  2)) {
+

[Qemu-devel] [PATCH 18/27] Implement the PAPR (pSeries) virtualized interrupt controller (xics)

2011-03-24 Thread David Gibson
PAPR defines an interrupt control architecture which is logically divided
into ICS (Interrupt Control Presentation, each unit is responsible for
presenting interrupts to a particular interrupt server, i.e. CPU) and
ICS (Interrupt Control Source, each unit responsible for one or more
hardware interrupts as numbered globally across the system).  All PAPR
virtual IO devices expect to deliver interrupts via this mechanism.  In
Linux, this interrupt controller system is handled by the xics driver.

On pSeries systems, access to the interrupt controller is virtualized via
hypercalls and RTAS methods.  However, the virtualized interface is very
similar to the underlying interrupt controller hardware, and similar PICs
exist un-virtualized in some other systems.

This patch implements both the ICP and ICS sides of the PAPR interrupt
controller.  For now, only the hypercall virtualized interface is provided,
however it would be relatively straightforward to graft an emulated
register interface onto the underlying interrupt logic if we want to add
a machine with a hardware ICS/ICP system in the future.

There are some limitations in this implementation: it is assumed for now
that only one instance of the ICS exists, although a full xics system can
have several, each responsible for a different group of hardware irqs.
ICP/ICS can handle both level-sensitve (LSI) and message signalled (MSI)
interrupt inputs.  For now, this implementation supports only MSI
interrupts, since that is used by PAPR virtual IO devices.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: David Gibson d...@au1.ibm.com
---
 Makefile.target |2 +-
 hw/spapr.c  |   26 +++
 hw/spapr.h  |2 +
 hw/xics.c   |  486 +++
 hw/xics.h   |   39 +
 5 files changed, 554 insertions(+), 1 deletions(-)
 create mode 100644 hw/xics.c
 create mode 100644 hw/xics.h

diff --git a/Makefile.target b/Makefile.target
index a53d99f..c795428 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -234,7 +234,7 @@ obj-ppc-y += ppc_newworld.o
 # IBM pSeries (sPAPR)
 ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
 obj-ppc-y += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
-obj-ppc-y += spapr_vty.o
+obj-ppc-y += xics.o spapr_vty.o
 endif
 # PowerPC 4xx boards
 obj-ppc-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
diff --git a/hw/spapr.c b/hw/spapr.c
index 21e3c86..b8244c9 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -34,6 +34,7 @@
 
 #include hw/spapr.h
 #include hw/spapr_vio.h
+#include hw/xics.h
 
 #include libfdt.h
 
@@ -64,6 +65,7 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
 char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr;
+uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 int i;
 char *modelname;
 int ret;
@@ -125,6 +127,7 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 
 for (i = 0; i  smp_cpus; i++) {
 CPUState *env = envs[i];
+uint32_t gserver_prop[] = {cpu_to_be32(i), 0}; /* HACK! */
 char *nodename;
 uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
0x, 0x};
@@ -155,6 +158,9 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
pft_size_prop, sizeof(pft_size_prop;
 _FDT((fdt_property_string(fdt, status, okay)));
 _FDT((fdt_property(fdt, 64-bit, NULL, 0)));
+_FDT((fdt_property_cell(fdt, ibm,ppc-interrupt-server#s, i)));
+_FDT((fdt_property(fdt, ibm,ppc-interrupt-gserver#s,
+   gserver_prop, sizeof(gserver_prop;
 
 if (envs[i]-mmu_model  POWERPC_MMU_1TSEG) {
 _FDT((fdt_property(fdt, ibm,processor-segment-sizes,
@@ -176,6 +182,20 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 
 _FDT((fdt_end_node(fdt)));
 
+/* interrupt controller */
+_FDT((fdt_begin_node(fdt, interrupt-controller@0)));
+
+_FDT((fdt_property_string(fdt, device_type,
+  PowerPC-External-Interrupt-Presentation)));
+_FDT((fdt_property_string(fdt, compatible, IBM,ppc-xicp)));
+_FDT((fdt_property_cell(fdt, reg, 0)));
+_FDT((fdt_property(fdt, interrupt-controller, NULL, 0)));
+_FDT((fdt_property(fdt, ibm,interrupt-server-ranges,
+   interrupt_server_ranges_prop,
+   sizeof(interrupt_server_ranges_prop;
+
+_FDT((fdt_end_node(fdt)));
+
 /* vdevice */
 _FDT((fdt_begin_node(fdt, vdevice)));
 
@@ -183,6 +203,8 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 _FDT((fdt_property_string(fdt, compatible, IBM,vdevice)));
 _FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
 _FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
+

[Qemu-devel] [PATCH 17/27] Implement assorted pSeries hcalls and RTAS methods

2011-03-24 Thread David Gibson
This patch adds several small utility hypercalls and RTAS methods to
the pSeries platform emulation.  Specifically:

* 'display-character' rtas call

This just prints a character to the console, it's occasionally used
for early debug of the OS.  The support includes a hack to make this
RTAS call respond on the normal token value present on real hardware,
since some early debugging tools just assume this value without
checking the device tree.

* 'get-time-of-day' rtas call

This one just takes the host real time, converts to the PAPR described
format and returns it to the guest.

* 'power-off' rtas call

This one shuts down the emulated system.

* H_DABR hypercall

On pSeries, the DABR debug register is usually a hypervisor resource
and virtualized through this hypercall.  If the hypercall is not
present, Linux will under some circumstances attempt to manipulate the
DABR directly which will fail on this emulated machine.

This stub implementation is enough to stop that behaviour, although it
doesn't actually implement the requested DABR operations as yet.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/spapr.c   |2 +-
 hw/spapr_hcall.c |   10 
 hw/spapr_rtas.c  |   69 ++
 3 files changed, 80 insertions(+), 1 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index ff1eb3b..21e3c86 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -63,7 +63,7 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 uint32_t start_prop = cpu_to_be32(initrd_base);
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
-char hypertas_prop[] = hcall-pft\0hcall-term;
+char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr;
 int i;
 char *modelname;
 int ret;
diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index 594e27d..02ccafd 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -248,6 +248,13 @@ static target_ulong h_protect(CPUState *env, 
sPAPREnvironment *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_set_dabr(CPUState *env, sPAPREnvironment *spapr,
+   target_ulong opcode, target_ulong *args)
+{
+/* FIXME: actually implement this */
+return H_HARDWARE;
+}
+
 static target_ulong h_rtas(sPAPREnvironment *spapr, target_ulong rtas_r3)
 {
 uint32_t token = ldl_phys(rtas_r3);
@@ -308,5 +315,8 @@ static void hypercall_init(void)
 spapr_register_hypercall(H_ENTER, h_enter);
 spapr_register_hypercall(H_REMOVE, h_remove);
 spapr_register_hypercall(H_PROTECT, h_protect);
+
+/* hcall-dabr */
+spapr_register_hypercall(H_SET_DABR, h_set_dabr);
 }
 device_init(hypercall_init);
diff --git a/hw/spapr_rtas.c b/hw/spapr_rtas.c
index 3f090f5..7226853 100644
--- a/hw/spapr_rtas.c
+++ b/hw/spapr_rtas.c
@@ -38,6 +38,58 @@
 #define TOKEN_BASE  0x2000
 #define TOKEN_MAX   0x100
 
+static void rtas_display_character(sPAPREnvironment *spapr,
+   uint32_t token, uint32_t nargs,
+   target_ulong args,
+   uint32_t nret, target_ulong rets)
+{
+uint8_t c = rtas_ld(args, 0);
+VIOsPAPRDevice *sdev = spapr_vio_find_by_reg(spapr-vio_bus, 0);
+
+if (!sdev) {
+rtas_st(rets, 0, -1);
+} else {
+vty_putchars(sdev, c, sizeof(c));
+rtas_st(rets, 0, 0);
+}
+}
+
+static void rtas_get_time_of_day(sPAPREnvironment *spapr,
+ uint32_t token, uint32_t nargs,
+ target_ulong args,
+ uint32_t nret, target_ulong rets)
+{
+struct tm tm;
+
+if (nret != 8) {
+rtas_st(rets, 0, -3);
+return;
+}
+
+qemu_get_timedate(tm, 0);
+
+rtas_st(rets, 0, 0); /* Success */
+rtas_st(rets, 1, tm.tm_year + 1900);
+rtas_st(rets, 2, tm.tm_mon + 1);
+rtas_st(rets, 3, tm.tm_mday);
+rtas_st(rets, 4, tm.tm_hour);
+rtas_st(rets, 5, tm.tm_min);
+rtas_st(rets, 6, tm.tm_sec);
+rtas_st(rets, 7, 0); /* we don't do nanoseconds */
+}
+
+static void rtas_power_off(sPAPREnvironment *spapr,
+   uint32_t token, uint32_t nargs, target_ulong args,
+   uint32_t nret, target_ulong rets)
+{
+if (nargs != 2 || nret != 1) {
+rtas_st(rets, 0, -3);
+return;
+}
+qemu_system_shutdown_request();
+rtas_st(rets, 0, 0);
+}
+
 static struct rtas_call {
 const char *name;
 spapr_rtas_fn fn;
@@ -59,6 +111,15 @@ target_ulong spapr_rtas_call(sPAPREnvironment *spapr,
 }
 }
 
+/* HACK: Some Linux early debug code uses RTAS display-character,
+ * but assumes the token value is 0xa (which it is on some real
+ * machines) without looking it up in the device tree.  This
+ * special case makes this work */
+if (token == 

[Qemu-devel] [PATCH 05/27] Implement missing parts of the logic for the POWER PURR

2011-03-24 Thread David Gibson
The PURR (Processor Utilization Resource Register) is a register found
on recent POWER CPUs.  The guts of implementing it at least enough to
get by are already present in qemu, however some of the helper
functions needed to actually wire it up are missing.

This patch adds the necessary glue, so that the PURR can be wired up
when we implement newer POWER CPU targets which include it.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/helper.h |1 +
 target-ppc/op_helper.c  |7 +++
 target-ppc/translate_init.c |8 
 3 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 1a69cf8..2b4744d 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -376,6 +376,7 @@ DEF_HELPER_0(load_601_rtcu, tl)
 #if !defined(CONFIG_USER_ONLY)
 #if defined(TARGET_PPC64)
 DEF_HELPER_1(store_asr, void, tl)
+DEF_HELPER_0(load_purr, tl)
 #endif
 DEF_HELPER_1(store_sdr1, void, tl)
 DEF_HELPER_1(store_tbl, void, tl)
diff --git a/target-ppc/op_helper.c b/target-ppc/op_helper.c
index bdb1f17..aa2e8ba 100644
--- a/target-ppc/op_helper.c
+++ b/target-ppc/op_helper.c
@@ -86,6 +86,13 @@ target_ulong helper_load_atbu (void)
 return cpu_ppc_load_atbu(env);
 }
 
+#if defined(TARGET_PPC64)  !defined(CONFIG_USER_ONLY)
+target_ulong helper_load_purr (void)
+{
+return (target_ulong)cpu_ppc_load_purr(env);
+}
+#endif
+
 target_ulong helper_load_601_rtcl (void)
 {
 return cpu_ppc601_load_rtcl(env);
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 7c08b1c..bca85d5 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -251,6 +251,14 @@ static void spr_write_atbu (void *opaque, int sprn, int 
gprn)
 {
 gen_helper_store_atbu(cpu_gpr[gprn]);
 }
+
+#if defined(TARGET_PPC64)
+__attribute__ (( unused ))
+static void spr_read_purr (void *opaque, int gprn, int sprn)
+{
+gen_helper_load_purr(cpu_gpr[gprn]);
+}
+#endif
 #endif
 
 #if !defined(CONFIG_USER_ONLY)
-- 
1.7.1




[Qemu-devel] [PATCH 10/27] Better factor the ppc hash translation path

2011-03-24 Thread David Gibson
Currently the path handling hash page table translation in get_segment()
has a mix of common and 32 or 64 bit specific code.  However the
division is not done terribly well which results in a lot of messy code
flipping between common and divided paths.

This patch improves the organization, consolidating several divided paths
into one.  This in turn allows simplification of some code in
get_segment(), removing a number of ugly interim variables.

This new factorization will also make it easier to add support for the 1T
segments added in newer CPUs.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/cpu.h|1 +
 target-ppc/helper.c |   67 ++
 2 files changed, 20 insertions(+), 48 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index cee1057..fd2dfcd 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -386,6 +386,7 @@ struct ppc_slb_t {
 #define SLB_VSID_B  0xc000ULL
 #define SLB_VSID_B_256M 0xULL
 #define SLB_VSID_VSID   0x3000ULL
+#define SLB_VSID_PTEM   (SLB_VSID_B | SLB_VSID_VSID)
 #define SLB_VSID_KS 0x0800ULL
 #define SLB_VSID_KP 0x0400ULL
 #define SLB_VSID_N  0x0200ULL /* no-execute */
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 0efa2a8..ae8001c 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -661,29 +661,15 @@ static inline int _find_pte(CPUState *env, mmu_ctx_t 
*ctx, int is_64b, int h,
 return ret;
 }
 
-static inline int find_pte32(CPUState *env, mmu_ctx_t *ctx, int h, int rw,
- int type, int target_page_bits)
-{
-return _find_pte(env, ctx, 0, h, rw, type, target_page_bits);
-}
-
-#if defined(TARGET_PPC64)
-static inline int find_pte64(CPUState *env, mmu_ctx_t *ctx, int h, int rw,
- int type, int target_page_bits)
-{
-return _find_pte(env, ctx, 1, h, rw, type, target_page_bits);
-}
-#endif
-
 static inline int find_pte(CPUState *env, mmu_ctx_t *ctx, int h, int rw,
int type, int target_page_bits)
 {
 #if defined(TARGET_PPC64)
 if (env-mmu_model  POWERPC_MMU_64)
-return find_pte64(env, ctx, h, rw, type, target_page_bits);
+return _find_pte(env, ctx, 1, h, rw, type, target_page_bits);
 #endif
 
-return find_pte32(env, ctx, h, rw, type, target_page_bits);
+return _find_pte(env, ctx, 0, h, rw, type, target_page_bits);
 }
 
 #if defined(TARGET_PPC64)
@@ -803,14 +789,16 @@ static inline int get_segment(CPUState *env, mmu_ctx_t 
*ctx,
   target_ulong eaddr, int rw, int type)
 {
 target_phys_addr_t hash;
-target_ulong sr, vsid, pgidx, page_mask;
+target_ulong vsid;
 int ds, pr, target_page_bits;
 int ret, ret2;
 
 pr = msr_pr;
+ctx-eaddr = eaddr;
 #if defined(TARGET_PPC64)
 if (env-mmu_model  POWERPC_MMU_64) {
 ppc_slb_t *slb;
+target_ulong pageaddr;
 
 LOG_MMU(Check SLBs\n);
 slb = slb_lookup(env, eaddr);
@@ -819,19 +807,24 @@ static inline int get_segment(CPUState *env, mmu_ctx_t 
*ctx,
 }
 
 vsid = (slb-vsid  SLB_VSID_VSID)  SLB_VSID_SHIFT;
-page_mask = ~SEGMENT_MASK_256M;
 target_page_bits = (slb-vsid  SLB_VSID_L)
 ? TARGET_PAGE_BITS_16M : TARGET_PAGE_BITS;
 ctx-key = !!(pr ? (slb-vsid  SLB_VSID_KP)
   : (slb-vsid  SLB_VSID_KS));
 ds = 0;
 ctx-nx = !!(slb-vsid  SLB_VSID_N);
-ctx-eaddr = eaddr;
+
+pageaddr = eaddr  ((1ULL  28) - (1ULL  target_page_bits));
+/* XXX: this is false for 1 TB segments */
+hash = vsid ^ (pageaddr  target_page_bits);
+/* Only 5 bits of the page index are used in the AVPN */
+ctx-ptem = (slb-vsid  SLB_VSID_PTEM) | ((pageaddr  16)  0x0F80);
 } else
 #endif /* defined(TARGET_PPC64) */
 {
+target_ulong sr, pgidx;
+
 sr = env-sr[eaddr  28];
-page_mask = 0x0FFF;
 ctx-key = (((sr  0x2000)  (pr != 0)) ||
 ((sr  0x4000)  (pr == 0))) ? 1 : 0;
 ds = sr  0x8000 ? 1 : 0;
@@ -843,6 +836,9 @@ static inline int get_segment(CPUState *env, mmu_ctx_t *ctx,
  ir=%d dr=%d pr=%d %d t=%d\n,
 eaddr, (int)(eaddr  28), sr, env-nip, env-lr, (int)msr_ir,
 (int)msr_dr, pr != 0 ? 1 : 0, rw, type);
+pgidx = (eaddr  ~SEGMENT_MASK_256M)  target_page_bits;
+hash = vsid ^ pgidx;
+ctx-ptem = (vsid  7) | (pgidx  10);
 }
 LOG_MMU(pte segment: key=%d ds %d nx %d vsid  TARGET_FMT_lx \n,
 ctx-key, ds, ctx-nx, vsid);
@@ -851,36 +847,12 @@ static inline int get_segment(CPUState *env, mmu_ctx_t 
*ctx,
 /* Check if instruction fetch is allowed, if needed */
 if (type != ACCESS_CODE || ctx-nx == 0) {
 /* Page address 

[Qemu-devel] [PATCH 09/27] Use hash more consistently in ppc mmu code

2011-03-24 Thread David Gibson
Currently, get_segment() has a variable called hash.  However it doesn't
(quite) get the hash value for the ppc hashed page table.  Instead it
gets the hash shifted - effectively the offset of the hash bucket within
the hash page table.

As well, as being different to the normal use of plain hash in the
architecture documentation, this usage necessitates some awkward 32/64
dependent masks and shifts which clutter up the path in get_segment().

This patch alters the code to use raw hash values through get_segment()
including storing raw hashes instead of pte group offsets in the ctx
structure.  This cleans up the path noticeably.

This does necessitate 32/64 dependent shifts when the hash values are
taken out of the ctx structure and used, but those paths already have
32/64 bit variants so this is less awkward than it was in get_segment().

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/cpu.h|5 ++-
 target-ppc/helper.c |   95 --
 2 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index ead4566..cee1057 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -367,6 +367,9 @@ union ppc_tlb_t {
 #define SDR_64_HTABSIZE0x001FULL
 #endif /* defined(TARGET_PPC64 */
 
+#define HASH_PTE_SIZE_32   8
+#define HASH_PTE_SIZE_64   16
+
 typedef struct ppc_slb_t ppc_slb_t;
 struct ppc_slb_t {
 uint64_t esid;
@@ -744,7 +747,7 @@ struct mmu_ctx_t {
 target_phys_addr_t raddr;  /* Real address  */
 target_phys_addr_t eaddr;  /* Effective address */
 int prot;  /* Protection bits   */
-target_phys_addr_t pg_addr[2]; /* PTE tables base addresses */
+target_phys_addr_t hash[2];/* Pagetable hash values */
 target_ulong ptem; /* Virtual segment ID | API  */
 int key;   /* Access key*/
 int nx;/* Non-execute area  */
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 68d2d9c..0efa2a8 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -567,21 +567,30 @@ static inline int get_bat(CPUState *env, mmu_ctx_t *ctx, 
target_ulong virtual,
 return ret;
 }
 
+static inline target_phys_addr_t get_pteg_offset(CPUState *env,
+ target_phys_addr_t hash,
+ int pte_size)
+{
+return (hash * pte_size * 8)  env-htab_mask;
+}
+
 /* PTE table lookup */
-static inline int _find_pte(mmu_ctx_t *ctx, int is_64b, int h, int rw,
-int type, int target_page_bits)
+static inline int _find_pte(CPUState *env, mmu_ctx_t *ctx, int is_64b, int h,
+int rw, int type, int target_page_bits)
 {
-target_ulong base, pte0, pte1;
+target_phys_addr_t pteg_off;
+target_ulong pte0, pte1;
 int i, good = -1;
 int ret, r;
 
 ret = -1; /* No entry found */
-base = ctx-pg_addr[h];
+pteg_off = get_pteg_offset(env, ctx-hash[h],
+   is_64b ? HASH_PTE_SIZE_64 : HASH_PTE_SIZE_32);
 for (i = 0; i  8; i++) {
 #if defined(TARGET_PPC64)
 if (is_64b) {
-pte0 = ldq_phys(base + (i * 16));
-pte1 = ldq_phys(base + (i * 16) + 8);
+pte0 = ldq_phys(env-htab_base + pteg_off + (i * 16));
+pte1 = ldq_phys(env-htab_base + pteg_off + (i * 16) + 8);
 
 /* We have a TLB that saves 4K pages, so let's
  * split a huge page to 4k chunks */
@@ -592,17 +601,17 @@ static inline int _find_pte(mmu_ctx_t *ctx, int is_64b, 
int h, int rw,
 r = pte64_check(ctx, pte0, pte1, h, rw, type);
 LOG_MMU(Load pte from  TARGET_FMT_lx  =  TARGET_FMT_lx  
 TARGET_FMT_lx  %d %d %d  TARGET_FMT_lx \n,
-base + (i * 16), pte0, pte1, (int)(pte0  1), h,
+pteg_base + (i * 16), pte0, pte1, (int)(pte0  1), h,
 (int)((pte0  1)  1), ctx-ptem);
 } else
 #endif
 {
-pte0 = ldl_phys(base + (i * 8));
-pte1 =  ldl_phys(base + (i * 8) + 4);
+pte0 = ldl_phys(env-htab_base + pteg_off + (i * 8));
+pte1 =  ldl_phys(env-htab_base + pteg_off + (i * 8) + 4);
 r = pte32_check(ctx, pte0, pte1, h, rw, type);
 LOG_MMU(Load pte from  TARGET_FMT_lx  =  TARGET_FMT_lx  
 TARGET_FMT_lx  %d %d %d  TARGET_FMT_lx \n,
-base + (i * 8), pte0, pte1, (int)(pte0  31), h,
+pteg_base + (i * 8), pte0, pte1, (int)(pte0  31), h,
 (int)((pte0  6)  1), ctx-ptem);
 }
 switch (r) {
@@ -638,11 +647,13 @@ static inline int _find_pte(mmu_ctx_t *ctx, int is_64b, 
int h, int rw,
 if (pte_update_flags(ctx, pte1, ret, rw) == 1) {
 #if defined(TARGET_PPC64)

[Qemu-devel] [PATCH 14/27] Implement the bus structure for PAPR virtual IO

2011-03-24 Thread David Gibson
This extends the pseries (PAPR) machine to include a virtual IO bus
supporting the PAPR defined hypercall based virtual IO mechanisms.

So far only one VIO device is provided, the vty / vterm, providing
a full console (polled only, for now).

Signed-off-by: David Gibson d...@au1.ibm.com
---
 Makefile.target |5 +-
 hw/spapr.c  |   48 -
 hw/spapr.h  |3 +
 hw/spapr_vio.c  |  214 +++
 hw/spapr_vio.h  |   50 +
 hw/spapr_vty.c  |  150 ++
 6 files changed, 450 insertions(+), 20 deletions(-)
 create mode 100644 hw/spapr_vio.c
 create mode 100644 hw/spapr_vio.h
 create mode 100644 hw/spapr_vty.c

diff --git a/Makefile.target b/Makefile.target
index ccf090b..cf12691 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -231,9 +231,10 @@ obj-ppc-y += ppc_prep.o
 obj-ppc-y += ppc_oldworld.o
 # NewWorld PowerMac
 obj-ppc-y += ppc_newworld.o
-# IBM pSeries (sPAPR)i
+# IBM pSeries (sPAPR)
 ifeq ($(CONFIG_FDT)$(TARGET_PPC64),yy)
-obj-ppc-y += spapr.o spapr_hcall.o
+obj-ppc-y += spapr.o spapr_hcall.o spapr_vio.o
+obj-ppc-y += spapr_vty.o
 endif
 # PowerPC 4xx boards
 obj-ppc-y += ppc4xx_devs.o ppc4xx_pci.o ppc405_uc.o ppc405_boards.o
diff --git a/hw/spapr.c b/hw/spapr.c
index 0deea1b..f3d6125 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -25,7 +25,6 @@
  *
  */
 #include sysemu.h
-#include qemu-char.h
 #include hw.h
 #include elf.h
 
@@ -34,6 +33,7 @@
 #include hw/loader.h
 
 #include hw/spapr.h
+#include hw/spapr_vio.h
 
 #include libfdt.h
 
@@ -60,6 +60,7 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 int i;
 char *modelname;
+int ret;
 
 #define _FDT(exp) \
 do { \
@@ -159,9 +160,30 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 
 _FDT((fdt_end_node(fdt)));
 
+/* vdevice */
+_FDT((fdt_begin_node(fdt, vdevice)));
+
+_FDT((fdt_property_string(fdt, device_type, vdevice)));
+_FDT((fdt_property_string(fdt, compatible, IBM,vdevice)));
+_FDT((fdt_property_cell(fdt, #address-cells, 0x1)));
+_FDT((fdt_property_cell(fdt, #size-cells, 0x0)));
+
+_FDT((fdt_end_node(fdt)));
+
 _FDT((fdt_end_node(fdt))); /* close root node */
 _FDT((fdt_finish(fdt)));
 
+/* re-expand to allow for further tweaks */
+_FDT((fdt_open_into(fdt, fdt, FDT_MAX_SIZE)));
+
+ret = spapr_populate_vdevice(spapr-vio_bus, fdt);
+if (ret  0) {
+fprintf(stderr, couldn't setup vio devices in fdt\n);
+exit(1);
+}
+
+_FDT((fdt_pack(fdt)));
+
 *fdt_size = fdt_totalsize(fdt);
 
 return fdt;
@@ -177,21 +199,6 @@ static void emulate_spapr_hypercall(CPUState *env)
 env-gpr[3] = spapr_hypercall(env, env-gpr[3], env-gpr[4]);
 }
 
-/* FIXME: hack until we implement the proper VIO console */
-static target_ulong h_put_term_char(CPUState *env, sPAPREnvironment *spapr,
-target_ulong opcode, target_ulong *args)
-{
-uint8_t buf[16];
-
-stq_p(buf, args[2]);
-stq_p(buf + 8, args[3]);
-
-qemu_chr_write(serial_hds[0], buf, args[1]);
-
-return 0;
-}
-
-
 /* pSeries LPAR / sPAPR hardware init */
 static void ppc_spapr_init(ram_addr_t ram_size,
const char *boot_device,
@@ -243,7 +250,13 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 ram_offset = qemu_ram_alloc(NULL, ppc_spapr.ram, ram_size);
 cpu_register_physical_memory(0, ram_size, ram_offset);
 
-spapr_register_hypercall(H_PUT_TERM_CHAR, h_put_term_char);
+spapr-vio_bus = spapr_vio_bus_init();
+
+for (i = 0; i  MAX_SERIAL_PORTS; i++) {
+if (serial_hds[i]) {
+spapr_vty_create(spapr-vio_bus, i, serial_hds[i]);
+}
+}
 
 if (kernel_filename) {
 uint64_t lowaddr = 0;
@@ -276,7 +289,6 @@ static void ppc_spapr_init(ram_addr_t ram_size,
 initrd_base = 0;
 initrd_size = 0;
 }
-
 } else {
 fprintf(stderr, pSeries machine needs -kernel for now);
 exit(1);
diff --git a/hw/spapr.h b/hw/spapr.h
index 685944b..06cca15 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -1,7 +1,10 @@
 #if !defined(__HW_SPAPR_H__)
 #define __HW_SPAPR_H__
 
+struct VIOsPAPRBus;
+
 typedef struct sPAPREnvironment {
+struct VIOsPAPRBus *vio_bus;
 } sPAPREnvironment;
 
 #define H_SUCCESS 0
diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
new file mode 100644
index 000..10acb4c
--- /dev/null
+++ b/hw/spapr_vio.c
@@ -0,0 +1,214 @@
+/*
+ * QEMU sPAPR VIO code
+ *
+ * Copyright (c) 2010 David Gibson, IBM Corporation d...@au1.ibm.com
+ * Based on the s390 virtio bus code:
+ * Copyright (c) 2009 Alexander Graf ag...@suse.de
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of 

[Qemu-devel] [PATCH 26/27] Implement PAPR VPA functions for pSeries shared processor partitions

2011-03-24 Thread David Gibson
Shared-processor partitions are those where a CPU is time-sliced between
partitions, rather than being permanently dedicated to a single
partition.  qemu emulated partitions, since they are just scheduled with
the qemu user process, behave mostly like shared processor partitions.

In order to better support shared processor partitions (splpar), PAPR
defines the VPA (Virtual Processor Area), a shared memory communication
channel between the hypervisor and partitions.  There are also two
additional shared memory communication areas for specialized purposes
associated with the VPA.

A VPA is not essential for operating an splpar, though it can be necessary
for obtaining accurate performance measurements in the presence of
runtime partition switching.

Most importantly, however, the VPA is a prerequisite for PAPR's H_CEDE,
hypercall, which allows a partition OS to give up it's shared processor
timeslices to other partitions when idle.

This patch implements the VPA and H_CEDE hypercalls in qemu.  We don't
implement any of the more advanced statistics which can be communicated
through the VPA.  However, this is enough to make normal pSeries kernels
do an effective power-save idle on an emulated pSeries, significantly
reducing the host load of a qemu emulated pSeries running an idle guest OS.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 hw/spapr.c   |2 +-
 hw/spapr_hcall.c |  192 ++
 target-ppc/cpu.h |7 ++
 3 files changed, 200 insertions(+), 1 deletions(-)

diff --git a/hw/spapr.c b/hw/spapr.c
index d16e499..9d611a7 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -67,7 +67,7 @@ static void *spapr_create_fdt(int *fdt_size, ram_addr_t 
ramsize,
 uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
 uint32_t pft_size_prop[] = {0, cpu_to_be32(hash_shift)};
 char hypertas_prop[] = hcall-pft\0hcall-term\0hcall-dabr\0hcall-interrupt
-\0hcall-tce\0hcall-vio;
+\0hcall-tce\0hcall-vio\0hcall-splpar;
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 int i;
 char *modelname;
diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index 02ccafd..6cc101d 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -4,6 +4,8 @@
 #include sysemu.h
 #include qemu-char.h
 #include exec-all.h
+#include exec.h
+#include helper_regs.h
 #include hw/spapr.h
 
 #define HPTES_PER_GROUP 8
@@ -255,6 +257,192 @@ static target_ulong h_set_dabr(CPUState *env, 
sPAPREnvironment *spapr,
 return H_HARDWARE;
 }
 
+#define FLAGS_REGISTER_VPA 0x2000ULL
+#define FLAGS_REGISTER_DTL 0x4000ULL
+#define FLAGS_REGISTER_SLBSHADOW   0x6000ULL
+#define FLAGS_DEREGISTER_VPA   0xa000ULL
+#define FLAGS_DEREGISTER_DTL   0xc000ULL
+#define FLAGS_DEREGISTER_SLBSHADOW 0xe000ULL
+
+#define VPA_MIN_SIZE   640
+#define VPA_SIZE_OFFSET0x4
+#define VPA_SHARED_PROC_OFFSET 0x9
+#define VPA_SHARED_PROC_VAL0x2
+
+static target_ulong register_vpa(CPUState *env, target_ulong vpa)
+{
+uint16_t size;
+uint8_t tmp;
+
+if (vpa == 0) {
+hcall_dprintf(Can't cope with registering a VPA at logical 0\n);
+return H_HARDWARE;
+}
+
+if (vpa % env-dcache_line_size) {
+return H_PARAMETER;
+}
+/* FIXME: bounds check the address */
+
+size = lduw_phys(vpa + 0x4);
+
+if (size  VPA_MIN_SIZE) {
+return H_PARAMETER;
+}
+
+/* VPA is not allowed to cross a page boundary */
+if ((vpa / 4096) != ((vpa + size - 1) / 4096)) {
+return H_PARAMETER;
+}
+
+env-vpa = vpa;
+
+tmp = ldub_phys(env-vpa + VPA_SHARED_PROC_OFFSET);
+tmp |= VPA_SHARED_PROC_VAL;
+stb_phys(env-vpa + VPA_SHARED_PROC_OFFSET, tmp);
+
+return H_SUCCESS;
+}
+
+static target_ulong deregister_vpa(CPUState *env, target_ulong vpa)
+{
+if (env-slb_shadow) {
+return H_RESOURCE;
+}
+
+if (env-dispatch_trace_log) {
+return H_RESOURCE;
+}
+
+env-vpa = 0;
+return H_SUCCESS;
+}
+
+static target_ulong register_slb_shadow(CPUState *env, target_ulong addr)
+{
+uint32_t size;
+
+if (addr == 0) {
+hcall_dprintf(Can't cope with SLB shadow at logical 0\n);
+return H_HARDWARE;
+}
+
+size = ldl_phys(addr + 0x4);
+if (size  0x8) {
+return H_PARAMETER;
+}
+
+if ((addr / 4096) != ((addr + size - 1) / 4096)) {
+return H_PARAMETER;
+}
+
+if (!env-vpa) {
+return H_RESOURCE;
+}
+
+env-slb_shadow = addr;
+
+return H_SUCCESS;
+}
+
+static target_ulong deregister_slb_shadow(CPUState *env, target_ulong addr)
+{
+env-slb_shadow = 0;
+return H_SUCCESS;
+}
+
+static target_ulong register_dtl(CPUState *env, target_ulong addr)
+{
+uint32_t size;
+
+if (addr == 0) {
+hcall_dprintf(Can't cope with DTL at logical 0\n);
+return H_HARDWARE;
+}
+
+size = 

[Qemu-devel] [PATCH 07/27] Clean up slb_lookup() function

2011-03-24 Thread David Gibson
The slb_lookup() function, used in the ppc translation path returns a
number of slb entry fields in reference parameters.  However, only one
of the two callers of slb_lookup() actually wants this information.

This patch, therefore, makes slb_lookup() return a simple pointer to the
located SLB entry (or NULL), and the caller which needs the fields can
extract them itself.

Signed-off-by: David Gibson d...@au1.ibm.com
---
 target-ppc/helper.c |   45 ++---
 1 files changed, 18 insertions(+), 27 deletions(-)

diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index b9621d2..7ca33cb 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -676,9 +676,7 @@ static inline int find_pte(CPUState *env, mmu_ctx_t *ctx, 
int h, int rw,
 }
 
 #if defined(TARGET_PPC64)
-static inline int slb_lookup(CPUPPCState *env, target_ulong eaddr,
- target_ulong *vsid, target_ulong *page_mask,
- int *attr, int *target_page_bits)
+static inline ppc_slb_t *slb_lookup(CPUPPCState *env, target_ulong eaddr)
 {
 uint64_t esid;
 int n;
@@ -693,19 +691,11 @@ static inline int slb_lookup(CPUPPCState *env, 
target_ulong eaddr,
 LOG_SLB(%s: slot %d %016 PRIx64  %016
 PRIx64 \n, __func__, n, slb-esid, slb-vsid);
 if (slb-esid == esid) {
-*vsid = (slb-vsid  SLB_VSID_VSID)  SLB_VSID_SHIFT;
-*page_mask = ~SEGMENT_MASK_256M;
-*attr = slb-vsid  SLB_VSID_ATTR;
-if (target_page_bits) {
-*target_page_bits = (slb-vsid  SLB_VSID_L)
-? TARGET_PAGE_BITS_16M
-: TARGET_PAGE_BITS;
-}
-return n;
+return slb;
 }
 }
 
-return -5;
+return NULL;
 }
 
 void ppc_slb_invalidate_all (CPUPPCState *env)
@@ -732,18 +722,13 @@ void ppc_slb_invalidate_all (CPUPPCState *env)
 
 void ppc_slb_invalidate_one (CPUPPCState *env, uint64_t T0)
 {
-target_ulong vsid, page_mask;
-int attr;
-int n;
 ppc_slb_t *slb;
 
-n = slb_lookup(env, T0, vsid, page_mask, attr, NULL);
-if (n  0) {
+slb = slb_lookup(env, T0);
+if (!slb) {
 return;
 }
 
-slb = env-slb[n];
-
 if (slb-esid  SLB_ESID_V) {
 slb-esid = ~SLB_ESID_V;
 
@@ -822,16 +807,22 @@ static inline int get_segment(CPUState *env, mmu_ctx_t 
*ctx,
 pr = msr_pr;
 #if defined(TARGET_PPC64)
 if (env-mmu_model  POWERPC_MMU_64) {
-int attr;
+ppc_slb_t *slb;
 
 LOG_MMU(Check SLBs\n);
-ret = slb_lookup(env, eaddr, vsid, page_mask, attr,
- target_page_bits);
-if (ret  0)
-return ret;
-ctx-key = !!(pr ? (attr  SLB_VSID_KP) : (attr  SLB_VSID_KS));
+slb = slb_lookup(env, eaddr);
+if (!slb) {
+return -5;
+}
+
+vsid = (slb-vsid  SLB_VSID_VSID)  SLB_VSID_SHIFT;
+page_mask = ~SEGMENT_MASK_256M;
+target_page_bits = (slb-vsid  SLB_VSID_L)
+? TARGET_PAGE_BITS_16M : TARGET_PAGE_BITS;
+ctx-key = !!(pr ? (slb-vsid  SLB_VSID_KP)
+  : (slb-vsid  SLB_VSID_KS));
 ds = 0;
-ctx-nx = !!(attr  SLB_VSID_N);
+ctx-nx = !!(slb-vsid  SLB_VSID_N);
 ctx-eaddr = eaddr;
 vsid_mask = 0x3F80ULL;
 vsid_sh = 7;
-- 
1.7.1