Re: [Qemu-devel] [PATCH v2] Don't allow multiwrites against a block device without underlying medium

2011-03-30 Thread Markus Armbruster
I think this (commit 301db7c2) should be cherry-picked into stable-0.14.

Ryan Harper ry...@us.ibm.com writes:

 If the block device has been closed, we no longer have a medium to submit
 IO against, check for this before submitting io.  This prevents a segfault
 further in the code where we dereference elements of the block driver.

 Signed-off-by: Ryan Harper ry...@us.ibm.com



Re: [Qemu-devel] [PATCH v4] Do not delete BlockDriverState when deleting the drive

2011-03-30 Thread Markus Armbruster
[Cc: Justin, because I feel it should go into stable-0.14 as well]

Ryan Harper ry...@us.ibm.com writes:

 When removing a drive from the host-side via drive_del we currently have
 the following path:

 drive_del
 qemu_aio_flush()
 bdrv_close()// zaps bs-drv, which makes any subsequent I/O get
 // dropped.  Works as designed
 drive_uninit()
 bdrv_delete()   // frees the bs.  Since the device is still connected to
 // bs, any subsequent I/O is a use-after-free.

 The value of bs-drv becomes unpredictable on free.  As long as it
 remains null, I/O still gets dropped, however it could become non-null
 at any point after the free resulting SEGVs or other QEMU state
 corruption.

 To resolve this issue as simply as possible, we can chose to not
 actually delete the BlockDriverState pointer.  Since bdrv_close()
 handles setting the drv pointer to NULL, we just need to remove the
 BlockDriverState from the QLIST that is used to enumerate the block
 devices.  This is currently handled within bdrv_delete, so move this
 into its own function, bdrv_make_anon().

 The result is that we can now invoke drive_del, this closes the file
 descriptors and sets BlockDriverState-drv to NULL which prevents futher
 IO to the device, and since we do not free BlockDriverState, we don't
 have to worry about the copy retained in the block devices.

 We also don't attempt to remove the qdev property since we are no longer
 deleting the BlockDriverState on drives with associated drives.  This
 also allows for removing Drives with no devices associated either.

 Reported-by: Markus Armbruster arm...@redhat.com
 Signed-off-by: Ryan Harper ry...@us.ibm.com

Acked-by: Markus Armbruster arm...@redhat.com



Re: [Qemu-devel] A question about QEMU on unix

2011-03-30 Thread Jes Sorensen
On 03/29/11 12:55, Bin (Bin) Shi wrote:
 Can QEMU run on QNX ?
 
 My machine is 
 
 Cpu - arm11
 
 Os - qnx6.5
 
 Does QEMU support my machine ?

Hi,

Do you mean if QEMU can emulate ARM11 and boot QNX that way, or do you
want to run QEMU on a QNX box?

I don't think QEMU has been ported to QNX, so it may require a bit of
work. I have no idea how hard it would be though. As for emulating
ARM11, I cannot say.

Cheers,
Jes



Re: [Qemu-devel] [PATCH v3 0/7] Let boards state maximum RAM limits in QEMUMachine struct

2011-03-30 Thread Jes Sorensen
On 03/29/11 16:08, Peter Maydell wrote:
 This primary aim of this patchset is to add a new 'max_ram' field to the
 QEMUMachine structure so that a board model can specify the maximum RAM it
 will accept.  We can then produce a friendly diagnostic message when the
 user tries to start qemu with a '-m' option asking for more RAM than that. 
 (Currently most of the ARM devboard models respond with an obscure guest
 crash when the guest tries to access RAM and finds device registers
 instead.)
 
 If no maximum size is specified we default to the old behaviour of
 do not impose any limit.
 
 The bulk of the patchset is knock-on cleanup as a result, in particular
 allowing QEMUMachine structs to be const and sun4m cleanup.

Hi Peter,

Sorry for not getting to this patch earlier.

I am a little concerned about this approach. It should work for simple
embedded boards, but for larger systems, it really ought to be a mask
rather than a max address. On NUMA systems you are bound to have holes,
and at times you might not even start from address 0. In these cases you
are likely to have device registers mapped in between the memory chunks.

Ideally I think it would be better to have a mask and then introduce a
is_valid_memory() kinda function to check it with.

I fear that by introducing a simple max limit like this, we are going to
hit problems later when we try to improve the NUMA support.

Cheers,
Jes



RE: [Qemu-devel] [PATCH 1/3] unicore32: add target-unicore32 directory for unicore32-linux-user support

2011-03-30 Thread Guan Xuetao


 -Original Message-
 From: Blue Swirl [mailto:blauwir...@gmail.com]
 Sent: Saturday, March 26, 2011 8:51 PM
 To: Guan Xuetao
 Cc: qemu-devel@nongnu.org
 Subject: Re: [Qemu-devel] [PATCH 1/3] unicore32: add target-unicore32 
 directory for unicore32-linux-user support
 
 On Thu, Mar 24, 2011 at 11:25 AM, Guan Xuetao g...@mprc.pku.edu.cn wrote:
  unicore32: add target-unicore32 directory for unicore32-linux-user support
 
  Signed-off-by: Guan Xuetao g...@mprc.pku.edu.cn
  ---
 
  diff --git a/target-unicore32/cpu.h b/target-unicore32/cpu.h
  new file mode 100644
  index 000..0688891
  --- /dev/null
  +++ b/target-unicore32/cpu.h
  @@ -0,0 +1,184 @@
  +/*
  + * UniCore32 virtual CPU header
  + *
  + * Copyright (C) 2010-2011 GUAN Xue-tao
  + *
  + * This program is free software; you can redistribute it and/or modify
  + * it under the terms of the GNU General Public License version 2 as
  + * published by the Free Software Foundation.
  + */
  +#ifndef __UC32_CPU_H__
  +#define __UC32_CPU_H__
 
 Please use CPU_UC32_H.
Applied.

  +typedef struct CPUState_UniCore32 {
  +/* Regs for current mode.  */
  +uint32_t regs[32];
  +/* Frequently accessed ASR bits are stored separately for efficiently.
  +   This contains all the other bits.  Use asr_{read,write} to access
  +   the whole ASR.  */
  +uint32_t uncached_asr;
  +uint32_t bsr;
  +
  +/* Banked registers.  */
  +uint32_t banked_bsr[6];
  +uint32_t banked_r29[6];
  +uint32_t banked_r30[6];
  +
  +/* asr flag cache for faster execution */
  +uint32_t CF; /* 0 or 1 */
  +uint32_t VF; /* V is the bit 31. All other bits are undefined */
  +uint32_t NF; /* N is bit 31. All other bits are undefined.  */
  +uint32_t ZF; /* Z set if zero.  */
  +
  +/* System control coprocessor (cp0) */
  +struct {
  +uint32_t c0_cpuid;
  +uint32_t c0_cachetype;
  +uint32_t c1_sys; /* System control register.  */
  +uint32_t c2_base; /* MMU translation table base.  */
  +uint32_t c3_faultstatus; /* Fault status registers.  */
  +uint32_t c4_faultaddr; /* Fault address registers.  */
  +uint32_t c5_cacheop; /* Cache operation registers.  */
  +uint32_t c6_tlbop; /* TLB operation registers. */
  +} cp0;
  +
  +/* Internal CPU feature flags.  */
  +uint32_t features;
 
 All CPUState fields up to breakpoints are cleared on reset by common
 code. I'd suppose CPU feature shouldn't change on reset, so please
 move this after CPU_COMMON.
Applied.

  +CPUState *uc32_cpu_init(const char *cpu_model);
  +int uc32_cpu_exec(CPUState *s);
  +int uc32_cpu_signal_handler(int host_signum, void *pinfo, void *puc);
  +int uc32_cpu_handle_mmu_fault(CPUState *env, target_ulong address, int rw,
  +  int mmu_idx, int is_softmuu);
  +void uc32_cpu_list(FILE *f, int (*cpu_fprintf)(FILE *f, const char *fmt, 
  ...));
 
 fprintf_function cpu_fprintf
Applied.

  +
  +void cpu_reset(CPUState *env)
  +{
  +uint32_t id;
  +
  +if (qemu_loglevel_mask(CPU_LOG_RESET)) {
  +qemu_log(CPU Reset (CPU %d)\n, env-cpu_index);
  +log_cpu_state(env, 0);
  +}
  +
  +id = env-cp0.c0_cpuid;
  +memset(env, 0, offsetof(CPUState, breakpoints));
 
 This is done in common code.
Applied.

 
  +if (id) {
  +cpu_reset_model_id(env, id);
 
 I don't think this should be done when resetting, but only at CPU init.
Applied.

  +CPUState *uc32_cpu_init(const char *cpu_model)
  +{
  +CPUState *env;
  +uint32_t id;
  +static int inited = 1;
  +
  +id = uc32_cpu_find_by_name(cpu_model);
  +if (id == 0) {
  +return NULL;
  +}
  +env = qemu_mallocz(sizeof(CPUState));
  +cpu_exec_init(env);
  +if (inited) {
  +inited = 0;
  +uc32_translate_init();
  +}
  +
  +env-cpu_model_str = cpu_model;
  +env-cp0.c0_cpuid = id;
  +cpu_reset(env);
 
 This is also performed by common code.
However, only I386/SPARC/PPC call cpu_reset() in linux-user/main.c.
Then the cpu_reset() codes had been merged in cpu_init() for unicore32.

  +
  +uint32_t asr_read(CPUState *env)
  +{
  +int ZF;
  +ZF = (env-ZF == 0);
  +return env-uncached_asr | (env-NF  0x8000) | (ZF  30) |
  +(env-CF  29) | ((env-VF  0x8000)  3);
  +}
  +
  +void asr_write(CPUState *env, uint32_t val, uint32_t mask)
  +{
  +if (mask  ASR_NZCV) {
  +env-ZF = (~val)  ASR_Z;
  +env-NF = val;
  +env-CF = (val  29)  1;
  +env-VF = (val  3)  0x8000;
  +}
  +
  +if ((env-uncached_asr ^ val)  mask  ASR_M) {
  +switch_mode(env, val  ASR_M);
  +}
  +mask = ~ASR_NZCV;
  +env-uncached_asr = (env-uncached_asr  ~mask) | (val  mask);
  +}
 
 Helpers like this and especially HELPER stuff below belong to op_helper.c.
 
 If a function is used by helpers and other parts, two sets of
 functions should be created. The 

RE: [Qemu-devel] A question about QEMU on unix

2011-03-30 Thread Bin (Bin) Shi
I want to run QEMU on a QNX box, to emulate an android system.


-Original Message-
From: Jes Sorensen [mailto:jes.soren...@redhat.com] 
Sent: Wednesday, March 30, 2011 3:41 PM
To: Bin (Bin) Shi
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] A question about QEMU on unix

On 03/29/11 12:55, Bin (Bin) Shi wrote:
 Can QEMU run on QNX ?
 
 My machine is 
 
 Cpu - arm11
 
 Os - qnx6.5
 
 Does QEMU support my machine ?

Hi,

Do you mean if QEMU can emulate ARM11 and boot QNX that way, or do you
want to run QEMU on a QNX box?

I don't think QEMU has been ported to QNX, so it may require a bit of
work. I have no idea how hard it would be though. As for emulating
ARM11, I cannot say.

Cheers,
Jes



Re: [Qemu-devel] [PATCH v3 0/7] Let boards state maximum RAM limits in QEMUMachine struct

2011-03-30 Thread Peter Maydell
On 30 March 2011 08:48, Jes Sorensen jes.soren...@redhat.com wrote:
 On 03/29/11 16:08, Peter Maydell wrote:
 This primary aim of this patchset is to add a new 'max_ram' field to the
 QEMUMachine structure so that a board model can specify the maximum RAM it
 will accept.

 I am a little concerned about this approach. It should work for simple
 embedded boards, but for larger systems, it really ought to be a mask
 rather than a max address.

It's not a maximum address, it's a maximum size. For instance
the RAM isn't contiguous on some of the ARM devboards.

 Ideally I think it would be better to have a mask and then introduce a
 is_valid_memory() kinda function to check it with.

The command line option doesn't provide any means of saying
put 64MB in this hole and another 128 over here and 32 there,
so this seems completely pointless to me. All we are trying
to do is validate what the user has asked for, so why have
a validation mechanism that can cope with impossible-to-request
arrangements?

 I fear that by introducing a simple max limit like this, we are going to
 hit problems later when we try to improve the NUMA support.

I think this is letting the best be the enemy of the good.

Even if you do want to have NUMA systems do more complex
things I think you should still have the simple maximum
size approach for the bulk of the supported boards which
don't need anything more complicated. So additional NUMA
features would augment, not replace this.

-- PMM



[Qemu-devel] [PATCHv2 0/3] unicore32: add unicore32-linux-user support for qemu 0.14

2011-03-30 Thread Guan Xuetao

The patch set adds new unicore32-linux-user support for qemu-stable-0.14
Patch 1 adds target-unicore32 directory
Patch 2 adds linux-user/unicore32 directory
Patch 3 adds necessary modifications for other files

V1 - V2: changed by advice from Blue Swirl


Guan Xuetao (3):
  unicore32: add target-unicore32 directory for unicore32-linux-user
support
  unicore32: add necessry headers in linux-user/unicore32 for unicore32
support
  unicore32: necessary modifications for other files to support
unicore32

 configure|   11 +-
 cpu-exec.c   |   12 +-
 default-configs/unicore32-linux-user.mak |1 +
 elf.h|2 +
 fpu/softfloat-specialize.h   |   10 +-
 linux-user/elfload.c |   74 ++
 linux-user/main.c|   89 ++-
 linux-user/qemu.h|5 +-
 linux-user/syscall_defs.h|   10 +-
 linux-user/unicore32/syscall.h   |   55 +
 linux-user/unicore32/syscall_nr.h|  371 ++
 linux-user/unicore32/target_signal.h |   26 +
 linux-user/unicore32/termbits.h  |2 +
 target-unicore32/cpu.h   |  182 +++
 target-unicore32/exec.h  |   50 +
 target-unicore32/helper.c|  488 +++
 target-unicore32/helper.h|   70 +
 target-unicore32/op_helper.c |  248 
 target-unicore32/translate.c | 2105 ++
 19 files changed, 3798 insertions(+), 13 deletions(-)
 create mode 100644 default-configs/unicore32-linux-user.mak
 create mode 100644 linux-user/unicore32/syscall.h
 create mode 100644 linux-user/unicore32/syscall_nr.h
 create mode 100644 linux-user/unicore32/target_signal.h
 create mode 100644 linux-user/unicore32/termbits.h
 create mode 100644 target-unicore32/cpu.h
 create mode 100644 target-unicore32/exec.h
 create mode 100644 target-unicore32/helper.c
 create mode 100644 target-unicore32/helper.h
 create mode 100644 target-unicore32/op_helper.c
 create mode 100644 target-unicore32/translate.c





[Qemu-devel] [PATCHv2 2/3] unicore32: add necessry headers in linux-user/unicore32 for unicore32 support

2011-03-30 Thread Guan Xuetao

Signed-off-by: Guan Xuetao g...@mprc.pku.edu.cn
---
 linux-user/unicore32/syscall.h   |   55 +
 linux-user/unicore32/syscall_nr.h|  371 ++
 linux-user/unicore32/target_signal.h |   26 +++
 linux-user/unicore32/termbits.h  |2 +
 4 files changed, 454 insertions(+), 0 deletions(-)
 create mode 100644 linux-user/unicore32/syscall.h
 create mode 100644 linux-user/unicore32/syscall_nr.h
 create mode 100644 linux-user/unicore32/target_signal.h
 create mode 100644 linux-user/unicore32/termbits.h

diff --git a/linux-user/unicore32/syscall.h b/linux-user/unicore32/syscall.h
new file mode 100644
index 000..010cdd8
--- /dev/null
+++ b/linux-user/unicore32/syscall.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2010-2011 GUAN Xue-tao
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef __UC32_SYSCALL_H__
+#define __UC32_SYSCALL_H__
+struct target_pt_regs {
+abi_ulong uregs[34];
+};
+
+#define UC32_REG_pc uregs[31]
+#define UC32_REG_lr uregs[30]
+#define UC32_REG_sp uregs[29]
+#define UC32_REG_ip uregs[28]
+#define UC32_REG_fp uregs[27]
+#define UC32_REG_26 uregs[26]
+#define UC32_REG_25 uregs[25]
+#define UC32_REG_24 uregs[24]
+#define UC32_REG_23 uregs[23]
+#define UC32_REG_22 uregs[22]
+#define UC32_REG_21 uregs[21]
+#define UC32_REG_20 uregs[20]
+#define UC32_REG_19 uregs[19]
+#define UC32_REG_18 uregs[18]
+#define UC32_REG_17 uregs[17]
+#define UC32_REG_16 uregs[16]
+#define UC32_REG_15 uregs[15]
+#define UC32_REG_14 uregs[14]
+#define UC32_REG_13 uregs[13]
+#define UC32_REG_12 uregs[12]
+#define UC32_REG_11 uregs[11]
+#define UC32_REG_10 uregs[10]
+#define UC32_REG_09 uregs[9]
+#define UC32_REG_08 uregs[8]
+#define UC32_REG_07 uregs[7]
+#define UC32_REG_06 uregs[6]
+#define UC32_REG_05 uregs[5]
+#define UC32_REG_04 uregs[4]
+#define UC32_REG_03 uregs[3]
+#define UC32_REG_02 uregs[2]
+#define UC32_REG_01 uregs[1]
+#define UC32_REG_00 uregs[0]
+#define UC32_REG_asruregs[32]
+#define UC32_REG_ORIG_00uregs[33]
+
+#define UC32_SYSCALL_BASE   0x90
+#define UC32_SYSCALL_ARCH_BASE  0xf
+#define UC32_SYSCALL_NR_set_tls (UC32_SYSCALL_ARCH_BASE + 5)
+
+#define UNAME_MACHINE UniCore-II
+
+#endif /* __UC32_SYSCALL_H__ */
diff --git a/linux-user/unicore32/syscall_nr.h 
b/linux-user/unicore32/syscall_nr.h
new file mode 100644
index 000..9c72d84
--- /dev/null
+++ b/linux-user/unicore32/syscall_nr.h
@@ -0,0 +1,371 @@
+/*
+ * This file contains the system call numbers for UniCore32 oldabi.
+ *
+ * Copyright (C) 2010-2011 GUAN Xue-tao
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#define TARGET_NR_restart_syscall   0
+#define TARGET_NR_exit  1
+#define TARGET_NR_fork  2
+#define TARGET_NR_read  3
+#define TARGET_NR_write 4
+#define TARGET_NR_open  5
+#define TARGET_NR_close 6
+#define TARGET_NR_waitpid   7
+#define TARGET_NR_creat 8
+#define TARGET_NR_link  9
+#define TARGET_NR_unlink10
+#define TARGET_NR_execve11
+#define TARGET_NR_chdir 12
+#define TARGET_NR_time  13
+#define TARGET_NR_mknod 14
+#define TARGET_NR_chmod 15
+#define TARGET_NR_lchown16
+#define TARGET_NR_break 17
+/* 18 */
+#define TARGET_NR_lseek 19
+#define TARGET_NR_getpid20
+#define TARGET_NR_mount 21
+#define TARGET_NR_umount22
+#define TARGET_NR_setuid23
+#define TARGET_NR_getuid24
+#define TARGET_NR_stime 25
+#define TARGET_NR_ptrace26
+#define TARGET_NR_alarm 27
+/* 28 */
+#define TARGET_NR_pause 29
+#define TARGET_NR_utime 30
+#define TARGET_NR_stty  31
+#define TARGET_NR_gtty  

Re: [Qemu-devel] GSoC: Improved image format compatibility

2011-03-30 Thread Stefan Hajnoczi
On Wed, Mar 30, 2011 at 5:50 AM, Stefan Weil w...@mail.berlios.de wrote:
 Am 29.03.2011 23:55, schrieb Lyu Mitnick:

 Hello all,

 I have used QEMU to assist me developing embedded system for 3 years. And
 I
 want to contribute to QEMU and participate Google Summer of Code this
 year.
 I have port binutils so I have some experienced with binary format. I
 noticed that there is no VHD support in qemu-img. A Virtual Hard Disk(VHD)
 is a virtual hard disk file format used in Microsoft virtual PC, Hyper-V,
 virtual box, xVM and Complete PC Backup  of vista and windows 7. I want to
 add support for VHD in GSoC. I am wondering whether this kind of idea is
 valuable to QEMU?? Or any another topic related image format is valuable
 to
 QEMU??

 thanks

 Mitnick

 Hello Mitnick,

 QEMU (including qemu-img) already supports the format vpc.
 The code in block/vpc.c also uses the name vhd, so I think
 vpc == vhd.

However, it is possible that vpc/vhd has new features that are not yet
supported in QEMU.  There is a listed project idea for updating image
formats on the GSoC wiki page:

http://wiki.qemu.org/Google_Summer_of_Code_2011#Improved_image_format_compatibility

You could assess QEMU's implementation in block/vpc.c and compare
against the latest public file format specification:

http://en.wikipedia.org/wiki/VHD_(file_format)

If you find there are new features or format changes that are not yet
supported in QEMU, this might make a good Improved image format
compatibility GSoC project.

Stefan



Re: [Qemu-devel] [PATCH v2 0/3] block: Correct size across CD-ROM media change

2011-03-30 Thread Markus Armbruster
Stefan Hajnoczi stefa...@linux.vnet.ibm.com writes:

 This patch series fixes two Linux host CD-ROM pass-through bugs in QEMU.

 After applying these patches it is possible to pass-through a Linux host 
 CD-ROM
 completely.  The guest can eject from software or the physical eject button 
 can
 be pressed on the drive.  The guest can detect this and newly inserted media
 are noticed.  There is no need to issue any QEMU monitor 'eject' or 'change'
 commands because the host CD-ROM is completely passed through.

Things can get confusing here, as eject is an overloaded term :)

Let me try to preempt such confusion.  We have three separate actions to
consider: OS opening and closing the tray, and QEMU monitor commands
eject and change, and the user inserting/removing media from a
physical tray.

On bare metal, OS open/close tray affects the physical tray the obvious
way.  The user can insert/remove media while the tray is open.

A virtual CD-ROM is backed by a QEMU block device (the things info
block shows).  The block device can be empty (seen by the gues OS as
no media), or it can be connected to a file.  Monitor commands eject
and change manipulate that connection.

Guest OS open/close tray affects the virtual tray the obvious way.  In
particular, if the OS opens, then closes the tray, it gets the same
media back, unless the user changed it[*].

Normally, a block device's file is an image file.  Monitor commands
eject and change are seen by the guest OS as media change.

Besides image files, we can also use host block devices.  This adds
another way to change media: The user can insert/remove physical media
while the physical tray is open.

Regardless, monitor commands eject and change still work, and are
still seen by the guest OS as media change.

 Patch details:

 The first is that the device size is cached even for removable devices and we
 never update it.  If a host CD-ROM is changed, then the size will be stale and
 reflect that of the last medium.

 The second is that Linux host CD-ROM pass-through requires that we re-open the
 device to refresh its size.  This is because the Linux CD-ROM driver only
 refreshes the size when the device is opened and furthermore has a bug that
 leads to stale sizes if the file descriptor is held across media change.

 I have also included a trace event for bdrv_set_locked() because it is useful
 information when debugging issues like these in the future.

 v2:
  * Clarify cdrom_is_inserted() comment as per Juan's suggestion

I plan to give these patches a swirl to see how they affect some other
CD-ROM weirdness I'm debugging.  Thanks!


[*] Used not to work, see commit 4be9762a.



[Qemu-devel] [PATCHv2 3/3] unicore32: necessary modifications for other files to support unicore32

2011-03-30 Thread Guan Xuetao

Signed-off-by: Guan Xuetao g...@mprc.pku.edu.cn
---
 configure|   11 +++-
 cpu-exec.c   |   12 -
 default-configs/unicore32-linux-user.mak |1 +
 elf.h|2 +
 fpu/softfloat-specialize.h   |   10 ++--
 linux-user/elfload.c |   74 +
 linux-user/main.c|   89 +-
 linux-user/qemu.h|5 +-
 linux-user/syscall_defs.h|   10 ++-
 9 files changed, 201 insertions(+), 13 deletions(-)
 create mode 100644 default-configs/unicore32-linux-user.mak

diff --git a/configure b/configure
index 598e8e1..a6633cf 100755
--- a/configure
+++ b/configure
@@ -280,7 +280,7 @@ else
 fi
 
 case $cpu in
-  alpha|cris|ia64|m68k|microblaze|ppc|ppc64|sparc64)
+  alpha|cris|ia64|m68k|microblaze|ppc|ppc64|sparc64|unicore32)
 cpu=$cpu
   ;;
   i386|i486|i586|i686|i86pc|BePC)
@@ -793,6 +793,9 @@ case $cpu in
 hppa*)
host_guest_base=yes
;;
+unicore32*)
+   host_guest_base=yes
+   ;;
 esac
 
 [ -z $guest_base ]  guest_base=$host_guest_base
@@ -1018,6 +1021,7 @@ sh4eb-linux-user \
 sparc-linux-user \
 sparc64-linux-user \
 sparc32plus-linux-user \
+unicore32-linux-user \
 
 fi
 # the following are Darwin specific
@@ -2495,7 +2499,7 @@ echo docdir=$docdir  $config_host_mak
 echo confdir=$confdir  $config_host_mak
 
 case $cpu in
-  
i386|x86_64|alpha|cris|hppa|ia64|m68k|microblaze|mips|mips64|ppc|ppc64|s390|s390x|sparc|sparc64)
+  
i386|x86_64|alpha|cris|hppa|ia64|m68k|microblaze|mips|mips64|ppc|ppc64|s390|s390x|sparc|sparc64|unicore32)
 ARCH=$cpu
   ;;
   armv4b|armv4l)
@@ -3007,6 +3011,9 @@ case $target_arch2 in
   s390x)
 target_phys_bits=64
   ;;
+  unicore32)
+target_phys_bits=32
+  ;;
   *)
 echo Unsupported target CPU
 exit 1
diff --git a/cpu-exec.c b/cpu-exec.c
index 8c9fb8b..130e0c3 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -262,6 +262,7 @@ int cpu_exec(CPUState *env1)
 env-cc_x = (env-sr  4)  1;
 #elif defined(TARGET_ALPHA)
 #elif defined(TARGET_ARM)
+#elif defined(TARGET_UNICORE32)
 #elif defined(TARGET_PPC)
 #elif defined(TARGET_MICROBLAZE)
 #elif defined(TARGET_MIPS)
@@ -326,6 +327,8 @@ int cpu_exec(CPUState *env1)
 do_interrupt(env);
 #elif defined(TARGET_ARM)
 do_interrupt(env);
+#elif defined(TARGET_UNICORE32)
+do_interrupt(env);
 #elif defined(TARGET_SH4)
do_interrupt(env);
 #elif defined(TARGET_ALPHA)
@@ -363,7 +366,7 @@ int cpu_exec(CPUState *env1)
 }
 #if defined(TARGET_ARM) || defined(TARGET_SPARC) || defined(TARGET_MIPS) || \
 defined(TARGET_PPC) || defined(TARGET_ALPHA) || defined(TARGET_CRIS) || \
-defined(TARGET_MICROBLAZE)
+defined(TARGET_MICROBLAZE) || defined(TARGET_UNICORE32)
 if (interrupt_request  CPU_INTERRUPT_HALT) {
 env-interrupt_request = ~CPU_INTERRUPT_HALT;
 env-halted = 1;
@@ -503,6 +506,12 @@ int cpu_exec(CPUState *env1)
 do_interrupt(env);
 next_tb = 0;
 }
+#elif defined(TARGET_UNICORE32)
+if (interrupt_request  CPU_INTERRUPT_HARD
+ !(env-uncached_asr  ASR_I)) {
+do_interrupt(env);
+next_tb = 0;
+}
 #elif defined(TARGET_SH4)
 if (interrupt_request  CPU_INTERRUPT_HARD) {
 do_interrupt(env);
@@ -653,6 +662,7 @@ int cpu_exec(CPUState *env1)
 env-eflags = env-eflags | helper_cc_compute_all(CC_OP) | (DF  DF_MASK);
 #elif defined(TARGET_ARM)
 /* XXX: Save/restore host fpu exception state?.  */
+#elif defined(TARGET_UNICORE32)
 #elif defined(TARGET_SPARC)
 #elif defined(TARGET_PPC)
 #elif defined(TARGET_M68K)
diff --git a/default-configs/unicore32-linux-user.mak 
b/default-configs/unicore32-linux-user.mak
new file mode 100644
index 000..6aafd21
--- /dev/null
+++ b/default-configs/unicore32-linux-user.mak
@@ -0,0 +1 @@
+# Default configuration for unicore32-linux-user
diff --git a/elf.h b/elf.h
index 7067c90..876c1da 100644
--- a/elf.h
+++ b/elf.h
@@ -105,6 +105,8 @@ typedef int64_t  Elf64_Sxword;
 #define EM_H8_300H  47  /* Hitachi H8/300H */
 #define EM_H8S  48  /* Hitachi H8S */
 
+#define EM_UNICORE32110 /* UniCore32 */
+
 /*
  * This is an interim value that we will use until the committee comes
  * up with a final number.
diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index eb644b2..839049d 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -30,7 +30,7 @@ these four paragraphs for those parts of this code that are 
retained.
 
 

[Qemu-devel] Re: virtio-blk.c handling of i/o which is not a 512 multiple

2011-03-30 Thread Stefan Hajnoczi
On Wed, Mar 30, 2011 at 9:15 AM, Conor Murphy
conor_murphy_v...@hotmail.com wrote:
 I'm trying to write a virtio-blk driver for Solaris. I've gotten it to the 
 point
 where Solaris can see the device and create a ZFS file system on it.

 However when I try and create a UFS filesystem on the device, the VM crashed
 with the error
 *** glibc detected *** /usr/bin/qemu-kvm: double free or corruption (!prev):
 0x7f2d38000a00 ***

This is a bug in QEMU.  A guest must not be able to trigger a crash.

 I can reproduce the problem with a simple dd, i.e.
 dd if=/dev/zero of=/dev/rdsk/c2d10p0 bs=5000 count=1

I think this a raw character device, which is why you're even able to
perform non-blocksize accesses?  Have you looked at how other drivers
(like the Xen pv blkfront) handle this?

 My driver will create a virtio-blk request with two elements in the sg list, 
 one
 for the first 4096 byes and the other for the remaining 904.

 From stepping through with gdb, virtio_blk_handle_write will sets n_sectors 
 to 9
 (5000 / 512). Later on the code, n_sectors is used the calculate the size of 
 the
 buffer required but 9 * 512 is too small and so when the request is process it
 ends up writing past the end of the buffer and I guest this triggers the glibc
 error.

We need to validate that (qiov-size % BDRV_SECTOR_SIZE) == 0 and
reject invalid requests.

 Is there a requirement for virtio-blk guest drivers that all i/o requests are
 sized in multiples of 512 bytes?

There is no strict requirement according to the virtio specification,
but maybe there should be:

http://ozlabs.org/~rusty/virtio-spec/virtio-spec-0.8.9.pdf

Stefan



Re: [Qemu-devel] [PATCH] net: Improve the warnings for dubious command line option combinations

2011-03-30 Thread Peter Maydell
Ping? It would be nice if this could be committed, it's blocking
the versatile express support patch.

thanks
-- PMM

On 22 March 2011 18:39, Peter Maydell peter.mayd...@linaro.org wrote:
 Improve the warnings we give if the user specified a combination of -net
 options which don't make much sense:
  * Fix a bug where we would only complain about the first VLAN having
   no NIC or no host network connection; we now diagnose this situation
   for all VLANs
  * Don't warn about anything if the config is the implicit default
   -net user -net nic rather than one specified by the user (this will
   only kick in for boards with no NIC or if CONFIG_SLIRP is not set)
  * Diagnose the case where the user asked for NICs which the board
   didn't instantiate (for example where the user asked for two NICs
   but the board only supports one)

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 The motivation for this patch is that I thought it made more sense
 to complain about unused NIC specifications in generic code than
 force every board to do it; see discussion of the vexpress patch
 http://patchwork.ozlabs.org/patch/85727/

  net.c |   34 +-
  1 files changed, 33 insertions(+), 1 deletions(-)

 diff --git a/net.c b/net.c
 index ddcca97..9d3aaf5 100644
 --- a/net.c
 +++ b/net.c
 @@ -1305,12 +1305,30 @@ void net_check_clients(void)
  {
     VLANState *vlan;
     VLANClientState *vc;
 -    int has_nic = 0, has_host_dev = 0;
 +    int has_nic, has_host_dev;
 +    int seen_nics = 0;
 +
 +    /* Don't warn about the default network setup that you get if
 +     * no command line -net options are specified. There are two
 +     * cases that we would otherwise complain about:
 +     * (1) board doesn't support a NIC but the implicit -net nic
 +     * requested one; we'd otherwise complain about more NICs being
 +     * specified than we support, and also that the vlan set up by
 +     * the implicit -net user didn't have any NICs connected to it
 +     * (2) CONFIG_SLIRP not set: we'd otherwise complain about the
 +     * implicit -net nic setting up a nic that wasn't connected to
 +     * anything.
 +     */
 +    if (default_net) {
 +        return;
 +    }

     QTAILQ_FOREACH(vlan, vlans, next) {
 +        has_nic = has_host_dev = 0;
         QTAILQ_FOREACH(vc, vlan-clients, next) {
             switch (vc-info-type) {
             case NET_CLIENT_TYPE_NIC:
 +                seen_nics++;
                 has_nic = 1;
                 break;
             case NET_CLIENT_TYPE_SLIRP:
 @@ -1330,12 +1348,26 @@ void net_check_clients(void)
                     vlan-id);
     }
     QTAILQ_FOREACH(vc, non_vlan_clients, next) {
 +        if (vc-info-type == NET_CLIENT_TYPE_NIC) {
 +            seen_nics++;
 +        }
         if (!vc-peer) {
             fprintf(stderr, Warning: %s %s has no peer\n,
                     vc-info-type == NET_CLIENT_TYPE_NIC ? nic : netdev,
                     vc-name);
         }
     }
 +    if (seen_nics != nb_nics) {
 +        /* Number of NICs requested by user on command line doesn't match
 +         * the number the model actually registered with us.
 +         * This will generally only happen for models of embedded boards
 +         * with no PCI bus or similar. PCI based machines can instantiate
 +         * all requested NICs as PCI devices but usually embedded boards
 +         * only have a single NIC.
 +         */
 +        fprintf(stderr, Warning: more nics requested than this machine 
 +                supports; some have been ignored\n);
 +    }
  }

  static int net_init_client(QemuOpts *opts, void *dummy)
 --
 1.7.1



Re: [Qemu-devel] [PATCH repost] qemu-img: Initial progress printing support

2011-03-30 Thread Stefan Hajnoczi
On Tue, Mar 29, 2011 at 9:51 AM,  jes.soren...@redhat.com wrote:
 diff --git a/qemu-common.h b/qemu-common.h
 index 7a96dd1..a3a4dde 100644
 --- a/qemu-common.h
 +++ b/qemu-common.h
 @@ -330,6 +330,11 @@ void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t 
 count);
  void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count,
                             size_t skip);

 +void qemu_init_progress(int enabled, float min_skip);

Please call it qemu_progress_init() to be consistent with the other
qemu_progress_*() functions.

 @@ -642,6 +648,9 @@ static int img_convert(int argc, char **argv)
         goto out;
     }

 +    qemu_init_progress(progress, 2.0);
 +    qemu_progress_print(0, 100);
 +
     bs = qemu_mallocz(bs_n * sizeof(BlockDriverState *));

     total_sectors = 0;
 @@ -773,6 +782,11 @@ static int img_convert(int argc, char **argv)
         }
         cluster_sectors = cluster_size  9;
         sector_num = 0;
 +
 +        nb_sectors = total_sectors - sector_num;

sector_num is always 0 here, why subtract it?

 +        local_progress = (float)100 /
 +            (nb_sectors / MIN(nb_sectors, (cluster_sectors)));

This seems to calculate the percentage increment for each iteration.
This increment value is wrong for the final iteration unless
nb_sectors is a multiple of cluster_sectors, so we'll overshoot 100%.

It would be nice if the caller didn't have to calculate progress
themselves.  How about:

void qemu_progress_begin(bool enabled, uint64_t max);
void qemu_progress_end(void);
void qemu_progress_add(uint64_t increment);

You set the maximum absolute value and then tell it how much progress
has been made each iteration:
qemu_progress_begin(true, total_sectors);
for (...) {
nbytes = ...;
qemu_progress_add(nbytes);
}
qemu_progress_end();

 +void qemu_init_progress(int enabled, float min_skip)
 +{
 +    state.enabled = enabled;
 +    if (!enabled) {
 +        return;
 +    }

This early return is not needed.

 +    state.min_skip = min_skip;
 +}
 +
 +void qemu_progress_end(void)
 +{
 +    progress_simple_end();
 +}
 +
 +void qemu_progress_print(float percent, int max)
 +{
 +    float current;
 +
 +    if (max == 0) {
 +        current = percent;
 +    } else {
 +        current = state.current + percent / 100 * max;
 +    }
 +    state.current = current;
 +
 +    if (current  (state.last_print + state.min_skip) ||
 +        (current == 100) || (current == 0)) {

Comparing against (float)100 may not match due to floating point representation.

 +        state.last_print = state.current;
 +        progress_simple_print();
 +    }
 +}
 +
 +int qemu_progress_get_current(void)
 +{
 +    return state.current;
 +}

This function is unused.

Stefan



Re: [Qemu-devel] [PATCH v2 0/3] block: Correct size across CD-ROM media change

2011-03-30 Thread Stefan Hajnoczi
On Wed, Mar 30, 2011 at 9:33 AM, Markus Armbruster arm...@redhat.com wrote:
 Stefan Hajnoczi stefa...@linux.vnet.ibm.com writes:

 This patch series fixes two Linux host CD-ROM pass-through bugs in QEMU.

 After applying these patches it is possible to pass-through a Linux host 
 CD-ROM
 completely.  The guest can eject from software or the physical eject button 
 can
 be pressed on the drive.  The guest can detect this and newly inserted media
 are noticed.  There is no need to issue any QEMU monitor 'eject' or 'change'
 commands because the host CD-ROM is completely passed through.

 Things can get confusing here, as eject is an overloaded term :)

 Let me try to preempt such confusion.  We have three separate actions to
 consider: OS opening and closing the tray, and QEMU monitor commands
 eject and change, and the user inserting/removing media from a
 physical tray.

 On bare metal, OS open/close tray affects the physical tray the obvious
 way.  The user can insert/remove media while the tray is open.

 A virtual CD-ROM is backed by a QEMU block device (the things info
 block shows).  The block device can be empty (seen by the gues OS as
 no media), or it can be connected to a file.  Monitor commands eject
 and change manipulate that connection.

 Guest OS open/close tray affects the virtual tray the obvious way.  In
 particular, if the OS opens, then closes the tray, it gets the same
 media back, unless the user changed it[*].

 Normally, a block device's file is an image file.  Monitor commands
 eject and change are seen by the guest OS as media change.

 Besides image files, we can also use host block devices.  This adds
 another way to change media: The user can insert/remove physical media
 while the physical tray is open.

 Regardless, monitor commands eject and change still work, and are
 still seen by the guest OS as media change.

I agree with your description.

These patches improve Linux host CD-ROM pass-through.  They do not
help ISO CD-ROM weirdness but I'm interested in seeing what CD-ROM
issues you're investigating.

Stefan



[Qemu-devel] Re: [PATCH 00/19] s390x emulation support

2011-03-30 Thread Peter Maydell
On 29 March 2011 14:29, Alexander Graf ag...@suse.de wrote:
 We've had support for running s390x guests with KVM for a
 while now. This patch set also enables support for running
 s390x guests in system as well as linux-user mode in emulation!

Is there an available ISA manual you can add a wiki link to?

http://wiki.qemu.org/Documentation/ISAManuals

(That page is a bit bare at the moment, I know...)

-- PMM



[Qemu-devel] Re: [PATCH 00/19] s390x emulation support

2011-03-30 Thread Alexander Graf

On 30.03.2011, at 12:23, Peter Maydell wrote:

 On 29 March 2011 14:29, Alexander Graf ag...@suse.de wrote:
 We've had support for running s390x guests with KVM for a
 while now. This patch set also enables support for running
 s390x guests in system as well as linux-user mode in emulation!
 
 Is there an available ISA manual you can add a wiki link to?
 
 http://wiki.qemu.org/Documentation/ISAManuals
 
 (That page is a bit bare at the moment, I know...)

Oh, nice page! Didn't know it existed :). Very useful. I added a link to a POP 
for z9 (new enough for the target we're emulating and available without 
registering with IBM) and also updated the PowerPC pieces.


Alex




Re: [Qemu-devel] [PATCH v3 0/7] Let boards state maximum RAM limits in QEMUMachine struct

2011-03-30 Thread Jes Sorensen
On 03/30/11 10:09, Peter Maydell wrote:
 On 30 March 2011 08:48, Jes Sorensen jes.soren...@redhat.com wrote:
 I am a little concerned about this approach. It should work for simple
 embedded boards, but for larger systems, it really ought to be a mask
 rather than a max address.
 
 It's not a maximum address, it's a maximum size. For instance
 the RAM isn't contiguous on some of the ARM devboards.

Right, but the fact that you can have holes makes it even more an issue.

 Ideally I think it would be better to have a mask and then introduce a
 is_valid_memory() kinda function to check it with.
 
 The command line option doesn't provide any means of saying
 put 64MB in this hole and another 128 over here and 32 there,
 so this seems completely pointless to me. All we are trying
 to do is validate what the user has asked for, so why have
 a validation mechanism that can cope with impossible-to-request
 arrangements?

You can on x86 using the -numa argument. When you use that, it is
completely pointless to have the max memory limit, to use your own words.

 I fear that by introducing a simple max limit like this, we are going to
 hit problems later when we try to improve the NUMA support.
 
 I think this is letting the best be the enemy of the good.
 
 Even if you do want to have NUMA systems do more complex
 things I think you should still have the simple maximum
 size approach for the bulk of the supported boards which
 don't need anything more complicated. So additional NUMA
 features would augment, not replace this.

The problem is we end up with two mechanisms that way which won't
necessarily live well together.

Since you have holes on ARM too, it makes sense IMHO to use a mask
exactly because you can then map that to max memory by simply adding up
the available holes. A linear number on the other hand is much harder to
validate against once you start populating holes explicitly.

Cheers,
Jes




[Qemu-devel] Re: [PATCH V12 00/17] Xen device model support

2011-03-30 Thread Alexander Graf

On 29.03.2011, at 20:27, anthony.per...@citrix.com wrote:

 From: Anthony PERARD anthony.per...@citrix.com
 
 Hi all,
 
 Here is the few change made since the v11:
  - All objects built for each target are now build only for i386 targets.
  - The compatibility macros have been replaced by static inline functions.
  - xen_platform_init replaced by a call to pci_create_simple.
  - Use of bitmap.h in xen mapcache
  - Xen Mapcache have a max cache limit now.
  - Compilation of x86_64 target fixed.
  - A ram_ptr leak is fixed in pci_add_option_rom.
 
 This series depends on the series Introduce machine QemuOpts.
 
 You can find a git tree here:
 
 git://xenbits.xen.org/people/aperard/qemu-dm.git qemu-dm-v12

Acked-by: Alexander Graf ag...@suse.de


Thanks a lot!

Alex




[Qemu-devel] [PATCH 3/3] Basic implementation of Sharp Zaurus SL-5500 collie PDA

2011-03-30 Thread Dmitry Eremin-Solenikov
Add very basic implementation of collie PDA emulation. The system lacks
LoCoMo and graphics/sound emulation. Linux kernel boots up to mounting
rootfs (theoretically it can be provided in pflash images).

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 Makefile.target |1 +
 hw/collie.c |   70 +++
 2 files changed, 71 insertions(+), 0 deletions(-)
 create mode 100644 hw/collie.c

diff --git a/Makefile.target b/Makefile.target
index d071a4d..abc2978 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -329,6 +329,7 @@ obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o 
syborg_keyboard.o
 obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o
 obj-arm-y += syborg_virtio.o
 obj-arm-y += strongarm.o
+obj-arm-y += collie.o
 
 obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
 obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
diff --git a/hw/collie.c b/hw/collie.c
new file mode 100644
index 000..965fd13
--- /dev/null
+++ b/hw/collie.c
@@ -0,0 +1,70 @@
+/*
+ * SA-1110-based Sharp Zaurus SL-5500 platform.
+ *
+ * Copyright (C) 2011 Dmitry Eremin-Solenikov
+ *
+ * This code is licensed under GNU GPL v2.
+ */
+#include hw.h
+#include sysbus.h
+#include boards.h
+#include devices.h
+#include strongarm.h
+#include arm-misc.h
+#include flash.h
+#include blockdev.h
+
+static struct arm_boot_info collie_binfo = {
+.loader_start = SA_SDCS0,
+.ram_size = 0x2000,
+};
+
+static void collie_init(ram_addr_t ram_size,
+const char *boot_device,
+const char *kernel_filename, const char *kernel_cmdline,
+const char *initrd_filename, const char *cpu_model)
+{
+StrongARMState *s;
+DriveInfo *dinfo;
+ram_addr_t phys_flash;
+
+if (!cpu_model) {
+cpu_model = sa1110;
+}
+
+s = sa1110_init(collie_binfo.ram_size, cpu_model);
+(void) s;
+
+phys_flash = qemu_ram_alloc(NULL, collie.fl1, 0x0200);
+dinfo = drive_get(IF_PFLASH, 0, 0);
+pflash_cfi01_register(SA_CS0, phys_flash,
+dinfo ? dinfo-bdrv : NULL, (64 * 1024),
+512, 4, 0x00, 0x00, 0x00, 0x00, 0);
+
+phys_flash = qemu_ram_alloc(NULL, collie.fl2, 0x0200);
+dinfo = drive_get(IF_PFLASH, 0, 1);
+pflash_cfi01_register(SA_CS1, phys_flash,
+dinfo ? dinfo-bdrv : NULL, (64 * 1024),
+512, 4, 0x00, 0x00, 0x00, 0x00, 0);
+
+sysbus_create_simple(scoop, 0x4080, NULL);
+
+collie_binfo.kernel_filename = kernel_filename;
+collie_binfo.kernel_cmdline = kernel_cmdline;
+collie_binfo.initrd_filename = initrd_filename;
+collie_binfo.board_id = 0x208;
+arm_load_kernel(s-env, collie_binfo);
+}
+
+static QEMUMachine collie_machine = {
+.name = collie,
+.desc = Collie PDA (SA-1110),
+.init = collie_init,
+};
+
+static void collie_machine_init(void)
+{
+qemu_register_machine(collie_machine);
+}
+
+machine_init(collie_machine_init)
-- 
1.7.4.1




[Qemu-devel] [PATCH 2/3] Implement basic part of SA-1110/SA-1100

2011-03-30 Thread Dmitry Eremin-Solenikov
Basic implementation of DEC/Intel SA-1100/SA-1110 chips emulation.
Implemented:
 - IRQs
 - GPIO
 - PPC
 - RTC
 - UARTs (no IrDA/etc.)
 - OST reused from pxa25x

Everything else is TODO (esp. PM/idle/sleep!) - see the todo in the
hw/strongarm.c

V2:
  * removed all strongarm variants except latest
  * dropped unused casts
  * fixed PIC vmstate
  * fixed new devices created with version_id = 1

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 Makefile.target |1 +
 hw/strongarm.c  | 1301 +++
 hw/strongarm.h  |   62 +++
 target-arm/cpu.h|3 +
 target-arm/helper.c |9 +
 5 files changed, 1376 insertions(+), 0 deletions(-)
 create mode 100644 hw/strongarm.c
 create mode 100644 hw/strongarm.h

diff --git a/Makefile.target b/Makefile.target
index 62b102a..d071a4d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -328,6 +328,7 @@ obj-arm-y += framebuffer.o
 obj-arm-y += syborg.o syborg_fb.o syborg_interrupt.o syborg_keyboard.o
 obj-arm-y += syborg_serial.o syborg_timer.o syborg_pointer.o syborg_rtc.o
 obj-arm-y += syborg_virtio.o
+obj-arm-y += strongarm.o
 
 obj-sh4-y = shix.o r2d.o sh7750.o sh7750_regnames.o tc58128.o
 obj-sh4-y += sh_timer.o sh_serial.o sh_intc.o sh_pci.o sm501.o
diff --git a/hw/strongarm.c b/hw/strongarm.c
new file mode 100644
index 000..9f3df87
--- /dev/null
+++ b/hw/strongarm.c
@@ -0,0 +1,1301 @@
+/*
+ * StrongARM SA-1100/SA-1110 emulation
+ *
+ * Copyright (C) 2011 Dmitry Eremin-Solenikov
+ *
+ * Largely based on StrongARM emulation:
+ * Copyright (c) 2006 Openedhand Ltd.
+ * Written by Andrzej Zaborowski bal...@zabor.org
+ *
+ * UART code based on QEMU 16550A UART emulation
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2008 Citrix Systems, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ */
+#include sysbus.h
+#include strongarm.h
+#include qemu-error.h
+#include arm-misc.h
+#include sysemu.h
+
+/*
+ TODO
+ - Implement cp15, c14 ?
+ - Implement cp15, c15 !!! (idle used in L)
+ - Implement idle mode handling/DIM
+ - Implement sleep mode/Wake sources
+ - Implement reset control
+ - Implement memory control regs
+ - PCMCIA handling
+ - Maybe support MBGNT/MBREQ
+ - DMA channels
+ - GPCLK
+ - IrDA
+ - MCP
+ - Enhance UART with modem signals
+ */
+
+static struct {
+target_phys_addr_t io_base;
+int irq;
+} sa_serial[] = {
+{ 0x8001, SA_PIC_UART1 },
+{ 0x8003, SA_PIC_UART2 },
+{ 0x8005, SA_PIC_UART3 },
+{ 0, 0 }
+};
+
+/* Interrupt Controller */
+typedef struct {
+SysBusDevice busdev;
+qemu_irqirq;
+qemu_irqfiq;
+
+uint32_t pending;
+uint32_t enabled;
+uint32_t is_fiq;
+uint32_t int_idle;
+} StrongARMPICState;
+
+#define ICIP0x00
+#define ICMR0x04
+#define ICLR0x08
+#define ICFP0x10
+#define ICPR0x20
+#define ICCR0x0c
+
+#define SA_PIC_SRCS 32
+
+
+static void strongarm_pic_update(void *opaque)
+{
+StrongARMPICState *s = opaque;
+
+/* FIXME: reflect DIM */
+qemu_set_irq(s-fiq, s-pending  s-enabled   s-is_fiq);
+qemu_set_irq(s-irq, s-pending  s-enabled  ~s-is_fiq);
+}
+
+static void strongarm_pic_set_irq(void *opaque, int irq, int level)
+{
+StrongARMPICState *s = opaque;
+
+if (level) {
+s-pending |= 1  irq;
+} else {
+s-pending = ~(1  irq);
+}
+
+strongarm_pic_update(s);
+}
+
+static uint32_t strongarm_pic_mem_read(void *opaque, target_phys_addr_t offset)
+{
+StrongARMPICState *s = opaque;
+
+switch (offset) {
+case ICIP:
+return s-pending  ~s-is_fiq  s-enabled;
+case ICMR:
+return s-enabled;
+case ICLR:
+return s-is_fiq;
+case ICCR:
+return s-int_idle == 0;
+case ICFP:
+return s-pending  s-is_fiq  s-enabled;
+case ICPR:
+return s-pending;
+default:
+printf(%s: Bad register offset 0x TARGET_FMT_plx \n,
+__func__, offset);
+return 0;
+}
+}
+
+static void strongarm_pic_mem_write(void *opaque, target_phys_addr_t offset,
+uint32_t value)
+{
+StrongARMPICState *s = opaque;
+
+switch (offset) {
+case ICMR:
+s-enabled = value;
+break;
+case ICLR:
+s-is_fiq = value;
+break;
+case ICCR:
+s-int_idle = (value  1) ? 0 : ~0;
+break;
+default:
+printf(%s: Bad register offset 0x TARGET_FMT_plx \n,
+__func__, offset);
+break;
+}
+strongarm_pic_update(s);
+}
+
+static CPUReadMemoryFunc * const strongarm_pic_readfn[] = {
+strongarm_pic_mem_read,
+strongarm_pic_mem_read,
+strongarm_pic_mem_read,
+};
+
+static CPUWriteMemoryFunc * const strongarm_pic_writefn[] = {
+strongarm_pic_mem_write,
+strongarm_pic_mem_write,
+strongarm_pic_mem_write,
+};
+
+static int 

[Qemu-devel] [PATCH 1/3] arm: basic support for ARMv4/ARMv4T emulation

2011-03-30 Thread Dmitry Eremin-Solenikov
Currently target-arm/ assumes at least ARMv5 core. Add support for
handling also ARMv4/ARMv4T. This changes the following instructions:

BX(v4T and later)

BKPT, BLX, CDP2, CLZ, LDC2, LDRD, MCRR, MCRR2, MRRC, MCRR, MRC2, MRRC,
MRRC2, PLD QADD, QDADD, QDSUB, QSUB, STRD, SMLAxy, SMLALxy, SMLAWxy,
SMULxy, SMULWxy, STC2 (v5 and later)

All instructions that are v5TE and later are also bound to just v5, as
that's how it was before.

This patch doesn _not_ include disabling of cp15 access and base-updated
data abort model (that will be required to emulate chips based on a
ARM7TDMI), because:
* no ARM7TDMI chips are currently emulated (or planned)
* those features aren't strictly necessary for my purposes (SA-1 core
  emulation).

All v5 models are handled as they are v5T. Internally we still have a
check if the model is a v5(T) or v5TE, but as all emulated cores are
v5TE, those two cases are simply aliased (for now).

Patch is heavily based on patch by Filip Navara filip.nav...@gmail.com
which in turn is based on work by Ulrich Hecht u...@suse.de and Vincent
Sanders vi...@kyllikki.org.

Signed-off-by: Dmitry Eremin-Solenikov dbarysh...@gmail.com
---
 target-arm/cpu.h   |4 ++-
 target-arm/helper.c|   28 ++-
 target-arm/translate.c |   59 +++
 3 files changed, 79 insertions(+), 12 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 1ae7982..e247a7a 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -360,7 +360,9 @@ enum arm_features {
 ARM_FEATURE_M, /* Microcontroller profile.  */
 ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
 ARM_FEATURE_THUMB2EE,
-ARM_FEATURE_V7MP/* v7 Multiprocessing Extensions */
+ARM_FEATURE_V7MP,/* v7 Multiprocessing Extensions */
+ARM_FEATURE_V4T,
+ARM_FEATURE_V5,
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 78f3d39..44b5c17 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -48,17 +48,23 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c0_cpuid = id;
 switch (id) {
 case ARM_CPUID_ARM926:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_VFP);
 env-vfp.xregs[ARM_VFP_FPSID] = 0x41011090;
 env-cp15.c0_cachetype = 0x1dd20d2;
 env-cp15.c1_sys = 0x00090078;
 break;
 case ARM_CPUID_ARM946:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_MPU);
 env-cp15.c0_cachetype = 0x0f004006;
 env-cp15.c1_sys = 0x0078;
 break;
 case ARM_CPUID_ARM1026:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_VFP);
 set_feature(env, ARM_FEATURE_AUXCR);
 env-vfp.xregs[ARM_VFP_FPSID] = 0x410110a0;
@@ -67,6 +73,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 break;
 case ARM_CPUID_ARM1136_R2:
 case ARM_CPUID_ARM1136:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_VFP);
 set_feature(env, ARM_FEATURE_AUXCR);
@@ -79,6 +87,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env-cp15.c1_sys = 0x00050078;
 break;
 case ARM_CPUID_ARM11MPCORE:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_VFP);
@@ -91,6 +101,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env-cp15.c0_cachetype = 0x1dd20d2;
 break;
 case ARM_CPUID_CORTEXA8:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_V7);
@@ -113,6 +125,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c1_sys = 0x00c50078;
 break;
 case ARM_CPUID_CORTEXA9:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_V6K);
 set_feature(env, ARM_FEATURE_V7);
@@ -140,6 +154,8 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 env-cp15.c1_sys = 0x00c50078;
 break;
 case ARM_CPUID_CORTEXM3:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
 set_feature(env, ARM_FEATURE_V6);
 set_feature(env, ARM_FEATURE_THUMB2);
 set_feature(env, ARM_FEATURE_V7);
@@ -147,6 +163,8 @@ static void 

Re: [Qemu-devel] [PATCH repost] qemu-img: Initial progress printing support

2011-03-30 Thread Jes Sorensen
On 03/30/11 11:50, Stefan Hajnoczi wrote:
 On Tue, Mar 29, 2011 at 9:51 AM,  jes.soren...@redhat.com wrote:
 diff --git a/qemu-common.h b/qemu-common.h
 index 7a96dd1..a3a4dde 100644
 --- a/qemu-common.h
 +++ b/qemu-common.h
 @@ -330,6 +330,11 @@ void qemu_iovec_memset(QEMUIOVector *qiov, int c, 
 size_t count);
  void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count,
 size_t skip);

 +void qemu_init_progress(int enabled, float min_skip);
 
 Please call it qemu_progress_init() to be consistent with the other
 qemu_progress_*() functions.

Fixed

 @@ -773,6 +782,11 @@ static int img_convert(int argc, char **argv)
 }
 cluster_sectors = cluster_size  9;
 sector_num = 0;
 +
 +nb_sectors = total_sectors - sector_num;
 
 sector_num is always 0 here, why subtract it?

Copy paste from another place where I calculated it - fixed.

 +local_progress = (float)100 /
 +(nb_sectors / MIN(nb_sectors, (cluster_sectors)));
 
 This seems to calculate the percentage increment for each iteration.
 This increment value is wrong for the final iteration unless
 nb_sectors is a multiple of cluster_sectors, so we'll overshoot 100%.

 It would be nice if the caller didn't have to calculate progress
 themselves.  How about:

 void qemu_progress_begin(bool enabled, uint64_t max);
 void qemu_progress_end(void);
 void qemu_progress_add(uint64_t increment);

 You set the maximum absolute value and then tell it how much progress
 has been made each iteration:
 qemu_progress_begin(true, total_sectors);
 for (...) {
 nbytes = ...;
 qemu_progress_add(nbytes);
 }
 qemu_progress_end();

The overshoot is handled by checking for = 100 and not going above it.
I designed the API to try and handle the case where you can have a
function being a portion of the bigger picture. So basically you could
have this:

   part1() 20%
   part2() 50%
   part3() 30%

But in another situation you might only run and expect the following split:

   part2() 60%
   part3() 40%

In that case you need to be able to calculate the relative number.
Obviously it requires part2() and part3() are given an argument
specifying how big of the big scheme they are. If we do it the way you
suggest with a fixed increment, this option is not really doable.

 +void qemu_init_progress(int enabled, float min_skip)
 +{
 +state.enabled = enabled;
 +if (!enabled) {
 +return;
 +}
 
 This early return is not needed.

It was added if the function was going to get more complicated, but I
can pull it out.

 +void qemu_progress_print(float percent, int max)
 +{
 +float current;
 +
 +if (max == 0) {
 +current = percent;
 +} else {
 +current = state.current + percent / 100 * max;
 +}
 +state.current = current;
 +
 +if (current  (state.last_print + state.min_skip) ||
 +(current == 100) || (current == 0)) {
 
 Comparing against (float)100 may not match due to floating point 
 representation.

I'll add a check for = 100 setting current to 100, that should fix it.

 +int qemu_progress_get_current(void)
 +{
 +return state.current;
 +}
 
 This function is unused.

It was added to make the API more complete, but lets just pull it out.

Thanks for the input! I'll post a v2 shortly

Jes



[Qemu-devel] [PATCH] trivial: fix comment around qdev_init_nofail()

2011-03-30 Thread Michael Tokarev
in previous life qdev_init_nofail() used to call hw_error() which
did register dump and other scary things.  Now it calls
error_report() and does a regular exit(1).  Fix the comment
to match reality.

Signed-off-by: Michael Tokarev m...@tls.msk.ru
---
 hw/qdev.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index 1aa1ea0..2f4846e 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -358,7 +358,7 @@ int qdev_simple_unplug_cb(DeviceState *dev)
 return 0;
 }
 
-/* Like qdev_init(), but terminate program via hw_error() instead of
+/* Like qdev_init(), but terminate program via error_report() instead of
returning an error value.  This is okay during machine creation.
Don't use for hotplug, because there callers need to recover from
failure.  Exception: if you know the device's init() callback can't
-- 
1.7.2.5




[Qemu-devel] [PATCH v2] qemu-img: Initial progress printing support

2011-03-30 Thread Jes . Sorensen
From: Jes Sorensen jes.soren...@redhat.com

This adds the basic infrastructure for supporting progress output
on the command line, as well as progress support for qemu-img commands
'rebase' and 'convert'.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 Makefile.objs|2 +-
 qemu-common.h|4 ++
 qemu-img-cmds.hx |4 +-
 qemu-img.c   |   38 ++-
 qemu-progress.c  |   89 ++
 5 files changed, 132 insertions(+), 5 deletions(-)
 create mode 100644 qemu-progress.c

diff --git a/Makefile.objs b/Makefile.objs
index 42301fd..c980b25 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -14,7 +14,7 @@ oslib-obj-$(CONFIG_POSIX) += oslib-posix.o
 # block-obj-y is code used by both qemu system emulation and qemu-img
 
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o 
async.o
-block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
+block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o qemu-progress.o
 block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
diff --git a/qemu-common.h b/qemu-common.h
index 8ecb488..82e27c1 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -334,6 +334,10 @@ void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t 
count);
 void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count,
 size_t skip);
 
+void qemu_progress_init(int enabled, float min_skip);
+void qemu_progress_end(void);
+void qemu_progress_print(float percent, int max);
+
 /* Convert a byte between binary and BCD.  */
 static inline uint8_t to_bcd(uint8_t val)
 {
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 6c7176f..3072d38 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -28,7 +28,7 @@ STEXI
 ETEXI
 
 DEF(convert, img_convert,
-convert [-c] [-f fmt] [-O output_fmt] [-o options] [-s snapshot_name] 
filename [filename2 [...]] output_filename)
+convert [-c] [-p] [-f fmt] [-O output_fmt] [-o options] [-s 
snapshot_name] filename [filename2 [...]] output_filename)
 STEXI
 @item convert [-c] [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] [-s 
@var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
 ETEXI
@@ -46,7 +46,7 @@ STEXI
 ETEXI
 
 DEF(rebase, img_rebase,
-rebase [-f fmt] [-u] -b backing_file [-F backing_fmt] filename)
+rebase [-f fmt] [-p] [-u] -b backing_file [-F backing_fmt] filename)
 STEXI
 @item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] 
@var{filename}
 ETEXI
diff --git a/qemu-img.c b/qemu-img.c
index 7e3cc4c..074388c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -77,6 +77,7 @@ static void help(void)
   match exactly. The image doesn't need a working backing 
file before\n
   rebasing in this case (useful for renaming the backing 
file)\n
  '-h' with or without a command shows this help and lists the 
supported formats\n
+ '-p' show progress of command (only certain commands)\n
\n
Parameters to snapshot subcommand:\n
  'snapshot' is the name of the snapshot to create, apply or 
delete\n
@@ -567,6 +568,7 @@ static int compare_sectors(const uint8_t *buf1, const 
uint8_t *buf2, int n,
 static int img_convert(int argc, char **argv)
 {
 int c, ret = 0, n, n1, bs_n, bs_i, compress, cluster_size, cluster_sectors;
+int progress = 0;
 const char *fmt, *out_fmt, *out_baseimg, *out_filename;
 BlockDriver *drv, *proto_drv;
 BlockDriverState **bs = NULL, *out_bs = NULL;
@@ -579,13 +581,14 @@ static int img_convert(int argc, char **argv)
 QEMUOptionParameter *out_baseimg_param;
 char *options = NULL;
 const char *snapshot_name = NULL;
+float local_progress;
 
 fmt = NULL;
 out_fmt = raw;
 out_baseimg = NULL;
 compress = 0;
 for(;;) {
-c = getopt(argc, argv, f:O:B:s:hce6o:);
+c = getopt(argc, argv, f:O:B:s:hce6o:p);
 if (c == -1) {
 break;
 }
@@ -620,6 +623,9 @@ static int img_convert(int argc, char **argv)
 case 's':
 snapshot_name = optarg;
 break;
+case 'p':
+progress = 1;
+break;
 }
 }
 
@@ -642,6 +648,9 @@ static int img_convert(int argc, char **argv)
 goto out;
 }
 
+qemu_progress_init(progress, 2.0);
+qemu_progress_print(0, 100);
+
 bs = qemu_mallocz(bs_n * sizeof(BlockDriverState *));
 
 total_sectors = 0;
@@ -773,6 +782,11 @@ static int img_convert(int argc, char **argv)
 }
 cluster_sectors = cluster_size  9;
 sector_num = 0;
+
+nb_sectors = total_sectors;
+local_progress = (float)100 /
+(nb_sectors / MIN(nb_sectors, (cluster_sectors)));
+
 for(;;) {
 int64_t bs_num;
 int remainder;
@@ -832,6 +846,7 @@ static int img_convert(int argc, char **argv)
  

Re: [Qemu-devel] [PATCH v2] qemu-img: Initial progress printing support

2011-03-30 Thread Stefan Hajnoczi
On Wed, Mar 30, 2011 at 1:16 PM,  jes.soren...@redhat.com wrote:
 From: Jes Sorensen jes.soren...@redhat.com

 This adds the basic infrastructure for supporting progress output
 on the command line, as well as progress support for qemu-img commands
 'rebase' and 'convert'.

 Signed-off-by: Jes Sorensen jes.soren...@redhat.com
 ---
  Makefile.objs    |    2 +-
  qemu-common.h    |    4 ++
  qemu-img-cmds.hx |    4 +-
  qemu-img.c       |   38 ++-
  qemu-progress.c  |   89 
 ++
  5 files changed, 132 insertions(+), 5 deletions(-)
  create mode 100644 qemu-progress.c

Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com



[Qemu-devel] [PATCH] exit if -drive specified is invalid instead of ignoring the wrong -drive

2011-03-30 Thread Michael Tokarev
This fixes the problem when qemu continues even if -drive specification
is somehow invalid, resulting in a mess.  Applicable for both current
master and for stable-0.14 (and the same issue exist 0.13 and 0.12 too).

The prob can actually be seriuos: when you start guest with two drives
and make an error in the specification of one of them, and the guest
has something like a raid array on the two drives, guest may start failing
that array or kick missing drives which may result in a mess - this is
what actually happened to me, I did't want a resync at all, and a resync
resulted in re-writing (and allocating) a 4TB virtual drive I used for
testing, which in turn resulted in my filesystem filling up and whole
thing failing badly.  Yes it was just testing VM, I experimented with
larger raid arrays, but the end result was quite, well, unexpected.

Signed-off-by: Michael Tokarev m...@tls.msk.ru
---
 vl.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/vl.c b/vl.c
index 8bcf2ae..3792afb 100644
--- a/vl.c
+++ b/vl.c
@@ -2098,7 +2098,9 @@ int main(int argc, char **argv, char **envp)
   HD_OPTS);
 break;
 case QEMU_OPTION_drive:
-drive_def(optarg);
+if (drive_def(optarg) == NULL) {
+exit(1);
+}
break;
 case QEMU_OPTION_set:
 if (qemu_set_option(optarg) != 0)
-- 
1.7.2.5




Re: [Qemu-devel] [PATCH] exit if -drive specified is invalid instead of ignoring the wrong -drive

2011-03-30 Thread Jes Sorensen
On 03/30/11 14:31, Michael Tokarev wrote:
 This fixes the problem when qemu continues even if -drive specification
 is somehow invalid, resulting in a mess.  Applicable for both current
 master and for stable-0.14 (and the same issue exist 0.13 and 0.12 too).
 
 The prob can actually be seriuos: when you start guest with two drives
 and make an error in the specification of one of them, and the guest
 has something like a raid array on the two drives, guest may start failing
 that array or kick missing drives which may result in a mess - this is
 what actually happened to me, I did't want a resync at all, and a resync
 resulted in re-writing (and allocating) a 4TB virtual drive I used for
 testing, which in turn resulted in my filesystem filling up and whole
 thing failing badly.  Yes it was just testing VM, I experimented with
 larger raid arrays, but the end result was quite, well, unexpected.
 
 Signed-off-by: Michael Tokarev m...@tls.msk.ru
 ---
  vl.c |4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/vl.c b/vl.c
 index 8bcf2ae..3792afb 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -2098,7 +2098,9 @@ int main(int argc, char **argv, char **envp)
HD_OPTS);
  break;
  case QEMU_OPTION_drive:
 -drive_def(optarg);
 +if (drive_def(optarg) == NULL) {
 +exit(1);
 +}

Looks good, however it would be nice if you added an error message here.

Cheers,
Jes




Re: [Qemu-devel] [PATCH v3 0/7] Let boards state maximum RAM limits in QEMUMachine struct

2011-03-30 Thread Peter Maydell
On 30 March 2011 11:51, Jes Sorensen jes.soren...@redhat.com wrote:
 On 03/30/11 10:09, Peter Maydell wrote:
 On 30 March 2011 08:48, Jes Sorensen jes.soren...@redhat.com wrote:
 I am a little concerned about this approach. It should work for simple
 embedded boards, but for larger systems, it really ought to be a mask
 rather than a max address.

 It's not a maximum address, it's a maximum size. For instance
 the RAM isn't contiguous on some of the ARM devboards.

 Right, but the fact that you can have holes makes it even more an issue.

Not really, typically they're just filled up in some particular
order (main RAM in one place and expansion RAM elsewhere).
Since the board init function is only passed a single ram_size
parameter that's all you can do anyhow.

 Ideally I think it would be better to have a mask and then introduce a
 is_valid_memory() kinda function to check it with.

I'm not sure what this mask would look like. You want megabyte
granularity, but even if you assume only 48 bit physical addresses
that would still be an alarmingly large number of bits. So I
guess you'd actually want a list of (start,length) tuples.

I strongly dislike this approach. I think it introduces unneeded
extra complexity, and it doesn't even address all the cases you
might want a generic validate RAM options mechanism to do (eg
boards which only allow RAM in 16MB increments, boards where you
must fill the base RAM first and then the expansion RAM, boards
where the RAM isn't at a specific physical address but can be
remapped under guest control, just to pick three). So it falls
awkwardly between the two stools of flexible and generic and
simple and straightforward.

 The command line option doesn't provide any means of saying
 put 64MB in this hole and another 128 over here and 32 there,

 You can on x86 using the -numa argument. When you use that, it is
 completely pointless to have the max memory limit, to use your own words.

(Is there any documentation on what you can do with the -numa
option? http://qemu.weilnetz.de/qemu-doc.html is decidedly laconic.)

 Since you have holes on ARM too, it makes sense IMHO to use a mask
 exactly because you can then map that to max memory by simply adding up
 the available holes. A linear number on the other hand is much harder to
 validate against once you start populating holes explicitly.

But I don't want to populate holes explicitly, and I don't imagine
most of the other boards want to populate holes explicitly
either.

The current QEMU design basically only controls the total amount
of RAM (as a command line option, and as a parameter to the board
init function), with NUMA being a an optional extra. Non-NUMA is
the common use case and you want the API used by the board code
to be straightforward to implement this common case. Adding a
simple maximum RAM field fits in with the existing design and
solves a genuine requirement for multiple board models.

Even if one system happens to have NUMA-specific requirements we
don't want to have to have every single devboard specify its RAM
limits in a complicated fashion for the benefit of the one NUMA
system. Just have the NUMA system not specify max_ram (so it gets
the default of zero, ie don't check) and have some NUMA-specific
QEMUMachine fields which default to system doesn't support NUMA.
Then NUMA aware board models can do the right thing, and non-NUMA
boards also do the right thing without having to think about
NUMA-related fields at all.
(Compare the way non-SMP boards don't have to worry about the
max_cpus field.)

-- PMM



Re: [Qemu-devel] [PATCH] exit if -drive specified is invalid instead of ignoring the wrong -drive

2011-03-30 Thread Michael Tokarev
30.03.2011 17:08, Jes Sorensen wrote:
 On 03/30/11 14:31, Michael Tokarev wrote:
 This fixes the problem when qemu continues even if -drive specification
 is somehow invalid, resulting in a mess.  Applicable for both current
 master and for stable-0.14 (and the same issue exist 0.13 and 0.12 too).

 The prob can actually be seriuos: when you start guest with two drives
 and make an error in the specification of one of them, and the guest
 has something like a raid array on the two drives, guest may start failing
 that array or kick missing drives which may result in a mess - this is
 what actually happened to me, I did't want a resync at all, and a resync
 resulted in re-writing (and allocating) a 4TB virtual drive I used for
 testing, which in turn resulted in my filesystem filling up and whole
 thing failing badly.  Yes it was just testing VM, I experimented with
 larger raid arrays, but the end result was quite, well, unexpected.

 Signed-off-by: Michael Tokarev m...@tls.msk.ru
 ---
  vl.c |4 +++-
  1 files changed, 3 insertions(+), 1 deletions(-)

 diff --git a/vl.c b/vl.c
 index 8bcf2ae..3792afb 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -2098,7 +2098,9 @@ int main(int argc, char **argv, char **envp)
HD_OPTS);
  break;
  case QEMU_OPTION_drive:
 -drive_def(optarg);
 +if (drive_def(optarg) == NULL) {
 +exit(1);
 +}
 
 Looks good, however it would be nice if you added an error message here.

It's already printed by drive_def():

 $ kvm -drive foo=bar
 kvm: -drive foo=bar: Invalid parameter 'foo'

I can add something like unable to initialize -drive, exiting
but to mee it looks sorta redundrand, no?

Thanks!

/mjt



Re: [Qemu-devel] [PATCH] exit if -drive specified is invalid instead of ignoring the wrong -drive

2011-03-30 Thread Jes Sorensen
On 03/30/11 15:22, Michael Tokarev wrote:
 30.03.2011 17:08, Jes Sorensen wrote:
  On 03/30/11 14:31, Michael Tokarev wrote:
  This fixes the problem when qemu continues even if -drive specification
  is somehow invalid, resulting in a mess.  Applicable for both current
  master and for stable-0.14 (and the same issue exist 0.13 and 0.12 too).
 
  The prob can actually be seriuos: when you start guest with two drives
  and make an error in the specification of one of them, and the guest
  has something like a raid array on the two drives, guest may start 
  failing
  that array or kick missing drives which may result in a mess - this is
  what actually happened to me, I did't want a resync at all, and a resync
  resulted in re-writing (and allocating) a 4TB virtual drive I used for
  testing, which in turn resulted in my filesystem filling up and whole
  thing failing badly.  Yes it was just testing VM, I experimented with
  larger raid arrays, but the end result was quite, well, unexpected.
 
  Signed-off-by: Michael Tokarev m...@tls.msk.ru
  ---
   vl.c |4 +++-
   1 files changed, 3 insertions(+), 1 deletions(-)
 
  diff --git a/vl.c b/vl.c
  index 8bcf2ae..3792afb 100644
  --- a/vl.c
  +++ b/vl.c
  @@ -2098,7 +2098,9 @@ int main(int argc, char **argv, char **envp)
 HD_OPTS);
   break;
   case QEMU_OPTION_drive:
  -drive_def(optarg);
  +if (drive_def(optarg) == NULL) {
  +exit(1);
  +}
  
  Looks good, however it would be nice if you added an error message here.
 It's already printed by drive_def():
 
  $ kvm -drive foo=bar
  kvm: -drive foo=bar: Invalid parameter 'foo'
 
 I can add something like unable to initialize -drive, exiting
 but to mee it looks sorta redundrand, no?

Ah in that case I agree - I didn't look deep enough in the code.

Acked-by: Jes Sorensen jes.soren...@redhat.com





Re: [Qemu-devel] [PATCH v3 0/7] Let boards state maximum RAM limits in QEMUMachine struct

2011-03-30 Thread Anthony Liguori

On 03/30/2011 08:22 AM, Peter Maydell wrote:

On 30 March 2011 11:51, Jes Sorensenjes.soren...@redhat.com  wrote:

On 03/30/11 10:09, Peter Maydell wrote:

On 30 March 2011 08:48, Jes Sorensenjes.soren...@redhat.com  wrote:

I am a little concerned about this approach. It should work for simple
embedded boards, but for larger systems, it really ought to be a mask
rather than a max address.

It's not a maximum address, it's a maximum size. For instance
the RAM isn't contiguous on some of the ARM devboards.

Right, but the fact that you can have holes makes it even more an issue.

Not really, typically they're just filled up in some particular
order (main RAM in one place and expansion RAM elsewhere).
Since the board init function is only passed a single ram_size
parameter that's all you can do anyhow.


FWIW, I don't think any static data is going to be perfect here.  A lot 
of boards have strict requirements on ram_size based on plausible 
combinations of DIMMs.  Arbitrary values up to ram_size is not good enough.


ram_size ought to be viewed as a hint, not a mechanism to allow common 
code to completely validate the passed in ram size parameter.  It's 
still up to the board to validate that the given ram size makes sense.


Regards,

Anthony Liguori


Ideally I think it would be better to have a mask and then introduce a
is_valid_memory() kinda function to check it with.

I'm not sure what this mask would look like. You want megabyte
granularity, but even if you assume only 48 bit physical addresses
that would still be an alarmingly large number of bits. So I
guess you'd actually want a list of (start,length) tuples.

I strongly dislike this approach. I think it introduces unneeded
extra complexity, and it doesn't even address all the cases you
might want a generic validate RAM options mechanism to do (eg
boards which only allow RAM in 16MB increments, boards where you
must fill the base RAM first and then the expansion RAM, boards
where the RAM isn't at a specific physical address but can be
remapped under guest control, just to pick three). So it falls
awkwardly between the two stools of flexible and generic and
simple and straightforward.


The command line option doesn't provide any means of saying
put 64MB in this hole and another 128 over here and 32 there,

You can on x86 using the -numa argument. When you use that, it is
completely pointless to have the max memory limit, to use your own words.

(Is there any documentation on what you can do with the -numa
option? http://qemu.weilnetz.de/qemu-doc.html is decidedly laconic.)


Since you have holes on ARM too, it makes sense IMHO to use a mask
exactly because you can then map that to max memory by simply adding up
the available holes. A linear number on the other hand is much harder to
validate against once you start populating holes explicitly.

But I don't want to populate holes explicitly, and I don't imagine
most of the other boards want to populate holes explicitly
either.

The current QEMU design basically only controls the total amount
of RAM (as a command line option, and as a parameter to the board
init function), with NUMA being a an optional extra. Non-NUMA is
the common use case and you want the API used by the board code
to be straightforward to implement this common case. Adding a
simple maximum RAM field fits in with the existing design and
solves a genuine requirement for multiple board models.

Even if one system happens to have NUMA-specific requirements we
don't want to have to have every single devboard specify its RAM
limits in a complicated fashion for the benefit of the one NUMA
system. Just have the NUMA system not specify max_ram (so it gets
the default of zero, ie don't check) and have some NUMA-specific
QEMUMachine fields which default to system doesn't support NUMA.
Then NUMA aware board models can do the right thing, and non-NUMA
boards also do the right thing without having to think about
NUMA-related fields at all.
(Compare the way non-SMP boards don't have to worry about the
max_cpus field.)

-- PMM






Re: [Qemu-devel] [PATCH v3 0/7] Let boards state maximum RAM limits in QEMUMachine struct

2011-03-30 Thread Jes Sorensen
On 03/30/11 15:22, Peter Maydell wrote:
 On 30 March 2011 11:51, Jes Sorensen jes.soren...@redhat.com
 wrote:
 Ideally I think it would be better to have a mask and then
 introduce a is_valid_memory() kinda function to check it with.
 
 I'm not sure what this mask would look like. You want megabyte 
 granularity, but even if you assume only 48 bit physical addresses 
 that would still be an alarmingly large number of bits. So I guess
 you'd actually want a list of (start,length) tuples.
 
 I strongly dislike this approach. I think it introduces unneeded 
 extra complexity, and it doesn't even address all the cases you might
 want a generic validate RAM options mechanism to do (eg boards
 which only allow RAM in 16MB increments, boards where you must fill
 the base RAM first and then the expansion RAM, boards where the RAM
 isn't at a specific physical address but can be remapped under guest
 control, just to pick three). So it falls awkwardly between the two
 stools of flexible and generic and simple and straightforward.

I thought about it a bit and obviously a simple mask isn't going to cut
it. What we should have is more like a list of valid ram locations which
we can use to calculate the allowable size for. The -m parameter is
really a shortcut for the real thing, and we shouldn't optimize for -m,
but rather the other way round.


 You can on x86 using the -numa argument. When you use that, it is 
 completely pointless to have the max memory limit, to use your own
 words.
 
 (Is there any documentation on what you can do with the -numa option?
 http://qemu.weilnetz.de/qemu-doc.html is decidedly laconic.)

I don't know of documentation for that unfortunately :( Eventually you
should be able to do anything with it that can happen in a real numa
system. Basically it should allow you to define nodes and assign CPUs
and memory to the given nodes, as well as PCI devices.

 Since you have holes on ARM too, it makes sense IMHO to use a mask 
 exactly because you can then map that to max memory by simply
 adding up the available holes. A linear number on the other hand is
 much harder to validate against once you start populating holes
 explicitly.
 
 But I don't want to populate holes explicitly, and I don't imagine 
 most of the other boards want to populate holes explicitly either.

Why not? You just mentioned the case where board holes can be populated
in different ways?

 The current QEMU design basically only controls the total amount of
 RAM (as a command line option, and as a parameter to the board init
 function), with NUMA being a an optional extra. Non-NUMA is the
 common use case and you want the API used by the board code to be
 straightforward to implement this common case. Adding a simple
 maximum RAM field fits in with the existing design and solves a
 genuine requirement for multiple board models.

The problem with the existing design is exactly that it was designed for
the simple case, and works badly for the case that matches the hardware
better. I suggest we do it right, and then provide functions to make -m
work easily. Otherwise we end up with bad hacks that are going to work
badly for the non simple case.

 Even if one system happens to have NUMA-specific requirements we 
 don't want to have to have every single devboard specify its RAM 
 limits in a complicated fashion for the benefit of the one NUMA 
 system. Just have the NUMA system not specify max_ram (so it gets the
 default of zero, ie don't check) and have some NUMA-specific 
 QEMUMachine fields which default to system doesn't support NUMA. 
 Then NUMA aware board models can do the right thing, and non-NUMA 
 boards also do the right thing without having to think about 
 NUMA-related fields at all. (Compare the way non-SMP boards don't
 have to worry about the max_cpus field.)

A simple system is basically a single-node NUMA system. So you refer to
NODE 0 per default if no explicit node is listed. Instead what we should
have is something like this

struct node_memregion {
uint64_t start0, end0;
uint64_t start1, end1;
uint64_t start2, end2;
uint64_t start3, end3;
}

Then have a max-nodes value per machine, and a pointer to a
node_memregion array. If you only have one node, which would be the
common case, then you populate just that specifying the holes you have
etc. A simple function can then calculate the maximum allowable memory
in a generic way.

Simple to define for all the simple boards.

Jes



Re: [Qemu-devel] [PATCH v3 0/7] Let boards state maximum RAM limits in QEMUMachine struct

2011-03-30 Thread Peter Maydell
On 30 March 2011 14:56, Anthony Liguori anth...@codemonkey.ws wrote:
 On 03/30/2011 08:22 AM, Peter Maydell wrote:
 Not really, typically they're just filled up in some particular
 order (main RAM in one place and expansion RAM elsewhere).
 Since the board init function is only passed a single ram_size
 parameter that's all you can do anyhow.

 FWIW, I don't think any static data is going to be perfect here.  A lot of
 boards have strict requirements on ram_size based on plausible combinations
 of DIMMs.  Arbitrary values up to ram_size is not good enough.

 ram_size ought to be viewed as a hint, not a mechanism to allow common code
 to completely validate the passed in ram size parameter.  It's still up to
 the board to validate that the given ram size makes sense.

Yes, I agree, so we shouldn't try to specify some complicated
set of static data that still won't be good enough.

I'm trying to make it easy for boards to avoid crashing horribly
when the user passes a bad value; that's all.

-- PMM



Re: [Qemu-devel] [PATCH] v3 revamp acpitable parsing and allow to specify complete (headerful) table

2011-03-30 Thread Gleb Natapov
On Tue, Mar 29, 2011 at 10:41:11PM +0400, Michael Tokarev wrote:
 [Cc'ing Gleb since he - it seems - wrote the original code]
 
 17.03.2011 13:00, Michael Tokarev wrote:
  This patch almost rewrites acpi_table_add() function
  (but still leaves it using old get_param_value() interface).
  The result is that it's now possible to specify whole table
  (together with a header) in an external file, instead of just
  data portion, with a new file= parameter, but at the same time
  it's still possible to specify header fields as before.
  
  Now with the checkpatch.pl formatting fixes, thanks to
  Stefan Hajnoczi for suggestions, with changes from
  Isaku Yamahata, and with my further refinements.
 
 Obviously after an almost complete rewrite, it's nice to have
 some way to ensure the new code does not break things.  I
 tested this by ensuring that the new code produces the same
 ACPI table as the old code did (modulo the new parameter).
 
 I did a few (not all) combinations of various subparameters
 (sig=foo rev=N oem_id=bar  oem_table_id=baz etc), booting
 linux with old and new code and comparing the resulting
 table from /sys/firmware/acpi/tables/foo.
 
 Later on, I took that fake table from within guest and
 specified it with -acpitable file=, and specified one or
 two extra fields and compared two tables - this file=
 plus added extra with the original way, replacing
 the added fields with their new values.  The result
 was the same.
 
 With this patch we're one step closer (and one step
 left still) to be able to run the same pre-installed
 windows image either natively or in kvm/qemu this
 way:
 
   kvm -hda /dev/sda -acpitable file=/sys/firmware/acpi/tables/SLIC
 
 (and choosing to boot windows installed on your hdd,
 obviously) -- windows activation/registration is not
 affected by running it in kvm - _provided_ both steps
 are completed, one is this acpitable thing and another
 is identification for seabios -- this one:
 http://article.gmane.org/gmane.comp.emulators.qemu/97348
 
  Signed-off-by: Michael Tokarev m...@tls.msk.ru
 
 Also
 Reviewed-by: Isaku Yamahata yamah...@valinux.co.jp
 
 Thanks!
 
   hw/acpi.c   |  285 
  +++
   qemu-options.hx |7 +-
   2 files changed, 168 insertions(+), 124 deletions(-)
  
  diff --git a/hw/acpi.c b/hw/acpi.c
  index 237526d..4cc0311 100644
  --- a/hw/acpi.c
  +++ b/hw/acpi.c
  @@ -15,6 +15,7 @@
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see 
  http://www.gnu.org/licenses/
*/
  +
   #include hw.h
   #include pc.h
   #include acpi.h
  @@ -24,17 +25,29 @@
   
   struct acpi_table_header
   {
  -char signature [4];/* ACPI signature (4 ASCII characters) */
  +uint16_t _length; /* our length, not actual part of the hdr */
  +  /* XXX why we have 2 length fields here? */
  +char sig[4];  /* ACPI signature (4 ASCII characters) */
   uint32_t length;  /* Length of table, in bytes, including 
  header */
   uint8_t revision; /* ACPI Specification minor version # */
   uint8_t checksum; /* To make sum of entire table == 0 */
  -char oem_id [6];   /* OEM identification */
  -char oem_table_id [8]; /* OEM table identification */
  +char oem_id[6];   /* OEM identification */
  +char oem_table_id[8]; /* OEM table identification */
   uint32_t oem_revision;/* OEM revision number */
  -char asl_compiler_id [4]; /* ASL compiler vendor ID */
  +char asl_compiler_id[4];  /* ASL compiler vendor ID */
   uint32_t asl_compiler_revision; /* ASL compiler revision number */
   } __attribute__((packed));
   
  +#define ACPI_TABLE_HDR_SIZE sizeof(struct acpi_table_header)
  +#define ACPI_TABLE_PFX_SIZE sizeof(uint16_t)  /* size of the extra prefix 
  */
  +
  +static const char dfl_hdr[ACPI_TABLE_HDR_SIZE] =
  +\0\0   /* fake _length (2) */
  +QEMU\0\0\0\0\1\0   /* sig (4), len(4), revno (1), csum (1) */
  +QEMUQEQEMUQEMU\1\0\0\0 /* OEM id (6), table (8), revno (4) */
  +QEMU\1\0\0\0   /* ASL compiler ID (4), version (4) */
  +;
  +
   char *acpi_tables;
   size_t acpi_tables_len;
   
  @@ -47,156 +60,182 @@ static int acpi_checksum(const uint8_t *data, int len)
   return (-sum)  0xff;
   }
   
  +/* like strncpy() but zero-fills the tail of destination */
  +static void strzcpy(char *dst, const char *src, size_t size)
  +{
  +size_t len = strlen(src);
  +if (len = size) {
  +len = size;
  +} else {
  +  memset(dst + len, 0, size - len);
  +}
  +memcpy(dst, src, len);
  +}
  +
  +/* XXX fixme: this function uses obsolete argument parsing interface */
   int acpi_table_add(const char *t)
   {
  -static const char *dfl_id = QEMUQEMU;
   char buf[1024], *p, *f;
  -struct acpi_table_header acpi_hdr;
   unsigned 

[Qemu-devel] Re: [PATCH 00/19] s390x emulation support

2011-03-30 Thread Peter Maydell
On 30 March 2011 11:46, Alexander Graf ag...@suse.de wrote:
 On 30.03.2011, at 12:23, Peter Maydell wrote:
 Is there an available ISA manual you can add a wiki link to?

 http://wiki.qemu.org/Documentation/ISAManuals

 (That page is a bit bare at the moment, I know...)

 Oh, nice page! Didn't know it existed :). Very useful. I
 added a link to a POP for z9 (new enough for the target
 we're emulating and available without registering with IBM)
 and also updated the PowerPC pieces.

Thanks. I've added links for Alpha, SPARC, HPPA and MIPS, plus
empty sections for the other architectures we support as host
or target but don't have docs links for yet (CRIS, LatticeMico32,
m68k, Microblaze, sh4).

-- PMM



Re: [Qemu-devel] [PATCH 3/3] vhost: roll our own cpu map variant

2011-03-30 Thread Michael S. Tsirkin
On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
 On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin m...@redhat.com wrote:
  vhost used cpu_physical_memory_map to get the
  virtual address for the ring, however,
  this will exit on an illegal RAM address.
  Since the addresses are guest-controlled, we
  shouldn't do that.
 
  Switch to our own variant that uses the vhost
  tables and returns an error instead of exiting.
 
 We should make all of QEMU more robust instead of just vhost.  Perhaps
 introduce cpu_physical_memory_map_nofail(...) that aborts like the
 current cpu_physical_memory_map() implementation and then make non-hw/
 users call that one.  hw/ users should check for failure.
 
 Stefan

Yea, well ... at least vhost-net wants to also check
it is given a ram address, not some other physical address.
We could generally replace the memory management in vhost-net
by some other logic, when that's done this one can
go away as well.

-- 
MST



Re: [Qemu-devel] [PATCH 1/3] arm: basic support for ARMv4/ARMv4T emulation

2011-03-30 Thread Peter Maydell
On 30 March 2011 12:41, Dmitry Eremin-Solenikov dbarysh...@gmail.com wrote:
 @@ -7172,10 +7210,7 @@ static void disas_arm_insn(CPUState * env, 
 DisasContext *s)
             }
             if (insn  (1  20)) {
                 /* Complete the load.  */
 -                if (rd == 15)
 -                    gen_bx(s, tmp);
 -                else
 -                    store_reg(s, rd, tmp);
 +                store_reg_from_load(env, s, rn, tmp);
             }
             break;
         case 0x08:

Shouldn't the argument to store_reg_from_load() be 'rd', not 'rn'?

I'm otherwise happy with this patch. Thanks for doing all the changes.

-- PMM


Re: [Qemu-devel] [PATCH 3/3] vhost: roll our own cpu map variant

2011-03-30 Thread Stefan Hajnoczi
On Wed, Mar 30, 2011 at 5:09 PM, Michael S. Tsirkin m...@redhat.com wrote:
 On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
 On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin m...@redhat.com wrote:
  vhost used cpu_physical_memory_map to get the
  virtual address for the ring, however,
  this will exit on an illegal RAM address.
  Since the addresses are guest-controlled, we
  shouldn't do that.
 
  Switch to our own variant that uses the vhost
  tables and returns an error instead of exiting.

 We should make all of QEMU more robust instead of just vhost.  Perhaps
 introduce cpu_physical_memory_map_nofail(...) that aborts like the
 current cpu_physical_memory_map() implementation and then make non-hw/
 users call that one.  hw/ users should check for failure.

 Stefan

 Yea, well ... at least vhost-net wants to also check
 it is given a ram address, not some other physical address.
 We could generally replace the memory management in vhost-net
 by some other logic, when that's done this one can
 go away as well.

Sounds like you do not want to refactor physical memory access for
non-vhost.  Fair enough but we have to do it sooner or later in order
to make all of QEMU more robust.  If vhost-net is protected but the
IDE CD-ROM and virtio-blk disk still have issues then we haven't
reached our goal yet.  Any way I can convince you to do a generic API?
:)

Stefan



Re: [Qemu-devel] MIPS64 user mode emulation Patch

2011-03-30 Thread Nathan Froyd
On Sat, Mar 26, 2011 at 11:58:37AM +0500, Khansa Butt wrote:
 Subject: [PATCH] MIPS64 user mode emulation in QEMU
  This patch adds support for Cavium Network's
  Octeon 57XX user mode instructions.  Octeon
  57xx is based on MIPS64.  So this patch is
  the first MIPS64 User Mode Emulation in QEMU
  This is the team(Khansa Butt, Ehsan-ul-Haq, Abdul Qadeer, Abdul Waheed)
  work of HPCNL Lab at KICS-UET Lahore.

Thanks for doing this.  As already noted, this patch should be split
into at least two patches: one to add Octeon-specific instructions in
target-mips/ and one adding the necessary linux-user bits.

 +extern int TARGET_OCTEON;

I don't think a global like this is the right way to go.  Perhaps the
elfload.c code should set a flag in image_info , which can then be used
to set a flag in CPUMIPSState later on.

If we must use a global variable, it should be declared in
target-mips/cpu.h.

 @@ -2013,7 +2024,8 @@ void cpu_loop(CPUMIPSState *env)
   env-active_tc.gpr[5],
   env-active_tc.gpr[6],
   env-active_tc.gpr[7],
 - arg5, arg6/*, arg7, arg8*/);
 + env-active_tc.gpr[8],
 + env-active_tc.gpr[9]/*, arg7, arg8*/);
  }
  if (ret == -TARGET_QEMU_ESIGRETURN) {
  /* Returning from a successful sigreturn syscall.

This change breaks O32 binaries; it needs to be done in a different way.

 diff --git a/linux-user/syscall.c b/linux-user/syscall.c
 index 499c4d7..47fef05 100644
 --- a/linux-user/syscall.c
 +++ b/linux-user/syscall.c
 @@ -7195,6 +7195,8 @@ abi_long do_syscall(void *cpu_env, int num, abi_long
 arg1,
  case TARGET_NR_set_thread_area:
  #if defined(TARGET_MIPS)
((CPUMIPSState *) cpu_env)-tls_value = arg1;
 +   /*tls entry is moved to k0 so that this can be used later*/
 +  ((CPUMIPSState *) cpu_env)-active_tc.gpr[26] = arg1;
ret = 0;
break;
  #elif defined(TARGET_CRIS)

I believe this is only correct for Octeon binaries; it's not how the
rest of the MIPS world works.  It therefore needs to be conditional on
Octeon-ness.

 --- a/target-mips/cpu.h
 +++ b/target-mips/cpu.h
 @@ -140,6 +140,20 @@ typedef struct mips_def_t mips_def_t;
  #define MIPS_FPU_MAX 1
  #define MIPS_DSP_ACC 4
 
 +typedef struct cavium_mul cavium_mul;
 +struct cavium_mul {
 + target_ulong MPL0;
 + target_ulong MPL1;
 + target_ulong MPL2;
 + target_ulong P0;
 + target_ulong P1;
 + target_ulong P2;
 +};
 +typedef struct cvmctl_register cvmctl_register;
 +struct cvmctl_register {
 + target_ulong cvmctl;
 +};

The indentation here needs to be fixed.  I don't think there's any
reason why these need to be defined outside TCState, either.

 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index cce77be..9c3d772 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -36,6 +36,15 @@
  #define GEN_HELPER 1
  #include helper.h
 
 +int TARGET_OCTEON;
 +#if defined(TARGET_MIPS64)
 +/*Macros for setting values of cvmctl registers*/
 +#define FUSE_START_BIT(cvmctl)(cvmctl | 0x8000)
 +#define KASUMI(cvmctl)(cvmctl | 0x2000)
 +#define IPPCI(cvmctl)(cvmctl | 0x380)
 +#define IPTI(cvmctl)(cvmctl | 0x70)
 +#endif

Please follow existing style; spaces between the comment and comment
markers (many examples in translate.c, not just here) and spaces between
the macro argument list and definition.

 @@ -779,7 +818,9 @@ static inline void gen_op_addr_add (DisasContext *ctx,
 TCGv ret, TCGv arg0, TCGv
 See the MIPS64 PRA manual, section 4.10. */
  if (((ctx-hflags  MIPS_HFLAG_KSU) == MIPS_HFLAG_UM) 
  !(ctx-hflags  MIPS_HFLAG_UX)) {
 -tcg_gen_ext32s_i64(ret, ret);
 +/*This function sign extend 32 bit value to 64 bit, was causing
 error
 +  when ld instruction came.Thats why it is commmented out*/
 +   /* tcg_gen_ext32s_i64(ret, ret);*/
  }
  #endif
  }

Um, no.  If you needed to comment this out, you have a bug someplace
else.  Don't paper over the bug here.

 +case OPC_VMULU:
 +case OPC_V3MULU:

These two are large enough that they should be done as out-of-line
helpers.

Also, since all these new instructions are Octeon-specific, there should
be checks emitted to ensure that they produce appropriate errors when
non-Octeon hardware is being simulated, similar in style to
check_mips_64.

  /* Arithmetic */
  static void gen_arith (CPUState *env, DisasContext *ctx, uint32_t opc,
 int rd, int rs, int rt)
  {
  const char *opn = arith;
 
 +target_ulong mask = 0xFF;

I don't think it's really necessary to have this, but if you feel it's
necessary, please move the declaration closer to the point of use.

 +#if defined(TARGET_MIPS64)
 +static void gen_seqsne (DisasContext *ctx, uint32_t opc,
 +int rd, int rs, int rt)
 +{
 +const char 

[Qemu-devel] CD-ROM bug round-up

2011-03-30 Thread Stefan Hajnoczi
Several of us have been investigating CD-ROM bugs.  Let's update each
other, make sure we're not duplicating effort, and see if we can help
each other make progress.

= Stefan =

Guests do not notice media change when using Linux host CD-ROM
pass-through for two reasons:
1. QEMU is caching the device size for removable media and never updating it.
2. On Linux hosts the /dev/cdrom file descriptor needs to be reopened
to refresh the underlying device size.

Patches have been posted to qemu-devel.

Libvirt does not recognize eject failure due to locked tray.  This
causes libvirt to get out-of-sync and believe the medium was
successfully changed.  A libvirt fix is in progress:
https://bugzilla.redhat.com/show_bug.cgi?id=689101

= Anthony =

Using 'change' or scripted commands without a delay between closing
the BlockDriverState and opening the new CD-ROM.  The guest does not
have a chance to poll the drive and determine a medium present -
medium not present transition followed by a medium not present -
medium present transition.

A fix for this does not exist yet and might be tricky since we need to
delay or wait for the guest to poll once before opening the new
CD-ROM.

= Amit =

Found that 2.6.38 and later guest kernels fail to report media change.
 The new in-kernel media change polling framework issues ATAPI
commands which are not implemented in hw/ide/core.c.

= Juan =

ATA HSM violation errors after live migration while CD-ROM read
operations.  This is probably a hw/ide/* or DMA migration bug.

= Markus =

?

Stefan



Re: [Qemu-devel] CD-ROM bug round-up

2011-03-30 Thread Gleb Natapov
On Wed, Mar 30, 2011 at 05:40:40PM +0100, Stefan Hajnoczi wrote:
 Several of us have been investigating CD-ROM bugs.  Let's update each
 other, make sure we're not duplicating effort, and see if we can help
 each other make progress.
 
 = Stefan =
 
 Guests do not notice media change when using Linux host CD-ROM
 pass-through for two reasons:
 1. QEMU is caching the device size for removable media and never updating it.
 2. On Linux hosts the /dev/cdrom file descriptor needs to be reopened
 to refresh the underlying device size.
 
 Patches have been posted to qemu-devel.
 
 Libvirt does not recognize eject failure due to locked tray.  This
 causes libvirt to get out-of-sync and believe the medium was
 successfully changed.  A libvirt fix is in progress:
 https://bugzilla.redhat.com/show_bug.cgi?id=689101
 
 = Anthony =
 
 Using 'change' or scripted commands without a delay between closing
 the BlockDriverState and opening the new CD-ROM.  The guest does not
 have a chance to poll the drive and determine a medium present -
 medium not present transition followed by a medium not present -
 medium present transition.
 
 A fix for this does not exist yet and might be tricky since we need to
 delay or wait for the guest to poll once before opening the new
 CD-ROM.
 
This sounds familiar and should have been fixed by commit
93c8cfd9e67a62711b86f4c93747566885eb7928

 = Amit =
 
 Found that 2.6.38 and later guest kernels fail to report media change.
  The new in-kernel media change polling framework issues ATAPI
 commands which are not implemented in hw/ide/core.c.
 
 = Juan =
 
 ATA HSM violation errors after live migration while CD-ROM read
 operations.  This is probably a hw/ide/* or DMA migration bug.
 
 = Markus =
 
 ?
 
 Stefan

--
Gleb.



Re: [Qemu-devel] [PATCH 3/3] vhost: roll our own cpu map variant

2011-03-30 Thread Michael S. Tsirkin
On Wed, Mar 30, 2011 at 05:26:22PM +0100, Stefan Hajnoczi wrote:
 On Wed, Mar 30, 2011 at 5:09 PM, Michael S. Tsirkin m...@redhat.com wrote:
  On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
  On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin m...@redhat.com 
  wrote:
   vhost used cpu_physical_memory_map to get the
   virtual address for the ring, however,
   this will exit on an illegal RAM address.
   Since the addresses are guest-controlled, we
   shouldn't do that.
  
   Switch to our own variant that uses the vhost
   tables and returns an error instead of exiting.
 
  We should make all of QEMU more robust instead of just vhost.  Perhaps
  introduce cpu_physical_memory_map_nofail(...) that aborts like the
  current cpu_physical_memory_map() implementation and then make non-hw/
  users call that one.  hw/ users should check for failure.
 
  Stefan
 
  Yea, well ... at least vhost-net wants to also check
  it is given a ram address, not some other physical address.
  We could generally replace the memory management in vhost-net
  by some other logic, when that's done this one can
  go away as well.
 
 Sounds like you do not want to refactor physical memory access for
 non-vhost.  Fair enough but we have to do it sooner or later in order
 to make all of QEMU more robust.  If vhost-net is protected but the
 IDE CD-ROM and virtio-blk disk still have issues then we haven't
 reached our goal yet.  Any way I can convince you to do a generic API?
 :)
 
 Stefan

If you are talking about splitting real ram from non ram
and creating a generic API for that, you don't need to convince me,
but I can't commit to implementing it right now.

-- 
MST



[Qemu-devel] [PATCH V1 0/8] Add TPM support to SeaBIOS

2011-03-30 Thread Stefan Berger
The following set of patches add TPM and Trusted Computing support to SeaBIOS.
In particular the patches add:

- a TPM driver for the Qemu's TPM TIS emulation (not yet in Qemu git)
- ACPI support for the TPM device (SSDT table)
- ACPI support for measurement logging (TCPA table)
- Support for initialzation of the TPM
- Support for the TCG BIOS extensions (1ah handler [ah = 0xbb])
  (used by trusted grub; http://trousers.sourceforge.net/grub.html)
- Static Root of Trusted for Measurement (SRTM) support
- Support for S3 resume (sends command to TPM upon resume)
- TPM-specific menu for controlling aspects of the TPM
- [An optional test suite for the TIS interface]

All implementations necessarily follow specifications.

Regards,
 Stefan



[Qemu-devel] [PATCH V1 4/8] Build the TCG BIOS extensions and TPM drivers.

2011-03-30 Thread Stefan Berger
This patch allows to configure the TCGBIOS extensions to be built
into SeaBIOS, depending on not COREBOOT being selected.

All TCG BIOS extensions are activated with CONFIG_TCGBIOS.

Add the two new code files (tcgbios.c, tpm_drivers.c) to be built.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 Makefile|2 +-
 src/Kconfig |8 
 2 files changed, 9 insertions(+), 1 deletion(-)

Index: seabios/Makefile
===
--- seabios.orig/Makefile
+++ seabios/Makefile
@@ -20,7 +20,7 @@ SRC16=$(SRCBOTH) system.c disk.c font.c
 SRC32FLAT=$(SRCBOTH) post.c shadow.c memmap.c coreboot.c boot.c \
   acpi.c smm.c mptable.c smbios.c pciinit.c optionroms.c mtrr.c \
   lzmadecode.c bootsplash.c jpeg.c usb-hub.c paravirt.c dev-i440fx.c \
-  pci_region.c
+  pci_region.c tcgbios.c tpm_drivers.c
 SRC32SEG=util.c output.c pci.c pcibios.c apm.c stacks.c
 
 cc-option = $(shell if test -z `$(1) $(2) -S -o /dev/null -xc \
Index: seabios/src/Kconfig
===
--- seabios.orig/src/Kconfig
+++ seabios/src/Kconfig
@@ -314,6 +314,14 @@ menu BIOS interfaces
 default n
 help
 Disable A20 on 16bit boot.
+
+config TCGBIOS
+depends on !COREBOOT
+bool TPM support and TCG BIOS extensions
+default y
+help
+Provide TPM support along with TCG BIOS extensions
+
 endmenu
 
 menu BIOS Tables




[Qemu-devel] [PATCH V1 5/8] Support for BIOS interrupt handler

2011-03-30 Thread Stefan Berger
This patch implements the TCG BIOS interrupt handler 1ah. It is for
example used by trusted grub.

This patch adds an implementation of SHA1 (following NIST specs., IETF RFC 3147
and Wikipedia) for speeding up measurements of code. Trusted Grub for example
makes use of this interface and measures (calculates SHA1) of the Linux kernel
and initrd. Those files can be rather large and hunting their bytes through
the TIS interface as part of the int handler commands invoked by trusted grub
does take quite some time due to the many vmexits the interface is creating
(one per byte).

There is also a threshold for the size of data to hash (100k) below which
the TPM is used and above the internal faster SHA1 algorithm is used.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 src/Kconfig   |8 
 src/clock.c   |9 
 src/stacks.c  |   14 +
 src/tcgbios.c |  714 ++
 src/tcgbios.h |3 
 5 files changed, 748 insertions(+)

Index: seabios/src/tcgbios.c
===
--- seabios.orig/src/tcgbios.c
+++ seabios/src/tcgbios.c
@@ -62,6 +62,9 @@ static const u8 GetCapability_OwnerAuth[
 #define RSDP_CAST(ptr)   ((struct rsdp_descriptor *)ptr)
 
 
+static u32 sha1(const u8 *data, u32 length, u8 *hash);
+
+
 /* helper functions */
 
 static inline void *input_buf32(struct bregs *regs)
@@ -522,4 +525,715 @@ err_exit:
 }
 
 
+static int
+isValidPcpes(struct pcpes *pcpes)
+{
+return (pcpes-eventtype != 0);
+}
+
+
+static u8 *
+get_lasa_last_ptr(u16 *entry_count, u8 **lasa_next)
+{
+struct pcpes *pcpes;
+u32 laml;
+u8 *lasa_base = get_lasa_base_ptr(laml);
+u8 *lasa_last = NULL;
+u8 *end = lasa_base + laml;
+u32 size;
+
+if (entry_count)
+*entry_count = 0;
+
+if (!lasa_base)
+return NULL;
+
+while (lasa_base  end) {
+pcpes = (struct pcpes *)lasa_base;
+if (!isValidPcpes(pcpes))
+break;
+if (entry_count)
+(*entry_count)++;
+size = pcpes-eventdatasize + offsetof(struct pcpes, event);
+lasa_last = lasa_base;
+lasa_base += size;
+}
+
+if (lasa_next)
+*lasa_next = lasa_base;
+
+return lasa_last;
+}
+
+
+/***
+  Calculation of SHA1 in SW
+
+  See: http://www.itl.nist.gov/fipspubs/fip180-1.htm
+   RFC3174, Wikipedia's SHA1 alogrithm description
+ **/
+typedef struct _sha1_ctx {
+u32 h[5];
+} sha1_ctx;
+
+
+static inline u32 rol(u32 val, u16 rol)
+{
+u32 res;
+
+__asm__ __volatile__ (rol %%cl, %%eax
+  : =a (res)
+  : a (val), c (rol));
+
+return res;
+}
+
+
+static inline u64 bswap_64(u64 val)
+{
+u32 hi = (u32)(val  32);
+u32 lo = (u32)val;
+
+__asm__ __volatile__ (bswap %%eax
+  : =a (lo)
+  : a (lo));
+
+__asm__ __volatile__ (bswap %%eax
+  : =a (hi)
+  : a (hi));
+
+return ((u64)lo  32 | hi);
+}
+
+
+static const u32 sha_ko[4] = { 0x5a827999,
+   0x6ed9eba1,
+   0x8f1bbcdc,
+   0xca62c1d6 };
+
+
+static void
+sha1_block(u32 *w, sha1_ctx *ctx)
+{
+u32 i;
+u32 a,b,c,d,e,f;
+u32 tmp;
+u32 idx;
+
+/* change endianess of given data */
+for (i = 0; i  16; i++)
+w[i] = htonl(w[i]);
+
+for (i = 16; i = 79; i++) {
+tmp = w[i-3] ^ w[i-8] ^ w[i-14] ^ w[i-16];
+w[i] = rol(tmp,1);
+}
+
+a = ctx-h[0];
+b = ctx-h[1];
+c = ctx-h[2];
+d = ctx-h[3];
+e = ctx-h[4];
+
+for (i = 0; i = 79; i++) {
+if (i = 19) {
+f = (b  c) | ((b ^ 0x)  d);
+idx = 0;
+} else if (i = 39) {
+f = b ^ c ^ d;
+idx = 1;
+} else if (i = 59) {
+f = (b  c) | (b  d) | (c  d);
+idx = 2;
+} else {
+f = b ^ c ^ d;
+idx = 3;
+}
+
+tmp = rol(a, 5) +
+  f +
+  e +
+  sha_ko[idx] +
+  w[i];
+e = d;
+d = c;
+c = rol(b, 30);
+b = a;
+a = tmp;
+}
+
+ctx-h[0] += a;
+ctx-h[1] += b;
+ctx-h[2] += c;
+ctx-h[3] += d;
+ctx-h[4] += e;
+}
+
+
+static void
+sha1_do(sha1_ctx *ctx, const u8 *data32, u32 length)
+{
+u32 offset;
+u16 num;
+u32 bits = 0;
+u32 w[80];
+u64 tmp;
+
+/* treat data in 64-byte chunks */
+for (offset = 0; length - offset = 64; offset += 64) {
+memcpy(w, data32 + offset, 64);
+sha1_block((u32 *)w, ctx);
+bits += (64 * 8);
+}
+
+/* last block with less than 64 bytes */
+num = length - offset;
+bits += (num  3);
+
+

[Qemu-devel] [PATCH V1 1/8] Add an implementation for a TPM TIS driver

2011-03-30 Thread Stefan Berger
This patch adds an implementation of a TPM TIS driver for the TPM TIS
emulation supported by Qemu (patches posted, not in git yet). Usage of the
driver is broken up into several functions. The driver is cleanly separated
from the rest of the code through an interface holding pointers to the driver's
functions. A client using this driver first probes whether the TPM TIS
interface is available (probe function) and then invokes the interface
function to initialze the interface and send requests and receive responses.

Possible future extensions *could* include a virtio interface for the TPM
with a corresponding driver here.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 src/tpm_drivers.c |  215 ++
 src/tpm_drivers.h |   55 +
 2 files changed, 270 insertions(+)

Index: seabios/src/tpm_drivers.c
===
--- /dev/null
+++ seabios/src/tpm_drivers.c
@@ -0,0 +1,215 @@
+/*
+ *  Implementation of the TCG BIOS extension according to the specification
+ *  described in
+ *  
https://www.trustedcomputinggroup.org/specs/PCClient/TCG_PCClientImplementationforBIOS_1-20_1-00.pdf
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ * Copyright (C) IBM Corporation, 2006, 2010, 2011
+ *
+ * Author: Stefan Berger stef...@us.ibm.com
+ */
+
+#include config.h
+#include util.h
+#include tpm_drivers.h
+#include tcgbios.h
+
+/* if device is not there, return '0', '1' otherwise */
+static u32 tis_probe(void)
+{
+u32 rc = 0;
+u32 didvid = readl(TIS_REG(0, TIS_REG_DID_VID));
+
+if ((didvid != 0)  (didvid != 0x))
+rc = 1;
+
+return rc;
+}
+
+static u32 tis_init(void)
+{
+writeb(TIS_REG(0, TIS_REG_INT_ENABLE), 0);
+
+return 1;
+}
+
+static u32 tis_wait_sts(u8 locty, u32 time, u8 mask, u8 expect)
+{
+u32 rc = 1;
+
+while (time  0) {
+u8 sts = readb(TIS_REG(locty, TIS_REG_STS));
+if ((sts  mask) == expect) {
+rc = 0;
+break;
+}
+mssleep(1);
+time--;
+}
+return rc;
+}
+
+static u32 tis_activate(u8 locty)
+{
+u32 rc = 0;
+u8 acc;
+int l;
+
+if (!(readb(TIS_REG(locty, TIS_REG_ACCESS)) 
+  TIS_ACCESS_ACTIVE_LOCALITY)) {
+/* release locality in use top-downwards */
+for (l = 4; l = 0; l--)
+writeb(TIS_REG(l, TIS_REG_ACCESS),
+   TIS_ACCESS_ACTIVE_LOCALITY);
+}
+
+/* request access to locality */
+writeb(TIS_REG(locty, TIS_REG_ACCESS), TIS_ACCESS_REQUEST_USE);
+
+acc = readb(TIS_REG(locty, TIS_REG_ACCESS));
+if ((acc  TIS_ACCESS_ACTIVE_LOCALITY)) {
+writeb(TIS_REG(locty, TIS_REG_STS), TIS_STS_COMMAND_READY);
+rc = tis_wait_sts(locty, 1000,
+  TIS_STS_COMMAND_READY, TIS_STS_COMMAND_READY);
+}
+
+return rc;
+}
+
+static u32 tis_find_active_locality(void)
+{
+u8 locty;
+
+for (locty = 0; locty = 4; locty++) {
+if ((readb(TIS_REG(locty, TIS_REG_ACCESS)) 
+ TIS_ACCESS_ACTIVE_LOCALITY))
+return locty;
+}
+
+tis_activate(0);
+
+return 0;
+}
+
+static u32 tis_ready(void)
+{
+u32 rc = 0;
+u8 locty = tis_find_active_locality();
+
+writeb(TIS_REG(locty, TIS_REG_STS), TIS_STS_COMMAND_READY);
+rc = tis_wait_sts(locty, 1000,
+  TIS_STS_COMMAND_READY, TIS_STS_COMMAND_READY);
+
+return rc;
+}
+
+static u32 tis_senddata(const u8 *const data, u32 len)
+{
+u32 rc = 0;
+u32 offset = 0;
+u32 end = 0;
+u16 burst = 0;
+u32 ctr = 0;
+u8 locty = tis_find_active_locality();
+
+do {
+while (burst == 0  ctr  2000) {
+   burst = readl(TIS_REG(locty, TIS_REG_STS))  8;
+if (burst == 0) {
+mssleep(1);
+ctr++;
+}
+}
+
+if (burst == 0) {
+rc = TCG_RESPONSE_TIMEOUT;
+break;
+}
+
+while (1) {
+writeb(TIS_REG(locty, TIS_REG_DATA_FIFO), data[offset++]);
+burst--;
+
+if (burst == 0 || offset == len)
+break;
+}
+
+if (offset == len)
+end = 1;
+} while (end == 0);
+

[Qemu-devel] [PATCH V1 7/8] Add a menu for TPM control

2011-03-30 Thread Stefan Berger
This patch provides an addtional menu entry that enables the user to control
certain aspects of the TPM.

If a working TPM has been detected, the top level BIOS menu
will look like this:

Press F12 for boot menu.
Press F11 to TPM menu.

Upon pressing F11 the TPM menu will be shown:

1. Enable TPM
2. Disable TPM
3. Activate TPM
4. Deactivate TPM
5. Clear ownership
6. Allow installation of owner
7. Prevent installation of owner
Escape for previous menu.
TPM is enabled, active, does not have an owner but one can be installed.

The code for the above 7 menu items is part of this patch.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com


---
 src/boot.c|   10 -
 src/tcgbios.c |  576 +-
 2 files changed, 584 insertions(+), 2 deletions(-)

Index: seabios/src/tcgbios.c
===
--- seabios.orig/src/tcgbios.c
+++ seabios/src/tcgbios.c
@@ -1,4 +1,4 @@
-/*
+;/*
  *  Implementation of the TCG BIOS extension according to the specification
  *  described in
  *  
https://www.trustedcomputinggroup.org/specs/PCClient/TCG_PCClientImplementationforBIOS_1-20_1-00.pdf
@@ -95,6 +95,11 @@ static tcpa_state_t tcpa_state = {
 
 extern struct tpm_driver tpm_drivers[];
 
+typedef struct {
+u16 revision;
+u8  op;
+} tpm_bios_cfg_t;
+
 
 /
   Extensions for TCG-enabled BIOS
@@ -1633,4 +1638,573 @@ tcpa_measure_post(void *from, void *to)
 }
 
 
+
+static u32
+assert_physical_presence(void)
+{
+u32 rc = 0;
+u32 returnCode;
+
+rc = build_and_send_cmd(TPM_ORD_PhysicalPresence,
+PhysicalPresence_CMD_ENABLE,
+sizeof(PhysicalPresence_CMD_ENABLE),
+NULL, 10, returnCode);
+
+#ifdef DEBUG_TCGBIOS
+printf(Return code from TSC_PhysicalPresence(CMD_ENABLE) = 0x%08x\n,
+   returnCode);
+#endif
+if (rc || returnCode)
+goto err_exit;
+
+rc = build_and_send_cmd(TPM_ORD_PhysicalPresence,
+PhysicalPresence_PRESENT,
+sizeof(PhysicalPresence_PRESENT),
+NULL, 10, returnCode);
+
+#ifdef DEBUG_TCGBIOS
+printf(Return code from TSC_PhysicalPresence(PRESENT) = 0x%08x\n,
+   returnCode);
+#endif
+if (rc || returnCode)
+goto err_exit;
+
+return 0;
+
+err_exit:
+#ifdef DEBUG_TCGBIOS
+dprintf(1,TCGBIOS: TPM malfunctioning (line %d).\n, __LINE__);
+#endif
+tcpa_state.tpm_working = 0;
+if (rc)
+return rc;
+return TCG_TCG_COMMAND_ERROR;
+}
+
+
+static u32
+read_permanent_flags(char *buf, int buf_len)
+{
+u32 rc;
+u32 returnCode;
+struct tpm_res_getcap_perm_flags pf;
+
+memset(buf, 0x0, buf_len);
+
+rc = build_and_send_cmd(TPM_ORD_GetCapability,
+GetCapability_Permanent_Flags,
+sizeof(GetCapability_Permanent_Flags),
+(u8 *)pf,
+sizeof(struct tpm_res_getcap_perm_flags),
+returnCode);
+
+#ifdef DEBUG_TCGBIOS
+dprintf(1, TCGBIOS: Return code from TPM_GetCapability() = 0x%08x\n,
+returnCode);
+#endif
+
+if (rc || returnCode)
+goto err_exit;
+
+memcpy(buf, pf.perm_flags, buf_len);
+
+return 0;
+
+err_exit:
+#ifdef DEBUG_TCGBIOS
+dprintf(1,TCGBIOS: TPM malfunctioning (line %d).\n, __LINE__);
+#endif
+tcpa_state.tpm_working = 0;
+if (rc)
+return rc;
+return TCG_TCG_COMMAND_ERROR;
+}
+
+
+static u32
+read_has_owner(u8 *has_owner)
+{
+u32 rc;
+u32 returnCode;
+struct tpm_res_getcap_ownerauth oauth;
+
+rc = build_and_send_cmd(TPM_ORD_GetCapability,
+GetCapability_OwnerAuth,
+sizeof(GetCapability_OwnerAuth),
+(u8 *)oauth,
+sizeof(struct tpm_res_getcap_ownerauth),
+returnCode);
+
+#ifdef DEBUG_TCGBIOS
+dprintf(1, TCGBIOS: Return code from TPM_GetCapability() = 0x%08x\n,
+returnCode);
+#endif
+
+if (rc || returnCode)
+goto err_exit;
+
+*has_owner = oauth.flag;
+
+return 0;
+
+err_exit:
+#ifdef DEBUG_TCGBIOS
+dprintf(1,TCGBIOS: TPM malfunctioning (line %d).\n, __LINE__);
+#endif
+tcpa_state.tpm_working = 0;
+if (rc)
+return rc;
+return TCG_TCG_COMMAND_ERROR;
+}
+
+
+static u32
+disable_tpm(int disable, int verbose)
+{
+u32 rc;
+u32 returnCode;
+struct tpm_permanent_flags pf;
+
+rc = read_permanent_flags((char *)pf, sizeof(pf));
+if (rc)
+return rc;
+
+if (!!pf.flags[PERM_FLAG_IDX_DISABLE] == !!disable) {
+if (verbose)
+printf(TPM is already %s.\n,,
+   disable ? disabled : enabled);
+return 0;
+}
+
+rc = assert_physical_presence();
+

[Qemu-devel] [PATCH V1 3/8] Implementation of the TCG BIOS extensions

2011-03-30 Thread Stefan Berger
This patch implements the main part of the TCG BIOS extensions. It provides
the following functionality:

- initialization of the TCPA ACPI table used for logging of measurements
- initialization of the TPM by sending a sequence of commands to it
- proper setup of the TPM once the BIOS hands over control to the bootloader
- support for S3 resume; BIOS sends TPM_Startup(ST_STATE) to TPM
- a utility function called mssleep is added. It waits for a number
  of milliseconds before it returns. I had tried to build a function
  like that based on calc_future_time() and check_timer(), but those
  didn't work once in an S3 resume.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 src/boot.c|2 
 src/post.c|5 
 src/resume.c  |2 
 src/tcgbios.c |  525 ++
 src/tcgbios.h |  386 ++
 src/util.c|   18 +
 src/util.h|5 
 7 files changed, 943 insertions(+)

Index: seabios/src/tcgbios.c
===
--- /dev/null
+++ seabios/src/tcgbios.c
@@ -0,0 +1,525 @@
+/*
+ *  Implementation of the TCG BIOS extension according to the specification
+ *  described in
+ *  
https://www.trustedcomputinggroup.org/specs/PCClient/TCG_PCClientImplementationforBIOS_1-20_1-00.pdf
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ * Copyright (C) IBM Corporation, 2006,2010,2011
+ *
+ * Author: Stefan Berger stef...@us.ibm.com
+ */
+
+#include config.h
+
+#if CONFIG_TCGBIOS
+
+#include types.h
+#include tpm_drivers.h // tpm_drivers[]
+#include util.h // printf, get_keystroke
+#include tcgbios.h// tcpa_*, prototypes
+#include acpi.h  // RSDP_SIGNATURE, rsdt_descriptor
+#include smbios.h // smbios_entry_point
+
+
+//#define DEBUG_TCGBIOS
+
+static const u8 Startup_ST_CLEAR[2] = { 0x00, TPM_ST_CLEAR };
+static const u8 Startup_ST_STATE[2] = { 0x00, TPM_ST_STATE };
+
+static const u8 PhysicalPresence_CMD_ENABLE[2]  = { 0x00, 0x20 };
+static const u8 PhysicalPresence_CMD_DISABLE[2] = { 0x01, 0x00 };
+static const u8 PhysicalPresence_PRESENT[2] = { 0x00, 0x08 };
+static const u8 PhysicalPresence_NOT_PRESENT[2] = { 0x00, 0x10 };
+static const u8 PhysicalPresence_LOCK[2]= { 0x00, 0x04 };
+
+static const u8 CommandFlag_FALSE[1] = { 0x00 };
+static const u8 CommandFlag_TRUE[1]  = { 0x01 };
+
+static const u8 GetCapability_Permanent_Flags[12] = {
+0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x04,
+0x00, 0x00, 0x01, 0x08
+};
+
+static const u8 GetCapability_OwnerAuth[12] = {
+0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x04,
+0x00, 0x00, 0x01, 0x11
+};
+
+
+#define RSDP_CAST(ptr)   ((struct rsdp_descriptor *)ptr)
+
+
+/* helper functions */
+
+static inline void *input_buf32(struct bregs *regs)
+{
+return MAKE_FLATPTR(regs-es, regs-di);
+}
+
+static inline void *output_buf32(struct bregs *regs)
+{
+return MAKE_FLATPTR(regs-ds, regs-si);
+}
+
+
+typedef struct {
+u8tpm_probed:1;
+u8tpm_found:1;
+u8tpm_working:1;
+u8if_shutdown:1;
+u8tpm_driver_to_use:4;
+} tcpa_state_t;
+
+
+static tcpa_state_t tcpa_state = {
+.tpm_driver_to_use = TPM_INVALID_DRIVER,
+};
+
+extern struct tpm_driver tpm_drivers[];
+
+
+/
+  Extensions for TCG-enabled BIOS
+ ***/
+
+
+static u32
+is_tpm_present(void)
+{
+u32 rc = 0;
+unsigned int i;
+
+for (i = 0; i  TPM_NUM_DRIVERS; i++) {
+struct tpm_driver *td = tpm_drivers[i];
+if (td-probe() != 0) {
+td-init();
+tcpa_state.tpm_driver_to_use = i;
+rc = 1;
+break;
+}
+}
+
+return rc;
+}
+
+
+int
+has_working_tpm(void)
+{
+if (!tcpa_state.tpm_probed) {
+tcpa_state.tpm_probed = 1;
+tcpa_state.tpm_found = (is_tpm_present() != 0);
+tcpa_state.tpm_working = 1;
+}
+if (!tcpa_state.tpm_working)
+return 0;
+
+return tcpa_state.tpm_found;
+}
+
+
+static u8
+calc_checksum(const u8 *addr, u32 length)
+{
+u8 sum = 0;
+u32 ctr;
+
+for (ctr = 0; ctr  length; ctr++)
+sum += addr[ctr];
+

[Qemu-devel] [PATCH V1 6/8] Add measurement code to the BIOS

2011-03-30 Thread Stefan Berger
This patch adds invocactions of functions that measure various parts of the
code and data through various parts of the BIOS code. It follows TCG
specifications on what needs to be measured. It also adds the implementation
of the called functions.

Reference for what needs to be measured can be found in section 3.2.2++ in

http://www.trustedcomputinggroup.org/resources/pc_client_work_group_specific_implementation_specification_for_conventional_bios_specification_version_12


The first measurements are done once the ACPI tables have been initialized.
The whole 0xe and 0xf segment are measure for measuring the 'POST'.

Once booted into Linux, the current measurements produce the following logs
which can be found in /sys/kernel/security/tpm0/ascii_bios_measurements.
The below log also shows measurements from trusted grub.

 0 0f9a2ad992c04a24e081b2a49b984616c4358386 01 [POST CODE]
 1 3fb240d2a04085a4e84f81e4398e070ed5a18163 06 [SMBIOS]
 2 cc812353fc277c1fab99e0b721752a1392984566 06 [Option ROM]
 2 9dbd87163112e5670378abe4510491259a61f411 05 [Start Option ROM Scan]
 2 6f74e357331b8dee11bbad85f27bc66cb873106c 06 [Option ROM]
 2 5626eb7ac05c7231e46d7461e7d3839b03ae9fad 06 [Option ROM]
 4 c1e25c3f6b0dc78d57296aa2870ca6f782ccf80f 05 [Calling INT 19h]
 0 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 1 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 2 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 3 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 4 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 5 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 6 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 7 d9be6524a5f5047db5866813acf3277892a7a30a 04 []
 4 8cf2fe6c87d4d0b2998a43da630292e6d85ee8b6 05 [Booting BCV device 80h (HDD)]
 4 5dff94459a3e2d13a433ef94afdc306144565bf7 0d [IPL]
 5 d1b33afde65ad47502332af957c60f20c84c1edc 0e [IPL Partition Data]
 4 487ce764b527ccad17f1d04243d0136fa981e6c4 0d [IPL]
 4 91d285e4dead566324c8938a3cc75803f462d9a1 0d [IPL]
 4 8ba79ac98bb491524fef29defc724daaf6263d35 0d [IPL]
 4 c591c15b82e4ff30e7383a4ff1ef3b41b38521ac 06 []
 4 8cdc27ec545eda33fbba1e8b8dae4da5c7206972 04 [Grub Event Separator]
 5 8cdc27ec545eda33fbba1e8b8dae4da5c7206972 04 [Grub Event Separator]
 5 e8673b9e14b02dc12d8ccfd0176bca7a3de7fc3c 0e [IPL Partition Data]
 5 0163e375a0af7525c5dac1a8e74b277359e40d1d 1105 []
 8 4be30f67c3d48ab7f04d9c0fd07f06d4c68379be 1205 []
 8 54c83965978de9708d026016ecb0e70660e04388 1305 []
 5 2431ed60130faeaf3a045f21963f71cacd46a029 04 [OS Event Separator]
 8 2431ed60130faeaf3a045f21963f71cacd46a029 04 [OS Event Separator]
 8 f3973cae05d6e2055062119d6e6e1e077b7df876 1005 []


Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 src/boot.c   |5 
 src/cdrom.c  |   10 +
 src/optionroms.c |4 
 src/post.c   |5 
 src/tcgbios.c|  397 +++
 src/tcgbios.h|   36 
 6 files changed, 457 insertions(+)

Index: seabios/src/post.c
===
--- seabios.orig/src/post.c
+++ seabios/src/post.c
@@ -190,6 +190,9 @@ init_hw(void)
 void VISIBLE32FLAT
 startBoot(void)
 {
+tcpa_calling_int19h();
+tcpa_add_event_separators();
+
 // Clear low-memory allocations (required by PMM spec).
 memset((void*)BUILD_STACK_ADDR, 0, BUILD_EBDA_MINIMUM - BUILD_STACK_ADDR);
 
@@ -242,6 +245,8 @@ maininit(void)
 // Initialize tpm (after acpi tables were written)
 tcpa_acpi_init();
 tcpa_startup();
+tcpa_measure_post((void *)0xE, (void *)0xF);
+tcpa_smbios_measure();
 
 // Run vga option rom
 vga_setup();
Index: seabios/src/optionroms.c
===
--- seabios.orig/src/optionroms.c
+++ seabios/src/optionroms.c
@@ -14,6 +14,7 @@
 #include pci_ids.h // PCI_CLASS_DISPLAY_VGA
 #include boot.h // IPL
 #include paravirt.h // qemu_cfg_*
+#include tcgbios.h // tcpa_*
 
 
 /
@@ -134,6 +135,7 @@ is_valid_rom(struct rom_header *rom)
 if (CONFIG_OPTIONROMS_CHECKSUM)
 return 0;
 }
+tcpa_option_rom(FLATPTR_TO_SEG(rom), len);
 return 1;
 }
 
@@ -385,6 +387,8 @@ optionrom_setup(void)
 memset(sources, 0, sizeof(sources));
 u32 post_vga = RomEnd;
 
+tcpa_start_option_rom_scan();
+
 if (CONFIG_OPTIONROMS_DEPLOYED) {
 // Option roms are already deployed on the system.
 u32 pos = RomEnd;
Index: seabios/src/boot.c
===
--- seabios.orig/src/boot.c
+++ seabios/src/boot.c
@@ -533,6 +533,9 @@ boot_disk(u8 bootdrv, int checksig)
 }
 }
 
+tcpa_add_bootdevice(0, bootdrv);
+tcpa_ipl(IPL_BCV, bootseg, 0, 512); /* specs: 8.2.3 steps 4 and 5 */
+
 /* Canonicalize bootseg:bootip */
 u16 bootip = (bootseg  0x0fff)  4;
 bootseg = 0xf000;
@@ -620,6 +623,8 @@ do_boot(u16 seq_nr)
 break;
 }
 
+

[Qemu-devel] [PATCH V1 8/8] Optional tests for the TIS interface

2011-03-30 Thread Stefan Berger
This patch adds an optional test suite (CONFIG_TIS_TEST) for the TIS interface
to SeaBIOS. If compiled into the BIOS, it can be invoked through the
TPM-specific menu item 8.

1. Enable TPM
2. Disable TPM
3. Activate TPM
4. Deactivate TPM
5. Clear ownership
6. Allow installation of owner
7. Prevent installation of owner
8. TIS test

I would like to see this code become part of the SeaBIOS code base
but I understand that a test suite in a BIOS is not the right place...
Nevertheless, for testing the TIS emulation in Qemu, I am posting it here.
The test suite fills up the available BIOS space from 92.6% at the previous
patch to 98.4%.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 Makefile   |2 
 src/Kconfig|7 
 src/tcgbios.c  |   34 +-
 src/tis_test.c |  846 +
 src/tis_test.h |   52 +++
 5 files changed, 932 insertions(+), 9 deletions(-)

Index: seabios/Makefile
===
--- seabios.orig/Makefile
+++ seabios/Makefile
@@ -20,7 +20,7 @@ SRC16=$(SRCBOTH) system.c disk.c font.c
 SRC32FLAT=$(SRCBOTH) post.c shadow.c memmap.c coreboot.c boot.c \
   acpi.c smm.c mptable.c smbios.c pciinit.c optionroms.c mtrr.c \
   lzmadecode.c bootsplash.c jpeg.c usb-hub.c paravirt.c dev-i440fx.c \
-  pci_region.c tcgbios.c tpm_drivers.c
+  pci_region.c tcgbios.c tpm_drivers.c tis_test.c
 SRC32SEG=util.c output.c pci.c pcibios.c apm.c stacks.c
 
 cc-option = $(shell if test -z `$(1) $(2) -S -o /dev/null -xc \
Index: seabios/src/tcgbios.c
===
--- seabios.orig/src/tcgbios.c
+++ seabios/src/tcgbios.c
@@ -33,6 +33,9 @@
 #include acpi.h  // RSDP_SIGNATURE, rsdt_descriptor
 #include smbios.h // smbios_entry_point
 
+#ifdef CONFIG_TIS_TEST
+#include tis_test.h
+#endif
 
 //#define DEBUG_TCGBIOS
 
@@ -927,6 +930,10 @@ pass_through_to_tpm(struct pttti *pttti,
 iovec[1].data   = NULL;
 iovec[1].length = 0;
 
+#ifdef CONFIG_TIS_TEST
+locty = pttti-reserved;
+#endif
+
 rc = transmit(locty, iovec, pttto-tpmopout, resbuflen);
 if (rc)
 goto err_exit;
@@ -2022,26 +2029,29 @@ err_exit:
 }
 
 
-static void
+static int
 show_tpm_state(void)
 {
+int state = 0;
 struct tpm_permanent_flags pf;
 u8 has_owner;
 
 if (read_permanent_flags((char *)pf, sizeof(pf)) ||
 read_has_owner(has_owner))
-return;
+return ~0;
 
 printf(TPM is );
 
-if (pf.flags[PERM_FLAG_IDX_DISABLE])
+if (pf.flags[PERM_FLAG_IDX_DISABLE]) {
 printf(disabled);
-else
+state |= 1  PERM_FLAG_IDX_DISABLE;
+} else
 printf(enabled);
 
-if (pf.flags[PERM_FLAG_IDX_DEACTIVATED])
+if (pf.flags[PERM_FLAG_IDX_DEACTIVATED]) {
 printf(, deactivated);
-else
+state |= 1  PERM_FLAG_IDX_DEACTIVATED;
+} else
 printf(, active);
 
 if (has_owner)
@@ -2054,6 +2064,7 @@ show_tpm_state(void)
 printf(and an owner cannot be installed.\n);
 }
 
+return state;
 }
 
 
@@ -2152,7 +2163,7 @@ void
 tcpa_menu(void)
 {
 int show_menu = 1;
-int scan_code;
+int scan_code, state;
 u32 rc;
 tpm_bios_cfg_t cfg = {
 .revision = 1,
@@ -2172,9 +2183,12 @@ tcpa_menu(void)
5. Clear ownership\n
6. Allow installation of owner\n
7. Prevent installation of owner\n
+#ifdef CONFIG_TIS_TEST
+   8. TIS test\n
+#endif
Escape for previous menu.\n);
 show_menu = 0;
-show_tpm_state();
+state = show_tpm_state();
 }
 
 cfg.op = 0;
@@ -2194,6 +2208,10 @@ tcpa_menu(void)
 case 7 ... 8:
 cfg.op = scan_code + 1;
 break;
+#ifdef CONFIG_TIS_TEST
+case 9:
+tis_test(state);
+#endif
 default:
 continue;
 }
Index: seabios/src/tis_test.c
===
--- /dev/null
+++ seabios/src/tis_test.c
@@ -0,0 +1,846 @@
+/*
+ *  TIS interface tests
+ *
+ *  This library is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU Lesser General Public
+ *  License as published by the Free Software Foundation; either
+ *  version 2 of the License, or (at your option) any later version.
+ *
+ *  This library is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  Lesser General Public License for more details.
+ *
+ *  You should have received a copy of the GNU Lesser General Public
+ *  License along with this library; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+ *
+ * Copyright (C) IBM Corporation, 2006,2011
+ *
+ * Author: 

Re: [Qemu-devel] [PATCH 3/3] vhost: roll our own cpu map variant

2011-03-30 Thread Stefan Hajnoczi
On Wed, Mar 30, 2011 at 5:59 PM, Michael S. Tsirkin m...@redhat.com wrote:
 On Wed, Mar 30, 2011 at 05:26:22PM +0100, Stefan Hajnoczi wrote:
 On Wed, Mar 30, 2011 at 5:09 PM, Michael S. Tsirkin m...@redhat.com wrote:
  On Tue, Mar 29, 2011 at 11:53:54AM +0100, Stefan Hajnoczi wrote:
  On Mon, Mar 28, 2011 at 10:14 PM, Michael S. Tsirkin m...@redhat.com 
  wrote:
   vhost used cpu_physical_memory_map to get the
   virtual address for the ring, however,
   this will exit on an illegal RAM address.
   Since the addresses are guest-controlled, we
   shouldn't do that.
  
   Switch to our own variant that uses the vhost
   tables and returns an error instead of exiting.
 
  We should make all of QEMU more robust instead of just vhost.  Perhaps
  introduce cpu_physical_memory_map_nofail(...) that aborts like the
  current cpu_physical_memory_map() implementation and then make non-hw/
  users call that one.  hw/ users should check for failure.
 
  Stefan
 
  Yea, well ... at least vhost-net wants to also check
  it is given a ram address, not some other physical address.
  We could generally replace the memory management in vhost-net
  by some other logic, when that's done this one can
  go away as well.

 Sounds like you do not want to refactor physical memory access for
 non-vhost.  Fair enough but we have to do it sooner or later in order
 to make all of QEMU more robust.  If vhost-net is protected but the
 IDE CD-ROM and virtio-blk disk still have issues then we haven't
 reached our goal yet.  Any way I can convince you to do a generic API?
 :)

 Stefan

 If you are talking about splitting real ram from non ram
 and creating a generic API for that, you don't need to convince me,
 but I can't commit to implementing it right now.

Okay, userspace virtio will be able to use it as well in the future.

Stefan



Re: [Qemu-devel] CD-ROM bug round-up

2011-03-30 Thread Stefan Hajnoczi
On Wed, Mar 30, 2011 at 5:50 PM, Gleb Natapov g...@redhat.com wrote:
 On Wed, Mar 30, 2011 at 05:40:40PM +0100, Stefan Hajnoczi wrote:
 = Anthony =

 Using 'change' or scripted commands without a delay between closing
 the BlockDriverState and opening the new CD-ROM.  The guest does not
 have a chance to poll the drive and determine a medium present -
 medium not present transition followed by a medium not present -
 medium present transition.

 A fix for this does not exist yet and might be tricky since we need to
 delay or wait for the guest to poll once before opening the new
 CD-ROM.

 This sounds familiar and should have been fixed by commit
 93c8cfd9e67a62711b86f4c93747566885eb7928

Excellent, thanks for pointing this out.  We need to figure out
whether this mechanism is insufficient for some guests.

On a Windows-related note, I've found that Windows 2003 Server sends
START STOP UNIT to eject the medium without unlocking the tray first.
When the CD-ROM is passed through to a Linux host CD-ROM drive the
eject ioctl fails if the tray is locked.  I haven't investigated this
more yet.

Stefan



Re: [Qemu-devel] [PATCH] v3 revamp acpitable parsing and allow to specify complete (headerful) table

2011-03-30 Thread Michael Tokarev
30.03.2011 18:26, Gleb Natapov wrote:
 On Tue, Mar 29, 2011 at 10:41:11PM +0400, Michael Tokarev wrote:
 [Cc'ing Gleb since he - it seems - wrote the original code]

Thank you for looking into this.

 17.03.2011 13:00, Michael Tokarev wrote:
 This patch almost rewrites acpi_table_add() function
 (but still leaves it using old get_param_value() interface).
 The result is that it's now possible to specify whole table
 (together with a header) in an external file, instead of just
 data portion, with a new file= parameter, but at the same time
 it's still possible to specify header fields as before.

[]
 +if (get_param_value(buf, sizeof(buf), data, t)) {
 +has_header = false;
 +} else if (get_param_value(buf, sizeof(buf), file, t)) {
 +has_header = true;
 +} else {
 +has_header = false;
 +buf[0] = '\0';
 +}
 Shouldn't we print some kind of error if user specify data and file
 param? Otherwise looks good to me.

This whole thing needs to be replaced with proper option
parsing.  Currently, it completely ignores any unknown
(such as mistyped) parameters among other things, or you
can specify the same parameter twice without it noticing.
So it needs more work when we'll decide which route to
take with option parsing.

I can rewrite it using QemuOpts, but I didn't want to
because it appears their lifetime is very limited too.

And sure, I can check for conflicting data= and file=,
I just thought it's not really necessary having in mind
all the above :)

/mjt



[Qemu-devel] [PATCH V1 2/8] Provide ACPI SSDT table for TPM device + S3 resume support

2011-03-30 Thread Stefan Berger
This patch provides ACPI support for the TPM device. It probes for the TPM
device and only if a TPM device is found then the TPM's SSDT and TCPA table
are created. This patch also connects them to the RSDT.

Since the logging area in the TCPA table requires 64kb, the memory reserved
for ACPI tables (config.h) is increased to 96kb in case CONFIG_TCGBIOS
is enabled.

This patch requires the subsequent patch for it to compile and work.

The IRQ description in the TPM's SSDT is commented since it will be
'safer' to run the TPM in polling mode - the Linux TPM TIS driver for example
has too many issues when run in interrupt mode.

The description of the TCPA (client) table can be found here:

http://www.trustedcomputinggroup.org/resources/server_work_group_acpi_general_specification_version_10

The compiled SSDT description is also part of this patch.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 Makefile  |9 -
 src/acpi-tpm-ssdt.dsl |   22 ++
 src/acpi-tpm-ssdt.hex |   34 ++
 src/acpi.c|   41 +
 src/acpi.h|   20 
 src/config.h  |6 +-
 6 files changed, 130 insertions(+), 2 deletions(-)

Index: seabios/src/acpi-tpm-ssdt.dsl
===
--- /dev/null
+++ seabios/src/acpi-tpm-ssdt.dsl
@@ -0,0 +1,22 @@
+DefinitionBlock (
+acpi-tpm-ssdt.aml,// Output Filename
+SSDT, // Signature
+0x01,   // SSDT Compliance Revision
+BXPC, // OEMID
+BXSSDT,   // TABLE ID
+0x1 // OEM Revision
+)
+{
+/* TPM with emulated TPM TIS interface */
+Device (TPM) {
+Name (_HID, EisaID (PNP0C31))
+Name (_CRS, ResourceTemplate ()
+{
+Memory32Fixed (ReadWrite, 0xFED4, 0x5000)
+//IRQNoFlags () {11}
+})
+Method (_STA, 0, NotSerialized) {
+Return (0x0F)
+}
+}
+}
Index: seabios/src/acpi-tpm-ssdt.hex
===
--- /dev/null
+++ seabios/src/acpi-tpm-ssdt.hex
@@ -0,0 +1,34 @@
+/*
+ * 
+ * Intel ACPI Component Architecture
+ * ASL Optimizing Compiler version 20101013-64 [Nov 21 2010]
+ * Copyright (c) 2000 - 2010 Intel Corporation
+ * 
+ * Compilation of out/.dsl.i - Sun Mar 13 20:44:32 2011
+ * 
+ * C source code output
+ * AML code block contains 0x93 bytes
+ *
+ */
+unsigned char AmlCode_TPM[] =
+{
+0x53,0x53,0x44,0x54,0x93,0x00,0x00,0x00,  /* SSDT */
+0x01,0xC3,0x42,0x58,0x50,0x43,0x00,0x00,  /* 0008..BXPC.. */
+0x42,0x58,0x53,0x53,0x44,0x54,0x00,0x00,  /* 0010BXSSDT.. */
+0x01,0x00,0x00,0x00,0x49,0x4E,0x54,0x4C,  /* 0018INTL */
+0x13,0x10,0x10,0x20,0x5B,0x82,0x4D,0x06,  /* 0020... [.M. */
+0x54,0x50,0x4D,0x5F,0x08,0x5F,0x48,0x49,  /* 0028TPM_._HI */
+0x44,0x0C,0x41,0xD0,0x0C,0x31,0x08,0x5F,  /* 0030D.A..1._ */
+0x53,0x54,0x52,0x11,0x33,0x0A,0x30,0x45,  /* 0038STR.3.0E */
+0x00,0x6D,0x00,0x75,0x00,0x6C,0x00,0x61,  /* 0040.m.u.l.a */
+0x00,0x74,0x00,0x65,0x00,0x64,0x00,0x20,  /* 0048.t.e.d.  */
+0x00,0x54,0x00,0x50,0x00,0x4D,0x00,0x20,  /* 0050.T.P.M.  */
+0x00,0x54,0x00,0x49,0x00,0x53,0x00,0x20,  /* 0058.T.I.S.  */
+0x00,0x64,0x00,0x65,0x00,0x76,0x00,0x69,  /* 0060.d.e.v.i */
+0x00,0x63,0x00,0x65,0x00,0x00,0x00,0x08,  /* 0068.c.e */
+0x5F,0x43,0x52,0x53,0x11,0x14,0x0A,0x11,  /* 0070_CRS */
+0x86,0x09,0x00,0x01,0x00,0x00,0xD4,0xFE,  /* 0078 */
+0x00,0x50,0x00,0x00,0x22,0x00,0x08,0x79,  /* 0080.Py */
+0x00,0x14,0x09,0x5F,0x53,0x54,0x41,0x00,  /* 0088..._STA. */
+0xA4,0x0A,0x0F/* 0090...  */
+};
Index: seabios/Makefile
===
--- seabios.orig/Makefile
+++ seabios/Makefile
@@ -192,13 +192,20 @@ $(OUT)vgabios.bin: $(OUT)vgabios.bin.raw
$(Q)./tools/buildrom.py $ $@
 
 ### dsdt build rules
+src/acpi-tpm-ssdt.hex: src/acpi-tpm-ssdt.dsl
+   @echo Compiling TPM SSDT
+   $(Q)cpp -P $  $(OUT)$*.dsl.i
+   $(Q)iasl -tc -p $(OUT)$* $(OUT)$*.dsl.i
+   $(Q)cp $(OUT)$*.hex $@
+   $(Q)sed -i 's/AmlCode/AmlCode_TPM/' $@
+
 src/%.hex: src/%.dsl
@echo Compiling DSDT
$(Q)cpp -P $  $(OUT)$*.dsl.i
$(Q)iasl -tc -p $(OUT)$* $(OUT)$*.dsl.i
$(Q)cp $(OUT)$*.hex $@
 
-$(OUT)ccode32flat.o: src/acpi-dsdt.hex
+$(OUT)ccode32flat.o: src/acpi-dsdt.hex src/acpi-tpm-ssdt.hex
 
 ### Kconfig rules
 export HOSTCC := $(CC)
Index: seabios/src/acpi.c
===
--- seabios.orig/src/acpi.c
+++ seabios/src/acpi.c
@@ 

Re: [Qemu-devel] [PATCH 1/7] target-arm: Make Neon helper routines use correct FP status

2011-03-30 Thread Nathan Froyd
On Mon, Mar 28, 2011 at 03:15:08PM +0100, Peter Maydell wrote:
 On 14 March 2011 05:35, Nathan Froyd froy...@codesourcery.com wrote:
  Oh, right.  I am ambivalent as to whether passing env to such functions
  is the right thing to do or not.
 
 So did this amount to a request for a change to this patchset,
 or are you happy to let it pass?

I am happy to let it pass.

-Nathan



Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Peter Maydell
On 15 March 2011 11:56, Gleb Natapov g...@redhat.com wrote:
 Currently when rogue script kills QEMU process (using TERM/INT/HUP
 signal) it looks indistinguishable from system shutdown. Lets report
 that QEMU was killed and leave some clues about the killer identity.

Unfortunately this patch causes qemu to segfault when killed
via ^C (at least on my Ubuntu maverick system). This is because
it registers a signal handler with sigaction, but then later
the SDL library is initialised and it reinstalls our handler
with plain old signal:

ohandler = signal(SIGINT, SDL_HandleSIG);
if ( ohandler != SIG_DFL )
signal(SIGINT, ohandler);

This is clearly buggy but on the other hand SDL is pretty widely
deployed and it's the default QEMU video output method, so I think
we need to work around it :-(

The most straightforward fix is to get the signal number from
argument one and not to bother printing the PID that killed us.

-- PMM



Re: [Qemu-devel] GSoC: Improved image format compatibility

2011-03-30 Thread Lyu Mitnick
Hello Stefan,

I have spent a whole night to trace the code of block/vpc.c and compare
against the VHD specifications
(at
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=C2D03242-2FFB-48EF-A211-F0C44741109Eamp;displaylang=en
).
It seems that there isn't Fixed hard disk image and Differencing hard disk
image support now. I am wondering whether what I has
learned from tracing code is true or false? I also found some possibility of
optimization in block/vpc.c. Should I contact mentor
assigned to this project in private or post the whole idea on the mailing
list??

thanks very much

Mitnick

2011/3/30 Stefan Hajnoczi stefa...@gmail.com

 On Wed, Mar 30, 2011 at 5:50 AM, Stefan Weil w...@mail.berlios.de wrote:
  Am 29.03.2011 23:55, schrieb Lyu Mitnick:
 
  Hello all,
 
  I have used QEMU to assist me developing embedded system for 3 years.
 And
  I
  want to contribute to QEMU and participate Google Summer of Code this
  year.
  I have port binutils so I have some experienced with binary format. I
  noticed that there is no VHD support in qemu-img. A Virtual Hard
 Disk(VHD)
  is a virtual hard disk file format used in Microsoft virtual PC,
 Hyper-V,
  virtual box, xVM and Complete PC Backup  of vista and windows 7. I want
 to
  add support for VHD in GSoC. I am wondering whether this kind of idea is
  valuable to QEMU?? Or any another topic related image format is valuable
  to
  QEMU??
 
  thanks
 
  Mitnick
 
  Hello Mitnick,
 
  QEMU (including qemu-img) already supports the format vpc.
  The code in block/vpc.c also uses the name vhd, so I think
  vpc == vhd.

 However, it is possible that vpc/vhd has new features that are not yet
 supported in QEMU.  There is a listed project idea for updating image
 formats on the GSoC wiki page:


 http://wiki.qemu.org/Google_Summer_of_Code_2011#Improved_image_format_compatibility

 You could assess QEMU's implementation in block/vpc.c and compare
 against the latest public file format specification:

 http://en.wikipedia.org/wiki/VHD_(file_format)

 If you find there are new features or format changes that are not yet
 supported in QEMU, this might make a good Improved image format
 compatibility GSoC project.

 Stefan



[Qemu-devel] virtio-blk.c handling of i/o which is not a 512 multiple

2011-03-30 Thread Conor Murphy
Hi,

I'm trying to write a virtio-blk driver for Solaris. I've gotten it to the point
where Solaris can see the device and create a ZFS file system on it.

However when I try and create a UFS filesystem on the device, the VM crashed
with the error
*** glibc detected *** /usr/bin/qemu-kvm: double free or corruption (!prev):
0x7f2d38000a00 ***

I can reproduce the problem with a simple dd, i.e.
dd if=/dev/zero of=/dev/rdsk/c2d10p0 bs=5000 count=1

My driver will create a virtio-blk request with two elements in the sg list, one
for the first 4096 byes and the other for the remaining 904.

From stepping through with gdb, virtio_blk_handle_write will sets n_sectors to 
9
(5000 / 512). Later on the code, n_sectors is used the calculate the size of the
buffer required but 9 * 512 is too small and so when the request is process it
ends up writing past the end of the buffer and I guest this triggers the glibc
error.

Is there a requirement for virtio-blk guest drivers that all i/o requests are
sized in multiples of 512 bytes?

Thanks,
Conor




Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Gleb Natapov
On Wed, Mar 30, 2011 at 07:39:31PM +0100, Peter Maydell wrote:
 On 15 March 2011 11:56, Gleb Natapov g...@redhat.com wrote:
  Currently when rogue script kills QEMU process (using TERM/INT/HUP
  signal) it looks indistinguishable from system shutdown. Lets report
  that QEMU was killed and leave some clues about the killer identity.
 
 Unfortunately this patch causes qemu to segfault when killed
 via ^C (at least on my Ubuntu maverick system). This is because
 it registers a signal handler with sigaction, but then later
 the SDL library is initialised and it reinstalls our handler
 with plain old signal:
 
 ohandler = signal(SIGINT, SDL_HandleSIG);
 if ( ohandler != SIG_DFL )
 signal(SIGINT, ohandler);
 
I fixed this in SDL upstream.

 This is clearly buggy but on the other hand SDL is pretty widely
 deployed and it's the default QEMU video output method, so I think
 we need to work around it :-(
 
 The most straightforward fix is to get the signal number from
 argument one and not to bother printing the PID that killed us.
 
For debugging purposes pid is useful. We cam register signal handler
after SDL is initialized though (if waiting for SDL update is not an
option).

--
Gleb.



Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Anthony Liguori

On 03/30/2011 01:39 PM, Peter Maydell wrote:

On 15 March 2011 11:56, Gleb Natapovg...@redhat.com  wrote:

Currently when rogue script kills QEMU process (using TERM/INT/HUP
signal) it looks indistinguishable from system shutdown. Lets report
that QEMU was killed and leave some clues about the killer identity.

Unfortunately this patch causes qemu to segfault when killed
via ^C (at least on my Ubuntu maverick system). This is because
it registers a signal handler with sigaction, but then later
the SDL library is initialised and it reinstalls our handler
with plain old signal:

 ohandler = signal(SIGINT, SDL_HandleSIG);
 if ( ohandler != SIG_DFL )
 signal(SIGINT, ohandler);

This is clearly buggy but on the other hand SDL is pretty widely
deployed and it's the default QEMU video output method, so I think
we need to work around it :-(

The most straightforward fix is to get the signal number from
argument one and not to bother printing the PID that killed us.


Or just #ifdefing this to 0 for now and then once an SDL version is 
released that works correctly, adding an appropriate guard.


Regards,

Anthony Liguori


-- PMM






Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Gleb Natapov
On Wed, Mar 30, 2011 at 01:49:10PM -0500, Anthony Liguori wrote:
 On 03/30/2011 01:39 PM, Peter Maydell wrote:
 On 15 March 2011 11:56, Gleb Natapovg...@redhat.com  wrote:
 Currently when rogue script kills QEMU process (using TERM/INT/HUP
 signal) it looks indistinguishable from system shutdown. Lets report
 that QEMU was killed and leave some clues about the killer identity.
 Unfortunately this patch causes qemu to segfault when killed
 via ^C (at least on my Ubuntu maverick system). This is because
 it registers a signal handler with sigaction, but then later
 the SDL library is initialised and it reinstalls our handler
 with plain old signal:
 
  ohandler = signal(SIGINT, SDL_HandleSIG);
  if ( ohandler != SIG_DFL )
  signal(SIGINT, ohandler);
 
 This is clearly buggy but on the other hand SDL is pretty widely
 deployed and it's the default QEMU video output method, so I think
 we need to work around it :-(
 
 The most straightforward fix is to get the signal number from
 argument one and not to bother printing the PID that killed us.
 
 Or just #ifdefing this to 0 for now and then once an SDL version is
 released that works correctly, adding an appropriate guard.
 
I prefer to move signal init after SDL init.

--
Gleb.



Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Peter Maydell
On 30 March 2011 19:43, Gleb Natapov g...@redhat.com wrote:
 On Wed, Mar 30, 2011 at 07:39:31PM +0100, Peter Maydell wrote:
 Unfortunately this patch causes qemu to segfault when killed
 via ^C (at least on my Ubuntu maverick system). This is because
 it registers a signal handler with sigaction, but then later
 the SDL library is initialised and it reinstalls our handler
 with plain old signal:

         ohandler = signal(SIGINT, SDL_HandleSIG);
         if ( ohandler != SIG_DFL )
                 signal(SIGINT, ohandler);

 I fixed this in SDL upstream.

Cool, but we can't rely on the SDL we're linked against being
a bugfixed version.

 This is clearly buggy but on the other hand SDL is pretty widely
 deployed and it's the default QEMU video output method, so I think
 we need to work around it :-(

 The most straightforward fix is to get the signal number from
 argument one and not to bother printing the PID that killed us.

 For debugging purposes pid is useful. We cam register signal handler
 after SDL is initialized though (if waiting for SDL update is not an
 option).

I'm not convinced about the utility of printing the pid, personally.
Most programs get along fine without printing anything when
they receive a terminal signal. However you're right that it should
be straightforward enough to move the signal init. In fact it
looks like this got broken somewhere along the line, the
call to os_setup_signal_handling() is already preceded with a comment:

/* must be after terminal init, SDL library changes signal handlers */
os_setup_signal_handling();

...we just aren't actually after the terminal init any more :-(

-- PMM



Re: [Qemu-devel] virtio-blk.c handling of i/o which is not a 512 multiple

2011-03-30 Thread Christoph Hellwig
On Wed, Mar 30, 2011 at 08:48:18AM +, Conor Murphy wrote:
 Is there a requirement for virtio-blk guest drivers that all i/o requests are
 sized in multiples of 512 bytes?

Yes, like for any other block driver.  Of course this should not actually
crash qemu, but rather fail the request.

Does the patch below give you a correct error report?


Index: qemu/hw/virtio-blk.c
===
--- qemu.orig/hw/virtio-blk.c   2011-03-30 20:46:10.268665534 +0200
+++ qemu/hw/virtio-blk.c2011-03-30 20:49:45.655247322 +0200
@@ -290,6 +290,10 @@ static void virtio_blk_handle_write(Virt
 virtio_blk_rw_complete(req, -EIO);
 return;
 }
+if (req-qiov.size % req-dev-conf-logical_block_size) {
+virtio_blk_rw_complete(req, -EIO);
+return;
+}
 
 if (mrb-num_writes == 32) {
 virtio_submit_multiwrite(req-dev-bs, mrb);
@@ -317,6 +321,10 @@ static void virtio_blk_handle_read(VirtI
 virtio_blk_rw_complete(req, -EIO);
 return;
 }
+if (req-qiov.size % req-dev-conf-logical_block_size) {
+virtio_blk_rw_complete(req, -EIO);
+return;
+}
 
 acb = bdrv_aio_readv(req-dev-bs, sector, req-qiov,
  req-qiov.size / BDRV_SECTOR_SIZE,



Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Gleb Natapov
On Wed, Mar 30, 2011 at 07:53:41PM +0100, Peter Maydell wrote:
 On 30 March 2011 19:43, Gleb Natapov g...@redhat.com wrote:
  On Wed, Mar 30, 2011 at 07:39:31PM +0100, Peter Maydell wrote:
  Unfortunately this patch causes qemu to segfault when killed
  via ^C (at least on my Ubuntu maverick system). This is because
  it registers a signal handler with sigaction, but then later
  the SDL library is initialised and it reinstalls our handler
  with plain old signal:
 
          ohandler = signal(SIGINT, SDL_HandleSIG);
          if ( ohandler != SIG_DFL )
                  signal(SIGINT, ohandler);
 
  I fixed this in SDL upstream.
 
 Cool, but we can't rely on the SDL we're linked against being
 a bugfixed version.
 
  This is clearly buggy but on the other hand SDL is pretty widely
  deployed and it's the default QEMU video output method, so I think
  we need to work around it :-(
 
  The most straightforward fix is to get the signal number from
  argument one and not to bother printing the PID that killed us.
 
  For debugging purposes pid is useful. We cam register signal handler
  after SDL is initialized though (if waiting for SDL update is not an
  option).
 
 I'm not convinced about the utility of printing the pid, personally.
 Most programs get along fine without printing anything when
 they receive a terminal signal. However you're right that it should
Well qemu is a bit of special case. It is long running process that
takes huge amount of memory and, as suchm it becomes a target of various
monitoring script which, when configured incorrectly, start killing
perfectly valid guests. In addition killing of the guest looks exactly
like guest shutdown to management software because we call shutdow_request
in the signal handler.

 be straightforward enough to move the signal init. In fact it
 looks like this got broken somewhere along the line, the
 call to os_setup_signal_handling() is already preceded with a comment:
 
 /* must be after terminal init, SDL library changes signal handlers */
 os_setup_signal_handling();
 
 ...we just aren't actually after the terminal init any more :-(
 
Exactly. This should do the trick (not tested).


diff --git a/vl.c b/vl.c
index 192a240..93aaccf 100644
--- a/vl.c
+++ b/vl.c
@@ -3148,9 +3148,6 @@ int main(int argc, char **argv, char **envp)
 
 cpu_synchronize_all_post_init();
 
-/* must be after terminal init, SDL library changes signal handlers */
-os_setup_signal_handling();
-
 set_numa_modes();
 
 current_machine = machine;
@@ -3206,6 +3203,9 @@ int main(int argc, char **argv, char **envp)
 break;
 }
 
+/* must be after terminal init, SDL library changes signal handlers */
+os_setup_signal_handling();
+
 #ifdef CONFIG_VNC
 /* init remote displays */
 if (vnc_display) {

--
Gleb.



Re: [Qemu-devel] GSoC: Improved image format compatibility

2011-03-30 Thread Stefan Weil

Am 30.03.2011 20:40, schrieb Lyu Mitnick:

Hello Stefan,

I have spent a whole night to trace the code of block/vpc.c and 
compare against the VHD specifications
(at 
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=C2D03242-2FFB-48EF-A211-F0C44741109Eamp;displaylang=en 
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=C2D03242-2FFB-48EF-A211-F0C44741109Eamp;displaylang=en).
It seems that there isn't Fixed hard disk image and Differencing hard 
disk image support now. I am wondering whether what I has
learned from tracing code is true or false? I also found some 
possibility of optimization in block/vpc.c. Should I contact mentor
assigned to this project in private or post the whole idea on the 
mailing list??


thanks very much

Mitnick




Hello Mitnick,

asynchronous i/o is also missing in block/vpc.c.

All newer block drivers support asynchronous reads and writes,
only some old drivers don't.

So adding asynchronous i/o to at least some of the old drivers
would improve their usability.

Look for aio in block/*.c to get a starting point.

Regards
Stefan (W.)




[Qemu-devel] [PATCH] Don't limit node count by smp count

2011-03-30 Thread Sasha Levin
It is possible to create CPU-less NUMA nodes, node amount shouldn't be
limited by amount of CPUs.

Signed-off-by: Sasha Levin levinsasha...@gmail.com
---
 vl.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 8bcf2ae..8cc1aa8 100644
--- a/vl.c
+++ b/vl.c
@@ -3002,8 +3002,8 @@ int main(int argc, char **argv, char **envp)
 if (nb_numa_nodes  0) {
 int i;
 
-if (nb_numa_nodes  smp_cpus) {
-nb_numa_nodes = smp_cpus;
+if (nb_numa_nodes  MAX_NODES) {
+nb_numa_nodes = MAX_NODES;
 }
 
 /* If no memory size if given for any node, assume the default
case




Re: [Qemu-devel] [PATCH] Don't limit node count by smp count

2011-03-30 Thread Anthony Liguori

On 03/30/2011 02:14 PM, Sasha Levin wrote:

It is possible to create CPU-less NUMA nodes, node amount shouldn't be
limited by amount of CPUs.


But does this actually work in the code today and does it work with any 
guests?


Regards,

Anthony Liguori


Signed-off-by: Sasha Levinlevinsasha...@gmail.com
---
  vl.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 8bcf2ae..8cc1aa8 100644
--- a/vl.c
+++ b/vl.c
@@ -3002,8 +3002,8 @@ int main(int argc, char **argv, char **envp)
  if (nb_numa_nodes  0) {
  int i;

-if (nb_numa_nodes  smp_cpus) {
-nb_numa_nodes = smp_cpus;
+if (nb_numa_nodes  MAX_NODES) {
+nb_numa_nodes = MAX_NODES;
  }

  /* If no memory size if given for any node, assume the default
case







Re: [Qemu-devel] GSoC: Improved image format compatibility

2011-03-30 Thread Lyu Mitnick
Hello Stefan,

Let me summarize the ideas of Improved image format compatibility  now:

(1) add support of Fixed hard disk image into block/vpc.c
(2) add support of Differencing hard disk image into block/vpc.c
(3) add asynchronous i/o into block/vpc,c
(4) have some optimization of block/vpc.c

Would you mind to tell me the project containing four topics above is
suitable for GSoC of size and skill level??
And would anyone is interested in this project and being my mentor??

thanks very much

2011/3/31 Stefan Weil w...@mail.berlios.de

 Am 30.03.2011 20:40, schrieb Lyu Mitnick:

 Hello Stefan,

 I have spent a whole night to trace the code of block/vpc.c and compare
 against the VHD specifications
 (at
 http://www.microsoft.com/downloads/en/details.aspx?FamilyID=C2D03242-2FFB-48EF-A211-F0C44741109Eamp;displaylang=enhttp://www.microsoft.com/downloads/en/details.aspx?FamilyID=C2D03242-2FFB-48EF-A211-F0C44741109Edisplaylang=en
 http://www.microsoft.com/downloads/en/details.aspx?FamilyID=C2D03242-2FFB-48EF-A211-F0C44741109Eamp;displaylang=enhttp://www.microsoft.com/downloads/en/details.aspx?FamilyID=C2D03242-2FFB-48EF-A211-F0C44741109Edisplaylang=en
 ).

 It seems that there isn't Fixed hard disk image and Differencing hard disk
 image support now. I am wondering whether what I has
 learned from tracing code is true or false? I also found some possibility
 of optimization in block/vpc.c. Should I contact mentor
 assigned to this project in private or post the whole idea on the mailing
 list??

 thanks very much

 Mitnick



 Hello Mitnick,

 asynchronous i/o is also missing in block/vpc.c.

 All newer block drivers support asynchronous reads and writes,
 only some old drivers don't.

 So adding asynchronous i/o to at least some of the old drivers
 would improve their usability.

 Look for aio in block/*.c to get a starting point.

 Regards
 Stefan (W.)




[Qemu-devel] [PATCH V2 3/9] Add persistent state handling to TPM TIS frontend driver

2011-03-30 Thread Stefan Berger
This patch adds support for handling of persisten state to the TPM TIS
frontend.

The currently used buffer is determined (can only be in currently active
locality and either be a read or a write buffer) and only that buffer's content
is stored. The reverse is done when the state is restored from disk
where the buffer's content are copied into the currently used buffer.

I adapated the structure to those used by the Xen driver in order to
provide compatibility to existing state. For that I am adding Andreas
Niederl as an author to the file.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 hw/tpm_tis.c |  156 +++
 1 file changed, 156 insertions(+)

Index: qemu-git/hw/tpm_tis.c
===
--- qemu-git.orig/hw/tpm_tis.c
+++ qemu-git/hw/tpm_tis.c
@@ -6,6 +6,8 @@
  * Author: Stefan Berger stef...@us.ibm.com
  * David Safford saff...@us.ibm.com
  *
+ * Xen 4 support: Andrease Niederl andreas.niederl@iaik,tugraz.at
+ *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License as
  * published by the Free Software Foundation, version 2 of the
@@ -824,3 +826,157 @@ static int tpm_tis_init(ISADevice *dev)
 exit(1);
 }
 
+/* persistent state handling */
+
+static void tpm_tis_pre_save(void *opaque)
+{
+TPMState *s = opaque;
+uint8_t locty = s-active_locty;
+
+qemu_mutex_lock(s-state_lock);
+
+/* wait for outstanding requests to complete */
+if (IS_VALID_LOCTY(locty)  s-loc[locty].state == STATE_EXECUTION) {
+qemu_cond_wait(s-from_tpm_cond, s-state_lock);
+}
+
+#ifdef DEBUG_TIS_SR
+fprintf(stderr,tpm_tis: suspend: locty 0 : r_offset = %d, w_offset = 
%d\n,
+s-loc[0].r_offset,
+s-loc[0].w_offset);
+if (s-loc[0].r_offset) {
+tis_dump_state(opaque, 0);
+}
+#endif
+
+qemu_mutex_unlock(s-state_lock);
+
+/* copy current active read or write buffer into the buffer
+   written to disk */
+if (IS_VALID_LOCTY(locty)) {
+switch (s-loc[locty].state) {
+case STATE_RECEPTION:
+memcpy(s-buf,
+   s-loc[locty].w_buffer.buffer,
+   MIN(sizeof(s-buf),
+   s-loc[locty].w_buffer.size));
+s-offset = s-loc[locty].w_offset;
+break;
+case STATE_COMPLETION:
+memcpy(s-buf,
+   s-loc[locty].r_buffer.buffer,
+   MIN(sizeof(s-buf),
+   s-loc[locty].r_buffer.size));
+s-offset = s-loc[locty].r_offset;
+break;
+default:
+/* leak nothing */
+memset(s-buf, 0x0, sizeof(s-buf));
+break;
+}
+}
+
+tis_get_active_backend()-save_volatile_data();
+}
+
+
+static int tpm_tis_post_load(void *opaque,
+ int version_id __attribute__((unused)))
+{
+TPMState *s = opaque;
+
+uint8_t locty = s-active_locty;
+
+if (IS_VALID_LOCTY(locty)) {
+switch (s-loc[locty].state) {
+case STATE_RECEPTION:
+memcpy(s-loc[locty].w_buffer.buffer,
+   s-buf,
+   MIN(sizeof(s-buf),
+   s-loc[locty].w_buffer.size));
+s-loc[locty].w_offset = s-offset;
+break;
+case STATE_COMPLETION:
+memcpy(s-loc[locty].r_buffer.buffer,
+   s-buf,
+   MIN(sizeof(s-buf),
+   s-loc[locty].r_buffer.size));
+s-loc[locty].r_offset = s-offset;
+break;
+default:
+break;
+}
+}
+
+#ifdef DEBUG_TIS_SR
+fprintf(stderr,tpm_tis: resume : locty 0 : r_offset = %d, w_offset = 
%d\n,
+s-loc[0].r_offset,
+s-loc[0].w_offset);
+#endif
+
+return tis_get_active_backend()-load_volatile_data(s);
+}
+
+
+static const VMStateDescription vmstate_locty = {
+.name = loc,
+.version_id = 1,
+.minimum_version_id = 0,
+.minimum_version_id_old = 0,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT32(state   , TPMLocality),
+VMSTATE_UINT32(inte, TPMLocality),
+VMSTATE_UINT32(ints, TPMLocality),
+VMSTATE_UINT8 (access  , TPMLocality),
+VMSTATE_UINT8 (sts , TPMLocality),
+VMSTATE_END_OF_LIST(),
+}
+};
+
+
+static const VMStateDescription vmstate_tpm_tis = {
+.name = tpm,
+.version_id = 1,
+.minimum_version_id = 0,
+.minimum_version_id_old = 0,
+.pre_save  = tpm_tis_pre_save,
+.post_load = tpm_tis_post_load,
+.fields  = (VMStateField []) {
+VMSTATE_UINT32(irq_num   , TPMState),
+VMSTATE_UINT32(offset, TPMState),
+VMSTATE_BUFFER(buf   , TPMState),
+VMSTATE_UINT8 (  active_locty, TPMState),
+VMSTATE_UINT8 (aborting_locty, TPMState),
+VMSTATE_UINT8 

[Qemu-devel] [PATCH V2 5/9] Add a debug register

2011-03-30 Thread Stefan Berger
This patch uses the possibility to add a vendor-specific register and
adds a debug register useful for dumping the internal state. This register
is only active in a debug build (#define DEBUG_TIS).

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 hw/tpm_tis.c |   67 +++
 1 file changed, 67 insertions(+)

Index: qemu-git/hw/tpm_tis.c
===
--- qemu-git.orig/hw/tpm_tis.c
+++ qemu-git/hw/tpm_tis.c
@@ -42,6 +42,8 @@
 #define TIS_REG_DID_VID   0xf00
 #define TIS_REG_RID   0xf04
 
+/* vendor-specific registers */
+#define TIS_REG_DEBUG 0xf90
 
 #define STS_VALID(1  7)
 #define STS_COMMAND_READY(1  6)
@@ -356,6 +358,66 @@ static uint32_t tpm_data_read(TPMState *
 }
 
 
+#ifdef DEBUG_TIS
+static void tis_dump_state(void *opaque, target_phys_addr_t addr)
+{
+static const unsigned regs[] = {
+TIS_REG_ACCESS,
+TIS_REG_INT_ENABLE,
+TIS_REG_INT_VECTOR,
+TIS_REG_INT_STATUS,
+TIS_REG_INTF_CAPABILITY,
+TIS_REG_STS,
+TIS_REG_DID_VID,
+TIS_REG_RID,
+0xfff};
+int idx;
+uint8_t locty = locality_from_addr(addr);
+target_phys_addr_t base = addr  ~0xfff;
+TPMState *s = opaque;
+
+fprintf(stdout,
+tpm_tis: active locality  : %d\n
+tpm_tis: state of locality %d : %d\n
+tpm_tis: register dump:\n,
+s-active_locty,
+locty, s-loc[locty].state);
+
+for (idx = 0; regs[idx] != 0xfff; idx++) {
+fprintf(stdout, tpm_tis: 0x%04x : 0x%08x\n, regs[idx],
+tis_mem_readl(opaque, base + regs[idx]));
+}
+
+fprintf(stdout,
+tpm_tis: read offset   : %d\n
+tpm_tis: result buffer : ,
+s-loc[locty].r_offset);
+for (idx = 0;
+ idx  tpm_get_size_from_buffer(s-loc[locty].r_buffer);
+ idx++) {
+fprintf(stdout, %c%02x%s,
+s-loc[locty].r_offset == idx ? '' : ' ',
+s-loc[locty].r_buffer.buffer[idx],
+((idx  0xf) == 0xf) ? \ntpm_tis:  : );
+}
+fprintf(stdout,
+\n
+tpm_tis: write offset  : %d\n
+tpm_tis: request buffer: ,
+s-loc[locty].w_offset);
+for (idx = 0;
+ idx  tpm_get_size_from_buffer(s-loc[locty].w_buffer);
+ idx++) {
+fprintf(stdout, %c%02x%s,
+s-loc[locty].w_offset == idx ? '' : ' ',
+s-loc[locty].w_buffer.buffer[idx],
+((idx  0xf) == 0xf) ? \ntpm_tis:  : );
+}
+fprintf(stdout,\n);
+}
+#endif
+
+
 /*
  * Read a register of the TIS interface
  * See specs pages 33-63 for description of the registers
@@ -431,6 +493,11 @@ static uint32_t tis_mem_readl(void *opaq
 case TIS_REG_RID:
 val = TPM_RID;
 break;
+#ifdef DEBUG_TIS
+case TIS_REG_DEBUG:
+tis_dump_state(opaque, addr);
+break;
+#endif
 }
 
 qemu_mutex_unlock(s-state_lock);




[Qemu-devel] [PATCH V2 1/9] Support for TPM command line options

2011-03-30 Thread Stefan Berger
This patch adds support for TPM command line options.
The command line supported here (considering the libtpms based
backend) are

./qemu-... -tpm type=type,path=path to blockstorage file,

and

./qemu-... -tpm ?

where the latter works similar to -soundhw ? and shows a list of
available TPM backends (i.e., libtpms-based, Xen).

Only the 'type' is interpreted in arch_init.c. Using this parameter,
the backend is chosen, i.e., 'builtin' for the libtpms-based
builtin TPM. The interpretation of the other parameters along with
determining whether enough parameters were provided is pushed into
the backend driver, which needs to implement the interface function
'handle_options' and return true if the VM can be started or 'false'
if not enough or bad parameters were provided.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 arch_init.c |   77 
 arch_init.h |2 +
 hw/pc.h |7 +
 qemu-config.c   |   20 ++
 qemu-options.hx |   11 
 vl.c|   11 
 6 files changed, 128 insertions(+)

Index: qemu-git/hw/pc.h
===
--- qemu-git.orig/hw/pc.h
+++ qemu-git/hw/pc.h
@@ -6,6 +6,7 @@
 #include isa.h
 #include fdc.h
 #include net.h
+#include tpm_tis.h
 
 /* PC-style peripherals (also used by other machines).  */
 
@@ -128,6 +129,12 @@ void pc_register_ferr_irq(qemu_irq irq);
 void pc_cmos_set_s3_resume(void *opaque, int irq, int level);
 void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
+/* tpm_tis.c */
+extern bool has_tpm;
+const BackendTPMDriver *tis_set_backend_driver(const char *tpm_type);
+void tis_display_backend_drivers(FILE *);
+
+
 void pc_cpus_init(const char *cpu_model);
 void pc_memory_init(ram_addr_t ram_size,
 const char *kernel_filename,
Index: qemu-git/qemu-options.hx
===
--- qemu-git.orig/qemu-options.hx
+++ qemu-git/qemu-options.hx
@@ -1041,6 +1041,17 @@ Specify SMBIOS type 0 fields
 Specify SMBIOS type 1 fields
 ETEXI
 
+#ifndef _WIN32
+# ifdef CONFIG_TPM
+DEF(tpm, HAS_ARG, QEMU_OPTION_tpm, \
+
+-tpm type=type,path=path\n \
+enable a TPM with state from file in given path\n
+use -tpm ? to get a list of supported TPM types\n,
+QEMU_ARCH_I386)
+# endif
+#endif
+
 DEFHEADING()
 STEXI
 @end table
Index: qemu-git/vl.c
===
--- qemu-git.orig/vl.c
+++ qemu-git/vl.c
@@ -244,6 +244,8 @@ int nb_numa_nodes;
 uint64_t node_mem[MAX_NODES];
 uint64_t node_cpumask[MAX_NODES];
 
+bool has_tpm = false;
+
 static QEMUTimer *nographic_timer;
 
 uint8_t qemu_uuid[16];
@@ -2420,6 +2422,15 @@ int main(int argc, char **argv, char **e
 ram_size = value;
 break;
 }
+#ifdef CONFIG_TPM
+case QEMU_OPTION_tpm:
+if (!(tpm_available())) {
+printf(Option %s not supported for this target\n, 
popt-name);
+exit(1);
+}
+select_tpm(optarg);
+break;
+#endif
 case QEMU_OPTION_mempath:
 mem_path = optarg;
 break;
Index: qemu-git/qemu-config.c
===
--- qemu-git.orig/qemu-config.c
+++ qemu-git/qemu-config.c
@@ -451,6 +451,25 @@ QemuOptsList qemu_option_rom_opts = {
 },
 };
 
+static QemuOptsList qemu_tpm_opts = {
+.name = tpm,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_tpm_opts.head),
+.desc = {
+{
+.name = type,
+.type = QEMU_OPT_STRING,
+.help = Type of TPM backend,
+},
+{
+.name = path,
+.type = QEMU_OPT_STRING,
+.help = Persitent storage for TPM state,
+},
+{ /* end of list */ }
+},
+};
+
+
 static QemuOptsList *vm_config_groups[32] = {
 qemu_drive_opts,
 qemu_chardev_opts,
@@ -465,6 +484,7 @@ static QemuOptsList *vm_config_groups[32
 qemu_trace_opts,
 #endif
 qemu_option_rom_opts,
+qemu_tpm_opts,
 NULL,
 };
 
Index: qemu-git/arch_init.c
===
--- qemu-git.orig/arch_init.c
+++ qemu-git/arch_init.c
@@ -41,6 +41,8 @@
 #include net.h
 #include gdbstub.h
 #include hw/smbios.h
+#include blockdev.h
+#include hw/tpm_tis.h
 
 #ifdef TARGET_SPARC
 int graphic_width = 1024;
@@ -726,3 +728,78 @@ int xen_available(void)
 return 0;
 #endif
 }
+
+int tpm_available(void) {
+#ifdef CONFIG_TPM
+return 1;
+#else
+return 0;
+#endif
+}
+
+#ifdef CONFIG_TPM
+
+#if defined (TARGET_I386) || defined (TARGET_X86_64)
+
+
+static int configure_tpm(QemuOpts *opts)
+{
+const char *value;
+const BackendTPMDriver *be;
+
+if (has_tpm) {
+fprintf(stderr,Only one TPM is 

[Qemu-devel] [PATCH V2 0/9] Qemu Trusted Platform Module (TPM) integration

2011-03-30 Thread Stefan Berger
The following series of patches adds a TPM (Trusted Platform Module) 
TIS (TPM Interface Spec) interface to Qemu and with that provides
means to access a backend implementing the actual TPM functionality.
This frontend enables for example Linux's TPM TIS (tpm_tis) driver.

I am also posting the implementation of a backend implementation that is based
on a library (libtpms) providing TPM functionality. This library is currently
undergoing further testing and is not commonly available, yet.
The main purpose of me posting the libtpms-based backend patches now is to
show an example of how to integrate a backend with this TIS frontend. The
frontend is independent of the code in the backend and could be checked-in
separately, though will be of limited use as long as no backend is provided.
The backend driver for Xen, however, should be adapted to work with this
frontend's extensive interface.

My testing is all based on the libtpms-based backend that provides support for
VM suspend/resume, migration and snapshotting. It uses QCoW2 as the file
format for storing its persistent state onto, which is necessary for support
of snapshotting. Using Linux as the OS along with some recently posted patches
for the Linux TPM TIS driver, suspend/resume works fine (using 'virsh
save/restore') along with hibernation and OS suspend (ACPI S3).

Proper support for the TPM requires support in the BIOS since the BIOS
needs to initialize the TPM upon machine start or issue commands to the TPM
when it resumes from suspend (ACPI S3). It also builds and connects the
necessary ACPI tables (SSDT for TPM device, TCPA table for logging) to the
ones that are built by a BIOS. To support this I have fairly extensive 
set of extensions for SeaBIOS that I posted to the SeaBIOS mailing list.

V2:
 - splitting some of the patches into smaller ones for easier review
 - fixes in individual patches

Regards,
Stefan




[Qemu-devel] [PATCH V2 8/9] Implementation of the libtpms-based backend

2011-03-30 Thread Stefan Berger
This patch provides the glue for the TPM TIS interface (frontend) to
the libtpms that provides the actual TPM functionality.

Some details:

This part of the patch provides support for the spawning of a thread
that will interact with the libtpms-based TPM. It expects a signal
from the frontend to wake and pick up the TPM command that is supposed
to be processed and delivers the response packet using a callback
function provided by the frontend.

The backend connectects itself to the frontend by filling out an interface
structure with pointers to the function implementing support for various
operations.

In this part a structure with callback functions with is registered with
libtpms. Those callback functions mostly deal with persistent storage.

The libtpms-based backend implements functionality to write into a 
Qemu block storage device rather than to plain files. With that we
can support VM snapshotting and we also get the possibility to use
encrypted QCoW2 for free. Thanks to Anthony for pointing this out.
The storage part of the driver has been split off into its own patch.

v2:
  - the directory's checksum is now calculated using zlib's crc32
  - fixes to adhere to the qemu coding style


Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 configure|3 
 hw/tpm_builtin.c |  424 ---
 hw/tpm_tis.h |   17 ++
 3 files changed, 423 insertions(+), 21 deletions(-)

Index: qemu-git/hw/tpm_tis.h
===
--- qemu-git.orig/hw/tpm_tis.h
+++ qemu-git/hw/tpm_tis.h
@@ -137,4 +137,21 @@ static inline void dumpBuffer(FILE *stre
 fprintf(stream, \n);
 }
 
+static inline void clear_sized_buffer(TPMSizedBuffer *tpmsb)
+{
+if (tpmsb-buffer) {
+   tpmsb-size = 0;
+   qemu_free(tpmsb-buffer);
+   tpmsb-buffer = NULL;
+}
+}
+
+static inline void set_sized_buffer(TPMSizedBuffer *tpmsb,
+uint8_t *buffer, uint32_t size)
+{
+clear_sized_buffer(tpmsb);
+tpmsb-size = size;
+tpmsb-buffer = buffer;
+}
+
 #endif /* _HW_TPM_TIS_H */
Index: qemu-git/hw/tpm_builtin.c
===
--- qemu-git.orig/hw/tpm_builtin.c
+++ qemu-git/hw/tpm_builtin.c
@@ -1,4 +1,5 @@
 /*
+ *  built-in TPM support using libtpms
  *
  *  Copyright (c) 2010, 2011 IBM Corporation
  *  Copyright (c) 2010, 2011 Stefan Berger
@@ -17,15 +18,32 @@
  * License along with this library; if not, see http://www.gnu.org/licenses/
  */
 
+#include blockdev.h
+#include block_int.h
 #include qemu-common.h
 #include hw/hw.h
 #include hw/tpm_tis.h
 #include hw/pc.h
 
+#include libtpms/tpm_library.h
+#include libtpms/tpm_error.h
+#include libtpms/tpm_memory.h
+#include libtpms/tpm_nvfilename.h
+#include libtpms/tpm_tis.h
+
+#include zlib.h
+
+
+/* following define will be removed once SeaBIOS has TPM support */
+
 //#define DEBUG_TPM
 //#define DEBUG_TPM_SR /* suspend - resume */
 
 
+#define SAVESTATE_TYPE 'S'
+#define PERMSTATE_TYPE 'P'
+#define VOLASTATE_TYPE 'V'
+
 /* data structures */
 
 typedef struct ThreadParams {
@@ -40,11 +58,17 @@ typedef struct ThreadParams {
 static QemuThread thread;
 
 static QemuMutex state_mutex; /* protects *_state below */
+static QemuCond bs_write_result_cond;
+static TPMSizedBuffer permanent_state = { .size = 0, .buffer = NULL, };
+static TPMSizedBuffer volatile_state  = { .size = 0, .buffer = NULL, };
+static TPMSizedBuffer save_state  = { .size = 0, .buffer = NULL, };
+static int pipefd[2] =  {-1, -1};
 
 static bool thread_terminate = false;
 static bool tpm_initialized = false;
 static bool had_fatal_error = false;
 static bool had_startup_error = false;
+static bool need_read_volatile = false;
 
 static ThreadParams tpm_thread_params;
 
@@ -58,12 +82,54 @@ static const unsigned char tpm_std_fatal
 static char dev_description[80];
 
 
+static int tpmlib_get_prop(enum TPMLIB_TPMProperty prop)
+{
+int result;
+
+TPM_RESULT res = TPMLIB_GetTPMProperty(prop, result);
+
+assert(res == TPM_SUCCESS);
+
+return result;
+}
+
+
+#if defined DEBUG_TPM || defined DEBUG_TPM_SR
+static unsigned int memsum(const unsigned char *buf, int len)
+{
+ int res = 0, i;
+
+ for (i = 0; i  len; i++) {
+  res += buf[i];
+ }
+
+ return res;
+}
+#endif
+
+
 /**
  * Start the TPM. If it had been started before, then terminate and start
  * it again.
  */
 static int startup_tpm(void)
 {
+TPM_RESULT res;
+
+if (tpm_initialized) {
+TPMLIB_Terminate();
+tpm_initialized = false;
+}
+
+res = TPMLIB_MainInit();
+if (res != TPM_SUCCESS) {
+#if defined DEBUG_TPM || defined DEBUG_TPM_SR
+fprintf(stderr,tpm: Error: Call to TPMLIB_MainInit() failed 
(rc=%d)\n,
+   res);
+#endif
+return 1;
+}
+
 tpm_initialized = true;
 
 #if defined DEBUG_TPM || defined DEBUG_TPM_SR
@@ -84,6 +150,13 @@ static int 

[Qemu-devel] [PATCH V2 4/9] Add tpm_tis driver to build process

2011-03-30 Thread Stefan Berger
The TPM interface (tpm_tis) needs to be explicitly enabled via 
./configure --enable-tpm. This restricts the building of the
TPM support to i386 and x86_64 targets since both backends I know
of, the Xen backend and the libtpms-based backend, will likely only
be available for these targets, at least initially. The list can be
easily extend. This measure prevents that one will end up with support
for a frontend but no available backend.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

Index:qemu/Makefile.target
===
---
 Makefile.target |7 +++
 configure   |   20 
 2 files changed, 27 insertions(+)

Index: qemu-git/Makefile.target
===
--- qemu-git.orig/Makefile.target
+++ qemu-git/Makefile.target
@@ -303,6 +303,13 @@ obj-sparc-y += cs4231.o eccmemctl.o sbi.
 
 # GRLIB
 obj-sparc-y += grlib_gptimer.o grlib_irqmp.o grlib_apbuart.o
+
+ifeq ($(TARGET_ARCH),$(filter $(TARGET_ARCH),i386 x86_64))
+
+obj-i386-$(CONFIG_TPM) += tpm_tis.o
+
+endif
+
 endif
 
 obj-arm-y = integratorcp.o versatilepb.o arm_pic.o arm_timer.o
Index: qemu-git/configure
===
--- qemu-git.orig/configure
+++ qemu-git/configure
@@ -175,6 +175,7 @@ trace_backend=nop
 trace_file=trace
 spice=
 rbd=
+tpm=no
 
 # parse CC options first
 for opt do
@@ -708,6 +709,8 @@ for opt do
   ;;
   --kerneldir=*) kerneldir=$optarg
   ;;
+  --enable-tpm) tpm=yes
+  ;;
   --with-pkgversion=*) pkgversion= ($optarg)
   ;;
   --disable-docs) docs=no
@@ -921,6 +924,7 @@ echoDefault
 echo   --disable-spice  disable spice
 echo   --enable-spice   enable spice
 echo   --enable-rbd enable building the rados block device (rbd)
+echo   --enable-tpm enables an emulated TPM
 echo 
 echo NOTE: The object files are built at the place where configure is 
launched
 exit 1
@@ -2540,6 +2544,7 @@ echo Trace output file $trace_file-pid
 echo spice support $spice
 echo rbd support   $rbd
 echo xfsctl support$xfs
+echo TPM support   $tpm
 
 if test $sdl_too_old = yes; then
 echo - Your SDL version is too old - please upgrade to have SDL support
@@ -3324,6 +3329,21 @@ if test $gprof = yes ; then
   fi
 fi
 
+if test $linux = yes  test $tpm = yes; then
+  has_tpm=0
+  if test $target_softmmu = yes ; then
+case $TARGET_BASE_ARCH in
+i386)
+  has_tpm=1
+;;
+esac
+  fi
+
+  if test $has_tpm = 1; then
+  echo CONFIG_TPM=y  $config_host_mak
+  fi
+fi
+
 linker_script=-Wl,-T../config-host.ld -Wl,-T,\$(SRC_PATH)/\$(ARCH).ld
 if test $target_linux_user = yes -o $target_bsd_user = yes ; then
   case $ARCH in




[Qemu-devel] [PATCH V2 2/9] Add TPM (frontend) hardware interface (TPM TIS) to Qemu

2011-03-30 Thread Stefan Berger
This patch adds the main code of the TPM frontend driver, the TPM TIS
interface, to Qemu. The code is largely based on my previous implementation
for Xen but has been significantly extended to meet the standard's
requirements, such as the support for changing of localities and all the
functionality of the available flags.

Communication with the backend (i.e., for Xen or the libtpms-based one)
is cleanly separated through an interface which the backend driver needs
to implement.

The TPM TIS driver's backend was previously chosen in the code added
to arch_init. The frontend holds a pointer to the chosen backend (interface).

Communication with the backend is largely based on signals and conditions.
Whenever the frontend has collected a complete packet, it will signal
the backend, which then starts processing the command. Once the result
has been returned, the backend invokes a callback function
(tis_tpm_receive_cb()).

The one tricky part is support for VM suspend while the TPM is processing
a command. In this case the frontend driver is waiting for the backend
to return the result of the last command before shutting down. It waits
on a condition for a signal from the backend, which is delivered in 
tis_tpm_receive_cb().

Testing the proper functioning of the different flags and localities 
cannot be done from user space when running in Linux for example, since
access to the address space of the TPM TIS interface is not possible. Also
the Linux driver itself does not exercise all functionality. So, for
testing there is a fairly extensive test suite as part of the SeaBIOS patches
since from within the BIOS one can have full access to all the TPM's registers.


Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 hw/pc.c  |3 
 hw/tpm_tis.c |  826 +++
 hw/tpm_tis.h |  140 ++
 3 files changed, 969 insertions(+)

Index: qemu-git/hw/tpm_tis.c
===
--- /dev/null
+++ qemu-git/hw/tpm_tis.c
@@ -0,0 +1,826 @@
+/*
+ * tpm_tis.c - QEMU emulator for a 1.2 TPM with TIS interface
+ *
+ * Copyright (C) 2006,2010 IBM Corporation
+ *
+ * Author: Stefan Berger stef...@us.ibm.com
+ * David Safford saff...@us.ibm.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ *
+ *
+ * Implementation of the TIS interface according to specs at
+ * 
https://www.trustedcomputinggroup.org/groups/pc_client/TCG_PCClientTPMSpecification_1-20_1-00_FINAL.pdf
+ *
+ */
+
+#include block.h
+#include hw/hw.h
+#include hw/pc.h
+#include hw/tpm_tis.h
+
+#include stdio.h
+
+//#define DEBUG_TIS
+
+/* whether the STS interrupt is supported */
+//#define RAISE_STS_IRQ
+
+/* tis registers */
+#define TIS_REG_ACCESS0x00
+#define TIS_REG_INT_ENABLE0x08
+#define TIS_REG_INT_VECTOR0x0c
+#define TIS_REG_INT_STATUS0x10
+#define TIS_REG_INTF_CAPABILITY   0x14
+#define TIS_REG_STS   0x18
+#define TIS_REG_DATA_FIFO 0x24
+#define TIS_REG_DID_VID   0xf00
+#define TIS_REG_RID   0xf04
+
+
+#define STS_VALID(1  7)
+#define STS_COMMAND_READY(1  6)
+#define STS_TPM_GO   (1  5)
+#define STS_DATA_AVAILABLE   (1  4)
+#define STS_EXPECT   (1  3)
+#define STS_RESPONSE_RETRY   (1  1)
+
+#define ACCESS_TPM_REG_VALID_STS (1  7)
+#define ACCESS_ACTIVE_LOCALITY   (1  5)
+#define ACCESS_BEEN_SEIZED   (1  4)
+#define ACCESS_SEIZE (1  3)
+#define ACCESS_PENDING_REQUEST   (1  2)
+#define ACCESS_REQUEST_USE   (1  1)
+#define ACCESS_TPM_ESTABLISHMENT (1  0)
+
+#define INT_ENABLED  (1  31)
+#define INT_DATA_AVAILABLE   (1  0)
+#define INT_STS_VALID(1  1)
+#define INT_LOCALITY_CHANGED (1  2)
+#define INT_COMMAND_READY(1  7)
+
+#ifndef RAISE_STS_IRQ
+
+# define INTERRUPTS_SUPPORTED (INT_LOCALITY_CHANGED | \
+   INT_DATA_AVAILABLE   | \
+   INT_COMMAND_READY)
+
+#else
+
+# define INTERRUPTS_SUPPORTED (INT_LOCALITY_CHANGED | \
+   INT_DATA_AVAILABLE   | \
+   INT_STS_VALID | \
+   INT_COMMAND_READY)
+
+#endif
+
+#define CAPABILITIES_SUPPORTED   ((1  4) |\
+  INTERRUPTS_SUPPORTED)
+
+#define TPM_DID  0x0001
+#define TPM_VID  0x0001
+#define TPM_RID  0x0001
+
+#define TPM_NO_DATA_BYTE 0xff
+
+/* prototypes */
+static uint32_t tis_mem_readl(void *opaque, target_phys_addr_t addr);
+
+
+static const BackendTPMDriver *bes[] = 

[Qemu-devel] [PATCH V2 7/9] Add a TPM backend skeleton implementation

2011-03-30 Thread Stefan Berger
This patch provides a TPM backend skelteon implementation. It doesn't do
anything but it compiles.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 Makefile.target  |5 
 hw/tpm_builtin.c |  372 +++
 hw/tpm_tis.c |3 
 3 files changed, 380 insertions(+)

Index: qemu-git/hw/tpm_builtin.c
===
--- /dev/null
+++ qemu-git/hw/tpm_builtin.c
@@ -0,0 +1,372 @@
+/*
+ *
+ *  Copyright (c) 2010, 2011 IBM Corporation
+ *  Copyright (c) 2010, 2011 Stefan Berger
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see http://www.gnu.org/licenses/
+ */
+
+#include qemu-common.h
+#include hw/hw.h
+#include hw/tpm_tis.h
+#include hw/pc.h
+
+//#define DEBUG_TPM
+//#define DEBUG_TPM_SR /* suspend - resume */
+
+
+/* data structures */
+
+typedef struct ThreadParams {
+TPMState *tpm_state;
+
+TPMRecvDataCB *recv_data_callback;
+} ThreadParams;
+
+
+/* local variables */
+
+static QemuThread thread;
+
+static QemuMutex state_mutex; /* protects *_state below */
+
+static bool thread_terminate = false;
+static bool tpm_initialized = false;
+static bool had_fatal_error = false;
+static bool had_startup_error = false;
+
+static ThreadParams tpm_thread_params;
+
+/* locality of the command being executed by libtpms */
+static uint8_t g_locty;
+
+static const unsigned char tpm_std_fatal_error_response[10] = {
+0x00, 0xc4, 0x00, 0x00, 0x00, 0x0A, 0x00, 0x00, 0x00, 0x09 /* TPM_FAIL */
+};
+
+static char dev_description[80];
+
+
+/**
+ * Start the TPM. If it had been started before, then terminate and start
+ * it again.
+ */
+static int startup_tpm(void)
+{
+tpm_initialized = true;
+
+#if defined DEBUG_TPM || defined DEBUG_TPM_SR
+fprintf(stderr,tpm: *** tpm startup was successful! ***\n);
+#endif
+
+return 0;
+}
+
+
+/*
+ * Start up the TPM before it sees the first command.
+ * We need to do this late since only now we will have the
+ * block storage encryption key and can read the previous
+ * TPM state. During 'reset' the key would not be available.
+ */
+static int late_startup_tpm(void)
+{
+int rc;
+
+rc  = startup_tpm();
+if (rc) {
+had_fatal_error = 1;
+return 1;
+}
+
+return 0;
+}
+
+
+static void terminate_tpm_thread(void)
+{
+if (!thread_terminate) {
+thread_terminate = true;
+
+qemu_mutex_lock  (tpm_thread_params.tpm_state-state_lock);
+qemu_cond_signal (tpm_thread_params.tpm_state-to_tpm_cond);
+qemu_mutex_unlock(tpm_thread_params.tpm_state-state_lock);
+
+qemu_thread_join(thread, NULL);
+memset(thread, 0, sizeof(thread));
+
+if (tpm_initialized) {
+tpm_initialized = false;
+}
+}
+}
+
+
+static void tpm_atexit(void)
+{
+terminate_tpm_thread();
+}
+
+
+static void *mainLoop(void *d)
+{
+int res = 0;
+ThreadParams *tParams = (ThreadParams *)d;
+uint32_t in_len, out_len;
+uint8_t *in, *out;
+uint32_t resp_size; /* total length of response */
+
+/* start command processing */
+while (!thread_terminate) {
+/* receive and handle commands */
+in_len = 0;
+do {
+#ifdef DEBUG_TPM
+fprintf(stderr,waiting for commands...\n);
+#endif
+
+if (thread_terminate) {
+break;
+}
+
+qemu_mutex_lock(tParams-tpm_state-state_lock);
+
+/* in case we were to slow and missed the signal, the
+   to_tpm_execute boolean tells us about a pending command */
+if (!tParams-tpm_state-to_tpm_execute) {
+qemu_cond_wait(tParams-tpm_state-to_tpm_cond,
+   tParams-tpm_state-state_lock);
+}
+
+tParams-tpm_state-to_tpm_execute = false;
+
+qemu_mutex_unlock(tParams-tpm_state-state_lock);
+
+if (thread_terminate) {
+break;
+}
+
+g_locty = tParams-tpm_state-command_locty;
+
+in = tParams-tpm_state-loc[g_locty].w_buffer.buffer;
+in_len = tParams-tpm_state-loc[g_locty].w_offset;
+
+if (!had_fatal_error) {
+
+out_len = tParams-tpm_state-loc[g_locty].r_buffer.size;
+
+#ifdef DEBUG_TPM
+fprintf(stderr,
+tpm: received %d bytes from VM in 

[Qemu-devel] [PATCH V2 9/9] Add block storage support for libtpms based TPM backend

2011-03-30 Thread Stefan Berger
This patch supports the storage of TPM persisten state.

The TPM creates state of varying size, depending for example how many
keys are loaded into it a a certain time. The worst-case sizes of
the different blobs the TPM can write have been pre-calculated and this
value is used to determine the minimum size of the Qcow2 image. It needs to
be 63kb. 'qemu-... -tpm ?' shows this number when this backend driver is
available.


The layout of the TPM's persistent data in the block storage is as follows:

The first sector (512 bytes) holds a primitive directory for the different
types of blobs that the TPM can write. This directory holds a revision
number, a checksum over its content, the number of entries, and the entries
themselves. The entries are described through their absolute offsets, their
maximum sizes, the number of currently valid bytes (the blobs infalte
and deflate) and what type of blob it is (see below for the types).

typedef struct BSDir {
uint16_t  rev;
uint32_t  checksum; 
uint32_t  num_entries;
BSEntry   entries[BS_DIR_MAX_NUM_ENTRIES];
} __attribute__((packed)) BSDir;


Their worst case sizes have been calculated and according to these sizes
the blobs are written at certain offsets into the blockstorage. Their offsets
are all aligned to sectors (512 byte boundaries).

The TPM provides three different blobs that are written into the storage:

- volatile state
- permanent state
- save state

The 'save state' is written when the VM suspends (ACPI S3) and read when it
resumes. This is done in concert with the BIOS where the BIOS needs to send
a command to the TPM upon resume (TPM_Startup(ST_STATE)), while the OS
issues the command TPM_SaveState().

The 'perment state' is written when the TPM receives a command that alters
its permenent state, i.e., when the a key is loaded into the TPM that
is expected to be there upon reboot of the machine / VM.

Volatile state is written when the frontend triggers it to do so, i.e.,
when the VM's state is written out during taking of a snapshot, migration
or suspension to disk (as in 'virsh save'). This state serves to resume
at the point where the TPM previously stopped but there is no need for it
after a machine reboot for example.

Tricky parts here are related to encrypted storage where certain operations
need to be deferred since the key for the storage only becomes available
much later than the time that the backend is instantiated.

The backend also tries to check for the validity of the block storage for
example. If the Qcow2 is not encrypted and the checksum is found to be
bad, the block storage directory will be initialized. 
In case the Qcow2 is encrypted, initialization will only be done if
the directory is found to be all 0s. In case the directory cannot be
checksummed correctly, but is not all 0s, it is assumed that the user
provided a wrong key. In this case I am not exiting qemu, but black-out
the TPM interface (returns 0xff in all memory location) due to a presumed
fatal error and let the VM run (without TPM functionality).

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 hw/tpm_builtin.c |  685 ++-
 1 file changed, 684 insertions(+), 1 deletion(-)

Index: qemu-git/hw/tpm_builtin.c
===
--- qemu-git.orig/hw/tpm_builtin.c
+++ qemu-git/hw/tpm_builtin.c
@@ -44,6 +44,33 @@
 #define PERMSTATE_TYPE 'P'
 #define VOLASTATE_TYPE 'V'
 
+#define ALIGN(VAL, SIZE) \
+  ( ( (VAL) + (SIZE) - 1 )  ~( (SIZE) - 1 ) )
+
+
+#define DIRECTORY_SIZEBDRV_SECTOR_SIZE
+
+#define PERMSTATE_DISK_OFFSET ALIGN(DIRECTORY_SIZE, BDRV_SECTOR_SIZE)
+#define PERMSTATE_DISK_SPACE \
+  ALIGN(tpmlib_get_prop(TPMPROP_TPM_MAX_NV_SPACE),\
+BDRV_SECTOR_SIZE)
+#define SAVESTATE_DISK_OFFSET (PERMSTATE_DISK_OFFSET + PERMSTATE_DISK_SPACE)
+#define SAVESTATE_DISK_SPACE \
+  ALIGN(tpmlib_get_prop(TPMPROP_TPM_MAX_SAVESTATE_SPACE),\
+BDRV_SECTOR_SIZE)
+#define VOLASTATE_DISK_OFFSET (SAVESTATE_DISK_OFFSET + SAVESTATE_DISK_SPACE)
+#define VOLASTATE_DISK_SPACE \
+  ALIGN(tpmlib_get_prop(TPMPROP_TPM_MAX_VOLATILESTATE_SPACE),\
+BDRV_SECTOR_SIZE)
+
+# define MINIMUM_BS_SIZE   ALIGN(ALIGN(VOLASTATE_DISK_OFFSET +\
+   VOLASTATE_DISK_SPACE,  \
+   BDRV_SECTOR_SIZE), \
+ 1024)
+
+#define MINIMUM_BS_SIZE_KB(int)(MINIMUM_BS_SIZE / 1024)
+
+
 /* data structures */
 
 typedef struct ThreadParams {
@@ -53,6 +80,37 @@ typedef struct ThreadParams {
 } ThreadParams;
 
 
+enum BSEntryType {
+BS_ENTRY_PERMSTATE,
+BS_ENTRY_SAVESTATE,
+BS_ENTRY_VOLASTATE,
+
+BS_ENTRY_LAST,
+};
+
+
+typedef struct BSEntry {
+enum BSEntryType type;
+uint64_t offset;
+uint32_t space;
+uint32_t blobsize;
+} __attribute__((packed)) 

[Qemu-devel] [PATCH V2 6/9] Implement qemu_thread_join function

2011-03-30 Thread Stefan Berger
This patch provides support for 'joining a thread' by wrapping the POSIX
pthread_join with qemu_thread_join.

Since the backend implementation is based on threads and I am stopping
and starting that thread during operations like 'snapshot resume', I
do use this functionality to synchronize with the TPM thread's termination
before terminating the TPM, creating a new one and loading previous state
from the time of the snapshot into the TPM.

Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com

---
 qemu-thread-posix.c |5 +
 qemu-thread.h   |1 +
 2 files changed, 6 insertions(+)

Index: qemu-git/qemu-thread-posix.c
===
--- qemu-git.orig/qemu-thread-posix.c
+++ qemu-git/qemu-thread-posix.c
@@ -147,3 +147,8 @@ void qemu_thread_exit(void *retval)
 {
 pthread_exit(retval);
 }
+
+int qemu_thread_join(QemuThread *thread, void **retval)
+{
+return pthread_join(thread-thread, retval);
+}
Index: qemu-git/qemu-thread.h
===
--- qemu-git.orig/qemu-thread.h
+++ qemu-git/qemu-thread.h
@@ -35,5 +35,6 @@ void qemu_thread_create(QemuThread *thre
 void qemu_thread_get_self(QemuThread *thread);
 int qemu_thread_is_self(QemuThread *thread);
 void qemu_thread_exit(void *retval);
+int qemu_thread_join(QemuThread *thread, void **retval);
 
 #endif




Re: [Qemu-devel] [PATCH] Don't limit node count by smp count

2011-03-30 Thread Sasha Levin
On Wed, 2011-03-30 at 14:28 -0500, Anthony Liguori wrote:
 On 03/30/2011 02:14 PM, Sasha Levin wrote:
  It is possible to create CPU-less NUMA nodes, node amount shouldn't be
  limited by amount of CPUs.
 
 But does this actually work in the code today and does it work with any 
 guests?
 

I've tested it on an Ubuntu desktop as a guest and it worked
fine, /sys/devices/system/node/has_cpu was as expected.

 Regards,
 
 Anthony Liguori
 
  Signed-off-by: Sasha Levinlevinsasha...@gmail.com
  ---
vl.c |4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
 
  diff --git a/vl.c b/vl.c
  index 8bcf2ae..8cc1aa8 100644
  --- a/vl.c
  +++ b/vl.c
  @@ -3002,8 +3002,8 @@ int main(int argc, char **argv, char **envp)
if (nb_numa_nodes  0) {
int i;
 
  -if (nb_numa_nodes  smp_cpus) {
  -nb_numa_nodes = smp_cpus;
  +if (nb_numa_nodes  MAX_NODES) {
  +nb_numa_nodes = MAX_NODES;
}
 
/* If no memory size if given for any node, assume the default
  case
 
 
 

-- 

Sasha.




Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Peter Maydell
2011/3/30 Gleb Natapov g...@redhat.com:
 On Wed, Mar 30, 2011 at 07:53:41PM +0100, Peter Maydell wrote:
 I'm not convinced about the utility of printing the pid, personally.
 Most programs get along fine without printing anything when
 they receive a terminal signal.

 Well qemu is a bit of special case. It is long running process that
 takes huge amount of memory and, as suchm it becomes a target of various
 monitoring script which, when configured incorrectly, start killing
 perfectly valid guests. In addition killing of the guest looks exactly
 like guest shutdown to management software because we call shutdow_request
 in the signal handler.

That sounds like a flaw in the communication protocol between
qemu and the management software, which would be better fixed
by having qemu communicate the reason for exit directly (ie
not just by printing to stderr), surely?

 Exactly. This should do the trick (not tested).

Looks good, and a test shows I don't get the segfault any more.

Reviewed-by: Peter Maydell peter.mayd...@linaro.org

although I guess you'll want to submit it with a sensible git
commit message :-)

-- PMM



[Qemu-devel] [PATCH] vl.c: Tidy up message printed when we exit on a signal

2011-03-30 Thread Peter Maydell
Tidy up the message printed when qemu exits due to a signal, so that
it's clearer where the message is coming from and that it's not just
stray debug output.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
 vl.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 5c9205f..4d9e503 100644
--- a/vl.c
+++ b/vl.c
@@ -1169,8 +1169,15 @@ int qemu_shutdown_requested(void)
 void qemu_kill_report(void)
 {
 if (shutdown_signal != -1) {
-fprintf(stderr, Got signal %d from pid %d\n,
- shutdown_signal, shutdown_pid);
+fprintf(stderr, qemu: terminating on signal %d, shutdown_signal);
+if (shutdown_pid == 0) {
+/* This happens for eg ^C at the terminal, so it's worth
+ * avoiding printing an odd message in that case.
+ */
+fputc('\n', stderr);
+} else {
+fprintf(stderr,  from pid %d\n, shutdown_pid);
+}
 shutdown_signal = -1;
 }
 }
-- 
1.7.1




Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread malc
On Wed, 30 Mar 2011, Peter Maydell wrote:

 On 15 March 2011 11:56, Gleb Natapov g...@redhat.com wrote:
  Currently when rogue script kills QEMU process (using TERM/INT/HUP
  signal) it looks indistinguishable from system shutdown. Lets report
  that QEMU was killed and leave some clues about the killer identity.
 
 Unfortunately this patch causes qemu to segfault when killed
 via ^C (at least on my Ubuntu maverick system). This is because
 it registers a signal handler with sigaction, but then later
 the SDL library is initialised and it reinstalls our handler
 with plain old signal:
 
 ohandler = signal(SIGINT, SDL_HandleSIG);
 if ( ohandler != SIG_DFL )
 signal(SIGINT, ohandler);
 
 This is clearly buggy but on the other hand SDL is pretty widely
 deployed and it's the default QEMU video output method, so I think
 we need to work around it :-(
 
 The most straightforward fix is to get the signal number from
 argument one and not to bother printing the PID that killed us.
 

Maybe using SDL_INIT_NOPARACHUTE is worth doing?

-- 
mailto:av1...@comtv.ru



Re: [Qemu-devel] [PATCH] Don't limit node count by smp count

2011-03-30 Thread Michael Roth

On 03/30/2011 02:14 PM, Sasha Levin wrote:

It is possible to create CPU-less NUMA nodes, node amount shouldn't be
limited by amount of CPUs.

Signed-off-by: Sasha Levinlevinsasha...@gmail.com
---
  vl.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 8bcf2ae..8cc1aa8 100644
--- a/vl.c
+++ b/vl.c
@@ -3002,8 +3002,8 @@ int main(int argc, char **argv, char **envp)
  if (nb_numa_nodes  0) {
  int i;

-if (nb_numa_nodes  smp_cpus) {
-nb_numa_nodes = smp_cpus;
+if (nb_numa_nodes  MAX_NODES) {
+nb_numa_nodes = MAX_NODES;
  }

  /* If no memory size if given for any node, assume the default
case




Did a quick test myself and it seems to work from what I can tell:

RHEL6 64-bit guest, QEMU master:

/home/mdroth/dev/kvm/qemu.git/x86_64-softmmu/qemu-system-x86_64 -smp 2 
-numa node,nodeid=1 -numa node,nodeid=2 -numa node,nodeid=3 -drive 
file=/home/mdroth/vm/rhel6_64_base.raw,snapshot=off,if=virtio -bios 
/home/mdroth/dev/kvm/qemu.git/pc-bios/bios.bin -vnc :1 -m 1024 --enable-kvm


...

[mdroth@vm0 ~]$ cat /sys/devices/system/node/node{0,1,2}/cpulist
0
1

[mdroth@vm0 ~]$ cat /sys/devices/system/node/node{0,1,2}/distance
10 20 20
20 10 20
20 20 10
[mdroth@vm0 ~]$ cat /sys/devices/system/node/node{0,1,2}/meminfo

Node 0 MemTotal: 343664 kB
Node 0 MemFree:  181384 kB
Node 0 MemUsed:  162280 kB
Node 0 Active:26640 kB
Node 0 Inactive:  52752 kB
Node 0 Active(anon):  12272 kB
Node 0 Inactive(anon):0 kB
Node 0 Active(file):  14368 kB
Node 0 Inactive(file):52752 kB
Node 0 Unevictable:   0 kB
Node 0 Mlocked:   0 kB
Node 0 Dirty:   116 kB
Node 0 Writeback: 0 kB
Node 0 FilePages: 67220 kB
Node 0 Mapped:12980 kB
Node 0 AnonPages: 12168 kB
Node 0 Shmem:   104 kB
Node 0 KernelStack: 736 kB
Node 0 PageTables: 2988 kB
Node 0 NFS_Unstable:  0 kB
Node 0 Bounce:0 kB
Node 0 WritebackTmp:  0 kB
Node 0 Slab:  53356 kB
Node 0 SReclaimable:   8896 kB
Node 0 SUnreclaim:44460 kB
Node 0 HugePages_Total: 0
Node 0 HugePages_Free:  0
Node 0 HugePages_Surp:  0

Node 1 MemTotal: 344064 kB
Node 1 MemFree:  292756 kB
Node 1 MemUsed:   51308 kB
Node 1 Active:12548 kB
Node 1 Inactive:  11296 kB
Node 1 Active(anon):   8120 kB
Node 1 Inactive(anon):4 kB
Node 1 Active(file):   4428 kB
Node 1 Inactive(file):11292 kB
Node 1 Unevictable:   0 kB
Node 1 Mlocked:   0 kB
Node 1 Dirty: 4 kB
Node 1 Writeback: 0 kB
Node 1 FilePages: 15776 kB
Node 1 Mapped: 3548 kB
Node 1 AnonPages:  6044 kB
Node 1 Shmem:56 kB
Node 1 KernelStack: 264 kB
Node 1 PageTables: 1904 kB
Node 1 NFS_Unstable:  0 kB
Node 1 Bounce:0 kB
Node 1 WritebackTmp:  0 kB
Node 1 Slab:  14696 kB
Node 1 SReclaimable:   1584 kB
Node 1 SUnreclaim:13112 kB
Node 1 HugePages_Total: 0
Node 1 HugePages_Free:  0
Node 1 HugePages_Surp:  0

Node 2 MemTotal: 360436 kB
Node 2 MemFree:  345112 kB
Node 2 MemUsed:   15324 kB
Node 2 Active:0 kB
Node 2 Inactive:  0 kB
Node 2 Active(anon):  0 kB
Node 2 Inactive(anon):0 kB
Node 2 Active(file):  0 kB
Node 2 Inactive(file):0 kB
Node 2 Unevictable:   0 kB
Node 2 Mlocked:   0 kB
Node 2 Dirty: 0 kB
Node 2 Writeback: 0 kB
Node 2 FilePages: 0 kB
Node 2 Mapped:4 kB
Node 2 AnonPages: 0 kB
Node 2 Shmem: 0 kB
Node 2 KernelStack:   8 kB
Node 2 PageTables:0 kB
Node 2 NFS_Unstable:  0 kB
Node 2 Bounce:0 kB
Node 2 WritebackTmp:  0 kB
Node 2 Slab:  10920 kB
Node 2 SReclaimable:   1200 kB
Node 2 SUnreclaim: 9720 kB
Node 2 HugePages_Total: 0
Node 2 HugePages_Free:  0
Node 2 HugePages_Surp:  0
[mdroth@vm0 ~]$





Re: [Qemu-devel] [PATCH 1/3] arm: basic support for ARMv4/ARMv4T emulation

2011-03-30 Thread Dmitry Eremin-Solenikov
On 3/30/11, Peter Maydell peter.mayd...@linaro.org wrote:
 On 30 March 2011 12:41, Dmitry Eremin-Solenikov dbarysh...@gmail.com
 wrote:
 @@ -7172,10 +7210,7 @@ static void disas_arm_insn(CPUState * env,
 DisasContext *s)
 }
 if (insn  (1  20)) {
 /* Complete the load.  */
 -if (rd == 15)
 -gen_bx(s, tmp);
 -else
 -store_reg(s, rd, tmp);
 +store_reg_from_load(env, s, rn, tmp);
 }
 break;
 case 0x08:

 Shouldn't the argument to store_reg_from_load() be 'rd', not 'rn'?

Yeah, it seems so :)
Please commit the fixed patch.

-- 
With best wishes
Dmitry



[Qemu-devel] [RFC PATCH] qed: add support for Copy-on-Read

2011-03-30 Thread Anthony Liguori
When creating an image using qemu-img, just pass '-o copy_on_read' and then
whenever QED reads from a backing file, it will write the block to the QED
file after the read completes ensuring that you only fetch from the backing
device once.

This is very useful for streaming images over a slow connection.

This isn't ready for merge yet as it's not playing nice with synchronize I/O.

I think it's fairly easy to do the same thing in qcow2 by just hooking adding
some logic after bdrv_aio_write() to call back into qcow2 with a synchronous
I/O write in the backing file case.   Thoughts on whether that would actually
work?

Signed-off-by: Anthony Liguori aligu...@us.ibm.com
---
 block/qed.c |   53 +
 block/qed.h |4 +++-
 2 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 75ae244..292833b 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -448,7 +448,8 @@ static int bdrv_qed_flush(BlockDriverState *bs)
 
 static int qed_create(const char *filename, uint32_t cluster_size,
   uint64_t image_size, uint32_t table_size,
-  const char *backing_file, const char *backing_fmt)
+  const char *backing_file, const char *backing_fmt,
+  bool copy_on_read)
 {
 QEDHeader header = {
 .magic = QED_MAGIC,
@@ -490,6 +491,9 @@ static int qed_create(const char *filename, uint32_t 
cluster_size,
 if (qed_fmt_is_raw(backing_fmt)) {
 header.features |= QED_F_BACKING_FORMAT_NO_PROBE;
 }
+if (copy_on_read) {
+header.compat_features |= QED_CF_COPY_ON_READ;
+}
 }
 
 qed_header_cpu_to_le(header, le_header);
@@ -523,6 +527,7 @@ static int bdrv_qed_create(const char *filename, 
QEMUOptionParameter *options)
 uint32_t table_size = QED_DEFAULT_TABLE_SIZE;
 const char *backing_file = NULL;
 const char *backing_fmt = NULL;
+bool copy_on_read = false;
 
 while (options  options-name) {
 if (!strcmp(options-name, BLOCK_OPT_SIZE)) {
@@ -539,6 +544,10 @@ static int bdrv_qed_create(const char *filename, 
QEMUOptionParameter *options)
 if (options-value.n) {
 table_size = options-value.n;
 }
+} else if (!strcmp(options-name, copy_on_read)) {
+if (options-value.n) {
+copy_on_read = true;
+}
 }
 options++;
 }
@@ -559,9 +568,14 @@ static int bdrv_qed_create(const char *filename, 
QEMUOptionParameter *options)
 qed_max_image_size(cluster_size, table_size));
 return -EINVAL;
 }
+if (copy_on_read  !backing_file) {
+fprintf(stderr,
+QED only supports Copy-on-Read with a backing file\n);
+return -EINVAL;
+}
 
 return qed_create(filename, cluster_size, image_size, table_size,
-  backing_file, backing_fmt);
+  backing_file, backing_fmt, copy_on_read);
 }
 
 typedef struct {
@@ -1085,6 +1099,27 @@ static void qed_aio_write_data(void *opaque, int ret,
 }
 
 /**
+ * Copy on read callback
+ *
+ * Write data from backing file to QED that's been read if CoR is enabled.
+ */
+static void qed_copy_on_read_cb(void *opaque, int ret)
+{
+QEDAIOCB *acb = opaque;
+BDRVQEDState *s = acb_to_s(acb);
+BlockDriverAIOCB *cor_acb;
+
+cor_acb = bdrv_aio_writev(s-bs,
+  acb-cur_pos / BDRV_SECTOR_SIZE,
+  acb-cur_qiov,
+  acb-cur_qiov.size / BDRV_SECTOR_SIZE,
+  qed_aio_next_io, acb);
+if (!cor_acb) {
+qed_aio_complete(acb, -EIO);
+}
+}
+
+/**
  * Read data cluster
  *
  * @opaque: Read request
@@ -1102,6 +1137,7 @@ static void qed_aio_read_data(void *opaque, int ret,
 BDRVQEDState *s = acb_to_s(acb);
 BlockDriverState *bs = acb-common.bs;
 BlockDriverAIOCB *file_acb;
+BlockDriverCompletionFunc *cb;
 
 /* Adjust offset into cluster */
 offset += qed_offset_into_cluster(s, acb-cur_pos);
@@ -1114,10 +1150,15 @@ static void qed_aio_read_data(void *opaque, int ret,
 
 qemu_iovec_copy(acb-cur_qiov, acb-qiov, acb-qiov_offset, len);
 
+cb = qed_aio_next_io;
+
 /* Handle backing file and unallocated sparse hole reads */
 if (ret != QED_CLUSTER_FOUND) {
+if ((s-header.compat_features  QED_CF_COPY_ON_READ)) {
+cb = qed_copy_on_read_cb;
+}
 qed_read_backing_file(s, acb-cur_pos, acb-cur_qiov,
-  qed_aio_next_io, acb);
+  cb, acb);
 return;
 }
 
@@ -1125,7 +1166,7 @@ static void qed_aio_read_data(void *opaque, int ret,
 file_acb = bdrv_aio_readv(bs-file, offset / BDRV_SECTOR_SIZE,
   acb-cur_qiov,
   acb-cur_qiov.size / BDRV_SECTOR_SIZE,
-

Re: [Qemu-devel] [PATCHv3] report that QEMU process was killed by a signal

2011-03-30 Thread Gleb Natapov
On Thu, Mar 31, 2011 at 01:35:31AM +0400, malc wrote:
 On Wed, 30 Mar 2011, Peter Maydell wrote:
 
  On 15 March 2011 11:56, Gleb Natapov g...@redhat.com wrote:
   Currently when rogue script kills QEMU process (using TERM/INT/HUP
   signal) it looks indistinguishable from system shutdown. Lets report
   that QEMU was killed and leave some clues about the killer identity.
  
  Unfortunately this patch causes qemu to segfault when killed
  via ^C (at least on my Ubuntu maverick system). This is because
  it registers a signal handler with sigaction, but then later
  the SDL library is initialised and it reinstalls our handler
  with plain old signal:
  
  ohandler = signal(SIGINT, SDL_HandleSIG);
  if ( ohandler != SIG_DFL )
  signal(SIGINT, ohandler);
  
  This is clearly buggy but on the other hand SDL is pretty widely
  deployed and it's the default QEMU video output method, so I think
  we need to work around it :-(
  
  The most straightforward fix is to get the signal number from
  argument one and not to bother printing the PID that killed us.
  
 
 Maybe using SDL_INIT_NOPARACHUTE is worth doing?
 
We do it already, but SDL mangles signal handlers anyway. The funny
thing is that parachute code actually correctly use sigaction when
available.

--
Gleb.