date:20151112

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Dr. David Alan Gilbert

* Li, Liang Z (liang.z...@intel.com) wrote:
> > >> >
> > >> > I use your new code:
> > >> > -
> > >> >unsigned long *p = ...
> > >> >if (p[0] || p[1] || p[2] || p[3]
> > >> >|| memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
> > >> >return BUFFER_NOT_ZERO;
> > >> >else
> > >> >return BUFFER_ZERO;
> > >> > ---
> > >> > and the result is almost the same.  I also tried the check 8, 16
> > >> > long data at the beginning, same result.
> > >>
> > >> Interesting...  Well, all I can say is that applaud you for testing
> > >> your hypothesis with the benchmark.
> > >>
> > >> Probably the setup cost of memcmp is too high, because the testing
> > >> loop is already very optimized.
> > >>
> > >> Please submit the AVX2 version if it helps!
> > 
> > I read the email in the wrong order.  Forget about my other email.
> > 
> > Sorry, Juan.
> > 
> 
> One thing I still can't understand, why the unit test in host environment 
> shows
> 'memcmp()' have better performance?

Are you aware of any program other than QEMU that also wants to do something
similar?  Finding whether a block of memory is zero, sounds like something
that would be useful in lots of places, I just can't think which ones.

Dave

> 
> Liang
> > 
> > >
> > > Yes, the AVX2 version really helps. I have already submitted it, could
> > > you help to review it?
> > >
> > > I am curious about the original intention to add the SSE2 Intrinsics,
> > > is the same reason?
> > >
> > > I even suspect the VM may impact the 'memcmp()' performance, is it
> > possible?
> > >
> > > Liang
> > >
> > >> Paolo
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 3/4] block/gluster: using new qapi schema

2015-11-12 Thread Eric Blake

On 11/12/2015 03:22 AM, Prasanna Kumar Kalever wrote:
> this patch adds GlusterConf to qapi/block-core.json

Missing a vNN in the subject line.  I think we're up to v14?  But it
doesn't affect what 'git am' will do.

> 
> Signed-off-by: Prasanna Kumar Kalever 
> ---
>  block/gluster.c  | 104 
> +--
>  qapi/block-core.json |  60 +++--
>  2 files changed, 109 insertions(+), 55 deletions(-)
> 

Modulo Jeff's findings,

> diff --git a/block/gluster.c b/block/gluster.c
> index ededda2..615f28b 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c

> -typedef struct GlusterConf {
> -char *host;
> -int port;
> -char *volume;
> -char *path;
> -char *transport;
> -} GlusterConf;
> -
> -

So this is the struct being replaced by qapi BlockdevOptionsGluster.
/me jumps ahead to [1] in my review, before continuing here...

I'm back. Looks like your qapi struct matches this nicely, with the
possible exception of what happens if we try to avoid churn by
using/enforcing a 1-element array now rather than converting to array in
patch 4.

> @@ -143,8 +127,10 @@ static int parse_volume_options(GlusterConf *gconf, char 
> *path)
>   * file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket
>   * file=gluster+rdma://1.2.3.4:24007/testvol/a.img
>   */
> -static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
> +static int qemu_gluster_parseuri(BlockdevOptionsGluster **pgconf,
> + const char *filename)

I'm not sure from looking at just the signature why you changed from
*gconf to **pgconf; maybe that sort of conversion would have been worth
mentioning in the commit message (a good rule of thumb - if the change
isn't blatantly obvious, then calling it out in the commit message will
prepare reviewers for it).

> @@ -190,13 +180,23 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, 
> const char *filename)
>  ret = -EINVAL;
>  goto out;
>  }
> -gconf->host = g_strdup(qp->p[0].value);
> +gconf->server->host = g_strdup(qp->p[0].value);
>  } else {
> -gconf->host = g_strdup(uri->server ? uri->server : "localhost");
> -gconf->port = uri->port;
> +gconf->server->host = g_strdup(uri->server ? uri->server : 
> "localhost");
> +if (uri->port) {
> +gconf->server->port = uri->port;
> +} else {
> +gconf->server->port = GLUSTER_DEFAULT_PORT;
> +}
> +gconf->server->has_port = true;
>  }
>  
> +*pgconf = gconf;

Okay, now I see where the change in signature comes into play - you want
to return a new allocation to the user, but only on success.  But I'm
still not necessarily convinced that you need it.  See more at [3] below.

> +
>  out:
> +if (ret < 0) {
> +qapi_free_BlockdevOptionsGluster(gconf);
> +}
>  if (qp) {
>  query_params_free(qp);
>  }
> @@ -204,14 +204,15 @@ out:
>  return ret;
>  }

Okay, this parseuri conversion is sane.  It will need tweaking in patch
4 to deal with gconf->server becoming a list rather than a single
server, but as long as both patches go in, we should be okay.

>  
> -static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char 
> *filename,
> -  Error **errp)
> +static struct glfs *qemu_gluster_init(BlockdevOptionsGluster **pgconf,
> +  const char *filename, Error **errp)
>  {
> -struct glfs *glfs = NULL;
> +struct glfs *glfs;

Jeff already spotted that the change here is spurious.

>  int ret;
>  int old_errno;
> +BlockdevOptionsGluster *gconf;
>  
> -ret = qemu_gluster_parseuri(gconf, filename);
> +ret = qemu_gluster_parseuri(, filename);
>  if (ret < 0) {
>  error_setg(errp, "Usage: file=gluster[+transport]://[host[:port]]/"
>   "volume/path[?socket=...]");
> @@ -224,8 +225,9 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, 
> const char *filename,
>  goto out;
>  }
>  
> -ret = glfs_set_volfile_server(glfs, gconf->transport, gconf->host,
> -gconf->port);
> +ret = glfs_set_volfile_server(glfs,
> +  
> GlusterTransport_lookup[gconf->server->transport],

Line longer than 80 characters; I might have used an intermediate const
char * variable to cut down on the length. But as long as it gets past
scripts/checkpatch.pl, I won't insist on a reformat.

> +  gconf->server->host, gconf->server->port);

Ouch - since you aren't validating that gconf->server->port fits in 16
bits, you may be passing something so large that it silently wraps around.

>  if (ret < 0) {
>  goto out;
>  }
> @@ -242,10 +244,10 @@ static struct glfs *qemu_gluster_init(GlusterConf 
> *gconf, const char *filename,
>

Re: [Qemu-devel] [PATCH 09/12] vl: Replace DT_NOGRAPHIC with MachineState field

2015-11-12 Thread Eduardo Habkost

On Thu, Nov 12, 2015 at 10:48:12AM +0100, Paolo Bonzini wrote:
> 
> 
> On 11/11/2015 20:09, Eduardo Habkost wrote:
> > All DisplayType values are just UI options that don't affect any
> > hardware emulation code, except for DT_NOGRAPHIC. Replace
> > DT_NOGRAPHIC with DT_NONE plus a new MachineState.nographic
> > field, so hardware emulation code don't need to use the
> > display_type variable.
> > 
> > Cc: Michael Walle 
> > Cc: Blue Swirl 
> > Cc: Mark Cave-Ayland 
> > Signed-off-by: Eduardo Habkost 
> 
> Can you add a QOM property too, so that "-machine graphics=yes|no" can
> be used?

I can, but I would like to clarify the expected semantics. With
the -machine option, we would have:

* -display, which affects only the display UI.
* -nographic, which affects:
  * The display UI;
  * Hardware emulation;
  * serial/paralllel/virtioconsole output redirection.
* -machine graphics=no, which would affect only hardware
  emulation.

Is that correct?

-- 
Eduardo

[Qemu-devel] [PATCH 3/3] qtest/ahci: use raw format when qemu-img is absent

2015-11-12 Thread John Snow

If we don't have the qemu-img tool, use the raw format
for tests and skip the high-sector LBA48 tests.

Signed-off-by: John Snow 
---
 tests/ahci-test.c | 41 -
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 6d9ac84..7a4e375 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -39,16 +39,16 @@
 #include "hw/pci/pci_ids.h"
 #include "hw/pci/pci_regs.h"
 
-/* Test-specific defines -- in MiB */
-#define TEST_IMAGE_SIZE_MB (200 * 1024)
-#define TEST_IMAGE_SECTORS ((TEST_IMAGE_SIZE_MB / AHCI_SECTOR_SIZE) \
-* 1024 * 1024)
+/* Test images sizes in MB */
+#define TEST_IMAGE_SIZE_MB_LARGE (200 * 1024)
+#define TEST_IMAGE_SIZE_MB_SMALL 64
 
 /*** Globals ***/
 static char tmp_path[] = "/tmp/qtest.XX";
 static char debug_path[] = "/tmp/qtest-blkdebug.XX";
 static bool ahci_pedantic;
 static const char *imgfmt;
+static unsigned test_image_size_mb;
 
 /*** Function Declarations ***/
 static void ahci_test_port_spec(AHCIQState *ahci, uint8_t port);
@@ -61,6 +61,11 @@ static void ahci_test_pmcap(AHCIQState *ahci, uint8_t 
offset);
 
 /*** Utilities ***/
 
+static size_t mb_to_sectors(size_t image_size_mb)
+{
+return (image_size_mb * 1024 * 1024) / AHCI_SECTOR_SIZE;
+}
+
 static void string_bswap16(uint16_t *s, size_t bytes)
 {
 g_assert_cmphex((bytes & 1), ==, 0);
@@ -901,7 +906,7 @@ static void ahci_test_max(AHCIQState *ahci)
 uint64_t nsect;
 uint8_t port;
 uint8_t cmd;
-uint64_t config_sect = TEST_IMAGE_SECTORS - 1;
+uint64_t config_sect = mb_to_sectors(test_image_size_mb) - 1;
 
 if (config_sect > 0xFF) {
 cmd = CMD_READ_MAX_EXT;
@@ -1480,7 +1485,7 @@ static uint64_t offset_sector(enum OffsetType ofst,
 return 1;
 case OFFSET_HIGH:
 ceil = (addr_type == ADDR_MODE_LBA28) ? 0xfff : 0x;
-ceil = MIN(ceil, TEST_IMAGE_SECTORS - 1);
+ceil = MIN(ceil, mb_to_sectors(test_image_size_mb) - 1);
 nsectors = buffsize / AHCI_SECTOR_SIZE;
 return ceil - nsectors + 1;
 default:
@@ -1562,8 +1567,9 @@ static void create_ahci_io_test(enum IOMode type, enum 
AddrMode addr,
 enum BuffLen len, enum OffsetType offset)
 {
 char *name;
-AHCIIOTestOptions *opts = g_malloc(sizeof(AHCIIOTestOptions));
+AHCIIOTestOptions *opts;
 
+opts = g_malloc(sizeof(AHCIIOTestOptions));
 opts->length = len;
 opts->address_type = addr;
 opts->io_type = type;
@@ -1575,6 +1581,13 @@ static void create_ahci_io_test(enum IOMode type, enum 
AddrMode addr,
buff_len_str[len],
offset_str[offset]);
 
+if ((addr == ADDR_MODE_LBA48) && (offset == OFFSET_HIGH) &&
+(mb_to_sectors(test_image_size_mb) <= 0xFFF)) {
+g_test_message("%s: skipped; test image too small", name);
+g_free(name);
+return;
+}
+
 qtest_add_data_func(name, opts, test_io_interface);
 g_free(name);
 }
@@ -1624,8 +1637,18 @@ int main(int argc, char **argv)
 /* Create a temporary image */
 fd = mkstemp(tmp_path);
 g_assert(fd >= 0);
-imgfmt = "qcow2";
-mkqcow2(tmp_path, TEST_IMAGE_SIZE_MB);
+if (have_qemu_img()) {
+imgfmt = "qcow2";
+test_image_size_mb = TEST_IMAGE_SIZE_MB_LARGE;
+mkqcow2(tmp_path, TEST_IMAGE_SIZE_MB_LARGE);
+} else {
+g_test_message("QTEST_QEMU_IMG not set or qemu-img missing; "
+   "skipping LBA48 high-sector tests");
+imgfmt = "raw";
+test_image_size_mb = TEST_IMAGE_SIZE_MB_SMALL;
+ret = ftruncate(fd, test_image_size_mb * 1024 * 1024);
+g_assert(ret == 0);
+}
 close(fd);
 
 /* Create temporary blkdebug instructions */
-- 
2.4.3

[Qemu-devel] [PATCH 0/3] qtest/ahci: skip qcow2 tests

2015-11-12 Thread John Snow

Skip tests that need qcow2 to function when qemu-img isn't
set or otherwise present.

When running under raw mode, we no longer properly test
LBA48 reads/writes beyond 137Gb. ahci-test will skip past
any //io//lba48//high test.

While we're at it, try to make failures due to missing qemu-img
binaries a little more apparent in case we find a way to bump
into them again.

John Snow (3):
  qtest/ahci: always specify image format
  libqos: add qemu-img presence check
  qtest/ahci: use raw format when qemu-img is absent

 tests/ahci-test.c | 90 ++-
 tests/libqos/libqos.c | 28 +---
 tests/libqos/libqos.h |  1 +
 3 files changed, 84 insertions(+), 35 deletions(-)

-- 
2.4.3

Re: [Qemu-devel] [PATCH 1/4] block/gluster: rename [server, volname, image] -> [host, volume, path]

2015-11-12 Thread Eric Blake

On 11/12/2015 03:22 AM, Prasanna Kumar Kalever wrote:
> this patch is very much be meaningful after next patch which adds multiple
> gluster servers support. After that,
> 
> an example is, in  'servers' tuple values we use 'server' variable for key

Awkward line break mid-sentence.  The commit message alone is not a
reason to hold up this patch, so maybe the maintainer can adjust it.

> 'host' in the code, it will be quite messy to have colliding names for
> variables, so to maintain better readability and makes it consistent with 
> other
> existing code as well as the input keys/options, this patch renames the
> following variables

I'd suggest replacing everything up to here with:

A future patch will add support for multiple gluster servers.  Our
existing terminology is a bit unusual in relation to what names are used
by other networked devices, and doesn't map very well to the terminology
we expect to use for multiple servers.  Therefore, rename the following
options:

> 'server'  -> 'host'
> 'image'   -> 'path'
> 'volname' -> 'volume'
> 
> Signed-off-by: Prasanna Kumar Kalever 
> Reviewed-by: Eric Blake 
> ---
>  block/gluster.c | 54 +++---
>  1 file changed, 27 insertions(+), 27 deletions(-)

R-b still stands on this one.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] tests/.gitignore: ignore test-blockjob-txn

2015-11-12 Thread John Snow

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On 11/12/2015 03:31 PM, Eric Blake wrote:
> On 11/12/2015 01:21 PM, John Snow wrote:
>> Reported-by: Eric Blake  Signed-off-by: John
>> Snow  --- tests/.gitignore | 1 + 1 file
>> changed, 1 insertion(+)
> 
> My version of this same patch at least called out the commit that 
> introduced the problem :) 
> https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg03142.html
>
> 
>> 
>> diff --git a/tests/.gitignore b/tests/.gitignore index
>> e96f569..de5e793 100644 --- a/tests/.gitignore +++
>> b/tests/.gitignore @@ -9,6 +9,7 @@ check-qom-proplist rcutorture 
>> test-aio test-bitops +test-blockjob-txn test-coroutine 
>> test-crypto-cipher test-crypto-hash
>> 
> 

Oh, I didn't realize you sent a _patch_, I just saw the part saying
"Hey, stuff gooofed up!"

ignore, NACK, etc

(Sorry, Eric -- I'm on autopilot right now.)

- --js
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQIcBAEBCAAGBQJWRPdnAAoJEH3vgQaq/DkOo+YQALWnjexfVyEC7I0kZaIzH/yh
voBG7XiApNPxPqq1X/C+UtGkM6XxIBAECoz2O5+eGnnw9OTcNkdQELcvERfcnG50
+ULvQIf+BeCrMV8fPZaq8uCvwYCIMOxUd2JH22QptYsetBIQHeVnARY8OltdJ1ET
gLRHFcBTYy/LFnQvuIqBNKcSTuiWTSxqHpsQHeDRv2a+0XTIgFCbsPLKKPpR0e7d
UdX1NSUuT0suIcgkDf34Vx1rEmlShFzmtrjVlFG8WFtzN0qrYOqn6SQ+1KxgAeR2
RzQeH8RUy7JgLIbKjARb+4Ya3wxKUFIFqZ3Y871uCG1eYC4BE/q3+Q3CdBM45w3E
DNLLk48efHHYumRXYMwS6dPo84T8s5XC76raWuMXLN8kbzj8F2fAgro1NXYfPlz0
NS81aw0HEDDsupvKafvnatF0neVaP1PCOT33ques2T/SIbzzLM9Eu/gNPaQAlBne
Uf7kV8NU38TW3dBKczzUVH//XeVo8dqsQiByGe4YXeMCC9vLHCBwtfxtnZXZtBth
mZ0JJnlIbnX43M8m0/+PsicFzGpddEOXhb4hgWhSRHxNThJQbALy36oFtKWkTCg4
nR/12rIlQQx/A1MfWtYflAAFxPNgGyp8B6Gu1A0utCUMIPti9oYgmROX7+32I7WZ
R71Xc2d3chK0ozyqh/8w
=eOua
-END PGP SIGNATURE-

Re: [Qemu-devel] [PATCH v2 0/7] vl: graphics stubs + #ifdef cleanup

2015-11-12 Thread Eduardo Habkost

On Thu, Nov 12, 2015 at 02:17:53PM -0700, Eric Blake wrote:
> On 11/12/2015 12:02 PM, Eduardo Habkost wrote:
> > Clean up the graphics initialization code to reduce the number of
> > 
> 
> Lame of git for eating lines that start with #ifdef.  But at least it
> doesn't matter on the cover letter :)

Maybe it's a git-publish bug, I will check. Thanks for noting! :)

-- 
Eduardo

[Qemu-devel] [PATCH] scsi: remove scsi_req_free prototype

2015-11-12 Thread Hervé Poussineau

Function has been deleted in ad2d30f79d3b0812f02c741be2189796b788d6d7.

Signed-off-by: Hervé Poussineau 
---
 include/hw/scsi/scsi.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index cdaf0f8..1915a73 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -250,7 +250,6 @@ SCSIRequest *scsi_req_alloc(const SCSIReqOps *reqops, 
SCSIDevice *d,
 SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
   uint8_t *buf, void *hba_private);
 int32_t scsi_req_enqueue(SCSIRequest *req);
-void scsi_req_free(SCSIRequest *req);
 SCSIRequest *scsi_req_ref(SCSIRequest *req);
 void scsi_req_unref(SCSIRequest *req);
 
-- 
2.1.4

[Qemu-devel] [PATCH 1/3] qtest/ahci: always specify image format

2015-11-12 Thread John Snow

Signed-off-by: John Snow 
---
 tests/ahci-test.c | 51 +--
 1 file changed, 29 insertions(+), 22 deletions(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 59d387c..6d9ac84 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -48,6 +48,7 @@
 static char tmp_path[] = "/tmp/qtest.XX";
 static char debug_path[] = "/tmp/qtest-blkdebug.XX";
 static bool ahci_pedantic;
+static const char *imgfmt;
 
 /*** Function Declarations ***/
 static void ahci_test_port_spec(AHCIQState *ahci, uint8_t port);
@@ -170,11 +171,11 @@ static AHCIQState *ahci_boot(const char *cli, ...)
 va_end(ap);
 } else {
 cli = "-drive if=none,id=drive0,file=%s,cache=writeback,serial=%s"
-",format=qcow2"
+",format=%s"
 " -M q35 "
 "-device ide-hd,drive=drive0 "
 "-global ide-hd.ver=%s";
-s = ahci_boot(cli, tmp_path, "testdisk", "version");
+s = ahci_boot(cli, tmp_path, "testdisk", imgfmt, "version");
 }
 
 return s;
@@ -1073,12 +1074,12 @@ static void test_flush_retry(void)
 
 prepare_blkdebug_script(debug_path, "flush_to_disk");
 ahci = ahci_boot_and_enable("-drive file=blkdebug:%s:%s,if=none,id=drive0,"
-"format=qcow2,cache=writeback,"
+"format=%s,cache=writeback,"
 "rerror=stop,werror=stop "
 "-M q35 "
 "-device ide-hd,drive=drive0 ",
 debug_path,
-tmp_path);
+tmp_path, imgfmt);
 
 /* Issue Flush Command and wait for error */
 port = ahci_port_select(ahci);
@@ -1108,10 +1109,10 @@ static void test_migrate_sanity(void)
 const char *uri = "tcp:127.0.0.1:1234";
 
 src = ahci_boot("-m 1024 -M q35 "
-"-hda %s ", tmp_path);
+"-drive if=ide,file=%s,format=%s ", tmp_path, imgfmt);
 dst = ahci_boot("-m 1024 -M q35 "
-"-hda %s "
-"-incoming %s", tmp_path, uri);
+"-drive if=ide,file=%s,format=%s "
+"-incoming %s", tmp_path, imgfmt, uri);
 
 ahci_migrate(src, dst, uri);
 
@@ -1132,10 +1133,11 @@ static void ahci_migrate_simple(uint8_t cmd_read, 
uint8_t cmd_write)
 const char *uri = "tcp:127.0.0.1:1234";
 
 src = ahci_boot_and_enable("-m 1024 -M q35 "
-   "-hda %s ", tmp_path);
+   "-drive if=ide,format=%s,file=%s ",
+   imgfmt, tmp_path);
 dst = ahci_boot("-m 1024 -M q35 "
-"-hda %s "
-"-incoming %s", tmp_path, uri);
+"-drive if=ide,format=%s,file=%s "
+"-incoming %s", imgfmt, tmp_path, uri);
 
 set_context(src->parent);
 
@@ -1190,12 +1192,12 @@ static void ahci_halted_io_test(uint8_t cmd_read, 
uint8_t cmd_write)
 prepare_blkdebug_script(debug_path, "write_aio");
 
 ahci = ahci_boot_and_enable("-drive file=blkdebug:%s:%s,if=none,id=drive0,"
-"format=qcow2,cache=writeback,"
+"format=%s,cache=writeback,"
 "rerror=stop,werror=stop "
 "-M q35 "
 "-device ide-hd,drive=drive0 ",
 debug_path,
-tmp_path);
+tmp_path, imgfmt);
 
 /* Initialize and prepare */
 port = ahci_port_select(ahci);
@@ -1256,20 +1258,20 @@ static void ahci_migrate_halted_io(uint8_t cmd_read, 
uint8_t cmd_write)
 prepare_blkdebug_script(debug_path, "write_aio");
 
 src = ahci_boot_and_enable("-drive file=blkdebug:%s:%s,if=none,id=drive0,"
-   "format=qcow2,cache=writeback,"
+   "format=%s,cache=writeback,"
"rerror=stop,werror=stop "
"-M q35 "
"-device ide-hd,drive=drive0 ",
debug_path,
-   tmp_path);
+   tmp_path, imgfmt);
 
 dst = ahci_boot("-drive file=%s,if=none,id=drive0,"
-"format=qcow2,cache=writeback,"
+"format=%s,cache=writeback,"
 "rerror=stop,werror=stop "
 "-M q35 "
 "-device ide-hd,drive=drive0 "
 "-incoming %s",
-tmp_path, uri);
+tmp_path, imgfmt, uri);
 
 set_context(src->parent);
 
@@ -1327,15 +1329,17 @@ static void test_flush_migrate(void)
 prepare_blkdebug_script(debug_path, "flush_to_disk");
 
 src =

Re: [Qemu-devel] [PATCH 3/4] block/gluster: using new qapi schema

2015-11-12 Thread Jeff Cody

On Thu, Nov 12, 2015 at 03:52:07PM +0530, Prasanna Kumar Kalever wrote:
> this patch adds GlusterConf to qapi/block-core.json
> 
> Signed-off-by: Prasanna Kumar Kalever 
> ---
>  block/gluster.c  | 104 
> +--
>  qapi/block-core.json |  60 +++--
>  2 files changed, 109 insertions(+), 55 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index ededda2..615f28b 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -11,6 +11,10 @@
>  #include "block/block_int.h"
>  #include "qemu/uri.h"
>  
> +#define GLUSTER_OPT_FILENAME"filename"
> +#define GLUSTER_DEFAULT_PORT24007
> +
> +
>  typedef struct GlusterAIOCB {
>  int64_t size;
>  int ret;
> @@ -29,15 +33,6 @@ typedef struct BDRVGlusterReopenState {
>  struct glfs_fd *fd;
>  } BDRVGlusterReopenState;
>  
> -typedef struct GlusterConf {
> -char *host;
> -int port;
> -char *volume;
> -char *path;
> -char *transport;
> -} GlusterConf;
> -
> -
>  static QemuOptsList qemu_gluster_create_opts = {
>  .name = "qemu-gluster-create-opts",
>  .head = QTAILQ_HEAD_INITIALIZER(qemu_gluster_create_opts.head),
> @@ -61,7 +56,7 @@ static QemuOptsList runtime_opts = {
>  .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
>  .desc = {
>  {
> -.name = "filename",
> +.name = GLUSTER_OPT_FILENAME,
>  .type = QEMU_OPT_STRING,
>  .help = "URL to the gluster image",
>  },
> @@ -70,18 +65,7 @@ static QemuOptsList runtime_opts = {
>  };
>  
>  
> -static void qemu_gluster_gconf_free(GlusterConf *gconf)
> -{
> -if (gconf) {
> -g_free(gconf->host);
> -g_free(gconf->volume);
> -g_free(gconf->path);
> -g_free(gconf->transport);
> -g_free(gconf);
> -}
> -}
> -
> -static int parse_volume_options(GlusterConf *gconf, char *path)
> +static int parse_volume_options(BlockdevOptionsGluster *gconf, char *path)
>  {
>  char *p, *q;
>  
> @@ -143,8 +127,10 @@ static int parse_volume_options(GlusterConf *gconf, char 
> *path)
>   * file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket
>   * file=gluster+rdma://1.2.3.4:24007/testvol/a.img
>   */
> -static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
> +static int qemu_gluster_parseuri(BlockdevOptionsGluster **pgconf,
> + const char *filename)
>  {
> +BlockdevOptionsGluster *gconf;
>  URI *uri;
>  QueryParams *qp = NULL;
>  bool is_unix = false;
> @@ -155,20 +141,24 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, 
> const char *filename)
>  return -EINVAL;
>  }
>  
> +gconf = g_new0(BlockdevOptionsGluster, 1);
> +gconf->server = g_new0(GlusterServer, 1);
> +
>  /* transport */
>  if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
> -gconf->transport = g_strdup("tcp");
> +gconf->server->transport = GLUSTER_TRANSPORT_TCP;
>  } else if (!strcmp(uri->scheme, "gluster+tcp")) {
> -gconf->transport = g_strdup("tcp");
> +gconf->server->transport = GLUSTER_TRANSPORT_TCP;
>  } else if (!strcmp(uri->scheme, "gluster+unix")) {
> -gconf->transport = g_strdup("unix");
> +gconf->server->transport = GLUSTER_TRANSPORT_UNIX;
>  is_unix = true;
>  } else if (!strcmp(uri->scheme, "gluster+rdma")) {
> -gconf->transport = g_strdup("rdma");
> +gconf->server->transport = GLUSTER_TRANSPORT_RDMA;
>  } else {
>  ret = -EINVAL;
>  goto out;
>  }
> +gconf->server->has_transport = true;
>  
>  ret = parse_volume_options(gconf, uri->path);
>  if (ret < 0) {
> @@ -190,13 +180,23 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, 
> const char *filename)
>  ret = -EINVAL;
>  goto out;
>  }
> -gconf->host = g_strdup(qp->p[0].value);
> +gconf->server->host = g_strdup(qp->p[0].value);
>  } else {
> -gconf->host = g_strdup(uri->server ? uri->server : "localhost");
> -gconf->port = uri->port;
> +gconf->server->host = g_strdup(uri->server ? uri->server : 
> "localhost");
> +if (uri->port) {
> +gconf->server->port = uri->port;
> +} else {
> +gconf->server->port = GLUSTER_DEFAULT_PORT;
> +}
> +gconf->server->has_port = true;
>  }
>  
> +*pgconf = gconf;
> +
>  out:
> +if (ret < 0) {
> +qapi_free_BlockdevOptionsGluster(gconf);
> +}
>  if (qp) {
>  query_params_free(qp);
>  }
> @@ -204,14 +204,15 @@ out:
>  return ret;
>  }
>  
> -static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char 
> *filename,
> -  Error **errp)
> +static struct glfs *qemu_gluster_init(BlockdevOptionsGluster **pgconf,
> +

Re: [Qemu-devel] [PATCH 4/4] block/gluster: add support for multiple gluster servers

2015-11-12 Thread Jeff Cody

On Thu, Nov 12, 2015 at 03:52:08PM +0530, Prasanna Kumar Kalever wrote:
> This patch adds a way to specify multiple volfile servers to the gluster
> block backend of QEMU with tcp|rdma transport types and their port numbers.
> 
> Problem:
> 
> Currently VM Image on gluster volume is specified like this:
> 
> file=gluster[+tcp]://host[:port]/testvol/a.img
> 
> Assuming we have three hosts in trusted pool with replica 3 volume
> in action and unfortunately host (mentioned in the command above) went down
> for some reason, since the volume is replica 3 we now have other 2 hosts
> active from which we can boot the VM.
> 
> But currently there is no mechanism to pass the other 2 gluster host
> addresses to qemu.
> 
> Solution:
> 
> New way of specifying VM Image on gluster volume with volfile servers:
> (We still support old syntax to maintain backward compatibility)
> 
> Basic command line syntax looks like:
> 
> Pattern I:
>  -drive driver=gluster,
> volume=testvol,path=/path/a.raw,
> server.0.host=1.2.3.4,
>[server.0.port=24007,]
>[server.0.transport=tcp,]
> server.1.host=5.6.7.8,
>[server.1.port=24008,]
>[server.1.transport=rdma,] ...
> 
> Pattern II:
>  'json:{"driver":"qcow2","file":{"driver":"gluster",
>"volume":"testvol","path":"/path/a.qcow2",
>"server":[{tuple0},{tuple1}, ...{tupleN}]}}'
> 
>driver  => 'gluster' (protocol name)
>volume  => name of gluster volume where our VM image resides
>path=> absolute path of image in gluster volume
> 
>   {tuple}  => {"host":"1.2.3.4"[,"port":"24007","transport":"tcp"]}
> 
>host=> host address (hostname/ipv4/ipv6 addresses)
>port=> port number on which glusterd is listening. (default 24007)
>transport   => transport type used to connect to gluster management daemon,
>it can be tcp|rdma (default 'tcp')
> 
> Examples:
> 1.
>  -drive driver=qcow2,file.driver=gluster,
> file.volume=testvol,file.path=/path/a.qcow2,
> file.server.0.host=1.2.3.4,
> file.server.0.port=24007,
> file.server.0.transport=tcp,
> file.server.1.host=5.6.7.8,
> file.server.1.port=24008,
> file.server.1.transport=rdma
> 2.
>  'json:{"driver":"qcow2","file":{"driver":"gluster","volume":"testvol",
>  "path":"/path/a.qcow2","server":
>  [{"host":"1.2.3.4","port":"24007","transport":"tcp"},
>   {"host":"4.5.6.7","port":"24008","transport":"rdma"}] } }'
> 
> This patch gives a mechanism to provide all the server addresses, which are in
> replica set, so in case host1 is down VM can still boot from any of the
> active hosts.
> 
> This is equivalent to the backup-volfile-servers option supported by
> mount.glusterfs (FUSE way of mounting gluster volume)
> 
> Credits: Sincere thanks to Kevin Wolf  and
> "Deepak C Shetty"  for inputs and all their support
> 
> Signed-off-by: Prasanna Kumar Kalever 
> ---
>  block/gluster.c  | 288 
> ---
>  qapi/block-core.json |   4 +-
>  2 files changed, 252 insertions(+), 40 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 615f28b..ba209cf 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -12,6 +12,13 @@
>  #include "qemu/uri.h"
>  
>  #define GLUSTER_OPT_FILENAME"filename"
> +#define GLUSTER_OPT_VOLUME  "volume"
> +#define GLUSTER_OPT_PATH"path"
> +#define GLUSTER_OPT_HOST"host"
> +#define GLUSTER_OPT_PORT"port"
> +#define GLUSTER_OPT_TRANSPORT   "transport"
> +#define GLUSTER_OPT_SERVER_PATTERN  "server."
> +
>  #define GLUSTER_DEFAULT_PORT24007
>  
>  
> @@ -64,6 +71,46 @@ static QemuOptsList runtime_opts = {
>  },
>  };
>  
> +static QemuOptsList runtime_json_opts = {
> +.name = "gluster_json",
> +.head = QTAILQ_HEAD_INITIALIZER(runtime_json_opts.head),
> +.desc = {
> +{
> +.name = GLUSTER_OPT_VOLUME,
> +.type = QEMU_OPT_STRING,
> +.help = "name of gluster volume where VM image resides",
> +},
> +{
> +.name = GLUSTER_OPT_PATH,
> +.type = QEMU_OPT_STRING,
> +.help = "absolute path to image file in gluster volume",
> +},
> +{ /* end of list */ }
> +},
> +};
> +
> +static QemuOptsList runtime_tuple_opts = {
> +.name = "gluster_tuple",
> +.head = QTAILQ_HEAD_INITIALIZER(runtime_tuple_opts.head),
> +.desc = {
> +{
> +.name = GLUSTER_OPT_HOST,
> +.type = QEMU_OPT_STRING,
> +.help = "host address (hostname/ipv4/ipv6 addresses)",
> +},
> +{
> +.name = GLUSTER_OPT_PORT,
> +.type = QEMU_OPT_NUMBER,
> +.help = "port number on which glusterd is listening (default 
> 24007)",
> +},
> +

Re: [Qemu-devel] [PATCH for-2.5 v5 1/1] hw/misc: Add support for ADC controller in Xilinx Zynq 7000

2015-11-12 Thread Peter Maydell

On 12 November 2015 at 17:54, Peter Crosthwaite
 wrote:
> From: Guenter Roeck 
>
> Add support for the Xilinx XADC core used in Zynq 7000.
>
> References:
> - Zynq-7000 All Programmable SoC Technical Reference Manual
> - 7 Series FPGAs and Zynq-7000 All Programmable SoC XADC
>   Dual 12-Bit 1 MSPS Analog-to-Digital Converter
>
> Tested with Linux using QEMU machine xilinx-zynq-a9 with devicetree
> files zynq-zc702.dtb and zynq-zc706.dtb, and kernel configuration
> multi_v7_defconfig.
>
> Reviewed-by: Alistair Francis 
> Signed-off-by: Guenter Roeck 
> [ PC changes:
>   * Changed macro names to match TRM where possible
>   * Made programmers model macro scheme consistent
>   * Dropped XADC_ZYNQ_ prefix on local macros
>   * Fix ALM field width
>   * Update threshold-comparison interrupts in _update_ints()
>   * factored out DFIFO pushes into helper. Renamed to "push/pop"
>   * Changed xadc_reg to 10 bits and added OOB check.
>   * Reduced scope of MCTL reset to just stop channel coms.
>   * Added dummy read data to write commands
>   * Changed _ to - seperators in string names and filenames
>   * Dropped  in header comment
>   * Catchall'ed _update_ints() in _write handler.
>   * Minor whitespace changes.
>   * Use ZYNQ_XADC_FIFO_DEPTH instead of ARRAY_SIZE()
> ]
> Signed-off-by: Peter Crosthwaite 
> ---
> v5:
> Fix compile warning
> v4:
> Addressed Alistair review
> Minor whitespace changes
> Use ZYNQ_XADC_FIFO_DEPTH instead of ARRAY_SIZE()
> v3:
> See [PC changes] in commit message
> v2:
> Use extract32()
> Merge zynq_xadc_reset() and _zynq_xadc_reset() into one function
> Use "xlnx,zynq_xadc"
> Move device model to include/hw/misc/zynq_xadc.h
> irq -> qemu_irq
> xadc_dfifo_depth -> xadc_dfifo_entries
> Dropped unnecessary comments
> Merged zynq_xadc_realize() into zynq_xadc_init()
>
>  hw/arm/xilinx_zynq.c|   6 +
>  hw/misc/Makefile.objs   |   1 +
>  hw/misc/zynq-xadc.c | 302 
> 
>  include/hw/misc/zynq-xadc.h |  46 +++
>  4 files changed, 355 insertions(+)
>  create mode 100644 hw/misc/zynq-xadc.c
>  create mode 100644 include/hw/misc/zynq-xadc.h
>

Applied to master, thanks.

-- PMM

[Qemu-devel] [PATCH] tests/.gitignore: ignore test-blockjob-txn

2015-11-12 Thread John Snow

Reported-by: Eric Blake 
Signed-off-by: John Snow 
---
 tests/.gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/.gitignore b/tests/.gitignore
index e96f569..de5e793 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -9,6 +9,7 @@ check-qom-proplist
 rcutorture
 test-aio
 test-bitops
+test-blockjob-txn
 test-coroutine
 test-crypto-cipher
 test-crypto-hash
-- 
2.4.3

Re: [Qemu-devel] [PATCH] tests/.gitignore: ignore test-blockjob-txn

2015-11-12 Thread Eric Blake

On 11/12/2015 01:21 PM, John Snow wrote:
> Reported-by: Eric Blake 
> Signed-off-by: John Snow 
> ---
>  tests/.gitignore | 1 +
>  1 file changed, 1 insertion(+)

My version of this same patch at least called out the commit that
introduced the problem :)
https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg03142.html

> 
> diff --git a/tests/.gitignore b/tests/.gitignore
> index e96f569..de5e793 100644
> --- a/tests/.gitignore
> +++ b/tests/.gitignore
> @@ -9,6 +9,7 @@ check-qom-proplist
>  rcutorture
>  test-aio
>  test-bitops
> +test-blockjob-txn
>  test-coroutine
>  test-crypto-cipher
>  test-crypto-hash
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 0/7] vl: graphics stubs + #ifdef cleanup

2015-11-12 Thread Eric Blake

On 11/12/2015 12:02 PM, Eduardo Habkost wrote:
> Clean up the graphics initialization code to reduce the number of
> 

Lame of git for eating lines that start with #ifdef.  But at least it
doesn't matter on the cover letter :)

> Changes v1 -> v2:
> * Patches 2-6: Move stub files to stubs/ui/
> * Patch 7: Move stubs to qemu-spice.h, as the header file
>   already had a separate section for !CONFIG_SPICE
> * Removed DT_NOGRAPHIC patches from the series (they will be sent
>   as a separate series)
> 
> Eduardo Habkost (7):
>   vl: Add DT_COCOA DisplayType value
>   stubs: VNC initialization stubs
>   stubs: curses_display_init() stub
>   stubs: SDL initialization stubs
>   stubs: cocoa_display_init() stub
>   stubs: gtk_display_init() stub
>   spice: Initialization stubs on qemu-spice.h
> 
>  include/sysemu/sysemu.h |  1 +
>  include/ui/console.h|  4 ++--
>  include/ui/qemu-spice.h | 13 +
>  stubs/Makefile.objs |  1 +
>  stubs/ui/Makefile.objs  |  5 +
>  stubs/ui/cocoa.c| 10 ++
>  stubs/ui/curses.c   | 10 ++
>  stubs/ui/gtk.c  | 10 ++
>  stubs/ui/sdl.c  | 17 +
>  stubs/ui/vnc.c  | 22 ++
>  vl.c| 37 ++---
>  11 files changed, 97 insertions(+), 33 deletions(-)
>  create mode 100644 stubs/ui/Makefile.objs
>  create mode 100644 stubs/ui/cocoa.c
>  create mode 100644 stubs/ui/curses.c
>  create mode 100644 stubs/ui/gtk.c
>  create mode 100644 stubs/ui/sdl.c
>  create mode 100644 stubs/ui/vnc.c
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] which machinetypes have an integrated/implied IDE controller?

2015-11-12 Thread Laine Stump

For a long time, libvirt assumed by default that all types of virtual 
machines had an integrated IDE controller named "ide" that wasn't 
specified on the qemu commandline. Since that caused problems 
specifically for the Q35 machine type (which has an *ahci* controller 
that it perplexingly calls "ide"), I added code to libvirt to only make 
that assumption for i440fx-based machinetypes, and to log an error and 
fail in all other cases where someone tried to create a disk attached to 
an IDE controller:


 http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=eadd757

(libvirt doesn't support explicitly adding IDE controllers on the qemu 
commandline, under the assumption that 1) there are lots of better ways 
to attach a disk and 2) nobody has asked for it up to now, and we don't 
want to encourage them to start using something that is slow and 
unnecessary).


But I just received an email from someone who informed me that the 
"ppc-beigeg3" machine type also has an IDE controller, and that 
additionally this is the *only* method of connecting a disk on this 
particular machine. So now I'm wondering how I can determine what other 
machinetypes have an integrated IDE controller, so that I can add them 
to this check. (I would also like to find out which qemu binary supports 
the "ppc-beigeg3" machinetype - I tried running "qemu-blah -M ?" for 
every qemu binary on my Fedora 22 system, and didn't see anything like 
that).

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Eric Blake

On 11/12/2015 12:56 PM, Dr. David Alan Gilbert wrote:

>> One thing I still can't understand, why the unit test in host environment 
>> shows
>> 'memcmp()' have better performance?

Have you tried running under a profiler, to see if there are hotspots or
at least get an idea of where the time is being spent?

> 
> Are you aware of any program other than QEMU that also wants to do something
> similar?  Finding whether a block of memory is zero, sounds like something
> that would be useful in lots of places, I just can't think which ones.

At least dd, cp, and probably several other utilities.  It would be nice
to post an RFE to glibc to see if they can come up with a dedicated
interface that is faster than memcmp(), although that still only helps
us when targetting a system new enough to have that interface.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH for-2.5] mac_dbdma: always initialize channel field in DBDMA_channel

2015-11-12 Thread Hervé Poussineau

dbdma_from_ch() uses channel field to return the right DBDMA object.
Previous code was working if guest OS was only using registered DMA channels.
However, it lead to QEMU crashes if guest OS was using unregistered DMA 
channels.

Signed-off-by: Hervé Poussineau 
---
 hw/misc/macio/mac_dbdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index 779683c..5ee8f02 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -557,7 +557,6 @@ void DBDMA_register_channel(void *dbdma, int nchan, 
qemu_irq irq,
 DBDMA_DPRINTF("DBDMA_register_channel 0x%x\n", nchan);
 
 ch->irq = irq;
-ch->channel = nchan;
 ch->rw = rw;
 ch->flush = flush;
 ch->io.opaque = opaque;
@@ -753,6 +752,7 @@ void* DBDMA_init (MemoryRegion **dbdma_mem)
 for (i = 0; i < DBDMA_CHANNELS; i++) {
 DBDMA_io *io = >channels[i].io;
 qemu_iovec_init(>iov, 1);
+s->channels[i].channel = i;
 }
 
 memory_region_init_io(>mem, NULL, _ops, s, "dbdma", 0x1000);
-- 
2.1.4

[Qemu-devel] [PATCH 2/3] libqos: add qemu-img presence check

2015-11-12 Thread John Snow

To allow tests to optionally exercise additional tests
that require the qemu-img tool that may not be present
in all builds.

Signed-off-by: John Snow 
---
 tests/libqos/libqos.c | 28 +++-
 tests/libqos/libqos.h |  1 +
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/tests/libqos/libqos.c b/tests/libqos/libqos.c
index 8d7c5a9..2d1a802 100644
--- a/tests/libqos/libqos.c
+++ b/tests/libqos/libqos.c
@@ -147,6 +147,23 @@ void migrate(QOSState *from, QOSState *to, const char *uri)
 set_context(to);
 }
 
+bool have_qemu_img(void)
+{
+char *rpath;
+const char *path = getenv("QTEST_QEMU_IMG");
+if (!path) {
+return false;
+}
+
+rpath = realpath(path, NULL);
+if (!rpath) {
+return false;
+} else {
+free(rpath);
+return true;
+}
+}
+
 void mkimg(const char *file, const char *fmt, unsigned size_mb)
 {
 gchar *cli;
@@ -155,13 +172,14 @@ void mkimg(const char *file, const char *fmt, unsigned 
size_mb)
 GError *err = NULL;
 char *qemu_img_path;
 gchar *out, *out2;
-char *abs_path;
+char *qemu_img_abs_path;
 
 qemu_img_path = getenv("QTEST_QEMU_IMG");
-abs_path = realpath(qemu_img_path, NULL);
-assert(qemu_img_path);
+g_assert(qemu_img_path);
+qemu_img_abs_path = realpath(qemu_img_path, NULL);
+g_assert(qemu_img_abs_path);
 
-cli = g_strdup_printf("%s create -f %s %s %uM", abs_path,
+cli = g_strdup_printf("%s create -f %s %s %uM", qemu_img_abs_path,
   fmt, file, size_mb);
 ret = g_spawn_command_line_sync(cli, , , , );
 if (err) {
@@ -183,7 +201,7 @@ void mkimg(const char *file, const char *fmt, unsigned 
size_mb)
 g_free(out);
 g_free(out2);
 g_free(cli);
-free(abs_path);
+free(qemu_img_abs_path);
 }
 
 void mkqcow2(const char *file, unsigned size_mb)
diff --git a/tests/libqos/libqos.h b/tests/libqos/libqos.h
index 492a651..ca14d2e 100644
--- a/tests/libqos/libqos.h
+++ b/tests/libqos/libqos.h
@@ -19,6 +19,7 @@ typedef struct QOSState {
 QOSState *qtest_vboot(QOSOps *ops, const char *cmdline_fmt, va_list ap);
 QOSState *qtest_boot(QOSOps *ops, const char *cmdline_fmt, ...);
 void qtest_shutdown(QOSState *qs);
+bool have_qemu_img(void);
 void mkimg(const char *file, const char *fmt, unsigned size_mb);
 void mkqcow2(const char *file, unsigned size_mb);
 void set_context(QOSState *s);
-- 
2.4.3

Re: [Qemu-devel] [PATCH 3/4] block/gluster: using new qapi schema

2015-11-12 Thread Eric Blake

On 11/12/2015 03:22 AM, Prasanna Kumar Kalever wrote:
> this patch adds GlusterConf to qapi/block-core.json
> 
> Signed-off-by: Prasanna Kumar Kalever 
> ---
>  block/gluster.c  | 104 
> +--
>  qapi/block-core.json |  60 +++--
>  2 files changed, 109 insertions(+), 55 deletions(-)

One more comment:

> -static int qemu_gluster_parseuri(GlusterConf *gconf, const char *filename)
> +static int qemu_gluster_parseuri(BlockdevOptionsGluster **pgconf,
> + const char *filename)
>  {
> +BlockdevOptionsGluster *gconf;
>  URI *uri;
>  QueryParams *qp = NULL;
>  bool is_unix = false;
> @@ -155,20 +141,24 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, 
> const char *filename)
>  return -EINVAL;

If we hit this early return, then *pgconf was never assigned...


> +static struct glfs *qemu_gluster_init(BlockdevOptionsGluster **pgconf,
> +  const char *filename, Error **errp)
>  {
> -struct glfs *glfs = NULL;
> +struct glfs *glfs;
>  int ret;
>  int old_errno;
> +BlockdevOptionsGluster *gconf;

but here, gconf is uninitialized,

>  
> -ret = qemu_gluster_parseuri(gconf, filename);
> +ret = qemu_gluster_parseuri(, filename);
>  if (ret < 0) {
>  error_setg(errp, "Usage: file=gluster[+transport]://[host[:port]]/"
>   "volume/path[?socket=...]");

which means we can goto out with it uninitialized...

> @@ -224,8 +225,9 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, 
> const char *filename,
>  goto out;
>  }
>  
> -ret = glfs_set_volfile_server(glfs, gconf->transport, gconf->host,
> -gconf->port);
> +ret = glfs_set_volfile_server(glfs,
> +  
> GlusterTransport_lookup[gconf->server->transport],
> +  gconf->server->host, gconf->server->port);
>  if (ret < 0) {
>  goto out;
>  }

...vs. here where we can goto out with it initialized.

So whatever solution you use to plug the leak must be careful to not
free uninitialized memory.  Easiest solution - initialize gconf to NULL
before qemu_gluster_parseuri (or else go back to a *gconf parameter
rather than **pgconf).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-2.5 1/2] input: Avoid CamelCase in InputEvent enums

2015-11-12 Thread Markus Armbruster

Eric Blake  writes:

> Our documentation states that we prefer 'lower-case', rather than
> 'CamelCase', for qapi enum values.  The InputButton and InputAxis
> enums violated this convention.  However, they are currently used
> primarily for generating code that is used internally; their only
> exposure through QMP is via the experimental 'x-input-send-event'
> command.  Since this is experimental, changing the QMP wire format
> for that command is acceptable.
>
> The existing c_enum_const() code in the generator for turning the
> enum names into C constants happens to munge both pre- and
> post-patch spellings to the same C code, which means making the
> change now touches very few files.  But we are considering a
> future patch which would change c_enum_const() to use
> c_name(V).upper() rather than camel_to_upper(), which would render
> 'WheelUp' as INPUT_BUTTON_WHEELUP instead of its current
> INPUT_BUTTON_WHEEL_UP.  Making the change to the enum values now
> will isolate these enums from any impact if the generator munging
> algorithm is changed.
>
> Note that SDL code uses the spelling WHEELUP rather than WHEEL_UP
> in its constants, but that shouldn't drive our decision.
>
> Fix a typo in the qapi docs for InputAxis while at it.
>
> CC: Gerd Hoffmann 
> Signed-off-by: Eric Blake 

Reviewed-by: Markus Armbruster 

I can take this through my tree if Gerd doesn't object.

Re: [Qemu-devel] [PATCH for-2.5 2/2] input: Promote 'input-send-event' to stable API

2015-11-12 Thread Markus Armbruster

Eric Blake  writes:

> We've had 'x-input-send-event' since 2.3, with no further
> changes to the interface other than tweaks in the previous patch
> to the spelling of the enum constants ('X' and 'WheelUp' changed
> to 'x' and 'wheel-up').
>
> What's more, changing the spelling of enum constants is not easy
> to introspect prior to 2.5; so a client that was relying on the
> experimental command can't easily tell which spelling is expected.
> But 'query-commands' works in all qemu versions that supported
> the command, so renaming the command now makes it an easy thing
> to determine which spelling of the enum values to use.
>
> Thus, it's time to promote this interface to stable.

The x- goes back to commit df5b2ad:

input: move input-send-event into experimental namespace

Ongoing discussions on how we are going to specify the console,
so tag the command as experiental so we can refine things in
the 2.3 development cycle.

Have we settled "how we are going to specify the console"?  If yes,
commit, please.  If no, I'm afraid the command should stay experimental.

[Qemu-devel] [PULL v2 02/12] e1000: Cosmetic and alignment fixes

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

This fixes some alignment and cosmetic issues. The changes are made
in order that the following patches in this series will look like
integral parts of the code surrounding them, while conforming to the
coding style. Although some changes in unrelated areas are also made.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c  | 166 
 hw/net/e1000_regs.h |   2 +-
 2 files changed, 89 insertions(+), 79 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 910de3a..da72776 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -41,20 +41,20 @@
 
 #ifdef E1000_DEBUG
 enum {
-DEBUG_GENERAL, DEBUG_IO,   DEBUG_MMIO, DEBUG_INTERRUPT,
-DEBUG_RX,  DEBUG_TX,   DEBUG_MDIC, DEBUG_EEPROM,
-DEBUG_UNKNOWN, DEBUG_TXSUM,DEBUG_TXERR,DEBUG_RXERR,
+DEBUG_GENERAL,  DEBUG_IO,   DEBUG_MMIO, DEBUG_INTERRUPT,
+DEBUG_RX,   DEBUG_TX,   DEBUG_MDIC, DEBUG_EEPROM,
+DEBUG_UNKNOWN,  DEBUG_TXSUM,DEBUG_TXERR,DEBUG_RXERR,
 DEBUG_RXFILTER, DEBUG_PHY,  DEBUG_NOTYET,
 };
-#define DBGBIT(x)  (1<>2)
+#define defreg(x)x = (E1000_##x>>2)
 enum {
-defreg(CTRL),  defreg(EECD),   defreg(EERD),   defreg(GPRC),
-defreg(GPTC),  defreg(ICR),defreg(ICS),defreg(IMC),
-defreg(IMS),   defreg(LEDCTL), defreg(MANC),   defreg(MDIC),
-defreg(MPC),   defreg(PBA),defreg(RCTL),   defreg(RDBAH),
-defreg(RDBAL), defreg(RDH),defreg(RDLEN),  defreg(RDT),
-defreg(STATUS),defreg(SWSM),   defreg(TCTL),   defreg(TDBAH),
-defreg(TDBAL), defreg(TDH),defreg(TDLEN),  defreg(TDT),
-defreg(TORH),  defreg(TORL),   defreg(TOTH),   defreg(TOTL),
-defreg(TPR),   defreg(TPT),defreg(TXDCTL), defreg(WUFC),
-defreg(RA),defreg(MTA),defreg(CRCERRS),defreg(VFTA),
-defreg(VET),defreg(RDTR),   defreg(RADV),   defreg(TADV),
+defreg(CTRL),defreg(EECD),defreg(EERD),defreg(GPRC),
+defreg(GPTC),defreg(ICR), defreg(ICS), defreg(IMC),
+defreg(IMS), defreg(LEDCTL),  defreg(MANC),defreg(MDIC),
+defreg(MPC), defreg(PBA), defreg(RCTL),defreg(RDBAH),
+defreg(RDBAL),   defreg(RDH), defreg(RDLEN),   defreg(RDT),
+defreg(STATUS),  defreg(SWSM),defreg(TCTL),defreg(TDBAH),
+defreg(TDBAL),   defreg(TDH), defreg(TDLEN),   defreg(TDT),
+defreg(TORH),defreg(TORL),defreg(TOTH),defreg(TOTL),
+defreg(TPR), defreg(TPT), defreg(TXDCTL),  defreg(WUFC),
+defreg(RA),  defreg(MTA), defreg(CRCERRS), defreg(VFTA),
+defreg(VET), defreg(RDTR),defreg(RADV),defreg(TADV),
 defreg(ITR),
 };
 
@@ -226,18 +226,18 @@ enum { NPHYWRITEOPS = ARRAY_SIZE(phyreg_writeops) };
 
 enum { PHY_R = 1, PHY_W = 2, PHY_RW = PHY_R | PHY_W };
 static const char phy_regcap[0x20] = {
-[PHY_STATUS] = PHY_R,  [M88E1000_EXT_PHY_SPEC_CTRL] = PHY_RW,
-[PHY_ID1] = PHY_R, [M88E1000_PHY_SPEC_CTRL] = PHY_RW,
-[PHY_CTRL] = PHY_RW,   [PHY_1000T_CTRL] = PHY_RW,
-[PHY_LP_ABILITY] = PHY_R,  [PHY_1000T_STATUS] = PHY_R,
-[PHY_AUTONEG_ADV] = PHY_RW,[M88E1000_RX_ERR_CNTR] = PHY_R,
-[PHY_ID2] = PHY_R, [M88E1000_PHY_SPEC_STATUS] = PHY_R,
+[PHY_STATUS]  = PHY_R, [M88E1000_EXT_PHY_SPEC_CTRL] = PHY_RW,
+[PHY_ID1] = PHY_R, [M88E1000_PHY_SPEC_CTRL] = PHY_RW,
+[PHY_CTRL]= PHY_RW,[PHY_1000T_CTRL] = PHY_RW,
+[PHY_LP_ABILITY]  = PHY_R, [PHY_1000T_STATUS]   = PHY_R,
+[PHY_AUTONEG_ADV] = PHY_RW,[M88E1000_RX_ERR_CNTR]   = PHY_R,
+[PHY_ID2] = PHY_R, [M88E1000_PHY_SPEC_STATUS]   = PHY_R,
 [PHY_AUTONEG_EXP] = PHY_R,
 };
 
 /*

[Qemu-devel] [PULL v2 04/12] e1000: Introduced an array to control the access to the MAC registers

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

The array of uint8_t's which is introduced here, contains access metadata
about the MAC registers: if a register is accessible, but partly implemented,
or if a register requires a certain compatibility flag in order to be
accessed. Currently, 6 hypothetical flags are supported (3 exist for e1000
so far) but in the future, if more than 6 flags will be needed, the datatype
of this array can simply be swapped for a larger one.

This patch is intended to solve the following current problems:

1) In a scenario of migration between different versions of QEMU, which
differ by the MAC registers implemented in them, some registers need not to
be active if a compatibility flag is set, in order to preserve the machine's
state perfectly for the older version. Checking this for each register
individually, would create a lot of clutter in the code.

2) Some registers are (or may be) only partly implemented (e.g.
placeholders that allow reading and writing, but lack other functions).
In such cases it is better to print a debug warning on read/write attempts.
As above, dealing with this functionality on a per-register level, would
require longer and more messy code.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c | 58 ++
 1 file changed, 46 insertions(+), 12 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 7088027..e079f25 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -142,6 +142,8 @@ typedef struct E1000State_st {
 uint32_t compat_flags;
 } E1000State;
 
+#define chkflag(x) (s->compat_flags & E1000_FLAG_##x)
+
 typedef struct E1000BaseClass {
 PCIDeviceClass parent_class;
 uint16_t phy_id2;
@@ -195,8 +197,7 @@ e1000_link_up(E1000State *s)
 static bool
 have_autoneg(E1000State *s)
 {
-return (s->compat_flags & E1000_FLAG_AUTONEG) &&
-   (s->phy_reg[PHY_CTRL] & MII_CR_AUTO_NEG_EN);
+return chkflag(AUTONEG) && (s->phy_reg[PHY_CTRL] & MII_CR_AUTO_NEG_EN);
 }
 
 static void
@@ -321,7 +322,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
 if (s->mit_timer_on) {
 return;
 }
-if (s->compat_flags & E1000_FLAG_MIT) {
+if (chkflag(MIT)) {
 /* Compute the next mitigation delay according to pending
  * interrupts and the current values of RADV (provided
  * RDTR!=0), TADV and ITR.
@@ -1258,6 +1259,18 @@ static void (*macreg_writeops[])(E1000State *, int, 
uint32_t) = {
 
 enum { NWRITEOPS = ARRAY_SIZE(macreg_writeops) };
 
+enum { MAC_ACCESS_PARTIAL = 1, MAC_ACCESS_FLAG_NEEDED = 2 };
+
+#define markflag(x)((E1000_FLAG_##x << 2) | MAC_ACCESS_FLAG_NEEDED)
+/* In the array below the meaning of the bits is: [f|f|f|f|f|f|n|p]
+ * f - flag bits (up to 6 possible flags)
+ * n - flag needed
+ * p - partially implenented */
+static const uint8_t mac_reg_access[0x8000] = {
+[RDTR]= markflag(MIT),[TADV]= markflag(MIT),
+[RADV]= markflag(MIT),[ITR] = markflag(MIT),
+};
+
 static void
 e1000_mmio_write(void *opaque, hwaddr addr, uint64_t val,
  unsigned size)
@@ -1266,9 +1279,20 @@ e1000_mmio_write(void *opaque, hwaddr addr, uint64_t val,
 unsigned int index = (addr & 0x1) >> 2;
 
 if (index < NWRITEOPS && macreg_writeops[index]) {
-macreg_writeops[index](s, index, val);
+if (!(mac_reg_access[index] & MAC_ACCESS_FLAG_NEEDED)
+|| (s->compat_flags & (mac_reg_access[index] >> 2))) {
+if (mac_reg_access[index] & MAC_ACCESS_PARTIAL) {
+DBGOUT(GENERAL, "Writing to register at offset: 0x%08x. "
+   "It is not fully implemented.\n", index<<2);
+}
+macreg_writeops[index](s, index, val);
+} else {/* "flag needed" bit is set, but the flag is not active */
+DBGOUT(MMIO, "MMIO write attempt to disabled reg. addr=0x%08x\n",
+   index<<2);
+}
 } else if (index < NREADOPS && macreg_readops[index]) {
-DBGOUT(MMIO, "e1000_mmio_writel RO %x: 0x%04"PRIx64"\n", index<<2, 
val);
+DBGOUT(MMIO, "e1000_mmio_writel RO %x: 0x%04"PRIx64"\n",
+   index<<2, val);
 } else {
 DBGOUT(UNKNOWN, "MMIO unknown write addr=0x%08x,val=0x%08"PRIx64"\n",
index<<2, val);
@@ -1281,11 +1305,21 @@ e1000_mmio_read(void *opaque, hwaddr addr, unsigned 
size)
 E1000State *s = opaque;
 unsigned int index = (addr & 0x1) >> 2;
 
-if (index < NREADOPS && macreg_readops[index])
-{
-return macreg_readops[index](s, index);
+if (index < NREADOPS && macreg_readops[index]) {
+if (!(mac_reg_access[index] & MAC_ACCESS_FLAG_NEEDED)
+|| (s->compat_flags & (mac_reg_access[index] >> 2))) {

[Qemu-devel] [PULL v2 01/12] slirp: Fix type casts and format strings in debug code

2015-11-12 Thread Jason Wang

From: Stefan Weil 

Casting pointers to long won't work on 64 bit Windows.
It is not needed with the right format strings.

Signed-off-by: Stefan Weil 
Signed-off-by: Jason Wang 
---
 slirp/bootp.c  | 12 +---
 slirp/if.c |  4 ++--
 slirp/ip_icmp.c|  4 ++--
 slirp/ip_input.c   | 10 +-
 slirp/ip_output.c  |  4 ++--
 slirp/mbuf.c   |  6 +++---
 slirp/misc.c   |  6 +++---
 slirp/sbuf.c   |  4 ++--
 slirp/socket.c | 18 +-
 slirp/tcp_input.c  | 14 +++---
 slirp/tcp_output.c |  2 +-
 slirp/tcp_subr.c   | 16 
 slirp/udp.c|  6 +++---
 13 files changed, 56 insertions(+), 50 deletions(-)

diff --git a/slirp/bootp.c b/slirp/bootp.c
index b7db9fa..1baaab1 100644
--- a/slirp/bootp.c
+++ b/slirp/bootp.c
@@ -23,6 +23,12 @@
  */
 #include 
 
+#if defined(_WIN32)
+/* Windows ntohl() returns an u_long value.
+ * Add a type cast to match the format strings. */
+# define ntohl(n) ((uint32_t)ntohl(n))
+#endif
+
 /* XXX: only DHCP is supported */
 
 #define LEASE_TIME (24 * 3600)
@@ -155,7 +161,7 @@ static void bootp_reply(Slirp *slirp, const struct bootp_t 
*bp)
 dhcp_decode(bp, _msg_type, _addr);
 DPRINTF("bootp packet op=%d msgtype=%d", bp->bp_op, dhcp_msg_type);
 if (preq_addr.s_addr != htonl(0L))
-DPRINTF(" req_addr=%08x\n", ntohl(preq_addr.s_addr));
+DPRINTF(" req_addr=%08" PRIx32 "\n", ntohl(preq_addr.s_addr));
 else
 DPRINTF("\n");
 
@@ -234,7 +240,7 @@ static void bootp_reply(Slirp *slirp, const struct bootp_t 
*bp)
 q += 4;
 
 if (bc) {
-DPRINTF("%s addr=%08x\n",
+DPRINTF("%s addr=%08" PRIx32 "\n",
 (dhcp_msg_type == DHCPDISCOVER) ? "offered" : "ack'ed",
 ntohl(daddr.sin_addr.s_addr));
 
@@ -302,7 +308,7 @@ static void bootp_reply(Slirp *slirp, const struct bootp_t 
*bp)
 } else {
 static const char nak_msg[] = "requested address not available";
 
-DPRINTF("nak'ed addr=%08x\n", ntohl(preq_addr.s_addr));
+DPRINTF("nak'ed addr=%08" PRIx32 "\n", ntohl(preq_addr.s_addr));
 
 *q++ = RFC2132_MSG_TYPE;
 *q++ = 1;
diff --git a/slirp/if.c b/slirp/if.c
index fb7acf8..8325a2a 100644
--- a/slirp/if.c
+++ b/slirp/if.c
@@ -53,8 +53,8 @@ if_output(struct socket *so, struct mbuf *ifm)
int on_fastq = 1;
 
DEBUG_CALL("if_output");
-   DEBUG_ARG("so = %lx", (long)so);
-   DEBUG_ARG("ifm = %lx", (long)ifm);
+   DEBUG_ARG("so = %p", so);
+   DEBUG_ARG("ifm = %p", ifm);
 
/*
 * First remove the mbuf from m_usedlist,
diff --git a/slirp/ip_icmp.c b/slirp/ip_icmp.c
index 9f1cb08..23b9f0f 100644
--- a/slirp/ip_icmp.c
+++ b/slirp/ip_icmp.c
@@ -125,7 +125,7 @@ icmp_input(struct mbuf *m, int hlen)
   Slirp *slirp = m->slirp;
 
   DEBUG_CALL("icmp_input");
-  DEBUG_ARG("m = %lx", (long )m);
+  DEBUG_ARG("m = %p", m);
   DEBUG_ARG("m_len = %d", m->m_len);
 
   /*
@@ -252,7 +252,7 @@ icmp_error(struct mbuf *msrc, u_char type, u_char code, int 
minsize,
   register struct mbuf *m;
 
   DEBUG_CALL("icmp_error");
-  DEBUG_ARG("msrc = %lx", (long )msrc);
+  DEBUG_ARG("msrc = %p", msrc);
   DEBUG_ARG("msrc_len = %d", msrc->m_len);
 
   if(type!=ICMP_UNREACH && type!=ICMP_TIMXCEED) goto end_error;
diff --git a/slirp/ip_input.c b/slirp/ip_input.c
index 880bdfd..7d436e6 100644
--- a/slirp/ip_input.c
+++ b/slirp/ip_input.c
@@ -80,7 +80,7 @@ ip_input(struct mbuf *m)
int hlen;
 
DEBUG_CALL("ip_input");
-   DEBUG_ARG("m = %lx", (long)m);
+   DEBUG_ARG("m = %p", m);
DEBUG_ARG("m_len = %d", m->m_len);
 
if (m->m_len < sizeof (struct ip)) {
@@ -232,9 +232,9 @@ ip_reass(Slirp *slirp, struct ip *ip, struct ipq *fp)
int i, next;
 
DEBUG_CALL("ip_reass");
-   DEBUG_ARG("ip = %lx", (long)ip);
-   DEBUG_ARG("fp = %lx", (long)fp);
-   DEBUG_ARG("m = %lx", (long)m);
+   DEBUG_ARG("ip = %p", ip);
+   DEBUG_ARG("fp = %p", fp);
+   DEBUG_ARG("m = %p", m);
 
/*
 * Presence of header sizes in mbufs
@@ -400,7 +400,7 @@ static void
 ip_enq(register struct ipasfrag *p, register struct ipasfrag *prev)
 {
DEBUG_CALL("ip_enq");
-   DEBUG_ARG("prev = %lx", (long)prev);
+   DEBUG_ARG("prev = %p", prev);
p->ipf_prev =  prev;
p->ipf_next = prev->ipf_next;
((struct ipasfrag *)(prev->ipf_next))->ipf_prev = p;
diff --git a/slirp/ip_output.c b/slirp/ip_output.c
index c82830f..1254d0d 100644
--- a/slirp/ip_output.c
+++ b/slirp/ip_output.c
@@ -60,8 +60,8 @@ ip_output(struct socket *so, struct mbuf *m0)
int len, off, error = 0;
 
DEBUG_CALL("ip_output");
-   DEBUG_ARG("so = %lx", (long)so);
-   DEBUG_ARG("m0 = %lx", (long)m0);
+   DEBUG_ARG("so = %p", so);
+   DEBUG_ARG("m0 = %p", m0);
 
ip = mtod(m, struct ip *);
/*
diff --git a/slirp/mbuf.c b/slirp/mbuf.c
index

[Qemu-devel] [PULL v2 11/12] net: netmap: Fix compilation issue

2015-11-12 Thread Jason Wang

From: Vincenzo Maffione 

Reorganization of struct NetClientOptions (commit e4ba22b) caused a
compilation failure of the netmap backend. This patch fixes the issue
by properly accessing the union field.

Reviewed-by: Eric Blake 
Reviewed-by: Markus Armbruster 
Signed-off-by: Vincenzo Maffione 
Signed-off-by: Jason Wang 
---
 net/netmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netmap.c b/net/netmap.c
index 508b829..4197a9c 100644
--- a/net/netmap.c
+++ b/net/netmap.c
@@ -439,7 +439,7 @@ int net_init_netmap(const NetClientOptions *opts,
 const char *name, NetClientState *peer, Error **errp)
 {
 /* FIXME error_setg(errp, ...) on failure */
-const NetdevNetmapOptions *netmap_opts = opts->netmap;
+const NetdevNetmapOptions *netmap_opts = opts->u.netmap;
 NetClientState *nc;
 NetmapPriv me;
 NetmapState *s;
-- 
2.1.4

Re: [Qemu-devel] [PATCH 04/10] snapshot: create bdrv_all_goto_snapshot helper

2015-11-12 Thread Fam Zheng

On Tue, 11/10 17:25, Denis V. Lunev wrote:
> to switch to snapshot on all loaded block drivers.
> 
> The patch also ensures proper locking.
> 
> Signed-off-by: Denis V. Lunev 
> Reviewed-by: Greg Kurz 
> CC: Juan Quintela 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> ---
>  block/snapshot.c | 20 
>  include/block/snapshot.h |  1 +
>  migration/savevm.c   | 15 +--
>  3 files changed, 26 insertions(+), 10 deletions(-)
> 
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 61a6ad1..9f07a63 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -403,3 +403,23 @@ int bdrv_all_delete_snapshot(const char *name, 
> BlockDriverState **first_bad_bs,
>  *first_bad_bs = bs;
>  return ret;
>  }
> +
> +
> +int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs)
> +{
> +int err = 0;
> +BlockDriverState *bs = NULL;
> +
> +while (err == 0 && (bs = bdrv_next(bs))) {
> +AioContext *ctx = bdrv_get_aio_context(bs);
> +
> +aio_context_acquire(ctx);
> +if (bdrv_can_snapshot(bs)) {
> +err = bdrv_snapshot_goto(bs, name);
> +}
> +aio_context_release(ctx);
> +}
> +
> +*first_bad_bs = bs;
> +return err;
> +}
> diff --git a/include/block/snapshot.h b/include/block/snapshot.h
> index d02d2b1..0a176c7 100644
> --- a/include/block/snapshot.h
> +++ b/include/block/snapshot.h
> @@ -84,5 +84,6 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState 
> *bs,
>  bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs);
>  int bdrv_all_delete_snapshot(const char *name, BlockDriverState 
> **first_bsd_bs,
>   Error **err);
> +int bdrv_all_goto_snapshot(const char *name, BlockDriverState 
> **first_bsd_bs);
>  
>  #endif
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 1157a6f..d18ff13 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1425,16 +1425,11 @@ int load_vmstate(const char *name)
>  /* Flush all IO requests so they don't interfere with the new state.  */
>  bdrv_drain_all();
>  
> -bs = NULL;
> -while ((bs = bdrv_next(bs))) {
> -if (bdrv_can_snapshot(bs)) {
> -ret = bdrv_snapshot_goto(bs, name);
> -if (ret < 0) {
> -error_report("Error %d while activating snapshot '%s' on 
> '%s'",
> - ret, name, bdrv_get_device_name(bs));
> -return ret;
> -}
> -}
> +ret = bdrv_all_goto_snapshot(name, );
> +if (ret < 0) {
> +error_report("Error %d while activating snapshot '%s' on '%s'",
> + ret, name, bdrv_get_device_name(bs));
> +return ret;

Maybe more friendlily strerror(ret)?

>  }
>  
>  /* restore the VM state */
> -- 
> 2.5.0
> 
>

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Paolo Bonzini

On 12/11/2015 03:49, Li, Liang Z wrote:
> I am very surprised about the live migration performance  result when
> I use your ' memeqzero4_paolo' instead of these SSE2 Intrinsics to
> check the zero pages.

What code were you using?  Remember I suggested using only unsigned long
checks, like

unsigned long *p = ...
if (p[0] || p[1] || p[2] || p[3]
|| memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
return BUFFER_NOT_ZERO;
else
return BUFFER_ZERO;

> The total live migration time increased about
> 8%!   Not decreased.  Although in the unit test your '
> memeqzero4_paolo'  has better performance, any idea?

You only tested the case of zero pages.  But real pages usually are not
zero, even if they have a few zero bytes at the beginning.  It's very
important to optimize the initial check before the memcmp call.

Paolo

Re: [Qemu-devel] [PATCH 0/3] Postcopy minor fixes

2015-11-12 Thread Dr. David Alan Gilbert

* Christian Borntraeger (borntrae...@de.ibm.com) wrote:
> Am 11.11.2015 um 15:02 schrieb Dr. David Alan Gilbert (git):
> > From: "Dr. David Alan Gilbert" 
> > 
> > Hi,
> >   These are three small fixes for the postcopy code;
> > the first two coming from Bharata's testing on Power, and
> > the last one being a text fixup that Eric asked for.
> > 
> > The first only affects setups with another iterable
> > device (e.g. Power's htab and block migration).
> >   (Symptom: An error about an unreasonable large package)
> > 
> > The second only affects systems with smaller target pages
> > than host pages and makes them properly map zero pages.
> >   (Symptom: Zero pages are fully mapped on the destination)
> > 
> > Dave
> > 
> > Dr. David Alan Gilbert (3):
> >   Finish non-postcopiable iterative devices before package
> >   Postcopy: Fix TP!=HP zero case
> >   migrate-start-postcopy: Improve text
> > 
> >  hmp-commands.hx |  4 +++-
> >  include/sysemu/sysemu.h |  2 +-
> >  migration/migration.c   | 10 --
> >  migration/ram.c |  2 +-
> >  migration/savevm.c  | 10 --
> >  qapi-schema.json|  3 ++-
> >  6 files changed, 23 insertions(+), 8 deletions(-)
> > 
> 
> When you are at it:
> I have another glitch in the help text:
> 
> (qemu) migrate_start_postcopy 
> Enable postcopy with migration_set_capability before the start of migration
>   
> this should be   migrate_set_capability

Oops, thanks!

Dave

> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 09/12] vl: Replace DT_NOGRAPHIC with MachineState field

2015-11-12 Thread Paolo Bonzini



On 11/11/2015 20:09, Eduardo Habkost wrote:
> All DisplayType values are just UI options that don't affect any
> hardware emulation code, except for DT_NOGRAPHIC. Replace
> DT_NOGRAPHIC with DT_NONE plus a new MachineState.nographic
> field, so hardware emulation code don't need to use the
> display_type variable.
> 
> Cc: Michael Walle 
> Cc: Blue Swirl 
> Cc: Mark Cave-Ayland 
> Signed-off-by: Eduardo Habkost 

Can you add a QOM property too, so that "-machine graphics=yes|no" can
be used?

Paolo

> ---
>  hw/lm32/milkymist.c |  2 +-
>  hw/nvram/fw_cfg.c   |  6 --
>  hw/sparc/sun4m.c|  2 +-
>  include/hw/boards.h |  1 +
>  include/sysemu/sysemu.h |  1 -
>  vl.c| 12 ++--
>  6 files changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/lm32/milkymist.c b/hw/lm32/milkymist.c
> index e46283a..947c7db 100644
> --- a/hw/lm32/milkymist.c
> +++ b/hw/lm32/milkymist.c
> @@ -163,7 +163,7 @@ milkymist_init(MachineState *machine)
>  milkymist_memcard_create(0x60004000);
>  milkymist_ac97_create(0x60005000, irq[4], irq[5], irq[6], irq[7]);
>  milkymist_pfpu_create(0x60006000, irq[8]);
> -if (display_type != DT_NOGRAPHIC) {
> +if (!machine->nographic) {
>  milkymist_tmu2_create(0x60007000, irq[9]);
>  }
>  milkymist_minimac2_create(0x60008000, 0x3000, irq[10], irq[11]);
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 73b0a81..e42b198 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -24,6 +24,7 @@
>  #include "hw/hw.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/dma.h"
> +#include "hw/boards.h"
>  #include "hw/isa/isa.h"
>  #include "hw/nvram/fw_cfg.h"
>  #include "hw/sysbus.h"
> @@ -755,16 +756,17 @@ static void fw_cfg_machine_ready(struct Notifier *n, 
> void *data)
>  static void fw_cfg_init1(DeviceState *dev)
>  {
>  FWCfgState *s = FW_CFG(dev);
> +MachineState *machine = MACHINE(qdev_get_machine());
>  
>  assert(!object_resolve_path(FW_CFG_PATH, NULL));
>  
> -object_property_add_child(qdev_get_machine(), FW_CFG_NAME, OBJECT(s), 
> NULL);
> +object_property_add_child(OBJECT(machine), FW_CFG_NAME, OBJECT(s), NULL);
>  
>  qdev_init_nofail(dev);
>  
>  fw_cfg_add_bytes(s, FW_CFG_SIGNATURE, (char *)"QEMU", 4);
>  fw_cfg_add_bytes(s, FW_CFG_UUID, qemu_uuid, 16);
> -fw_cfg_add_i16(s, FW_CFG_NOGRAPHIC, (uint16_t)(display_type == 
> DT_NOGRAPHIC));
> +fw_cfg_add_i16(s, FW_CFG_NOGRAPHIC, (uint16_t)machine->nographic);
>  fw_cfg_add_i16(s, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
>  fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
>  fw_cfg_bootsplash(s);
> diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
> index 230dac9..d47f06a 100644
> --- a/hw/sparc/sun4m.c
> +++ b/hw/sparc/sun4m.c
> @@ -1017,7 +1017,7 @@ static void sun4m_hw_init(const struct sun4m_hwdef 
> *hwdef,
>  slavio_timer_init_all(hwdef->counter_base, slavio_irq[19], 
> slavio_cpu_irq, smp_cpus);
>  
>  slavio_serial_ms_kbd_init(hwdef->ms_kb_base, slavio_irq[14],
> -  display_type == DT_NOGRAPHIC, ESCC_CLOCK, 1);
> +  machine->nographic, ESCC_CLOCK, 1);
>  /* Slavio TTYA (base+4, Linux ttyS0) is the first QEMU serial device
> Slavio TTYB (base+0, Linux ttyS1) is the second QEMU serial device */
>  escc_init(hwdef->serial_base, slavio_irq[15], slavio_irq[15],
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 3e9a92c..1353f8a 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -120,6 +120,7 @@ struct MachineState {
>  char *firmware;
>  bool iommu;
>  bool suppress_vmdesc;
> +bool nographic;
>  
>  ram_addr_t ram_size;
>  ram_addr_t maxram_size;
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 0f4e520..f92a53c 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -139,7 +139,6 @@ typedef enum DisplayType
>  DT_SDL,
>  DT_COCOA,
>  DT_GTK,
> -DT_NOGRAPHIC,
>  DT_NONE,
>  } DisplayType;
>  
> diff --git a/vl.c b/vl.c
> index 57064ea..5d0228b 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2980,6 +2980,7 @@ int main(int argc, char **argv, char **envp)
>  int show_vnc_port = 0;
>  bool defconfig = true;
>  bool userconfig = true;
> +bool nographic = false;
>  const char *log_mask = NULL;
>  const char *log_file = NULL;
>  const char *trace_events = NULL;
> @@ -3226,7 +3227,8 @@ int main(int argc, char **argv, char **envp)
>  display_type = select_display(optarg);
>  break;
>  case QEMU_OPTION_nographic:
> -display_type = DT_NOGRAPHIC;
> +nographic = true;
> +display_type = DT_NONE;
>  break;
>  case QEMU_OPTION_curses:
>  #ifdef CONFIG_CURSES
> @@

Re: [Qemu-devel] [PATCH v13 3/3] block/gluster: add support for multiple gluster servers

2015-11-12 Thread Prasanna Kumar Kalever

On Tuesday, November 10, 2015 9:37:20 PM, Eric Blake wrote:
> 
> On 11/10/2015 02:09 AM, Prasanna Kumar Kalever wrote:
> > This patch adds a way to specify multiple volfile servers to the gluster
> > block backend of QEMU with tcp|rdma transport types and their port numbers.
> > 
> 
> [...]

[...]

> 
> Overall, I think we are probably on the right track for the QMP
> interface; but since blockdev-add is NOT stable yet for 2.5, it won't
> hurt to wait to get this in until 2.6, to make sure we have plenty of
> time; and it would also be nice to make sure we get nbd, nfs, rbd,
> sheepdog all supported in the same release; possibly by sharing common
> types instead of introducing GlusterServer as a one-off type.

We are hoping this to go in 2.5 which is really important for gluster
hyper-convergence release (next Feb).

Is there any possibility of getting exception for this patch ?

Thanks,
-Prasanna

> 
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 
>

[Qemu-devel] [PULL v2 12/12] net: netmap: use error_setg() helpers in place of error_report()

2015-11-12 Thread Jason Wang

From: Vincenzo Maffione 

This update was required to align error reporting of netmap backend
initialization to the modifications introduced by commit a30ecde.

Reviewed-by: Eric Blake 
Reviewed-by: Markus Armbruster 
Signed-off-by: Vincenzo Maffione 
Signed-off-by: Jason Wang 
---
 net/netmap.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/net/netmap.c b/net/netmap.c
index 4197a9c..5558368 100644
--- a/net/netmap.c
+++ b/net/netmap.c
@@ -90,7 +90,7 @@ pkt_copy(const void *_src, void *_dst, int l)
  * Open a netmap device. We assume there is only one queue
  * (which is the case for the VALE bridge).
  */
-static int netmap_open(NetmapPriv *me)
+static void netmap_open(NetmapPriv *me, Error **errp)
 {
 int fd;
 int err;
@@ -99,9 +99,8 @@ static int netmap_open(NetmapPriv *me)
 
 me->fd = fd = open(me->fdname, O_RDWR);
 if (fd < 0) {
-error_report("Unable to open netmap device '%s' (%s)",
-me->fdname, strerror(errno));
-return -1;
+error_setg_file_open(errp, errno, me->fdname);
+return;
 }
 memset(, 0, sizeof(req));
 pstrcpy(req.nr_name, sizeof(req.nr_name), me->ifname);
@@ -109,15 +108,14 @@ static int netmap_open(NetmapPriv *me)
 req.nr_version = NETMAP_API;
 err = ioctl(fd, NIOCREGIF, );
 if (err) {
-error_report("Unable to register %s: %s", me->ifname, strerror(errno));
+error_setg_errno(errp, errno, "Unable to register %s", me->ifname);
 goto error;
 }
 l = me->memsize = req.nr_memsize;
 
 me->mem = mmap(0, l, PROT_WRITE | PROT_READ, MAP_SHARED, fd, 0);
 if (me->mem == MAP_FAILED) {
-error_report("Unable to mmap netmap shared memory: %s",
-strerror(errno));
+error_setg_errno(errp, errno, "Unable to mmap netmap shared memory");
 me->mem = NULL;
 goto error;
 }
@@ -125,11 +123,11 @@ static int netmap_open(NetmapPriv *me)
 me->nifp = NETMAP_IF(me->mem, req.nr_offset);
 me->tx = NETMAP_TXRING(me->nifp, 0);
 me->rx = NETMAP_RXRING(me->nifp, 0);
-return 0;
+
+return;
 
 error:
 close(me->fd);
-return -1;
 }
 
 static void netmap_send(void *opaque);
@@ -438,9 +436,9 @@ static NetClientInfo net_netmap_info = {
 int net_init_netmap(const NetClientOptions *opts,
 const char *name, NetClientState *peer, Error **errp)
 {
-/* FIXME error_setg(errp, ...) on failure */
 const NetdevNetmapOptions *netmap_opts = opts->u.netmap;
 NetClientState *nc;
+Error *err = NULL;
 NetmapPriv me;
 NetmapState *s;
 
@@ -448,7 +446,9 @@ int net_init_netmap(const NetClientOptions *opts,
 netmap_opts->has_devname ? netmap_opts->devname : "/dev/netmap");
 /* Set default name for the port if not supplied. */
 pstrcpy(me.ifname, sizeof(me.ifname), netmap_opts->ifname);
-if (netmap_open()) {
+netmap_open(, );
+if (err) {
+error_propagate(errp, err);
 return -1;
 }
 /* Create the object. */
-- 
2.1.4

Re: [Qemu-devel] [PATCH v11 19/28] qapi: Change munging of CamelCase enum values

2015-11-12 Thread Gerd Hoffmann

  Hi,

> > If desired, I can prepare an alternate patch that adds the dash to the
> > qapi enum definition, to see what we think.
> 
> If Gerd is fine with the rename, let's do it.

No need to do so I think ...

> >> -[INPUT_BUTTON_WHEEL_UP] = SDL_BUTTON(SDL_BUTTON_WHEELUP),
> >> -[INPUT_BUTTON_WHEEL_DOWN] = SDL_BUTTON(SDL_BUTTON_WHEELDOWN),
> >> +[INPUT_BUTTON_WHEELUP] = SDL_BUTTON(SDL_BUTTON_WHEELUP),
> >> +[INPUT_BUTTON_WHEELDOWN] = SDL_BUTTON(SDL_BUTTON_WHEELDOWN),
> >
> > Since SDL already spells the names without space, it's not the end of
> > the world if we do likewise.
> 
> Good point.

This doesn't look too bad.  And even if x-input-send-event isn't
official api I'd prefer to not break it for such a minor cosmetic issue.

cheers,
  Gerd

[Qemu-devel] [PULL v2 10/12] e1000: Introducing backward compatibility command line parameter

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

This follows the previous patches, where support for migrating the
entire MAC registers' array, and some new MAC registers were introduced.

This patch introduces the e1000-specific boolean parameter
"extra_mac_registers", which is on by default. Setting it to off will
enable migration to older versions of QEMU, but will disable the read
and write access to the new registers, that were introduced since adding
the ability to migrate the entire MAC array.

Example for usage to enable backward compatibility and to disable the
new MAC registers:

qemu-system-x86_64 -device e1000,extra_mac_registers=off,... ...

As mentioned above, the default value is "on".

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c  | 2 ++
 include/hw/compat.h | 6 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 793286a..c877e06 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1854,6 +1854,8 @@ static Property e1000_properties[] = {
 compat_flags, E1000_FLAG_AUTONEG_BIT, true),
 DEFINE_PROP_BIT("mitigation", E1000State,
 compat_flags, E1000_FLAG_MIT_BIT, true),
+DEFINE_PROP_BIT("extra_mac_registers", E1000State,
+compat_flags, E1000_FLAG_MAC_BIT, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/compat.h b/include/hw/compat.h
index 93e71af..896a1b0 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -6,7 +6,11 @@
 .driver   = "virtio-blk-device",\
 .property = "scsi",\
 .value= "true",\
-},
+},{\
+.driver   = "e1000",\
+.property = "extra_mac_registers",\
+.value= "off",\
+ },
 
 #define HW_COMPAT_2_3 \
 {\
-- 
2.1.4

[Qemu-devel] [PULL v2 09/12] e1000: Implementing various counters

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

This implements the following Statistic registers (various counters)
according to Intel's specs:

TSCTC  GOTCL  GOTCH  GORCL  GORCH  MPRC   BPRC   RUCROC
BPTC   MPTC   PTC... PRC...

PLEASE NOTE: these registers will not be active, nor will migrate, until
a compatibility flag will be set (in the next patch in this series).

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c | 90 ++
 1 file changed, 85 insertions(+), 5 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 821fed4..793286a 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -37,6 +37,8 @@
 
 #include "e1000_regs.h"
 
+static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+
 #define E1000_DEBUG
 
 #ifdef E1000_DEBUG
@@ -182,7 +184,13 @@ enum {
 defreg(DC),  defreg(TNCRS),   defreg(SEC), defreg(CEXTERR),
 defreg(RLEC),defreg(XONRXC),  defreg(XONTXC),  defreg(XOFFRXC),
 defreg(XOFFTXC), defreg(RFC), defreg(RJC), defreg(RNBC),
-defreg(TSCTFC),  defreg(MGTPRC),  defreg(MGTPDC),  defreg(MGTPTC)
+defreg(TSCTFC),  defreg(MGTPRC),  defreg(MGTPDC),  defreg(MGTPTC),
+defreg(RUC), defreg(ROC), defreg(GORCL),   defreg(GORCH),
+defreg(GOTCL),   defreg(GOTCH),   defreg(BPRC),defreg(MPRC),
+defreg(TSCTC),   defreg(PRC64),   defreg(PRC127),  defreg(PRC255),
+defreg(PRC511),  defreg(PRC1023), defreg(PRC1522), defreg(PTC64),
+defreg(PTC127),  defreg(PTC255),  defreg(PTC511),  defreg(PTC1023),
+defreg(PTC1522), defreg(MPTC),defreg(BPTC)
 };
 
 static void
@@ -588,6 +596,16 @@ inc_reg_if_not_full(E1000State *s, int index)
 }
 }
 
+static inline void
+inc_tx_bcast_or_mcast_count(E1000State *s, const unsigned char *arr)
+{
+if (!memcmp(arr, bcast, sizeof bcast)) {
+inc_reg_if_not_full(s, BPTC);
+} else if (arr[0] & 1) {
+inc_reg_if_not_full(s, MPTC);
+}
+}
+
 static void
 grow_8reg_if_not_full(E1000State *s, int index, int size)
 {
@@ -602,6 +620,24 @@ grow_8reg_if_not_full(E1000State *s, int index, int size)
 s->mac_reg[index+1] = sum >> 32;
 }
 
+static void
+increase_size_stats(E1000State *s, const int *size_regs, int size)
+{
+if (size > 1023) {
+inc_reg_if_not_full(s, size_regs[5]);
+} else if (size > 511) {
+inc_reg_if_not_full(s, size_regs[4]);
+} else if (size > 255) {
+inc_reg_if_not_full(s, size_regs[3]);
+} else if (size > 127) {
+inc_reg_if_not_full(s, size_regs[2]);
+} else if (size > 64) {
+inc_reg_if_not_full(s, size_regs[1]);
+} else if (size == 64) {
+inc_reg_if_not_full(s, size_regs[0]);
+}
+}
+
 static inline int
 vlan_enabled(E1000State *s)
 {
@@ -639,12 +675,17 @@ fcs_len(E1000State *s)
 static void
 e1000_send_packet(E1000State *s, const uint8_t *buf, int size)
 {
+static const int PTCregs[6] = { PTC64, PTC127, PTC255, PTC511,
+PTC1023, PTC1522 };
+
 NetClientState *nc = qemu_get_queue(s->nic);
 if (s->phy_reg[PHY_CTRL] & MII_CR_LOOPBACK) {
 nc->info->receive(nc, buf, size);
 } else {
 qemu_send_packet(nc, buf, size);
 }
+inc_tx_bcast_or_mcast_count(s, buf);
+increase_size_stats(s, PTCregs, size);
 }
 
 static void
@@ -671,8 +712,11 @@ xmit_seg(E1000State *s)
 if (tp->tcp) {
 sofar = frames * tp->mss;
 stl_be_p(tp->data+css+4, ldl_be_p(tp->data+css+4)+sofar); /* seq */
-if (tp->paylen - sofar > tp->mss)
+if (tp->paylen - sofar > tp->mss) {
 tp->data[css + 13] &= ~9;/* PSH, FIN */
+} else if (frames) {
+inc_reg_if_not_full(s, TSCTC);
+}
 } else/* UDP */
 stw_be_p(tp->data+css+4, len);
 if (tp->sum_needed & E1000_TXD_POPTS_TXSM) {
@@ -702,6 +746,8 @@ xmit_seg(E1000State *s)
 inc_reg_if_not_full(s, TPT);
 grow_8reg_if_not_full(s, TOTL, s->tx.size);
 s->mac_reg[GPTC] = s->mac_reg[TPT];
+s->mac_reg[GOTCL] = s->mac_reg[TOTL];
+s->mac_reg[GOTCH] = s->mac_reg[TOTH];
 }
 
 static void
@@ -869,7 +915,6 @@ start_xmit(E1000State *s)
 static int
 receive_filter(E1000State *s, const uint8_t *buf, int size)
 {
-static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
 static const int mta_shift[] = {4, 3, 2, 0};
 uint32_t f, rctl = s->mac_reg[RCTL], ra[2], *rp;
 int isbcast = !memcmp(buf, bcast, sizeof bcast), ismcast = (buf[0] & 1);
@@ -887,10 +932,12 @@ receive_filter(E1000State *s, const uint8_t *buf, int 
size)
 }
 
 if (ismcast && (rctl & E1000_RCTL_MPE)) {  /* promiscuous mcast */
+inc_reg_if_not_full(s, MPRC);
 return 1;
 }
 
 if (isbcast && (rctl & E1000_RCTL_BAM)) {

Re: [Qemu-devel] [PATCH V3 4/6] ide: orphan all buffered requests on DMA cancel

2015-11-12 Thread Peter Lieven


Am 12.11.2015 um 09:27 schrieb Fam Zheng:

On Fri, 11/06 09:42, Peter Lieven wrote:

If the guests canceles a DMA request we can prematurely
invoke all callbacks of buffered requests and flag all them
as orphaned. Ideally this avoids the need for draining all
requests. For CDROM devices this works in 100% of all cases.

Signed-off-by: Peter Lieven 
---
  hw/ide/pci.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index d31ff88..a9e164e 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -240,6 +240,22 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
  /* Ignore writes to SSBM if it keeps the old value */
  if ((val & BM_CMD_START) != (bm->cmd & BM_CMD_START)) {
  if (!(val & BM_CMD_START)) {
+/* First invoke the callbacks of all buffered requests
+ * and flag those requests as orphaned. Ideally there
+ * are no unbuffered (Scatter Gather DMA Requests or
+ * write requests) pending and we can avoid to drain. */
+IDEBufferedRequest *req;
+IDEState *s = idebus_active_if(bm->bus);
+QLIST_FOREACH(req, >buffered_requests, list) {
+if (!req->orphaned) {
+#ifdef DEBUG_IDE
+printf("%s: invoking cb %p of buffered request %p with"
+   " -ECANCELED\n", __func__, req->original_cb, req);
+#endif
+req->original_cb(req->original_opaque, -ECANCELED);
+}
+req->orphaned = true;
+}

Why not use bdrv_aio_cancel or bdrv_aio_cancel_async with the aio returned by
bdrv_aio_cancel?


bdrv_aio_cancel would block until the request is completed, that wouldn't help 
if
the storage is no longer responsive.

The trick with the buffered request is that we can avoid waiting for the 
storage and
guarantee that a later completion on the storage won't corrupt guest memory.

Peter

Re: [Qemu-devel] [PATCH 2/2] i440fx: print an error message if user tries to enable iommu

2015-11-12 Thread Markus Armbruster

Bandan Das  writes:

> There's no indication of any sort that i440fx doesn't support
> "iommu=on""
>
> Signed-off-by: Bandan Das 
> ---
>  hw/pci-host/piix.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
> index 7b2fbf9..f12593a 100644
> --- a/hw/pci-host/piix.c
> +++ b/hw/pci-host/piix.c
> @@ -301,6 +301,10 @@ static void i440fx_pcihost_realize(DeviceState *dev, 
> Error **errp)
>  static void i440fx_realize(PCIDevice *dev, Error **errp)
>  {
>  dev->config[I440FX_SMRAM] = 0x02;
> +
> +if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
> +fprintf(stderr, "i440fx doesn't support emulated iommu\n");
> +}
>  }
>  
>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,

error_report(), please.

If this is just a warning, please prefix the message with "warning: ".

If it isn't, exit(1).

Re: [Qemu-devel] [RFC PATCH v4 01/11] exec: Remove cpu from cpus list during cpu_exec_exit()

2015-11-12 Thread Zhu Guihua


Hi Bharata,

On 09/09/2015 03:56 PM, Bharata B Rao wrote:

On Wed, Sep 09, 2015 at 03:41:30PM +0800, Zhu Guihua wrote:

On 09/09/2015 01:52 PM, Bharata B Rao wrote:

On Fri, Sep 04, 2015 at 03:31:24PM +1000, David Gibson wrote:

On Thu, Aug 06, 2015 at 10:57:07AM +0530, Bharata B Rao wrote:

CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().

cpu_exec_init() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.

Instead of introducing a new field CPUState.queued, I could have used
CPUState.cpu_index to check if the cpu is already dequeued from the list.
Since that doesn't work for CONFIG_USER_ONLY, I had to add a new field.

Signed-off-by: Bharata B Rao 

This seems reasonable to me, but I'm wondering how x86 cpu hotplug /
unplug is working without it.

x86 hotplug/unplug code currently resides in Zhu's git tree
(git://github.com/zhugh/qemu). They are removing the CPU from the list
explicitly in x86 CPU's instance_finalize routine.

Sorry, my git tree is git://github.com/zhuguihua/qemu

Now there was no progress about topology, so we don't know what will happen
in x86. I am not sure whether we will take this method finally.

Andreas had a presentation on this topic in KVM forum recently.

Andreas - do you have any updates on the topology and other aspects
of CPU hotplug so that we can align the CPU hotplug work in different
archs accordingly and hope to get it merged in 2.5 time frame ?


Do you update the patchset?

My work in x86 has stopped for a while, Maybe I can get some ideas from 
another

arch's worker.

Thanks,
Zhu

Re: [Qemu-devel] [PATCH for 2.5 v7 0/10] dataplane snapshot fixes

2015-11-12 Thread Fam Zheng

On Tue, 11/10 17:25, Denis V. Lunev wrote:
> Denis V. Lunev (10):
>   snapshot: create helper to test that block drivers supports snapshots
>   snapshot: return error code from bdrv_snapshot_delete_by_id_or_name
>   snapshot: create bdrv_all_delete_snapshot helper
>   snapshot: create bdrv_all_goto_snapshot helper
>   snapshot: create bdrv_all_find_snapshot helper
>   migration: drop find_vmstate_bs check in hmp_delvm
>   snapshot: create bdrv_all_create_snapshot helper
>   migration: reorder processing in hmp_savevm
>   migration: implement bdrv_all_find_vmstate_bs helper
>   migration: normalize locking in migration/savevm.c
> 
>  block/snapshot.c | 135 ++-
>  include/block/snapshot.h |  25 +-
>  migration/savevm.c   | 207 
> +++
>  3 files changed, 217 insertions(+), 150 deletions(-)

Reviewed-by: Fam Zheng

Re: [Qemu-devel] [RFC PATCH v4 01/11] exec: Remove cpu from cpus list during cpu_exec_exit()

2015-11-12 Thread Andreas Färber

Am 12.11.2015 um 10:30 schrieb Bharata B Rao:
> On Thu, Nov 12, 2015 at 05:11:02PM +0800, Zhu Guihua wrote:
>> Hi Bharata,
>>
>> On 09/09/2015 03:56 PM, Bharata B Rao wrote:
>>> On Wed, Sep 09, 2015 at 03:41:30PM +0800, Zhu Guihua wrote:
 On 09/09/2015 01:52 PM, Bharata B Rao wrote:
> On Fri, Sep 04, 2015 at 03:31:24PM +1000, David Gibson wrote:
>> On Thu, Aug 06, 2015 at 10:57:07AM +0530, Bharata B Rao wrote:
>>> CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
>>> should be removed from cpu_exec_exit().
>>>
>>> cpu_exec_init() is called from generic CPU::instance_finalize and some
>>> archs like PowerPC call it from CPU unrealizefn. So ensure that we
>>> dequeue the cpu only once.
>>>
>>> Instead of introducing a new field CPUState.queued, I could have used
>>> CPUState.cpu_index to check if the cpu is already dequeued from the 
>>> list.
>>> Since that doesn't work for CONFIG_USER_ONLY, I had to add a new field.
>>>
>>> Signed-off-by: Bharata B Rao 
>> This seems reasonable to me, but I'm wondering how x86 cpu hotplug /
>> unplug is working without it.
> x86 hotplug/unplug code currently resides in Zhu's git tree
> (git://github.com/zhugh/qemu). They are removing the CPU from the list
> explicitly in x86 CPU's instance_finalize routine.
 Sorry, my git tree is git://github.com/zhuguihua/qemu

 Now there was no progress about topology, so we don't know what will happen
 in x86. I am not sure whether we will take this method finally.
>>> Andreas had a presentation on this topic in KVM forum recently.
>>>
>>> Andreas - do you have any updates on the topology and other aspects
>>> of CPU hotplug so that we can align the CPU hotplug work in different
>>> archs accordingly and hope to get it merged in 2.5 time frame ?
>>
>> Do you update the patchset?
>>
>> My work in x86 has stopped for a while, Maybe I can get some ideas from
>> another
>> arch's worker.
> 
> My last version is here:
> https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg00650.html
> 
> I initally started with core level CPU hotplug, moved to socket level hotplug
> based on Andreas' patchset and then moved back again to core level hotplug.
> 
> I was a bit confused about how the generic semantics would evovle and hence
> the work got delayed. I wil be posting the next version of my patchset
> based on core level semantics soon.

What I recall as conclusion from the KVM Forum session and previous
discussions was that pseries would operate on core level (i.e.,
granularity of two SMT threads), whereas your first try was on thread
level and then on socket level.

Regards,
Andreas

> I am hoping that I should be able to get CPU hotplug/unplug included
> in QEMU-2.6 timeframe.

If there are preparatory patches ready for inclusion today, please point
me to them urgently.

Thanks,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Graham Norton; HRB 21284 (AG Nürnberg)

Re: [Qemu-devel] [PULL v2 0/7] Block patches

2015-11-12 Thread Peter Maydell

On 11 November 2015 at 20:59, Marc-André Lureau
 wrote:
> Hi Peter
>
> On Wed, Nov 11, 2015 at 9:33 PM, Peter Maydell  
> wrote:
>> On 10 November 2015 at 18:41, Peter Maydell  wrote:
>>> On 9 November 2015 at 17:50, Marc-André Lureau  wrote:
 I can imagine a test starting a server thread and 2 qemu instances
 would take more than 5s on such configuration then.

 Could you try timing the test a few times to confirm this?
>>>
>>> petmay01@moonshot-dsg-11:~/qemu/build/all-a64$ time
>>> QTEST_QEMU_BINARY=i386-softmmu/qemu-system-i386
>>> QTEST_QEMU_IMG=qemu-img MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM %
>>> 255 + 1))} gtester -k --verbose -m=quick tests/ivshmem-test
>>> TEST: tests/ivshmem-test... (pid=10893)
>>>   /i386/ivshmem/single:OK
>>>   /i386/ivshmem/pair:  OK
>>>   /i386/ivshmem/server:OK
>>>   /i386/ivshmem/hotplug:   OK
>>>   /i386/ivshmem/memdev:OK
>>> PASS: tests/ivshmem-test
>>>
>>> real0m11.945s
>>> user0m11.020s
>>> sys 0m0.310s
>>>
>>> (almost all of the runtime seems to be in the "pair" subtest).
>>
>> This is now failing on practically every pull request I test.
>> Please post a patch to fix this test or disable it...
>
> This is the simplest patch I suggest for now.

That will still mean that trying the slow tests gives random
failures, so this still needs attention (ie raising the timeouts
to something that won't actually be hit), but I guess it will
solve my immediate problem. Can you send it to the list as
a proper patch, please?

thanks
-- PMM

Re: [Qemu-devel] [v2 1/2] cutils: add avx2 instruction optimization

2015-11-12 Thread Li, Liang Z

> 
> The main issue here is that you are not testing whether the compiler supports
> gnu_indirect_function.
> 
> I suggest that you start by moving the functions to util/buffer-zero.c
> 
> Then the structure should be something like
> 
> #ifdef CONFIG_HAVE_AVX2
> #include 
> #endif
> 
> ... define buffer_find_nonzero_offset_inner ...
> ... define can_use_buffer_find_nonzero_offset_inner ...
> 
> #if defined CONFIG_HAVE_GNU_IFUNC && defined CONFIG_HAVE_AVX2 ...
> define buffer_find_nonzero_offset_avx2 ...
> ... define can_use_buffer_find_nonzero_offset_avx2 ...
> ... define the indirect functions ...
> #else
> ... define buffer_find_nonzero_offset that just calls
> buffer_find_nonzero_offset_inner ...
> ... define can_use_buffer_find_nonzero_offset that just calls
> can_use_buffer_find_nonzero_offset_inner ...
> #endif
> 
> Thanks,
> 
> Paolo

Got it, thanks.

Liang

Re: [Qemu-devel] [PATCH v6 0/9] e1000: Various fixes and registers' implementation

2015-11-12 Thread Jason Wang



On 11/11/2015 09:52 PM, Leonid Bloch wrote:
> This series fixes issues with packet/octet counting in e1000's Statistic
> registers, fixes a bug in the packet address filtering procedure, and
> implements many MAC registers that were absent before, some Statistic
> counters among them.
>
> Besides this, the series introduces a parameter which, if set to "on"
> (default), will cause the entire MAC registers' array to migrate during
> live migration (please see patches #2 and #9 for details). The rational
> behind this is the ability to implement additional MAC registers in the
> future, without worrying about migration compatibility between future
> versions. For compatibility with previous versions, the above mentioned
> parameter can be set to "off".
>
> Also, a new array is introduced to control the access to the various MAC
> registers. This takes care of situations when a MAC register requires a
> certain parameter to be accessed, or is partially implemented, and
> requires a debug warning to be printed on access attempts.
>
> Additionally, several cosmetic changes are made.
>
> Differences v1-2:
> 
> * Wording of several commit messages corrected.
> * For trivially implemented Diagnostic registers, a debug message is
>   added on read/write attempts, alerting of incomplete implementation.
> * Following testing on a physical device, only the lower 16 bits can now
>   be read from AIT, and only the lower 4 - from FFMT*.
> * The grow_8reg_if_not_full function is rewritten.
> * inc_tx_bcast_or_mcast_count and increase_size_stats are now called
>   from within e1000_send_packet, to avoid code duplication.
>
> Differences v2-3:
> 
> * Minor rewordings of some commit messages (0002, 0003).
> * Live migration capability is added to the newly implemented registers.
>
> Differences v3-4:
> 
> * Introduction of the "full_mac_registers" parameter (see above).
> * Reversion of the live migration handling introduced in v3.
> * Small alignment changes in patch #1 to correspond with the following
>   patches.
>
> Differences v4-v5:
> 
> * Introduction of an array to control the access to the MAC registers.
> * Removal of the specific functions that warned of partial
>   implementation on read/write from patch 4.
> * Adequate changes to patches 4 and 8: mainly adding the registers
>   introduced there to the new array.
>
> Differences v5-v6:
> 
> * The access control array now does not contain an "always accessible"
>   bit. The assumption that a register is always accessible is based now
>   solely on the facts that it has a read or write handler, and it does
>   not require a flag to be set. That also makes place for 6 possible
>   flag bits in the access control array, instead of 5 in v5.
> * The support for backward compatibility, nor the new registers
>   introduced in this series, can not be turned on now until the last
>   patch in this series is applied. This is done to preserve
>   compatibility if bisection in-between the patches of this series will
>   be needed.
>
> The majority of these changes result from Jason Wang's review - thank
> you, Jason!

Applied in https://github.com/jasowang/qemu/commits/net with a minor
modification:

- Move the compat property from patch 3 to patch 9

Thanks

>
> Leonid Bloch (9):
>   e1000: Cosmetic and alignment fixes
>   e1000: Add support for migrating the entire MAC registers' array
>   e1000: Introduced an array to control the access to the MAC registers
>   e1000: Trivial implementation of various MAC registers
>   e1000: Fixing the received/transmitted packets' counters
>   e1000: Fixing the received/transmitted octets' counters
>   e1000: Fixing the packet address filtering procedure
>   e1000: Implementing various counters
>   e1000: Introducing backward compatibility command line parameter
>
>  hw/net/e1000.c  | 476 
> 
>  hw/net/e1000_regs.h |   8 +-
>  include/hw/compat.h |   4 +
>  3 files changed, 379 insertions(+), 109 deletions(-)
>

Re: [Qemu-devel] [PATCH 2/3] usb-mtp: Add support for inotify based file monitoring

2015-11-12 Thread Gerd Hoffmann

On Mo, 2015-11-09 at 18:12 -0500, Bandan Das wrote:
> Gerd Hoffmann  writes:
> 
> > On Di, 2015-11-03 at 19:00 -0500, Bandan Das wrote:
> >> +/* Add a new watch asap so as to not lose events
> >> */
> >
> > This comment sounds like there is a race ("asap").  There isn't one,
> > correct ordering (adding the watch before reading the directory) is
> 
> Hmm, seems like there's still a small window. We may not have even
> started processing the event because we are still processing the earlier
> ones.

> > enough to make sure you don't miss anything.  You might see create
> > events for objects already in the tree though, are you prepared to
> > handle that?
> 
> Oh, interesting.  Current version will happily add duplicate entries.
> I will add a check.

I think we are talking about the same thing here.
Things can run in parallel, like this:

process copying a file tree | qemu with usb-mtp
+--
create directory|
| inotify event #1 queued (dir)
| qemu fetches event #1
| qemu adds new inotify watch
copy file into new dir  |
| inotify event #2 queued (file)
| qemu reads new directory
| qemu finds the new file
| qemu fetches event #2

So, yes, the kernel can add new inotify events for the new watch before
qemu finished processing the old event (especially before you are done
reading the directory), and if you are hitting that the effect is that
you see a create event for the new file even though you already have it
in the tree.

But it is impossible that you miss the creation of the new file (this is
what I meant with "there is no race").

hope this clarifies,
  Gerd

Re: [Qemu-devel] [PATCH 1/4] spapr: Extend rtas-blob

2015-11-12 Thread Thomas Huth

On 11/11/15 18:15, Aravinda Prasad wrote:
> Extend rtas-blob to accommodate error log. Error log
> structure is saved in rtas space upon a machine check
> exception.
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  hw/ppc/spapr.c |4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 05926a3..b7b9e09 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1556,6 +1556,10 @@ static void ppc_spapr_init(MachineState *machine)
>  exit(1);
>  }
>  spapr->rtas_size = get_image_size(filename);
> +
> +/* Resize blob to accommodate error log. */
> +spapr->rtas_size = TARGET_PAGE_ALIGN(spapr->rtas_size);
> +
>  spapr->rtas_blob = g_malloc(spapr->rtas_size);
>  if (load_image_size(filename, spapr->rtas_blob, spapr->rtas_size) < 0) {
>  error_report("Could not load LPAR rtas '%s'", filename);

Sorry to say that, but this patch is horrible!

1) If the rtas blob ever gets bigger than 512 bytes, we will get
"random" corruption of the RTAS code later when an NMI occurs since the
mc log is blindly copied into the RTAS area later!
==> Please add an "assert(spapr->rtas_size < RTAS_ERRLOG_OFFSET)" at the
beginning of your patch.

2) Why resizing with TARGET_PAGE_ALIGN() ? In the very worst case, this
would not change the size at all (if the rtas_size is already a multiple
of PAGE_SIZE)
==> Please set the size to a proper value like
 RTAS_ERRLOG_OFFSET + sizeof(struct rtas_mc_log)
instead!

 Thomas

[Qemu-devel] [PULL v2 00/12] Net patches

2015-11-12 Thread Jason Wang

The following changes since commit 31e49ac192f782d594bbd04070fe79e800b7813f:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-2015' into 
staging (2015-11-11 18:23:08 +)

are available in the git repository at:

  https://github.com/jasowang/qemu.git tags/net-pull-request

for you to fetch changes up to 39bec4f38b028a2cff8c38f3455aef44d7b3b6c4:

  net: netmap: use error_setg() helpers in place of error_report() (2015-11-12 
15:31:52 +0800)


Changes from V1:
- no changes in the patches, V1 just misses the list


Leonid Bloch (9):
  e1000: Cosmetic and alignment fixes
  e1000: Add support for migrating the entire MAC registers' array
  e1000: Introduced an array to control the access to the MAC registers
  e1000: Trivial implementation of various MAC registers
  e1000: Fixing the received/transmitted packets' counters
  e1000: Fixing the received/transmitted octets' counters
  e1000: Fixing the packet address filtering procedure
  e1000: Implementing various counters
  e1000: Introducing backward compatibility command line parameter

Stefan Weil (1):
  slirp: Fix type casts and format strings in debug code

Vincenzo Maffione (2):
  net: netmap: Fix compilation issue
  net: netmap: use error_setg() helpers in place of error_report()

 hw/net/e1000.c  | 476 
 hw/net/e1000_regs.h |   8 +-
 include/hw/compat.h |   6 +-
 net/netmap.c|  24 +--
 slirp/bootp.c   |  12 +-
 slirp/if.c  |   4 +-
 slirp/ip_icmp.c |   4 +-
 slirp/ip_input.c|  10 +-
 slirp/ip_output.c   |   4 +-
 slirp/mbuf.c|   6 +-
 slirp/misc.c|   6 +-
 slirp/sbuf.c|   4 +-
 slirp/socket.c  |  18 +-
 slirp/tcp_input.c   |  14 +-
 slirp/tcp_output.c  |   2 +-
 slirp/tcp_subr.c|  16 +-
 slirp/udp.c |   6 +-
 17 files changed, 448 insertions(+), 172 deletions(-)

-- 
2.1.4

Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit

2015-11-12 Thread Thomas Huth

On 11/11/15 18:16, Aravinda Prasad wrote:
> Memory error such as bit flips that cannot be corrected
> by hardware are passed on to the kernel for handling.
> If the memory address in error belongs to guest then
> guest kernel is responsible for taking suitable action.
> Patch [1] enhances KVM to exit guest with exit reason
> set to KVM_EXIT_NMI in such cases.
> 
> This patch handles KVM_EXIT_NMI exit. If the guest OS
> has registered the machine check handling routine by
> calling "ibm,nmi-register", then the handler builds
> the error log and invokes the registered handler else
> invokes the handler at 0x200.
> 
> [1] http://marc.info/?l=kvm-ppc=144726114408289
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  target-ppc/kvm.c |   69 +++
>  target-ppc/kvm_ppc.h |   81 
> ++
>  2 files changed, 150 insertions(+)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 110436d..e2e5170 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run 
> *run)
>  ret = 0;
>  break;
>  
> +case KVM_EXIT_NMI:
> +DPRINTF("handle NMI exception\n");
> +ret = kvm_handle_nmi(cpu);
> +break;
> +
>  default:
>  fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>  ret = -1;
> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>  {
>  return data & 0x;
>  }
> +
> +int kvm_handle_nmi(PowerPCCPU *cpu)
> +{
> +struct rtas_mc_log mc_log;
> +CPUPPCState *env = >env;
> +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +
> +cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
> +
> +/* Properly set bits in MSR before we invoke the handler */
> +env->msr = 0;
> +
> +if (!(*pcc->interrupts_big_endian)(cpu)) {
> +env->msr |= (1ULL << MSR_LE);
> +}
> +
> +#ifdef TARGET_PPC64
> +env->msr |= (1ULL << MSR_SF);
> +#endif
> +
> +if (!spapr->guest_machine_check_addr) {
> +/*
> + * If OS has not registered with "ibm,nmi-register"
> + * jump to 0x200
> + */

Shouldn't you also check MSR_ME here first and enter checkstop when
machine checks are disabled?
Also I think you have to set up some more registers for machine check
interrupts, like SRR0 and SRR1?

> +env->nip = 0x200;
> +return 0;
> +}
> +
> +qemu_mutex_lock(>mc_in_progress);

Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
kvm_cpu_exec()), so if you would ever get one thread waiting for this
mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
because the other code would wait forever to get the BQL ==> Deadlock.

I think if you want to be able to handle multiple NMIs at once, you
likely need something like an error log per CPU instead. And if an NMI
happens one CPU while there is already a NMI handler running on the very
same CPU, you could likely simply track this with an boolean variable
and put the CPU into checkstop if this happens?

> +/* Set error log fields */
> +mc_log.r3 = env->gpr[3];
> +mc_log.err_log.byte0 = 0;
> +mc_log.err_log.byte1 =
> +(RTAS_SEVERITY_ERROR_SYNC << RTAS_ELOG_SEVERITY_SHIFT);
> +mc_log.err_log.byte1 |=
> +(RTAS_DISP_NOT_RECOVERED << RTAS_ELOG_DISPOSITION_SHIFT);
> +mc_log.err_log.byte2 =
> +(RTAS_INITIATOR_MEMORY << RTAS_ELOG_INITIATOR_SHIFT);
> +mc_log.err_log.byte2 |= RTAS_TARGET_MEMORY;
> +
> +if (env->spr[SPR_DSISR] & P7_DSISR_MC_UE) {
> +mc_log.err_log.byte3 = RTAS_TYPE_ECC_UNCORR;
> +} else {
> +mc_log.err_log.byte3 = 0;
> +}
> +
> +/* Handle all Host/Guest LE/BE combinations */
> +if (env->msr & (1ULL << MSR_LE)) {
> +mc_log.r3 = cpu_to_le64(mc_log.r3);
> +} else {
> +mc_log.r3 = cpu_to_be64(mc_log.r3);
> +}
> +
> +cpu_physical_memory_write(spapr->rtas_addr + RTAS_ERRLOG_OFFSET,
> +  _log, sizeof(mc_log));
> +
> +env->nip = spapr->guest_machine_check_addr;
> +env->gpr[3] = spapr->rtas_addr + RTAS_ERRLOG_OFFSET;
> +
> +return 0;
> +}
> diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
> index 5c1d334..1172735 100644
> --- a/target-ppc/kvm_ppc.h
> +++ b/target-ppc/kvm_ppc.h
> @@ -53,6 +53,87 @@ void kvmppc_hash64_free_pteg(uint64_t token);
>  void kvmppc_hash64_write_pte(CPUPPCState *env, target_ulong pte_index,
>   target_ulong pte0, target_ulong pte1);
>  bool kvmppc_has_cap_fixup_hcalls(void);
> +int kvm_handle_nmi(PowerPCCPU *cpu);
> +
> +/* Offset from rtas-base where error log is placed */
> +#define RTAS_ERRLOG_OFFSET   (0x200)

Why paranthesis here?

> +#define

Re: [Qemu-devel] [PATCH V3 2/6] block: add blk_abort_aio_request

2015-11-12 Thread Fam Zheng

On Fri, 11/06 09:42, Peter Lieven wrote:
> Signed-off-by: Peter Lieven 
> ---
>  block/block-backend.c  | 17 +
>  include/sysemu/block-backend.h |  3 +++
>  2 files changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 19fdaae..b13dc4e 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -627,8 +627,9 @@ static void error_callback_bh(void *opaque)
>  qemu_aio_unref(acb);
>  }
>  
> -static BlockAIOCB *abort_aio_request(BlockBackend *blk, BlockCompletionFunc 
> *cb,
> - void *opaque, int ret)
> +BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
> +   BlockCompletionFunc *cb,
> +   void *opaque, int ret)

Parameter list identation is off by one.

>  {
>  struct BlockBackendAIOCB *acb;
>  QEMUBH *bh;
> @@ -650,7 +651,7 @@ BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, 
> int64_t sector_num,
>  {
>  int ret = blk_check_request(blk, sector_num, nb_sectors);
>  if (ret < 0) {
> -return abort_aio_request(blk, cb, opaque, ret);
> +return blk_abort_aio_request(blk, cb, opaque, ret);
>  }
>  
>  return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
> @@ -710,7 +711,7 @@ BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t 
> sector_num,
>  {
>  int ret = blk_check_request(blk, sector_num, nb_sectors);
>  if (ret < 0) {
> -return abort_aio_request(blk, cb, opaque, ret);
> +return blk_abort_aio_request(blk, cb, opaque, ret);
>  }
>  
>  return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
> @@ -722,7 +723,7 @@ BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t 
> sector_num,
>  {
>  int ret = blk_check_request(blk, sector_num, nb_sectors);
>  if (ret < 0) {
> -return abort_aio_request(blk, cb, opaque, ret);
> +return blk_abort_aio_request(blk, cb, opaque, ret);
>  }
>  
>  return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
> @@ -732,7 +733,7 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
>BlockCompletionFunc *cb, void *opaque)
>  {
>  if (!blk_is_available(blk)) {
> -return abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
> +return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
>  }
>  
>  return bdrv_aio_flush(blk->bs, cb, opaque);
> @@ -744,7 +745,7 @@ BlockAIOCB *blk_aio_discard(BlockBackend *blk,
>  {
>  int ret = blk_check_request(blk, sector_num, nb_sectors);
>  if (ret < 0) {
> -return abort_aio_request(blk, cb, opaque, ret);
> +return blk_abort_aio_request(blk, cb, opaque, ret);
>  }
>  
>  return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
> @@ -787,7 +788,7 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned 
> long int req, void *buf,
>BlockCompletionFunc *cb, void *opaque)
>  {
>  if (!blk_is_available(blk)) {
> -return abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
> +return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
>  }
>  
>  return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque);
> diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
> index 9306a52..b5267a8 100644
> --- a/include/sysemu/block-backend.h
> +++ b/include/sysemu/block-backend.h
> @@ -180,5 +180,8 @@ int blk_save_vmstate(BlockBackend *blk, const uint8_t 
> *buf,
>  int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size);
>  int blk_probe_blocksizes(BlockBackend *blk, BlockSizes *bsz);
>  int blk_probe_geometry(BlockBackend *blk, HDGeometry *geo);
> +BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
> +   BlockCompletionFunc *cb,
> +   void *opaque, int ret);
>  

Same here.

>  #endif
> -- 
> 1.9.1
> 
>

Re: [Qemu-devel] [PATCH for-2.5 2/2] input: Promote 'input-send-event' to stable API

2015-11-12 Thread Gerd Hoffmann

On Do, 2015-11-12 at 09:23 +0100, Markus Armbruster wrote:
> Eric Blake  writes:
> 
> > We've had 'x-input-send-event' since 2.3, with no further
> > changes to the interface other than tweaks in the previous patch
> > to the spelling of the enum constants ('X' and 'WheelUp' changed
> > to 'x' and 'wheel-up').
> >
> > What's more, changing the spelling of enum constants is not easy
> > to introspect prior to 2.5; so a client that was relying on the
> > experimental command can't easily tell which spelling is expected.
> > But 'query-commands' works in all qemu versions that supported
> > the command, so renaming the command now makes it an easy thing
> > to determine which spelling of the enum values to use.
> >
> > Thus, it's time to promote this interface to stable.
> 
> The x- goes back to commit df5b2ad:
> 
> input: move input-send-event into experimental namespace
> 
> Ongoing discussions on how we are going to specify the console,
> so tag the command as experiental so we can refine things in
> the 2.3 development cycle.
> 
> Have we settled "how we are going to specify the console"?  If yes,
> commit, please.  If no, I'm afraid the command should stay experimental.

Good question.  I don't think so.

IIRC the question was whenever we'll leave it as-is (console=),
or whenever we'll do something like display=,head= instead.

The latter would be consistent with how we are doing input routing, i.e.
grouping display and input devices to a seat for multiseat setups (see
docs/multiseat.txt for more details).

The consoles are already present in the qom tree
as /backend/console[] nodes, and they have device + head
children.  So qom users can map console= to
display=,head= and visa versa already.  So from a functionality
point of view it doesn't really matter, it is largely a matter of
taste ...

cheers,
  Gerd

Re: [Qemu-devel] [RFC PATCH v4 01/11] exec: Remove cpu from cpus list during cpu_exec_exit()

2015-11-12 Thread Zhu Guihua



On 11/12/2015 05:30 PM, Bharata B Rao wrote:

On Thu, Nov 12, 2015 at 05:11:02PM +0800, Zhu Guihua wrote:

Hi Bharata,

On 09/09/2015 03:56 PM, Bharata B Rao wrote:

On Wed, Sep 09, 2015 at 03:41:30PM +0800, Zhu Guihua wrote:

On 09/09/2015 01:52 PM, Bharata B Rao wrote:

On Fri, Sep 04, 2015 at 03:31:24PM +1000, David Gibson wrote:

On Thu, Aug 06, 2015 at 10:57:07AM +0530, Bharata B Rao wrote:

CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().

cpu_exec_init() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.

Instead of introducing a new field CPUState.queued, I could have used
CPUState.cpu_index to check if the cpu is already dequeued from the list.
Since that doesn't work for CONFIG_USER_ONLY, I had to add a new field.

Signed-off-by: Bharata B Rao 

This seems reasonable to me, but I'm wondering how x86 cpu hotplug /
unplug is working without it.

x86 hotplug/unplug code currently resides in Zhu's git tree
(git://github.com/zhugh/qemu). They are removing the CPU from the list
explicitly in x86 CPU's instance_finalize routine.

Sorry, my git tree is git://github.com/zhuguihua/qemu

Now there was no progress about topology, so we don't know what will happen
in x86. I am not sure whether we will take this method finally.

Andreas had a presentation on this topic in KVM forum recently.

Andreas - do you have any updates on the topology and other aspects
of CPU hotplug so that we can align the CPU hotplug work in different
archs accordingly and hope to get it merged in 2.5 time frame ?

Do you update the patchset?

My work in x86 has stopped for a while, Maybe I can get some ideas from
another
arch's worker.

My last version is here:
https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg00650.html

I initally started with core level CPU hotplug, moved to socket level hotplug
based on Andreas' patchset and then moved back again to core level hotplug.

I was a bit confused about how the generic semantics would evovle and hence
the work got delayed. I wil be posting the next version of my patchset
based on core level semantics soon.

I am hoping that I should be able to get CPU hotplug/unplug included
in QEMU-2.6 timeframe.


Thanks for your reply. Look forward to your next version.

Regards,
Zhu

Re: [Qemu-devel] [PATCH 04/10] snapshot: create bdrv_all_goto_snapshot helper

2015-11-12 Thread Denis V. Lunev


On 11/12/2015 11:38 AM, Fam Zheng wrote:

On Tue, 11/10 17:25, Denis V. Lunev wrote:

to switch to snapshot on all loaded block drivers.

The patch also ensures proper locking.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Greg Kurz 
CC: Juan Quintela 
CC: Stefan Hajnoczi 
CC: Kevin Wolf 
---
  block/snapshot.c | 20 
  include/block/snapshot.h |  1 +
  migration/savevm.c   | 15 +--
  3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index 61a6ad1..9f07a63 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -403,3 +403,23 @@ int bdrv_all_delete_snapshot(const char *name, 
BlockDriverState **first_bad_bs,
  *first_bad_bs = bs;
  return ret;
  }
+
+
+int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs)
+{
+int err = 0;
+BlockDriverState *bs = NULL;
+
+while (err == 0 && (bs = bdrv_next(bs))) {
+AioContext *ctx = bdrv_get_aio_context(bs);
+
+aio_context_acquire(ctx);
+if (bdrv_can_snapshot(bs)) {
+err = bdrv_snapshot_goto(bs, name);
+}
+aio_context_release(ctx);
+}
+
+*first_bad_bs = bs;
+return err;
+}
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index d02d2b1..0a176c7 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -84,5 +84,6 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
  bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs);
  int bdrv_all_delete_snapshot(const char *name, BlockDriverState 
**first_bsd_bs,
   Error **err);
+int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bsd_bs);
  
  #endif

diff --git a/migration/savevm.c b/migration/savevm.c
index 1157a6f..d18ff13 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1425,16 +1425,11 @@ int load_vmstate(const char *name)
  /* Flush all IO requests so they don't interfere with the new state.  */
  bdrv_drain_all();
  
-bs = NULL;

-while ((bs = bdrv_next(bs))) {
-if (bdrv_can_snapshot(bs)) {
-ret = bdrv_snapshot_goto(bs, name);
-if (ret < 0) {
-error_report("Error %d while activating snapshot '%s' on '%s'",
- ret, name, bdrv_get_device_name(bs));
-return ret;
-}
-}
+ret = bdrv_all_goto_snapshot(name, );
+if (ret < 0) {
+error_report("Error %d while activating snapshot '%s' on '%s'",
+ ret, name, bdrv_get_device_name(bs));
+return ret;

Maybe more friendlily strerror(ret)?


  }
  
  /* restore the VM state */

--
2.5.0



we can. I think that this could be done for all such functions
as follow up patch. I'll do that separately and will send
with the next iteration if applicable.

Den

Re: [Qemu-devel] [Bug 739785] Re: qemu-i386 user mode can't fork (bash: fork: Invalid argument)

2015-11-12 Thread Justin Shafer

Wine works! =) Didn't know if you knew... no more old qemu.

You da man!

On Tue, Aug 6, 2013 at 3:33 AM, Peter Maydell 
wrote:

> Commits aa004f5f9 to 24cb36a61c (13 in total) are the patchset that fix
> this.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/739785
>
> Title:
>   qemu-i386 user mode can't fork (bash: fork: Invalid argument)
>
> Status in QEMU:
>   Fix Committed
> Status in “qemu” package in Debian:
>   Confirmed
>
> Bug description:
>   Good time of day everybody,
>
>   I have been trying to make usermode qemu on ARM with plugapps
>   (archlinux) with archlinux i386 chroot to work.
>
>   1. I installed arch linux in a virtuabox and created a chroot for it
> with mkarchroot. Transferred it to my pogo plug into /i386/
>   2. I comiled qemu-i386 static and put it into /i386/usr/bin/
>   ./configure --static --disable-blobs --disable-system
> --target-list=i386-linux-user
>   make
>
>   3. I also compiled linux kernel 2.6.38 with CONFIG_BINFMT_MISC=y and
> installed it.
>   uname -a
>   Linux Plugbox 2.6.38 #4 PREEMPT Fri Mar 18 22:19:10 CDT 2011 armv5tel
> Feroceon 88FR131 rev 1 (v5l) Marvell SheevaPlug Reference Board GNU/Linux
>
>   4. Added the following options into /etc/rc.local
>   /sbin/modprobe binfmt_misc
>   /bin/mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
>   echo
> ':qemu-i386:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff\xff:/usr/bin/qemu-i386:'
> >/proc/sys/fs/binfmt_misc/register
>
>   5. Also copied ld-linux.so.3 (actually ld-2.13.so because ld-
>   linux.so.3 is a link to that file) from /lib/ to /i386/lib/
>
>   6.Now i chroot into /i386 and I get this:
>   [root@Plugbox i386]# chroot .
>   [II aI hnve ao n@P /]# pacman -Suy
>   bash: fork: Invalid argument
>
>   7.I also downloaded linux-user-test-0.3 from qemu website and ran the
> test:
>   [root@Plugbox linux-user-test-0.3]# make
>   ./qemu-linux-user.sh
>   [qemu-i386]
>   ../qemu-0.14.0/i386-linux-user/qemu-i386 -L ./gnemul/qemu-i386 i386/ls
> -l dummyfile
>   BUG IN DYNAMIC LINKER ld.so: dl-version.c: 210: _dl_check_map_versions:
> Assertion `needed != ((void *)0)' failed!
>   make: *** [test] Error 127
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/739785/+subscriptions
>

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/739785

Title:
  qemu-i386 user mode can't fork (bash: fork: Invalid argument)

Status in QEMU:
  Fix Committed
Status in qemu package in Debian:
  Fix Released

Bug description:
  Good time of day everybody,

  I have been trying to make usermode qemu on ARM with plugapps
  (archlinux) with archlinux i386 chroot to work.

  1. I installed arch linux in a virtuabox and created a chroot for it with 
mkarchroot. Transferred it to my pogo plug into /i386/
  2. I comiled qemu-i386 static and put it into /i386/usr/bin/
  ./configure --static --disable-blobs --disable-system 
--target-list=i386-linux-user
  make

  3. I also compiled linux kernel 2.6.38 with CONFIG_BINFMT_MISC=y and 
installed it.
  uname -a
  Linux Plugbox 2.6.38 #4 PREEMPT Fri Mar 18 22:19:10 CDT 2011 armv5tel 
Feroceon 88FR131 rev 1 (v5l) Marvell SheevaPlug Reference Board GNU/Linux

  4. Added the following options into /etc/rc.local
  /sbin/modprobe binfmt_misc
  /bin/mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
  echo 
':qemu-i386:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff\xff:/usr/bin/qemu-i386:'
 >/proc/sys/fs/binfmt_misc/register

  5. Also copied ld-linux.so.3 (actually ld-2.13.so because ld-
  linux.so.3 is a link to that file) from /lib/ to /i386/lib/

  6.Now i chroot into /i386 and I get this:
  [root@Plugbox i386]# chroot .
  [II aI hnve ao n@P /]# pacman -Suy
  bash: fork: Invalid argument

  7.I also downloaded linux-user-test-0.3 from qemu website and ran the test:
  [root@Plugbox linux-user-test-0.3]# make
  ./qemu-linux-user.sh
  [qemu-i386]
  ../qemu-0.14.0/i386-linux-user/qemu-i386 -L ./gnemul/qemu-i386 i386/ls -l 
dummyfile
  BUG IN DYNAMIC LINKER ld.so: dl-version.c: 210: _dl_check_map_versions: 
Assertion `needed != ((void *)0)' failed!
  make: *** [test] Error 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/739785/+subscriptions

Re: [Qemu-devel] [PATCH v3 0/2] Fix compilation of netmap backend

2015-11-12 Thread Jason Wang



On 11/10/2015 05:47 PM, Vincenzo Maffione wrote:
> This patch series adds some fixes to the netmap net backend. It contains
> two changes:
> (1) Fix compilation issue of netmap.c introduced by the reorganization
> of struct NetClientOptions
> (2) Address the FIXME comment that was asking to use error_setg()
> variants in place of error_report()
>
> CHANGELOG:
> - removed dead return and use error_setg_file_open() in place
>   of error_setg_errno()
> - I noticed that net_init_netmap() has to return int, so I restored
>   the return statements in that function
>
> Vincenzo Maffione (2):
>   net: netmap: Fix compilation issue
>   net: netmap: use error_setg() helpers in place of error_report()
>
>  net/netmap.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>

I've picked this series in https://github.com/jasowang/qemu/commits/net

Thanks

Re: [Qemu-devel] [PATCH] virtio-9p: add savem handlers

2015-11-12 Thread Greg Kurz

On Thu, 22 Oct 2015 19:38:42 +0200
Greg Kurz  wrote:

> We don't support migration of mounted 9p shares. This is handled by a
> migration blocker.
> 
> One would expect, however, to be able to migrate if the share is unmounted.
> Unfortunately virtio-9p-device does not register savevm handlers at all !
> Migration succeeds and leaves the guest with a dangling device...
> 
> This patch simply registers migration handlers for virtio-9p-device. Whether
> migration is possible or not still depends on the migration blocker.
> 
> Signed-off-by: Greg Kurz 
> ---

Ping ?

> Michael, Aneesh,
> 
> This is the same patch minus the call to unregister_savevm() since we don't
> have an unrealize handler.
> 
> I decided to simply drop all the other patches. Hot-unplug support is totally
> missing and definitely needs more work. I'll try to come up with a solution
> in its own series.
> 
> Cheers.
> 
> --
> Greg
> 
> ---
>  hw/9pfs/virtio-9p-device.c |   11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index 93a407c45926..e3abcfaffb2a 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -43,6 +43,16 @@ static void virtio_9p_get_config(VirtIODevice *vdev, 
> uint8_t *config)
>  g_free(cfg);
>  }
> 
> +static void virtio_9p_save(QEMUFile *f, void *opaque)
> +{
> +virtio_save(VIRTIO_DEVICE(opaque), f);
> +}
> +
> +static int virtio_9p_load(QEMUFile *f, void *opaque, int version_id)
> +{
> +return virtio_load(VIRTIO_DEVICE(opaque), f, version_id);
> +}
> +
>  static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
>  {
>  VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> @@ -130,6 +140,7 @@ static void virtio_9p_device_realize(DeviceState *dev, 
> Error **errp)
>  }
>  v9fs_path_free();
> 
> +register_savevm(dev, "virtio-9p", -1, 1, virtio_9p_save, virtio_9p_load, 
> s);
>  return;
>  out:
>  g_free(s->ctx.fs_root);
> 
>

Re: [Qemu-devel] [PATCH v11 19/28] qapi: Change munging of CamelCase enum values

2015-11-12 Thread Gerd Hoffmann

  Hi,

> The InputButton type has existed since 2.0; which is then part of the
> 'InputBtnEvent' struct, then the 'InputEvent' union, also since 2.0.  I
> can't easily tell if it was only used internally at that point,

Internal only.

> 'x-input-send-event' (since 2.2, but the x- prefix gives us freedom).

Yes, x-input-send-event is the only external usage.

cheers,
  Gerd

Re: [Qemu-devel] [RFC] ide: Don't use qemu_hw_version() for firmware revision

2015-11-12 Thread Markus Armbruster

Eduardo Habkost  writes:

> The IDEState.version field is used for firmware version
> information returned to the guest. Updating firmware information
> on QEMU upgrade is supposed to be acceptable, so IDE doesn't need
> the version compatibility magic of qemu_hw_version() and can use
> QEMU_VERSION directly.
>
> Signed-off-by: Eduardo Habkost 
> ---
> I'm sending this just to start a discussion about what exactly we
> are supposed to return to the guest on those IDE fields. Should
> we return:
>
> 1) Something that never changes and don't reveal QEMU version
>information (e.g. "QEMU")?
> 2) Something that is always the same depending on the
>machine-type (machine-type name? MachineClass.hw_version?)
> 3) Something that change every time QEMU is upgraded (QEMU_VERSION)?
> 4) Something else?
>
> This patch implements option (3).
>
> ---
>  hw/ide/core.c | 2 +-
>  hw/ide/internal.h | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 364ba21..1602707 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -2312,7 +2312,7 @@ int ide_init_drive(IDEState *s, BlockBackend *blk, 
> IDEDriveKind kind,
>  if (version) {
>  pstrcpy(s->version, sizeof(s->version), version);
>  } else {
> -pstrcpy(s->version, sizeof(s->version), qemu_hw_version());
> +pstrcpy(s->version, sizeof(s->version), QEMU_VERSION);

Is s->version migrated?

If no, live migration to a newer QEMU changes the version, doesn't it?
The "firmware upgrade is acceptable" argument doesn't apply there.  I
guess a spontaneous version change is relatively unlikely to cause
trouble, but why risk it?

>  }
>  
>  ide_reset(s);
> diff --git a/hw/ide/internal.h b/hw/ide/internal.h
> index e4629b0..a4277ce 100644
> --- a/hw/ide/internal.h
> +++ b/hw/ide/internal.h
> @@ -378,6 +378,7 @@ struct IDEState {
>  /* set for lba48 access */
>  uint8_t lba48;
>  BlockBackend *blk;
> +/* Firmware revision/version */
>  char version[9];
>  /* ATAPI specific */
>  struct unreported_events events;
> @@ -488,6 +489,7 @@ struct IDEDevice {
>  uint32_t unit;
>  BlockConf conf;
>  int chs_trans;
> +/* Firmware revision/version */
>  char *version;
>  char *serial;
>  char *model;

I'd put the comment to the right of version, to make it immediately
obvious it applies to just version.

Re: [Qemu-devel] safety of migration_bitmap_extend

2015-11-12 Thread Wen Congyang

On 11/04/2015 05:19 PM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (we...@cn.fujitsu.com) wrote:
>> On 11/04/2015 05:05 PM, Dr. David Alan Gilbert wrote:
>>> * Wen Congyang (we...@cn.fujitsu.com) wrote:
 On 11/03/2015 09:47 PM, Dr. David Alan Gilbert wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> "Dr. David Alan Gilbert"  wrote:
>>> Hi,
>>>   I'm trying to understand why migration_bitmap_extend is correct/safe;
>>> If I understand correctly, you're arguing that:
>>>
>>>   1) the migration_bitmap_mutex around the extend, stops any sync's 
>>> happening
>>>  and so no new bits will be set during the extend.
>>>
>>>   2) If migration sends a page and clears a bitmap entry, it doesn't
>>>  matter if we lose the 'clear' because we're copying it as
>>>  we extend it, because losing the clear just means the page
>>>  gets resent, and so the data is OK.
>>>
>>> However, doesn't (2) mean that migration_dirty_pages might be wrong?
>>> If a page was sent, the bit cleared, and migration_dirty_pages 
>>> decremented,
>>> then if we copy over that bitmap and 'set' that bit again then 
>>> migration_dirty_pages
>>> is too small; that means that either migration would finish too early,
>>> or more likely, migration_dirty_pages would wrap-around -ve and
>>> never finish.
>>>
>>> Is there a reason it's really safe?
>>
>> No.  It is reasonably safe.  Various values of reasonably.
>>
>> migration_dirty_pages should never arrive at values near zero.  Because
>> we move to the completion stage way before it gets a value near zero.
>> (We could have very, very bad luck, as in it is not safe).
>
> That's only true if we hit the qemu_file_rate_limit() in ram_save_iterate;
> if we don't hit the rate limit (e.g. because we're CPU or network limited
> to slower than the set limit) then I think ram_save_iterate will go all 
> the
> way to sending every page; if that happens it'll go once more
> around the main migration loop, and call the pending routine, and now get
> a -ve (very +ve) number of pending pages, so continuously do 
> ram_save_iterate
> again.
>
> We've had that type of bug before when we messed up the dirty-pages 
> calculation
> during hotplug.

 IIUC, migration_bitmap_extend() is called when migration is running, and 
 we hotplug
 a device.

 In this case, I think we hold the iothread mutex when 
 migration_bitmap_extend() is called.

 ram_save_complete() is also protected by the iothread mutex.

 So if migration_bitmap_extend() is called, the migration thread may be 
 blocked in
 migration_completion() and wait it. qemu_savevm_state_complete() will be 
 called after
 migration_completion() returns.
>>>
>>> But I don't think ram_save_iterate is protected by that lock, and my concern
>>> is that the dirty-pages calculation is wrong during the iteration phase, 
>>> and then
>>> the iteration phase will never exit and never try and get to 
>>> ram_save_complete.
>>
>> Yes, the dirty-pages may be wrong. But it is smaller, not larger than the 
>> exact value.
>> Why will the iteration phase never exit?
> 
> Imagine that migration_dirty_pages is slightly too small and we enter 
> ram_save_iterate;
> ram_save_iterate now sends *all* it's pages, it decrements 
> migration_dirty_pages for
> every page sent.  At the end of ram_save_iterate, migration_dirty_pages would 
> be negative.
> But migration_dirty_pages is *u*int64_t; so we exit ram_save_iterate,
> go around the main migration_thread loop again and call 
> qemu_savevm_state_pending, and
> it returns a very large number (because it's actually a negative number), so 
> we keep
> going around the loop, because it never gets smaller.

I don't know how to trigger the problem. I think store migration_dirty_pages in 
BitmapRcu
can fix this problem.

Thanks
Wen Congyang

> 
> Dave
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> Dave
>>>

 Thanks
 Wen Congyang

>
>> Now, do we really care if migration_dirty_pages is exact?  Not really,
>> we just use it to calculate if we should start the throotle or not.
>> That only test that each 1 second, so if we have written a couple of
>> pages that we are not accounting for, things should be reasonably safe.
>>
>> Once told that, I don't know why we didn't catch that problem during
>> review (yes, I am guilty here).  Not sure how to really fix it,
>> thought.  I think that the problem is more theoretical than real, but
>
> Dave
>
>> 
>>
>> Thanks, Juan.
>>
>>>
>>> Dave
>>>
>>> --
>>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>
> .
>

>>>

Re: [Qemu-devel] [PATCH v6 0/9] e1000: Various fixes and registers' implementation

2015-11-12 Thread Dmitry Fleytman


> On 12 Nov 2015, at 10:16 AM, Jason Wang  wrote:
> 
> 
> 
> On 11/11/2015 09:52 PM, Leonid Bloch wrote:
>> This series fixes issues with packet/octet counting in e1000's Statistic
>> registers, fixes a bug in the packet address filtering procedure, and
>> implements many MAC registers that were absent before, some Statistic
>> counters among them.
>> 
>> Besides this, the series introduces a parameter which, if set to "on"
>> (default), will cause the entire MAC registers' array to migrate during
>> live migration (please see patches #2 and #9 for details). The rational
>> behind this is the ability to implement additional MAC registers in the
>> future, without worrying about migration compatibility between future
>> versions. For compatibility with previous versions, the above mentioned
>> parameter can be set to "off".
>> 
>> Also, a new array is introduced to control the access to the various MAC
>> registers. This takes care of situations when a MAC register requires a
>> certain parameter to be accessed, or is partially implemented, and
>> requires a debug warning to be printed on access attempts.
>> 
>> Additionally, several cosmetic changes are made.
>> 
>> Differences v1-2:
>> 
>> * Wording of several commit messages corrected.
>> * For trivially implemented Diagnostic registers, a debug message is
>>  added on read/write attempts, alerting of incomplete implementation.
>> * Following testing on a physical device, only the lower 16 bits can now
>>  be read from AIT, and only the lower 4 - from FFMT*.
>> * The grow_8reg_if_not_full function is rewritten.
>> * inc_tx_bcast_or_mcast_count and increase_size_stats are now called
>>  from within e1000_send_packet, to avoid code duplication.
>> 
>> Differences v2-3:
>> 
>> * Minor rewordings of some commit messages (0002, 0003).
>> * Live migration capability is added to the newly implemented registers.
>> 
>> Differences v3-4:
>> 
>> * Introduction of the "full_mac_registers" parameter (see above).
>> * Reversion of the live migration handling introduced in v3.
>> * Small alignment changes in patch #1 to correspond with the following
>>  patches.
>> 
>> Differences v4-v5:
>> 
>> * Introduction of an array to control the access to the MAC registers.
>> * Removal of the specific functions that warned of partial
>>  implementation on read/write from patch 4.
>> * Adequate changes to patches 4 and 8: mainly adding the registers
>>  introduced there to the new array.
>> 
>> Differences v5-v6:
>> 
>> * The access control array now does not contain an "always accessible"
>>  bit. The assumption that a register is always accessible is based now
>>  solely on the facts that it has a read or write handler, and it does
>>  not require a flag to be set. That also makes place for 6 possible
>>  flag bits in the access control array, instead of 5 in v5.
>> * The support for backward compatibility, nor the new registers
>>  introduced in this series, can not be turned on now until the last
>>  patch in this series is applied. This is done to preserve
>>  compatibility if bisection in-between the patches of this series will
>>  be needed.
>> 
>> The majority of these changes result from Jason Wang's review - thank
>> you, Jason!
> 
> Applied in https://github.com/jasowang/qemu/commits/net 
>  with a minor
> modification:
> 
> - Move the compat property from patch 3 to patch 9

Thanks for your help, Jason!

> 
> Thanks
> 
>> 
>> Leonid Bloch (9):
>>  e1000: Cosmetic and alignment fixes
>>  e1000: Add support for migrating the entire MAC registers' array
>>  e1000: Introduced an array to control the access to the MAC registers
>>  e1000: Trivial implementation of various MAC registers
>>  e1000: Fixing the received/transmitted packets' counters
>>  e1000: Fixing the received/transmitted octets' counters
>>  e1000: Fixing the packet address filtering procedure
>>  e1000: Implementing various counters
>>  e1000: Introducing backward compatibility command line parameter
>> 
>> hw/net/e1000.c  | 476 
>> 
>> hw/net/e1000_regs.h |   8 +-
>> include/hw/compat.h |   4 +
>> 3 files changed, 379 insertions(+), 109 deletions(-)

Re: [Qemu-devel] [PATCH 1/1] configure: use appropriate code fragment for -fstack-protector checks

2015-11-12 Thread Markus Armbruster

Rodrigo Rebello  writes:

> The check for stack-protector support consisted in compiling and linking
> the test program below (output by function write_c_skeleton()) with the
> compiler flag -fstack-protector-strong first and then with
> -fstack-protector-all if the first one failed to work:
>
>   int main(void) { return 0; }
>
> This caused false positives when using certain toolchains in which the
> compiler accepted -fstack-protector-strong but no support was provided
> by the C library, since for this stack-protector variant the compiler
> emits canary code only for functions that meet specific conditions
> (local arrays, memory references to local variables, etc.) and the code
> fragment under test included none of them (hence no stack protection
> code generated, no link failure).
>
> This fix changes the test program used for -fstack-protector checks to
> include a function that meets conditions which cause the compiler to
> generate canary code in all variants.
>
> Signed-off-by: Rodrigo Rebello 
> ---
>  configure | 18 ++
>  1 file changed, 18 insertions(+)
>
> diff --git a/configure b/configure
> index 46fd8bd..c3d9592 100755
> --- a/configure
> +++ b/configure
> @@ -1486,6 +1486,24 @@ for flag in $gcc_flags; do
>  done
>  
>  if test "$stack_protector" != "no"; then
> +  cat > $TMPC << EOF
> +void foo(const char *c);
> +
> +void foo(const char *c)
> +{
> +char arr[64], *p;
> +for (p = arr; *c; c++, p++) {
> +*p = *c;
> +}
> +}
> +
> +int main(void)
> +{
> +char c[] = "";
> +foo(c);

Why not simply foo("")?

Could the optimizer optimize away the pattern that triggers the canary?

To protect against that possibility, we could use

int main(int argc, char *argv[])
{
foo(argv[0]);
}

> +return 0;
> +}
> +EOF
>gcc_flags="-fstack-protector-strong -fstack-protector-all"
>sp_on=0
>for flag in $gcc_flags; do

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Li, Liang Z

> On 12/11/2015 03:49, Li, Liang Z wrote:
> > I am very surprised about the live migration performance  result when
> > I use your ' memeqzero4_paolo' instead of these SSE2 Intrinsics to
> > check the zero pages.
> 
> What code were you using?  Remember I suggested using only unsigned long
> checks, like
> 
>   unsigned long *p = ...
>   if (p[0] || p[1] || p[2] || p[3]
>   || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
>   return BUFFER_NOT_ZERO;
>   else
>   return BUFFER_ZERO;
> 



I use the following code:


bool memeqzero4_paolo(const void *data, size_t length)
{
const unsigned char *p = data;
unsigned long word;

if (!length)
return true;

/* Check len bytes not aligned on a word.  */
while (__builtin_expect(length & (sizeof(word) - 1), 0)) {
if (*p)
return false;
p++;
length--;
if (!length)
return true;
}

/* Check up to 16 bytes a word at a time.  */
for (;;) {
memcpy(, p, sizeof(word));
if (word)
return false;
p += sizeof(word);
length -= sizeof(word);
if (!length)
return true;
if (__builtin_expect(length & 15, 0) == 0)
break;
}

 /* Now we know that's zero, memcmp with self. */
 return memcmp(data, p, length) == 0;
}

> > The total live migration time increased about
> > 8%!   Not decreased.  Although in the unit test your '
> > memeqzero4_paolo'  has better performance, any idea?
> 
> You only tested the case of zero pages.  But real pages usually are not zero,
> even if they have a few zero bytes at the beginning.  It's very important to
> optimize the initial check before the memcmp call.
> 

In the unit test, I only test zero pages too, and the performance of  
'memeqzero4_paolo' is better.
But when merged into QEMU, it caused performance drop. Why?

> Paolo

Re: [Qemu-devel] [PATCH v11 00/12] Block replication for continuous checkpoints

2015-11-12 Thread Wen Congyang

ping...

On 11/03/2015 06:58 PM, Wen Congyang wrote:
> You can the detailed information about block replication from here:
> http://wiki.qemu.org/Features/BlockReplication
> 
> Usage:
> Please refer to docs/block-replication.txt
> 
> This patch series is based on the following patch series:
> 1. http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg03860.html
> 2. http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg06124.html
> 
> You can get the patch here:
> https://github.com/coloft/qemu/tree/wency/block-replication-v11
> 
> The newest framework will be sent later.
> 
> TODO:
> 1. Continuous block replication. It will be started after basic functions
>are accepted.
> 
> Changs Log:
> V11:
> 1. Reopen the backing file when starting blcok replication if it is not
>opened in R/W mode
> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>when opening backing file
> 3. Block the top BDS so there is only one block job for the top BDS and
>its backing chain.
> V10:
> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>reference.
> 2. Address the comments from Eric Blake
> V9:
> 1. Update the error messages
> 2. Rebase to the newest qemu
> 3. Split child add/delete support. These patches are sent in another patchset.
> V8:
> 1. Address Alberto Garcia's comments
> V7:
> 1. Implement adding/removing quorum child. Remove the option non-connect.
> 2. Simplify the backing refrence option according to Stefan Hajnoczi's 
> suggestion
> V6:
> 1. Rebase to the newest qemu.
> V5:
> 1. Address the comments from Gong Lei
> 2. Speed the failover up. The secondary vm can take over very quickly even
>if there are too many I/O requests.
> V4:
> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
> V3:
> 1: use error_setg() instead of error_set()
> 2. Add a new block job API
> 3. Active disk, hidden disk and nbd target uses the same AioContext
> 4. Add a testcase to test new hbitmap API
> V2:
> 1. Redesign the secondary qemu(use image-fleecing)
> 2. Use Error objects to return error message
> 3. Address the comments from Max Reitz and Eric Blake
> 
> Wen Congyang (12):
>   unblock backup operations in backing file
>   Store parent BDS in BdrvChild
>   allow writing to the backing file
>   Backup: clear all bitmap when doing block checkpoint
>   Allow creating backup jobs when opening BDS
>   block: make bdrv_put_ref_bh_schedule() as a public API
>   docs: block replication's description
>   Add new block driver interfaces to control block replication
>   quorum: implement block driver interfaces for block replication
>   Implement new driver for block replication
>   support replication driver in blockdev-add
>   Add a new API to start/stop replication, do checkpoint to all BDSes
> 
>  block.c| 211 -
>  block/Makefile.objs|   3 +-
>  block/backup.c |  14 ++
>  block/quorum.c |  78 +++
>  block/replication.c| 550 
> +
>  blockdev.c |  37 +--
>  blockjob.c |  11 +
>  docs/block-replication.txt | 251 +
>  include/block/block.h  |  10 +
>  include/block/block_int.h  |  15 ++
>  include/block/blockjob.h   |  12 +
>  qapi/block-core.json   |  34 ++-
>  12 files changed, 1190 insertions(+), 36 deletions(-)
>  create mode 100644 block/replication.c
>  create mode 100644 docs/block-replication.txt
>

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Paolo Bonzini



On 12/11/2015 09:53, Li, Liang Z wrote:
>> On 12/11/2015 03:49, Li, Liang Z wrote:
>>> I am very surprised about the live migration performance  result when
>>> I use your ' memeqzero4_paolo' instead of these SSE2 Intrinsics to
>>> check the zero pages.
>>
>> What code were you using?  Remember I suggested using only unsigned long
>> checks, like
>>
>>  unsigned long *p = ...
>>  if (p[0] || p[1] || p[2] || p[3]
>>  || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
>>  return BUFFER_NOT_ZERO;
>>  else
>>  return BUFFER_ZERO;
>>
> 
> I use the following code:
> 
> 
> bool memeqzero4_paolo(const void *data, size_t length)
> {
>  ...
> }

The code you used is very generic and not optimized for the kind of data
you see during migration, hence the existing code in QEMU fares better.

>>> The total live migration time increased about
>>> 8%!   Not decreased.  Although in the unit test your '
>>> memeqzero4_paolo'  has better performance, any idea?
>>
>> You only tested the case of zero pages.  But real pages usually are not zero,
>> even if they have a few zero bytes at the beginning.  It's very important to
>> optimize the initial check before the memcmp call.
>>
> 
> In the unit test, I only test zero pages too, and the performance of  
> 'memeqzero4_paolo' is better.
> But when merged into QEMU, it caused performance drop. Why?

Because QEMU is not migrating zero pages only.

Paolo

Re: [Qemu-devel] [RFC PATCH v4 01/11] exec: Remove cpu from cpus list during cpu_exec_exit()

2015-11-12 Thread Bharata B Rao

On Thu, Nov 12, 2015 at 05:11:02PM +0800, Zhu Guihua wrote:
> Hi Bharata,
> 
> On 09/09/2015 03:56 PM, Bharata B Rao wrote:
> >On Wed, Sep 09, 2015 at 03:41:30PM +0800, Zhu Guihua wrote:
> >>On 09/09/2015 01:52 PM, Bharata B Rao wrote:
> >>>On Fri, Sep 04, 2015 at 03:31:24PM +1000, David Gibson wrote:
> On Thu, Aug 06, 2015 at 10:57:07AM +0530, Bharata B Rao wrote:
> >CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
> >should be removed from cpu_exec_exit().
> >
> >cpu_exec_init() is called from generic CPU::instance_finalize and some
> >archs like PowerPC call it from CPU unrealizefn. So ensure that we
> >dequeue the cpu only once.
> >
> >Instead of introducing a new field CPUState.queued, I could have used
> >CPUState.cpu_index to check if the cpu is already dequeued from the list.
> >Since that doesn't work for CONFIG_USER_ONLY, I had to add a new field.
> >
> >Signed-off-by: Bharata B Rao 
> This seems reasonable to me, but I'm wondering how x86 cpu hotplug /
> unplug is working without it.
> >>>x86 hotplug/unplug code currently resides in Zhu's git tree
> >>>(git://github.com/zhugh/qemu). They are removing the CPU from the list
> >>>explicitly in x86 CPU's instance_finalize routine.
> >>Sorry, my git tree is git://github.com/zhuguihua/qemu
> >>
> >>Now there was no progress about topology, so we don't know what will happen
> >>in x86. I am not sure whether we will take this method finally.
> >Andreas had a presentation on this topic in KVM forum recently.
> >
> >Andreas - do you have any updates on the topology and other aspects
> >of CPU hotplug so that we can align the CPU hotplug work in different
> >archs accordingly and hope to get it merged in 2.5 time frame ?
> 
> Do you update the patchset?
> 
> My work in x86 has stopped for a while, Maybe I can get some ideas from
> another
> arch's worker.

My last version is here:
https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg00650.html

I initally started with core level CPU hotplug, moved to socket level hotplug
based on Andreas' patchset and then moved back again to core level hotplug.

I was a bit confused about how the generic semantics would evovle and hence
the work got delayed. I wil be posting the next version of my patchset
based on core level semantics soon.

I am hoping that I should be able to get CPU hotplug/unplug included
in QEMU-2.6 timeframe.

Regards,
Bharata.

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Li, Liang Z

> >>> I am very surprised about the live migration performance  result
> >>> when I use your ' memeqzero4_paolo' instead of these SSE2 Intrinsics
> >>> to check the zero pages.
> >>
> >> What code were you using?  Remember I suggested using only unsigned
> >> long checks, like
> >>
> >>unsigned long *p = ...
> >>if (p[0] || p[1] || p[2] || p[3]
> >>|| memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
> >>return BUFFER_NOT_ZERO;
> >>else
> >>return BUFFER_ZERO;
> >>
> >
> > I use the following code:
> >
> >
> > bool memeqzero4_paolo(const void *data, size_t length) {
> >  ...
> > }
> 
> The code you used is very generic and not optimized for the kind of data you
> see during migration, hence the existing code in QEMU fares better.
> 

I migrate a 8GB RAM Idle guest,  I think most of it's pages are zero pages.

I use your new code:
-
unsigned long *p = ...
if (p[0] || p[1] || p[2] || p[3]
|| memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
return BUFFER_NOT_ZERO;
else
return BUFFER_ZERO;
---
and the result is almost the same.  I also tried the check 8, 16 long data at 
the beginning, 
same result.

> >>> The total live migration time increased about
> >>> 8%!   Not decreased.  Although in the unit test your '
> >>> memeqzero4_paolo'  has better performance, any idea?
> >>
> >> You only tested the case of zero pages.  But real pages usually are
> >> not zero, even if they have a few zero bytes at the beginning.  It's
> >> very important to optimize the initial check before the memcmp call.
> >>
> >
> > In the unit test, I only test zero pages too, and the performance of
> 'memeqzero4_paolo' is better.
> > But when merged into QEMU, it caused performance drop. Why?
> 
> Because QEMU is not migrating zero pages only.
> 
> Paolo

Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit

2015-11-12 Thread Thomas Huth

On 12/11/15 09:09, Thomas Huth wrote:
> On 11/11/15 18:16, Aravinda Prasad wrote:
>> Memory error such as bit flips that cannot be corrected
>> by hardware are passed on to the kernel for handling.
>> If the memory address in error belongs to guest then
>> guest kernel is responsible for taking suitable action.
>> Patch [1] enhances KVM to exit guest with exit reason
>> set to KVM_EXIT_NMI in such cases.
>>
>> This patch handles KVM_EXIT_NMI exit. If the guest OS
>> has registered the machine check handling routine by
>> calling "ibm,nmi-register", then the handler builds
>> the error log and invokes the registered handler else
>> invokes the handler at 0x200.
>>
>> [1] http://marc.info/?l=kvm-ppc=144726114408289
>>
>> Signed-off-by: Aravinda Prasad 
>> ---
>>  target-ppc/kvm.c |   69 +++
>>  target-ppc/kvm_ppc.h |   81 
>> ++
>>  2 files changed, 150 insertions(+)
>>
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 110436d..e2e5170 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -1665,6 +1665,11 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run 
>> *run)
>>  ret = 0;
>>  break;
>>  
>> +case KVM_EXIT_NMI:
>> +DPRINTF("handle NMI exception\n");
>> +ret = kvm_handle_nmi(cpu);
>> +break;
>> +
>>  default:
>>  fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
>>  ret = -1;
>> @@ -2484,3 +2489,67 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
>>  {
>>  return data & 0x;
>>  }
>> +
>> +int kvm_handle_nmi(PowerPCCPU *cpu)
>> +{
>> +struct rtas_mc_log mc_log;
>> +CPUPPCState *env = >env;
>> +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>> +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>> +
>> +cpu_synchronize_state(CPU(ppc_env_get_cpu(env)));
>> +
>> +/* Properly set bits in MSR before we invoke the handler */
>> +env->msr = 0;
>> +
>> +if (!(*pcc->interrupts_big_endian)(cpu)) {
>> +env->msr |= (1ULL << MSR_LE);
>> +}
>> +
>> +#ifdef TARGET_PPC64
>> +env->msr |= (1ULL << MSR_SF);
>> +#endif
>> +
>> +if (!spapr->guest_machine_check_addr) {
>> +/*
>> + * If OS has not registered with "ibm,nmi-register"
>> + * jump to 0x200
>> + */
> 
> Shouldn't you also check MSR_ME here first and enter checkstop when
> machine checks are disabled?
> Also I think you have to set up some more registers for machine check
> interrupts, like SRR0 and SRR1?
> 
>> +env->nip = 0x200;
>> +return 0;
>> +}
>> +
>> +qemu_mutex_lock(>mc_in_progress);
> 
> Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code
> is run under the Big QEMU Lock™ (see qemu_mutex_lock_iothread() in
> kvm_cpu_exec()),

In case you're looking for the calls, I just noticed that the
qemu_mutex_lock_iothread() have recently been pushed into
kvm_arch_handle_exit() itself.

> so if you would ever get one thread waiting for this
> mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock()
> because the other code would wait forever to get the BQL ==> Deadlock.
> 
> I think if you want to be able to handle multiple NMIs at once, you
> likely need something like an error log per CPU instead. And if an NMI
> happens one CPU while there is already a NMI handler running on the very
> same CPU, you could likely simply track this with an boolean variable
> and put the CPU into checkstop if this happens?

Ok, I now had a look into the LoPAPR spec, and if I've got that right,
you really have to serialize the NMIs in case they happen at multiple
CPUs at the same time. So I guess the best thing you can do here is
something like:

   while (spapr->mc_in_progress) {
   /*
* There is already another NMI in progress, thus we need
* to yield here to wait until it has been finsihed
*/
   qemu_mutex_unlock_iothread();
   usleep(10);
   qemu_mutex_lock_iothread();
   }
   spapr->mc_in_progress = true;

Also LoPAPR talks about 'subsequent processors report "fatal error
previously reported"', so maybe the other processors should report that
condition in this case?
And of course you've also got to check that the same CPU is not getting
multiple NMIs before the interlock function has been called again.

 Thomas

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Paolo Bonzini

On 12/11/2015 10:40, Li, Liang Z wrote:
> I migrate a 8GB RAM Idle guest,  I think most of it's pages are zero pages.
> 
> I use your new code:
> -
>   unsigned long *p = ...
>   if (p[0] || p[1] || p[2] || p[3]
>   || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
>   return BUFFER_NOT_ZERO;
>   else
>   return BUFFER_ZERO;
> ---
> and the result is almost the same.  I also tried the check 8, 16 long data at 
> the beginning, 
> same result.

Interesting...  Well, all I can say is that applaud you for testing your
hypothesis with the benchmark.

Probably the setup cost of memcmp is too high, because the testing loop
is already very optimized.

Please submit the AVX2 version if it helps!

Paolo

Re: [Qemu-devel] [PATCH v6 0/9] e1000: Various fixes and registers' implementation

2015-11-12 Thread Leonid Bloch

On Thu, Nov 12, 2015 at 10:16 AM, Jason Wang  wrote:
>
>
> On 11/11/2015 09:52 PM, Leonid Bloch wrote:
>> This series fixes issues with packet/octet counting in e1000's Statistic
>> registers, fixes a bug in the packet address filtering procedure, and
>> implements many MAC registers that were absent before, some Statistic
>> counters among them.
>>
>> Besides this, the series introduces a parameter which, if set to "on"
>> (default), will cause the entire MAC registers' array to migrate during
>> live migration (please see patches #2 and #9 for details). The rational
>> behind this is the ability to implement additional MAC registers in the
>> future, without worrying about migration compatibility between future
>> versions. For compatibility with previous versions, the above mentioned
>> parameter can be set to "off".
>>
>> Also, a new array is introduced to control the access to the various MAC
>> registers. This takes care of situations when a MAC register requires a
>> certain parameter to be accessed, or is partially implemented, and
>> requires a debug warning to be printed on access attempts.
>>
>> Additionally, several cosmetic changes are made.
>>
>> Differences v1-2:
>> 
>> * Wording of several commit messages corrected.
>> * For trivially implemented Diagnostic registers, a debug message is
>>   added on read/write attempts, alerting of incomplete implementation.
>> * Following testing on a physical device, only the lower 16 bits can now
>>   be read from AIT, and only the lower 4 - from FFMT*.
>> * The grow_8reg_if_not_full function is rewritten.
>> * inc_tx_bcast_or_mcast_count and increase_size_stats are now called
>>   from within e1000_send_packet, to avoid code duplication.
>>
>> Differences v2-3:
>> 
>> * Minor rewordings of some commit messages (0002, 0003).
>> * Live migration capability is added to the newly implemented registers.
>>
>> Differences v3-4:
>> 
>> * Introduction of the "full_mac_registers" parameter (see above).
>> * Reversion of the live migration handling introduced in v3.
>> * Small alignment changes in patch #1 to correspond with the following
>>   patches.
>>
>> Differences v4-v5:
>> 
>> * Introduction of an array to control the access to the MAC registers.
>> * Removal of the specific functions that warned of partial
>>   implementation on read/write from patch 4.
>> * Adequate changes to patches 4 and 8: mainly adding the registers
>>   introduced there to the new array.
>>
>> Differences v5-v6:
>> 
>> * The access control array now does not contain an "always accessible"
>>   bit. The assumption that a register is always accessible is based now
>>   solely on the facts that it has a read or write handler, and it does
>>   not require a flag to be set. That also makes place for 6 possible
>>   flag bits in the access control array, instead of 5 in v5.
>> * The support for backward compatibility, nor the new registers
>>   introduced in this series, can not be turned on now until the last
>>   patch in this series is applied. This is done to preserve
>>   compatibility if bisection in-between the patches of this series will
>>   be needed.
>>
>> The majority of these changes result from Jason Wang's review - thank
>> you, Jason!
>
> Applied in https://github.com/jasowang/qemu/commits/net with a minor
> modification:
>
> - Move the compat property from patch 3 to patch 9
>
> Thanks

Jason, thanks for your review! It was very helpful!
>
>>
>> Leonid Bloch (9):
>>   e1000: Cosmetic and alignment fixes
>>   e1000: Add support for migrating the entire MAC registers' array
>>   e1000: Introduced an array to control the access to the MAC registers
>>   e1000: Trivial implementation of various MAC registers
>>   e1000: Fixing the received/transmitted packets' counters
>>   e1000: Fixing the received/transmitted octets' counters
>>   e1000: Fixing the packet address filtering procedure
>>   e1000: Implementing various counters
>>   e1000: Introducing backward compatibility command line parameter
>>
>>  hw/net/e1000.c  | 476 
>> 
>>  hw/net/e1000_regs.h |   8 +-
>>  include/hw/compat.h |   4 +
>>  3 files changed, 379 insertions(+), 109 deletions(-)
>>
>

Re: [Qemu-devel] [PATCH 00/12] vl: graphics stubs + #ifdef cleanup, eliminate some global vars

2015-11-12 Thread Paolo Bonzini



On 11/11/2015 20:09, Eduardo Habkost wrote:
> * Clean up the graphics initialization code to reduce the
>   number of #ifdefs;
> * Remove the display_type == DT_NOGRAPHIC checks from hardware
>   emulation code;
> * Make the display_type global variable a local variable on
>   main();
> * Make the display_remote static variable a local variable on
>   main().
> 
> Eduardo Habkost (12):
>   vl: Add DT_COCOA DisplayType value
>   stubs: Add VNC initialization stubs
>   stubs: curses_display_init() stub
>   stubs: SDL initialization stubs
>   stubs: cocoa_display_init() stub
>   stubs: gtk_display_init() stub
>   stubs: spice initialization stubs
>   milkymist: Move DT_NOGRAPHIC check outside milkymist_tmu2_create()
>   vl: Replace DT_NOGRAPHIC with MachineState field
>   vl: Make display_type a local variable
>   vl: Move DisplayType typedef to vl.c
>   vl: Make display_remote a local variable
> 
>  hw/lm32/milkymist-hw.h  |  4 
>  hw/lm32/milkymist.c |  4 +++-
>  hw/nvram/fw_cfg.c   |  6 +++--
>  hw/sparc/sun4m.c|  2 +-
>  include/hw/boards.h |  1 +
>  include/sysemu/sysemu.h | 11 -
>  include/ui/console.h|  4 ++--
>  stubs/Makefile.objs |  5 
>  stubs/cocoa.c   | 10 
>  stubs/curses.c  | 10 
>  stubs/gtk.c | 10 
>  stubs/sdl.c | 17 +
>  stubs/spice.c   | 13 ++
>  stubs/vnc.c | 22 +
>  vl.c| 63 
> +++--
>  15 files changed, 122 insertions(+), 60 deletions(-)
>  create mode 100644 stubs/cocoa.c
>  create mode 100644 stubs/curses.c
>  create mode 100644 stubs/gtk.c
>  create mode 100644 stubs/sdl.c
>  create mode 100644 stubs/spice.c
>  create mode 100644 stubs/vnc.c

Interesting.  This wasn't how stubs were meant to be used, but I cannot
formulate any objection that makes sense. :)

However, please move the new files to stubs/ui/.

I'll review the DT_NOGRAPHIC changes shortly.

Paolo

Re: [Qemu-devel] [PATCH v13 3/3] block/gluster: add support for multiple gluster servers

2015-11-12 Thread Prasanna Kumar Kalever

On Tuesday, November 10, 2015 10:54:25 PM, Jeff Cody wrote:
> 
> On Tue, Nov 10, 2015 at 02:39:16PM +0530, Prasanna Kumar Kalever wrote:
> > This patch adds a way to specify multiple volfile servers to the gluster
> > block backend of QEMU with tcp|rdma transport types and their port numbers.
> > 
> > Problem:
> > 
> > Currently VM Image on gluster volume is specified like this:
> > 
> > file=gluster[+tcp]://host[:port]/testvol/a.img
> > 
> > Assuming we have three hosts in trusted pool with replica 3 volume
> > in action and unfortunately host (mentioned in the command above) went down
> > for some reason, since the volume is replica 3 we now have other 2 hosts
> > active from which we can boot the VM.
> > 
> > But currently there is no mechanism to pass the other 2 gluster host
> > addresses to qemu.
> > 
> > Solution:
> > 
> > New way of specifying VM Image on gluster volume with volfile servers:
> > (We still support old syntax to maintain backward compatibility)
> > 
> > Basic command line syntax looks like:
> > 
> > Pattern I:
> >  -drive driver=gluster,
> > volume=testvol,path=/path/a.raw,
> > servers.0.host=1.2.3.4,
> >[servers.0.port=24007,]
> >[servers.0.transport=tcp,]
> > servers.1.host=5.6.7.8,
> >[servers.1.port=24008,]
> >[servers.1.transport=rdma,] ...
> > 
> > Pattern II:
> >  'json:{"driver":"qcow2","file":{"driver":"gluster",
> >"volume":"testvol","path":"/path/a.qcow2",
> >"servers":[{tuple0},{tuple1}, ...{tupleN}]}}'
> > 
> >driver  => 'gluster' (protocol name)
> >volume  => name of gluster volume where our VM image resides
> >path=> absolute path of image in gluster volume
> > 
> >   {tuple}  => {"host":"1.2.3.4"[,"port":"24007","transport":"tcp"]}
> > 
> >host=> host address (hostname/ipv4/ipv6 addresses)
> >port=> port number on which glusterd is listening. (default
> >24007)
> >transport   => transport type used to connect to gluster management
> >daemon,
> >it can be tcp|rdma (default 'tcp')
> > 
> > Examples:
> > 1.
> >  -drive driver=qcow2,file.driver=gluster,
> > file.volume=testvol,file.path=/path/a.qcow2,
> > file.servers.0.host=1.2.3.4,
> > file.servers.0.port=24007,
> > file.servers.0.transport=tcp,
> > file.servers.1.host=5.6.7.8,
> > file.servers.1.port=24008,
> > file.servers.1.transport=rdma
> > 2.
> >  'json:{"driver":"qcow2","file":{"driver":"gluster","volume":"testvol",
> >  "path":"/path/a.qcow2","servers":
> >  [{"host":"1.2.3.4","port":"24007","transport":"tcp"},
> >   {"host":"4.5.6.7","port":"24008","transport":"rdma"}] } }'
> > 
> > This patch gives a mechanism to provide all the server addresses, which are
> > in
> > replica set, so in case host1 is down VM can still boot from any of the
> > active hosts.
> > 
> > This is equivalent to the backup-volfile-servers option supported by
> > mount.glusterfs (FUSE way of mounting gluster volume)
> > 
> > Credits: Sincere thanks to Kevin Wolf  and
> > "Deepak C Shetty"  for inputs and all their support
> > 
> > Signed-off-by: Prasanna Kumar Kalever 
> 
> 
> Previous versions of this commit mentioned that the new functionality
> is dependent on a recent fix in libgfapi.  This commit message is
> missing that line; does its absence mean that the new functionality is
> not dependent on any particular libgfapi version?
> 
> What happens if the new functionality is tried on the last stable
> libgfapi release?

Sorry for not removing this since long, actually the libgfapi fix is for 
defaults values
i.e. When glfs_set_volfile_server is invocated multiple times, only on the first
invocation gfapi code replace port 0 with 24007 and transport NULL with "tcp".

Any have to remove this dependency, I have put up code that will take care of 
defaults.

Thanks,
-prasanna 

Hence, replacing the parameters at the entry function is the right way.
> 
> Thanks!
> Jeff
> 
>

Re: [Qemu-devel] [PATCH V3 3/6] ide: add support for IDEBufferedRequest

2015-11-12 Thread Fam Zheng

On Fri, 11/06 09:42, Peter Lieven wrote:
> +BlockAIOCB *ide_buffered_readv(IDEState *s, int64_t sector_num,
> +   QEMUIOVector *iov, int nb_sectors,
> +   BlockCompletionFunc *cb, void *opaque)
> +{
> +BlockAIOCB *aioreq;
> +IDEBufferedRequest *req;
> +int c = 0;
> +
> +QLIST_FOREACH(req, >buffered_requests, list) {
> +c++;
> +}
> +if (c > MAX_BUFFERED_REQS) {
> +return blk_abort_aio_request(s->blk, cb, opaque, -EIO);
> +}
> +
> +req = g_new0(IDEBufferedRequest, 1);
> +req->original_qiov = iov;
> +req->original_cb = cb;
> +req->original_opaque = opaque;
> +req->iov.iov_base = qemu_blockalign(blk_bs(s->blk), iov->size);

Where is this bounce buffer freed?

> +req->iov.iov_len = iov->size;
> +qemu_iovec_init_external(>qiov, >iov, 1);
> +
> +aioreq = blk_aio_readv(s->blk, sector_num, >qiov, nb_sectors,
> +   ide_buffered_readv_cb, req);
> +
> +QLIST_INSERT_HEAD(>buffered_requests, req, list);
> +return aioreq;
> +}
> +
>  static void ide_sector_read(IDEState *s);
>  
>  static void ide_sector_read_cb(void *opaque, int ret)

Re: [Qemu-devel] [PULL 0/1] Block patches

2015-11-12 Thread Peter Maydell

On 11 November 2015 at 18:00, Jeff Cody  wrote:
> The following changes since commit 3c07587d49458341510360557c849e93e9afaf59:
>
>   Merge remote-tracking branch 'remotes/dgibson/tags/ppc-next-2015' into 
> staging (2015-11-11 09:34:18 +)
>
> are available in the git repository at:
>
>
>   g...@github.com:codyprime/qemu-kvm-jtc.git tags/block-pull-request
>
> for you to fetch changes up to c833d1e8f5e95762336a823a35ade65a2d0fe587:
>
>   gluster: allocate GlusterAIOCBs on the stack (2015-11-11 10:45:39 -0500)
>
> 
> Block patches
> 

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH V3 4/6] ide: orphan all buffered requests on DMA cancel

2015-11-12 Thread Fam Zheng

On Fri, 11/06 09:42, Peter Lieven wrote:
> If the guests canceles a DMA request we can prematurely
> invoke all callbacks of buffered requests and flag all them
> as orphaned. Ideally this avoids the need for draining all
> requests. For CDROM devices this works in 100% of all cases.
> 
> Signed-off-by: Peter Lieven 
> ---
>  hw/ide/pci.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> index d31ff88..a9e164e 100644
> --- a/hw/ide/pci.c
> +++ b/hw/ide/pci.c
> @@ -240,6 +240,22 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
>  /* Ignore writes to SSBM if it keeps the old value */
>  if ((val & BM_CMD_START) != (bm->cmd & BM_CMD_START)) {
>  if (!(val & BM_CMD_START)) {
> +/* First invoke the callbacks of all buffered requests
> + * and flag those requests as orphaned. Ideally there
> + * are no unbuffered (Scatter Gather DMA Requests or
> + * write requests) pending and we can avoid to drain. */
> +IDEBufferedRequest *req;
> +IDEState *s = idebus_active_if(bm->bus);
> +QLIST_FOREACH(req, >buffered_requests, list) {
> +if (!req->orphaned) {
> +#ifdef DEBUG_IDE
> +printf("%s: invoking cb %p of buffered request %p with"
> +   " -ECANCELED\n", __func__, req->original_cb, req);
> +#endif
> +req->original_cb(req->original_opaque, -ECANCELED);
> +}
> +req->orphaned = true;
> +}

Why not use bdrv_aio_cancel or bdrv_aio_cancel_async with the aio returned by
bdrv_aio_cancel?

Fam

>  /*
>   * We can't cancel Scatter Gather DMA in the middle of the
>   * operation or a partial (not full) DMA transfer would reach
> @@ -253,6 +269,9 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
>   * aio operation with preadv/pwritev.
>   */
>  if (bm->bus->dma->aiocb) {
> +#ifdef DEBUG_IDE
> +printf("%s: draining all remaining requests", __func__);
> +#endif
>  blk_drain_all();
>  assert(bm->bus->dma->aiocb == NULL);
>  }
> -- 
> 1.9.1
> 
>

[Qemu-devel] [PULL v2 06/12] e1000: Fixing the received/transmitted packets' counters

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

According to Intel's specs, these counters (as the other Statistic
registers) stick at 0x when this maximal value is reached.
Previously, they would reset after the max. value.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 767490c..57a61f6 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -580,6 +580,14 @@ putsum(uint8_t *data, uint32_t n, uint32_t sloc, uint32_t 
css, uint32_t cse)
 }
 }
 
+static inline void
+inc_reg_if_not_full(E1000State *s, int index)
+{
+if (s->mac_reg[index] != 0x) {
+s->mac_reg[index]++;
+}
+}
+
 static inline int
 vlan_enabled(E1000State *s)
 {
@@ -677,8 +685,8 @@ xmit_seg(E1000State *s)
 e1000_send_packet(s, tp->data, tp->size);
 }
 
-s->mac_reg[TPT]++;
-s->mac_reg[GPTC]++;
+inc_reg_if_not_full(s, TPT);
+s->mac_reg[GPTC] = s->mac_reg[TPT];
 n = s->mac_reg[TOTL];
 if ((s->mac_reg[TOTL] += s->tx.size) < n)
 s->mac_reg[TOTH]++;
@@ -1091,8 +1099,8 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 }
 } while (desc_offset < total_size);
 
-s->mac_reg[GPRC]++;
-s->mac_reg[TPR]++;
+inc_reg_if_not_full(s, TPR);
+s->mac_reg[GPRC] = s->mac_reg[TPR];
 /* TOR - Total Octets Received:
  * This register includes bytes received in a packet from the  field through the  field, inclusively.
-- 
2.1.4

[Qemu-devel] [PULL v2 08/12] e1000: Fixing the packet address filtering procedure

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

Previously, if promiscuous unicast was enabled, a packet was received
straight away, even if it was a multicast or a broadcast packet. This
patch fixes that behavior, while making the filtering procedure a bit
more human-readable.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 9967b5d..821fed4 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -872,6 +872,7 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
 static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
 static const int mta_shift[] = {4, 3, 2, 0};
 uint32_t f, rctl = s->mac_reg[RCTL], ra[2], *rp;
+int isbcast = !memcmp(buf, bcast, sizeof bcast), ismcast = (buf[0] & 1);
 
 if (is_vlan_packet(s, buf) && vlan_rx_filter_enabled(s)) {
 uint16_t vid = be16_to_cpup((uint16_t *)(buf + 14));
@@ -881,14 +882,17 @@ receive_filter(E1000State *s, const uint8_t *buf, int 
size)
 return 0;
 }
 
-if (rctl & E1000_RCTL_UPE) // promiscuous
+if (!isbcast && !ismcast && (rctl & E1000_RCTL_UPE)) { /* promiscuous 
ucast */
 return 1;
+}
 
-if ((buf[0] & 1) && (rctl & E1000_RCTL_MPE))   // promiscuous mcast
+if (ismcast && (rctl & E1000_RCTL_MPE)) {  /* promiscuous mcast */
 return 1;
+}
 
-if ((rctl & E1000_RCTL_BAM) && !memcmp(buf, bcast, sizeof bcast))
+if (isbcast && (rctl & E1000_RCTL_BAM)) {  /* broadcast enabled */
 return 1;
+}
 
 for (rp = s->mac_reg + RA; rp < s->mac_reg + RA + 32; rp += 2) {
 if (!(rp[1] & E1000_RAH_AV))
-- 
2.1.4

[Qemu-devel] [PULL v2 05/12] e1000: Trivial implementation of various MAC registers

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

These registers appear in Intel's specs, but were not implemented.
These registers are now implemented trivially, i.e. they are initiated
with zero values, and if they are RW, they can be written or read by the
driver, or read only if they are R (essentially retaining their zero
values). For these registers no other procedures are performed.

For the trivially implemented Diagnostic registers, a debug warning is
produced on read/write attempts.

PLEASE NOTE: these registers will not be active, nor will migrate, until
a compatibility flag will be set (in a later patch in this series).

The registers implemented here are:

Transmit:
RW: AIT

Management:
RW: WUC WUS IPAVIP6AT*  IP4AT*  FFLT*   WUPM*   FFMT*   FFVT*

Diagnostic:
RW: RDFHRDFTRDFHS   RDFTS   RDFPC   PBM*TDFHTDFTTDFHS
TDFTS   TDFPC

Statistic:
RW: FCRUC
R:  RNBCTSCTFC  MGTPRC  MGTPDC  MGTPTC  RFC RJC SCC ECOL
LATECOL MCC COLCDC  TNCRS   SEC CEXTERR RLECXONRXC
XONTXC  XOFFRXC XOFFTXC

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c  | 97 +++--
 hw/net/e1000_regs.h |  6 
 2 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index e079f25..767490c 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -172,7 +172,17 @@ enum {
 defreg(TPR), defreg(TPT), defreg(TXDCTL),  defreg(WUFC),
 defreg(RA),  defreg(MTA), defreg(CRCERRS), defreg(VFTA),
 defreg(VET), defreg(RDTR),defreg(RADV),defreg(TADV),
-defreg(ITR),
+defreg(ITR), defreg(FCRUC),   defreg(TDFH),defreg(TDFT),
+defreg(TDFHS),   defreg(TDFTS),   defreg(TDFPC),   defreg(RDFH),
+defreg(RDFT),defreg(RDFHS),   defreg(RDFTS),   defreg(RDFPC),
+defreg(IPAV),defreg(WUC), defreg(WUS), defreg(AIT),
+defreg(IP6AT),   defreg(IP4AT),   defreg(FFLT),defreg(FFMT),
+defreg(FFVT),defreg(WUPM),defreg(PBM), defreg(SCC),
+defreg(ECOL),defreg(MCC), defreg(LATECOL), defreg(COLC),
+defreg(DC),  defreg(TNCRS),   defreg(SEC), defreg(CEXTERR),
+defreg(RLEC),defreg(XONRXC),  defreg(XONTXC),  defreg(XOFFRXC),
+defreg(XOFFTXC), defreg(RFC), defreg(RJC), defreg(RNBC),
+defreg(TSCTFC),  defreg(MGTPRC),  defreg(MGTPDC),  defreg(MGTPTC)
 };
 
 static void
@@ -1122,6 +1132,30 @@ mac_readreg(E1000State *s, int index)
 }
 
 static uint32_t
+mac_low4_read(E1000State *s, int index)
+{
+return s->mac_reg[index] & 0xf;
+}
+
+static uint32_t
+mac_low11_read(E1000State *s, int index)
+{
+return s->mac_reg[index] & 0x7ff;
+}
+
+static uint32_t
+mac_low13_read(E1000State *s, int index)
+{
+return s->mac_reg[index] & 0x1fff;
+}
+
+static uint32_t
+mac_low16_read(E1000State *s, int index)
+{
+return s->mac_reg[index] & 0x;
+}
+
+static uint32_t
 mac_icr_read(E1000State *s, int index)
 {
 uint32_t ret = s->mac_reg[ICR];
@@ -1223,18 +1257,37 @@ static uint32_t (*macreg_readops[])(E1000State *, int) 
= {
 getreg(RDH),  getreg(RDT),  getreg(VET),  getreg(ICS),
 getreg(TDBAL),getreg(TDBAH),getreg(RDBAH),getreg(RDBAL),
 getreg(TDLEN),getreg(RDLEN),getreg(RDTR), getreg(RADV),
-getreg(TADV), getreg(ITR),
+getreg(TADV), getreg(ITR),  getreg(FCRUC),getreg(IPAV),
+getreg(WUC),  getreg(WUS),  getreg(SCC),  getreg(ECOL),
+getreg(MCC),  getreg(LATECOL),  getreg(COLC), getreg(DC),
+getreg(TNCRS),getreg(SEC),  getreg(CEXTERR),  getreg(RLEC),
+getreg(XONRXC),   getreg(XONTXC),   getreg(XOFFRXC),  getreg(XOFFTXC),
+getreg(RFC),  getreg(RJC),  getreg(RNBC), getreg(TSCTFC),
+getreg(MGTPRC),   getreg(MGTPDC),   getreg(MGTPTC),
 
 [TOTH]= mac_read_clr8,  [TORH]= mac_read_clr8,
 [GPRC]= mac_read_clr4,  [GPTC]= mac_read_clr4,
 [TPT] = mac_read_clr4,  [TPR] = mac_read_clr4,
 [ICR] = mac_icr_read,   [EECD]= get_eecd,
 [EERD]= flash_eerd_read,
+[RDFH]= mac_low13_read, [RDFT]= mac_low13_read,
+[RDFHS]   = mac_low13_read, [RDFTS]   = mac_low13_read,
+[RDFPC]   = mac_low13_read,
+[TDFH]= mac_low11_read, [TDFT]= mac_low11_read,
+[TDFHS]   = mac_low13_read, [TDFTS]   = mac_low13_read,
+[TDFPC]   = mac_low13_read,
+[AIT] = mac_low16_read,
 
 [CRCERRS ... MPC]   = _readreg,
+[IP6AT ... IP6AT+3] = _readreg,[IP4AT ... IP4AT+6] = _readreg,
+[FFLT ... FFLT+6]   = _low11_read,
 [RA ... RA+31]  = _readreg,
+[WUPM ... WUPM+31]  = _readreg,
 [MTA ... MTA+127]   = _readreg,
 [VFTA ... VFTA+127] = _readreg,
+[FFMT ... FFMT+254]

[Qemu-devel] [PULL v2 03/12] e1000: Add support for migrating the entire MAC registers' array

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

This patch makes the migration of the entire array of MAC registers
possible during live migration. The entire array is just 128 KB long, so
practically no penalty should be felt when transmitting it, additionally
to the previously transmitted individual registers. The advantage here is
eliminating the need to introduce new vmstate subsections in the future,
when additional MAC registers will be implemented.

Backward compatibility is preserved by introducing a e1000-specific
boolean parameter (in a later patch), which will be on by default.
Setting it to off would enable migration to older versions of QEMU.

Additionally, this parameter will be used to control the access to the
extra MAC registers in the future.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index da72776..7088027 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -135,8 +135,10 @@ typedef struct E1000State_st {
 /* Compatibility flags for migration to/from qemu 1.3.0 and older */
 #define E1000_FLAG_AUTONEG_BIT 0
 #define E1000_FLAG_MIT_BIT 1
+#define E1000_FLAG_MAC_BIT 2
 #define E1000_FLAG_AUTONEG (1 << E1000_FLAG_AUTONEG_BIT)
 #define E1000_FLAG_MIT (1 << E1000_FLAG_MIT_BIT)
+#define E1000_FLAG_MAC (1 << E1000_FLAG_MAC_BIT)
 uint32_t compat_flags;
 } E1000State;
 
@@ -1380,6 +1382,13 @@ static bool e1000_mit_state_needed(void *opaque)
 return s->compat_flags & E1000_FLAG_MIT;
 }
 
+static bool e1000_full_mac_needed(void *opaque)
+{
+E1000State *s = opaque;
+
+return s->compat_flags & E1000_FLAG_MAC;
+}
+
 static const VMStateDescription vmstate_e1000_mit_state = {
 .name = "e1000/mit_state",
 .version_id = 1,
@@ -1395,6 +1404,17 @@ static const VMStateDescription vmstate_e1000_mit_state 
= {
 }
 };
 
+static const VMStateDescription vmstate_e1000_full_mac_state = {
+.name = "e1000/full_mac_state",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = e1000_full_mac_needed,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32_ARRAY(mac_reg, E1000State, 0x8000),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_e1000 = {
 .name = "e1000",
 .version_id = 2,
@@ -1474,6 +1494,7 @@ static const VMStateDescription vmstate_e1000 = {
 },
 .subsections = (const VMStateDescription*[]) {
 _e1000_mit_state,
+_e1000_full_mac_state,
 NULL
 }
 };
-- 
2.1.4

[Qemu-devel] [PULL v2 07/12] e1000: Fixing the received/transmitted octets' counters

2015-11-12 Thread Jason Wang

From: Leonid Bloch 

Previously, these 64-bit registers did not stick at their maximal
values when (and if) they reached them, as they should do, according to
the specs.

This patch introduces a function that takes care of such registers,
avoiding code duplication, making the relevant parts more compatible
with the QEMU coding style, while ensuring that in the unlikely case
of reaching the maximal value, the counter will stick there, as it
supposed to.

Signed-off-by: Leonid Bloch 
Signed-off-by: Dmitry Fleytman 
Signed-off-by: Jason Wang 
---
 hw/net/e1000.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 57a61f6..9967b5d 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -588,6 +588,20 @@ inc_reg_if_not_full(E1000State *s, int index)
 }
 }
 
+static void
+grow_8reg_if_not_full(E1000State *s, int index, int size)
+{
+uint64_t sum = s->mac_reg[index] | (uint64_t)s->mac_reg[index+1] << 32;
+
+if (sum + size < sum) {
+sum = ~0ULL;
+} else {
+sum += size;
+}
+s->mac_reg[index] = sum;
+s->mac_reg[index+1] = sum >> 32;
+}
+
 static inline int
 vlan_enabled(E1000State *s)
 {
@@ -637,7 +651,7 @@ static void
 xmit_seg(E1000State *s)
 {
 uint16_t len, *sp;
-unsigned int frames = s->tx.tso_frames, css, sofar, n;
+unsigned int frames = s->tx.tso_frames, css, sofar;
 struct e1000_tx *tp = >tx;
 
 if (tp->tse && tp->cptse) {
@@ -686,10 +700,8 @@ xmit_seg(E1000State *s)
 }
 
 inc_reg_if_not_full(s, TPT);
+grow_8reg_if_not_full(s, TOTL, s->tx.size);
 s->mac_reg[GPTC] = s->mac_reg[TPT];
-n = s->mac_reg[TOTL];
-if ((s->mac_reg[TOTL] += s->tx.size) < n)
-s->mac_reg[TOTH]++;
 }
 
 static void
@@ -1104,11 +1116,9 @@ e1000_receive_iov(NetClientState *nc, const struct iovec 
*iov, int iovcnt)
 /* TOR - Total Octets Received:
  * This register includes bytes received in a packet from the  field through the  field, inclusively.
+ * Always include FCS length (4) in size.
  */
-n = s->mac_reg[TORL] + size + /* Always include FCS length. */ 4;
-if (n < s->mac_reg[TORL])
-s->mac_reg[TORH]++;
-s->mac_reg[TORL] = n;
+grow_8reg_if_not_full(s, TORL, size+4);
 
 n = E1000_ICS_RXT0;
 if ((rdt = s->mac_reg[RDT]) < s->mac_reg[RDH])
-- 
2.1.4

Re: [Qemu-devel] [PATCH 3/4] spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls

2015-11-12 Thread Thomas Huth

On 11/11/15 18:15, Aravinda Prasad wrote:
> This patch adds support in QEMU to handle "ibm,nmi-register"
> and "ibm,nmi-interlock" RTAS calls.
> 
> The machine check notification address is saved when the
> OS issues "ibm,nmi-register" RTAS call.
> 
> This patch also handles the case when multiple processors
> experience machine check at or about the same time by
> handling "ibm,nmi-interlock" call. In such cases, as per
> PAPR, subsequent processors serialize waiting for the first
> processor to issue the "ibm,nmi-interlock" call. The second
> processor waits till the first processor, which also
> received a machine check error, is done reading the error
> log. The first processor issues "ibm,nmi-interlock" call
> when the error log is consumed. This patch implements the
> releasing part of the error-log while subsequent patch
> (which builds error log) handles the locking part.
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  hw/ppc/spapr_rtas.c|   29 +
>  include/hw/ppc/spapr.h |8 +++-
>  2 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 9869bc9..fd4d2af 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -597,6 +597,31 @@ out:
>  rtas_st(rets, 0, rc);
>  }
>  
> +static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
> +  sPAPRMachineState *spapr,
> +  uint32_t token, uint32_t nargs,
> +  target_ulong args,
> +  uint32_t nret, target_ulong rets)
> +{
> +qemu_mutex_init(>mc_in_progress);
> +spapr->guest_machine_check_addr = rtas_ld(args, 1);
> +rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +}
> +
> +static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
> +   sPAPRMachineState *spapr,
> +   uint32_t token, uint32_t nargs,
> +   target_ulong args,
> +   uint32_t nret, target_ulong rets)
> +{
> +/*
> + * VCPU issuing "ibm,nmi-interlock" is done with NMI handling,
> + * hence unlock mc_in_progress.
> + */
> +qemu_mutex_unlock(>mc_in_progress);
> +rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +}

Maybe the interlock function should return an error if the nmi_register
function has not been called before? OTOH, RTAS is not supposed to do
excessive parameter checking, so this is maybe not worth the effort.

 Thomas

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Li, Liang Z

> On 12/11/2015 10:40, Li, Liang Z wrote:
> > I migrate a 8GB RAM Idle guest,  I think most of it's pages are zero pages.
> >
> > I use your new code:
> > -
> > unsigned long *p = ...
> > if (p[0] || p[1] || p[2] || p[3]
> > || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
> > return BUFFER_NOT_ZERO;
> > else
> > return BUFFER_ZERO;
> > ---
> > and the result is almost the same.  I also tried the check 8, 16 long
> > data at the beginning, same result.
> 
> Interesting...  Well, all I can say is that applaud you for testing your 
> hypothesis
> with the benchmark.
> 
> Probably the setup cost of memcmp is too high, because the testing loop is
> already very optimized.
> 
> Please submit the AVX2 version if it helps!

Yes, the AVX2 version really helps. I have already submitted it, could you help 
to review it?

I am curious about the original intention to add the SSE2 Intrinsics, is the 
same reason?

I even suspect the VM may impact the 'memcmp()' performance, is it possible?

Liang

> Paolo

Re: [Qemu-devel] [v2 1/2] cutils: add avx2 instruction optimization

2015-11-12 Thread Paolo Bonzini



On 10/11/2015 03:51, Liang Li wrote:
> buffer_find_nonzero_offset() is a hot function during live migration.
> Now it use SSE2 intructions for optimization. For platform supports
> AVX2 instructions, use the AVX2 instructions for optimization can help
> to improve the performance about 30% comparing to SSE2.
> Zero page check can be faster with this optimization, the test result
> shows that for an 8GB RAM idle guest, this patch can help to shorten
> the total live migration time about 6%.
> 
> This patch use the ifunc mechanism to select the proper function when
> running, for platform supports AVX2, excute the AVX2 instructions,
> else, excute the original code.
> 
> Signed-off-by: Liang Li 
> ---
>  include/qemu-common.h | 28 +++--
>  util/Makefile.objs|  2 ++
>  util/avx2.c   | 69 
> +++
>  util/cutils.c | 53 +--
>  4 files changed, 143 insertions(+), 9 deletions(-)
>  create mode 100644 util/avx2.c
> 
> diff --git a/include/qemu-common.h b/include/qemu-common.h
> index 2f74540..9fa7501 100644
> --- a/include/qemu-common.h
> +++ b/include/qemu-common.h
> @@ -484,15 +484,29 @@ void qemu_hexdump(const char *buf, FILE *fp, const char 
> *prefix, size_t size);
>  #endif
>  
>  #define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
> -static inline bool
> -can_use_buffer_find_nonzero_offset(const void *buf, size_t len)
> -{
> -return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
> -   * sizeof(VECTYPE)) == 0
> -&& ((uintptr_t) buf) % sizeof(VECTYPE) == 0);
> -}
> +bool can_use_buffer_find_nonzero_offset(const void *buf, size_t len);
> +
>  size_t buffer_find_nonzero_offset(const void *buf, size_t len);
>  
> +extern bool
> +can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len);
> +
> +extern size_t buffer_find_nonzero_offset_avx2(const void *buf, size_t len);
> +
> +extern bool
> +can_use_buffer_find_nonzero_offset_inner(const void *buf, size_t len);
> +
> +extern size_t buffer_find_nonzero_offset_inner(const void *buf, size_t len);
> +
> +__asm__(".type can_use_buffer_find_nonzero_offset, \%gnu_indirect_function");
> +__asm__(".type buffer_find_nonzero_offset, \%gnu_indirect_function");
> +
> +
> +void *can_use_buffer_find_nonzero_offset_ifunc(void) \
> + __asm__("can_use_buffer_find_nonzero_offset");
> +
> +void *buffer_find_nonzero_offset_ifunc(void) \
> + __asm__("buffer_find_nonzero_offset");
>  /*
>   * helper to parse debug environment variables
>   */
> diff --git a/util/Makefile.objs b/util/Makefile.objs
> index d7cc399..6aacad7 100644
> --- a/util/Makefile.objs
> +++ b/util/Makefile.objs
> @@ -1,4 +1,5 @@
>  util-obj-y = osdep.o cutils.o unicode.o qemu-timer-common.o
> +util-obj-y += avx2.o
>  util-obj-$(CONFIG_POSIX) += compatfd.o
>  util-obj-$(CONFIG_POSIX) += event_notifier-posix.o
>  util-obj-$(CONFIG_POSIX) += mmap-alloc.o
> @@ -29,3 +30,4 @@ util-obj-y += qemu-coroutine.o qemu-coroutine-lock.o 
> qemu-coroutine-io.o
>  util-obj-y += qemu-coroutine-sleep.o
>  util-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
>  util-obj-y += buffer.o
> +avx2.o-cflags  := $(AVX2_CFLAGS)
> diff --git a/util/avx2.c b/util/avx2.c
> new file mode 100644
> index 000..0e6915a
> --- /dev/null
> +++ b/util/avx2.c
> @@ -0,0 +1,69 @@
> +#include "qemu-common.h"
> +
> +#ifdef __AVX2__
> +#include 
> +#define AVX2_VECTYPE__m256i
> +#define AVX2_SPLAT(p)   _mm256_set1_epi8(*(p))
> +#define AVX2_ALL_EQ(v1, v2) \
> +(_mm256_movemask_epi8(_mm256_cmpeq_epi8(v1, v2)) == 0x)
> +#define AVX2_VEC_OR(v1, v2) (_mm256_or_si256(v1, v2))
> +
> +inline bool
> +can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len)
> +{
> +return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
> +   * sizeof(AVX2_VECTYPE)) == 0
> +&& ((uintptr_t) buf) % sizeof(AVX2_VECTYPE) == 0);
> +}
> +
> +size_t buffer_find_nonzero_offset_avx2(const void *buf, size_t len)
> +{
> +const AVX2_VECTYPE *p = buf;
> +const AVX2_VECTYPE zero = (AVX2_VECTYPE){0};
> +size_t i;
> +
> +assert(can_use_buffer_find_nonzero_offset_avx2(buf, len));
> +
> +if (!len) {
> +return 0;
> +}
> +
> +for (i = 0; i < BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR; i++) {
> +if (!AVX2_ALL_EQ(p[i], zero)) {
> +return i * sizeof(AVX2_VECTYPE);
> +}
> +}
> +
> +for (i = BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR;
> + i < len / sizeof(AVX2_VECTYPE);
> + i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
> +AVX2_VECTYPE tmp0 = AVX2_VEC_OR(p[i + 0], p[i + 1]);
> +AVX2_VECTYPE tmp1 = AVX2_VEC_OR(p[i + 2], p[i + 3]);
> +AVX2_VECTYPE tmp2 = AVX2_VEC_OR(p[i + 4], p[i + 5]);
> +AVX2_VECTYPE tmp3 = AVX2_VEC_OR(p[i + 6], p[i + 7]);
> +AVX2_VECTYPE tmp01 = AVX2_VEC_OR(tmp0, tmp1);
>

Re: [Qemu-devel] [PATCH 2/2] i440fx: print an error message if user tries to enable iommu

2015-11-12 Thread Bandan Das

Markus Armbruster  writes:

> Bandan Das  writes:
>
>> There's no indication of any sort that i440fx doesn't support
>> "iommu=on""
>>
>> Signed-off-by: Bandan Das 
>> ---
>>  hw/pci-host/piix.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
>> index 7b2fbf9..f12593a 100644
>> --- a/hw/pci-host/piix.c
>> +++ b/hw/pci-host/piix.c
>> @@ -301,6 +301,10 @@ static void i440fx_pcihost_realize(DeviceState *dev, 
>> Error **errp)
>>  static void i440fx_realize(PCIDevice *dev, Error **errp)
>>  {
>>  dev->config[I440FX_SMRAM] = 0x02;
>> +
>> +if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
>> +fprintf(stderr, "i440fx doesn't support emulated iommu\n");
>> +}
>>  }
>>  
>>  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
>
> error_report(), please.
>
> If this is just a warning, please prefix the message with "warning: ".

Sure will do. Yeah, it seems good enough just to print a message without
exiting.

> If it isn't, exit(1).

[Qemu-devel] [PATCH v2 1/2] q35: Check propery to determine if iommu is set

2015-11-12 Thread Bandan Das

The helper function machine_iommu() isn't necesary. We can
directly check for the property.

Signed-off-by: Bandan Das 
---
 hw/core/machine.c   | 5 -
 hw/pci-host/q35.c   | 2 +-
 include/hw/boards.h | 1 -
 3 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index f4db340..acca00d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -462,11 +462,6 @@ bool machine_usb(MachineState *machine)
 return machine->usb;
 }
 
-bool machine_iommu(MachineState *machine)
-{
-return machine->iommu;
-}
-
 bool machine_kernel_irqchip_allowed(MachineState *machine)
 {
 return machine->kernel_irqchip_allowed;
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index c81507d..1fb4707 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -506,7 +506,7 @@ static void mch_realize(PCIDevice *d, Error **errp)
  PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
 }
 /* Intel IOMMU (VT-d) */
-if (machine_iommu(current_machine)) {
+if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
 mch_init_dmar(mch);
 }
 }
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 3e9a92c..24eb6f0 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -33,7 +33,6 @@ MachineClass *find_default_machine(void);
 extern MachineState *current_machine;
 
 bool machine_usb(MachineState *machine);
-bool machine_iommu(MachineState *machine);
 bool machine_kernel_irqchip_allowed(MachineState *machine);
 bool machine_kernel_irqchip_required(MachineState *machine);
 int machine_kvm_shadow_mem(MachineState *machine);
-- 
2.5.0

[Qemu-devel] [PATCH v2 2/2] i440fx: print an error message if user tries to enable iommu

2015-11-12 Thread Bandan Das

There's no indication of any sort that i440fx doesn't support
"iommu=on""

Signed-off-by: Bandan Das 
---
 hw/pci-host/piix.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 7b2fbf9..ffcb846 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -34,6 +34,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/i386/ioapic.h"
 #include "qapi/visitor.h"
+#include "qemu/error-report.h"
 
 /*
  * I440FX chipset data sheet.
@@ -301,6 +302,10 @@ static void i440fx_pcihost_realize(DeviceState *dev, Error 
**errp)
 static void i440fx_realize(PCIDevice *dev, Error **errp)
 {
 dev->config[I440FX_SMRAM] = 0x02;
+
+if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
+error_report("warning: i440fx doesn't support emulated iommu\n");
+}
 }
 
 PCIBus *i440fx_init(const char *host_type, const char *pci_type,
-- 
2.5.0

Re: [Qemu-devel] [PATCH 3/4] block/gluster: using new qapi schema

2015-11-12 Thread Eric Blake

On 11/12/2015 03:22 AM, Prasanna Kumar Kalever wrote:
> this patch adds GlusterConf to qapi/block-core.json
> 
> Signed-off-by: Prasanna Kumar Kalever 
> ---
>  block/gluster.c  | 104 
> +--
>  qapi/block-core.json |  60 +++--
>  2 files changed, 109 insertions(+), 55 deletions(-)
> 

Pointing it out here for completeness, even though I first stumbled on
it when reviewing 4/4:

> @@ -190,13 +180,23 @@ static int qemu_gluster_parseuri(GlusterConf *gconf, 
> const char *filename)
>  ret = -EINVAL;
>  goto out;
>  }
> -gconf->host = g_strdup(qp->p[0].value);
> +gconf->server->host = g_strdup(qp->p[0].value);

This is abusing the 'host' field of GlusterServer to track a socket
path, and ignores the fact that port is meaningless for a
gluster+unix:// connection.

> @@ -224,8 +225,9 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, 
> const char *filename,
>  goto out;
>  }
>  
> -ret = glfs_set_volfile_server(glfs, gconf->transport, gconf->host,
> -gconf->port);
> +ret = glfs_set_volfile_server(glfs,
> +  
> GlusterTransport_lookup[gconf->server->transport],
> +  gconf->server->host, gconf->server->port);

At least gluster itself has the same overloaded abuse of terminology;
I'm hoping that a port of 0 is okay when requesting a "unix"
volfile_server.  [I don't know, because I didn't read the docs for
glfs_set_volfile_server()]

> +##
> +# @GlusterServer
> +#
> +# Details for connecting to a gluster server
> +#
> +# @host:   host address (hostname/ipv4/ipv6 addresses)
> +#
> +# @port:   #optional port number on which glusterd is listening
> +#   (default 24007)
> +#
> +# @transport:  #optional transport type used to connect to gluster management
> +#   daemon (default 'tcp')
> +#
> +# Since: 2.5
> +##
> +{ 'struct': 'GlusterServer',
> +  'data': { 'host': 'str',
> +'*port': 'int',
> +'*transport': 'GlusterTransport' } }

And my idea on patch 4/4 was that converting this from simple struct to
flat union might be a more realistic view of things (if transport is
'unix', there can't be a port; and rather than abusing the name 'host'
we could use the name 'socket'; similarly for 'rdma') - but without
additional qapi support, I don't know that we can have an optional
'transport' and still have a discriminated union in time for 2.5.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 2/3] usb-mtp: Add support for inotify based file monitoring

2015-11-12 Thread Bandan Das

Gerd Hoffmann  writes:

> On Mo, 2015-11-09 at 18:12 -0500, Bandan Das wrote:
>> Gerd Hoffmann  writes:
>> 
>> > On Di, 2015-11-03 at 19:00 -0500, Bandan Das wrote:
>> >> +/* Add a new watch asap so as to not lose events
>> >> */
>> >
>> > This comment sounds like there is a race ("asap").  There isn't one,
>> > correct ordering (adding the watch before reading the directory) is
>> 
>> Hmm, seems like there's still a small window. We may not have even
>> started processing the event because we are still processing the earlier
>> ones.
>
>> > enough to make sure you don't miss anything.  You might see create
>> > events for objects already in the tree though, are you prepared to
>> > handle that?
>> 
>> Oh, interesting.  Current version will happily add duplicate entries.
>> I will add a check.
>
> I think we are talking about the same thing here.
> Things can run in parallel, like this:
>
> process copying a file tree | qemu with usb-mtp
> +--
> create directory|
> | inotify event #1 queued (dir)
> | qemu fetches event #1
> | qemu adds new inotify watch
> copy file into new dir  |
> | inotify event #2 queued (file)
> | qemu reads new directory
> | qemu finds the new file
> | qemu fetches event #2
>
> So, yes, the kernel can add new inotify events for the new watch before

Maybe I am missing something but what if the watch on dir was
added by qemu _after_ the file (say file1) was copied to it.
Then, the kernel would generate events for file2, file3 and so on but
never a CREATE event for file1. Isn't that a possibility ? So, what I mean
by that comment is that add a watchpoint soon enough but it could be
possible that by the time the watch is added, a few files might have already
been copied and will not generate events.

> qemu finished processing the old event (especially before you are done
> reading the directory), and if you are hitting that the effect is that
> you see a create event for the new file even though you already have it
> in the tree.
>
> But it is impossible that you miss the creation of the new file (this is
> what I meant with "there is no race").
>
> hope this clarifies,
>   Gerd

[Qemu-devel] [PATCH v2 0/2] Minor cleanups when parsing the "iommu" option

2015-11-12 Thread Bandan Das

Small cleanup changes. The first removes the helper function by directly
checking the property and the second adds a error message if user tries
to use "-machine iommu=on" with i440fx.

v2:
2/2: use error_report for the warning message

Bandan Das (2):
  q35: Check propery to determine if iommu is set
  i440fx: print an error message if user tries to enable iommu

 hw/core/machine.c   | 5 -
 hw/pci-host/piix.c  | 5 +
 hw/pci-host/q35.c   | 2 +-
 include/hw/boards.h | 1 -
 4 files changed, 6 insertions(+), 7 deletions(-)

-- 
2.5.0

Re: [Qemu-devel] [PATCH 4/4] block/gluster: add support for multiple gluster servers

2015-11-12 Thread Eric Blake

On 11/12/2015 03:22 AM, Prasanna Kumar Kalever wrote:
> This patch adds a way to specify multiple volfile servers to the gluster
> block backend of QEMU with tcp|rdma transport types and their port numbers.
> 

> This patch gives a mechanism to provide all the server addresses, which are in
> replica set, so in case host1 is down VM can still boot from any of the
> active hosts.
> 
> This is equivalent to the backup-volfile-servers option supported by
> mount.glusterfs (FUSE way of mounting gluster volume)
> 
> Credits: Sincere thanks to Kevin Wolf  and
> "Deepak C Shetty"  for inputs and all their support
> 
> Signed-off-by: Prasanna Kumar Kalever 
> ---
>  block/gluster.c  | 288 
> ---
>  qapi/block-core.json |   4 +-
>  2 files changed, 252 insertions(+), 40 deletions(-)

All right - the diffstat is smaller this time around (288 is nicer than
468 lines changed in v13).  There's always a psychological barrier to
reviewing large patches, and breaking things into bite-sized chunks
helps even if the same amount of work is done overall.

> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 615f28b..ba209cf 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -12,6 +12,13 @@
>  #include "qemu/uri.h"
>  
>  #define GLUSTER_OPT_FILENAME"filename"
> +#define GLUSTER_OPT_VOLUME  "volume"
> +#define GLUSTER_OPT_PATH"path"
> +#define GLUSTER_OPT_HOST"host"
> +#define GLUSTER_OPT_PORT"port"
> +#define GLUSTER_OPT_TRANSPORT   "transport"
> +#define GLUSTER_OPT_SERVER_PATTERN  "server."
> +
>  #define GLUSTER_DEFAULT_PORT24007

Once again, I'm jumping to the interface first [1]


> @@ -131,6 +178,7 @@ static int qemu_gluster_parseuri(BlockdevOptionsGluster 
> **pgconf,
>   const char *filename)
>  {
>  BlockdevOptionsGluster *gconf;
> +GlusterServer *gsconf;
>  URI *uri;
>  QueryParams *qp = NULL;
>  bool is_unix = false;
> @@ -142,23 +190,24 @@ static int qemu_gluster_parseuri(BlockdevOptionsGluster 
> **pgconf,
>  }
>  
>  gconf = g_new0(BlockdevOptionsGluster, 1);
> -gconf->server = g_new0(GlusterServer, 1);
> +gconf->server = g_new0(GlusterServerList, 1);
> +gconf->server->value = gsconf = g_new0(GlusterServer, 1);
>  
>  /* transport */
>  if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
> -gconf->server->transport = GLUSTER_TRANSPORT_TCP;
> +gsconf->transport = GLUSTER_TRANSPORT_TCP;

Most of the changes here in parseuri could have been in patch 3/4 if we
weren't churning on the qapi definition.  But looks like your conversion
here is correct.

> -static struct glfs *qemu_gluster_init(BlockdevOptionsGluster **pgconf,
> -  const char *filename, Error **errp)
> +static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
> +   Error **errp)
>  {

I might have split the refactoring of qemu_gluster_glfs_init() into its
own patch, but not the end of the world the way it was done here.

>  struct glfs *glfs;
>  int ret;
>  int old_errno;
> -BlockdevOptionsGluster *gconf;
> -
> -ret = qemu_gluster_parseuri(, filename);
> -if (ret < 0) {
> -error_setg(errp, "Usage: file=gluster[+transport]://[host[:port]]/"
> - "volume/path[?socket=...]");
> -errno = -ret;
> -goto out;
> -}
> +GlusterServerList *server;
>  
>  glfs = glfs_new(gconf->volume);
>  if (!glfs) {
>  goto out;
>  }
>  
> -ret = glfs_set_volfile_server(glfs,
> -  
> GlusterTransport_lookup[gconf->server->transport],
> -  gconf->server->host, gconf->server->port);
> -if (ret < 0) {
> -goto out;
> +for (server = gconf->server; server; server = server->next) {

I still wonder if gconf->servers (and therefore servers.0, servers.1 in
the command line, instead of server.0, server.1) would have been a
better name for the list, but I don't know if it is worth repainting the
bikeshed at this point in time.  On the other hand, it's user-visible,
so once it gets released, we're stuck with the name, but up until then,
we can do a followup patch if anyone else has a strong opinion.

> +ret = glfs_set_volfile_server(glfs,
> +  
> GlusterTransport_lookup[server->value->transport],
> +  server->value->host, 
> server->value->port);

I asked in v13 if all initializations set the optional transport and
port.  See [3] below

> +if (ret < 0) {
> +goto out;
> +}
>  }
>  
>  /*
> @@ -244,10 +287,9 @@ static struct glfs 
> *qemu_gluster_init(BlockdevOptionsGluster **pgconf,
>  ret = glfs_init(glfs);
>

Re: [Qemu-devel] [PATCH 0/4] block/gluster: add support for multiple gluster servers

2015-11-12 Thread Eric Blake

[adding qemu-block]

On 11/12/2015 03:22 AM, Prasanna Kumar Kalever wrote:
> This release is rebased on qemu master branch.
> In this series of patches 1/4 and 2/4 are unchanged.

According to scripts/get-maintainer.pl, this series should have cc'd
qemu-bl...@nongnu.org.  I don't know if anyone on the block list missed
my reviews because they were only on qemu-devel; and it may matter to
other networked block devices that also need to implement structured
options for use in blockdev-add.

> 
> Prasanna Kumar Kalever (4):
>   block/gluster: rename [server, volname, image] -> [host, volume, path]
>   block/gluster: code cleanup
>   block/gluster: using new qapi schema
>   block/gluster: add support for multiple gluster servers
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 2/2] i440fx: print an error message if user tries to enable iommu

2015-11-12 Thread Eric Blake

On 11/12/2015 03:55 PM, Bandan Das wrote:
> There's no indication of any sort that i440fx doesn't support
> "iommu=on""
> 
> Signed-off-by: Bandan Das 
> ---
>  hw/pci-host/piix.c | 5 +
>  1 file changed, 5 insertions(+)
> 

> @@ -301,6 +302,10 @@ static void i440fx_pcihost_realize(DeviceState *dev, 
> Error **errp)
>  static void i440fx_realize(PCIDevice *dev, Error **errp)
>  {
>  dev->config[I440FX_SMRAM] = 0x02;
> +
> +if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
> +error_report("warning: i440fx doesn't support emulated iommu\n");

No trailing \n with error_report().

With that fixed (and perhaps maintainer can do it),
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V3 5/6] ide: enable buffered requests for ATAPI devices

2015-11-12 Thread Fam Zheng

On Fri, 11/06 09:42, Peter Lieven wrote:
> Signed-off-by: Peter Lieven 
> ---
>  hw/ide/atapi.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> index 29fd131..2f6d018 100644
> --- a/hw/ide/atapi.c
> +++ b/hw/ide/atapi.c
> @@ -190,8 +190,8 @@ static int cd_read_sector(IDEState *s, void *buf)
>  block_acct_start(blk_get_stats(s->blk), >acct,
>   4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>  
> -blk_aio_readv(s->blk, (int64_t)s->lba << 2, >qiov, 4,
> -  cd_read_sector_cb, s);
> +ide_buffered_readv(s, (int64_t)s->lba << 2, >qiov, 4,
> +   cd_read_sector_cb, s);
>  
>  s->status |= BUSY_STAT;
>  return 0;
> @@ -424,9 +424,9 @@ static void ide_atapi_cmd_read_dma_cb(void *opaque, int 
> ret)
>  s->bus->dma->iov.iov_len = n * 4 * 512;
>  qemu_iovec_init_external(>bus->dma->qiov, >bus->dma->iov, 1);
>  
> -s->bus->dma->aiocb = blk_aio_readv(s->blk, (int64_t)s->lba << 2,
> -   >bus->dma->qiov, n * 4,
> -   ide_atapi_cmd_read_dma_cb, s);
> +s->bus->dma->aiocb = ide_buffered_readv(s, (int64_t)s->lba << 2,
> +>bus->dma->qiov, n * 4,
> +ide_atapi_cmd_read_dma_cb, s);

IIRC the dma aiocb are still going to be drained in bmdma_cmd_writeb, so why do
we need the bounce buffer?

>  return;
>  
>  eot:
> -- 
> 1.9.1
> 
>

Re: [Qemu-devel] [v2 1/2] cutils: add avx2 instruction optimization

2015-11-12 Thread Juan Quintela

Paolo Bonzini  wrote:

>
> The main issue here is that you are not testing whether the compiler 
> supports gnu_indirect_function.
>
> I suggest that you start by moving the functions to util/buffer-zero.c
>
> Then the structure should be something like
>
> #ifdef CONFIG_HAVE_AVX2
> #include 
> #endif
>
> ... define buffer_find_nonzero_offset_inner ...
> ... define can_use_buffer_find_nonzero_offset_inner ...
>
> #if defined CONFIG_HAVE_GNU_IFUNC && defined CONFIG_HAVE_AVX2
> ... define buffer_find_nonzero_offset_avx2 ...
> ... define can_use_buffer_find_nonzero_offset_avx2 ...
> ... define the indirect functions ...
> #else
> ... define buffer_find_nonzero_offset that just calls
> buffer_find_nonzero_offset_inner ...
> ... define can_use_buffer_find_nonzero_offset that just calls
> can_use_buffer_find_nonzero_offset_inner ...
> #endif

My understanding for this was that glibc is better than hand made asm,
and paolo4_memzero (or whatever was it called) was the best approach.
And just remove SSE.  Have I missed something?


Later, Juan.

Re: [Qemu-devel] [PULL 42/57] Page request: Consume pages off the post-copy queue

2015-11-12 Thread Peter Maydell

On 10 November 2015 at 14:25, Juan Quintela  wrote:
> From: "Dr. David Alan Gilbert" 
>
> When transmitting RAM pages, consume pages that have been queued by
> MIG_RPCOMM_REQPAGE commands and send them ahead of normal page scanning.
>
> Note:
>   a) After a queued page the linear walk carries on from after the
> unqueued page; there is a reasonable chance that the destination
> was about to ask for other closeby pages anyway.
>
>   b) We have to be careful of any assumptions that the page walking
> code makes, in particular it does some short cuts on its first linear
> walk that break as soon as we do a queued page.
>
>   c) We have to be careful to not break up host-page size chunks, since
> this makes it harder to place the pages on the destination.
>
> Signed-off-by: Dr. David Alan Gilbert 
> Reviewed-by: Juan Quintela 
> Signed-off-by: Juan Quintela 

I've just discovered that this is causing 'make check' failures on
my OSX host (unfortunately something in my setup is causing
'make check' failures to not always cause a build failure, so I
didn't notice earlier):

manooth$ (make -C build/x86 -j8 && cd build/x86 &&
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
QTEST_QEMU_IMG=qemu-img MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM %
255 + 1))}  tests/ahci-test)
[...]

/x86_64/ahci/flush/simple: OK
/x86_64/ahci/flush/retry: OK
/x86_64/ahci/flush/migrate: qemu: qemu_mutex_lock: Invalid argument
qemu-system-x86_64:Broken pipe
 Not a migration stream
qemu-system-x86_64: load of migration failed: Invalid argument

thanks
-- PMM

[Qemu-devel] [PATCH v2] tests/vhost-user-bridge: implement logging of dirty pages

2015-11-12 Thread Victor Kaplansky

During migration devices continue writing to the guest's memory.
The writes has to be reported to QEMU. This change implements
minimal support in vhost-user-bridge required for successful
migration of a guest with virtio-net device.

Signed-off-by: Victor Kaplansky 
---
v2:
   - use log_guest_addr for used ring reported by qemu instead of
 translating.
   - use mmap_size and mmap_offset defined in new
 VHOST_USER_SET_LOG_BASE interface. See the patch
 "vhost-user: modify SET_LOG_BASE to pass mmap size and
 offset".
   - start logging dirty pages only after the appropriate feature
 is set by a VHOST_USER_GET_PROTOCOL_FEATURES request.
   - updated TODO list.

 tests/vhost-user-bridge.c | 169 ++
 1 file changed, 155 insertions(+), 14 deletions(-)

diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
index fa18ad5..8c1c997 100644
--- a/tests/vhost-user-bridge.c
+++ b/tests/vhost-user-bridge.c
@@ -13,16 +13,22 @@
 /*
  * TODO:
  * - main should get parameters from the command line.
- * - implement all request handlers.
+ * - implement all request handlers. Still not implemented:
+ *  vubr_set_protocol_features_exec()
+ *  vubr_get_queue_num_exec()
+ *  vubr_set_vring_enable_exec()
+ *  vubr_send_rarp_exec()
+ *  vubr_set_log_fd_exec()
  * - test for broken requests and virtqueue.
  * - implement features defined by Virtio 1.0 spec.
  * - support mergeable buffers and indirect descriptors.
- * - implement RESET_DEVICE request.
  * - implement clean shutdown.
  * - implement non-blocking writes to UDP backend.
  * - implement polling strategy.
  */
 
+#define _FILE_OFFSET_BITS 64
+
 #include 
 #include 
 #include 
@@ -166,6 +172,7 @@ typedef struct VubrVirtq {
 struct vring_desc *desc;
 struct vring_avail *avail;
 struct vring_used *used;
+uint64_t log_guest_addr;
 } VubrVirtq;
 
 /* Based on qemu/hw/virtio/vhost-user.c */
@@ -173,6 +180,9 @@ typedef struct VubrVirtq {
 #define VHOST_MEMORY_MAX_NREGIONS8
 #define VHOST_USER_F_PROTOCOL_FEATURES 30
 
+typedef uint8_t vhost_log_chunk_t;
+#define VHOST_LOG_PAGE 4096
+
 enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_MQ = 0,
 VHOST_USER_PROTOCOL_F_LOG_SHMFD = 1,
@@ -220,6 +230,11 @@ typedef struct VhostUserMemory {
 VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
 } VhostUserMemory;
 
+typedef struct VhostUserLog {
+uint64_t mmap_size;
+uint64_t mmap_offset;
+} VhostUserLog;
+
 typedef struct VhostUserMsg {
 VhostUserRequest request;
 
@@ -234,6 +249,7 @@ typedef struct VhostUserMsg {
 struct vhost_vring_state state;
 struct vhost_vring_addr addr;
 VhostUserMemory memory;
+VhostUserLog log;
 } payload;
 int fds[VHOST_MEMORY_MAX_NREGIONS];
 int fd_num;
@@ -265,8 +281,13 @@ typedef struct VubrDev {
 uint32_t nregions;
 VubrDevRegion regions[VHOST_MEMORY_MAX_NREGIONS];
 VubrVirtq vq[MAX_NR_VIRTQUEUE];
+int log_call_fd;
+uint64_t log_size;
+vhost_log_chunk_t *log_table;
 int backend_udp_sock;
 struct sockaddr_in backend_udp_dest;
+int ready;
+uint64_t features;
 } VubrDev;
 
 static const char *vubr_request_str[] = {
@@ -368,7 +389,12 @@ vubr_message_read(int conn_fd, VhostUserMsg *vmsg)
 
 rc = recvmsg(conn_fd, , 0);
 
-if (rc <= 0) {
+if (rc == 0) {
+vubr_die("recvmsg");
+fprintf(stderr, "Peer disconnected.\n");
+exit(1);
+}
+if (rc < 0) {
 vubr_die("recvmsg");
 }
 
@@ -395,7 +421,12 @@ vubr_message_read(int conn_fd, VhostUserMsg *vmsg)
 
 if (vmsg->size) {
 rc = read(conn_fd, >payload, vmsg->size);
-if (rc <= 0) {
+if (rc == 0) {
+vubr_die("recvmsg");
+fprintf(stderr, "Peer disconnected.\n");
+exit(1);
+}
+if (rc < 0) {
 vubr_die("recvmsg");
 }
 
@@ -465,12 +496,39 @@ vubr_virtqueue_kick(VubrVirtq *vq)
 }
 }
 
+
+static void
+vubr_log_page(uint8_t *log_table, uint64_t page)
+{
+DPRINT("Logged dirty guest page: %"PRId64"\n", page);
+log_table[page / 8] |= 1 << (page % 8);
+}
+
+static void
+vubr_log_write(VubrDev *dev, uint64_t address, uint64_t length)
+{
+uint64_t page;
+
+if (!(dev->features & VHOST_F_LOG_ALL) || !dev->log_table || !length) {
+return;
+}
+
+assert(dev->log_size >= ((address + length) / VHOST_LOG_PAGE / 8));
+
+page = address / VHOST_LOG_PAGE;
+while (page * VHOST_LOG_PAGE < address + length) {
+vubr_log_page(dev->log_table, page);
+page += VHOST_LOG_PAGE;
+}
+}
+
 static void
 vubr_post_buffer(VubrDev *dev, VubrVirtq *vq, uint8_t *buf, int32_t len)
 {
 struct vring_desc *desc   = vq->desc;
 struct vring_avail *avail = vq->avail;
 struct vring_used *used   = vq->used;
+uint64_t log_guest_addr =

Re: [Qemu-devel] [PULL v2 0/1] error: More error_setg() usage

2015-11-12 Thread Peter Maydell

On 12 November 2015 at 10:28, Peter Maydell  wrote:
> On 11 November 2015 at 17:59, Markus Armbruster  wrote:
>> v2: Indentation touched up
>>
>> The following changes since commit 3c07587d49458341510360557c849e93e9afaf59:
>>
>>   Merge remote-tracking branch 'remotes/dgibson/tags/ppc-next-2015' into 
>> staging (2015-11-11 09:34:18 +)
>>
>> are available in the git repository at:
>>
>>   git://repo.or.cz/qemu/armbru.git tags/pull-error-2015-11-11
>>
>> for you to fetch changes up to 455b0fde8c38a0794743e2e7c1a40018b7bee9f6:
>>
>>   error: More error_setg() usage (2015-11-11 18:56:26 +0100)
>>
>> 
>> error: More error_setg() usage
>>
>> 
>> Eric Blake (1):
>>   error: More error_setg() usage
>>
>>  block.c   |  3 +--
>>  docs/writing-qmp-commands.txt | 20 +---
>>  hw/i386/pc.c  |  6 +++---
>>  hw/net/rocker/rocker.c|  6 ++
>>  hw/net/rocker/rocker_of_dpa.c | 12 
>>  qom/object.c  |  4 ++--
>>  6 files changed, 21 insertions(+), 30 deletions(-)
>>
>> Eric Blake (1):
>>   error: More error_setg() usage
>
>
> Not clear whether this is the fault of these patches, but I get
> 'make check' failures on OSX:

...almost certainly not these patches. For some reason these
don't actually cause 'make check' to return non-zero, and
my "grep logs for warning/error etc" workflow wouldn't catch
them. I just happened to look at this log by hand.

thanks
-- PMM

Re: [Qemu-devel] [PATCH for-2.5 0/4] Expose ErrorClass through introspection

2015-11-12 Thread Markus Armbruster

Eric Blake  writes:

> I noticed that introspection was not documenting either
> qmp_capabilities nor the ErrorClass enum.  I think this is worth
> fixing for 2.5 when introspection is brand new, so that if we later
> extend the ErrorClass enum or add future capability negotiation (and
> in particular if such additions get backported in downstream builds),
> a client will be able to use introspection to learn whether the new
> features are supported, regardless of the qemu version.
>
> Note that this also adds qmp_capabilities to 'query-commands'.
>
> Yes, this is borderline, and you may decide that it doesn't deserve
> to be called a bug and should wait for 2.6.

Before I discuss the error class proposal in more detail, a preliminary
remark: error classes are a leftover from the days of "rich" error
objects, and any new use of an error class other than
ERROR_CLASS_GENERIC_ERROR is immediately suspect.  I'm not saying that
we won't add such uses anymore, just that there's a significant bar to
overcome, which we haven't for quite some time now.

I think I could be persuaded that a client might be able to use
knowledge on what error classes a specific command can produce.  Of
course, presence of an error class doesn't tell what actual error
conditions map to it, i.e. the client still needs to make assumptions on
the meaning of error classes.  Humans make those, too, but humans can
read the contract in the comments.

The value of a global list of error classes seems even more dubious,
though.  Existence of an error class by itself guarantees nothing.  How
would a client use the information?  Assume that existence of a class
implies a certain command uses it in a certain way?  That's an even
bigger jump than above.

I guess using the presence or absence of an error class as a witness for
a certain feature or behavior could work.  Seems practical when the
written contract guarantees the connection between the two (de jure
connection), or the commit that introduces the feature or behavior also
adds or removes the error class (de facto connecton).  This applies both
to a global list of error classes and to per-command lists.

Example 1: MigrationExpected

Before commit 1e99814 "qmp: handle stop/cont in INMIGRATE state",
cont could fail with error MigrationExpected.  Libvirt dealt with it
by trying again.

Commit 1e99814 made cont just work, and dropped the error class.
The error class was never used for anything else.

Exposing a global list of error classes like your patch does would
permit detecting the presence of this commit.  However, detecting it
is pointless: to deal with its absence, you have to loop on
MigrationExpected anyway, and that code works just fine with and
without the commit.

Example 2: Unwanted DeviceNotFound dropped

During the 2.3 development cycle, a few unwanted uses of
DeviceNotFound crept in.  Commits 5b347c5, f3cf80e, 6ec46ad backed
them out before the release.

For the sake of argument, ignore the fact that these unwanted
DeviceNotFound never made it to a release, and if they had, we
would've left them in, because taking them out would've been an ABI
break.

A client could use a per-command error class list to detect them,
but not a global error class list.  But what could it do with the
information?  If DeviceNotFound is there, the client can handle it
specially, if not, it can't, and I can't see how knowing it would
make a difference.

Example 3 (hypothetical): New error class to support a client's need

Say we discover that a client wants to handle a certain error
specially, and we decide to make that possible by changing its error
class from GenericError to something specific to that error.
Hypothetical, because changing an error's error class is an ABI
break, and we normally don't do that.

The client could then refrain from using the command in certain ways
unless it uses the specific error class for this error.

Detecting that by finding the error class in the global list of
error classes works only if the error class is new, and only works
as long as it doesn't get used for other things that then get
backported without the original use.

Detecting it by finding the error class in the command's list of
error classes would be less brittle.

Example 4: Use per-command error list to catch unwanted error class

If we declare a command's errors, we can detect undeclared errors at
run time.  This should help catching unwanted ones early (see
example 2).

Having to declare error classes may facilitate proper review of new
uses of funky error classes.

None of these examples is a particularly convincing use case.  Can you
think of better ones?

Finally, what happens if error class introspection misses 2.5 and makes
a later version?

If we add a global error class list like this patch does, a client has
to

Re: [Qemu-devel] [PATCH v11 19/28] qapi: Change munging of CamelCase enum values

2015-11-12 Thread Markus Armbruster

Gerd Hoffmann  writes:

>   Hi,
>
>> > If desired, I can prepare an alternate patch that adds the dash to the
>> > qapi enum definition, to see what we think.
>> 
>> If Gerd is fine with the rename, let's do it.
>
> No need to do so I think ...
>
>> >> -[INPUT_BUTTON_WHEEL_UP] = SDL_BUTTON(SDL_BUTTON_WHEELUP),
>> >> -[INPUT_BUTTON_WHEEL_DOWN] = SDL_BUTTON(SDL_BUTTON_WHEELDOWN),
>> >> +[INPUT_BUTTON_WHEELUP] = SDL_BUTTON(SDL_BUTTON_WHEELUP),
>> >> +[INPUT_BUTTON_WHEELDOWN] = SDL_BUTTON(SDL_BUTTON_WHEELDOWN),
>> >
>> > Since SDL already spells the names without space, it's not the end of
>> > the world if we do likewise.
>> 
>> Good point.
>> 
>> Even if we adopt SDL's spelling WHEELUP and WHEELDOWN, I'd still prefer
>> to downcase the QAPI names for consistency with the rest of QAPI.
>
> This doesn't look too bad.  And even if x-input-send-event isn't
> official api I'd prefer to not break it for such a minor cosmetic issue.

To slow our slide into a morass of inconsistency, I intend to make
qapi.py enforce naming conventions.  Involves a whitelist for existing
violators we can't or won't fix.  Naturally, I'd prefer to keep the list
as short as possible.

I feel these ones can and should be fixed, and the best time to fix them
is when we drop the x- from the command.

But if you insist on keeping the current names then, I'll live with the
extra whitelist entries.

Re: [Qemu-devel] [PATCH V3 0/6] ide: avoid main-loop hang on CDROM/NFS failure

2015-11-12 Thread Fam Zheng

On Fri, 11/06 09:42, Peter Lieven wrote:
> This series aims at avoiding a hanging main-loop if a vserver has a
> CDROM image mounted from a NFS share and that NFS share goes down.
> Typical situation is that users mount an CDROM ISO to install something
> and then forget to eject that CDROM afterwards.
> As a consequence this mounted CD is able to bring down the
> whole vserver if the backend NFS share is unreachable. This is bad
> especially if the CDROM itself is not needed anymore at this point.

If a storage backend is lost, would QEMU hang on guest reboot with this patch?
If so, just for understanding the problem, what is the use case this series
addresses?

The code looks good to me apart from the two questions I left, and that I
didn't fully understand the elementary transfer part.

Thanks,

Fam

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

2015-11-12 Thread Juan Quintela

"Li, Liang Z"  wrote:
>> On 12/11/2015 10:40, Li, Liang Z wrote:
>> > I migrate a 8GB RAM Idle guest,  I think most of it's pages are zero pages.
>> >
>> > I use your new code:
>> > -
>> >unsigned long *p = ...
>> >if (p[0] || p[1] || p[2] || p[3]
>> >|| memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
>> >return BUFFER_NOT_ZERO;
>> >else
>> >return BUFFER_ZERO;
>> > ---
>> > and the result is almost the same.  I also tried the check 8, 16 long
>> > data at the beginning, same result.
>> 
>> Interesting...  Well, all I can say is that applaud you for testing
>> your hypothesis
>> with the benchmark.
>> 
>> Probably the setup cost of memcmp is too high, because the testing loop is
>> already very optimized.
>> 
>> Please submit the AVX2 version if it helps!

I read the email in the wrong order.  Forget about my other email.

Sorry, Juan.


>
> Yes, the AVX2 version really helps. I have already submitted it, could
> you help to review it?
>
> I am curious about the original intention to add the SSE2 Intrinsics,
> is the same reason?
>
> I even suspect the VM may impact the 'memcmp()' performance, is it possible?
>
> Liang
>
>> Paolo

Re: [Qemu-devel] [RFC PATCH v4 01/11] exec: Remove cpu from cpus list during cpu_exec_exit()

2015-11-12 Thread Bharata B Rao

On Thu, Nov 12, 2015 at 10:56:50AM +0100, Andreas Färber wrote:
>  
> > I am hoping that I should be able to get CPU hotplug/unplug included
> > in QEMU-2.6 timeframe.
> 
> If there are preparatory patches ready for inclusion today, please point
> me to them urgently.

Thanks. I do have some generic changes, but I will push them during 2.6
development.

Regards,
Bharata.

1 2 3 4 >

1 - 100 of 343 matches

Mail list logo