date:20160119

Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-19 Thread alvise rigo

On Mon, Jan 18, 2016 at 8:09 PM, Alex Bennée  wrote:
>
>
> Alex Bennée  writes:
>
> > alvise rigo  writes:
> >
> >> On Fri, Jan 15, 2016 at 4:25 PM, Alex Bennée  
> >> wrote:
> >>>
> >>> alvise rigo  writes:
> >>>
>  On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée  
>  wrote:
> >
> > alvise rigo  writes:
> >
> 
>  Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
>  exist solely in aarch64.
>  These instructions are purely emulated now and can potentially write
>  128 bits of data in a non-atomic fashion.
> >>>
> >>> Sure, but I doubt they are the reason for this hang as the kernel
> >>> doesn't use them.
> >>
> >> The kernel does use them for __cmpxchg_double in
> >> arch/arm64/include/asm/atomic_ll_sc.h.
> >
> > I take it back, if I'd have grepped for "ldxp" instead of "stxp" I would
> > have seen it, sorry about that ;-)
> >
> >> In any case, the normal exclusive instructions are also emulated in
> >> target-arm/translate-a64.c.
> >
> > I'll check on them on Monday. I'd assumed all the stuff was in the
> > helpers as I scanned through and missed the translate.c changes Fred
> > made. Hopefully that will be the last hurdle.
>
> I'm pleased to confirm you were right. I hacked up Fred's helper based
> solution for aarch64 including the ldxp/stxp stuff. It's not
> semantically correct because:
>
>   result = atomic_bool_cmpxchg(p, oldval, (uint8_t)newval) &&
>atomic_bool_cmpxchg([1], oldval2, (uint8_t)newval2);
>
> won't leave the system as it was before if the race causes the second

Exactly.

> cmpxchg to fail. I assume this won't be a problem in the LL/SC world as
> we'll be able to serialise all accesses to the exclusive page properly?

In LL/SC the idea would be to dedicate one ARM-specific helper (in
target-arm/helper-a64.c) to handle this case.
Once the helper grabbed the excl mutex, we are allowed to make 128
bits or bigger accesses.

>
>
> See:
>
> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r2
>
> >
> > In the meantime if I'm not booting Jessie I can get MTTCG aarch64
> > working with a initrd based rootfs. Once I've gone through those I'm
> > planning on giving it a good stress test with -fsantize=threads.
>
> My first pass with this threw up a bunch of errors with the RCU code
> like this:
>
> WARNING: ThreadSanitizer: data race (pid=15387)
>   Atomic write of size 4 at 0x7f59efa51d48 by main thread (mutexes: write 
> M172):
> #0 __tsan_atomic32_fetch_add  (libtsan.so.0+0x00058e8f)
> #1 call_rcu1 util/rcu.c:288 (qemu-system-aarch64+0x006c3bd0)
> #2 address_space_update_topology 
> /home/alex/lsrc/qemu/qemu.git/memory.c:806 
> (qemu-system-aarch64+0x001ed9ca)
> #3 memory_region_transaction_commit 
> /home/alex/lsrc/qemu/qemu.git/memory.c:842 
> (qemu-system-aarch64+0x001ed9ca)
> #4 address_space_init /home/alex/lsrc/qemu/qemu.git/memory.c:2136 
> (qemu-system-aarch64+0x001f1fa6)
> #5 memory_map_init /home/alex/lsrc/qemu/qemu.git/exec.c:2344 
> (qemu-system-aarch64+0x00196607)
> #6 cpu_exec_init_all /home/alex/lsrc/qemu/qemu.git/exec.c:2795 
> (qemu-system-aarch64+0x00196607)
> #7 main /home/alex/lsrc/qemu/qemu.git/vl.c:4083 
> (qemu-system-aarch64+0x001829aa)
>
>   Previous read of size 4 at 0x7f59efa51d48 by thread T1:
> #0 call_rcu_thread util/rcu.c:242 (qemu-system-aarch64+0x006c3d92)
> #1   (libtsan.so.0+0x000235f9)
>
>   Location is global 'rcu_call_count' of size 4 at 0x7f59efa51d48 
> (qemu-system-aarch64+0x010f1d48)
>
>   Mutex M172 (0x7f59ef6254e0) created at:
> #0 pthread_mutex_init  (libtsan.so.0+0x00027ee5)
> #1 qemu_mutex_init util/qemu-thread-posix.c:55 
> (qemu-system-aarch64+0x006ad747)
> #2 qemu_init_cpu_loop /home/alex/lsrc/qemu/qemu.git/cpus.c:890 
> (qemu-system-aarch64+0x001d4166)
> #3 main /home/alex/lsrc/qemu/qemu.git/vl.c:3005 
> (qemu-system-aarch64+0x001820ac)
>
>   Thread T1 (tid=15389, running) created by main thread at:
> #0 pthread_create  (libtsan.so.0+0x000274c7)
> #1 qemu_thread_create util/qemu-thread-posix.c:525 
> (qemu-system-aarch64+0x006ae04d)
> #2 rcu_init_complete util/rcu.c:320 (qemu-system-aarch64+0x006c3d52)
> #3 rcu_init util/rcu.c:351 (qemu-system-aarch64+0x0018e288)
> #4 __libc_csu_init  (qemu-system-aarch64+0x006c63ec)
>
>
> but I don't know how many are false positives so I'm going to look in more
> detail now.

Umm...I'm not very familiar with the sanitize option, I'll let you
follow this lead :).

alvise

>
> 
>
> --
> Alex Bennée

Re: [Qemu-devel] [PATCH COLO-Frame v13 34/39] net/filter-buffer: Add default filter-buffer for each netdev

2016-01-19 Thread Hailiang Zhang


Hi Jason,

Thanks for your review.

On 2016/1/19 11:19, Jason Wang wrote:



On 12/29/2015 03:09 PM, zhanghailiang wrote:

We add each netdev (except vhost-net) a default filter-buffer,
which will be used for COLO or Micro-checkpoint to buffer VM's packets.
The name of default filter-buffer is 'nop'.
For the default filter-buffer, it will not buffer any packets in default.
So it has no side effect for the netdev.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
Cc: Yang Hongyang 


This patch did three things:

1) the ability to enable or disable a netfilter
2) the ability to add a default filter
3) default filter attaching for filter-buffer

Better to split them into separate small patches.

And several questions:

For 1), I'm not sure this is real needed, we can in fact disable a
filter by removing it.


If we do like this, do we also need to _enable_ the buffer filter by
add it dynamically instead of attaching the default filter ?
Just like what we do in V10 ?
(In that series, you think have a default filter may be better.
The main reason for that is to support
hot-add nic. Since we didn't support hot-add nic during COLO,
it will be OK to add default filter dynamically)


For 2), Instead of a specific code just for filter buffer, I think we
need a generic method for an arbitrary filter to be used as default.


Good idea.


And if we can achieve 2), 3) is not needed any more.


---
v12:
- Skip vhost-net when add default filter
- Don't go through filter layer if the filter is disabled.
v11:
- New patch
---
  include/net/filter.h | 10 +++
  net/filter-buffer.c  | 82 
  net/filter.c |  6 +++-
  net/net.c| 12 
  4 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/include/net/filter.h b/include/net/filter.h
index 2deda36..40aa38c 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -56,6 +56,8 @@ struct NetFilterState {
  NetClientState *netdev;
  NetFilterDirection direction;
  char info_str[256];
+bool is_default;
+bool enabled;
  QTAILQ_ENTRY(NetFilterState) next;
  };

@@ -74,4 +76,12 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender,
  int iovcnt,
  void *opaque);

+static inline bool qemu_need_skip_netfilter(NetFilterState *nf)
+{
+return nf->enabled ? false : true;
+}
+
+void netdev_add_default_filter_buffer(const char *netdev_id,
+  NetFilterDirection direction,
+  Error **errp);
  #endif /* QEMU_NET_FILTER_H */
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 57be149..9cf3544 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -14,6 +14,13 @@
  #include "qapi/qmp/qerror.h"
  #include "qapi-visit.h"
  #include "qom/object.h"
+#include "net/net.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp-output-visitor.h"
+#include "qapi/qmp-input-visitor.h"
+#include "monitor/monitor.h"
+#include "qmp-commands.h"
+#include "net/vhost_net.h"

  #define TYPE_FILTER_BUFFER "filter-buffer"

@@ -102,6 +109,7 @@ static void filter_buffer_cleanup(NetFilterState *nf)
  static void filter_buffer_setup(NetFilterState *nf, Error **errp)
  {
  FilterBufferState *s = FILTER_BUFFER(nf);
+char *path = object_get_canonical_path_component(OBJECT(nf));

  /*
   * We may want to accept zero interval when VM FT solutions like MC
@@ -114,6 +122,14 @@ static void filter_buffer_setup(NetFilterState *nf, Error 
**errp)
  }

  s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
+nf->is_default = !strcmp(path, "nop");
+/*
+* For the default buffer filter, it will be disabled by default,
+* So it will not buffer any packets.
+*/
+if (nf->is_default) {
+nf->enabled = false;
+}
  if (s->interval) {
  timer_init_us(>release_timer, QEMU_CLOCK_VIRTUAL,
filter_buffer_release_timer, nf);
@@ -163,6 +179,72 @@ out:
  error_propagate(errp, local_err);
  }

+/*
+* This will be used by COLO or MC FT, for which they will need
+* to buffer the packets of VM's net devices, Here we add a default
+* buffer filter for each netdev. The name of default buffer filter is
+* 'nop'
+*/
+void netdev_add_default_filter_buffer(const char *netdev_id,
+  NetFilterDirection direction,
+  Error **errp)
+{


Need a more generic way to add an arbitrary filter as default. E.g
during netdev init, query if there's a default and do the initialization
there.



We call it in net_client_init1(), i don't find a better place to call it,
what's your suggestion ?


+QmpOutputVisitor *qov;
+QmpInputVisitor *qiv;
+Visitor *ov, *iv;
+QObject *obj = NULL;
+QDict *qdict;
+void *dummy

Re: [Qemu-devel] [PATCH COLO-Frame v13 35/39] filter-buffer: Accept zero interval

2016-01-19 Thread Hailiang Zhang


On 2016/1/19 11:21, Jason Wang wrote:



On 12/29/2015 03:09 PM, zhanghailiang wrote:

For default buffer filter, its 'interval' value is zero,
so here we should accept zero interval.

Signed-off-by: zhanghailiang 
Reviewed-by: Yang Hongyang 
Cc: Jason Wang 
---
v12:
- Add Reviewed-by tag
v11:
- Add comment
v10:
- new patch
---
  net/filter-buffer.c | 10 --
  1 file changed, 10 deletions(-)

diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 9cf3544..8abac94 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -111,16 +111,6 @@ static void filter_buffer_setup(NetFilterState *nf, Error 
**errp)
  FilterBufferState *s = FILTER_BUFFER(nf);
  char *path = object_get_canonical_path_component(OBJECT(nf));

-/*
- * We may want to accept zero interval when VM FT solutions like MC
- * or COLO use this filter to release packets on demand.
- */


You'd better move this to the commit log for a better rationale of the
patch.



OK, i will fix it, thanks.


-if (!s->interval) {
-error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval",
-   "a non-zero interval");
-return;
-}
-
  s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
  nf->is_default = !strcmp(path, "nop");
  /*



.

Re: [Qemu-devel] [PATCH COLO-Frame v13 36/39] filter-buffer: Introduce a helper function to enable/disable default filter

2016-01-19 Thread Hailiang Zhang


On 2016/1/19 11:35, Jason Wang wrote:



On 12/29/2015 03:09 PM, zhanghailiang wrote:

The default buffer filter doesn't buffer packets in default,
but we need to buffer packets for COLO or Micro-checkpoint,
Here we add a helper function to enable/disable filter's buffer
capability.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
Cc: Yang Hongyang 
---
v12:
- Rename the heler function to qemu_set_default_filters_status()
v11:
- New patch
---
  include/net/filter.h |  1 +
  include/net/net.h|  4 
  net/filter-buffer.c  | 19 +++
  net/net.c| 29 +
  4 files changed, 53 insertions(+)

diff --git a/include/net/filter.h b/include/net/filter.h
index 40aa38c..08aa604 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -84,4 +84,5 @@ static inline bool qemu_need_skip_netfilter(NetFilterState 
*nf)
  void netdev_add_default_filter_buffer(const char *netdev_id,
NetFilterDirection direction,
Error **errp);
+void qemu_set_default_filters_status(bool enable);
  #endif /* QEMU_NET_FILTER_H */
diff --git a/include/net/net.h b/include/net/net.h
index 7af3e15..5c65c45 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -125,6 +125,10 @@ NetClientState *qemu_find_vlan_client_by_name(Monitor 
*mon, int vlan_id,
const char *client_str);
  typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
  void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
+typedef void (*qemu_netfilter_foreach)(NetFilterState *nf, void *opaque,
+   Error **errp);
+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+Error **errp);
  int qemu_can_send_packet(NetClientState *nc);
  ssize_t qemu_sendv_packet(NetClientState *nc, const struct iovec *iov,
int iovcnt);
diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 8abac94..90a50cc 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -169,6 +169,25 @@ out:
  error_propagate(errp, local_err);
  }

+static void set_default_filter_status(NetFilterState *nf,
+  void *opaque,
+  Error **errp)
+{
+if (!strcmp(object_get_typename(OBJECT(nf)), TYPE_FILTER_BUFFER)) {
+bool *status = opaque;
+
+if (nf->is_default) {
+nf->enabled = *status;
+}
+}
+}
+
+void qemu_set_default_filters_status(bool enable)
+{
+qemu_foreach_netfilter(set_default_filter_status,
+   , NULL);
+}


The name of the function sounds a generic helper but it in fact pass a
type specific function. Consider enable is a generic property of
netfilter, we want a more generic code here.



Got it, i will fix it.


+
  /*
  * This will be used by COLO or MC FT, for which they will need
  * to buffer the packets of VM's net devices, Here we add a default
diff --git a/net/net.c b/net/net.c
index fd53cfc..30946c5 100644
--- a/net/net.c
+++ b/net/net.c
@@ -259,6 +259,35 @@ static char *assign_name(NetClientState *nc1, const char 
*model)
  return g_strdup_printf("%s.%d", model, id);
  }

+void qemu_foreach_netfilter(qemu_netfilter_foreach func, void *opaque,
+Error **errp)
+{
+NetClientState *nc;
+NetFilterState *nf;
+
+QTAILQ_FOREACH(nc, _clients, next) {
+if (nc->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
+continue;
+}
+/* FIXME: Not support multiqueue */
+if (nc->queue_index > 1) {
+error_setg(errp, "%s: multiqueue is not supported", __func__);
+return;
+}


Do we really need this? Looks like netfilter_complete() has already
checked this.



Yes, this is useless, i will remove it.


+QTAILQ_FOREACH(nf, >filters, next) {
+if (func) {
+Error *local_err = NULL;
+
+func(nf, opaque, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+}
+}
+}
+}


Need a separate patch for this helper.



OK, i will split it in next version, thanks.


+
  static void qemu_net_client_destructor(NetClientState *nc)
  {
  g_free(nc);



.

Re: [Qemu-devel] [PATCH 0/3] clean-includes script to add osdep.h to everything

2016-01-19 Thread Peter Maydell

On 19 January 2016 at 07:27, Markus Armbruster  wrote:
> Peter Maydell  writes:
>
>> On 11 January 2016 at 15:19, Daniel P. Berrange  wrote:
>>> I think even guest-agent code & tests could include it in order to
>>> get clean includes, even if they don't use any of the QEMU functions
>>> defined in it. So I think its simplest to just say every .c file must
>>> use it and leave it at that.
>>
>> OK, let's assume that works.
>
> If it doesn't, we need a header with just configuration results that is
> included in every .c file first.  Just like config.h should be when
> using autoconf.

An example of the kind of code that I wasn't sure about is
the stuff in tests/tcg/mips/ -- this currently doesn't
include any QEMU headers that I can see and I don't think
they're even on the include path.

In any case I'll do the obvious stuff first and circle back
to the oddball standalone sources later.

thanks
-- PMM

Re: [Qemu-devel] [RE-RESEND PATCH] pci: Adjust PCI config limit based on bus topology

2016-01-19 Thread Marcel Apfelbaum


On 01/19/2016 01:06 AM, Alex Williamson wrote:

A conventional PCI bus does not support config space accesses above
the standard 256 byte configuration space.  PCIe-to-PCI bridges are
not permitted to forward transactions if the extended register address
field is non-zero and must handle it as an unsupported request (PCIe
bridge spec rev 1.0, 4.1.3, 4.1.4).  Therefore, we should not support
extended config space if there is a conventional bus anywhere on the
path to a device.

Signed-off-by: Alex Williamson 
---
Previous postings:
https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05384.html
https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg02422.html

  hw/pci/pci_host.c |   26 ++
  1 file changed, 26 insertions(+)

diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c
index 49f59a5..3a3e294 100644
--- a/hw/pci/pci_host.c
+++ b/hw/pci/pci_host.c
@@ -19,6 +19,7 @@
   */

  #include "hw/pci/pci.h"
+#include "hw/pci/pci_bridge.h"
  #include "hw/pci/pci_host.h"
  #include "hw/pci/pci_bus.h"
  #include "trace.h"
@@ -49,9 +50,29 @@ static inline PCIDevice *pci_dev_find_by_addr(PCIBus *bus, 
uint32_t addr)
  return pci_find_device(bus, bus_num, devfn);
  }

+static void pci_adjust_config_limit(PCIBus *bus, uint32_t *limit)
+{
+if (*limit > PCI_CONFIG_SPACE_SIZE) {
+if (!pci_bus_is_express(bus)) {
+*limit = PCI_CONFIG_SPACE_SIZE;
+return;
+}
+
+if (!pci_bus_is_root(bus)) {
+PCIDevice *bridge = pci_bridge_get_device(bus);
+pci_adjust_config_limit(bridge->bus, limit);
+}
+}
+}
+
  void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
uint32_t limit, uint32_t val, uint32_t len)
  {
+pci_adjust_config_limit(pci_dev->bus, );
+if (limit <= addr) {
+return;
+}
+
  assert(len <= 4);
  /* non-zero functions are only exposed when function 0 is present,
   * allowing direct removal of unexposed functions.
@@ -70,6 +91,11 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, 
uint32_t addr,
  {
  uint32_t ret;

+pci_adjust_config_limit(pci_dev->bus, );
+if (limit <= addr) {
+return ~0x0;
+}
+
  assert(len <= 4);
  /* non-zero functions are only exposed when function 0 is present,
   * allowing direct removal of unexposed functions.




Quick question: could we check the limit as part of pci_config_size?
Anyway, it looks OK to me.

Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel

Re: [Qemu-devel] [PATCH v7] spec: add qcow2 bitmaps extension specification

2016-01-19 Thread Vladimir Sementsov-Ogievskiy


On 19.01.2016 00:16, Eric Blake wrote:

preserving semantics of those extra_data bytes).  We
have enough room for future extension, and that's good e


Ok, so, what should go to the spec? Current wording is ok? Just delete 
"Type-specific":


+
+20 - 23:extra_data_size
+Size of type-specific extra data.
+
+For now, as no extra data is defined, extra_data_size is
+reserved and must be zero.
+
+variable:   Extra data for the bitmap.
+




--
Best regards,
Vladimir
* now, @virtuozzo.com instead of @parallels.com. Sorry for this inconvenience.

Re: [Qemu-devel] [RFC PATCH v2 00/10] Introduce Intel 82574 GbE Controller Emulation (e1000e)

2016-01-19 Thread Dmitry Fleytman


> On 19 Jan 2016, at 05:48 AM, Jason Wang  wrote:
> 
> 
> 
> On 01/19/2016 01:35 AM, Leonid Bloch wrote:
>> Hello All,
>> 
>> This series is the latest code of the e1000e device emulation being 
>> developed.
>> 
>> Changes since v1:
>> 
>> 1. Added support for all the device features:
>>  - Interrupt moderation.
>>  - RSS.
>>  - Multiqueue.
>> 2. Simulated exact PCI/PCIe configuration space layout.
>> 3. Made fixes needed to pass Microsoft's HW certification tests (HCK).
>> 
>> This series is still an RFC, because the following tasks are not done yet:
>> 
>> 1. See which code can be shared between this device and the existing e1000 
>> device.
>> 2. Rebase patches to the latest master (current base is v2.3.0).
>> 
>> Please share your thoughts,
>> Thanks, Dmitry.
> 
> Hi:
> 
> Do you have a public git tree for easier reviewing?


Hi,

Yes, see here: https://github.com/daynix/qemu-e1000e/commits/e1000e-v2 

Branch e1000e-v2.

~Dmitry

> 
> Thanks
> 
>> 
>> ===
>> 
>> Hello qemu-devel,
>> 
>> This patch series is an RFC for the new networking device emulation
>> we're developing for QEMU.
>> 
>> This new device emulates the Intel 82574 GbE Controller and works
>> with unmodified Intel e1000e drivers from the Linux/Windows kernels.
>> 
>> The status of the current series is "Functional Device Ready, work
>> on Extended Features in Progress".
>> 
>> More precisely, these patches represent a functional device, which
>> is recognized by the standard Intel drivers, and is able to transfer
>> TX/RX packets with CSO/TSO offloads, according to the spec.
>> 
>> Extended features not supported yet (work in progress):
>>  1. TX/RX Interrupt moderation mechanisms
>>  2. RSS
>>  3. Full-featured multi-queue (use of multiqueued network backend)
>> 
>> Also, there will be some code refactoring and performance
>> optimization efforts.
>> 
>> This series was tested on Linux (Fedora 22) and Windows (2012R2)
>> guests, using Iperf, with TX/RX and TCP/UDP streams, and various
>> packet sizes.
>> 
>> More thorough testing, including data streams with different MTU
>> sizes, and Microsoft Certification (HLK) tests, are pending missing
>> features' development.
>> 
>> See commit messages (esp. "net: Introduce e1000e device emulation")
>> for more information about the development approaches and the
>> architecture options chosen for this device.
>> 
>> This series is based upon v2.3.0 tag of the upstream QEMU repository,
>> and it will be rebased to latest before the final submission.
>> 
>> Please share your thoughts - any feedback is highly welcomed :)
>> 
>> Best Regards,
>> Dmitry Fleytman.
>> 
>> Dmitry Fleytman (10):
>>  msix: make msix_clr_pending() visible for clients
>>  pci: Introduce function for PCI PM capability creation
>>  pcie: Add support for PCIe CAP v1
>>  pcie: Introduce function for DSN capability creation
>>  net: Introduce Toeplitz hash calculator
>>  net: Add macros for ETH address tracing
>>  net_pkt: Name vmxnet3 packet abstractions more generic
>>  net_pkt: Extend packet abstraction as requied by e1000e functionality
>>  e1000_regs: Add definitions for Intel 82574-specific bits
>>  net: Introduce e1000e device emulation
>> 
>> MAINTAINERS|   14 +
>> default-configs/pci.mak|1 +
>> hw/net/Makefile.objs   |5 +-
>> hw/net/e1000_regs.h|  353 -
>> hw/net/e1000e.c|  700 +
>> hw/net/e1000e_core.c   | 3453 
>> 
>> hw/net/e1000e_core.h   |  230 +++
>> hw/net/net_rx_pkt.c|  536 +++
>> hw/net/net_rx_pkt.h|  353 +
>> hw/net/net_tx_pkt.c|  627 
>> hw/net/net_tx_pkt.h|  191 +++
>> hw/net/vmxnet3.c   |   80 +-
>> hw/net/vmxnet_rx_pkt.c |  187 ---
>> hw/net/vmxnet_rx_pkt.h |  174 ---
>> hw/net/vmxnet_tx_pkt.c |  567 
>> hw/net/vmxnet_tx_pkt.h |  148 --
>> hw/pci/msix.c  |2 +-
>> hw/pci/pci.c   |   21 +
>> hw/pci/pcie.c  |   96 +-
>> include/hw/pci/msix.h  |1 +
>> include/hw/pci/pci.h   |2 +
>> include/hw/pci/pci_regs.h  |4 +
>> include/hw/pci/pcie.h  |5 +
>> include/hw/pci/pcie_regs.h |8 +-
>> include/net/checksum.h |   49 +-
>> include/net/eth.h  |  161 ++-
>> include/net/net.h  |5 +
>> net/checksum.c |7 +-
>> net/eth.c  |  410 +-
>> tests/Makefile |4 +-
>> trace-events   |  195 +++
>> 31 files changed, 7350 insertions(+), 1239 deletions(-)
>> create mode 100644 hw/net/e1000e.c
>> create mode 100644 hw/net/e1000e_core.c
>> create mode 100644 hw/net/e1000e_core.h
>> create mode 100644 hw/net/net_rx_pkt.c
>> create mode 100644 hw/net/net_rx_pkt.h
>> create mode 100644 hw/net/net_tx_pkt.c
>> create mode 100644 hw/net/net_tx_pkt.h
>> delete mode 100644 hw/net/vmxnet_rx_pkt.c
>>

Re: [Qemu-devel] [PATCH v1 1/1] arm_gic: Update ID registers based on revision

2016-01-19 Thread Peter Maydell

On 19 January 2016 at 01:33, Alistair Francis
 wrote:
> Update the GIC ID registers (registers above 0xfe0) based on the GIC
> revision instead of using the sames values for all GIC implementations.
>
> Signed-off-by: Alistair Francis 
> Tested-by: Sören Brinkmann 
> ---
>
>  hw/intc/arm_gic.c | 29 ++---
>  1 file changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
> index 13e297d..f6bfa53 100644
> --- a/hw/intc/arm_gic.c
> +++ b/hw/intc/arm_gic.c
> @@ -31,8 +31,16 @@ do { fprintf(stderr, "arm_gic: " fmt , ## __VA_ARGS__); } 
> while (0)
>  #define DPRINTF(fmt, ...) do {} while(0)
>  #endif
>
> -static const uint8_t gic_id[] = {
> -0x90, 0x13, 0x04, 0x00, 0x0d, 0xf0, 0x05, 0xb1
> +static const uint8_t gic_id_11mpcore[] = {
> +0x00, 0x00, 0x00, 0x00, 0x90, 0x13, 0x04, 0x00, 0x0d, 0xf0, 0x05, 0xb1
> +};
> +
> +static const uint8_t gic_id_gicv1[] = {
> +0x04, 0x00, 0x00, 0x00, 0x90, 0xb3, 0x1b, 0x00, 0x0d, 0xf0, 0x05, 0xb1
> +};
> +
> +static const uint8_t gic_id_gicv2[] = {
> +0x04, 0x00, 0x00, 0x00, 0x90, 0xb4, 0x2b, 0x00, 0x0d, 0xf0, 0x05, 0xb1
>  };
>
>  static inline int gic_get_current_cpu(GICState *s)
> @@ -689,7 +697,22 @@ static uint32_t gic_dist_readb(void *opaque, hwaddr 
> offset, MemTxAttrs attrs)
>  if (offset & 3) {
>  res = 0;
>  } else {
> -res = gic_id[(offset - 0xfe0) >> 2];
> +switch (s->revision) {
> +case REV_11MPCORE:
> +res = gic_id_11mpcore[(offset - 0xfe0) >> 2];
> +break;
> +case 1:
> +res = gic_id_gicv1[(offset - 0xfe0) >> 2];
> +break;
> +case 2:
> +res = gic_id_gicv2[(offset - 0xfe0) >> 2];
> +break;
> +case REV_NVIC:
> +/* Shouldn't be able to get here */
> +abort();
> +default:
> +res = 0;
> +}
>  }
>  }
>  return res;

You've expanded the arrays to include the fd0...fdc values
(which is right) but the logic also needs to change to
make offset == 0xfd0..0xfdf go through this code path and
also to use the new indexing into the array.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 1/1] nvdimm: disable balloon

2016-01-19 Thread Xiao Guangrong




On 01/18/2016 07:42 PM, Denis V. Lunev wrote:

From: Vladimir Sementsov-Ogievskiy 

NVDIMM for now is planned to use as a backing store for DAX filesystem
in the guest and thus this memory is excluded from guest memory management
and LRUs.

In this case libvirt running QEMU along with configured ballon almost
immediately inflates balloon and effectively kill the guest as
qemu counts nvdimm as part of the ram.



It looks good me.

However, it is not related to this patch, why not use the 'total memory' 
reported
by guest instead? It is more precise as a) BIOS and other components will occupy
available memory and b) guest may limit the memory size it can use...


Counting dimm devices as part of the ram for ballooning was started from
patch
  virtio-balloon: Fix balloon not working correctly when hotplug memory

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Denis V. Lunev 
CC: Stefan Hajnoczi 
CC: Xiao Guangrong 
CC: "Michael S. Tsirkin" 
CC: Igor Mammedov 
CC: Eric Blake 
CC: Markus Armbruster 
---
The patch is submitted start a discussion. It may be technically correct,
but for us the situation is a bit shady.

  hw/mem/nvdimm.c  | 4 
  hw/mem/pc-dimm.c | 7 ++-
  include/hw/mem/pc-dimm.h | 1 +
  qapi-schema.json | 5 -
  4 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 4fd397f..4f4d29a 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -27,9 +27,13 @@
  static void nvdimm_class_init(ObjectClass *oc, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(oc);
+PCDIMMDeviceClass *ddc = PC_DIMM_CLASS(oc);

  /* nvdimm hotplug has not been supported yet. */
  dc->hotpluggable = false;
+
+/* ballooning is not supported */
+ddc->in_ram = false;
  }

  static TypeInfo nvdimm_info = {
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index d5cdab2..e0f869d 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -164,6 +164,7 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque)
  MemoryDeviceInfo *info = g_new0(MemoryDeviceInfo, 1);
  PCDIMMDeviceInfo *di = g_new0(PCDIMMDeviceInfo, 1);
  DeviceClass *dc = DEVICE_GET_CLASS(obj);
+PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(obj);
  PCDIMMDevice *dimm = PC_DIMM(obj);

  if (dev->id) {
@@ -172,6 +173,7 @@ int qmp_pc_dimm_device_list(Object *obj, void *opaque)
  }
  di->hotplugged = dev->hotplugged;
  di->hotpluggable = dc->hotpluggable;
+di->in_ram = ddc->in_ram;
  di->addr = dimm->addr;
  di->slot = dimm->slot;
  di->node = dimm->node;
@@ -205,7 +207,9 @@ ram_addr_t get_current_ram_size(void)
  if (value) {
  switch (value->type) {
  case MEMORY_DEVICE_INFO_KIND_DIMM:
-size += value->u.dimm->size;
+if (value->u.dimm->in_ram) {
+size += value->u.dimm->size;
+}


Can we use "object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)" to filter out
NVDIMM device?


  break;
  default:
  break;
@@ -444,6 +448,7 @@ static void pc_dimm_class_init(ObjectClass *oc, void *data)
  dc->props = pc_dimm_properties;
  dc->desc = "DIMM memory module";

+ddc->in_ram = true;
  ddc->get_memory_region = pc_dimm_get_memory_region;
  }

diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index d83bf30..3bcb505 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -65,6 +65,7 @@ typedef struct PCDIMMDevice {
  typedef struct PCDIMMDeviceClass {
  /* private */
  DeviceClass parent_class;
+bool in_ram;

  /* public */
  MemoryRegion *(*get_memory_region)(PCDIMMDevice *dimm);
diff --git a/qapi-schema.json b/qapi-schema.json
index b3038b2..613b4d5 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3922,6 +3922,8 @@
  #
  # @hotpluggable: true if device if could be added/removed while machine is 
running
  #
+# @in-ram: true if device if should be counted in current ram size (since 2.6)
+#
  # Since: 2.1
  ##
  { 'struct': 'PCDIMMDeviceInfo',
@@ -3932,7 +3934,8 @@
  'node': 'int',
  'memdev': 'str',
  'hotplugged': 'bool',
-'hotpluggable': 'bool'
+'hotpluggable': 'bool',
+'in-ram': 'bool'


What is it used for?

Re: [Qemu-devel] [PATCH v8 00/35] qapi visitor cleanups (post-introspection cleanups subset E)

2016-01-19 Thread Markus Armbruster

Eric Blake  writes:

> Based on qemu.git master. Pending prerequisites:
> + Not a strong dependency, but for qapi-tests to consistently pass,
> I needed a race fixed:
> https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg01827.html
>
> Also available as a tag at this location:
> git fetch git://repo.or.cz/qemu/ericb.git qapi-cleanupv8e
>
> and will soon be part of my branch with the rest of the v5 series, at:
> http://repo.or.cz/qemu/ericb.git/shortlog/refs/heads/qapi
>
> v8 notes:
> Four new patches (13-16/35), plus rebasing on top of them, so that
> the code base now consistently passes a 'v, name' pair anywhere a
> visitor needs a name, rather than putting other arguments in between
> the pair. I got to have fun with Coccinelle :)  Also fix a bug in my
> changes to visit_next_list() (v7 29/31), so that 'make check' and
> qemu-iotests now pass at all points in the series.
>
> The parameter ordering changes have the potential to be a rebase
> magnet, so I'm hoping this series can go in relatively soon after
> Markus returns from break.
>
> I made good on my threat in v7 of writing a qapi-to-JSON output
> visitor, but that will remain a separate series based on this one
> (the only posting of that series so far now needs rebasing:
> https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg01760.html)
>
> 001/35:[] [--] 'qobject: Document more shortcomings in our number 
> handling'
> 002/35:[] [--] 'qapi: Avoid use of misnamed DO_UPCAST()'
> 003/35:[] [--] 'qapi: Drop dead dealloc visitor variable'
> 004/35:[] [--] 'hmp: Improve use of qapi visitor'
> 005/35:[] [--] 'vl: Improve use of qapi visitor'
> 006/35:[] [--] 'balloon: Improve use of qapi visitor'
> 007/35:[] [--] 'qapi: Improve generated event use of qapi visitor'
> 008/35:[] [--] 'qapi: Track all failures between visit_start/stop'
> 009/35:[] [--] 'qapi: Prefer type_int64 over type_int in visitors'
> 010/35:[] [--] 'qapi: Make all visitors supply uint64 callbacks'
> 011/35:[] [--] 'qapi: Consolidate visitor small integer callbacks'
> 012/35:[] [--] 'qapi: Don't cast Enum* to int*'
> 013/35:[down] 'qom: Use typedef for Visitor'

Applies cleanly until here.

> 014/35:[down] 'qapi: Swap visit_* arguments for consistent 'name' placement'

Doesn't apply.

You can either spin v9 addressing Marc-André's review, or you can rebase
v8 without changes somewhere I can pull, so I can review it properly.

[...]

Re: [Qemu-devel] [PATCH v16 00/14] vfio-pci: pass the aer error to guest

2016-01-19 Thread Chen Fan



On 01/17/2016 02:34 AM, Michael S. Tsirkin wrote:

On Tue, Jan 12, 2016 at 10:43:01AM +0800, Cao jin wrote:

From: Chen Fan 

For now, for vfio pci passthough devices when qemu receives
an error from host aer report, currentlly just terminate the guest,
but usually user want to know what error occurred but stopping the
guest, so this patches add aer capability support for vfio device,
and pass the error to guest, and have guest driver to recover
from the error.

I would like to see a version of this patchset that doesn't
depend on pci core changes.
I think that if you make this simplifying assumption:

- all devices on same bus in guest are on same bus in host

then you can handle both reset and hotplug simply in function 0
since it will belong to vfio.

So we can have a version without pci core changes that simply assumes
this, and things will just work.


Now, if we wanted to enforce this limitation, I think the
cleanest way would be to add a callback in struct PCIDevice:

bool is_valid_function(PCIDevice *newfunction)

and call it as each function is added.
This way aer function can validate that each function
added shares the same bus.
And this way issues will be detected directly and not when
function 0 is added.

I would prefer this validation code to be a patch on top so we can merge
the functionality directly and avoid blocking it while we figure out the
best api to validate things.

I don't see why making guest topology match host would
ever be a problem, but if it's required to support
configurations where these differ, I'd like to see
an attempt to address that be split out, after aer
is supported.

Hi Michael,

   it's a good idea. we should simplify the implementation of the aer 
function first

without more affect on pci core code.

Thanks,
Chen





v15-v16:
10/14, 11/14 are new to introduce a reset sequence id to specify the
vfio devices has been reset for that reset. other patches aren't modified.

v14-v15:
1. add device hot reset callback
2. add bus_in_reset for vfio device to avoid multi do host bus reset

v13-v14:
1. for multifunction device, requiring all functions enable AER.(9/13)
2. due to all affected functions receive error signal, ignore no
   error occurred function. (12/13)

v12-v13:
1. since support multifuncion hotplug, here add callback to enable aer.
2. add pci device pre+post reset for aer host reset.

Chen Fan (14):
   vfio: extract vfio_get_hot_reset_info as a single function
   vfio: squeeze out vfio_pci_do_hot_reset for support bus reset
   pcie: modify the capability size assert
   vfio: make the 4 bytes aligned for capability size
   vfio: add pcie extanded capability support
   aer: impove pcie_aer_init to support vfio device
   vfio: add aer support for vfio device
   vfio: add check host bus reset is support or not
   add check reset mechanism when hotplug vfio device
   pci: introduce pci bus pre reset
   vfio: introduce last reset sequence id
   pcie_aer: expose pcie_aer_msg() interface
   vfio-pci: pass the aer error to guest
   vfio: add 'aer' property to expose aercap

  hw/pci-bridge/ioh3420.c|   2 +-
  hw/pci-bridge/xio3130_downstream.c |   2 +-
  hw/pci-bridge/xio3130_upstream.c   |   2 +-
  hw/pci/pci.c   |  42 +++
  hw/pci/pci_bridge.c|   3 +
  hw/pci/pcie.c  |   2 +-
  hw/pci/pcie_aer.c  |   6 +-
  hw/vfio/pci.c  | 616 +
  hw/vfio/pci.h  |   9 +
  include/hw/pci/pci.h   |   1 +
  include/hw/pci/pci_bus.h   |   8 +
  include/hw/pci/pcie_aer.h  |   3 +-
  12 files changed, 624 insertions(+), 72 deletions(-)

--
1.9.3




.

Re: [Qemu-devel] [PATCH v16 13/14] vfio-pci: pass the aer error to guest

2016-01-19 Thread Chen Fan



On 01/18/2016 06:45 PM, Marcel Apfelbaum wrote:

On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 

when the vfio device encounters an uncorrectable error in host,
the vfio_pci driver will signal the eventfd registered by this
vfio device, the results in the qemu eventfd handler getting


Maybe "the results in" -> resulting in


invoked.

this patch is to pass the error to guest and have the guest driver
recover from the error.


Maybe "Pass the error to... and let the ... "



Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 53 
+++--

  1 file changed, 47 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index da4815e..efa5e01 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2553,18 +2553,59 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
  static void vfio_err_notifier_handler(void *opaque)
  {
  VFIOPCIDevice *vdev = opaque;
+PCIDevice *dev = >pdev;
+PCIEAERMsg msg = {
+.severity = 0,
+.source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
+};

  if (!event_notifier_test_and_clear(>err_notifier)) {
  return;
  }

  /*
- * TBD. Retrieve the error details and decide what action
- * needs to be taken. One of the actions could be to pass
- * the error to the guest and have the guest driver recover
- * from the error. This requires that PCIe capabilities be
- * exposed to the guest. For now, we just terminate the
- * guest to contain the error.
+ * in case the real hardware configration has been changed,


configration -> configuration



+ * here we should recheck the bus reset capability.
+ */
+if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
+vfio_check_host_bus_reset(vdev)) {
+goto stop;
+}
+/*
+ * we should read the error details from the real hardware
+ * configuration spaces, here we only need to do is signaling
+ * to guest an uncorrectable error has occurred.
+ */
+if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
+dev->exp.aer_cap) {


Why do we need dev->exp.aer_cap check here? In patch 7/14 we fail the 
device init

process if this happens, right?


the property FEATURE_ENABLE_AER can't represent the vfio device actually 
has the aer

capability. so here we should check it.




+uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
+uint32_t uncor_status;
+bool isfatal;
+
+uncor_status = vfio_pci_read_config(dev,
+   dev->exp.aer_cap + PCI_ERR_UNCOR_STATUS, 4);
+
+/*
+ * if we receive the error signal but not this device, we can


maybe "if the error is not emitted by this device..."


thank you for your careful review for my bad english description in the 
patchset,

I will update them in the next version.

Thanks,
Chen




Thanks,
Marcel


+ * just ignore it.
+ */
+if (!(uncor_status & ~0UL)) {
+return;
+}
+
+isfatal = uncor_status & pci_get_long(aer_cap + 
PCI_ERR_UNCOR_SEVER);

+
+msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
+ PCI_ERR_ROOT_CMD_NONFATAL_EN;
+
+pcie_aer_msg(dev, );
+return;
+}
+
+stop:
+/*
+ * If the aer capability is not exposed to the guest. we just
+ * terminate the guest to contain the error.
   */

  error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error 
detected.  "






.

Re: [Qemu-devel] [PATCH v16 10/14] pci: introduce pci bus pre reset

2016-01-19 Thread Chen Fan



On 01/15/2016 04:36 AM, Alex Williamson wrote:

On Tue, 2016-01-12 at 10:43 +0800, Cao jin wrote:

From: Chen Fan 

avoid repeat bus reset, here introduce a sequence ID for each time
bus hot reset, so each vfio device could know whether they've already
been reset for that sequence ID.

Signed-off-by: Chen Fan 
---
  hw/pci/pci.c | 13 +
  hw/pci/pci_bridge.c  |  3 +++
  include/hw/pci/pci.h |  1 +
  include/hw/pci/pci_bus.h |  3 +++
  4 files changed, 20 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index f6ca6ef..ceb72d5 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -91,6 +91,18 @@ static void pci_bus_unrealize(BusState *qbus,
Error **errp)
  vmstate_unregister(NULL, _pcibus, bus);
  }
  
+void pci_bus_pre_reset(PCIBus *bus, uint32_t seqid)

+{
+PCIBus *sec;
+
+bus->in_reset = true;
+bus->reset_seqid = seqid;
+
+QLIST_FOREACH(sec, >child, sibling) {
+pci_bus_pre_reset(sec, seqid);
+}
+}
+
  static bool pcibus_is_root(PCIBus *bus)
  {
  return !bus->parent_dev;
@@ -276,6 +288,7 @@ static void pcibus_reset(BusState *qbus)
  for (i = 0; i < bus->nirq; i++) {
  assert(bus->irq_count[i] == 0);
  }
+bus->in_reset = false;
  }
  
  static void pci_host_bus_register(PCIBus *bus, DeviceState *parent)

diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index 40c97b1..c7f15a1 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -268,6 +268,9 @@ void pci_bridge_write_config(PCIDevice *d,
  newctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
  if (~oldctl & newctl & PCI_BRIDGE_CTL_BUS_RESET) {
  /* Trigger hot reset on 0->1 transition. */
+uint32_t seqid = s->sec_bus.reset_seqid++;

Doesn't this need to come from a global sequence ID?  Imagine the case
of a nested bus, the leaf bus is reset incrementing the sequence ID.
The devices on that bus store that sequence ID as they're reset.  The
parent bus is then reset, but all the devices on the leaf bus have
already been reset for that sequence ID and ignore the reset.


+
+pci_bus_pre_reset(>sec_bus, seqid ? seqid : 1);

Does this work?  Seems like this would make devices ignore the second
bus reset after the VM is instantiated.  ie.  the first bus reset seqid
is 0, so we call pre_reset with 1, the second time we call it with 1
again.


  qbus_reset_all(>sec_bus.qbus);

I'd be tempted to call qbus_walk_children() directly, it already has a
pre_busfn callback hook.

Hi Alex,

this looks like need to change much pci core code,  as Michael suggested 
in 00/14,

maybe we should simply the aer implementation. what do you think of that?

Thanks,
Chen





  }
  }
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 379b6e1..b811279 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -381,6 +381,7 @@ void pci_bus_fire_intx_routing_notifier(PCIBus
*bus);
  void pci_device_set_intx_routing_notifier(PCIDevice *dev,
PCIINTxRoutingNotifier
notifier);
  void pci_device_reset(PCIDevice *dev);
+void pci_bus_pre_reset(PCIBus *bus, uint32_t seqid);
  
  PCIDevice *pci_nic_init_nofail(NICInfo *nd, PCIBus *rootbus,

 const char *default_model,
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index 7812fa9..dd6aaf1 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -40,6 +40,9 @@ struct PCIBus {
  int nirq;
  int *irq_count;
  
+bool in_reset;

+uint32_t reset_seqid;
+
  NotifierWithReturnList hotplug_notifiers;
  };
  



.

Re: [Qemu-devel] [PATCH] hw/misc: slavepci_passthru driver

2016-01-19 Thread Francesco Zuliani

Hi Alex,

On 01/18/2016 05:41 PM, Alex Williamson wrote:

On Mon, 2016-01-18 at 10:16 -0500, Marc-André Lureau wrote:

- Original Message -

Hi there,

I'd like to submit this new pci driver ( hw/misc )for inclusion,
if you think it could be useful to other as well as ourself.

The driver "worked for our needs" BUT we haven't done extensive
testing and this is our first attempt to submit a patch so I kindly
ask for extra-forgiveness .

The "slavepci_passthru" driver is useful in the scenario described
below to implement a simplified passthru when the host CPU does not
support IOMMU and one is interested only in pci target-mode (slave
devices).

Let's CC Alex, who worked on the most recent framework for something related to
that (VFIO).

Embedded system cpu (e.g. Atom, AMD G-Series) often lack the VT-d
extensions (IOMMU) needed to be able to pass-thru pci peripherals to
the guest machine (i.e. the pci pass-thru feature cannot be used).

If one is only interested in using the pci board as a pci-target
(slave device), this driver mmap(s) the host-pci-bars into the guest
within a virtual pci-device.

What exactly do you mean by pci-target/slave device? Does this mean
that the device is not DMA capable, ie. cannot enable BusMaster?

Yes, exactly. Our approach can be used ONLY if one is NOT interested in
DMA-Capability (i.e. it is not possible to enable BusMaster)

This is useful in our case for debugging via qemu gsbserver facility
(i.e. '-s' option in qemu) a system running barebone-executable .

Currently the driver assumes the custom pci card has four 32-bit bars
to be mapped (in current patch this is mandatory)

HowTo:
To use the new driver one shall:
- define two environment variables for assigning proper VID and DID to
associate to the guest pci card
- give the host pci bar address to map in the guest.

Example Usage:

Let us suppose that we have in the host a slave pci device with the
following 4 bars (i.e. output of lspci -v -s YOUR-CARD | grep Memory)
Memory at db80 (32-bit, non-prefetchable) [size=4K]
Memory at db90 (32-bit, non-prefetchable) [size=8K]
Memory at dba0 (32-bit, non-prefetchable) [size=4K]
Memory at dbb0 (32-bit, non-prefetchable) [size=4K]

We can map these bars in a guest-pci with VID=0xe33e DID=0x000a using

SLAVEPASSTHRU_VID="0xe33e" SLAVEPASSTHRU_DID="0xa" qemu-system-x86_64 \
YOUR-SET-OF-FLAGS \
-device

slavepassthru,size1=4096,baseaddr1=0xdb90,size2=8192,baseaddr2=0xdba0,size3=4096,baseaddr3=0xdbd0,size4=4096,baseaddr4=0xdbe0

Please note that if your device has less than four bars you can give
the same size and baseaddress to the unused bars.

Those are some pretty serious usage restrictions and using /dev/mem is
really not practical. The resource files in pci-sysfs would even be a
better option.

our was a quick hack to fulfill our needs, the approach via sysfs is
of course the right one and we would implement it if this patch is of
interest.

I didn't see how IO and MMIO BARs get enabled on the
physical device or whether you support any kind of interrupt scheme.

In our case the IO space is not used.
The MMIO space is already enabled.

Our custom board does not have any interrupt and our quick hack
did not implement it.

I
had never really intended QEMU use of this, but you might want to
consider vfio no-iommu mode:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/vfio/vfio.c?id=03a76b60f8ba27974e2d252bc555d2c103420e15

Using this taints the kernel, but maybe that's nothing you mind if
you're already letting QEMU access /dev/mem. The QEMU vfio-pci driver
would need to be modified to use the new device and of course it
wouldn't have IOMMU translation capabilities. That means that the
BusMaster bit should protected and MSI/X capabilities should be hidden
from the VM. It seems more flexible and featureful than what you have
here. Thanks,

I was not aware of this interesting patch, I will study it to see if
it fits our use case.

Just for information you mean "taint" in that "security" is broken, not
licensing issues, am I right?

Thanks a lot for your time

Francesco Zuliani

Alex

Re: [Qemu-devel] [PATCH 2/2] migration/virtio: Remove simple .get/.put use

2016-01-19 Thread Dr. David Alan Gilbert

* Sascha Silbe (si...@linux.vnet.ibm.com) wrote:
> Dear David,
> 
> "Dr. David Alan Gilbert"  writes:
> 
> > +/* a variable length array (i.e. _type *_field) but we know the
> > + * length
> > + */
> > +#define VMSTATE_STRUCT_VARRAY_POINTER_KNOWN(_field, _state, _num, 
> > _version, _vmsd, _type) { \
> [...]
> 
> Thinking about it some more, wouldn't VMSTATE_STRUCT_ARRAY_POINTER be a
> better name? Like with VMSTATE_ARRAY, the size of the array is known at
> compile-time. It's just that you need to dereference it first, hence
> ..._POINTER. There's nothing variable about it at all.

t's all a bit confusing; but the only pattern I'd figured out was that the
things after the 'VARRAY_' part tended to be talking about the length of
the array rather than the contents.

> But keep in mind I don't understand the current naming scheme in the
> first place, e.g. VMSTATE_ARRAY_INT32_UNSAFE vs. VMSTATE_VARRAY_INT32,
> with both of them specifying VMS_VARRAY_INT32...

No, I don't really either; one for Juan or Amit to suggest if they
prefer one or the other.

Dave

> 
> Sascha
> -- 
> Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg
> https://se-silbe.de/
> USt-IdNr. DE281696641
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

1 2 3 4 >

1 - 100 of 352 matches

Mail list logo