On Mon, Jun 17, 2013 at 09:30:42AM -0400, Luiz Capitulino wrote: > On Mon, 17 Jun 2013 16:21:31 +0300 > "Michael S. Tsirkin" <m...@redhat.com> wrote: > > > On Mon, Jun 17, 2013 at 09:11:27AM -0400, Luiz Capitulino wrote: > > > On Fri, 14 Jun 2013 15:45:52 +0800 > > > Amos Kong <ak...@redhat.com> wrote: > > > > > > > Currently macvtap based macvlan device is working in promiscuous > > > > mode, we want to implement mac-programming over macvtap through > > > > Libvirt for better performance. > > > > > > > > Design: > > > > QEMU notifies Libvirt when rx-filter config is changed in guest, > > > > then Libvirt query the rx-filter information by a monitor command, > > > > and sync the change to macvtap device. Related rx-filter config > > > > of the nic contains main mac, rx-mode items and vlan table. > > > > > > > > This patch adds a QMP event to notify management of rx-filter change, > > > > and adds a monitor command for management to query rx-filter > > > > information. > > > > > > > > Test: > > > > If we repeatedly add/remove vlan, and change macaddr of vlan > > > > interfaces in guest by a loop script. > > > > > > > > Result: > > > > The events will flood the QMP client(management), management takes > > > > too much resource to process the events. > > > > > > > > Event_throttle API (set rate to 1 ms) can avoid the events to flood > > > > > > I doubt this is a valid value. Today, the three events that use the event > > > throttle API set the delay rate to 1000 ms. > > > > > > > QMP client, but it could cause an unexpected delay (~1ms), guests > > > > guests normally expect rx-filter updates immediately. > > > > > > What you mean by "immediately"? There's a round-trip to the host plus > > > all the stuff QEMU will execute to fulfil the request. And how did you > > > measure this, btw? > > > > > > What you have to do is is to measure your test-case in three different > > > scenarios: > > > > > > 1. Against upstream QEMU (ie. no patches) > > > 2. With the event throttle API > > > 3. With this patch > > > > > > Only then you'll be able which is better. > > > > > > I doubt there's a lot of value in splitting this patch in two and > > testing whether we can reproduce any problems with a partial patch. > > I didn't suggest that. Actually, I didn't even talk about code. > > > The > > problem of management not keeping up with event flood triggered by mac > > updates by guest might be purely theoretical, > > That's not the problem the flag thing is trying to solve. We have the > event throttle API for that. What the flag is trying to solve is _guest_ > latency due to event generation and it's not clear at all if this is a real > problem. > > > but a robust solution that > > does not have theoretical holes makes me sleep better at night, and it's a > > trivial amount of code. > > It has the drawback of coupling the query command with the event.
What specifically is the problem? I only see a fix for a problem, that might be theoretical, but I'm happier knowing I don't need to worry about it. > IMO, such > a coupling has to be justified with real data. It's a quality of implementation issue. No spec says that you should not delay RX filter updates for a very long time (yes 1000ms is very long), but if a NIC does it it's a low quality NIC, and it will cause trouble in the field. I don't think we need to spend time proving that with real data, it's just obvious. > > > > So > > > > Acked-by: Michael S. Tsirkin <m...@redhat.com> > > > > Any enhancements can be implemented > > by additional patches, to be applied on top. > > > > > > > > > > So we use a flag for each nic to avoid events flooding, the event > > > > is emitted once until the query command is executed. The flag > > > > implementation could not introduce unexpected delay. > > > > > > > > There maybe exist an uncontrollable delay if we let Libvirt do the > > > > real change, guests normally expect rx-filter updates immediately. > > > > But it's another separate issue, we can investigate it when the > > > > work in Libvirt side is done. > > > > > > > > Signed-off-by: Amos Kong <ak...@redhat.com> > > > > --- > > > > v2: add argument to filter mac-table info of single nic (Stefan) > > > > update the document, add event notification > > > > v3: rename to rx-filter, add main mac, avoid events flooding (MST) > > > > fix error process (Stefan), fix qmp interface (Eric) > > > > v4: process qerror in hmp, cleanup (Luiz) > > > > set flag for each device, add device path in event, add > > > > helper for g_strdup_printf (MST) > > > > fix qmp document (Eric) > > > > v5: add path in doc, define notify flag to unsigned (Eric) > > > > add vlan table (Jason), drop monitor cmd > > > > v6: return vids list, add test results/analysis to commitlog > > > > --- > > > > QMP/qmp-events.txt | 17 ++++++++ > > > > hw/net/virtio-net.c | 108 > > > > ++++++++++++++++++++++++++++++++++++++++++++++ > > > > include/monitor/monitor.h | 1 + > > > > include/net/net.h | 3 ++ > > > > monitor.c | 1 + > > > > net/net.c | 47 ++++++++++++++++++++ > > > > qapi-schema.json | 75 ++++++++++++++++++++++++++++++++ > > > > qmp-commands.hx | 63 +++++++++++++++++++++++++++ > > > > 8 files changed, 315 insertions(+) > > > > > > > > diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt > > > > index 24e804e9..39b6016 100644 > > > > --- a/QMP/qmp-events.txt > > > > +++ b/QMP/qmp-events.txt > > > > @@ -172,6 +172,23 @@ Data: > > > > }, > > > > "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } > > > > > > > > +NIC_RX_FILTER_CHANGED > > > > +----------------- > > > > + > > > > +The event is emitted once until the query command is executed, > > > > +the first event will always be emitted. > > > > + > > > > +Data: > > > > + > > > > +- "name": net client name (json-string) > > > > +- "path": device path (json-string) > > > > + > > > > +{ "event": "NIC_RX_FILTER_CHANGED", > > > > + "data": { "name": "vnet0", > > > > + "path": "/machine/peripheral/vnet0/virtio-backend" }, > > > > + "timestamp": { "seconds": 1368697518, "microseconds": 326866 } } > > > > +} > > > > + > > > > RESET > > > > ----- > > > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > > index 1ea9556..4137959 100644 > > > > --- a/hw/net/virtio-net.c > > > > +++ b/hw/net/virtio-net.c > > > > @@ -21,6 +21,8 @@ > > > > #include "hw/virtio/virtio-net.h" > > > > #include "net/vhost_net.h" > > > > #include "hw/virtio/virtio-bus.h" > > > > +#include "qapi/qmp/qjson.h" > > > > +#include "monitor/monitor.h" > > > > > > > > #define VIRTIO_NET_VM_VERSION 11 > > > > > > > > @@ -192,6 +194,100 @@ static void > > > > virtio_net_set_link_status(NetClientState *nc) > > > > virtio_net_set_status(vdev, vdev->status); > > > > } > > > > > > > > +static void rxfilter_notify(NetClientState *nc) > > > > +{ > > > > + QObject *event_data; > > > > + VirtIONet *n = qemu_get_nic_opaque(nc); > > > > + > > > > + if (nc->rxfilter_notify_enabled) { > > > > + event_data = qobject_from_jsonf("{ 'name': %s, 'path': %s }", > > > > + n->netclient_name, > > > > + object_get_canonical_path(OBJECT(n->qdev))); > > > > + monitor_protocol_event(QEVENT_NIC_RX_FILTER_CHANGED, > > > > event_data); > > > > + qobject_decref(event_data); > > > > + > > > > + /* disable event notification to avoid events flooding */ > > > > + nc->rxfilter_notify_enabled = 0; > > > > + } > > > > +} > > > > + > > > > +static char *mac_strdup_printf(const uint8_t *mac) > > > > +{ > > > > + return g_strdup_printf("%.2x:%.2x:%.2x:%.2x:%.2x:%.2x", mac[0], > > > > + mac[1], mac[2], mac[3], mac[4], mac[5]); > > > > +} > > > > + > > > > +static RxFilterInfo *virtio_net_query_rxfilter(NetClientState *nc) > > > > +{ > > > > + VirtIONet *n = qemu_get_nic_opaque(nc); > > > > + RxFilterInfo *info; > > > > + strList *str_list = NULL; > > > > + strList *entry; > > > > + intList *int_list = NULL; > > > > + intList *int_entry; > > > > + int i, j; > > > > + > > > > + info = g_malloc0(sizeof(*info)); > > > > + info->name = g_strdup(nc->name); > > > > + info->promiscuous = n->promisc; > > > > + > > > > + if (n->nouni) { > > > > + info->unicast = RX_STATE_NONE; > > > > + } else if (n->alluni) { > > > > + info->unicast = RX_STATE_ALL; > > > > + } else { > > > > + info->unicast = RX_STATE_NORMAL; > > > > + } > > > > + > > > > + if (n->nomulti) { > > > > + info->multicast = RX_STATE_NONE; > > > > + } else if (n->allmulti) { > > > > + info->multicast = RX_STATE_ALL; > > > > + } else { > > > > + info->multicast = RX_STATE_NORMAL; > > > > + } > > > > + > > > > + info->broadcast_allowed = n->nobcast; > > > > + info->multicast_overflow = n->mac_table.multi_overflow; > > > > + info->unicast_overflow = n->mac_table.uni_overflow; > > > > + > > > > + info->main_mac = mac_strdup_printf(n->mac); > > > > + > > > > + for (i = 0; i < n->mac_table.first_multi; i++) { > > > > + entry = g_malloc0(sizeof(*entry)); > > > > + entry->value = mac_strdup_printf(n->mac_table.macs + i * > > > > ETH_ALEN); > > > > + entry->next = str_list; > > > > + str_list = entry; > > > > + } > > > > + info->unicast_table = str_list; > > > > + > > > > + str_list = NULL; > > > > + for (i = n->mac_table.first_multi; i < n->mac_table.in_use; i++) { > > > > + entry = g_malloc0(sizeof(*entry)); > > > > + entry->value = mac_strdup_printf(n->mac_table.macs + i * > > > > ETH_ALEN); > > > > + entry->next = str_list; > > > > + str_list = entry; > > > > + } > > > > + info->multicast_table = str_list; > > > > + > > > > + for (i = 0; i < MAX_VLAN >> 5; i++) { > > > > + for (j = 0; n->vlans[i] && j < 0x1f; j++) { > > > > + if (n->vlans[i] & (1U << j)) { > > > > + int_entry = g_malloc0(sizeof(*int_entry)); > > > > + int_entry->value = (i << 5) + j; > > > > + int_entry->next = int_list; > > > > + int_list = int_entry; > > > > + } > > > > + } > > > > + } > > > > + info->vlan_table = int_list; > > > > + > > > > + /* enable event notification after query */ > > > > + nc->rxfilter_notify_enabled = 1; > > > > + > > > > + return info; > > > > +} > > > > + > > > > static void virtio_net_reset(VirtIODevice *vdev) > > > > { > > > > VirtIONet *n = VIRTIO_NET(vdev); > > > > @@ -420,6 +516,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, > > > > uint8_t cmd, > > > > { > > > > uint8_t on; > > > > size_t s; > > > > + NetClientState *nc = qemu_get_queue(n->nic); > > > > > > > > s = iov_to_buf(iov, iov_cnt, 0, &on, sizeof(on)); > > > > if (s != sizeof(on)) { > > > > @@ -442,6 +539,8 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, > > > > uint8_t cmd, > > > > return VIRTIO_NET_ERR; > > > > } > > > > > > > > + rxfilter_notify(nc); > > > > + > > > > return VIRTIO_NET_OK; > > > > } > > > > > > > > @@ -487,6 +586,7 @@ static int virtio_net_handle_mac(VirtIONet *n, > > > > uint8_t cmd, > > > > { > > > > struct virtio_net_ctrl_mac mac_data; > > > > size_t s; > > > > + NetClientState *nc = qemu_get_queue(n->nic); > > > > > > > > if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { > > > > if (iov_size(iov, iov_cnt) != sizeof(n->mac)) { > > > > @@ -495,6 +595,8 @@ static int virtio_net_handle_mac(VirtIONet *n, > > > > uint8_t cmd, > > > > s = iov_to_buf(iov, iov_cnt, 0, &n->mac, sizeof(n->mac)); > > > > assert(s == sizeof(n->mac)); > > > > qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac); > > > > + rxfilter_notify(nc); > > > > + > > > > return VIRTIO_NET_OK; > > > > } > > > > > > > > @@ -559,6 +661,8 @@ static int virtio_net_handle_mac(VirtIONet *n, > > > > uint8_t cmd, > > > > n->mac_table.multi_overflow = 1; > > > > } > > > > > > > > + rxfilter_notify(nc); > > > > + > > > > return VIRTIO_NET_OK; > > > > } > > > > > > > > @@ -567,6 +671,7 @@ static int virtio_net_handle_vlan_table(VirtIONet > > > > *n, uint8_t cmd, > > > > { > > > > uint16_t vid; > > > > size_t s; > > > > + NetClientState *nc = qemu_get_queue(n->nic); > > > > > > > > s = iov_to_buf(iov, iov_cnt, 0, &vid, sizeof(vid)); > > > > vid = lduw_p(&vid); > > > > @@ -584,6 +689,8 @@ static int virtio_net_handle_vlan_table(VirtIONet > > > > *n, uint8_t cmd, > > > > else > > > > return VIRTIO_NET_ERR; > > > > > > > > + rxfilter_notify(nc); > > > > + > > > > return VIRTIO_NET_OK; > > > > } > > > > > > > > @@ -1312,6 +1419,7 @@ static NetClientInfo net_virtio_info = { > > > > .receive = virtio_net_receive, > > > > .cleanup = virtio_net_cleanup, > > > > .link_status_changed = virtio_net_set_link_status, > > > > + .query_rx_filter = virtio_net_query_rxfilter, > > > > }; > > > > > > > > static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int > > > > idx) > > > > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h > > > > index 1a6cfcf..1942cc4 100644 > > > > --- a/include/monitor/monitor.h > > > > +++ b/include/monitor/monitor.h > > > > @@ -41,6 +41,7 @@ typedef enum MonitorEvent { > > > > QEVENT_BLOCK_JOB_READY, > > > > QEVENT_DEVICE_DELETED, > > > > QEVENT_DEVICE_TRAY_MOVED, > > > > + QEVENT_NIC_RX_FILTER_CHANGED, > > > > QEVENT_SUSPEND, > > > > QEVENT_SUSPEND_DISK, > > > > QEVENT_WAKEUP, > > > > diff --git a/include/net/net.h b/include/net/net.h > > > > index 43d85a1..30e4b04 100644 > > > > --- a/include/net/net.h > > > > +++ b/include/net/net.h > > > > @@ -49,6 +49,7 @@ typedef ssize_t (NetReceiveIOV)(NetClientState *, > > > > const struct iovec *, int); > > > > typedef void (NetCleanup) (NetClientState *); > > > > typedef void (LinkStatusChanged)(NetClientState *); > > > > typedef void (NetClientDestructor)(NetClientState *); > > > > +typedef RxFilterInfo *(QueryRxFilter)(NetClientState *); > > > > > > > > typedef struct NetClientInfo { > > > > NetClientOptionsKind type; > > > > @@ -59,6 +60,7 @@ typedef struct NetClientInfo { > > > > NetCanReceive *can_receive; > > > > NetCleanup *cleanup; > > > > LinkStatusChanged *link_status_changed; > > > > + QueryRxFilter *query_rx_filter; > > > > NetPoll *poll; > > > > } NetClientInfo; > > > > > > > > @@ -74,6 +76,7 @@ struct NetClientState { > > > > unsigned receive_disabled : 1; > > > > NetClientDestructor *destructor; > > > > unsigned int queue_index; > > > > + unsigned rxfilter_notify_enabled:1; > > > > }; > > > > > > > > typedef struct NICState { > > > > diff --git a/monitor.c b/monitor.c > > > > index 017411f..3b8117c 100644 > > > > --- a/monitor.c > > > > +++ b/monitor.c > > > > @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { > > > > [QEVENT_BLOCK_JOB_READY] = "BLOCK_JOB_READY", > > > > [QEVENT_DEVICE_DELETED] = "DEVICE_DELETED", > > > > [QEVENT_DEVICE_TRAY_MOVED] = "DEVICE_TRAY_MOVED", > > > > + [QEVENT_NIC_RX_FILTER_CHANGED] = "NIC_RX_FILTER_CHANGED", > > > > [QEVENT_SUSPEND] = "SUSPEND", > > > > [QEVENT_SUSPEND_DISK] = "SUSPEND_DISK", > > > > [QEVENT_WAKEUP] = "WAKEUP", > > > > diff --git a/net/net.c b/net/net.c > > > > index 43a74e4..33abffe 100644 > > > > --- a/net/net.c > > > > +++ b/net/net.c > > > > @@ -961,6 +961,53 @@ void print_net_client(Monitor *mon, NetClientState > > > > *nc) > > > > nc->info_str); > > > > } > > > > > > > > +RxFilterInfoList *qmp_query_rx_filter(bool has_name, const char *name, > > > > + Error **errp) > > > > +{ > > > > + NetClientState *nc; > > > > + RxFilterInfoList *filter_list = NULL, *last_entry = NULL; > > > > + > > > > + QTAILQ_FOREACH(nc, &net_clients, next) { > > > > + RxFilterInfoList *entry; > > > > + RxFilterInfo *info; > > > > + > > > > + /* only query rx-filter information of nic */ > > > > + if (nc->info->type != NET_CLIENT_OPTIONS_KIND_NIC) { > > > > + continue; > > > > + } > > > > + if (has_name && strcmp(nc->name, name) != 0) { > > > > + continue; > > > > + } > > > > + > > > > + if (nc->info->query_rx_filter) { > > > > + info = nc->info->query_rx_filter(nc); > > > > + entry = g_malloc0(sizeof(*entry)); > > > > + entry->value = info; > > > > + > > > > + if (!filter_list) { > > > > + filter_list = entry; > > > > + } else { > > > > + last_entry->next = entry; > > > > + } > > > > + last_entry = entry; > > > > + } else if (has_name) { > > > > + error_setg(errp, "net client(%s) doesn't support" > > > > + " rx-filter querying", name); > > > > + break; > > > > + } > > > > + } > > > > + > > > > + if (filter_list == NULL && !error_is_set(errp)) { > > > > + if (has_name) { > > > > + error_setg(errp, "invalid net client name: %s", name); > > > > + } else { > > > > + error_setg(errp, "no net client supports rx-filter > > > > querying"); > > > > + } > > > > + } > > > > + > > > > + return filter_list; > > > > +} > > > > + > > > > void do_info_network(Monitor *mon, const QDict *qdict) > > > > { > > > > NetClientState *nc, *peer; > > > > diff --git a/qapi-schema.json b/qapi-schema.json > > > > index 5ad6894..a69dc17 100644 > > > > --- a/qapi-schema.json > > > > +++ b/qapi-schema.json > > > > @@ -3624,3 +3624,78 @@ > > > > '*cpuid-input-ecx': 'int', > > > > 'cpuid-register': 'X86CPURegister32', > > > > 'features': 'int' } } > > > > + > > > > +## > > > > +# @RxState: > > > > +# > > > > +# Packets receiving state > > > > +# > > > > +# @normal: filter assigned packets according to the mac-table > > > > +# > > > > +# @none: don't receive any assigned packet > > > > +# > > > > +# @all: receive all assigned packets > > > > +# > > > > +## > > > > +{ 'enum': 'RxState', 'data': [ 'normal', 'none', 'all' ] } > > > > + > > > > +## > > > > +# @RxFilterInfo: > > > > +# > > > > +# Rx-filter information for a nic. > > > > +# > > > > +# @name: net client name > > > > +# > > > > +# @promiscuous: whether promiscuous mode is enabled > > > > +# > > > > +# @multicast: multicast receive state > > > > +# > > > > +# @unicast: unicast receive state > > > > +# > > > > +# @broadcast-allowed: whether to receive broadcast > > > > +# > > > > +# @multicast-overflow: multicast table is overflowed or not > > > > +# > > > > +# @unicast-overflow: unicast table is overflowed or not > > > > +# > > > > +# @main-mac: the main macaddr string > > > > +# > > > > +# @vlan-table: a list of active vlan id > > > > +# > > > > +# @unicast-table: a list of unicast macaddr string > > > > +# > > > > +# @multicast-table: a list of multicast macaddr string > > > > +# > > > > +# Since 1.6 > > > > +## > > > > + > > > > +{ 'type': 'RxFilterInfo', > > > > + 'data': { > > > > + 'name': 'str', > > > > + 'promiscuous': 'bool', > > > > + 'multicast': 'RxState', > > > > + 'unicast': 'RxState', > > > > + 'broadcast-allowed': 'bool', > > > > + 'multicast-overflow': 'bool', > > > > + 'unicast-overflow': 'bool', > > > > + 'main-mac': 'str', > > > > + 'vlan-table': ['int'], > > > > + 'unicast-table': ['str'], > > > > + 'multicast-table': ['str'] }} > > > > + > > > > +## > > > > +# @query-rx-filter: > > > > +# > > > > +# Return rx-filter information for all nics (or for the given nic). > > > > +# > > > > +# @name: #optional net client name > > > > +# > > > > +# Returns: list of @RxFilterInfo for all nics (or for the given nic). > > > > +# Returns an error if the given @name doesn't exist, or given > > > > +# nic doesn't support rx-filter querying, or no net client > > > > +# supports rx-filter querying > > > > +# > > > > +# Since: 1.6 > > > > +## > > > > +{ 'command': 'query-rx-filter', 'data': { '*name': 'str' }, > > > > + 'returns': ['RxFilterInfo'] } > > > > diff --git a/qmp-commands.hx b/qmp-commands.hx > > > > index 8cea5e5..aa00a94 100644 > > > > --- a/qmp-commands.hx > > > > +++ b/qmp-commands.hx > > > > @@ -2997,3 +2997,66 @@ Example: > > > > <- { "return": {} } > > > > > > > > EQMP > > > > + { > > > > + .name = "query-rx-filter", > > > > + .args_type = "name:s?", > > > > + .mhandler.cmd_new = qmp_marshal_input_query_rx_filter, > > > > + }, > > > > + > > > > +SQMP > > > > +query-rx-filter > > > > +--------------- > > > > + > > > > +Show rx-filter information. > > > > + > > > > +Returns a json-array of rx-filter information for all nics (or for the > > > > +given nic), returning an error if the given nic doesn't exist, or > > > > +given nic doesn't support rx-filter querying, or no net client > > > > +supports rx-filter querying. > > > > + > > > > +The query will clear the event notification flag of each nic, then qemu > > > > +will start to emit event to QMP monitor. > > > > + > > > > +Each array entry contains the following: > > > > + > > > > +- "name": net client name (json-string) > > > > +- "promiscuous": promiscuous mode is enabled (json-bool) > > > > +- "multicast": multicast receive state (one of 'normal', 'none', 'all') > > > > +- "unicast": unicast receive state (one of 'normal', 'none', 'all') > > > > +- "broadcast-allowed": allow to receive broadcast (json-bool) > > > > +- "multicast-overflow": multicast table is overflowed (json-bool) > > > > +- "unicast-overflow": unicast table is overflowed (json-bool) > > > > +- "main-mac": main macaddr string (json-string) > > > > +- "vlan-table": a json-array of active vlan id > > > > +- "unicast-table": a json-array of unicast macaddr string > > > > +- "multicast-table": a json-array of multicast macaddr string > > > > + > > > > +Example: > > > > + > > > > +-> { "execute": "query-rx-filter", "arguments": { "name": "vnet0" } } > > > > +<- { "return": [ > > > > + { > > > > + "promiscuous": true, > > > > + "name": "vnet0", > > > > + "main-mac": "52:54:00:12:34:56", > > > > + "unicast": "normal", > > > > + "vlan-table": [ > > > > + 4, > > > > + 0 > > > > + ], > > > > + "unicast-table": [ > > > > + ], > > > > + "multicast": "normal", > > > > + "multicast-overflow": false, > > > > + "unicast-overflow": false, > > > > + "multicast-table": [ > > > > + "01:00:5e:00:00:01", > > > > + "33:33:00:00:00:01", > > > > + "33:33:ff:12:34:56" > > > > + ], > > > > + "broadcast-allowed": false > > > > + } > > > > + ] > > > > + } > > > > + > > > > +EQMP > >