Yang Hongyang <yan...@cn.fujitsu.com> writes: > This filter is to buffer/release packets, this feature can be used > when using MicroCheckpointing, or other Remus like VM FT solutions, you
What's "Remus"? > can also use it to simulate the network delay. > It has an interval option, if supplied, this filter will release > packets by interval. Suggest "will delay packets by that time interval." Is interval really optional? > > Usage: > -netdev tap,id=bn0 > -object filter-buffer,id=f0,netdev=bn0,chain=in,interval=1000 > > NOTE: > the scale of interval is microsecond. Perhaps "interval is in microseconds". > > Signed-off-by: Yang Hongyang <yan...@cn.fujitsu.com> > --- > v11: add a fixme comment from Jason > v10: use NetQueue flush api to flush packets > sent_cb can not be called when we already return size > v9: adjustment due to the qapi change > v7: use QTAILQ_FOREACH_SAFE() when flush packets > v6: move the interval check earlier and some comment adjust > v5: remove dummy sent_cb > change interval type from int64 to uint32 > check interval!=0 when initialise > rename FILTERBUFFERState to FilterBufferState > v4: remove bh > pass the packet to next filter instead of receiver > v3: check packet's sender and sender->peer when flush it > --- > net/Makefile.objs | 1 + > net/filter-buffer.c | 170 > ++++++++++++++++++++++++++++++++++++++++++++++++++++ > qemu-options.hx | 18 ++++++ > vl.c | 7 ++- > 4 files changed, 195 insertions(+), 1 deletion(-) > create mode 100644 net/filter-buffer.c > > diff --git a/net/Makefile.objs b/net/Makefile.objs > index 914aec0..5fa2f97 100644 > --- a/net/Makefile.objs > +++ b/net/Makefile.objs > @@ -14,3 +14,4 @@ common-obj-$(CONFIG_SLIRP) += slirp.o > common-obj-$(CONFIG_VDE) += vde.o > common-obj-$(CONFIG_NETMAP) += netmap.o > common-obj-y += filter.o > +common-obj-y += filter-buffer.o > diff --git a/net/filter-buffer.c b/net/filter-buffer.c > new file mode 100644 > index 0000000..ef94e91 > --- /dev/null > +++ b/net/filter-buffer.c > @@ -0,0 +1,170 @@ > +/* > + * Copyright (c) 2015 FUJITSU LIMITED > + * Author: Yang Hongyang <yan...@cn.fujitsu.com> > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or > + * later. See the COPYING file in the top-level directory. > + */ > + > +#include "net/filter.h" > +#include "net/queue.h" > +#include "qemu-common.h" > +#include "qemu/timer.h" > +#include "qemu/iov.h" > +#include "qapi/qmp/qerror.h" > +#include "qapi-visit.h" > +#include "qom/object.h" > + > +#define TYPE_FILTER_BUFFER "filter-buffer" > + > +#define FILTER_BUFFER(obj) \ > + OBJECT_CHECK(FilterBufferState, (obj), TYPE_FILTER_BUFFER) > + > +struct FilterBufferState { > + NetFilterState parent_obj; > + > + NetQueue *incoming_queue; > + uint32_t interval; > + QEMUTimer release_timer; > +}; > +typedef struct FilterBufferState FilterBufferState; Again, not splitting the declaration is more concise. > + > +static void filter_buffer_flush(NetFilterState *nf) > +{ > + FilterBufferState *s = FILTER_BUFFER(nf); > + > + if (!qemu_net_queue_flush(s->incoming_queue)) { > + /* Unable to empty the queue, purge remaining packets */ > + qemu_net_queue_purge(s->incoming_queue, nf->netdev); > + } > +} This either flushes or purges incoming_queue, where "purge" means dropping packets. Correct? > + > +static void filter_buffer_release_timer(void *opaque) > +{ > + NetFilterState *nf = opaque; > + FilterBufferState *s = FILTER_BUFFER(nf); Style nit: blank line between declarations and statements, please. > + filter_buffer_flush(nf); Is purging correct here? > + timer_mod(&s->release_timer, > + qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval); Timer rearmed to fire again in s->interval microseconds. > +} > + > +/* filter APIs */ > +static ssize_t filter_buffer_receive_iov(NetFilterState *nf, > + NetClientState *sender, > + unsigned flags, > + const struct iovec *iov, > + int iovcnt, > + NetPacketSent *sent_cb) > +{ > + FilterBufferState *s = FILTER_BUFFER(nf); > + > + /* > + * we return size when buffer a packet, the sender will take it as > + * a already sent packet, so sent_cb should not be called later Humor me: when a comment has multiple sentences, start each one with a capital letter, and end it with punctuation. > + * FIXME: even if guest can't receive packet for some reasons. Filter > + * can still accept packet until its internal queue is full. > + */ I'm not sure I understand the comment. > + qemu_net_queue_append_iov(s->incoming_queue, sender, flags, > + iov, iovcnt, NULL); > + return iov_size(iov, iovcnt); > +} > + > +static void filter_buffer_cleanup(NetFilterState *nf) > +{ > + FilterBufferState *s = FILTER_BUFFER(nf); > + > + if (s->interval) { > + timer_del(&s->release_timer); > + } > + > + /* flush packets */ > + if (s->incoming_queue) { > + filter_buffer_flush(nf); I guess purging is correct here. > + g_free(s->incoming_queue); > + } > +} > + > +static void filter_buffer_setup(NetFilterState *nf, Error **errp) > +{ > + FilterBufferState *s = FILTER_BUFFER(nf); > + > + /* > + * this check should be dropped when there're VM FT solutions like MC > + * or COLO use this filter to release packets on demand. > + */ If you end a sentence with a period, you get to start it with a capital letter :) "there're" is odd. Perhaps something like "We may want to accept zero interval when VM FT solutions like MC or COLO use this filter to release packets on demand." > + if (!s->interval) { > + error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "interval", > + "a non-zero interval"); How can this happen? Doesn't filter_buffer_set_interval() catch zero intervals already? > + return; > + } > + > + s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf); > + if (s->interval) { Condition cannot be false. Same in filter_buffer_cleanup(). > + timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL, > + filter_buffer_release_timer, nf); > + timer_mod(&s->release_timer, > + qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval); Timer armed to fire in s->interval microseconds. Unless I misunderstand something, this doesn't actually delay each packet by s->interval microseconds, it batches packet delivery: all packets arriving in a given interval are delayed until the end of the interval. Correct? > + } > +} > + > +static void filter_buffer_class_init(ObjectClass *oc, void *data) > +{ > + NetFilterClass *nfc = NETFILTER_CLASS(oc); > + > + nfc->setup = filter_buffer_setup; > + nfc->cleanup = filter_buffer_cleanup; > + nfc->receive_iov = filter_buffer_receive_iov; > +} > + > +static void filter_buffer_get_interval(Object *obj, Visitor *v, void *opaque, > + const char *name, Error **errp) > +{ > + FilterBufferState *s = FILTER_BUFFER(obj); > + uint32_t value = s->interval; > + > + visit_type_uint32(v, &value, name, errp); > +} > + > +static void filter_buffer_set_interval(Object *obj, Visitor *v, void *opaque, > + const char *name, Error **errp) > +{ > + FilterBufferState *s = FILTER_BUFFER(obj); > + Error *local_err = NULL; > + uint32_t value; > + > + visit_type_uint32(v, &value, name, &local_err); > + if (local_err) { > + goto out; > + } > + if (!value) { > + error_setg(&local_err, "Property '%s.%s' doesn't take value '%" > + PRIu32 "'", object_get_typename(obj), name, value); What does it take? That's what the user wants to know... Perhaps "Property '%s.%s' requires a positive value". > + goto out; > + } > + s->interval = value; > + > +out: > + error_propagate(errp, local_err); > +} > + > +static void filter_buffer_init(Object *obj) > +{ > + object_property_add(obj, "interval", "int", > + filter_buffer_get_interval, > + filter_buffer_set_interval, NULL, NULL, NULL); > +} > + > +static const TypeInfo filter_buffer_info = { > + .name = TYPE_FILTER_BUFFER, > + .parent = TYPE_NETFILTER, > + .class_init = filter_buffer_class_init, > + .instance_init = filter_buffer_init, > + .instance_size = sizeof(FilterBufferState), > +}; > + > +static void register_types(void) > +{ > + type_register_static(&filter_buffer_info); > +} > + > +type_init(register_types); > diff --git a/qemu-options.hx b/qemu-options.hx > index 7e147b8..b09f97f 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -3642,6 +3642,24 @@ in PEM format, in filenames @var{ca-cert.pem}, > @var{ca-crl.pem} (optional), > @var{server-cert.pem} (only servers), @var{server-key.pem} (only servers), > @var{client-cert.pem} (only clients), and @var{client-key.pem} (only > clients). > > +@item -object > filter-buffer,id=@var{id},netdev=@var{netdevid}[,chain=@var{all|in|out}][,interval=@var{t}] > + > +Buffer network packets on netdev @var{netdevid}. > +If interval @var{t} provided, will release packets by interval. > +Interval scale: microsecond. > + > +If interval @var{t} not provided, you have to make sure the packets can be Are you sure omitting t works? filter_buffer_set_interval() rejects zero... > +released, either by manually remove this filter or call the release buffer > API, > +otherwise, the packets will be buffered forever. Use with caution. Please wrap your lines a bit earlier. > + > +chain @var{all|in|out} is an option that can be applied to any netfilter, > default is @option{all}. > + > +@option{all} means this filter will receive packets both sent to/from the > netdev > + > +@option{in} means this filter will receive packets sent to the netdev > + > +@option{out} means this filter will receive packets sent from the netdev > + I'd describe this like "filter is inserted in the receive / transmit queue". > @end table > > ETEXI > diff --git a/vl.c b/vl.c > index ec589e2..3cf89d5 100644 > --- a/vl.c > +++ b/vl.c > @@ -2794,7 +2794,12 @@ static bool object_create_initial(const char *type) > if (g_str_equal(type, "rng-egd")) { > return false; > } > - /* TODO: return false for concrete netfilters */ > + > + /* return false for concrete netfilters */ I find this comment useless, please drop it :) > + if (g_str_equal(type, "filter-buffer")) { > + return false; > + } > + > return true; > }