Re: [Qemu-devel] [PATCH v5 01/11] qdict: implement a qdict_crumple method for un-flattening a dict

Daniel P. Berrange Tue, 14 Jun 2016 04:43:35 -0700

On Thu, Jun 09, 2016 at 03:20:47PM +0200, Markus Armbruster wrote:
> I apologize for the lateness of this review.
> 
> "Daniel P. Berrange" <berra...@redhat.com> writes:
> 
> > The qdict_flatten() method will take a dict whose elements are
> > further nested dicts/lists and flatten them by concatenating
> > keys.
> >
> > The qdict_crumple() method aims to do the reverse, taking a flat
> > qdict, and turning it into a set of nested dicts/lists. It will
> > apply nesting based on the key name, with a '.' indicating a
> > new level in the hierarchy. If the keys in the nested structure
> > are all numeric, it will create a list, otherwise it will create
> > a dict.
> >
> > If the keys are a mixture of numeric and non-numeric, or the
> > numeric keys are not in strictly ascending order, an error will
> > be reported.
> >
> > As an example, a flat dict containing
> >
> >  {
> >    'foo.0.bar': 'one',
> >    'foo.0.wizz': '1',
> >    'foo.1.bar': 'two',
> >    'foo.1.wizz': '2'
> >  }
> >
> > will get turned into a dict with one element 'foo' whose
> > value is a list. The list elements will each in turn be
> > dicts.
> >
> >  {
> >    'foo': [
> >      { 'bar': 'one', 'wizz': '1' },
> >      { 'bar': 'two', 'wizz': '2' }
> >    ],
> >  }
> >
> > If the key is intended to contain a literal '.', then it must
> > be escaped as '..'. ie a flat dict
> >
> >   {
> >      'foo..bar': 'wizz',
> >      'bar.foo..bar': 'eek',
> >      'bar.hello': 'world'
> >   }
> >
> > Will end up as
> >
> >   {
> >      'foo.bar': 'wizz',
> >      'bar': {
> >         'foo.bar': 'eek',
> >         'hello': 'world'
> >      }
> >   }
> >
> > The intent of this function is that it allows a set of QemuOpts
> > to be turned into a nested data structure that mirrors the nesting
> > used when the same object is defined over QMP.
> >
> > Signed-off-by: Daniel P. Berrange <berra...@redhat.com>
>


> > +/**
> > + * qdict_split_flat_key:
> > + * @key: the key string to split
> > + * @prefix: non-NULL pointer to hold extracted prefix
> > + * @suffix: non-NULL pointer to hold extracted suffix
> > + *
> > + * Given a flattened key such as 'foo.0.bar', split it
> > + * into two parts at the first '.' separator. Allows
> > + * double dot ('..') to escape the normal separator.
> > + *
> > + * eg
> > + *    'foo.0.bar' -> prefix='foo' and suffix='0.bar'
> > + *    'foo..0.bar' -> prefix='foo.0' and suffix='bar'
> > + *
> > + * The '..' sequence will be unescaped in the returned
> > + * 'prefix' string. The 'suffix' string will be left
> > + * in escaped format, so it can be fed back into the
> > + * qdict_split_flat_key() key as the input later.
> 
> Why is the suffix strdup'ed then?

It doesn't need to be - i'll make it const.



> > +}
> > +
> > +
> > +/**
> > + * qdict_list_size:
> > + * @maybe_list: dict that may be only list elements
> 
> Huh?  How can a dictionary "be only list elements"?
> 
> Do you mean "the dictionary to test?"

I'll say "dict to search for keys representing list elements."

> > + *
> > + * Determine whether all keys in @maybe_list are
> > + * valid list elements. If they are all valid,
> > + * then this returns the number of elements. If
> > + * they all look like non-numeric keys, then returns
> > + * zero. If there is a mix of numeric and non-numeric
> > + * keys, then an error is set as it is both a list
> > + * and a dict at once.
> 
> This is well-defined only if empty @maybe_list is considered to have
> dict nature, not list nature.  Else, return value zero could be the
> length of the empty list or the special "has dict nature" value.
> 
> Please spell out behavior for empty @maybe_list.

Yep, empty list will imply qdict nature and so return zero.

> > + *
> > + * Returns: number of list elements, 0 if a dict, -1 on error
> 
> Awkward function name.  qdict_list_size_if_list() would be clear.
>
> But I'd simply turn this into a predicate qdict_is_list(), and have the
> caller use qdict_size() to get the number of elements.

I thought that qdict_size() would cause another iteration, but
I see now it just returns a cached size. So I'll switch to
qidct_is_list().

> > +static ssize_t qdict_list_size(QDict *maybe_list, Error **errp)
> > +{
> > +    const QDictEntry *entry, *next;
> > +    ssize_t len = 0;
> > +    ssize_t max = -1;
> > +    int is_list = -1;
> > +    int64_t val;
> > +
> > +    entry = qdict_first(maybe_list);
> > +    while (entry != NULL) {
> > +        next = qdict_next(maybe_list, entry);
> 
> Please keep the loop control in one place:
> 
>        for (entry = qdict_first(maybe_list); entry; entry = 
> qdict_next(entry)) {
> 
> I'd rename some variables for less verbiage:
> 
>        for (ent = qdict_first(dict); ent; ent = qdict_next(ent)) {
> 
> Your loop control also works when the loop body deletes @entry from
> @maybe_list.  Seeing such loop control in a function that isn't supposed
> to change the its argument makes the reviewer go "WTF?!?" :)

This pattern was left from an earlier version where all the code was in
one method. I'll change it to a for() loop now.

> 
> > +
> > +        if (qemu_strtoll(entry->key, NULL, 10, &val) == 0) {
> > +            if (is_list == -1) {
> > +                is_list = 1;
> > +            } else if (!is_list) {
> > +                error_setg(errp,
> > +                           "Cannot crumple a dictionary with a mix of list 
> > "
> > +                           "and non-list keys");
> > +                return -1;
> > +            }
> > +            len++;
> > +            if (val > max) {
> > +                max = val;
> > +            }
> > +        } else {
> > +            if (is_list == -1) {
> > +                is_list = 0;
> > +            } else if (is_list) {
> > +                error_setg(errp,
> > +                           "Cannot crumple a dictionary with a mix of list 
> > "
> > +                           "and non-list keys");
> > +                return -1;
> > +            }
> > +        }
> > +
> > +        entry = next;
> > +    }
> > +
> > +    if (is_list == -1) {
> > +        is_list = 0;
> 
> This can happen only when @maybe_list is empty.  Okay, but perhaps you'd
> like to assert(!qdict_size(maybe_list)).

Ok

> 
> > +    }
> > +
> > +    if (len != (max + 1)) {
> > +        error_setg(errp, "List indexes are not continuous, "
> > +                   "saw %zd elements but %zd largest index",
> > +                   len, max);
> > +        return -1;
> 
> contiguous?
> 
> What if we saw indexes 0, 2, 2?

That would imply that the dict had two entries with the same
key, which by definition is impossible with a dict data
structure.

> > +    }
> > +
> > +    return is_list ? len : 0;
> > +}
> > +
> > +/**
> > + * qdict_crumple:
> > + * @src: the original flat dictionary to crumple
> 
> "Flat" means all values are scalar.  Should we spell that out?

Yep, ok.

> > + * @recursive: true to recursively crumple nested dictionaries
> > + *
> > + * Takes a flat dictionary whose keys use '.' separator to
> > + * indicate nesting, and values are scalars, crumplings it
> 
> s/, crumplings/, and crumples/
> 
> > + * into a nested structure. If the @recursive parameter is
> > + * false, then only the first level of structure implied
> > + * by the keys will be crumpled. If @recursive is true,
> > + * then the input will be recursively crumpled  to expand
> > + * all levels of structure in the keys.
> > + *
> > + * To include a literal '.' in a key name, it must be escaped
> > + * as '..'
> > + *
> > + * For example, an input of:
> > + *
> > + * { 'foo.0.bar': 'one', 'foo.0.wizz': '1',
> > + *   'foo.1.bar': 'two', 'foo.1.wizz': '2' }
> > + *
> > + * will result in any output of:
> > + *
> > + * {
> > + *   'foo': [
> > + *      { 'bar': 'one', 'wizz': '1' },
> > + *      { 'bar': 'two', 'wizz': '2' }
> > + *   ],
> > + * }
> > + *
> > + * Returns: either a QDict or QList for the nested data structure
> 
> I think you should discuss how this can fail.

Will do.

> > +QObject *qdict_crumple(QDict *src, bool recursive, Error **errp)
> > +{
> > +    const QDictEntry *entry, *next;
> > +    QDict *two_level, *multi_level = NULL;
> > +    QObject *dst = NULL, *child;
> > +    ssize_t list_len;
> > +    size_t i;
> > +    char *prefix = NULL, *suffix = NULL;
> > +
> > +    two_level = qdict_new();
> > +    entry = qdict_first(src);
> > +
> > +    /* Step 1: split our totally flat dict into a two level dict */
> > +    while (entry != NULL) {
> > +        next = qdict_next(src, entry);
> 
> Again, keep the loop control in one place.
> 
> > +
> > +        if (qobject_type(entry->value) == QTYPE_QDICT ||
> > +            qobject_type(entry->value) == QTYPE_QLIST) {
> > +            error_setg(errp, "Value %s is not a scalar",
> > +                       entry->key);
> > +            goto error;
> > +        }
> > +
> > +        qdict_split_flat_key(entry->key, &prefix, &suffix);
> > +
> > +        child = qdict_get(two_level, prefix);
> > +        if (suffix) {
> > +            if (child) {
> > +                if (qobject_type(child) != QTYPE_QDICT) {
> > +                    error_setg(errp, "Key %s prefix is already set as a 
> > scalar",
> > +                               prefix);
> > +                    goto error;
> > +                }
> > +            } else {
> > +                child = QOBJECT(qdict_new());
> > +                qdict_put_obj(two_level, prefix, child);
> > +            }
> > +            qobject_incref(entry->value);
> > +            qdict_put_obj(qobject_to_qdict(child), suffix, entry->value);
> > +        } else {
> > +            if (child) {
> > +                error_setg(errp, "Key %s prefix is already set as a dict",
> > +                           prefix);
> > +                goto error;
> > +            }
> > +            qobject_incref(entry->value);
> > +            qdict_put_obj(two_level, prefix, entry->value);
> > +        }
> 
> Works, because we put only QDicts we've created ourselves (first
> qdict_put_obj() above) and values we got from @src (second
> qdict_put_obj()), and we fail when such a value isn't scalar.
> 
> > +
> > +        g_free(suffix);
> 
> As I suspected, qdict_split_flat_key() strdup'ing the suffix is useless.
> 
> > +        g_free(prefix);
> > +        suffix = prefix = NULL;
> 
> Dead stores.

No, they're not dead. The end of this method has a 'g_free(prefix)' so
we must be sure to clear this out to prevent a double-free if a later
codebase jumps to the error label.

> > +        entry = next;
> > +    }
> > +
> > +    /* Step 2: optionally process the two level dict recursively
> > +     * into a multi-level dict */
> > +    if (recursive) {
> > +        multi_level = qdict_new();
> > +        entry = qdict_first(two_level);
> > +        while (entry != NULL) {
> > +            next = qdict_next(two_level, entry);
> 
> Again, keep the loop control in one place.

OK

> 
> > +
> > +            if (qobject_type(entry->value) == QTYPE_QDICT) {
> > +                child = qdict_crumple(qobject_to_qdict(entry->value),
> > +                                      recursive, errp);
> > +                if (!child) {
> > +                    goto error;
> > +                }
> > +
> > +                qdict_put_obj(multi_level, entry->key, child);
> > +            } else {
> > +                qobject_incref(entry->value);
> > +                qdict_put_obj(multi_level, entry->key, entry->value);
> > +            }
> > +
> > +            entry = next;
> > +        }
> > +        QDECREF(two_level);
> > +    } else {
> > +        multi_level = two_level;
> > +    }
> > +    two_level = NULL;
> > +
> > +    /* Step 3: detect if we need to turn our dict into list */
> > +    list_len = qdict_list_size(multi_level, errp);
> > +    if (list_len < 0) {
> > +        goto error;
> > +    }
> > +
> > +    if (list_len) {
> > +        dst = QOBJECT(qlist_new());
> > +
> > +        for (i = 0; i < list_len; i++) {
> > +            char *key = g_strdup_printf("%zu", i);
> > +
> > +            child = qdict_get(multi_level, key);
> > +            g_free(key);
> > +            assert(child);
> 
> qdict_list_size() accepts as list index any (string) key qemu_strtoll()
> accepts.  If %zu formats it back into the same string, we'll find it
> here.  Else we die.  Please spell this out in the function contract.

OK.

> [*] I'm afraid we also die if qdict_list_size()'s "List indexes are not
> continuous" check gets fooled.  Suggest to drop that check, and replace
> this assertion by error_setg(errp, "Malformed list indexes").
> Admittedly not the nicest error message; perhaps you can come up with a
> better one.

We can't be fooled per my note earlier

> 
> > +
> > +            qobject_incref(child);
> > +            qlist_append_obj(qobject_to_qlist(dst), child);
> > +        }
> > +        QDECREF(multi_level);
> 
> Do we need
> 
>            multi_level = NULL;
> 
> here?

It isn't needed right now, since we're at the end of the method now
and nothing will touch this var again. Setting it to NULL is a
worthwhile safety net for future refactoring though.

> 
> > +    } else {
> > +        dst = QOBJECT(multi_level);
> > +    }
> > +
> > +    return dst;
> > +
> > + error:
> > +    g_free(suffix);
> > +    g_free(prefix);
> > +    QDECREF(multi_level);
> > +    QDECREF(two_level);
> > +    qobject_decref(dst);
> > +    return NULL;
> > +}
> > +
> > +
> >  /**
> >   * qdict_array_entries(): Returns the number of direct array entries if the
> >   * sub-QDict of src specified by the prefix in subqdict (or src itself for
> > diff --git a/tests/check-qdict.c b/tests/check-qdict.c
> > index a43056c..0d12f40 100644
> > --- a/tests/check-qdict.c
> > +++ b/tests/check-qdict.c
> > @@ -15,6 +15,7 @@
> >  #include "qapi/qmp/qint.h"
> >  #include "qapi/qmp/qdict.h"
> >  #include "qapi/qmp/qstring.h"
> > +#include "qapi/error.h"
> >  #include "qemu-common.h"

> > +static void qdict_crumple_test_bad_inputs(void)
> > +{
> > +    QDict *src;
> > +    Error *error = NULL;
> > +
> > +    src = qdict_new();
> > +    /* rule.0 can't be both a string and a dict */
> > +    qdict_put(src, "rule.0", qstring_from_str("fred"));
> > +    qdict_put(src, "rule.0.policy", qstring_from_str("allow"));
> > +
> > +    g_assert(qdict_crumple(src, true, &error) == NULL);
> > +    g_assert(error != NULL);
> > +    error_free(error);
> > +    error = NULL;
> > +    QDECREF(src);
> > +
> > +    src = qdict_new();
> > +    /* rule can't be both a list and a dict */
> > +    qdict_put(src, "rule.0", qstring_from_str("fred"));
> > +    qdict_put(src, "rule.a", qstring_from_str("allow"));
> > +
> > +    g_assert(qdict_crumple(src, true, &error) == NULL);
> > +    g_assert(error != NULL);
> > +    error_free(error);
> > +    error = NULL;
> > +    QDECREF(src);
> > +
> > +    src = qdict_new();
> > +    /* The input should be flat, ie no dicts or lists */
> > +    qdict_put(src, "rule.a", qdict_new());
> > +    qdict_put(src, "rule.b", qstring_from_str("allow"));
> > +
> > +    g_assert(qdict_crumple(src, true, &error) == NULL);
> > +    g_assert(error != NULL);
> > +    error_free(error);
> > +    error = NULL;
> > +    QDECREF(src);
> > +
> > +
> > +    src = qdict_new();
> > +    /* List indexes must not have gaps */
> > +    qdict_put(src, "rule.0", qdict_new());
> > +    qdict_put(src, "rule.3", qstring_from_str("allow"));
> > +
> > +    g_assert(qdict_crumple(src, true, &error) == NULL);
> > +    g_assert(error != NULL);
> > +    error_free(error);
> > +    error = NULL;
> > +    QDECREF(src);
> 
> Suggest to add test case
> 
>        /* List indexes must not have gaps (more creative version) */
>        qdict_put(src, "rule.0", ...);
>        qdict_put(src, "rule.2", ...);
>        qdict_put(src, "rule.2", ...);

That's surely impossible, as dict keys have to be unique.

> and
> 
>        /* List indexes must be in %zu format */
>        qdict_put(src, "rule.+0", ...);

Ok.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [PATCH v5 01/11] qdict: implement a qdict_crumple method for un-flattening a dict

Reply via email to