On 1/12/24 17:39, Frode Nordahl wrote:
> Hello, Ilya,
> 
> This is a partial response as I've come across something that might
> change the approach for this specific patch, and I thought it would be
> pertinent to disclose that discovery as soon as possible.
> 
> I'll respond to the rest as I verify the finding and continue working
> through your comments on this patch.

OK.

> 
> On Fri, Jan 12, 2024 at 1:26 AM Ilya Maximets <i.maxim...@ovn.org> wrote:
>>
>> On 1/10/24 20:29, Frode Nordahl wrote:
>>> Creating and destroying JSON objects may be time consuming.
>>>
>>> Add yielding counterparts of json_serialized_object_create() and
>>> json_destroy__() functions that make use of the cooperative
>>> multitasking module to yield during processing, allowing time
>>> sensitive tasks in other parts of the program to be completed
>>> during processing.
>>>
>>> Signed-off-by: Frode Nordahl <frode.nord...@canonical.com>
>>> ---
>>>  include/openvswitch/json.h | 16 ++++++++--
>>>  lib/json.c                 | 63 ++++++++++++++++++++++++++------------
>>>  2 files changed, 56 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/include/openvswitch/json.h b/include/openvswitch/json.h
>>> index 35b403c29..39528f140 100644
>>> --- a/include/openvswitch/json.h
>>> +++ b/include/openvswitch/json.h
>>> @@ -81,6 +81,7 @@ struct json *json_boolean_create(bool);
>>>  struct json *json_string_create(const char *);
>>>  struct json *json_string_create_nocopy(char *);
>>>  struct json *json_serialized_object_create(const struct json *);
>>> +struct json *json_serialized_object_create_with_yield(const struct json *);
>>>  struct json *json_integer_create(long long int);
>>>  struct json *json_real_create(double);
>>>
>>> @@ -137,7 +138,8 @@ struct json *json_from_stream(FILE *stream);
>>>
>>>  enum {
>>>      JSSF_PRETTY = 1 << 0,       /* Multiple lines with indentation, if 
>>> true. */
>>> -    JSSF_SORT = 1 << 1          /* Object members in sorted order, if 
>>> true. */
>>> +    JSSF_SORT = 1 << 1,         /* Object members in sorted order, if 
>>> true. */
>>> +    JSSF_YIELD = 1 << 2         /* Yield during processing */
>>
>> This is a public header, but we're referencing a functionality
>> of the internal library that users of a public header can't access.
> 
> The flags are publicly available through the use of the
> json_to_string() functions flags argument, which is why I thought it
> would be natural to put this here.
> 
> OTOH, I guess it would not be natural for anyone to use it other than
> through the `_with_yield()` front-end, so I see what you mean.
> 
> 
>> It might be better to create an internal header in lib/ and
>> define this as an internal flag with a higher value (bit 7?).
>> A macro or a single-value enum should be fine.
>>
>> Also, period at the end of a comment.
>>
>>>  };
>>>  char *json_to_string(const struct json *, int flags);
>>>  void json_to_ds(const struct json *, int flags, struct ds *);
>>> @@ -158,14 +160,22 @@ json_clone(const struct json *json_)
>>>      return json;
>>>  }
>>>
>>> -void json_destroy__(struct json *json);
>>> +void json_destroy__(struct json *json, bool yield);
>>
>> Same technically applies here, but this is an internal__ function,
>> so may be fine to keep it like that, as users are not supposed to
>> call it.
>>
>>>
>>>  /* Frees 'json' and everything it points to, recursively. */
>>>  static inline void
>>>  json_destroy(struct json *json)
>>>  {
>>>      if (json && !--json->count) {
>>> -        json_destroy__(json);
>>> +        json_destroy__(json, false);
>>> +    }
>>> +}
>>> +
>>> +static inline void
>>> +json_destroy_with_yield(struct json *json)
>>> +{
>>> +    if (json && !--json->count) {
>>> +        json_destroy__(json, true);
>>>      }
>>>  }
>>
>> This last function can be moved to the internal header.
>> Along with the prototype of the json_serialized_object_create_with_yield.
> 
> If we put these in a private header, ovsdb/monitor.c would have to
> include a "json-private.h", and that does not feel right to me?

Yeah, here we have a different type of 'private' header, that's
why I used the word 'internal' and not 'private'.  json-private.h
indeed sounds like something that is only for json-*.c to include,
but I had in mind a header that in just in lib/ and not include/,
i.e. available for internal users to include, but not exported to
users of a shared library.

I suppose we can just create lib/json.h and put this stuff there.
We do have lib/flow.h and include/openvswitch/flow.h, for example.

In this case, ovsdb/monitor.c can include internal json.h.

> 
> I'll try to find some way to stash this away though if you want to
> keep it away from the public interface.
> 
>> The name is getting very long...
>>
>>>
>>> diff --git a/lib/json.c b/lib/json.c
>>> index aded8bb01..78ebabb18 100644
>>> --- a/lib/json.c
>>> +++ b/lib/json.c
>>> @@ -24,6 +24,7 @@
>>>  #include <limits.h>
>>>  #include <string.h>
>>>
>>> +#include "cooperative-multitasking.h"
>>>  #include "openvswitch/dynamic-string.h"
>>>  #include "hash.h"
>>>  #include "openvswitch/shash.h"
>>> @@ -181,14 +182,26 @@ json_string_create(const char *s)
>>>      return json_string_create_nocopy(xstrdup(s));
>>>  }
>>>
>>> -struct json *
>>> -json_serialized_object_create(const struct json *src)
>>> +static struct json *
>>> +json_serialized_object_create__(const struct json *src, int flags)
>>>  {
>>>      struct json *json = json_create(JSON_SERIALIZED_OBJECT);
>>> -    json->string = json_to_string(src, JSSF_SORT);
>>> +    json->string = json_to_string(src, flags);
>>>      return json;
>>>  }
>>>
>>> +struct json *
>>> +json_serialized_object_create(const struct json *src)
>>> +{
>>> +    return json_serialized_object_create__(src, JSSF_SORT);
>>> +}
>>> +
>>> +struct json *
>>> +json_serialized_object_create_with_yield(const struct json *src)
>>> +{
>>> +    return json_serialized_object_create__(src, JSSF_SORT | JSSF_YIELD);
>>> +}
>>> +
>>>  struct json *
>>>  json_array_create_empty(void)
>>>  {
>>> @@ -360,20 +373,20 @@ json_integer(const struct json *json)
>>>      return json->integer;
>>>  }
>>>
>>> -static void json_destroy_object(struct shash *object);
>>> -static void json_destroy_array(struct json_array *array);
>>> +static void json_destroy_object(struct shash *object, bool yield);
>>> +static void json_destroy_array(struct json_array *array, bool yield);
>>>
>>>  /* Frees 'json' and everything it points to, recursively. */
>>>  void
>>> -json_destroy__(struct json *json)
>>> +json_destroy__(struct json *json, bool yield)
>>>  {
>>>      switch (json->type) {
>>>      case JSON_OBJECT:
>>> -        json_destroy_object(json->object);
>>> +        json_destroy_object(json->object, yield);
>>>          break;
>>>
>>>      case JSON_ARRAY:
>>> -        json_destroy_array(&json->array);
>>> +        json_destroy_array(&json->array, yield);
>>>          break;
>>>
>>>      case JSON_STRING:
>>> @@ -395,13 +408,16 @@ json_destroy__(struct json *json)
>>>  }
>>>
>>>  static void
>>> -json_destroy_object(struct shash *object)
>>> +json_destroy_object(struct shash *object, bool yield)
>>>  {
>>>      struct shash_node *node;
>>>
>>>      SHASH_FOR_EACH_SAFE (node, object) {
>>>          struct json *value = node->data;
>>>
>>> +        if (yield) {
>>> +            cooperative_multitasking_yield();
>>> +        }
>>
>> Should this be moved to the top of the function?
>>
>>>          json_destroy(value);
>>
>> Should be a conditional call to with_yield() version as well?
>> Inner objects can be huge.
>>
>> AFAIU the code here, only objects yield.  Other types of JSON
>> do not yield.  And that is fine.  However, if the object only
>> contains simple elements, yielding on each of them is likely
>> unnecessary.  If elements of the object are objects themselves,
>> then they will yield.
>>
>> For example, in case of a JSON-formatted database, we will yield
>> once per row in this case.  IIUC, current patch will yield once
>> per table, because an update is an array of table objects that
>> contain row objects.
>>
>> What do you think?
>>
>>>          shash_delete(object, node);
>>>      }
>>> @@ -410,12 +426,13 @@ json_destroy_object(struct shash *object)
>>>  }
>>>
>>>  static void
>>> -json_destroy_array(struct json_array *array)
>>> +json_destroy_array(struct json_array *array, bool yield)
>>>  {
>>>      size_t i;
>>>
>>>      for (i = 0; i < array->n; i++) {
>>> -        json_destroy(array->elems[i]);
>>> +        yield ? json_destroy_with_yield(array->elems[i]) :
>>> +                json_destroy(array->elems[i]);
>>
>> Please, use an if statement instead.
>>
>>>      }
>>>      free(array->elems);
>>>  }
>>> @@ -1525,7 +1542,7 @@ static void json_serialize_object(const struct shash 
>>> *object,
>>>                                    struct json_serializer *);
>>>  static void json_serialize_array(const struct json_array *,
>>>                                   struct json_serializer *);
>>> -static void json_serialize_string(const char *, struct ds *);
>>> +static void json_serialize_string(const char *, struct json_serializer *);
>>>
>>>  /* Converts 'json' to a string in JSON format, encoded in UTF-8, and 
>>> returns
>>>   * that string.  The caller is responsible for freeing the returned string,
>>> @@ -1598,7 +1615,7 @@ json_serialize(const struct json *json, struct 
>>> json_serializer *s)
>>>          break;
>>>
>>>      case JSON_STRING:
>>> -        json_serialize_string(json->string, ds);
>>> +        json_serialize_string(json->string, s);
>>>          break;
>>>
>>>      case JSON_SERIALIZED_OBJECT:
>>> @@ -1631,7 +1648,7 @@ json_serialize_object_member(size_t i, const struct 
>>> shash_node *node,
>>>          indent_line(s);
>>>      }
>>>
>>> -    json_serialize_string(node->name, ds);
>>> +    json_serialize_string(node->name, s);
>>>      ds_put_char(ds, ':');
>>>      if (s->flags & JSSF_PRETTY) {
>>>          ds_put_char(ds, ' ');
>>> @@ -1734,7 +1751,7 @@ static const char *chars_escaping[256] = {
>>>  };
>>>
>>>  static void
>>> -json_serialize_string(const char *string, struct ds *ds)
>>> +json_serialize_string(const char *string, struct json_serializer *s)
>>>  {
>>>      uint8_t c;
>>>      uint8_t c2;
>>> @@ -1742,26 +1759,32 @@ json_serialize_string(const char *string, struct ds 
>>> *ds)
>>>      const char *escape;
>>>      const char *start;
>>>
>>> -    ds_put_char(ds, '"');
>>> +    ds_put_char(s->ds, '"');
>>>      count = 0;
>>>      start = string;
>>>      while ((c = *string++) != '\0') {
>>> +        if (s->flags & JSSF_YIELD) {
>>> +            cooperative_multitasking_yield();
>>> +        }
>>
>> This looks like an overkill.  Here we're yielding per character
>> in every JSON string.  That's a lot.
>> Can we yield per object like it's done for a destruction instead?
> 
> Looking at this again, it looks like I have mixed up the code paths of
> the type JSON_SERIALIZED_OBJECT with JSON_STRING so this is most
> likely not needed.
> 
> That also explains one overrun source I was not able to identify
> before running out of time to send the initial patch set for review,
> sorry about that!
> 
> We most likely need a chunked version of ds_put_cstr() instead though,
> as it is what most likely steals the time instead [0]. Will get back
> as soon as I've verified.

I'm not convinced this is a problem.  Pre-serialized objects are
copied with a single memcpy() call.  It should be fast enough even
if the string is 100MB long.  I'm more concerned about not yielding
while serializing objects that have large sub-objects.

> 
> 0: 
> https://github.com/openvswitch/ovs/blob/60457a5e9ddc33809139e91b08634eacd766abb2/lib/json.c#L1605
> 
> --
> Frode Nordahl
> 
>>>          if (c >= ' ' && c != '"' && c != '\\') {
>>>              count++;
>>>          } else {
>>>              if (count) {
>>> -                ds_put_buffer(ds, start, count);
>>> +                ds_put_buffer(s->ds, start, count);
>>>                  count = 0;
>>>              }
>>>              start = string;
>>>              escape = chars_escaping[c];
>>>              while ((c2 = *escape++) != '\0') {
>>> -                ds_put_char(ds, c2);
>>> +                if (s->flags & JSSF_YIELD) {
>>> +                    cooperative_multitasking_yield();
>>> +                }
>>> +                ds_put_char(s->ds, c2);
>>>              }
>>>          }
>>>      }
>>>      if (count) {
>>> -        ds_put_buffer(ds, start, count);
>>> +        ds_put_buffer(s->ds, start, count);
>>>      }
>>> -    ds_put_char(ds, '"');
>>> +    ds_put_char(s->ds, '"');
>>>  }
>>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to