Re: [ovs-dev] [PATCH] ovsdb-data: Deduplicate string atoms.

Ilya Maximets Fri, 24 Sep 2021 04:01:18 -0700

On 9/24/21 11:12, Mark Gray wrote:
> On 23/09/2021 01:13, Ilya Maximets wrote:
>> ovsdb-server spends a lot of time cloning atoms for various reasons,
>> e.g. to create a diff of two rows or to clone a row to the transaction.
>> All atoms, except for strings, contains a simple value that could be
>> copied in efficient way, but duplicating strings every time has a
>> significant performance impact.
>>
>> Introducing a new reference-counted structure 'ovsdb_atom_string'
>> that allows to not copy strings every time, but just increase a
>> reference counter.
>>
>> Changes in ovsdb/ovsdb-idlc.in are a bit brute-force-like, but I
>> didn't manage to write anything better.  And this part is very hard
>> to understand and debug, so maybe it's better to keep it in this
>> more or less understandable shape.
>>
>> This change allows to increase transaction throughput in benchmarks
>> up to 2x for standalone databases and 3x for clustered databases, i.e.
>> number of transactions that ovsdb-server can handle per second.
>> It also noticeably reduces memory consumption of ovsdb-server.
>>
> 
> Great!
> 
>> Next step will be to consolidate this structure with json strings,
>> so we will not need to duplicate strings while converting database
>> objects to json and back.
>>
>> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
>> ---
>>
>> There is a small conflict in lib/db-ctl-base.c with the 'set
>> optimizations' patch-set, but it can be easily resolved if
>> someone wants to try these patches together>
>>  lib/db-ctl-base.c      |  11 +--
>>  lib/ovsdb-cs.c         |   2 +-
>>  lib/ovsdb-data.c       |  45 ++++++-----
>>  lib/ovsdb-data.h       |  29 ++++++-
>>  lib/ovsdb-idl.c        |   3 +-
>>  ovsdb/ovsdb-idlc.in    | 179 ++++++++++++++++++++++++++++-------------
>>  ovsdb/ovsdb-server.c   |   6 +-
>>  ovsdb/ovsdb-util.c     |  16 ++--
>>  ovsdb/rbac.c           |   8 +-
>>  python/ovs/db/data.py  |  43 +++++++---
>>  python/ovs/db/types.py |   7 +-
>>  tests/test-ovsdb.c     |  14 ++--
>>  12 files changed, 244 insertions(+), 119 deletions(-)
>>
>> diff --git a/lib/db-ctl-base.c b/lib/db-ctl-base.c
>> index 77cc76a9f..713487797 100644
>> --- a/lib/db-ctl-base.c
>> +++ b/lib/db-ctl-base.c
>> @@ -247,15 +247,15 @@ record_id_equals(const union ovsdb_atom *name, enum 
>> ovsdb_atomic_type type,
>>                   const char *record_id)
>>  {
>>      if (type == OVSDB_TYPE_STRING) {
>> -        if (!strcmp(name->string, record_id)) {
>> +        if (!strcmp(name->s->string, record_id)) {
>>              return true;
>>          }
>>  
>>          struct uuid uuid;
>>          size_t len = strlen(record_id);
>>          if (len >= 4
>> -            && uuid_from_string(&uuid, name->string)
>> -            && !strncmp(name->string, record_id, len)) {
>> +            && uuid_from_string(&uuid, name->s->string)
>> +            && !strncmp(name->s->string, record_id, len)) {
>>              return true;
>>          }
>>  
>> @@ -318,11 +318,12 @@ get_row_by_id(struct ctl_context *ctx,
>>          if (!id->key) {
>>              name = datum->n == 1 ? &datum->keys[0] : NULL;
>>          } else {
>> -            const union ovsdb_atom key_atom
>> -                = { .string = CONST_CAST(char *, id->key) };
>> +            union ovsdb_atom key_atom = {
>> +                .s = ovsdb_atom_string_create(CONST_CAST(char *, id->key)) 
>> };
>>              unsigned int i = ovsdb_datum_find_key(datum, &key_atom,
>>                                                    OVSDB_TYPE_STRING);
>>              name = i == UINT_MAX ? NULL : &datum->values[i];
>> +            ovsdb_atom_destroy(&key_atom, OVSDB_TYPE_STRING);
>>          }
>>          if (!name) {
>>              continue;
>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c
>> index 659d49dbf..fcb6fe1b3 100644
>> --- a/lib/ovsdb-cs.c
>> +++ b/lib/ovsdb-cs.c
>> @@ -1833,7 +1833,7 @@ server_column_get_string(const struct server_row *row,
>>  {
>>      ovs_assert(server_columns[index].type.key.type == OVSDB_TYPE_STRING);
>>      const struct ovsdb_datum *d = &row->data[index];
>> -    return d->n == 1 ? d->keys[0].string : default_value;
>> +    return d->n == 1 ? d->keys[0].s->string : default_value;
>>  }
>>  
>>  static bool
>> diff --git a/lib/ovsdb-data.c b/lib/ovsdb-data.c
>> index 1f491a98b..0ed8f01b6 100644
>> --- a/lib/ovsdb-data.c
>> +++ b/lib/ovsdb-data.c
>> @@ -74,7 +74,7 @@ ovsdb_atom_init_default(union ovsdb_atom *atom, enum 
>> ovsdb_atomic_type type)
>>          break;
>>  
>>      case OVSDB_TYPE_STRING:
>> -        atom->string = xmemdup("", 1);
>> +        atom->s = ovsdb_atom_string_create_nocopy(xmemdup("", 1));
>>          break;
>>  
>>      case OVSDB_TYPE_UUID:
>> @@ -136,7 +136,7 @@ ovsdb_atom_is_default(const union ovsdb_atom *atom,
>>          return atom->boolean == false;
>>  
>>      case OVSDB_TYPE_STRING:
>> -        return atom->string[0] == '\0';
>> +        return atom->s->string[0] == '\0';
> 
> perhaps a helper function would allow you to hide some of this from a
> client. e.g.
> 
> char *ovsdb_atom_string_get(struct ovsdb_atom)


I thought about this, but we have only a handful of places where it is
used.  Another thing is that I want to replace it with json object later
that will have a json_string() access method.  So, it should be fine for
now to not introduce a method that will be replaced anyway.

Also, it seems like it should not be used at all directly, but should be
accessed only through datum/atom APIs, even though it is used in a few
places directly in current code.  So, I don't feel like making it easy
to use.

What do you think?

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ovsdb-data: Deduplicate string atoms.

Reply via email to