Thanks, this patch set is a good way to incrementally work through these
changes.
I have looked at
v4-0001-pg_node_tree-Omit-serialization-of-fields-with-de.patch today.
Here are my thoughts:
I believe we had discussed offline to not omit enum fields with value 0
(WRITE_ENUM_FIELD). This is because the values of enum fields are
implementation artifacts, and this could be confusing for readers.
(This could be added as a squeeze-out-every-byte change later, but if
we're going to keep the format fit for human reading, I think we should
skip this.)
I have some concerns about the round-trippability of float values. If
we do, effectively, if (node->fldname != 0.0), then I think this would
also match negative zero, but when we read it back it, it would get
assigned positive zero. Maybe there are other edge cases like this.
Might be safer to not mess with this.
On the reading side, the macro nesting has gotten a bit out of hand. :)
We had talked earlier in the thread about the _DIRECT macros and you
said there were left over from something else you want to try, but I see
nothing else in this patch set uses this. I think this could all be
much simpler, like (omitting required punctuation)
#define READ_INT_FIELD(fldname, default)
if ((token = next_field(fldname, &length)))
local_node->fldname = atoi(token);
else
local_node->fldname = default;
where next_field() would
1. read the next token
2. if it is ":fldname", continue;
else rewind the read pointer and return NULL
3. read the next token and return that
Not only is this simpler, but it might also have better performance,
because we don't have separate pg_strtok_next() and pg_strtok() calls in
sequence.