This is an automated email from the ASF dual-hosted git repository.
rskraba pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/avro.git
The following commit(s) were added to refs/heads/master by this push:
new afe8fa1 AVRO-3028: Records encode fields that equal the default
(#1065)
afe8fa1 is described below
commit afe8fa1adfbed7971c077338cd9441b14503507a
Author: Juan Cruz Viotti <[email protected]>
AuthorDate: Wed Jan 27 10:26:48 2021 -0400
AVRO-3028: Records encode fields that equal the default (#1065)
I believe that this is an important aspect to clarify as some other
serialization formats omit fields that equal their default for
space-efficiency reasons. I had to run a small experiment as I could not
find this information in the spec.
Signed-off-by: Juan Cruz Viotti <[email protected]>
---
doc/src/content/xdocs/spec.xml | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/doc/src/content/xdocs/spec.xml b/doc/src/content/xdocs/spec.xml
index 2cf3f8a..e65cf3b 100644
--- a/doc/src/content/xdocs/spec.xml
+++ b/doc/src/content/xdocs/spec.xml
@@ -104,14 +104,17 @@
for users (optional).</li>
<li><code>type:</code> a <a href="#schemas">schema</a>, as
defined above</li>
<li><code>default:</code> A default value for this
- field, used when reading instances that lack this
- field (optional). Permitted values depend on the
- field's schema type, according to the table below.
- Default values for union fields correspond to the
- first schema in the union. Default values for bytes
- and fixed fields are JSON strings, where Unicode
- code points 0-255 are mapped to unsigned 8-bit byte
- values 0-255.
+ field, only used when reading instances that lack
+ the field for schema evolution purposes. The
+ presence of a default value does not make the
+ field optional at encoding time. Permitted values
+ depend on the field's schema type, according to the
+ table below. Default values for union fields correspond
+ to the first schema in the union. Default values for bytes
+ and fixed fields are JSON strings, where Unicode
+ code points 0-255 are mapped to unsigned 8-bit byte
+ values 0-255. Avro encodes a field even if its
+ value is equal to its default.
<table class="right">
<caption>field default values</caption>
<tr><th>avro type</th><th>json
type</th><th>example</th></tr>
@@ -564,7 +567,7 @@
followed by the serialized string:
<source>02 02 61</source></li>
</ul>
- <p><em>NOTE</em>: Currently for C/C++ implementtions, the
positions are practically an int, but theoretically a long.
+ <p><em>NOTE</em>: Currently for C/C++ implementations, the
positions are practically an int, but theoretically a long.
In reality, we don't expect unions with 215M members </p>
</section>