> The map of metadata key/value pairs begins with a long, then a number of > string-key/bytes-value pairs. To be consistent with avro maps, should this > be followed by a long of 0? The spec doesn't say explicitly, but if the > header is described by an avro schema I would suspect yes. >
Not sure if this is what you are talking about, but in the Python implementation (datafile.py) we define an Avro schema for the header: """ ETA_SCHEMA = schema.parse("""\ {"type": "record", "name": "org.apache.avro.file.Header", "fields" : [ {"name": "magic", "type": {"type": "fixed", "name": "magic", "size": %d}}, {"name": "meta", "type": {"type": "map", "values": "bytes"}}, {"name": "sync", "type": {"type": "fixed", "name": "sync", "size": %d}}]} """ % (MAGIC_SIZE, SYNC_SIZE)) """ Also, some written container files should show up in https://issues.apache.org/jira/browse/AVRO-230 real soon now. Thanks, Jeff