On Thu, Nov 19, 2009 at 10:28 AM, Doug Cutting <cutt...@apache.org> wrote:
> Matt Massie wrote: > >> I'd recommend we consider using the >> approach more generally. >> > > For stuff besides parsing it's be more complicated to share test data. > > Perhaps we could have three files per schema: the schema, a data file > containing binary instances of that schema and a data file containing json > instances of that schema. The generic tester could read the data files > check that their instances match. > The main thing we need is a standardized directory structure. For example... tests/ pass/ test_a/ schema.json json_data/ one_test_data.json two_test_data.json etc_test_data.json avro_data/ binary_test_data.bin another_binary_test_data_file.bin test_b/ ... fail/ Of course, there would be no data tests for invalid schemas. Once we agree on a directory structure, we can all write our code once and then focus on populating our test tree. > Or we could standardize on a simple random datum generator. Then we could > have, for each schema, random seeds paired with binary instances. Then one > can generate datastructure instances given the seed and check that they're > equal to the instance read from the test file, etc. This would be a > stronger test. > Once we agree on the directory structure, we can have all the Avro implementations generate test data any number of different ways. +1 on the random seed idea. -Matt