On Thu, Nov 19, 2009 at 10:28 AM, Doug Cutting <cutt...@apache.org> wrote:

> Matt Massie wrote:
>
>> I'd recommend we consider using the
>> approach more generally.
>>
>
> For stuff besides parsing it's be more complicated to share test data.
>

> Perhaps we could have three files per schema: the schema, a data file
> containing binary instances of that schema and a data file containing json
> instances of that schema.  The generic tester could read the data files
> check that their instances match.
>

The main thing we need is a standardized directory structure.  For
example...

tests/
  pass/
    test_a/
      schema.json
      json_data/
        one_test_data.json
        two_test_data.json
        etc_test_data.json
      avro_data/
        binary_test_data.bin
        another_binary_test_data_file.bin
    test_b/
      ...
  fail/

Of course, there would be no data tests for invalid schemas.  Once we agree
on a directory structure, we can all write our code once and then focus on
populating our test tree.


> Or we could standardize on a simple random datum generator.  Then we could
> have, for each schema, random seeds paired with binary instances.  Then one
> can generate datastructure instances given the seed and check that they're
> equal to the instance read from the test file, etc.  This would be a
> stronger test.
>

Once we agree on the directory structure, we can have all the Avro
implementations generate test data any number of different ways.  +1 on the
random seed idea.

-Matt

Reply via email to