On Wed, Oct 9, 2019 at 12:11 PM Andy Grove <andygrov...@gmail.com> wrote:

> I'm very interested in helping to find a solution to this because we really
> do need integration tests for Rust to make sure we're compatible with other
> implementations... there is also the ongoing CI dockerization work that I
> feel is related.
>
> I haven't looked at the current integration tests yet and would appreciate
> some pointers on how all of this works (do we have docs?) or where to start
> looking.
>
I have a test in my latest PR: https://github.com/apache/arrow/pull/5523
And here is the generated data:
https://github.com/apache/arrow-testing/pull/11
As with program to generate these data, it's just a simple java program.
I'm not sure whether we need to integrate it into arrow.

>
> I imagine the integration test could follow the approach that Renjie is
> outlining where we call Java to generate some files and then call Rust to
> parse them?
>
> Thanks,
>
> Andy.
>
>
>
>
>
>
>
> On Tue, Oct 8, 2019 at 9:48 PM Renjie Liu <liurenjie2...@gmail.com> wrote:
>
> > Hi:
> >
> > I'm developing rust version of reader which reads parquet into arrow
> array.
> > To verify the correct of this reader, I use the following approach:
> >
> >
> >    1. Define schema with protobuf.
> >    2. Generate json data of this schema using other language with more
> >    sophisticated implementation (e.g. java)
> >    3. Generate parquet data of this schema using other language with more
> >    sophisticated implementation (e.g. java)
> >    4. Write tests to read json file, and parquet file into memory (arrow
> >    array), then compare json data with arrow data.
> >
> >  I think with this method we can guarantee the correctness of arrow
> reader
> > because json format is ubiquitous and their implementation are more
> stable.
> >
> > Any comment is appreciated.
> >
>


-- 
Renjie Liu
Software Engineer, MVAD

Reply via email to