For historical context golden files were first introduced so we could
verify backwards compatibility.  I think the preferred method is still to
do "live" testing.  (i.e. Having one implementation consume JSON output a
binary file, read the binary file with the second implementation and emit
JSON, and then check equality of JSON)

On Fri, Mar 19, 2021 at 9:51 AM Jorge Cardoso Leitão <
jorgecarlei...@gmail.com> wrote:

> Hi,
>
> Thanks a lot for bringing this up, Fernando. I had the same thought when I
> first looked at the tensor implementation in Rust. Now it is a bit more
> clear :)
>
> So, if I understood correctly, the direction would be to declare a
> "JSON-integration" equivalent for tensors, generate a set of "golden binary
> - json" files from C++, push them to the arrow-testing submodule, and then
> use them to compare implementations.
>
> Best,
> Jorge
>
>
>
> On Tue, Mar 16, 2021 at 5:02 PM Fernando Herrera <
> fernando.j.herr...@gmail.com> wrote:
>
> > Hi Wes,
> >
> > Thanks for the update. It would be interesting to add a centralized
> > plan for tensors in Arrow. It would allow sharing data between packages
> > like numpy, ndarray, pytorch, tensorflow really easy. Don't you think so?
> > Let me have a look at how the integration tests are created in Archery
> > so I can add some to start testing IPC in rust.
> >
> > Thanks
> >
> > On Tue, Mar 16, 2021 at 3:16 PM Wes McKinney <wesmck...@gmail.com>
> wrote:
> >
> > > hi Fernando — for clarity, there is no centralized planning in this
> > > project. If a volunteer wants to do something and there are no
> > > objections from other people, then they are free to go ahead and do
> > > it. If there aren't any Jira issues about adding integration tests, it
> > > would make sense to go ahead and open some and clarify the scope of
> > > what you would like to see get developed.
> > >
> > > On Tue, Mar 16, 2021 at 3:25 AM Antoine Pitrou <anto...@python.org>
> > wrote:
> > > >
> > > >
> > > > Hi Fernando,
> > > >
> > > > Currently there are no explicit plans to do it, but that would be
> > > > certainly useful if other implementation start implementing tensor
> IPC
> > > > support.
> > > >
> > > > One should start by defining a reference format (probably JSON) such
> as
> > > > exists for other IPC types:
> > > > https://arrow.apache.org/docs/format/Integration.html
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > > >
> > > >
> > > > Le 16/03/2021 à 10:02, Fernando Herrera a écrit :
> > > > > Are there any plans to include integration testing for tensors in
> the
> > > > > pipeline?
> > > > >
> > > > > Thanks,
> > > > > Fernando
> > > > >
> > > > > On Mon, Mar 15, 2021 at 8:16 PM Antoine Pitrou <anto...@python.org
> >
> > > wrote:
> > > > >
> > > > >> On Mon, 15 Mar 2021 19:48:22 +0000
> > > > >> Fernando Herrera <fernando.j.herr...@gmail.com> wrote:
> > > > >>> Hi Neal,
> > > > >>>
> > > > >>> Thanks for the update and the link.
> > > > >>>
> > > > >>> I found that the project has these files for tensor checking
> > > > >>>
> > > > >>
> > >
> >
> https://github.com/apache/arrow-testing/tree/e8ce32338f2dfeca3a5126f7677bdee159604000/data/arrow-ipc-tensor-stream
> > > > >>>
> > > > >>> So, if I understand correctly, for any application to be
> compatible
> > > > >>> with C++ tensors it should be able to read these files. Am I
> > correct?
> > > > >>
> > > > >> No, these are invalid files found by fuzz testing, that used to
> > crash
> > > > >> the C++ IPC reader. More information here:
> > > > >> https://arrow.apache.org/docs/developers/cpp/fuzzing.html
> > > > >>
> > > > >> We don't have any reference files for integration testing of
> tensors
> > > > >> and sparse tensors currently.
> > > > >>
> > > > >> Regards
> > > > >>
> > > > >> Antoine.
> > > > >>
> > > > >>
> > > > >>
> > > > >
> > >
> >
>

Reply via email to