Hi Jacob, Yes, this is exciting! My recommendation would be to fork apache/arrow, add a `julia` directory, copy the contents of https://github.com/JuliaData/Arrow.jl in there, and put up a pull request for review. Then we can discuss specifics there. There will be a few other steps to do that I can think of:
* Drop the MIT license file since the code will have to be under the Apache-2 license. I believe there will have to be some sort of IP-related declarations to be made in order for the Arrow project to accept the code donation; I'll let others who've gone through that chime in with recommendations there. * Every file will need a license note at the top as well; see examples of that throughout the arrow repository. If there are generated files or files that for whatever reason can't have the license header, add them to the list in `dev/release/rat_exclude_files.txt`. * You may want to add a GitHub Actions job for the unit tests, or that can be done in a followup. I'd recommend setting up the integration tests in a followup, personally, but others may disagree. Not all implementations in the project have integration tests at the moment, so while it is very valuable and strongly encouraged, it's not a blocker. Neal On Sun, Sep 13, 2020 at 12:33 PM Jacob Quinn <quinn.jac...@gmail.com> wrote: > Hello all, > > Hopefully this email works (I'm not super familiar with using mailing lists > like this). > > Over the past few weeks, I've been working on a pure Julia implementation > to support serializing/deserializing the arrow format for Julia. The code > in its current state can be found here: > https://github.com/JuliaData/Arrow.jl. > > I believe the code has reached an initial beta-level quality and just > finished writing the arrow <-> json integration testing code that archery > expects. I haven't worked on actual archery integration yet, but it should > just be a matter of adding a tester_julia.py file that knows how to invoke > the test/integrationtest.jl file with similar arguments as the tester_go.py > file. > > This email has a couple purposes: > * Signal that the julia code is somewhat ready to be used/integrated in the > main repo > * Ask for advice/direction on actually integrating with the apache arrow > github repository > > For the latter, in particular, I imagine keeping an initial PR as minimal > as possible is desirable. I need to follow up with the core pkg devs for > Julia, but I've been told it's possible/not hard to have a Julia package > "live" inside a monorepo, but I just haven't figured out the details of > what that means on the Julia General package registry side of things. But > I'm happy to figure that out and shouldn't really affect the merging of > Julia code into the apache arrow github. > > So my plan is roughly: > * Fork/make a branch of the apache arrow repo > * Add in the Julia code from the link I mentioned above > * Add necessary files/integration in archery to run Julia integration tests > alongside other languages > * Do initial merge into apache arrow? > > If there are other initial requirements core devs would expect, just let me > know, but I imagine that updating the implementation matrix, for example, > can be done afterwards as follow up. > > Excited to have Julia more officially integrated here! > > Cheers, > > -Jacob > https://github.com/quinnj > https://twitter.com/quinn_jacobd >