I went back and read the mailing list discussions from September about
the donation and I would say there was not a clear enough statement
from us about what the donation and IP clearance meant as far as the
future of the Julia codebase. This is partly our fault — we have taken
in 9 other code donations over the last 5 years, and in all cases the
developers understood that they were to move their process to the
Arrow repositories and communications channels.

It did not occur to me at all that the code that you were putting in
the Arrow repository would get treated like a read-only fork that you
update periodically. If I had realized that, we wouldn't be in this
situation.

As a reminder about what Arrow and the ASF are all about: Community
over Code. We think that building a collaborative, open community that
works and plans together in public, makes decisions based on consensus
with clear meritocratic ("doers decide") governance is the best way to
build this project. The concerns that you have around the timing and
frequency of releases for the Julia codebase are in my mind easy to
resolve, and if you had indicated that having a customized process for
Julia releases was a condition for your joining the community
wholeheartedly, we would have been happy to help. I think that the
benefits of common CI/CD infrastructure and opportunities to build
deeper integrations between the Julia implementation and the other
implementations (imagine... Julia kernels running in DataFusion?)
would outweigh the sense of "loss of control" from developing within a
larger project.

On Wed, Apr 7, 2021 at 12:16 AM Jacob Quinn <quinn.jac...@gmail.com> wrote:
>
> Responses inline below:
>
> On Tue, Apr 6, 2021 at 9:46 PM Jorge Cardoso Leitão <
> jorgecarlei...@gmail.com> wrote:
>
> > Hi,
> >
> > > you all did not attempt to work in the community for any meaningful
> > amount of time and
> > are choosing not to try based on the perception that it will create
> > unacceptable overhead for you
> >
> > It is not self-evident to me that Julia's community was sufficiently
> > informed about what they
> > had to give in in terms of process and release management when merging /
> > donating.
> >
>
> Yes, it was pretty unclear what the process was if we needed to do any kind
> of patch release. I know that has been sorted out better recently, but back
> in November, it didn't really seem like an option (i.e. independent
> language patch releases).
>
>
> > IMO this is a plausible explanation as to why the donation was made and
> > then later abandoned.
> >
> >
> I'll just note that the "abandonment" can only be a perception from the
> apache/arrow side of things, but as I mentioned above, I also tried to
> clearly state in the julia/Arrow/README that the development process would
> continue with the JuliaData/Arrow.jl repo as the main "dev" branch, with
> changes being upstreamed to the apache/arrow repo, which was followed
> through, having an upstream of commits right before the 3.0.0 release, and
> I was planning on doing the same soon for the 4.0.0 release. That is to
> say, the Julia implementation has continued progressing forward quite
> rapidly, IMO, but I can see that perhaps apache/arrow repo members may have
> viewed it as "abandoned".
>
>
> > I do not fully understand why the pain points Jacob mentioned were not
> > brought up to the mailing list sooner, though.
> >
>
> To be honest and frank, I didn't have pain points with the development
> process I outlined when the code was donated and as stated in the README.
> That was the process that made the donation possible and I imagined would
> work well going forward, and has, until this thread started and it was
> pointed out that this process isn't viable. The pain points were discussed
> with the initial code donation, but in my mind were resolved with the
> development process that was decided upon.
>
>
> > This made us unable to potentially take corrective measures. I think that
> > this is why everyone was taken a bit by surprise with this.
> >
> > Best,
> > Jorge
> >
> >
> > On Fri, Apr 2, 2021 at 10:18 PM Wes McKinney <wesmck...@gmail.com> wrote:
> >
> > > hi Jacob — sorry to hear that. It's a bummer that you all did not
> > > attempt to work in the community for any meaningful amount of time and
> > > are choosing not to try based on the perception that it will create
> > > unacceptable overhead for you. I believe the benefits would outweigh
> > > the costs, but I suppose we will have to agree to disagree.
> > >
> > > Can you prepare a pull request to do the requisite repository surgery?
> > > I hope the development goes well in the future and look forward to
> > > seeing folks from the Julia ecosystem engaged here on growing the
> > > Arrow ecosystem.
> > >
> > > Thanks,
> > > Wes
> > >
> > > On Fri, Apr 2, 2021 at 3:03 PM Jacob Quinn <quinn.jac...@gmail.com>
> > wrote:
> > > >
> > > > Ok, I've had a chance to discuss with a few other Julia developers and
> > > > review various options. I think it's best to drop the Julia code from
> > the
> > > > physical apache/arrow repo. The extra overhead on development, release
> > > > process, and user issue reporting and PR contributing are too much in
> > > > addition to the technical challenges that we never resolved involving
> > > > including the past Arrow.jl release version git trees in the
> > apache/arrow
> > > > repo.
> > > >
> > > > We're still very much committed to working on the Julia implementation
> > > and
> > > > participating in the broader arrow community. I've enjoyed following
> > the
> > > > user/dev mailing lists and will continue to do so. We monitor format
> > > > proposals and try to implement new functionality as quickly as
> > possible.
> > > We
> > > > got the initial arrow flight proto code generated just last night in
> > > fact.
> > > > I'd still like to explore official integration with the archery test
> > > suite
> > > > to solidify the Julia implementation with integration tests; I think
> > that
> > > > would be very valuable for long-term confidence in the cross-language
> > > > support of the Julia implementation.
> > > >
> > > > We realize one of the main implications will probably be dropping Julia
> > > > from the list of "official implementations". We're encouraged by the
> > many
> > > > users who have already started using the Julia implementation and will
> > > > strive to maintain a high rate of issue responsiveness and feature
> > > > development to maintain project confidence. If there's a possibility of
> > > > being included somewhere as an "unofficial" or "semi-official"
> > > > implementation, we'd love to still be bundled with the broader arrow
> > > > project somehow, like, for example, showing how Julia integrates with
> > the
> > > > archery test suite, once the work there is done.
> > > >
> > > > Best,
> > > >
> > > > -Jacob
> > > >
> > > >
> > > >
> > > > On Tue, Mar 30, 2021 at 4:10 PM Wes McKinney <wesmck...@gmail.com>
> > > wrote:
> > > >
> > > > > Also, on the issue that there are no Julia-focused PMC members — note
> > > > > that I helped the JavaScript folks make their own independent
> > releases
> > > > > for quite a while: called the votes (e.g. [1]), helped get people to
> > > > > verify and vote on the releases. After a time, it was decided to stop
> > > > > releasing independently because there wasn't enough development
> > > > > activity to justify it.
> > > > >
> > > > > [1]: https://www.mail-archive.com/dev@arrow.apache.org/msg05971.html
> > > > >
> > > > > On Tue, Mar 30, 2021 at 4:54 PM Wes McKinney <wesmck...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > hi Jacob,
> > > > > >
> > > > > > On Tue, Mar 30, 2021 at 4:18 PM Jacob Quinn <
> > quinn.jac...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > I can comment as the primary apache arrow liaison for the
> > Arrow.jl
> > > > > > > repository and original code donator.
> > > > > > >
> > > > > > > I apologize for the "surprise", but I commented a few times in
> > > various
> > > > > > > places and put a snippet in the README
> > > > > > > <
> > > > >
> > >
> > https://github.com/apache/arrow/tree/master/julia/Arrow#difference-between-this-code-and-the-juliadataarrowjl-repository
> > > > > >
> > > > > > > about
> > > > > > > the approach I wanted to take w/ the Julia implementation in
> > terms
> > > of
> > > > > > > keeping the JuliaData/Arrow.jl repository as a "dev branch" of
> > > sorts
> > > > > of the
> > > > > > > apache/arrow code, upstreaming changes periodically. There's
> > even a
> > > > > script
> > > > > > > <
> > > > >
> > >
> > https://github.com/JuliaData/Arrow.jl/blob/main/scripts/update_apache_arrow_code.jl
> > > > > >
> > > > > > > I wrote to mostly automate this upstreaming. I realize now that I
> > > > > didn't
> > > > > > > consider the "Arrow PMC" position on this kind of setup or seek
> > to
> > > > > affirm
> > > > > > > that it would be ok to approach things like this.
> > > > > > >
> > > > > > > The reality is that Julia users are very engrained to expect
> > Julia
> > > > > packages
> > > > > > > to live in a single stand-alone github repo, where issues can be
> > > > > opened,
> > > > > > > and pull requests are welcome. It was hard and still is hard to
> > > imagine
> > > > > > > "turning that off", since I believe we would lose a lot of
> > > valuable bug
> > > > > > > reports and first-time contributions. This isn't necessarily any
> > > fault
> > > > > of
> > > > > > > how the bug report/contribution process is handled for the arrow
> > > > > project
> > > > > > > overall, though I'm also aware that there's a desire to make it
> > > easier
> > > > > > >
> > > > > > >
> > > > > > <
> > > > >
> > >
> > https://lists.apache.org/x/thread.html/r8817dfba08ef8daa210956db69d513fd27b7a751d28fb8f27e39cc7e@%3Cdev.arrow.apache.org%3E
> > > > > >
> > > > > > > and
> > > > > > > it currently requires more and different effort than Julia users
> > > are
> > > > > used
> > > > > > > to. I think it's more from how open, welcoming, and how strong
> > the
> > > > > culture
> > > > > > > is in Julia around encouraging community contributions and the
> > > tight
> > > > > > > integration with github and its open-source project management
> > > tools.
> > > > > > >
> > > > > >
> > > > > > Well, we are on track to having 1000 different people contribute to
> > > > > > the project and have over 12,000 issues, so I don't think there is
> > > > > > evidence that we are failing to attract new contributors or that
> > > > > > feature requests / bugs aren't being reported. The way that we work
> > > is
> > > > > > _different_, so adapting to the Apache process will require change.
> > > > > >
> > > > > > > Additionally, I was and still am concerned about the overall
> > > release
> > > > > > > process of the apache/arrow project. I know there have been
> > efforts
> > > > > there
> > > > > > > as well to make it easier for individual languages to release on
> > > their
> > > > > own
> > > > > > > cadence, but just anecdotally, the JuliaData/Arrow.jl has
> > > > > had/needed/wanted
> > > > > > > 10 patch and minor releases since the original code donation,
> > > whereas
> > > > > the
> > > > > > > apache/arrow project has had one (3.0.0). This leads to some of
> > the
> > > > > > > concerns I have with restricting development to just the
> > > apache/arrow
> > > > > > > repository: how exactly does the release process work for
> > > individual
> > > > > > > languages who may desire independent releases apart from the
> > > quarterly
> > > > > > > overall project releases? I think from the Rust thread I remember
> > > that
> > > > > you
> > > > > > > just need a group of language contributors to all agree, but what
> > > if
> > > > > I'm
> > > > > > > the only "active" Julia contributor? It's also unclear what the
> > > > > > > expectations are for actual development: with the original code
> > > > > donation
> > > > > > > PRs, I know Neal "reviewed" the PRs, but perhaps missed the
> > details
> > > > > around
> > > > > > > how I proposed development continue going forward. Is it required
> > > to
> > > > > have a
> > > > > > > certain number of reviews before merging? On the Julia side, I
> > can
> > > try
> > > > > to
> > > > > > > encourage/push for those who have contributed to the
> > > JuliaData/Arrow.jl
> > > > > > > repository to help review PRs to apache/arrow, but I also can't
> > > > > guarantee
> > > > > > > we would always have someone to review. It just feels pretty
> > > awkward
> > > > > if I
> > > > > > > keep needing to ping non-Julia people to "review" a PR to merge
> > it.
> > > > > Perhaps
> > > > > > > this is just a problem of the overall Julia implementation
> > > "smallness"
> > > > > in
> > > > > > > terms of contributors, but I'm not sure on the best answer here.
> > > > > > >
> > > > > >
> > > > > > Several things here:
> > > > > >
> > > > > > * If you want to do separate Julia releases, you are free to do
> > that,
> > > > > > but you have to follow the process (voting on the mailing list,
> > > > > > publishing GPG-signed source artifacts)
> > > > > > * If you had been working "in the community" since November, you
> > > would
> > > > > > probably already be a committer, so there is a bootstrapping here
> > > that
> > > > > > has failed to take place. In the meantime, we are more than happy
> > to
> > > > > > help you "earn your wings" (as a committer) as quickly as possible.
> > > > > > But from my perspective, I see a code donation and two other
> > commits,
> > > > > > which isn't enough to make a case for committership.
> > > > > >
> > > > > > > So in short, I'm not sure on the best path forward. I think
> > > strictly
> > > > > > > restricting development to the apache/arrow physical repository
> > > would
> > > > > > > actively hurt the progress of the Julia implementation, whereas
> > it
> > > > > *has*
> > > > > > > been progressing with increasing momentum since first released.
> > > There
> > > > > are
> > > > > > > posts on the Julia discourse forum, in the Julia slack and zulip
> > > > > > > communities, and quite a few issues/PRs being opened at the
> > > > > > > JuliaData/Arrow.jl repository. There have been several calls for
> > > arrow
> > > > > > > flight support, with a member from Julia Computing actually close
> > > to
> > > > > > > releasing a gRPC client
> > > > > > > <https://github.com/JuliaComputing/gRPCClient.jl> specifically
> > > > > > > to help with flight support. But in terms of actual committers,
> > > it's
> > > > > been
> > > > > > > primarily just myself, with a few minor contributions by others.
> > > > > > >
> > > > > > > I guess the big question that comes to mind is what are the hard
> > > > > > > requirements to be considered an "official implementation"? Does
> > > the
> > > > > code
> > > > > > > *have* to live in the same physical repo? Or if it passed the
> > > series of
> > > > > > > archery integration tests, would that be enough? I apologize for
> > my
> > > > > > > naivete/inexperience on all things "apache", but I imagine that's
> > > a big
> > > > > > > part of it: having official development/releases through the
> > > > > apache/arrow
> > > > > > > community, though again I'm not exactly sure on the formal
> > > processes
> > > > > here?
> > > > > > > I would like to keep Julia as an official implementation, but I'm
> > > also
> > > > > > > mostly carrying the maintainership alone at the moment and want
> > to
> > > be
> > > > > > > realistic with the future of the project.
> > > > > > >
> > > > > >
> > > > > > The critical matter is whether the development/maintenance work is
> > > > > > conducted by the "Arrow community" in accordance with the Apache
> > Way,
> > > > > > which is to say individuals collaborating with each other on Apache
> > > > > > channels (for communication and development) and avoiding the bad
> > > > > > patterns you see sometimes in other communities (e.g. inconsistent
> > > > > > openness).
> > > > > >
> > > > > > It's fine — really, no pressure — if you want to be independent and
> > > do
> > > > > > things your own way, you just have to be clear that you are
> > > > > > independent and not operating as part of the Apache Arrow
> > community.
> > > > > > You can't have it both ways, though. No hard feelings whatever you
> > > > > > decide, but the current "dump code over the wall occasionally"
> > > > > > approach but work on independent channels is not compatible.
> > Building
> > > > > > healthy open source communities is hard, but this way has been
> > shown
> > > > > > to work well, which is why I've spent the last 6 years working hard
> > > to
> > > > > > bring people together to build this project and ecosystem!
> > > > > >
> > > > > > If you want to maintain a test harness here to verify an
> > independent
> > > > > > Julia implementation, that's fine, too. I'm disappointed that
> > things
> > > > > > failed to bootstrap after the code donation, so I want to see if we
> > > > > > can course correct quickly or if not decide to go our separate
> > ways.
> > > > > >
> > > > > > Thanks,
> > > > > > Wes
> > > > > >
> > > > > > > I'm open to discussion and ideas on the best way forward.
> > > > > > >
> > > > > > > -Jacob
> > > > > > >
> > > > > > > On Tue, Mar 30, 2021 at 2:03 PM Wes McKinney <
> > wesmck...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > hi folks,
> > > > > > > >
> > > > > > > > I was very surprised today to learn that the Julia Arrow
> > > > > > > > implementation has continued operating more or less like an
> > > > > > > > independent open source project since the code donation last
> > > > > November:
> > > > > > > >
> > > > > > > > https://github.com/JuliaData/Arrow.jl/commits/main
> > > > > > > >
> > > > > > > > There may have been a misunderstanding about what was expected
> > to
> > > > > > > > occur after the code donation, but it's problematic for a bunch
> > > of
> > > > > > > > reasons (IP lineage / governance / community development) to
> > have
> > > > > work
> > > > > > > > happening on the implementation "outside the community".
> > > > > > > >
> > > > > > > > In any case, what is done is done, so the Arrow PMC's position
> > on
> > > > > this
> > > > > > > > would be roughly to regard the work as a hard fork of what's in
> > > > > Apache
> > > > > > > > Arrow, which given its development activity is more or less
> > > inactive
> > > > > > > > [1]. (I had actually thought the project was simply inactive
> > > after
> > > > > the
> > > > > > > > code donation)
> > > > > > > >
> > > > > > > > The critical question now is, is there interest from Julia
> > > developers
> > > > > > > > in working "in the community", which is to say:
> > > > > > > >
> > > > > > > > * Having development discussions on ASF channels (mailing list,
> > > > > > > > GitHub, JIRA), planning and communicating in the open
> > > > > > > > * Doing all development in ASF GitHub repositories
> > > > > > > >
> > > > > > > > The answer to the question may be "no" (which is okay), but if
> > > that's
> > > > > > > > the case, I don't think we should be giving the impression that
> > > we
> > > > > > > > have an official Julia implementation that is developed and
> > > > > maintained
> > > > > > > > by the community (and so my argument would be unfortunately to
> > > drop
> > > > > > > > the donated code from the project).
> > > > > > > >
> > > > > > > > If the answer is "yes", there needs to be a hard commitment to
> > > move
> > > > > > > > development to Apache channels and not look back. We would also
> > > need
> > > > > > > > to figure out what to do to document and synchronize the new IP
> > > > > that's
> > > > > > > > been created since the code donation.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Wes
> > > > > > > >
> > > > > > > > [1]:
> > https://github.com/apache/arrow/commits/master/julia/Arrow
> > > > > > > >
> > > > >
> > >
> >

Reply via email to