The PR is updated to rename the directory to `connectors`. If there are no other objections can we merge it?
On Mon, Mar 21, 2022 at 1:42 PM Alkis Evlogimenos < alkis.evlogime...@databricks.com> wrote: > Unless there are objections, I will update the PR tonight to rename > `external` to `connectors`. > > On Mon, Mar 21, 2022 at 12:36 PM Wenchen Fan <cloud0...@gmail.com> wrote: > >> How about renaming it to `connectors` if docker is the only exception and >> will be moved out? >> >> On Sat, Mar 19, 2022 at 6:18 PM Alkis Evlogimenos >> <alkis.evlogime...@databricks.com.invalid> wrote: >> >>> It looks like renaming the directory and moving components can be >>> separate steps. If there is consensus that connectors will move out, should >>> the directory be named misc for everything else until there is some >>> direction for the remaining modules? >>> >>> On Fri, 18 Mar 2022 at 03:03 Jungtaek Lim <kabhwan.opensou...@gmail.com> >>> wrote: >>> >>>> Avro reader is technically a connector. We eventually called data >>>> source implementation "connector" as well; the package name in the catalyst >>>> represents it. >>>> >>>> Docker is something I'm not sure fits with the name "external". It >>>> probably deserves a top level directory now, since we start to release an >>>> official docker image. That does not seem to be an experimental one. >>>> >>>> Except Docker, all modules in the external directory are "sort of" >>>> connectors. Ganglia metric sink is an exception, but it is still a kind of >>>> connector for Dropwizard. >>>> (It might be interesting to see how many users are still using >>>> kinesis-asl and ganglia-lgpl modules. We have had almost no updates for >>>> DStream for several years.) >>>> >>>> If we agree with my proposal for docker, remaining is going to be >>>> effectively a rename. I don't have a strong opinion, just wanted to avoid >>>> the external directory to become/remain miscellaneous one. >>>> >>>> On Fri, Mar 18, 2022 at 10:04 AM Sean Owen <sro...@gmail.com> wrote: >>>> >>>>> I sympathize, but might be less change to just rename the dir. There >>>>> is more in there like the avro reader; it's kind of miscellaneous. I think >>>>> we might want fewer rather than more top level dirs. >>>>> >>>>> On Thu, Mar 17, 2022 at 7:33 PM Jungtaek Lim < >>>>> kabhwan.opensou...@gmail.com> wrote: >>>>> >>>>>> We seem to just focus on how to avoid the conflict with the name >>>>>> "external" used in bazel. Since we consider the possibility of renaming, >>>>>> why not revisit the modules "external" contains? >>>>>> >>>>>> Looks like kinds of the modules external directory contains are 1) >>>>>> Docker 2) Connectors 3) Sink on Dropwizard metrics (only ganglia here, >>>>>> and >>>>>> it seems to be just that Ganglia is LGPL) >>>>>> >>>>>> Would it make sense if each kind deserves a top directory? We can >>>>>> probably give better generalized names, and as a side-effect we will no >>>>>> longer have "external". >>>>>> >>>>>> On Fri, Mar 18, 2022 at 5:45 AM Dongjoon Hyun < >>>>>> dongjoon.h...@gmail.com> wrote: >>>>>> >>>>>>> Thank you for posting this, Alkis. >>>>>>> >>>>>>> Before the question (1) and (2), I'm curious if the Apache Spark >>>>>>> community has other downstreams using Bazel. >>>>>>> >>>>>>> To All. If there are some Bazel users with Apache Spark code, could >>>>>>> you share your practice? If you are using renaming, what is your renamed >>>>>>> directory name? >>>>>>> >>>>>>> Dongjoon. >>>>>>> >>>>>>> >>>>>>> On Thu, Mar 17, 2022 at 11:56 AM Alkis Evlogimenos >>>>>>> <alkis.evlogime...@databricks.com.invalid> wrote: >>>>>>> >>>>>>>> AFAIK there is not. `external` has been baked in bazel since the >>>>>>>> beginning and there is no plan from bazel devs to attempt to fix >>>>>>>> this >>>>>>>> <https://github.com/bazelbuild/bazel/issues/4508#issuecomment-724055371> >>>>>>>> . >>>>>>>> >>>>>>>> On Thu, Mar 17, 2022 at 7:52 PM Sean Owen <sro...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Just checking - there is no way to tell bazel to look somewhere >>>>>>>>> else for whatever 'external' means to it? >>>>>>>>> It's a kinda big ugly change but it's not a functional change. If >>>>>>>>> anything it might break some downstream builds that rely on the >>>>>>>>> current >>>>>>>>> structure too. But such is life for developers? I don't have a strong >>>>>>>>> reason we can't. >>>>>>>>> >>>>>>>>> On Thu, Mar 17, 2022 at 1:47 PM Alkis Evlogimenos >>>>>>>>> <alkis.evlogime...@databricks.com.invalid> wrote: >>>>>>>>> >>>>>>>>>> Hi Spark devs. >>>>>>>>>> >>>>>>>>>> The Apache Spark repo has a top level external/ directory. This >>>>>>>>>> is a reserved name for the bazel build system and it causes all >>>>>>>>>> sorts of >>>>>>>>>> problems: some can be worked around and some cannot (for some >>>>>>>>>> details on >>>>>>>>>> one that cannot see >>>>>>>>>> https://github.com/hedronvision/bazel-compile-commands-extractor/issues/30 >>>>>>>>>> ). >>>>>>>>>> >>>>>>>>>> Some forks of Apache Spark use bazel as a build system. It >>>>>>>>>> would be nice if we can make this change in Apache Spark without >>>>>>>>>> resorting >>>>>>>>>> to complex renames/merges whenever changes are pulled from upstream. >>>>>>>>>> >>>>>>>>>> As such I proposed to rename external/ directory to want to >>>>>>>>>> rename the external/ directory to something else [SPARK-38569 >>>>>>>>>> <https://issues.apache.org/jira/browse/SPARK-38569>]. I also >>>>>>>>>> sent a tentative [PR-35874 >>>>>>>>>> <https://github.com/apache/spark/pull/35874>] that renames >>>>>>>>>> external/ to vendor/. >>>>>>>>>> >>>>>>>>>> My questions to you are: >>>>>>>>>> 1. Are there any objections to renaming external to X? >>>>>>>>>> 2. Is vendor a good new name for external? >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>