Reopening this old thread to discuss whether we should change the heading text on the Arrow website (https://arrow.apache.org) to match this updated description in the GitHub repo.
I opened a Jira issue for this at https://issues.apache.org/jira/browse/ARROW-14086. Please share feedback here or in comments on the Jira issue. On Sat, Jun 12, 2021 at 11:20 AM Joris Peeters <joris.mg.peet...@gmail.com> wrote: > > +1 > > On Sat, Jun 12, 2021 at 2:56 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > Thanks Kou! I have updated the description using .asf.yaml. Appreciate > > everyone giving thought to this! > > > > On Thu, Jun 10, 2021 at 8:13 PM Sutou Kouhei <k...@clear-code.com> wrote: > > > > > > It seems that we can use .asf.yaml to set the description on > > > GitHub: > > > > > > > > https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features#Git.asf.yamlfeatures-GitHubsettings > > > > > > github: > > > description: "Apache Arrow is ..." > > > > > > In <CAJPUwMB6yjWnxyE0xsCVbfd=KxSqd6rp6VUFhvr0=eou5fz...@mail.gmail.com> > > > "Re: Long title on github page" on Thu, 10 Jun 2021 17:44:57 -0500, > > > Wes McKinney <wesmck...@gmail.com> wrote: > > > > > > > I'll wait a day or two for more feedback to percolate and then ask > > > > Infra to change the description on GitHub. > > > > > > > > On Thu, Jun 10, 2021 at 4:47 PM Adam Lippai <a...@rigo.sk> wrote: > > > >> > > > >> +1 > > > >> > > > >> On Thu, Jun 10, 2021, 23:38 Antoine Pitrou <anto...@python.org> > > wrote: > > > >> > > > >> > > > > >> > Sound good enough to me. > > > >> > > > > >> > > > > >> > Le 10/06/2021 à 23:35, Wes McKinney a écrit : > > > >> > > I hate to reopen this can of worms again, but here is my effort to > > > >> > > synthesize feedback: > > > >> > > > > > >> > > "Apache Arrow is a multi-language toolbox for accelerated data > > > >> > > interchange and in-memory processing." > > > >> > > > > > >> > > On Thu, Jun 10, 2021 at 12:37 PM Dominik Moritz < > > domor...@apache.org> > > > >> > wrote: > > > >> > >> > > > >> > >> I thought there were some good suggestions in this thread. @Wes, > > did you > > > >> > >> find a description you liked? > > > >> > >> > > > >> > >> On May 18, 2021 at 06:24:47, Adam Hooper <a...@adamhooper.com> > > wrote: > > > >> > >> > > > >> > >>> Poll question: why did you choose Arrow? > > > >> > >>> > > > >> > >>> Personally: I researched Arrow because it's a spec for IPC. (My > > > >> > requirement > > > >> > >>> was: "wrap computations in a separate process.") I chose Arrow > > for its > > > >> > >>> community and ecosystem -- in other words, because my peers > > chose it. > > > >> > >>> > > > >> > >>> I happen to use the compute kernel and Parquet capabilities > > every day; > > > >> > but > > > >> > >>> they did not sway me at all. I would choose Arrow if it were > > nothing > > > >> > but > > > >> > >>> this spec and this community. (I chose HTML, after all.) > > > >> > >>> > > > >> > >>> I see the *code* as one enormous proof that the *spec* is good, > > and as > > > >> > a > > > >> > >>> collection of examples and best practices. > > > >> > >>> > > > >> > >>> ... so a great pitch to me would be: "Apache Arrow is a data > > format and > > > >> > >>> toolbox for efficient in-memory processing." > > > >> > >>> > > > >> > >>> Enjoy life, > > > >> > >>> Adam > > > >> > >>> > > > >> > >>> On Tue, May 18, 2021 at 2:38 AM Aldrin > > <akmon...@ucsc.edu.invalid> > > > >> > wrote: > > > >> > >>> > > > >> > >>> "Apache Arrow is a data processing library that also provides a > > > >> > uniform, > > > >> > >>> > > > >> > >>> efficient interface for data systems." > > > >> > >>> > > > >> > >>> > > > >> > >>> This probably still isn't quite right, I imagine the bit about > > "for > > > >> > data > > > >> > >>> > > > >> > >>> systems" needs some addition (maybe "for transport between data > > > >> > systems")? > > > >> > >>> > > > >> > >>> > > > >> > >>> My primary motivators: > > > >> > >>> > > > >> > >>> > > > >> > >>> - "A data processing library": > > > >> > >>> > > > >> > >>> - Arrow provides many language bindings, but ultimately > > they're > > > >> > all > > > >> > >>> > > > >> > >>> part of the same "library ecosystem", which I think is > > fine to > > > >> > >>> > > > >> > >>> capture in > > > >> > >>> > > > >> > >>> "library" > > > >> > >>> > > > >> > >>> - A main goal of arrow is for processing to be fast, > > whatever > > > >> > that > > > >> > >>> > > > >> > >>> processing may be > > > >> > >>> > > > >> > >>> - "uniform, efficient interface for data systems": > > > >> > >>> > > > >> > >>> - Arrow, provides (or tries to) a cohesive ("uniform") > > > >> > interface for > > > >> > >>> > > > >> > >>> data processing (although it has several APIs to do this) > > > >> > >>> > > > >> > >>> - Also, IMO, a motivation for arrow was a format and > > library to > > > >> > >>> > > > >> > >>> facilitate processing, but that provided functions and > > > >> > >>> > > > >> > >>> interfaces to easily > > > >> > >>> > > > >> > >>> translate into optimized data formats used by disparate > > data > > > >> > systems > > > >> > >>> > > > >> > >>> (cassandra, hadoop, etc.). > > > >> > >>> > > > >> > >>> - Arrow tries to be transparently zero-copy, which is > > part of > > > >> > the > > > >> > >>> > > > >> > >>> interface for efficiency > > > >> > >>> > > > >> > >>> - Arrow certainly has a data format, but that format is the > > crux > > > >> > of the > > > >> > >>> > > > >> > >>> interface (IMO). However, it also makes using other formats > > easy > > > >> > (via > > > >> > >>> > > > >> > >>> filesystem API and parquet reader/writers, etc.). So, > > focusing on > > > >> > the > > > >> > >>> > > > >> > >>> data > > > >> > >>> > > > >> > >>> format seems unnecessary in such a terse description. > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> Aldrin Montana > > > >> > >>> > > > >> > >>> Computer Science PhD Student > > > >> > >>> > > > >> > >>> UC Santa Cruz > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> On Mon, May 17, 2021 at 5:07 PM Weston Pace < > > weston.p...@gmail.com> > > > >> > wrote: > > > >> > >>> > > > >> > >>> > > > >> > >>>> I'd avoid the word "structured" as it is somewhat ill-defined. > > > >> > >>> > > > >> > >>>> > > > >> > >>> > > > >> > >>>> On Mon, May 17, 2021 at 12:37 PM Mauricio Vargas > > > >> > >>> > > > >> > >>>> <mauri...@ursacomputing.com> wrote: > > > >> > >>> > > > >> > >>>>> > > > >> > >>> > > > >> > >>>>> more marketed: > > > >> > >>> > > > >> > >>>>> How about: "Apache Arrow is a format and language-agnostic > > library > > > >> > >>> > > > >> > >>>> focused > > > >> > >>> > > > >> > >>>>> on efficient sharing and processing of structured data." > > > >> > >>> > > > >> > >>>>> > > > >> > >>> > > > >> > >>>>> On Mon, May 17, 2021 at 6:25 PM Micah Kornfield < > > > >> > emkornfi...@gmail.com > > > >> > >>> > > > >> > >>>> > > > >> > >>> > > > >> > >>>>> wrote: > > > >> > >>> > > > >> > >>>>> > > > >> > >>> > > > >> > >>>>>> How about: "Apache Arrow is a collection of specifications, > > cross > > > >> > >>> > > > >> > >>>> language > > > >> > >>> > > > >> > >>>>>> libraries and applications focused on efficient sharing and > > > >> > >>> > > > >> > >>> processing > > > >> > >>> > > > >> > >>>> of > > > >> > >>> > > > >> > >>>>>> structured data." > > > >> > >>> > > > >> > >>>>>> > > > >> > >>> > > > >> > >>>>>> On Mon, May 17, 2021 at 3:06 PM Wes McKinney < > > wesmck...@gmail.com> > > > >> > >>> > > > >> > >>>> wrote: > > > >> > >>> > > > >> > >>>>>> > > > >> > >>> > > > >> > >>>>>>> On Mon, May 17, 2021 at 4:58 PM Weston Pace < > > weston.p...@gmail.com > > > >> > >>> > > > >> > >>>> > > > >> > >>> > > > >> > >>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> “Apache Arrow is a format and compute kernel for in-memory > > > >> > >>> > > > >> > >>> data” > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>> I like this but no one ever knows what "in-memory" means > > (or they > > > >> > >>> > > > >> > >>>> just > > > >> > >>> > > > >> > >>>>>>>> think 'data is always in memory'). How about... > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>> "Apache Arrow is a format and compute kernel for zero-copy > > > >> > >>> > > > >> > >>>> processing > > > >> > >>> > > > >> > >>>>>>>> and sharing of data." > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>> or... > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>> "Apache Arrow is a format and compute kernel for > > processing and > > > >> > >>> > > > >> > >>>>>>>> sharing data without serialization overhead." > > > >> > >>> > > > >> > >>>>>>> > > > >> > >>> > > > >> > >>>>>>> A few issues with this: > > > >> > >>> > > > >> > >>>>>>> > > > >> > >>> > > > >> > >>>>>>> * Multiple PL aspect unclear (is a single piece of > > software, or > > > >> > >>> > > > >> > >>>>>>> multiple pieces of software?) > > > >> > >>> > > > >> > >>>>>>> * Development platform aspect unclear > > > >> > >>> > > > >> > >>>>>>> > > > >> > >>> > > > >> > >>>>>>> I see that some people don't like the word "platform". Some > > people > > > >> > >>> > > > >> > >>>>>>> come to this project and want to find an end-to-end > > application, > > > >> > >>> > > > >> > >>>>>>> rather than a developer toolkit that they can use to build > > > >> > >>> > > > >> > >>>>>>> applications. Perhaps we should be more explicit and use > > > >> > >>> > > > >> > >>>>>>> "computational development toolkit" instead of "platform". > > > >> > >>> > > > >> > >>>>>>> > > > >> > >>> > > > >> > >>>>>>>> Although marshalling[1] would probably be a more precise > > word it > > > >> > >>> > > > >> > >>> is > > > >> > >>> > > > >> > >>>>>>>> not as well known. > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>> [1] > > https://en.wikipedia.org/wiki/Marshalling_(computer_science) > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>> On Mon, May 17, 2021 at 9:36 AM Mauricio Vargas > > > >> > >>> > > > >> > >>>>>>>> <mauri...@ursacomputing.com> wrote: > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> a few ideas > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> github.com/apache/arrow - Apache Arrow is an efficient > > library > > > >> > >>> > > > >> > >>>> for > > > >> > >>> > > > >> > >>>>>>> big data > > > >> > >>> > > > >> > >>>>>>>>> processing and sharing > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> github.com/apache/arrow - Apache Arrow is a > > computational tool > > > >> > >>> > > > >> > >>>> for > > > >> > >>> > > > >> > >>>>>>>>> processing, storing and sharing large datasets > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> github.com/apache/arrow - Apache Arrow is a fast and > > simple > > > >> > >>> > > > >> > >>>> library > > > >> > >>> > > > >> > >>>>>>> for > > > >> > >>> > > > >> > >>>>>>>>> big data analytics > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> *github.com/apache/arrow <http://github.com/apache/arrow> > > - > > > >> > >>> > > > >> > >>>> Apache > > > >> > >>> > > > >> > >>>>>>> Arrow is > > > >> > >>> > > > >> > >>>>>>>>> a powerful workhorse for analytic operations on modern > > > >> > >>> > > > >> > >>> hardware* > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>> On Mon, May 17, 2021 at 3:13 PM Julian Hyde < > > > >> > >>> > > > >> > >>>> jhyde.apa...@gmail.com> > > > >> > >>> > > > >> > >>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>> Alright, well, whatever it is, it must fit into one > > breath. > > > >> > >>> > > > >> > >>> If > > > >> > >>> > > > >> > >>>> the > > > >> > >>> > > > >> > >>>>>>>>>> high-concept pitch is successful, people will stick > > around > > > >> > >>> > > > >> > >>> for > > > >> > >>> > > > >> > >>>> the > > > >> > >>> > > > >> > >>>>>>> full > > > >> > >>> > > > >> > >>>>>>>>>> pitch. > > > >> > >>> > > > >> > >>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>> Words such as “platform” and “enable” are noise. You say > > > >> > >>> > > > >> > >>>>>> “platform”, > > > >> > >>> > > > >> > >>>>>>> they > > > >> > >>> > > > >> > >>>>>>>>>> start to say “what exactly do you mean by platform”, the > > > >> > >>> > > > >> > >>>> elevator > > > >> > >>> > > > >> > >>>>>>> doors > > > >> > >>> > > > >> > >>>>>>>>>> open, and they’re gone. > > > >> > >>> > > > >> > >>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>> “Apache Arrow is a format and compute kernel for > > in-memory > > > >> > >>> > > > >> > >>>> data” > > > >> > >>> > > > >> > >>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>> On May 17, 2021, at 12:03 PM, Eduardo Ponce < > > > >> > >>> > > > >> > >>>> edponc...@gmail.com > > > >> > >>> > > > >> > >>>>>>> > > > >> > >>> > > > >> > >>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>> One more suggestion for the bucket: > > > >> > >>> > > > >> > >>>>>>>>>>> "Apache Arrow is a computational platform for efficient > > > >> > >>> > > > >> > >>>> in-memory > > > >> > >>> > > > >> > >>>>>>> data > > > >> > >>> > > > >> > >>>>>>>>>>> representation and processing." > > > >> > >>> > > > >> > >>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>> On Mon, May 17, 2021 at 2:49 PM Wes McKinney < > > > >> > >>> > > > >> > >>>>>> wesmck...@gmail.com> > > > >> > >>> > > > >> > >>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>> I think less is better in the description, but > > > >> > >>> > > > >> > >>>> unfortunately the > > > >> > >>> > > > >> > >>>>>>>>>>>> association of Arrow as being "just a data format" has > > > >> > >>> > > > >> > >>> been > > > >> > >>> > > > >> > >>>>>>> actively > > > >> > >>> > > > >> > >>>>>>>>>>>> harmful in some ways to community growth. We have a > > data > > > >> > >>> > > > >> > >>>> format, > > > >> > >>> > > > >> > >>>>>>> yes, > > > >> > >>> > > > >> > >>>>>>>>>>>> but we are also creating a computational platform to go > > > >> > >>> > > > >> > >>>>>>> hand-in-hand > > > >> > >>> > > > >> > >>>>>>>>>>>> with the data format to make it easier to build fast > > > >> > >>> > > > >> > >>>>>> applications > > > >> > >>> > > > >> > >>>>>>> that > > > >> > >>> > > > >> > >>>>>>>>>>>> use the data format. So the description needs to > > capture > > > >> > >>> > > > >> > >>>> both of > > > >> > >>> > > > >> > >>>>>>> these > > > >> > >>> > > > >> > >>>>>>>>>>>> ideas. > > > >> > >>> > > > >> > >>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>> On Mon, May 17, 2021 at 12:15 PM Julian Hyde < > > > >> > >>> > > > >> > >>>>>>> jhyde.apa...@gmail.com> > > > >> > >>> > > > >> > >>>>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> I think that the “cross-language development platform > > > >> > >>> > > > >> > >>> for” > > > >> > >>> > > > >> > >>>> is > > > >> > >>> > > > >> > >>>>>>> noise. > > > >> > >>> > > > >> > >>>>>>>>>>>> (I’m sure that JPEG developers think that JPEG is a > > > >> > >>> > > > >> > >>>>>>> “cross-language > > > >> > >>> > > > >> > >>>>>>>>>>>> development platform” too. But it isn’t. It is an image > > > >> > >>> > > > >> > >>>> format.) > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> "Apache Arrow is data format for efficient in-memory > > > >> > >>> > > > >> > >>>>>> processing.” > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> I’ll note that In marketing speak, we are developing a > > > >> > >>> > > > >> > >>>>>>> high-concept > > > >> > >>> > > > >> > >>>>>>>>>>>> pitch [1] here. Every company needs a name, a brand, a > > > >> > >>> > > > >> > >>>>>>> high-concept > > > >> > >>> > > > >> > >>>>>>>>>> pitch, > > > >> > >>> > > > >> > >>>>>>>>>>>> and 3- or 4-sentence description. But every Apache > > project > > > >> > >>> > > > >> > >>>> needs > > > >> > >>> > > > >> > >>>>>>> these > > > >> > >>> > > > >> > >>>>>>>>>> too. > > > >> > >>> > > > >> > >>>>>>>>>>>> It’s worth spending the time on the description, also, > > and > > > >> > >>> > > > >> > >>>> then > > > >> > >>> > > > >> > >>>>>>> use > > > >> > >>> > > > >> > >>>>>>>>>> them in > > > >> > >>> > > > >> > >>>>>>>>>>>> all the places that we describe Arrow. > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> Julian > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> [1] > > > >> > >>> > > > >> > >>>>>>> > > https://www.growthink.com/content/whats-your-high-concept-pitch > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>> On May 17, 2021, at 7:38 AM, Eduardo Ponce < > > > >> > >>> > > > >> > >>>>>> edponc...@gmail.com > > > >> > >>> > > > >> > >>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>> I agree with Nate's and Brian's suggestions, but > > would > > > >> > >>> > > > >> > >>>> like to > > > >> > >>> > > > >> > >>>>>>> add > > > >> > >>> > > > >> > >>>>>>>>>>>> that we > > > >> > >>> > > > >> > >>>>>>>>>>>>>> can make it a one-liner for more conciseness and > > > >> > >>> > > > >> > >>>> consistency > > > >> > >>> > > > >> > >>>>>>> with > > > >> > >>> > > > >> > >>>>>>>>>> other > > > >> > >>> > > > >> > >>>>>>>>>>>>>> Apache projects. > > > >> > >>> > > > >> > >>>>>>>>>>>>>> Apologies if it seems I am going around the > > suggestions > > > >> > >>> > > > >> > >>>> loop > > > >> > >>> > > > >> > >>>>>>> again. > > > >> > >>> > > > >> > >>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>> "Apache Arrow is a cross-language development > > platform > > > >> > >>> > > > >> > >>>>>> enabling > > > >> > >>> > > > >> > >>>>>>>>>>>> efficient > > > >> > >>> > > > >> > >>>>>>>>>>>>>> in-memory data processing and transport." > > > >> > >>> > > > >> > >>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>> On Mon, May 17, 2021 at 10:11 AM Brian Hulette < > > > >> > >>> > > > >> > >>>>>>> bhule...@apache.org> > > > >> > >>> > > > >> > >>>>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> Thank you for bringing this up Dominik. I sampled > > some > > > >> > >>> > > > >> > >>>> of the > > > >> > >>> > > > >> > >>>>>>>>>>>> descriptions > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> for other Apache projects I frequent, the ones with > > a > > > >> > >>> > > > >> > >>>>>>> meaningful > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> description have a single sentence: > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> github.com/apache/spark - Apache Spark - A unified > > > >> > >>> > > > >> > >>>> analytics > > > >> > >>> > > > >> > >>>>>>> engine > > > >> > >>> > > > >> > >>>>>>>>>>>> for > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> large-scale data processing > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> github.com/apache/beam - Apache Beam is a unified > > > >> > >>> > > > >> > >>>>>> programming > > > >> > >>> > > > >> > >>>>>>> model > > > >> > >>> > > > >> > >>>>>>>>>>>> for > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> Batch and Streaming > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> github.com/apache/avro - Apache Avro is a data > > > >> > >>> > > > >> > >>>> serialization > > > >> > >>> > > > >> > >>>>>>> system > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> Several others (Flink, Hadoop, ...) just have > > "[Mirror > > > >> > >>> > > > >> > >>>> of] > > > >> > >>> > > > >> > >>>>>>> Apache > > > >> > >>> > > > >> > >>>>>>>>>>>> <name>" > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> as the description. > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> +1 for Nate's suggestion "Apache Arrow is a > > > >> > >>> > > > >> > >>>> cross-language > > > >> > >>> > > > >> > >>>>>>>>>> development > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> platform for in-memory data. It enables systems to > > > >> > >>> > > > >> > >>>> process > > > >> > >>> > > > >> > >>>>>> and > > > >> > >>> > > > >> > >>>>>>>>>>>> transport > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> data more efficiently." > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> On Mon, May 17, 2021 at 5:23 AM Wes McKinney < > > > >> > >>> > > > >> > >>>>>>> wesmck...@gmail.com> > > > >> > >>> > > > >> > >>>>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> It's probably best for description to limit > > mentions > > > >> > >>> > > > >> > >>> of > > > >> > >>> > > > >> > >>>>>>> specific > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> features. There are some high level features > > mentioned > > > >> > >>> > > > >> > >>>> in > > > >> > >>> > > > >> > >>>>>> the > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> description now ("computational libraries and > > > >> > >>> > > > >> > >>> zero-copy > > > >> > >>> > > > >> > >>>>>>> streaming > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> messaging and interprocess communication"), but > > now in > > > >> > >>> > > > >> > >>>> 2021 > > > >> > >>> > > > >> > >>>>>>> since > > > >> > >>> > > > >> > >>>>>>>>>> the > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> project has grown so much, it could leave people > > with > > > >> > >>> > > > >> > >>> a > > > >> > >>> > > > >> > >>>>>>> limited view > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> of what they might find here. > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> On Mon, May 17, 2021 at 12:14 AM Mauricio Vargas > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> <mauri...@ursacomputing.com> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> How about > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> 'Apache Arrow is a cross-language development > > > >> > >>> > > > >> > >>> platform > > > >> > >>> > > > >> > >>>> for > > > >> > >>> > > > >> > >>>>>>>>>> in-memory > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> data. > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> It enables systems to process and transport data > > > >> > >>> > > > >> > >>>>>> efficiently, > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> providing a > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> simple and fast library for partitioning of large > > > >> > >>> > > > >> > >>>> tables'? > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> Sorry the delay, long election day > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> On Sun, May 16, 2021, 2:27 PM Nate Bauernfeind < > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> natebauernfe...@deephaven.io> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> Suggestion: faster -> more efficiently > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> "Apache Arrow is a cross-language development > > > >> > >>> > > > >> > >>>> platform for > > > >> > >>> > > > >> > >>>>>>>>>>>> in-memory > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> data. It enables systems to process and transport > > > >> > >>> > > > >> > >>> data > > > >> > >>> > > > >> > >>>>>> more > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> efficiently." > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> On Sun, May 16, 2021 at 11:35 AM Wes McKinney < > > > >> > >>> > > > >> > >>>>>>>>>> wesmck...@gmail.com > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> Here's what there now: > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> "Apache Arrow is a cross-language development > > > >> > >>> > > > >> > >>>> platform > > > >> > >>> > > > >> > >>>>>> for > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> in-memory > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> data. It specifies a standardized > > > >> > >>> > > > >> > >>>> language-independent > > > >> > >>> > > > >> > >>>>>>> columnar > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> memory > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> format for flat and hierarchical data, organized > > > >> > >>> > > > >> > >>> for > > > >> > >>> > > > >> > >>>>>>> efficient > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> analytic operations on modern hardware. It also > > > >> > >>> > > > >> > >>>> provides > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> computational > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> libraries and zero-copy streaming messaging and > > > >> > >>> > > > >> > >>>>>>> interprocess > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> communication…" > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> How about something shorter like > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> "Apache Arrow is a cross-language development > > > >> > >>> > > > >> > >>>> platform > > > >> > >>> > > > >> > >>>>>> for > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> in-memory > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> data. It enables systems to process and > > transport > > > >> > >>> > > > >> > >>>> data > > > >> > >>> > > > >> > >>>>>>> faster." > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> Suggestions / refinements from others welcome > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> On Sat, May 15, 2021 at 9:12 PM Dominik Moritz < > > > >> > >>> > > > >> > >>>>>>> domor...@cmu.edu > > > >> > >>> > > > >> > >>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> wrote: > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>>> Super minor issue but could someone make the > > > >> > >>> > > > >> > >>>> description > > > >> > >>> > > > >> > >>>>>>> on > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> GitHub > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> shorter? > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>>> GitHub puts the description into the title of > > the > > > >> > >>> > > > >> > >>>> page > > > >> > >>> > > > >> > >>>>>>> and makes > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> it > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> hard > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> to find it in URL autocomplete. > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> -- > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>>>>> > > > >> > >>> > > > >> > >>>>>>> > > > >> > >>> > > > >> > >>>>>> > > > >> > >>> > > > >> > >>>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> -- > > > >> > >>> Adam Hooper > > > >> > >>> +1-514-882-9694 > > > >> > >>> http://adamhooper.com > > > >> > >>> > > > >> > > >