hi Romain, Cool! I would suggest that we proceed in one of two ways:
* Start merging R patches to master (what I would prefer) * Merge patches into an r-devel branch while the R bindings initiative is in early stages I don't really see any benefits to hiding early-stage code in a branch; the README for R should clearly indicate that the API is experimental. I think it would be better for the code to start going into the Arrow project (rather than staying in your personal branch) for a few reasons: * More opportunities for the community to participate * More visible progress / transparency into what is going on * You will earn karma in the Apache project and be on your way to becoming a committer * Opportunities for code review from other C++ developers on use of the Arrow APIs, and opportunities for improvement * Incremental IP / licensing oversight (this gets harder when the patches get bigger) * Help with roadmapping / enumerating work to be done On that last note, I would recommend beginning to liberally create JIRAs as you think of things that need to be done to build first class R support for Arrow. JIRA is the simplest way to develop the roadmap organically, it doesn't need to be anything formal. Thanks! Wes On Tue, Mar 20, 2018 at 12:04 PM, Romain Francois <rom...@purrple.cat> wrote: > Hello, > > Today is Tuesday, so that's the day I work on porting arrow to R. This week, > I've continued some of the work from last week, still following the steps of > the python front end as documented here: > https://arrow.apache.org/docs/python/data.html#type-metadata > <https://arrow.apache.org/docs/python/data.html#type-metadata> > > Things are starting to materialize, and I try to give it an R feel. > >> int32() > DataType(int32) >> >> float64() > DataType(double) >> >> struct( x = int32(), y = float64(), d1 = date32() ) > StructType(struct<x: int32, y: double, d1: date32[day]>) >> >> schema( x = int32(), y = float64(), d1 = date32() ) > x: int32 > y: double > d1: date32[day] > > > This is not that interesting, but it sets a nice premise for the future. > > Quick ones: > - are there examples of uses of pyarrow.union ? > - how does pyarrow.array dispatches to the right array type ? And perhaps > more generally, how do I know what's inside the function ? > >>>> pa.array([1, 2, None, 3]) > <pyarrow.lib.Int64Array object at 0x10db246d8> > [ > 1, > 2, > NA, > 3 > ] >>>> >>>> pa.array > <function pyarrow.lib.array> > > > Romain > >