hi Romain,

Cool! I would suggest that we proceed in one of two ways:

* Start merging R patches to master (what I would prefer)
* Merge patches into an r-devel branch while the R bindings initiative
is in early stages

I don't really see any benefits to hiding early-stage code in a
branch; the README for R should clearly indicate that the API is
experimental. I think it would be better for the code to start going
into the Arrow project (rather than staying in your personal branch)
for a few reasons:

* More opportunities for the community to participate
* More visible progress / transparency into what is going on
* You will earn karma in the Apache project and be on your way to
becoming a committer
* Opportunities for code review from other C++ developers on use of
the Arrow APIs, and opportunities for improvement
* Incremental IP / licensing oversight (this gets harder when the
patches get bigger)
* Help with roadmapping / enumerating work to be done

On that last note, I would recommend beginning to liberally create
JIRAs as you think of things that need to be done to build first class
R support for Arrow. JIRA is the simplest way to develop the roadmap
organically, it doesn't need to be anything formal.

Thanks!
Wes

On Tue, Mar 20, 2018 at 12:04 PM, Romain Francois <rom...@purrple.cat> wrote:
> Hello,
>
> Today is Tuesday, so that's the day I work on porting arrow to R. This week, 
> I've continued some of the work from last week, still following the steps of 
> the python front end as documented here: 
> https://arrow.apache.org/docs/python/data.html#type-metadata 
> <https://arrow.apache.org/docs/python/data.html#type-metadata>
>
> Things are starting to materialize, and I try to give it an R feel.
>
>> int32()
> DataType(int32)
>>
>> float64()
> DataType(double)
>>
>> struct( x = int32(), y = float64(), d1 = date32() )
> StructType(struct<x: int32, y: double, d1: date32[day]>)
>>
>> schema( x = int32(), y = float64(), d1 = date32() )
> x: int32
> y: double
> d1: date32[day]
>
>
> This is not that interesting, but it sets a nice premise for the future.
>
> Quick ones:
> - are there examples of uses of pyarrow.union ?
> - how does pyarrow.array dispatches to the right array type ? And perhaps 
> more generally, how do I know what's inside the function ?
>
>>>> pa.array([1, 2, None, 3])
> <pyarrow.lib.Int64Array object at 0x10db246d8>
> [
>   1,
>   2,
>   NA,
>   3
> ]
>>>>
>>>> pa.array
> <function pyarrow.lib.array>
>
>
> Romain
>
>

Reply via email to