/DL systems that can benefits from columnar format are mostly
> in Python.
> > > > 3. Simple operations, though benefits vectorization, might not be
> worth the data exchange overhead.
> > > >
> > > > So would an improved Pandas UDF API would be g
2. ML/DL systems that can benefits from columnar format are mostly in
> > > Python.
> > > 3. Simple operations, though benefits vectorization, might not be worth
> > > the data exchange overhead.
> > >
> > > So would an improved Pandas UDF API would be good enough?
amji
Sent: Friday, April 19, 2019 12:21 PM
To: Bryan Cutler
Cc: Dev
Subject: Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended
Columnar Processing Support
+ (non-binding)
Sent from my iPhone
Pardon the dumb thumb typos :)
On Apr 19, 2019, at 10:30 AM, Bryan Cutler wrote:
+1 (n
RA. Main concerns here:
>>
>> 1. Exposing third-party Java APIs in Spark is risky. Arrow might have
>>
>> 1.0 release someday.
>>
>> 2. ML/DL systems that can benefits from columnar format are mostly in
>>
>> Python.
>>
>> 3. Simple opera
the SPIP JIRA.
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>> On Sat, Apr 20, 2019 at 12:52 AM Xiangrui Meng < mengxr@ gmail. com (
>>>>> men...@gmail.com ) >
>>>>>
>>>>>
>>>>
>&
hat I should join the discussion earlier! Hope it is not too
>> late:)
>> > >
>> > > On Fri, Apr 19, 2019 at 1:20 PM wrote:
>> > > +1 (non-binding) for better columnar data processing support.
>> > >
>> > >
>> > >
>>
mnar format are mostly in
> Python.
> > > 3. Simple operations, though benefits vectorization, might not be
> worth the data exchange overhead.
> > >
> > > So would an improved Pandas UDF API would be good enough? For example,
> SPARK-26412 (UDF that takes an ite
oved Pandas UDF API would be good enough? For example,
> > SPARK-26412 (UDF that takes an iterator of of Arrow batches).
> >
> > Sorry that I should join the discussion earlier! Hope it is not too late:)
> >
> > On Fri, Apr 19, 2019 at 1:20 PM
row might have
>> 1.0 release someday.
>> > > 2. ML/DL systems that can benefits from columnar format are mostly in
>> Python.
>> > > 3. Simple operations, though benefits vectorization, might not be
>> worth the data exchange overhead.
>> > >
>>
of Arrow batches).
> > >
> > > Sorry that I should join the discussion earlier! Hope it is not too
> late:)
> > >
> > > On Fri, Apr 19, 2019 at 1:20 PM wrote:
> > > +1 (non-binding) for better columnar data processing support.
> > >
&g
10 matches
Mail list logo