Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

Thomas graves Tue, 14 May 2019 11:00:11 -0700

Thanks for replying, I'll extend the vote til May 26th to allow your
and other people feedback who haven't had time to look at it.


Tom

On Mon, May 13, 2019 at 4:43 PM Holden Karau <hol...@pigscanfly.ca> wrote:
>
> I’d like to ask this vote period to be extended, I’m interested but I don’t 
> have the cycles to review it in detail and make an informed vote until the 
> 25th.
>
> On Tue, May 14, 2019 at 1:49 AM Xiangrui Meng <m...@databricks.com> wrote:
>>
>> My vote is 0. Since the updated SPIP focuses on ETL use cases, I don't feel 
>> strongly about it. I would still suggest doing the following:
>>
>> 1. Link the POC mentioned in Q4. So people can verify the POC result.
>> 2. List public APIs we plan to expose in Appendix A. I did a quick check. 
>> Beside ColumnarBatch and ColumnarVector, we also need to make the following 
>> public. People who are familiar with SQL internals should help assess the 
>> risk.
>> * ColumnarArray
>> * ColumnarMap
>> * unsafe.types.CaledarInterval
>> * ColumnarRow
>> * UTF8String
>> * ArrayData
>> * ...
>> 3. I still feel using Pandas UDF as the mid-term success doesn't match the 
>> purpose of this SPIP. It does make some code cleaner. But I guess for ETL 
>> use cases, it won't bring much value.
>>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

Reply via email to