I’d like to ask this vote period to be extended, I’m interested but I don’t have the cycles to review it in detail and make an informed vote until the 25th.
On Tue, May 14, 2019 at 1:49 AM Xiangrui Meng <m...@databricks.com> wrote: > My vote is 0. Since the updated SPIP focuses on ETL use cases, I don't > feel strongly about it. I would still suggest doing the following: > > 1. Link the POC mentioned in Q4. So people can verify the POC result. > 2. List public APIs we plan to expose in Appendix A. I did a quick check. > Beside ColumnarBatch and ColumnarVector, we also need to make the following > public. People who are familiar with SQL internals should help assess the > risk. > * ColumnarArray > * ColumnarMap > * unsafe.types.CaledarInterval > * ColumnarRow > * UTF8String > * ArrayData > * ... > 3. I still feel using Pandas UDF as the mid-term success doesn't match the > purpose of this SPIP. It does make some code cleaner. But I guess for ETL > use cases, it won't bring much value. > > -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau