I think you misunderstood the point of this SPIP. I responded to your comments in the SPIP JIRA.
On Sat, Apr 20, 2019 at 12:52 AM Xiangrui Meng <men...@gmail.com> wrote: > I posted my comment in the JIRA > <https://issues.apache.org/jira/browse/SPARK-27396?focusedCommentId=16822367&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16822367>. > Main concerns here: > > 1. Exposing third-party Java APIs in Spark is risky. Arrow might have 1.0 > release someday. > 2. ML/DL systems that can benefits from columnar format are mostly in > Python. > 3. Simple operations, though benefits vectorization, might not be worth > the data exchange overhead. > > So would an improved Pandas UDF API would be good enough? For example, > SPARK-26412 <https://issues.apache.org/jira/browse/SPARK-26412> (UDF that > takes an iterator of of Arrow batches). > > Sorry that I should join the discussion earlier! Hope it is not too late:) > > On Fri, Apr 19, 2019 at 1:20 PM <tcon...@gmail.com> wrote: > >> +1 (non-binding) for better columnar data processing support. >> >> >> >> *From:* Jules Damji <dmat...@comcast.net> >> *Sent:* Friday, April 19, 2019 12:21 PM >> *To:* Bryan Cutler <cutl...@gmail.com> >> *Cc:* Dev <dev@spark.apache.org> >> *Subject:* Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended >> Columnar Processing Support >> >> >> >> + (non-binding) >> >> Sent from my iPhone >> >> Pardon the dumb thumb typos :) >> >> >> On Apr 19, 2019, at 10:30 AM, Bryan Cutler <cutl...@gmail.com> wrote: >> >> +1 (non-binding) >> >> >> >> On Thu, Apr 18, 2019 at 11:41 AM Jason Lowe <jl...@apache.org> wrote: >> >> +1 (non-binding). Looking forward to seeing better support for >> processing columnar data. >> >> >> >> Jason >> >> >> >> On Tue, Apr 16, 2019 at 10:38 AM Tom Graves <tgraves...@yahoo.com.invalid> >> wrote: >> >> Hi everyone, >> >> >> >> I'd like to call for a vote on SPARK-27396 - SPIP: Public APIs for >> extended Columnar Processing Support. The proposal is to extend the >> support to allow for more columnar processing. >> >> >> >> You can find the full proposal in the jira at: >> https://issues.apache.org/jira/browse/SPARK-27396. There was also a >> DISCUSS thread in the dev mailing list. >> >> >> >> Please vote as early as you can, I will leave the vote open until next >> Monday (the 22nd), 2pm CST to give people plenty of time. >> >> >> >> [ ] +1: Accept the proposal as an official SPIP >> >> [ ] +0 >> >> [ ] -1: I don't think this is a good idea because ... >> >> >> >> >> >> Thanks! >> >> Tom Graves >> >>