Thanks a lot for Wes and Liya's feedbacks.

Agreed that parsing performance of CSV files is important, and I just found a 
benchmark test for Java CSV library[1][2] which shows FastCSV has obvious 
advantages. Anyway, I will test it myself.


Thanks,
Ji Liu

[1] https://raw.githubusercontent.com/osiegmar/FastCSV/master/benchmark.png
[2] https://github.com/osiegmar/FastCSV


------------------------------------------------------------------
From:Fan Liya <[email protected]>
Send Time:2019年7月19日(星期五) 10:14
To:dev <[email protected]>
Cc:Ji Liu <[email protected]>; Micah Kornfield <[email protected]>
Subject:Re: [DISCUSS][JAVA] Implement a CSV to Arrow adapter

Hi Ji,

Thanks for proposing this. CSV adapter sounds like a useful feature.

Best,
Liya Fan
On Fri, Jul 19, 2019 at 12:31 AM Wes McKinney <[email protected]> wrote:
We wrote a custom reader in C++ since performance of parsing CSV files
 matters a lot -- we wanted to do multi-threaded execution of
 conversion steps, also. I don't know what the performance of
 commons-csv is but it might be worth doing some benchmarks to see.

 On Thu, Jul 18, 2019 at 4:35 AM Ji Liu <[email protected]> wrote:
 >
 > Hi all,
 >
 > Seems there is no adapter to convert CSV data to Arrow data in Java side 
 > which C++ has.  Now we already have JDBC adapter, Orc adapter and Avro 
 > adapter (In progress),  I think an adapter for CSV would probably also be 
 > nice.
 > After a brief discuss with @Micah Kornfield, Apache commons-csv [1] seems an 
 > efficient CSV parser that we could potentially leverage but I don't know if 
 > there are other better options. Any inputs and comments would be appreciated.
 >
 > Thanks,
 > Ji Liu[1]https://commons.apache.org/proper/commons-csv/

Reply via email to