[
https://issues.apache.org/jira/browse/HIVE-7777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203915#comment-14203915
]
Alon Goldshuv commented on HIVE-7777:
-------------------------------------
Either way should work (adding OpenCSV parsing on LazySimpleSerde or adding
type support on this new CSV serde).
IMO the deciding factor should be performance considerations. If adding quote
stripping to LazySimpleSerde means it will slow down simple non quoted parsing
(e.g, due to introducing the need to examine the state after each byte instead
of seeking fast to the next line terminator) - I'd say the solution is best
represented in 2 separate serdes (as proposed in this JIRA). If that isn't the
case though - a single serde (as proposed by [~rstokes]) is more
elegant/friendly. [~rstokes] - can you share information on that respect, or
share the code for your modified LazySimpleSerde?
> Add CSV Serde based on OpenCSV
> ------------------------------
>
> Key: HIVE-7777
> URL: https://issues.apache.org/jira/browse/HIVE-7777
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Reporter: Ferdinand Xu
> Assignee: Ferdinand Xu
> Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7777.1.patch, HIVE-7777.2.patch, HIVE-7777.3.patch,
> HIVE-7777.patch, csv-serde-master.zip
>
>
> There is no official support for csvSerde for hive while there is an open
> source project in github(https://github.com/ogrodnek/csv-serde). CSV is of
> high frequency in use as a data format.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)