[ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
-------------------------------
    Attachment:     (was: HIVE-11131.4.patch)

> Get row information on DataWritableWriter once for better writing performance
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-11131
>                 URL: https://issues.apache.org/jira/browse/HIVE-11131
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: 1.2.0
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>         Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch, 
> HIVE-11131.4.patch
>
>
> DataWritableWriter is a class used to write Hive records to Parquet files. 
> This class is getting all the information about how to parse a record, such 
> as schema and object inspector, every time a record is written (or write() is 
> called).
> We can make this class perform better by initializing some writers per data
> type once, and saving all object inspectors on each writer.
> The class expects that the next records written will have the same object 
> inspectors and schema, so there is no need to have conditions for that. When 
> a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to