Jörn, please do update the wiki, we really need better SerDe documentation.

Getting write access is easy:

About This Wiki -- How to get permission to edit
<https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit>


-- Lefty


On Sun, May 13, 2018 at 10:18 AM Jörn Franke <jornfra...@gmail.com> wrote:

> You have in AbstractSerde a method to return very basic stats related to
> your fileformat (mostly size of the data and number of rows etc):
>
>
> https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/SerDeStats.java
>
>  In method initialize of your Serde you can retrieve properties related to
> partitions and include this information in your file format, if needed (you
> don’t need to create folders etc for partitions - this is done by Hive)
>
>
> On 13. May 2018, at 19:09, Elliot West <tea...@gmail.com> wrote:
>
> Hi Jörn,
>
> I’m curious to know how the SerDe framework provides the means to deal
> with partitions, table properties, and statistics? I was under the
> impression that these were in the domain of the metastore and I’ve not
> found anything in the SerDe interface related to these. I would appreciate
> if you could point me in the direction of anything I’ve missed.
>
> Thanks,
>
> Elliot.
>
> On Sun, 13 May 2018 at 15:42, Jörn Franke <jornfra...@gmail.com> wrote:
>
>> In detail you can check the source code, but a Serde needs to translate
>> an object to a Hive object and vice versa. Usually this is very simple
>> (simply passing the object or create A HiveDecimal etc). It also provides
>> an ObjectInspector that basically describes an object in more detail (eg to
>> be processed by an UDF). For example, it can tell you precision and scale
>> of an objects. In case of ORC it describes also how a bunch of objects
>> (vectorized) can be mapped to hive objects and the other way around.
>> Furthermore, it provides statistics and provides means to deal with
>> partitions as well as table properties (!=input/outputformat properties).
>> Although it sounds complex, hive provides most of the functionality so
>> implementing a serde is most of the times easy.
>>
>> > On 13. May 2018, at 16:34, 侯宗田 <zongtian...@icloud.com> wrote:
>> >
>> > Hello,everyone
>> >   I know the json serde turn fields in a row to a json format, csv
>> serde turn it to csv format with their serdeproperties. But I wonder what
>> the orc serde does when I choose to stored as orc file format. And why is
>> there still escaper, separator in orc serdeproperties. Also with RC
>> Parquet. I think they are just about how to stored and compressed with
>> their input and output format respectively, but I don’t know what their
>> serde does, can anyone give some hint?
>>
>

Reply via email to