In detail you can check the source code, but a Serde needs to translate an object to a Hive object and vice versa. Usually this is very simple (simply passing the object or create A HiveDecimal etc). It also provides an ObjectInspector that basically describes an object in more detail (eg to be processed by an UDF). For example, it can tell you precision and scale of an objects. In case of ORC it describes also how a bunch of objects (vectorized) can be mapped to hive objects and the other way around. Furthermore, it provides statistics and provides means to deal with partitions as well as table properties (!=input/outputformat properties). Although it sounds complex, hive provides most of the functionality so implementing a serde is most of the times easy.
> On 13. May 2018, at 16:34, 侯宗田 <[email protected]> wrote: > > Hello,everyone > I know the json serde turn fields in a row to a json format, csv serde turn > it to csv format with their serdeproperties. But I wonder what the orc serde > does when I choose to stored as orc file format. And why is there still > escaper, separator in orc serdeproperties. Also with RC Parquet. I think they > are just about how to stored and compressed with their input and output > format respectively, but I don’t know what their serde does, can anyone give > some hint?
