Jörn, please do update the wiki, we really need better SerDe documentation.
Getting write access is easy: About This Wiki -- How to get permission to edit <https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit> -- Lefty On Sun, May 13, 2018 at 10:18 AM Jörn Franke <jornfra...@gmail.com> wrote: > You have in AbstractSerde a method to return very basic stats related to > your fileformat (mostly size of the data and number of rows etc): > > > https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/SerDeStats.java > > In method initialize of your Serde you can retrieve properties related to > partitions and include this information in your file format, if needed (you > don’t need to create folders etc for partitions - this is done by Hive) > > > On 13. May 2018, at 19:09, Elliot West <tea...@gmail.com> wrote: > > Hi Jörn, > > I’m curious to know how the SerDe framework provides the means to deal > with partitions, table properties, and statistics? I was under the > impression that these were in the domain of the metastore and I’ve not > found anything in the SerDe interface related to these. I would appreciate > if you could point me in the direction of anything I’ve missed. > > Thanks, > > Elliot. > > On Sun, 13 May 2018 at 15:42, Jörn Franke <jornfra...@gmail.com> wrote: > >> In detail you can check the source code, but a Serde needs to translate >> an object to a Hive object and vice versa. Usually this is very simple >> (simply passing the object or create A HiveDecimal etc). It also provides >> an ObjectInspector that basically describes an object in more detail (eg to >> be processed by an UDF). For example, it can tell you precision and scale >> of an objects. In case of ORC it describes also how a bunch of objects >> (vectorized) can be mapped to hive objects and the other way around. >> Furthermore, it provides statistics and provides means to deal with >> partitions as well as table properties (!=input/outputformat properties). >> Although it sounds complex, hive provides most of the functionality so >> implementing a serde is most of the times easy. >> >> > On 13. May 2018, at 16:34, 侯宗田 <zongtian...@icloud.com> wrote: >> > >> > Hello,everyone >> > I know the json serde turn fields in a row to a json format, csv >> serde turn it to csv format with their serdeproperties. But I wonder what >> the orc serde does when I choose to stored as orc file format. And why is >> there still escaper, separator in orc serdeproperties. Also with RC >> Parquet. I think they are just about how to stored and compressed with >> their input and output format respectively, but I don’t know what their >> serde does, can anyone give some hint? >> >