You have in AbstractSerde a method to return very basic stats related to your fileformat (mostly size of the data and number of rows etc):
https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/SerDeStats.java In method initialize of your Serde you can retrieve properties related to partitions and include this information in your file format, if needed (you don’t need to create folders etc for partitions - this is done by Hive) > On 13. May 2018, at 19:09, Elliot West <[email protected]> wrote: > > Hi Jörn, > > I’m curious to know how the SerDe framework provides the means to deal with > partitions, table properties, and statistics? I was under the impression that > these were in the domain of the metastore and I’ve not found anything in the > SerDe interface related to these. I would appreciate if you could point me in > the direction of anything I’ve missed. > > Thanks, > > Elliot. > >> On Sun, 13 May 2018 at 15:42, Jörn Franke <[email protected]> wrote: >> In detail you can check the source code, but a Serde needs to translate an >> object to a Hive object and vice versa. Usually this is very simple (simply >> passing the object or create A HiveDecimal etc). It also provides an >> ObjectInspector that basically describes an object in more detail (eg to be >> processed by an UDF). For example, it can tell you precision and scale of an >> objects. In case of ORC it describes also how a bunch of objects >> (vectorized) can be mapped to hive objects and the other way around. >> Furthermore, it provides statistics and provides means to deal with >> partitions as well as table properties (!=input/outputformat properties). >> Although it sounds complex, hive provides most of the functionality so >> implementing a serde is most of the times easy. >> >> > On 13. May 2018, at 16:34, 侯宗田 <[email protected]> wrote: >> > >> > Hello,everyone >> > I know the json serde turn fields in a row to a json format, csv serde >> > turn it to csv format with their serdeproperties. But I wonder what the >> > orc serde does when I choose to stored as orc file format. And why is >> > there still escaper, separator in orc serdeproperties. Also with RC >> > Parquet. I think they are just about how to stored and compressed with >> > their input and output format respectively, but I don’t know what their >> > serde does, can anyone give some hint?
