Hi, > While using ORC file format, I would like to see in the logs that >stripes and/or row-groups are being skipped based on my where clause.
There¹s no logging in the inner loop there. > Is that info even outputted ? If so, what do I need to enable it ? You can do a query run with the following to see the difference. hive> set hive.tez.print.exec.summary=true; hive> set hive.optimize.index.filter=false; // run query hive> set hive.optimize.index.filter=true; // run query You¹ll get numbers which will indicate how much row-filtering is happening, since the input records count for the vertex will track the actual records read off ORC. For an example of what that does, see <http://www.slideshare.net/Hadoop_Summit/orc-2015-faster-better-smaller/21> If you have hive-1.2.0 builds, then you can also try setting the TBLPROPERTIES for orc.bloom.filter.columns to use the new row indexes as well. For Strings, that should work much better than the current min-max index. Cheers, Gopal