I was able to find the property with some digging around and experimentation. Never knew that ppd had something to do with this property.
On Thu, Apr 3, 2014 at 7:23 PM, Stephen Sprague <sprag...@gmail.com> wrote: > wow. good find. i hope these config settings are well documented and that > you didn't have to spend alot time searching for that. Interesting that > the default isn't true for this one. > > > On Wed, Apr 2, 2014 at 11:00 PM, Abhay Bansal > <abhaybansal.1...@gmail.com>wrote: > >> I was able to resolve the issue by setting "hive.optimize.index.filter" >> to true. >> >> In the hadoop logs >> syslog:2014-04-03 05:44:51,204 INFO >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included column ids = >> 3,8,13 >> syslog:2014-04-03 05:44:51,204 INFO >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included columns names = >> sourceipv4address,sessionid,url >> syslog:2014-04-03 05:44:51,216 INFO >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: ORC pushdown predicate: >> leaf-0 = (EQUALS sourceipv4address 1809657989) >> >> I can now see the ORC pushdown predicate. >> >> Thanks, >> -Abhay >> >> >> On Thu, Apr 3, 2014 at 11:14 AM, Stephen Boesch <java...@gmail.com>wrote: >> >>> HI Abhay, >>> What is the DDL for your "test" table? >>> >>> >>> 2014-04-02 22:36 GMT-07:00 Abhay Bansal <abhaybansal.1...@gmail.com>: >>> >>> I am new to Hive, apologise for asking such a basic question. >>>> >>>> Following exercise was done with hive .12 and hadoop 0.20.203 >>>> >>>> I created a ORC file form java, and pushed it into a table with the >>>> same schema. I checked the conf >>>> property >>>> <property><name>hive.optimize.ppd</name><value>true</value></property> >>>> which should ideally use the ppd optimisation. >>>> >>>> I ran a query "select sourceipv4address,sessionid,url from test where >>>> sourceipv4address="dummy";" >>>> >>>> Just to see if the ppd optimization is working I checked the hadoop >>>> logs where I found >>>> >>>> ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 >>>> 05:01:39,913 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included >>>> column ids = 3,8,13 >>>> ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 >>>> 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included >>>> columns names = sourceipv4address,sessionid,url >>>> ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 >>>> 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: *No >>>> ORC pushdown predicate* >>>> >>>> I am not sure which part of it I missed. Any help would be >>>> appreciated. >>>> >>>> Thanks, >>>> -Abhay >>>> >>> >>> >> >