Hi, I think there is a bug for selecting partitions dynamically. There is no problem for sequence file tables. However for textfile tables it seems to not query it in the same way. In the logs there is no mapper assigned, see end of this mail, the first log is for a SEQUENCEFILE storage hive table and the second for a TEXTFILE storage one (they are not external, through external text file tables does not work either). I am using hive 0.9.
Regards, Étienne hive> SELECT MAX(PERIOD) FROM CONTINUOUS_SUM; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201208201551_0162, Tracking URL = http://rennes.local.net:50030/jobdetails.jsp?jobid=job_201208201551_0162 Kill Command = /home/hadoop/project/hadoop/hadoop-0.20.205.0/libexec/../bin/hadoop job -Dmapred.job.tracker=rennes.local.net:54311 -kill job_201208201551_0162 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2012-09-10 14:53:45,540 Stage-1 map = 0%, reduce = 0% 2012-09-10 14:53:54,622 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.3 sec . . . 2012-09-10 14:54:17,948 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 8.5 sec MapReduce Total cumulative CPU time: 8 seconds 500 msec Ended Job = job_201208201551_0162 MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 Cumulative CPU: 8.5 sec HDFS Read: 279 HDFS Write: 18 SUCCESS Total MapReduce CPU Time Spent: 8 seconds 500 msec OK 20110401-20110408 ----------------------------------------------------------------------------------------------------------------------------------------------------------- hive> SELECT MAX(DT) FROM SUBSCRIBER; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201208201551_0163, Tracking URL = http://rennes.local.net:50030/jobdetails.jsp?jobid=job_201208201551_0163 Kill Command = /home/hadoop/project/hadoop/hadoop-0.20.205.0/libexec/../bin/hadoop job -Dmapred.job.tracker=rennes.local.net:54311 -kill job_201208201551_0163 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2012-09-10 14:55:03,443 Stage-1 map = 0%, reduce = 0% . . . 2012-09-10 14:55:20,597 Stage-1 map = 0%, reduce = 100%, Cumulative CPU 2.83 sec 2012-09-10 14:55:21,608 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.83 sec MapReduce Total cumulative CPU time: 2 seconds 830 msec Ended Job = job_201208201551_0163 MapReduce Jobs Launched: Job 0: Reduce: 1 Cumulative CPU: 2.83 sec HDFS Read: 0 HDFS Write: 3 SUCCESS Total MapReduce CPU Time Spent: 2 seconds 830 msec OK NULL Time taken: 30.118 seconds On 7 September 2012 19:54, Etienne Dumoulin <[email protected]>wrote: > Hi, > > I have an external table partitioned per day, I have currently one > partition. > > When I launch: > > SELECT max(day) from MY_EXTERNAL_TABLE GROUP BY DAY; > > It returns nothing. > > I would like to use this query to select a partition dynamically: > > SELECT ID, > PROPERTY > FROM MY_EXTERNAL_TABLE > WHERE day IN ( > SELECT max(day) > FROM MY_EXTERNAL_TABLE > WHERE day <= '2012-09-07' > GROUP BY DAY > ) > > Do I do something wrong? > Is there a way around? > > Thanks, > > Étienne >
