Re: Select dynamically a partition in an external table

Etienne Dumoulin Mon, 10 Sep 2012 07:24:17 -0700

Hi,

I think there is a bug for selecting partitions dynamically.
There is no problem for sequence file tables.
However for textfile tables it seems to not query it in the same way.
In the logs there is no mapper assigned, see end of this mail, the first
log is for a SEQUENCEFILE storage hive table and the second for a
TEXTFILE storage one (they are not external, through external text
file tables does not work either). I am using hive 0.9.


Regards,

Étienne


hive> SELECT MAX(PERIOD) FROM CONTINUOUS_SUM;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201208201551_0162, Tracking URL =
http://rennes.local.net:50030/jobdetails.jsp?jobid=job_201208201551_0162
Kill Command =
/home/hadoop/project/hadoop/hadoop-0.20.205.0/libexec/../bin/hadoop job
-Dmapred.job.tracker=rennes.local.net:54311 -kill job_201208201551_0162
Hadoop job information for Stage-1: number of mappers: 1; number of
reducers: 1
2012-09-10 14:53:45,540 Stage-1 map = 0%,  reduce = 0%
2012-09-10 14:53:54,622 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
2.3 sec
.
.
.
2012-09-10 14:54:17,948 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
8.5 sec
MapReduce Total cumulative CPU time: 8 seconds 500 msec
Ended Job = job_201208201551_0162
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 8.5 sec   HDFS Read: 279 HDFS
Write: 18 SUCCESS
Total MapReduce CPU Time Spent: 8 seconds 500 msec
OK
20110401-20110408

-----------------------------------------------------------------------------------------------------------------------------------------------------------

hive> SELECT MAX(DT) FROM SUBSCRIBER;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201208201551_0163, Tracking URL =
http://rennes.local.net:50030/jobdetails.jsp?jobid=job_201208201551_0163
Kill Command =
/home/hadoop/project/hadoop/hadoop-0.20.205.0/libexec/../bin/hadoop job
-Dmapred.job.tracker=rennes.local.net:54311 -kill job_201208201551_0163
Hadoop job information for Stage-1: number of mappers: 0; number of
reducers: 1
2012-09-10 14:55:03,443 Stage-1 map = 0%,  reduce = 0%
.
.
.
2012-09-10 14:55:20,597 Stage-1 map = 0%,  reduce = 100%, Cumulative CPU
2.83 sec
2012-09-10 14:55:21,608 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
2.83 sec
MapReduce Total cumulative CPU time: 2 seconds 830 msec
Ended Job = job_201208201551_0163
MapReduce Jobs Launched:
Job 0: Reduce: 1   Cumulative CPU: 2.83 sec   HDFS Read: 0 HDFS Write: 3
SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 830 msec
OK
NULL
Time taken: 30.118 seconds



On 7 September 2012 19:54, Etienne Dumoulin <[email protected]>wrote:

> Hi,
>
> I have an external table partitioned per day, I have currently one
> partition.
>
> When I launch:
>
> SELECT max(day) from MY_EXTERNAL_TABLE GROUP BY DAY;
>
> It returns nothing.
>
> I would like to use this query to select a partition dynamically:
>
>   SELECT ID,
>                  PROPERTY
>     FROM MY_EXTERNAL_TABLE
>     WHERE day IN (
>             SELECT max(day)
>             FROM MY_EXTERNAL_TABLE
>             WHERE day <= '2012-09-07'
>             GROUP BY DAY
>         )
>
> Do I do something wrong?
> Is there a way around?
>
> Thanks,
>
> Étienne
>

Re: Select dynamically a partition in an external table

Reply via email to