Hello,
I'm trying to construct a query which will do some simple counts over a
table of events, and insert them in to a summary table.
The query I came up with looks something like
INSERT OVERWRITE TABLE activity
PARTITION(date_utc)
select count(1) views,
from impressions
where dt >= date_sub(to_date(from_unixtime(unix_timestamp())),1)
and dt < to_date(from_unixtime(unix_timestamp()))
group by d_theme;
This seemed to work fine, but I realized it was doing a full table scan,
and not using the partitions in "views" appropriately.
I set hive.mapred.mode=strict and the above query fails to run.
If I change the expressions in the where clause to literals, it works as I
would expect.
Can anyone help me figure out how to do what I want?
I put together a test script to illustrate my problem:
set hive.mapred.mode=strict;
set hive.optimize.ppd=true;
DROP TABLE IF EXISTS test_where_expr;
CREATE TABLE test_where_expr (
ts int
)
PARTITIONED BY ( dt STRING );
explain SELECT * from test_where_expr WHERE dt =
to_date(date_sub(from_unixtime(unix_timestamp()),1));
explain SELECT * from test_where_expr WHERE dt = '2012-06-01';
This is the output:
hive> set hive.mapred.mode=strict;
hive> set hive.optimize.ppd=true;
hive>
> DROP TABLE IF EXISTS test_where_expr;
OK
Time taken: 3.405 seconds
hive>
> CREATE TABLE test_where_expr (
> ts int
> )
> PARTITIONED BY ( dt STRING );
OK
Time taken: 0.189 seconds
hive>
> explain SELECT * from test_where_expr
> WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1));
FAILED: Error in semantic analysis: No partition predicate found for Alias
"test_where_expr" Table "test_where_expr"
hive>
> explain SELECT * from test_where_expr
> WHERE dt = '2012-06-01';
OK
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME test_where_expr)))
(TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT
(TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (TOK_TABLE_OR_COL dt)
'2012-06-01'))))
STAGE DEPENDENCIES:
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Time taken: 0.166 seconds
hive> hadoop@ip-10-68-190-136:~$ hive --version
Hive version 0.8.1.
hadoop@ip-10-68-190-136:~$ hadoop -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
Thanks,
--Matthew Hooker