Hello,

I'm trying to construct a query which will do some simple counts over a
table of events, and insert them in to a summary table.

The query I came up with looks something like

INSERT OVERWRITE TABLE activity
PARTITION(date_utc)
select count(1) views,
from impressions
where dt >= date_sub(to_date(from_unixtime(unix_timestamp())),1)
and dt < to_date(from_unixtime(unix_timestamp()))
group by d_theme;

This seemed to work fine, but I realized it was doing a full table scan,
and not using the partitions in "views" appropriately.

I set hive.mapred.mode=strict and the above query fails to run.
If I change the expressions in the where clause to literals, it works as I
would expect.

Can anyone help me figure out how to do what I want?


I put together a test script to illustrate my problem:

set hive.mapred.mode=strict;
set hive.optimize.ppd=true;
DROP TABLE IF EXISTS test_where_expr;
CREATE TABLE test_where_expr (
    ts int
  )
  PARTITIONED BY ( dt STRING );
explain SELECT * from test_where_expr WHERE dt =
to_date(date_sub(from_unixtime(unix_timestamp()),1));
explain SELECT * from test_where_expr WHERE dt = '2012-06-01';


This is the output:

hive> set hive.mapred.mode=strict;
hive> set hive.optimize.ppd=true;
hive>
    > DROP TABLE IF EXISTS test_where_expr;
OK
Time taken: 3.405 seconds
hive>
    > CREATE TABLE test_where_expr (
    >     ts int
    >   )
    >   PARTITIONED BY ( dt STRING );
OK
Time taken: 0.189 seconds
hive>
    > explain SELECT * from test_where_expr
    > WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1));
FAILED: Error in semantic analysis: No partition predicate found for Alias
"test_where_expr" Table "test_where_expr"
hive>
    > explain SELECT * from test_where_expr
    > WHERE dt = '2012-06-01';
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME test_where_expr)))
(TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT
(TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (TOK_TABLE_OR_COL dt)
'2012-06-01'))))

STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1


Time taken: 0.166 seconds
hive> hadoop@ip-10-68-190-136:~$ hive --version
Hive version 0.8.1.
hadoop@ip-10-68-190-136:~$ hadoop -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)


Thanks,
--Matthew Hooker

Reply via email to