Re: Need help running query over "yesterday". getting "No partition predicate found" errors
Thanks, Mark. I was hoping the expressions could be evaluated at query time, but generating the query works just as well. On Mon, Jun 4, 2012 at 7:42 PM, Mark Grover wrote: > Hi Matthew, > unix_timestamp() is being executed on the Hive server in your case. > Therefore, as your query progresses, the timestamp returned by your > unix_timestamp() would change. Based on your usage, this would lead to > unsuspecting problems when this queries runs across UTC midnight. In any > case, I'd recommend that you get your Hive client code to get the present > unix_timestamp and generate your partition pruning constant(s) on the > client before you send your query to the server. When you do so, partition > pruning will occur as expected. > > Mark > > - Original Message - > From: "Matthew Hooker" > To: user@hive.apache.org > Sent: Monday, June 4, 2012 8:01:26 PM > Subject: Need help running query over "yesterday". getting "No partition > predicate found" errors > > Hello, > > > I'm trying to construct a query which will do some simple counts over a > table of events, and insert them in to a summary table. > > > The query I came up with looks something like > > > > INSERT OVERWRITE TABLE activity > PARTITION(date_utc) > select count(1) views, > from impressions > > where dt >= date_sub(to_date(from_unixtime(unix_timestamp())),1) > and dt < to_date(from_unixtime(unix_timestamp())) > group by d_theme; > > > This seemed to work fine, but I realized it was doing a full table scan, > and not using the partitions in "views" appropriately. > > > I set hive.mapred.mode=strict and the above query fails to run. > If I change the expressions in the where clause to literals, it works as I > would expect. > > > Can anyone help me figure out how to do what I want? > > > > > I put together a test script to illustrate my problem: > > > > set hive.mapred.mode=strict; > set hive.optimize.ppd=true; > DROP TABLE IF EXISTS test_where_expr; > CREATE TABLE test_where_expr ( > ts int > ) > PARTITIONED BY ( dt STRING ); > explain SELECT * from test_where_expr WHERE dt = > to_date(date_sub(from_unixtime(unix_timestamp()),1)); > explain SELECT * from test_where_expr WHERE dt = '2012-06-01'; > > > > > This is the output: > > > > hive> set hive.mapred.mode=strict; > hive> set hive.optimize.ppd=true; > hive> > > DROP TABLE IF EXISTS test_where_expr; > OK > Time taken: 3.405 seconds > hive> > > CREATE TABLE test_where_expr ( > > ts int > > ) > > PARTITIONED BY ( dt STRING ); > OK > Time taken: 0.189 seconds > hive> > > explain SELECT * from test_where_expr > > WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1)); > FAILED: Error in semantic analysis: No partition predicate found for Alias > "test_where_expr" Table "test_where_expr" > hive> > > explain SELECT * from test_where_expr > > WHERE dt = '2012-06-01'; > OK > ABSTRACT SYNTAX TREE: > (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME test_where_expr))) > (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT > (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (TOK_TABLE_OR_COL dt) > '2012-06-01' > > > STAGE DEPENDENCIES: > Stage-0 is a root stage > > > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > > > > > Time taken: 0.166 seconds > hive> hadoop@ip-10-68-190-136:~$ hive --version > Hive version 0.8.1. > hadoop@ip-10-68-190-136:~$ hadoop -version > java version "1.6.0_26" > Java(TM) SE Runtime Environment (build 1.6.0_26-b03) > Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) > > > > > > Thanks, > --Matthew Hooker >
Re: Need help running query over "yesterday". getting "No partition predicate found" errors
Hi Matthew, unix_timestamp() is being executed on the Hive server in your case. Therefore, as your query progresses, the timestamp returned by your unix_timestamp() would change. Based on your usage, this would lead to unsuspecting problems when this queries runs across UTC midnight. In any case, I'd recommend that you get your Hive client code to get the present unix_timestamp and generate your partition pruning constant(s) on the client before you send your query to the server. When you do so, partition pruning will occur as expected. Mark - Original Message - From: "Matthew Hooker" To: user@hive.apache.org Sent: Monday, June 4, 2012 8:01:26 PM Subject: Need help running query over "yesterday". getting "No partition predicate found" errors Hello, I'm trying to construct a query which will do some simple counts over a table of events, and insert them in to a summary table. The query I came up with looks something like INSERT OVERWRITE TABLE activity PARTITION(date_utc) select count(1) views, from impressions where dt >= date_sub(to_date(from_unixtime(unix_timestamp())),1) and dt < to_date(from_unixtime(unix_timestamp())) group by d_theme; This seemed to work fine, but I realized it was doing a full table scan, and not using the partitions in "views" appropriately. I set hive.mapred.mode=strict and the above query fails to run. If I change the expressions in the where clause to literals, it works as I would expect. Can anyone help me figure out how to do what I want? I put together a test script to illustrate my problem: set hive.mapred.mode=strict; set hive.optimize.ppd=true; DROP TABLE IF EXISTS test_where_expr; CREATE TABLE test_where_expr ( ts int ) PARTITIONED BY ( dt STRING ); explain SELECT * from test_where_expr WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1)); explain SELECT * from test_where_expr WHERE dt = '2012-06-01'; This is the output: hive> set hive.mapred.mode=strict; hive> set hive.optimize.ppd=true; hive> > DROP TABLE IF EXISTS test_where_expr; OK Time taken: 3.405 seconds hive> > CREATE TABLE test_where_expr ( > ts int > ) > PARTITIONED BY ( dt STRING ); OK Time taken: 0.189 seconds hive> > explain SELECT * from test_where_expr > WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1)); FAILED: Error in semantic analysis: No partition predicate found for Alias "test_where_expr" Table "test_where_expr" hive> > explain SELECT * from test_where_expr > WHERE dt = '2012-06-01'; OK ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME test_where_expr))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (TOK_TABLE_OR_COL dt) '2012-06-01' STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Time taken: 0.166 seconds hive> hadoop@ip-10-68-190-136:~$ hive --version Hive version 0.8.1. hadoop@ip-10-68-190-136:~$ hadoop -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) Thanks, --Matthew Hooker
Need help running query over "yesterday". getting "No partition predicate found" errors
Hello, I'm trying to construct a query which will do some simple counts over a table of events, and insert them in to a summary table. The query I came up with looks something like INSERT OVERWRITE TABLE activity PARTITION(date_utc) select count(1) views, from impressions where dt >= date_sub(to_date(from_unixtime(unix_timestamp())),1) and dt < to_date(from_unixtime(unix_timestamp())) group by d_theme; This seemed to work fine, but I realized it was doing a full table scan, and not using the partitions in "views" appropriately. I set hive.mapred.mode=strict and the above query fails to run. If I change the expressions in the where clause to literals, it works as I would expect. Can anyone help me figure out how to do what I want? I put together a test script to illustrate my problem: set hive.mapred.mode=strict; set hive.optimize.ppd=true; DROP TABLE IF EXISTS test_where_expr; CREATE TABLE test_where_expr ( ts int ) PARTITIONED BY ( dt STRING ); explain SELECT * from test_where_expr WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1)); explain SELECT * from test_where_expr WHERE dt = '2012-06-01'; This is the output: hive> set hive.mapred.mode=strict; hive> set hive.optimize.ppd=true; hive> > DROP TABLE IF EXISTS test_where_expr; OK Time taken: 3.405 seconds hive> > CREATE TABLE test_where_expr ( > ts int > ) > PARTITIONED BY ( dt STRING ); OK Time taken: 0.189 seconds hive> > explain SELECT * from test_where_expr > WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1)); FAILED: Error in semantic analysis: No partition predicate found for Alias "test_where_expr" Table "test_where_expr" hive> > explain SELECT * from test_where_expr > WHERE dt = '2012-06-01'; OK ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME test_where_expr))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (TOK_TABLE_OR_COL dt) '2012-06-01' STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Time taken: 0.166 seconds hive> hadoop@ip-10-68-190-136:~$ hive --version Hive version 0.8.1. hadoop@ip-10-68-190-136:~$ hadoop -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) Thanks, --Matthew Hooker