Observed that FilterOperator is applied only once if hive.optimize.ppd is set 
false. I think there is bug with predicate pushdown. So, raised HIVE-1538.

Thanks
Amareshwari

On 8/12/10 3:01 PM, "Amareshwari Sri Ramadasu" <amar...@yahoo-inc.com> wrote:

Hi,

I see that if a query has where clause, the FilterOperator is applied twice. 
Can you tell me why is it done so?
It seems second operator is always filtering zero rows.

Explain on a query with where clause :
hive> explain select * from input1 where input1.key != 10;
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_TABREF input1)) (TOK_INSERT (TOK_DESTINATION 
(TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (!= 
(. (TOK_TABLE_OR_COL input1) key) 10))))

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        input1
          TableScan
            alias: input1
            Filter Operator
              predicate:
                  expr: (key <> 10)
                  type: boolean
              Filter Operator
                predicate:
                    expr: (key <> 10)
                    type: boolean
                Select Operator
                  expressions:
                        expr: key
                        type: int
                        expr: value
                        type: int
                  outputColumnNames: _col0, _col1
                  File Output Operator
                    compressed: false
                    GlobalTableId: 0
                    table:
                        input format: org.apache.hadoop.mapred.TextInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

  Stage: Stage-0
    Fetch Operator
      limit: -1

I see the same from the Mapper logs also. The first FilterOperator does the 
filtering and second operator always filters zero rows.

2010-08-12 14:33:22,149 INFO ExecMapper:
<MAP>Id =5
  <Children>
    <TS>Id =0
      <Children>
        <FIL>Id =1
          <Children>
            <FIL>Id =2
              <Children>
                <SEL>Id =3
                  <Children>
                    <FS>Id =4
                      <Parent>Id = 3 null<\Parent>
                    <\FS>
                  <\Children>
                  <Parent>Id = 2 null<\Parent>
                <\SEL>
              <\Children>
              <Parent>Id = 1 null<\Parent>
            <\FIL>
          <\Children>
          <Parent>Id = 0 null<\Parent>
        <\FIL>
      <\Children>
      <Parent>Id = 5 null<\Parent>
    <\TS>
  <\Children>
<\MAP>

2010-08-12 14:33:22,272 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 5 
forwarding 1 rows
2010-08-12 14:33:22,272 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 
0 forwarding 1 rows
2010-08-12 14:33:22,450 INFO ExecMapper: ExecMapper: processing 1 rows: used 
memory = 4417072
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 5 
finished. closing...
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 5 
forwarded 1 rows
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 
DESERIALIZE_ERRORS:0
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 
0 finished. closing...
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 
0 forwarded 1 rows
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 1 
finished. closing...
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 1 
forwarded 0 rows
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 
FILTERED:1
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 
PASSED:0
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 2 
finished. closing...
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 2 
forwarded 0 rows
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 
FILTERED:0
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 
PASSED:0
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 3 
finished. closing...
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 3 
forwarded 0 rows
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 
finished. closing...
2010-08-12 14:33:22,450 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 4 
forwarded 0 rows
2010-08-12 14:33:22,451 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 
Final Path: FS 
hdfs://localhost:19000/tmp/hive-amarsri/hive_2010-08-12_14-33-14_470_1825337114959896683/_tmp.-ext-10001/000000_0
2010-08-12 14:33:22,451 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 
Writing to temp file: FS 
hdfs://localhost:19000/tmp/hive-amarsri/hive_2010-08-12_14-33-14_470_1825337114959896683/_tmp.-ext-10001/_tmp.000000_0
2010-08-12 14:33:22,454 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 
New Final Path: FS 
hdfs://localhost:19000/tmp/hive-amarsri/hive_2010-08-12_14-33-14_470_1825337114959896683/_tmp.-ext-10001/000000_0
2010-08-12 14:33:22,485 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 3 
Close done
2010-08-12 14:33:22,485 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 2 
Close done
2010-08-12 14:33:22,485 INFO org.apache.hadoop.hive.ql.exec.FilterOperator: 1 
Close done
2010-08-12 14:33:22,485 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 
0 Close done
2010-08-12 14:33:22,485 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 5 
Close done
2010-08-12 14:33:22,485 INFO ExecMapper: ExecMapper: processed 1 rows: used 
memory = 5135888

Thanks
Amareshwari

Reply via email to