Sergey Shelukhin created TEZ-1081:
-------------------------------------

             Summary: expose some basic statistics from 
org.apache.tez.runtime.api.Input (or similar)
                 Key: TEZ-1081
                 URL: https://issues.apache.org/jira/browse/TEZ-1081
             Project: Apache Tez
          Issue Type: Improvement
            Reporter: Sergey Shelukhin


Hive loads data from  org.apache.tez.runtime.api.Input into mapjoin hashtables. 
It would be useful to know in advance
1) How many rows are there in the input (should be easy to add).
2) How many unique keys (even an approximation).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to