Sergey Shelukhin created TEZ-1081:
-------------------------------------
Summary: expose some basic statistics from
org.apache.tez.runtime.api.Input (or similar)
Key: TEZ-1081
URL: https://issues.apache.org/jira/browse/TEZ-1081
Project: Apache Tez
Issue Type: Improvement
Reporter: Sergey Shelukhin
Hive loads data from org.apache.tez.runtime.api.Input into mapjoin hashtables.
It would be useful to know in advance
1) How many rows are there in the input (should be easy to add).
2) How many unique keys (even an approximation).
--
This message was sent by Atlassian JIRA
(v6.2#6252)