[
https://issues.apache.org/jira/browse/HIVE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839286#comment-13839286
]
Prasanth J commented on HIVE-5936:
----------------------------------
[~navis] HIVE-5369 does not discern 0 to -1. The reason is that I felt even 0
(emptiness) is not very reliable. To make it more reliable in HIVE-5369 I am
making another call to filesystem to check for the file size which is reliable
(if metastore reports 0 then filesystem will report file size as 0).
https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L93
here I am getting raw data size from metastore. If it is not reliable I will
fallback to total file size from metastore. If total file size is also not
reliable then I will query the filesystem to get file size. HIVE-5921 needs
some sort of data size (raw data size or file size) to estimate the number of
rows in the absence of any statistics (worst case scenario). Since all the
statistics rules in HIVE-5369 needs atleast the basic statistics (row count and
data size), it is better to provide some statistics (accurate or estimated)
than providing no statistics at all.
> analyze command failing to collect stats with counter mechanism
> ---------------------------------------------------------------
>
> Key: HIVE-5936
> URL: https://issues.apache.org/jira/browse/HIVE-5936
> Project: Hive
> Issue Type: Bug
> Components: Statistics
> Affects Versions: 0.13.0
> Reporter: Ashutosh Chauhan
> Assignee: Navis
> Attachments: HIVE-5936.1.patch.txt, HIVE-5936.2.patch.txt
>
>
> With counter mechanism, MR job is successful, but StatsTask on client fails
> with NPE.
--
This message was sent by Atlassian JIRA
(v6.1#6144)