[
https://issues.apache.org/jira/browse/HIVE-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822051#comment-13822051
]
Thejas M Nair commented on HIVE-5793:
-------------------------------------
Thanks for updating the docs!
In the default.xml description, can you also state if
hive.fetch.task.conversion.threshold is set to less than 0, are the file
lengths going to be checked (I see that it doesn't.) ?
bq. This kind of information is data dependent and user can't declare this
reliably while writing udf code. This should really be estimated using stats
and than optimizer need to make decision using that.
I am not sure how the UDAF output size can be determined using just stats of
the input, unless you know what the UDAF is actually doing.
But you can determine whether the UDAF is a compacting one, based on the return
type. If it returns a type that has fixed size (primitive numeric types,
boolean), then you know that it will be compacting. But if the return type is a
string,binary or complex type, then you need a hint (or some other way of
estimating output size).
> Update hive-default.xml.template for HIVE-4002
> ----------------------------------------------
>
> Key: HIVE-5793
> URL: https://issues.apache.org/jira/browse/HIVE-5793
> Project: Hive
> Issue Type: Improvement
> Components: Configuration
> Reporter: Navis
> Assignee: Navis
> Priority: Trivial
> Attachments: HIVE-5793.1.patch.txt, HIVE-5793.2.patch.txt
>
>
> Addressing
> https://issues.apache.org/jira/browse/HIVE-3990?focusedCommentId=13818388&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13818388
--
This message was sent by Atlassian JIRA
(v6.1#6144)