[
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588515#comment-13588515
]
Phabricator commented on HIVE-4080:
-----------------------------------
ashutoshc has requested changes to the revision "HIVE-4080 [jira] Add Lead &
Lag UDAFs".
Some comments.
INLINE COMMENTS
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:533 Use
LEAD_FUNC_NAME and LAG_FUNC_NAME here to be consistent.
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:871 This
null check should be done outside of nesting if() block.
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:875 I don't
see a case where fInfo is != null but udafResolver could be. If so, than this
null check is redundant and should be removed.
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:890 We are
already know that we are dealing with lead/lag udaf, it must be of type
GenericUDAFResolver2.. no ?
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:866 I am
wondering if this function can be rewritten as following:
WindowFunctionInfo finfo = windowFunctions.get(name.toLowerCase());
if (finfo == null) { return null;}
f ( !name.toLowerCase().equals(LEAD_FUNC_NAME) &&
!name.toLowerCase().equals(LAG_FUNC_NAME) ) {
return getGenericUDAFEvaluator(name, argumentOIs, isDistinct, isAllColumns);
}
// this must be lead/lag UDAF
GenericUDAFResolver udafResolver = finfo.getfInfo().getGenericUDAFResolver();
GenericUDAFParameterInfo paramInfo = new SimpleGenericUDAFParameterInfo(
argumentOIs.toArray(), isDistinct, isAllColumns);
return ((GenericUDAFResolver2) udafResolver).getEvaluator(paramInfo);
If not, than I have specific questions. See next few comments.
ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:1467 It will
be good to comments for this new boolean registerAsUDAF. Something like
following
There are certain UDAFs like lead/lag which we want as windowing functions,
but don't want them to appear in mFunctions.
Why?
Because....
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java:29 This
and GenericUDAFLead share lots of common code. It might be good to have an
abstract class for these two, just the way you have it in GenericUDFLeadLag.
ql/src/test/queries/clientpositive/leadlag_queries.q:20 I think currently we
don't support over clause on expressions. Once we do, it will be good to add
test like:
select p_retailprice - lead (p_retail) over (partition by p_mfgr) from part;
ql/src/test/queries/clientpositive/leadlag_queries.q:35 It will be good to
add a test which has both lead and lag in same query.
REVISION DETAIL
https://reviews.facebook.net/D8961
BRANCH
HIVE-4080
ARCANIST PROJECT
hive
To: JIRA, ashutoshc, hbutani
> Add Lead & Lag UDAFs
> --------------------
>
> Key: HIVE-4080
> URL: https://issues.apache.org/jira/browse/HIVE-4080
> Project: Hive
> Issue Type: Bug
> Components: PTF-Windowing
> Reporter: Harish Butani
> Assignee: Harish Butani
> Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch
>
>
> Currently we support Lead/Lag as navigation UDFs usable with Windowing.
> To be standard compliant we need to support Lead & Lag UDAFs.
> Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing
> is in play.
> Currently allow Lead/Lag expressions to appear in SelectLists even when they
> are not arguments to UDAFs. Support for this feature will probably be
> removed. Causes ambiguities when Query contains different partition clauses.
> Will provide more details with associated Jira to remove this feature.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira