[
https://issues.apache.org/jira/browse/HIVE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726475#comment-13726475
]
Edward Capriolo commented on HIVE-2482:
---------------------------------------
[~thejas] Thank you for your comment, I am going to agree and disagree with
your for my prospective on this issue.
* I use hive_test to tests my udfs https://github.com/edwardcapriolo/hive_test
* At one point we added a plugin developer kit to hive which allowed annotation
based testing of UDFS
At one point this was removed, there were reports that it was flakey and I was
not paying much attention at that time, but I probably would have advocated
that it not be removed.
Now, I do agree with you that we can get better coverage of some things outside
end-to-end tests, but believe it or not functions are not one of them.
Why do I say this? A few reasons:
* Most functions are not functional.
* They actually have state, conf at initialization, reusable objects shared
between calls to evaluate.
* UDAFs have entire aggregation buffers systems.
To your specific points
1) Welcome to my life, I have been complaining about our test infrastructure
for years. Honestly now that we have a build system we can test udf's fairly
fast, and there is not a huge volume of them anyway.
2) That can be true, again I use hive_test and I am not against having units +
end-to-end tests
3) I agree with this to an extent, but even in a real unit test one still has
to write Assert.assertEquals( something, somethingElse ) so you still eyeball
something. From a review standpoints it's easier to eyeball the .out then tens
or hundreds of asserts.
Again I am not against having more traditionally unit tests and writing code in
functional style that is easier to document and and reason about, but I think
to cover all the corner cases of exceptions and cleaning out private state
properly the unit tests will be more ugly then the q tests.
I am talking on hive-dev about the project split up. This is one of the things
I want to do, move all the end-to-end test to a final project and really step
up the unit style testing.
There is lots of things we can do to make the tests faster
* move all the UDFs into 1 big test :) save the overhead of launching multiple
tests
* optimize 'select udf(column) from table limit 1' <-- we should be able to
make that test scream
Anyway unlike the past where stuff like this sits on the queue forever we now
have a build bot and I am dedicated to seeing patches reviewed and committed
fast (especially those like these)
BTW at minimum there is show_functions.q, so every time you add a function you
at least have to touch that test.
> Convenience UDFs for binary data type
> -------------------------------------
>
> Key: HIVE-2482
> URL: https://issues.apache.org/jira/browse/HIVE-2482
> Project: Hive
> Issue Type: New Feature
> Affects Versions: 0.9.0
> Reporter: Ashutosh Chauhan
> Assignee: Mark Wagner
> Attachments: HIVE-2482.1.patch
>
>
> HIVE-2380 introduced binary data type in Hive. It will be good to have
> following udfs to make it more useful:
> * UDF's to convert to/from hex string
> * UDF's to convert to/from string using a specific encoding
> * UDF's to convert to/from base64 string
> * UDF's to convert to/from non-string types using a particular serde
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira