[ https://issues.apache.org/jira/browse/HIVE-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718932#comment-13718932 ]
Jitendra Nath Pandey commented on HIVE-4822: -------------------------------------------- bq. How does explain work with the vectorization engine? The 'explain' continues to work as before and returns the same plan as in non-vector mode. Vectorization executes exactly the same query plan, only the implementation of the operators and expressions has been changed to run in vectorized fashion. However, we do plan to enhance 'explain' to also show which operators will be executed in vectorized mode. We will start working on it very soon and file a jira. In current implementation, we don't need the 'explain' annotations on vectorized UDFs, because the vectorized UDFs are used at run time. In the query planning stage only row mode UDFs are used, however at query execution time if vectorization is possible, we switch to corresponding vectorized UDFs. We adopted this approach to avoid any changes to query planner for vectorization. bq. Could we somehow hybrid some of our existing UDFS to work from both engines? We will surely have to support the hybrid approach, as you are suggesting, for UDFs that users have implemented, even though we will recommend users to re-implement their UDFs in vectorized fashion. However, for built in hive UDFs, it will almost always be better to have vectorized implementation for performance. Eventually, we do want to have vectorized implementation for all built-in UDFs. bq. Are we sure that functions that operate on doubles and floats are going to round exactly the same way? We have used same underlying java libraries therefore, our results should match. In our testing we do compare the results with non-vector results to make sure. bq. Do we have a wiki page or something where we are keeping track of what is currently supported using vectorization? That's a good idea, I agree we should track this so that community is aware. It will also help and encourage folks to identify areas to contribute. > implement vectorized math functions > ----------------------------------- > > Key: HIVE-4822 > URL: https://issues.apache.org/jira/browse/HIVE-4822 > Project: Hive > Issue Type: Sub-task > Affects Versions: vectorization-branch > Reporter: Eric Hanson > Assignee: Eric Hanson > Fix For: vectorization-branch > > Attachments: HIVE-4822.1.patch, HIVE-4822.4.patch, > HIVE-4822.5-vectorization.patch > > > Implement vectorized support for the all the built-in math functions. This > includes implementing the vectorized operation, and tying it all together in > VectorizationContext so it runs end-to-end. These functions include: > round(Col) > Round(Col, N) > Floor(Col) > Ceil(Col) > Rand(), Rand(seed) > Exp(Col) > Ln(Col) > Log10(Col) > Log2(Col) > Log(base, Col) > Pow(col, p), Power(col, p) > Sqrt(Col) > Bin(Col) > Hex(Col) > Unhex(Col) > Conv(Col, from_base, to_base) > Abs(Col) > Pmod(arg1, arg2) > Sin(Col) > Asin(Col) > Cos(Col) > ACos(Col) > Atan(Col) > Degrees(Col) > Radians(Col) > Positive(Col) > Negative(Col) > Sign(Col) > E() > Pi() > To reduce the total code volume, do an implicit type cast from non-double > input types to double. > Also, POSITITVE and NEGATIVE are syntactic sugar for unary + and unary -, so > reuse code for those as appropriate. > Try to call the function directly in the inner loop and avoid new() or > expensive operations, as appropriate. > Templatize the code where appropriate, e.g. all the unary function of form > DOUBLE func(DOUBLE) > can probably be done with a template. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira