Re: Finding HiveBaseResultSet have performance improvements and wanting to participate the Hive project

László Bodor Mon, 20 Nov 2023 00:28:20 -0800

Hi!

Thanks for your interest in Hive Project. For later reference, can you
please share the corresponding Hive jira ticket for this particular issue?
Thanks!


Regards,
Laszlo Bodor


浪迹天涯 <1005131...@qq.com.invalid> ezt írta (időpont: 2023. nov. 20., H,
8:51):

> Sorry to bother you, I find that this problem has been fixed three years
> ago. You can ignore this email,thanks!
>
>
>
> 发自我的iPhone
>
>
> ------------------ Original ------------------
> From: It <1005131...@qq.com&gt;
> Date: Sat,Nov 18,2023 6:06 PM
> To: dev <dev@hive.apache.org&gt;
> Subject: Re: Finding HiveBaseResultSet have performance improvements and
> wanting to participate the Hive project
>
>
>
> Dear Hive team,
> &nbsp; &nbsp; I hope this email finds you well.
> &nbsp; &nbsp; I am a user of the Hive Open Source project on GitHub and I
> have been using it on my company’s project for some time now.Recently, I
> noticed that a piece of code in&nbsp;HiveBaseResultSet.java can have
> apparent performance improvement with the help of Arthas. However, I sadly
> find that there is no open issue can be related to Hive project. So I’m
> writing to participate in Hive project.
> &nbsp; &nbsp; With the help of Arthas, I can see the whole stack
> &nbsp;information of HiveBaseResultSet while getting result from Hive
> DataBase, which in our company’s query framework. I found there is nearly
> 20% time wasting on String.split() and String.equalsIgnoreCase().So after I
> checked the source code with questions, I found the answer on Function
> “findColumn()” ! Every time we use result.getXX(String columnName), such as
> getObject(String columnName), HiveBaseResultSet will traverse
> normalizedColumnNames which is a list used to store the mapping of
> columnName and columnIndex, then it gets the index to get the final value.
> It wastes too many time on finding the columnName, which includes lowing
> cases, traversing, splitting columnName and columnIndex, &nbsp;and
> comparing values! So I decided to use a cache to store&nbsp;the mapping of
> columnName and columnIndex, which only spends time on storing the mapping
> in the first time, and after that, getting index will shorten plenty of
> time in searching result, which improve nearly 20% performance on my
> project😉.
> &nbsp; &nbsp; I was proud of improving my Java skills and really wanted to
> make some contribution on Hive project, but unfortunately I noticed that
> there is no button to create new issues. Please let me know if there is any
> way I can get involved in the Hive project.
> &nbsp; &nbsp; Thank you for your time and consideration! And it’s my great
> pleasure to discuss code with Hive team.
>
>
> Sincerely,
> Followers Xrain_ts
>
>
>
> 发自我的iPhone

Re: Finding HiveBaseResultSet have performance improvements and wanting to participate the Hive project

Reply via email to