[jira] [Created] (HIVE-27169) New Locked List to prevent configuration change at runtime without throwing error

2023-03-23 Thread Raghav Aggarwal (Jira)
Raghav Aggarwal created HIVE-27169:
--

 Summary: New Locked List to prevent configuration change at 
runtime without throwing error
 Key: HIVE-27169
 URL: https://issues.apache.org/jira/browse/HIVE-27169
 Project: Hive
  Issue Type: Improvement
Affects Versions: 4.0.0-alpha-2
Reporter: Raghav Aggarwal
Assignee: Raghav Aggarwal


_*AIM*_

Create a new locked list called{{ hive.conf.locked.list}} which contains comma 
separated configuration that won't be changed during runtime. If someone try to 
change them at runtime then it will give WARN log on beeline itself.

 

_*How is it different from Restricted List?*_

When running hql file or at runtime, if a configuration present in restricted 
list get updated then it will throw error and won't proceed with further 
execution of hql file.

With locked list, the configuration that is getting updated will throw WARN log 
on beeline and will continue to execute the hql file.

 

_*Why is it required?*_

In organisations, admin want to enforce some configs which user shouldn't be 
able to change at runtime and it shouldn't affect user's existing hql scripts. 
Therefore, this locked list will be useful as it will not allow user to change 
the value of particular configs and it will also not stop the execution of hql 
scripts.

 

{_}*NOTE*{_}: Only at cluster level {{hive.conf.locked.list }}can be set and 
after that the hive service needs to be restarted.

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Release managers

2023-03-23 Thread Sai Hemanth Gantasala
Hi all,

I would like to volunteer for the 4.2.0 release.

Thanks,
Sai.

On Thu, Mar 23, 2023 at 2:47 PM Denys Kuzmenko  wrote:

> Hi, I can take the following one: 4.1.0
>


[jira] [Created] (HIVE-27168) Use basename of the datatype when fetching partition metadata using partition filters

2023-03-23 Thread Sourabh Badhya (Jira)
Sourabh Badhya created HIVE-27168:
-

 Summary: Use basename of the datatype when fetching partition 
metadata using partition filters
 Key: HIVE-27168
 URL: https://issues.apache.org/jira/browse/HIVE-27168
 Project: Hive
  Issue Type: Bug
Reporter: Sourabh Badhya
Assignee: Sourabh Badhya


While fetching partition metadata using partition filters, we use the column 
type of the table directly. However, char/varchar types can contain extra 
information such as length of the char/varchar column and hence it skips 
fetching partition metadata due to this extra information.

Solution: Use the basename of the column type while deciding on whether 
partition pruning can be done on the partitioned column.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27167) Upgrade guava version in standalone-metastore and storage-api module

2023-03-23 Thread Raghav Aggarwal (Jira)
Raghav Aggarwal created HIVE-27167:
--

 Summary: Upgrade guava version in standalone-metastore and 
storage-api module
 Key: HIVE-27167
 URL: https://issues.apache.org/jira/browse/HIVE-27167
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore, storage-api
Affects Versions: 4.0.0-alpha-2
Reporter: Raghav Aggarwal
Assignee: Raghav Aggarwal


The guava version in standalone-metastore and storage-api (i.e 19.0) is not in 
sync with the the parent pom.xml (i.e 22.0). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] HIVE 4.0 GA Release Proposal

2023-03-23 Thread Denys Kuzmenko
Thanks, Sungwoo for running the TPC-DS benchmark. Do we know if the same level 
of performance degradation was present in 4.0.0-alpha1?

All: please use the `hive-4.0.0-must` label in a ticket if you think it's a 
show-stopper for the release.


Re: Release managers

2023-03-23 Thread Denys Kuzmenko
Hi, I can take the following one: 4.1.0


Re: [DISCUSS] Incremental and cadence predictable release activity for HIVE

2023-03-23 Thread Denys Kuzmenko
Sorry for being late to the party. I think what Kirti proposes would be good 
for the project and end-users. As mentioned, we could start with 2-3 releases 
per year, and once we improve on the process and automation (CI/CD) we could 
reevaluate.  


Re: [DISCUSS] Incremental and cadence predictable release activity for HIVE

2023-03-23 Thread Kirti Ruge
Thanks Ayush, Thanks Stamatis for your valuable inputs. 

Let us a give a try to get at least 2 releases/year so that we can evaluate 
further on strategy plan of release activity. I see another mail thread for 
interested devs who can volunteer as Release Managers for next subsequent 
releases 
https://lists.apache.org/thread/2tfcocjdfc2w0mw4fsy736gvp4vqykqm 
. 

Let us chime in there and give a try.
Closing this thread for now.  

Thanks,
Kirti

> On 13-Mar-2023, at 5:13 PM, Stamatis Zampetakis  wrote:
> 
> Hello,
> 
> I am not sure what a branch cut actually refers to. As I mentioned in the
> past I am not in favor of maintaining multiple release branches; the cost
> is high and the number of volunteers is simply not enough. I am willing to
> reconsider if things change in the near future.
> 
> Apart from that, having frequent releases from master is definitely great
> for consumers  and good for the health of the project; two, three releases
> per year would be great but for this to happen we need volunteers (mostly
> release managers).
> 
> One thing that I have seen working well in other projects is to decide in
> advance the next 3-4 release managers. Maybe it's worth trying implementing
> this in Hive.
> 
> Best,
> Stamatis
> 
> On Sun, Mar 12, 2023 at 6:07 PM Ayush Saxena  wrote:
> 
>> Hi Kirti,
>> Thanx for the initiative. This sounds very interesting, but I doubt if it
>> is that easy to incorporate. Sharing my thoughts:
>> 
>>   - Regarding "Unpredictable" : I don't think we are like doing very
>>   unpredictable releases. It should be a formal mail, like Release x.y.z
>> and
>>   then the RM usually shares a potential Branch freeze date, then a
>>   margin number of days for blockers or critical tickets. And this entire
>>   process would be around a minimum of 1 month and usually will go around
>> 3
>>   months.
>>   - Regarding "Regressions": Quicker releases doesn't certainly mean more
>>   stable releases.
>>   - Regarding half-baked features: We are mostly developing on master
>>   branch, we don't have a concept of feature branch(a lot of projects have
>>   that), So, if a bunch of features are running in parallel by different
>> set
>>   of people, with a "fixed" date it is practically impossible to achieve,
>>   this thing needs to be negotiated b/w all of them.
>>   - Even if we pin a date, that ain't sufficient, we need volunteers who
>>   can take up the RM role, If we proceed with this we should decide the
>> RM as
>>   well beforehand.
>>   - This timeline thing can get screwed up in case you hit a security
>>   issue: AFAIK you can't announce a CVE unless you have a release on all
>>   active release lines with the fix. So, in that case this schedule will
>> get
>>   messed up and the RM, the dates would require to be renegotiated.
>>   - Sometimes you need to release early because a downstream project needs
>>   a fix, which blocks their way to upgrade Hive. Standard practice, almost
>>   All apache projects are concerned about each other and help others in
>>   upgrading, so in that case I am not sure holding them for a fixed date
>> is
>>   cool or not
>>   - Mostly what I have observed, A release takes place when we have enough
>>   tickets to release, We don't want to just keep on releasing with just
>> 20-25
>>   fixes, nor we want to push straight 800-900 fixes in one go. The number
>> of
>>   fixes, the nature of fixes all should be taken in account while planning
>>   the release date.
>> 
>> 
>> In general: Good Idea, We should definitely encourage more frequent
>> releases, having a "strict" date or not is debatable.
>> 
>> -Ayush
>> 
>> On Sun, 12 Mar 2023 at 19:44, Kirti Ruge  wrote:
>> 
>>> Hello HIVE Dev,
>>> 
>>> I would like to discuss/propose incremental and cadence predictable
>>> process for HIVE releases.
>>> 
>>> https://hive.apache.org/general/downloads/
>>> 
>>> Currently, our releases have a very random span in between, and those
>> have
>>> sometimes caused problems like-
>>> 
>>> 1. All downstream and end users have unpredictable schedules because of
>>> upstream.
>>> 2. More chances of regression issues when there is an unplanned release
>>> date. As developers and release managers have to rush, this prevents us
>>> from focusing on having a proper regression-free release.
>>> 
>>> I would like to propose a branch cut twice a year to have two strict
>>> releases yearly. It would make release cadence predictable for end users
>>> and bring some disciplinary schedules for all users, including downstream
>>> projects.
>>> 
>>> Advantages of this approach-
>>> 
>>> 1. If we pin a branch cut date, features can be prioritized better so
>> that
>>> no half-baked stuff goes into release.
>>> 2. Such Incremental release will help in better regression and reduce the
>>> burden from release management activity( result is reduced issues and
>>> problems with quality). It will eventually help to