Re: [ANNOUNCE] New Committer: Simhadri Govindappa

2024-04-22 Thread Kirti Ruge
Congratulations Simhadri!!!

> On 20-Apr-2024, at 11:36 AM, Akshat m  wrote:
> 
> Congratulations Simhadri!, very well deserved 
> 
> On Fri, Apr 19, 2024 at 3:37 PM Simhadri G  > wrote:
> Thanks again everyone :) 
> 
> On Fri, Apr 19, 2024, 2:15 AM Rajesh Balamohan  > wrote:
> Congratulations Simhadri. :)
> 
> ~Rajesh.B
> 
> On Fri, Apr 19, 2024 at 2:02 AM Aman Sinha  > wrote:
> Congrats Simhadri ! 
> 
> On Thu, Apr 18, 2024 at 12:25 PM Naveen Gangam  
> wrote:
> Congrats Simhadri. Looking forward to many more contributions in the future.
> 
> On Thu, Apr 18, 2024 at 12:25 PM Sai Hemanth Gantasala 
>  wrote:
> Congratulations Simhadri  well deserved
> 
> On Thu, Apr 18, 2024 at 8:41 AM Pau Tallada  > wrote:
> Congratulations
> 
> Missatge de Alessandro Solimando  > del dia dj., 18 d’abr. 2024 a les 
> 17:40:
> Great news, Simhadri, very well deserved!
> 
> On Thu, 18 Apr 2024 at 15:07, Simhadri G  > wrote:
> Thanks everyone! 
> I really appreciate it, it means a lot to me :) 
> The Apache Hive project and its community have truly inspired me . I'm 
> grateful for the chance to contribute to such a remarkable project.
> 
> Thanks!
> Simhadri Govindappa
> 
> On Thu, Apr 18, 2024 at 6:18 PM Sankar Hariappan 
>  wrote:
> Congrats Simhadri!
> 
>  
> 
> -Sankar
> 
>  
> 
> From: Butao Zhang mailto:butaozha...@163.com>> 
> Sent: Thursday, April 18, 2024 5:39 PM
> To: u...@hive.apache.org ; dev 
> mailto:dev@hive.apache.org>>
> Subject: [EXTERNAL] Re: [ANNOUNCE] New Committer: Simhadri Govindappa
> 
>  
> 
> You don't often get email from butaozha...@163.com 
> . Learn why this is important 
>   
> Congratulations Simhadri !!!
> 
>  
> 
> Thanks.
> 
>  
> 
> 发件人: user-return-28075-butaozhang1=163@hive.apache.org 
>  
>  > 代表 Ayush 
> Saxena mailto:ayush...@gmail.com>>
> 发送时间: 星期四, 四月 18, 2024 7:50 下午
> 收件人: dev mailto:dev@hive.apache.org>>; 
> u...@hive.apache.org   >
> 主题: [ANNOUNCE] New Committer: Simhadri Govindappa
> 
>  
> 
> Hi All,
> 
> Apache Hive's Project Management Committee (PMC) has invited Simhadri 
> Govindappa to become a committer, and we are pleased to announce that he has 
> accepted.
> 
>  
> 
> Please join me in congratulating him, Congratulations Simhadri, Welcome 
> aboard!!!
> 
>  
> 
> -Ayush Saxena
> 
> (On behalf of Apache Hive PMC)
> 
> 
> 
> -- 
> --
> Pau Tallada Crespí
> Departament de Serveis
> Port d'Informació Científica (PIC)
> Tel: +34 93 170 2729
> --
> 



Re: [VOTE] Release Apache Hive 4.0.0 (Release Candidate 0)

2024-03-28 Thread Kirti Ruge
+1 (non-binding)


I have done below steps on Mac m1

 Built from HIVE 4.0.0 from source successfully.
 Verified checksums and signatures.
 Initialized metastore with postgresql.
 Started metastore and hiveserver . 
 Ran some simple Hive queries via beeline and  checked same on webui 
(http://localhost:10002/).
 Built docker image and started hive services with docker. 

Regards,
Kirti

> On 28-Mar-2024, at 3:41 PM, Zoltán Rátkai  
> wrote:
> 
> +1 (non-binding)
> 
> Performed on Mac M1:
> 
> - Verified checksums
> - Verified signature
> - Built from source
> - Build docker image (HADOOP_VERSION=3.3.6, TEZ_VERSION=0.10.3)
> - Started docker image
> - Checked web GUI is working (http://localhost:10002/)
> - Created a table and ran CRUD operations on Hive ACID table successfully
> on the Docker environment
> - Checked the executed queries via web GUI Regards,
> 
> Zoltan Ratkai
> 
> On Thu, Mar 28, 2024 at 8:41 AM kokila narayanan <
> kokilanarayana...@gmail.com> wrote:
> 
>> +1 (non-binding)
>> 
>> 1. Verified checksums
>> 2. Verified signatures
>> 3. Built from source successfully
>> 4. Deployed and started binary tar with Hadoop 3.3.6 and Tez 0.10.3.
>> 5. Executed basic operations on ACID and external tables.
>> 
>> Regards,
>> Kokila
>> 
>> On Thu, Mar 28, 2024 at 12:30 PM Sourabh Badhya
>>  wrote:
>> 
>>> +1 (non-binding)
>>> 
>>> [1] Built from source successfully.
>>> [2] Verified checksums and signatures.
>>> [3] Built docker image with Apache Hadoop 3.3.6 and Apache Tez 0.10.3 and
>>> metastore using Postgres successfully.
>>> [4] Ran CRUD operations on Hive ACID, Iceberg tables and basic operations
>>> on Hive external tables successfully on the Docker environment.
>>> [5] Browsed the same executed queries via Hiveserver2 UI.
>>> 
>>> Thanks Denys for driving the release.
>>> 
>>> Regards,
>>> Sourabh Badhya
>>> 
>>> On Wed, Mar 27, 2024 at 10:38 PM Ayush Saxena 
>> wrote:
>>> 
 +1 (Binding)
 
 * Built from source
 * Verified checksums
 * Verified signature
 * Verified all code files have ASF Header
 * Validated the Notice & License files
 * No code diff b/w git tag & src tar
 * Ran some basic operations on Iceberg, ACID & External Tables (Hive on
 Tez)
 * Browsed through HS2 UI
 * Built Docker image from source & tried some basic commands on the
>>> docker
 environment.
 * Skimmed over the contents of maven repo.
 
 Thanx Denys for driving the release. Good Luck!!!
 
 -Ayush
 
 On Wed, 27 Mar 2024 at 21:05, Marta Kuczora
>>>  
 wrote:
 
> +1 (binding)
> 
> Thanks a lot Denys for driving the release!
> 
> * Verified the checksum and signature [OK]
> 
> * Built Hive 4.0.0 from source [OK]
> 
> * Initialized metastore with MySQL [OK]
> 
> * Built package and ran metastore and hiveserver [OK]
> 
> * Deployed and start the binary tar with Hadoop 3.3.6 and Tez 0.10.3
>>> [OK]
> 
> * Ran some simple Hive queries with external/acid/iceberg tables [OK]
> 
> 
> Regards,
> 
> Marta
> 
> On Tue, Mar 26, 2024 at 8:26 AM Denys Kuzmenko >> 
> wrote:
> 
>> Hi Everyone,
>> 
>> We would like to thank everyone who has contributed to the project
>>> and
>> request
>> the Hive PMC members to review and vote on this new release
>>> candidate.
>> 
>> Apache Hive 4.0.0 RC-0 artifacts are available here:*
>> https://people.apache.org/~dkuzmenko/apache-hive-4.0.0-rc0/
>> 
>> 
>> The checksums are as follows:
>> - 83eb88549ae88d3df6a86bb3e2526c7f4a0f21acafe21452c18071cee058c666
>> apache-hive-4.0.0-bin.tar.gz
>> - 4dbc9321d245e7fd26198e5d3dff95e5f7d0673d54d0727787d72956a1bca4f5
>> apache-hive-4.0.0-src.tar.gz
>> 
>> 
>> You can find the KEYS file here:
>> 
>> * https://downloads.apache.org/hive/KEYS
>> 
>> 
>> A staged Maven repository URL is:*
>> 
>>> https://repository.apache.org/content/repositories/orgapachehive-1127/
>> 
>> The git commit hash is:*
>> 
>> 
> 
 
>>> 
>> https://github.com/apache/hive/commit/183f8cb41d3dbed961ffd27999876468ff06690c
>> 
>> 
>> This corresponds to the tag: release-4.0.0-rc0
>> * https://github.com/apache/hive/tree/release-4.0.0-rc0
>> 
>> The vote is open for the next 72 hours and passes if a majority of
>> at
> least
>> three +1 PMC votes are cast.
>> 
>> (Only PMC members have binding votes, however, other community
>>> members
>> are encouraged to cast non-binding votes.)
>> 
>> 
>> [ ] +1 Release this package as Apache Hive 4.0.0
>> [ ] +0
>> [ ] -1 Do not release this because...
>> 
>> 
>> Please download, verify, and test.
>> 
>> 
>> Regards,
>> 
>> Denys
>> 
> 
 
>>> 
>> 



Re: Force coding style in hive precommit

2024-01-07 Thread Kirti Ruge
+1
As it would improve maintainability and code reviews. Sometimes small 
indentation/styling issues would kill review cycle time and we can easily avoid 
it before requesting review.
Enforcing more rules around it definitely boost guaranteeing quality. We can 
integrate it with git hooks. If we are going for this, I can work on getting it 
in place .

Thanks,
Kirti

> On 08-Jan-2024, at 11:36 AM, Akshat m  wrote:
> 
> +1, We do have a documentation round it as well:
> https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-CodingConventions
> so it makes sense to enforce it as well.
> 
> Right now we have a small section around this in documentation, We can also
> expand this to a new page and add more Java practices to it as well which
> are followed in the project while we are at this, Will be a great addition
> to Hive 4 documentation, I can pick it up.
> 
> I suggest we add this style check as a pre-commit git hook as well, so it
> is enforced when the author is committing locally as well, this can save
> the wait time for pre-commit failure in the PR for the author to realise
> the styling issues, ideally this should be taken care of with the ide style
> configuration but in case we miss it this would error out while
> committing the changes.
> 
> Regards,
> Akshat
> 
> On Sat, Jan 6, 2024 at 10:17 AM László Bodor 
> wrote:
> 
>> Hi All!
>> 
>> What do you think about forcing coding style in Hive precommit?
>> 
>> I remember, back in the old days, precommit printed some warnings in case
>> some coding style (formatting, indentation, naming convention, etc.)
>> problems were found in the patch, now it's simply not used, I guess since
>> we're using GitHub PRs.
>> 
>> For example: I remember I simply approved a PR a few months ago which
>> LGTM, and later just realized it's full of 4-spaces indentation, which is
>> wrong if we assume that code should be formatted according to the style
>> definition here:
>> https://github.com/apache/hive/blob/master/dev-support/eclipse-styles.xml
>> 
>> I have just attached an example of Tez PR to open minds and start a
>> conversation.
>> 
>> Regards,
>> Laszlo Bodor
>> 
>> 



Re: [DISCUSS] HIVE 4.0 GA Release Proposal

2023-05-09 Thread Kirti Ruge
I see a few tickets like HIVE-26400 which is a major milestone, are
resolved .
Can we reevaluate priorities of other JIRAs so that It may give us clarity
GO/NO-GO  for 4.0.0 GA release  and its timeline?



Thanks,
Kirti

On Sat, Mar 25, 2023 at 3:27 PM Stamatis Zampetakis 
wrote:

> Regarding correctness, I think it makes sense to change default values and
> possibly add a warning note when there's a known risk of wrong results.
> Needless to say that we should try to fix as many issues as possible; we
> still need volunteers to review open PRS.
>
> Performances regressions are trickier but if we have the query plans (CBO +
> full) along with logs (including task counters) for fast and slow execution
> we may be able to understand what happens. Don't hesitate to create Jira
> tickets with these information if available.
>
> Last regarding 4.0.0 blockers, I don't think we need a special label. The
> built-in and widely used priority "blocker" seems enough to capture the
> importance and urgency of a ticket.
> Since I am the release manager for the next release I will go over tickets
> marked as blockers and reevaluate priorities if necessary.
>
> Best,
> Stamatis
>
> On Thu, Mar 23, 2023, 10:27 AM Denys Kuzmenko 
> wrote:
>
> > Thanks, Sungwoo for running the TPC-DS benchmark. Do we know if the same
> > level of performance degradation was present in 4.0.0-alpha1?
> >
> > All: please use the `hive-4.0.0-must` label in a ticket if you think it's
> > a show-stopper for the release.
> >
>


[jira] [Created] (HIVE-27213) parquet logical decimal type to INT32 is not working while compute statastics

2023-04-03 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-27213:
-

 Summary: parquet logical decimal type to INT32 is not working 
while compute statastics
 Key: HIVE-27213
 URL: https://issues.apache.org/jira/browse/HIVE-27213
 Project: Hive
  Issue Type: Improvement
Reporter: KIRTI RUGE
 Attachments: test.parquet

[^test.parquet]

Steps to reproduce:

dfs ${system:test.dfs.mkdir} hdfs:///tmp/dwxtest/ws_sold_date_sk=2451825;
dfs -copyFromLocal ../../data/files/dwxtest.parquet 
hdfs:///tmp/dwxtest/ws_sold_date_sk=2451825;
dfs -ls hdfs:///tmp/dwxtest/ws_sold_date_sk=2451825/;


CREATE EXTERNAL TABLE `web_sales`(
`ws_sold_time_sk` int,
`ws_ship_date_sk` int,
`ws_item_sk` int,
`ws_bill_customer_sk` int,
`ws_bill_cdemo_sk` int,
`ws_bill_hdemo_sk` int,
`ws_bill_addr_sk` int,
`ws_ship_customer_sk` int,
`ws_ship_cdemo_sk` int,
`ws_ship_hdemo_sk` int,
`ws_ship_addr_sk` int,
`ws_web_page_sk` int,
`ws_web_site_sk` int,
`ws_ship_mode_sk` int,
`ws_warehouse_sk` int,
`ws_promo_sk` int,
`ws_order_number` bigint,
`ws_quantity` int,
`ws_wholesale_cost` decimal(7,2),
`ws_list_price` decimal(7,2),
`ws_sales_price` decimal(7,2),
`ws_ext_discount_amt` decimal(7,2),
`ws_ext_sales_price` decimal(7,2),
`ws_ext_wholesale_cost` decimal(7,2),
`ws_ext_list_price` decimal(7,2),
`ws_ext_tax` decimal(7,2),
`ws_coupon_amt` decimal(7,2),
`ws_ext_ship_cost` decimal(7,2),
`ws_net_paid` decimal(7,2),
`ws_net_paid_inc_tax` decimal(7,2),
`ws_net_paid_inc_ship` decimal(7,2),
`ws_net_paid_inc_ship_tax` decimal(7,2),
`ws_net_profit` decimal(7,2))
PARTITIONED BY (
`ws_sold_date_sk` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS PARQUET LOCATION 'hdfs:///tmp/dwxtest/';


MSCK REPAIR TABLE web_sales;


analyze table web_sales compute statistics for columns;

 

Error Stack:

 
{noformat}
analyze table web_sales compute statistics for columns;

], TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
attempt_1678779198717__2_00_52_3:java.lang.RuntimeException: 
java.lang.RuntimeException: java.io.IOException: 
org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in 
block -1 in file 
s3a://nfqe-tpcds-test/spark-tpcds/sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/web_sales/ws_sold_date_sk=2451825/part-00796-788bef86-2748-4e21-a464-b34c7e646c94-cfcafd2c-2abd-4067-8aea-f58cb1021b35.c000.snappy.parquet
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:351)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:280)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:84)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:70)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:70)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:40)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: java.io.IOException: 
org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in 
block -1 in file 
s3a://nfqe-tpcds-test/spark-tpcds/sf1000-parquet/useDecimal=true,useDate=true,filterNull=false/web_sales/ws_sold_date_sk=2451825/part-00796-788bef86-2748-4e21-a464-b34c7e646c94-cfcafd2c-2abd-4067-8aea-f58cb1021b35.c000.snappy.parquet
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:164)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:83

Re: [DISCUSS] Incremental and cadence predictable release activity for HIVE

2023-03-23 Thread Kirti Ruge
Thanks Ayush, Thanks Stamatis for your valuable inputs. 

Let us a give a try to get at least 2 releases/year so that we can evaluate 
further on strategy plan of release activity. I see another mail thread for 
interested devs who can volunteer as Release Managers for next subsequent 
releases 
https://lists.apache.org/thread/2tfcocjdfc2w0mw4fsy736gvp4vqykqm 
<https://lists.apache.org/thread/2tfcocjdfc2w0mw4fsy736gvp4vqykqm>. 

Let us chime in there and give a try.
Closing this thread for now.  

Thanks,
Kirti

> On 13-Mar-2023, at 5:13 PM, Stamatis Zampetakis  wrote:
> 
> Hello,
> 
> I am not sure what a branch cut actually refers to. As I mentioned in the
> past I am not in favor of maintaining multiple release branches; the cost
> is high and the number of volunteers is simply not enough. I am willing to
> reconsider if things change in the near future.
> 
> Apart from that, having frequent releases from master is definitely great
> for consumers  and good for the health of the project; two, three releases
> per year would be great but for this to happen we need volunteers (mostly
> release managers).
> 
> One thing that I have seen working well in other projects is to decide in
> advance the next 3-4 release managers. Maybe it's worth trying implementing
> this in Hive.
> 
> Best,
> Stamatis
> 
> On Sun, Mar 12, 2023 at 6:07 PM Ayush Saxena  wrote:
> 
>> Hi Kirti,
>> Thanx for the initiative. This sounds very interesting, but I doubt if it
>> is that easy to incorporate. Sharing my thoughts:
>> 
>>   - Regarding "Unpredictable" : I don't think we are like doing very
>>   unpredictable releases. It should be a formal mail, like Release x.y.z
>> and
>>   then the RM usually shares a potential Branch freeze date, then a
>>   margin number of days for blockers or critical tickets. And this entire
>>   process would be around a minimum of 1 month and usually will go around
>> 3
>>   months.
>>   - Regarding "Regressions": Quicker releases doesn't certainly mean more
>>   stable releases.
>>   - Regarding half-baked features: We are mostly developing on master
>>   branch, we don't have a concept of feature branch(a lot of projects have
>>   that), So, if a bunch of features are running in parallel by different
>> set
>>   of people, with a "fixed" date it is practically impossible to achieve,
>>   this thing needs to be negotiated b/w all of them.
>>   - Even if we pin a date, that ain't sufficient, we need volunteers who
>>   can take up the RM role, If we proceed with this we should decide the
>> RM as
>>   well beforehand.
>>   - This timeline thing can get screwed up in case you hit a security
>>   issue: AFAIK you can't announce a CVE unless you have a release on all
>>   active release lines with the fix. So, in that case this schedule will
>> get
>>   messed up and the RM, the dates would require to be renegotiated.
>>   - Sometimes you need to release early because a downstream project needs
>>   a fix, which blocks their way to upgrade Hive. Standard practice, almost
>>   All apache projects are concerned about each other and help others in
>>   upgrading, so in that case I am not sure holding them for a fixed date
>> is
>>   cool or not
>>   - Mostly what I have observed, A release takes place when we have enough
>>   tickets to release, We don't want to just keep on releasing with just
>> 20-25
>>   fixes, nor we want to push straight 800-900 fixes in one go. The number
>> of
>>   fixes, the nature of fixes all should be taken in account while planning
>>   the release date.
>> 
>> 
>> In general: Good Idea, We should definitely encourage more frequent
>> releases, having a "strict" date or not is debatable.
>> 
>> -Ayush
>> 
>> On Sun, 12 Mar 2023 at 19:44, Kirti Ruge  wrote:
>> 
>>> Hello HIVE Dev,
>>> 
>>> I would like to discuss/propose incremental and cadence predictable
>>> process for HIVE releases.
>>> 
>>> https://hive.apache.org/general/downloads/
>>> 
>>> Currently, our releases have a very random span in between, and those
>> have
>>> sometimes caused problems like-
>>> 
>>> 1. All downstream and end users have unpredictable schedules because of
>>> upstream.
>>> 2. More chances of regression issues when there is an unplanned release
>>> date. As developers and release managers have to rush, this prevents us
>>> from focusing on having a proper regression-free release.
>>> 
>>> I would like to propose a bra

[jira] [Created] (HIVE-27166) Introduce Apache Commons DBUtils to handle boilerplate code

2023-03-22 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-27166:
-

 Summary: Introduce Apache Commons DBUtils to handle boilerplate 
code
 Key: HIVE-27166
 URL: https://issues.apache.org/jira/browse/HIVE-27166
 Project: Hive
  Issue Type: Improvement
Reporter: KIRTI RUGE


Apache Commons DbUtils is a small library that makes working with JDBC a lot 
easier.

Currently scope of this Jira is introducing Apache DBUtils latest version for 
applicable methods in TxnHandler and CompactionTxnHandler classes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] Incremental and cadence predictable release activity for HIVE

2023-03-12 Thread Kirti Ruge
Hello HIVE Dev,

I would like to discuss/propose incremental and cadence predictable process for 
HIVE releases.
 
https://hive.apache.org/general/downloads/

Currently, our releases have a very random span in between, and those have 
sometimes caused problems like-

1. All downstream and end users have unpredictable schedules because of 
upstream.
2. More chances of regression issues when there is an unplanned release date. 
As developers and release managers have to rush, this prevents us from focusing 
on having a proper regression-free release.

I would like to propose a branch cut twice a year to have two strict releases 
yearly. It would make release cadence predictable for end users and bring some 
disciplinary schedules for all users, including downstream projects. 

Advantages of this approach-

1. If we pin a branch cut date, features can be prioritized better so that no 
half-baked stuff goes into release.
2. Such Incremental release will help in better regression and reduce the 
burden from release management activity( result is reduced issues and problems 
with quality). It will eventually help to streamline release management 
activity.


Let me know your thoughts.

Thanks,
Kirti

Re: [DISCUSS] HIVE 4.0 GA Release Proposal

2023-03-12 Thread Kirti Ruge
Thanks Stamatis !!!
I see below JIRAs marked with label hive-4.0.0-must 
<https://issues.apache.org/jira/issues/?jql=labels+%3D+hive-4.0.0-must> and in 
unresolved status.

https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=564 
<https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=564>


Thanks,
Kirti

HIVE-26400 <https://issues.apache.org/jira/browse/HIVE-26400>
Provide docker images for Hive - PR in review 
https://github.com/apache/hive/pull/3448 
<https://github.com/apache/hive/pull/3448>

HIVE-26537 <https://issues.apache.org/jira/browse/HIVE-26537>
Deprecate older APIs in the HMS. -  PR in review 
https://github.com/apache/hive/pull/3599

HIVE-26220 <https://issues.apache.org/jira/browse/HIVE-26220>
Shade & relocate dependencies in hive-exec to avoid conflicting with downstream 
projects

HIVE-26644 <https://issues.apache.org/jira/browse/HIVE-26644>
Introduce auto sizing in HMS -   stale PR. 
https://github.com/apache/hive/pull/3683



> On 10-Mar-2023, at 10:20 PM, Stamatis Zampetakis  wrote:
> 
> Hi Kirti,
> 
> Thanks for bringing up this topic.
> 
> The master branch already has many new features; we don't need to wait for
> more to cut a GA.
> 
> The main criterion for going GA is stability thus I would consider
> regressions as the only blockers for the release.
> 
> If I recall well the only regressions discovered so far are some problems
> with TPC-DS queries so basically HIVE-26654 [1].
> 
> I will let others chime in to include more tickets if necessary.
> 
> Best,
> Stamatis
> 
> [1] https://issues.apache.org/jira/browse/HIVE-26654
> 
> 
> On Wed, Mar 8, 2023 at 10:02 AM Kirti Ruge  wrote:
> 
>> Hello Hive Dev,
>> 
>> It has been about 6 months since Hive-4.0-alpha-2 was released in Nov 2022.
>> Would it be a good time to discuss about HIVE-4.0 GA  release to the
>> community ? Can we have discussion on the new features/jdk support versions
>> which we want to publish as part of 4.0 GA , timeframe of release.
>> 
>> 
>> Thanks,
>> Kirti



[DISCUSS] HIVE 4.0 GA Release Proposal

2023-03-08 Thread Kirti Ruge
Hello Hive Dev,

It has been about 6 months since Hive-4.0-alpha-2 was released in Nov 2022.
Would it be a good time to discuss about HIVE-4.0 GA  release to the community 
? Can we have discussion on the new features/jdk support versions which we want 
to publish as part of 4.0 GA , timeframe of release.


Thanks,
Kirti

[jira] [Created] (HIVE-27085) Revert Manual constructor from AbortCompactionResponseElement.java

2023-02-16 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-27085:
-

 Summary: Revert Manual constructor from 
AbortCompactionResponseElement.java
 Key: HIVE-27085
 URL: https://issues.apache.org/jira/browse/HIVE-27085
 Project: Hive
  Issue Type: Bug
Reporter: KIRTI RUGE






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27079) OOM / high GC caused by showCompactions

2023-02-14 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-27079:
-

 Summary: OOM / high GC caused by showCompactions
 Key: HIVE-27079
 URL: https://issues.apache.org/jira/browse/HIVE-27079
 Project: Hive
  Issue Type: Improvement
Reporter: KIRTI RUGE


Show Compaction without filter pulls a huge results and taht might be concern 
for heap .

Need to work/check how we can add scrollable result set with beeline



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [EXTERNAL] [ANNOUNCE] New PMC Member: Stamatis Zampetakis

2023-01-14 Thread Kirti Ruge
Congratulations Stamatis !!!
Thanks,
Kirti

On Sat, 14 Jan 2023 at 4:22 AM, Ayush Saxena  wrote:

> Congratulations Stamatis
>
> -Ayush
>
>
> On 14-Jan-2023, at 1:41 AM, Alessandro Solimando <
> alessandro.solima...@gmail.com> wrote:
>
> 
>
> Congratulations Stamatis,
> very well deserved!
>
> Best regards,
> Alessandro
>
> On Fri 13 Jan 2023, 19:57 Chris Nauroth,  wrote:
>
>> Congratulations, Stamatis!
>>
>> Chris Nauroth
>>
>>
>> On Fri, Jan 13, 2023 at 10:46 AM Simhadri G 
>> wrote:
>>
>> > Congratulations Stamatis!
>> >
>> > On Sat, 14 Jan 2023, 00:12 Sankar Hariappan via user, <
>> > u...@hive.apache.org>
>> > wrote:
>> >
>> > > Congrats Stamatis! Well deserved one 
>> > >
>> > >
>> > >
>> > > Thanks,
>> > >
>> > > Sankar
>> > >
>> > >
>> > >
>> > > *From:* Naveen Gangam 
>> > > *Sent:* Saturday, January 14, 2023 12:03 AM
>> > > *To:* dev ; u...@hive.apache.org
>> > > *Cc:* zabe...@apache.org
>> > > *Subject:* [EXTERNAL] [ANNOUNCE] New PMC Member: Stamatis Zampetakis
>> > >
>> > >
>> > >
>> > > Hello Hive Community,
>> > >
>> > > Apache Hive PMC is pleased to announce that Stamatis Zampetakis has
>> > > accepted the Apache Hive PMC's invitation to become PMC Member, and is
>> > now
>> > > our newest PMC member. Please join me in congratulating Stamatis !!!
>> > >
>> > >
>> > >
>> > > He has been an active member in the hive community across many
>> aspects of
>> > > the project. Many thanks to Stamatis for all the contributions he has
>> > made
>> > > and looking forward to many more future contributions in the expanded
>> > role.
>> > >
>> > >
>> > >
>> > > Cheers,
>> > >
>> > > Naveen (on behalf of Hive PMC)
>> > >
>> >
>>
>


Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-20 Thread Kirti Ruge
Congratulations Ayush.

On Wed, 21 Dec 2022 at 12:15 AM, Chris Nauroth  wrote:

> Congratulations, Ayush!
>
> Chris Nauroth
>
>
> On Tue, Dec 20, 2022 at 10:02 AM Sai Hemanth Gantasala <
> saihema...@cloudera.com> wrote:
>
> > Congratulations Ayush, Very well deserved!!.
> >
> > On Mon, Dec 19, 2022 at 5:12 PM Naveen Gangam 
> > wrote:
> >
> >> Hello Hive Community,
> >> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted
> the
> >> Apache Hive PMC's invitation to become PMC Member, and is now our newest
> >> PMC member. Many thanks to Ayush for all the contributions he has made
> and
> >> looking forward to many more future contributions in the expanded role.
> >>
> >> Please join me in congratulating Ayush !!!
> >>
> >> Cheers,
> >> Naveen (on behalf of Hive PMC)
> >>
> >>
> >
>


[jira] [Created] (HIVE-26858) OOM / high GC caused by showCompactions & purgeCompactionHistory

2022-12-15 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26858:
-

 Summary: OOM / high GC caused by showCompactions & 
purgeCompactionHistory
 Key: HIVE-26858
 URL: https://issues.apache.org/jira/browse/HIVE-26858
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: KIRTI RUGE


If for some reason housekeeper service wasn't running, when activated it could 
cause OOM. showCompactions & purgeCompactionHistory loads the complete history 
of events into the heap that should be reviewed.
purgeCompactionHistory might be refactored with batching or replaced with a 
complex delete query instead of `select` first and then `delete`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26826) Large table drop causes issues when interrupted

2022-12-09 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26826:
-

 Summary: Large table drop causes issues when interrupted
 Key: HIVE-26826
 URL: https://issues.apache.org/jira/browse/HIVE-26826
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: KIRTI RUGE


*Issue:*

1. Create a table with large number of partitions (e.g 10K partitions)

2. Drop this table (This will take lot longer time in s3 & other calls)

3. Interrupt #2. In beeline try "ctl+c" or cancel query in hue.

4. Since interrupt handling has issues in this codepath, it doesn't kill the 
query.

Other statements will start waiting from this point onwards, as the lockManager 
doesn't release the locks completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26825) Compactor: Cleaner shouldn't fetch table details again and again for partitioned tables

2022-12-08 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26825:
-

 Summary: Compactor: Cleaner shouldn't fetch table details again 
and again for partitioned tables
 Key: HIVE-26825
 URL: https://issues.apache.org/jira/browse/HIVE-26825
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Reporter: KIRTI RUGE


Cleaner shouldn't be fetch table/partition details for all its partitions. When 
there are large number of databases/tables, it takes lot of time for Initiator 
to complete its initial iteration and load on DB also goes higher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26805) Cancel ongoing/working compaction requests

2022-12-02 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26805:
-

 Summary: Cancel ongoing/working compaction requests
 Key: HIVE-26805
 URL: https://issues.apache.org/jira/browse/HIVE-26805
 Project: Hive
  Issue Type: New Feature
Reporter: KIRTI RUGE






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26804) Cancel Compactions in initiated state

2022-12-02 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26804:
-

 Summary: Cancel Compactions in initiated state
 Key: HIVE-26804
 URL: https://issues.apache.org/jira/browse/HIVE-26804
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Reporter: KIRTI RUGE






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26803) Ability to cancel compactions

2022-12-02 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26803:
-

 Summary: Ability to cancel compactions
 Key: HIVE-26803
 URL: https://issues.apache.org/jira/browse/HIVE-26803
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Reporter: KIRTI RUGE






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26764) Show compaction request should have all filds optional

2022-11-19 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26764:
-

 Summary: Show compaction request should have all filds optional
 Key: HIVE-26764
 URL: https://issues.apache.org/jira/browse/HIVE-26764
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0-alpha-2
Reporter: KIRTI RUGE






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26666) Filter out compactions by id to minimise expense of db operations

2022-10-23 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-2:
-

 Summary: Filter out compactions by id to minimise  expense of db 
operations
 Key: HIVE-2
 URL: https://issues.apache.org/jira/browse/HIVE-2
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: KIRTI RUGE


At present we use below operations while filtering out compactions in classes 
like

AlterTableCompactOperation

cleaner

Use show compaction filter option provided after 

https://issues.apache.org/jira/browse/HIVE-13353



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26580) SHOW COMPACTIONS should support ordering and limiting functionality in filtering options

2022-09-30 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26580:
-

 Summary: SHOW COMPACTIONS should support ordering and limiting 
functionality in filtering options
 Key: HIVE-26580
 URL: https://issues.apache.org/jira/browse/HIVE-26580
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: KIRTI RUGE
 Fix For: 4.0.0


SHOW COMPACTION should provide ordering by defied table . It should also 
support limitation of fetched records



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26563) Add extra columns in Show Compactions output and sort the output

2022-09-26 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26563:
-

 Summary: Add extra columns in Show Compactions output and sort the 
output
 Key: HIVE-26563
 URL: https://issues.apache.org/jira/browse/HIVE-26563
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Affects Versions: 3.0.0
Reporter: KIRTI RUGE


SHOW COMPACTIONS need reformatting in below aspects:

1.Need to add all below columns     

  host information, duration, next_txn_id, txn_id, commit_time, 
highest_write_id, cleaner       start, tbl_properties

2. Sort the output in a way it should display a moist recent element at the 
start(either completed or in progress)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26481) Compaction failing with File does not exist error

2022-08-18 Thread KIRTI RUGE (Jira)
KIRTI RUGE created HIVE-26481:
-

 Summary: Compaction failing with File does not exist error
 Key: HIVE-26481
 URL: https://issues.apache.org/jira/browse/HIVE-26481
 Project: Hive
  Issue Type: Bug
Reporter: KIRTI RUGE


The compaction fails when the Cleaner tried to remove a missing directory from 
HDFS.

2022-08-05 18:56:38,873 INFO org.apache.hadoop.hive.ql.txn.compactor.Cleaner: 
[Cleaner-executor-thread-0]: Starting cleaning for 
id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId:
 null,initiatorId: null 2022-08-05 18:56:38,888 ERROR 
org.apache.hadoop.hive.ql.txn.compactor.Cleaner: [Cleaner-executor-thread-0]: 
Caught exception when cleaning, unable to complete cleaning of 
id:30,dbname:default,tableName:test_concur_compaction_minor,partName:null,state:�,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:4,errorMessage:null,workerId:
 null,initiatorId: null java.io.FileNotFoundException: File 
hdfs://ns1/warehouse/tablespace/managed/hive/test_concur_compaction_minor/.hive-staging_hive_2022-08-05_18-56-37_115_5049319600695911622-37
 does not exist. at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
 at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208)
 at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) at 
org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332) at 
org.apache.hadoop.fs.FileSystem$5.hasNext(FileSystem.java:2309) at 
org.apache.hadoop.hive.ql.io.AcidUtils.getHdfsDirSnapshots(AcidUtils.java:1440) 
at 
org.apache.hadoop.hive.ql.txn.compactor.Cleaner.removeFiles(Cleaner.java:287) 
at org.apache.hadoop.hive.ql.txn.compactor.Cleaner.clean(Cleaner.java:214) at 
org.apache.hadoop.hive.ql.txn.compactor.Cleaner.lambda$run$0(Cleaner.java:114) 
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil$ThrowingRunnable.lambda$unchecked$0(CompactorUtil.java:54)
 at 
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)