[jira] [Created] (HIVE-26073) Fix TestReplicationScenarios.testIncrementalStatisticsMetrics

2022-03-24 Thread Peter Vary (Jira)
Peter Vary created HIVE-26073:
-

 Summary: Fix 
TestReplicationScenarios.testIncrementalStatisticsMetrics
 Key: HIVE-26073
 URL: https://issues.apache.org/jira/browse/HIVE-26073
 Project: Hive
  Issue Type: Task
Reporter: Peter Vary


The test is flaky:

http://ci.hive.apache.org/job/hive-flaky-check/546



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26072) Enable vectorization for stats gathering (tablescan op)

2022-03-24 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-26072:
---

 Summary: Enable vectorization for stats gathering (tablescan op)
 Key: HIVE-26072
 URL: https://issues.apache.org/jira/browse/HIVE-26072
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajesh Balamohan


https://issues.apache.org/jira/browse/HIVE-24510 enabled vectorization for 
compute_bit_vector. 

But tablescan operator for stats gathering is disabled by default.

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java#L2577]

Need to enable vectorization for this. This can significantly reduce runtimes 
for analyze statements for large tables.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26071) JWT authentication for Thrift over HTTP in HiveMetaStore

2022-03-24 Thread Sourabh Goyal (Jira)
Sourabh Goyal created HIVE-26071:


 Summary: JWT authentication for Thrift over HTTP in HiveMetaStore
 Key: HIVE-26071
 URL: https://issues.apache.org/jira/browse/HIVE-26071
 Project: Hive
  Issue Type: New Feature
  Components: Standalone Metastore
Reporter: Sourabh Goyal
Assignee: Sourabh Goyal


HIVE-25575 recently added a support for JWT authentication in HS2. This Jira 
aims to add the same feature in HMS



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] Apache Hive 3.1.3 Release Candidate 1

2022-03-24 Thread Naveen Gangam
Thanks Stamatis. Let me look into this.

On Thu, Mar 24, 2022 at 5:42 AM Stamatis Zampetakis 
wrote:

> Thanks for pushing this forward Naveen.
>
> I checked the released sources in apache-hive-3.1.3-src and they contain
> modified LGPL files violating the ASF release policy.
> The problem is the same reported under HIVE-25665. I think the fix
> should be backported to branch-3 before moving forward with the release.
>
> -1 (non-binding)
>
> Best,
> Stamatis
>
> On Wed, Mar 23, 2022 at 9:47 PM Naveen Gangam  >
> wrote:
>
> > Apache Hive 3.1.3 Release Candidate 1 is available here:
> > https://people.apache.org/~ngangam/apache-hive-3.1.3-rc-1
> >
> > The checksums are these:
> > - *e0551a6fe328be5ff0fa16d275b65f43f56c35da66ac4e391e47d3e74d466b91*
> > apache-hive-3.1.3-bin.tar.gz
> >
> > - *ce35a179304055004023bec016518fcb40b2ce2b14238ab77aebec99815fde02*
> > apache-hive-3.1.3-src.tar.gz
> >
> >
> > Maven artifacts are available
> > here:
> https://repository.apache.org/content/repositories/orgapachehive-1112
> >
> > The tag release-3.1.3-rc1 has been applied to the source for this
> > release in github, you can see it
> > athttps://github.com/apache/hive/tree/release-3.1.3-rc1
> >
> > The git commit hash is: cc050e40eb55f6c9f1aa08c00c1689f657747afb
> > <
> >
> https://github.com/apache/hive/commit/cc050e40eb55f6c9f1aa08c00c1689f657747afb
> > >
> > Voting will conclude in 72 hours.
> >
> > Hive PMC Members: Please test and vote.
> >
> > Thanks.
> >
>


[VOTE] Apache Hive 4.0.0-alpha-1 Release Candidate 2

2022-03-24 Thread Peter Vary
Hi Team,

Apache Hive 4.0.0-alpha-1 Release Candidate 2 is available here:
https://people.apache.org/~pvary/apache-hive-4.0.0-alpha-1-rc2/ 


The checksums are these:
- 1e450197dbf847696b05042eb68b78b968064f1f1b369a7fb0b77a6329a27809  
apache-hive-4.0.0-alpha-1-bin.tar.gz
- a21a609ec2e30f8cc656242c545bb3a04de21c2a1eee90808648e3aa4bf3d04e  
apache-hive-4.0.0-alpha-1-src.tar.gz

Maven artifacts are available here:
https://repository.apache.org/content/repositories/orgapachehive-1113/ 


The tag 4.0.0-alpha-1-rc1 has been applied to the source for this release in 
github, you can see it at
https://github.com/apache/hive/tree/release-4.0.0-alpha-1-rc1 


The git commit hash is:
https://github.com/apache/hive/commit/357d4906f5c806d585fd84db57cf296e12e6049b 


Voting will conclude in 72 hours.

All interested parties: Please test.
Hive PMC Members: Please test and vote.

Thanks.

Re: [VOTE] Apache Hive 4.0.0-alpha-1 Release Candidate 1

2022-03-24 Thread Peter Vary
Thanks for the feedback Stamatis!

As discussed offline:
I will close this vote as unsuccessful
I will create a new release candidate with the 
derby.log/metastore_db/${test.tmp.dir} removed, but keep the other things 
unchanged
I have created jiras to fix the other issues in the next release:
HIVE-26070: Remove the generated files from the source tarball
HIVE-26069: Remove unnecessary items from the .gitignore
HIVE-26068: Add README to the src tarball
HIVE-26067: Remove core directory from src

So I close this vote as unsuccessful.

Thanks,
Peter

> On 2022. Mar 23., at 23:26, Stamatis Zampetakis  wrote:
> 
> Ubuntu 20.04.4 LTS, jdk1.8.0_261, Apache Maven 3.6.3
> 
> * Checked signatures and checksums OK
> * Checked diff between repo and release sources (diff -qr hive
> apache-hive-4.0.0-alpha-1-src) KO
> * Built from git tag (mvn clean install -DskipTests -Pitests) OK
> * Built from release sources (mvn clean install -DskipTests -Pitests) OK
> 
> While comparing the content of the git repo with the release sources I
> noticed various differences. Most notable ones for which I cast a negative
> vote are listed below:
> 
> Only in apache-hive-4.0.0-alpha-1-src/common/src: gen
> Only in apache-hive-4.0.0-alpha-1-src/conf: hive-default.xml.template
> Only in apache-hive-4.0.0-alpha-1-src/itests/hive-unit: cmroot
> Only in apache-hive-4.0.0-alpha-1-src/ql: dependency-reduced-pom.xml
> Only in
> apache-hive-4.0.0-alpha-1-src/standalone-metastore/metastore-common/src/gen:
> version
> Only in
> apache-hive-4.0.0-alpha-1-src/standalone-metastore/metastore-server:
> derby.log
> Only in
> apache-hive-4.0.0-alpha-1-src/standalone-metastore/metastore-server:
> metastore_db
> Only in
> apache-hive-4.0.0-alpha-1-src/standalone-metastore/metastore-server/src: gen
> Only in apache-hive-4.0.0-alpha-1-src/streaming: ${test.tmp.dir}
> Only in hive/: README.md
> Only in hive/: core
> 
> The fact that derby.log and metastore_db appears in the released sources
> it's definitely not normal.
> 
> Other than that I was surprised to see that itests sources are part of the
> released sources. I thought that the goal of keeping them separate was to
> avoid releasing them along with the main code. I checked previous releases
> and the directory is there so I suppose it is intentional to have them in
> apache-hive-4.0.0-alpha-1-src.tar.gz
> 
> For future votes, I think it is useful to include in the email a pointer to
> the PGP key that was used to sign the release. I knew where to find it but
> not sure if everyone does. I have to note that the key that was used to
> sign the release does not seem to be signed by any other member of the PMC;
> this is a bit problematic but not a blocker [1].
> 
> Last, I've seen that the released sources do not contain a README file with
> instructions or pointers on how to build the project.
> 
> -1 (non-binding)
> 
> Best,
> Stamatis
> 
> [1] https://www.apache.org/info/verification.html
> 
> 
> On Wed, Mar 23, 2022 at 11:45 AM Peter Vary 
> wrote:
> 
>> Hi Stamatis,
>> 
>> Here is the data you have suggested:
>> Commit hash: 357d4906f5c806d585fd84db57cf296e12e6049b
>> Checksums:
>> ff60286044d2f3faa8ad1475132cdcecf4ce9ed8faf1ed4e56a6753ebc3ab585
>> apache-hive-4.0.0-alpha-1-bin.tar.gz
>> 07f30371df5f624352fa1d0fa50fd981a4dec6d4311bb340bace5dd7247d3015
>> apache-hive-4.0.0-alpha-1-src.tar.gz
>> 
>> Also added it to the
>> https://cwiki.apache.org/confluence/display/Hive/HowToRelease <
>> https://cwiki.apache.org/confluence/display/Hive/HowToRelease> wiki page
>> as well
>> 
>> Thanks,
>> Peter
>> 
>>> On 2022. Mar 22., at 18:22, Stamatis Zampetakis 
>> wrote:
>>> 
>>> Hi Peter,
>>> 
>>> Many thanks for rolling out the RC and for resolving many of the blocker
>>> issues that were remaining.
>>> 
>>> In general, it is a good practice to include the commit hash (which tags
>>> the release) and the checksum hashes of the release artifacts [1] in the
>>> vote email to minimize the chances of man-in-the-middle attacks and
>> voting
>>> on wrong packages.
>>> Can you please update this thread with those?
>>> 
>>> Best,
>>> Stamatis
>>> 
>>> [1] https://people.apache.org/~pvary/apache-hive-4.0.0-alpha-1-rc1/
>>> 
>>> 
>>> On Tue, Mar 22, 2022 at 5:00 PM Naveen Gangam
>> 
>>> wrote:
>>> 
 I have been able to build and run a quick test. I have NOT verified the
 signature. I was trying to run the HMS Checkin tests and got this. I
 suspect these are not specific to the alpha-1 branch. But it is not a
>> test
 failure (although it appears like it should be)
 *"mvn test
 
 
>> -Dtest.groups=org.apache.hadoop.hive.metastore.annotation.MetastoreCheckinTest"*
 
 [*INFO*] Running
>> org.apache.hadoop.hive.common.metrics.*TestLegacyMetrics*
 
 [main] WARN org.apache.hadoop.hive.common.metrics.LegacyMetrics - Could
>> not
 find counter value for foo.n, returning null instead.
 
 javax.management.AttributeNotFoundException: Key [foo.n] not
>> found/tracked
 
>>

[jira] [Created] (HIVE-26070) Remove the generated files from the source tarball

2022-03-24 Thread Peter Vary (Jira)
Peter Vary created HIVE-26070:
-

 Summary: Remove the generated files from the source tarball
 Key: HIVE-26070
 URL: https://issues.apache.org/jira/browse/HIVE-26070
 Project: Hive
  Issue Type: Task
Reporter: Peter Vary


We should discuss and decide if we would like to share the generated files in 
the source tarball.

There are 3 kind of generated files:
 # Thrift - we keep them in git too, so I would vote for keeping them
 # package-info.java files:
 ** standalone-metastore/metastore-common/src/gen/version
 ** standalone-metastore/metastore-server/src/gen/version
 ** standalone-metastore/src/gen/version
 ** common/src/gen
 # Test Record generated for Kafka tests: kafka-handler/src/test/genĀ 

I would vote for keeping 1, but remove 2 and 3 from the source tarball



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26069) Remove unnecessary items from the .gitignore

2022-03-24 Thread Peter Vary (Jira)
Peter Vary created HIVE-26069:
-

 Summary: Remove unnecessary items from the .gitignore
 Key: HIVE-26069
 URL: https://issues.apache.org/jira/browse/HIVE-26069
 Project: Hive
  Issue Type: Task
Reporter: Peter Vary


Currently .gitignore contains files which are generated by test and either 
should be removed by those given test or preferably they should be created 
under the target directories.

Some of the that we have seen are:
- metastore_db
- derby.log
- datanucleus.log

Also it might worth to explore why the following directories are not showing up 
in the {{git status}}:
- streaming/${test.tmp.dir}
- itests/hive-unit/cmroot





--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26068) Add README to the src tarball

2022-03-24 Thread Peter Vary (Jira)
Peter Vary created HIVE-26068:
-

 Summary: Add README to the src tarball
 Key: HIVE-26068
 URL: https://issues.apache.org/jira/browse/HIVE-26068
 Project: Hive
  Issue Type: Task
Reporter: Peter Vary


We need to add the README to the src tarball.

This should contain info about how to build the project from source



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26067) Remove core directory from src

2022-03-24 Thread Peter Vary (Jira)
Peter Vary created HIVE-26067:
-

 Summary: Remove core directory from src
 Key: HIVE-26067
 URL: https://issues.apache.org/jira/browse/HIVE-26067
 Project: Hive
  Issue Type: Task
Reporter: Peter Vary


This is not used. For the only file there we have an exact copy in 
{{org.apache.hive.hcatalog}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [VOTE] Apache Hive 3.1.3 Release Candidate 1

2022-03-24 Thread Stamatis Zampetakis
Thanks for pushing this forward Naveen.

I checked the released sources in apache-hive-3.1.3-src and they contain
modified LGPL files violating the ASF release policy.
The problem is the same reported under HIVE-25665. I think the fix
should be backported to branch-3 before moving forward with the release.

-1 (non-binding)

Best,
Stamatis

On Wed, Mar 23, 2022 at 9:47 PM Naveen Gangam 
wrote:

> Apache Hive 3.1.3 Release Candidate 1 is available here:
> https://people.apache.org/~ngangam/apache-hive-3.1.3-rc-1
>
> The checksums are these:
> - *e0551a6fe328be5ff0fa16d275b65f43f56c35da66ac4e391e47d3e74d466b91*
> apache-hive-3.1.3-bin.tar.gz
>
> - *ce35a179304055004023bec016518fcb40b2ce2b14238ab77aebec99815fde02*
> apache-hive-3.1.3-src.tar.gz
>
>
> Maven artifacts are available
> here:https://repository.apache.org/content/repositories/orgapachehive-1112
>
> The tag release-3.1.3-rc1 has been applied to the source for this
> release in github, you can see it
> athttps://github.com/apache/hive/tree/release-3.1.3-rc1
>
> The git commit hash is: cc050e40eb55f6c9f1aa08c00c1689f657747afb
> <
> https://github.com/apache/hive/commit/cc050e40eb55f6c9f1aa08c00c1689f657747afb
> >
> Voting will conclude in 72 hours.
>
> Hive PMC Members: Please test and vote.
>
> Thanks.
>


[jira] [Created] (HIVE-26066) Remove deprecated GenericUDAFComputeStats

2022-03-24 Thread Alessandro Solimando (Jira)
Alessandro Solimando created HIVE-26066:
---

 Summary: Remove deprecated GenericUDAFComputeStats
 Key: HIVE-26066
 URL: https://issues.apache.org/jira/browse/HIVE-26066
 Project: Hive
  Issue Type: Task
  Components: Statistics
Affects Versions: 4.0.0
Reporter: Alessandro Solimando


The function has been deprecated and it is currently not used (it is registered 
in the function registry and covered by some qtests, though).

As soon as we move to the next release cycle, the function can be removed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)