Re: [ANNOUNCE] New Committer: Stephen Carlin

2024-07-18 Thread Simhadri G
Congratulations Stephen!!

On Thu, Jul 18, 2024, 6:10 PM Denys Kuzmenko 
wrote:

> Congrats Stephen!
>
> On Thu, Jul 18, 2024 at 7:01 AM Akshat m  wrote:
>
>> Congratulations Stephen !
>>
>> Regards
>> Akshat
>>
>> On Thu, Jul 18, 2024 at 8:07 AM kokila narayanan <
>> kokilanarayana...@gmail.com> wrote:
>>
>>> Congratulations Stephen !!
>>>
>>> Regards,
>>> Kokila N
>>>
>>> On Thu, 18 Jul, 2024, 07:57 Naresh P R,  wrote:
>>>
 Congratulations Stephen !!!
 —-
 Regards,
 Naresh P R

 On Wed, Jul 17, 2024 at 5:21 AM Stamatis Zampetakis 
 wrote:

> Hi All,
>
> Apache Hive's Project Management Committee (PMC) has invited Stephen
> Carlin to become a committer, and we are pleased to announce that he
> has accepted.
>
> Steve has been contributing to the project since 2019. He has improved
> many aspects of the project notably the query compiler and the
> cost-based optimizer enhancing performance and fixing multiple bugs.
>
> Stephen, welcome, thank you for your contributions, and we look forward
> to your further interactions with the community!
>
> Please review the guidelines for new committers [1] and take
> additional actions as needed.
>
> Stamatis Zampetakis (on behalf of the Apache Calcite PMC)
>
> [1] https://cwiki.apache.org/confluence/display/Hive/HowToCommit
>



Re: apache/hive security vulnerabilities.

2024-06-19 Thread Simhadri G
Hi guys,

I checked for jackson-databind-2.4.0. It seems to be a transitive
dependency from htrace-core .

[image: image.png]


On Wed, Jun 19, 2024 at 8:29 PM Stamatis Zampetakis 
wrote:

> I am pretty sure that the old Jackson versions are shaded somewhere
> inside the jars of Hive dependencies. We probably need to inspect the
> contents of our binary distribution of Hive 4.0.0 and take corrective
> actions if needed.
>
> Best,
> Stamatis
>
> On Wed, Jun 19, 2024 at 4:35 PM Denys Kuzmenko 
> wrote:
> >
> > Hi Sreek,
> >
> > Oh, thanks! Ideally docker image should be build from Hive-4.0 branch
> artifacts via the GH action. Let me check, I just hope it wasn't manually
> uploaded
>


Re: Hive 4.0 interview/podcast

2024-06-11 Thread Simhadri G
Great interview!!

Thanks Rich and Stamatis for the podcast!!

On Tue, Jun 11, 2024 at 7:46 PM Butao Zhang  wrote:

> It is a really valuable interview!!!
> Thanks Rich and Stamatis!!!
>
>
>
> Thanks,
> Butao Zhang
>  Replied Message 
> From Ayush Saxena 
> Date 6/11/2024 21:57
> To ,
>  
> Subject Re: Hive 4.0 interview/podcast
> Hey Guys,
> I just watched the interview - It's awesome. Big thanx to Rich &
> Stamatis for putting this together!!!
>
> -Ayush
>
> On Tue, 11 Jun 2024 at 18:38, Rich Bowen  wrote:
>
>
> Hi, folks. Congratulations on the release of Hive 4.0. I've just published
> an interview with Stamatis Zampetakis, and it's live at
> https://youtu.be/7HX2MieyzW4 (video) and at https://wp.me/p8gHED-41k
> (just the audio).
>
> Thanks, Stamatis! It was good talking with you.
>
> --Rich
>
>


Re: [VOTE] Mark Hive 2.x EOL

2024-05-10 Thread Simhadri G
+1 (non-binding)


On Fri, May 10, 2024 at 2:34 PM Stamatis Zampetakis 
wrote:

> +1 (binding)
>
> On Fri, May 10, 2024 at 10:10 AM Denys Kuzmenko 
> wrote:
> >
> > +1 (binding)
>


[Discussion] HIVE-28211: Restore hive-exec:core jar

2024-04-25 Thread Simhadri G
Hi Everyone,

The hive-exec:core jar is used by spark, oozie, hudi and many other
projects. Removal of the hive-exec:core jar has caused the following issues.

   - Spark : https://lists.apache.org/list?dev@hive.apache.org:lte=1M:joda
   - Oozie: https://lists.apache.org/thread/yld75ltf9y8d9q3cow3xqlg0fqyj6mkg
   - Hudi: apache/hudi#8147 <https://github.com/apache/hudi/issues/8147>
   - Apache IotDB:
https://lists.apache.org/thread/wdqsyj89w9cvyk1pyxr83hlxpg6zp1go

   - Guava: https://github.com/google/guava/issues/
   - joda-time:
   https://lists.apache.org/thread/sphgcvod3qx9wtc51ltpfyr8dpx9p294

I understand that there is prior discussion about why the hive-exec:core
jar was removed here:
https://lists.apache.org/thread/cwtxnffoqpwgmdtlc9hyor2cm22djpkg

We agreed that ultimately hive-exec jar should be used over hive-exec:core
but there are quite a few dependencies that need to be shaded and relocated
for this.  https://issues.apache.org/jira/browse/HIVE-26220 .

Until we shade & relocate dependencies in hive-exec, we should restore the
hive-exec:core jar . The intention for this is to provide a smoother
transition from the hive-exec:core to hive-exec jar for projects that
depend on hive .

Seeking inputs from the community  and a way to move forward on this topic.

I apologize in advance if I have missed anything.

Thanks!

Simhadri G


Re: Re: [ANNOUNCE] New Committer: Simhadri Govindappa

2024-04-19 Thread Simhadri G
Thanks again everyone :)

On Fri, Apr 19, 2024, 2:15 AM Rajesh Balamohan 
wrote:

> Congratulations Simhadri. :)
>
> ~Rajesh.B
>
> On Fri, Apr 19, 2024 at 2:02 AM Aman Sinha  wrote:
>
>> Congrats Simhadri !
>>
>> On Thu, Apr 18, 2024 at 12:25 PM Naveen Gangam
>>  wrote:
>>
>>> Congrats Simhadri. Looking forward to many more contributions in the
>>> future.
>>>
>>> On Thu, Apr 18, 2024 at 12:25 PM Sai Hemanth Gantasala
>>>  wrote:
>>>
>>>> Congratulations Simhadri  well deserved
>>>>
>>>> On Thu, Apr 18, 2024 at 8:41 AM Pau Tallada  wrote:
>>>>
>>>>> Congratulations
>>>>>
>>>>> Missatge de Alessandro Solimando  del
>>>>> dia dj., 18 d’abr. 2024 a les 17:40:
>>>>>
>>>>>> Great news, Simhadri, very well deserved!
>>>>>>
>>>>>> On Thu, 18 Apr 2024 at 15:07, Simhadri G 
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks everyone!
>>>>>>> I really appreciate it, it means a lot to me :)
>>>>>>> The Apache Hive project and its community have truly inspired me .
>>>>>>> I'm grateful for the chance to contribute to such a remarkable project.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Simhadri Govindappa
>>>>>>>
>>>>>>> On Thu, Apr 18, 2024 at 6:18 PM Sankar Hariappan
>>>>>>>  wrote:
>>>>>>>
>>>>>>>> Congrats Simhadri!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -Sankar
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *From:* Butao Zhang 
>>>>>>>> *Sent:* Thursday, April 18, 2024 5:39 PM
>>>>>>>> *To:* u...@hive.apache.org; dev 
>>>>>>>> *Subject:* [EXTERNAL] Re: [ANNOUNCE] New Committer: Simhadri
>>>>>>>> Govindappa
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> You don't often get email from butaozha...@163.com. Learn why this
>>>>>>>> is important <https://aka.ms/LearnAboutSenderIdentification>
>>>>>>>>
>>>>>>>> Congratulations Simhadri !!!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> *发件人**:* user-return-28075-butaozhang1=163@hive.apache.org <
>>>>>>>> user-return-28075-butaozhang1=163@hive.apache.org> 代表 Ayush
>>>>>>>> Saxena 
>>>>>>>> *发送时间**:* 星期四, 四月 18, 2024 7:50 下午
>>>>>>>> *收件人**:* dev ; u...@hive.apache.org <
>>>>>>>> u...@hive.apache.org>
>>>>>>>> *主题**:* [ANNOUNCE] New Committer: Simhadri Govindappa
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Apache Hive's Project Management Committee (PMC) has invited
>>>>>>>> Simhadri Govindappa to become a committer, and we are pleased to 
>>>>>>>> announce
>>>>>>>> that he has accepted.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Please join me in congratulating him, Congratulations Simhadri,
>>>>>>>> Welcome aboard!!!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -Ayush Saxena
>>>>>>>>
>>>>>>>> (On behalf of Apache Hive PMC)
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> --
>>>>> Pau Tallada Crespí
>>>>> Departament de Serveis
>>>>> Port d'Informació Científica (PIC)
>>>>> Tel: +34 93 170 2729
>>>>> --
>>>>>
>>>>>


Re: Re: [ANNOUNCE] New Committer: Simhadri Govindappa

2024-04-18 Thread Simhadri G
Thanks everyone!
I really appreciate it, it means a lot to me :)
The Apache Hive project and its community have truly inspired me . I'm
grateful for the chance to contribute to such a remarkable project.

Thanks!
Simhadri Govindappa

On Thu, Apr 18, 2024 at 6:18 PM Sankar Hariappan
 wrote:

> Congrats Simhadri!
>
>
>
> -Sankar
>
>
>
> *From:* Butao Zhang 
> *Sent:* Thursday, April 18, 2024 5:39 PM
> *To:* u...@hive.apache.org; dev 
> *Subject:* [EXTERNAL] Re: [ANNOUNCE] New Committer: Simhadri Govindappa
>
>
>
> You don't often get email from butaozha...@163.com. Learn why this is
> important 
>
> Congratulations Simhadri !!!
>
>
>
> Thanks.
>
>
> --
>
> *发件人**:* user-return-28075-butaozhang1=163@hive.apache.org <
> user-return-28075-butaozhang1=163@hive.apache.org> 代表 Ayush Saxena <
> ayush...@gmail.com>
> *发送时间**:* 星期四, 四月 18, 2024 7:50 下午
> *收件人**:* dev ; u...@hive.apache.org <
> u...@hive.apache.org>
> *主题**:* [ANNOUNCE] New Committer: Simhadri Govindappa
>
>
>
> Hi All,
>
> Apache Hive's Project Management Committee (PMC) has invited Simhadri
> Govindappa to become a committer, and we are pleased to announce that he
> has accepted.
>
>
>
> Please join me in congratulating him, Congratulations Simhadri, Welcome
> aboard!!!
>
>
>
> -Ayush Saxena
>
> (On behalf of Apache Hive PMC)
>


Re: Issue with joda-time library bundled in hive-exec:4.0.0

2024-04-18 Thread Simhadri G
Hi Cheng Pan,

There is long running Hive mail thread discussing this here:
https://lists.apache.org/thread/sxcrcf4v9j630tl9domp0bn4m33bdq0s


On Thu, Apr 18, 2024 at 11:15 AM Cheng Pan  wrote:

> Hi Ayush,
>
> > Hive is already in discussion of marking Hive-2.x EOL, so at very best
> we would have one release and immediately after that we will announce it EOL
>
> Does the discussion happen in public? Is there an ETA for the final
> release of branch-2.3?
>
> Thanks,
> Cheng Pan
>
>
> > On Apr 17, 2024, at 18:03, Ayush Saxena  wrote:
> >
> > Thanx Cheng Pan for sharing the pointers, Do you have any list of issues
> or pointers on what are the challenges for Spark to move to a higher Hive
> version? I know upgrading libraries is quite challenging but it is
> inevitable.
> >
> > Hive is already in discussion of marking Hive-2.x EOL, so at very best
> we would have one release and immediately after that we will announce it
> EOL, maintaining a release line is quite an effort for us at Hive & doing
> it because other projects doesn't want to upgrade isn't a convincing reason
> for most of us. The best we can do is or are trying is to address issues
> for Spark whatever we can do as part of Hive code & would definitely need
> help/support from Spark side as well, since the move is from 2.x to 4.x, it
> would be a big change and would offer resistance on both sides.
> >
> > So, it would be great help if any pointers can be shared from Spark side
> for the move, if there is no help/interest from Spark then we can't do
> anything & there is no need for Hive-2.x either in that case :-)
> >
> > -Ayush
> >
> > On Wed, 17 Apr 2024 at 15:00, Cheng Pan  wrote:
> > > … we are exploring ways to get Spark move from 2.3.9 to 4.0, Our
> initial hunch is that it would be quite challenging without a hive-exec
> slim jar …
> >
> > It should be challenging to upgrade Spark’s built-in Hive version.
> Actually, we already did lots of work on branch-2.3 which focuses on CVE
> reduction, for example, allowing Spark to upgrade Guava to modern versions
> to get rid of Guava 14, it was tested with the latest Spark master
> branch[1], maybe we need a release for 2.3.10 now.
> >
> > [1] https://github.com/apache/spark/pull/45372
> >
> > Thanks,
> > Cheng Pan
> >
> >
>
>


Re: [Blog] Apache Hive 4.0 Release blog for ASF M & P

2024-04-05 Thread Simhadri G
Looks great,  thanks Ayush! :)

On Fri, Apr 5, 2024, 8:54 PM Butao Zhang  wrote:

> Good job Ayush!
> Hope this can make more people know that Apache Hive 4.0 is really ready
> to be used!
>
>
> Thanks,
> Butao Zhang
>  Replied Message 
> From Ayush Saxena 
> Date 4/5/2024 19:55
> To dev 
> Subject [Blog] Apache Hive 4.0 Release blog for ASF M & P
> Hi All,
>
> Have been talking to the ASF M & P team and they recongonise the 4.0
> release is a big milestone for our project.
>
> They are happy to have an entry for us in the their news column, ex:
>
> https://news.apache.org/foundation/entry/apache-software-foundation-announces-apache-wicket-v10
>
> So, I along with Denys, Simhadri & tons of help from ChatGpt have prepared
> a draft to share with them.
> The draft is here:
>
>
> https://docs.google.com/document/d/10Zu8pHvWNDRTqn7yvYqvU4-kw3Q1TXo7mGo5m5fUP2Y/edit
>
> If you have some feedback or concerns, please share with us.
>
> If you want some improvements or removals, let us know here & we will do
> that, or if you need write access to this page, just let me know.
>
> If nobody objects, I plan to send this to the team by next week Tuesday
>
>
> -Ayush
>


Re: [VOTE] Release Apache Hive 4.0.0 (Release Candidate 0)

2024-03-27 Thread Simhadri G
Hi Everyone,

Thanks, Denys for driving the release.

+1 (non-binding)

I Verified the following:

* Downloaded the source tarball, signature (.asc), and checksum: ✓ OK
*Imported GPG keys and verified the signature: ✓ OK

   1. Download KEYS and run gpg --import /path/to/downloaded/KEYS
   2. Verify the signature by running: gpg --verify
   ./apache-hive-4.0.0-src.tar.gz.asc ./apache-hive-4.0.0-src.tar.gz and gpg
   --verify ./apache-hive-4.0.0-bin.tar.gz.asc  ./apache-hive-4.0.0-bin.tar.gz

* Validated checksum and signature for the artifacts: ✓ OK
* Successfully built from source: ✓ OK
* Initialized meta scripts against MySQL: ✓ OK
* Confirmed successful standalone metastore setup with MySQL: ✓ OK
* Deployed and started HiveServer2 and Metastore with Hadoop 3.3.6 and Tez
0.10.3: ✓ OK
* Ran TPCDS queries on Hive external tables with Tez and also executed a
few hive Iceberg queries : ✓ OK

Thanks!
Simhadri G

On Wed, Mar 27, 2024 at 7:12 AM Butao Zhang  wrote:

> +1 (non-binding)
>
>
> I checked:
> [x] Build the 4.0.0 source code successfully: mvn clean package
> -DskipTests -Pdist -Piceberg -Pitests
> [x] Deploy and start the binary tar against Hadoop3.3.6 & Tez0.10.3
> successfully.
> [x] Run some test SQLs, such as create acid table/ iceberg table, and do
> some basic/acid operations (insert/delete/update)
> [x] Enable Ranger authorization plugin, do some db/tbl permission check.
>
>
> Thanks Denys for driving the release. I'm excited to see the official
> upcoming release of version Hive 4.0.0!
>  Replied Message 
> | From | Denys Kuzmenko |
> | Date | 3/26/2024 15:26 |
> | To |  |
> | Subject | [VOTE] Release Apache Hive 4.0.0 (Release Candidate 0) |
> Hi Everyone,
>
> We would like to thank everyone who has contributed to the project and
> request
> the Hive PMC members to review and vote on this new release candidate.
>
> Apache Hive 4.0.0 RC-0 artifacts are available here:*
> https://people.apache.org/~dkuzmenko/apache-hive-4.0.0-rc0/
>
>
> The checksums are as follows:
> - 83eb88549ae88d3df6a86bb3e2526c7f4a0f21acafe21452c18071cee058c666
> apache-hive-4.0.0-bin.tar.gz
> - 4dbc9321d245e7fd26198e5d3dff95e5f7d0673d54d0727787d72956a1bca4f5
> apache-hive-4.0.0-src.tar.gz
>
>
> You can find the KEYS file here:
>
> * https://downloads.apache.org/hive/KEYS
>
>
> A staged Maven repository URL is:*
> https://repository.apache.org/content/repositories/orgapachehive-1127/
>
> The git commit hash is:*
>
> https://github.com/apache/hive/commit/183f8cb41d3dbed961ffd27999876468ff06690c
>
>
> This corresponds to the tag: release-4.0.0-rc0
> * https://github.com/apache/hive/tree/release-4.0.0-rc0
>
> The vote is open for the next 72 hours and passes if a majority of at least
> three +1 PMC votes are cast.
>
> (Only PMC members have binding votes, however, other community members
> are encouraged to cast non-binding votes.)
>
>
> [ ] +1 Release this package as Apache Hive 4.0.0
> [ ] +0
> [ ] -1 Do not release this because...
>
>
> Please download, verify, and test.
>
>
> Regards,
>
> Denys
>


Retire https://apache.github.io sites

2024-03-13 Thread Simhadri G
Hi Everyone,

The revamped hive website has been hosted at https://hive.apache.org/  for
more than a year now.

As a result , we would like to retire and disable old Apache Hive website
hosted via github pages in the following sites:

   - https://apache.github.io/hive/
   - https://apache.github.io/hive-site/

The work for the same is tracked in
https://issues.apache.org/jira/browse/HIVE-27953 .

Kindly let us know if there are any questions regarding this.

Thanks!
Simhadri G


Re: [Discuss] Enable Attachments for Hive mailing lists

2024-01-24 Thread Simhadri G
+1 from me.

It would be nice if we could attach design docs to the mail thread.

Thanks!
Simhadri G


On Tue, Jan 23, 2024 at 1:40 PM Stamatis Zampetakis 
wrote:

> +0
>
> I rarely open attachments from public mailing lists for security
> reasons (unless we are talking for known safe extensions).
>
> Moreover, I find it easier to glance through code if people share a
> link to a PR or code in GitHub than if I have to download and apply a
> patch locally.
>
> I understand that for some people this may be helpful so I am not
> opposing the change.
>
> Best,
> Stamatis
>
> On Mon, Jan 22, 2024 at 2:39 PM Attila Turoczy
>  wrote:
> >
> > +1 for me as well. We need it.
> >
> > -Attila
> >
> > On Mon, Jan 22, 2024 at 1:25 PM Ayush Saxena  wrote:
> >
> > > Hi All,
> > > As of now we don't allow having attachments on the hive mailing lists
> > > (apart from security ML), This prevents us from attaching
> patches/design
> > > doc or even screenshots of issues being reported on our mailing lists.
> > >
> > > A lot of projects allow that, I feel we should enable this for our Hive
> > > mailing lists as well for better dev experience.
> > >
> > > Let me know your thoughts!!!
> > >
> > > Obviously a +1 from me
> > >
> > > -Ayush
> > >
>


Re: 4.0 documentation - Confluence limitations?

2024-01-08 Thread Simhadri G
Hi Zsolt,

The current hive website is built with hugo,  so +1 from me :)

We do have a few doc pages written in hugo, example :
https://hive.apache.org/developement/quickstart/

To add a new page we will need to add a new markdown file in the correct
location in the hive-site repo and hugo will render the same in the hive
website.
For reference , there is a readme section here on how to add new pages as
well: https://github.com/apache/hive-site#to-add-new-content
We can definitely change the formatting/style of docs as needed.


Thanks!
Simhadri G

On Mon, Jan 8, 2024 at 3:04 PM Stamatis Zampetakis 
wrote:

> Hey Zsolt,
>
> There have been a few discussions in the past about moving the
> documentation from the wiki to the website and from what I recall
> people were more or less in favor of moving towards this direction.
> The main thing missing is volunteers that are willing to take on this
> migration step.
>
> Personally, I am very much in favor of going into this direction not
> only for solving namespacing issues but also for traceability purposes
> and facilitating doc contributions and reviews.
>
> Big +1 from me.
>
> Best,
> Stamatis
>
> On Mon, Jan 8, 2024 at 10:15 AM Zsolt Miskolczi
>  wrote:
> >
> > In confluence, page names should be unique in a given space. As I see,
> > Apache Hive has its own space.
> > And now comes the tricky part: with 4.0 documentation, we didn't create a
> > new space, just a 4.0 parent page. We create a copy of existing pages
> under
> > the umbrella of this page:
> > https://cwiki.apache.org/confluence/display/Hive/Apache+Hive+4.0.0
> >
> > The problem is the unique naming of pages: it would make sense to keep
> the
> > page names the same as in the older documents but unfortunately, we
> cannot.
> > So we try to create names that are almost the same, or just delay the
> > decisions.
> > Two examples:
> > - AdminManual Installation
> > <
> https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation>
> > became Manual Installation
> > <https://cwiki.apache.org/confluence/display/Hive/Manual+Installation>
> > - Hive Schema Tool
> > <https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool>became
> Copy
> > of Hive Schema Tool - [TODO: move it under a 4.0 admin manual page, find
> a
> > proper name]
> > <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284790216
> >
> >
> > I feel multiple issues with that: Consistency is gone. And also, I'm not
> > sure how it can support search engines. Also, it can be confusing for
> > people who want to use the wiki pages.
> >
> > I was thinking about different solutions. Creating a Hive 4.0 space in
> > Confluence can solve the problem of page uniqueness. But doesn't address
> > the issue of searchability and ease of use.
> >
> > We can also keep the current one but in that case, it would be
> recommended
> > to figure out a great naming convention about the pages.
> >
> > At this point, my best idea is to move to an engine that has better
> offers
> > to document a software product. For example, Iceberg uses Hugo. It is a
> > markup-based engine, it can be kept in source control and pretty fast.
> > Example page: https://iceberg.apache.org/docs/1.4.1/.
> >
> >
> > What do you think of that?
> >
> > Thank you,
> > Zsolt
>


Re: Help with Docker Apache/Hive metastore using mysql remote database

2023-12-18 Thread Simhadri G
We can modify the Dockerfile to wget the necessary driver and copy it to
/opt/hive/lib/ .  This should make it work. The diff is attached below:


diff --git a/packaging/src/docker/Dockerfile
b/packaging/src/docker/Dockerfile
--- a/packaging/src/docker/Dockerfile (revision
dceaf810b32fc266e3e657fdaefcd4507f2191b5)
+++ b/packaging/src/docker/Dockerfile (date 1702897518609)
@@ -80,6 +80,9 @@

 ENV PATH=$HIVE_HOME/bin:$HADOOP_HOME/bin:$PATH

+RUN wget
https://repo1.maven.org/maven2/org/postgresql/postgresql/42.5.1/postgresql-42.5.1.jar
+RUN cp /postgresql-42.5.1.jar /opt/hive/lib/
+
 COPY entrypoint.sh /
 COPY conf $HIVE_HOME/conf
 RUN chmod +x /entrypoint.sh

On Mon, Dec 18, 2023, 12:59 PM Ayush Saxena  wrote:

> I think the similar problem is being chased as part of
> https://github.com/apache/hive/pull/4948
>
> On Mon, 18 Dec 2023 at 09:48, Sanjay Gupta  wrote:
> >
> >
> >
> >
> > Issue with Docker container using mysql RDBMS ( Failed to load driver)
> >
> > https://hub.docker.com/r/apache/hive
> >
> > According to readme
> >
> > Launch Standalone Metastore With External RDBMS
> (Postgres/Oracle/MySql/MsSql)
> >
> > I want to use MySQL
> >
> > I tried com.mysql.jdbc.Driver or com.mysql.cj.jdbc.Driver
> >
> > docker run -it -d -p 9083:9083 --env SERVICE_NAME=metastore
> --add-host=host.docker.internal:host-gateway \
> >  --env DB_DRIVER=mysql \
> >  --env
> SERVICE_OPTS="-Djavax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver
> -Djavax.jdo.option.ConnectionURL=jdbc:mysql://host.docker.internal:3306/hive?createDatabaseIfNotExist=true
> -Djavax.jdo.option.ConnectionUserName=hive
> -Djavax.jdo.option.ConnectionPassword=password" \
> >  --mount source=warehouse,target=/opt/hive/data/warehouse \
> >  --name metastore-standalone apache/hive:${HIVE_VERSION}
> >
> >
> > docker run -it -d -p 9083:9083 --env SERVICE_NAME=metastore
> --add-host=host.docker.internal:host-gateway \
> >  --env DB_DRIVER=mysql \
> >  --env
> SERVICE_OPTS="-Djavax.jdo.option.ConnectionDriverName=com.mysql.cj.jdbc.Driver
> -Djavax.jdo.option.ConnectionURL=jdbc:mysql://host.docker.internal:3306/hive?createDatabaseIfNotExist=true
> -Djavax.jdo.option.ConnectionUserName=hive
> -Djavax.jdo.option.ConnectionPassword=password" \
> >  --mount source=warehouse,target=/opt/hive/data/warehouse \
> >  --name metastore-standalone apache/hive:${HIVE_VERSION}
> >
> > Docker logs shows this for both drivers ( same error )
> >
> > docker logs f3
> > + : mysql
> > + SKIP_SCHEMA_INIT=false
> > + export HIVE_CONF_DIR=/opt/hive/conf
> > + HIVE_CONF_DIR=/opt/hive/conf
> > + '[' -d '' ']'
> > + export 'HADOOP_CLIENT_OPTS= -Xmx1G
> -Djavax.jdo.option.ConnectionDriverName=com.mysql.cj.jdbc.Driver
> -Djavax.jdo.option.ConnectionURL=jdbc:mysql://host.docker.internal:3306/hive?createDatabaseIfNotExist=true
> -Djavax.jdo.option.ConnectionUserName=hive
> -Djavax.jdo.option.ConnectionPassword=hive'
> > + HADOOP_CLIENT_OPTS=' -Xmx1G
> -Djavax.jdo.option.ConnectionDriverName=com.mysql.cj.jdbc.Driver
> -Djavax.jdo.option.ConnectionURL=jdbc:mysql://host.docker.internal:3306/hive?createDatabaseIfNotExist=true
> -Djavax.jdo.option.ConnectionUserName=hive
> -Djavax.jdo.option.ConnectionPassword=hive'
> > + [[ false == \f\a\l\s\e ]]
> > + initialize_hive
> > + /opt/hive/bin/schematool -dbType mysql -initSchema
> > SLF4J: Class path contains multiple SLF4J bindings.
> > SLF4J: Found binding in
> [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: Found binding in
> [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> > SLF4J: Actual binding is of type
> [org.apache.logging.slf4j.Log4jLoggerFactory]
> > Metastore connection URL:
> jdbc:mysql://host.docker.internal:3306/hive?createDatabaseIfNotExist=true
> > Metastore Connection Driver : com.mysql.cj.jdbc.Driver
> > Metastore connection User: hive
> > org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver
> > Underlying cause: java.lang.ClassNotFoundException :
> com.mysql.cj.jdbc.Driver
> > Use --verbose for detailed stacktrace.
> > *** schemaTool failed ***
> > + '[' 1 -eq 0 ']'
> > + echo 'Schema initialization failed!'
> > Schema initialization failed!
> > + exit 1
> >
> > Any idea, why I am getting failed to load driver for MySQL DB.
> >
> > Isn't docker container comes with MySQL Driver ?
> >
> > Docker container exits so I can't check whether driver is already
> installed.
> >
> > Let me know, what I can do to make it work.
> >
> > --
> >
> >
> > Thanks
> > Sanjay Gupta
> >
> >
> >
> > --
> >
> > Thanks
> > Sanjay Gupta
> >
> >
> >
> > --
> >
> > Thanks
> > Sanjay Gupta
> >
>


Re: [VOTE] Release Apache Hive 4.0.0-beta-1 (Release Candidate 0)

2023-08-10 Thread Simhadri G
Hi Everyone,

Thanks, Stamatis for driving the release.

+1 (non-binding)

Verified the following:

* Download the source tarball, signature (.asc), and checksum (.sha512):
OK
* Import gpg keys: download KEYS and run gpg --import
/path/to/downloaded/KEYS.txt -> Verify the signature by running:gpg
--verify ./apache-hive-4.0.0-beta-1-bin.tar.gz.asc
/apache-hive-4.0.0-beta-1-bin.tar.gz  : OK
* Validated checksum and signature for the artifacts : OK
* Build from source successfully  : OK
* Init  meta scripts against MYSQL : OK
* Successful standalone metastore setup with MYSQL  : OK
* Bring up HiveServer2 and Metastore, run some simple hive queries and
iceberg queries using Tez : OK

Thanks!
Simhadri G



On Thu, Aug 10, 2023 at 12:30 PM Sourabh Badhya
 wrote:

> I was able to do the following with RC0 artifacts -
> * Verified checksum of sources and binaries.
> * Successfully built from source.
> * Successful metastore DB setup with Postgres.
> * Brought up HS2 and HMS successfully and ran CREATE, INSERT, SELECT, DROP
> queries for external tables and CREATE, INSERT, SELECT, DELETE, UPDATE,
> DROP queries for transactional and Iceberg tables using Tez.
>
> +1 (non-binding)
>
> Thanks Stamatis for driving the release.
>
> Regards,
> Sourabh Badhya
>
> On Wed, Aug 9, 2023 at 10:53 AM dengzhhu653  wrote:
>
> > +1 (non-binding), and thanks for driving the release.* Verified
> > signature/Checksum of sources and binaries;* Good rat check and
> source
> > files;* Build from source successfully;* Inited meta scripts
> > against Postgres;* Bring up HiveServer2 and Metastore, run some
> simple
> > queries using Tez: okThanks,Zhihua
> > At 2023-08-07 21:55:30, "Stamatis Zampetakis" 
> wrote:
> > >Hi all,
> > >
> > >I have created a build for Apache Hive 4.0.0-beta-1 Release Candidate 0.
> > >
> > >Thanks to everyone who has contributed to this release.
> > >
> > >You can read the release notes here:
> > >
> https://github.com/apache/hive/blob/branch-4.0.0-beta-1/RELEASE_NOTES.txt
> > >
> > >The commit to be voted upon:
> > >
> >
> https://github.com/apache/hive/commit/d2310944e412b577a39687c7968b2e93eede8433
> > >
> > >Its hash is
> > >d2310944e412b577a39687c7968b2e93eede8433
> > >
> > >Tag:
> > >https://github.com/apache/hive/tree/release-4.0.0-beta-1-rc0
> > >
> > >The artifacts to be voted on are located here:
> > >https://people.apache.org/~zabetak/apache-hive-4.0.0-beta-1-rc0/
> > >
> > >The hashes of the artifacts are as follows:
> > >- 4114d8e9a523562c77237a8751dec9ed1bcbf6ccbe2e178d72f356ca4e65d466
> > >apache-hive-4.0.0-beta-1-bin.tar.gz
> > >- 8d157f4dcb9af5e48e51206a4046d1c11414fbc39583c84be31d609606136209
> > >apache-hive-4.0.0-beta-1-src.tar.gz
> > >
> > >A staged Maven repository is available for review at:
> > >https://repository.apache.org/content/repositories/orgapachehive-1119
> > >
> > >Release artifacts are signed with the following key:
> > >https://people.apache.org/keys/committer/zabetak.asc
> > >https://downloads.apache.org/hive/KEYS
> > >
> > >Please vote on releasing this package as Apache Hive 4.0.0-beta-1.
> > >
> > >The vote is open for the next 72 hours and passes if a majority of at
> > >least three +1 PMC votes are cast.
> > >
> > >[ ] +1 Release this package as Apache Hive 4.0.0-beta-1
> > >[ ]  0 I don't feel strongly about it, but I'm okay with the release
> > >[ ] -1 Do not release this package because...
> > >
> > >Here is my vote:
> > >+1 (binding)
> > >
> > >Best,
> > >Stamatis
> >
>


Re: Request for write access to Hive wiki

2023-08-02 Thread Simhadri G
Thanks Ayush!

On Wed, Aug 2, 2023 at 4:25 PM Ayush Saxena  wrote:

> Hi Simhadri,
> It is done.
>
> -Ayush
>
> On Wed, 2 Aug 2023 at 15:40, Simhadri G  wrote:
> >
> > Hi Everyone,
> >
> > I need to update the Hive column stats page in the hive wiki :
> >
> https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive
> > because of a PR.
> >
> > I kindly request write access to the hive wiki(Confluence username:
> > simhadri064).
> >
> > Thanks!
> > Simhadri G
>


Request for write access to Hive wiki

2023-08-02 Thread Simhadri G
Hi Everyone,

I need to update the Hive column stats page in the hive wiki :
https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive
because of a PR.

I kindly request write access to the hive wiki(Confluence username:
simhadri064).

Thanks!
Simhadri G


Re: [ANNOUNCE] New committer for Apache Hive: Alessandro Solimando

2023-02-08 Thread Simhadri G
Congratulations Alessandro!!

On Thu, 9 Feb 2023, 11:26 Ayush Saxena,  wrote:

> Congratulations Alessandro!!!
>
> -Ayush
>
> > On 09-Feb-2023, at 1:30 AM, Naveen Gangam  wrote:
> >
> > The Project Management Committee (PMC) for Apache Hive has invited
> > Alessandro Solimando (asolimando) to become a committer and is pleased
> > to announce that he has accepted.
> >
> > Contributions from Alessandro:
> > He has authored 30 patches for Hive, 18 for Apache Calcite and has
> > done many code reviews for other contributors. Vast experience and
> > knowledge in SQL Compiler and Optimization. His most recent work was
> > added support for histogram-based column stats in Hive.
> >
> > https://issues.apache.org/jira/issues/?filter=12352498
> >
> > Being a committer enables easier contribution to the project since
> > there is no need to go via the patch submission process. This should
> > enable better productivity.A PMC member helps manage and guide the
> > direction of the project.
> >
> > Congratulations
> > Hive PMC
>


Re: [ANNOUNCE] New PMC Member: Laszlo Bodor

2023-01-28 Thread Simhadri G
Congratulations Laszlo Bodor! :)



On Sat, 28 Jan 2023, 20:26 Akshat m,  wrote:

> Congratulations Laszlo
>
> Regards,
> Akshat
>
> On Sat, Jan 28, 2023 at 3:03 AM Naveen Gangam  >
> wrote:
>
> > Hello Hive Community,
> > Apache Hive PMC is pleased to announce that Laszlo Bodor
> > (username:abstractdog) has accepted the Apache Hive PMC's invitation to
> > become PMC Member, and is now our newest PMC member. Please join me in
> > congratulating Laszlo !!!
> >
> > He has been an active member in the hive community across many aspects of
> > the project. Many thanks to Laszlo for all the contributions he has made
> > and looking forward to many more future contributions in the expanded
> role.
> >
> > https://github.com/apache/hive/commits?author=abstractdog
> >
> > * 96 commits in master [2]
> > * 66 reviews in master [3]
> > * Reported 163 JIRAS [6]
> >
> > Cheers,
> > Naveen (on behalf of Hive PMC)
> >
>


Re: [EXTERNAL] [ANNOUNCE] New PMC Member: Stamatis Zampetakis

2023-01-13 Thread Simhadri G
Congratulations Stamatis!

On Sat, 14 Jan 2023, 00:12 Sankar Hariappan via user, 
wrote:

> Congrats Stamatis! Well deserved one 
>
>
>
> Thanks,
>
> Sankar
>
>
>
> *From:* Naveen Gangam 
> *Sent:* Saturday, January 14, 2023 12:03 AM
> *To:* dev ; u...@hive.apache.org
> *Cc:* zabe...@apache.org
> *Subject:* [EXTERNAL] [ANNOUNCE] New PMC Member: Stamatis Zampetakis
>
>
>
> Hello Hive Community,
>
> Apache Hive PMC is pleased to announce that Stamatis Zampetakis has
> accepted the Apache Hive PMC's invitation to become PMC Member, and is now
> our newest PMC member. Please join me in congratulating Stamatis !!!
>
>
>
> He has been an active member in the hive community across many aspects of
> the project. Many thanks to Stamatis for all the contributions he has made
> and looking forward to many more future contributions in the expanded role.
>
>
>
> Cheers,
>
> Naveen (on behalf of Hive PMC)
>


Re: Proposal: Revamp Apache Hive website.

2023-01-12 Thread Simhadri G
Hello Everyone,

Happy new year!

I am happy to announce that the new Apache Hive website[1] is finally up
and running.
It can be accessed here: https://hive.apache.org/

I would like to specially thank Stamatis, Ayush, Sai Heamanth for reviewing
the PR. Without their help, the new website would not have reached
completion.
I would also like to thank Owen O'Malley, Daniel Gruno,  Alessandro
Solimando and Pau Tallada for the help and feedback received during the
process.

Thank you,
Simhadri G

[1]https://hive.apache.org/
[2]HIVE-26565 <https://issues.apache.org/jira/browse/HIVE-26565> :
https://issues.apache.org/jira/browse/HIVE-26565
[2] INFRA-24077 <https://issues.apache.org/jira/browse/INFRA-24077> :
https://issues.apache.org/jira/browse/INFRA-24077

On Mon, Jan 9, 2023 at 4:56 PM Stamatis Zampetakis 
wrote:

> Hi everyone,
>
> Simhadri has been working hard to modernize the Hive website (HIVE-26565)
> for the past few months and I am quite happy with the results.
>
> I reviewed the respective PR [1] and will commit the changes in 24h unless
> there are objections.
>
> Best,
> Stamatis
>
> [1] https://github.com/apache/hive-site/pull/2
>
> On Wed, Oct 5, 2022 at 8:46 PM Simhadri G  wrote:
>
>> Thanks for the feedback Stamatis !
>>
>>- I have updated the PR to include a README.md file with instructions
>>to build and view the site locally after making any new changes. This will
>>help us preview the changes locally before pushing the commit. (Docker is
>>not required here.)
>>
>>- Github pages was used to share the new website with the community
>>and it will most likely not be necessary later on.
>>
>>- Regarding the role of Github Actions(gh-pages.yml):
>>
>>- Whenever a PR is merged to the main branch, a github action is
>>   triggered .
>>   - Github action will install a hugo and build the site with the
>>   new changes.  Once the build is successful, HUGO then generates a set 
>> of
>>   static files and these files are automatically merged to the
>>   hive-site/asf-site branch by github actions bot.
>>   - From here, to publish  hive-site/asf-site to project web site
>>   sub-domain (hive.apache.org),  we need to set up a configuration
>>   block called publish in your .asf.yaml file. (
>>   
>> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Publishingabranchtoyourprojectwebsite).
>>
>>   - We will need help from apache infra - gmcdonald
>>   <https://github.com/apache/hive-site/commits?author=gmcdonald> or
>>   Humbedooh
>>   <https://github.com/apache/hive-site/commits?author=Humbedooh> to
>>   make sure that we have set this up correctly.
>>
>>   - I agree with your suggestion to keep the changes around the
>>revamp as minimal as possible and not mix the content update with the
>>framework change. In this case, we can make the other changes 
>> incrementally
>>at a later stage.
>>
>>
>> Thanks!
>> Simhadri G
>>
>> On Wed, Oct 5, 2022 at 3:41 PM Stamatis Zampetakis 
>> wrote:
>>
>>> Thanks for staying on top of this Simhadri.
>>>
>>> I will try to help reviewing the PR once I get some time.
>>>
>>> What is not yet clear to me from this discussion or by looking at the PR
>>> is the workflow for making a change appear on the web (
>>> https://hive.apache.org/). Having a README which clearly states what
>>> needs to be done is a must.
>>>
>>> I also think it is quite important to have instructions and possibly
>>> docker images for someone to be able to test how the changes look locally
>>> before commiting a change to the repo.
>>>
>>> Another point that needs clarification is the role of github pages. I am
>>> not sure why it is necessary at the moment and what exactly is the plan
>>> going forward. If I understand well, currently it is used to preview the
>>> changes but from my perspective we shouldn't need to commit something to
>>> the repo to understand if something breaks or not; preview should happen
>>> locally.
>>>
>>> I would suggest to keep the changes around the revamp as minimal as
>>> possible and not mix the content update with the framework change. As
>>> usual, smaller changes are easier to review and merge. It is definitely
>>> worth updating and improving the content but let's do it incrementally so
>>> that changes can get merged faster.
>>>
>>> The list of c

Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-19 Thread Simhadri G
Congratulations Ayush

On Tue, 20 Dec 2022, 06:42 Naveen Gangam,  wrote:

> Hello Hive Community,
> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
> Apache Hive PMC's invitation to become PMC Member, and is now our newest
> PMC member. Many thanks to Ayush for all the contributions he has made and
> looking forward to many more future contributions in the expanded role.
>
> Please join me in congratulating Ayush !!!
>
> Cheers,
> Naveen (on behalf of Hive PMC)
>
>


Re: Proposal: Revamp Apache Hive website.

2022-10-05 Thread Simhadri G
Thanks for the feedback Stamatis !

   - I have updated the PR to include a README.md file with instructions to
   build and view the site locally after making any new changes. This will
   help us preview the changes locally before pushing the commit. (Docker is
   not required here.)

   - Github pages was used to share the new website with the community and
   it will most likely not be necessary later on.

   - Regarding the role of Github Actions(gh-pages.yml):

   - Whenever a PR is merged to the main branch, a github action is
  triggered .
  - Github action will install a hugo and build the site with the new
  changes.  Once the build is successful, HUGO then generates a
set of static
  files and these files are automatically merged to the hive-site/asf-site
  branch by github actions bot.
  - From here, to publish  hive-site/asf-site to project web site
  sub-domain (hive.apache.org),  we need to set up a configuration
  block called publish in your .asf.yaml file. (
  
https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Publishingabranchtoyourprojectwebsite).

  - We will need help from apache infra - gmcdonald
  <https://github.com/apache/hive-site/commits?author=gmcdonald> or
  Humbedooh
  <https://github.com/apache/hive-site/commits?author=Humbedooh> to
  make sure that we have set this up correctly.

  - I agree with your suggestion to keep the changes around the revamp
   as minimal as possible and not mix the content update with the framework
   change. In this case, we can make the other changes incrementally at a
   later stage.


Thanks!
Simhadri G

On Wed, Oct 5, 2022 at 3:41 PM Stamatis Zampetakis 
wrote:

> Thanks for staying on top of this Simhadri.
>
> I will try to help reviewing the PR once I get some time.
>
> What is not yet clear to me from this discussion or by looking at the PR
> is the workflow for making a change appear on the web (
> https://hive.apache.org/). Having a README which clearly states what
> needs to be done is a must.
>
> I also think it is quite important to have instructions and possibly
> docker images for someone to be able to test how the changes look locally
> before commiting a change to the repo.
>
> Another point that needs clarification is the role of github pages. I am
> not sure why it is necessary at the moment and what exactly is the plan
> going forward. If I understand well, currently it is used to preview the
> changes but from my perspective we shouldn't need to commit something to
> the repo to understand if something breaks or not; preview should happen
> locally.
>
> I would suggest to keep the changes around the revamp as minimal as
> possible and not mix the content update with the framework change. As
> usual, smaller changes are easier to review and merge. It is definitely
> worth updating and improving the content but let's do it incrementally so
> that changes can get merged faster.
>
> The list of committers and PMC members for Hive can be found in the apache
> phonebook [1]. The list can easily get outdated so maybe we can consider
> adding links to [1] and/or github and other places instead of duplicating
> the content. Anyways, let's first deal with the revamp and discuss content
> changes later in separate JIRAs/PRs.
>
> Best,
> Stamatis
>
> [1] https://home.apache.org/phonebook.html?project=hive
>
> On Sun, Oct 2, 2022 at 2:41 AM Simhadri G  wrote:
>
>> Hello Everyone,
>>
>> I have raised the PR for the revamped Hive Website here:
>>  https://github.com/apache/hive-site/pull/2
>>
>> I kindly request if someone can help review this PR .
>>
>> Until the PR is merged, you can find the updated website here . Please
>> have a look and any feedback is most welcome :)
>> https://simhadri-g.github.io/hive-site/
>>
>> Few other things to note:
>>
>>- We will need help from someone who has write access to hive-site
>>repo to update the github workflow once PR is merged.
>>- One more important question, I came across this (
>>https://hive.apache.org/people.html ) page, while moving the .md file
>>to the new website, which lists the current pmc and committers of hive. I
>>noticed that this list is not upto date, a lot of people seem to be 
>> missing
>>from this list. May I please know where I can find the updated date list 
>> of
>>committers and PMCs which I can refer to and update the page.
>>- Lastly, I plan to add a few more sections to the homepage soon, one
>>of the sections I have in mind is to add an overview of all the apache
>>projects that use or integrate with apache hive... If there are an

Re: Proposal: Revamp Apache Hive website.

2022-10-01 Thread Simhadri G
Hello Everyone,

I have raised the PR for the revamped Hive Website here:
 https://github.com/apache/hive-site/pull/2

I kindly request if someone can help review this PR .

Until the PR is merged, you can find the updated website here . Please have
a look and any feedback is most welcome :)
https://simhadri-g.github.io/hive-site/

Few other things to note:

   - We will need help from someone who has write access to hive-site repo
   to update the github workflow once PR is merged.
   - One more important question, I came across this (
   https://hive.apache.org/people.html ) page, while moving the .md file to
   the new website, which lists the current pmc and committers of hive. I
   noticed that this list is not upto date, a lot of people seem to be missing
   from this list. May I please know where I can find the updated date list of
   committers and PMCs which I can refer to and update the page.
   - Lastly, I plan to add a few more sections to the homepage soon, one of
   the sections I have in mind is to add an overview of all the apache
   projects that use or integrate with apache hive... If there are any other
   suggestions in addition to this please let me know.


Thanks!
Simhadri G



On Sat, Sep 24, 2022 at 7:03 AM Simhadri G  wrote:

> Thanks everyone,
>
>  I will begin with creating the PR and share the link in this thread soon.
>
> Thanks
> Simhadri G
>
> On Sat, 24 Sep 2022, 04:52 Ayush Saxena,  wrote:
>
>> Thanx Everyone,
>> Almost a week and we don’t seems to have any objections to start with up
>> revamp task with hive-site repo for now.
>>
>> Other things as mentioned can be followed up and we can try to ask folks
>> to establish a PMC consensus if the need be for the futher migration tasks.
>>
>> Simhadri, would be good to create a Jira and link the PR and drop the
>> link here in the thread as well, so as people interested can drop
>> suggestions regarding the design and content of the website over there, for
>> anything else we can always come back here if we are blocked on something,
>> or if something more needs to be done in this context.
>>
>> -Ayush
>>
>> On 21-Sep-2022, at 6:35 PM, Stamatis Zampetakis 
>> wrote:
>>
>>
>> 
>> The javadocs are currently in svn and they can remain there for the
>> moment. Eventually, they could be moved to a hive-site repository and for
>> sure we don't want them in the main hive repo. I don't see an immediate
>> need to change the place where javadocs are stored but if needed we can
>> raise a JIRA ticket and continue the discussion there. It's not a good idea
>> to discuss under a closed issue/PR.
>>
>> The hive-site repo is always gonna be the place for storing the generated
>> website (html files etc). When you talk about moving back to the hive repo
>> I guess you refer to the source/markdown files. The decision to change the
>> process of publishing the website will probably require a PMC vote with
>> lazy consensus.
>>
>> I agree that we can start by updating the current setup. Then we can kick
>> off the discussion about moving the website sources to hive repo and start
>> publishing from there. I don't know if we need to move the javadocs, so we
>> can postpone this discussion till we hit an obstacle.
>>
>> Best,
>> Stamatis
>>
>> On Mon, Sep 19, 2022 at 12:01 PM Simhadri G 
>> wrote:
>>
>>> Thanks Owen, Stamatis, Ayush and Alessandro for the feedback.
>>>
>>>- Regarding the javadocs and the automatically build and to deploy
>>>github-pages discussion in the previous PR thread [1]
>>><https://github.com/apache/hive/pull/1410>,
>>>
>>>
>>>- Apache Iceberg-docs ([2]
>>>   <https://iceberg.apache.org/javadoc/latest/>) has recently set up
>>>   a github workflow ([3])
>>>   
>>> <https://github.com/apache/iceberg-docs/actions/runs/3062679467/jobs/4943928455>
>>>   to publish the javadocs from a given javadocs dir [4]
>>>   <https://github.com/apache/iceberg-docs/tree/main/javadoc> , I
>>>   think we can setup the same workflow for Hive javadocs.
>>>   - As Ayush and Stamatis have mentioned, I think over the past 2
>>>   years, apache infra has added support for github actions and we can 
>>> confirm
>>>   that from Apache Iceberg/calcite docs that are currently using it.
>>>   - But I am not sure regarding which branch or directory we will
>>>   need to put the hive javadoc files . This needs more discussion and 
>>> we can
>>>   follow up on this(

Re: Proposal: Revamp Apache Hive website.

2022-09-23 Thread Simhadri G
Thanks everyone,

 I will begin with creating the PR and share the link in this thread soon.

Thanks
Simhadri G

On Sat, 24 Sep 2022, 04:52 Ayush Saxena,  wrote:

> Thanx Everyone,
> Almost a week and we don’t seems to have any objections to start with up
> revamp task with hive-site repo for now.
>
> Other things as mentioned can be followed up and we can try to ask folks
> to establish a PMC consensus if the need be for the futher migration tasks.
>
> Simhadri, would be good to create a Jira and link the PR and drop the link
> here in the thread as well, so as people interested can drop suggestions
> regarding the design and content of the website over there, for anything
> else we can always come back here if we are blocked on something, or if
> something more needs to be done in this context.
>
> -Ayush
>
> On 21-Sep-2022, at 6:35 PM, Stamatis Zampetakis  wrote:
>
>
> 
> The javadocs are currently in svn and they can remain there for the
> moment. Eventually, they could be moved to a hive-site repository and for
> sure we don't want them in the main hive repo. I don't see an immediate
> need to change the place where javadocs are stored but if needed we can
> raise a JIRA ticket and continue the discussion there. It's not a good idea
> to discuss under a closed issue/PR.
>
> The hive-site repo is always gonna be the place for storing the generated
> website (html files etc). When you talk about moving back to the hive repo
> I guess you refer to the source/markdown files. The decision to change the
> process of publishing the website will probably require a PMC vote with
> lazy consensus.
>
> I agree that we can start by updating the current setup. Then we can kick
> off the discussion about moving the website sources to hive repo and start
> publishing from there. I don't know if we need to move the javadocs, so we
> can postpone this discussion till we hit an obstacle.
>
> Best,
> Stamatis
>
> On Mon, Sep 19, 2022 at 12:01 PM Simhadri G  wrote:
>
>> Thanks Owen, Stamatis, Ayush and Alessandro for the feedback.
>>
>>- Regarding the javadocs and the automatically build and to deploy
>>github-pages discussion in the previous PR thread [1]
>><https://github.com/apache/hive/pull/1410>,
>>
>>
>>- Apache Iceberg-docs ([2]
>>   <https://iceberg.apache.org/javadoc/latest/>) has recently set up
>>   a github workflow ([3])
>>   
>> <https://github.com/apache/iceberg-docs/actions/runs/3062679467/jobs/4943928455>
>>   to publish the javadocs from a given javadocs dir [4]
>>   <https://github.com/apache/iceberg-docs/tree/main/javadoc> , I
>>   think we can setup the same workflow for Hive javadocs.
>>   - As Ayush and Stamatis have mentioned, I think over the past 2
>>   years, apache infra has added support for github actions and we can 
>> confirm
>>   that from Apache Iceberg/calcite docs that are currently using it.
>>   - But I am not sure regarding which branch or directory we will
>>   need to put the hive javadoc files . This needs more discussion and we 
>> can
>>   follow up on this([5]
>>   <https://github.com/apache/hive/pull/1410#issuecomment-680111530>)
>>   .
>>
>>
>>-  I am not aware about the procedure or the approvals we need to
>>move from hive-site repo back to the main repository. We will need help
>>with this.
>>
>>- I was able to setup the github action on the POC repo:
>>https://github.com/simhadri-g/hive-site/tree/new-site  .
>>- Any changes to this repo/new-site will automatically reflect here
>>   once the github workflow completes:
>>   https://simhadri-g.github.io/hive-site/  .
>>
>>   - Considering the feedback, I think we can plan to do in 3 phases,
>>for the first cut I would like to update the website in the present setup,
>>followed by moving the javadocs to the hive-site repo  and as for the 
>> third
>>phase , we can work on migrating from hive-site to hive repo.
>>
>>- If everyone agrees, can we please go ahead with the first phase?
>>
>>
>> [1]https://github.com/apache/hive/pull/1410,
>> [2]https://iceberg.apache.org/javadoc/latest/
>> [3]
>> https://github.com/apache/iceberg-docs/actions/runs/3062679467/jobs/4943928455
>> [4]https://github.com/apache/iceberg-docs/tree/main/javadoc
>> [5]https://github.com/apache/hive/pull/1410#issuecomment-680111530
>> [6] https://github.com/apache/hive/pull/1410#issuecomment-680102815
>>
>>
>> Thanks!
>> Simhadri G
>>

Re: Proposal: Revamp Apache Hive website.

2022-09-19 Thread Simhadri G
Thanks Owen, Stamatis, Ayush and Alessandro for the feedback.

   - Regarding the javadocs and the automatically build and to deploy
   github-pages discussion in the previous PR thread [1]
   <https://github.com/apache/hive/pull/1410>,


   - Apache Iceberg-docs ([2] <https://iceberg.apache.org/javadoc/latest/>)
  has recently set up a github workflow ([3])
  
<https://github.com/apache/iceberg-docs/actions/runs/3062679467/jobs/4943928455>
  to publish the javadocs from a given javadocs dir [4]
  <https://github.com/apache/iceberg-docs/tree/main/javadoc> , I think
  we can setup the same workflow for Hive javadocs.
  - As Ayush and Stamatis have mentioned, I think over the past 2
  years, apache infra has added support for github actions and we
can confirm
  that from Apache Iceberg/calcite docs that are currently using it.
  - But I am not sure regarding which branch or directory we will need
  to put the hive javadoc files . This needs more discussion and we can
  follow up on this([5]
  <https://github.com/apache/hive/pull/1410#issuecomment-680111530>) .


   -  I am not aware about the procedure or the approvals we need to move
   from hive-site repo back to the main repository. We will need help with
   this.

   - I was able to setup the github action on the POC repo:
   https://github.com/simhadri-g/hive-site/tree/new-site  .
   - Any changes to this repo/new-site will automatically reflect here once
  the github workflow completes: https://simhadri-g.github.io/hive-site/
  .

  - Considering the feedback, I think we can plan to do in 3 phases,
   for the first cut I would like to update the website in the present setup,
   followed by moving the javadocs to the hive-site repo  and as for the third
   phase , we can work on migrating from hive-site to hive repo.

   - If everyone agrees, can we please go ahead with the first phase?


[1]https://github.com/apache/hive/pull/1410,
[2]https://iceberg.apache.org/javadoc/latest/
[3]
https://github.com/apache/iceberg-docs/actions/runs/3062679467/jobs/4943928455
[4]https://github.com/apache/iceberg-docs/tree/main/javadoc
[5]https://github.com/apache/hive/pull/1410#issuecomment-680111530
[6] https://github.com/apache/hive/pull/1410#issuecomment-680102815


Thanks!
Simhadri G

On Mon, Sep 19, 2022 at 1:50 PM Alessandro Solimando <
alessandro.solima...@gmail.com> wrote:

> Hi everyone,
> thanks Simhadri for pushing this forward.
>
> I like the look and feel of the new website, and I agree with Stamatis
> that having the website sources in the Hive repo, and automatically
> publishing the site upon commits would be very beneficial.
>
> Best regards,
> Alessandro
>
> On Thu, 15 Sept 2022 at 23:11, Stamatis Zampetakis 
> wrote:
>
>> Hi all,
>>
>> It's great to see some effort in improving the website. The POC from
>> Simhadri looks really cool; I didn't check the content but I love the look
>> and feel.
>>
>> Now regarding the current process for modifying and updating the website
>> there is some info in this relatively recent thread [1].
>>
>> Moving forward, I would really like to have the source code of the
>> website (markdown etc) in the main repo of the project [2], and use GitHub
>> actions to automatically build and push the content to the site repo [3]
>> per commit basis.
>> This workflow is used in Apache Calcite and I find it extremely
>> convenient.
>>
>> Best,
>> Stamatis
>>
>> [1] https://lists.apache.org/thread/4b6x4d6z4tgnv4mo0ycg30y4dlt0msbd
>> [2] https://github.com/apache/hive
>> [3] https://github.com/apache/hive-site
>>
>> On Thu, Sep 15, 2022 at 10:50 PM Ayush Saxena  wrote:
>>
>>> Owen,
>>> I am not sure if I am catching you right, But now the repository for the
>>> website has changed, we no longer use our main *hive.git* repository
>>> for the website, We are using the* hive-site *repository for the
>>> website, The migration happened this year January I suppose.
>>>
>>> Can give a check to the set of commit here from: gmcdonald
>>> <https://github.com/apache/hive-site/commits?author=gmcdonald> and
>>> Humbedooh <https://github.com/apache/hive-site/commits?author=Humbedooh>
>>> https://github.com/apache/hive-site/commits/main
>>>
>>> Now whatever you push to main branch of hive-site(
>>> https://github.com/apache/hive-site) it gets published on the *asf-site*
>>> branch by the buildbot(
>>> https://github.com/apache/hive-site/commits/asf-site)
>>>
>>> Simhadri's changes will be directed to the main branch of the hive-site
>>> repo and they will get auto published on the asf-

[jira] [Created] (HIVE-26429) Set default value of hive.txn.xlock.ctas to true and update lineage info for CTAS queries.

2022-07-26 Thread Simhadri G (Jira)
Simhadri G created HIVE-26429:
-

 Summary: Set default value of hive.txn.xlock.ctas to true and 
update lineage info for CTAS queries.
 Key: HIVE-26429
 URL: https://issues.apache.org/jira/browse/HIVE-26429
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26424) When decimal type has overflowed the specified precision it must throw an error/warning instead of succeeding with NULL entries

2022-07-22 Thread Simhadri G (Jira)
Simhadri G created HIVE-26424:
-

 Summary: When decimal type has overflowed the specified precision 
it must throw an error/warning instead of succeeding with NULL entries
 Key: HIVE-26424
 URL: https://issues.apache.org/jira/browse/HIVE-26424
 Project: Hive
  Issue Type: Bug
Reporter: Simhadri G


When the decimal type has overflowed the specified precision, it results in 
null entries as seen below:
{code:java}
0: jdbc:hive2://localhost:10001/> select cast(48932.19 AS DECIMAL(6,6));
+---+
|  _c0  |
+---+
| NULL  |
+---+
1 row selected (0.178 seconds){code}
 

This can be a significant issue when inserting a large amount of data from one 
table to another. This can result in entire columns having NULL entries, as 
seen below

 
{code:java}


0: jdbc:hive2://localhost:10001/> select * from t2;

+---+
|      t2.num       |
+---+
| 28367.81  |
| 49632.19  |
| NULL              |
| 28367.81  |
| 49632.19  |
| NULL              |
+---+
6 rows selected (0.202 seconds) 

0: jdbc:hive2://localhost:10001/> create table t3(num decimal(20,10));

0: jdbc:hive2://localhost:10001/> insert into t3 select cast(t2.num as 
decimal(5,2)) from t2;
12 rows affected (40.97 seconds)


0: jdbc:hive2://localhost:10001/> select * from t3;
+-+
| t3.num  |
+-+
| NULL    |
| NULL    |
| NULL    |
| NULL    |
| NULL    |
| NULL    |
+-+
6 rows selected (0.205 seconds){code}
I think it would be better to throw an error as below instead of succeeding. 
Similar to Mysql.
{code:java}
ERROR : Out of range value for column 'cast(num as decimal(5,2))' {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-20 Thread Simhadri G (Jira)
Simhadri G created HIVE-26244:
-

 Summary: Implementing locking for concurrent ctas
 Key: HIVE-26244
 URL: https://issues.apache.org/jira/browse/HIVE-26244
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26215) Expose the MIN_HISTORY_LEVEL table through Hive sys database

2022-05-09 Thread Simhadri G (Jira)
Simhadri G created HIVE-26215:
-

 Summary:  Expose the MIN_HISTORY_LEVEL table  through Hive sys 
database 
 Key: HIVE-26215
 URL: https://issues.apache.org/jira/browse/HIVE-26215
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G


While we still (partially) use MIN_HISTORY_LEVEL for the cleaner, we should 
expose it as a sys table so we can see what might be blocking the Cleaner 
thread.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26009) Determine number of buckets for implicitly bucketed ACIDv2 tables

2022-03-07 Thread Simhadri G (Jira)
Simhadri G created HIVE-26009:
-

 Summary: Determine number of buckets for implicitly bucketed 
ACIDv2 tables 
 Key: HIVE-26009
 URL: https://issues.apache.org/jira/browse/HIVE-26009
 Project: Hive
  Issue Type: Improvement
Reporter: Simhadri G
Assignee: Simhadri G


Hive tries to set number of reducers equal to number of buckets here: 
[https://github.com/apache/hive/blob/9857c4e584384f7b0a49c34bc2bdf876c2ea1503/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L6958]
 

 

The numberOfBuckets for implicitly bucketed tables is set to -1 by default. 
When this is the case, it is left to hive to estimate the number of reducers 
required the job, based on job input, and configuration parameters.

[https://github.com/apache/hive/blob/9857c4e584384f7b0a49c34bc2bdf876c2ea1503/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3369]

 

This estimate is not optimal in all cases. In the worst case, it case result in 
a single reducer being launched , which can lead to a significant bottleneck in 
performance .

 

Ideally,  the number of reducers launched should equal to number of buckets, 
which is the case for explicitly bucketed tables.

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25471) Clear entries in Privilege table - sys.tbl_col_privs when privilege synchroniser is disabled to avoid stale permission.

2021-08-20 Thread Simhadri G (Jira)
Simhadri G created HIVE-25471:
-

 Summary: Clear entries in Privilege table - sys.tbl_col_privs  
when privilege synchroniser is disabled to avoid stale permission.
 Key: HIVE-25471
 URL: https://issues.apache.org/jira/browse/HIVE-25471
 Project: Hive
  Issue Type: Task
Reporter: Simhadri G
Assignee: Simhadri G






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24497) Node heartbeats from LLAP Daemon to the client are not matching leading to timeout.

2020-12-07 Thread Simhadri G (Jira)
Simhadri G created HIVE-24497:
-

 Summary: Node heartbeats from LLAP Daemon to the client are not 
matching leading to timeout.
 Key: HIVE-24497
 URL: https://issues.apache.org/jira/browse/HIVE-24497
 Project: Hive
  Issue Type: Sub-task
Reporter: Simhadri G
Assignee: Simhadri G


Node heartbeat contains info about all the tasks that were submitted to that 
LLAP Daemon. In cloud deployment, the client is not able to match this 
heartbeats due to differences in hostname and port .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23361) Optimising privilege synchroniser

2020-05-04 Thread Simhadri G (Jira)
Simhadri G created HIVE-23361:
-

 Summary: Optimising privilege synchroniser
 Key: HIVE-23361
 URL: https://issues.apache.org/jira/browse/HIVE-23361
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Simhadri G


Privilege synchronizer pulls the list of databases, tables and columns from the 
Hive Metastore. For each of these objects it fetches the privilege information 
and invokes HMS API to refresh the privilege information in HMS. This patch 
store the privilege information as bit string. This is done to reduce the size 
of the tbl_col_privs tables in metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23301) Optimising privilege synchroniser: UDF for updating privileges

2020-04-26 Thread Simhadri G (Jira)
Simhadri G created HIVE-23301:
-

 Summary: Optimising privilege synchroniser: UDF for updating 
privileges
 Key: HIVE-23301
 URL: https://issues.apache.org/jira/browse/HIVE-23301
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, UDF
Affects Versions: 3.1.1
Reporter: Simhadri G
 Attachments: UDFSplitMapPrivs.patch

Privilege synchronizer pulls the list of databases, tables and columns from the 
Hive Metastore. For each of these objects it fetches the privilege information 
and invokes HMS API to refresh the privilege information in HMS. The current 
UDF Maps  a bit string  to a privilege based on if the privilege is granted or 
not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)