[jira] [Created] (HIVE-26939) Hive LLAP Application Master fails to come up with Hadoop 3.3.4

2023-01-12 Thread Aman Raj (Jira)
Aman Raj created HIVE-26939:
---

 Summary: Hive LLAP Application Master fails to come up with Hadoop 
3.3.4
 Key: HIVE-26939
 URL: https://issues.apache.org/jira/browse/HIVE-26939
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Aman Raj
Assignee: Aman Raj


When current oss master hive tries to bring up the LLAP Application Master, it 
fails with this issue :
{code:java}
Executing the launch command\nINFO client.ServiceClient: Loading service 
definition from local FS: 
/var/lib/ambari-agent/tmp/llap-yarn-service_2023-01-10_07-56-46/Yarnfile\nERROR 
utils.JsonSerDeser: Exception while parsing json input 
stream\ncom.fasterxml.jackson.databind.exc.InvalidFormatException: Cannot 
deserialize value of type 
`org.apache.hadoop.yarn.service.api.records.PlacementScope` from String 
\"NODE\": not one of the values accepted for Enum class: [node, rack]\n at 
[Source: (org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream); 
line: 31, column: 22] (through reference chain: 
org.apache.hadoop.yarn.service.api.records.Service[\"components\"]->java.util.ArrayList[0]->org.apache.hadoop.yarn.service.api.records.Component[\"placement_policy\"]->org.apache.hadoop.yarn.service.api.records.PlacementPolicy[\"constraints\"]->java.util.ArrayList[0]->org.apache.hadoop.yarn.service.api.records.PlacementConstraint[\"scope\"])\n\tat
 
com.fasterxml.jackson.databind.exc.InvalidFormatException.from(InvalidFormatException.java:67)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.DeserializationContext.weirdStringException(DeserializationContext.java:1851)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.DeserializationContext.handleWeirdStringValue(DeserializationContext.java:1079)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.EnumDeserializer._deserializeAltString(EnumDeserializer.java:339)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.EnumDeserializer._fromString(EnumDeserializer.java:214)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.EnumDeserializer.deserialize(EnumDeserializer.java:188)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:355)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:244)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:28)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:355)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:244)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:28)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)

Re: Proposal: Revamp Apache Hive website.

2023-01-12 Thread Chris Nauroth
That looks great! Thank you so much for your efforts, Simhadri!

Chris Nauroth


On Thu, Jan 12, 2023 at 2:52 AM Simhadri G  wrote:

> Hello Everyone,
>
> Happy new year!
>
> I am happy to announce that the new Apache Hive website[1] is finally up
> and running.
> It can be accessed here: https://hive.apache.org/
>
> I would like to specially thank Stamatis, Ayush, Sai Heamanth for
> reviewing the PR. Without their help, the new website would not have
> reached completion.
> I would also like to thank Owen O'Malley, Daniel Gruno,  Alessandro
> Solimando and Pau Tallada for the help and feedback received during the
> process.
>
> Thank you,
> Simhadri G
>
> [1]https://hive.apache.org/
> [2]HIVE-26565  :
> https://issues.apache.org/jira/browse/HIVE-26565
> [2] INFRA-24077  :
> https://issues.apache.org/jira/browse/INFRA-24077
>
> On Mon, Jan 9, 2023 at 4:56 PM Stamatis Zampetakis 
> wrote:
>
>> Hi everyone,
>>
>> Simhadri has been working hard to modernize the Hive website (HIVE-26565)
>> for the past few months and I am quite happy with the results.
>>
>> I reviewed the respective PR [1] and will commit the changes in 24h
>> unless there are objections.
>>
>> Best,
>> Stamatis
>>
>> [1] https://github.com/apache/hive-site/pull/2
>>
>> On Wed, Oct 5, 2022 at 8:46 PM Simhadri G  wrote:
>>
>>> Thanks for the feedback Stamatis !
>>>
>>>- I have updated the PR to include a README.md file with
>>>instructions to build and view the site locally after making any new
>>>changes. This will help us preview the changes locally before pushing the
>>>commit. (Docker is not required here.)
>>>
>>>- Github pages was used to share the new website with the community
>>>and it will most likely not be necessary later on.
>>>
>>>- Regarding the role of Github Actions(gh-pages.yml):
>>>
>>>- Whenever a PR is merged to the main branch, a github action is
>>>   triggered .
>>>   - Github action will install a hugo and build the site with the
>>>   new changes.  Once the build is successful, HUGO then generates a set 
>>> of
>>>   static files and these files are automatically merged to the
>>>   hive-site/asf-site branch by github actions bot.
>>>   - From here, to publish  hive-site/asf-site to project web site
>>>   sub-domain (hive.apache.org),  we need to set up a configuration
>>>   block called publish in your .asf.yaml file. (
>>>   
>>> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Publishingabranchtoyourprojectwebsite).
>>>
>>>   - We will need help from apache infra - gmcdonald
>>>    or
>>>   Humbedooh
>>>    to
>>>   make sure that we have set this up correctly.
>>>
>>>   - I agree with your suggestion to keep the changes around the
>>>revamp as minimal as possible and not mix the content update with the
>>>framework change. In this case, we can make the other changes 
>>> incrementally
>>>at a later stage.
>>>
>>>
>>> Thanks!
>>> Simhadri G
>>>
>>> On Wed, Oct 5, 2022 at 3:41 PM Stamatis Zampetakis 
>>> wrote:
>>>
 Thanks for staying on top of this Simhadri.

 I will try to help reviewing the PR once I get some time.

 What is not yet clear to me from this discussion or by looking at the
 PR is the workflow for making a change appear on the web (
 https://hive.apache.org/). Having a README which clearly states what
 needs to be done is a must.

 I also think it is quite important to have instructions and possibly
 docker images for someone to be able to test how the changes look locally
 before commiting a change to the repo.

 Another point that needs clarification is the role of github pages. I
 am not sure why it is necessary at the moment and what exactly is the plan
 going forward. If I understand well, currently it is used to preview the
 changes but from my perspective we shouldn't need to commit something to
 the repo to understand if something breaks or not; preview should happen
 locally.

 I would suggest to keep the changes around the revamp as minimal as
 possible and not mix the content update with the framework change. As
 usual, smaller changes are easier to review and merge. It is definitely
 worth updating and improving the content but let's do it incrementally so
 that changes can get merged faster.

 The list of committers and PMC members for Hive can be found in the
 apache phonebook [1]. The list can easily get outdated so maybe we can
 consider adding links to [1] and/or github and other places instead of
 duplicating the content. Anyways, let's first deal with the revamp and
 discuss content changes 

[jira] [Created] (HIVE-26938) Investigate SMB Map Join for FULL OUTER

2023-01-12 Thread John Sherman (Jira)
John Sherman created HIVE-26938:
---

 Summary: Investigate SMB Map Join for FULL OUTER
 Key: HIVE-26938
 URL: https://issues.apache.org/jira/browse/HIVE-26938
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: John Sherman
Assignee: John Sherman


HIVE-18908 added FULL OUTER Map Join support but this work did not add support 
for SMB Map Joins for FULL OUTER.

We should investigate if we can safely support SMB Map Join for this scenario 
and implement it if so.

This is the area in which it gives up conversion, if we modify this line to 
pass a 2nd argument  of true to getBigTableCandidates to enable 
isFullOuterJoinSupported - it does successfully convert (but we need to verify 
that execution does the correct thing).

[https://github.com/apache/hive/blob/03ad025ada776c0d359124c6342615f1983c1a94/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L482]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Proposal: Revamp Apache Hive website.

2023-01-12 Thread Simhadri G
Hello Everyone,

Happy new year!

I am happy to announce that the new Apache Hive website[1] is finally up
and running.
It can be accessed here: https://hive.apache.org/

I would like to specially thank Stamatis, Ayush, Sai Heamanth for reviewing
the PR. Without their help, the new website would not have reached
completion.
I would also like to thank Owen O'Malley, Daniel Gruno,  Alessandro
Solimando and Pau Tallada for the help and feedback received during the
process.

Thank you,
Simhadri G

[1]https://hive.apache.org/
[2]HIVE-26565  :
https://issues.apache.org/jira/browse/HIVE-26565
[2] INFRA-24077  :
https://issues.apache.org/jira/browse/INFRA-24077

On Mon, Jan 9, 2023 at 4:56 PM Stamatis Zampetakis 
wrote:

> Hi everyone,
>
> Simhadri has been working hard to modernize the Hive website (HIVE-26565)
> for the past few months and I am quite happy with the results.
>
> I reviewed the respective PR [1] and will commit the changes in 24h unless
> there are objections.
>
> Best,
> Stamatis
>
> [1] https://github.com/apache/hive-site/pull/2
>
> On Wed, Oct 5, 2022 at 8:46 PM Simhadri G  wrote:
>
>> Thanks for the feedback Stamatis !
>>
>>- I have updated the PR to include a README.md file with instructions
>>to build and view the site locally after making any new changes. This will
>>help us preview the changes locally before pushing the commit. (Docker is
>>not required here.)
>>
>>- Github pages was used to share the new website with the community
>>and it will most likely not be necessary later on.
>>
>>- Regarding the role of Github Actions(gh-pages.yml):
>>
>>- Whenever a PR is merged to the main branch, a github action is
>>   triggered .
>>   - Github action will install a hugo and build the site with the
>>   new changes.  Once the build is successful, HUGO then generates a set 
>> of
>>   static files and these files are automatically merged to the
>>   hive-site/asf-site branch by github actions bot.
>>   - From here, to publish  hive-site/asf-site to project web site
>>   sub-domain (hive.apache.org),  we need to set up a configuration
>>   block called publish in your .asf.yaml file. (
>>   
>> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Publishingabranchtoyourprojectwebsite).
>>
>>   - We will need help from apache infra - gmcdonald
>>    or
>>   Humbedooh
>>    to
>>   make sure that we have set this up correctly.
>>
>>   - I agree with your suggestion to keep the changes around the
>>revamp as minimal as possible and not mix the content update with the
>>framework change. In this case, we can make the other changes 
>> incrementally
>>at a later stage.
>>
>>
>> Thanks!
>> Simhadri G
>>
>> On Wed, Oct 5, 2022 at 3:41 PM Stamatis Zampetakis 
>> wrote:
>>
>>> Thanks for staying on top of this Simhadri.
>>>
>>> I will try to help reviewing the PR once I get some time.
>>>
>>> What is not yet clear to me from this discussion or by looking at the PR
>>> is the workflow for making a change appear on the web (
>>> https://hive.apache.org/). Having a README which clearly states what
>>> needs to be done is a must.
>>>
>>> I also think it is quite important to have instructions and possibly
>>> docker images for someone to be able to test how the changes look locally
>>> before commiting a change to the repo.
>>>
>>> Another point that needs clarification is the role of github pages. I am
>>> not sure why it is necessary at the moment and what exactly is the plan
>>> going forward. If I understand well, currently it is used to preview the
>>> changes but from my perspective we shouldn't need to commit something to
>>> the repo to understand if something breaks or not; preview should happen
>>> locally.
>>>
>>> I would suggest to keep the changes around the revamp as minimal as
>>> possible and not mix the content update with the framework change. As
>>> usual, smaller changes are easier to review and merge. It is definitely
>>> worth updating and improving the content but let's do it incrementally so
>>> that changes can get merged faster.
>>>
>>> The list of committers and PMC members for Hive can be found in the
>>> apache phonebook [1]. The list can easily get outdated so maybe we can
>>> consider adding links to [1] and/or github and other places instead of
>>> duplicating the content. Anyways, let's first deal with the revamp and
>>> discuss content changes later in separate JIRAs/PRs.
>>>
>>> Best,
>>> Stamatis
>>>
>>> [1] https://home.apache.org/phonebook.html?project=hive
>>>
>>> On Sun, Oct 2, 2022 at 2:41 AM Simhadri G  wrote:
>>>
 Hello Everyone,

 I have raised the PR for the revamped Hive Website here:

[jira] [Created] (HIVE-26937) Batch events during incremental replication to avoid O.O.M

2023-01-12 Thread Rakshith C (Jira)
Rakshith C created HIVE-26937:
-

 Summary: Batch events during incremental replication to avoid O.O.M
 Key: HIVE-26937
 URL: https://issues.apache.org/jira/browse/HIVE-26937
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Rakshith C
Assignee: Rakshith C


* Currently incremental replication flow of hive dumps all events read from 
notification logs sequentially in staging directory.
 * Repl Load loads all the event directories present in staging directory to a 
list and processes them.
 * This has caused O.O.M issues when number of events are large.

Hence introducing batching of events where Repl Dump dumps events in batches 
and Repl Load processes events batch by batch to avoid O.O.M.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)