Re: Hive Custom UDF evaluate behavior when @UDFType is set

2018-05-14 Thread Jason Dere
It looks like there are 2 separate places where constant folding is occurring:


java.lang.Exception: Evaluate

at com.protegrity.hive.udf.testUdf.evaluate(testUdf.java:38)

at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:145)

at 
org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:232)

at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:958)

at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1168)

at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)

at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)

at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)

at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)

at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)

at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:192)

at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:145)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10530)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10486)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3720)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3499)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:9011)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8966)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9812)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9705)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10141)

at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:286)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10152)

at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)

at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)

at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)


And


java.lang.Exception: Evaluate

at com.protegrity.hive.udf.testUdf.evaluate(testUdf.java:38)

at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.evaluateFunction(ConstantPropagateProcFactory.java:533)

at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.foldExpr(ConstantPropagateProcFactory.java:238)

at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory.access$000(ConstantPropagateProcFactory.java:92)

at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagateProcFactory$ConstantPropagateSelectProc.process(ConstantPropagateProcFactory.java:735)

at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)

at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)

at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)

at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagate$ConstantPropagateWalker.walk(ConstantPropagate.java:147)

at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)

at 
org.apache.hadoop.hive.ql.optimizer.ConstantPropagate.transform(ConstantPropagate.java:117)

at 
org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:182)

at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10207)

at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)

at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)

at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)



​I don't think this is necessarily a bug, just how constants are 
folded/propagated in Hive.

I'm actually surprised you did not hit this when running the query against 
tables. Unless the UDF was taking in parameters based on the table's column 
values (then no constant propagation).




From: PradeepKumar Yadav 
Sent: Sunday, May 13, 2018 11:44 PM
To: Jason Dere; user@hive.

org.apache.hadoop.hive.ql.metadata.HiveMetaStoreClientFactory

2018-05-14 Thread Elliot West
Hello,

I've been looking at Amazon's integration of their Glue service with Hive
in EMR and notice that they achieve this with:

   - An AWS Glue specific implementation of
   org.apache.hadoop.hive.ql.metadata.HiveMetaStoreClientFactory
   (com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory)
   - A configuration property to set the HMS client factory
   implementation: hive.metastore.client.factory.class

However, despite searching on https://github.com/apache/hive, I cannot find
the o.a.h.h.q.m.HMSCF class, or the configuration key anywhere in vanilla
Apache Hive even though the packaging suggests it is not an Amazon addition
specific to EMR. I do find a similar class in
org.apache.hadoop.hive.ql.security.authorization.plugin however.

Can you tell me if this an Amazon specific construct or are my code
searching abilities failing me. If this has been added by Amazon to EMR
alone, are there any plans to add similar functionality back to the Apache
Hive project, and if so when?

I'd be keen to see relevant GitHub links, JIRAs, etc.

Cheers,

Elliot.


Re: May 2018 Hive User Group Meeting

2018-05-14 Thread Sahil Takiar
Hello,

Yes, the meetup was recorded. We are in the process of getting it uploaded
to Youtube. Once its publicly available I will send out the link on this
email thread.

Thanks

--Sahil

On Mon, May 14, 2018 at 6:04 AM,  wrote:

> Hi,
>
>
>
> If you have recorded the meeting share link please. I could not follow it
> online for the schedule (I live in Spain).
>
>
>
> Kind Regards,
>
>
>
>
>
> *From:* Luis Figueroa [mailto:lef...@outlook.com]
> *Sent:* miércoles, 9 de mayo de 2018 18:01
> *To:* user@hive.apache.org
> *Cc:* d...@hive.apache.org
> *Subject:* Re: May 2018 Hive User Group Meeting
>
>
>
> Hey everyone,
>
>
>
> Was the meeting recorded by any chance?
>
> Luis
>
>
> On May 8, 2018, at 5:31 PM, Sahil Takiar  wrote:
>
> Hey Everyone,
>
>
>
> Almost time for the meetup! The live stream can be viewed on this link:
> https://live.lifesizecloud.com/extension/2000992219?
> token=067078ac-a8df-45bc-b84c-4b371ecbc719&name=&locale=en&
> meeting=Hive%20User%20Group%20Meetup
>
> The stream won't be live until the meetup starts.
>
> For those attending in person, there will be guest wifi:
>
> Login: HiveMeetup
> Password: ClouderaHive
>
>
>
> On Mon, May 7, 2018 at 12:48 PM, Sahil Takiar 
> wrote:
>
> Hey Everyone,
>
>
>
> The meetup is only a day away! Here
> 
> is a link to all the abstracts we have compiled thus far. Several of you
> have asked about event streaming and recordings. The meetup will be both
> streamed live and recorded. We will post the links on this thread and on
> the meetup link tomorrow closer to the start of the meetup.
>
>
>
> The meetup will be at Cloudera HQ - 395 Page Mill Rd
> . If
> you have any trouble getting into the building, feel free to post on the
> meetup link.
>
>
>
> Meetup Link: https://www.meetup.com/Hive-User-Group-Meeting/
> events/249641278/
>
>
>
> On Wed, May 2, 2018 at 7:48 AM, Sahil Takiar 
> wrote:
>
> Hey Everyone,
>
>
>
> The agenda for the meetup has been set and I'm excited to say we have lots
> of interesting talks scheduled! Below is final agenda, the full list of
> abstracts will be sent out soon. If you are planning to attend, please RSVP
> on the meetup link so we can get an accurate headcount of attendees (
> https://www.meetup.com/Hive-User-Group-Meeting/events/249641278/).
>
>
> 6:30 - 7:00 PM Networking and Refreshments
>
> 7:00PM - 8:20 PM Lightning Talks (10 min each) - 8 talks total
>
> · What's new in Hive 3.0.0 - Ashutosh Chauhan
>
> · Hive-on-Spark at Uber: Efficiency & Scale - Xuefu Zhang
>
> · Hive-on-S3 Performance: Past, Present, and Future - Sahil Takiar
>
> · Dali: Data Access Layer at LinkedIn - Adwait Tumbde
>
> · Parquet Vectorization in Hive - Vihang Karajgaonkar
>
> · ORC Column Level Encryption - Owen O’Malley
>
> · Running Hive at Scale @ Lyft - Sharanya Santhanam, Rohit Menon
>
> · Materialized Views in Hive - Jesus Camacho Rodriguez
>
> 8:30 PM - 9:00 PM Hive Metastore Panel
>
> · Moderator: Vihang Karajgaonkar
>
> · Participants:
>
> oDaniel Dai - Hive Metastore Caching
>
> oAlan Gates - Hive Metastore Separation
>
> oRituparna Agrawal - Customer Use Cases & Pain Points of (Big)
> Metadata
>
> The Metastore panel will consist of a short presentation by each panelist
> followed by a Q&A session driven by the moderator.
>
>
>
> On Tue, Apr 24, 2018 at 2:53 PM, Sahil Takiar 
> wrote:
>
> We still have a few slots open for lightening talks, so if anyone is
> interested in giving a presentation don't hesitate to reach out!
>
>
>
> If you are planning to attend the meetup, please RSVP on the Meetup link (
> https://www.meetup.com/Hive-User-Group-Meeting/events/249641278/) so that
> we can get an accurate headcount for food.
>
>
>
> Thanks!
>
>
>
> --Sahil
>
>
>
> On Wed, Apr 11, 2018 at 5:08 PM, Sahil Takiar 
> wrote:
>
> Hi all,
>
> I'm happy to announce that the Hive community is organizing a Hive user
> group meeting in the Bay Area next month. The details can be found at
> https://www.meetup.com/Hive-User-Group-Meeting/events/249641278/
>
>
> The format of this meetup will be slightly different from previous ones.
> There will be one hour dedicated to lightning talks, followed by a group
> discussion on the future of the Hive Metastore.
>
> We are inviting talk proposals from Hive users as well as developers at
> this time. Please contact either myself (takiar.sa...@gmail.com), Vihang
> Karajgaonkar (vih...@cloudera.com), or Peter Vary (pv...@cloudera.com)
> with proposals. We currently have 5 openings.
>
> Please let me know if you have any questions or suggestions.
>
> Thanks,
> Sahil
>
>
>
>
>
> --
>
> Sahil Takiar
>
> Software Engineer
> takiar.sa...@gmail.com | (510) 673-0309
>
>
>
>
>
> --
>
> Sahil Takiar
>
> Software Engineer
> takiar.sa...@gmail.com | (510) 673-0309
>

RE: May 2018 Hive User Group Meeting

2018-05-14 Thread roberto.tardio
Hi,

 

If you have recorded the meeting share link please. I could not follow it 
online for the schedule (I live in Spain).

 

Kind Regards,

 

 

From: Luis Figueroa [mailto:lef...@outlook.com] 
Sent: miércoles, 9 de mayo de 2018 18:01
To: user@hive.apache.org
Cc: d...@hive.apache.org
Subject: Re: May 2018 Hive User Group Meeting

 

Hey everyone,  

 

Was the meeting recorded by any chance? 

Luis


On May 8, 2018, at 5:31 PM, Sahil Takiar mailto:takiar.sa...@gmail.com> > wrote:

Hey Everyone, 

 

Almost time for the meetup! The live stream can be viewed on this link: 
https://live.lifesizecloud.com/extension/2000992219?token=067078ac-a8df-45bc-b84c-4b371ecbc719
 

 &name=&locale=en&meeting=Hive%20User%20Group%20Meetup

The stream won't be live until the meetup starts.

For those attending in person, there will be guest wifi:

Login: HiveMeetup
Password: ClouderaHive

 

On Mon, May 7, 2018 at 12:48 PM, Sahil Takiar mailto:takiar.sa...@gmail.com> > wrote:

Hey Everyone, 

 

The meetup is only a day away! Here 

  is a link to all the abstracts we have compiled thus far. Several of you have 
asked about event streaming and recordings. The meetup will be both streamed 
live and recorded. We will post the links on this thread and on the meetup link 
tomorrow closer to the start of the meetup.

 

The meetup will be at Cloudera HQ - 395 Page Mill Rd. If you have any trouble 
getting into the building, feel free to post on the meetup link.

 

Meetup Link: https://www.meetup.com/Hive-User-Group-Meeting/events/249641278/

 

On Wed, May 2, 2018 at 7:48 AM, Sahil Takiar mailto:takiar.sa...@gmail.com> > wrote:

Hey Everyone,

 

The agenda for the meetup has been set and I'm excited to say we have lots of 
interesting talks scheduled! Below is final agenda, the full list of abstracts 
will be sent out soon. If you are planning to attend, please RSVP on the meetup 
link so we can get an accurate headcount of attendees ( 
 
https://www.meetup.com/Hive-User-Group-Meeting/events/249641278/).


6:30 - 7:00 PM Networking and Refreshments 

7:00PM - 8:20 PM Lightning Talks (10 min each) - 8 talks total

· What's new in Hive 3.0.0 - Ashutosh Chauhan

* Hive-on-Spark at Uber: Efficiency & Scale - Xuefu Zhang

* Hive-on-S3 Performance: Past, Present, and Future - Sahil Takiar

* Dali: Data Access Layer at LinkedIn - Adwait Tumbde

· Parquet Vectorization in Hive - Vihang Karajgaonkar 

* ORC Column Level Encryption - Owen O’Malley

· Running Hive at Scale @ Lyft - Sharanya Santhanam, Rohit Menon

* Materialized Views in Hive - Jesus Camacho Rodriguez

8:30 PM - 9:00 PM Hive Metastore Panel

* Moderator: Vihang Karajgaonkar

* Participants: 

oDaniel Dai - Hive Metastore Caching

oAlan Gates - Hive Metastore Separation

oRituparna Agrawal - Customer Use Cases & Pain Points of (Big) Metadata

The Metastore panel will consist of a short presentation by each panelist 
followed by a Q&A session driven by the moderator.

 

On Tue, Apr 24, 2018 at 2:53 PM, Sahil Takiar mailto:takiar.sa...@gmail.com> > wrote:

We still have a few slots open for lightening talks, so if anyone is interested 
in giving a presentation don't hesitate to reach out! 

 

If you are planning to attend the meetup, please RSVP on the Meetup link 
(https://www.meetup.com/Hive-User-Group-Meeting/events/249641278/) so that we 
can get an accurate headcount for food.

 

Thanks!

 

--Sahil

 

On Wed, Apr 11, 2018 at 5:08 PM, Sahil Takiar mailto:takiar.sa...@gmail.com> > wrote:

Hi all,

I'm happy to announce that the Hive community is organizing a Hive user group 
meeting in the Bay Area next month. The details can be found at 
https://www.meetup.com/Hive-User-Group-Meeting/events/249641278/ 


The format of this meetup will be slightly different from previous ones. There 
will be one hour dedicated to lightning talks, followed by a group discussion 
on the future of the Hive Metastore.

We are inviting talk proposals from Hive users as well as developers at this 
time. Please contact either myself (takiar.sa...@gmail.com 
 ), Vihang Karajgaonkar (vih...@cloudera.com 
 ), or Peter Vary (pv...@cloudera.com 
 ) with proposals. We currently have 5 openings.

Please let me know if you have any questions or suggestions.

Thanks,
Sahil





 

-- 

Sahil Takiar

Software Engineer
takiar.sa...@gmail.com   | (510) 673-0309





 

-- 

Sahil Takiar

Software Engineer
takiar.sa...@gmail.com   | (510) 673-0309