how ALTER statement affects reading the ORC files before table schema change

2015-06-03 Thread Jie Zhang
Hi, Experts,

We have an application which creates a ORC table in hive to store the
events. The table schema needs to be changed dynamically since new data
adds more new columns (append only). We noticed the application needs to
use alter table to change the table schema before loading the new data in
order to read previous records (before schema change) correctly.


For example:

create table t1 (c1 structf1:string, f2:int) partitions by (ds timestamp)
stored as orc;

load data inpath ‘/old/data/file/path’ into table t1 partition
(ds=‘2015-06-02 00:00:00’);

alter table t1 change c1 c1 structf1:string, f2:int, f3:string;

load data inpath ‘/new/data/file/path’ into table t1 partition
(ds=‘2015-06-02 00:00:00’);

If ALTER is not executed, say, just load both old and new ORC files into
another table with the final schema, the previous records can not be read
at all with hitting ClassCastException. May I know what magic ALTER
statement does to have ORC reader reading the file before schema change?
Thanks very much for your help!


Jessica


RE: Top N query

2015-06-03 Thread Arpita SINGH
Is Inner Join supported by the Hive version you are using ?

Regards
Arpita Singh

From: Ravisankar Mani [mailto:rrav...@gmail.com]
Sent: Wednesday, June 03, 2015 11:29 AM
To: user@hive.apache.org
Subject: Top N query

Hi everyone,
   FAILED: SemanticException [Error 10019]: Line 9:5 OR not 
supported in JOIN currently 'movieid_0'Exception occur while executing 
the following query. But i need the add the conditions. Kindly help other way 
to resolve the query. Please refer the following query


SELECT 
IF(ISNOTNULL(SUM(recommend_ratings2.rating)),SUM(recommend_ratings2.rating), 0 
) AS rating
,recommend_ratings2.movieid AS movieid
FROM default.recommend_ratings2
INNER JOIN
 (SELECT recommend_ratings2.movieid AS movieid_0
,IF(ISNOTNULL(SUM(recommend_ratings2.rating)),SUM(recommend_ratings2.rating), 0 
) AS Measure_0 FROM default.recommend_ratings2
 GROUP BY recommend_ratings2.movieid  ORDER BY Measure_0 DESC limit 2 ) T_0
ON ( recommend_ratings2.movieid = T_0.movieid_0 or ((recommend_ratings2.movieid 
IS NULL) AND
 (T_0.movieid_0 IS NULL)))

Regards
Ravi
This email and any attachments may contain confidential information and 
intellectual property (including copyright material). It is only for the use of 
the addressee(s) in accordance with any instructions contained within it. If 
you are not the addressee, you are prohibited from copying, forwarding, 
disclosing, saving or otherwise using it in any way. If you receive this email 
in error, please immediately advise the sender and delete it. Steria may 
monitor the content of emails within its network to ensure compliance with its 
policies and procedures. Emails are susceptible to alteration and their 
integrity (including origin) cannot be assured. Steria shall not be liable for 
any modification to a message, or for messages falsely sent.


MetaException(message:java.security.AccessControlException: Permission denied

2015-06-03 Thread patcharee

Hi,

I was running a spark job to insert overwrite hive table and got 
Permission denied. My question is why spark job did the insert by using 
user 'hive', not myself who ran the job? How can I fix the problem?


val hiveContext = new HiveContext(sc)
import hiveContext.implicits._
hiveContext.sql(INSERT OVERWRITE table 4dim ... )


Caused by: MetaException(message:java.security.AccessControlException: 
Permission denied: user=hive, access=WRITE, 
inode=/apps/hive/warehouse/wrf_tables/4dim/zone=2/z=1/year=2009/month=1:patcharee:hdfs:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6795)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6777)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6702)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:9529)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:1516)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1433)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_partition_result$alter_partition_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_partition_result$alter_partition_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_partition_result.read(ThriftHiveMetastore.java)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_partition(ThriftHiveMetastore.java:2033)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_partition(ThriftHiveMetastore.java:2018)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_partition(HiveMetaStoreClient.java:1091)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)

at com.sun.proxy.$Proxy37.alter_partition(Unknown Source)
at 
org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(Hive.java:469)

... 26 more


BR,
Patcharee


Group by query

2015-06-03 Thread Ravisankar Mani
Hi everyone,

   I have selected specific column using aggregation . FAILED:
SemanticException [Error 10002]: Line 2:26 Invalid column reference
'contactid  .  Exception occur while executing following query. But it
working in sql server without using group by. Kindly refer the query.


SELECT
  adventurepersoncontacts.contactid as contactid
 ,adventurepersoncontacts.fullname as fullname
 ,adventurepersoncontacts.age as age
 ,adventurepersoncontacts.emailaddress as emailaddress
 ,adventurepersoncontacts.phoneno as phoneno
 ,adventurepersoncontacts.modifieddate as modifieddate
, COUNT(adventurepersoncontacts.contactid)  as contact
 FROM
 default.adventurepersoncontacts

Regards,
Ravisankar M R


Hive Query

2015-06-03 Thread Teja Kunapareddy
Ravi,

 

Use if condition .. it works .. 

 

UseISNOTNULL  instead, and give the replace field as third parameter for
if condition.

 

 

NOTE: false and null are together in if condition

 

 

If(ISNOTNULL(sum(columnname), sum(columnname),0) as sumVendor

 

Regards,

Teja

 

 



Re: Write access to wiki

2015-06-03 Thread Lefty Leverenz
Done.  Welcome to the Hive wiki team, Anant!

-- Lefty

On Tue, Jun 2, 2015 at 9:57 PM, anant nag nntna...@gmail.com wrote:

 Hi,

 I want to update the Hive Cli documentation with the information on how to
 use the feature added in HIVE-9664( Hive add jar command should be able to
 download and add jars from a repository). Could you please provide me write
 access to the wiki. My confluence username is *nntnag17*

 Thanks,

 --
 Anant Nag
 Software Engineer
 LinkedIn Corporation
 an...@linkedin.com



Re: Top N query

2015-06-03 Thread Ravisankar Mani
Hi Arpita,

Inner join working perfectly. But the 'or' condition not
supported in inner join.

Regards
Ravi

On Wed, Jun 3, 2015 at 6:39 AM, Arpita SINGH arpita.si...@steria.co.in
wrote:

  Is Inner Join supported by the Hive version you are using ?



 Regards

 Arpita Singh



 *From:* Ravisankar Mani [mailto:rrav...@gmail.com]
 *Sent:* Wednesday, June 03, 2015 11:29 AM
 *To:* user@hive.apache.org
 *Subject:* Top N query



 Hi everyone,

FAILED: SemanticException [Error 10019]: Line 9:5 OR not
 supported in JOIN currently 'movieid_0'Exception occur while
 executing the following query. But i need the add the conditions. Kindly
 help other way to resolve the query. Please refer the following query



 SELECT
 IF(ISNOTNULL(SUM(recommend_ratings2.rating)),SUM(recommend_ratings2.rating),
 0 ) AS rating
 ,recommend_ratings2.movieid AS movieid
 FROM default.recommend_ratings2
 INNER JOIN
  (SELECT recommend_ratings2.movieid AS movieid_0
 ,IF(ISNOTNULL(SUM(recommend_ratings2.rating)),SUM(recommend_ratings2.rating),
 0 ) AS Measure_0 FROM default.recommend_ratings2
  GROUP BY recommend_ratings2.movieid  ORDER BY Measure_0 DESC limit 2 ) T_0
 ON ( recommend_ratings2.movieid = T_0.movieid_0 or
 ((recommend_ratings2.movieid IS NULL) AND
  (T_0.movieid_0 IS NULL)))

   Regards

 Ravi
   This email and any attachments may contain confidential information and
 intellectual property (including copyright material). It is only for the
 use of the addressee(s) in accordance with any instructions contained
 within it. If you are not the addressee, you are prohibited from copying,
 forwarding, disclosing, saving or otherwise using it in any way. If you
 receive this email in error, please immediately advise the sender and
 delete it. Steria may monitor the content of emails within its network to
 ensure compliance with its policies and procedures. Emails are susceptible
 to alteration and their integrity (including origin) cannot be assured.
 Steria shall not be liable for any modification to a message, or for
 messages falsely sent.



Re: Null condition hive query

2015-06-03 Thread Bhagwan S. Soni
Why don't you try IF(column_name IS NULL, 0) AS column_name;
After this you can perform your aggregation on top of this.

On Wed, Jun 3, 2015 at 11:16 AM, Ravisankar Mani rrav...@gmail.com wrote:

 Hi Everyone,

   I need to check condition measure value is null or not. If it is null is
 then replace 0 otherwise in returns same values

 It is a sample sql query

  SELECT ISNULL(SUM(CAST([Purchasing].[Vendor].[BusinessEntityID] AS
 decimal(38,6))),0)
 AS [BusinessEntityID],
 [Purchasing].[Vendor].[AccountNumber] AS [AccountNumber]
 FROM [Purchasing].[Vendor]
 GROUP BY [Purchasing].[Vendor].[AccountNumber]  ORDER BY 2 ASC

 if  i give  'sum(columnname,0) ' it replace 0 if any null found

 i have check in hive query   ISNULL(columnname) its only returns true or
 false the how to i replace 0.. Could you please help regarding this query


 Regards.

 Ravi



Re: Group by query

2015-06-03 Thread Bhagwan S. Soni
Ravishankar,

below query will works fine without including group by
SELECT
 COUNT(adventurepersoncontacts.contactid)  as contact
 FROM
 default.adventurepersoncontacts

But when you are planning to get aggregated column with non-aggregated
column you have to specify the level of aggregation.
Why don't you include selected column in group by, I don't think so it will
affect your output.

Thanks,
Bhagwan

On Wed, Jun 3, 2015 at 4:05 PM, Ravisankar Mani rrav...@gmail.com wrote:

 Hi everyone,

I have selected specific column using aggregation . FAILED:
 SemanticException [Error 10002]: Line 2:26 Invalid column reference
 'contactid  .  Exception occur while executing following query. But it
 working in sql server without using group by. Kindly refer the query.


 SELECT
   adventurepersoncontacts.contactid as contactid
  ,adventurepersoncontacts.fullname as fullname
  ,adventurepersoncontacts.age as age
  ,adventurepersoncontacts.emailaddress as emailaddress
  ,adventurepersoncontacts.phoneno as phoneno
  ,adventurepersoncontacts.modifieddate as modifieddate
 , COUNT(adventurepersoncontacts.contactid)  as contact
  FROM
  default.adventurepersoncontacts

 Regards,
 Ravisankar M R



Re: HIve Joins vs Pig Joins

2015-06-03 Thread DU DU
Are Tez, CBO, and ORC are considered in this. With so much efforts invested
to Hive especially from Hortonworks and Cloudera, I think Hive will have
the future, such as Hive over spark and so on.

On Wed, Jun 3, 2015 at 12:06 AM, Ritesh Kumar Singh 
riteshoneinamill...@gmail.com wrote:

 Check this paper :
 Pig-Hive benchmarks
 http://www.ibm.com/developerworks/library/ba-pigvhive/pighivebenchmarking.pdf

 Discusses these things in details.

 Cheers,
 Ritesh

 On Tue, Jun 2, 2015 at 11:32 AM, sreejesh s sreejesh...@yahoo.com wrote:

 Hi,

 I have read couple of articles that say pig joins perform better compared
 to Hive joins... Is that true ? if Yes could you please explain the reason.

 Thanks






-- 
Thanks,
Will


Re: Hive Query

2015-06-03 Thread gabriel balan

Hi


If(ISNOTNULL(sum(columnname), sum(columnname),0) as sumVendor

Or
*coalesce( sum(columnname),0) as ...*

   As explained here 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF,
   COALESCE(T v1, T v2, ...) Returns the first v that is not NULL, or NULL if 
all v's are NULL.


hth
GB

On 6/3/2015 2:23 AM, Teja Kunapareddy wrote:


Ravi,

Use if condition .. it works ..

UseISNOTNULL  instead, and give the replace field as third parameter for if 
condition…

NOTE: false and null are together in if condition

If(ISNOTNULL(sum(columnname), sum(columnname),0) as sumVendor

Regards,

Teja



--
The statements and opinions expressed here are my own and do not necessarily 
represent those of Oracle Corporation.



Re: Hive Query

2015-06-03 Thread João Alves
Hey all,

Has anyone else also found the coalesce function to be prone to some weird 
behaviours? 

e.g.1: Giving null when it shouldn’t.

e.g.2: I had to change a coalesce(v1,v2,v3) to coalesce(v1,v2,v3,null) (???) 
otherwise the query would crash!

Regards,
João Alves


 On 03 Jun 2015, at 17:00, gabriel balan gabriel.ba...@oracle.com wrote:
 
 Hi
 
 If(ISNOTNULL(sum(columnname), sum(columnname),0) as sumVendor
 Or
 coalesce( sum(columnname),0) as ...
 As explained here 
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF, 
 COALESCE(T v1, T v2, ...) Returns the first v that is not NULL, or NULL if 
 all v's are NULL.
 
 
 hth
 GB
 
 On 6/3/2015 2:23 AM, Teja Kunapareddy wrote:
 Ravi,
  
 Use if condition .. it works .. 
  
 UseISNOTNULL  instead, and give the replace field as third parameter for 
 if condition…
  
  
 NOTE: false and null are together in if condition
  
  
 If(ISNOTNULL(sum(columnname), sum(columnname),0) as sumVendor
  
 Regards,
 Teja
  
  
 
 -- 
 The statements and opinions expressed here are my own and do not necessarily 
 represent those of Oracle Corporation.



Re: HIve Joins vs Pig Joins

2015-06-03 Thread Alan Gates
These results are fairly old.  Both Hive and Pig have changed 
significantly since then.  Many of Hive's internal operators have been 
rewritten.  Both can now run over Tez or Spark.  So at this point it's 
hard to say which is better.  The best option is to develop some tests 
based on your use case and try both to see which works best for you.


Alan.


Ritesh Kumar Singh mailto:riteshoneinamill...@gmail.com
June 2, 2015 at 21:06
Check this paper :
Pig-Hive benchmarks 
http://www.ibm.com/developerworks/library/ba-pigvhive/pighivebenchmarking.pdf


Discusses these things in details.

Cheers,
Ritesh


sreejesh s mailto:sreejesh...@yahoo.com
June 1, 2015 at 23:02
Hi,

I have read couple of articles that say pig joins perform better 
compared to Hive joins... Is that true ? if Yes could you please 
explain the reason.


Thanks




Re: Query in HQL

2015-06-03 Thread Ankit Bhatnagar
modify to say   'CustomerId' 



 On Wednesday, June 3, 2015 9:44 PM, Renuka Be renunalin...@gmail.com 
wrote:
   

 Hi Folks,
Here is my query,
select customerid as CustomerId  from Customers

It returns table with column name as 'customerid', ' CustomerId' in the query 
was not reflected into table. What shall i do to display column name as 
'CustomerId'.
Regards,Renuka

  

Hive 0.14: impersonation not working without system property

2015-06-03 Thread Lonikar, Kiran
Hi,

While executing hive queries through CliDriver, we are getting permission 
denied error on scratch dir when executing with impersonation. The error goes 
away when the system property HADOOP_PROXY_USER is set to the the 
proxy/impersonated user name.

Attaching the sample program to reproduce the issue. This happens on HDP 2.2 
(did not try on CDH 5u4 which also has hive 0.14).

The sample program when executed without any cmd line parameters reproduces 
this issue. When a cmd line arg is provided, it sets the system property and 
the queries are executed without an error.

Has anyone seen this? Seems like a possible regression from 0.13 to 0.14. It 
does not happen on CDH 5u2 which has hive 0.13.

Kiran



HiveCliSample.java
Description: HiveCliSample.java


Re: Top N query

2015-06-03 Thread Ravisankar Mani
Hi gopal,

Thanks for your response.

I does't understand about your solution. Could you please provide exact
query?

Regards,
Ravi



On Wed, Jun 3, 2015 at 2:23 PM, Gopal Vijayaraghavan gop...@apache.org
wrote:

 Hi,

 This particular case can be handled by a special case of the inner
 equijoin.

  ( recommend_ratings2.movieid = T_0.movieid_0 or
 ((recommend_ratings2.movieid IS NULL) AND
  (T_0.movieid_0 IS NULL)))

 Can be rewritten as a null-safe equi-join (i.e where movieid =
 movieid_0, which is not SQL semantics).

 That however is a corner-case to workaround folks who want OR joins to
 just handle NULL == NULL as a match.

 Cheers,
 Gopal





Re: Group by query

2015-06-03 Thread Ravisankar Mani
Hi Bhagwan,

 Thanks for your response...

   Is there any possible to use aggregated and non aggregated
columns in hive query to get data without using group by?


Regards,
Ravi





On Wed, Jun 3, 2015 at 10:24 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
wrote:

 Ravishankar,

 below query will works fine without including group by
 SELECT
  COUNT(adventurepersoncontacts.contactid)  as contact
  FROM
  default.adventurepersoncontacts

 But when you are planning to get aggregated column with non-aggregated
 column you have to specify the level of aggregation.
 Why don't you include selected column in group by, I don't think so it
 will affect your output.

 Thanks,
 Bhagwan

 On Wed, Jun 3, 2015 at 4:05 PM, Ravisankar Mani rrav...@gmail.com wrote:

 Hi everyone,

I have selected specific column using aggregation . FAILED:
 SemanticException [Error 10002]: Line 2:26 Invalid column reference
 'contactid  .  Exception occur while executing following query. But it
 working in sql server without using group by. Kindly refer the query.


 SELECT
   adventurepersoncontacts.contactid as contactid
  ,adventurepersoncontacts.fullname as fullname
  ,adventurepersoncontacts.age as age
  ,adventurepersoncontacts.emailaddress as emailaddress
  ,adventurepersoncontacts.phoneno as phoneno
  ,adventurepersoncontacts.modifieddate as modifieddate
 , COUNT(adventurepersoncontacts.contactid)  as contact
  FROM
  default.adventurepersoncontacts

 Regards,
 Ravisankar M R





Query in HQL

2015-06-03 Thread Renuka Be
Hi Folks,

Here is my query,

select customerid as CustomerId  from Customers

It returns table with column name as 'customerid', ' CustomerId' in the
query was not reflected into table. What shall i do to display column name
as 'CustomerId'.

Regards,
Renuka


Re: Query in HQL

2015-06-03 Thread Ankit Bhatnagar
My bad .. i looked up the code and hive really transforms everything to lower 
case. 


 On Wednesday, June 3, 2015 10:02 PM, Renuka Be renunalin...@gmail.com 
wrote:
   

 Hi Ankit,
Thanks for your reply.
I have tried as you suggested it throws below error.
FAILED: ParseException line 1:21 cannot recognize input near 'as' 
''CustomerId'' 'from' in selection target

Query: select customerid as 'CustomerId' from Customers
Regards,Renuka
On Thu, Jun 4, 2015 at 12:49 AM, Ankit Bhatnagar ank...@yahoo-inc.com wrote:

modify to say   'CustomerId' 



 On Wednesday, June 3, 2015 9:44 PM, Renuka Be renunalin...@gmail.com 
wrote:
   

 Hi Folks,
Here is my query,
select customerid as CustomerId  from Customers

It returns table with column name as 'customerid', ' CustomerId' in the query 
was not reflected into table. What shall i do to display column name as 
'CustomerId'.
Regards,Renuka

   



  

Re: Top N query

2015-06-03 Thread Gopal Vijayaraghavan
Hi,

This particular case can be handled by a special case of the inner
equijoin.

 ( recommend_ratings2.movieid = T_0.movieid_0 or
((recommend_ratings2.movieid IS NULL) AND
 (T_0.movieid_0 IS NULL)))

Can be rewritten as a null-safe equi-join (i.e where movieid =
movieid_0, which is not SQL semantics).

That however is a corner-case to workaround folks who want OR joins to
just handle NULL == NULL as a match.

Cheers,
Gopal




When using ORC, how to confirm Stipes/row-groups are being skipped ?

2015-06-03 Thread chandan sheth
Hi Community,

While using ORC file format, I would like to see in the logs that stripes
and/or row-groups are being skipped based on my where clause.

I don't see skip-messages it in the logs.
Is that info even outputted ? If so, what do I need to enable it ?

Thank you,

*Chandan*


connection pooling for hive JDBC client

2015-06-03 Thread McWhorter, David
Hello, I am working on an application that queries and interacts with hive 
using the JDBC API.  In many other cases, using a JDBC connection pool such as 
commons-dbcp or BoneCP or HikariCP is a recommended practice and results in 
much better performance.  All of the examples I’ve found of using accessing 
Hive through the JDBC API (such as 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBCClientSampleCode)
 use a raw JDBC connection to Hive directly.  A few questions:

  1.  Should hive work with connection pools such as commons-dbcp, BoneCP, or 
HikariCP?
 *   Note: I have tried all BoneCP and HikariCP, and cannot seem to get 
either to work because HiveConnection.setReadOnly throws a 
java.sql.SQLException saying “Method not supported”.  I am using hive 0.14.0.
  2.  Do you recommended using a JDBC connection pool for interacting with hive 
from an application that will execute many repeated and concurrent 
queries/statements?

Also, just to be clear, I am not asking about configuring the Hive metastore to 
use a connection pool to connect to its underlying database (MySQL, PostgreSQL, 
etc), but about using a connection pool to interact with and query Hive through 
the JDBC api from an application.

Thank you,
David McWhorter

—

David McWhorter
Senior Developer, Foundations
Informatics and Technology Services
Office: 434.260.5232 | Mobile: 434.227.2551
david_mcwhor...@premierinc.commailto:david_mcwhor...@premierinc.com  |  
Premier, Inc. (NASDAQ: PINC)


Re: current_date function in hive

2015-06-03 Thread Ayazur Rehman
Hi everyone,

Thanks for the responses. My problem is solved now.
I was not using the proper version of hive to be able to use the
current_date function.
Thanks Gopal for the UDF link.
I beleive I will have to add the udf to the hiveserver2 in order to be able
to run the query using Oozie.

On Tue, Jun 2, 2015 at 1:30 PM, Gopal Vijayaraghavan gop...@apache.org
wrote:

 Hi,

  You may try to_date(FROM_UNIXTIME(UNIX_TIMESTAMP()))

 That would be a very bad idea for query correctness.

 The current_date UDF was introduced because of that particular
 anti-pattern.

 The unix_timestamp() is evaluated when a row is read, so each row gets a
 slightly different timestamp when doing ETLs  I have seen it give
 completely incorrect results when used near midnight (i.e query kicked off
 at 11:59, executed at 00:01 next day).

 Plus map-reduce falure handling mandates (logically) identical outputs on
 map task retries or speculation.

 I¹ve recently pulled out the current_timestamp() impl out into a reusable
 UDF
 recently so that I can use that in hive-1.0.x installs as well.

 https://github.com/t3rmin4t0r/current-timestamp

 You can use that UDF JAR with hive-1.0+, the constant folding will replace
 the value in-place during compilation.

 Cheers,
 Gopal





-- 
Thanking You,
Ayazur Rehman
+1 720 917 8540