RE: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Xu, Cheng A
Congratulations!!!

From: @Sanjiv Singh [mailto:sanjiv.is...@gmail.com]
Sent: Tuesday, March 24, 2015 12:45 PM
To: user@hive.apache.org
Cc: d...@hive.apache.org; mmccl...@hortonworks.com; jxi...@apache.org; 
sergio.p...@cloudera.com
Subject: Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and 
Sergio Pena

Congratulations !!!

Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Mon, Mar 23, 2015 at 11:38 PM, Carl Steinbach 
mailto:c...@apache.org>> wrote:
The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and Sergio 
Pena committers on the Apache Hive Project.

Please join me in congratulating Jimmy, Matt, and Sergio.

Thanks.

- Carl




Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread @Sanjiv Singh
Congratulations !!!

Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Mon, Mar 23, 2015 at 11:38 PM, Carl Steinbach  wrote:

> The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and
> Sergio Pena committers on the Apache Hive Project.
>
> Please join me in congratulating Jimmy, Matt, and Sergio.
>
> Thanks.
>
> - Carl
>
>


Re: Delete ORC partition

2015-03-23 Thread Megha Garg
Thanks for correcting, that was a typo. My actual command is :-

 alter table my_tbl drop partition (date='2014-01-02 00%3A00%3A00.0') ;

On Tue, Mar 24, 2015 at 9:35 AM, Steve Howard 
wrote:

> Do you have a typo in the partition name?  There is a space in the list
> you have between day and hour, but not in your drop statement.  Also %3A is
> hex for the ":" character, but you don't have that in you partition name to
> get dropped.
>
> Sent from my iPad
>
> On Mar 23, 2015, at 11:46 PM, Megha Garg  wrote:
>
> I am not getting any error and my hive version is 0.13
>
> On Mon, Mar 23, 2015 at 8:57 PM, Alan Gates  wrote:
>
>> Are you getting an error or does the partition just not get deleted?  If
>> you get an error message can you share it?  What version of Hive are you
>> using?
>>
>> Alan.
>>
>>  
>>  Megha Garg 
>>  March 23, 2015 at 5:43
>> Hi,
>>
>> I am new to hive. I have created one ORC table with partitioning where my
>> partition looks like below:-
>>
>> *date=2014-01-01 00%3A00%3A00.0*
>> *date=2014-01-02 00%3A00%3A00.0*
>> *date=2014-01-03 00%3A00%3A00.0*
>>
>> I want to delete my second partition (date=2014-01-02 00%3A00%3A00.0) but
>> i am not able to do so. I am using the below query:-
>>
>> * alter table my_tbl drop  partition (date='2014-01-0200.00.00.0') ;*
>>
>> But it is not working.  How can i delete it?
>>
>>
>


Re: Delete ORC partition

2015-03-23 Thread Steve Howard
Do you have a typo in the partition name?  There is a space in the list you 
have between day and hour, but not in your drop statement.  Also %3A is hex for 
the ":" character, but you don't have that in you partition name to get dropped.

Sent from my iPad

> On Mar 23, 2015, at 11:46 PM, Megha Garg  wrote:
> 
> I am not getting any error and my hive version is 0.13
> 
>> On Mon, Mar 23, 2015 at 8:57 PM, Alan Gates  wrote:
>> Are you getting an error or does the partition just not get deleted?  If you 
>> get an error message can you share it?  What version of Hive are you using?
>> 
>> Alan.
>> 
>>>Megha Garg  March 23, 2015 at 5:43
>>> Hi,
>>> 
>>> I am new to hive. I have created one ORC table with partitioning where my 
>>> partition looks like below:-
>>> 
>>> date=2014-01-01 00%3A00%3A00.0
>>> date=2014-01-02 00%3A00%3A00.0
>>> date=2014-01-03 00%3A00%3A00.0
>>> 
>>> I want to delete my second partition (date=2014-01-02 00%3A00%3A00.0) but i 
>>> am not able to do so. I am using the below query:-
>>> 
>>>  alter table my_tbl drop  partition (date='2014-01-0200.00.00.0') ; 
>>> 
>>> But it is not working.  How can i delete it?
> 


Re: Delete ORC partition

2015-03-23 Thread Megha Garg
I am not getting any error and my hive version is 0.13

On Mon, Mar 23, 2015 at 8:57 PM, Alan Gates  wrote:

> Are you getting an error or does the partition just not get deleted?  If
> you get an error message can you share it?  What version of Hive are you
> using?
>
> Alan.
>
>   Megha Garg 
>  March 23, 2015 at 5:43
> Hi,
>
> I am new to hive. I have created one ORC table with partitioning where my
> partition looks like below:-
>
> *date=2014-01-01 00%3A00%3A00.0*
> *date=2014-01-02 00%3A00%3A00.0*
> *date=2014-01-03 00%3A00%3A00.0*
>
> I want to delete my second partition (date=2014-01-02 00%3A00%3A00.0) but
> i am not able to do so. I am using the below query:-
>
> * alter table my_tbl drop  partition (date='2014-01-0200.00.00.0') ;*
>
> But it is not working.  How can i delete it?
>
>


Re: Submitting via WebHCat won't put hive log into ATS?

2015-03-23 Thread Eugene Koifman
I'm working https://issues.apache.org/jira/browse/HIVE-10066 where I plan to 
make all of hive-site.xml be available to webhcat's LaunchMapper so that these 
settings don't have to be duplicated in templeton.hive.properties.

From: Lefty Leverenz mailto:leftylever...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Monday, March 23, 2015 at 5:37 PM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>, 
"d...@hive.apache.org" 
mailto:d...@hive.apache.org>>
Subject: Re: Submitting via WebHCat won't put hive log into ATS?

Should this be documented in the wiki?  (For example, in the WebHCat 
Configuration 
Variables
 section with a link from Error 
Logs
 in Getting Started.)

-- Lefty

On Mon, Mar 23, 2015 at 8:00 PM, Xiaoyong Zhu 
mailto:xiaoy...@microsoft.com>> wrote:
Confirmed with Zhijie Shen from Hortonworks (big thanks for the help!) and it 
is actually caused by the fact that WebHCat does not know Hive configurations.
Assuming you have set the following properties for Hive:
[cid:image001.png@01D06608.A0B59290]
Then putting these three properties into templeton.hive.properties would 
workaround this issue.

Xiaoyong

From: Xiaoyong Zhu 
[mailto:xiaoy...@microsoft.com]
Sent: Saturday, March 21, 2015 8:15 PM
To: user@hive.apache.org
Subject: Submitting via WebHCat won't put hive log into ATS?

Hi Hive experts,

I know that Hive is writing logs to ATS (application timeline server) when a 
hive script is executed from the CLI. However, when I try to submit Hive jobs 
from WebHCat, it seems that it does not write logs into ATS? How could I 
configure Hive to make the jobs submitted from WebHCat also log into ATS?

Thanks!

Xiaoyong




Re: Submitting via WebHCat won't put hive log into ATS?

2015-03-23 Thread Lefty Leverenz
Should this be documented in the wiki?  (For example, in the WebHCat
Configuration
Variables

section
with a link from Error Logs

in
Getting Started.)

-- Lefty

On Mon, Mar 23, 2015 at 8:00 PM, Xiaoyong Zhu 
wrote:

>  Confirmed with Zhijie Shen from Hortonworks (big thanks for the help!)
> and it is actually caused by the fact that WebHCat does not know Hive
> configurations.
>
> Assuming you have set the following properties for Hive:
>
>  Then putting these three properties into *templeton.hive.properties *would
> workaround this issue.
>
>
>
> Xiaoyong
>
>
>
> *From:* Xiaoyong Zhu [mailto:xiaoy...@microsoft.com]
> *Sent:* Saturday, March 21, 2015 8:15 PM
> *To:* user@hive.apache.org
> *Subject:* Submitting via WebHCat won't put hive log into ATS?
>
>
>
> Hi Hive experts,
>
>
>
> I know that Hive is writing logs to ATS (application timeline server) when
> a hive script is executed from the CLI. However, when I try to submit Hive
> jobs from WebHCat, it seems that it does not write logs into ATS? How could
> I configure Hive to make the jobs submitted from WebHCat also log into ATS?
>
>
>
> Thanks!
>
>
>
> Xiaoyong
>
>
>


RE: Submitting via WebHCat won't put hive log into ATS?

2015-03-23 Thread Xiaoyong Zhu
Confirmed with Zhijie Shen from Hortonworks (big thanks for the help!) and it 
is actually caused by the fact that WebHCat does not know Hive configurations.
Assuming you have set the following properties for Hive:
[cid:image001.png@01D06608.A0B59290]
Then putting these three properties into templeton.hive.properties would 
workaround this issue.

Xiaoyong

From: Xiaoyong Zhu [mailto:xiaoy...@microsoft.com]
Sent: Saturday, March 21, 2015 8:15 PM
To: user@hive.apache.org
Subject: Submitting via WebHCat won't put hive log into ATS?

Hi Hive experts,

I know that Hive is writing logs to ATS (application timeline server) when a 
hive script is executed from the CLI. However, when I try to submit Hive jobs 
from WebHCat, it seems that it does not write logs into ATS? How could I 
configure Hive to make the jobs submitted from WebHCat also log into ATS?

Thanks!

Xiaoyong



Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Vaibhav Gumashta
Congrats to all.

From: Sergey Shelukhin mailto:ser...@hortonworks.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Monday, March 23, 2015 at 12:52 PM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>, 
"d...@hive.apache.org" 
mailto:d...@hive.apache.org>>, Matthew McCline 
mailto:mmccl...@hortonworks.com>>, 
"jxi...@apache.org" 
mailto:jxi...@apache.org>>, Sergio Pena 
mailto:sergio.p...@cloudera.com>>
Subject: Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and 
Sergio Pena

Congrats!

From: Carl Steinbach mailto:c...@apache.org>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Monday, March 23, 2015 at 10:08
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>, 
"d...@hive.apache.org" 
mailto:d...@hive.apache.org>>, Matthew McCline 
mailto:mmccl...@hortonworks.com>>, 
"jxi...@apache.org" 
mailto:jxi...@apache.org>>, Sergio Pena 
mailto:sergio.p...@cloudera.com>>
Subject: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio 
Pena

The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and Sergio 
Pena committers on the Apache Hive Project.

Please join me in congratulating Jimmy, Matt, and Sergio.

Thanks.

- Carl



Re: Querying Uniontype with Hive

2015-03-23 Thread Buntu Dev
Still trying to figure out if there is any way to query directly or if
there is any existing UDF to help query union type fields in HiveQL.

Thanks!

On Tue, Feb 24, 2015 at 10:43 AM, Buntu Dev  wrote:

> Hi,
>
> This might've been asked previously but I couldn't find any examples of
> how to query uniontype in Hive.
>
> I have this field in the table:
>
> `location`
> uniontype,boolean>
>
> How do I go about querying: "select location.latitiude, location.latitude
> from ..." since I get this error:
>
> . Operator is only supported on struct or list of struct types
>
>
> If this is currently not supported via Hive, would appreciate if anyone
> can throw some light on how to go about this outside of Hive?
>
>
> Thanks!
>


Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Sergey Shelukhin
Congrats!

From: Carl Steinbach mailto:c...@apache.org>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Monday, March 23, 2015 at 10:08
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>, 
"d...@hive.apache.org" 
mailto:d...@hive.apache.org>>, Matthew McCline 
mailto:mmccl...@hortonworks.com>>, 
"jxi...@apache.org" 
mailto:jxi...@apache.org>>, Sergio Pena 
mailto:sergio.p...@cloudera.com>>
Subject: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio 
Pena

The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and Sergio 
Pena committers on the Apache Hive Project.

Please join me in congratulating Jimmy, Matt, and Sergio.

Thanks.

- Carl



Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Alexander Pivovarov
Congrats to Matt, Jimmy and Sergio!

On Mon, Mar 23, 2015 at 11:30 AM, Chaoyu Tang  wrote:

> Congratulations to Jimmy and Sergio!
>
> On Mon, Mar 23, 2015 at 2:08 PM, Carl Steinbach  wrote:
>
>> The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and
>> Sergio Pena committers on the Apache Hive Project.
>>
>> Please join me in congratulating Jimmy, Matt, and Sergio.
>>
>> Thanks.
>>
>> - Carl
>>
>>
>


Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Chaoyu Tang
Congratulations to Jimmy and Sergio!

On Mon, Mar 23, 2015 at 2:08 PM, Carl Steinbach  wrote:

> The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and
> Sergio Pena committers on the Apache Hive Project.
>
> Please join me in congratulating Jimmy, Matt, and Sergio.
>
> Thanks.
>
> - Carl
>
>


Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Szehon Ho
Congrats guys!

On Mon, Mar 23, 2015 at 11:27 AM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

>  Congratulations everyone!
>
>  On Mar 23, 2015, at 11:26 AM, Chinna Rao Lalam <
> lalamchinnara...@gmail.com> wrote:
>
>  Congratulations to all...
>
> On Mon, Mar 23, 2015 at 11:38 PM, Carl Steinbach  wrote:
>
>>  The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and
>> Sergio Pena committers on the Apache Hive Project.
>>
>>  Please join me in congratulating Jimmy, Matt, and Sergio.
>>
>>  Thanks.
>>
>>  - Carl
>>
>>
>
>
>  --
>  Hope It Helps,
> Chinna
>
>
>


RE: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Mich Talebzadeh
Me too J

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and 
Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries 
or their employees, unless expressly so stated. It is the responsibility of the 
recipient to ensure that this email is virus free, therefore neither Peridale 
Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Xuefu Zhang [mailto:xzh...@cloudera.com] 
Sent: 23 March 2015 18:26
To: user@hive.apache.org
Cc: d...@hive.apache.org; mmccl...@hortonworks.com; jxi...@apache.org; Sergio 
Pena
Subject: Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and 
Sergio Pena

 

Congratulations to all!

--Xuefu

 

On Mon, Mar 23, 2015 at 11:08 AM, Carl Steinbach  wrote:

The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and Sergio 
Pena committers on the Apache Hive Project. 

 

Please join me in congratulating Jimmy, Matt, and Sergio.

 

Thanks.

 

- Carl

 

 



Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Prasanth Jayachandran
Congratulations everyone!

On Mar 23, 2015, at 11:26 AM, Chinna Rao Lalam 
mailto:lalamchinnara...@gmail.com>> wrote:

Congratulations to all...

On Mon, Mar 23, 2015 at 11:38 PM, Carl Steinbach 
mailto:c...@apache.org>> wrote:
The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and Sergio 
Pena committers on the Apache Hive Project.

Please join me in congratulating Jimmy, Matt, and Sergio.

Thanks.

- Carl




--
Hope It Helps,
Chinna



Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Chinna Rao Lalam
Congratulations to all...

On Mon, Mar 23, 2015 at 11:38 PM, Carl Steinbach  wrote:

> The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and
> Sergio Pena committers on the Apache Hive Project.
>
> Please join me in congratulating Jimmy, Matt, and Sergio.
>
> Thanks.
>
> - Carl
>
>


-- 
Hope It Helps,
Chinna


Re: [ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Xuefu Zhang
Congratulations to all!

--Xuefu

On Mon, Mar 23, 2015 at 11:08 AM, Carl Steinbach  wrote:

> The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and
> Sergio Pena committers on the Apache Hive Project.
>
> Please join me in congratulating Jimmy, Matt, and Sergio.
>
> Thanks.
>
> - Carl
>
>


[ANNOUNCE] New Hive Committers - Jimmy Xiang, Matt McCline, and Sergio Pena

2015-03-23 Thread Carl Steinbach
The Apache Hive PMC has voted to make Jimmy Xiang, Matt McCline, and Sergio
Pena committers on the Apache Hive Project.

Please join me in congratulating Jimmy, Matt, and Sergio.

Thanks.

- Carl


Re: Delete ORC partition

2015-03-23 Thread Alan Gates
Are you getting an error or does the partition just not get deleted?  If 
you get an error message can you share it?  What version of Hive are you 
using?


Alan.


Megha Garg 
March 23, 2015 at 5:43
Hi,

I am new to hive. I have created one ORC table with partitioning where 
my partition looks like below:-


*date=2014-01-01 00%3A00%3A00.0*
*date=2014-01-02 00%3A00%3A00.0*
*date=2014-01-03 00%3A00%3A00.0*

I want to delete my second partition (date=2014-01-02 00%3A00%3A00.0) 
but i am not able to do so. I am using the below query:-


* alter table my_tbl drop  partition (date='2014-01-0200.00.00.0') ;*

But it is not working.  How can i delete it?


Delete ORC partition

2015-03-23 Thread Megha Garg
Hi,

I am new to hive. I have created one ORC table with partitioning where my
partition looks like below:-

*date=2014-01-01 00%3A00%3A00.0*
*date=2014-01-02 00%3A00%3A00.0*
*date=2014-01-03 00%3A00%3A00.0*

I want to delete my second partition (date=2014-01-02 00%3A00%3A00.0) but i
am not able to do so. I am using the below query:-

* alter table my_tbl drop  partition (date='2014-01-0200.00.00.0') ;*

But it is not working.  How can i delete it?


Re:run hiveserver2 with no authentication mode.

2015-03-23 Thread Килеев Васли Славик
Hello!

I have answered to your question in stack overflow.

You should configure your LDAP server and populate user tree before making 
authentication.

My hive server uses LDAP authentication too. I have already configured LDAP 
server and it works fine. I could configure it for you too for a small fee.

> thanks, do you know to how config hiveserver2 with ldap service? i have 
> configured but can't make it connect to server.
> i have filed an ticket on the url: 
> http://stackoverflow.com/questions/29206038/hiveserver2-bind-ldap-authenticationCould
>  you plz take a look?
> 
> thanks
> 
> ukown.
>> From: slavakil...@yandex.ru
>> To: user@hive.apache.org
>> Subject: Re:run hiveserver2 with no authentication mode.
>> Date: Mon, 23 Mar 2015 10:50:23 +0300
>>
>> Hello!
>>
>> Try to use user root with blank password. It worked for me. So it should 
>> work for you as well.
>>
>> > could anyone answer my question?uknow.
>> >
>> > From: tenglinx...@outlook.com
>> > To: user@hive.apache.org
>> > Subject: run hiveserver2 with no authentication mode.
>> > Date: Wed, 18 Mar 2015 13:14:31 +
>> >
>> > Hi all,
>> >
>> > i am pretty new to hive and would like to consult one issue that whether 
>> > the hiveserver2 can run with no authentication mode. i have tried to 
>> > configure the hive-site.xml file as below(this is the only one hiveserver2 
>> > related in my config file):
>> > 
>> > hive.server2.authentication
>> > none
>> > 
>> >
>> > but i can't connect to server when trying attempts with beeline. and if no 
>> > username/password is specified, always there is an prompt asking for them.
>> > could anyone give any comment on this? does hive support no authentication 
>> > mode for hiveserver2? and whether there is a way to achieve this.
>> >
>> > thanks in advance.
>> >
>> > uknow.


RE: Bucket pruning

2015-03-23 Thread Mich Talebzadeh
Hi,

 

Can someone clarify whether hive approach to split follows what is essentially 
a UNIX/Linux command?

 

For example the following command will split a a_larg_file into partitions (sub 
files) of 250,000 bytes each called hql01.dat, hql02.dat and so forth

 

tar cz ./a_large_file |split -d -b 25 hqlfile

 

 

HTH

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and 
Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries 
or their employees, unless expressly so stated. It is the responsibility of the 
recipient to ensure that this email is virus free, therefore neither Peridale 
Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: matshyeq [mailto:matsh...@gmail.com] 
Sent: 23 March 2015 10:41
To: user
Cc: Daniel Haviv
Subject: Re: Bucket pruning

 

To me there's practically very little difference between partitioning and 
bucketing (partitioning defines split criteria explicitly whereas bucketing 
somewhat implicitly) . Hive however recognises the latter as a separate feature 
and handles the two in quite different way.

 

There's already a feature request proposition to unify and bring the 
optimisations across (so it would address the "bucket pruning" issue I believe 
you're having):

 

https://issues.apache.org/jira/browse/HIVE-9523

 

Probably best if you vote for it so it gets some traction…

 

Regards 

~Maciek

 

On Fri, Mar 13, 2015 at 12:22 PM, cobby  wrote:

hi, thanks for the detailed response.
i will experiment with your suggested orc bloom filter solution.

it seems to me the obvious, most straight forward solution is to add support 
for hash partitioning. so i can do something like:

create table T()
partitioned by (x into num_partitions,..).

upon insert hash(x) determines which partition to put the record in. upon 
select, the query processor can now hash on x and scan only that partition 
(this optimization will probably work only on = and other discrete filtering 
but thats true for partitioning in general).
it seems all of this can be done early in the query plan phase and have no 
effect on underling infra.

regards,cobby.




> On 12 במרץ 2015, at 23:05, Gopal Vijayaraghavan  wrote:
>
> Hi,
>
> No and it¹s a shame because we¹re stuck on some compatibility details with
> this.
>
> The primary issue is the fact that the InputFormat is very generic and
> offers no way to communicate StorageDescriptor or bucketing.
>
> The split generation for something SequenceFileInputFormat lives inside
> MapReduce, where it has no idea about bucketing.
>
> So InputFormat.getSplits(conf) returns something relatively arbitrary,
> which contains a mixture of files when CombineInputFormat is turned on.
>
> I have implemented this twice so far for ORC (for custom Tez jobs, with
> huge wins) by using an MRv2 PathFilter over the regular OrcNewInputFormat
> implementation, by turning off combine input and using Tez grouping
> instead.
>
> But that has proved to be very fragile for a trunk feature, since with
> schema evolution of partitioned tables older partitions may be bucketed
> with a different count from a newer partition - so the StorageDescriptor
> for each partition has to be fetched across before we can generate a valid
> PathFilter.
>
> The SARGs are probably a better way to do this eventually as they can
> implement IN_BUCKET(1,2) to indicate 1 of 2 instead of the ³0_1²
> PathFilter which is fragile.
>
>
> Right now, the most fool-proof solution we¹ve hit upon was to apply the
> ORC bloom filter to the bucket columns, which is far safer as it does not
> care about the DDL - but does a membership check on the actual metadata &
> prunes deeper at the stripe-level if it is sorted as well.
>
> That is somewhat neat since this doesn¹t need any new options for querying
> - it automatically(*) kicks in for your query pattern.
>
> Cheers,
> Gopal
> (*) - conditions apply - there¹s a threshold for file-size for these
> filters to be evaluated during planning (to prevent HS2 from burning CPU).
>
>
> From:  Daniel Haviv 
> Reply-To:  "user@hive.apache.org" 
> Date:  Thursday, March 12, 2015 at 2:36 AM
> To:  "user@hive.apache.org" 
> Subject:  Bucket pruning
>
>
> Hi,
> We created a bucketed table and when we select in the following way:
> select *
> from testtble
> where bucket_col ='X';
>
> We observe that there all of the table is being read and not just the
> specific bucket.
>
> Does Hive support such a feature ?
>
>
> Thanks,
> Daniel
>
>

 



Re: Bucket pruning

2015-03-23 Thread matshyeq
To me there's practically very little difference between partitioning and
bucketing (partitioning defines split criteria explicitly whereas bucketing
somewhat implicitly) . Hive however recognises the latter as a separate
feature and handles the two in quite different way.

There's already a feature request proposition to unify and bring the
optimisations across (so it would address the "bucket pruning" issue I
believe you're having):

https://issues.apache.org/jira/browse/HIVE-9523

Probably best if you vote for it so it gets some traction…

Regards
~Maciek

On Fri, Mar 13, 2015 at 12:22 PM, cobby  wrote:

> hi, thanks for the detailed response.
> i will experiment with your suggested orc bloom filter solution.
>
> it seems to me the obvious, most straight forward solution is to add
> support for hash partitioning. so i can do something like:
>
> create table T()
> partitioned by (x into num_partitions,..).
>
> upon insert hash(x) determines which partition to put the record in. upon
> select, the query processor can now hash on x and scan only that partition
> (this optimization will probably work only on = and other discrete
> filtering but thats true for partitioning in general).
> it seems all of this can be done early in the query plan phase and have no
> effect on underling infra.
>
> regards,cobby.
>
>
>
> > On 12 במרץ 2015, at 23:05, Gopal Vijayaraghavan 
> wrote:
> >
> > Hi,
> >
> > No and it¹s a shame because we¹re stuck on some compatibility details
> with
> > this.
> >
> > The primary issue is the fact that the InputFormat is very generic and
> > offers no way to communicate StorageDescriptor or bucketing.
> >
> > The split generation for something SequenceFileInputFormat lives inside
> > MapReduce, where it has no idea about bucketing.
> >
> > So InputFormat.getSplits(conf) returns something relatively arbitrary,
> > which contains a mixture of files when CombineInputFormat is turned on.
> >
> > I have implemented this twice so far for ORC (for custom Tez jobs, with
> > huge wins) by using an MRv2 PathFilter over the regular OrcNewInputFormat
> > implementation, by turning off combine input and using Tez grouping
> > instead.
> >
> > But that has proved to be very fragile for a trunk feature, since with
> > schema evolution of partitioned tables older partitions may be bucketed
> > with a different count from a newer partition - so the StorageDescriptor
> > for each partition has to be fetched across before we can generate a
> valid
> > PathFilter.
> >
> > The SARGs are probably a better way to do this eventually as they can
> > implement IN_BUCKET(1,2) to indicate 1 of 2 instead of the ³0_1²
> > PathFilter which is fragile.
> >
> >
> > Right now, the most fool-proof solution we¹ve hit upon was to apply the
> > ORC bloom filter to the bucket columns, which is far safer as it does not
> > care about the DDL - but does a membership check on the actual metadata &
> > prunes deeper at the stripe-level if it is sorted as well.
> >
> > That is somewhat neat since this doesn¹t need any new options for
> querying
> > - it automatically(*) kicks in for your query pattern.
> >
> > Cheers,
> > Gopal
> > (*) - conditions apply - there¹s a threshold for file-size for these
> > filters to be evaluated during planning (to prevent HS2 from burning
> CPU).
> >
> >
> > From:  Daniel Haviv 
> > Reply-To:  "user@hive.apache.org" 
> > Date:  Thursday, March 12, 2015 at 2:36 AM
> > To:  "user@hive.apache.org" 
> > Subject:  Bucket pruning
> >
> >
> > Hi,
> > We created a bucketed table and when we select in the following way:
> > select *
> > from testtble
> > where bucket_col ='X';
> >
> > We observe that there all of the table is being read and not just the
> > specific bucket.
> >
> > Does Hive support such a feature ?
> >
> >
> > Thanks,
> > Daniel
> >
> >
>


RE: run hiveserver2 with no authentication mode.

2015-03-23 Thread James Teng
thanks, do you know to how config hiveserver2 with ldap service? i have 
configured but can't make it connect to server.
i have filed an ticket on the url: 
http://stackoverflow.com/questions/29206038/hiveserver2-bind-ldap-authenticationCould
 you plz take a look?
thanks
ukown.
> From: slavakil...@yandex.ru
> To: user@hive.apache.org
> Subject: Re:run hiveserver2 with no authentication mode.
> Date: Mon, 23 Mar 2015 10:50:23 +0300
> 
> Hello!
> 
> Try to use user root with blank password. It worked for me. So it should work 
> for you as well.
> 
> > could anyone answer my question?uknow.
> > 
> > From: tenglinx...@outlook.com
> > To: user@hive.apache.org
> > Subject: run hiveserver2 with no authentication mode.
> > Date: Wed, 18 Mar 2015 13:14:31 +
> > 
> > Hi all,
> > 
> > i am pretty new to hive and would like to consult one issue that whether 
> > the hiveserver2 can run with no authentication mode. i have tried to 
> > configure the hive-site.xml file as below(this is the only one hiveserver2 
> > related in my config file):
> > 
> > hive.server2.authentication
> > none
> > 
> > 
> > but i can't connect to server when trying attempts with beeline. and if no 
> > username/password is specified, always there is an prompt asking for them.
> > could anyone give any comment on this? does hive support no authentication 
> > mode for hiveserver2? and whether there is a way to achieve this.
> > 
> > thanks in advance.
> > 
> > uknow.
  

Re:run hiveserver2 with no authentication mode.

2015-03-23 Thread Килеев Васли Славик
Hello!

Try to use user root with blank password. It worked for me. So it should work 
for you as well.

> could anyone answer my question?uknow.
> 
> From: tenglinx...@outlook.com
> To: user@hive.apache.org
> Subject: run hiveserver2 with no authentication mode.
> Date: Wed, 18 Mar 2015 13:14:31 +
> 
> Hi all,
> 
> i am pretty new to hive and would like to consult one issue that whether the 
> hiveserver2 can run with no authentication mode. i have tried to configure 
> the hive-site.xml file as below(this is the only one hiveserver2 related in 
> my config file):
> 
> hive.server2.authentication
> none
> 
> 
> but i can't connect to server when trying attempts with beeline. and if no 
> username/password is specified, always there is an prompt asking for them.
> could anyone give any comment on this? does hive support no authentication 
> mode for hiveserver2? and whether there is a way to achieve this.
> 
> thanks in advance.
> 
> uknow.


FW: run hiveserver2 with no authentication mode.

2015-03-23 Thread James Teng
could anyone answer my question?uknow.
From: tenglinx...@outlook.com
To: user@hive.apache.org
Subject: run hiveserver2 with no authentication mode.
Date: Wed, 18 Mar 2015 13:14:31 +




Hi all,
i am pretty new to hive and would like to consult one issue that whether the 
hiveserver2 can run with no authentication mode. i have tried to configure the 
hive-site.xml file as below(this is the only one hiveserver2 related in my 
config file): hive.server2.authentication
none  
but i can't connect to server when trying attempts with beeline. and if no 
username/password is specified, always there is an prompt asking for them.could 
anyone give any comment on this? does hive support no authentication mode for 
hiveserver2? and whether there is a way to achieve this.
thanks in advance.
uknow.  
  

RE: Executing HQL files from JAVA application.

2015-03-23 Thread Amal Gupta
Hey Mich,

Got any clues regarding the failure of the code that I sent?

I was going through the project and the code again and I suspect the 
mis-matching dependencies to be the culprits. I am currently trying to re-align 
the dependencies as per the pom given on the mvnrepository.com while trying to 
see if a particular configuration succeeds.

Will keep you posted on my progress.  Thanks again for all the help that you 
are providing. :)

Regards,
Amal

From: Amal Gupta
Sent: Sunday, March 22, 2015 7:52 AM
To: user@hive.apache.org
Subject: RE: Executing HQL files from JAVA application.

Hi Mich,

:) A coincidence. Even I am new to hive. My test script which I am trying to 
execute contains a drop and a create statement.

Script :-
use test_db;
DROP TABLE IF EXISTS demoHiveTable;
CREATE EXTERNAL TABLE demoHiveTable (
demoId string,
demoName string
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
STORED AS TEXTFILE LOCATION '/hive/';


Java Code: -
Not sure whether this will have an impact but the the code is a part of Spring 
batch Tasklet being triggered from the Batch-Context. This tasklet runs in 
parallel with other tasklets.

   public RepeatStatus execute(StepContribution arg0, ChunkContext arg1)
throws Exception {
String[] args = 
{"-d",BeeLine.BEELINE_DEFAULT_JDBC_DRIVER,"-u","jdbc:hive2://:1/test_db",
 "-n","**","-p","**", 
"-f","C://Work//test_hive.hql"};
BeeLine beeline = new BeeLine();
ByteArrayOutputStream os = new ByteArrayOutputStream();
PrintStream beelineOutputStream = new PrintStream(os);
beeline.setOutputStream(beelineOutputStream);
beeline.setErrorStream(beelineOutputStream);
beeline.begin(args,null);
String output = os.toString("UTF8");
System.out.println(output);
 return RepeatStatus.FINISHED;
   }

It will be great if you can share the piece of code that worked for you. May be 
it will give me some pointers on how to go ahead.

Best Regards,
Amal

From: Mich Talebzadeh [mailto:m...@peridale.co.uk]
Sent: Sunday, March 22, 2015 2:58 AM
To: user@hive.apache.org
Subject: RE: Executing HQL files from JAVA application.

Hi Amal;

Me coming from relational database (Oracle, Sybase) background :) always expect 
that a DDL statement like DROP TABLE has to run in its own transaction and 
cannot be combined with a DML statement.

Now I suspect that when you run the command DROP TABLE IF EXIASTS ; 
 like below in beehive it works

0: jdbc:hive2://rhes564:10010/default> drop table if exists mytest;
No rows affected (0.216 seconds)

That runs in its own transaction so it works. However, I suspect in JAVA that 
is not the case. Can you possibly provide your JAVA code to see what exactly it 
is doing.

Thanks,

Mich

http://talebzadehmich.wordpress.com

Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and 
Coherence Cache

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries 
or their employees, unless expressly so stated. It is the responsibility of the 
recipient to ensure that this email is virus free, therefore neither Peridale 
Ltd, its subsidiaries nor their employees accept any responsibility.

From: Amal Gupta [mailto:amal.gup...@aexp.com]
Sent: 21 March 2015 18:16
To: user@hive.apache.org
Subject: RE: Executing HQL files from JAVA application.

Hi Mich,

Thank you for your response.  I was not aware of beeline. I have now included 
this in my app and this looks a much better solution going forward.  In the 
last couple of hours I have tried to work with beeline but have been facing 
some issues.


1.   I was able to run on the remote server command line a beeline command 
given below . This was successful.
beeline -u jdbc:hive2://:1/test_db 
org.apache.hive.jdbc.HiveDriver -n * -p ** -f 
/hive/scripts/demoHiveTable.hql


2.   Running the same from the java app results in the issues.  My script 
contains a drop table for the demoTable but the table is not dropped when 
running from java.   (DROP TABLE IF EXISTS demoHiveTable;)  . I see the 
following logs.

SLF4J: This version of SLF4J requires log4j version 1.2.12 or later. See also 
http://www.slf4j.org/codes.html#log4j_version
Exception in thread "Thread-1" java.lang.NoSuchMethodError: 
org.apache.hive.jdbc.HiveStatement.hasMoreLogs()Z
   at org.apache.hive.beeline.Commands$1.run(Commands.java:839)
   at java.lang.Thread.run(Thread.java:662)
Connecting to jdbc:hive2: ://:10