Re: Alter table is giving error

2012-11-27 Thread Dean Wampler
Right, your CREATE TABLE statement now points to your S3 location, so you
don't need to do anything else. However, queries will pull this data from
S3 every time, which will be a little slower and you'll incur a small
charge for reading from S3. However, parking data there is great when you
only need occasional access to it, not frequent access where using an HDFS
location is better.

However, as a side note, the message informs you that you can't use an S3
location in a LOAD DATA statement. So, if you ever define a
managed/internal table and want to populate it with S3 data, you'll have to
copy the data from S3 to your cluster first, then load it from there.

dean

On Tue, Nov 27, 2012 at 2:53 PM, Mark Grover wrote:

> Chunky,
> You have an external table that points at the location s3://location/
>
> No need to load the data. All files (or partitions folders) under
> s3://location/ should be available via the table.
> Just run your queries on it.
>
> Load data will move the data from one HDFS location to another. You don't
> need/want to do that in this case.
>
> Mark
>
> On Tue, Nov 27, 2012 at 12:18 PM, Chunky Gupta wrote:
>
>> Hi,
>>
>> Now when I am trying to load a csv file to any table I created, its not
>> working.
>>
>> I created a table :-
>> CREATE EXTERNAL TABLE someidtable (
>> someid STRING,
>> )
>> ROW FORMAT
>> DELIMITED FIELDS TERMINATED BY '\t'
>> LINES TERMINATED BY '\n'
>> LOCATION 's3://location/';
>>
>> Then
>>
>> LOAD DATA INPATH 's3://location/someidexcel.csv' INTO TABLE someidtable;
>>
>> It gives this error:-
>> "Error in semantic analysis: Line 1:17 Invalid path
>> ''s3n://location/someidexcel.csv'': only "file" or "hdfs" file systems
>> accepted"
>>
>> Please help me in resolving this issue.
>> Thanks,
>> Chunky.
>>
>>
>> On Wed, Nov 7, 2012 at 6:43 PM, Chunky Gupta wrote:
>>
>>> Okay Mark, I will be looking into this JIRA regularly.
>>> Thanks again for helping.
>>> Chunky.
>>>
>>>
>>> On Wed, Nov 7, 2012 at 12:22 PM, Mark Grover <
>>> grover.markgro...@gmail.com> wrote:
>>>
 Chunky,
 I just tried it myself. It turns out that the directory you are adding
 as partition has to be empty for msck repair to work. This is obviously
 sub-optimal and there is a JIRA in place (
 https://issues.apache.org/jira/browse/HIVE-3231) to fix it.

 So, I'd suggest you keep an eye out for the next version for that fix
 to come in. In the meanwhile, run msck after you create your partition
 directory but before you populate your directory with data.

 Mark


 On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta 
 wrote:

> Hi Mark,
> Sorry, I forgot to mention. I have also tried
> msck repair table ;
> and same output I got which I got from msck only.
> Do I need to do any other settings for this to work, because I have
> prepared Hadoop and Hive setup from start on EC2.
>
> Thanks,
> Chunky.
>
>
>
> On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover <
> grover.markgro...@gmail.com> wrote:
>
>> Chunky,
>> You should have run:
>> msck repair table ;
>>
>> Sorry, I should have made it clear in my last reply. I have added an
>> entry to Hive wiki for benefit of others:
>>
>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions
>>
>> Mark
>>
>>
>> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta > > wrote:
>>
>>> Hi Mark,
>>> I didn't get any error.
>>> I ran this on hive console:-
>>>  "msck table Table_Name;"
>>> It says Ok and showed the execution time as 1.050 sec.
>>> But when I checked partitions for table using
>>>   "show partitions Table_Name;"
>>> It didn't show me any partitions.
>>>
>>> Thanks,
>>> Chunky.
>>>
>>>
>>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <
>>> grover.markgro...@gmail.com> wrote:
>>>
 Glad to hear, Chunky.

 Out of curiosity, what errors did you get when using msck?


 On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <
 chunky.gu...@vizury.com> wrote:

> Hi Mark,
> I tried msck, but it is not working for me. I have written a
> python script to partition the data individually.
>
> Thank you Edward, Mark and Dean.
> Chunky.
>
>
> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <
> grover.markgro...@gmail.com> wrote:
>
>> Chunky,
>> I have used "recover partitions" command on EMR, and that worked
>> fine.
>>
>> However, take a look at
>> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck
>> command in Apache Hive does the same thing. Try it out and let us 
>> know it
>> goes.
>>
>> Mark
>>
>> On Mon, No

Re: Alter table is giving error

2012-11-27 Thread Mark Grover
Chunky,
You have an external table that points at the location s3://location/

No need to load the data. All files (or partitions folders) under
s3://location/ should be available via the table.
Just run your queries on it.

Load data will move the data from one HDFS location to another. You don't
need/want to do that in this case.

Mark

On Tue, Nov 27, 2012 at 12:18 PM, Chunky Gupta wrote:

> Hi,
>
> Now when I am trying to load a csv file to any table I created, its not
> working.
>
> I created a table :-
> CREATE EXTERNAL TABLE someidtable (
> someid STRING,
> )
> ROW FORMAT
> DELIMITED FIELDS TERMINATED BY '\t'
> LINES TERMINATED BY '\n'
> LOCATION 's3://location/';
>
> Then
>
> LOAD DATA INPATH 's3://location/someidexcel.csv' INTO TABLE someidtable;
>
> It gives this error:-
> "Error in semantic analysis: Line 1:17 Invalid path
> ''s3n://location/someidexcel.csv'': only "file" or "hdfs" file systems
> accepted"
>
> Please help me in resolving this issue.
> Thanks,
> Chunky.
>
>
> On Wed, Nov 7, 2012 at 6:43 PM, Chunky Gupta wrote:
>
>> Okay Mark, I will be looking into this JIRA regularly.
>> Thanks again for helping.
>> Chunky.
>>
>>
>> On Wed, Nov 7, 2012 at 12:22 PM, Mark Grover > > wrote:
>>
>>> Chunky,
>>> I just tried it myself. It turns out that the directory you are adding
>>> as partition has to be empty for msck repair to work. This is obviously
>>> sub-optimal and there is a JIRA in place (
>>> https://issues.apache.org/jira/browse/HIVE-3231) to fix it.
>>>
>>> So, I'd suggest you keep an eye out for the next version for that fix to
>>> come in. In the meanwhile, run msck after you create your partition
>>> directory but before you populate your directory with data.
>>>
>>> Mark
>>>
>>>
>>> On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta 
>>> wrote:
>>>
 Hi Mark,
 Sorry, I forgot to mention. I have also tried
 msck repair table ;
 and same output I got which I got from msck only.
 Do I need to do any other settings for this to work, because I have
 prepared Hadoop and Hive setup from start on EC2.

 Thanks,
 Chunky.



 On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover <
 grover.markgro...@gmail.com> wrote:

> Chunky,
> You should have run:
> msck repair table ;
>
> Sorry, I should have made it clear in my last reply. I have added an
> entry to Hive wiki for benefit of others:
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions
>
> Mark
>
>
> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta 
> wrote:
>
>> Hi Mark,
>> I didn't get any error.
>> I ran this on hive console:-
>>  "msck table Table_Name;"
>> It says Ok and showed the execution time as 1.050 sec.
>> But when I checked partitions for table using
>>   "show partitions Table_Name;"
>> It didn't show me any partitions.
>>
>> Thanks,
>> Chunky.
>>
>>
>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <
>> grover.markgro...@gmail.com> wrote:
>>
>>> Glad to hear, Chunky.
>>>
>>> Out of curiosity, what errors did you get when using msck?
>>>
>>>
>>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta <
>>> chunky.gu...@vizury.com> wrote:
>>>
 Hi Mark,
 I tried msck, but it is not working for me. I have written a python
 script to partition the data individually.

 Thank you Edward, Mark and Dean.
 Chunky.


 On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <
 grover.markgro...@gmail.com> wrote:

> Chunky,
> I have used "recover partitions" command on EMR, and that worked
> fine.
>
> However, take a look at
> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck
> command in Apache Hive does the same thing. Try it out and let us 
> know it
> goes.
>
> Mark
>
> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <
> edlinuxg...@gmail.com> wrote:
>
>> Recover partitions should work the same way for different file
>> systems.
>>
>> Edward
>>
>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>>  wrote:
>> > Writing a script to add the external partitions individually is
>> the only way
>> > I know of.
>> >
>> > Sent from my rotary phone.
>> >
>> >
>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <
>> chunky.gu...@vizury.com> wrote:
>> >
>> > Hi Dean,
>> >
>> > Actually I was having Hadoop and Hive cluster on EMR and I have
>> S3 storage
>> > containing logs which updates daily and having partition with
>> date(dt). And
>> > I was using this recover partition.
>> > No

Re: Alter table is giving error

2012-11-27 Thread Chunky Gupta
Hi,

Now when I am trying to load a csv file to any table I created, its not
working.

I created a table :-
CREATE EXTERNAL TABLE someidtable (
someid STRING,
)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
LOCATION 's3://location/';

Then

LOAD DATA INPATH 's3://location/someidexcel.csv' INTO TABLE someidtable;

It gives this error:-
"Error in semantic analysis: Line 1:17 Invalid path
''s3n://location/someidexcel.csv'': only "file" or "hdfs" file systems
accepted"

Please help me in resolving this issue.
Thanks,
Chunky.

On Wed, Nov 7, 2012 at 6:43 PM, Chunky Gupta wrote:

> Okay Mark, I will be looking into this JIRA regularly.
> Thanks again for helping.
> Chunky.
>
>
> On Wed, Nov 7, 2012 at 12:22 PM, Mark Grover 
> wrote:
>
>> Chunky,
>> I just tried it myself. It turns out that the directory you are adding as
>> partition has to be empty for msck repair to work. This is obviously
>> sub-optimal and there is a JIRA in place (
>> https://issues.apache.org/jira/browse/HIVE-3231) to fix it.
>>
>> So, I'd suggest you keep an eye out for the next version for that fix to
>> come in. In the meanwhile, run msck after you create your partition
>> directory but before you populate your directory with data.
>>
>> Mark
>>
>>
>> On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta wrote:
>>
>>> Hi Mark,
>>> Sorry, I forgot to mention. I have also tried
>>> msck repair table ;
>>> and same output I got which I got from msck only.
>>> Do I need to do any other settings for this to work, because I have
>>> prepared Hadoop and Hive setup from start on EC2.
>>>
>>> Thanks,
>>> Chunky.
>>>
>>>
>>>
>>> On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover <
>>> grover.markgro...@gmail.com> wrote:
>>>
 Chunky,
 You should have run:
 msck repair table ;

 Sorry, I should have made it clear in my last reply. I have added an
 entry to Hive wiki for benefit of others:

 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions

 Mark


 On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta 
 wrote:

> Hi Mark,
> I didn't get any error.
> I ran this on hive console:-
>  "msck table Table_Name;"
> It says Ok and showed the execution time as 1.050 sec.
> But when I checked partitions for table using
>   "show partitions Table_Name;"
> It didn't show me any partitions.
>
> Thanks,
> Chunky.
>
>
> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <
> grover.markgro...@gmail.com> wrote:
>
>> Glad to hear, Chunky.
>>
>> Out of curiosity, what errors did you get when using msck?
>>
>>
>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta > > wrote:
>>
>>> Hi Mark,
>>> I tried msck, but it is not working for me. I have written a python
>>> script to partition the data individually.
>>>
>>> Thank you Edward, Mark and Dean.
>>> Chunky.
>>>
>>>
>>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <
>>> grover.markgro...@gmail.com> wrote:
>>>
 Chunky,
 I have used "recover partitions" command on EMR, and that worked
 fine.

 However, take a look at
 https://issues.apache.org/jira/browse/HIVE-874. Seems like msck
 command in Apache Hive does the same thing. Try it out and let us know 
 it
 goes.

 Mark

 On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <
 edlinuxg...@gmail.com> wrote:

> Recover partitions should work the same way for different file
> systems.
>
> Edward
>
> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>  wrote:
> > Writing a script to add the external partitions individually is
> the only way
> > I know of.
> >
> > Sent from my rotary phone.
> >
> >
> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta <
> chunky.gu...@vizury.com> wrote:
> >
> > Hi Dean,
> >
> > Actually I was having Hadoop and Hive cluster on EMR and I have
> S3 storage
> > containing logs which updates daily and having partition with
> date(dt). And
> > I was using this recover partition.
> > Now I wanted to shift to EC2 and have my own Hadoop and Hive
> cluster. So,
> > what is the alternate of using recover partition in this case,
> if you have
> > any idea ?
> > I found one way of individually partitioning all dates, so I
> have to write
> > script for that to do so for all dates. Is there any easiest way
> other than
> > this ?
> >
> > Thanks,
> > Chunky
> >
> >
> >
> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
> >  wrote:
> >>
> >> T

Re: Alter table is giving error

2012-11-07 Thread Chunky Gupta
Okay Mark, I will be looking into this JIRA regularly.
Thanks again for helping.
Chunky.

On Wed, Nov 7, 2012 at 12:22 PM, Mark Grover wrote:

> Chunky,
> I just tried it myself. It turns out that the directory you are adding as
> partition has to be empty for msck repair to work. This is obviously
> sub-optimal and there is a JIRA in place (
> https://issues.apache.org/jira/browse/HIVE-3231) to fix it.
>
> So, I'd suggest you keep an eye out for the next version for that fix to
> come in. In the meanwhile, run msck after you create your partition
> directory but before you populate your directory with data.
>
> Mark
>
>
> On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta wrote:
>
>> Hi Mark,
>> Sorry, I forgot to mention. I have also tried
>> msck repair table ;
>> and same output I got which I got from msck only.
>> Do I need to do any other settings for this to work, because I have
>> prepared Hadoop and Hive setup from start on EC2.
>>
>> Thanks,
>> Chunky.
>>
>>
>>
>> On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover > > wrote:
>>
>>> Chunky,
>>> You should have run:
>>> msck repair table ;
>>>
>>> Sorry, I should have made it clear in my last reply. I have added an
>>> entry to Hive wiki for benefit of others:
>>>
>>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions
>>>
>>> Mark
>>>
>>>
>>> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta wrote:
>>>
 Hi Mark,
 I didn't get any error.
 I ran this on hive console:-
  "msck table Table_Name;"
 It says Ok and showed the execution time as 1.050 sec.
 But when I checked partitions for table using
   "show partitions Table_Name;"
 It didn't show me any partitions.

 Thanks,
 Chunky.


 On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <
 grover.markgro...@gmail.com> wrote:

> Glad to hear, Chunky.
>
> Out of curiosity, what errors did you get when using msck?
>
>
> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta 
> wrote:
>
>> Hi Mark,
>> I tried msck, but it is not working for me. I have written a python
>> script to partition the data individually.
>>
>> Thank you Edward, Mark and Dean.
>> Chunky.
>>
>>
>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <
>> grover.markgro...@gmail.com> wrote:
>>
>>> Chunky,
>>> I have used "recover partitions" command on EMR, and that worked
>>> fine.
>>>
>>> However, take a look at
>>> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck
>>> command in Apache Hive does the same thing. Try it out and let us know 
>>> it
>>> goes.
>>>
>>> Mark
>>>
>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <
>>> edlinuxg...@gmail.com> wrote:
>>>
 Recover partitions should work the same way for different file
 systems.

 Edward

 On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
  wrote:
 > Writing a script to add the external partitions individually is
 the only way
 > I know of.
 >
 > Sent from my rotary phone.
 >
 >
 > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
 wrote:
 >
 > Hi Dean,
 >
 > Actually I was having Hadoop and Hive cluster on EMR and I have
 S3 storage
 > containing logs which updates daily and having partition with
 date(dt). And
 > I was using this recover partition.
 > Now I wanted to shift to EC2 and have my own Hadoop and Hive
 cluster. So,
 > what is the alternate of using recover partition in this case, if
 you have
 > any idea ?
 > I found one way of individually partitioning all dates, so I have
 to write
 > script for that to do so for all dates. Is there any easiest way
 other than
 > this ?
 >
 > Thanks,
 > Chunky
 >
 >
 >
 > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
 >  wrote:
 >>
 >> The RECOVER PARTITIONS is an enhancement added by Amazon to
 their version
 >> of Hive.
 >>
 >>
 >>
 http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
 >>
 >> 
 >>   Chapter 21 of Programming Hive discusses this feature and
 other aspects
 >> of using Hive in EMR.
 >> 
 >>
 >> dean
 >>
 >>
 >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <
 chunky.gu...@vizury.com>
 >> wrote:
 >>>
 >>> Hi,
 >>>
 >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2
 and Hive
 >>> version 0.8.1 (I configured everything) . I have created a
 table using :-
>>>

Re: Alter table is giving error

2012-11-06 Thread Mark Grover
Chunky,
I just tried it myself. It turns out that the directory you are adding as
partition has to be empty for msck repair to work. This is obviously
sub-optimal and there is a JIRA in place (
https://issues.apache.org/jira/browse/HIVE-3231) to fix it.

So, I'd suggest you keep an eye out for the next version for that fix to
come in. In the meanwhile, run msck after you create your partition
directory but before you populate your directory with data.

Mark

On Tue, Nov 6, 2012 at 10:33 PM, Chunky Gupta wrote:

> Hi Mark,
> Sorry, I forgot to mention. I have also tried
> msck repair table ;
> and same output I got which I got from msck only.
> Do I need to do any other settings for this to work, because I have
> prepared Hadoop and Hive setup from start on EC2.
>
> Thanks,
> Chunky.
>
>
>
> On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover 
> wrote:
>
>> Chunky,
>> You should have run:
>> msck repair table ;
>>
>> Sorry, I should have made it clear in my last reply. I have added an
>> entry to Hive wiki for benefit of others:
>>
>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions
>>
>> Mark
>>
>>
>> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta wrote:
>>
>>> Hi Mark,
>>> I didn't get any error.
>>> I ran this on hive console:-
>>>  "msck table Table_Name;"
>>> It says Ok and showed the execution time as 1.050 sec.
>>> But when I checked partitions for table using
>>>   "show partitions Table_Name;"
>>> It didn't show me any partitions.
>>>
>>> Thanks,
>>> Chunky.
>>>
>>>
>>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover <
>>> grover.markgro...@gmail.com> wrote:
>>>
 Glad to hear, Chunky.

 Out of curiosity, what errors did you get when using msck?


 On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta 
 wrote:

> Hi Mark,
> I tried msck, but it is not working for me. I have written a python
> script to partition the data individually.
>
> Thank you Edward, Mark and Dean.
> Chunky.
>
>
> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <
> grover.markgro...@gmail.com> wrote:
>
>> Chunky,
>> I have used "recover partitions" command on EMR, and that worked fine.
>>
>> However, take a look at
>> https://issues.apache.org/jira/browse/HIVE-874. Seems like msck
>> command in Apache Hive does the same thing. Try it out and let us know it
>> goes.
>>
>> Mark
>>
>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo <
>> edlinuxg...@gmail.com> wrote:
>>
>>> Recover partitions should work the same way for different file
>>> systems.
>>>
>>> Edward
>>>
>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>>>  wrote:
>>> > Writing a script to add the external partitions individually is
>>> the only way
>>> > I know of.
>>> >
>>> > Sent from my rotary phone.
>>> >
>>> >
>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
>>> wrote:
>>> >
>>> > Hi Dean,
>>> >
>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3
>>> storage
>>> > containing logs which updates daily and having partition with
>>> date(dt). And
>>> > I was using this recover partition.
>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive
>>> cluster. So,
>>> > what is the alternate of using recover partition in this case, if
>>> you have
>>> > any idea ?
>>> > I found one way of individually partitioning all dates, so I have
>>> to write
>>> > script for that to do so for all dates. Is there any easiest way
>>> other than
>>> > this ?
>>> >
>>> > Thanks,
>>> > Chunky
>>> >
>>> >
>>> >
>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
>>> >  wrote:
>>> >>
>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
>>> version
>>> >> of Hive.
>>> >>
>>> >>
>>> >>
>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>>> >>
>>> >> 
>>> >>   Chapter 21 of Programming Hive discusses this feature and other
>>> aspects
>>> >> of using Hive in EMR.
>>> >> 
>>> >>
>>> >> dean
>>> >>
>>> >>
>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <
>>> chunky.gu...@vizury.com>
>>> >> wrote:
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2
>>> and Hive
>>> >>> version 0.8.1 (I configured everything) . I have created a table
>>> using :-
>>> >>>
>>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
>>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION
>>> 's3://my-location/data/';
>>> >>>
>>> >>> Now I am trying to recover partition using :-
>>> >>>
>>> >>> ALTER TABLE XXX RECOVER PARTITIONS;
>>> >>>
>>> >>> 

Re: Alter table is giving error

2012-11-06 Thread Chunky Gupta
Hi Mark,
Sorry, I forgot to mention. I have also tried
msck repair table ;
and same output I got which I got from msck only.
Do I need to do any other settings for this to work, because I have
prepared Hadoop and Hive setup from start on EC2.

Thanks,
Chunky.



On Wed, Nov 7, 2012 at 11:58 AM, Mark Grover wrote:

> Chunky,
> You should have run:
> msck repair table ;
>
> Sorry, I should have made it clear in my last reply. I have added an entry
> to Hive wiki for benefit of others:
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions
>
> Mark
>
>
> On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta wrote:
>
>> Hi Mark,
>> I didn't get any error.
>> I ran this on hive console:-
>>  "msck table Table_Name;"
>> It says Ok and showed the execution time as 1.050 sec.
>> But when I checked partitions for table using
>>   "show partitions Table_Name;"
>> It didn't show me any partitions.
>>
>> Thanks,
>> Chunky.
>>
>>
>> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover > > wrote:
>>
>>> Glad to hear, Chunky.
>>>
>>> Out of curiosity, what errors did you get when using msck?
>>>
>>>
>>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta wrote:
>>>
 Hi Mark,
 I tried msck, but it is not working for me. I have written a python
 script to partition the data individually.

 Thank you Edward, Mark and Dean.
 Chunky.


 On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <
 grover.markgro...@gmail.com> wrote:

> Chunky,
> I have used "recover partitions" command on EMR, and that worked fine.
>
> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. 
> Seems
> like msck command in Apache Hive does the same thing. Try it out and let 
> us
> know it goes.
>
> Mark
>
> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo  > wrote:
>
>> Recover partitions should work the same way for different file
>> systems.
>>
>> Edward
>>
>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>>  wrote:
>> > Writing a script to add the external partitions individually is the
>> only way
>> > I know of.
>> >
>> > Sent from my rotary phone.
>> >
>> >
>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
>> wrote:
>> >
>> > Hi Dean,
>> >
>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3
>> storage
>> > containing logs which updates daily and having partition with
>> date(dt). And
>> > I was using this recover partition.
>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive
>> cluster. So,
>> > what is the alternate of using recover partition in this case, if
>> you have
>> > any idea ?
>> > I found one way of individually partitioning all dates, so I have
>> to write
>> > script for that to do so for all dates. Is there any easiest way
>> other than
>> > this ?
>> >
>> > Thanks,
>> > Chunky
>> >
>> >
>> >
>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
>> >  wrote:
>> >>
>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
>> version
>> >> of Hive.
>> >>
>> >>
>> >>
>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>> >>
>> >> 
>> >>   Chapter 21 of Programming Hive discusses this feature and other
>> aspects
>> >> of using Hive in EMR.
>> >> 
>> >>
>> >> dean
>> >>
>> >>
>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <
>> chunky.gu...@vizury.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and
>> Hive
>> >>> version 0.8.1 (I configured everything) . I have created a table
>> using :-
>> >>>
>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION
>> 's3://my-location/data/';
>> >>>
>> >>> Now I am trying to recover partition using :-
>> >>>
>> >>> ALTER TABLE XXX RECOVER PARTITIONS;
>> >>>
>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12
>> cannot
>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
>> statement"
>> >>>
>> >>> Doing same steps on a cluster setup on EMR with Hadoop version
>> 1.0.3 and
>> >>> Hive version 0.8.1 (Configured by EMR), works fine.
>> >>>
>> >>> So is this a version issue or am I missing some configuration
>> changes in
>> >>> EC2 setup ?
>> >>> I am not able to find exact solution for this problem on
>> internet. Please
>> >>> help me.
>> >>>
>> >>> Thanks,
>> >>> Chunky.
>> >>>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Dean Wampler, Ph.D.
>> >

Re: Alter table is giving error

2012-11-06 Thread Mark Grover
Chunky,
You should have run:
msck repair table ;

Sorry, I should have made it clear in my last reply. I have added an entry
to Hive wiki for benefit of others:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Recoverpartitions

Mark


On Tue, Nov 6, 2012 at 9:55 PM, Chunky Gupta wrote:

> Hi Mark,
> I didn't get any error.
> I ran this on hive console:-
>  "msck table Table_Name;"
> It says Ok and showed the execution time as 1.050 sec.
> But when I checked partitions for table using
>   "show partitions Table_Name;"
> It didn't show me any partitions.
>
> Thanks,
> Chunky.
>
>
> On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover 
> wrote:
>
>> Glad to hear, Chunky.
>>
>> Out of curiosity, what errors did you get when using msck?
>>
>>
>> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta wrote:
>>
>>> Hi Mark,
>>> I tried msck, but it is not working for me. I have written a python
>>> script to partition the data individually.
>>>
>>> Thank you Edward, Mark and Dean.
>>> Chunky.
>>>
>>>
>>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover <
>>> grover.markgro...@gmail.com> wrote:
>>>
 Chunky,
 I have used "recover partitions" command on EMR, and that worked fine.

 However, take a look at https://issues.apache.org/jira/browse/HIVE-874. 
 Seems
 like msck command in Apache Hive does the same thing. Try it out and let us
 know it goes.

 Mark

 On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo 
 wrote:

> Recover partitions should work the same way for different file systems.
>
> Edward
>
> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>  wrote:
> > Writing a script to add the external partitions individually is the
> only way
> > I know of.
> >
> > Sent from my rotary phone.
> >
> >
> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
> wrote:
> >
> > Hi Dean,
> >
> > Actually I was having Hadoop and Hive cluster on EMR and I have S3
> storage
> > containing logs which updates daily and having partition with
> date(dt). And
> > I was using this recover partition.
> > Now I wanted to shift to EC2 and have my own Hadoop and Hive
> cluster. So,
> > what is the alternate of using recover partition in this case, if
> you have
> > any idea ?
> > I found one way of individually partitioning all dates, so I have to
> write
> > script for that to do so for all dates. Is there any easiest way
> other than
> > this ?
> >
> > Thanks,
> > Chunky
> >
> >
> >
> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
> >  wrote:
> >>
> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
> version
> >> of Hive.
> >>
> >>
> >>
> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
> >>
> >> 
> >>   Chapter 21 of Programming Hive discusses this feature and other
> aspects
> >> of using Hive in EMR.
> >> 
> >>
> >> dean
> >>
> >>
> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <
> chunky.gu...@vizury.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and
> Hive
> >>> version 0.8.1 (I configured everything) . I have created a table
> using :-
> >>>
> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION
> 's3://my-location/data/';
> >>>
> >>> Now I am trying to recover partition using :-
> >>>
> >>> ALTER TABLE XXX RECOVER PARTITIONS;
> >>>
> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12
> cannot
> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
> statement"
> >>>
> >>> Doing same steps on a cluster setup on EMR with Hadoop version
> 1.0.3 and
> >>> Hive version 0.8.1 (Configured by EMR), works fine.
> >>>
> >>> So is this a version issue or am I missing some configuration
> changes in
> >>> EC2 setup ?
> >>> I am not able to find exact solution for this problem on internet.
> Please
> >>> help me.
> >>>
> >>> Thanks,
> >>> Chunky.
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Dean Wampler, Ph.D.
> >> thinkbiganalytics.com
> >> +1-312-339-1330
> >>
> >>
> >
>


>>>
>>
>


Re: Alter table is giving error

2012-11-06 Thread Chunky Gupta
Hi Mark,
I didn't get any error.
I ran this on hive console:-
 "msck table Table_Name;"
It says Ok and showed the execution time as 1.050 sec.
But when I checked partitions for table using
  "show partitions Table_Name;"
It didn't show me any partitions.

Thanks,
Chunky.

On Tue, Nov 6, 2012 at 10:38 PM, Mark Grover wrote:

> Glad to hear, Chunky.
>
> Out of curiosity, what errors did you get when using msck?
>
>
> On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta wrote:
>
>> Hi Mark,
>> I tried msck, but it is not working for me. I have written a python
>> script to partition the data individually.
>>
>> Thank you Edward, Mark and Dean.
>> Chunky.
>>
>>
>> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover > > wrote:
>>
>>> Chunky,
>>> I have used "recover partitions" command on EMR, and that worked fine.
>>>
>>> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. 
>>> Seems
>>> like msck command in Apache Hive does the same thing. Try it out and let us
>>> know it goes.
>>>
>>> Mark
>>>
>>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo 
>>> wrote:
>>>
 Recover partitions should work the same way for different file systems.

 Edward

 On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
  wrote:
 > Writing a script to add the external partitions individually is the
 only way
 > I know of.
 >
 > Sent from my rotary phone.
 >
 >
 > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
 wrote:
 >
 > Hi Dean,
 >
 > Actually I was having Hadoop and Hive cluster on EMR and I have S3
 storage
 > containing logs which updates daily and having partition with
 date(dt). And
 > I was using this recover partition.
 > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster.
 So,
 > what is the alternate of using recover partition in this case, if you
 have
 > any idea ?
 > I found one way of individually partitioning all dates, so I have to
 write
 > script for that to do so for all dates. Is there any easiest way
 other than
 > this ?
 >
 > Thanks,
 > Chunky
 >
 >
 >
 > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
 >  wrote:
 >>
 >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
 version
 >> of Hive.
 >>
 >>
 >>
 http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
 >>
 >> 
 >>   Chapter 21 of Programming Hive discusses this feature and other
 aspects
 >> of using Hive in EMR.
 >> 
 >>
 >> dean
 >>
 >>
 >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta <
 chunky.gu...@vizury.com>
 >> wrote:
 >>>
 >>> Hi,
 >>>
 >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and
 Hive
 >>> version 0.8.1 (I configured everything) . I have created a table
 using :-
 >>>
 >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
 >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION
 's3://my-location/data/';
 >>>
 >>> Now I am trying to recover partition using :-
 >>>
 >>> ALTER TABLE XXX RECOVER PARTITIONS;
 >>>
 >>> but I am getting this error :- "FAILED: Parse Error: line 1:12
 cannot
 >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
 statement"
 >>>
 >>> Doing same steps on a cluster setup on EMR with Hadoop version
 1.0.3 and
 >>> Hive version 0.8.1 (Configured by EMR), works fine.
 >>>
 >>> So is this a version issue or am I missing some configuration
 changes in
 >>> EC2 setup ?
 >>> I am not able to find exact solution for this problem on internet.
 Please
 >>> help me.
 >>>
 >>> Thanks,
 >>> Chunky.
 >>>
 >>>
 >>>
 >>
 >>
 >>
 >> --
 >> Dean Wampler, Ph.D.
 >> thinkbiganalytics.com
 >> +1-312-339-1330
 >>
 >>
 >

>>>
>>>
>>
>


Re: Alter table is giving error

2012-11-06 Thread Mark Grover
Glad to hear, Chunky.

Out of curiosity, what errors did you get when using msck?

On Tue, Nov 6, 2012 at 5:14 AM, Chunky Gupta wrote:

> Hi Mark,
> I tried msck, but it is not working for me. I have written a python script
> to partition the data individually.
>
> Thank you Edward, Mark and Dean.
> Chunky.
>
>
> On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover 
> wrote:
>
>> Chunky,
>> I have used "recover partitions" command on EMR, and that worked fine.
>>
>> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems
>> like msck command in Apache Hive does the same thing. Try it out and let us
>> know it goes.
>>
>> Mark
>>
>> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo wrote:
>>
>>> Recover partitions should work the same way for different file systems.
>>>
>>> Edward
>>>
>>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>>>  wrote:
>>> > Writing a script to add the external partitions individually is the
>>> only way
>>> > I know of.
>>> >
>>> > Sent from my rotary phone.
>>> >
>>> >
>>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
>>> wrote:
>>> >
>>> > Hi Dean,
>>> >
>>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3
>>> storage
>>> > containing logs which updates daily and having partition with
>>> date(dt). And
>>> > I was using this recover partition.
>>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster.
>>> So,
>>> > what is the alternate of using recover partition in this case, if you
>>> have
>>> > any idea ?
>>> > I found one way of individually partitioning all dates, so I have to
>>> write
>>> > script for that to do so for all dates. Is there any easiest way other
>>> than
>>> > this ?
>>> >
>>> > Thanks,
>>> > Chunky
>>> >
>>> >
>>> >
>>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
>>> >  wrote:
>>> >>
>>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
>>> version
>>> >> of Hive.
>>> >>
>>> >>
>>> >>
>>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>>> >>
>>> >> 
>>> >>   Chapter 21 of Programming Hive discusses this feature and other
>>> aspects
>>> >> of using Hive in EMR.
>>> >> 
>>> >>
>>> >> dean
>>> >>
>>> >>
>>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta >> >
>>> >> wrote:
>>> >>>
>>> >>> Hi,
>>> >>>
>>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and
>>> Hive
>>> >>> version 0.8.1 (I configured everything) . I have created a table
>>> using :-
>>> >>>
>>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
>>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION
>>> 's3://my-location/data/';
>>> >>>
>>> >>> Now I am trying to recover partition using :-
>>> >>>
>>> >>> ALTER TABLE XXX RECOVER PARTITIONS;
>>> >>>
>>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
>>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
>>> statement"
>>> >>>
>>> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3
>>> and
>>> >>> Hive version 0.8.1 (Configured by EMR), works fine.
>>> >>>
>>> >>> So is this a version issue or am I missing some configuration
>>> changes in
>>> >>> EC2 setup ?
>>> >>> I am not able to find exact solution for this problem on internet.
>>> Please
>>> >>> help me.
>>> >>>
>>> >>> Thanks,
>>> >>> Chunky.
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Dean Wampler, Ph.D.
>>> >> thinkbiganalytics.com
>>> >> +1-312-339-1330
>>> >>
>>> >>
>>> >
>>>
>>
>>
>


Re: Alter table is giving error

2012-11-06 Thread Chunky Gupta
Hi Mark,
I tried msck, but it is not working for me. I have written a python script
to partition the data individually.

Thank you Edward, Mark and Dean.
Chunky.

On Mon, Nov 5, 2012 at 11:08 PM, Mark Grover wrote:

> Chunky,
> I have used "recover partitions" command on EMR, and that worked fine.
>
> However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems
> like msck command in Apache Hive does the same thing. Try it out and let us
> know it goes.
>
> Mark
>
> On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo wrote:
>
>> Recover partitions should work the same way for different file systems.
>>
>> Edward
>>
>> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>>  wrote:
>> > Writing a script to add the external partitions individually is the
>> only way
>> > I know of.
>> >
>> > Sent from my rotary phone.
>> >
>> >
>> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
>> wrote:
>> >
>> > Hi Dean,
>> >
>> > Actually I was having Hadoop and Hive cluster on EMR and I have S3
>> storage
>> > containing logs which updates daily and having partition with date(dt).
>> And
>> > I was using this recover partition.
>> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster.
>> So,
>> > what is the alternate of using recover partition in this case, if you
>> have
>> > any idea ?
>> > I found one way of individually partitioning all dates, so I have to
>> write
>> > script for that to do so for all dates. Is there any easiest way other
>> than
>> > this ?
>> >
>> > Thanks,
>> > Chunky
>> >
>> >
>> >
>> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
>> >  wrote:
>> >>
>> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
>> version
>> >> of Hive.
>> >>
>> >>
>> >>
>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>> >>
>> >> 
>> >>   Chapter 21 of Programming Hive discusses this feature and other
>> aspects
>> >> of using Hive in EMR.
>> >> 
>> >>
>> >> dean
>> >>
>> >>
>> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta 
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
>> >>> version 0.8.1 (I configured everything) . I have created a table
>> using :-
>> >>>
>> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
>> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION
>> 's3://my-location/data/';
>> >>>
>> >>> Now I am trying to recover partition using :-
>> >>>
>> >>> ALTER TABLE XXX RECOVER PARTITIONS;
>> >>>
>> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
>> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
>> statement"
>> >>>
>> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3
>> and
>> >>> Hive version 0.8.1 (Configured by EMR), works fine.
>> >>>
>> >>> So is this a version issue or am I missing some configuration changes
>> in
>> >>> EC2 setup ?
>> >>> I am not able to find exact solution for this problem on internet.
>> Please
>> >>> help me.
>> >>>
>> >>> Thanks,
>> >>> Chunky.
>> >>>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Dean Wampler, Ph.D.
>> >> thinkbiganalytics.com
>> >> +1-312-339-1330
>> >>
>> >>
>> >
>>
>
>


Re: Alter table is giving error

2012-11-05 Thread Mark Grover
Chunky,
I have used "recover partitions" command on EMR, and that worked fine.

However, take a look at https://issues.apache.org/jira/browse/HIVE-874. Seems
like msck command in Apache Hive does the same thing. Try it out and let us
know it goes.

Mark

On Mon, Nov 5, 2012 at 7:56 AM, Edward Capriolo wrote:

> Recover partitions should work the same way for different file systems.
>
> Edward
>
> On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
>  wrote:
> > Writing a script to add the external partitions individually is the only
> way
> > I know of.
> >
> > Sent from my rotary phone.
> >
> >
> > On Nov 5, 2012, at 8:19 AM, Chunky Gupta 
> wrote:
> >
> > Hi Dean,
> >
> > Actually I was having Hadoop and Hive cluster on EMR and I have S3
> storage
> > containing logs which updates daily and having partition with date(dt).
> And
> > I was using this recover partition.
> > Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So,
> > what is the alternate of using recover partition in this case, if you
> have
> > any idea ?
> > I found one way of individually partitioning all dates, so I have to
> write
> > script for that to do so for all dates. Is there any easiest way other
> than
> > this ?
> >
> > Thanks,
> > Chunky
> >
> >
> >
> > On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
> >  wrote:
> >>
> >> The RECOVER PARTITIONS is an enhancement added by Amazon to their
> version
> >> of Hive.
> >>
> >>
> >>
> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
> >>
> >> 
> >>   Chapter 21 of Programming Hive discusses this feature and other
> aspects
> >> of using Hive in EMR.
> >> 
> >>
> >> dean
> >>
> >>
> >> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta 
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
> >>> version 0.8.1 (I configured everything) . I have created a table using
> :-
> >>>
> >>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
> >>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';
> >>>
> >>> Now I am trying to recover partition using :-
> >>>
> >>> ALTER TABLE XXX RECOVER PARTITIONS;
> >>>
> >>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
> >>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table
> statement"
> >>>
> >>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3
> and
> >>> Hive version 0.8.1 (Configured by EMR), works fine.
> >>>
> >>> So is this a version issue or am I missing some configuration changes
> in
> >>> EC2 setup ?
> >>> I am not able to find exact solution for this problem on internet.
> Please
> >>> help me.
> >>>
> >>> Thanks,
> >>> Chunky.
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Dean Wampler, Ph.D.
> >> thinkbiganalytics.com
> >> +1-312-339-1330
> >>
> >>
> >
>


Re: Alter table is giving error

2012-11-05 Thread Edward Capriolo
Recover partitions should work the same way for different file systems.

Edward

On Mon, Nov 5, 2012 at 9:33 AM, Dean Wampler
 wrote:
> Writing a script to add the external partitions individually is the only way
> I know of.
>
> Sent from my rotary phone.
>
>
> On Nov 5, 2012, at 8:19 AM, Chunky Gupta  wrote:
>
> Hi Dean,
>
> Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage
> containing logs which updates daily and having partition with date(dt). And
> I was using this recover partition.
> Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So,
> what is the alternate of using recover partition in this case, if you have
> any idea ?
> I found one way of individually partitioning all dates, so I have to write
> script for that to do so for all dates. Is there any easiest way other than
> this ?
>
> Thanks,
> Chunky
>
>
>
> On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler
>  wrote:
>>
>> The RECOVER PARTITIONS is an enhancement added by Amazon to their version
>> of Hive.
>>
>>
>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>>
>> 
>>   Chapter 21 of Programming Hive discusses this feature and other aspects
>> of using Hive in EMR.
>> 
>>
>> dean
>>
>>
>> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta 
>> wrote:
>>>
>>> Hi,
>>>
>>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
>>> version 0.8.1 (I configured everything) . I have created a table using :-
>>>
>>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
>>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';
>>>
>>> Now I am trying to recover partition using :-
>>>
>>> ALTER TABLE XXX RECOVER PARTITIONS;
>>>
>>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
>>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement"
>>>
>>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
>>> Hive version 0.8.1 (Configured by EMR), works fine.
>>>
>>> So is this a version issue or am I missing some configuration changes in
>>> EC2 setup ?
>>> I am not able to find exact solution for this problem on internet. Please
>>> help me.
>>>
>>> Thanks,
>>> Chunky.
>>>
>>>
>>>
>>
>>
>>
>> --
>> Dean Wampler, Ph.D.
>> thinkbiganalytics.com
>> +1-312-339-1330
>>
>>
>


Re: Alter table is giving error

2012-11-05 Thread Dean Wampler
Writing a script to add the external partitions individually is the only way I 
know of. 

Sent from my rotary phone. 


On Nov 5, 2012, at 8:19 AM, Chunky Gupta  wrote:

> Hi Dean,
> 
> Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage 
> containing logs which updates daily and having partition with date(dt). And I 
> was using this recover partition.
> Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So, 
> what is the alternate of using recover partition in this case, if you have 
> any idea ? 
> I found one way of individually partitioning all dates, so I have to write 
> script for that to do so for all dates. Is there any easiest way other than 
> this ?
> 
> Thanks,
> Chunky
> 
> 
> 
> On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler 
>  wrote:
>> The RECOVER PARTITIONS is an enhancement added by Amazon to their version of 
>> Hive.
>> 
>> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>> 
>> 
>>   Chapter 21 of Programming Hive discusses this feature and other aspects of 
>> using Hive in EMR.
>> 
>> 
>> dean
>> 
>> 
>> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta  wrote:
>>> Hi,
>>> 
>>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive 
>>> version 0.8.1 (I configured everything) . I have created a table using :-
>>> 
>>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT DELIMITED 
>>> FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';
>>> 
>>> Now I am trying to recover partition using :-
>>> 
>>> ALTER TABLE XXX RECOVER PARTITIONS;
>>> 
>>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot 
>>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement"
>>> 
>>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and 
>>> Hive version 0.8.1 (Configured by EMR), works fine.
>>> 
>>> So is this a version issue or am I missing some configuration changes in 
>>> EC2 setup ?
>>> I am not able to find exact solution for this problem on internet. Please 
>>> help me.
>>> 
>>> Thanks,
>>> Chunky.
>> 
>> 
>> 
>> -- 
>> Dean Wampler, Ph.D.
>> thinkbiganalytics.com
>> +1-312-339-1330
> 


Re: Alter table is giving error

2012-11-05 Thread Chunky Gupta
Hi Dean,

Actually I was having Hadoop and Hive cluster on EMR and I have S3 storage
containing logs which updates daily and having partition with date(dt). And
I was using this recover partition.
Now I wanted to shift to EC2 and have my own Hadoop and Hive cluster. So,
what is the alternate of using recover partition in this case, if you have
any idea ?
I found one way of individually partitioning all dates, so I have to write
script for that to do so for all dates. Is there any easiest way other than
this ?

Thanks,
Chunky



On Mon, Nov 5, 2012 at 6:28 PM, Dean Wampler <
dean.wamp...@thinkbiganalytics.com> wrote:

> The RECOVER PARTITIONS is an enhancement added by Amazon to their version
> of Hive.
>
>
> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html
>
> 
>   Chapter 21 of Programming Hive discusses this feature and other aspects
> of using Hive in EMR.
> 
>
> dean
>
>
> On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta wrote:
>
>> Hi,
>>
>> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
>> version 0.8.1 (I configured everything) . I have created a table using :-
>>
>> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
>> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';
>>
>> Now I am trying to recover partition using :-
>>
>> ALTER TABLE XXX RECOVER PARTITIONS;
>>
>> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
>> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement"
>>
>> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
>> Hive version 0.8.1 (Configured by EMR), works fine.
>>
>> So is this a version issue or am I missing some configuration changes in
>> EC2 setup ?
>> I am not able to find exact solution for this problem on internet. Please
>> help me.
>>
>> Thanks,
>> Chunky.
>>
>>
>>
>>
>
>
> --
> *Dean Wampler, Ph.D.*
> thinkbiganalytics.com
> +1-312-339-1330
>
>
>


Re: Alter table is giving error

2012-11-05 Thread Dean Wampler
The RECOVER PARTITIONS is an enhancement added by Amazon to their version
of Hive.

http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html


  Chapter 21 of Programming Hive discusses this feature and other aspects
of using Hive in EMR.


dean

On Mon, Nov 5, 2012 at 5:34 AM, Chunky Gupta wrote:

> Hi,
>
> I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
> version 0.8.1 (I configured everything) . I have created a table using :-
>
> CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT
> DELIMITED FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';
>
> Now I am trying to recover partition using :-
>
> ALTER TABLE XXX RECOVER PARTITIONS;
>
> but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
> recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement"
>
> Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
> Hive version 0.8.1 (Configured by EMR), works fine.
>
> So is this a version issue or am I missing some configuration changes in
> EC2 setup ?
> I am not able to find exact solution for this problem on internet. Please
> help me.
>
> Thanks,
> Chunky.
>
>
>
>


-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330


Alter table is giving error

2012-11-05 Thread Chunky Gupta
Hi,

I am having a cluster setup on EC2 with Hadoop version 0.20.2 and Hive
version 0.8.1 (I configured everything) . I have created a table using :-

CREATE EXTERNAL TABLE XXX ( YYY )PARTITIONED BY ( ZZZ )ROW FORMAT DELIMITED
FIELDS TERMINATED BY 'WWW' LOCATION 's3://my-location/data/';

Now I am trying to recover partition using :-

ALTER TABLE XXX RECOVER PARTITIONS;

but I am getting this error :- "FAILED: Parse Error: line 1:12 cannot
recognize input near 'XXX' 'RECOVER' 'PARTITIONS' in alter table statement"

Doing same steps on a cluster setup on EMR with Hadoop version 1.0.3 and
Hive version 0.8.1 (Configured by EMR), works fine.

So is this a version issue or am I missing some configuration changes in
EC2 setup ?
I am not able to find exact solution for this problem on internet. Please
help me.

Thanks,
Chunky.