This is not backward compatibility issue. Check HIVE-592 for details. Before 
this patch, a rename doesn't change the name of the hdfs directory and if you 
create a new table with the old name of the renamed  table then both tables 
will be pointing to the same directory thus causing problems. HIVE-592 fixes 
this to rename directories correctly. So if you have created all tables after 
HIVE-592 patch went in, you should be fine.


________________________________
From: Bill Graham <billgra...@gmail.com>
Reply-To: <billgra...@gmail.com>
Date: Thu, 30 Jul 2009 13:09:03 -0700
To: Prasad Chakka <pcha...@facebook.com>
Cc: <hive-user@hadoop.apache.org>
Subject: Re: partitions not being created

I sent my last try reply before seeing your last email.

Thanks, that seems possible. I did initially create ApiUsageTemp using the most 
recent Hive release. Then while working on a JIRA I updated my Hive client and 
server to the more recent builds from the trunk.

If that could cause such a problem, this is troubling though, since it implies 
that we can't upgrade Hive without possibly corrupting our metadata store.

I'll try again from scratch though and see if it works, thanks.


On Thu, Jul 30, 2009 at 1:04 PM, Bill Graham <billgra...@gmail.com> wrote:
Prasad,

My setup is Hive client -> Hive Server (with local metastore) -> Hadoop. I was 
also suspecting metastore issues, so I've tried multiple times with newly 
created destination tables and I see the same thing happening.

All of the log info I've been able to find I've included already in this 
thread. Let me know if there's anywhere else I could look for clues.

I've included from the client:
- /tmp/$USER/hive.log

And from the hive server:
- Stdout/err logs

- /tmp/$USER/hive_job_log*.txt

Is there anything else I should be looking at? All of the M/R logs don't show 
any exceptions anything suspect.

Thanks for your time and insights on this issue, I appreciate it.

thanks,
Bill


On Thu, Jul 30, 2009 at 11:57 AM, Prasad Chakka <pcha...@facebook.com> wrote:
Bill,

The real error is happening on the Hive Metastore Server or Hive Server  
(depending on the setup you are using). Error logs on it must have different 
stack trace. From the information below I am guessing that the way the 
destination table hdfs directories that got created has some problems. Can you 
drop that table (and make sure that there is no corresponding HDFS directory 
for both integer and string type partitions that you created) and retry the 
query.

If you don't want to drop the destination table then send me the logs on Hive 
Server.

Prasad



________________________________
From: Bill Graham <billgra...@gmail.com <http://billgra...@gmail.com> >
Reply-To: <billgra...@gmail.com <http://billgra...@gmail.com> >
Date: Thu, 30 Jul 2009 11:47:41 -0700

To: Prasad Chakka <pcha...@facebook.com <http://pcha...@facebook.com> >
Cc: <hive-user@hadoop.apache.org <http://hive-user@hadoop.apache.org> >
Subject: Re: partitions not being created

That file contains a similar error as the Hive Server logs:

2009-07-30 11:44:21,095 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(510)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
2009-07-30 11:44:48,070 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(510)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
2009-07-30 11:45:27,796 ERROR metadata.Hive (Hive.java:getPartition(588)) - 
org.apache.thrift.TApplicationException: get_partition failed: unknown result
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:784)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:752)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:415)
        at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:579)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:466)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:335)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:241)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:122)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:165)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:258)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

2009-07-30 11:45:27,797 ERROR exec.MoveTask (SessionState.java:printError(279)) 
- Failed with exception org.apache.thrift.TApplicationException: get_partition 
failed: unknown result
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.TApplicationException: get_partition failed: unknown result
        at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:589)
        at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:466)
        at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:335)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:241)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:122)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:165)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:258)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
        at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
Caused by: org.apache.thrift.TApplicationException: get_partition failed: 
unknown result
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:784)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:752)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:415)
        at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:579)
        ... 16 more

2009-07-30 11:45:27,798 ERROR ql.Driver (SessionState.java:printError(279)) - 
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask

On Thu, Jul 30, 2009 at 11:33 AM, Prasad Chakka <pcha...@facebook.com 
<http://pcha...@facebook.com> > wrote:

The hive logs go into /tmp/$USER/hive.log not hive_job_log*.txt.


________________________________
From: Bill Graham <billgra...@gmail.com <http://billgra...@gmail.com>  
<http://billgra...@gmail.com> >
Reply-To: <billgra...@gmail.com <http://billgra...@gmail.com>  
<http://billgra...@gmail.com> >

Date: Thu, 30 Jul 2009 10:52:06 -0700
To: Prasad Chakka <pcha...@facebook.com <http://pcha...@facebook.com>  
<http://pcha...@facebook.com> >
Cc: <hive-user@hadoop.apache.org <http://hive-user@hadoop.apache.org>  
<http://hive-user@hadoop.apache.org> >, Zheng Shao <zsh...@gmail.com 
<http://zsh...@gmail.com>  <http://zsh...@gmail.com> >


Subject: Re: partitions not being created

I'm trying to set a string to a string and I'm seeing this error. I also had an 
attempt where it was a string to an int, and I also saw the same error.

The /tmp/$USER/hive_job_log*.txt file doesn't contain any exceptions, but I've 
included it's output below. Only the Hive server logs show the exceptions 
listed above. (Note that the table I'm loading from in this log output is 
ApiUsageSmall, which is identical to ApiUsageTemp. For some reason the data 
from ApiUsageTemp is now gone.)

QueryStart QUERY_STRING="INSERT OVERWRITE TABLE ApiUsage PARTITION (dt = 
"20090518") SELECT `(requestDate)?+.+` FROM ApiUsageSmall WHERE requestDate = 
'2009/05/18'" QUERY_ID="app_20090730104242" TIME="1248975752235"
TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" 
TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TIME="1248975752235"
TaskProgress TASK_HADOOP_PROGRESS="2009-07-30 10:42:34,783 map = 0%,  reduce 
=0%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="Job 
Counters .Launched map tasks:1,Job Counters .Data-local map tasks:1" 
TASK_ID="Stage-1" QUERY_ID="app_20090730104242" 
TASK_HADOOP_ID="job_200906301559_0409" TIME="1248975754785"
TaskProgress ROWS_INSERTED="apiusage~296" TASK_HADOOP_PROGRESS="2009-07-30 
10:42:43,031 map = 40%,  reduce =0%" 
TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File 
Systems.HDFS bytes read:23019,File Systems.HDFS bytes written:19178,Job 
Counters .Rack-local map tasks:2,Job Counters .Launched map tasks:5,Job 
Counters .Data-local map 
tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:592,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:6,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:296,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce
 Framework.Map input records:302,Map-Reduce Framework.Map input 
bytes:23019,Map-Reduce Framework.Map output records:0" TASK_ID="Stage-1" 
QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" 
TIME="1248975763033"
TaskProgress ROWS_INSERTED="apiusage~1471" TASK_HADOOP_PROGRESS="2009-07-30 
10:42:44,068 map = 100%,  reduce =100%" 
TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File 
Systems.HDFS bytes read:114068,File Systems.HDFS bytes written:95275,Job 
Counters .Rack-local map tasks:2,Job Counters .Launched map tasks:5,Job 
Counters .Data-local map 
tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:2942,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:27,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:1471,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce
 Framework.Map input records:1498,Map-Reduce Framework.Map input 
bytes:114068,Map-Reduce Framework.Map output records:0" TASK_ID="Stage-1" 
QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" 
TIME="1248975764071"
TaskEnd ROWS_INSERTED="apiusage~1471" TASK_RET_CODE="0" 
TASK_HADOOP_PROGRESS="2009-07-30 10:42:44,068 map = 100%,  reduce =100%" 
TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File 
Systems.HDFS bytes read:114068,File Systems.HDFS bytes written:95275,Job 
Counters .Rack-local map tasks:2,Job Counters .Launched map tasks:5,Job 
Counters .Data-local map 
tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:2942,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:27,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:1471,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce
 Framework.Map input records:1498,Map-Reduce Framework.Map input 
bytes:114068,Map-Reduce Framework.Map output records:0" TASK_ID="Stage-1" 
QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" 
TIME="1248975764199"
TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.ConditionalTask" 
TASK_ID="Stage-4" QUERY_ID="app_20090730104242" TIME="1248975764199"
TaskEnd TASK_RET_CODE="0" 
TASK_NAME="org.apache.hadoop.hive.ql.exec.ConditionalTask" TASK_ID="Stage-4" 
QUERY_ID="app_20090730104242" TIME="1248975782277"
TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" TASK_ID="Stage-0" 
QUERY_ID="app_20090730104242" TIME="1248975782277"
TaskEnd TASK_RET_CODE="1" TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" 
TASK_ID="Stage-0" QUERY_ID="app_20090730104242" TIME="1248975782473"
QueryEnd ROWS_INSERTED="apiusage~1471" QUERY_STRING="INSERT OVERWRITE TABLE 
ApiUsage PARTITION (dt = "20090518") SELECT `(requestDate)?+.+` FROM 
ApiUsageSmall WHERE requestDate = '2009/05/18'" QUERY_ID="app_20090730104242" 
QUERY_NUM_TASKS="2" TIME="1248975782474"



On Thu, Jul 30, 2009 at 10:09 AM, Prasad Chakka <pcha...@facebook.com 
<http://pcha...@facebook.com>  <http://pcha...@facebook.com> > wrote:
Are you sure you are getting the same error even with the schema below (i.e. 
trying to set a string to an int column?). Can you give the full stack trace 
that you might see in /tmp/$USER/hive.log?


________________________________
From: Bill Graham <billgra...@gmail.com <http://billgra...@gmail.com>  
<http://billgra...@gmail.com>  <http://billgra...@gmail.com> >
Reply-To: <hive-user@hadoop.apache.org <http://hive-user@hadoop.apache.org>  
<http://hive-user@hadoop.apache.org>  <http://hive-user@hadoop.apache.org> >, 
<billgra...@gmail.com <http://billgra...@gmail.com>  
<http://billgra...@gmail.com>  <http://billgra...@gmail.com> >


Date: Thu, 30 Jul 2009 10:02:54 -0700
To: Zheng Shao <zsh...@gmail.com <http://zsh...@gmail.com>  
<http://zsh...@gmail.com>  <http://zsh...@gmail.com> >
Cc: <hive-user@hadoop.apache.org <http://hive-user@hadoop.apache.org>  
<http://hive-user@hadoop.apache.org>  <http://hive-user@hadoop.apache.org> >


Subject: Re: partitions not being created


Based on these describe statements, is what I'm trying to do feasable? I'm 
basically trying to load the contents of ApiUsageTemp into ApiUsage, with the 
ApiUsageTemp.requestdate column becoming the ApiUsage.dt partition.


On Wed, Jul 29, 2009 at 9:28 AM, Bill Graham <billgra...@gmail.com 
<http://billgra...@gmail.com>  <http://billgra...@gmail.com>  
<http://billgra...@gmail.com> > wrote:
Sure. The only difference I see is that the ApiUsage has a dt partition, 
instead of the requestdate column:

hive> describe extended ApiUsage;
OK
user    string
restresource    string
statuscode      int
requesthour     int
numrequests     string
responsetime    string
numslowrequests string
dt      string

Detailed Table Information      Table(tableName:apiusage, dbName:default, 
owner:grahamb, createTime:1248884801, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:user, type:string, comment:null), 
FieldSchema(name:restresource, type:string, comment:null), 
FieldSchema(name:statuscode, type:int, comment:null), 
FieldSchema(name:requesthour, type:int, comment:null), 
FieldSchema(name:numrequests, type:string, comment:null), 
FieldSchema(name:responsetime, type:string, comment:null), 
FieldSchema(name:numslowrequests, type:string, comment:null)], 
location:hdfs://xxxxxxx:9000/user/hive/warehouse/apiusage 
<http://c18-ssa-dev40-so-qry1.cnet.com:9000/user/hive/warehouse/apiusage> , 
inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{field.delim= , serialization.format= }), bucketCols:[], 
sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:dt, type:string, 
comment:null)], parameters:{})

Time taken: 0.277 seconds
hive> describe extended ApiUsageTemp;
OK
user    string
restresource    string
statuscode      int
requestdate     string
requesthour     int
numrequests     string
responsetime    string
numslowrequests string

Detailed Table Information      Table(tableName:apiusagetemp, dbName:default, 
owner:grahamb, createTime:1248466925, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:user, type:string, comment:null), 
FieldSchema(name:restresource, type:string, comment:null), 
FieldSchema(name:statuscode, type:int, comment:null), 
FieldSchema(name:requestdate, type:string, comment:null), 
FieldSchema(name:requesthour, type:int, comment:null), 
FieldSchema(name:numrequests, type:string, comment:null), 
FieldSchema(name:responsetime, type:string, comment:null), 
FieldSchema(name:numslowrequests, type:string, comment:null)], 
location:hdfs://xxxxxxx:9000/user/hive/warehouse/apiusage 
<http://c18-ssa-dev40-so-qry1.cnet.com:9000/user/hive/warehouse/apiusage> , 
inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{field.delim= , serialization.format= }), bucketCols:[], 
sortCols:[], parameters:{}), partitionKeys:[], 
parameters:{last_modified_time=1248826696, last_modified_by=app})

Time taken: 0.235 seconds



On Tue, Jul 28, 2009 at 9:03 PM, Zheng Shao <zsh...@gmail.com 
<http://zsh...@gmail.com>  <http://zsh...@gmail.com>  <http://zsh...@gmail.com> 
> wrote:
Can you send the output of these 2 commands?

describe extended ApiUsage;
describe extended ApiUsageTemp;


Zheng

On Tue, Jul 28, 2009 at 6:29 PM, Bill Graham<billgra...@gmail.com 
<http://billgra...@gmail.com>  <http://billgra...@gmail.com>  
<http://billgra...@gmail.com> > wrote:
> Thanks for the tip, but it fails in the same way when I use a string.
>
> On Tue, Jul 28, 2009 at 6:21 PM, David Lerman <dler...@videoegg.com 
> <http://dler...@videoegg.com>  <http://dler...@videoegg.com>  
> <http://dler...@videoegg.com> > wrote:
>>
>> >> hive> create table partTable (a string, b int) partitioned by (dt int);
>>
>> > INSERT OVERWRITE TABLE ApiUsage PARTITION (dt = "20090518")
>> > SELECT `(requestDate)?+.+` FROM ApiUsageTemp WHERE requestDate =
>> > '2009/05/18'
>>
>> The table has an int partition column (dt), but you're trying to set a
>> string value (dt = "20090518").
>>
>> Try :
>>
>> create table partTable (a string, b int) partitioned by (dt string);
>>
>> and then do your insert.
>>
>
>



--
Yours,
Zheng










Reply via email to