The hive logs go into /tmp/$USER/hive.log not hive_job_log*.txt. ________________________________ From: Bill Graham <[email protected]> Reply-To: <[email protected]> Date: Thu, 30 Jul 2009 10:52:06 -0700 To: Prasad Chakka <[email protected]> Cc: <[email protected]>, Zheng Shao <[email protected]> Subject: Re: partitions not being created
I'm trying to set a string to a string and I'm seeing this error. I also had an attempt where it was a string to an int, and I also saw the same error. The /tmp/$USER/hive_job_log*.txt file doesn't contain any exceptions, but I've included it's output below. Only the Hive server logs show the exceptions listed above. (Note that the table I'm loading from in this log output is ApiUsageSmall, which is identical to ApiUsageTemp. For some reason the data from ApiUsageTemp is now gone.) QueryStart QUERY_STRING="INSERT OVERWRITE TABLE ApiUsage PARTITION (dt = "20090518") SELECT `(requestDate)?+.+` FROM ApiUsageSmall WHERE requestDate = '2009/05/18'" QUERY_ID="app_20090730104242" TIME="1248975752235" TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TIME="1248975752235" TaskProgress TASK_HADOOP_PROGRESS="2009-07-30 10:42:34,783 map = 0%, reduce =0%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="Job Counters .Launched map tasks:1,Job Counters .Data-local map tasks:1" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" TIME="1248975754785" TaskProgress ROWS_INSERTED="apiusage~296" TASK_HADOOP_PROGRESS="2009-07-30 10:42:43,031 map = 40%, reduce =0%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File Systems.HDFS bytes read:23019,File Systems.HDFS bytes written:19178,Job Counters .Rack-local map tasks:2,Job Counters .Launched map tasks:5,Job Counters .Data-local map tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:592,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:6,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:296,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce Framework.Map input records:302,Map-Reduce Framework.Map input bytes:23019,Map-Reduce Framework.Map output records:0" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" TIME="1248975763033" TaskProgress ROWS_INSERTED="apiusage~1471" TASK_HADOOP_PROGRESS="2009-07-30 10:42:44,068 map = 100%, reduce =100%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File Systems.HDFS bytes read:114068,File Systems.HDFS bytes written:95275,Job Counters .Rack-local map tasks:2,Job Counters .Launched map tasks:5,Job Counters .Data-local map tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:2942,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:27,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:1471,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce Framework.Map input records:1498,Map-Reduce Framework.Map input bytes:114068,Map-Reduce Framework.Map output records:0" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" TIME="1248975764071" TaskEnd ROWS_INSERTED="apiusage~1471" TASK_RET_CODE="0" TASK_HADOOP_PROGRESS="2009-07-30 10:42:44,068 map = 100%, reduce =100%" TASK_NAME="org.apache.hadoop.hive.ql.exec.ExecDriver" TASK_COUNTERS="File Systems.HDFS bytes read:114068,File Systems.HDFS bytes written:95275,Job Counters .Rack-local map tasks:2,Job Counters .Launched map tasks:5,Job Counters .Data-local map tasks:3,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.PASSED:2942,org.apache.hadoop.hive.ql.exec.FilterOperator$Counter.FILTERED:27,org.apache.hadoop.hive.ql.exec.FileSinkOperator$TableIdEnum.TABLE_ID_1_ROWCOUNT:1471,org.apache.hadoop.hive.ql.exec.MapOperator$Counter.DESERIALIZE_ERRORS:0,Map-Reduce Framework.Map input records:1498,Map-Reduce Framework.Map input bytes:114068,Map-Reduce Framework.Map output records:0" TASK_ID="Stage-1" QUERY_ID="app_20090730104242" TASK_HADOOP_ID="job_200906301559_0409" TIME="1248975764199" TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.ConditionalTask" TASK_ID="Stage-4" QUERY_ID="app_20090730104242" TIME="1248975764199" TaskEnd TASK_RET_CODE="0" TASK_NAME="org.apache.hadoop.hive.ql.exec.ConditionalTask" TASK_ID="Stage-4" QUERY_ID="app_20090730104242" TIME="1248975782277" TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" TASK_ID="Stage-0" QUERY_ID="app_20090730104242" TIME="1248975782277" TaskEnd TASK_RET_CODE="1" TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" TASK_ID="Stage-0" QUERY_ID="app_20090730104242" TIME="1248975782473" QueryEnd ROWS_INSERTED="apiusage~1471" QUERY_STRING="INSERT OVERWRITE TABLE ApiUsage PARTITION (dt = "20090518") SELECT `(requestDate)?+.+` FROM ApiUsageSmall WHERE requestDate = '2009/05/18'" QUERY_ID="app_20090730104242" QUERY_NUM_TASKS="2" TIME="1248975782474" On Thu, Jul 30, 2009 at 10:09 AM, Prasad Chakka <[email protected]> wrote: Are you sure you are getting the same error even with the schema below (i.e. trying to set a string to an int column?). Can you give the full stack trace that you might see in /tmp/$USER/hive.log? ________________________________ From: Bill Graham <[email protected] <http://[email protected]> > Reply-To: <[email protected] <http://[email protected]> >, <[email protected] <http://[email protected]> > Date: Thu, 30 Jul 2009 10:02:54 -0700 To: Zheng Shao <[email protected] <http://[email protected]> > Cc: <[email protected] <http://[email protected]> > Subject: Re: partitions not being created Based on these describe statements, is what I'm trying to do feasable? I'm basically trying to load the contents of ApiUsageTemp into ApiUsage, with the ApiUsageTemp.requestdate column becoming the ApiUsage.dt partition. On Wed, Jul 29, 2009 at 9:28 AM, Bill Graham <[email protected] <http://[email protected]> > wrote: Sure. The only difference I see is that the ApiUsage has a dt partition, instead of the requestdate column: hive> describe extended ApiUsage; OK user string restresource string statuscode int requesthour int numrequests string responsetime string numslowrequests string dt string Detailed Table Information Table(tableName:apiusage, dbName:default, owner:grahamb, createTime:1248884801, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:user, type:string, comment:null), FieldSchema(name:restresource, type:string, comment:null), FieldSchema(name:statuscode, type:int, comment:null), FieldSchema(name:requesthour, type:int, comment:null), FieldSchema(name:numrequests, type:string, comment:null), FieldSchema(name:responsetime, type:string, comment:null), FieldSchema(name:numslowrequests, type:string, comment:null)], location:hdfs://xxxxxxx:9000/user/hive/warehouse/apiusage <http://c18-ssa-dev40-so-qry1.cnet.com:9000/user/hive/warehouse/apiusage> , inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{field.delim= , serialization.format= }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:dt, type:string, comment:null)], parameters:{}) Time taken: 0.277 seconds hive> describe extended ApiUsageTemp; OK user string restresource string statuscode int requestdate string requesthour int numrequests string responsetime string numslowrequests string Detailed Table Information Table(tableName:apiusagetemp, dbName:default, owner:grahamb, createTime:1248466925, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:user, type:string, comment:null), FieldSchema(name:restresource, type:string, comment:null), FieldSchema(name:statuscode, type:int, comment:null), FieldSchema(name:requestdate, type:string, comment:null), FieldSchema(name:requesthour, type:int, comment:null), FieldSchema(name:numrequests, type:string, comment:null), FieldSchema(name:responsetime, type:string, comment:null), FieldSchema(name:numslowrequests, type:string, comment:null)], location:hdfs://xxxxxxx:9000/user/hive/warehouse/apiusage <http://c18-ssa-dev40-so-qry1.cnet.com:9000/user/hive/warehouse/apiusage> , inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{field.delim= , serialization.format= }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], parameters:{last_modified_time=1248826696, last_modified_by=app}) Time taken: 0.235 seconds On Tue, Jul 28, 2009 at 9:03 PM, Zheng Shao <[email protected] <http://[email protected]> > wrote: Can you send the output of these 2 commands? describe extended ApiUsage; describe extended ApiUsageTemp; Zheng On Tue, Jul 28, 2009 at 6:29 PM, Bill Graham<[email protected] <http://[email protected]> > wrote: > Thanks for the tip, but it fails in the same way when I use a string. > > On Tue, Jul 28, 2009 at 6:21 PM, David Lerman <[email protected] > <http://[email protected]> > wrote: >> >> >> hive> create table partTable (a string, b int) partitioned by (dt int); >> >> > INSERT OVERWRITE TABLE ApiUsage PARTITION (dt = "20090518") >> > SELECT `(requestDate)?+.+` FROM ApiUsageTemp WHERE requestDate = >> > '2009/05/18' >> >> The table has an int partition column (dt), but you're trying to set a >> string value (dt = "20090518"). >> >> Try : >> >> create table partTable (a string, b int) partitioned by (dt string); >> >> and then do your insert. >> > > -- Yours, Zheng
