create table question
I use hadoop 2.2.0 and hive 0.13.0, I want to create a table from an existing file, states.hql is follows: CREATE EXTERNAL TABLE states(abbreviation string, full_name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 'tmp/states' ; [hadoop@master ~]$ hadoop fs -ls 14/04/22 20:17:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items drwxr-xr-x - hadoop supergroup 0 2014-04-22 20:02 tmp [hadoop@master ~]$ hadoop fs -put states.txt tmp/states [hadoop@master ~]$ hadoop fs -ls tmp/states 14/04/22 20:17:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items -rw-r--r-- 2 hadoop supergroup654 2014-04-22 20:02 tmp/states/states.txt Then I execute states.hql [hadoop@master ~]$ hive -f states.hql 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed Logging initialized using configuration in jar:file:/home/software/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: hdfs://master:9000./tmp/states) It raise following error,why? How to correct it? 2014-04-22 20:12:03,907 INFO [main]: exec.DDLTask (DDLTask.java:createTable(4074)) - Default to LazySimpleSerDe for table states 2014-04-22 20:12:05,147 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(624)) - 0: create_table: Table(tableName:states, dbName:default, owner:hadoop, createTime:1398222724, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, type:string, comment:null), FieldSchema(name:full_name, type:string, comment:null)], location:tmp/states, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format= , field.delim= }), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) 2014-04-22 20:12:05,147 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(306)) - ugi=hadoop ip=unknown-ip-addr cmd=create_table: Table(tableName:states, dbName:default, owner:hadoop, createTime:1398222724, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, type:string, comment:null), FieldSchema(name:full_name, type:string, comment:null)], location:tmp/states, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format= , field.delim= }), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE}, viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) 2014-04-22 20:12:05,196 ERROR [main]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - MetaException(message:java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: hdfs://master:9000./tmp/states
question about hive sql
I use hive under hadoop 2.2.0, first I start hive [hadoop@master sbin]$ hive 14/04/21 19:06:32 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/04/21 19:06:32 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/04/21 19:06:32 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/04/21 19:06:32 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/04/21 19:06:32 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/04/21 19:06:32 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/04/21 19:06:32 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/04/21 19:06:32 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2128d0:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/04/21 19:06:32 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2128d0:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. Logging initialized using configuration in jar:file:/home/software/hive-0.11.0/lib/hive-common-0.11.0.jar!/hive-log4j.properties Hive history file=/tmp/hadoop/hive_job_log_hadoop_7623@master_201404211906_2069310090.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/software/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/software/hive-0.11.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Then I creat a table hive> create table test(id STRING); OK Time taken: 17.277 seconds Then I insert some date into test hive> load data inpath 'a.txt' overwrite into table test; Loading data to table default.test rmr: DEPRECATED: Please use 'rm -r' instead. Deleted /user/hive/warehouse/test Table default.test stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 19, raw_data_size: 0] OK Time taken: 1.855 seconds hive> select * from test; OK China US Australia Time taken: 0.526 seconds, Fetched: 3 row(s) Now I use count command, I expected the result value is 3, but it runs failure! Why? Where is wrong? I am puzzled with it for several days. Anyone could tell me how to correct it? hive> select count(*) from test; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapred.reduce.tasks= Starting Job = job_1398132272370_0001, Tracking URL = http://master:8088/proxy/application_1398132272370_0001/ Kill Command = /home/software/hadoop-2.2.0/bin/hadoop job -kill job_1398132272370_0001 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2014-04-21 19:15:56,684 Stage-1 map = 0%, reduce = 0% Ended Job = job_1398132272370_0001 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask MapReduce Jobs Launched: Job 0: HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec hive> Error information under http://172.11.12.6:8088/cluster/app/application_1398132272370_0001 User: hadoop Name: select count(*) from test(Stage-1) Application Type: MAPREDUCE State: FAILED FinalStatus: FAILED Started: 21-Apr-2014 19:14:55 Elapsed: 57sec Tracking URL: History Diagnostics: Application application_1398132272370_0001 failed 2 times due to AM Container for appattempt_1398132272370_0001_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) at org.apache.hadoop.util.Shell.run(Shell.java:379) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.n
question about hive under hadoop
I use hive-0.11.0 under hadoop 2.2.0, like follows: [hadoop@node1 software]$ hive 14/04/16 19:11:02 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/04/16 19:11:02 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/04/16 19:11:02 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/04/16 19:11:02 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/04/16 19:11:02 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/04/16 19:11:02 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/04/16 19:11:02 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/04/16 19:11:03 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@17a9eb9:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/04/16 19:11:03 WARN conf.Configuration: org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@17a9eb9:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. Logging initialized using configuration in jar:file:/home/software/hive-0.11.0/lib/hive-common-0.11.0.jar!/hive-log4j.properties Hive history file=/tmp/hadoop/hive_job_log_hadoop_4933@node1_201404161911_2112956781.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/software/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/software/hive-0.11.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Then I crete a table named ufodata,like follows: hive> CREATE TABLE ufodata(sighted STRING, reported STRING, > sighting_location STRING,shape STRING, duration STRING, > description STRING COMMENT 'Free text description') > COMMENT 'The UFO data set.' ; OK Time taken: 1.588 seconds hive> LOAD DATA INPATH '/tmp/ufo.tsv' OVERWRITE INTO TABLE ufodata; Loading data to table default.ufodata rmr: DEPRECATED: Please use 'rm -r' instead. Deleted /user/hive/warehouse/ufodata Table default.ufodata stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 75342464, raw_data_size: 0] OK Time taken: 1.483 seconds Then I want to count the table ufodata,like follows: hive> select count(*) from ufodata; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapred.reduce.tasks= Starting Job = job_1397699833108_0002, Tracking URL = http://master:8088/proxy/application_1397699833108_0002/ Kill Command = /home/software/hadoop-2.2.0/bin/hadoop job -kill job_1397699833108_0002 I have two question: 1. Why do above command failed, where is wrong? how to solve it? 2. When I use following command to quit hive,and reboot computer hive>quit; $reboot Then I use following command under hive hive>describe ufodata; Table not found 'ufodata' Where is my table? I am puzzled with it. How to resove above two question? Thanks --- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. ---
Re: Hive install under hadoop
Oh, I set wrong HIVE_HOME,now I correct it and hive can run. Thanks. - Original Message - From: Shengjun Xin To: user@hive.apache.org Sent: Monday, April 14, 2014 4:48 PM Subject: Re: Hive install under hadoop Do you install hadoop correctly? On Mon, Apr 14, 2014 at 4:22 PM, EdwardKing wrote: I want to use hive in hadoop2.2.0, so I execute following steps: [hadoop@master /]$ tar -xzf hive-0.11.0.tar.gz [hadoop@master /]$ export HIVE_HOME=/home/software/hive [hadoop@master /]$ export PATH=${HIVE_HOME}/bin:${PATH} [hadoop@master /]$ hadoop fs -mkdir /tmp [hadoop@master /]$ hadoop fs -mkdir /user/hive/warehouse [hadoop@master /]$ hadoop fs -chmod g+w /tmp [hadoop@master /]$ hadoop fs -chmod g+w /user/hive/warehouse [hadoop@master /]$ hive Error creating temp dir in hadoop.tmp.dir file:/home/software/temp due to Permission denied How to make hive install success? Thanks. --- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. --- -- Regards Shengjun --- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. ---
Hive install under hadoop
I want to use hive in hadoop2.2.0, so I execute following steps: [hadoop@master /]$ tar �Cxzf hive-0.11.0.tar.gz [hadoop@master /]$ export HIVE_HOME=/home/software/hive [hadoop@master /]$ export PATH=${HIVE_HOME}/bin:${PATH} [hadoop@master /]$ hadoop fs -mkdir /tmp [hadoop@master /]$ hadoop fs -mkdir /user/hive/warehouse [hadoop@master /]$ hadoop fs -chmod g+w /tmp [hadoop@master /]$ hadoop fs -chmod g+w /user/hive/warehouse [hadoop@master /]$ hive Error creating temp dir in hadoop.tmp.dir file:/home/software/temp due to Permission denied How to make hive install success? Thanks. --- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. ---