create table question

2014-04-22 Thread EdwardKing
I use hadoop 2.2.0 and hive 0.13.0, I want to create a table from an existing 
file, states.hql is follows:
CREATE EXTERNAL TABLE states(abbreviation string, full_name
string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 'tmp/states' ;


[hadoop@master ~]$ hadoop fs -ls
14/04/22 20:17:32 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - hadoop supergroup  0 2014-04-22 20:02 tmp

[hadoop@master ~]$ hadoop fs -put states.txt tmp/states
[hadoop@master ~]$ hadoop fs -ls tmp/states
14/04/22 20:17:19 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   2 hadoop supergroup654 2014-04-22 20:02 
tmp/states/states.txt


Then I execute states.hql
[hadoop@master ~]$ hive -f states.hql
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.min.split.size.per.node is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.node
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.input.dir.recursive is 
deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.min.split.size.per.rack is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.rack
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use 
mapreduce.job.committer.setup.cleanup.needed
Logging initialized using configuration in 
jar:file:/home/software/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute URI: 
hdfs://master:9000./tmp/states)


It raise following error,why? How to correct it?
2014-04-22 20:12:03,907 INFO  [main]: exec.DDLTask 
(DDLTask.java:createTable(4074)) - Default to LazySimpleSerDe for table states
2014-04-22 20:12:05,147 INFO  [main]: metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(624)) - 0: create_table: Table(tableName:states, 
dbName:default, owner:hadoop, createTime:1398222724, lastAccessTime:0, 
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, 
type:string, comment:null), FieldSchema(name:full_name, type:string, 
comment:null)], location:tmp/states, 
inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format= , field.delim= }), bucketCols:[], 
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], 
skewedColValues:[], skewedColValueLocationMaps:{}), 
storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE}, 
viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE)
2014-04-22 20:12:05,147 INFO  [main]: HiveMetaStore.audit 
(HiveMetaStore.java:logAuditEvent(306)) - ugi=hadoop ip=unknown-ip-addr 
cmd=create_table: Table(tableName:states, dbName:default, owner:hadoop, 
createTime:1398222724, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, type:string, 
comment:null), FieldSchema(name:full_name, type:string, comment:null)], 
location:tmp/states, inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format= , field.delim= }), bucketCols:[], 
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], 
skewedColValues:[], skewedColValueLocationMaps:{}), 
storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE}, 
viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) 
2014-04-22 20:12:05,196 ERROR [main]: metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invoke(143)) - 
MetaException(message:java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute URI: 
hdfs://master:9000./tmp/states

question about hive sql

2014-04-21 Thread EdwardKing
I use hive under hadoop 2.2.0, first I start hive
[hadoop@master sbin]$ hive
14/04/21 19:06:32 INFO Configuration.deprecation: mapred.input.dir.recursive is 
deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/04/21 19:06:32 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/04/21 19:06:32 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/04/21 19:06:32 INFO Configuration.deprecation: 
mapred.min.split.size.per.rack is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.rack
14/04/21 19:06:32 INFO Configuration.deprecation: 
mapred.min.split.size.per.node is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.node
14/04/21 19:06:32 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
14/04/21 19:06:32 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
14/04/21 19:06:32 WARN conf.Configuration: 
org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2128d0:an attempt to 
override final parameter: mapreduce.job.end-notification.max.retry.interval;  
Ignoring.
14/04/21 19:06:32 WARN conf.Configuration: 
org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2128d0:an attempt to 
override final parameter: mapreduce.job.end-notification.max.attempts;  
Ignoring.
Logging initialized using configuration in 
jar:file:/home/software/hive-0.11.0/lib/hive-common-0.11.0.jar!/hive-log4j.properties
Hive history 
file=/tmp/hadoop/hive_job_log_hadoop_7623@master_201404211906_2069310090.txt
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/software/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/software/hive-0.11.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Then I creat a table
hive> create table test(id STRING);
OK
Time taken: 17.277 seconds

Then  I insert some date into test  
hive> load data inpath 'a.txt' overwrite into table test;
Loading data to table default.test
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted /user/hive/warehouse/test
Table default.test stats: [num_partitions: 0, num_files: 1, num_rows: 0, 
total_size: 19, raw_data_size: 0]
OK
Time taken: 1.855 seconds

hive> select * from test;
OK
China
US
Australia
Time taken: 0.526 seconds, Fetched: 3 row(s)

Now I use count command, I expected the result value is 3, but it runs failure! 
 Why? Where is wrong? I am puzzled with it for several days. Anyone could tell 
me how to correct it?
hive> select count(*) from test;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapred.reduce.tasks=
Starting Job = job_1398132272370_0001, Tracking URL = 
http://master:8088/proxy/application_1398132272370_0001/
Kill Command = /home/software/hadoop-2.2.0/bin/hadoop job  -kill 
job_1398132272370_0001
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2014-04-21 19:15:56,684 Stage-1 map = 0%,  reduce = 0%
Ended Job = job_1398132272370_0001 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched: 
Job 0:  HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
hive> 


Error information under 
http://172.11.12.6:8088/cluster/app/application_1398132272370_0001
User:  hadoop
Name:  select count(*) from test(Stage-1)
Application Type:  MAPREDUCE
State:  FAILED
FinalStatus:  FAILED
Started:  21-Apr-2014 19:14:55
Elapsed:  57sec
Tracking URL:  History
Diagnostics:  
Application application_1398132272370_0001 failed 2 times due to AM Container 
for appattempt_1398132272370_0001_02 exited with exitCode: 1 due to: 
Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at 
org.apache.hadoop.yarn.server.n

question about hive under hadoop

2014-04-16 Thread EdwardKing
I use hive-0.11.0 under hadoop 2.2.0, like follows:
[hadoop@node1 software]$ hive
14/04/16 19:11:02 INFO Configuration.deprecation: mapred.input.dir.recursive is 
deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/04/16 19:11:02 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/04/16 19:11:02 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/04/16 19:11:02 INFO Configuration.deprecation: 
mapred.min.split.size.per.rack is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.rack
14/04/16 19:11:02 INFO Configuration.deprecation: 
mapred.min.split.size.per.node is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.node
14/04/16 19:11:02 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
14/04/16 19:11:02 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
14/04/16 19:11:03 WARN conf.Configuration: 
org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@17a9eb9:an attempt to 
override final parameter: mapreduce.job.end-notification.max.retry.interval;  
Ignoring.
14/04/16 19:11:03 WARN conf.Configuration: 
org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@17a9eb9:an attempt to 
override final parameter: mapreduce.job.end-notification.max.attempts;  
Ignoring.
Logging initialized using configuration in 
jar:file:/home/software/hive-0.11.0/lib/hive-common-0.11.0.jar!/hive-log4j.properties
Hive history 
file=/tmp/hadoop/hive_job_log_hadoop_4933@node1_201404161911_2112956781.txt
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/software/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/software/hive-0.11.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]


Then I crete a table named ufodata,like follows:
hive> CREATE TABLE ufodata(sighted STRING, reported STRING,
> sighting_location STRING,shape STRING, duration STRING,
> description STRING COMMENT 'Free text description')
> COMMENT 'The UFO data set.' ;
OK
Time taken: 1.588 seconds
hive> LOAD DATA INPATH '/tmp/ufo.tsv' OVERWRITE INTO TABLE ufodata;
Loading data to table default.ufodata
rmr: DEPRECATED: Please use 'rm -r' instead.
Deleted /user/hive/warehouse/ufodata
Table default.ufodata stats: [num_partitions: 0, num_files: 1, num_rows: 0, 
total_size: 75342464, raw_data_size: 0]
OK
Time taken: 1.483 seconds

Then I want to count the table ufodata,like follows:

hive> select count(*) from ufodata;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapred.reduce.tasks=
Starting Job = job_1397699833108_0002, Tracking URL = 
http://master:8088/proxy/application_1397699833108_0002/
Kill Command = /home/software/hadoop-2.2.0/bin/hadoop job  -kill 
job_1397699833108_0002

I have two question:
1. Why do above command failed, where is wrong?  how to solve it?
2. When I use following command to quit hive,and reboot computer
hive>quit;
$reboot

Then I use following command under hive
hive>describe ufodata;
Table not found 'ufodata'

Where is my table? I am puzzled with it. How to resove above two question?
 
Thanks






 
---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---


Re: Hive install under hadoop

2014-04-14 Thread EdwardKing
Oh, I set wrong HIVE_HOME,now I correct it and hive can run. Thanks. 
  - Original Message - 
  From: Shengjun Xin 
  To: user@hive.apache.org 
  Sent: Monday, April 14, 2014 4:48 PM
  Subject: Re: Hive install under hadoop


  Do you install hadoop correctly?




  On Mon, Apr 14, 2014 at 4:22 PM, EdwardKing  wrote:

I want to use hive in hadoop2.2.0, so I execute following steps:

[hadoop@master /]$ tar -xzf  hive-0.11.0.tar.gz
[hadoop@master /]$ export HIVE_HOME=/home/software/hive
[hadoop@master /]$ export PATH=${HIVE_HOME}/bin:${PATH}
[hadoop@master /]$ hadoop fs -mkdir /tmp
[hadoop@master /]$ hadoop fs -mkdir /user/hive/warehouse
[hadoop@master /]$ hadoop fs -chmod g+w /tmp
[hadoop@master /]$ hadoop fs -chmod g+w /user/hive/warehouse
[hadoop@master /]$ hive
Error creating temp dir in hadoop.tmp.dir file:/home/software/temp due to 
Permission denied
 
How to make hive install success? Thanks.





---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be 
confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader 
of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  
storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original 
message and all copies from 
your system. Thank you. 

---






  -- 

  Regards 

  Shengjun
---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---


Hive install under hadoop

2014-04-14 Thread EdwardKing
I want to use hive in hadoop2.2.0, so I execute following steps:

[hadoop@master /]$ tar �Cxzf  hive-0.11.0.tar.gz
[hadoop@master /]$ export HIVE_HOME=/home/software/hive
[hadoop@master /]$ export PATH=${HIVE_HOME}/bin:${PATH}
[hadoop@master /]$ hadoop fs -mkdir /tmp
[hadoop@master /]$ hadoop fs -mkdir /user/hive/warehouse
[hadoop@master /]$ hadoop fs -chmod g+w /tmp
[hadoop@master /]$ hadoop fs -chmod g+w /user/hive/warehouse
[hadoop@master /]$ hive
Error creating temp dir in hadoop.tmp.dir file:/home/software/temp due to 
Permission denied
 
How to make hive install success? Thanks.

 ---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---