Re: Setting s3 credentials in cloudera

2014-04-22 Thread Kishore kumar
With the same credentials I am able to download the s3 file to my local
filesystem.


On Tue, Apr 22, 2014 at 11:17 AM, Kishore kumar kish...@techdigita.inwrote:

 No, I am running in cli.


 On Mon, Apr 21, 2014 at 8:43 PM, j.barrett Strausser 
 j.barrett.straus...@gmail.com wrote:

 You mention cloudera, are you trying to execute the query from HUE?  That
 requires altering the setting for HUE and not HIVE.


 On Mon, Apr 21, 2014 at 11:12 AM, j.barrett Strausser 
 j.barrett.straus...@gmail.com wrote:

 Hope those aren't you actual credentials.


 On Mon, Apr 21, 2014 at 11:05 AM, Kishore kumar 
 kish...@techdigita.inwrote:

 I Edited Cluster-wide Configuration Safety Valve for core-site.xml
 in cm, and specified as below, but still the problem is same.

 property
 namefs.s3.awsAccessKeyId/name
 valueAKIAJNIM5P2SASWJPHSA/value
 /property

 property
 namefs.s3.awsSecretAccessKey/name
 valueBN1hkKD7JY4LGGNbjxmnFE0ehs12vXmP44GCKV2N/value
 /property


 FAILED: Error in metadata:
 MetaException(message:java.lang.IllegalArgumentException: AWS Access Key ID
 and Secret Access Key must be specified as the username or password
 (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or
 fs.s3.awsSecretAccessKey properties (respectively).)
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask

 Thanks,
 Kishore.


 On Mon, Apr 21, 2014 at 8:17 PM, Kishore kumar 
 kish...@techdigita.inwrote:

 I set the credentials from hive command line, still I am getting the
 error. please help me.

 hive set fs.s3.awsAccessKeyId = x;
  hive set fs.s3.awsSecretAccessKey = xxx;

 FAILED: Error in metadata:
 MetaException(message:java.lang.IllegalArgumentException: AWS Access Key 
 ID
 and Secret Access Key must be specified as the username or password
 (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or
 fs.s3.awsSecretAccessKey properties (respectively).)
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask

 Thanks,
 Kishore.



 On Mon, Apr 21, 2014 at 7:33 PM, Kishore kumar 
 kish...@techdigita.inwrote:

 Hi Experts,

 I am trying to create table against my s3 file, I faced the below
 issue, where to set these credentials in clouderamanager4.8. I got this
 link (
 http://community.cloudera.com/t5/Cloudera-Manager-Installation/AWS-Access-Key-ID-and-Secret-Access-Key-must-be-specified-as-the/td-p/495)
 after some research but please explain me clearly after edited
 Cluster-wide Configuration Safety Valve for core-site.xml how to
 specify the values.

 -- Thanks,


 *Kishore *




 --




 --




 --


 https://github.com/bearrito



 --




-- 

*Kishore Kumar*
ITIM


RE: Kerberized Hive | Remote Access using Keytab

2014-04-22 Thread Savant, Keshav
Hi All,

Can someone provide some information on below problem?

Kind Regards,
Keshav C Savant

From: Savant, Keshav [mailto:keshav.c.sav...@fisglobal.com]
Sent: Friday, April 18, 2014 3:52 PM
To: user@hive.apache.org
Subject: Kerberized Hive | Remote Access using Keytab

Hi All,

I have successfully Kerberized the CDH5  Hive. Now I can do a kinit  then 
issue hive queries.

Next I wanted to access hive remotely from standalone java client using keytab 
file so that kinit (or credential prompt) can be avoided.

I have written a java code with following lines (based on input from cdh-user 
google 
grouphttps://urldefense.proofpoint.com/v1/url?u=https://groups.google.com/a/cloudera.org/forum/%23%21topic/cdh-user/S7nPFx0w90Uk=%2FbkpAUdJWZuiTILCq%2FFnQg%3D%3D%0Ar=n8%2FsNJ1paZ2bqAHakATIk84Ym2qkN8Z0Oh2DW2luaMQ%3D%0Am=5bmaY2O6gxvhGmAlWv5Rm1CE0ohlHdXuWX97e3K5SX4%3D%0As=f8d620a00927b0d175986961186dd09268d50bd540d4340e74c68f8ba0a2cc53)
 to solve the above problem, but after that I am getting GSS initiate failed 
exception.

Configuration conf = new Configuration();
conf.addResource(new 
java.io.FileInputStream(/installer/hive_jdbc/core-site.xml)); //file placed 
at this path
SecurityUtil.login(conf,/path/to/my/keytab/file/user.keytab, user@domain);

I have also posted the same problem on 
thishttps://urldefense.proofpoint.com/v1/url?u=https://groups.google.com/a/cloudera.org/forum/%23%21topic/cdh-user/S7nPFx0w90Uk=%2FbkpAUdJWZuiTILCq%2FFnQg%3D%3D%0Ar=n8%2FsNJ1paZ2bqAHakATIk84Ym2qkN8Z0Oh2DW2luaMQ%3D%0Am=5bmaY2O6gxvhGmAlWv5Rm1CE0ohlHdXuWX97e3K5SX4%3D%0As=f8d620a00927b0d175986961186dd09268d50bd540d4340e74c68f8ba0a2cc53
 URL, sample code  logs are posted here.

As per the apache hive wiki on 
thishttps://urldefense.proofpoint.com/v1/url?u=https://cwiki.apache.org/confluence/display/Hive/HiveServer2%26%2343%3BClients%23HiveServer2Clients-JDBCClientSetupforaSecureClusterk=%2FbkpAUdJWZuiTILCq%2FFnQg%3D%3D%0Ar=n8%2FsNJ1paZ2bqAHakATIk84Ym2qkN8Z0Oh2DW2luaMQ%3D%0Am=5bmaY2O6gxvhGmAlWv5Rm1CE0ohlHdXuWX97e3K5SX4%3D%0As=eba097fe03762745b0271351811bc5ce726f5d5cc4dcb5e6137f6eb67cdff4b7
 page, a valid ticket needs to be there in ticket cache for hitting a 
kerberized hive. Can I bypass this  use a keytab for hitting kerberized hive 
from a standalone java program?

Kindly provide some input/pointers/examples to solve this.

Kind regards,
Keshav C Savant
_
The information contained in this message is proprietary and/or confidential. 
If you are not the intended recipient, please: (i) delete the message and all 
copies; (ii) do not disclose, distribute or use the message in any manner; and 
(iii) notify the sender immediately. In addition, please be aware that any 
message addressed to our domain is subject to archiving and review by persons 
other than the intended recipient. Thank you.

_
The information contained in this message is proprietary and/or confidential. 
If you are not the intended recipient, please: (i) delete the message and all 
copies; (ii) do not disclose, distribute or use the message in any manner; and 
(iii) notify the sender immediately. In addition, please be aware that any 
message addressed to our domain is subject to archiving and review by persons 
other than the intended recipient. Thank you.


Re: Hive 0.13.0 - IndexOutOfBounds Exception

2014-04-22 Thread Bryan Jeffrey
Prasanth,

The error seems to occur with just about any table.  I mocked up a very
simple table to illustrate the problem (including input data, etc.) to make
this easy to repeat.

hive create table loading_data_0 (A smallint, B smallint) partitioned by
(range int) row format delimited fields terminated by '|' stored as
textfile;
hive create table data (A smallint, B smallint) partitioned by (range int)
clustered by (A) sorted by (A, B) into 8 buckets stored as orc
tblproperties (\orc.compress\ = \SNAPPY\, \orc.index\ = \true\);
[root@server ~]# cat test.input
123|436
423|426
223|456
923|486
023|406
hive load data inpath '/test.input' into table loading_data_0 partition
(range=123);

[root@server scripts]# hive -e describe data;
Logging initialized using configuration in
/opt/hadoop/latest-hive/conf/hive.log4j
OK
Time taken: 0.508 seconds
OK
a   smallint
b   smallint
range   int

# Partition Information
# col_name  data_type   comment

range   int
Time taken: 0.422 seconds, Fetched: 8 row(s)
[root@server scripts]# hive -e describe loading_data_0;
Logging initialized using configuration in
/opt/hadoop/latest-hive/conf/hive.log4j
OK
Time taken: 0.511 seconds
OK
a   smallint
b   smallint
range   int

# Partition Information
# col_name  data_type   comment

range   int
Time taken: 0.37 seconds, Fetched: 8 row(s)


[root@server scripts]# hive -e set
hive.exec.dynamic.partition.mode=nonstrict; set hive.enforce.sorting =
true; set mapred.job.queue.name=orc_queue; explain insert into table data
partition (range) select * from loading_data_0;
Logging initialized using configuration in
/opt/hadoop/latest-hive/conf/hive.log4j
OK
Time taken: 0.564 seconds
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: loading_data_0
Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
Column stats: NONE
Select Operator
  expressions: a (type: smallint), b (type: smallint), range
(type: int)
  outputColumnNames: _col0, _col1, _col2
  Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
Column stats: NONE
  Reduce Output Operator
key expressions: _col2 (type: int), -1 (type: int), _col0
(type: smallint), _col1 (type: smallint)
sort order: 
Map-reduce partition columns: _col2 (type: int)
Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
Column stats: NONE
value expressions: _col0 (type: smallint), _col1 (type:
smallint), _col2 (type: int)
  Reduce Operator Tree:
Extract
  Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
Column stats: NONE
  File Output Operator
compressed: false
Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
Column stats: NONE
table:
input format:
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
output format:
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
name: data

  Stage: Stage-0
Move Operator
  tables:
  partition:
range
  replace: false
  table:
  input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
  output format:
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
  serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
  name: data

Time taken: 0.913 seconds, Fetched: 45 row(s)



 [root@server]# hive -e set hive.exec.dynamic.partition.mode=nonstrict;
set hive.enforce.sorting = true; set mapred.job.queue.name=orc_queue;
insert into table data partition (range) select * from loading_data_0;
Logging initialized using configuration in
/opt/hadoop/latest-hive/conf/hive.log4j
OK
Time taken: 0.513 seconds
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=number
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=number
In order to set a constant number of reducers:
  set mapreduce.job.reduces=number
Starting Job = job_1398130933303_1467, Tracking URL =
http://server:8088/proxy/application_1398130933303_1467/
Kill Command = /opt/hadoop/latest-hadoop/bin/hadoop job  -kill
job_1398130933303_1467
Hadoop job information for Stage-1: number of mappers: 1; number of
reducers: 1
2014-04-22 11:33:26,984 Stage-1 map = 0%,  reduce = 0%
2014-04-22 11:33:51,833 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1398130933303_1467 with 

How to get hdfs zip files in localfilesystem

2014-04-22 Thread Kishore kumar
Hi Experts,

Hive query result stored in hdfs with 75 zip files, I merged them and get
into the local filesystem with '-getmerge', but I am unable to see the data
after i unzipped the file, anything i am missing ? please help me.

-- 


*Kishore *


Re: How to get hdfs zip files in localfilesystem

2014-04-22 Thread Kishore kumar
I am getting the message if i run below command to see the contents in
merged file, please help how to do this, if I am storing the query result
in local filesystem I can able to see the contents.

# less TheCombinedResultOfTheJob.txt

TheCombinedResultOfTheJob.txt may be a binary file.  See it anyway?  y
when i enter y the result is

0^A0^A0^A11^A0^A12^A416^A0.0
0^A0^A0^A11^A38^A12^A87^A0.0
0^A0^A0^A12^A53^A11^A1^A0.0
0^A0^A0^A12^A72^A11^A30^A0.0
0^A0^A0^A12^A357^A11^A12^A0.0
0^A0^A0^A12^A395^A11^A2^A0.0
0^A0^A0^A12^A547^A11^A9^A0.0



On Tue, Apr 22, 2014 at 8:32 PM, Kishore kumar kish...@techdigita.inwrote:

 Hi Experts,

 Hive query result stored in hdfs with 75 zip files, I merged them and get
 into the local filesystem with '-getmerge', but I am unable to see the data
 after i unzipped the file, anything i am missing ? please help me.

 --


 *Kishore *



Re: Hive 0.13.0 - IndexOutOfBounds Exception

2014-04-22 Thread Bryan Jeffrey
Prasanth,

Was this additional information sufficient?  This is a large road block to
our adopting Hive 0.13.0.

Regards,

Bryan Jeffrey


On Tue, Apr 22, 2014 at 7:41 AM, Bryan Jeffrey bryan.jeff...@gmail.comwrote:

 Prasanth,

 The error seems to occur with just about any table.  I mocked up a very
 simple table to illustrate the problem (including input data, etc.) to make
 this easy to repeat.

 hive create table loading_data_0 (A smallint, B smallint) partitioned by
 (range int) row format delimited fields terminated by '|' stored as
 textfile;
 hive create table data (A smallint, B smallint) partitioned by (range
 int) clustered by (A) sorted by (A, B) into 8 buckets stored as orc
 tblproperties (\orc.compress\ = \SNAPPY\, \orc.index\ = \true\);
 [root@server ~]# cat test.input
 123|436
 423|426
 223|456
 923|486
 023|406
 hive load data inpath '/test.input' into table loading_data_0 partition
 (range=123);

 [root@server scripts]# hive -e describe data;
 Logging initialized using configuration in
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.508 seconds
 OK
 a   smallint
 b   smallint
 range   int

 # Partition Information
 # col_name  data_type   comment

 range   int
 Time taken: 0.422 seconds, Fetched: 8 row(s)
 [root@server scripts]# hive -e describe loading_data_0;
 Logging initialized using configuration in
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.511 seconds
 OK
 a   smallint
 b   smallint
 range   int

 # Partition Information
 # col_name  data_type   comment

 range   int
 Time taken: 0.37 seconds, Fetched: 8 row(s)


 [root@server scripts]# hive -e set
 hive.exec.dynamic.partition.mode=nonstrict; set hive.enforce.sorting =
 true; set mapred.job.queue.name=orc_queue; explain insert into table data
 partition (range) select * from loading_data_0;
 Logging initialized using configuration in
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.564 seconds
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1

 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: loading_data_0
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
 Select Operator
   expressions: a (type: smallint), b (type: smallint), range
 (type: int)
   outputColumnNames: _col0, _col1, _col2
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
   Reduce Output Operator
 key expressions: _col2 (type: int), -1 (type: int), _col0
 (type: smallint), _col1 (type: smallint)
 sort order: 
 Map-reduce partition columns: _col2 (type: int)
 Statistics: Num rows: 5 Data size: 40 Basic stats:
 COMPLETE Column stats: NONE
 value expressions: _col0 (type: smallint), _col1 (type:
 smallint), _col2 (type: int)
   Reduce Operator Tree:
 Extract
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
   File Output Operator
 compressed: false
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
 table:
 input format:
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
 output format:
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
 name: data

   Stage: Stage-0
 Move Operator
   tables:
   partition:
 range
   replace: false
   table:
   input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
   output format:
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
   serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
   name: data

 Time taken: 0.913 seconds, Fetched: 45 row(s)



  [root@server]# hive -e set hive.exec.dynamic.partition.mode=nonstrict;
 set hive.enforce.sorting = true; set mapred.job.queue.name=orc_queue;
 insert into table data partition (range) select * from loading_data_0;
 Logging initialized using configuration in
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.513 seconds
 Total jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapreduce.job.reduces=number
 Starting Job = job_1398130933303_1467, Tracking URL =
 

Re: Hive 0.13.0 - IndexOutOfBounds Exception

2014-04-22 Thread Prasanth Jayachandran
Thanks Bryan. This is more than sufficient. As a workaround, can you try 
setting hive.optimize.sort.dynamic.partition=false and see if it helps? In the 
meantime, I will diagnose the issue.

Thanks
Prasanth Jayachandran

On Apr 22, 2014, at 10:36 AM, Bryan Jeffrey bryan.jeff...@gmail.com wrote:

 Prasanth,
 
 Was this additional information sufficient?  This is a large road block to 
 our adopting Hive 0.13.0.
 
 Regards,
 
 Bryan Jeffrey
 
 
 On Tue, Apr 22, 2014 at 7:41 AM, Bryan Jeffrey bryan.jeff...@gmail.com 
 wrote:
 Prasanth,
 
 The error seems to occur with just about any table.  I mocked up a very 
 simple table to illustrate the problem (including input data, etc.) to make 
 this easy to repeat.
 
 hive create table loading_data_0 (A smallint, B smallint) partitioned by 
 (range int) row format delimited fields terminated by '|' stored as textfile;
 hive create table data (A smallint, B smallint) partitioned by (range int) 
 clustered by (A) sorted by (A, B) into 8 buckets stored as orc tblproperties 
 (\orc.compress\ = \SNAPPY\, \orc.index\ = \true\);
 [root@server ~]# cat test.input
 123|436
 423|426
 223|456
 923|486
 023|406
 hive load data inpath '/test.input' into table loading_data_0 partition 
 (range=123);
 
 [root@server scripts]# hive -e describe data;
 Logging initialized using configuration in 
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.508 seconds
 OK
 a   smallint
 b   smallint
 range   int
 
 # Partition Information
 # col_name  data_type   comment
 
 range   int
 Time taken: 0.422 seconds, Fetched: 8 row(s)
 [root@server scripts]# hive -e describe loading_data_0;
 Logging initialized using configuration in 
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.511 seconds
 OK
 a   smallint
 b   smallint
 range   int
 
 # Partition Information
 # col_name  data_type   comment
 
 range   int
 Time taken: 0.37 seconds, Fetched: 8 row(s)
 
 
 [root@server scripts]# hive -e set 
 hive.exec.dynamic.partition.mode=nonstrict; set hive.enforce.sorting = true; 
 set mapred.job.queue.name=orc_queue; explain insert into table data partition 
 (range) select * from loading_data_0;
 Logging initialized using configuration in 
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.564 seconds
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: loading_data_0
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator
   expressions: a (type: smallint), b (type: smallint), range 
 (type: int)
   outputColumnNames: _col0, _col1, _col2
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator
 key expressions: _col2 (type: int), -1 (type: int), _col0 
 (type: smallint), _col1 (type: smallint)
 sort order: 
 Map-reduce partition columns: _col2 (type: int)
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
 value expressions: _col0 (type: smallint), _col1 (type: 
 smallint), _col2 (type: int)
   Reduce Operator Tree:
 Extract
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE Column 
 stats: NONE
   File Output Operator
 compressed: false
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
 table:
 input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
 output format: 
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
 name: data
 
   Stage: Stage-0
 Move Operator
   tables:
   partition:
 range
   replace: false
   table:
   input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
   output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
   serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
   name: data
 
 Time taken: 0.913 seconds, Fetched: 45 row(s)
 
 
 
  [root@server]# hive -e set hive.exec.dynamic.partition.mode=nonstrict; set 
 hive.enforce.sorting = true; set mapred.job.queue.name=orc_queue; insert into 
 table data partition (range) select * from loading_data_0;
 Logging initialized using configuration in 
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.513 seconds
 Total jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks not specified. Estimated from input data size: 1
 In order to change the average 

Re: Hive 0.13.0 - IndexOutOfBounds Exception

2014-04-22 Thread Prasanth Jayachandran
Bryan,

This issue is related to https://issues.apache.org/jira/browse/HIVE-6883

The workaround for this issue is to disable 
hive.optimize.sort.dynamic.partition optimization by setting it to false.

We found this issue very late (towards the end of 0.13 release) and so wasn’t 
included in hive 0.13. It will go into the next patch release/next release. I 
will request for a backport to hive 0.13 source as well. 

Thanks
Prasanth Jayachandran

On Apr 22, 2014, at 10:36 AM, Bryan Jeffrey bryan.jeff...@gmail.com wrote:

 Prasanth,
 
 Was this additional information sufficient?  This is a large road block to 
 our adopting Hive 0.13.0.
 
 Regards,
 
 Bryan Jeffrey
 
 
 On Tue, Apr 22, 2014 at 7:41 AM, Bryan Jeffrey bryan.jeff...@gmail.com 
 wrote:
 Prasanth,
 
 The error seems to occur with just about any table.  I mocked up a very 
 simple table to illustrate the problem (including input data, etc.) to make 
 this easy to repeat.
 
 hive create table loading_data_0 (A smallint, B smallint) partitioned by 
 (range int) row format delimited fields terminated by '|' stored as textfile;
 hive create table data (A smallint, B smallint) partitioned by (range int) 
 clustered by (A) sorted by (A, B) into 8 buckets stored as orc tblproperties 
 (\orc.compress\ = \SNAPPY\, \orc.index\ = \true\);
 [root@server ~]# cat test.input
 123|436
 423|426
 223|456
 923|486
 023|406
 hive load data inpath '/test.input' into table loading_data_0 partition 
 (range=123);
 
 [root@server scripts]# hive -e describe data;
 Logging initialized using configuration in 
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.508 seconds
 OK
 a   smallint
 b   smallint
 range   int
 
 # Partition Information
 # col_name  data_type   comment
 
 range   int
 Time taken: 0.422 seconds, Fetched: 8 row(s)
 [root@server scripts]# hive -e describe loading_data_0;
 Logging initialized using configuration in 
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.511 seconds
 OK
 a   smallint
 b   smallint
 range   int
 
 # Partition Information
 # col_name  data_type   comment
 
 range   int
 Time taken: 0.37 seconds, Fetched: 8 row(s)
 
 
 [root@server scripts]# hive -e set 
 hive.exec.dynamic.partition.mode=nonstrict; set hive.enforce.sorting = true; 
 set mapred.job.queue.name=orc_queue; explain insert into table data partition 
 (range) select * from loading_data_0;
 Logging initialized using configuration in 
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.564 seconds
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: loading_data_0
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator
   expressions: a (type: smallint), b (type: smallint), range 
 (type: int)
   outputColumnNames: _col0, _col1, _col2
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator
 key expressions: _col2 (type: int), -1 (type: int), _col0 
 (type: smallint), _col1 (type: smallint)
 sort order: 
 Map-reduce partition columns: _col2 (type: int)
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
 value expressions: _col0 (type: smallint), _col1 (type: 
 smallint), _col2 (type: int)
   Reduce Operator Tree:
 Extract
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE Column 
 stats: NONE
   File Output Operator
 compressed: false
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
 Column stats: NONE
 table:
 input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
 output format: 
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
 name: data
 
   Stage: Stage-0
 Move Operator
   tables:
   partition:
 range
   replace: false
   table:
   input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
   output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
   serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
   name: data
 
 Time taken: 0.913 seconds, Fetched: 45 row(s)
 
 
 
  [root@server]# hive -e set hive.exec.dynamic.partition.mode=nonstrict; set 
 hive.enforce.sorting = true; set mapred.job.queue.name=orc_queue; insert into 
 table data partition (range) select * from loading_data_0;
 Logging initialized using 

Re: Hive 0.13.0 - IndexOutOfBounds Exception

2014-04-22 Thread Bryan Jeffrey
Prasanth,

Thank you for the help.  It would not have occurred to me to look at
partition sort and order issues from that dump. I may just apply the patch
to my copy of 13.

Regards,

Bryan Jeffrey
On Apr 22, 2014 2:41 PM, Prasanth Jayachandran 
pjayachand...@hortonworks.com wrote:

 Bryan,

 This issue is related to https://issues.apache.org/jira/browse/HIVE-6883

 The workaround for this issue is to disable
 hive.optimize.sort.dynamic.partition optimization by setting it to false.

 We found this issue very late (towards the end of 0.13 release) and so
 wasn’t included in hive 0.13. It will go into the next patch release/next
 release. I will request for a backport to hive 0.13 source as well.

 Thanks
 Prasanth Jayachandran

 On Apr 22, 2014, at 10:36 AM, Bryan Jeffrey bryan.jeff...@gmail.com
 wrote:

 Prasanth,

 Was this additional information sufficient?  This is a large road block to
 our adopting Hive 0.13.0.

 Regards,

 Bryan Jeffrey


 On Tue, Apr 22, 2014 at 7:41 AM, Bryan Jeffrey bryan.jeff...@gmail.comwrote:

 Prasanth,

 The error seems to occur with just about any table.  I mocked up a very
 simple table to illustrate the problem (including input data, etc.) to make
 this easy to repeat.

 hive create table loading_data_0 (A smallint, B smallint) partitioned by
 (range int) row format delimited fields terminated by '|' stored as
 textfile;
 hive create table data (A smallint, B smallint) partitioned by (range
 int) clustered by (A) sorted by (A, B) into 8 buckets stored as orc
 tblproperties (\orc.compress\ = \SNAPPY\, \orc.index\ = \true\);
 [root@server ~]# cat test.input
 123|436
 423|426
 223|456
 923|486
 023|406
 hive load data inpath '/test.input' into table loading_data_0 partition
 (range=123);

 [root@server scripts]# hive -e describe data;
 Logging initialized using configuration in
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.508 seconds
 OK
 a   smallint
 b   smallint
 range   int

 # Partition Information
 # col_name  data_type   comment

 range   int
 Time taken: 0.422 seconds, Fetched: 8 row(s)
 [root@server scripts]# hive -e describe loading_data_0;
 Logging initialized using configuration in
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.511 seconds
 OK
 a   smallint
 b   smallint
 range   int

 # Partition Information
 # col_name  data_type   comment

 range   int
 Time taken: 0.37 seconds, Fetched: 8 row(s)


 [root@server scripts]# hive -e set
 hive.exec.dynamic.partition.mode=nonstrict; set hive.enforce.sorting =
 true; set mapred.job.queue.name=orc_queue; explain insert into table
 data partition (range) select * from loading_data_0;
 Logging initialized using configuration in
 /opt/hadoop/latest-hive/conf/hive.log4j
 OK
 Time taken: 0.564 seconds
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1

 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
   TableScan
 alias: loading_data_0
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
 Select Operator
   expressions: a (type: smallint), b (type: smallint), range
 (type: int)
   outputColumnNames: _col0, _col1, _col2
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
   Reduce Output Operator
 key expressions: _col2 (type: int), -1 (type: int), _col0
 (type: smallint), _col1 (type: smallint)
 sort order: 
 Map-reduce partition columns: _col2 (type: int)
 Statistics: Num rows: 5 Data size: 40 Basic stats:
 COMPLETE Column stats: NONE
 value expressions: _col0 (type: smallint), _col1 (type:
 smallint), _col2 (type: int)
   Reduce Operator Tree:
 Extract
   Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
   File Output Operator
 compressed: false
 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE
 Column stats: NONE
 table:
 input format:
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
 output format:
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
 name: data

   Stage: Stage-0
 Move Operator
   tables:
   partition:
 range
   replace: false
   table:
   input format:
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
   output format:
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
   serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
   name: data

 Time taken: 0.913 seconds, Fetched: 45 row(s)



  

Re: How to get hdfs zip files in localfilesystem

2014-04-22 Thread Subramanian, Sanjay (HQP)
This looks like the file has Ctrl A (default delimiter)

Try this

CTRLA=$(echo -e \x01”); sed 's/'${CTRLA}'/\t/g’ TheCombinedResultOfTheJob.txt 
 TheCombinedResultOfTheJob.tsv; less TheCombinedResultOfTheJob.tsv

Thanks
Warm Regards

Sanjay

From: Kishore kumar kish...@techdigita.inmailto:kish...@techdigita.in
Reply-To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Date: Tuesday, April 22, 2014 at 9:38 AM
To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: How to get hdfs zip files in localfilesystem

I am getting the message if i run below command to see the contents in merged 
file, please help how to do this, if I am storing the query result in local 
filesystem I can able to see the contents.

# less TheCombinedResultOfTheJob.txt

TheCombinedResultOfTheJob.txt may be a binary file.  See it anyway?  y   when 
i enter y the result is

0^A0^A0^A11^A0^A12^A416^A0.0
0^A0^A0^A11^A38^A12^A87^A0.0
0^A0^A0^A12^A53^A11^A1^A0.0
0^A0^A0^A12^A72^A11^A30^A0.0
0^A0^A0^A12^A357^A11^A12^A0.0
0^A0^A0^A12^A395^A11^A2^A0.0
0^A0^A0^A12^A547^A11^A9^A0.0



On Tue, Apr 22, 2014 at 8:32 PM, Kishore kumar 
kish...@techdigita.inmailto:kish...@techdigita.in wrote:
Hi Experts,

Hive query result stored in hdfs with 75 zip files, I merged them and get into 
the local filesystem with '-getmerge', but I am unable to see the data after i 
unzipped the file, anything i am missing ? please help me.

--

Kishore






Finding Max of a column without using any Aggregation functions

2014-04-22 Thread Subramanian, Sanjay (HQP)
Hey guys

TABLE=STUDENT
COLUMN=SCORE

U want to find the max value in the column without using any aggregation 
functions.

Its easy in a RDB context but I was trying to get a solution in Hive (clearly I 
have some spare time on my hands - LOL)

select
 nfr.score
from
 student nfr
left outer join
 (select
  a.score as fra,
  b.score as frb
 from
  (select
   '1' as dummy,
   score
  from
   student
  ) a

 join
  (select
   '1' as dummy,
   score
  from
   student
  ) b
 ON
  a.dummy = b.dummy
 where
  a.score  b.score
 ) frab
on
 frab.fra=nfr.score
where
 frab.fra is null

Thanks
Warm Regards

Sanjay



create table question

2014-04-22 Thread EdwardKing
I use hadoop 2.2.0 and hive 0.13.0, I want to create a table from an existing 
file, states.hql is follows:
CREATE EXTERNAL TABLE states(abbreviation string, full_name
string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 'tmp/states' ;


[hadoop@master ~]$ hadoop fs -ls
14/04/22 20:17:32 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - hadoop supergroup  0 2014-04-22 20:02 tmp

[hadoop@master ~]$ hadoop fs -put states.txt tmp/states
[hadoop@master ~]$ hadoop fs -ls tmp/states
14/04/22 20:17:19 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   2 hadoop supergroup654 2014-04-22 20:02 
tmp/states/states.txt


Then I execute states.hql
[hadoop@master ~]$ hive -f states.hql
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.reduce.tasks is 
deprecated. Instead, use mapreduce.job.reduces
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.min.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use 
mapreduce.reduce.speculative
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.min.split.size.per.node is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.node
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.input.dir.recursive is 
deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.min.split.size.per.rack is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize.per.rack
14/04/22 20:11:47 INFO Configuration.deprecation: mapred.max.split.size is 
deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use 
mapreduce.job.committer.setup.cleanup.needed
Logging initialized using configuration in 
jar:file:/home/software/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute URI: 
hdfs://master:9000./tmp/states)


It raise following error,why? How to correct it?
2014-04-22 20:12:03,907 INFO  [main]: exec.DDLTask 
(DDLTask.java:createTable(4074)) - Default to LazySimpleSerDe for table states
2014-04-22 20:12:05,147 INFO  [main]: metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(624)) - 0: create_table: Table(tableName:states, 
dbName:default, owner:hadoop, createTime:1398222724, lastAccessTime:0, 
retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, 
type:string, comment:null), FieldSchema(name:full_name, type:string, 
comment:null)], location:tmp/states, 
inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format= , field.delim= }), bucketCols:[], 
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], 
skewedColValues:[], skewedColValueLocationMaps:{}), 
storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE}, 
viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE)
2014-04-22 20:12:05,147 INFO  [main]: HiveMetaStore.audit 
(HiveMetaStore.java:logAuditEvent(306)) - ugi=hadoop ip=unknown-ip-addr 
cmd=create_table: Table(tableName:states, dbName:default, owner:hadoop, 
createTime:1398222724, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, type:string, 
comment:null), FieldSchema(name:full_name, type:string, comment:null)], 
location:tmp/states, inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format= , field.delim= }), bucketCols:[], 
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], 
skewedColValues:[], skewedColValueLocationMaps:{}), 
storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE}, 
viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) 
2014-04-22 20:12:05,196 ERROR [main]: metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invoke(143)) - 
MetaException(message:java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute URI: 

Re: create table question

2014-04-22 Thread Shengjun Xin
in the ql, you set relative path tmp/states, according to the error
message, you need to set absolute path


On Wed, Apr 23, 2014 at 11:23 AM, EdwardKing zhan...@neusoft.com wrote:

  I use hadoop 2.2.0 and hive 0.13.0, I want to create a table from an
 existing file, states.hql is follows:
 CREATE EXTERNAL TABLE states(abbreviation string, full_name
 string)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 LOCATION 'tmp/states' ;


 [hadoop@master ~]$ hadoop fs -ls
 14/04/22 20:17:32 WARN util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 Found 1 items
 drwxr-xr-x   - hadoop supergroup  0 2014-04-22 20:02 tmp
  [hadoop@master ~]$ hadoop fs -put states.txt tmp/states
 [hadoop@master ~]$ hadoop fs -ls tmp/states
 14/04/22 20:17:19 WARN util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 Found 1 items
 -rw-r--r--   2 hadoop supergroup654 2014-04-22 20:02
 tmp/states/states.txt


 Then I execute states.hql
 [hadoop@master ~]$ hive -f states.hql
 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.reduce.tasks is
 deprecated. Instead, use mapreduce.job.reduces
 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.min.split.size is
 deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
 14/04/22 20:11:47 INFO Configuration.deprecation:
 mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
 mapreduce.reduce.speculative
 14/04/22 20:11:47 INFO Configuration.deprecation:
 mapred.min.split.size.per.node is deprecated. Instead, use
 mapreduce.input.fileinputformat.split.minsize.per.node
 14/04/22 20:11:47 INFO Configuration.deprecation:
 mapred.input.dir.recursive is deprecated. Instead, use
 mapreduce.input.fileinputformat.input.dir.recursive
 14/04/22 20:11:47 INFO Configuration.deprecation:
 mapred.min.split.size.per.rack is deprecated. Instead, use
 mapreduce.input.fileinputformat.split.minsize.per.rack
 14/04/22 20:11:47 INFO Configuration.deprecation: mapred.max.split.size is
 deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
 14/04/22 20:11:47 INFO Configuration.deprecation:
 mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use
 mapreduce.job.committer.setup.cleanup.needed
 Logging initialized using configuration in
 jar:file:/home/software/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask.
 MetaException(message:java.lang.IllegalArgumentException:
 java.net.URISyntaxException: Relative path in absolute URI:
 hdfs://master:9000./tmp/states)


 It raise following error,why? How to correct it?
 2014-04-22 20:12:03,907 INFO  [main]: exec.DDLTask
 (DDLTask.java:createTable(4074)) - Default to LazySimpleSerDe for table
 states
 2014-04-22 20:12:05,147 INFO  [main]: metastore.HiveMetaStore
 (HiveMetaStore.java:logInfo(624)) - 0: create_table:
 Table(tableName:states, dbName:default, owner:hadoop,
 createTime:1398222724, lastAccessTime:0, retention:0,
 sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, type:string,
 comment:null), FieldSchema(name:full_name, type:string, comment:null)],
 location:tmp/states, inputFormat:org.apache.hadoop.mapred.TextInputFormat,
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
 parameters:{serialization.format= , field.delim= }), bucketCols:[],
 sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[],
 skewedColValues:[], skewedColValueLocationMaps:{}),
 storedAsSubDirectories:false), partitionKeys:[],
 parameters:{EXTERNAL=TRUE}, viewOriginalText:null, viewExpandedText:null,
 tableType:EXTERNAL_TABLE)
 2014-04-22 20:12:05,147 INFO  [main]: HiveMetaStore.audit
 (HiveMetaStore.java:logAuditEvent(306)) -
 ugi=hadoop ip=unknown-ip-addr cmd=create_table: Table(tableName:states,
 dbName:default, owner:hadoop, createTime:1398222724, lastAccessTime:0,
 retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation,
 type:string, comment:null), FieldSchema(name:full_name, type:string,
 comment:null)], location:tmp/states,
 inputFormat:org.apache.hadoop.mapred.TextInputFormat,
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
 parameters:{serialization.format= , field.delim= }), bucketCols:[],
 sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[],
 skewedColValues:[], skewedColValueLocationMaps:{}),
 storedAsSubDirectories:false), partitionKeys:[],
 parameters:{EXTERNAL=TRUE}, viewOriginalText:null, viewExpandedText:null,
 tableType:EXTERNAL_TABLE)
 2014-04-22 20:12:05,196 ERROR [main]: 

Re: create table question

2014-04-22 Thread Sanjay Subramanian
For example
if ur name node was hadoop_name_nodeIP:8020

(verify this thru your browser http://hadoop_name_nodeIP:50070)

Modified Create Table
==
CREATE EXTERNAL TABLE states(abbreviation string, full_name
string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 'hdfs://hp8300one:8020/tmp/states' ;




 From: Shengjun Xin s...@gopivotal.com
To: user@hive.apache.org 
Sent: Tuesday, April 22, 2014 8:58 PM
Subject: Re: create table question
 


in the ql, you set relative path tmp/states, according to the error message, 
you need to set absolute path




On Wed, Apr 23, 2014 at 11:23 AM, EdwardKing zhan...@neusoft.com wrote:

 
I use hadoop 2.2.0 and hive 0.13.0, I want to 
create a table from an existing file, states.hql 
is follows:
CREATE 
EXTERNAL TABLE states(abbreviation string, full_name
string)
ROW FORMAT 
DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 'tmp/states' 
;
 
 
[hadoop@master ~]$ hadoop fs 
-ls
14/04/22 20:17:32 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Found 1 items
drwxr-xr-x   - hadoop 
supergroup  0 2014-04-22 
20:02 tmp

[hadoop@master ~]$ hadoop fs -put 
states.txt tmp/states
[hadoop@master ~]$ hadoop fs -ls 
tmp/states
14/04/22 20:17:19 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
Found 1 items
-rw-r--r--   2 hadoop 
supergroup    654 2014-04-22 20:02 
tmp/states/states.txt
 
 
Then I execute states.hql
[hadoop@master ~]$ hive -f 
states.hql
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.reduce.tasks is deprecated. Instead, use 
mapreduce.job.reduces
14/04/22 20:11:47 INFO Configuration.deprecation: 
mapred.min.split.size is deprecated. Instead, use 
mapreduce.input.fileinputformat.split.minsize
14/04/22 20:11:47 INFO 
Configuration.deprecation: mapred.reduce.tasks.speculative.execution is 
deprecated. Instead, use mapreduce.reduce.speculative
14/04/22 20:11:47 INFO 
Configuration.deprecation: mapred.min.split.size.per.node is deprecated. 
Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/04/22 
20:11:47 INFO Configuration.deprecation: mapred.input.dir.recursive is 
deprecated. Instead, use 
mapreduce.input.fileinputformat.input.dir.recursive
14/04/22 20:11:47 INFO 
Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. 
Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/04/22 
20:11:47 INFO Configuration.deprecation: mapred.max.split.size is deprecated. 
Instead, use mapreduce.input.fileinputformat.split.maxsize
14/04/22 20:11:47 
INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is 
deprecated. Instead, use 
mapreduce.job.committer.setup.cleanup.needed
Logging initialized using 
configuration in 
jar:file:/home/software/apache-hive-0.13.0-bin/lib/hive-common-0.13.0.jar!/hive-log4j.properties
FAILED: 
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:java.lang.IllegalArgumentException: 
java.net.URISyntaxException: Relative path in absolute URI: 
hdfs://master:9000./tmp/states)
 
 
It raise following error,why? How to correct 
it?
2014-04-22 20:12:03,907 INFO  
[main]: exec.DDLTask (DDLTask.java:createTable(4074)) - Default to 
LazySimpleSerDe for table states
2014-04-22 20:12:05,147 INFO  [main]: 
metastore.HiveMetaStore (HiveMetaStore.java:logInfo(624)) - 0: create_table: 
Table(tableName:states, dbName:default, owner:hadoop, createTime:1398222724, 
lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, type:string, 
comment:null), FieldSchema(name:full_name, type:string, comment:null)], 
location:tmp/states, inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format= , field.delim= }), bucketCols:[], 
sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], 
skewedColValues:[], skewedColValueLocationMaps:{}), 
storedAsSubDirectories:false), partitionKeys:[], parameters:{EXTERNAL=TRUE}, 
viewOriginalText:null, viewExpandedText:null, 
tableType:EXTERNAL_TABLE)
2014-04-22 20:12:05,147 INFO  [main]: 
HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(306)) - 
ugi=hadoop ip=unknown-ip-addr cmd=create_table: 
Table(tableName:states, dbName:default, owner:hadoop, createTime:1398222724, 
lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:abbreviation, type:string, 
comment:null), FieldSchema(name:full_name, type:string, comment:null)], 
location:tmp/states, inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false,