Hi

I was playing with external table in hive and it got me confused as concept of 
external as explain in documentation and practical implementation is not going 
correctly.

Hive Version : 0.7

CREATE EXTERNAL TABLE IF NOT EXISTS learn.crime_external_native (
Orig_State String,
TypeofCrime String,
Crime String,
Year int,
Count int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'hdfs://localhost:8020/user/srcdata'

CREATE EXTERNAL TABLE IF NOT EXISTS learn.crime_external_native_1 (
like learn.crime_external_native
LOCATION '/user/crime_external_native_1'

LOAD DATA INPATH '/user/CrimeHDFS2.csv' INTO TABLE 
learn.crime_external_native_1;

Gives Error as

"Path is not legal '/user/CrimeHDFS2.csv': Move from 
hdfs://0.0.0.0/user/CrimeHDFS2.csv to  
hdfs://localhost:8020/user/crime_external_native_1 is not valid.
Please check that values for params "default.fs.name" and 
"hive.metastore.warehosue.dir" do not conflict"

What am I doing wrong here?

Whereas when i load data from local file to external table it WORKS!

LOAD DATA LOCAL INPATH '/home/cloudera/CrimeHDFS2.csv' INTO TABLE 
learn.crime_external_native_1;

>From above I am making following assumptions. Is it correct.


*         While creating a EXTERNAL table in hive you have specify directory on 
HDFS (not the data file name) which contains the source data files.

*         ROW FORMAT specified should match with data files contained in 
specified directory of external table. SO that even new file gets added. you 
can query it directly. No need to use LOAD command.

*         When you create a external table and if you load data from local 
file. It copies file to external table location and when you drop this table it 
removed directory and data file (I feel it contradict with the external table 
concepts).

Am i correct!

Thanks,
Kuldeep

Reply via email to