Arup Malakar created HCATALOG-554:
-------------------------------------
Summary: Loading data using HCatLoader() from a table on non
default namenode fails
Key: HCATALOG-554
URL: https://issues.apache.org/jira/browse/HCATALOG-554
Project: HCatalog
Issue Type: Bug
Affects Versions: 0.4, 0.5
Environment: Hadoop 0.23.3
hcatalog 0.4
hive 0.9
Reporter: Arup Malakar
Assignee: Arup Malakar
1. Create hive table:
{code}
CREATE TABLE small_table(
id int,
score int
)
stored as SequenceFile
location "viewfs:///database/small_table";
{code}
2. Data:
{code}
1,32
2,235
3,32532
4,23
5,2
{code}
3. Load data onto the HCatalog table:
DATA = LOAD '/tmp/data.csv' as (id:int, score:int);
store DATA into 'default.small_table' using
org.apache.hcatalog.pig.HCatStorer();
4. Confirm that the load has been stored in the table:
{code}
hadoopqa@gsbl90385:/tmp$ hive -e "select * from default.small_table"
Logging initialized using configuration in
file:/grid/0/homey/libexec/hive/conf/hive-log4j.properties
Hive history
file=/homes/hadoopqa/hivelogs/hive_job_log_hadoopqa_201211212228_1532947518.txt
OK
1 32
2 235
3 32532
4 23
5 2
{code}
5. Now try to read the same table using HCatLoader():
{code}
a = load 'default.small_table_arup' using org.apache.hcatalog.pig.HCatLoader();
dump a;
{code}
Exception seen is:
{code}
012-11-21 22:30:50,087 [Thread-6] ERROR
org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException
as:[email protected] (auth:KERBEROS)
cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
2012-11-21 22:30:50,088 [Thread-6] INFO
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob -
PigLatin:select_in_pig.pig got an error while submitting
org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:449)
at
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:466)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:358)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1216)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1213)
at
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
at
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
at java.lang.Thread.run(Thread.java:619)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
Caused by: java.io.IOException: viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:338)
at
org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:164)
at
org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:164)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2190)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:84)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2224)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2206)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:305)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:98)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:187)
at
org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
at
org.apache.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:149)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
... 13 more
{code}
Here viewfs:///database/ resolves to the non default namenode and is defined in
the client side mount table.
Observation:
This issue seems similar to HCATALOG-553, probably HCatLoader() doesn't have
the right token for the non default namenode.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira