[
https://issues.apache.org/jira/browse/HCATALOG-554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arup Malakar updated HCATALOG-554:
----------------------------------
Attachment: HCATALOG-554-trunk_0.patch
HCATALOG-554-branch_0.patch
This problem was due to setting the wrong input path. The default file system
was used to get the fully qualified name for the input paths, which didn't play
well with viewfs. For getting fully qualified names, the right filesystem
should be used instead of using the default file system.
> Loading data using HCatLoader() from a table on non default namenode fails
> --------------------------------------------------------------------------
>
> Key: HCATALOG-554
> URL: https://issues.apache.org/jira/browse/HCATALOG-554
> Project: HCatalog
> Issue Type: Bug
> Affects Versions: 0.4, 0.5
> Environment: Hadoop 0.23.3
> hcatalog 0.4
> hive 0.9
> Reporter: Arup Malakar
> Assignee: Arup Malakar
> Attachments: HCATALOG-554-branch_0.patch, HCATALOG-554-trunk_0.patch
>
>
> 1. Create hive table:
> {code}
> CREATE TABLE small_table(
> id int,
> score int
> )
> stored as SequenceFile
> location "viewfs:///database/small_table";
> {code}
> 2. Data:
> {code}
> 1,32
> 2,235
> 3,32532
> 4,23
> 5,2
> {code}
> 3. Load data onto the HCatalog table:
> DATA = LOAD '/tmp/data.csv' as (id:int, score:int);
> store DATA into 'default.small_table' using
> org.apache.hcatalog.pig.HCatStorer();
> 4. Confirm that the load has been stored in the table:
> {code}
> hadoopqa@gsbl90385:/tmp$ hive -e "select * from default.small_table"
> Logging initialized using configuration in
> file:/grid/0/homey/libexec/hive/conf/hive-log4j.properties
> Hive history
> file=/homes/hadoopqa/hivelogs/hive_job_log_hadoopqa_201211212228_1532947518.txt
> OK
> 1 32
> 2 235
> 3 32532
> 4 23
> 5 2
> {code}
> 5. Now try to read the same table using HCatLoader():
> {code}
> a = load 'default.small_table_arup' using
> org.apache.hcatalog.pig.HCatLoader();
> dump a;
> {code}
> Exception seen is:
> {code}
> 012-11-21 22:30:50,087 [Thread-6] ERROR
> org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException
> as:[email protected] (auth:KERBEROS)
> cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
> 2012-11-21 22:30:50,088 [Thread-6] INFO
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob -
> PigLatin:select_in_pig.pig got an error while submitting
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:449)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:466)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:358)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1216)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1213)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1213)
> at
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
> at
> org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
> at java.lang.Thread.run(Thread.java:619)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
> Caused by: java.io.IOException: viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
> at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:338)
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:164)
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:164)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2190)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:84)
> at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2224)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2206)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:305)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:98)
> at
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
> at
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:187)
> at
> org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
> at
> org.apache.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:149)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
> ... 13 more
> {code}
> Here viewfs:///database/ resolves to the non default namenode and is defined
> in the client side mount table.
> Observation:
> This issue seems similar to HCATALOG-553, probably HCatLoader() doesn't have
> the right token for the non default namenode.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira