[jira] [Updated] (HCATALOG-554) Loading data using HCatLoader() from a table on non default namenode fails

Arup Malakar (JIRA) Fri, 07 Dec 2012 10:49:23 -0800

     [ 
https://issues.apache.org/jira/browse/HCATALOG-554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Arup Malakar updated HCATALOG-554:
----------------------------------

    Attachment: HCATALOG-554-trunk_0.patch
                HCATALOG-554-branch_0.patch

This problem was due to setting the wrong input path. The default file system 
was used to get the fully qualified name for the input paths, which didn't play 
well with viewfs. For getting fully qualified names, the right filesystem 
should be used instead of using the default file system. 
                
> Loading data using HCatLoader() from a table on non default namenode fails
> --------------------------------------------------------------------------
>
>                 Key: HCATALOG-554
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-554
>             Project: HCatalog
>          Issue Type: Bug
>    Affects Versions: 0.4, 0.5
>         Environment: Hadoop 0.23.3
> hcatalog 0.4
> hive 0.9
>            Reporter: Arup Malakar
>            Assignee: Arup Malakar
>         Attachments: HCATALOG-554-branch_0.patch, HCATALOG-554-trunk_0.patch
>
>
> 1. Create hive table:
> {code}
> CREATE TABLE small_table(
>   id                int,
>   score             int
> )
> stored as SequenceFile
> location "viewfs:///database/small_table";
> {code}
> 2. Data:
> {code}
> 1,32
> 2,235
> 3,32532
> 4,23
> 5,2
> {code}
> 3. Load data onto the HCatalog table:
> DATA = LOAD '/tmp/data.csv' as (id:int, score:int);
> store DATA into 'default.small_table' using 
> org.apache.hcatalog.pig.HCatStorer();
> 4. Confirm that the load has been stored in the table:
> {code}
> hadoopqa@gsbl90385:/tmp$ hive -e "select * from default.small_table"
> Logging initialized using configuration in 
> file:/grid/0/homey/libexec/hive/conf/hive-log4j.properties
> Hive history 
> file=/homes/hadoopqa/hivelogs/hive_job_log_hadoopqa_201211212228_1532947518.txt
> OK
> 1       32
> 2       235
> 3       32532
> 4       23
> 5       2 
> {code}
> 5. Now try to read the same table using HCatLoader():
> {code}
> a = load 'default.small_table_arup' using 
> org.apache.hcatalog.pig.HCatLoader();
> dump a;
> {code}
> Exception seen is:
> {code}
> 012-11-21 22:30:50,087 [Thread-6] ERROR 
> org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException 
> as:[email protected] (auth:KERBEROS) 
> cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
> viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
> 2012-11-21 22:30:50,088 [Thread-6] INFO  
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - 
> PigLatin:select_in_pig.pig got an error while submitting
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
> viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
>         at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:449)
>         at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:466)
>         at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:358)
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1216)
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1213)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1213)
>         at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
>         at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
>         at java.lang.Thread.run(Thread.java:619)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
> Caused by: java.io.IOException: viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
>         at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:338)
>         at 
> org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:164)
>         at 
> org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:164)
>         at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2190)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:84)
>         at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2224)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2206)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:305)
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
>         at 
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:98)
>         at 
> org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
>         at 
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:187)
>         at 
> org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
>         at 
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
>         at 
> org.apache.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:149)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
>         ... 13 more
> {code}
> Here viewfs:///database/ resolves to the non default namenode and is defined 
> in the client side mount table.
> Observation:
> This issue seems similar to HCATALOG-553, probably HCatLoader() doesn't have 
> the right token for the non default namenode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HCATALOG-554) Loading data using HCatLoader() from a table on non default namenode fails

Reply via email to