Arup Malakar created HCATALOG-554:
-------------------------------------

             Summary: Loading data using HCatLoader() from a table on non 
default namenode fails
                 Key: HCATALOG-554
                 URL: https://issues.apache.org/jira/browse/HCATALOG-554
             Project: HCatalog
          Issue Type: Bug
    Affects Versions: 0.4, 0.5
         Environment: Hadoop 0.23.3
hcatalog 0.4
hive 0.9
            Reporter: Arup Malakar
            Assignee: Arup Malakar


1. Create hive table:
{code}
CREATE TABLE small_table(
  id                int,
  score             int
)
stored as SequenceFile
location "viewfs:///database/small_table";
{code}

2. Data:
{code}
1,32
2,235
3,32532
4,23
5,2
{code}

3. Load data onto the HCatalog table:
DATA = LOAD '/tmp/data.csv' as (id:int, score:int);
store DATA into 'default.small_table' using 
org.apache.hcatalog.pig.HCatStorer();

4. Confirm that the load has been stored in the table:
{code}
hadoopqa@gsbl90385:/tmp$ hive -e "select * from default.small_table"
Logging initialized using configuration in 
file:/grid/0/homey/libexec/hive/conf/hive-log4j.properties
Hive history 
file=/homes/hadoopqa/hivelogs/hive_job_log_hadoopqa_201211212228_1532947518.txt
OK
1       32
2       235
3       32532
4       23
5       2 
{code}

5. Now try to read the same table using HCatLoader():

{code}
a = load 'default.small_table_arup' using org.apache.hcatalog.pig.HCatLoader();
dump a;
{code}

Exception seen is:

{code}
012-11-21 22:30:50,087 [Thread-6] ERROR 
org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException 
as:[email protected] (auth:KERBEROS) 
cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
2012-11-21 22:30:50,088 [Thread-6] INFO  
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - 
PigLatin:select_in_pig.pig got an error while submitting
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: 
viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:449)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:466)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:358)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1216)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1213)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1213)
        at 
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
        at 
org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
        at java.lang.Thread.run(Thread.java:619)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
Caused by: java.io.IOException: viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
        at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:338)
        at 
org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:164)
        at 
org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:164)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2190)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:84)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2224)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2206)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:305)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
        at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:98)
        at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
        at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:187)
        at 
org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
        at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
        at 
org.apache.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:149)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
        ... 13 more
{code}

Here viewfs:///database/ resolves to the non default namenode and is defined in 
the client side mount table.

Observation:

This issue seems similar to HCATALOG-553, probably HCatLoader() doesn't have 
the right token for the non default namenode.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to