Aritra Nayak created PHOENIX-5361:
-------------------------------------

             Summary: FileNotFoundException found when schema is in lowercase
                 Key: PHOENIX-5361
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5361
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.13.0
         Environment: *Hadoop*: 2.6.0-cdh5.9.2

*Phoenix*: 4.13

*HBase*: 1.2.0-cdh5.9.2

*Java*: 8
            Reporter: Aritra Nayak


The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in 
lowercase.

 

Steps to reproduce:
 # Create the Hive table:
{code:java}
CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY, 
"firstName" VARCHAR, "lastName" VARCHAR);
{code}

 # Upload the CSV file in your preferred HDFS location
{code:java}
/data/s01/DUMMY_DATA/1.csv{code}

 # Run the hadoop jar command to bulk upload
{code:java}
hadoop jar 
/opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar
 org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA 
--input /data/s01/DUMMY_DATA/1.csv --zookeeper zk-journalnode-lv-101:2181
{code}
Getting the below error:
{code:java}
Exception in thread "main" java.io.FileNotFoundException: Bulkload dir 
/tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA not found
    at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
    at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
    at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
    at 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
    at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
    at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
    at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
    at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at 
org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}

  The Map Reduce job reads 100_000 records, but does not write any

 
{code:java}
19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
    File System Counters
        FILE: Number of bytes read=20
        FILE: Number of bytes written=315801
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=41666811
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=0
    Job Counters
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=39894
        Total time spent by all reduces in occupied slots (ms)=56216
        Total time spent by all map tasks (ms)=19947
        Total time spent by all reduce tasks (ms)=14054
        Total vcore-seconds taken by all map tasks=19947
        Total vcore-seconds taken by all reduce tasks=14054
        Total megabyte-seconds taken by all map tasks=40851456
        Total megabyte-seconds taken by all reduce tasks=57565184
    Map-Reduce Framework
        Map input records=1000000
        Map output records=0   <----- see here
        Map output bytes=0
        Map output materialized bytes=16
        Input split bytes=123
        Combine input records=0
        Combine output records=0
        Reduce input groups=0
        Reduce shuffle bytes=16
        Reduce input records=0
        Reduce output records=0
        Spilled Records=0
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=914
        CPU time spent (ms)=49240
        Physical memory (bytes) snapshot=2022809600
        Virtual memory (bytes) snapshot=8064647168
        Total committed heap usage (bytes)=3589275648
    Phoenix MapReduce Import
        Upserts Done=1000000
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=41666688
    File Output Format Counters
        Bytes Written=0
{code}
   {color:#14892c}Same steps (1-3) when followed with schema name S01, passes 
and data gets successfully uploaded into the table{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to