Re: BulkloadTool issue even after successful HfileLoads

Gabriel Reid Wed, 23 Sep 2015 06:25:39 -0700

Hi Dhruv,

This is a bug in Phoenix, although it appears that your hadoop
configuration is also somewhat unusual.


As far as I can see, your hadoop configuration is set up to use the
local filesystem, and not hdfs. You can test this by running the
following command:

    hadoop dfs -ls /

If that command lists the contents of the root directory on your local
machine, then hadoop is set up to use your local filesystem, and not
HDFS.

The bulk load tool currently uses the file system that is configured
for Hadoop for deleting the (temporary) output directory. However,
because your setup uses the local filesystem by default, but you're
using HDFS via the file paths that you supply to the tool, deleting
the temporary output directory fails.

The quick fix (if feasible) is to set up your hadoop command to use
HDFS as the default file system. However, could you also log this as a
bug in the JIRA (https://issues.apache.org/jira/browse/PHOENIX)

- Gabriel


On Wed, Sep 23, 2015 at 2:45 PM, Dhruv Gohil <yourfrienddh...@gmail.com> wrote:
> Hi,
>     I am able to successfully use BulkLoadTool to load millions of rows in
> phoenixTable, but at end of each execution following error occurs. Need your
> inputs to make runs full green.
>
>     Following is minimal reproduction using EXAMPLE given in documentation.
> Env:
>     CDH 5.3
>     Hbase 0.98
>     Phoenix 4.2.2
>
>> Running BulkloadTool from one of the HDFS-CDH cluster machine.
>  Exactly following instructions :
> https://phoenix.apache.org/bulk_dataload.html
>
> HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase/conf hadoop jar
> phoenix-4.2.2-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool
> --table EXAMPLE --input
> hdfs://cdh3.st.comany.org:8020/data/example.csv --output
> hdfs://cdh3.st.comany.org:8020/user/ --zookeeper 10.10.10.1
>
>
> CsvBulkLoadTool - Import job on table=EXAMPLE failed due to
> exception:java.lang.IllegalArgumentException: Wrong FS:
> hdfs://cdh3.st.comany.org:8020/user/EXAMPLE, expected: file:///
>
> Job output log:
>
> 2015-09-23 17:54:09,633 [phoenix-2-thread-0] INFO
> org.apache.hadoop.mapreduce.Job - Job job_local1620809330_0001 completed
> successfully
> 2015-09-23 17:54:09,663 [phoenix-2-thread-0] INFO
> org.apache.hadoop.mapreduce.Job - Counters: 33
> File System Counters
> FILE: Number of bytes read=34018940
> FILE: Number of bytes written=34812676
> FILE: Number of read operations=0
> FILE: Number of large read operations=0
> FILE: Number of write operations=0
> HDFS: Number of bytes read=68
> HDFS: Number of bytes written=1241
> HDFS: Number of read operations=15
> HDFS: Number of large read operations=0
> HDFS: Number of write operations=5
> Map-Reduce Framework
> Map input records=2
> Map output records=6
> Map output bytes=330
> Map output materialized bytes=348
> Input split bytes=117
> Combine input records=0
> Combine output records=0
> Reduce input groups=2
> Reduce shuffle bytes=0
> Reduce input records=6
> Reduce output records=6
> Spilled Records=12
> Shuffled Maps =0
> Failed Shuffles=0
> Merged Map outputs=0
> GC time elapsed (ms)=0
> CPU time spent (ms)=0
> Physical memory (bytes) snapshot=0
> Virtual memory (bytes) snapshot=0
> Total committed heap usage (bytes)=736100352
> Phoenix MapReduce Import
> Upserts Done=2
> File Input Format Counters
> Bytes Read=34
> File Output Format Counters
> Bytes Written=1241
> 2015-09-23 17:54:09,664 [phoenix-2-thread-0] INFO
> org.apache.phoenix.mapreduce.CsvBulkLoadTool - Loading HFiles from
> hdfs://cdh3.st.comany.org:8020/user/EXAMPLE
> 2015-09-23 17:54:12,428 [phoenix-2-thread-0] WARN
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles - Skipping
> non-directory hdfs://cdh3.st.comany.org:8020/user/EXAMPLE/_SUCCESS
> 2015-09-23 17:54:16,233 [LoadIncrementalHFiles-0] INFO
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles - Trying to load
> hfile=hdfs://cdh3.st.comany.org:8020/user/EXAMPLE/M/1fe6f08c1a2f431ba87e08ce27ec813d
> first=\x80\x00\x00\x00\x00\x0009 last=\x80\x00\x00\x00\x00\x01\x092
> 2015-09-23 17:54:17,898 [phoenix-2-thread-0] INFO
> org.apache.phoenix.mapreduce.CsvBulkLoadTool - Incremental load complete for
> table=EXAMPLE
> 2015-09-23 17:54:17,899 [phoenix-2-thread-0] INFO
> org.apache.phoenix.mapreduce.CsvBulkLoadTool - Removing output directory
> hdfs://cdh3.st.comany.org:8020/user/EXAMPLE
> 2015-09-23 17:54:17,900 [phoenix-2-thread-0] ERROR
> org.apache.phoenix.mapreduce.CsvBulkLoadTool - Import job on table=EXAMPLE
> failed due to exception:java.lang.IllegalArgumentException: Wrong FS:
> hdfs://cdh3.st.comany.org:8020/user/EXAMPLE, expected: file:///
>
> --
> Dhruv

Re: BulkloadTool issue even after successful HfileLoads

Reply via email to