The reduce partition information is stored in this partition_XXXX file. See the 
below code:

HFileOutputFormat#configureIncrementalLoad:
        .....................
    Path partitionsPath = new Path(job.getWorkingDirectory(),
                                   "partitions_" + UUID.randomUUID());
    LOG.info("Writing partition information to " + partitionsPath);

    FileSystem fs = partitionsPath.getFileSystem(conf);
    writePartitions(conf, partitionsPath, startKeys);
        .....................

Hoping it helps.

Jieshan
-----Original Message-----
From: Tao Xiao [mailto:xiaotao.cs....@gmail.com] 
Sent: Monday, December 16, 2013 6:48 PM
To: user@hbase.apache.org
Subject: Why so many unexpected files like partitions_xxxx are created?

I imported data into HBase in the fashion of bulk load,  but after that I found 
many unexpected file were created in the HDFS directory of /user/root/, and 
they like these:

/user/root/partitions_fd74866b-6588-468d-8463-474e202db070
/user/root/partitions_fd867cd2-d9c9-48f5-9eec-185b2e57788d
/user/root/partitions_fda37b8a-a882-4787-babc-8310a969f85c
/user/root/partitions_fdaca2f4-2792-41f6-b7e8-61a8a5677dea
/user/root/partitions_fdd55baa-3a12-493e-8844-a23ae83209c5
/user/root/partitions_fdd85a3c-9abe-45d4-a0c6-76d2bed88ea5
/user/root/partitions_fe133460-5f3f-4c6a-9fff-ff6c62410cc1
/user/root/partitions_fe29a2b0-b281-465f-8d4a-6044822d960a
/user/root/partitions_fe2fa6fa-9066-484c-bc91-ec412e48d008
/user/root/partitions_fe31667b-2d5a-452e-baf7-a81982fe954a
/user/root/partitions_fe3a5542-bc4d-4137-9d5e-1a0c59f72ac3
/user/root/partitions_fe6a9407-c27b-4a67-bb50-e6b9fd172bc9
/user/root/partitions_fe6f9294-f970-473c-8659-c08292c27ddd
... ...
... ...


It seems that they are HFiles, but I don't know why the were created here?

I bulk load data into HBase in the following way:

Firstly,   I wrote a MapReduce program which only has map tasks. The map
tasks read some text data and emit them in the form of  RowKey and KeyValue.The 
following is my program:

        @Override
        protected void map(NullWritable NULL, GtpcV1SignalWritable signal, 
Context ctx) throws InterruptedException, IOException {
            String strRowkey = xxx;
            byte[] rowkeyBytes = Bytes.toBytes(strRowkey);

            rowkey.set(rowkeyBytes);

            part1.init(signal);
            part2.init(signal);

            KeyValue kv = new KeyValue(rowkeyBytes, Family_A, Qualifier_Q, 
part1.serialize());
            ctx.write(rowkey, kv);

            kv = new KeyValue(rowkeyBytes, Family_B, Qualifier_Q, 
part2.serialize());
            ctx.write(rowkey, kv);
        }


after the MR programs finished, there were several HFiles generated in the 
output directory I specified.

Then I bean to load these HFiles into HBase using the following command:
       hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
HFiles-Dir  MyTable

Finally , I could see that the data were indeed loaded into the table in HBase.


But, I could also see that there were many unexpected files generated in the 
HDFS directory of  /user/root/,  just as I have mentioned at the begining of 
this mail,  and I did not specify any files to be produced in this directory.

What happened ? Who can tell me what there files are and who produced them?

Thanks

Reply via email to