[ 
https://issues.apache.org/jira/browse/HBASE-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13808234#comment-13808234
 ] 

Alexandre Normand commented on HBASE-7744:
------------------------------------------

Sorry for the delay. I spent some time yesterday trying to add a test that 
would confirm that bulk load is broken in local mode on trunk. It's a hack but 
it forces {{mapred.job.tracker}} to {{local}} and sets the filesystem to be 
local too. All that done, it still seems to me that this was fixed indirectly 
on trunk by some other commits. Specifically, the key to the 0.94 fix is the 
fact that this gets executed when running in local mode:
{code}
      boolean localMode = "local".equals(conf.get("mapred.job.tracker"));
      if (localMode) {
        conf.set(TotalOrderPartitioner.PARTITIONER_PATH, 
partitionsPath.toString());
      }
{code}

On trunk, we have a similar behavior except that it's executed in all cases 
(which is much cleaner, in my opinion):
{code}
  static void configurePartitioner(Job job, List<ImmutableBytesWritable> 
splitPoints)
      throws IOException {

    // create the partitions file
    FileSystem fs = FileSystem.get(job.getConfiguration());
    Path partitionsPath = new Path("/tmp", "partitions_" + UUID.randomUUID());
    ...
    job.setPartitionerClass(TotalOrderPartitioner.class);
    TotalOrderPartitioner.setPartitionFile(job.getConfiguration(), 
partitionsPath);
  }
{code}

I'd mark this as already fixed on trunk. I can be convinced to attach my hacked 
together test (it's basically just extending the existing 
{{IntegrationTestBulkLoad}} while overriding a few configurations to run in 
local mode) if anyone is curious. 

Can anyone double check me?

> Tools creating HFiles (Import, ImportTsv) don't run in local mode
> -----------------------------------------------------------------
>
>                 Key: HBASE-7744
>                 URL: https://issues.apache.org/jira/browse/HBASE-7744
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Nick Dimiduk
>            Assignee: Alexandre Normand
>            Priority: Minor
>         Attachments: HBASE-7744-0.94.6-v2.patch, HBASE-7744-0.94.patch, 
> HBASE-7744-trunk.patch, HBASE-7744-v1.patch
>
>
> This is mostly a developer pain point.
> HFileOutputFormat#configureIncrementalLoad depends on 
> DistributedCache#createSymlink to find the splits file when configuring TOP. 
> This symlink doesn't work when run in local mode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to