Re: local bulk loading?

Jean-Daniel Cryans Thu, 26 Apr 2012 15:32:40 -0700

Yep same old problem that was asked a bunch of time on the user list :)


On Thu, Apr 26, 2012 at 3:29 PM, Dave Revell <[email protected]> wrote:
> Hi Doug,
>
> When I hit this problem, I concluded that HFileOutputFormat cannot be used
> in standalone mode since it requires DistributedCache, which doesn't work
> with the local job runner.
>
> So you're not the only one :(
>
> -Dave
>
> On Thu, Apr 26, 2012 at 1:52 PM, Doug Meil 
> <[email protected]>wrote:
>
>>
>> Hi Devs-
>>
>> I'm coding up a local bulkloading example for the RefGuide but I've been
>> banging my head on this….
>>
>>
>>  WARN [Thread-8] (LocalJobRunner.java:295) - job_local_0001
>>
>> java.lang.IllegalArgumentException: Can't read partitions file
>>
>> at
>> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
>>
>> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>>
>> at
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>
>> at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:552)
>>
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
>>
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315)
>>
>> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>>
>> Caused by: java.io.FileNotFoundException: File _partition.lst does not
>> exist.
>>
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:372)
>>
>> at
>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>>
>> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:751)
>>
>> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>>
>> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>>
>> at
>> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.readPartitions(TotalOrderPartitioner.java:296)
>>
>> at
>> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:82)
>>
>> … does bulk loading work with the local job runner?  Obviously, you're not
>> going to run a production cluster off your laptop but it's nice to at least
>> be able to test your code.
>>
>> I know the DistributedCache doesn't work with the LocalJobRunner (and
>> TotalOrderPartitioner uses the DistributedCache) and then there's this log
>> message..
>>
>>
>>  WARN [main] (LocalJobRunner.java:134) - LocalJobRunner does not support
>> symlinking into current working dir.
>>
>> … so I'm wondering how this actually works, if it does work locally.
>>
>> Coincidentally, this exact error is in the troubleshooting chapter..
>>
>> http://hbase.apache.org/book.html#trouble.mapreduce
>>
>> … but it came up in a different context.  In the context that the guy was
>> asking the question he thought he was remote, but he was really local.
>>
>> Doug Meil
>> Chief Software Architect, Explorys
>> [email protected]
>>
>>

Re: local bulk loading?

Reply via email to