As far as I know, FileInputFormat.getSplits() will returns the number of splits automatically computed by the number of files, blocks. BTW, What version of Hadoop/Hbase?
I tried to test that code (http://wiki.apache.org/hadoop/Hbase/MapReduce) on my cluster (Hadoop 0.19.1 and Hbase 0.19.0). The number of input paths was 2, map tasks were 274. Below is my changed code for v0.19.0. --- public JobConf createSubmittableJob(String[] args) { JobConf c = new JobConf(getConf(), TestImport.class); c.setJobName(NAME); FileInputFormat.setInputPaths(c, args[0]); c.set("input.table", args[1]); c.setMapperClass(InnerMap.class); c.setNumReduceTasks(0); c.setOutputFormat(NullOutputFormat.class); return c; } On Thu, Apr 23, 2009 at 6:19 PM, nguyenhuynh.mr <nguyenhuynh...@gmail.com> wrote: > Edward J. Yoon wrote: > >> How do you to add input paths? >> >> On Wed, Apr 22, 2009 at 5:09 PM, nguyenhuynh.mr >> <nguyenhuynh...@gmail.com> wrote: >> >>> Edward J. Yoon wrote: >>> >>> >>>> Hi, >>>> >>>> In that case, The atomic unit of split is a file. So, you need to >>>> increase the number of files. or Use the TextInputFormat as below. >>>> >>>> jobConf.setInputFormat(TextInputFormat.class); >>>> >>>> On Wed, Apr 22, 2009 at 4:35 PM, nguyenhuynh.mr >>>> <nguyenhuynh...@gmail.com> wrote: >>>> >>>> >>>>> Hi all! >>>>> >>>>> >>>>> I have a MR job use to import contents into HBase. >>>>> >>>>> The content is text file in HDFS. I used the maps file to store local >>>>> path of contents. >>>>> >>>>> Each content has the map file. ( the map is a text file in HDFS and >>>>> contain 1 line info). >>>>> >>>>> >>>>> I created the maps directory used to contain map files. And the this >>>>> maps directory used to input path for job. >>>>> >>>>> When i run job, the number map task is same number map files. >>>>> Ex: I have 5 maps file -> 5 map tasks. >>>>> >>>>> Therefor, the map phase is slowly :( >>>>> >>>>> Why the map phase is slowly if the number map task large and the number >>>>> map task is equal number of files?. >>>>> >>>>> * p/s: Run jobs with: 3 node: 1 server and 2 slaver >>>>> >>>>> Please help me! >>>>> Thanks. >>>>> >>>>> Best, >>>>> Nguyen. >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> Current, I use TextInputformat to set InputFormat for map phase. >>> >>> >> >> >> >> Thanks for your help! > I use FileInputFormat to add input paths. > Some thing like: > FileInputFormat.setInputPath(new Path("dir")); > > The "dir" is a directory contains input files. > > Best, > Nguyen > > > -- Best Regards, Edward J. Yoon edwardy...@apache.org http://blog.udanax.org