Take a look at CombineFileInputFormat. DR
On 01/21/2011 09:24 AM, lei liu wrote:
There are two input direcoties:/user/test1/ and /user/test2/ , I want to join the two direcoties content, in order to join the two directories, I need to identity the content are handled by mapper from which directory, so I use below code in mapper: private int tag = -1; @Override public void configure(JobConf conf) { try { this.conf = conf; String pathsToAliasStr = conf.get("paths.to.alias");//example: conf.set("paths.to.alias", "0=/user/test1/,1=/user/test2/" String[] pathsToAlias = pathsToAliasStr.split(","); Path fpath = new Path((new Path(conf.get("map.input.file" ))).toUri().getPath()); String path = fpath.toUri().toString(); for (int i = 0; i< pathsToAlias.length; i++) { String[] pathToAlias = pathsToAlias[i].split("="); if (path.startsWith(pathToAlias[1])) { tag = Integer.valueOf(pathToAlias[0].trim());//identity current map instatnce are handling which directory content. } } } catch (Throwable e) { e.printStackTrace(); throw new RuntimeException(e); } } So when map method run, the content are handled by the mapper are identified for same direcoty. I want to know whether one mapper instatnce only handle content of one directory at same time. Thanks LiuLei