There are two input direcoties:/user/test1/ and /user/test2/ , I want to
join the two direcoties content, in order to join the two directories, I
need to identity the content are handled by mapper from which directory, so
I use below code in mapper:

    private int tag = -1;
    @Override
    public void configure(JobConf conf) {
        try {

            this.conf = conf;
            String pathsToAliasStr = conf.get("paths.to.alias");//example:
conf.set("paths.to.alias", "0=/user/test1/,1=/user/test2/"
            String[] pathsToAlias = pathsToAliasStr.split(",");

            Path fpath = new Path((new Path(conf.get("map.input.file"
))).toUri().getPath());
            String path = fpath.toUri().toString();

            for (int i = 0; i < pathsToAlias.length; i++) {
                String[] pathToAlias = pathsToAlias[i].split("=");
                if (path.startsWith(pathToAlias[1])) {
                    tag = Integer.valueOf(pathToAlias[0].trim());//identity
current map instatnce are handling which directory content.
                }
            }
        } catch (Throwable e) {
            e.printStackTrace();
            throw new RuntimeException(e);
        }

    }

So when map method  run, the content are handled by the mapper are
identified for same direcoty.

I want to know whether one mapper instatnce only handle content of one
directory at same time.


Thanks

LiuLei

Reply via email to