[ 
https://issues.apache.org/jira/browse/ACCUMULO-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815670#comment-13815670
 ] 

Josh Elser commented on ACCUMULO-1854:
--------------------------------------

I was talking to Christopher tonight about this. He did bring up the good point 
about why not to use the AccumuloMultiTableInputFormat. One point we came to 
was that making these changes would allow single M/R jobs to talk to separate 
Accumulo clusters instead of a single cluster.

I did settle on a change that I'm not completely happy about that is reliant on 
the fact that splits are generated by one host in serial. If they were 
generated in parallel, my approach would break. However, given that the 
InputFormata can't rely on getting the same Configuration object in each 
invocation of getSplits, the only other reliable approach I could come up with 
was to use something like HDFS which has its own sort of concurrency issues. 
Since it's not an issue now, I've punted on worrying about it.

> Accumulo{Input,Output}Format can't handle multiple configurations
> -----------------------------------------------------------------
>
>                 Key: ACCUMULO-1854
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1854
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.4.4, 1.5.0
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 1.4.5, 1.5.1, 1.6.1
>
>
> I noticed that I was unable to properly use MultipleInputs (or any code which 
> uses a similar approach) with the AccumuloInputFormat class because of the 
> way it builds up information in the Configuration object.
> It would be useful to be able to have multiple instances of AIF (and AOF) 
> configured within one Job (Configuration).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to