Are you configuring MultipleOutputs to have Accumulo Key Value objects? How are you ingesting the Data into Accumulo?
public class MultipleOutputs <https://hadoop.apache.org/docs/r2.6.3/api/src-html/org/apache/hadoop/mapreduce/lib/output/MultipleOutputs.html#line.175><KEYOUT,VALUEOUT> On Thu, May 5, 2022 at 9:46 AM Vincent Russell <[email protected]> wrote: > Thank you for the reply mike. > > These are the counters that show up in the job history server for example. > For instance in the accumulo docs: > > $ bin/tool.sh lib/accumulo-examples-simple.jar > org.apache.accumulo.examples.simple.mapreduce.WordCount -i instance -z > zookeepers --input /user/username/wc -t wordCount -u username -p > password > > 11/02/07 18:20:11 INFO input.FileInputFormat: Total input paths to process > : 1 > 11/02/07 18:20:12 INFO mapred.JobClient: Running job: job_201102071740_0003 > 11/02/07 18:20:13 INFO mapred.JobClient: map 0% reduce 0% > 11/02/07 18:20:20 INFO mapred.JobClient: map 100% reduce 0% > 11/02/07 18:20:22 INFO mapred.JobClient: Job complete: > job_201102071740_0003 > 11/02/07 18:20:22 INFO mapred.JobClient: Counters: 6 > 11/02/07 18:20:22 INFO mapred.JobClient: Job Counters > 11/02/07 18:20:22 INFO mapred.JobClient: Launched map tasks=1 > 11/02/07 18:20:22 INFO mapred.JobClient: Data-local map tasks=1 > 11/02/07 18:20:22 INFO mapred.JobClient: FileSystemCounters > 11/02/07 18:20:22 INFO mapred.JobClient: HDFS_BYTES_READ=10487 > 11/02/07 18:20:22 INFO mapred.JobClient: Map-Reduce Framework > 11/02/07 18:20:22 INFO mapred.JobClient: Map input records=255 > 11/02/07 18:20:22 INFO mapred.JobClient: Spilled Records=0 > 11/02/07 18:20:22 INFO mapred.JobClient: Map output records=1452 > > > The outputformat is a MultipleOutputs that write to disk. > > I guess my question is should we care about these counters? Do they mean > anything with accumulo? It seems to suggest that mappers are not running > local with the tservers. > > Thanks, > Vincent > > On Thu, May 5, 2022 at 9:03 AM Mike Miller <[email protected]> wrote: > > > What do you mean by data local or rack-local map task? You don't have any > > mappers running? What is your output format? What configuration are you > > setting in the Job? > > > > On Wed, May 4, 2022 at 8:44 PM Vincent Russell < > [email protected]> > > wrote: > > > > > Hello, > > > > > > I am using map reduce with accumulo 2.0.1 and hadoop 3.3.1. After our > > map > > > reduce jobs complete we take a look at the counters and the > > Data-local-map > > > tasks and rack-local map tasks are both equal to 0. Do we probably > have > > > something misconfigured or is this expected? Or can this count not be > > > configured properly with the AccumuloInputFormat? > > > > > > Thank you, > > > Vincent > > > > > >
