Hi,
I am trying to run a bulk ingest to import data into Accumulo but it is
failing at the reduce task with the below error:
java.lang.IllegalStateException: Keys appended out-of-order. New key
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:col3 [myVis]
9223372036854775807 false, previous key
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a
foo:col16 [myVis] 9223372036854775807 false
at
org.apache.accumulo.core.file.rfile.RFile$Writer.append(RFile.java:378)
Could this be caused by the order at which the writes are being done?
*-- Background*
*
*
The input file is a tab separated file. A sample row would look like:
Data1 Data2 Data3 Data4 Data5 … DataN
The map parses the data, for each row, into a Map<String, String>. This
will contain the following:
Col1 Data1
Col2 Data2
Col3 Data3
…
ColN DataN
An outputKey is then generated for this row in the format *client@timeStamp
@randomUUID*
Then for each entry in Map<String, String> a outputValue is generated in
the format *ColN|DataN*
The outputKey and outputValue are written to Context.
This completes successfully, however, the reduce task fails.
My ReduceClass is as follows:
*public* *static* *class* ReduceClass *extends*
Reducer<Text,Text,Key,Value>
{
*public* *void* reduce(Text key, Iterable<Text> keyValues, Context
output) *throws* IOException, InterruptedException {
// for each value belonging to the key
*for* (Text keyValue : keyValues) {
//split the keyValue into *Col* and Data
String[] values = keyValue.toString().split("\\|");
// Generate key
Key outputKey = *new* Key(key, *new* Text("foo"), *new*
Text(values[0]), *new* Text("myVis"));
// Generate value
Value outputValue = *new* Value(values[1].getBytes(),
0, values[1].length());
// Write to context
output.write(outputKey, outputValue);
}
}
}
*-- Expected output*
I am expecting the contents of the Accumulo table to be as follows:
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col1 [myVis]
Data1
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col2 [myVis]
Data2
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col3 [myVis]
Data3
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col4 [myVis]
Data4
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:Col5 [myVis]
Data5
…
client@20121206123059@0014efca-d8e8-492e-83cb-e5b6b7c49f7a foo:ColN [myVis]
DataN
Thanks,
Andrew