Currently I can retrieve entries if I use MapFileOutputFormat via 
conf.setOutputFormat with no compression specified.  But I was trying to do 
this:

public void configure(JobConf jobConf) {
...
this.writer = new MapFile.Writer(jobConf, fileSys, dirName, Text.class, 
Text.class, SequenceFile.CompressionType.BLOCK);
...
}

public void map(WritableComparable key, Writable value,
                        OutputCollector output, Reporter reporter) throws 
IOException {
...
writer.append(newkey,newvalue);
...
}

To use SequenceFile block compression.  Then later trying to retrieve the 
output values in a separate class:

public static void main(String[] args) throws Exception {
...
conf.setInputFormat(org.apache.hadoop.mapred.SequenceFileInputFormat.class);
...
MapFile.Reader[] readers = MapFileOutputFormat.getReaders(fileSys, inDataPath, 
defaults);
Partitioner part = 
(Partitioner)ReflectionUtils.newInstance(conf.getPartitionerClass(), conf);
Text entryValue = null;
entryValue = (Text)MapFileOutputFormat.getEntry(readers, part, new 
Text("mykey"), new Text());
if (entryValue != null) {
        System.out.println("My Entry's Value: ");
        System.out.println(entryValue.toString());
}
for (MapFile.Reader reader:readers) {
        if (reader != null) {
                reader.close();
        }
}
}

But when I use block compression I no longer get a result from 
MapFileOutputFormat.getEntry.  What am I doing wrong?  And/or is there a way 
for this to work using conf.setOutputFormat(MapFileOutputFormat.class) and 
conf.setMapOutputCompressionType(SequenceFile.CompressionType.BLOCK)? 

Reply via email to