Hi,


I’d like to sink my data into hdfs using SequenceFileAsBinaryOutputFormat
with compression, and I find a way from the link
https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/hadoop_compatibility.html,
the code works, but I’m curious to know, since it creates a mapreduce Job
instance here, would this Flink application creates and run a mapreduce
underneath? If so, will it kill performance?



I tried to figure out by looking into log, but couldn’t get a clue, hope
people could shed some light here. Thank you.



Job job = Job.getInstance();

HadoopOutputFormat<BytesWritable, BytesWritable> hadoopOF = new
HadoopOutputFormat<BytesWritable, BytesWritable>(

                    new SequenceFileAsBinaryOutputFormat(), job);



hadoopOF.getConfiguration().set("mapreduce.output.fileoutputformat.compress",
"true");

hadoopOF.getConfiguration().set("mapreduce.output.fileoutputformat.compress.type",
CompressionType.BLOCK.toString());

TextOutputFormat.setOutputPath(job, new Path("hdfs://..."));

dataset.output(hadoopOF);

Reply via email to