Hi,
I'm not sure weather this belongs in the hive-dev or hive-user.
I have a folder with many small files.
I would like to reduce the number of files the way hive merges output .
I tried to understand from the source of
org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1 how to leverage the API to
submit a job
that merges output files.
I think I was able to identify:
private void createMergeJob(FileSinkOperator fsOp, GenMRProcContext ctx,
String finalName)
throws SemanticException
As the entry point to the logic that performs the operation, but I did not find
documentation as to how to use it
Is there an example that simulates the use of this API call?