xiangfu0 commented on a change in pull request #7193:
URL: https://github.com/apache/pinot/pull/7193#discussion_r675884881
##########
File path:
pinot-tools/src/main/java/org/apache/pinot/tools/admin/command/SegmentProcessorFrameworkCommand.java
##########
@@ -72,25 +80,42 @@ public String description() {
@Override
public boolean execute()
throws Exception {
+ PluginManager.get().init();
SegmentProcessorFrameworkSpec segmentProcessorFrameworkSpec =
JsonUtils.fileToObject(new File(_segmentProcessorFrameworkSpec),
SegmentProcessorFrameworkSpec.class);
File inputSegmentsDir = new
File(segmentProcessorFrameworkSpec.getInputSegmentsDir());
File outputSegmentsDir = new
File(segmentProcessorFrameworkSpec.getOutputSegmentsDir());
- if (!outputSegmentsDir.exists()) {
- if (!outputSegmentsDir.mkdirs()) {
- throw new RuntimeException(
- "Did not find output directory, and could not create it either: "
+ segmentProcessorFrameworkSpec
- .getOutputSegmentsDir());
+ File workingDir = new File(outputSegmentsDir, "tmp-" + UUID.randomUUID());
+ File untarredSegmentsDir = new File(workingDir, "untarred_segments");
+ FileUtils.forceMkdir(untarredSegmentsDir);
+ File[] segmentDirs = inputSegmentsDir.listFiles();
+ Preconditions
+ .checkState(segmentDirs != null && segmentDirs.length > 0, "Failed to
find files under input segments dir: %s",
+ inputSegmentsDir.getAbsolutePath());
+ List<RecordReader> recordReaders = new ArrayList<>(segmentDirs.length);
+ for (File segmentDir : segmentDirs) {
+ String fileName = segmentDir.getName();
+
+ // Untar the segments if needed
+ if (!segmentDir.isDirectory()) {
+ if (fileName.endsWith(".tar.gz") || fileName.endsWith(".tgz")) {
+ segmentDir = TarGzCompressionUtils.untar(segmentDir,
untarredSegmentsDir).get(0);
+ } else {
+ throw new IllegalStateException("Unsupported segment format: " +
segmentDir.getAbsolutePath());
Review comment:
Not relevant to this PR.
I somehow feel we may want to have a util function to check if a file is in
tar gz format.
E.g. controller directory stores segment tar gz files without extension.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]