chenboat commented on a change in pull request #5221: Add a new server api for
download of segments.
URL: https://github.com/apache/incubator-pinot/pull/5221#discussion_r406550254
##########
File path:
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java
##########
@@ -175,4 +183,43 @@ public String getCrcMetadataForTable(
}
}
}
+
+ // TODO Add access control similar to
PinotSegmentUploadDownloadRestletResource for segment download.
+ @GET
+ @Produces(MediaType.APPLICATION_OCTET_STREAM)
+ @Path("/segments/{tableNameWithType}/{segmentName}")
+ @ApiOperation(value = "Download a segment", notes = "Download a segment in
zipped tar format")
+ public Response downloadSegment(
+ @ApiParam(value = "Name of the table with type REALTIME OR OFFLINE",
required = true, example = "myTable_OFFLINE") @PathParam("tableNameWithType")
String tableNameWithType,
+ @ApiParam(value = "Name of the segment", required = true)
@PathParam("segmentName") @Encoded String segmentName,
+ @Context HttpHeaders httpHeaders)
+ throws Exception {
+ LOGGER.info("Received a request to download segment {} for table {}",
segmentName, tableNameWithType);
+ TableDataManager tableDataManager =
checkGetTableDataManager(tableNameWithType);
+ SegmentDataManager segmentDataManager =
tableDataManager.acquireSegment(segmentName);
+ if (segmentDataManager == null) {
+ throw new WebApplicationException(
+ String.format("Table %s segment %s does not exist",
tableNameWithType, segmentName),
+ Response.Status.NOT_FOUND);
+ }
+ try {
+ String tableDir = tableDataManager.getTableDataDir();
+ // TODO Limit the number of concurrent downloads of segments because
compression is an expensive operation.
+ String tarFilePath =
TarGzCompressionUtils.createTarGzOfDirectory(tableDir + File.separator +
segmentName);
Review comment:
I would go with (1) for now. (2) goes to the same line as the performance
optimization we put as TODO here. The deleteOnExit has been added already.
Using segmentTarDir config seems to be an extra protection against polluting
server disk.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]