mcvsubbu commented on a change in pull request #5221: Add a new server api for 
download of segments.
URL: https://github.com/apache/incubator-pinot/pull/5221#discussion_r407801772
 
 

 ##########
 File path: 
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java
 ##########
 @@ -175,4 +183,43 @@ public String getCrcMetadataForTable(
       }
     }
   }
+
+  // TODO Add access control similar to 
PinotSegmentUploadDownloadRestletResource for segment download.
+  @GET
+  @Produces(MediaType.APPLICATION_OCTET_STREAM)
+  @Path("/segments/{tableNameWithType}/{segmentName}")
+  @ApiOperation(value = "Download a segment", notes = "Download a segment in 
zipped tar format")
+  public Response downloadSegment(
+      @ApiParam(value = "Name of the table with type REALTIME OR OFFLINE", 
required = true, example = "myTable_OFFLINE") @PathParam("tableNameWithType") 
String tableNameWithType,
+      @ApiParam(value = "Name of the segment", required = true) 
@PathParam("segmentName") @Encoded String segmentName,
+      @Context HttpHeaders httpHeaders)
+      throws Exception {
+    LOGGER.info("Received a request to download segment {} for table {}", 
segmentName, tableNameWithType);
+    TableDataManager tableDataManager = 
checkGetTableDataManager(tableNameWithType);
+    SegmentDataManager segmentDataManager = 
tableDataManager.acquireSegment(segmentName);
+    if (segmentDataManager == null) {
+      throw new WebApplicationException(
+          String.format("Table %s segment %s does not exist", 
tableNameWithType, segmentName),
+          Response.Status.NOT_FOUND);
+    }
+    try {
+      String tableDir = tableDataManager.getTableDataDir();
+      // TODO Limit the number of concurrent downloads of segments because 
compression is an expensive operation.
+      String tarFilePath = 
TarGzCompressionUtils.createTarGzOfDirectory(tableDir + File.separator + 
segmentName);
 
 Review comment:
   The segmentTarDir may be configured to be the high performance disk (e.g. 
ssd)  as opposed to the disk for log messages (boot/hdd). So, we should use 
that for sure. Please go ahead and change it to use a unique name, and add a 
note that this may deteriorate performance if more than one replica asks to 
download segments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to