mcvsubbu commented on a change in pull request #5221: Add a new server api for 
download of segments.
URL: https://github.com/apache/incubator-pinot/pull/5221#discussion_r406536533
 
 

 ##########
 File path: 
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java
 ##########
 @@ -175,4 +183,43 @@ public String getCrcMetadataForTable(
       }
     }
   }
+
+  // TODO Add access control similar to 
PinotSegmentUploadDownloadRestletResource for segment download.
+  @GET
+  @Produces(MediaType.APPLICATION_OCTET_STREAM)
+  @Path("/segments/{tableNameWithType}/{segmentName}")
+  @ApiOperation(value = "Download a segment", notes = "Download a segment in 
zipped tar format")
+  public Response downloadSegment(
+      @ApiParam(value = "Name of the table with type REALTIME OR OFFLINE", 
required = true, example = "myTable_OFFLINE") @PathParam("tableNameWithType") 
String tableNameWithType,
+      @ApiParam(value = "Name of the segment", required = true) 
@PathParam("segmentName") @Encoded String segmentName,
+      @Context HttpHeaders httpHeaders)
+      throws Exception {
+    LOGGER.info("Received a request to download segment {} for table {}", 
segmentName, tableNameWithType);
+    TableDataManager tableDataManager = 
checkGetTableDataManager(tableNameWithType);
+    SegmentDataManager segmentDataManager = 
tableDataManager.acquireSegment(segmentName);
+    if (segmentDataManager == null) {
+      throw new WebApplicationException(
+          String.format("Table %s segment %s does not exist", 
tableNameWithType, segmentName),
+          Response.Status.NOT_FOUND);
+    }
+    try {
+      String tableDir = tableDataManager.getTableDataDir();
+      // TODO Limit the number of concurrent downloads of segments because 
compression is an expensive operation.
+      String tarFilePath = 
TarGzCompressionUtils.createTarGzOfDirectory(tableDir + File.separator + 
segmentName);
 
 Review comment:
   You have two options here:
   (1) Create a temporary tar file (unique name), and delete it soon after you 
send the segment out, in the finally block. In addition you can also mark it as 
deleteOnExit, in case things happen during tar/serving. In this case, use 
segmentTarDir config to drop the file for (one) serve to another host.
   (2) Create a semi-permanent tar file (non-unique name) so that it can be 
re-used when another request comes in for the same segment. In this case, you 
need to make sure that the file is not over-written when two requests come in 
for serving the same segment. In this case, I think the permanent tar file 
should be created in tableDataDir like you have done. But you do need to 
synchronize the creation of the file, and check if the file is already created 
by another thread, no?
   
   Which technique do you want to go with?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to