ArafatKhan2198 commented on code in PR #9994:
URL: https://github.com/apache/ozone/pull/9994#discussion_r3108757758
##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/ContainerEndpoint.java:
##########
@@ -450,6 +458,65 @@ private Response getUnhealthyContainersFromSchema(
return Response.ok(response).build();
}
+ /**
+ * Export all unhealthy containers into a CSV file by streaming the results
directly
+ * from the database without holding them in the JVM heap.
+ *
+ * @param state The container state to filter by, or null for all.
+ * @param limit The maximum number of records to return, 0 for unlimited.
+ * @return {@link Response} containing the CSV StreamingOutput.
+ */
+ @GET
+ @Path("/unhealthy/export")
+ @Produces("text/csv")
+ public Response exportUnhealthyContainers(
+ @QueryParam("state") String state,
+ @DefaultValue("0") @QueryParam(RECON_QUERY_LIMIT) int limit) {
+
+ ContainerSchemaDefinition.UnHealthyContainerStates internalState = null;
+ if (StringUtils.isNotEmpty(state)) {
+ try {
+ internalState =
ContainerSchemaDefinition.UnHealthyContainerStates.valueOf(state);
+ } catch (IllegalArgumentException e) {
+ throw new WebApplicationException(e, Response.Status.BAD_REQUEST);
+ }
+ }
+
+ final ContainerSchemaDefinition.UnHealthyContainerStates filterState =
internalState;
+
+ StreamingOutput stream = outputStream -> {
+ try (Cursor<UnhealthyContainersRecord> cursor =
+
containerHealthSchemaManager.getUnhealthyContainersCursor(filterState, limit)) {
+
+ PrintWriter writer = new PrintWriter(new
OutputStreamWriter(outputStream, StandardCharsets.UTF_8));
Review Comment:
Hey @devmadhuu, that's a very fair point about Excel/spreadsheets capping
out at ~1M rows. A single 4M row CSV will definitely get truncated for users
trying to open it directly.
I can add a prevContainerId parameter to this streaming endpoint.
If a user selects "Complete Export", the UI can automatically trigger
multiple sequential downloads of 500K records each (e.g., missing_part1.csv,
missing_part2.csv) until the end is reached.
This gives the user exactly what they need without adding heavy async job
infrastructure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]