rdblue commented on code in PR #9695:
URL: https://github.com/apache/iceberg/pull/9695#discussion_r1520444991
##########
open-api/rest-catalog-open-api.yaml:
##########
@@ -2068,6 +2162,145 @@ components:
items:
$ref: '#/components/schemas/PartitionStatisticsFile'
+ PlanTask:
+ description:
+ A JSON object that contains information provided by the server,
+ to be utilized by clients for distributed planning, should be supplied
+ as is for input in PlanTable operation.
+ type: object
+
+ FileScanTask:
+ type: object
+ required:
+ - schema
+ - spec
+ - start
+ - length
+ - data-file
+ properties:
+ data-file:
+ $ref: '#/components/schemas/ContentFile'
+ partition:
+ type: object
+ additionalProperties:
+ type: string
+ size-bytes:
+ type: number
+ start:
+ type: number
+ length:
+ type: number
+ estimated-rows-count:
+ type: number
+ delete-files:
+ type: array
+ items:
+ $ref: '#/components/schemas/ContentFile'
+ schema:
+ $ref: '#/components/schemas/Schema'
Review Comment:
Yeah, if this is based on what gets serialized to workers in Java
frameworks, we don't need it. Those tasks send the table schema and spec to be
able to work with the partition tuple on the task side. But this use case
assumes that the caller has access to the table.
If we wanted it to be possible for the caller to not load the table (which
we may choose to do in a later update to this API) then we would send this
metadata once per request rather than on each task.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]