sungwy commented on code in PR #1669:
URL: https://github.com/apache/iceberg-python/pull/1669#discussion_r1960582838
##########
pyiceberg/io/pyarrow.py:
##########
@@ -1573,11 +1561,16 @@ def _table_from_scan_task(task: FileScanTask) ->
pa.Table:
tables = [f.result() for f in completed_futures if f.result()]
+ arrow_schema = schema_to_pyarrow(self._projected_schema,
include_field_ids=False)
+
if len(tables) < 1:
- return pa.Table.from_batches([],
schema=schema_to_pyarrow(self._projected_schema, include_field_ids=False))
+ return pa.Table.from_batches([], schema=arrow_schema)
result = pa.concat_tables(tables, promote_options="permissive")
+ if property_as_bool(self._io.properties,
PYARROW_USE_LARGE_TYPES_ON_READ, False):
Review Comment:
Should we update this to align with the current default value?
```suggestion
if property_as_bool(self._io.properties,
PYARROW_USE_LARGE_TYPES_ON_READ, True):
```
##########
pyiceberg/io/pyarrow.py:
##########
@@ -1655,19 +1646,16 @@ class
ArrowProjectionVisitor(SchemaWithPartnerVisitor[pa.Array, Optional[pa.Arra
_file_schema: Schema
_include_field_ids: bool
_downcast_ns_timestamp_to_us: bool
- _use_large_types: bool
def __init__(
self,
file_schema: Schema,
downcast_ns_timestamp_to_us: bool = False,
include_field_ids: bool = False,
- use_large_types: bool = True,
Review Comment:
I've always dreaded process of updating our code base that does not have a
properly defined list of public classes 😞
But would this change require a deprecation notice first?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]