samredai opened a new pull request #4263: URL: https://github.com/apache/iceberg/pull/4263
This adds functionality for converting a value to bytes and vice-versa augmented by the `PrimitiveType` subclass provided. Functionally this is a direct port of [conversion.py](https://github.com/apache/iceberg/blob/master/python_legacy/iceberg/api/types/conversions.py) in the python legacy library. The main changes are: 1. I replaced the large dictionaries containing `<type>: <lambda function>` pairs with [@singledispatch](https://docs.python.org/3/library/functools.html#functools.singledispatch) 2. I removed the [TypeID](https://github.com/apache/iceberg/blob/master/python_legacy/iceberg/api/types/type.py#L24) enum and instead just used the types directly. Overall this is more lines of source code but I'll summarize my reasonings for the above two changes: 1. I believe the intention behind the lambda maps was to allow for generic `from_...` and `to_...` functions that contain dynamic logic based on the argument type, similar to constructor overloading in Java. I feel that using `@singledispatch` achieves that more directly by creating a generic function and then registering functions based on the first argument's type. This also allows stacking the register decorators for types that share the same logic (such as `IntegerType` and `DateType`) instead of defining that logic multiple times. Also it's a stable feature added as early as python 3.4 so I believe it's future-proof. 2. The `TypeID` enum in the legacy python client carries some information in the enum value, for example `BOOLEAN = {"java_class": "Boolean.class", "python_class": bool, "id": 1}`. These dictionary values weren't used in `conversions.py` but it looks like they're used in other places such as the partition spec. I removed it because this also felt like it may be adding some unnecessary indirection. If anyone feels this should be put back I'm ok with that but I would suggest that we see where the relationships in these dictionaries are required and localize these mappings to that area of the source code (maybe with a simple `PY_TO_JAVA_CLASS = {...}` object and a `get_class_id(...)` function). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
