samredai opened a new pull request #4263:
URL: https://github.com/apache/iceberg/pull/4263


   This adds functionality for converting a value to bytes and vice-versa 
augmented by the `PrimitiveType` subclass provided.
   
   Functionally this is a direct port of 
[conversion.py](https://github.com/apache/iceberg/blob/master/python_legacy/iceberg/api/types/conversions.py)
 in the python legacy library. The main changes are:
   1. I replaced the large dictionaries containing `<type>: <lambda function>` 
pairs with 
[@singledispatch](https://docs.python.org/3/library/functools.html#functools.singledispatch)
   2. I removed the 
[TypeID](https://github.com/apache/iceberg/blob/master/python_legacy/iceberg/api/types/type.py#L24)
 enum and instead just used the types directly.
   
   Overall this is more lines of source code but I'll summarize my reasonings 
for the above two changes:
   
   1. I believe the intention behind the lambda maps was to allow for generic 
`from_...` and `to_...` functions that contain dynamic logic based on the 
argument type, similar to constructor overloading in Java. I feel that using 
`@singledispatch` achieves that more directly by creating a generic function 
and then registering functions based on the first argument's type. This also 
allows stacking the register decorators for types that share the same logic 
(such as `IntegerType` and `DateType`) instead of defining that logic multiple 
times. Also it's a stable feature added as early as python 3.4 so I believe 
it's future-proof.
   2. The `TypeID` enum in the legacy python client carries some information in 
the enum value, for example `BOOLEAN = {"java_class": "Boolean.class", 
"python_class": bool, "id": 1}`. These dictionary values weren't used in 
`conversions.py` but it looks like they're used in other places such as the 
partition spec. I removed it because this also felt like it may be adding some 
unnecessary indirection. If anyone feels this should be put back I'm ok with 
that but I would suggest that we see where the relationships in these 
dictionaries are required and localize these mappings to that area of the 
source code (maybe with a simple `PY_TO_JAVA_CLASS = {...}` object and a 
`get_class_id(...)` function).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to