[I] Fields with mixed datatypes [iceberg-python]

via GitHub Sat, 10 Aug 2024 12:33:07 -0700


jayceslesar opened a new issue, #1037:
URL: https://github.com/apache/iceberg-python/issues/1037


   ### Question
   
   How would I go about using a field with mixed datatypes? Is that 
recommended/possible? I am a fan of tall-tidy data and am wondering how to 
properly go about the following?
   
   ```py
   from pydantic import BaseModel
   from datetime import datetime
   import pyarrow as pa
   
   from pyiceberg.catalog.sql import SqlCatalog
   
   
   class Message(BaseModel):
       system: str
       node: str
       message_name: str
       signal: str
       bus: str
       timestamp: datetime
       value: int | float | bool | str
   
       @staticmethod
       def to_pyarrow_schema():
           return pa.schema([
               pa.field('system', pa.string()),
               pa.field('node', pa.string()),
               pa.field('message_name', pa.string()),
               pa.field('signal', pa.string()),
               pa.field('bus', pa.string()),
               pa.field('timestamp', pa.timestamp('s', tz='UTC')),
               pa.field(pa.union([pa.field("value", pa.int32()), 
pa.field("value", pa.float64()), pa.field("value", pa.bool_()), 
pa.field("value", pa.string())],  mode=pa.lib.UnionMode_SPARSE)),
           ])
   
   catalog = SqlCatalog(
       "default",
       **{
           "uri": "my_uri/catalog",
       },
   )
   
   catalog.create_table(
       identifier="default.messages",
       schema=Message.to_pyarrow_schema(),
   )
   ```
   Right now it throws an error `TypeError: Expected primitive type, got: 
<class 'pyarrow.lib.SparseUnionType'>` which makes sense as what I am 
attempting isn't supported.
   
   Should I be using a string type and casting in my queries?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Fields with mixed datatypes [iceberg-python]

Reply via email to