[ 
https://issues.apache.org/jira/browse/ARROW-15547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488109#comment-17488109
 ] 

Joris Van den Bossche commented on ARROW-15547:
-----------------------------------------------

Can you provide a reproducible code example of the issue you encounter? 

With the data that you currently provided, the function works fine for me (but 
there are no decimals in the resulting table, as it doesn't infer this type 
automatically from numbers):

{code}
In [3]: null = None

In [4]: data = [{"accounted_at":  .... # data as provided above

In [6]: create_dataframe(data)
Out[6]: 
pyarrow.Table
booked_by: string
invoice_recipient_id: string
created_at: string
due_date: string
lines: list<item: struct<amount: double, commission: double, commissionUnit: 
string, description: string, soldPrice: double, type: string>>
  child 0, item: struct<amount: double, commission: double, commissionUnit: 
string, description: string, soldPrice: double, type: string>
      child 0, amount: double
      child 1, commission: double
      child 2, commissionUnit: string
      child 3, description: string
      child 4, soldPrice: double
      child 5, type: string
deleted_at: null
internal_code: string
type: string
id: string
payment_term: string
franchise_id: string
teamleader_id: string
created_by: string
parent_id: null
sent_by: string
accounted_at: string
recipient_emails: null
booked_at: string
status: string
description: string
sent_at: string
{code}

> Regression: Decimal type inferemce
> ----------------------------------
>
>                 Key: ARROW-15547
>                 URL: https://issues.apache.org/jira/browse/ARROW-15547
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 6.0.1
>            Reporter: Charley Guillaume
>            Priority: Major
>
> While trying to ingest data using pyarrow 6.0.1 using this function :{{{}{}}}
> {code:java}
> def create_dataframe(list_dict: dict) -> pa.table:
>     fields = set()
>     for d in list_dict:
>         fields = fields.union(d.keys())
>     dataframe = pa.table({f: [row.get(f) for row in list_dict] for f in 
> fields})
>     return dataframe {code}
> {{I had the following error: }}
> {code:java}
> pyarrow.lib.ArrowInvalid: Decimal type with precision 7 does not fit into 
> precision inferred from first array element: 8  {code}
> {{}}
> {{After downgrading too v4.0.1 the error was gone.}}
> {{}}
> {{The data looked like that : }}
> {noformat}
> [{"accounted_at": "2022-01-31T22:55:25.702000+00:00", "booked_at": 
> "2022-01-27T09:24:17.539000+00:00", "booked_by": 
> "7b3ce009-728d-4fbc-9120-00fa8c1c8655", "created_at": 
> "2022-01-27T09:08:22.306000+00:00", "created_by": 
> "7b3ce009-728d-4fbc-9120-00fa8c1c8655", "deleted_at": null, "description": 
> "description of the record", "due_date": "2022-02-10T00:00:00+00:00", 
> "franchise_id": "9a2858c4-5c71-43d3-b28f-2352de47ff9f", "id": 
> "ba3f6d3a-12f4-4d78-acc5-2e59ca384c1e", "internal_code": "A.2022 / 9", 
> "invoice_recipient_id": "7169cef9-9cb2-461f-a38f-a4d1ce3ca1c3", "lines": 
> [{"type": "property", "amount": 7800, "soldPrice": 260000, "commission": 3, 
> "description": "Honoraires de l'agence", "commissionUnit": "PERCENT"}], 
> "parent_id": null, "payment_term": "14-days", "recipient_emails": null, 
> "sent_at": null, "sent_by": null, "status": "booked", "teamleader_id": 
> "xxx-yyy-www-zzz", "type": "out"}, {"accounted_at": null, "booked_at": 
> "2022-01-05T09:23:03.274000+00:00", "booked_by": 
> "8a91a22d-ddb9-491a-bc2d-c06ff3f256b4", "created_at": 
> "2022-01-05T09:21:32.503000+00:00", "created_by": 
> "8a91a22d-ddb9-491a-bc2d-c06ff3f256b4", "deleted_at": null, "description": 
> "Description content", "due_date": "2022-02-04T00:00:00+00:00", 
> "franchise_id": "929d47a3-c30f-404b-aaff-c96cff1bdd10", "id": 
> "828cd056-6aa7-4cea-9c94-ffa2db4498df", "internal_code": "BXC22 / 3", 
> "invoice_recipient_id": "5f90aa24-4c32-401d-927c-db9d4a9f90bf", "lines": 
> [{"type": "property", "amount": 92.55, "soldPrice": 3702.02, "commission": 
> 2.5, "description": "description2", "commissionUnit": "PERCENT"}], 
> "parent_id": null, "payment_term": "30-days", "recipient_emails": null, 
> "sent_at": "2022-01-05T09:27:34.077000+00:00", "sent_by": 
> "8a91a22d-ddb9-491a-bc2d-c06ff3f256b4", "status": "credited", 
> "teamleader_id": "xxx-yzyzy-zzz-www", "type": "out"}]{noformat}
> {{}}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to