Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-08-15 Thread via GitHub


alamb commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2291105584

   > Right now datafusion doesn't support struct evolution very well. Imagine 
you have a struct named `customData` with field `someOptionEnabled` in one 
parquet file, later down the line you add a new field `newAddedOption` to the 
`customData` struct in another parquet file. Currently when you try and `SELECT 
* FROM table` you'll get this error:
   > 
   > ```
   > {"message":"Failed to collect DataFrame batches: Plan(\"Cannot cast file 
schema field customData of type Struct([Field { name: 
\\\"someOptionEnabled\\\", data_type: Boolean, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }]) to table schema field of type 
Struct([Field { name: \\\"someOptionEnabled\\\", data_type: Boolean, nullable: 
true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: 
\\\"newAddedOption\\\", data_type: Float64, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }])\")","status":"error"}
   > ```
   > 
   > Feels like we should handle this more gracefully. cc @alamb
   
   I agree
   
   > I'm happy to make contributions if someone can point me to the right 
places to look.
   
   My suggestion is to start with filing a ticket with a self contained 
reproducer (either rust code or SQL) that shows what you are trying to do.
   
   This would likely become part of the test of any code improvement we make, 
as well as providing some more detail for other contributors to help point to 
the right place in the code
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-08-13 Thread via GitHub


TheBuilderJR commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2287085675

   Right now datafusion doesn't support struct evolution very well. Imagine you 
have a struct named `customData` with field `someOptionEnabled` in one parquet 
file, later down the line you add a new field `newAddedOption` to the 
`customData` struct in another parquet file. Currently when you try and `SELECT 
* FROM table` you'll get this error:
   
   ```
   {"message":"Failed to collect DataFrame batches: Plan(\"Cannot cast file 
schema field customData of type Struct([Field { name: 
\\\"someOptionEnabled\\\", data_type: Boolean, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }]) to table schema field of type 
Struct([Field { name: \\\"someOptionEnabled\\\", data_type: Boolean, nullable: 
true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: 
\\\"newAddedOption\\\", data_type: Float64, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }])\")","status":"error"}
   ```
   
   Feels like we should handle this more gracefully. cc @alamb 
   
   I'm happy to make contributions if someone can point me to the right places 
to look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-07-14 Thread via GitHub


alamb commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2227389021

   > I think #11445 is related to this epic
   
   Thank you -- added


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-07-13 Thread via GitHub


Throne3d commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2227085035

   I think #11445 is related to this epic


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-07-12 Thread via GitHub


goldmedal commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2225878481

   I added an issue to check the duplicate or null name for struct: 
https://github.com/apache/datafusion/issues/11438


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-05-25 Thread via GitHub


alamb commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2131214534

   > I added an issue to support recursive unnest: #10660, i think it shoul 
belong to this epic
   
   Added


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-05-24 Thread via GitHub


duongcongtoai commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2130882122

   I added an issue to support recursive unnest: 
https://github.com/apache/datafusion/issues/10660


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-04-30 Thread via GitHub


alamb commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2085321547

   > i created a ticket: #10264
   
   Thank you. I added this to the list in the ticket description


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-04-27 Thread via GitHub


duongcongtoai commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2080458071

   i created a ticket: https://github.com/apache/datafusion/issues/10264


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-04-27 Thread via GitHub


alamb commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2080448944

   > Hi, i think unnest support for struct can be an item in this epic right?
   
   That would make sense to me -- is there a ticket that describes what this 
means? I


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]

2024-04-26 Thread via GitHub


toaiduongdh commented on issue #2326:
URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2080361039

   Hi, i think unnest support for struct can be an item in this epic right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org