Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
alamb commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2291105584 > Right now datafusion doesn't support struct evolution very well. Imagine you have a struct named `customData` with field `someOptionEnabled` in one parquet file, later down the line you add a new field `newAddedOption` to the `customData` struct in another parquet file. Currently when you try and `SELECT * FROM table` you'll get this error: > > ``` > {"message":"Failed to collect DataFrame batches: Plan(\"Cannot cast file schema field customData of type Struct([Field { name: \\\"someOptionEnabled\\\", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) to table schema field of type Struct([Field { name: \\\"someOptionEnabled\\\", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \\\"newAddedOption\\\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])\")","status":"error"} > ``` > > Feels like we should handle this more gracefully. cc @alamb I agree > I'm happy to make contributions if someone can point me to the right places to look. My suggestion is to start with filing a ticket with a self contained reproducer (either rust code or SQL) that shows what you are trying to do. This would likely become part of the test of any code improvement we make, as well as providing some more detail for other contributors to help point to the right place in the code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
TheBuilderJR commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2287085675 Right now datafusion doesn't support struct evolution very well. Imagine you have a struct named `customData` with field `someOptionEnabled` in one parquet file, later down the line you add a new field `newAddedOption` to the `customData` struct in another parquet file. Currently when you try and `SELECT * FROM table` you'll get this error: ``` {"message":"Failed to collect DataFrame batches: Plan(\"Cannot cast file schema field customData of type Struct([Field { name: \\\"someOptionEnabled\\\", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) to table schema field of type Struct([Field { name: \\\"someOptionEnabled\\\", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \\\"newAddedOption\\\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])\")","status":"error"} ``` Feels like we should handle this more gracefully. cc @alamb I'm happy to make contributions if someone can point me to the right places to look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
alamb commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2227389021 > I think #11445 is related to this epic Thank you -- added -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
Throne3d commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2227085035 I think #11445 is related to this epic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
goldmedal commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2225878481 I added an issue to check the duplicate or null name for struct: https://github.com/apache/datafusion/issues/11438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
alamb commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2131214534 > I added an issue to support recursive unnest: #10660, i think it shoul belong to this epic Added -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
duongcongtoai commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2130882122 I added an issue to support recursive unnest: https://github.com/apache/datafusion/issues/10660 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
alamb commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2085321547 > i created a ticket: #10264 Thank you. I added this to the list in the ticket description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
duongcongtoai commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2080458071 i created a ticket: https://github.com/apache/datafusion/issues/10264 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
alamb commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2080448944 > Hi, i think unnest support for struct can be an item in this epic right? That would make sense to me -- is there a ticket that describes what this means? I -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [I] [EPIC] Improved support for nested / structured types (`Struct` , `List`, `ListArray`, and other Composite types) [datafusion]
toaiduongdh commented on issue #2326: URL: https://github.com/apache/datafusion/issues/2326#issuecomment-2080361039 Hi, i think unnest support for struct can be an item in this epic right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org