adriangb commented on issue #42069: URL: https://github.com/apache/arrow/issues/42069#issuecomment-2696006993
> The main pitfall of using an extension type for this is the storage type is meaningless to users. They need to have special libraries to interpret the bytes if pulled into a system that doesn't understand the variant extension type. FWIW two tricks we've employed with relatively good results: - Add metadata to the Field to say "treat this as json" and when serializing out it gets converted to plain utf8 json. - Keep this optimized data as a duplicate/private column so that `select json_column` pulls through the plain json data but `where json_data->'foo' = 1` gets rewritten to `where variant_get(_private_variant_col, 'foo') == 1` The latter is particularly helpful to keep backward compatibility. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
