[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2022-10-21 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-19256:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is not resolved. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the JIRA guidelines 
(https://cwiki.apache.org/confluence/display/Hive/HowToContribute) the 
fixVersion should be set only when the issue is resolved/closed.

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: HIVE-19256.patch, HIVE-19256_1.patch
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-06-27 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-19256:
---
Fix Version/s: (was: 3.1.0)
   3.2.0

Deferring this to 3.2.0 since the branch for 3.1.0 has been cut off.

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HIVE-19256.patch, HIVE-19256_1.patch
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Affects Version/s: 3.1.0

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19256.patch, HIVE-19256_1.patch
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Fix Version/s: 3.1.0
  Component/s: Hive

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19256.patch, HIVE-19256_1.patch
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Attachment: HIVE-19256_1.patch

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: HIVE-19256.patch, HIVE-19256_1.patch
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Status: Patch Available  (was: Open)

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: HIVE-19256.patch, HIVE-19256_1.patch
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Attachment: (was: generic_project.diff)

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: HIVE-19256.patch
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Attachment: generic_project.diff

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: HIVE-19256.patch, generic_project.diff
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Attachment: (was: generic_project.diff)

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: HIVE-19256.patch, generic_project.diff
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Attachment: HIVE-19256.patch

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: HIVE-19256.patch, generic_project.diff
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19256) UDF which shapes the input data according to the specified schema

2018-04-20 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-19256:
---
Attachment: generic_project.diff

> UDF which shapes the input data according to the specified schema
> -
>
> Key: HIVE-19256
> URL: https://issues.apache.org/jira/browse/HIVE-19256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>Priority: Major
> Attachments: generic_project.diff
>
>
> We use this UDF a lot in our org. This UDF takes an object and a Hive schema 
> and make sure the output object matches the schema completely. In some 
> respects it is similar to {{named
> _struct}} UDF which can be used to select columns from a struct, but it is 
> more general since it can work not only on structs, but all Hive data types 
> (expect union). Also the schema can provide certain valid type conversions 
> (int -> double etc)
> One scenario where this is quite useful is making sure that the Hive view 
> created with a specific schema will have columns which will always match that 
> schema. In Hive today when a view is created, new nested columns from the 
> underlying table can leak out from the view, even though the user never 
> wanted this behavior. Note that this leaking of columns is only for nested 
> columns and not for top level columns, so in that regard this behavior of 
> Hive is inconsistent.
> Sample usage of the UDF
> {code}
> generic_project(col, "struct>>") // Returning 
> data which matches the input schema. Here extra columns which are not part of 
> the input will be removed
> generic_project(col, "struct") //  If the input column had a struct 
> with col a as int . It would type cast 'a' to double.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)