[jira] [Updated] (SPARK-5302) Add support for SQLContext "partition" columns
[ https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5302: - Assignee: Cheng Lian > Add support for SQLContext "partition" columns > -- > > Key: SPARK-5302 > URL: https://issues.apache.org/jira/browse/SPARK-5302 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Bob Tiernay >Assignee: Cheng Lian > Fix For: 1.4.0 > > > For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to > support a virtual column that maps to part of the the file path, similar to > what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} > where {{dt}} is a column of type {{TEXT}}). > The API could allow the user to type the column using an appropriate > {{DataType}} instance. This new field could be addressed in SQL statements > much the same as is done in Hive. > As a consequence, pruning of partitions could be possible when executing a > query and also remove the need to materialize a column in each logical > partition that is already encoded in the path name. Furthermore, this would > provide an nice interop and migration strategy for Hive users who may one day > use {{SQLContext}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns
[ https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Tiernay updated SPARK-5302: --- Description: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions. The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. (was: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions. The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly.) Add support for SQLContext partition columns -- Key: SPARK-5302 URL: https://issues.apache.org/jira/browse/SPARK-5302 Project: Spark Issue Type: New Feature Components: SQL Reporter: Bob Tiernay For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions. The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns
[ https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Tiernay updated SPARK-5302: --- Description: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. (was: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions. The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly.) Add support for SQLContext partition columns -- Key: SPARK-5302 URL: https://issues.apache.org/jira/browse/SPARK-5302 Project: Spark Issue Type: New Feature Components: SQL Reporter: Bob Tiernay For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns
[ https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Tiernay updated SPARK-5302: --- Description: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} is a field of type {{TEXT}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. was:For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. Add support for SQLContext partition columns -- Key: SPARK-5302 URL: https://issues.apache.org/jira/browse/SPARK-5302 Project: Spark Issue Type: New Feature Components: SQL Reporter: Bob Tiernay For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} is a field of type {{TEXT}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns
[ https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Tiernay updated SPARK-5302: --- Description: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} is a column of type {{TEXT}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. was: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} is a field of type {{TEXT}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. Add support for SQLContext partition columns -- Key: SPARK-5302 URL: https://issues.apache.org/jira/browse/SPARK-5302 Project: Spark Issue Type: New Feature Components: SQL Reporter: Bob Tiernay For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support a virtual column that maps to part of the the file path, similar to what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} is a column of type {{TEXT}}). The API could allow the user to type the column using an appropriate {{DataType}} instance. This new field could be addressed in SQL statements much the same as is done in Hive. As a consequence, pruning of partitions could be possible when executing a query and also remove the need to materialize a column in each logical partition that is already encoded in the path name. Furthermore, this would provide an nice interop and migration strategy for Hive users who may one day use {{SQLContext}} directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org