[jira] [Updated] (SPARK-5302) Add support for SQLContext "partition" columns

2015-09-16 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-5302:
-
Assignee: Cheng Lian

> Add support for SQLContext "partition" columns
> --
>
> Key: SPARK-5302
> URL: https://issues.apache.org/jira/browse/SPARK-5302
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Reporter: Bob Tiernay
>Assignee: Cheng Lian
> Fix For: 1.4.0
>
>
> For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to 
> support a virtual column that maps to part of the the file path, similar to 
> what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} 
> where {{dt}} is a column of type {{TEXT}}). 
> The API could allow the user to type the column using an appropriate 
> {{DataType}} instance. This new field could be addressed in SQL statements 
> much the same as is done in Hive. 
> As a consequence, pruning of partitions could be possible when executing a 
> query and also remove the need to materialize a column in each logical 
> partition that is already encoded in the path name. Furthermore, this would 
> provide an nice interop and migration strategy for Hive users who may one day 
> use {{SQLContext}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns

2015-01-17 Thread Bob Tiernay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Tiernay updated SPARK-5302:
---
Description: For {{SQLContext}} (not {{HiveContext}}) it would be very 
convenient to support a virtual column that maps to part of the the file path, 
similar to what is done in Hive for partitions. The API could allow the user to 
type the column using an appropriate {{DataType}} instance. This new field 
could be addressed in SQL statements much the same as is done in Hive. As a 
consequence, pruning of partitions could be possible when executing a query and 
also remove the need to materialize a column in each logical partition that is 
already encoded in the path name. Furthermore, this would provide an nice 
interop and migration strategy for Hive users who may one day use 
{{SQLContext}} directly.  (was: For {{SQLContext}} (not {{HiveContext}}) it 
would be very convenient to support a virtual column that maps to part of the 
the file path, similar to what is done in Hive for partitions. The API could 
allow the user to type the column using an appropriate {{DataType}} instance. 
This new field could be addressed in SQL statements much the same as is done in 
Hive. As a consequence, this would provide an nice interop and migration 
strategy for Hive users who may one day use {{SQLContext}} directly.)

 Add support for SQLContext partition columns
 --

 Key: SPARK-5302
 URL: https://issues.apache.org/jira/browse/SPARK-5302
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Reporter: Bob Tiernay

 For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to 
 support a virtual column that maps to part of the the file path, similar to 
 what is done in Hive for partitions. The API could allow the user to type the 
 column using an appropriate {{DataType}} instance. This new field could be 
 addressed in SQL statements much the same as is done in Hive. As a 
 consequence, pruning of partitions could be possible when executing a query 
 and also remove the need to materialize a column in each logical partition 
 that is already encoded in the path name. Furthermore, this would provide an 
 nice interop and migration strategy for Hive users who may one day use 
 {{SQLContext}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns

2015-01-17 Thread Bob Tiernay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Tiernay updated SPARK-5302:
---
Description: For {{SQLContext}} (not {{HiveContext}}) it would be very 
convenient to support a virtual column that maps to part of the the file path, 
similar to what is done in Hive for partitions (e.g. 
{{/data/clicks/dt=2015-01-01/}}). The API could allow the user to type the 
column using an appropriate {{DataType}} instance. This new field could be 
addressed in SQL statements much the same as is done in Hive. As a consequence, 
pruning of partitions could be possible when executing a query and also remove 
the need to materialize a column in each logical partition that is already 
encoded in the path name. Furthermore, this would provide an nice interop and 
migration strategy for Hive users who may one day use {{SQLContext}} directly.  
(was: For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to 
support a virtual column that maps to part of the the file path, similar to 
what is done in Hive for partitions. The API could allow the user to type the 
column using an appropriate {{DataType}} instance. This new field could be 
addressed in SQL statements much the same as is done in Hive. As a consequence, 
pruning of partitions could be possible when executing a query and also remove 
the need to materialize a column in each logical partition that is already 
encoded in the path name. Furthermore, this would provide an nice interop and 
migration strategy for Hive users who may one day use {{SQLContext}} directly.)

 Add support for SQLContext partition columns
 --

 Key: SPARK-5302
 URL: https://issues.apache.org/jira/browse/SPARK-5302
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Reporter: Bob Tiernay

 For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to 
 support a virtual column that maps to part of the the file path, similar to 
 what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}}). 
 The API could allow the user to type the column using an appropriate 
 {{DataType}} instance. This new field could be addressed in SQL statements 
 much the same as is done in Hive. As a consequence, pruning of partitions 
 could be possible when executing a query and also remove the need to 
 materialize a column in each logical partition that is already encoded in the 
 path name. Furthermore, this would provide an nice interop and migration 
 strategy for Hive users who may one day use {{SQLContext}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns

2015-01-17 Thread Bob Tiernay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Tiernay updated SPARK-5302:
---
Description: 
For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support 
a virtual column that maps to part of the the file path, similar to what is 
done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} 
is a field of type {{TEXT}}). 

The API could allow the user to type the column using an appropriate 
{{DataType}} instance. This new field could be addressed in SQL statements much 
the same as is done in Hive. 

As a consequence, pruning of partitions could be possible when executing a 
query and also remove the need to materialize a column in each logical 
partition that is already encoded in the path name. Furthermore, this would 
provide an nice interop and migration strategy for Hive users who may one day 
use {{SQLContext}} directly.

  was:For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to 
support a virtual column that maps to part of the the file path, similar to 
what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}}). The 
API could allow the user to type the column using an appropriate {{DataType}} 
instance. This new field could be addressed in SQL statements much the same as 
is done in Hive. As a consequence, pruning of partitions could be possible when 
executing a query and also remove the need to materialize a column in each 
logical partition that is already encoded in the path name. Furthermore, this 
would provide an nice interop and migration strategy for Hive users who may one 
day use {{SQLContext}} directly.


 Add support for SQLContext partition columns
 --

 Key: SPARK-5302
 URL: https://issues.apache.org/jira/browse/SPARK-5302
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Reporter: Bob Tiernay

 For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to 
 support a virtual column that maps to part of the the file path, similar to 
 what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} 
 where {{dt}} is a field of type {{TEXT}}). 
 The API could allow the user to type the column using an appropriate 
 {{DataType}} instance. This new field could be addressed in SQL statements 
 much the same as is done in Hive. 
 As a consequence, pruning of partitions could be possible when executing a 
 query and also remove the need to materialize a column in each logical 
 partition that is already encoded in the path name. Furthermore, this would 
 provide an nice interop and migration strategy for Hive users who may one day 
 use {{SQLContext}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5302) Add support for SQLContext partition columns

2015-01-17 Thread Bob Tiernay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Tiernay updated SPARK-5302:
---
Description: 
For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support 
a virtual column that maps to part of the the file path, similar to what is 
done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} 
is a column of type {{TEXT}}). 

The API could allow the user to type the column using an appropriate 
{{DataType}} instance. This new field could be addressed in SQL statements much 
the same as is done in Hive. 

As a consequence, pruning of partitions could be possible when executing a 
query and also remove the need to materialize a column in each logical 
partition that is already encoded in the path name. Furthermore, this would 
provide an nice interop and migration strategy for Hive users who may one day 
use {{SQLContext}} directly.

  was:
For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to support 
a virtual column that maps to part of the the file path, similar to what is 
done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} where {{dt}} 
is a field of type {{TEXT}}). 

The API could allow the user to type the column using an appropriate 
{{DataType}} instance. This new field could be addressed in SQL statements much 
the same as is done in Hive. 

As a consequence, pruning of partitions could be possible when executing a 
query and also remove the need to materialize a column in each logical 
partition that is already encoded in the path name. Furthermore, this would 
provide an nice interop and migration strategy for Hive users who may one day 
use {{SQLContext}} directly.


 Add support for SQLContext partition columns
 --

 Key: SPARK-5302
 URL: https://issues.apache.org/jira/browse/SPARK-5302
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Reporter: Bob Tiernay

 For {{SQLContext}} (not {{HiveContext}}) it would be very convenient to 
 support a virtual column that maps to part of the the file path, similar to 
 what is done in Hive for partitions (e.g. {{/data/clicks/dt=2015-01-01/}} 
 where {{dt}} is a column of type {{TEXT}}). 
 The API could allow the user to type the column using an appropriate 
 {{DataType}} instance. This new field could be addressed in SQL statements 
 much the same as is done in Hive. 
 As a consequence, pruning of partitions could be possible when executing a 
 query and also remove the need to materialize a column in each logical 
 partition that is already encoded in the path name. Furthermore, this would 
 provide an nice interop and migration strategy for Hive users who may one day 
 use {{SQLContext}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org