[jira] [Commented] (KUDU-1676) Spark DDL needs elegant way to specify range partitioning

Grant Henke (JIRA) Wed, 25 Apr 2018 09:58:28 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452632#comment-16452632
 ]


Grant Henke commented on KUDU-1676:
-----------------------------------

Thanks for the comment [~ivan.vergiliev]. I didn't realize there was a Jira for 
this. 

This is fixed via [e94556e|https://github.com/apache/kudu/commit/e94556e].

The unit test added in 
[DefaultSourceTest.scala|https://github.com/apache/kudu/commit/e94556e#diff-e35eedde7d54f5aac66eb41b13e7efee]
 is a brief example. 

> Spark DDL needs elegant way to specify range partitioning
> ---------------------------------------------------------
>
>                 Key: KUDU-1676
>                 URL: https://issues.apache.org/jira/browse/KUDU-1676
>             Project: Kudu
>          Issue Type: New Feature
>          Components: spark
>    Affects Versions: 1.0.0
>            Reporter: Mladen Kovacevic
>            Assignee: Mladen Kovacevic
>            Priority: Major
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> To define partition column splits, you need a PartialRow object. 
> These are easy to create when you have the Schema object. But since your 
> table schema in Spark is defined with StructType instead of Schema, then its 
> cumbersome to define a new Schema object to be the exact duplicate of the 
> StructType version, only to get the PartialRow, to set what the range 
> partition will values would be, then set the addSplitRow() function call to 
> your CreateTableOptions.
> We need an elegant way to have the Spark API handle specifying range 
> partition attributes without having to drop into the Java API in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KUDU-1676) Spark DDL needs elegant way to specify range partitioning

Reply via email to