[jira] [Comment Edited] (KUDU-1676) Spark DDL needs elegant way to specify range partitioning

Ivan Vergiliev (JIRA) Wed, 25 Apr 2018 08:46:52 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452518#comment-16452518
 ]


Ivan Vergiliev edited comment on KUDU-1676 at 4/25/18 3:45 PM:
---------------------------------------------------------------

This is now possible using the following call:
{{ val kuduSchema = kuduContext.createSchema(schema)}}

where `schema` is a Spark `StructType` schema.


was (Author: ivan.vergiliev):
This is now possible using the following call:

```
val kuduSchema = kuduContext.createSchema(schema)
```

where `schema` is a Spark `StructType` schema.

> Spark DDL needs elegant way to specify range partitioning
> ---------------------------------------------------------
>
>                 Key: KUDU-1676
>                 URL: https://issues.apache.org/jira/browse/KUDU-1676
>             Project: Kudu
>          Issue Type: New Feature
>          Components: spark
>    Affects Versions: 1.0.0
>            Reporter: Mladen Kovacevic
>            Assignee: Mladen Kovacevic
>            Priority: Major
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> To define partition column splits, you need a PartialRow object. 
> These are easy to create when you have the Schema object. But since your 
> table schema in Spark is defined with StructType instead of Schema, then its 
> cumbersome to define a new Schema object to be the exact duplicate of the 
> StructType version, only to get the PartialRow, to set what the range 
> partition will values would be, then set the addSplitRow() function call to 
> your CreateTableOptions.
> We need an elegant way to have the Spark API handle specifying range 
> partition attributes without having to drop into the Java API in Spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (KUDU-1676) Spark DDL needs elegant way to specify range partitioning

Reply via email to