[
https://issues.apache.org/jira/browse/KUDU-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348352#comment-15348352
]
Andy Grove edited comment on KUDU-1493 at 6/24/16 2:41 PM:
-----------------------------------------------------------
One application could write many DataFrames with different column ordering to
the same table. The read operation should always return the columns in the
order that you specify in your projection. If you don't provide a projection
then I would expect the columns to be returned in the order they are defined in
the kudu schema. As far as I know, this is the current behavior and is correct,
in my opinion.
If you rely on ordering you should apply a projection onto the RDD that you
read from Kudu e.g. "SELECT c, b, a FROM kudu_table" if using Spark SQL, rather
than "SELECT * FROM kudu_table".
Databases usually make no guarantees about row or column ordering unless you
are explicit in your query.
was (Author: andygrove):
One application could write many DataFrames with different column ordering to
the same table. The read operation should always return the columns in the
order that you specify in your projection. If you don't provide a projection
then I would expect the columns to be returned in the order they are defined in
the kudu schema. As far as I know, this is the current behavior and is correct,
in my opinion.
If you rely on ordering you should apply a projection onto the RDD that you
read from Kudu e.g. "SELECT c, b, a FROM kudu_table" if using SparkSQL, rather
than "SELECT * FROM kudu_table".
SQL databases usually make no guarantees about row or column ordering unless
you are explicit in your query.
> Spark read fails if key columns are not leading columns
> -------------------------------------------------------
>
> Key: KUDU-1493
> URL: https://issues.apache.org/jira/browse/KUDU-1493
> Project: Kudu
> Issue Type: Bug
> Components: spark
> Affects Versions: 0.9.0
> Reporter: Tom White
> Assignee: Andy Grove
>
> If the Spark dataframe schema is (A, B, C) then reading will fail if the Kudu
> keys are (A, C). Keys (A, B) work fine.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)