subject:"\[jira\] \[Commented\] \(SPARK\-14463\) read.text broken for partitioned tables"

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-05-18 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290078#comment-15290078
 ] 

Apache Spark commented on SPARK-14463:
--

User 'rxin' has created a pull request for this issue:
https://github.com/apache/spark/pull/13184

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Assignee: Jurriaan Pruis
>Priority: Critical
> Fix For: 2.0.0
>
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-05-18 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15288690#comment-15288690
 ] 

Apache Spark commented on SPARK-14463:
--

User 'jurriaan' has created a pull request for this issue:
https://github.com/apache/spark/pull/13104

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-05-13 Thread Jurriaan Pruis (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283044#comment-15283044
 ] 

Jurriaan Pruis commented on SPARK-14463:


Actually, this functionality is broken (explicitly disabled) in Spark 2.0's 
text datasource. See https://github.com/apache/spark/pull/13104 for a fix.

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-05-04 Thread Jurriaan Pruis (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271207#comment-15271207
 ] 

Jurriaan Pruis commented on SPARK-14463:


Any idea if https://issues.apache.org/jira/browse/SPARK-14343 is somehow 
related?

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-18 Thread Reynold Xin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246222#comment-15246222
 ] 

Reynold Xin commented on SPARK-14463:
-

This is just a problem with the text method because it returns Dataset[String]. 
I think we can disable partitioning in this case.

If they want to load a two level folder, they can use glob
{code}
read.text("/path/to/data/*/*")
{code}

If users want to use partitioning, they can still use
{code}
format("text").load("...")
{code}
which returns a DataFrame rather than a Dataset[String].


> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-18 Thread Jurriaan Pruis (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246211#comment-15246211
 ] 

Jurriaan Pruis commented on SPARK-14463:


Why? I guess this can be quite useful, at least while reading them (I've got 
some partitioned text files, and want to be able to quickly filter them based 
on the partitions before processing them any further). This kinda worked with 
Spark 1.6.x but having some problems when trying to work with the partition 
values themselves (SPARK-14343).

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-13 Thread Cheng Lian (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240640#comment-15240640
 ] 

Cheng Lian commented on SPARK-14463:


Should we simply throw an exception when text data source is used together with 
partitioning?

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-13 Thread Reynold Xin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239630#comment-15239630
 ] 

Reynold Xin commented on SPARK-14463:
-

The problem is that the return type is String, and as a result we can't really 
add another field to the text method.


> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-13 Thread Cheng Lian (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239617#comment-15239617
 ] 

Cheng Lian commented on SPARK-14463:


Seems that this is because {{buildReader()}} doesn't append partitioned columns 
like other data sources.

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-07 Thread Reynold Xin (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231035#comment-15231035
 ] 

Reynold Xin commented on SPARK-14463:
-

oh - i guess we should drop the partitioned value?


> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

2016-04-07 Thread Michael Armbrust (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230946#comment-15230946
 ] 

Michael Armbrust commented on SPARK-14463:
--

[~rxin]

> read.text broken for partitioned tables
> ---
>
> Key: SPARK-14463
> URL: https://issues.apache.org/jira/browse/SPARK-14463
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Michael Armbrust
>Priority: Critical
>
> Strongly typing the return values of {{read.text}} as {{Dataset\[String]}} 
> breaks when trying to load a partitioned table (or any table where the path 
> looks partitioned)
> {code}
> Seq((1, "test"))
>   .toDF("a", "b")
>   .write
>   .format("text")
>   .partitionBy("a")
>   .save("/home/michael/text-part-bug")
> sqlContext.read.text("/home/michael/text-part-bug")
> {code}
> {code}
> org.apache.spark.sql.AnalysisException: Try to map struct 
> to Tuple1, but failed as the number of fields does not line up.
>  - Input schema: struct
>  - Target schema: struct;
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265)
>   at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:197)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:168)
>   at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57)
>   at org.apache.spark.sql.Dataset.as(Dataset.scala:357)
>   at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

[jira] [Commented] (SPARK-14463) read.text broken for partitioned tables

11 matches

Site Navigation

Mail list logo

Footer information