[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-04 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/15979 admittedly the result looks weird. it really should be: +---++ |key|count(1)| +---++ | null| 1| | [1,1]|

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-04 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/15979 spark 2.0.x does not have mapValues. but this works: scala> Seq(("a", Some((1, 1))), ("a", None)).toDS.groupByKey(_._2).count.show +---++ |

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-04 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/15979 Yes it worked before On Dec 4, 2016 02:33, "Wenchen Fan" wrote: > val x: Dataset[String, Option[(String, String)]] = ... >

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15979 ``` val x: Dataset[String, Option[(String, String)]] = ... x.groupByKey(_._1).mapValues(_._2).agg(someAgg) ``` Does it work before? Please see the discussion in the JIRA:

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-03 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/15979 this means anything that uses an encoder can no longer use Option[_ <: Product]. encoders are not just used for the top level Dataset creation. Dataset.groupByKey[K] requires an

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 Also backported to branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15979 Sounds good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 @rxin Shall we backport this to branch-2.1? I think it's relatively safe. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 Merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69411/ Test PASSed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69411/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69411/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69403/ Test FAILed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69403/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69403/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15979 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69394/ Test FAILed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69394 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69394/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 Good to merge pending Jenkins. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69394 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69394/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15979 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69387/ Test FAILed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69387/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15979 FWIW I don't think we should call it nonflat. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 My only concern is that "non-flat" type is neither intuitive nor a well-known term. In fact, this PR only prevents `Option[T <: Product]` to be top-level Dataset types. How about just call them

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 LGTM, merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69328/ Test PASSed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69328/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69328 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69328/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15979 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69326/ Test FAILed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69326 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69326/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69326 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69326/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-28 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15979 looks good. @liancheng want to double check? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15979 retest it please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15979 "non-flat type" means "complex type", i.e. array, seq, map, product, etc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15979 What does "non-flat type" mean? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69000/ Test PASSed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69000 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69000/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69000 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69000/consoleFull)** for PR 15979 at commit

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15979 cc @yhuai @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes