Re: frustration with field names in Dataset

2017-02-02 Thread Koert Kuipers
great its an easy fix. i will create jira and pullreq On Thu, Feb 2, 2017 at 2:13 PM, Michael Armbrust wrote: > That might be reasonable. At least I can't think of any problems with > doing that. > > On Thu, Feb 2, 2017 at 7:39 AM, Koert Kuipers

Re: frustration with field names in Dataset

2017-02-02 Thread Michael Armbrust
That might be reasonable. At least I can't think of any problems with doing that. On Thu, Feb 2, 2017 at 7:39 AM, Koert Kuipers wrote: > since a dataset is a typed object you ideally don't have to think about > field names. > > however there are operations on Dataset that

Re: frustration with field names in Dataset

2017-02-02 Thread Koert Kuipers
another example is if i have a Dataset[(K, V)] and i want to repartition it by the key K. repartition requires a Column which means i am suddenly back to worrying about duplicate field names. i would like to be able to say: dataset.repartition(dataset(0)) On Thu, Feb 2, 2017 at 10:39 AM, Koert

frustration with field names in Dataset

2017-02-02 Thread Koert Kuipers
since a dataset is a typed object you ideally don't have to think about field names. however there are operations on Dataset that require you to provide a Column, like for example joinWith (and joinWith returns a strongly typed Dataset, not DataFrame). once you have to provide a Column you are