Hey, Using arrays is probably a convenient way to do so.
I think the way you described the groupBy only works for tuples now. To do the grouping on the array field, you would need to create a key extractor for this and pass that to groupBy. Actually we have some use-cases like this for streaming so we are thinking of writing a wrapper for the array types that would behave as you described. Regards, Gyula > On 21 Oct 2014, at 14:03, Martin Neumann <[email protected]> wrote: > > Hej, > > I have a csv file with 54 columns each of them is string (for now). I need > to group and sort them on field 15. > > Whats the best way to load the data into Flink? > There is no Tuple54 (and the <> would look awful anyway with 54 times > String in it). > My current Idea is to write a Mapper and split the string to Arrays of > Strings would grouping and sorting work on this? > > So can I do something like this or does that only work on tuples: > Dataset<String[]> ds; > ds.groupBy(15).sort(20. ANY) > > cheers Martin
