I am not sure how you should go about that, let’s wait for some feedback from the others.
Until then you can always map the array to (array, keyfield) and use groupBy(1). > On 21 Oct 2014, at 14:17, Martin Neumann <[email protected]> wrote: > > Hej, > > Unfortunately .sort() cannot take a key extractor, would I have to do the > sort myself then? > > cheers Martin > > On Tue, Oct 21, 2014 at 2:08 PM, Gyula Fora <[email protected]> wrote: > >> Hey, >> >> Using arrays is probably a convenient way to do so. >> >> I think the way you described the groupBy only works for tuples now. To do >> the grouping on the array field, you would need to create a key extractor >> for this and pass that to groupBy. >> >> Actually we have some use-cases like this for streaming so we are thinking >> of writing a wrapper for the array types that would behave as you described. >> >> Regards, >> Gyula >> >>> On 21 Oct 2014, at 14:03, Martin Neumann <[email protected]> wrote: >>> >>> Hej, >>> >>> I have a csv file with 54 columns each of them is string (for now). I >> need >>> to group and sort them on field 15. >>> >>> Whats the best way to load the data into Flink? >>> There is no Tuple54 (and the <> would look awful anyway with 54 times >>> String in it). >>> My current Idea is to write a Mapper and split the string to Arrays of >>> Strings would grouping and sorting work on this? >>> >>> So can I do something like this or does that only work on tuples: >>> Dataset<String[]> ds; >>> ds.groupBy(15).sort(20. ANY) >>> >>> cheers Martin >> >>
