Yes, for sorted groups, you need to use Pojos or Tuples.
I think you have to split the input lines manually, with a mapper.
How about using a TupleN<...> with only the fields you need? (returned by
the mapper)

if you need all fields, you could also use a Tuple2<String, String[]> where
the first position is the sort key?



On Tue, Oct 21, 2014 at 2:20 PM, Gyula Fora <gyf...@apache.org> wrote:

> I am not sure how you should go about that, let’s wait for some feedback
> from the others.
>
> Until then you can always map the array to (array, keyfield) and use
> groupBy(1).
>
>
> > On 21 Oct 2014, at 14:17, Martin Neumann <mneum...@spotify.com> wrote:
> >
> > Hej,
> >
> > Unfortunately .sort() cannot take a key extractor, would I have to do the
> > sort myself then?
> >
> > cheers Martin
> >
> > On Tue, Oct 21, 2014 at 2:08 PM, Gyula Fora <gyf...@apache.org> wrote:
> >
> >> Hey,
> >>
> >> Using arrays is probably a convenient way to do so.
> >>
> >> I think the way you described the groupBy only works for tuples now. To
> do
> >> the grouping on the array field, you would need to create a key
> extractor
> >> for this and pass that to groupBy.
> >>
> >> Actually we have some use-cases like this for streaming so we are
> thinking
> >> of writing a wrapper for the array types that would behave as you
> described.
> >>
> >> Regards,
> >> Gyula
> >>
> >>> On 21 Oct 2014, at 14:03, Martin Neumann <mneum...@spotify.com> wrote:
> >>>
> >>> Hej,
> >>>
> >>> I have a csv file with 54 columns each of them is string (for now). I
> >> need
> >>> to group and sort them on field 15.
> >>>
> >>> Whats the best way to load the data into Flink?
> >>> There is no Tuple54 (and the <> would look awful anyway with 54 times
> >>> String in it).
> >>> My current Idea is to write a Mapper and split the string to Arrays of
> >>> Strings would grouping and sorting work on this?
> >>>
> >>> So can I do something like this or does that only work on tuples:
> >>> Dataset<String[]> ds;
> >>> ds.groupBy(15).sort(20. ANY)
> >>>
> >>> cheers Martin
> >>
> >>
>
>

Reply via email to