Ok, thanks Fabian.
From: Fabian Hueske [mailto:fhue...@gmail.com]
Sent: Tuesday, October 11, 2016 1:12 AM
To: user@flink.apache.org
Subject: Re: Wordindex conversation.
Hi,
you can do it like this:
1) you have to split each label record of the main dataset into separate
records
Hi,
you can do it like this:
1) you have to split each label record of the main dataset into separate
records:
(0,List(a, b, c, d, e, f, g)) -> (0, a), (0, b), (0, c), ..., (0, g)
(1,List(b, c, f, a, g)) -> (1, b), (1, c), ..., (1, g)
2) join word index dataset with splitted main dataset:
Data
Hi;
I have MainDataset (Label,WordList) :
(0,List(a, b, c, d, e, f, g))
(1,List(b, c, f, a, g))
..and, wordIndex dataset(created with .zipWithIndex) :
wordIndex> (0,a)
wordIndex> (1,b)
wordIndex> (2,c)
wordIndex> (3,d)
wordIndex> (4,e)
wordIndex> (5,f)
wordIndex> (6,g)
H