Re: Custom keyBy(), look for similaties

2016-06-08 Thread Chesnay Schepler
the idea behind key-selectors is to extract a property on which you can to equality comparisons. let's get one question out of the way first: is your scoring algorithm transitive? as in if A==B and B==C, is it a given that A==C? because if not, there's just no way to group(=partition) the data,

Re: Custom keyBy(), look for similaties

2016-06-07 Thread iñaki williams
Thanks for your answer Ufuk. However, I have been reading about KeySelector and I don't understand completely how it works with my idea. I am using an algorithm that gives me an score between some different strings. My idea is: if the score is higher than 0'80 for example, then those two strings

Re: Custom keyBy(), look for similaties

2016-06-06 Thread Ufuk Celebi
Hey Iñaki, you can use the KeySelector as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/common/index.html#specifying-keys But you only a local view for the current element, e.g. the library you use to determine the similarity has to know the similarities upfront

Custom keyBy(), look for similaties

2016-06-06 Thread iñaki williams
Hi guys, I am using Flink on my project and I have a question. (I am using Java) Is it possible to modify the keyby method in order to key by similarities and not by the exact name? Example: I recieve 2 DataStreams, in the first one , the name of the field that I want to KeyBy is "John Locke", w