Yunfeng Zhou created FLINK-27742: ------------------------------------ Summary: Fix Compatibility Issues Between Flink ML Operators. Key: FLINK-27742 URL: https://issues.apache.org/jira/browse/FLINK-27742 Project: Flink Issue Type: Bug Components: Library / Machine Learning Affects Versions: ml-2.0.0 Reporter: Yunfeng Zhou
It is discovered that StringIndexer and LogisticRegression in Flink ML cannot be connected in a pipeline. The reason is that the output label column of StringIndexer is integer, while LogisticRegression can only accept input data whose labels are doubles. In order to make Flink ML stages compatible with each other, the following changes need to be made. - For stages who can only accept double-typed inputs, update their implementation to accept any numerical type. - For stages that generates labels as integers, make them return labels as doubles. -- This message was sent by Atlassian Jira (v8.20.7#820007)