GitHub user jingyimei opened a pull request:
https://github.com/apache/madlib/pull/274
Handling special characters in MLP and Encode Categorical Variables
This PR handles special characters and unicode in column name and column
values in MLP and Encode Categorical Variables modules. Also updated install
check test cases to cover it.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/madlib/madlib one_hot_encoding_fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/madlib/pull/274.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #274
----
commit 1f97cd5a9c9d118e9024049c466e0e6cf44dcdd2
Author: Arvind Sridhar <asridhar@...>
Date: 2018-05-24T00:02:43Z
Encode categorical variables: handling special characters
JIRA: MADLIB-1238
JIRA: MADLIB-1243
This commit deals with special characters in column name and column
values. Also adds install check test cases to cover these scenarios.
commit 7d70ac24fbd679c0e5d58ac09bd536e6cc887790
Author: Jingyi Mei <jmei@...>
Date: 2018-05-30T21:16:13Z
MLP: handling special characters
JIRA: MADLIB-1238
This commit deals with special characters in column names and column
values within tables passed into the multilayer perceptron (MLP). We
also added install check cases to cover these scenarios.
----
---