Nandish Jayaram created MADLIB-1245: ---------------------------------------
Summary: Randomize data after standardization Key: MADLIB-1245 URL: https://issues.apache.org/jira/browse/MADLIB-1245 Project: Apache MADlib Issue Type: Improvement Components: Module: Utilities Reporter: Nandish Jayaram The functions `utils_ind_var_scales` and `utils_ind_var_scales_grouping` in `convex.utils_regularization` are used to standardize the input data, which is then fed to the underlying gradient descent solver. Most often, randomizing the data works well with gradient descent. The current functions create a temp table consisting of the standardized version of the input data, but the rows are not randomly distributed. Can we distribute it randomly? This might affect multiple modules, so all those affected modules must be tested well to ensure this change is acceptable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)