Imran Younus created SYSTEMML-1401: -------------------------------------- Summary: Data mismatch problem with Cox Predict script Key: SYSTEMML-1401 URL: https://issues.apache.org/jira/browse/SYSTEMML-1401 Project: SystemML Issue Type: Bug Components: Algorithms Environment:
Reporter: Imran Younus The Cox predict script internally sorts the input/test data set w.r.t. time. This is necessary to calculate the cumulative hazard function. But creates a serious problem for the user because all the results returned from the predict script are sorted by time but the input data is not, and user has no way of matching the input data with predictions. There are two possible solutions to this problems: 1) We should restore the original order inside the predict script before returning the final results, so that the order of the predictions match exactly with order of the input data. 2) We can add sorted time column in the final output to let the user know which prediction corresponds to which time value. This may be easier to implement, but I think this is not ideal solution because in case of ties in time values, user will still have problem matching input with the predictions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)