[ 
https://issues.apache.org/jira/browse/SYSTEMML-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924804#comment-15924804
 ] 

Imran Younus commented on SYSTEMML-1401:
----------------------------------------

[~prithvianight] [~mboehm7] [~ae2015] [~a1singh]
Please have a look at this. This is important for R4ML.


> Data mismatch problem with Cox Predict script
> ---------------------------------------------
>
>                 Key: SYSTEMML-1401
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1401
>             Project: SystemML
>          Issue Type: Bug
>          Components: Algorithms
>         Environment: 
>            Reporter: Imran Younus
>
> The Cox predict script internally sorts the input/test data set w.r.t. time. 
> This is necessary to calculate the cumulative hazard function. But creates a 
> serious problem for the user because all the results returned from the 
> predict script are sorted by time but the input data is not, and user has no 
> way of matching the input data with predictions.
> There are two possible solutions to this problems:
> 1) We should restore the original order inside the predict script before 
> returning the final results, so that the order of the predictions match 
> exactly with order of the input data.
> 2) We can add sorted time column in the final output to let the user know 
> which prediction corresponds to which time value. This may be easier to 
> implement, but I think this is not ideal solution because in case of ties in 
> time values, user will still have problem matching input with the predictions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to