[ 
https://issues.apache.org/jira/browse/MAHOUT-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615897#action_12615897
 ] 

Ted Dunning commented on MAHOUT-65:
-----------------------------------


The user story I have in mind is one I encounter at least once a week.

I have data in a matrix that represents test conditions, clicks and impressions.

I build a model using some or all of the test conditions including or excluding 
interactions and some simple functions of the test conditions.  To do this, the 
modeling software has to convert my data matrix into a so-called design matrix. 
 Each column that contains categorial data has to be converted to a 1 of n-1 
binary encoding, but other values need to be carried along intact.  The columns 
containing the binary encoding need to be named according to the ordinal values 
that they represent.

The result is a set of rows that represent possible coefficient values.  These 
are, of course, labelled with the design matrix columns.

Sometimes I will get new data.  It won't have the result columns and design 
variables may be shifted around somewhat, but the column labels will be the 
same.  I need to compute predictions for the new data so I have to create a new 
design matrix and multiply it by the coefficients.  This has to be done by 
matching labels.

This user story requires 

- labels

- data type (categorial or continuous)

- the names of the values of the ordinal types

- matrix multiply by label instead of just column.

i.e. all of the capabilities that Karl mentioned and that I included in my list.

> Add Element Labels to Vectors and Matrices
> ------------------------------------------
>
>                 Key: MAHOUT-65
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-65
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Matrix
>            Reporter: Jeff Eastman
>         Attachments: MAHOUT-65.patch
>
>
> Many applications can benefit by accessing elements in vectors and matrices 
> using String labels in addition to numeric indices. Investigate adding such a 
> capability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to