Re: [jira] [Commented] (MAHOUT-1551) Add document to describe how to use mlp with command line

2014-07-14 Thread Yexi Jiang
HI, Felix, you are current, the current implementation is a simple
online/stochastic gradient descent network using back-propagation for
optimizing. The user can set the number of levels, number of neurons in
each level, and a various of parameters (such as learning rate,
regularization weight, etc.). The CLI version simplifies some parameters
because basic users do not need that many parameters.

Regards,
Yexi


2014-07-14 7:36 GMT-07:00 Felix Schüler (JIRA) j...@apache.org:


 [
 https://issues.apache.org/jira/browse/MAHOUT-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060688#comment-14060688
 ]

 Felix Schüler commented on MAHOUT-1551:
 ---

 Ted, thanks for the feedback!
 As far as we understand it, the implementation is a simple
 online/stochastic gradient descent using backpropagation to calculate the
 gradients of the error function. Weights are then updated with a fixed
 learning rate that never changes. As we (I always say 'we' because I am
 working on it with someone else for a university-class) have described in
 MAHOUT-1388, the CLI version only performs a fixed number of n iterations
 where n is the size of the training set. So example is fed into the network
 once, which in case of a dataset as small as the iris dataset does not lead
 to acceptable performance. The unit test for the mlp iterates 2000 times
 through the dataset to achieve a good performance, but as far as we can
 tell, stopping does not depend on learning or weight updates even though
 regularization is implemented.
 We could add this information to the implementation section of the
 documentation.

 As for the DSL, we are very tempted to implement the MLP or a more general
 neural network framework. We will think about it and see if we can find the
 time.

  Add document to describe how to use mlp with command line
  -
 
  Key: MAHOUT-1551
  URL: https://issues.apache.org/jira/browse/MAHOUT-1551
  Project: Mahout
   Issue Type: Documentation
   Components: Classification, CLI, Documentation
 Affects Versions: 0.9
 Reporter: Yexi Jiang
   Labels: documentation
  Fix For: 1.0
 
  Attachments: README.md
 
 
  Add documentation about the usage of multi-layer perceptron in command
 line.



 --
 This message was sent by Atlassian JIRA
 (v6.2#6252)



[jira] [Commented] (MAHOUT-1551) Add document to describe how to use mlp with command line

2014-07-14 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061011#comment-14061011
 ] 

Yexi Jiang commented on MAHOUT-1551:


[~fschueler], you are correct, the current implementation is a simple 
online/stochastic gradient descent network using back-propagation for 
optimizing. The user can set the number of levels, number of neurons in each 
level, and a various of parameters (such as learning rate, regularization 
weight, etc.). The CLI version simplifies some parameters because basic users 
do not need that many parameters.

Regards,
Yexi

 Add document to describe how to use mlp with command line
 -

 Key: MAHOUT-1551
 URL: https://issues.apache.org/jira/browse/MAHOUT-1551
 Project: Mahout
  Issue Type: Documentation
  Components: Classification, CLI, Documentation
Affects Versions: 0.9
Reporter: Yexi Jiang
  Labels: documentation
 Fix For: 1.0

 Attachments: README.md


 Add documentation about the usage of multi-layer perceptron in command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-05-18 Thread Yexi Jiang
The code has been uploaded to review board.


2014-05-17 23:27 GMT-07:00 Sebastian Schelter (JIRA) j...@apache.org:


 [
 https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001021#comment-14001021]

 Sebastian Schelter commented on MAHOUT-1388:
 

 [~yxjiang] what's the status here?

  Add command line support and logging for MLP
  
 
  Key: MAHOUT-1388
  URL: https://issues.apache.org/jira/browse/MAHOUT-1388
  Project: Mahout
   Issue Type: Improvement
   Components: Classification
 Affects Versions: 1.0
 Reporter: Yexi Jiang
 Assignee: Suneel Marthi
   Labels: mlp, sgd
  Fix For: 1.0
 
  Attachments: Mahout-1388.patch, Mahout-1388.patch
 
 
  The user should have the ability to run the Perceptron from the command
 line.
  There are two programs to execute MLP, the training and labeling. The
 first one takes the data as input and outputs the model, the second one
 takes the model and unlabeled data as input and outputs the results.
  The parameters for training are as follows:
  
  --input -i (input data)
  --skipHeader -sk // whether to skip the first row, this parameter is
 optional
  --labels -labels // the labels of the instances, separated by
 whitespace. Take the iris dataset for example, the labels are 'setosa
 versicolor virginica'.
  --model -mo  // in training mode, this is the location to store the
 model (if the specified location has an existing model, it will update the
 model through incremental learning), in labeling mode, this is the location
 to store the result
  --update -u // whether to incremental update the model, if this
 parameter is not given, train the model from scratch
  --output -o   // this is only useful in labeling mode
  --layersize -ls (no. of units per hidden layer) // use whitespace
 separated number to indicate the number of neurons in each layer (including
 input layer and output layer), e.g. '5 3 2'.
  --squashingFunction -sf // currently only supports Sigmoid
  --momentum -m
  --learningrate -l
  --regularizationweight -r
  --costfunction -cf   // the type of cost function,
  
  For example, train a 3-layer (including input, hidden, and output) MLP
 with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight,
 the parameter would be:
  mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
  This command would read the training data from /tmp/training-data.csv
 and write the trained model to /tmp/model.model.
  The parameters for labeling is as follows:
  -
  --input -i // input file path
  --columnRange -cr // the range of column used for feature, start from 0
 and separated by whitespace, e.g. 0 5
  --format -f // the format of input file, currently only supports csv
  --model -mo // the file path of the model
  --output -o // the output path for the results
  -
  If a user need to use an existing model, it will use the following
 command:
  mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
  Moreover, we should be providing default values if the user does not
 specify any.



 --
 This message was sent by Atlassian JIRA
 (v6.2#6252)




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computing and Information Sciences,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-05-18 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001158#comment-14001158
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~ssc] The code is available at https://reviews.apache.org/r/16700/.

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch, Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-05-18 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001294#comment-14001294
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~ssc] Sure, where should I add the documentation to?

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch, Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAHOUT-1551) Add document to describe how to use mlp with command line

2014-05-18 Thread Yexi Jiang (JIRA)
Yexi Jiang created MAHOUT-1551:
--

 Summary: Add document to describe how to use mlp with command line
 Key: MAHOUT-1551
 URL: https://issues.apache.org/jira/browse/MAHOUT-1551
 Project: Mahout
  Issue Type: Documentation
  Components: Classification, CLI, Documentation
Affects Versions: 0.9
Reporter: Yexi Jiang


Add documentation about the usage of multi-layer perceptron in command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-04-19 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975002#comment-13975002
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~smarthi] Could you please re-assign this issue to me?

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch, Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-04-18 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974113#comment-13974113
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~ssc] Thanks, I will work on it.

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch, Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-04-18 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974127#comment-13974127
 ] 

Yexi Jiang commented on MAHOUT-1388:


Do you mean that the MLP needs to be reimplemented in the way to work with 
spark? The current implementation of MLP is not a hadoop version.

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch, Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-04-18 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974718#comment-13974718
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~ssc] For the comment 'duplicate code, this has already been implemented in 
the other class', could you please also point out which class has implemented 
the method that extract the string from the command line? I checked 
o.a.m.commons package, but didn't find the method I need.

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch, Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1510) Goodbye MapReduce

2014-04-15 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969557#comment-13969557
 ] 

Yexi Jiang commented on MAHOUT-1510:


What kind of algorithms are acceptable in the future?

 Goodbye MapReduce
 -

 Key: MAHOUT-1510
 URL: https://issues.apache.org/jira/browse/MAHOUT-1510
 Project: Mahout
  Issue Type: Task
  Components: Documentation
Reporter: Sebastian Schelter
 Fix For: 1.0


 We should prominently state on the website that we reject any future MR 
 algorithm contributions (but still maintain and bugfix what we have so far).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1510) Goodbye MapReduce

2014-04-15 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969777#comment-13969777
 ] 

Yexi Jiang commented on MAHOUT-1510:


Great, is it necessary to port all of the old algorithms to scala DSL form?

 Goodbye MapReduce
 -

 Key: MAHOUT-1510
 URL: https://issues.apache.org/jira/browse/MAHOUT-1510
 Project: Mahout
  Issue Type: Task
  Components: Documentation
Reporter: Sebastian Schelter
 Fix For: 1.0


 We should prominently state on the website that we reject any future MR 
 algorithm contributions (but still maintain and bugfix what we have so far).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2014-04-15 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970006#comment-13970006
 ] 

Yexi Jiang commented on MAHOUT-1265:


Hi, [~barsik], according to 
[MAHOUT-1510|https://issues.apache.org/jira/browse/MAHOUT-1510], mahout no 
longer accept the proposal of MR algorithm.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: machine_learning, neural_network
 Fix For: 0.9

 Attachments: MAHOUT-1265.patch, Mahout-1265-17.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation

Re: [jira] [Assigned] (MAHOUT-1388) Add command line support and logging for MLP

2014-03-24 Thread Yexi Jiang
The patch is already available.


2014-03-23 1:01 GMT-04:00 Suneel Marthi (JIRA) j...@apache.org:


  [
 https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

 Suneel Marthi reassigned MAHOUT-1388:
 -

 Assignee: Suneel Marthi

  Add command line support and logging for MLP
  
 
  Key: MAHOUT-1388
  URL: https://issues.apache.org/jira/browse/MAHOUT-1388
  Project: Mahout
   Issue Type: Improvement
   Components: Classification
 Affects Versions: 1.0
 Reporter: Yexi Jiang
 Assignee: Suneel Marthi
   Labels: mlp, sgd
  Fix For: 1.0
 
  Attachments: Mahout-1388.patch, Mahout-1388.patch
 
 
  The user should have the ability to run the Perceptron from the command
 line.
  There are two programs to execute MLP, the training and labeling. The
 first one takes the data as input and outputs the model, the second one
 takes the model and unlabeled data as input and outputs the results.
  The parameters for training are as follows:
  
  --input -i (input data)
  --skipHeader -sk // whether to skip the first row, this parameter is
 optional
  --labels -labels // the labels of the instances, separated by
 whitespace. Take the iris dataset for example, the labels are 'setosa
 versicolor virginica'.
  --model -mo  // in training mode, this is the location to store the
 model (if the specified location has an existing model, it will update the
 model through incremental learning), in labeling mode, this is the location
 to store the result
  --update -u // whether to incremental update the model, if this
 parameter is not given, train the model from scratch
  --output -o   // this is only useful in labeling mode
  --layersize -ls (no. of units per hidden layer) // use whitespace
 separated number to indicate the number of neurons in each layer (including
 input layer and output layer), e.g. '5 3 2'.
  --squashingFunction -sf // currently only supports Sigmoid
  --momentum -m
  --learningrate -l
  --regularizationweight -r
  --costfunction -cf   // the type of cost function,
  
  For example, train a 3-layer (including input, hidden, and output) MLP
 with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight,
 the parameter would be:
  mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
  This command would read the training data from /tmp/training-data.csv
 and write the trained model to /tmp/model.model.
  The parameters for labeling is as follows:
  -
  --input -i // input file path
  --columnRange -cr // the range of column used for feature, start from 0
 and separated by whitespace, e.g. 0 5
  --format -f // the format of input file, currently only supports csv
  --model -mo // the file path of the model
  --output -o // the output path for the results
  -
  If a user need to use an existing model, it will use the following
 command:
  mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
  Moreover, we should be providing default values if the user does not
 specify any.



 --
 This message was sent by Atlassian JIRA
 (v6.2#6252)




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: [jira] [Comment Edited] (MAHOUT-1426) GSOC 2013 Neural network algorithms

2014-03-19 Thread Yexi Jiang
Hi, Ted,

I am currently working on that issue with Suneel.

Yexi


2014-03-19 19:44 GMT-04:00 Ted Dunning ted.dunn...@gmail.com:

 On Wed, Mar 19, 2014 at 3:19 PM, Maciej Mazur maciejmaz...@gmail.com
 wrote:

  I'm not going to propose this project.
  Now this issue can be closed.
 

 Proposing the downpour would be a good thing to do.

 It won't be that difficult.

 Please don't take my comments as discouraging.




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1441) Add documentation for Spectral KMeans to Mahout Website

2014-03-09 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925231#comment-13925231
 ] 

Yexi Jiang commented on MAHOUT-1441:


I did some summary for [Mahout-1177: Reform and simplify the clustering 
APIs|https://issues.apache.org/jira/browse/MAHOUT-1177] last year. It can be 
found 
[here|https://docs.google.com/document/d/10RocKzS_FBZTIScqTI3Gl2tfeR8vXabPMCGNpZe07m8/edit].

Hope this document is useful.

 Add documentation for Spectral KMeans to Mahout Website
 ---

 Key: MAHOUT-1441
 URL: https://issues.apache.org/jira/browse/MAHOUT-1441
 Project: Mahout
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0
Reporter: Suneel Marthi
Assignee: Shannon Quinn
 Fix For: 1.0


 Need to update the Website with Design, user guide and any relevant 
 documentation for Spectral KMeans clustering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [jira] [Commented] (MAHOUT-1441) Add documentation for Spectral KMeans to Mahout Website

2014-03-09 Thread Yexi Jiang
Sebastian, Currently I am working on other things. If this issue is not
urgent, please assign it to me.


2014-03-09 12:28 GMT-04:00 Sebastian Schelter (JIRA) j...@apache.org:


 [
 https://issues.apache.org/jira/browse/MAHOUT-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925234#comment-13925234]

 Sebastian Schelter commented on MAHOUT-1441:
 

 Yexi, could you create a tutorial from this writeup to teach people how
 (and why) to use Mahout's spectral clustering?

 It would be great to have something similar to
 https://mahout.apache.org/users/recommender/userbased-5-minutes.htmlwhich was 
 the result of MAHOUT-1438

  Add documentation for Spectral KMeans to Mahout Website
  ---
 
  Key: MAHOUT-1441
  URL: https://issues.apache.org/jira/browse/MAHOUT-1441
  Project: Mahout
   Issue Type: Bug
   Components: Documentation
 Affects Versions: 1.0
 Reporter: Suneel Marthi
 Assignee: Shannon Quinn
  Fix For: 1.0
 
 
  Need to update the Website with Design, user guide and any relevant
 documentation for Spectral KMeans clustering.



 --
 This message was sent by Atlassian JIRA
 (v6.2#6252)




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1441) Add documentation for Spectral KMeans to Mahout Website

2014-03-09 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925270#comment-13925270
 ] 

Yexi Jiang commented on MAHOUT-1441:


[~ssc], Currently I am working on other things. If this issue is not urgent, 
please assign it to me. 

[~smarthi], An advantage of spectral clustering is that it performs clustering 
on the correlation metric space (a.k.a. similarity graph). It performs good on 
the data points that cannot be well clustered by the clustering algorithms 
which directly work on the original metric space and cluster the data points in 
convex shape. I'm not sure whether the reuters dataset can reflect such 
advantage of spectral clustering. Or do we need to show a more representative 
dataset in the website example? Like the ones shown in the experiment section 
of this [paper |http://ai.stanford.edu/~ang/papers/nips01-spectral.pdf].

 Add documentation for Spectral KMeans to Mahout Website
 ---

 Key: MAHOUT-1441
 URL: https://issues.apache.org/jira/browse/MAHOUT-1441
 Project: Mahout
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0
Reporter: Suneel Marthi
Assignee: Shannon Quinn
 Fix For: 1.0


 Need to update the Website with Design, user guide and any relevant 
 documentation for Spectral KMeans clustering.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Mahout 1.0 goals

2014-03-04 Thread Yexi Jiang
Sebastian,

In one of my recent projects, I used the Naive Bayes for classification, so
I gave a write-up on this algorithm. You can find the document at

https://docs.google.com/document/d/1h7N0GmIKe-KG64uulPMPzkp00nowM2-HDQ48c4PIhbc/edit?usp=sharing
.

Feedbacks are welcome.


2014-03-04 3:57 GMT-05:00 Sebastian Schelter ssc.o...@googlemail.com:

 Yexi, could you do a small write-up, analogously to what I proposed for
 Giorgio. Make sure to pick a different algorithm though.

 --sebastian
 Am 03.03.2014 16:54 schrieb Yexi Jiang yexiji...@gmail.com:

  I'm also happy to help.
 
 
  2014-03-03 10:29 GMT-05:00 Giorgio Zoppi giorgio.zo...@gmail.com:
 
   I would like to help in the api creation. How do I start for being
   productive with mahout?
   Best Regards,
   Giorgio
  
  
   2014-02-28 1:37 GMT+01:00 Ted Dunning ted.dunn...@gmail.com:
  
I would like to start a conversation about where we want Mahout to be
  for
1.0.  Let's suspend for the moment the question of how to achieve the
goals.  Instead, let's converge on what we really would like to have
   happen
and after that, let's talk about means that will get us there.
   
Here are some goals that I think would be good in the area of
 numerics,
classifiers and clustering:
   
- runs with or without Hadoop
   
- runs with or without map-reduce
   
- includes (at least), regularized generalized linear models,
 k-means,
random forest, distributed random forest, distributed neural networks
   
- reasonably competitive speed against other implementations
 including
graphlab, mlib and R.
   
- interactive model building
   
- models can be exported as code or data
   
- simple programming model
   
- programmable via Java or R
   
- runs clustered or not
   
   
What does everybody think?
   
  
  
  
   --
   Quiero ser el rayo de sol que cada día te despierta
   para hacerte respirar y vivir en me.
   Favola -Moda.
  
 
 
 
  --
  --
  Yexi Jiang,
  ECS 251,  yjian...@cs.fiu.edu
  School of Computer and Information Science,
  Florida International University
  Homepage: http://users.cis.fiu.edu/~yjian004/
 




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: Mahout 1.0 goals

2014-03-03 Thread Yexi Jiang
I'm also happy to help.


2014-03-03 10:29 GMT-05:00 Giorgio Zoppi giorgio.zo...@gmail.com:

 I would like to help in the api creation. How do I start for being
 productive with mahout?
 Best Regards,
 Giorgio


 2014-02-28 1:37 GMT+01:00 Ted Dunning ted.dunn...@gmail.com:

  I would like to start a conversation about where we want Mahout to be for
  1.0.  Let's suspend for the moment the question of how to achieve the
  goals.  Instead, let's converge on what we really would like to have
 happen
  and after that, let's talk about means that will get us there.
 
  Here are some goals that I think would be good in the area of numerics,
  classifiers and clustering:
 
  - runs with or without Hadoop
 
  - runs with or without map-reduce
 
  - includes (at least), regularized generalized linear models, k-means,
  random forest, distributed random forest, distributed neural networks
 
  - reasonably competitive speed against other implementations including
  graphlab, mlib and R.
 
  - interactive model building
 
  - models can be exported as code or data
 
  - simple programming model
 
  - programmable via Java or R
 
  - runs clustered or not
 
 
  What does everybody think?
 



 --
 Quiero ser el rayo de sol que cada día te despierta
 para hacerte respirar y vivir en me.
 Favola -Moda.




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: [jira] [Comment Edited] (MAHOUT-1426) GSOC 2013 Neural network algorithms

2014-02-27 Thread Yexi Jiang
Peng,

Can you provide more details about your thought?

Regards,


2014-02-27 16:00 GMT-05:00 peng pc...@uowmail.edu.au:

 That should be easy. But that defeats the purpose of using mahout as there
 are already enough implementations of single node backpropagation (in which
 case GPU is much faster).

 Yexi:

 Regarding downpour SGD and sandblaster, may I suggest that the
 implementation better has no parameter server? It's obviously a single
 point of failure and in terms of bandwidth, a bottleneck. I heard that
 MLlib on top of Spark has a functional implementation (never read or test
 it), and its possible to build the workflow on top of YARN. Non of those
 framework has an heterogeneous topology.

 Yours Peng


 On Thu 27 Feb 2014 09:43:19 AM EST, Maciej Mazur (JIRA) wrote:


  [ https://issues.apache.org/jira/browse/MAHOUT-1426?page=
 com.atlassian.jira.plugin.system.issuetabpanels:comment-
 tabpanelfocusedCommentId=13913488#comment-13913488 ]

 Maciej Mazur edited comment on MAHOUT-1426 at 2/27/14 2:41 PM:
 ---

 I've read the papers. I didn't think about distributed network. I had in
 mind network that will fit into memory, but will require significant amount
 of computations.

 I understand that there are better options for neural networks than map
 reduce.
 How about non-map-reduce version?
 I see that you think it is something that would make a sense. (Doing a
 non-map-reduce neural network in Mahout would be of substantial
 interest.)
 Do you think it will be a valueable contribution?
 Is there a need for this type of algorithm?
 I think about multi-threded batch gradient descent with pretraining (RBM
 or/and Autoencoders).

 I have looked into these old JIRAs. RBM patch was withdrawn.
 I would rather like to withdraw that patch, because by the time i
 implemented it i didn't know that the learning algorithm is not suited for
 MR, so I think there is no point including the patch.


 was (Author: maciejmazur):
 I've read the papers. I didn't think about distributed network. I had in
 mind network that will fit into memory, but will require significant amount
 of computations.

 I understand that there are better options for neural networks than map
 reduce.
 How about non-map-reduce version?
 I see that you think it is something that would make a sense.
 Do you think it will be a valueable contribution?
 Is there a need for this type of algorithm?
 I think about multi-threded batch gradient descent with pretraining (RBM
 or/and Autoencoders).

 I have looked into these old JIRAs. RBM patch was withdrawn.
 I would rather like to withdraw that patch, because by the time i
 implemented it i didn't know that the learning algorithm is not suited for
 MR, so I think there is no point including the patch.

  GSOC 2013 Neural network algorithms
 ---

  Key: MAHOUT-1426
  URL: https://issues.apache.org/jira/browse/MAHOUT-1426
  Project: Mahout
   Issue Type: Improvement
   Components: Classification
 Reporter: Maciej Mazur

 I would like to ask about possibilites of implementing neural network
 algorithms in mahout during GSOC.
 There is a classifier.mlp package with neural network.
 I can't see neighter RBM  nor Autoencoder in these classes.
 There is only one word about Autoencoders in NeuralNetwork class.
 As far as I know Mahout doesn't support convolutional networks.
 Is it a good idea to implement one of these algorithms?
 Is it a reasonable amount of work?
 How hard is it to get GSOC in Mahout?
 Did anyone succeed last year?




 --
 This message was sent by Atlassian JIRA
 (v6.1.5#6160)




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: [jira] [Comment Edited] (MAHOUT-1426) GSOC 2013 Neural network algorithms

2014-02-27 Thread Yexi Jiang
Hi, Peng,

Do you mean the MultilayerPerceptron? There are three 'train' method, and
only one (the one without the parameters trackingKey and groupKey) is
implemented. In current implementation, they are not used.

Regards,
Yexi


2014-02-27 19:31 GMT-05:00 Ted Dunning ted.dunn...@gmail.com:

 Generally for training models like this, there is an assumption that fault
 tolerance is not particularly necessary because the low risk of failure
 trades against algorithmic speed.  For reasonably small chance of failure,
 simply re-running the training is just fine.  If there is high risk of
 failure, simply checkpointing the parameter server is sufficient to allow
 restarts without redundancy.

 Sharding the parameter is quite possible and is reasonable when the
 parameter vector exceed 10's or 100's of millions of parameters, but isn't
 likely much necessary below that.

 The asymmetry is similarly not a big deal.  The traffic to and from the
 parameter server isn't enormous.


 Building something simple and working first is a good thing.


 On Thu, Feb 27, 2014 at 3:56 PM, peng pc...@uowmail.edu.au wrote:

  With pleasure! the original downpour paper propose a parameter server
 from
  which subnodes download shards of old model and upload gradients. So if
 the
  parameter server is down, the process has to be delayed, it also requires
  that all model parameters to be stored and atomically updated on (and
  fetched from) a single machine, imposing asymmetric HDD and bandwidth
  requirement. This design is necessary only because each -=delta operation
  has to be atomic. Which cannot be ensured across network (e.g. on HDFS).
 
  But it doesn't mean that the operation cannot be decentralized:
 parameters
  can be sharded across multiple nodes and multiple accumulator instances
 can
  handle parts of the vector subtraction. This should be easy if you
 create a
  buffer for the stream of gradient, and allocate proper numbers of
 producers
  and consumers on each machine to make sure it doesn't overflow. Obviously
  this is far from MR framework, but at least it can be made homogeneous
 and
  slightly faster (because sparse data can be distributed in a way to
  minimize their overlapping, so gradients doesn't have to go across the
  network that frequent).
 
  If we instead using a centralized architect. Then there must be =1
 backup
  parameter server for mission critical training.
 
  Yours Peng
 
  e.g. we can simply use a producer/consumer pattern
 
  If we use a producer/consumer pattern for all gradients,
 
  On Thu 27 Feb 2014 05:09:52 PM EST, Yexi Jiang wrote:
 
  Peng,
 
  Can you provide more details about your thought?
 
  Regards,
 
 
  2014-02-27 16:00 GMT-05:00 peng pc...@uowmail.edu.au:
 
   That should be easy. But that defeats the purpose of using mahout as
  there
  are already enough implementations of single node backpropagation (in
  which
  case GPU is much faster).
 
  Yexi:
 
  Regarding downpour SGD and sandblaster, may I suggest that the
  implementation better has no parameter server? It's obviously a single
  point of failure and in terms of bandwidth, a bottleneck. I heard that
  MLlib on top of Spark has a functional implementation (never read or
 test
  it), and its possible to build the workflow on top of YARN. Non of
 those
  framework has an heterogeneous topology.
 
  Yours Peng
 
 
  On Thu 27 Feb 2014 09:43:19 AM EST, Maciej Mazur (JIRA) wrote:
 
 
[ https://issues.apache.org/jira/browse/MAHOUT-1426?page=
  com.atlassian.jira.plugin.system.issuetabpanels:comment-
  tabpanelfocusedCommentId=13913488#comment-13913488 ]
 
  Maciej Mazur edited comment on MAHOUT-1426 at 2/27/14 2:41 PM:
  ---
 
  I've read the papers. I didn't think about distributed network. I had
 in
  mind network that will fit into memory, but will require significant
  amount
  of computations.
 
  I understand that there are better options for neural networks than
 map
  reduce.
  How about non-map-reduce version?
  I see that you think it is something that would make a sense. (Doing a
  non-map-reduce neural network in Mahout would be of substantial
  interest.)
  Do you think it will be a valueable contribution?
  Is there a need for this type of algorithm?
  I think about multi-threded batch gradient descent with pretraining
 (RBM
  or/and Autoencoders).
 
  I have looked into these old JIRAs. RBM patch was withdrawn.
  I would rather like to withdraw that patch, because by the time i
  implemented it i didn't know that the learning algorithm is not suited
  for
  MR, so I think there is no point including the patch.
 
 
  was (Author: maciejmazur):
  I've read the papers. I didn't think about distributed network. I had
 in
  mind network that will fit into memory, but will require significant
  amount
  of computations.
 
  I understand that there are better options for neural networks than
 map
  reduce.
  How about non-map-reduce version

Re: [jira] [Created] (MAHOUT-1426) GSOC 2013 Neural network algorithms

2014-02-25 Thread Yexi Jiang
Since the training methods for neural network largely requires a lot of
iterations, it is not perfect suitable to implement it in MapReduce style.

Currently, the NeuralNetwork is implemented as an online learning model and
the training is conducted via stochastic gradient descent.

Moreover, currently version of NeuralNetwork is mainly used for supervised
learning, so there is no RBM or Autoencoder.

Regards,
Yexi


2014-02-25 10:34 GMT-05:00 Maciej Mazur (JIRA) j...@apache.org:

 Maciej Mazur created MAHOUT-1426:
 

  Summary: GSOC 2013 Neural network algorithms
  Key: MAHOUT-1426
  URL: https://issues.apache.org/jira/browse/MAHOUT-1426
  Project: Mahout
   Issue Type: Improvement
   Components: Classification
 Reporter: Maciej Mazur


 I would like to ask about possibilites of implementing neural network
 algorithms in mahout during GSOC.

 There is a classifier.mlp package with neural network.
 I can't see neighter RBM  nor Autoencoder in these classes.
 There is only one word about Autoencoders in NeuralNetwork class.
 As far as I know Mahout doesn't support convolutional networks.

 Is it a good idea to implement one of these algorithms?
 Is it a reasonable amount of work?



 --
 This message was sent by Atlassian JIRA
 (v6.1.5#6160)




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1426) GSOC 2013 Neural network algorithms

2014-02-25 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911865#comment-13911865
 ] 

Yexi Jiang commented on MAHOUT-1426:


I totally agree with you. From the algorithmic perspective, RBM and Autoencoder 
is proved to be very effective for feature learning. When training multi-level 
neural network, it is usually necessary to stack the RBMs or Autoencoders to 
learn the representative features first.

1. If the training dataset is large.
It is true that if the training data is huge, the online version be be slow as 
it is not a parallel implementation. If we implement the algorithm in MapReduce 
way, the data can be read in parallel. Now matter we use stochastic gradient 
descent, mini-batch gradient descent, or full batch gradient descent, we need 
to train the model with many iteration. In practice, we need one job for each 
iteration. It is know that the start-up time of hadoop is time-consuming, 
therefore, the overhead can be even higher than the actual computing time. For 
example, if we use stochastic gradient descent, after each partition read one 
data instance, we need to update and synchronize the model. IMHO, BSP is more 
effective than MapReduce in such scenario.

2. If the model is large.
If the model is large, we need to partition the model and store it 
distributedly, you can find a solution at a related NIPS paper 
(http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/large_deep_networks_nips2012.pdf).

In this case, the distributed system needs to be heterogeneous, since different 
nodes may have different tasks (for parameter storage or for computing). It is 
difficult to design an algorithm to conduct such work under MapReduce style, as 
each task is considered to be homogeneous in MapReduce. 

Actually, according to the talk of Tera-scale deep learning 
(http://static.googleusercontent.com/media/research.google.com/en/us/archive/unsupervised_learning_talk_2012.pdf),
 even BSP is not quite suitable since the error may always happen in a large 
scale distributed system. In their implementation, they implemented an 
asynchronous computing framework to conduct the large scale learning.

In summary, implementing MapReduce version of NeuralNetwork is OK, but compared 
with the more suitable computing frameworks, it is not so efficient.




 GSOC 2013 Neural network algorithms
 ---

 Key: MAHOUT-1426
 URL: https://issues.apache.org/jira/browse/MAHOUT-1426
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Reporter: Maciej Mazur

 I would like to ask about possibilites of implementing neural network 
 algorithms in mahout during GSOC.
 There is a classifier.mlp package with neural network.
 I can't see neighter RBM  nor Autoencoder in these classes.
 There is only one word about Autoencoders in NeuralNetwork class.
 As far as I know Mahout doesn't support convolutional networks.
 Is it a good idea to implement one of these algorithms?
 Is it a reasonable amount of work?
 How hard is it to get GSOC in Mahout?
 Did anyone succeed last year?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-02-02 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889097#comment-13889097
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~smarthi] I have revised the code, could you please have a look at the code at 
the review board? https://reviews.apache.org/r/16700/

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch, Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Mahout 0.9 Release - Call for Volunteers

2014-01-16 Thread Yexi Jiang
Got the same error.

Regards,
Yexi


2014/1/16 Chameera Wijebandara chameerawijeband...@gmail.com

 Hi Suneel,

 Still it getting 404 error.

 Thanks,
 Chameera


 On Thu, Jan 16, 2014 at 7:11 PM, Suneel Marthi suneel_mar...@yahoo.com
 wrote:

  Here's the new URL for Mahout 0.9 Release:
 
 
 https://repository.apache.org/content/repositories/orgapachemahout-1001/org/apache/mahout/mahout-buildtools/0.9/
 
  For those volunteering to test this, some of the things to be verified:
 
  a) Verify that u can unpack the release (tar or zip)
  b) Verify u r able to compile the distro
  c)  Run through the unit tests: mvn clean test
  d) Run the example scripts under $MAHOUT_HOME/examples/bin. Please run
  through all the different options in each script.
 
 
 
  Committers and PMC members:
  ---
 
  Need atleast 3 +1 votes from this group for the Release to pass.
 
 
  Thanks and Regards.




 --
 Thanks,
 Chameera




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: Mahout 0.9 Release - Call for Volunteers

2014-01-16 Thread Yexi Jiang
Tested on my mac and a server with ubuntu 12.04 LTS.

All tests passed.

[INFO]


[INFO] Reactor Summary:

[INFO]

[INFO] Mahout Build Tools  SUCCESS [1.964s]

[INFO] Apache Mahout . SUCCESS [0.400s]

[INFO] Mahout Math ... SUCCESS
[1:53.067s]

[INFO] Mahout Core ... SUCCESS
[9:09.716s]

[INFO] Mahout Integration  SUCCESS
[1:04.662s]

[INFO] Mahout Examples ... SUCCESS [3.331s]

[INFO] Mahout Release Package  SUCCESS [0.000s]

[INFO] Mahout Math/Scala wrappers  SUCCESS [11.356s]

[INFO]


[INFO] BUILD SUCCESS

[INFO]


Regards,
Yexi

2014/1/16 Sotiris Salloumis i...@eprice.gr

 From unix you should try the following with wget or curl, make sure during
 copy the email client will not wrap it up


http://repository.apache.org/content/repositories/orgapachemahout-1002/org/a
 pache/mahout/mahout-distribution/0.9/mahout-distribution-0.9-src.tar.gz

 Above link via Google url shortener for easy copy/paste
http://goo.gl/gX6xGz


 Regards
 Sotiris

 -Original Message-
 From: Yexi Jiang [mailto:yexiji...@gmail.com]
 Sent: Thursday, January 16, 2014 5:59 PM
 To: mahout
 Cc: Suneel Marthi; u...@mahout.apache.org; priv...@mahout.apache.org
 Subject: Re: Mahout 0.9 Release - Call for Volunteers

 Got the same error.

 Regards,
 Yexi


 2014/1/16 Chameera Wijebandara chameerawijeband...@gmail.com

  Hi Suneel,
 
  Still it getting 404 error.
 
  Thanks,
  Chameera
 
 
  On Thu, Jan 16, 2014 at 7:11 PM, Suneel Marthi
  suneel_mar...@yahoo.com
  wrote:
 
   Here's the new URL for Mahout 0.9 Release:
  
  
  https://repository.apache.org/content/repositories/orgapachemahout-100
  1/org/apache/mahout/mahout-buildtools/0.9/
  
   For those volunteering to test this, some of the things to be
verified:
  
   a) Verify that u can unpack the release (tar or zip)
   b) Verify u r able to compile the distro
   c)  Run through the unit tests: mvn clean test
   d) Run the example scripts under $MAHOUT_HOME/examples/bin. Please
   run through all the different options in each script.
  
  
  
   Committers and PMC members:
   ---
  
   Need atleast 3 +1 votes from this group for the Release to pass.
  
  
   Thanks and Regards.
 
 
 
 
  --
  Thanks,
  Chameera
 



 --
 --
 Yexi Jiang,
 ECS 251,  yjian...@cs.fiu.edu
 School of Computer and Information Science, Florida International
University
 Homepage: http://users.cis.fiu.edu/~yjian004/




--
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-01-07 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864531#comment-13864531
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~smarthi] When I submit the patch to the review board, I got the following 
error:


The file 
'https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java'
 (r1556303) could not be found in the repository


However, I checked this url, the file exists. I'm not sure what causes this 
error.


 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2014-01-07 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864681#comment-13864681
 ] 

Yexi Jiang commented on MAHOUT-1388:


The base I used is 
-
https://svn.apache.org/repos/asf/mahout/trunk
-

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAHOUT-1388) Add command line support and logging for MLP

2013-12-29 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1388:
---

Status: Patch Available  (was: Open)

This patch should be applied after apply patch in Mahout-1265.

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAHOUT-1388) Add command line support and logging for MLP

2013-12-29 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1388:
---

Attachment: Mahout-1388.patch

This patch should be applied after apply patch in Mahout-1265.

 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
  Labels: mlp, sgd
 Fix For: 1.0

 Attachments: Mahout-1388.patch


 The user should have the ability to run the Perceptron from the command line.
 There are two programs to execute MLP, the training and labeling. The first 
 one takes the data as input and outputs the model, the second one takes the 
 model and unlabeled data as input and outputs the results.
 The parameters for training are as follows:
 
 --input -i (input data)
 --skipHeader -sk // whether to skip the first row, this parameter is optional
 --labels -labels // the labels of the instances, separated by whitespace. 
 Take the iris dataset for example, the labels are 'setosa versicolor 
 virginica'.
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --update -u // whether to incremental update the model, if this parameter is 
 not given, train the model from scratch
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use whitespace separated 
 number to indicate the number of neurons in each layer (including input layer 
 and output layer), e.g. '5 3 2'.
 --squashingFunction -sf // currently only supports Sigmoid
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the 
 parameter would be:
 mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
 /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 The parameters for labeling is as follows:
 -
 --input -i // input file path
 --columnRange -cr // the range of column used for feature, start from 0 and 
 separated by whitespace, e.g. 0 5
 --format -f // the format of input file, currently only supports csv
 --model -mo // the file path of the model
 --output -o // the output path for the results
 -
 If a user need to use an existing model, it will use the following command:
 mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAHOUT-1388) Add command line support and logging for MLP

2013-12-27 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1388:
---

Description: 
The user should have the ability to run the Perceptron from the command line.

There are two programs to execute MLP, the training and labeling. The first one 
takes the data as input and outputs the model, the second one takes the model 
and unlabeled data as input and outputs the results.

The parameters for training are as follows:

--input -i (input data)
--skipHeader -sk // whether to skip the first row, this parameter is optional
--labels -labels // the labels of the instances, separated by whitespace. Take 
the iris dataset for example, the labels are 'setosa versicolor virginica'.
--model -mo  // in training mode, this is the location to store the model (if 
the specified location has an existing model, it will update the model through 
incremental learning), in labeling mode, this is the location to store the 
result
--update -u // whether to incremental update the model, if this parameter is 
not given, train the model from scratch
--output -o   // this is only useful in labeling mode
--layersize -ls (no. of units per hidden layer) // use whitespace separated 
number to indicate the number of neurons in each layer (including input layer 
and output layer), e.g. '5 3 2'.
--squashingFunction -sf // currently only supports Sigmoid
--momentum -m 
--learningrate -l
--regularizationweight -r
--costfunction -cf   // the type of cost function,

For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 
learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter 
would be:

mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o 
/tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01

This command would read the training data from /tmp/training-data.csv and write 
the trained model to /tmp/model.model.


The parameters for labeling is as follows:
-
--input -i // input file path
--columnRange -cr // the range of column used for feature, start from 0 and 
separated by whitespace, e.g. 0 5
--format -f // the format of input file, currently only supports csv
--model -mo // the file path of the model
--output -o // the output path for the results
-

If a user need to use an existing model, it will use the following command:
mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result

Moreover, we should be providing default values if the user does not specify 
any. 

  was:
The user should have the ability to run the Perceptron from the command line.

There are two modes for MLP, the training and labeling, the first one takes the 
data as input and outputs the model, the second one takes the model and 
unlabeled data as input and outputs the results.

The parameters are as follows:

--mode -mo // train or label
--input -i (input data)
--model -mo  // in training mode, this is the location to store the model (if 
the specified location has an existing model, it will update the model through 
incremental learning), in labeling mode, this is the location to store the 
result
--output -o   // this is only useful in labeling mode
--layersize -ls (no. of units per hidden layer) // use comma separated number 
to indicate the number of neurons in each layer (including input layer and 
output layer)
--momentum -m 
--learningrate -l
--regularizationweight -r
--costfunction -cf   // the type of cost function,

For example, train a 3-layer (including input, hidden, and output) MLP with 
Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 
regularization weight, the parameter would be:

mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 
0.1 -r 0.01 -cf minus_squared

This command would read the training data from /tmp/training-data.csv and write 
the trained model to /tmp/model.model.

If a user need to use an existing model, it will use the following command:
mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result

Moreover, we should be providing default values if the user does not specify 
any. 


 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
  Labels: mlp, sgd
 Fix For: 1.0


 The user should have the ability to run

[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP

2013-12-25 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856671#comment-13856671
 ] 

Yexi Jiang commented on MAHOUT-1388:


[~smarthi] OK, I'll add it. Currently, it only supports CSV.



 Add command line support and logging for MLP
 

 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
  Labels: mlp, sgd
 Fix For: 1.0


 The user should have the ability to run the Perceptron from the command line.
 There are two modes for MLP, the training and labeling, the first one takes 
 the data as input and outputs the model, the second one takes the model and 
 unlabeled data as input and outputs the results.
 The parameters are as follows:
 
 --mode -mo // train or label
 --input -i (input data)
 --model -mo  // in training mode, this is the location to store the model (if 
 the specified location has an existing model, it will update the model 
 through incremental learning), in labeling mode, this is the location to 
 store the result
 --output -o   // this is only useful in labeling mode
 --layersize -ls (no. of units per hidden layer) // use comma separated number 
 to indicate the number of neurons in each layer (including input layer and 
 output layer)
 --momentum -m 
 --learningrate -l
 --regularizationweight -r
 --costfunction -cf   // the type of cost function,
 
 For example, train a 3-layer (including input, hidden, and output) MLP with 
 Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 
 regularization weight, the parameter would be:
 mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 
 -m 0.1 -r 0.01 -cf minus_squared
 This command would read the training data from /tmp/training-data.csv and 
 write the trained model to /tmp/model.model.
 If a user need to use an existing model, it will use the following command:
 mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o 
 /tmp/label-result
 Moreover, we should be providing default values if the user does not specify 
 any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (MAHOUT-1388) Add command line support and logging for MLP

2013-12-23 Thread Yexi Jiang (JIRA)
Yexi Jiang created MAHOUT-1388:
--

 Summary: Add command line support and logging for MLP
 Key: MAHOUT-1388
 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
 Project: Mahout
  Issue Type: Improvement
  Components: Classification
Affects Versions: 1.0
Reporter: Yexi Jiang
 Fix For: 1.0


The user should have the ability to run the Perceptron from the command line.

There are two modes for MLP, the training and labeling, the first one takes the 
data as input and outputs the model, the second one takes the model and 
unlabeled data as input and outputs the results.

The parameters are as follows:

--mode -mo // train or label
--input -i (input data)
--model -mo  // in training mode, this is the location to store the model (if 
the specified location has an existing model, it will update the model through 
incremental learning), in labeling mode, this is the location to store the 
result
--output -o   // this is only useful in labeling mode
--layersize -ls (no. of units per hidden layer) // use comma separated number 
to indicate the number of neurons in each layer (including input layer and 
output layer)
--momentum -m 
--learningrate -l
--regularizationweight -r
--costfunction -cf   // the type of cost function,

For example, train a 3-layer (including input, hidden, and output) MLP with 
Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 
regularization weight, the parameter would be:

mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 
0.1 -r 0.01 -cf minus_squared

This command would read the training data from /tmp/training-data.csv and write 
the trained model to /tmp/model.model.

If a user need to use an existing model, it will use the following command:
mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result

Moreover, we should be providing default values if the user does not specify 
any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron

2013-12-19 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1265:
---

Attachment: Mahout-1265-17.patch

The version 17.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: MAHOUT-1265.patch, Mahout-1265-13.patch, 
 Mahout-1265-17.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part of \delta_j with 
 given partition of data, and then store the result

[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-12-19 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853303#comment-13853303
 ] 

Yexi Jiang commented on MAHOUT-1265:


Great. I am thinking a mapreduce version of MLP. It may take a non-trivial 
amount of time.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
Assignee: Suneel Marthi
  Labels: machine_learning, neural_network
 Fix For: 0.9

 Attachments: MAHOUT-1265.patch, Mahout-1265-17.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So

Re: Review Request 13406: mahout-1265: add multilayer perceptron.

2013-12-18 Thread Yexi Jiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13406/
---

(Updated Dec. 19, 2013, 12:31 a.m.)


Review request for mahout and Ted Dunning.


Changes
---

I have formatted the code to make it looks better.


Repository: mahout


Description
---

mahout-1265: add multilayer perceptron. For details, please refer to 
https://issues.apache.org/jira/browse/MAHOUT-1265.


Diffs (updated)
-

  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/13406/diff/


Testing
---

Please see the corresponding test cases


Thanks,

Yexi Jiang



Re: Review Request 13406: mahout-1265: add multilayer perceptron.

2013-12-17 Thread Yexi Jiang


 On Dec. 17, 2013, 7:16 p.m., Suneel Marthi wrote:
  General comment: Fix the Javadocs in code, using IntelliJ should identify 
  most of these issues.

Thank you very much for your patiently review! I learnt a lot and will not make 
the mistakes for the future issues.


- Yexi


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13406/#review30556
---


On Dec. 9, 2013, 7:57 p.m., Yexi Jiang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/13406/
 ---
 
 (Updated Dec. 9, 2013, 7:57 p.m.)
 
 
 Review request for mahout and Ted Dunning.
 
 
 Repository: mahout
 
 
 Description
 ---
 
 mahout-1265: add multilayer perceptron. For details, please refer to 
 https://issues.apache.org/jira/browse/MAHOUT-1265.
 
 
 Diffs
 -
 
   
 https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java
  PRE-CREATION 
   
 https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java
  PRE-CREATION 
   
 https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java
  PRE-CREATION 
   
 https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java
  PRE-CREATION 
   
 https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/13406/diff/
 
 
 Testing
 ---
 
 Please see the corresponding test cases
 
 
 Thanks,
 
 Yexi Jiang
 




Re: Review Request 13406: mahout-1265: add multilayer perceptron.

2013-12-17 Thread Yexi Jiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13406/
---

(Updated Dec. 17, 2013, 8:18 p.m.)


Review request for mahout and Ted Dunning.


Changes
---

Thank you very much for your patiently review! 
The code has been revised according to the comments.
I learnt a lot and will not make the mistakes for the future issues.


Repository: mahout


Description
---

mahout-1265: add multilayer perceptron. For details, please refer to 
https://issues.apache.org/jira/browse/MAHOUT-1265.


Diffs (updated)
-

  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java
 PRE-CREATION 
  
https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/13406/diff/


Testing
---

Please see the corresponding test cases


Thanks,

Yexi Jiang



[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-12-17 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851223#comment-13851223
 ] 

Yexi Jiang commented on MAHOUT-1265:


I have applied the patch to my local code base and tested it. It works without 
any error.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: MAHOUT-1265.patch, Mahout-1265-13.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part

Re: Mahout 0.9 release

2013-12-16 Thread Yexi Jiang
 deferred for last 2 Mahout releases.
 
  M-1319, M-1328, M-1347, M-1350 - Suneel
 
 
  M-1265 - Multi Layer Perceptron, Yexi please look at my comments on
 Reviewboard.
 
  M-1273 - Kun Yung, Ted, defer this to next release ???
 
 
 
  M-1312, M-1256 - Stevo, could u take one of them
 
 
  On Thursday, November 28, 2013 5:01 AM, Isabel Drost-Fromm 
 isa...@apache.org wrote:
 
  On Wed, 27 Nov 2013 14:23:11 -0800
   (PST)
  Suneel Marthi suneel_mar...@yahoo.com wrote:
  Below are the Open issues for 0.9:-
 
  This looks like we should be targeting Dec. 9th as code freeze to me.
  What do you all think?
 
 
  Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 - All
  related to Wiki updates, missing Wiki documentation and Wiki
  migration to new CMS.  Isabel's working on M-1245 (migrating to new
  CMS). Could some of the others be consolidated with
 that?
 
  I believe MAHOUT-1245
  essentially is ready to be published - all I want
  before notifying INFRA to
  switch to the new cms based site is one other
  person to take at least a brief look.
 
  For MAHOUT-1304 - Sebastian, can you please check that the cms based
  site actually does fit on 1280px? We can close this issue then.
 
  MAHOUT-1305 - I think this should be turned into a task to actually
  delete most of the pages that have been migrated to the new CMS (almost
  all of them). Once 1245 is shipped, it would be great if a few more
  people could lend a hand in getting this done.
 
  MAHOUT-1307 - Can be closed once switched to CMS
 
  MAHOUT-1326 - This really relates to the old
  Confluence export plugin
  we once have been using to generate static pages out of our wiki that
  is no longer active. Unless anyone on the Mahout dev list
  knows how to
  fully
   delete all exported static pages we should file an issue with
  INFRA to ask for help getting those deleted. They definitely are
  confusing to users.
 
 
 
  M-1286 - Peng and ssc, we had talked about this during the last
  hangout. Can this be included in 0.9?
 
  M-1030 - Andrew Musselman? Any updates on this, its important that we
  fix this for 0.9
 
  M-1319, M-1328,
M-1347,
  M-1364 - Suneel
 
  M-1273 - Kun Yung, remember talking about this in one of the earlier
  hangouts; can't recall what was decided?
 
  M-1312, M-1256 - Dan Filimon (or Stevo??)
 
  M-996  someone could pick
   this up (if its still relevant with present
  codebase i.e.)
 
  I think this can move to the next release - according to the
  contributor and Sebastian the patch is rather hacky and there for
  illustration purposes only. I'd rather see some more thought go into
  that instead of pushing to have this in 0.9.
 
 
  M-1265 Yexi had submitted a patch for this, it would be good if
  this
  could go in as part of 0.9
 
  M-1288 Solr Recommender - Pat Ferrell
 
  M-1285: Any takers for this?
 
  Would be nice to have - in particular if someone on dev@ (not
  necessarily a committer) wants to get started with the code base.
  Otherwise I'd say fix for next release
   if time gets short.
 
 
  M-1356: Isabel's started on this, Stevo could u review this?
 
  We definitely can punt that for the next release or even thereafter. It
  would be great if someone who has some knowledge of Java security
  policies would take a look. The implication of not fixing this
 
  essentially is that in case someone commits test code that writes
  outside of target or to some globally shared directory we might end up
  having randomly failing tests due to the parallel setup again. But as
  these will occur shortly after the commit it should be easy enough to
  find the code change that caused the breakage.
 
 
 
  M-1329: Support for Hadoop 2
 
  Is that truly feasable
   within a week?
 
 
  M-1366:  Stevo, Isabel 
 
  This should be done as part of the release process by release manager
  at the latest.
 
 
  M-1261:
  Sebastian???
 
  M-1309, M-1310, M-1311, M-1316 - all related to running Mahout on
  Windows ??
 
  I'm not aware of us supporting Windows.
 
 
  M-1350 - Any takers?? (Stevo??)
 
  To me this looks like a broken classpath on the user side. Without a
  patch to at least re-produce the issue I wouldn't spend too much time
 
  on this.
 
 
  Isabel
 




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: Mahout 0.9 release

2013-12-16 Thread Yexi Jiang
I have updated the code based on the previous feedback. I am now waiting to
know whether the code is shipable.


2013/12/16 Suneel Marthi suneel_mar...@yahoo.com

 Waiting on u to provide an updated patch based on the feedback on
 Reviewboard?




   On Monday, December 16, 2013 4:14 PM, Yexi Jiang yexiji...@gmail.com
 wrote:
  What about M-1265?


 2013/12/16 Suneel Marthi suneel_mar...@yahoo.com

 Its time to freeze trunk the this week, here's the status of JIRAs:-

 Suneel
 --
 M-1319 - Patch available, would appreciate if someone could review/test
 the patch before I commit to trunk.

 Pat
 -
 M-1288 Solr Recommender

 Pat, I see that you have the code in ur Github repo, could u create a
 patch that could be merged into Mahout trunk.

 Frank
 
 M-1364 (Upgrade to Lucene 4.6) - Patch available.
 Grant, do u have cycles to review this patch?


 Gokhan

 --

 M-1354 (Support for Hadoop 2.x) - Patch available.
 Gokhan, any updates on this.





 On Sunday, December 8, 2013 6:23 PM, Suneel Marthi 
 suneel_mar...@yahoo.com wrote:

 We need to freeze the trunk this coming week in preparation for 0.9
 release, below are the pending JIRAs:-

 Wiki (not a show stopper for 0.9)

 -
 M-1245, M-1304, M-1305, M-1307, M-1326


 Suneel
 ---
 M-1319 (i can work on this tomorrow)

 M-1265 (Multi Layer Perceptron) -


 Need to be merged into trunk, the code's available for review on
 ReviewBoard.
 It would help if another set of eyes reviewed the test cases (Isabel,
 Stevo.. ?)


 Pat

 
 M-1288 Solr Recommender
 (What's the status of this Pat, this needs to be in 0.9 Release.)

 Stevo
 ---
 M-1366 (this can be at time of 0.9 Release and has no impact on trunk)

 Frank
 
 M-1364 (Upgrade to Lucene 4.6) - Patch available.
   It would be nice to have this go in 0.9

 The patch worked for me Frank, I agree that this needs to be reviewed by
 someone who's more familiar with Lucene.

 Gokhan

 --

 M-1354 (Support for Hadoop 2.x) - Patch available.
 This is targeted for 1.0. The patch worked for me on Hadoop 1.2.1, it
 would be good if someone could try the patch on hadoop 2.x instance.

 Others
 --
 M-1371 - This was reported on @user and a patch was submitted. If we don't
 hear from the author within this week, this can be deferred to 1.0





 On Tuesday, December 3, 2013 8:13 PM, Suneel Marthi 
 suneel_mar...@yahoo.com wrote:

 JIRAs Update for 0.9 release:-

 Wiki - Isabel, Sebastian and other volunteers
 -
 M-1245, M-1304, M-1305, M-1307, M-1326

 Suneel
 ---
 M-1319
 M-1242 (Patch available to be committed to trunk)

 Pat
 ---
 M-1288 Solr Recommender

 Yexi, Suneel
 ---
 M-1265 - Multi Layer Perceptron

 Stevo, Isabel
 -
 M-1366

 Andrew
 --
 M-1030, M-1349

 Ted
 --
 M-1368 (Patch available to be committed to trunk)











 On Sunday, December 1, 2013 7:57 AM, Suneel Marthi 
 suneel_mar...@yahoo.com wrote:

 Open JIRAs for 0.9 release :-

 Wiki - Isabel, Sebastian and other volunteers
 -

 M-1245, M-1304, M-1305, M-1307, M-1326

 Suneel
 ---
 M-1319, M-1328

 Pat
 ---
 M-1288 Solr Recommender

 Sebastian, Peng
 
 M-1286

 Yexi, Suneel
 ---
 M-1265 - Multi Layer Perceptron
 Ted, do u have cycles to review this, the patch's up on Reviewboard.

 Stevo, Isabel
 -
 M-1366 - Please delete old releases from mirroring system
 M-1345 - Enable Randomized testing for all modules

 Andrew
 --
 M-1030

 Open Issues (any takers for these ???)
 
 M-1242
 M-1349






 On Friday, November 29, 2013 12:07 PM, Sebastian Schelter 
 ssc.o...@googlemail.com wrote:

 On 29.11.2013 17:59, Suneel Marthi wrote:
  Open JIRAs for 0.9:
 
  Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 -
 related to Wiki updates.
  Definitely appreciate more hands here to review/update the wiki
 
  M-1286 - Peng and
   Sebastian, no updates on this. Can this be included in 0.9?

 I will look into this over the weekend!


 
  M-1030 - Andrew Musselman
 
  M-1319, M-1328 -  Suneel
 
  M-1347 - Suneel, patch has been committed to trunk.
 
  M-1265 - I have been working with Yexi on this. Ted, would u have time
 to review this; the code's on Reviewboard.
 
  M-1288 - Sole Recommender, Pat Ferrel
 
  M-1345: Isabel, Frank. I think we are good on this patch. Isabel, could
 u commit this to trunk?
 
  M-1312: Stevo, could u look at this?
 
  M-1349: Any takers for this??
 
  Others: Spectral Kmeans clustering documentation (Shannon)
 
 
 
 
  On Thursday,
  November 28, 2013 10:38 AM, Suneel Marthi suneel_mar...@yahoo.com
 wrote:
 
  Adding Mahout-1349 to the list of JIRAs .
 
 
 
 
 
  On Thursday, November 28, 2013 10:37 AM, Suneel Marthi 
 suneel_mar...@yahoo.com wrote:
 
  Update on Open JIRAs for 0.9

[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron

2013-12-09 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1265:
---

Attachment: Mahout-1265-11.patch

This is the final version of the patch. It has been reviewed by [~smarthi].

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: Mahout-1265-11.patch, Mahout-1265-6.patch, 
 mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part of \delta_j

Re: [jira] [Commented] (MAHOUT-1307) Distinguish implemented algorithms from algorithms which may be implemented in the future in algorithms page

2013-12-08 Thread Yexi Jiang
It seems that some of the info on that page is out-dated.


2013/12/8 Ajay Bhat (JIRA) j...@apache.org


 [
 https://issues.apache.org/jira/browse/MAHOUT-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842449#comment-13842449]

 Ajay Bhat commented on MAHOUT-1307:
 ---

 Hi [~yamakatu], I'd like to help with this issue. But it seems I don't
 have permission to edit the page?

  Distinguish implemented algorithms from algorithms which may be
 implemented in the future in algorithms page
 
 
 
  Key: MAHOUT-1307
  URL: https://issues.apache.org/jira/browse/MAHOUT-1307
  Project: Mahout
   Issue Type: Documentation
   Components: Website
 Affects Versions: 0.8
 Reporter: yamakatu
 Priority: Minor
  Fix For: 0.9
 
 
  In case of the description of the Mahout algorithms web page,
  (https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms)
  the algorithms which may be implemented in the future are easy to be
 confused with the already implemented algorithms,
  and I think that it is difficult to recognize both intuitively.
  I think that both algorithms should be distinguished more clearly.



 --
 This message was sent by Atlassian JIRA
 (v6.1#6144)




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron

2013-12-07 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1265:
---

Attachment: Mahout-1265-6.patch

This is the final version of the patch. It has been reviewed by [~smarthi].

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: Mahout-1265-6.patch, mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part of \delta_j with 
 given partition of data

Re: Mahout 0.9 release

2013-12-02 Thread Yexi Jiang
 fit on 1280px? We can close this issue then.
   
MAHOUT-1305 - I think this should be turned into a task to actually
delete most of the pages that have been migrated to the new CMS
 (almost
all of them). Once 1245 is shipped, it would be great if a few more
people could lend a hand in getting this done.
   
MAHOUT-1307 - Can be closed once switched to CMS
   
MAHOUT-1326 - This really relates to the old Confluence export plugin
we once have been using to generate static pages out of our wiki that
is no longer active. Unless anyone on the Mahout dev list
knows how to
fully
 delete all exported static pages we should file an issue with
INFRA to ask for help getting those deleted. They definitely are
confusing to users.
   
   
   
M-1286 - Peng and ssc, we had talked about this during the last
hangout. Can this be included in 0.9?
   
M-1030 - Andrew Musselman? Any updates on this, its important that
 we
fix this for 0.9
   
M-1319, M-1328,
  M-1347, M-1364 - Suneel
   
M-1273 - Kun Yung, remember talking about this in one of the earlier
hangouts; can't recall what was decided?
   
M-1312, M-1256 - Dan Filimon (or Stevo??)
   
M-996  someone could pick
 this up (if its still relevant with present
codebase i.e.)
   
I think this can move to the next release - according to the
contributor and Sebastian the patch is rather hacky and there for
illustration purposes only. I'd rather see some more thought go into
that instead of pushing to have this in 0.9.
   
   
M-1265 Yexi had submitted a patch for this, it would be good if this
could go in as part of 0.9
   
M-1288 Solr Recommender - Pat Ferrell
   
M-1285: Any takers for this?
   
Would be nice to have - in particular if someone on dev@ (not
necessarily a committer) wants to get started with the code base.
Otherwise I'd say fix for next release
 if time gets short.
   
   
M-1356: Isabel's started on this, Stevo could u review this?
   
We definitely can punt that for the next release or even thereafter.
 It
would be great if someone who has some knowledge of Java security
policies would take a look. The implication of not fixing this
essentially is that in case someone commits test code that writes
outside of target or to some globally shared directory we might end
 up
having randomly failing tests due to the parallel setup again. But as
these will occur shortly after the commit it should be easy enough to
find the code change that caused the breakage.
   
   
   
M-1329: Support for Hadoop 2
   
Is that truly feasable
 within a week?
   
   
M-1366:  Stevo, Isabel 
   
This should be done as part of the release process by release manager
at the latest.
   
   
M-1261: Sebastian???
   
M-1309, M-1310, M-1311, M-1316 - all related to running Mahout on
Windows ??
   
I'm not aware of us supporting Windows.
   
   
M-1350 - Any takers?? (Stevo??)
   
To me this looks like a broken classpath on the user side. Without a
patch to at least re-produce the issue I wouldn't spend too much time
   
on this.
   
   
Isabel
   
  
  
  
 




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: Review Request 13406: mahout-1265: add multilayer perceptron.

2013-12-02 Thread Yexi Jiang
Suneel,

If this does not work, which location is the safe place to put the
temporary file?

Regards,
Yexi


2013/12/2 Suneel Marthi suneel.mar...@gmail.com

 Yexi,

 The tests have to be redone in light of recent
 changes for m-1345.  We shouldn't be writing to /tmp
 anymore which is gonna fail the tests.

 More later.

 Sent from my iPhone

 On Dec 2, 2013, at 6:25 PM, Yexi Jiang yexiji...@gmail.com wrote:

   This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/13406/
   Review request for mahout and Ted Dunning.
 By Yexi Jiang.

 *Updated Dec. 2, 2013, 11:25 p.m.*
 Changes

 I have updated the code according to the comments.

   *Repository: * mahout
 Description

 mahout-1265: add multilayer perceptron. For details, please refer to 
 https://issues.apache.org/jira/browse/MAHOUT-1265.

   Testing

 Please see the corresponding test cases

   Diffs (updated)

-

 https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java
(PRE-CREATION)
-

 https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java
(PRE-CREATION)
-

 https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java
(PRE-CREATION)
-

 https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java
(PRE-CREATION)
-

 https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java
(PRE-CREATION)

 View Diff https://reviews.apache.org/r/13406/diff/




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-11-28 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834959#comment-13834959
 ] 

Yexi Jiang commented on MAHOUT-1265:


OK, I'll revise it accordingly.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part of \delta_j with 
 given partition of data, and then store the result into a specified

Re: Mahout 0.9 release

2013-11-28 Thread Yexi Jiang
I am working on M-1265.


2013/11/28 Suneel Marthi suneel_mar...@yahoo.com

 Update on Open JIRAs for 0.9:

 Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 - all
 related to Wiki updates, please see Isabel's updates.

 M-1286 - Peng and Sebastian, we had talked about this during the last
 hangout. Can this be included in 0.9?

 M-1030- Andrew Musselman, its critical that we get this into 0.9, its been
 deferred for last 2 Mahout releases.

 M-1319, M-1328, M-1347, M-1350 - Suneel


 M-1265 - Multi Layer Perceptron, Yexi please look at my comments on
 Reviewboard.

 M-1273 - Kun Yung, Ted, defer this to next release ???



 M-1312, M-1256 - Stevo, could u take one of them

 On Thursday, November 28, 2013 5:01 AM, Isabel Drost-Fromm 
 isa...@apache.org wrote:

 On Wed, 27 Nov 2013 14:23:11 -0800 (PST)
 Suneel Marthi suneel_mar...@yahoo.com wrote:
  Below are the Open issues for 0.9:-

 This looks like we should be targeting Dec. 9th as code freeze to me.
 What do you all think?


  Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 - All
  related to Wiki updates, missing Wiki documentation and Wiki
  migration to new CMS.  Isabel's working on M-1245 (migrating to new
  CMS). Could some of the others be consolidated with that?

 I believe MAHOUT-1245 essentially is ready to be published - all I want
 before notifying INFRA to
  switch to the new cms based site is one other
 person to take at least a brief look.

 For MAHOUT-1304 - Sebastian, can you please check that the cms based
 site actually does fit on 1280px? We can close this issue then.

 MAHOUT-1305 - I think this should be turned into a task to actually
 delete most of the pages that have been migrated to the new CMS (almost
 all of them). Once 1245 is shipped, it would be great if a few more
 people could lend a hand in getting this done.

 MAHOUT-1307 - Can be closed once switched to CMS

 MAHOUT-1326 - This really relates to the old Confluence export plugin
 we once have been using to generate static pages out of our wiki that
 is no longer active. Unless anyone on the Mahout dev list
  knows how to
 fully delete all exported static pages we should file an issue with
 INFRA to ask for help getting those deleted. They definitely are
 confusing to users.



  M-1286 - Peng and ssc, we had talked about this during the last
  hangout. Can this be included in 0.9?
 
  M-1030 - Andrew Musselman? Any updates on this, its important that we
  fix this for 0.9
 
  M-1319, M-1328,
   M-1347, M-1364 - Suneel
 
  M-1273 - Kun Yung, remember talking about this in one of the earlier
  hangouts; can't recall what was decided?
 
  M-1312, M-1256 - Dan Filimon (or Stevo??)
 
  M-996  someone could pick this up (if its still relevant with present
  codebase i.e.)

 I think this can move to the next release - according to the
 contributor and Sebastian the patch is rather hacky and there for
 illustration purposes only. I'd rather see some more thought go into
 that instead of pushing to have this in 0.9.


  M-1265 Yexi had submitted a patch for this, it would be good if this
  could go in as part of 0.9
 
  M-1288 Solr Recommender - Pat Ferrell
 
  M-1285: Any takers for this?

 Would be nice to have - in particular if someone on dev@ (not
 necessarily a committer) wants to get started with the code base.
 Otherwise I'd say fix for next release if time gets short.


  M-1356: Isabel's started on this, Stevo could u review this?

 We definitely can punt that for the next release or even thereafter. It
 would be great if someone who has some knowledge of Java security
 policies would take a look. The implication of not fixing this
 essentially is that in case someone commits test code that writes
 outside of target or to some globally shared directory we might end up
 having randomly failing tests due to the parallel setup again. But as
 these will occur shortly after the commit it should be easy enough to
 find the code change that caused the breakage.



  M-1329: Support for Hadoop 2

 Is that truly feasable within a week?


  M-1366:  Stevo, Isabel 

 This should be done as part of the release process by release manager
 at the latest.


  M-1261: Sebastian???
 
  M-1309, M-1310, M-1311, M-1316 - all related to running Mahout on
  Windows ??

 I'm not aware of us supporting Windows.


  M-1350 - Any takers?? (Stevo??)

 To me this looks like a broken classpath on the user side. Without a
 patch to at least re-produce the issue I wouldn't spend too much time

 on this.


 Isabel




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron

2013-11-18 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825355#comment-13825355
 ] 

Yexi Jiang commented on MAHOUT-976:
---

[MAHOUT-1265|https://issues.apache.org/jira/browse/MAHOUT-1265] is actually a 
new implementation of MLP based on Ted's comments. For example, the users can 
freely configure the layer but setting the number of neurons, the squashing 
function. Also, the users can also set the kind of cost function and the 
parameters like learning rate, momemtum weight and so on.

 Implement Multilayer Perceptron
 ---

 Key: MAHOUT-976
 URL: https://issues.apache.org/jira/browse/MAHOUT-976
 Project: Mahout
  Issue Type: New Feature
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
Priority: Minor
  Labels: multilayer, networks, neural, perceptron
 Fix For: Backlog

 Attachments: MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch, 
 MAHOUT-976.patch

   Original Estimate: 80h
  Remaining Estimate: 80h

 Implement a multi layer perceptron
  * via Matrix Multiplication
  * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: 
 Efficent Backprop
  * arbitrary number of hidden layers (also 0  - just the linear model)
  * connection between proximate layers only 
  * different cost and activation functions (different activation function in 
 each layer) 
  * test of backprop by gradient checking 
  * normalization of the inputs (storeable) as part of the model
  
 First:
  * implementation stocastic gradient descent like gradient machine
  * simple gradient descent incl. momentum
 Later (new jira issues):  
  * Distributed Batch learning (see below)  
  * Stacked (Denoising) Autoencoder - Feature Learning
  * advanced cost minimazation like 2nd order methods, conjugate gradient etc.
 Distribution of learning can be done by (batch learning):
  1 Partioning of the data in x chunks 
  2 Learning the weight changes as matrices in each chunk
  3 Combining the matrixes and update of the weights - back to 2
 Maybe this procedure can be done with random parts of the chunks (distributed 
 quasi online learning). 
 Batch learning with delta-bar-delta heuristics for adapting the learning 
 rates.
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-10-05 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787462#comment-13787462
 ] 

Yexi Jiang commented on MAHOUT-1265:


There is no news?

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part of \delta_j with 
 given partition of data, and then store the result into a specified location.
 3.2

Re: You are invited to Apache Mahout meet-up

2013-08-22 Thread Yexi Jiang
A great event. I wish I were in Bay area.


2013/8/22 Shannon Quinn squ...@gatech.edu

 I'm only sorry I'm not in the Bay area. Sounds great!


 On 8/22/13 3:38 AM, Stevo Slavić wrote:

 Retweeted meetup invite. Have fun!

 Kind regards,
 Stevo Slavic.


 On Thu, Aug 22, 2013 at 8:34 AM, Ted Dunning ted.dunn...@gmail.com
 wrote:

  Very cool.

 Would love to see folks turn out for this.


 On Wed, Aug 21, 2013 at 9:38 PM, Ellen Friedman
 b.ellen.fried...@gmail.com**wrote:

  The Apache Mahout user group has been re-activated. If you are in the
 Bay
 Area in California, join us on Aug 27 (Redwood City).

 Sebastian Schelter will be the main speaker, talking about new
 directions
 with Mahout recommendation. Grant Ingersoll, Ted Dunning and I be there

 to

 do a short introduction for the meet-up and update on the 0.8 release.

 Here's the link to rsvp: http://bit.ly/16K32hg

 Hope you can come, and please spread the word.

 Ellen





-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-08-07 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732836#comment-13732836
 ] 

Yexi Jiang commented on MAHOUT-1265:


Is there any who can review the code?
The sample code for using it can be seen in test cases.

[~tdunning] Could you please give any comments?

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So

[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-08-07 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733110#comment-13733110
 ] 

Yexi Jiang commented on MAHOUT-1265:


[~smarthi] Done, please refer to https://reviews.apache.org/r/13406/. Thank you.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part of \delta_j with 
 given

Re: Hangout on Monday

2013-08-05 Thread Yexi Jiang
Can anyone join?


2013/8/5 DB Tsai dbt...@dbtsai.com

 Can we get the google hangout link? Kun and I don't get the invitation.

 Sincerely,

 DB Tsai
 ---
 Web: http://www.dbtsai.com
 Phone : +1-650-383-8392


 On Mon, Aug 5, 2013 at 3:54 PM, Ted Dunning ted.dunn...@gmail.com wrote:
  Yes.  Max of 10.
 
 
  On Mon, Aug 5, 2013 at 3:53 PM, Nyoman Ribeka nyoman.rib...@gmail.com
 wrote:
 
  I think hangout have a maximum of 10 participants. Watching the youtube
  means you're passively participating.
 
 
  On Mon, Aug 5, 2013 at 6:51 PM, Sebastian Schelter s...@apache.org
 wrote:
 
   Is the link only for watching or also for participation? Never did a
   hangout before :)
  
   2013/8/5 Andrew Musselman andrew.mussel...@gmail.com
  
Can't make it alas
   
   
On Mon, Aug 5, 2013 at 3:12 PM, Michael Kun Yang 
 kuny...@stanford.edu
wrote:
   
 what's the addr of the hangout?


 On Sun, Aug 4, 2013 at 10:37 AM, Peng Cheng pc...@uowmail.edu.au
 
wrote:

  Nice, I'll be there.
 
 
  On 13-08-03 02:51 PM, Andrew Musselman wrote:
 
  Sounds good
 
 
  On Sat, Aug 3, 2013 at 12:04 AM, Ted Dunning 
  ted.dunn...@gmail.com
   
  wrote:
 
   Yes.  1600 PDT
 
  I got that right in the linked doc, just not on the more
  important
 email.
 
 
 
 
  On Fri, Aug 2, 2013 at 3:30 PM, Andrew Psaltis 
  andrew.psal...@webtrends.com
 
  wrote:
  On 8/2/13 4:42 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:
 
   Let's have the hangout at 1600 on Monday, August 5th.
 
  Maybe asking the obvious here so I apologize for the spam.
 The
 timezone
 
  is
 
  PDT, correct?
 
 
 
 
 

   
  
 
 
 
  --
  Thanks,
 
  -Nyoman Ribeka
 




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: Hangout on Monday

2013-08-05 Thread Yexi Jiang
Hi, Ted,

I added you on google plus.


2013/8/5 Suneel Marthi suneel_mar...@yahoo.com

 Grant had setup a biweekly/weekly Google Doodle for Mahout meetups.

 We had only one of them sometime in July with no technical issues.

 I could see and hear you guys talk on today's hangout but it just wouldn't
 allow me to join in.

 Suggest that we should be using that going forward, there is no need for
 the meeting host to add the rest of the team to his/her circles that way.


 Regards,
 Suneel


 As our Google+ circle of knowledge expands, so does the circumference of
 darkness surrounding it - Albert Einstein



 
  From: Ted Dunning ted.dunn...@gmail.com
 To: Mahout Dev List dev@mahout.apache.org
 Sent: Monday, August 5, 2013 8:16 PM
 Subject: Re: Hangout on Monday


 Peng,

 It looks like you are not actually on google plus.  I have you in my Mahout
 circle under your iowa email address, but I am unable to add you to a
 hangout.


 On Mon, Aug 5, 2013 at 5:07 PM, Peng Cheng pc...@uowmail.edu.au wrote:

  So buggy, the program act as i'm in the meeting (showing a push to talk
  button), but it doesn't do anything.
 
 
  On 13-08-05 08:02 PM, Ted Dunning wrote:
 
  Hangouts clearly do not work the way I thought they did.  The URL that I
  sent out was for the arhcived version of the meeting.
 
 
  On Mon, Aug 5, 2013 at 5:00 PM, Peng Cheng pc...@uowmail.edu.au
 wrote:
 
   Strange, I didn't see any invitation.
 
 
  On 13-08-05 06:54 PM, Ted Dunning wrote:
 
   Just sent invite to Mahout dev list.
 
 
  On Mon, Aug 5, 2013 at 3:53 PM, Ted Dunning ted.dunn...@gmail.com
  wrote:
 
It is for both.
 
  If you have g+ installed you can participate.  If not, you can watch.
 
 
 
  On Mon, Aug 5, 2013 at 3:51 PM, Sebastian Schelter s...@apache.org
  wrote:
 
Is the link only for watching or also for participation? Never did
 a
 
  hangout before :)
 
  2013/8/5 Andrew Musselman andrew.mussel...@gmail.com
 
Can't make it alas
 
 
  On Mon, Aug 5, 2013 at 3:12 PM, Michael Kun Yang 
  kuny...@stanford.edu
 
   wrote:
  what's the addr of the hangout?
 
 
  On Sun, Aug 4, 2013 at 10:37 AM, Peng Cheng pc...@uowmail.edu.au
 
 
   wrote:
 
   Nice, I'll be there.
 
 
  On 13-08-03 02:51 PM, Andrew Musselman wrote:
 
Sounds good
 
 
  On Sat, Aug 3, 2013 at 12:04 AM, Ted Dunning 
 
   ted.dunn...@gmail.com
 
wrote:
 
 Yes.  1600 PDT
 
   I got that right in the linked doc, just not on the more
  important
 
   email.
 
 
  On Fri, Aug 2, 2013 at 3:30 PM, Andrew Psaltis 
  andrew.psal...@webtrends.com
 
wrote:
 
  On 8/2/13 4:42 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:
 
 Let's have the hangout at 1600 on Monday, August 5th.
  Maybe asking the obvious here so I apologize for the spam. The
 
   timezone
 
  is
 
PDT, correct?
 
 
 
 
 
 
 
 




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron

2013-08-03 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1265:
---

Attachment: mahout-1265.patch

[~tdunning] I have finished a workable single machine version 
MultilayerPerceptron (based on NeuralNetwork). It supports the requirement as 
you mentioned above. It allow users to customize each layer including the size 
and the squashing function. Also, it allows users to specify different loss 
functions to the model. Moreover, it allow user to store the trained model and 
reload it for later use. Finally, it allows users to extract the weight of each 
layer from a trained model. This approach allows users to train and stack a 
deep learning neural network layer by layer. If this single machine version 
passes the review, I will begin to work on the map-reduce version base on it.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge

[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron

2013-08-03 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1265:
---

Status: Patch Available  (was: Open)

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i)} - o_j^{(m_i)})
 The above equation indicates that the \delta_j can be divided into k parts.
 So for the implementation, each mapper can calculate part of \delta_j with 
 given partition of data, and then store the result into a specified location.
 3.2 The model update step:
 After k

[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-08-03 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728756#comment-13728756
 ] 

Yexi Jiang commented on MAHOUT-1265:


[~tdunning] The test cases contain the test on three datasets, the simple XOR 
problem, the Cancer dataset (2-class classification) and the Iris 
dataset(3-class classification). For the later two datasets, the classification 
accuracy is more than 90%.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
 o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
 o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 
 It is easy to know that \delta_j can be rewritten as 
 \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) 
 * (t_j^{(m_i

[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-08-03 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728769#comment-13728769
 ] 

Yexi Jiang commented on MAHOUT-1265:


The MLP is implemented based on NeuralNetwork. The NeuralNetwork is more 
general in terms of functionality (can be used for regression, classification, 
dimensional reduction, etc) and architecture (Linear Regression and Logistic 
Regression as a two-level neural network, Autoencoder as a three-level neural 
network, I heard that even the SVM can be modeled as a type of neural network, 
but I'm not sure.). 

In my opinion, the NeuralNetwork I implemented is a suitable start for deep 
learning, as one implementation of the Deep nets is based on stacking the 
Autoencoder.

 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network
 Attachments: mahout-1265.patch


 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1 The partial weight update calculation step:
 This step trains the MLP distributedly. Each task will get a copy of the MLP 
 model, and calculate the weight update with a partition of data.
 Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where 
 D denotes the training set, d denotes a training instance, t_d denotes the 
 class label and y_d denotes the output of the MLP. Also, suppose sigmoid 
 function is used as the squashing function, 
 squared error is used as the cost function, 
 t_i denotes the target value for the ith dimension of the output layer, 
 o_i denotes the actual output for the ith dimension of the output layer, 
 l denotes the learning rate,
 w_{ij} denotes the weight between the jth neuron in previous layer and the 
 ith neuron in the next layer. 
 The weight of each edge is updated as 
 \Delta w_{ij} = l * 1 / m * \delta_j * o_i, 
 where \delta_j = - \sigma_{m} * o_j

[jira] [Created] (MAHOUT-1265) Add Multilayer Perceptron

2013-06-18 Thread Yexi Jiang (JIRA)
Yexi Jiang created MAHOUT-1265:
--

 Summary: Add Multilayer Perceptron 
 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang


Design of multilayer perceptron


1. Motivation
A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
network, which is a mathematical model inspired by the biological neural 
network. The multilayer perceptron can be used for various machine learning 
tasks such as classification and regression. It is helpful if it can be 
included in mahout.

2. API

The design goal of API is to facilitate the usage of MLP for user, and make the 
implementation detail user transparent.

The following is an example code of how user uses the MLP.
-
//  set the parameters
double learningRate = 0.5;
double momentum = 0.1;
double regularization = 0.01;
int[] layerSizeArray = new int[] {2, 5, 1};
String costFuncName = “SquaredError”;
String squashingFuncName = “Sigmoid”;
//  the location to store the model, if there is already an existing model at 
the specified location, MLP will throw exception
URI modelLocation = ...
MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
regularization, momentum, squashingFuncName, costFuncName, layerSizeArray, 
modelLocation);

//  the user can also load an existing model with given URI and update the 
model with new training data, if there is no existing model at the specified 
location, an exception will be thrown
/*
MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
regularization, momentum, squashingFuncName, costFuncName, modelLocation);
*/

URI trainingDataLocation = …
//  the detail of training is transparent to the user, it may running in a 
single machine or in a distributed environment
mlp.train(trainingDataLocation);

//  user can also train the model with one training instance in stochastic 
gradient descent way
Vector trainingInstance = ...
mlp.train(trainingInstance);

//  prepare the input feature
Vector inputFeature …
//  the semantic meaning of the output result is defined by the user
//  in general case, the dimension of output vector is 1 for regression and 
two-class classification
//  the dimension of output vector is n for n-class classification (n  2)
Vector outputVector = mlp.output(inputFeature); 
-


3. Methodology

The output calculation can be easily implemented with feed-forward approach. 
Also, the single machine training is straightforward. The following will 
describe how to train MLP in distributed way with batch gradient descent. The 
workflow is illustrated as the below figure.


https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720

For the distributed training, each training iteration is divided into two 
steps, the weight update calculation step and the weight update step. The 
distributed MLP can only be trained in batch-update approach.


3.1 The partial weight update calculation step:
This step trains the MLP distributedly. Each task will get a copy of the MLP 
model, and calculate the weight update with a partition of data.

Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D 
denotes the training set, d denotes a training instance, t_d denotes the class 
label and y_d denotes the output of the MLP. Also, suppose sigmoid function is 
used as the squashing function, 
squared error is used as the cost function, 
t_i denotes the target value for the ith dimension of the output layer, 
o_i denotes the actual output for the ith dimension of the output layer, 
l denotes the learning rate,
w_{ij} denotes the weight between the jth neuron in previous layer and the ith 
neuron in the next layer. 

The weight of each edge is updated as 

\Delta w_{ij} = l * 1 / m * \delta_j * o_i, 

where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 

It is easy to know that \delta_j can be rewritten as 

\delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * 
(t_j^{(m_i)} - o_j^{(m_i)})

The above equation indicates that the \delta_j can be divided into k parts.

So for the implementation, each mapper can calculate part of \delta_j with 
given partition of data, and then store the result into a specified location.


3.2 The model update step:

After k parts of \delta_j been calculated, a separate program can be used to 
merge the k parts of \delta_j into one to update the weight matrices.

This program can load the results calculated in the weight update calculation 
step and update the weight matrices. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly

[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron

2013-06-18 Thread Yexi Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yexi Jiang updated MAHOUT-1265:
---

Description: 
Design of multilayer perceptron


1. Motivation
A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
network, which is a mathematical model inspired by the biological neural 
network. The multilayer perceptron can be used for various machine learning 
tasks such as classification and regression. It is helpful if it can be 
included in mahout.

2. API

The design goal of API is to facilitate the usage of MLP for user, and make the 
implementation detail user transparent.

The following is an example code of how user uses the MLP.
-
//  set the parameters
double learningRate = 0.5;
double momentum = 0.1;
int[] layerSizeArray = new int[] {2, 5, 1};
String costFuncName = “SquaredError”;
String squashingFuncName = “Sigmoid”;
//  the location to store the model, if there is already an existing model at 
the specified location, MLP will throw exception
URI modelLocation = ...
MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
modelLocation);
mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);

//  the user can also load an existing model with given URI and update the 
model with new training data, if there is no existing model at the specified 
location, an exception will be thrown
/*
MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
regularization, momentum, squashingFuncName, costFuncName, modelLocation);
*/

URI trainingDataLocation = …
//  the detail of training is transparent to the user, it may running in a 
single machine or in a distributed environment
mlp.train(trainingDataLocation);

//  user can also train the model with one training instance in stochastic 
gradient descent way
Vector trainingInstance = ...
mlp.train(trainingInstance);

//  prepare the input feature
Vector inputFeature …
//  the semantic meaning of the output result is defined by the user
//  in general case, the dimension of output vector is 1 for regression and 
two-class classification
//  the dimension of output vector is n for n-class classification (n  2)
Vector outputVector = mlp.output(inputFeature); 
-


3. Methodology

The output calculation can be easily implemented with feed-forward approach. 
Also, the single machine training is straightforward. The following will 
describe how to train MLP in distributed way with batch gradient descent. The 
workflow is illustrated as the below figure.


https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720

For the distributed training, each training iteration is divided into two 
steps, the weight update calculation step and the weight update step. The 
distributed MLP can only be trained in batch-update approach.


3.1 The partial weight update calculation step:
This step trains the MLP distributedly. Each task will get a copy of the MLP 
model, and calculate the weight update with a partition of data.

Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D 
denotes the training set, d denotes a training instance, t_d denotes the class 
label and y_d denotes the output of the MLP. Also, suppose sigmoid function is 
used as the squashing function, 
squared error is used as the cost function, 
t_i denotes the target value for the ith dimension of the output layer, 
o_i denotes the actual output for the ith dimension of the output layer, 
l denotes the learning rate,
w_{ij} denotes the weight between the jth neuron in previous layer and the ith 
neuron in the next layer. 

The weight of each edge is updated as 

\Delta w_{ij} = l * 1 / m * \delta_j * o_i, 

where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - 
o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - 
o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. 

It is easy to know that \delta_j can be rewritten as 

\delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * 
(t_j^{(m_i)} - o_j^{(m_i)})

The above equation indicates that the \delta_j can be divided into k parts.

So for the implementation, each mapper can calculate part of \delta_j with 
given partition of data, and then store the result into a specified location.


3.2 The model update step:

After k parts of \delta_j been calculated, a separate program can be used to 
merge the k parts of \delta_j into one to update the weight matrices.

This program can load the results calculated in the weight update calculation 
step and update the weight matrices. 


  was:
Design of multilayer perceptron


1. Motivation
A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
network, which is a mathematical model inspired

[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron

2013-06-18 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13686880#comment-13686880
 ] 

Yexi Jiang commented on MAHOUT-1265:


Ted,

{quote}
I would suggest that a more fluid API would be helpful to people. For instance, 
each layer might be an object which could be composed together to build a model 
which
is then trained.
{quote}

It seems that you suggest a more general neural network, not just the MLP.
A MLP is a kind of feed-forward neural network that the topology is fixed.
It usually consists of several layers and every pair of neurons in adjacent 
layers are connected.
Therefore, specify the size of each layer is enough to determine the topology 
of a MLP.

It is good if we first define a generic neural network, and then build a MLP on 
top of this generic neural network in the way as you said. An advantage is that 
the generic neural network can be reused to build other types of neural 
networks in the future, e.g. autoencoder for dimensional reduction, recurrent 
neural network for sequential mining, or possibly deep nets, etc.


{quote}
Secondly, it seems like it would be good to have different kinds of loss 
function and
regularizations.
{quote}

Yes, the MLP would allow the user to specify different loss function, squashing 
functions, and regularizations.


{quote}
Also, regarding things like momentum, do you have an idea that this really 
needs to be
commonly adjusted? or is there a way to set a good default?
{quote}

As far as I know, there is no empirical way to set a good default momentum 
weight. A good value is determined by the concrete problem. As for learning 
rate, a good way is to enable the decaying learning rate.




 Add Multilayer Perceptron 
 --

 Key: MAHOUT-1265
 URL: https://issues.apache.org/jira/browse/MAHOUT-1265
 Project: Mahout
  Issue Type: New Feature
Reporter: Yexi Jiang
  Labels: machine_learning, neural_network

 Design of multilayer perceptron
 1. Motivation
 A multilayer perceptron (MLP) is a kind of feed forward artificial neural 
 network, which is a mathematical model inspired by the biological neural 
 network. The multilayer perceptron can be used for various machine learning 
 tasks such as classification and regression. It is helpful if it can be 
 included in mahout.
 2. API
 The design goal of API is to facilitate the usage of MLP for user, and make 
 the implementation detail user transparent.
 The following is an example code of how user uses the MLP.
 -
 //  set the parameters
 double learningRate = 0.5;
 double momentum = 0.1;
 int[] layerSizeArray = new int[] {2, 5, 1};
 String costFuncName = “SquaredError”;
 String squashingFuncName = “Sigmoid”;
 //  the location to store the model, if there is already an existing model at 
 the specified location, MLP will throw exception
 URI modelLocation = ...
 MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, 
 modelLocation);
 mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...);
 //  the user can also load an existing model with given URI and update the 
 model with new training data, if there is no existing model at the specified 
 location, an exception will be thrown
 /*
 MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, 
 regularization, momentum, squashingFuncName, costFuncName, modelLocation);
 */
 URI trainingDataLocation = …
 //  the detail of training is transparent to the user, it may running in a 
 single machine or in a distributed environment
 mlp.train(trainingDataLocation);
 //  user can also train the model with one training instance in stochastic 
 gradient descent way
 Vector trainingInstance = ...
 mlp.train(trainingInstance);
 //  prepare the input feature
 Vector inputFeature …
 //  the semantic meaning of the output result is defined by the user
 //  in general case, the dimension of output vector is 1 for regression and 
 two-class classification
 //  the dimension of output vector is n for n-class classification (n  2)
 Vector outputVector = mlp.output(inputFeature); 
 -
 3. Methodology
 The output calculation can be easily implemented with feed-forward approach. 
 Also, the single machine training is straightforward. The following will 
 describe how to train MLP in distributed way with batch gradient descent. The 
 workflow is illustrated as the below figure.
 https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720
 For the distributed training, each training iteration is divided into two 
 steps, the weight update calculation step and the weight update step. The 
 distributed MLP can only be trained in batch-update approach.
 3.1

[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-11 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680476#comment-13680476
 ] 

Yexi Jiang commented on MAHOUT-975:
---

There are multiple problems (not only bugs) with the GradientMachine (based on 
Ted's revised version). If there is not time to pay attention to this issue, 
please ignore it until next week (when 0.8 is released).

1) The GradientMachine is a special case of MultiLayerPerceptron (MLP) that 
contains only 1 hidden layer. Is it necessary to have it if the 
MultiLayerPerceptron is in the plan?

2) The hiddenToOutput seems not correct. The squashing(activation) function 
should also apply to the output layer (See [1][2][3][4]). Therefore, the range 
of the output for each node(neuron) in the output is (0, 1) if Sigmoid function 
is used, or (-1, 1) if Tanh function is used.

3) There are several problems with the training method. In updateRanking, I 
don't know which weight update strategy is used, it claims it is 
back-propagation, but it is not implemented in that way. 

3.1) It seems that only part of the outputWeight are updated (the weights 
associated with the good output node, and the weights associated with the worst 
output node. Again, this is OK for two-class problem).
For back-propagation, all the weights between the last hidden layer and the 
output layer should be updated. So, is the original designer intentionally 
design it like that and can guarantee its correctness?

In the backpropagation way, the delta of each node should be calculated first, 
and the weight of each node is adjusted based on the corresponding delta. 
However, in the implemented code, 
   
 3.2) The GradientMachine (and MLP) actually can also be used for regression 
and prediction. The 'train' method of OnlineLearner restricts its power.

4) The corresponding test case is not enough to test the correctness of the 
implementation.

5) If all the previous problems have been fixed, it is time to consider the 
necessity of a map-reduce version of the algorithm.
 

Reference:
[1] Tom Mitchel. Machine Learning. Chapter 4.
[2] Jiawei Han. Data Mining Concepts and Technologies. Chapter 6.
[3] Stanford Unsupervised Feature Learning and Deep Learning tutorial. 
http://ufldl.stanford.edu/wiki/index.php/Neural_Networks. Section Neural 
Network.
[4] Christopher Bishop. Neural Networks for Pattern Recognition. Chapter 4.



 Bug in Gradient Machine  - Computation of the gradient
 --

 Key: MAHOUT-975
 URL: https://issues.apache.org/jira/browse/MAHOUT-975
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
 Fix For: Backlog

 Attachments: GradientMachine2.java, GradientMachine.patch, 
 MAHOUT-975.patch


 The initialisation to compute the gradient descent weight updates for the 
 output units should be wrong:
  
 In the comment: dy / dw is just w since  y = x' * w + b.
 This is wrong. dy/dw is x (ignoring the indices). The same initialisation is 
 done in the code.
 Check by using neural network terminology:
 The gradient machine is a specialized version of a multi layer perceptron 
 (MLP).
 In a MLP the gradient for computing the weight change for the output units 
 is:
 dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
 here: i index of the output layer; j index of the hidden layer
 (d stands for the partial derivatives)
 here: z_i = a_i (no squashing in the output layer)
 with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g + z_b
 with
 g index of output unit with target value: +1 (positive class)
 b: random output unit with target value: 0
 =
 dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden 
 unit
 j)
 dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden 
 unit
 j)
 That's the same if the comment would be correct:
 dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
 the output unit with target value +1.
 
 In neural network implementations it's common to compute the gradient
 numerically for a test of the implementation. This can be done by:
 dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-11 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680673#comment-13680673
 ] 

Yexi Jiang commented on MAHOUT-975:
---

The size of goodLabels in updateRanking is always 1 and it seems that there is 
no need to use a loop.
Also, the existing test case cannot been passed. The 
ArrayIndexOutOfBoundsException are thrown.

 Bug in Gradient Machine  - Computation of the gradient
 --

 Key: MAHOUT-975
 URL: https://issues.apache.org/jira/browse/MAHOUT-975
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
 Fix For: Backlog

 Attachments: GradientMachine2.java, GradientMachine.patch, 
 MAHOUT-975.patch


 The initialisation to compute the gradient descent weight updates for the 
 output units should be wrong:
  
 In the comment: dy / dw is just w since  y = x' * w + b.
 This is wrong. dy/dw is x (ignoring the indices). The same initialisation is 
 done in the code.
 Check by using neural network terminology:
 The gradient machine is a specialized version of a multi layer perceptron 
 (MLP).
 In a MLP the gradient for computing the weight change for the output units 
 is:
 dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
 here: i index of the output layer; j index of the hidden layer
 (d stands for the partial derivatives)
 here: z_i = a_i (no squashing in the output layer)
 with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g + z_b
 with
 g index of output unit with target value: +1 (positive class)
 b: random output unit with target value: 0
 =
 dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden 
 unit
 j)
 dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden 
 unit
 j)
 That's the same if the comment would be correct:
 dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
 the output unit with target value +1.
 
 In neural network implementations it's common to compute the gradient
 numerically for a test of the implementation. This can be done by:
 dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-10 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679582#comment-13679582
 ] 

Yexi Jiang commented on MAHOUT-975:
---

[~smarthi] Sure, I'd like to try it. Is the deadline end of this week?

 Bug in Gradient Machine  - Computation of the gradient
 --

 Key: MAHOUT-975
 URL: https://issues.apache.org/jira/browse/MAHOUT-975
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
 Fix For: 0.8

 Attachments: GradientMachine.patch


 The initialisation to compute the gradient descent weight updates for the 
 output units should be wrong:
  
 In the comment: dy / dw is just w since  y = x' * w + b.
 This is wrong. dy/dw is x (ignoring the indices). The same initialisation is 
 done in the code.
 Check by using neural network terminology:
 The gradient machine is a specialized version of a multi layer perceptron 
 (MLP).
 In a MLP the gradient for computing the weight change for the output units 
 is:
 dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
 here: i index of the output layer; j index of the hidden layer
 (d stands for the partial derivatives)
 here: z_i = a_i (no squashing in the output layer)
 with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g + z_b
 with
 g index of output unit with target value: +1 (positive class)
 b: random output unit with target value: 0
 =
 dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden 
 unit
 j)
 dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden 
 unit
 j)
 That's the same if the comment would be correct:
 dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
 the output unit with target value +1.
 
 In neural network implementations it's common to compute the gradient
 numerically for a test of the implementation. This can be done by:
 dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-10 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679837#comment-13679837
 ] 

Yexi Jiang commented on MAHOUT-975:
---

[~smarthi] When I apply this patch, the source code cannot be compiled. One of 
the error is that hiddenActivations cannot be resolved. Another error is that 
the class Functions.NEGATE is misspell as Function.NEGATE.



 Bug in Gradient Machine  - Computation of the gradient
 --

 Key: MAHOUT-975
 URL: https://issues.apache.org/jira/browse/MAHOUT-975
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
 Fix For: 0.8

 Attachments: GradientMachine.patch


 The initialisation to compute the gradient descent weight updates for the 
 output units should be wrong:
  
 In the comment: dy / dw is just w since  y = x' * w + b.
 This is wrong. dy/dw is x (ignoring the indices). The same initialisation is 
 done in the code.
 Check by using neural network terminology:
 The gradient machine is a specialized version of a multi layer perceptron 
 (MLP).
 In a MLP the gradient for computing the weight change for the output units 
 is:
 dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
 here: i index of the output layer; j index of the hidden layer
 (d stands for the partial derivatives)
 here: z_i = a_i (no squashing in the output layer)
 with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g + z_b
 with
 g index of output unit with target value: +1 (positive class)
 b: random output unit with target value: 0
 =
 dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden 
 unit
 j)
 dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden 
 unit
 j)
 That's the same if the comment would be correct:
 dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
 the output unit with target value +1.
 
 In neural network implementations it's common to compute the gradient
 numerically for a test of the implementation. This can be done by:
 dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-10 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679837#comment-13679837
 ] 

Yexi Jiang edited comment on MAHOUT-975 at 6/10/13 8:07 PM:


[~smarthi] When I apply this patch, the source code cannot be compiled. One of 
the error is that hiddenActivations cannot be resolved. Another error is that 
the class Functions.NEGATE is misspelled as Function.NEGATE.





  was (Author: yxjiang):
[~smarthi] When I apply this patch, the source code cannot be compiled. One 
of the error is that hiddenActivations cannot be resolved. Another error is 
that the class Functions.NEGATE is misspell as Function.NEGATE.


  
 Bug in Gradient Machine  - Computation of the gradient
 --

 Key: MAHOUT-975
 URL: https://issues.apache.org/jira/browse/MAHOUT-975
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
 Fix For: 0.8

 Attachments: GradientMachine.patch


 The initialisation to compute the gradient descent weight updates for the 
 output units should be wrong:
  
 In the comment: dy / dw is just w since  y = x' * w + b.
 This is wrong. dy/dw is x (ignoring the indices). The same initialisation is 
 done in the code.
 Check by using neural network terminology:
 The gradient machine is a specialized version of a multi layer perceptron 
 (MLP).
 In a MLP the gradient for computing the weight change for the output units 
 is:
 dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
 here: i index of the output layer; j index of the hidden layer
 (d stands for the partial derivatives)
 here: z_i = a_i (no squashing in the output layer)
 with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g + z_b
 with
 g index of output unit with target value: +1 (positive class)
 b: random output unit with target value: 0
 =
 dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden 
 unit
 j)
 dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden 
 unit
 j)
 That's the same if the comment would be correct:
 dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
 the output unit with target value +1.
 
 In neural network implementations it's common to compute the gradient
 numerically for a test of the implementation. This can be done by:
 dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [jira] [Comment Edited] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-10 Thread Yexi Jiang
OK, I try to update source code to the latest version.


2013/6/10 Yexi Jiang (JIRA) j...@apache.org


 [
 https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679837#comment-13679837]

 Yexi Jiang edited comment on MAHOUT-975 at 6/10/13 8:07 PM:
 

 [~smarthi] When I apply this patch, the source code cannot be compiled.
 One of the error is that hiddenActivations cannot be resolved. Another
 error is that the class Functions.NEGATE is misspelled as Function.NEGATE.





   was (Author: yxjiang):
 [~smarthi] When I apply this patch, the source code cannot be
 compiled. One of the error is that hiddenActivations cannot be resolved.
 Another error is that the class Functions.NEGATE is misspell as
 Function.NEGATE.



  Bug in Gradient Machine  - Computation of the gradient
  --
 
  Key: MAHOUT-975
  URL: https://issues.apache.org/jira/browse/MAHOUT-975
  Project: Mahout
   Issue Type: Bug
   Components: Classification
 Affects Versions: 0.7
 Reporter: Christian Herta
 Assignee: Ted Dunning
  Fix For: 0.8
 
  Attachments: GradientMachine.patch
 
 
  The initialisation to compute the gradient descent weight updates for
 the output units should be wrong:
 
  In the comment: dy / dw is just w since  y = x' * w + b.
  This is wrong. dy/dw is x (ignoring the indices). The same
 initialisation is done in the code.
  Check by using neural network terminology:
  The gradient machine is a specialized version of a multi layer
 perceptron (MLP).
  In a MLP the gradient for computing the weight change for the output
 units is:
  dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
  here: i index of the output layer; j index of the hidden layer
  (d stands for the partial derivatives)
  here: z_i = a_i (no squashing in the output layer)
  with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g +
 z_b
  with
  g index of output unit with target value: +1 (positive class)
  b: random output unit with target value: 0
  =
  dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the
 hidden unit
  j)
  dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the
 hidden unit
  j)
  That's the same if the comment would be correct:
  dy /dw = x (x is here the activation of the hidden unit) * (-1) for
 weights to
  the output unit with target value +1.
  
  In neural network implementations it's common to compute the gradient
  numerically for a test of the implementation. This can be done by:
  dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA
 administrators
 For more information on JIRA, see: http://www.atlassian.com/software/jira




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-10 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679866#comment-13679866
 ] 

Yexi Jiang commented on MAHOUT-975:
---

[~smarthi]OK, I will directly working on the latest version of code.

 Bug in Gradient Machine  - Computation of the gradient
 --

 Key: MAHOUT-975
 URL: https://issues.apache.org/jira/browse/MAHOUT-975
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
 Fix For: 0.8

 Attachments: GradientMachine.patch


 The initialisation to compute the gradient descent weight updates for the 
 output units should be wrong:
  
 In the comment: dy / dw is just w since  y = x' * w + b.
 This is wrong. dy/dw is x (ignoring the indices). The same initialisation is 
 done in the code.
 Check by using neural network terminology:
 The gradient machine is a specialized version of a multi layer perceptron 
 (MLP).
 In a MLP the gradient for computing the weight change for the output units 
 is:
 dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
 here: i index of the output layer; j index of the hidden layer
 (d stands for the partial derivatives)
 here: z_i = a_i (no squashing in the output layer)
 with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g + z_b
 with
 g index of output unit with target value: +1 (positive class)
 b: random output unit with target value: 0
 =
 dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden 
 unit
 j)
 dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden 
 unit
 j)
 That's the same if the comment would be correct:
 dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
 the output unit with target value +1.
 
 In neural network implementations it's common to compute the gradient
 numerically for a test of the implementation. This can be done by:
 dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

2013-06-10 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680104#comment-13680104
 ] 

Yexi Jiang commented on MAHOUT-975:
---

[~smarthi] Do I still need to work on this?

 Bug in Gradient Machine  - Computation of the gradient
 --

 Key: MAHOUT-975
 URL: https://issues.apache.org/jira/browse/MAHOUT-975
 Project: Mahout
  Issue Type: Bug
  Components: Classification
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
 Fix For: 0.8

 Attachments: GradientMachine2.java, GradientMachine.patch, 
 MAHOUT-975.patch


 The initialisation to compute the gradient descent weight updates for the 
 output units should be wrong:
  
 In the comment: dy / dw is just w since  y = x' * w + b.
 This is wrong. dy/dw is x (ignoring the indices). The same initialisation is 
 done in the code.
 Check by using neural network terminology:
 The gradient machine is a specialized version of a multi layer perceptron 
 (MLP).
 In a MLP the gradient for computing the weight change for the output units 
 is:
 dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
 here: i index of the output layer; j index of the hidden layer
 (d stands for the partial derivatives)
 here: z_i = a_i (no squashing in the output layer)
 with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g + z_b
 with
 g index of output unit with target value: +1 (positive class)
 b: random output unit with target value: 0
 =
 dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden 
 unit
 j)
 dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden 
 unit
 j)
 That's the same if the comment would be correct:
 dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to
 the output unit with target value +1.
 
 In neural network implementations it's common to compute the gradient
 numerically for a test of the implementation. This can be done by:
 dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron

2013-06-09 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679119#comment-13679119
 ] 

Yexi Jiang commented on MAHOUT-976:
---

No feedback?

 Implement Multilayer Perceptron
 ---

 Key: MAHOUT-976
 URL: https://issues.apache.org/jira/browse/MAHOUT-976
 Project: Mahout
  Issue Type: New Feature
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
Priority: Minor
  Labels: multilayer, networks, neural, perceptron
 Fix For: Backlog

 Attachments: MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch, 
 MAHOUT-976.patch

   Original Estimate: 80h
  Remaining Estimate: 80h

 Implement a multi layer perceptron
  * via Matrix Multiplication
  * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: 
 Efficent Backprop
  * arbitrary number of hidden layers (also 0  - just the linear model)
  * connection between proximate layers only 
  * different cost and activation functions (different activation function in 
 each layer) 
  * test of backprop by gradient checking 
  * normalization of the inputs (storeable) as part of the model
  
 First:
  * implementation stocastic gradient descent like gradient machine
  * simple gradient descent incl. momentum
 Later (new jira issues):  
  * Distributed Batch learning (see below)  
  * Stacked (Denoising) Autoencoder - Feature Learning
  * advanced cost minimazation like 2nd order methods, conjugate gradient etc.
 Distribution of learning can be done by (batch learning):
  1 Partioning of the data in x chunks 
  2 Learning the weight changes as matrices in each chunk
  3 Combining the matrixes and update of the weights - back to 2
 Maybe this procedure can be done with random parts of the chunks (distributed 
 quasi online learning). 
 Batch learning with delta-bar-delta heuristics for adapting the learning 
 rates.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron

2013-06-07 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678668#comment-13678668
 ] 

Yexi Jiang commented on MAHOUT-976:
---

Hi, 

I read the source code from the patch files (all the four versions) and have 
the following questions.

1) It seems that the source code has not fully implemented the distributed MLP.

Based on my understanding, the algorithm designer intends to make the 
implemented MLP generic enough so that it can be used both in single machine 
scenario and distributed scenario.

For the single machine scenario, the user can easily reuse the algorithm by 
writing similar code in the test cases. But for the distributed version, the 
user has to implement the mapper to load all the training data. And then he 
needs to create a MLP instance inside the mapper and train it with the incoming 
data. Moreover, the user has to come up with a solution to merge all the MLP 
weight updating in each mapper instance, which is not trivial.

Therefore, it seems that the current implementation does no more than a single 
machine version of MLP.



2) The dimension of target Vector feed to trainOnline is always 1. This is 
because the actual is always an integer, and there is no post-process to make 
it a mutual class vector.

The following is the call sequence.
train - trainOnline - getDerivativeOfTheCostWithoutRegularization - 
getOutputDeltas - AbstractVector.assign(Vector v, DoubleDoubleFunction f)

The assign method would check whether v equals to this.size. In the MLP 
scenario, it will check whether the size of output layer equals the size of 
class label.

And the following is the related code.
--
public void train(long trackingKey, String groupKey, int actual,
  Vector instance) {
// training with one pattern
Vector target = new DenseVector(1);
target.setQuick(0, (double) actual);
trainOnline(instance, target);
  }
--

The reason why it passes the test cases is because the test case just create 
the MLP with size 1 output layer.

So, I'm wondering whether the argument list of train should be changed, or 
argument 'actual' should be transformed internally.



I have implemented a BSP based distributed MLP, and the code has already by 
committed to apache hama machine learning package. The BSP version is not 
difficult to adapt to the mapreduce framework. If it is OK, I can change my 
existing code and contribute it the mahout.



 Implement Multilayer Perceptron
 ---

 Key: MAHOUT-976
 URL: https://issues.apache.org/jira/browse/MAHOUT-976
 Project: Mahout
  Issue Type: New Feature
Affects Versions: 0.7
Reporter: Christian Herta
Assignee: Ted Dunning
Priority: Minor
  Labels: multilayer, networks, neural, perceptron
 Fix For: Backlog

 Attachments: MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch, 
 MAHOUT-976.patch

   Original Estimate: 80h
  Remaining Estimate: 80h

 Implement a multi layer perceptron
  * via Matrix Multiplication
  * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: 
 Efficent Backprop
  * arbitrary number of hidden layers (also 0  - just the linear model)
  * connection between proximate layers only 
  * different cost and activation functions (different activation function in 
 each layer) 
  * test of backprop by gradient checking 
  * normalization of the inputs (storeable) as part of the model
  
 First:
  * implementation stocastic gradient descent like gradient machine
  * simple gradient descent incl. momentum
 Later (new jira issues):  
  * Distributed Batch learning (see below)  
  * Stacked (Denoising) Autoencoder - Feature Learning
  * advanced cost minimazation like 2nd order methods, conjugate gradient etc.
 Distribution of learning can be done by (batch learning):
  1 Partioning of the data in x chunks 
  2 Learning the weight changes as matrices in each chunk
  3 Combining the matrixes and update of the weights - back to 2
 Maybe this procedure can be done with random parts of the chunks (distributed 
 quasi online learning). 
 Batch learning with delta-bar-delta heuristics for adapting the learning 
 rates.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Really want to contribute to mahout

2013-06-03 Thread Yexi Jiang
Certainly, I am always keep an eye on the issue tracker. It is not easy to
find an open issue, most of them are assigned short after it is created.


2013/6/2 Ted Dunning ted.dunn...@gmail.com

 Yexi,

 It is really good that you just spoke up.  The density based clustering
 issue that you filed didn't find a fertile audience, that is true.

 Can you provide a pointer to the other issue?




 On Sat, Jun 1, 2013 at 9:06 PM, Yexi Jiang yexiji...@gmail.com wrote:

  Hi,
 
  I have joined the mailing list for a while and intend to contribute my
 code
  to mahout. However, I tried two issues but didn't get the permission to
  work on them.
 
  I'm wondering how can I contribute to mahout. As I am a graduate student
  working on data mining, I'm really want to do something to make mahout
  better.
 
  Regards,
  Yexi
 




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: Algorithms for categorical data

2013-06-02 Thread Yexi Jiang
Do you already have one implemented?


2013/6/2 Florents Tselai tse...@dmst.aueb.gr

 I've noticed (correct me if I'm wrong) that mahout lacks algorithms
 specialized in clustering data with categorical attributes.

 Would the community be interested in the implementation of algorithms like
 ROCK http://www.cis.upenn.edu/~sudipto/mypapers/categorical.pdf ?

 I'm currently working on this area (applied-research project) and I'd like
 to have my code open-sourced.




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


Re: Algorithms for categorical data

2013-06-02 Thread Yexi Jiang
You mean you are testing on the single machine version?


2013/6/2 Florents Tselai tse...@dmst.aueb.gr

 Not yet.

 I'm currently experimenting with various implementation in Python.


 On Sun, Jun 2, 2013 at 9:43 PM, Yexi Jiang yexiji...@gmail.com wrote:

  Do you already have one implemented?
 
 
  2013/6/2 Florents Tselai tse...@dmst.aueb.gr
 
   I've noticed (correct me if I'm wrong) that mahout lacks algorithms
   specialized in clustering data with categorical attributes.
  
   Would the community be interested in the implementation of algorithms
  like
   ROCK http://www.cis.upenn.edu/~sudipto/mypapers/categorical.pdf ?
  
   I'm currently working on this area (applied-research project) and I'd
  like
   to have my code open-sourced.
  
 
 
 
  --
  --
  Yexi Jiang,
  ECS 251,  yjian...@cs.fiu.edu
  School of Computer and Information Science,
  Florida International University
  Homepage: http://users.cis.fiu.edu/~yjian004/
 




-- 
--
Yexi Jiang,
ECS 251,  yjian...@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/


[jira] [Commented] (MAHOUT-1206) Add density-based clustering algorithms to mahout

2013-06-01 Thread Yexi Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672147#comment-13672147
 ] 

Yexi Jiang commented on MAHOUT-1206:


Still there is no comments?

 Add density-based clustering algorithms to mahout
 -

 Key: MAHOUT-1206
 URL: https://issues.apache.org/jira/browse/MAHOUT-1206
 Project: Mahout
  Issue Type: Improvement
Reporter: Yexi Jiang
  Labels: clustering

 The clustering algorithms (kmeans, fuzzy kmeans, dirichlet clustering, and 
 spectral cluster) clustering data by assuming that the data can be clustered 
 into the regular hyper sphere or ellipsoid. However, in practical, not all 
 the data can be clustered in this way. 
 To enable the data to be clustered in arbitrary shapes, clustering algorithms 
 like DBSCAN, BIRCH, CLARANCE 
 (http://en.wikipedia.org/wiki/Cluster_analysis#Density-based_clustering) are 
 proposed.
 It is better that we can implement one or some of these clustering algorithm 
 to enrich the clustering library. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Really want to contribute to mahout

2013-06-01 Thread Yexi Jiang
Hi,

I have joined the mailing list for a while and intend to contribute my code
to mahout. However, I tried two issues but didn't get the permission to
work on them.

I'm wondering how can I contribute to mahout. As I am a graduate student
working on data mining, I'm really want to do something to make mahout
better.

Regards,
Yexi