Re: [jira] [Commented] (MAHOUT-1551) Add document to describe how to use mlp with command line
HI, Felix, you are current, the current implementation is a simple online/stochastic gradient descent network using back-propagation for optimizing. The user can set the number of levels, number of neurons in each level, and a various of parameters (such as learning rate, regularization weight, etc.). The CLI version simplifies some parameters because basic users do not need that many parameters. Regards, Yexi 2014-07-14 7:36 GMT-07:00 Felix Schüler (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/MAHOUT-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060688#comment-14060688 ] Felix Schüler commented on MAHOUT-1551: --- Ted, thanks for the feedback! As far as we understand it, the implementation is a simple online/stochastic gradient descent using backpropagation to calculate the gradients of the error function. Weights are then updated with a fixed learning rate that never changes. As we (I always say 'we' because I am working on it with someone else for a university-class) have described in MAHOUT-1388, the CLI version only performs a fixed number of n iterations where n is the size of the training set. So example is fed into the network once, which in case of a dataset as small as the iris dataset does not lead to acceptable performance. The unit test for the mlp iterates 2000 times through the dataset to achieve a good performance, but as far as we can tell, stopping does not depend on learning or weight updates even though regularization is implemented. We could add this information to the implementation section of the documentation. As for the DSL, we are very tempted to implement the MLP or a more general neural network framework. We will think about it and see if we can find the time. Add document to describe how to use mlp with command line - Key: MAHOUT-1551 URL: https://issues.apache.org/jira/browse/MAHOUT-1551 Project: Mahout Issue Type: Documentation Components: Classification, CLI, Documentation Affects Versions: 0.9 Reporter: Yexi Jiang Labels: documentation Fix For: 1.0 Attachments: README.md Add documentation about the usage of multi-layer perceptron in command line. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1551) Add document to describe how to use mlp with command line
[ https://issues.apache.org/jira/browse/MAHOUT-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061011#comment-14061011 ] Yexi Jiang commented on MAHOUT-1551: [~fschueler], you are correct, the current implementation is a simple online/stochastic gradient descent network using back-propagation for optimizing. The user can set the number of levels, number of neurons in each level, and a various of parameters (such as learning rate, regularization weight, etc.). The CLI version simplifies some parameters because basic users do not need that many parameters. Regards, Yexi Add document to describe how to use mlp with command line - Key: MAHOUT-1551 URL: https://issues.apache.org/jira/browse/MAHOUT-1551 Project: Mahout Issue Type: Documentation Components: Classification, CLI, Documentation Affects Versions: 0.9 Reporter: Yexi Jiang Labels: documentation Fix For: 1.0 Attachments: README.md Add documentation about the usage of multi-layer perceptron in command line. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
The code has been uploaded to review board. 2014-05-17 23:27 GMT-07:00 Sebastian Schelter (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001021#comment-14001021] Sebastian Schelter commented on MAHOUT-1388: [~yxjiang] what's the status here? Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252) -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computing and Information Sciences, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001158#comment-14001158 ] Yexi Jiang commented on MAHOUT-1388: [~ssc] The code is available at https://reviews.apache.org/r/16700/. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001294#comment-14001294 ] Yexi Jiang commented on MAHOUT-1388: [~ssc] Sure, where should I add the documentation to? Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MAHOUT-1551) Add document to describe how to use mlp with command line
Yexi Jiang created MAHOUT-1551: -- Summary: Add document to describe how to use mlp with command line Key: MAHOUT-1551 URL: https://issues.apache.org/jira/browse/MAHOUT-1551 Project: Mahout Issue Type: Documentation Components: Classification, CLI, Documentation Affects Versions: 0.9 Reporter: Yexi Jiang Add documentation about the usage of multi-layer perceptron in command line. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975002#comment-13975002 ] Yexi Jiang commented on MAHOUT-1388: [~smarthi] Could you please re-assign this issue to me? Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974113#comment-13974113 ] Yexi Jiang commented on MAHOUT-1388: [~ssc] Thanks, I will work on it. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974127#comment-13974127 ] Yexi Jiang commented on MAHOUT-1388: Do you mean that the MLP needs to be reimplemented in the way to work with spark? The current implementation of MLP is not a hadoop version. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13974718#comment-13974718 ] Yexi Jiang commented on MAHOUT-1388: [~ssc] For the comment 'duplicate code, this has already been implemented in the other class', could you please also point out which class has implemented the method that extract the string from the command line? I checked o.a.m.commons package, but didn't find the method I need. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1510) Goodbye MapReduce
[ https://issues.apache.org/jira/browse/MAHOUT-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969557#comment-13969557 ] Yexi Jiang commented on MAHOUT-1510: What kind of algorithms are acceptable in the future? Goodbye MapReduce - Key: MAHOUT-1510 URL: https://issues.apache.org/jira/browse/MAHOUT-1510 Project: Mahout Issue Type: Task Components: Documentation Reporter: Sebastian Schelter Fix For: 1.0 We should prominently state on the website that we reject any future MR algorithm contributions (but still maintain and bugfix what we have so far). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1510) Goodbye MapReduce
[ https://issues.apache.org/jira/browse/MAHOUT-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13969777#comment-13969777 ] Yexi Jiang commented on MAHOUT-1510: Great, is it necessary to port all of the old algorithms to scala DSL form? Goodbye MapReduce - Key: MAHOUT-1510 URL: https://issues.apache.org/jira/browse/MAHOUT-1510 Project: Mahout Issue Type: Task Components: Documentation Reporter: Sebastian Schelter Fix For: 1.0 We should prominently state on the website that we reject any future MR algorithm contributions (but still maintain and bugfix what we have so far). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13970006#comment-13970006 ] Yexi Jiang commented on MAHOUT-1265: Hi, [~barsik], according to [MAHOUT-1510|https://issues.apache.org/jira/browse/MAHOUT-1510], mahout no longer accept the proposal of MR algorithm. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: machine_learning, neural_network Fix For: 0.9 Attachments: MAHOUT-1265.patch, Mahout-1265-17.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation
Re: [jira] [Assigned] (MAHOUT-1388) Add command line support and logging for MLP
The patch is already available. 2014-03-23 1:01 GMT-04:00 Suneel Marthi (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] Suneel Marthi reassigned MAHOUT-1388: - Assignee: Suneel Marthi Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.2#6252) -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: [jira] [Comment Edited] (MAHOUT-1426) GSOC 2013 Neural network algorithms
Hi, Ted, I am currently working on that issue with Suneel. Yexi 2014-03-19 19:44 GMT-04:00 Ted Dunning ted.dunn...@gmail.com: On Wed, Mar 19, 2014 at 3:19 PM, Maciej Mazur maciejmaz...@gmail.com wrote: I'm not going to propose this project. Now this issue can be closed. Proposing the downpour would be a good thing to do. It won't be that difficult. Please don't take my comments as discouraging. -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1441) Add documentation for Spectral KMeans to Mahout Website
[ https://issues.apache.org/jira/browse/MAHOUT-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925231#comment-13925231 ] Yexi Jiang commented on MAHOUT-1441: I did some summary for [Mahout-1177: Reform and simplify the clustering APIs|https://issues.apache.org/jira/browse/MAHOUT-1177] last year. It can be found [here|https://docs.google.com/document/d/10RocKzS_FBZTIScqTI3Gl2tfeR8vXabPMCGNpZe07m8/edit]. Hope this document is useful. Add documentation for Spectral KMeans to Mahout Website --- Key: MAHOUT-1441 URL: https://issues.apache.org/jira/browse/MAHOUT-1441 Project: Mahout Issue Type: Bug Components: Documentation Affects Versions: 1.0 Reporter: Suneel Marthi Assignee: Shannon Quinn Fix For: 1.0 Need to update the Website with Design, user guide and any relevant documentation for Spectral KMeans clustering. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [jira] [Commented] (MAHOUT-1441) Add documentation for Spectral KMeans to Mahout Website
Sebastian, Currently I am working on other things. If this issue is not urgent, please assign it to me. 2014-03-09 12:28 GMT-04:00 Sebastian Schelter (JIRA) j...@apache.org: [ https://issues.apache.org/jira/browse/MAHOUT-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925234#comment-13925234] Sebastian Schelter commented on MAHOUT-1441: Yexi, could you create a tutorial from this writeup to teach people how (and why) to use Mahout's spectral clustering? It would be great to have something similar to https://mahout.apache.org/users/recommender/userbased-5-minutes.htmlwhich was the result of MAHOUT-1438 Add documentation for Spectral KMeans to Mahout Website --- Key: MAHOUT-1441 URL: https://issues.apache.org/jira/browse/MAHOUT-1441 Project: Mahout Issue Type: Bug Components: Documentation Affects Versions: 1.0 Reporter: Suneel Marthi Assignee: Shannon Quinn Fix For: 1.0 Need to update the Website with Design, user guide and any relevant documentation for Spectral KMeans clustering. -- This message was sent by Atlassian JIRA (v6.2#6252) -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1441) Add documentation for Spectral KMeans to Mahout Website
[ https://issues.apache.org/jira/browse/MAHOUT-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925270#comment-13925270 ] Yexi Jiang commented on MAHOUT-1441: [~ssc], Currently I am working on other things. If this issue is not urgent, please assign it to me. [~smarthi], An advantage of spectral clustering is that it performs clustering on the correlation metric space (a.k.a. similarity graph). It performs good on the data points that cannot be well clustered by the clustering algorithms which directly work on the original metric space and cluster the data points in convex shape. I'm not sure whether the reuters dataset can reflect such advantage of spectral clustering. Or do we need to show a more representative dataset in the website example? Like the ones shown in the experiment section of this [paper |http://ai.stanford.edu/~ang/papers/nips01-spectral.pdf]. Add documentation for Spectral KMeans to Mahout Website --- Key: MAHOUT-1441 URL: https://issues.apache.org/jira/browse/MAHOUT-1441 Project: Mahout Issue Type: Bug Components: Documentation Affects Versions: 1.0 Reporter: Suneel Marthi Assignee: Shannon Quinn Fix For: 1.0 Need to update the Website with Design, user guide and any relevant documentation for Spectral KMeans clustering. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Mahout 1.0 goals
Sebastian, In one of my recent projects, I used the Naive Bayes for classification, so I gave a write-up on this algorithm. You can find the document at https://docs.google.com/document/d/1h7N0GmIKe-KG64uulPMPzkp00nowM2-HDQ48c4PIhbc/edit?usp=sharing . Feedbacks are welcome. 2014-03-04 3:57 GMT-05:00 Sebastian Schelter ssc.o...@googlemail.com: Yexi, could you do a small write-up, analogously to what I proposed for Giorgio. Make sure to pick a different algorithm though. --sebastian Am 03.03.2014 16:54 schrieb Yexi Jiang yexiji...@gmail.com: I'm also happy to help. 2014-03-03 10:29 GMT-05:00 Giorgio Zoppi giorgio.zo...@gmail.com: I would like to help in the api creation. How do I start for being productive with mahout? Best Regards, Giorgio 2014-02-28 1:37 GMT+01:00 Ted Dunning ted.dunn...@gmail.com: I would like to start a conversation about where we want Mahout to be for 1.0. Let's suspend for the moment the question of how to achieve the goals. Instead, let's converge on what we really would like to have happen and after that, let's talk about means that will get us there. Here are some goals that I think would be good in the area of numerics, classifiers and clustering: - runs with or without Hadoop - runs with or without map-reduce - includes (at least), regularized generalized linear models, k-means, random forest, distributed random forest, distributed neural networks - reasonably competitive speed against other implementations including graphlab, mlib and R. - interactive model building - models can be exported as code or data - simple programming model - programmable via Java or R - runs clustered or not What does everybody think? -- Quiero ser el rayo de sol que cada día te despierta para hacerte respirar y vivir en me. Favola -Moda. -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/ -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: Mahout 1.0 goals
I'm also happy to help. 2014-03-03 10:29 GMT-05:00 Giorgio Zoppi giorgio.zo...@gmail.com: I would like to help in the api creation. How do I start for being productive with mahout? Best Regards, Giorgio 2014-02-28 1:37 GMT+01:00 Ted Dunning ted.dunn...@gmail.com: I would like to start a conversation about where we want Mahout to be for 1.0. Let's suspend for the moment the question of how to achieve the goals. Instead, let's converge on what we really would like to have happen and after that, let's talk about means that will get us there. Here are some goals that I think would be good in the area of numerics, classifiers and clustering: - runs with or without Hadoop - runs with or without map-reduce - includes (at least), regularized generalized linear models, k-means, random forest, distributed random forest, distributed neural networks - reasonably competitive speed against other implementations including graphlab, mlib and R. - interactive model building - models can be exported as code or data - simple programming model - programmable via Java or R - runs clustered or not What does everybody think? -- Quiero ser el rayo de sol que cada día te despierta para hacerte respirar y vivir en me. Favola -Moda. -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: [jira] [Comment Edited] (MAHOUT-1426) GSOC 2013 Neural network algorithms
Peng, Can you provide more details about your thought? Regards, 2014-02-27 16:00 GMT-05:00 peng pc...@uowmail.edu.au: That should be easy. But that defeats the purpose of using mahout as there are already enough implementations of single node backpropagation (in which case GPU is much faster). Yexi: Regarding downpour SGD and sandblaster, may I suggest that the implementation better has no parameter server? It's obviously a single point of failure and in terms of bandwidth, a bottleneck. I heard that MLlib on top of Spark has a functional implementation (never read or test it), and its possible to build the workflow on top of YARN. Non of those framework has an heterogeneous topology. Yours Peng On Thu 27 Feb 2014 09:43:19 AM EST, Maciej Mazur (JIRA) wrote: [ https://issues.apache.org/jira/browse/MAHOUT-1426?page= com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanelfocusedCommentId=13913488#comment-13913488 ] Maciej Mazur edited comment on MAHOUT-1426 at 2/27/14 2:41 PM: --- I've read the papers. I didn't think about distributed network. I had in mind network that will fit into memory, but will require significant amount of computations. I understand that there are better options for neural networks than map reduce. How about non-map-reduce version? I see that you think it is something that would make a sense. (Doing a non-map-reduce neural network in Mahout would be of substantial interest.) Do you think it will be a valueable contribution? Is there a need for this type of algorithm? I think about multi-threded batch gradient descent with pretraining (RBM or/and Autoencoders). I have looked into these old JIRAs. RBM patch was withdrawn. I would rather like to withdraw that patch, because by the time i implemented it i didn't know that the learning algorithm is not suited for MR, so I think there is no point including the patch. was (Author: maciejmazur): I've read the papers. I didn't think about distributed network. I had in mind network that will fit into memory, but will require significant amount of computations. I understand that there are better options for neural networks than map reduce. How about non-map-reduce version? I see that you think it is something that would make a sense. Do you think it will be a valueable contribution? Is there a need for this type of algorithm? I think about multi-threded batch gradient descent with pretraining (RBM or/and Autoencoders). I have looked into these old JIRAs. RBM patch was withdrawn. I would rather like to withdraw that patch, because by the time i implemented it i didn't know that the learning algorithm is not suited for MR, so I think there is no point including the patch. GSOC 2013 Neural network algorithms --- Key: MAHOUT-1426 URL: https://issues.apache.org/jira/browse/MAHOUT-1426 Project: Mahout Issue Type: Improvement Components: Classification Reporter: Maciej Mazur I would like to ask about possibilites of implementing neural network algorithms in mahout during GSOC. There is a classifier.mlp package with neural network. I can't see neighter RBM nor Autoencoder in these classes. There is only one word about Autoencoders in NeuralNetwork class. As far as I know Mahout doesn't support convolutional networks. Is it a good idea to implement one of these algorithms? Is it a reasonable amount of work? How hard is it to get GSOC in Mahout? Did anyone succeed last year? -- This message was sent by Atlassian JIRA (v6.1.5#6160) -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: [jira] [Comment Edited] (MAHOUT-1426) GSOC 2013 Neural network algorithms
Hi, Peng, Do you mean the MultilayerPerceptron? There are three 'train' method, and only one (the one without the parameters trackingKey and groupKey) is implemented. In current implementation, they are not used. Regards, Yexi 2014-02-27 19:31 GMT-05:00 Ted Dunning ted.dunn...@gmail.com: Generally for training models like this, there is an assumption that fault tolerance is not particularly necessary because the low risk of failure trades against algorithmic speed. For reasonably small chance of failure, simply re-running the training is just fine. If there is high risk of failure, simply checkpointing the parameter server is sufficient to allow restarts without redundancy. Sharding the parameter is quite possible and is reasonable when the parameter vector exceed 10's or 100's of millions of parameters, but isn't likely much necessary below that. The asymmetry is similarly not a big deal. The traffic to and from the parameter server isn't enormous. Building something simple and working first is a good thing. On Thu, Feb 27, 2014 at 3:56 PM, peng pc...@uowmail.edu.au wrote: With pleasure! the original downpour paper propose a parameter server from which subnodes download shards of old model and upload gradients. So if the parameter server is down, the process has to be delayed, it also requires that all model parameters to be stored and atomically updated on (and fetched from) a single machine, imposing asymmetric HDD and bandwidth requirement. This design is necessary only because each -=delta operation has to be atomic. Which cannot be ensured across network (e.g. on HDFS). But it doesn't mean that the operation cannot be decentralized: parameters can be sharded across multiple nodes and multiple accumulator instances can handle parts of the vector subtraction. This should be easy if you create a buffer for the stream of gradient, and allocate proper numbers of producers and consumers on each machine to make sure it doesn't overflow. Obviously this is far from MR framework, but at least it can be made homogeneous and slightly faster (because sparse data can be distributed in a way to minimize their overlapping, so gradients doesn't have to go across the network that frequent). If we instead using a centralized architect. Then there must be =1 backup parameter server for mission critical training. Yours Peng e.g. we can simply use a producer/consumer pattern If we use a producer/consumer pattern for all gradients, On Thu 27 Feb 2014 05:09:52 PM EST, Yexi Jiang wrote: Peng, Can you provide more details about your thought? Regards, 2014-02-27 16:00 GMT-05:00 peng pc...@uowmail.edu.au: That should be easy. But that defeats the purpose of using mahout as there are already enough implementations of single node backpropagation (in which case GPU is much faster). Yexi: Regarding downpour SGD and sandblaster, may I suggest that the implementation better has no parameter server? It's obviously a single point of failure and in terms of bandwidth, a bottleneck. I heard that MLlib on top of Spark has a functional implementation (never read or test it), and its possible to build the workflow on top of YARN. Non of those framework has an heterogeneous topology. Yours Peng On Thu 27 Feb 2014 09:43:19 AM EST, Maciej Mazur (JIRA) wrote: [ https://issues.apache.org/jira/browse/MAHOUT-1426?page= com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanelfocusedCommentId=13913488#comment-13913488 ] Maciej Mazur edited comment on MAHOUT-1426 at 2/27/14 2:41 PM: --- I've read the papers. I didn't think about distributed network. I had in mind network that will fit into memory, but will require significant amount of computations. I understand that there are better options for neural networks than map reduce. How about non-map-reduce version? I see that you think it is something that would make a sense. (Doing a non-map-reduce neural network in Mahout would be of substantial interest.) Do you think it will be a valueable contribution? Is there a need for this type of algorithm? I think about multi-threded batch gradient descent with pretraining (RBM or/and Autoencoders). I have looked into these old JIRAs. RBM patch was withdrawn. I would rather like to withdraw that patch, because by the time i implemented it i didn't know that the learning algorithm is not suited for MR, so I think there is no point including the patch. was (Author: maciejmazur): I've read the papers. I didn't think about distributed network. I had in mind network that will fit into memory, but will require significant amount of computations. I understand that there are better options for neural networks than map reduce. How about non-map-reduce version
Re: [jira] [Created] (MAHOUT-1426) GSOC 2013 Neural network algorithms
Since the training methods for neural network largely requires a lot of iterations, it is not perfect suitable to implement it in MapReduce style. Currently, the NeuralNetwork is implemented as an online learning model and the training is conducted via stochastic gradient descent. Moreover, currently version of NeuralNetwork is mainly used for supervised learning, so there is no RBM or Autoencoder. Regards, Yexi 2014-02-25 10:34 GMT-05:00 Maciej Mazur (JIRA) j...@apache.org: Maciej Mazur created MAHOUT-1426: Summary: GSOC 2013 Neural network algorithms Key: MAHOUT-1426 URL: https://issues.apache.org/jira/browse/MAHOUT-1426 Project: Mahout Issue Type: Improvement Components: Classification Reporter: Maciej Mazur I would like to ask about possibilites of implementing neural network algorithms in mahout during GSOC. There is a classifier.mlp package with neural network. I can't see neighter RBM nor Autoencoder in these classes. There is only one word about Autoencoders in NeuralNetwork class. As far as I know Mahout doesn't support convolutional networks. Is it a good idea to implement one of these algorithms? Is it a reasonable amount of work? -- This message was sent by Atlassian JIRA (v6.1.5#6160) -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1426) GSOC 2013 Neural network algorithms
[ https://issues.apache.org/jira/browse/MAHOUT-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911865#comment-13911865 ] Yexi Jiang commented on MAHOUT-1426: I totally agree with you. From the algorithmic perspective, RBM and Autoencoder is proved to be very effective for feature learning. When training multi-level neural network, it is usually necessary to stack the RBMs or Autoencoders to learn the representative features first. 1. If the training dataset is large. It is true that if the training data is huge, the online version be be slow as it is not a parallel implementation. If we implement the algorithm in MapReduce way, the data can be read in parallel. Now matter we use stochastic gradient descent, mini-batch gradient descent, or full batch gradient descent, we need to train the model with many iteration. In practice, we need one job for each iteration. It is know that the start-up time of hadoop is time-consuming, therefore, the overhead can be even higher than the actual computing time. For example, if we use stochastic gradient descent, after each partition read one data instance, we need to update and synchronize the model. IMHO, BSP is more effective than MapReduce in such scenario. 2. If the model is large. If the model is large, we need to partition the model and store it distributedly, you can find a solution at a related NIPS paper (http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/large_deep_networks_nips2012.pdf). In this case, the distributed system needs to be heterogeneous, since different nodes may have different tasks (for parameter storage or for computing). It is difficult to design an algorithm to conduct such work under MapReduce style, as each task is considered to be homogeneous in MapReduce. Actually, according to the talk of Tera-scale deep learning (http://static.googleusercontent.com/media/research.google.com/en/us/archive/unsupervised_learning_talk_2012.pdf), even BSP is not quite suitable since the error may always happen in a large scale distributed system. In their implementation, they implemented an asynchronous computing framework to conduct the large scale learning. In summary, implementing MapReduce version of NeuralNetwork is OK, but compared with the more suitable computing frameworks, it is not so efficient. GSOC 2013 Neural network algorithms --- Key: MAHOUT-1426 URL: https://issues.apache.org/jira/browse/MAHOUT-1426 Project: Mahout Issue Type: Improvement Components: Classification Reporter: Maciej Mazur I would like to ask about possibilites of implementing neural network algorithms in mahout during GSOC. There is a classifier.mlp package with neural network. I can't see neighter RBM nor Autoencoder in these classes. There is only one word about Autoencoders in NeuralNetwork class. As far as I know Mahout doesn't support convolutional networks. Is it a good idea to implement one of these algorithms? Is it a reasonable amount of work? How hard is it to get GSOC in Mahout? Did anyone succeed last year? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889097#comment-13889097 ] Yexi Jiang commented on MAHOUT-1388: [~smarthi] I have revised the code, could you please have a look at the code at the review board? https://reviews.apache.org/r/16700/ Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch, Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Mahout 0.9 Release - Call for Volunteers
Got the same error. Regards, Yexi 2014/1/16 Chameera Wijebandara chameerawijeband...@gmail.com Hi Suneel, Still it getting 404 error. Thanks, Chameera On Thu, Jan 16, 2014 at 7:11 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: Here's the new URL for Mahout 0.9 Release: https://repository.apache.org/content/repositories/orgapachemahout-1001/org/apache/mahout/mahout-buildtools/0.9/ For those volunteering to test this, some of the things to be verified: a) Verify that u can unpack the release (tar or zip) b) Verify u r able to compile the distro c) Run through the unit tests: mvn clean test d) Run the example scripts under $MAHOUT_HOME/examples/bin. Please run through all the different options in each script. Committers and PMC members: --- Need atleast 3 +1 votes from this group for the Release to pass. Thanks and Regards. -- Thanks, Chameera -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: Mahout 0.9 Release - Call for Volunteers
Tested on my mac and a server with ubuntu 12.04 LTS. All tests passed. [INFO] [INFO] Reactor Summary: [INFO] [INFO] Mahout Build Tools SUCCESS [1.964s] [INFO] Apache Mahout . SUCCESS [0.400s] [INFO] Mahout Math ... SUCCESS [1:53.067s] [INFO] Mahout Core ... SUCCESS [9:09.716s] [INFO] Mahout Integration SUCCESS [1:04.662s] [INFO] Mahout Examples ... SUCCESS [3.331s] [INFO] Mahout Release Package SUCCESS [0.000s] [INFO] Mahout Math/Scala wrappers SUCCESS [11.356s] [INFO] [INFO] BUILD SUCCESS [INFO] Regards, Yexi 2014/1/16 Sotiris Salloumis i...@eprice.gr From unix you should try the following with wget or curl, make sure during copy the email client will not wrap it up http://repository.apache.org/content/repositories/orgapachemahout-1002/org/a pache/mahout/mahout-distribution/0.9/mahout-distribution-0.9-src.tar.gz Above link via Google url shortener for easy copy/paste http://goo.gl/gX6xGz Regards Sotiris -Original Message- From: Yexi Jiang [mailto:yexiji...@gmail.com] Sent: Thursday, January 16, 2014 5:59 PM To: mahout Cc: Suneel Marthi; u...@mahout.apache.org; priv...@mahout.apache.org Subject: Re: Mahout 0.9 Release - Call for Volunteers Got the same error. Regards, Yexi 2014/1/16 Chameera Wijebandara chameerawijeband...@gmail.com Hi Suneel, Still it getting 404 error. Thanks, Chameera On Thu, Jan 16, 2014 at 7:11 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: Here's the new URL for Mahout 0.9 Release: https://repository.apache.org/content/repositories/orgapachemahout-100 1/org/apache/mahout/mahout-buildtools/0.9/ For those volunteering to test this, some of the things to be verified: a) Verify that u can unpack the release (tar or zip) b) Verify u r able to compile the distro c) Run through the unit tests: mvn clean test d) Run the example scripts under $MAHOUT_HOME/examples/bin. Please run through all the different options in each script. Committers and PMC members: --- Need atleast 3 +1 votes from this group for the Release to pass. Thanks and Regards. -- Thanks, Chameera -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/ -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864531#comment-13864531 ] Yexi Jiang commented on MAHOUT-1388: [~smarthi] When I submit the patch to the review board, I got the following error: The file 'https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java' (r1556303) could not be found in the repository However, I checked this url, the file exists. I'm not sure what causes this error. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864681#comment-13864681 ] Yexi Jiang commented on MAHOUT-1388: The base I used is - https://svn.apache.org/repos/asf/mahout/trunk - Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1388: --- Status: Patch Available (was: Open) This patch should be applied after apply patch in Mahout-1265. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1388: --- Attachment: Mahout-1388.patch This patch should be applied after apply patch in Mahout-1265. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Labels: mlp, sgd Fix For: 1.0 Attachments: Mahout-1388.patch The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1388: --- Description: The user should have the ability to run the Perceptron from the command line. There are two programs to execute MLP, the training and labeling. The first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters for training are as follows: --input -i (input data) --skipHeader -sk // whether to skip the first row, this parameter is optional --labels -labels // the labels of the instances, separated by whitespace. Take the iris dataset for example, the labels are 'setosa versicolor virginica'. --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --update -u // whether to incremental update the model, if this parameter is not given, train the model from scratch --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use whitespace separated number to indicate the number of neurons in each layer (including input layer and output layer), e.g. '5 3 2'. --squashingFunction -sf // currently only supports Sigmoid --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -i /tmp/training-data.csv -labels setosa versicolor virginica -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. The parameters for labeling is as follows: - --input -i // input file path --columnRange -cr // the range of column used for feature, start from 0 and separated by whitespace, e.g. 0 5 --format -f // the format of input file, currently only supports csv --model -mo // the file path of the model --output -o // the output path for the results - If a user need to use an existing model, it will use the following command: mlp -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. was: The user should have the ability to run the Perceptron from the command line. There are two modes for MLP, the training and labeling, the first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters are as follows: --mode -mo // train or label --input -i (input data) --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use comma separated number to indicate the number of neurons in each layer (including input layer and output layer) --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 -cf minus_squared This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. If a user need to use an existing model, it will use the following command: mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Labels: mlp, sgd Fix For: 1.0 The user should have the ability to run
[jira] [Commented] (MAHOUT-1388) Add command line support and logging for MLP
[ https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856671#comment-13856671 ] Yexi Jiang commented on MAHOUT-1388: [~smarthi] OK, I'll add it. Currently, it only supports CSV. Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Labels: mlp, sgd Fix For: 1.0 The user should have the ability to run the Perceptron from the command line. There are two modes for MLP, the training and labeling, the first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters are as follows: --mode -mo // train or label --input -i (input data) --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use comma separated number to indicate the number of neurons in each layer (including input layer and output layer) --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 -cf minus_squared This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. If a user need to use an existing model, it will use the following command: mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (MAHOUT-1388) Add command line support and logging for MLP
Yexi Jiang created MAHOUT-1388: -- Summary: Add command line support and logging for MLP Key: MAHOUT-1388 URL: https://issues.apache.org/jira/browse/MAHOUT-1388 Project: Mahout Issue Type: Improvement Components: Classification Affects Versions: 1.0 Reporter: Yexi Jiang Fix For: 1.0 The user should have the ability to run the Perceptron from the command line. There are two modes for MLP, the training and labeling, the first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results. The parameters are as follows: --mode -mo // train or label --input -i (input data) --model -mo // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result --output -o // this is only useful in labeling mode --layersize -ls (no. of units per hidden layer) // use comma separated number to indicate the number of neurons in each layer (including input layer and output layer) --momentum -m --learningrate -l --regularizationweight -r --costfunction -cf // the type of cost function, For example, train a 3-layer (including input, hidden, and output) MLP with Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be: mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 -cf minus_squared This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model. If a user need to use an existing model, it will use the following command: mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result Moreover, we should be providing default values if the user does not specify any. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1265: --- Attachment: Mahout-1265-17.patch The version 17. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: MAHOUT-1265.patch, Mahout-1265-13.patch, Mahout-1265-17.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given partition of data, and then store the result
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853303#comment-13853303 ] Yexi Jiang commented on MAHOUT-1265: Great. I am thinking a mapreduce version of MLP. It may take a non-trivial amount of time. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Assignee: Suneel Marthi Labels: machine_learning, neural_network Fix For: 0.9 Attachments: MAHOUT-1265.patch, Mahout-1265-17.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So
Re: Review Request 13406: mahout-1265: add multilayer perceptron.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13406/ --- (Updated Dec. 19, 2013, 12:31 a.m.) Review request for mahout and Ted Dunning. Changes --- I have formatted the code to make it looks better. Repository: mahout Description --- mahout-1265: add multilayer perceptron. For details, please refer to https://issues.apache.org/jira/browse/MAHOUT-1265. Diffs (updated) - https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java PRE-CREATION Diff: https://reviews.apache.org/r/13406/diff/ Testing --- Please see the corresponding test cases Thanks, Yexi Jiang
Re: Review Request 13406: mahout-1265: add multilayer perceptron.
On Dec. 17, 2013, 7:16 p.m., Suneel Marthi wrote: General comment: Fix the Javadocs in code, using IntelliJ should identify most of these issues. Thank you very much for your patiently review! I learnt a lot and will not make the mistakes for the future issues. - Yexi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13406/#review30556 --- On Dec. 9, 2013, 7:57 p.m., Yexi Jiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13406/ --- (Updated Dec. 9, 2013, 7:57 p.m.) Review request for mahout and Ted Dunning. Repository: mahout Description --- mahout-1265: add multilayer perceptron. For details, please refer to https://issues.apache.org/jira/browse/MAHOUT-1265. Diffs - https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java PRE-CREATION Diff: https://reviews.apache.org/r/13406/diff/ Testing --- Please see the corresponding test cases Thanks, Yexi Jiang
Re: Review Request 13406: mahout-1265: add multilayer perceptron.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13406/ --- (Updated Dec. 17, 2013, 8:18 p.m.) Review request for mahout and Ted Dunning. Changes --- Thank you very much for your patiently review! The code has been revised according to the comments. I learnt a lot and will not make the mistakes for the future issues. Repository: mahout Description --- mahout-1265: add multilayer perceptron. For details, please refer to https://issues.apache.org/jira/browse/MAHOUT-1265. Diffs (updated) - https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java PRE-CREATION https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java PRE-CREATION Diff: https://reviews.apache.org/r/13406/diff/ Testing --- Please see the corresponding test cases Thanks, Yexi Jiang
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851223#comment-13851223 ] Yexi Jiang commented on MAHOUT-1265: I have applied the patch to my local code base and tested it. It works without any error. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: MAHOUT-1265.patch, Mahout-1265-13.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part
Re: Mahout 0.9 release
deferred for last 2 Mahout releases. M-1319, M-1328, M-1347, M-1350 - Suneel M-1265 - Multi Layer Perceptron, Yexi please look at my comments on Reviewboard. M-1273 - Kun Yung, Ted, defer this to next release ??? M-1312, M-1256 - Stevo, could u take one of them On Thursday, November 28, 2013 5:01 AM, Isabel Drost-Fromm isa...@apache.org wrote: On Wed, 27 Nov 2013 14:23:11 -0800 (PST) Suneel Marthi suneel_mar...@yahoo.com wrote: Below are the Open issues for 0.9:- This looks like we should be targeting Dec. 9th as code freeze to me. What do you all think? Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 - All related to Wiki updates, missing Wiki documentation and Wiki migration to new CMS. Isabel's working on M-1245 (migrating to new CMS). Could some of the others be consolidated with that? I believe MAHOUT-1245 essentially is ready to be published - all I want before notifying INFRA to switch to the new cms based site is one other person to take at least a brief look. For MAHOUT-1304 - Sebastian, can you please check that the cms based site actually does fit on 1280px? We can close this issue then. MAHOUT-1305 - I think this should be turned into a task to actually delete most of the pages that have been migrated to the new CMS (almost all of them). Once 1245 is shipped, it would be great if a few more people could lend a hand in getting this done. MAHOUT-1307 - Can be closed once switched to CMS MAHOUT-1326 - This really relates to the old Confluence export plugin we once have been using to generate static pages out of our wiki that is no longer active. Unless anyone on the Mahout dev list knows how to fully delete all exported static pages we should file an issue with INFRA to ask for help getting those deleted. They definitely are confusing to users. M-1286 - Peng and ssc, we had talked about this during the last hangout. Can this be included in 0.9? M-1030 - Andrew Musselman? Any updates on this, its important that we fix this for 0.9 M-1319, M-1328, M-1347, M-1364 - Suneel M-1273 - Kun Yung, remember talking about this in one of the earlier hangouts; can't recall what was decided? M-1312, M-1256 - Dan Filimon (or Stevo??) M-996 someone could pick this up (if its still relevant with present codebase i.e.) I think this can move to the next release - according to the contributor and Sebastian the patch is rather hacky and there for illustration purposes only. I'd rather see some more thought go into that instead of pushing to have this in 0.9. M-1265 Yexi had submitted a patch for this, it would be good if this could go in as part of 0.9 M-1288 Solr Recommender - Pat Ferrell M-1285: Any takers for this? Would be nice to have - in particular if someone on dev@ (not necessarily a committer) wants to get started with the code base. Otherwise I'd say fix for next release if time gets short. M-1356: Isabel's started on this, Stevo could u review this? We definitely can punt that for the next release or even thereafter. It would be great if someone who has some knowledge of Java security policies would take a look. The implication of not fixing this essentially is that in case someone commits test code that writes outside of target or to some globally shared directory we might end up having randomly failing tests due to the parallel setup again. But as these will occur shortly after the commit it should be easy enough to find the code change that caused the breakage. M-1329: Support for Hadoop 2 Is that truly feasable within a week? M-1366: Stevo, Isabel This should be done as part of the release process by release manager at the latest. M-1261: Sebastian??? M-1309, M-1310, M-1311, M-1316 - all related to running Mahout on Windows ?? I'm not aware of us supporting Windows. M-1350 - Any takers?? (Stevo??) To me this looks like a broken classpath on the user side. Without a patch to at least re-produce the issue I wouldn't spend too much time on this. Isabel -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: Mahout 0.9 release
I have updated the code based on the previous feedback. I am now waiting to know whether the code is shipable. 2013/12/16 Suneel Marthi suneel_mar...@yahoo.com Waiting on u to provide an updated patch based on the feedback on Reviewboard? On Monday, December 16, 2013 4:14 PM, Yexi Jiang yexiji...@gmail.com wrote: What about M-1265? 2013/12/16 Suneel Marthi suneel_mar...@yahoo.com Its time to freeze trunk the this week, here's the status of JIRAs:- Suneel -- M-1319 - Patch available, would appreciate if someone could review/test the patch before I commit to trunk. Pat - M-1288 Solr Recommender Pat, I see that you have the code in ur Github repo, could u create a patch that could be merged into Mahout trunk. Frank M-1364 (Upgrade to Lucene 4.6) - Patch available. Grant, do u have cycles to review this patch? Gokhan -- M-1354 (Support for Hadoop 2.x) - Patch available. Gokhan, any updates on this. On Sunday, December 8, 2013 6:23 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: We need to freeze the trunk this coming week in preparation for 0.9 release, below are the pending JIRAs:- Wiki (not a show stopper for 0.9) - M-1245, M-1304, M-1305, M-1307, M-1326 Suneel --- M-1319 (i can work on this tomorrow) M-1265 (Multi Layer Perceptron) - Need to be merged into trunk, the code's available for review on ReviewBoard. It would help if another set of eyes reviewed the test cases (Isabel, Stevo.. ?) Pat M-1288 Solr Recommender (What's the status of this Pat, this needs to be in 0.9 Release.) Stevo --- M-1366 (this can be at time of 0.9 Release and has no impact on trunk) Frank M-1364 (Upgrade to Lucene 4.6) - Patch available. It would be nice to have this go in 0.9 The patch worked for me Frank, I agree that this needs to be reviewed by someone who's more familiar with Lucene. Gokhan -- M-1354 (Support for Hadoop 2.x) - Patch available. This is targeted for 1.0. The patch worked for me on Hadoop 1.2.1, it would be good if someone could try the patch on hadoop 2.x instance. Others -- M-1371 - This was reported on @user and a patch was submitted. If we don't hear from the author within this week, this can be deferred to 1.0 On Tuesday, December 3, 2013 8:13 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: JIRAs Update for 0.9 release:- Wiki - Isabel, Sebastian and other volunteers - M-1245, M-1304, M-1305, M-1307, M-1326 Suneel --- M-1319 M-1242 (Patch available to be committed to trunk) Pat --- M-1288 Solr Recommender Yexi, Suneel --- M-1265 - Multi Layer Perceptron Stevo, Isabel - M-1366 Andrew -- M-1030, M-1349 Ted -- M-1368 (Patch available to be committed to trunk) On Sunday, December 1, 2013 7:57 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: Open JIRAs for 0.9 release :- Wiki - Isabel, Sebastian and other volunteers - M-1245, M-1304, M-1305, M-1307, M-1326 Suneel --- M-1319, M-1328 Pat --- M-1288 Solr Recommender Sebastian, Peng M-1286 Yexi, Suneel --- M-1265 - Multi Layer Perceptron Ted, do u have cycles to review this, the patch's up on Reviewboard. Stevo, Isabel - M-1366 - Please delete old releases from mirroring system M-1345 - Enable Randomized testing for all modules Andrew -- M-1030 Open Issues (any takers for these ???) M-1242 M-1349 On Friday, November 29, 2013 12:07 PM, Sebastian Schelter ssc.o...@googlemail.com wrote: On 29.11.2013 17:59, Suneel Marthi wrote: Open JIRAs for 0.9: Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 - related to Wiki updates. Definitely appreciate more hands here to review/update the wiki M-1286 - Peng and Sebastian, no updates on this. Can this be included in 0.9? I will look into this over the weekend! M-1030 - Andrew Musselman M-1319, M-1328 - Suneel M-1347 - Suneel, patch has been committed to trunk. M-1265 - I have been working with Yexi on this. Ted, would u have time to review this; the code's on Reviewboard. M-1288 - Sole Recommender, Pat Ferrel M-1345: Isabel, Frank. I think we are good on this patch. Isabel, could u commit this to trunk? M-1312: Stevo, could u look at this? M-1349: Any takers for this?? Others: Spectral Kmeans clustering documentation (Shannon) On Thursday, November 28, 2013 10:38 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: Adding Mahout-1349 to the list of JIRAs . On Thursday, November 28, 2013 10:37 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: Update on Open JIRAs for 0.9
[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1265: --- Attachment: Mahout-1265-11.patch This is the final version of the patch. It has been reviewed by [~smarthi]. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: Mahout-1265-11.patch, Mahout-1265-6.patch, mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j
Re: [jira] [Commented] (MAHOUT-1307) Distinguish implemented algorithms from algorithms which may be implemented in the future in algorithms page
It seems that some of the info on that page is out-dated. 2013/12/8 Ajay Bhat (JIRA) j...@apache.org [ https://issues.apache.org/jira/browse/MAHOUT-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842449#comment-13842449] Ajay Bhat commented on MAHOUT-1307: --- Hi [~yamakatu], I'd like to help with this issue. But it seems I don't have permission to edit the page? Distinguish implemented algorithms from algorithms which may be implemented in the future in algorithms page Key: MAHOUT-1307 URL: https://issues.apache.org/jira/browse/MAHOUT-1307 Project: Mahout Issue Type: Documentation Components: Website Affects Versions: 0.8 Reporter: yamakatu Priority: Minor Fix For: 0.9 In case of the description of the Mahout algorithms web page, (https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms) the algorithms which may be implemented in the future are easy to be confused with the already implemented algorithms, and I think that it is difficult to recognize both intuitively. I think that both algorithms should be distinguished more clearly. -- This message was sent by Atlassian JIRA (v6.1#6144) -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1265: --- Attachment: Mahout-1265-6.patch This is the final version of the patch. It has been reviewed by [~smarthi]. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: Mahout-1265-6.patch, mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given partition of data
Re: Mahout 0.9 release
fit on 1280px? We can close this issue then. MAHOUT-1305 - I think this should be turned into a task to actually delete most of the pages that have been migrated to the new CMS (almost all of them). Once 1245 is shipped, it would be great if a few more people could lend a hand in getting this done. MAHOUT-1307 - Can be closed once switched to CMS MAHOUT-1326 - This really relates to the old Confluence export plugin we once have been using to generate static pages out of our wiki that is no longer active. Unless anyone on the Mahout dev list knows how to fully delete all exported static pages we should file an issue with INFRA to ask for help getting those deleted. They definitely are confusing to users. M-1286 - Peng and ssc, we had talked about this during the last hangout. Can this be included in 0.9? M-1030 - Andrew Musselman? Any updates on this, its important that we fix this for 0.9 M-1319, M-1328, M-1347, M-1364 - Suneel M-1273 - Kun Yung, remember talking about this in one of the earlier hangouts; can't recall what was decided? M-1312, M-1256 - Dan Filimon (or Stevo??) M-996 someone could pick this up (if its still relevant with present codebase i.e.) I think this can move to the next release - according to the contributor and Sebastian the patch is rather hacky and there for illustration purposes only. I'd rather see some more thought go into that instead of pushing to have this in 0.9. M-1265 Yexi had submitted a patch for this, it would be good if this could go in as part of 0.9 M-1288 Solr Recommender - Pat Ferrell M-1285: Any takers for this? Would be nice to have - in particular if someone on dev@ (not necessarily a committer) wants to get started with the code base. Otherwise I'd say fix for next release if time gets short. M-1356: Isabel's started on this, Stevo could u review this? We definitely can punt that for the next release or even thereafter. It would be great if someone who has some knowledge of Java security policies would take a look. The implication of not fixing this essentially is that in case someone commits test code that writes outside of target or to some globally shared directory we might end up having randomly failing tests due to the parallel setup again. But as these will occur shortly after the commit it should be easy enough to find the code change that caused the breakage. M-1329: Support for Hadoop 2 Is that truly feasable within a week? M-1366: Stevo, Isabel This should be done as part of the release process by release manager at the latest. M-1261: Sebastian??? M-1309, M-1310, M-1311, M-1316 - all related to running Mahout on Windows ?? I'm not aware of us supporting Windows. M-1350 - Any takers?? (Stevo??) To me this looks like a broken classpath on the user side. Without a patch to at least re-produce the issue I wouldn't spend too much time on this. Isabel -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: Review Request 13406: mahout-1265: add multilayer perceptron.
Suneel, If this does not work, which location is the safe place to put the temporary file? Regards, Yexi 2013/12/2 Suneel Marthi suneel.mar...@gmail.com Yexi, The tests have to be redone in light of recent changes for m-1345. We shouldn't be writing to /tmp anymore which is gonna fail the tests. More later. Sent from my iPhone On Dec 2, 2013, at 6:25 PM, Yexi Jiang yexiji...@gmail.com wrote: This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13406/ Review request for mahout and Ted Dunning. By Yexi Jiang. *Updated Dec. 2, 2013, 11:25 p.m.* Changes I have updated the code according to the comments. *Repository: * mahout Description mahout-1265: add multilayer perceptron. For details, please refer to https://issues.apache.org/jira/browse/MAHOUT-1265. Testing Please see the corresponding test cases Diffs (updated) - https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/MultilayerPerceptron.java (PRE-CREATION) - https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetwork.java (PRE-CREATION) - https://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/classifier/mlp/NeuralNetworkFunctions.java (PRE-CREATION) - https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestMultilayerPerceptron.java (PRE-CREATION) - https://svn.apache.org/repos/asf/mahout/trunk/core/src/test/java/org/apache/mahout/classifier/mlp/TestNeuralNetwork.java (PRE-CREATION) View Diff https://reviews.apache.org/r/13406/diff/ -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834959#comment-13834959 ] Yexi Jiang commented on MAHOUT-1265: OK, I'll revise it accordingly. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given partition of data, and then store the result into a specified
Re: Mahout 0.9 release
I am working on M-1265. 2013/11/28 Suneel Marthi suneel_mar...@yahoo.com Update on Open JIRAs for 0.9: Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 - all related to Wiki updates, please see Isabel's updates. M-1286 - Peng and Sebastian, we had talked about this during the last hangout. Can this be included in 0.9? M-1030- Andrew Musselman, its critical that we get this into 0.9, its been deferred for last 2 Mahout releases. M-1319, M-1328, M-1347, M-1350 - Suneel M-1265 - Multi Layer Perceptron, Yexi please look at my comments on Reviewboard. M-1273 - Kun Yung, Ted, defer this to next release ??? M-1312, M-1256 - Stevo, could u take one of them On Thursday, November 28, 2013 5:01 AM, Isabel Drost-Fromm isa...@apache.org wrote: On Wed, 27 Nov 2013 14:23:11 -0800 (PST) Suneel Marthi suneel_mar...@yahoo.com wrote: Below are the Open issues for 0.9:- This looks like we should be targeting Dec. 9th as code freeze to me. What do you all think? Mahout-1245, Mahout-1304, Mahout-1305, Mahout-1307, Mahout-1326 - All related to Wiki updates, missing Wiki documentation and Wiki migration to new CMS. Isabel's working on M-1245 (migrating to new CMS). Could some of the others be consolidated with that? I believe MAHOUT-1245 essentially is ready to be published - all I want before notifying INFRA to switch to the new cms based site is one other person to take at least a brief look. For MAHOUT-1304 - Sebastian, can you please check that the cms based site actually does fit on 1280px? We can close this issue then. MAHOUT-1305 - I think this should be turned into a task to actually delete most of the pages that have been migrated to the new CMS (almost all of them). Once 1245 is shipped, it would be great if a few more people could lend a hand in getting this done. MAHOUT-1307 - Can be closed once switched to CMS MAHOUT-1326 - This really relates to the old Confluence export plugin we once have been using to generate static pages out of our wiki that is no longer active. Unless anyone on the Mahout dev list knows how to fully delete all exported static pages we should file an issue with INFRA to ask for help getting those deleted. They definitely are confusing to users. M-1286 - Peng and ssc, we had talked about this during the last hangout. Can this be included in 0.9? M-1030 - Andrew Musselman? Any updates on this, its important that we fix this for 0.9 M-1319, M-1328, M-1347, M-1364 - Suneel M-1273 - Kun Yung, remember talking about this in one of the earlier hangouts; can't recall what was decided? M-1312, M-1256 - Dan Filimon (or Stevo??) M-996 someone could pick this up (if its still relevant with present codebase i.e.) I think this can move to the next release - according to the contributor and Sebastian the patch is rather hacky and there for illustration purposes only. I'd rather see some more thought go into that instead of pushing to have this in 0.9. M-1265 Yexi had submitted a patch for this, it would be good if this could go in as part of 0.9 M-1288 Solr Recommender - Pat Ferrell M-1285: Any takers for this? Would be nice to have - in particular if someone on dev@ (not necessarily a committer) wants to get started with the code base. Otherwise I'd say fix for next release if time gets short. M-1356: Isabel's started on this, Stevo could u review this? We definitely can punt that for the next release or even thereafter. It would be great if someone who has some knowledge of Java security policies would take a look. The implication of not fixing this essentially is that in case someone commits test code that writes outside of target or to some globally shared directory we might end up having randomly failing tests due to the parallel setup again. But as these will occur shortly after the commit it should be easy enough to find the code change that caused the breakage. M-1329: Support for Hadoop 2 Is that truly feasable within a week? M-1366: Stevo, Isabel This should be done as part of the release process by release manager at the latest. M-1261: Sebastian??? M-1309, M-1310, M-1311, M-1316 - all related to running Mahout on Windows ?? I'm not aware of us supporting Windows. M-1350 - Any takers?? (Stevo??) To me this looks like a broken classpath on the user side. Without a patch to at least re-produce the issue I wouldn't spend too much time on this. Isabel -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825355#comment-13825355 ] Yexi Jiang commented on MAHOUT-976: --- [MAHOUT-1265|https://issues.apache.org/jira/browse/MAHOUT-1265] is actually a new implementation of MLP based on Ted's comments. For example, the users can freely configure the layer but setting the number of neurons, the squashing function. Also, the users can also set the kind of cost function and the parameters like learning rate, momemtum weight and so on. Implement Multilayer Perceptron --- Key: MAHOUT-976 URL: https://issues.apache.org/jira/browse/MAHOUT-976 Project: Mahout Issue Type: New Feature Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Priority: Minor Labels: multilayer, networks, neural, perceptron Fix For: Backlog Attachments: MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch Original Estimate: 80h Remaining Estimate: 80h Implement a multi layer perceptron * via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: Efficent Backprop * arbitrary number of hidden layers (also 0 - just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking * normalization of the inputs (storeable) as part of the model First: * implementation stocastic gradient descent like gradient machine * simple gradient descent incl. momentum Later (new jira issues): * Distributed Batch learning (see below) * Stacked (Denoising) Autoencoder - Feature Learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights - back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning). Batch learning with delta-bar-delta heuristics for adapting the learning rates. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787462#comment-13787462 ] Yexi Jiang commented on MAHOUT-1265: There is no news? Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given partition of data, and then store the result into a specified location. 3.2
Re: You are invited to Apache Mahout meet-up
A great event. I wish I were in Bay area. 2013/8/22 Shannon Quinn squ...@gatech.edu I'm only sorry I'm not in the Bay area. Sounds great! On 8/22/13 3:38 AM, Stevo Slavić wrote: Retweeted meetup invite. Have fun! Kind regards, Stevo Slavic. On Thu, Aug 22, 2013 at 8:34 AM, Ted Dunning ted.dunn...@gmail.com wrote: Very cool. Would love to see folks turn out for this. On Wed, Aug 21, 2013 at 9:38 PM, Ellen Friedman b.ellen.fried...@gmail.com**wrote: The Apache Mahout user group has been re-activated. If you are in the Bay Area in California, join us on Aug 27 (Redwood City). Sebastian Schelter will be the main speaker, talking about new directions with Mahout recommendation. Grant Ingersoll, Ted Dunning and I be there to do a short introduction for the meet-up and update on the 0.8 release. Here's the link to rsvp: http://bit.ly/16K32hg Hope you can come, and please spread the word. Ellen -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732836#comment-13732836 ] Yexi Jiang commented on MAHOUT-1265: Is there any who can review the code? The sample code for using it can be seen in test cases. [~tdunning] Could you please give any comments? Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733110#comment-13733110 ] Yexi Jiang commented on MAHOUT-1265: [~smarthi] Done, please refer to https://reviews.apache.org/r/13406/. Thank you. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given
Re: Hangout on Monday
Can anyone join? 2013/8/5 DB Tsai dbt...@dbtsai.com Can we get the google hangout link? Kun and I don't get the invitation. Sincerely, DB Tsai --- Web: http://www.dbtsai.com Phone : +1-650-383-8392 On Mon, Aug 5, 2013 at 3:54 PM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. Max of 10. On Mon, Aug 5, 2013 at 3:53 PM, Nyoman Ribeka nyoman.rib...@gmail.com wrote: I think hangout have a maximum of 10 participants. Watching the youtube means you're passively participating. On Mon, Aug 5, 2013 at 6:51 PM, Sebastian Schelter s...@apache.org wrote: Is the link only for watching or also for participation? Never did a hangout before :) 2013/8/5 Andrew Musselman andrew.mussel...@gmail.com Can't make it alas On Mon, Aug 5, 2013 at 3:12 PM, Michael Kun Yang kuny...@stanford.edu wrote: what's the addr of the hangout? On Sun, Aug 4, 2013 at 10:37 AM, Peng Cheng pc...@uowmail.edu.au wrote: Nice, I'll be there. On 13-08-03 02:51 PM, Andrew Musselman wrote: Sounds good On Sat, Aug 3, 2013 at 12:04 AM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. 1600 PDT I got that right in the linked doc, just not on the more important email. On Fri, Aug 2, 2013 at 3:30 PM, Andrew Psaltis andrew.psal...@webtrends.com wrote: On 8/2/13 4:42 PM, Ted Dunning ted.dunn...@gmail.com wrote: Let's have the hangout at 1600 on Monday, August 5th. Maybe asking the obvious here so I apologize for the spam. The timezone is PDT, correct? -- Thanks, -Nyoman Ribeka -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: Hangout on Monday
Hi, Ted, I added you on google plus. 2013/8/5 Suneel Marthi suneel_mar...@yahoo.com Grant had setup a biweekly/weekly Google Doodle for Mahout meetups. We had only one of them sometime in July with no technical issues. I could see and hear you guys talk on today's hangout but it just wouldn't allow me to join in. Suggest that we should be using that going forward, there is no need for the meeting host to add the rest of the team to his/her circles that way. Regards, Suneel As our Google+ circle of knowledge expands, so does the circumference of darkness surrounding it - Albert Einstein From: Ted Dunning ted.dunn...@gmail.com To: Mahout Dev List dev@mahout.apache.org Sent: Monday, August 5, 2013 8:16 PM Subject: Re: Hangout on Monday Peng, It looks like you are not actually on google plus. I have you in my Mahout circle under your iowa email address, but I am unable to add you to a hangout. On Mon, Aug 5, 2013 at 5:07 PM, Peng Cheng pc...@uowmail.edu.au wrote: So buggy, the program act as i'm in the meeting (showing a push to talk button), but it doesn't do anything. On 13-08-05 08:02 PM, Ted Dunning wrote: Hangouts clearly do not work the way I thought they did. The URL that I sent out was for the arhcived version of the meeting. On Mon, Aug 5, 2013 at 5:00 PM, Peng Cheng pc...@uowmail.edu.au wrote: Strange, I didn't see any invitation. On 13-08-05 06:54 PM, Ted Dunning wrote: Just sent invite to Mahout dev list. On Mon, Aug 5, 2013 at 3:53 PM, Ted Dunning ted.dunn...@gmail.com wrote: It is for both. If you have g+ installed you can participate. If not, you can watch. On Mon, Aug 5, 2013 at 3:51 PM, Sebastian Schelter s...@apache.org wrote: Is the link only for watching or also for participation? Never did a hangout before :) 2013/8/5 Andrew Musselman andrew.mussel...@gmail.com Can't make it alas On Mon, Aug 5, 2013 at 3:12 PM, Michael Kun Yang kuny...@stanford.edu wrote: what's the addr of the hangout? On Sun, Aug 4, 2013 at 10:37 AM, Peng Cheng pc...@uowmail.edu.au wrote: Nice, I'll be there. On 13-08-03 02:51 PM, Andrew Musselman wrote: Sounds good On Sat, Aug 3, 2013 at 12:04 AM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. 1600 PDT I got that right in the linked doc, just not on the more important email. On Fri, Aug 2, 2013 at 3:30 PM, Andrew Psaltis andrew.psal...@webtrends.com wrote: On 8/2/13 4:42 PM, Ted Dunning ted.dunn...@gmail.com wrote: Let's have the hangout at 1600 on Monday, August 5th. Maybe asking the obvious here so I apologize for the spam. The timezone is PDT, correct? -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1265: --- Attachment: mahout-1265.patch [~tdunning] I have finished a workable single machine version MultilayerPerceptron (based on NeuralNetwork). It supports the requirement as you mentioned above. It allow users to customize each layer including the size and the squashing function. Also, it allows users to specify different loss functions to the model. Moreover, it allow user to store the trained model and reload it for later use. Finally, it allows users to extract the weight of each layer from a trained model. This approach allows users to train and stack a deep learning neural network layer by layer. If this single machine version passes the review, I will begin to work on the map-reduce version base on it. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge
[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1265: --- Status: Patch Available (was: Open) Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given partition of data, and then store the result into a specified location. 3.2 The model update step: After k
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728756#comment-13728756 ] Yexi Jiang commented on MAHOUT-1265: [~tdunning] The test cases contain the test on three datasets, the simple XOR problem, the Cancer dataset (2-class classification) and the Iris dataset(3-class classification). For the later two datasets, the classification accuracy is more than 90%. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728769#comment-13728769 ] Yexi Jiang commented on MAHOUT-1265: The MLP is implemented based on NeuralNetwork. The NeuralNetwork is more general in terms of functionality (can be used for regression, classification, dimensional reduction, etc) and architecture (Linear Regression and Logistic Regression as a two-level neural network, Autoencoder as a three-level neural network, I heard that even the SVM can be modeled as a type of neural network, but I'm not sure.). In my opinion, the NeuralNetwork I implemented is a suitable start for deep learning, as one implementation of the Deep nets is based on stacking the Autoencoder. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Attachments: mahout-1265.patch Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j
[jira] [Created] (MAHOUT-1265) Add Multilayer Perceptron
Yexi Jiang created MAHOUT-1265: -- Summary: Add Multilayer Perceptron Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; double regularization = 0.01; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, layerSizeArray, modelLocation); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given partition of data, and then store the result into a specified location. 3.2 The model update step: After k parts of \delta_j been calculated, a separate program can be used to merge the k parts of \delta_j into one to update the weight matrices. This program can load the results calculated in the weight update calculation step and update the weight matrices. -- This message is automatically generated by JIRA. If you think it was sent incorrectly
[jira] [Updated] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yexi Jiang updated MAHOUT-1265: --- Description: Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1 The partial weight update calculation step: This step trains the MLP distributedly. Each task will get a copy of the MLP model, and calculate the weight update with a partition of data. Suppose the training error is E(w) = ½ \sigma_{d \in D} cost(t_d, y_d), where D denotes the training set, d denotes a training instance, t_d denotes the class label and y_d denotes the output of the MLP. Also, suppose sigmoid function is used as the squashing function, squared error is used as the cost function, t_i denotes the target value for the ith dimension of the output layer, o_i denotes the actual output for the ith dimension of the output layer, l denotes the learning rate, w_{ij} denotes the weight between the jth neuron in previous layer and the ith neuron in the next layer. The weight of each edge is updated as \Delta w_{ij} = l * 1 / m * \delta_j * o_i, where \delta_j = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * (t_j^{(m)} - o_j^{(m)}) for output layer, \delta = - \sigma_{m} * o_j^{(m)} * (1 - o_j^{(m)}) * \sigma_k \delta_k * w_{jk} for hidden layer. It is easy to know that \delta_j can be rewritten as \delta_j = - \sigma_{i = 1}^k \sigma_{m_i} * o_j^{(m_i)} * (1 - o_j^{(m_i)}) * (t_j^{(m_i)} - o_j^{(m_i)}) The above equation indicates that the \delta_j can be divided into k parts. So for the implementation, each mapper can calculate part of \delta_j with given partition of data, and then store the result into a specified location. 3.2 The model update step: After k parts of \delta_j been calculated, a separate program can be used to merge the k parts of \delta_j into one to update the weight matrices. This program can load the results calculated in the weight update calculation step and update the weight matrices. was: Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired
[jira] [Commented] (MAHOUT-1265) Add Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13686880#comment-13686880 ] Yexi Jiang commented on MAHOUT-1265: Ted, {quote} I would suggest that a more fluid API would be helpful to people. For instance, each layer might be an object which could be composed together to build a model which is then trained. {quote} It seems that you suggest a more general neural network, not just the MLP. A MLP is a kind of feed-forward neural network that the topology is fixed. It usually consists of several layers and every pair of neurons in adjacent layers are connected. Therefore, specify the size of each layer is enough to determine the topology of a MLP. It is good if we first define a generic neural network, and then build a MLP on top of this generic neural network in the way as you said. An advantage is that the generic neural network can be reused to build other types of neural networks in the future, e.g. autoencoder for dimensional reduction, recurrent neural network for sequential mining, or possibly deep nets, etc. {quote} Secondly, it seems like it would be good to have different kinds of loss function and regularizations. {quote} Yes, the MLP would allow the user to specify different loss function, squashing functions, and regularizations. {quote} Also, regarding things like momentum, do you have an idea that this really needs to be commonly adjusted? or is there a way to set a good default? {quote} As far as I know, there is no empirical way to set a good default momentum weight. A good value is determined by the concrete problem. As for learning rate, a good way is to enable the decaying learning rate. Add Multilayer Perceptron -- Key: MAHOUT-1265 URL: https://issues.apache.org/jira/browse/MAHOUT-1265 Project: Mahout Issue Type: New Feature Reporter: Yexi Jiang Labels: machine_learning, neural_network Design of multilayer perceptron 1. Motivation A multilayer perceptron (MLP) is a kind of feed forward artificial neural network, which is a mathematical model inspired by the biological neural network. The multilayer perceptron can be used for various machine learning tasks such as classification and regression. It is helpful if it can be included in mahout. 2. API The design goal of API is to facilitate the usage of MLP for user, and make the implementation detail user transparent. The following is an example code of how user uses the MLP. - // set the parameters double learningRate = 0.5; double momentum = 0.1; int[] layerSizeArray = new int[] {2, 5, 1}; String costFuncName = “SquaredError”; String squashingFuncName = “Sigmoid”; // the location to store the model, if there is already an existing model at the specified location, MLP will throw exception URI modelLocation = ... MultilayerPerceptron mlp = new MultiLayerPerceptron(layerSizeArray, modelLocation); mlp.setLearningRate(learningRate).setMomentum(momentum).setRegularization(...).setCostFunction(...).setSquashingFunction(...); // the user can also load an existing model with given URI and update the model with new training data, if there is no existing model at the specified location, an exception will be thrown /* MultilayerPerceptron mlp = new MultiLayerPerceptron(learningRate, regularization, momentum, squashingFuncName, costFuncName, modelLocation); */ URI trainingDataLocation = … // the detail of training is transparent to the user, it may running in a single machine or in a distributed environment mlp.train(trainingDataLocation); // user can also train the model with one training instance in stochastic gradient descent way Vector trainingInstance = ... mlp.train(trainingInstance); // prepare the input feature Vector inputFeature … // the semantic meaning of the output result is defined by the user // in general case, the dimension of output vector is 1 for regression and two-class classification // the dimension of output vector is n for n-class classification (n 2) Vector outputVector = mlp.output(inputFeature); - 3. Methodology The output calculation can be easily implemented with feed-forward approach. Also, the single machine training is straightforward. The following will describe how to train MLP in distributed way with batch gradient descent. The workflow is illustrated as the below figure. https://docs.google.com/drawings/d/1s8hiYKpdrP3epe1BzkrddIfShkxPrqSuQBH0NAawEM4/pub?w=960h=720 For the distributed training, each training iteration is divided into two steps, the weight update calculation step and the weight update step. The distributed MLP can only be trained in batch-update approach. 3.1
[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
[ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680476#comment-13680476 ] Yexi Jiang commented on MAHOUT-975: --- There are multiple problems (not only bugs) with the GradientMachine (based on Ted's revised version). If there is not time to pay attention to this issue, please ignore it until next week (when 0.8 is released). 1) The GradientMachine is a special case of MultiLayerPerceptron (MLP) that contains only 1 hidden layer. Is it necessary to have it if the MultiLayerPerceptron is in the plan? 2) The hiddenToOutput seems not correct. The squashing(activation) function should also apply to the output layer (See [1][2][3][4]). Therefore, the range of the output for each node(neuron) in the output is (0, 1) if Sigmoid function is used, or (-1, 1) if Tanh function is used. 3) There are several problems with the training method. In updateRanking, I don't know which weight update strategy is used, it claims it is back-propagation, but it is not implemented in that way. 3.1) It seems that only part of the outputWeight are updated (the weights associated with the good output node, and the weights associated with the worst output node. Again, this is OK for two-class problem). For back-propagation, all the weights between the last hidden layer and the output layer should be updated. So, is the original designer intentionally design it like that and can guarantee its correctness? In the backpropagation way, the delta of each node should be calculated first, and the weight of each node is adjusted based on the corresponding delta. However, in the implemented code, 3.2) The GradientMachine (and MLP) actually can also be used for regression and prediction. The 'train' method of OnlineLearner restricts its power. 4) The corresponding test case is not enough to test the correctness of the implementation. 5) If all the previous problems have been fixed, it is time to consider the necessity of a map-reduce version of the algorithm. Reference: [1] Tom Mitchel. Machine Learning. Chapter 4. [2] Jiawei Han. Data Mining Concepts and Technologies. Chapter 6. [3] Stanford Unsupervised Feature Learning and Deep Learning tutorial. http://ufldl.stanford.edu/wiki/index.php/Neural_Networks. Section Neural Network. [4] Christopher Bishop. Neural Networks for Pattern Recognition. Chapter 4. Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: Backlog Attachments: GradientMachine2.java, GradientMachine.patch, MAHOUT-975.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
[ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680673#comment-13680673 ] Yexi Jiang commented on MAHOUT-975: --- The size of goodLabels in updateRanking is always 1 and it seems that there is no need to use a loop. Also, the existing test case cannot been passed. The ArrayIndexOutOfBoundsException are thrown. Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: Backlog Attachments: GradientMachine2.java, GradientMachine.patch, MAHOUT-975.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
[ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679582#comment-13679582 ] Yexi Jiang commented on MAHOUT-975: --- [~smarthi] Sure, I'd like to try it. Is the deadline end of this week? Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: 0.8 Attachments: GradientMachine.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
[ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679837#comment-13679837 ] Yexi Jiang commented on MAHOUT-975: --- [~smarthi] When I apply this patch, the source code cannot be compiled. One of the error is that hiddenActivations cannot be resolved. Another error is that the class Functions.NEGATE is misspell as Function.NEGATE. Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: 0.8 Attachments: GradientMachine.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
[ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679837#comment-13679837 ] Yexi Jiang edited comment on MAHOUT-975 at 6/10/13 8:07 PM: [~smarthi] When I apply this patch, the source code cannot be compiled. One of the error is that hiddenActivations cannot be resolved. Another error is that the class Functions.NEGATE is misspelled as Function.NEGATE. was (Author: yxjiang): [~smarthi] When I apply this patch, the source code cannot be compiled. One of the error is that hiddenActivations cannot be resolved. Another error is that the class Functions.NEGATE is misspell as Function.NEGATE. Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: 0.8 Attachments: GradientMachine.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Comment Edited] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
OK, I try to update source code to the latest version. 2013/6/10 Yexi Jiang (JIRA) j...@apache.org [ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679837#comment-13679837] Yexi Jiang edited comment on MAHOUT-975 at 6/10/13 8:07 PM: [~smarthi] When I apply this patch, the source code cannot be compiled. One of the error is that hiddenActivations cannot be resolved. Another error is that the class Functions.NEGATE is misspelled as Function.NEGATE. was (Author: yxjiang): [~smarthi] When I apply this patch, the source code cannot be compiled. One of the error is that hiddenActivations cannot be resolved. Another error is that the class Functions.NEGATE is misspell as Function.NEGATE. Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: 0.8 Attachments: GradientMachine.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
[ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679866#comment-13679866 ] Yexi Jiang commented on MAHOUT-975: --- [~smarthi]OK, I will directly working on the latest version of code. Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: 0.8 Attachments: GradientMachine.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient
[ https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13680104#comment-13680104 ] Yexi Jiang commented on MAHOUT-975: --- [~smarthi] Do I still need to work on this? Bug in Gradient Machine - Computation of the gradient -- Key: MAHOUT-975 URL: https://issues.apache.org/jira/browse/MAHOUT-975 Project: Mahout Issue Type: Bug Components: Classification Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Fix For: 0.8 Attachments: GradientMachine2.java, GradientMachine.patch, MAHOUT-975.patch The initialisation to compute the gradient descent weight updates for the output units should be wrong: In the comment: dy / dw is just w since y = x' * w + b. This is wrong. dy/dw is x (ignoring the indices). The same initialisation is done in the code. Check by using neural network terminology: The gradient machine is a specialized version of a multi layer perceptron (MLP). In a MLP the gradient for computing the weight change for the output units is: dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) here: i index of the output layer; j index of the hidden layer (d stands for the partial derivatives) here: z_i = a_i (no squashing in the output layer) with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + z_b with g index of output unit with target value: +1 (positive class) b: random output unit with target value: 0 = dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the hidden unit j) dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the hidden unit j) That's the same if the comment would be correct: dy /dw = x (x is here the activation of the hidden unit) * (-1) for weights to the output unit with target value +1. In neural network implementations it's common to compute the gradient numerically for a test of the implementation. This can be done by: dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679119#comment-13679119 ] Yexi Jiang commented on MAHOUT-976: --- No feedback? Implement Multilayer Perceptron --- Key: MAHOUT-976 URL: https://issues.apache.org/jira/browse/MAHOUT-976 Project: Mahout Issue Type: New Feature Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Priority: Minor Labels: multilayer, networks, neural, perceptron Fix For: Backlog Attachments: MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch Original Estimate: 80h Remaining Estimate: 80h Implement a multi layer perceptron * via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: Efficent Backprop * arbitrary number of hidden layers (also 0 - just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking * normalization of the inputs (storeable) as part of the model First: * implementation stocastic gradient descent like gradient machine * simple gradient descent incl. momentum Later (new jira issues): * Distributed Batch learning (see below) * Stacked (Denoising) Autoencoder - Feature Learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights - back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning). Batch learning with delta-bar-delta heuristics for adapting the learning rates. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron
[ https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678668#comment-13678668 ] Yexi Jiang commented on MAHOUT-976: --- Hi, I read the source code from the patch files (all the four versions) and have the following questions. 1) It seems that the source code has not fully implemented the distributed MLP. Based on my understanding, the algorithm designer intends to make the implemented MLP generic enough so that it can be used both in single machine scenario and distributed scenario. For the single machine scenario, the user can easily reuse the algorithm by writing similar code in the test cases. But for the distributed version, the user has to implement the mapper to load all the training data. And then he needs to create a MLP instance inside the mapper and train it with the incoming data. Moreover, the user has to come up with a solution to merge all the MLP weight updating in each mapper instance, which is not trivial. Therefore, it seems that the current implementation does no more than a single machine version of MLP. 2) The dimension of target Vector feed to trainOnline is always 1. This is because the actual is always an integer, and there is no post-process to make it a mutual class vector. The following is the call sequence. train - trainOnline - getDerivativeOfTheCostWithoutRegularization - getOutputDeltas - AbstractVector.assign(Vector v, DoubleDoubleFunction f) The assign method would check whether v equals to this.size. In the MLP scenario, it will check whether the size of output layer equals the size of class label. And the following is the related code. -- public void train(long trackingKey, String groupKey, int actual, Vector instance) { // training with one pattern Vector target = new DenseVector(1); target.setQuick(0, (double) actual); trainOnline(instance, target); } -- The reason why it passes the test cases is because the test case just create the MLP with size 1 output layer. So, I'm wondering whether the argument list of train should be changed, or argument 'actual' should be transformed internally. I have implemented a BSP based distributed MLP, and the code has already by committed to apache hama machine learning package. The BSP version is not difficult to adapt to the mapreduce framework. If it is OK, I can change my existing code and contribute it the mahout. Implement Multilayer Perceptron --- Key: MAHOUT-976 URL: https://issues.apache.org/jira/browse/MAHOUT-976 Project: Mahout Issue Type: New Feature Affects Versions: 0.7 Reporter: Christian Herta Assignee: Ted Dunning Priority: Minor Labels: multilayer, networks, neural, perceptron Fix For: Backlog Attachments: MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch Original Estimate: 80h Remaining Estimate: 80h Implement a multi layer perceptron * via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: Efficent Backprop * arbitrary number of hidden layers (also 0 - just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking * normalization of the inputs (storeable) as part of the model First: * implementation stocastic gradient descent like gradient machine * simple gradient descent incl. momentum Later (new jira issues): * Distributed Batch learning (see below) * Stacked (Denoising) Autoencoder - Feature Learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights - back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning). Batch learning with delta-bar-delta heuristics for adapting the learning rates. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Really want to contribute to mahout
Certainly, I am always keep an eye on the issue tracker. It is not easy to find an open issue, most of them are assigned short after it is created. 2013/6/2 Ted Dunning ted.dunn...@gmail.com Yexi, It is really good that you just spoke up. The density based clustering issue that you filed didn't find a fertile audience, that is true. Can you provide a pointer to the other issue? On Sat, Jun 1, 2013 at 9:06 PM, Yexi Jiang yexiji...@gmail.com wrote: Hi, I have joined the mailing list for a while and intend to contribute my code to mahout. However, I tried two issues but didn't get the permission to work on them. I'm wondering how can I contribute to mahout. As I am a graduate student working on data mining, I'm really want to do something to make mahout better. Regards, Yexi -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: Algorithms for categorical data
Do you already have one implemented? 2013/6/2 Florents Tselai tse...@dmst.aueb.gr I've noticed (correct me if I'm wrong) that mahout lacks algorithms specialized in clustering data with categorical attributes. Would the community be interested in the implementation of algorithms like ROCK http://www.cis.upenn.edu/~sudipto/mypapers/categorical.pdf ? I'm currently working on this area (applied-research project) and I'd like to have my code open-sourced. -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
Re: Algorithms for categorical data
You mean you are testing on the single machine version? 2013/6/2 Florents Tselai tse...@dmst.aueb.gr Not yet. I'm currently experimenting with various implementation in Python. On Sun, Jun 2, 2013 at 9:43 PM, Yexi Jiang yexiji...@gmail.com wrote: Do you already have one implemented? 2013/6/2 Florents Tselai tse...@dmst.aueb.gr I've noticed (correct me if I'm wrong) that mahout lacks algorithms specialized in clustering data with categorical attributes. Would the community be interested in the implementation of algorithms like ROCK http://www.cis.upenn.edu/~sudipto/mypapers/categorical.pdf ? I'm currently working on this area (applied-research project) and I'd like to have my code open-sourced. -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/ -- -- Yexi Jiang, ECS 251, yjian...@cs.fiu.edu School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
[jira] [Commented] (MAHOUT-1206) Add density-based clustering algorithms to mahout
[ https://issues.apache.org/jira/browse/MAHOUT-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672147#comment-13672147 ] Yexi Jiang commented on MAHOUT-1206: Still there is no comments? Add density-based clustering algorithms to mahout - Key: MAHOUT-1206 URL: https://issues.apache.org/jira/browse/MAHOUT-1206 Project: Mahout Issue Type: Improvement Reporter: Yexi Jiang Labels: clustering The clustering algorithms (kmeans, fuzzy kmeans, dirichlet clustering, and spectral cluster) clustering data by assuming that the data can be clustered into the regular hyper sphere or ellipsoid. However, in practical, not all the data can be clustered in this way. To enable the data to be clustered in arbitrary shapes, clustering algorithms like DBSCAN, BIRCH, CLARANCE (http://en.wikipedia.org/wiki/Cluster_analysis#Density-based_clustering) are proposed. It is better that we can implement one or some of these clustering algorithm to enrich the clustering library. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Really want to contribute to mahout
Hi, I have joined the mailing list for a while and intend to contribute my code to mahout. However, I tried two issues but didn't get the permission to work on them. I'm wondering how can I contribute to mahout. As I am a graduate student working on data mining, I'm really want to do something to make mahout better. Regards, Yexi