Hi, guys. As you are calling for some application programs on Hama in the *Future Plans* of the Hama programming wiki here ( https://issues.apache.org/jira/secure/attachment/12528218/ApacheHamaBSPProgrammingmodel.pdf), I am so interested in machine learning. I have a plan to implement neural networks (eg.Multilayer Perceptron with BP) on Hama. Hama seems to be a nice tool for training large scale neural networks. Esepcailly, for those with large scale structure (many hidden layers and many neurons), I find Hama Graph provided a good solution. We can regard each neuron in NN(neural network) as a vertex in Hama Graph, and the links between neurons as eages in the Graph. Then, the training process can be regarded as updating the weights of the eages among vetices. However, I encounted a problem in the current Hama Graph implementation.
Let me explain this to you. As you maybe now, during the training process of many machine learning algorithms, we need to input many training samples into the model one by one. Usaually, more training samples will lead to preciser models. However, as far as I know, the only input file interface provided by the Hama Graph is the input for graph structure. Sadly, it's hard to read the distribute the training samples during running time, as users can only make their computing logics by overriding the some key functions such as compute() int the Vetex class. So, does hama graph provide any flexible file reading interface for users in running time? Thanks in advance. Walker.
