Hello Ryan, I am thinking of designing two proposals here so that one can be picked easily. If I do DAG, then I will implement the first four models only, and if I do models only, I will do all 6 of them.
And thanks for the suggestion. I will send out a google doc containing both proposals. Regards Shubham Agrawal On Tue, Apr 12, 2022 at 8:36 PM Ryan Curtin <r...@ratml.org> wrote: > On Thu, Apr 07, 2022 at 05:44:17PM +0530, Shubham Agrawal wrote: > > Sorry for the late reply. > > > > My thought of representing DAG as an adjacency list approach. Storing > > pointers to the next and previous layers is required for backward and > > forward passes in the layer itself. That's why I am trying to use 2 > utility > > layers to handle start and end points. I think I have provided some > > pseudocode for these passes. But I haven't thought about anything too > > specific for now. We can also set up a meeting to discuss this. > > Do you mean that you plan to modify the Layer class? That shouldn't be > necessary. You should instead just need to hold the adjacency list in > the class that holds all the layers. No modification should be needed > to any of the layers themselves. > > > About the models list, I have selected some candidates models. > > 1. AlexNet - > > > https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf > > 2. SqueezeNet - https://arxiv.org/pdf/1602.07360.pdf > > 3. VGG 11, 13, 16, 19 - https://arxiv.org/pdf/1409.1556.pdf > > 4. Xception - https://arxiv.org/pdf/1610.02357.pdf > > 5. PolyNet - https://arxiv.org/pdf/1611.05725.pdf > > 6. NASNet - https://arxiv.org/pdf/1707.07012.pdf > > About NASNet and PolyNet, they can't be retrained for now on mlpack > because > > of missing GPU support, and they take time on GPU for training. > > Sorry that I don't have the context, but if your plan is to implement > all six of these as well as the DAG network in one project, that's > great---but do be aware that you may spend more time than you expect > debugging memory handling of the DAG network implementation. It's > important that we avoid data copies, so some amount of time should go > into that. > > You can take a look at, e.g., the implementations of the memory handling > functions in MultiLayer (in #2777); there is one function to allocate > memory for each of the forward/backward/gradient passes. Maybe you have > already seen that, but in any case, the complexity of that will be a lot > more for the DAG case. :) > > I hope this is helpful! > > Thanks, > > Ryan > > -- > Ryan Curtin | "Give a man a gun and he thinks he's Superman. > r...@ratml.org | Give him two and he thinks he's God." - Pang >
_______________________________________________ mlpack mailing list mlpack@lists.mlpack.org http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack