Re: [mlpack] Augmented RNNs @ GSoC'17

Marcus Edel Tue, 21 Mar 2017 07:08:19 -0700

Hello Konstantin,

we will look at the proposal once we get a chance, do you think you could upload
the draft with google docs, that makes it easier to write comments, maybe there
is an option to comment on a pdf?


Thanks,
Marcus

> On 20 Mar 2017, at 20:16, Сидоров Константин <[email protected]> wrote:
> 
> Hello Marcus,
> I have submitted my application. I would be very interested in your feedback 
> on it, and especially I would like to know precisely what should be written 
> in the abstract. I have written something there, but I am really unsure if 
> it's correct or otherwise appropriate.
>  
> --
> Thanks in advance,
> Konstantin.
>  
> 15.03.2017, 18:30, "Marcus Edel" <[email protected] 
> <mailto:[email protected]>>:
>> Hello Konstantin,
>>  
>>  
>>> Of course, it's a good idea - I have implemented it. Updated gist:
>>> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 
>>> <https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5> 
>>> However, I
>>> don't completely understand what Backward method returns. I have written in
>>> comments to code my understanding of it - but I strongly doubt that I'm 
>>> right.
>>> Can you review that comment and explain what it returns?
>>  
>> The Forward function computes the output using the current parameters and 
>> given
>> onput of the implemented module.
>>  
>> The Backward function computes the gradient of the implemented module with
>> respect to its own input.
>>  
>> The Gradeint function computes the gradient of the implemented module with
>> respect to its own parameters. However, a bunch of modules don't have any
>> parameters so they don't implement this function.
>>  
>> Also, I noticed you described the REINFORCE version of HAM, I think it would 
>> be
>> easier to go with the fully differentiable HAM, what do you think? I've
>> implemented the model from the "Recurrent Models of Visual Attention" paper,
>> that also used the REINFORCE method, I'll have to update the code slightly 
>> but I
>> think if you opt for the REINFORCE HAM maybe it helps to take a look at what 
>> I
>> did.
>>  
>>  
>>> I looked in core/tree directory
>>> (https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/tree 
>>> <https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/tree>). There 
>>> are
>>> some tree structures, but I can't see a good way to adapt them to HAM
>>> architecture - but even if there is, I think it still will be more 
>>> convenient to
>>> implement the gist of my previous letter. The reason is that tree structures
>>> from core/tree are created with different problems in mind and maintain 
>>> problem-
>>> specific information in them, which is definitely not going to help because 
>>> it
>>> will hinder both execution and developer time. Maybe I was just unable to 
>>> find
>>> the right tree structure - if so, give me a link to it, because (as you
>>> mentioned) having a ready and tested implementation is helpful.
>>  
>> I agree I'll see if I have some time, to play with some ideas that uses the
>> existing tree structures and let you know what I find out. But as I said, I'm
>> not against implementing an own structure, on the other side, it would make
>> sense to reuse some code if it turns out to be good idea.
>>  
>> Let me know what you think.
>>  
>> Thanks,
>> Marcus
>>  
>>> On 14 Mar 2017, at 18:23, Сидоров Константин <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>  
>>> Hello Marcus,
>>>  
>>>  
>>> The addition is simple all we have to do is to provide a
>>> minimal Function set: Forward, Backward, and Gradient for more information 
>>> take
>>> a look at the existing layers. What do you think about that, I think that 
>>> this
>>> would allow us to use a bunch of already existing features?
>>> Of course, it's a good idea - I have implemented it. Updated gist: 
>>> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 
>>> <https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5>
>>> However, I don't completely understand what Backward method returns. I have 
>>> written in comments to code my understanding of it - but I strongly doubt 
>>> that I'm right. Can you review that comment and explain what it returns?
>>>  
>>> That's right the HAM module uses a fixed memory size and the gist looks 
>>> clean
>>> and minimal, however, I would also take a look at the existing mlpack
>>> implementations. If we implement some tree that we use for the HAM model we 
>>> have
>>> to make sure it's well tested and that sometimes takes more time than the 
>>> actual
>>> implementation; if we could reuse some existing code, it's already tested. 
>>> But
>>> as I said, if it turns out implementing some specific structure is a better 
>>> way,
>>> we can do that. No need to use some code that wasn't designed to be used in 
>>> a
>>> totally different way.
>>> I looked in core/tree directory 
>>> (https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/tree 
>>> <https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/tree)>). 
>>> There are some tree structures, but I can't see a good way to adapt them to 
>>> HAM architecture - but even if there is, I think it still will be more 
>>> convenient to implement the gist of my previous letter. The reason is that 
>>> tree structures from core/tree are created with different problems in mind 
>>> and maintain problem-specific information in them, which is definitely not 
>>> going to help because it will hinder both execution and developer time. 
>>> Maybe I was just unable to find the right tree structure - if so, give me a 
>>> link to it, because (as you mentioned) having a ready and tested 
>>> implementation is helpful.
>>>  
>>> --
>>> Best Regards,
>>> Konstantin.
>>>  
>>> 13.03.2017, 18:25, "Marcus Edel" <[email protected] 
>>> <mailto:[email protected]>>:
>>>> Hello Konstantin,
>>>>  
>>>>  
>>>>> gist @ GitHub is a great idea, so the API (with changes in task 
>>>>> definitions) is
>>>>> now here: 
>>>>> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 
>>>>> <https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5> 
>>>> Great, now if you modify the gist it automatically creates a new revision 
>>>> and we
>>>> can switch between revisions.
>>>>  
>>>>  
>>>>> Regarding the NTM and HAM API, I was thinking that it might be a good 
>>>>> idea, to
>>>>> implement the models as layers, that way someone could reuse the 
>>>>> implementation
>>>>> inside other architectures. Sorry, I didn't quite understand your idea - 
>>>>> can you
>>>>> restate it?
>>>>  
>>>> The HAM module was designed with the intention to be generic enough so 
>>>> that it
>>>> can be used as a building block of larger neural architectures. So, for 
>>>> example,
>>>> I have this recurrent neural network:
>>>>  
>>>> RNN<MeanSquaredError<> > model(rho);
>>>> model.Add<IdentityLayer<> >();
>>>> model.Add<Linear<> >(inputSize, 20);
>>>> model.Add<LSTM<> >(20, 7, rho);
>>>> model.Add<Linear<> >(7, outputSize);
>>>> model.Add<SigmoidLayer<> >();
>>>>  
>>>> instead of using the LSTM I'd like to use the HAM module or the NTM. It 
>>>> would be
>>>> great if we could design the classes so that they can be integrated into 
>>>> the
>>>> current infrastructure. The addition is simple all we have to do is to 
>>>> provide a
>>>> minimal Function set: Forward, Backward, and Gradient for more information 
>>>> take
>>>> a look at the existing layers. What do you think about that, I think that 
>>>> this
>>>> would allow us to use a bunch of already existing features?
>>>>  
>>>>  
>>>>> Also, I haven't complete thought this through but maybe it's possible to 
>>>>> use the
>>>>> existing binary or decision tree for the HAM model, would save us a lot 
>>>>> of work
>>>>> if that's manageable. In HAM model, memory has fixed size. In this 
>>>>> situation, as
>>>>> a competitive programmer, I would write the code using std::vector as 
>>>>> storage
>>>>> for data and int (size_t actually) instead node pointers. Gist showcasing 
>>>>> this
>>>>> idea: 
>>>>> https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb 
>>>>> <https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb>. As
>>>>> you can see, the code is rather small (without STL iterator support, 
>>>>> though) and
>>>>> quite easy to implement quickly and properly (the kind of thing that is 
>>>>> all-
>>>>> important in such competitions).
>>>>  
>>>> That's right the HAM module uses a fixed memory size and the gist looks 
>>>> clean
>>>> and minimal, however, I would also take a look at the existing mlpack
>>>> implementations. If we implement some tree that we use for the HAM model 
>>>> we have
>>>> to make sure it's well tested and that sometimes takes more time than the 
>>>> actual
>>>> implementation; if we could reuse some existing code, it's already tested. 
>>>> But
>>>> as I said, if it turns out implementing some specific structure is a 
>>>> better way,
>>>> we can do that. No need to use some code that wasn't designed to be used 
>>>> in a
>>>> totally different way.
>>>>  
>>>> Let us know what you think.
>>>>  
>>>> Best,
>>>> Marcus
>>>>  
>>>>  
>>>>> On 10 Mar 2017, at 18:54, Сидоров Константин <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>  
>>>>> Hello Marcus,
>>>>>  
>>>>> gist @ GitHub is a great idea, so the API (with changes in task 
>>>>> definitions) is now here: 
>>>>> https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5 
>>>>> <https://gist.github.com/partobs-mdp/1da986071430000c97a7021723cc90d5>
>>>>>  
>>>>> Regarding the NTM and HAM API, I was thinking that it might be a good 
>>>>> idea, to
>>>>> implement the models as layers, that way someone could reuse the 
>>>>> implementation
>>>>> inside other architectures.
>>>>> Sorry, I didn't quite understand your idea - can you restate it?
>>>>>  
>>>>> Also, I haven't
>>>>> complete thought this through but maybe it's possible to use the existing 
>>>>> binary
>>>>> or decision tree for the HAM model, would save us a lot of work if that's
>>>>> manageable.
>>>>> In HAM model, memory has fixed size. In this situation, as a competitive 
>>>>> programmer, I would write the code using std::vector as storage for data 
>>>>> and int (size_t actually) instead node pointers. Gist showcasing this 
>>>>> idea: 
>>>>> https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb 
>>>>> <https://gist.github.com/partobs-mdp/411df153a6067008d27c255ebd0fb0cb>. 
>>>>> As you can see, the code is rather small (without STL iterator support, 
>>>>> though) and quite easy to implement quickly and properly (the kind of 
>>>>> thing that is all-important in such competitions).
>>>>>  
>>>>> Right now, since I have a detailed version of API, I have an idea to 
>>>>> switch to writing the application. Do you mind if I send the draft of my 
>>>>> application for review?
>>>>>  
>>>>> --
>>>>> Thanks in advance,
>>>>> Konstantin
>>>>>  
>>>>> 10.03.2017, 18:39, "Marcus Edel" <[email protected] 
>>>>> <mailto:[email protected]>>:
>>>>>> Hello Konstantin,
>>>>>>  
>>>>>> you put some really good thoughts into the first draft, do you think 
>>>>>> switching
>>>>>> to Github gist or something else makes it easier to share the code?
>>>>>>  
>>>>>> About the tasks, I think we should put all specific task parameters into 
>>>>>> the
>>>>>> constructor instead of using the Evaluate function. If we provide a 
>>>>>> unified
>>>>>> function like the Evaluate function for each task, it's probably easier 
>>>>>> to use.
>>>>>> Besides that the task API looks really clean.
>>>>>>  
>>>>>> Regarding the NTM and HAM API, I was thinking that it might be a good 
>>>>>> idea, to
>>>>>> implement the models as layers, that way someone could reuse the 
>>>>>> implementation
>>>>>> inside other architectures. I like the idea of implementing Highway 
>>>>>> Networks,
>>>>>> but maybe instead of implementing the idea first, I would recommend to 
>>>>>> implement
>>>>>> it if there is time left and propose this as an improvement. Also, I 
>>>>>> haven't
>>>>>> complete thought this through but maybe it's possible to use the 
>>>>>> existing binary
>>>>>> or decision tree for the HAM model, would save us a lot of work if that's
>>>>>> manageable.
>>>>>>  
>>>>>> Let me know what you think.
>>>>>>  
>>>>>> Thanks,
>>>>>> Marcus
>>>>>>  
>>>>>>> On 9 Mar 2017, at 12:09, Сидоров Константин <[email protected] 
>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>  
>>>>>>> Hello Marcus,
>>>>>>> As a continuation of the API discussion, I've made some kind of C++ 
>>>>>>> header file with the ideas we've previously discussed plus ideas you've 
>>>>>>> pointed out in the last letter, which I attach to the letter. Speaking 
>>>>>>> about non-NTM augmented models, I'm inclined towards choosing 
>>>>>>> Hierarchical Attractive Memory - because I have some experience in 
>>>>>>> reinforcement learning and this ideaby itself seems very interesting.
>>>>>>> However, I have the feeling that I've somewhat messed up HAM code 
>>>>>>> (partly because I haven't quite understood the paper yet, partly 
>>>>>>> because I have hard time getting back to the C++ world). For this 
>>>>>>> reason, can you review the API / C++ declaration file sketch?
>>>>>>>  
>>>>>>> --
>>>>>>> Thanks in advance,
>>>>>>> Konstantin
>>>>>>>  
>>>>>>> 08.03.2017, 16:57, "Marcus Edel" <[email protected] 
>>>>>>> <mailto:[email protected]>>:
>>>>>>>> Hello Konstantin,
>>>>>>>>  
>>>>>>>>  
>>>>>>>>> Something like this: 
>>>>>>>>> https://thesundayprogrammer.wordpress.com/2016/01/27 
>>>>>>>>> <https://thesundayprogrammer.wordpress.com/2016/01/27>
>>>>>>>>> /neural-networks-using-mlpack/, but written for the up-to-date mlpack 
>>>>>>>>> version,
>>>>>>>>> rather that the version from last year. Of course, I tried hard to 
>>>>>>>>> google it,
>>>>>>>>> but failed :)
>>>>>>>>  
>>>>>>>> I'm working on a tutorial, for now, you could take a look at the unit 
>>>>>>>> test cases
>>>>>>>> where different architectures are tested on simple problems:
>>>>>>>>  
>>>>>>>> https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/feedforward_network_test.cpp
>>>>>>>>  
>>>>>>>> <https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/feedforward_network_test.cpp>
>>>>>>>> https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/recurrent_network_test.cpp
>>>>>>>>  
>>>>>>>> <https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/recurrent_network_test.cpp>
>>>>>>>> https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/convolutional_network_test.cpp
>>>>>>>>  
>>>>>>>> <https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/convolutional_network_test.cpp>
>>>>>>>>  
>>>>>>>> Let me know if I should translate the model used from the blog post.
>>>>>>>>  
>>>>>>>>  
>>>>>>>>> I think it's time to sum up the discussed API: we create namespace
>>>>>>>>> mlpack::ann::augmented, and do all the work in it. For instance, we 
>>>>>>>>> create NTM
>>>>>>>>> class with Train() / Evaluate() methods (standard for mlpack). Also, 
>>>>>>>>> we create
>>>>>>>>> classes for evaluating models (for instance, CopyTask). As an idea: 
>>>>>>>>> what do you
>>>>>>>>> think about creating yet another namespace for such tasks? (for 
>>>>>>>>> instance, the
>>>>>>>>> full class name will be mlpack::ann::augmented::tasks::CopyTask)? 
>>>>>>>>> Earlier I've
>>>>>>>>> offered to inherit them from BaseBenchmark, but now I see it's not 
>>>>>>>>> the only way.
>>>>>>>>> Right now, I'm more inclined to think that not doing inheritance is 
>>>>>>>>> better
>>>>>>>>> because of that argument with the virtual functions. What do you 
>>>>>>>>> think?
>>>>>>>>  
>>>>>>>> I agree, using a subnamespace of augmented is a good idea, makes 
>>>>>>>> things more
>>>>>>>> clear. Also,  I think we can use another template parameter for e.g. 
>>>>>>>> the
>>>>>>>> Controler, probably makes it easier to switch between a feed-forward 
>>>>>>>> and LSTM
>>>>>>>> controller. There are some other positions where we could use 
>>>>>>>> templates to
>>>>>>>> generalize the API.
>>>>>>>>  
>>>>>>>>  
>>>>>>>>> I'm interested in continuing the discussion and getting deeper into 
>>>>>>>>> that
>>>>>>>>> project. By the way, sorry for the delay with the first API idea. The 
>>>>>>>>> reason is
>>>>>>>>> that I'm a freshman student, and for this reason my university 
>>>>>>>>> studies often get
>>>>>>>>> quite stressful. However, right now I'm trying to submit all of my 
>>>>>>>>> coursework
>>>>>>>>> and finish the semester early - with the goal of fully transiting to 
>>>>>>>>> GSoC work
>>>>>>>>> :)
>>>>>>>>  
>>>>>>>> No worries, take all the time you need for your studies, there is 
>>>>>>>> plenty of
>>>>>>>> time, where we can discuss ideas. Just some note, recommendation; take 
>>>>>>>> a look at
>>>>>>>> other augmented memory models as well, because the NTM is probably not 
>>>>>>>> enough
>>>>>>>> work for the GSoC.
>>>>>>>>  
>>>>>>>> Thanks,
>>>>>>>> Marcus
>>>>>>>>  
>>>>>>>>  
>>>>>>>>> On 7 Mar 2017, at 16:47, Сидоров Константин <[email protected] 
>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>  
>>>>>>>>> Hello Marcus,
>>>>>>>>>  
>>>>>>>>> Regarding
>>>>>>>>> the BaseAugmented class, we should avoid inheritance (or, at least, 
>>>>>>>>> virtual
>>>>>>>>> inheritance) in deference to templates. This is because virtual 
>>>>>>>>> functions incur
>>>>>>>>> runtime overhead. In this case I don't see a compelling reason to 
>>>>>>>>> introduce a
>>>>>>>>> BaseAugmented class, maybe I missed something? We could do:
>>>>>>>>>  
>>>>>>>>> Code:
>>>>>>>>> =====
>>>>>>>>>  
>>>>>>>>> class CopyTask
>>>>>>>>> {
>>>>>>>>>   template<typename ModelType>
>>>>>>>>>   void Evaluate(ModelType& model)
>>>>>>>>>   {
>>>>>>>>>     // Call model.Train() using the generated copy task data.
>>>>>>>>>     // Call model.Evaluate
>>>>>>>>>   }
>>>>>>>>> }
>>>>>>>>>  
>>>>>>>>> Usage:
>>>>>>>>> ======
>>>>>>>>>  
>>>>>>>>> NTM ntm;
>>>>>>>>> CopyTask.Evaluate(ntm);
>>>>>>>>>  
>>>>>>>>> What do you think about the design?
>>>>>>>>> 
>>>>>>>>> This is definitely an excellent idea :) Actually, this shows that I 
>>>>>>>>> have forgotten most of C++ - here, I forgot about that after using 
>>>>>>>>> the template keyword, I cal call any functions I like (just like in 
>>>>>>>>> Python, actually). So, it's about time to refresh C++ - and thanks 
>>>>>>>>> for your advice on resources, since my C++ experience was biased 
>>>>>>>>> towards competitive programming, which mostly ignores these language 
>>>>>>>>> features.
>>>>>>>>>  
>>>>>>>>> - Can you show me the working example of mlpack::ann::FFN compatible 
>>>>>>>>> with
>>>>>>>>> current upstream C++ API?
>>>>>>>>>  
>>>>>>>>> Not sure what you mean, can you elaborate more?
>>>>>>>>> 
>>>>>>>>> Something like this: 
>>>>>>>>> https://thesundayprogrammer.wordpress.com/2016/01/27/neural-networks-using-mlpack/
>>>>>>>>>  
>>>>>>>>> <https://thesundayprogrammer.wordpress.com/2016/01/27/neural-networks-using-mlpack/>,
>>>>>>>>>  but written for the up-to-date mlpack version, rather that the 
>>>>>>>>> version from last year. Of course, I tried hard to google it, but 
>>>>>>>>> failed :)
>>>>>>>>>  
>>>>>>>>> I think it's time to sum up the discussed API: we create namespace 
>>>>>>>>> mlpack::ann::augmented, and do all the work in it. For instance, we 
>>>>>>>>> create NTM class with Train() / Evaluate() methods (standard for 
>>>>>>>>> mlpack). Also, we create classes for evaluating models (for instance, 
>>>>>>>>> CopyTask).
>>>>>>>>> As an idea: what do you think about creating yet another namespace 
>>>>>>>>> for such tasks? (for instance, the full class name will be 
>>>>>>>>> mlpack::ann::augmented::tasks::CopyTask)? Earlier I've offered to 
>>>>>>>>> inherit them from BaseBenchmark, but now I see it's not the only way. 
>>>>>>>>> Right now, I'm more inclined to think that not doing inheritance is 
>>>>>>>>> better because of that argument with the virtual functions. What do 
>>>>>>>>> you think?
>>>>>>>>>  
>>>>>>>>> I'm interested in continuing the discussion and getting deeper into 
>>>>>>>>> that project. By the way, sorry for the delay with the first API 
>>>>>>>>> idea. The reason is that I'm a freshman student, and for this reason 
>>>>>>>>> my university studies often get quite stressful. However, right now 
>>>>>>>>> I'm trying to submit all of my coursework and finish the semester 
>>>>>>>>> early - with the goal of fully transiting to GSoC work :)
>>>>>>>>>  
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>> Konstantin.
>>>>>>>>>  
>>>>>>>>> 07.03.2017, 17:54, "Marcus Edel" <[email protected] 
>>>>>>>>> <mailto:[email protected]>>:
>>>>>>>>>> Hello Konstantin,
>>>>>>>>>>  
>>>>>>>>>> thanks for getting back.
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>> Since we have to make several different augmented RNN models, I 
>>>>>>>>>>> think it is a
>>>>>>>>>>> good idea to make a namespace mlpacK::ann::augmented and a class 
>>>>>>>>>>> BaseAugmented
>>>>>>>>>>> (will be useful for benchmarking later).
>>>>>>>>>>  
>>>>>>>>>> Putting different networks under a unified namespace is a good idea. 
>>>>>>>>>> Regarding
>>>>>>>>>> the BaseAugmented class, we should avoid inheritance (or, at least, 
>>>>>>>>>> virtual
>>>>>>>>>> inheritance) in deference to templates. This is because virtual 
>>>>>>>>>> functions incur
>>>>>>>>>> runtime overhead. In this case I don't see a compelling reason to 
>>>>>>>>>> introduce a
>>>>>>>>>> BaseAugmented class, maybe I missed something? We could do:
>>>>>>>>>>  
>>>>>>>>>> Code:
>>>>>>>>>> =====
>>>>>>>>>>  
>>>>>>>>>> class CopyTask
>>>>>>>>>> {
>>>>>>>>>>   template<typename ModelType>
>>>>>>>>>>   void Evaluate(ModelType& model)
>>>>>>>>>>   {
>>>>>>>>>>     // Call model.Train() using the generated copy task data.
>>>>>>>>>>     // Call model.Evaluate
>>>>>>>>>>   }
>>>>>>>>>> }
>>>>>>>>>>  
>>>>>>>>>> Usage:
>>>>>>>>>> ======
>>>>>>>>>>  
>>>>>>>>>> NTM ntm;
>>>>>>>>>> CopyTask.Evaluate(ntm);
>>>>>>>>>>  
>>>>>>>>>> What do you think about the design?
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>> P.S. Some questions that arose while trying to get in grips with 
>>>>>>>>>>> mlpack:
>>>>>>>>>>> - What resources can you advice to brush up *advanced/modern* C++? 
>>>>>>>>>>> (e.g., templates, && in functions)
>>>>>>>>>>  
>>>>>>>>>> There are some really nice books that might help to refresh your 
>>>>>>>>>> knowledge:
>>>>>>>>>>  
>>>>>>>>>> - "Modern C++ Design, Generic Programming and Design Patterns 
>>>>>>>>>> Applied" by Andrei Alexandrescu
>>>>>>>>>> - "Effective C++" by Scott Meyers
>>>>>>>>>> - "Effective STL" by Scott Meyers
>>>>>>>>>>  
>>>>>>>>>> There are also some references on http://mlpack.org/gsoc.html 
>>>>>>>>>> <http://mlpack.org/gsoc.html>
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>> - Can you show me the working example of mlpack::ann::FFN 
>>>>>>>>>>> compatible with
>>>>>>>>>>> current upstream C++ API?
>>>>>>>>>>  
>>>>>>>>>> Not sure what you mean, can you elaborate more?
>>>>>>>>>>  
>>>>>>>>>> I hope this is helpful, let us know if you have any more questions.
>>>>>>>>>>  
>>>>>>>>>> Thanks,
>>>>>>>>>> Marcus
>>>>>>>>>>  
>>>>>>>>>>> On 5 Mar 2017, at 10:12, Сидоров Константин <[email protected] 
>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>  
>>>>>>>>>>> Hello Marcus!
>>>>>>>>>>> Right now, I'm thinking quite a lot on "Augmented RNNs" project. 
>>>>>>>>>>> For example, I'm trying to convert ideas from project description 
>>>>>>>>>>> to something more concrete and mlpack-specific. Here is my first 
>>>>>>>>>>> shot at C++ API for NTM.
>>>>>>>>>>> Disclaimer: I haven't written on C++ for ~10 months. My last 
>>>>>>>>>>> experience of C++ coding is our (Russian) national programming 
>>>>>>>>>>> olympiad (April '16), after which I've been coding almost 
>>>>>>>>>>> exclusively on Python.
>>>>>>>>>>> Since we have to make several different augmented RNN models, I 
>>>>>>>>>>> think it is a good idea to make a namespace mlpacK::ann::augmented 
>>>>>>>>>>> and a class BaseAugmented (will be useful for benchmarking later).
>>>>>>>>>>> Inside it, we can define class NTM< OutputLayerType, 
>>>>>>>>>>> InitializationRuleType > : BaseAugmented with standard mlpack 
>>>>>>>>>>> interface:
>>>>>>>>>>> - template<typename NetworkType, typename Lambda> NTM(NetworkType 
>>>>>>>>>>> controller, arma::mat &memory, Lambda similarity)
>>>>>>>>>>> - Predict (arma::mat &predictors, arma::mat &responses)
>>>>>>>>>>> - Train (const arma::mat &predictors, const arma::mat &responses)
>>>>>>>>>>> For benchmarking, we can define class 
>>>>>>>>>>> mlpack::ann:augmented::BaseBenchmark with interface:
>>>>>>>>>>> - BaseBenchmark() = 0
>>>>>>>>>>> - Evaluate() = 0 (the method that will accept task parameters as 
>>>>>>>>>>> arguments and run the augmented model)
>>>>>>>>>>> As an example, an API of CopyBenchmark : BaseBenchmark:
>>>>>>>>>>> - CopyBenchmark()
>>>>>>>>>>> - Evaluate(BaseAugmented model, int maxLength = 5, int repeats = 1) 
>>>>>>>>>>> // repeats is a parameter to convert copy to repeat-copy task.
>>>>>>>>>>> So, that's some kind of API. I would be interested to discuss and 
>>>>>>>>>>> analyze it with you.
>>>>>>>>>>>  
>>>>>>>>>>> P.S. Some questions that arose while trying to get in grips with 
>>>>>>>>>>> mlpack:
>>>>>>>>>>> - What resources can you advice to brush up *advanced/modern* C++? 
>>>>>>>>>>> (e.g., templates, && in functions)
>>>>>>>>>>> - Can you show me the working example of mlpack::ann::FFN 
>>>>>>>>>>> compatible with current upstream C++ API?
>>>>>>>>>>>  
>>>>>>>>>>> 28.02.2017, 16:43, "Marcus Edel" <[email protected] 
>>>>>>>>>>> <mailto:[email protected]>>:
>>>>>>>>>>>> Hello Konstantin,
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>> My name is Konstantin Sidorov, and I am an undergraduate student 
>>>>>>>>>>>>> in Astrakhan
>>>>>>>>>>>>> State University (Russia). I’m glad to know that mlpack was 
>>>>>>>>>>>>> accepted in GSoC’17
>>>>>>>>>>>>> – as a side note, congratulations :)
>>>>>>>>>>>>  
>>>>>>>>>>>> thanks and welcome!
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>> I’m already fairly familiar with deep learning. For example, 
>>>>>>>>>>>>> recently I
>>>>>>>>>>>>> implemented optimality tightening from “Learning to play in a day”
>>>>>>>>>>>>> (https://arxiv.org/abs/1611.01606 
>>>>>>>>>>>>> <https://arxiv.org/abs/1611.01606>) for the AgentNet (“Deep 
>>>>>>>>>>>>> Reinforcement
>>>>>>>>>>>>> Learning library for humans”, 
>>>>>>>>>>>>> https://github.com/yandexdataschool/AgentNet 
>>>>>>>>>>>>> <https://github.com/yandexdataschool/AgentNet>).
>>>>>>>>>>>>  
>>>>>>>>>>>> Sounds really interesting, the "Learning to Play in a Day" paper 
>>>>>>>>>>>> is on my
>>>>>>>>>>>> reading list, looks like I should move it up.
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>> Of course, on such an early stage I have no detailed plan what 
>>>>>>>>>>>>> (and how) to do –
>>>>>>>>>>>>> only some ideas. In the beginning, for example, I’m planning to 
>>>>>>>>>>>>> implement NTMs
>>>>>>>>>>>>> as described in arXiv paper and implement *reusable* benchmarking 
>>>>>>>>>>>>> code (e.g.,
>>>>>>>>>>>>> copy, repeat copy, n-grams). I would like to discuss this project 
>>>>>>>>>>>>> more
>>>>>>>>>>>>> thoroughly if possible. In addition, this is my first 
>>>>>>>>>>>>> participation in GSoC. So,
>>>>>>>>>>>>> excuse me in advance if I’ve done something inappropriate.
>>>>>>>>>>>>  
>>>>>>>>>>>> Implementing the NTM task from the paper, so that they can be used 
>>>>>>>>>>>> for other
>>>>>>>>>>>> models as well is a great idea. In fact, you see a lot of other 
>>>>>>>>>>>> papers that at
>>>>>>>>>>>> least reuse the copy task. There are a bunch of other interesting 
>>>>>>>>>>>> tasks that
>>>>>>>>>>>> could be implemented like the MNIST pen stroke classification task 
>>>>>>>>>>>> recently
>>>>>>>>>>>> introduced by Edwin D. de Jong in his "Incremental Sequence 
>>>>>>>>>>>> Learning" paper. The
>>>>>>>>>>>> Stanford Natural Language Inference task proposed by Samuel R. 
>>>>>>>>>>>> Bowman et al. in
>>>>>>>>>>>> "A large annotated corpus for learning natural language inference" 
>>>>>>>>>>>> can be also
>>>>>>>>>>>> transformed into a long-term dependency task, that might be 
>>>>>>>>>>>> interesting.
>>>>>>>>>>>>  
>>>>>>>>>>>> Regarding the project itself, take a look at other models as well, 
>>>>>>>>>>>> depending on
>>>>>>>>>>>> the model you choose, I think there is some time left for another 
>>>>>>>>>>>> model. Also,
>>>>>>>>>>>> about the implementation, mlpack's architecture is kinda different 
>>>>>>>>>>>> to Theano's
>>>>>>>>>>>> graph construction and compilation work, but if you managed to 
>>>>>>>>>>>> work with Theano
>>>>>>>>>>>> you shouldn't have a problem.
>>>>>>>>>>>>  
>>>>>>>>>>>> If you like we can discuss any details over the mailing list and 
>>>>>>>>>>>> brainstorm some
>>>>>>>>>>>> ideas, discuss an initial class design, etc.
>>>>>>>>>>>>  
>>>>>>>>>>>> I hope this is helpful, let us know if you have any more questions.
>>>>>>>>>>>>  
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Marcus
>>>>>>>>>>>>  
>>>>>>>>>>>>> On 28 Feb 2017, at 07:06, Сидоров Константин <[email protected] 
>>>>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>>>>>  
>>>>>>>>>>>>> Hello Marcus,
>>>>>>>>>>>>> My name is Konstantin Sidorov, and I am an undergraduate student 
>>>>>>>>>>>>> in Astrakhan State University (Russia). I’m glad to know that 
>>>>>>>>>>>>> mlpack was accepted in GSoC’17 – as a side note, congratulations 
>>>>>>>>>>>>> :)
>>>>>>>>>>>>> I’m interested to work on project “Augmented Recurrent Neural 
>>>>>>>>>>>>> Networks”. I’m already fairly familiar with deep learning. For 
>>>>>>>>>>>>> example, recently I implemented optimality tightening from 
>>>>>>>>>>>>> “Learning to play in a day” (https://arxiv.org/abs/1611.01606 
>>>>>>>>>>>>> <https://arxiv.org/abs/1611.01606>) for the AgentNet (“Deep 
>>>>>>>>>>>>> Reinforcement Learning library for humans”, 
>>>>>>>>>>>>> https://github.com/yandexdataschool/AgentNet 
>>>>>>>>>>>>> <https://github.com/yandexdataschool/AgentNet>). Here is the 
>>>>>>>>>>>>> merged pull request: 
>>>>>>>>>>>>> https://github.com/yandexdataschool/AgentNet/pull/88 
>>>>>>>>>>>>> <https://github.com/yandexdataschool/AgentNet/pull/88>.
>>>>>>>>>>>>> As you see, I’m quite familiar with deep learning and Theano. 
>>>>>>>>>>>>> Even though my main field of interest is RL, I would be very 
>>>>>>>>>>>>> interested in doing something new – that is why I’ve chosen 
>>>>>>>>>>>>> “Augmented RNNs”.
>>>>>>>>>>>>> Of course, on such an early stage I have no detailed plan what 
>>>>>>>>>>>>> (and how) to do – only some ideas. In the beginning, for example, 
>>>>>>>>>>>>> I’m planning to implement NTMs as described in arXiv paper and 
>>>>>>>>>>>>> implement *reusable* benchmarking code (e.g., copy, repeat copy, 
>>>>>>>>>>>>> n-grams).
>>>>>>>>>>>>> I would like to discuss this project more thoroughly if possible. 
>>>>>>>>>>>>> In addition, this is my first participation in GSoC. So, excuse 
>>>>>>>>>>>>> me in advance if I’ve done something inappropriate.
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>> Konstantin.
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> mlpack mailing list
>>>>>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>>>>>> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack 
>>>>>>>>>>>>> <http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack> 
>>>>>>>>>>>  
>>>>>>>>>>> --
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Konstantin.
>>>>>>>>>  
>>>>>>>  
>>>>>>>  
>>>>>>> -- 
>>>>>>> Сидоров Константин
>>>>>>> Телефон: +7 917 095-31-27
>>>>>>> E-mail: [email protected] <mailto:[email protected]>
>>>>>>>  
>>>>>>> <api.hpp>
>>>>>  
>>>>>  
>>>>> -- 
>>>>> Сидоров Константин
>>>>> Телефон: +7 917 095-31-27
>>>>> E-mail: [email protected] <mailto:[email protected]> 
>>>  
>>> -- 
>>> Сидоров Константин
>>> Телефон: +7 917 095-31-27
>>> E-mail: [email protected] <mailto:[email protected]>
>>>  
>  
>  
> -- 
> Сидоров Константин
> Телефон: +7 917 095-31-27
> E-mail: [email protected] <mailto:[email protected]>

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Re: [mlpack] Augmented RNNs @ GSoC'17

Reply via email to