Hi Ryan. >> My suggestion is to add another overload: >> >> HyperParameterOptimizer<...> h(data, datasetInfo, labels); >> >> This is because I consider the dataset information, which encodes the >> types of dimensions, to be a part of the dataset. Not all machine >> learning methods support a DatasetInfo object; I believe that it is only >> DecisionTree and HoeffdingTree at the moment (maybe there is one more I >> forgot). > > There are pros and cons of such design. Advantage: for some users it can be > more natural to pass datasetInfo into the constructor rather than into the > method Optimize. Disadvantages: 1) we need to double the amount of > constructors for HyperParameterOptimizer, as well as for the cross-validation > classes KFoldCV and SimpleCV (4 in total - weighted/non-weighted learning + > presence/absence of datasetInfo parameter) ; 2) we need to double the amount > of considered cases in the implementation of the method Evaluate of > cross-validation classes (4 in total again - weighted/non-weighted learning + > presence/absence of datasetInfo parameter); 3) I’m not sure it can be > refactored in some way, so the same probably will be true for new > cross-validation classes.
I just would like to ask whether I need to clarify anything in my response. I look forward to seeing what you think about the problem - should we provide additional constructors (for HyperParameterOptimizer and the cross validation classes) or should we change constructor signatures as we discussed in https://github.com/mlpack/mlpack/issues/929 <https://github.com/mlpack/mlpack/issues/929>? Best regards, Kirill Mishchenko > On 28 Apr 2017, at 08:09, Kirill Mishchenko <[email protected]> wrote: > > Hi Ryan. > >> My suggestion is to add another overload: >> >> HyperParameterOptimizer<...> h(data, datasetInfo, labels); >> >> This is because I consider the dataset information, which encodes the >> types of dimensions, to be a part of the dataset. Not all machine >> learning methods support a DatasetInfo object; I believe that it is only >> DecisionTree and HoeffdingTree at the moment (maybe there is one more I >> forgot). > > There are pros and cons of such design. Advantage: for some users it can be > more natural to pass datasetInfo into the constructor rather than into the > method Optimize. Disadvantages: 1) we need to double the amount of > constructors for HyperParameterOptimizer, as well as for the cross-validation > classes KFoldCV and SimpleCV (4 in total - weighted/non-weighted learning + > presence/absence of datasetInfo parameter) ; 2) we need to double the amount > of considered cases in the implementation of the method Evaluate of > cross-validation classes (4 in total again - weighted/non-weighted learning + > presence/absence of datasetInfo parameter); 3) I’m not sure it can be > refactored in some way, so the same probably will be true for new > cross-validation classes. > >> But now, we have C++11 >> and rvalue references, so we can do a redesign here to work around at >> least the first issue: we can have the optimizers hold 'FunctionType', >> and allow the user to pass in a 'FunctionType&&' and then use the move >> constructor. > > I’m not sure it’s possible since we don’t know the type of the template > parameter FunctionType until we initialize it in the body of the method > Optimize. > >> Thanks again for the discussion, > > My pleasure. > > Best regards, > > Kirill Mishchenko > >> On 26 Apr 2017, at 20:17, Ryan Curtin <[email protected] >> <mailto:[email protected]>> wrote: >> >> On Wed, Apr 26, 2017 at 11:24:18AM +0500, Kirill Mishchenko wrote: >>> Hi Ryan. >>> >>>> The key problem, like you said, is that we don't know what AuxType >>>> should be so we can't call its constructor. But maybe we can adapt >>>> things a little bit: >>>> >>>> template<typename AuxType, typename... Args> >>>> struct Holder /* needs a better name */ >>>> { >>>> // This typedef allows us access to the type we need to construct. >>>> typedef AuxType Aux; >>>> >>>> // These are the parameters we will use. >>>> std::tuple<Args...> args; >>>> >>>> Holder(Args... argsIn) { /* put argsIn into args */ } >>>> }; >>>> >>>> Then we could use this in addition with the Bind() class when calling an >>>> optimizer: >>>> >>>> std::array<double, 3> param3s = { 1.0, 2.0 4.0 }; >>>> std::array<double, 2> auxParam1s = { 1.0, 3.0 }; >>>> std::array<double, 4> auxParam2s = { 4.0, 5.0, 6.0, 8.0 }; >>>> auto results = tuner.Optimize<GridSearch>(Bind(param1), Bind(param2), >>>> param3s, Holder<AuxType>(auxParam1s, auxParam2s)); >>>> >>>> Like most of my other code ideas, this is a very basic sketchup, but I >>>> think it can work. Let me know what you think or if there is some >>>> detail I did not think about enough that will make the idea fail. :) >>> >>> I think this approach is quite implementable. Moreover, we should be >>> able to provide support of Bind for aux parameters: >>> >>> std::array<double, 3> param3s = { 1.0, 2.0, 4.0 }; >>> double auxParam1 = 1.0; >>> std::array<double, 4> auxParam2s = { 4.0, 5.0, 6.0, 8.0 }; >>> auto results = tuner.Optimize<GridSearch>(Bind(param1), Bind(param2), >>> param3s, Holder<AuxType>(Bind(auxParam1), auxParam2s)); >> >> Yeah, that seems like it will work. It might be worth spending some >> time thinking about what would be the easiest for the user to >> understand, but in either case the general implementation will be the >> same. >> >>>> Sure; I think maybe we should allow the user to pass in a DatasetInfo >>>> with the training data and labels, to keep things simple. >>> >>> Can you clarify a bit more what you mean here? >> >> Yeah, my impression is that the user creates the hyperparameter >> optimizer like this: >> >> HyperParameterOptimizer<...> h(data, labels); >> >> My suggestion is to add another overload: >> >> HyperParameterOptimizer<...> h(data, datasetInfo, labels); >> >> This is because I consider the dataset information, which encodes the >> types of dimensions, to be a part of the dataset. Not all machine >> learning methods support a DatasetInfo object; I believe that it is only >> DecisionTree and HoeffdingTree at the moment (maybe there is one more I >> forgot). >> >>>> // move optimizer type to class template parameter >>>> HyperParameterOptimizer<SoftmaxRegression<>, Accuracy, KFoldCV, SA> h; >>>> >>>> h.Optimizer().Tolerance() = 1e-5; >>>> h.Optimizer().MoveCtrlSweep() = 3; >>>> >>>> h.Optimize(…); >>> >>> In this approach we need to construct an optimizer before the method >>> Optimize (of HyperParamOptimizer(Tuner) in the example above) is >>> called, and it can be very problematic because of two reasons. >>> >>> 1. We don’t know what FunctionType object (which wraps cross >>> validation) to optimize since it depends on what we pass to the method >>> Optimize (in particular, it depends on whether or not we bind some >>> arguments). >>> >>> 2. In the case of GridSearch we also don’t know sets of values for >>> parameters before calling the method Optimize. Recall that we pass >>> these sets of values during construction of an GridSearch object. >> >> Right, I see what you mean. At the current time the mlpack optimizers >> expect a 'FunctionType&' to be passed to the optimizer, and this >> reference is held internally. However, that design decision was made >> before C++11 and was intended to avoid copies. But now, we have C++11 >> and rvalue references, so we can do a redesign here to work around at >> least the first issue: we can have the optimizers hold 'FunctionType', >> and allow the user to pass in a 'FunctionType&&' and then use the move >> constructor. >> >> In that way, you could create an optimizer without having access to the >> instantiated FunctionType. >> >> I can see a few ways to solve the second issue after that change is >> done. But in either case, the goal from my end would be to avoid a big >> long call to Optimize() that has both Bind(), Holder<>(), and >> OptimizerArg() types all in it. I think the idea of passing optimizer >> arguments after the arguments to the machine learning algorithm and >> marking them all with OptimizerArg() might be confusing for users, and >> it's easier if they can directly modify the parameters of the optimizer. >> >>>> If that's correct, then it might be nice to implement some additional >>>> idea such as when the user passes a 'math::Range<double> lambda', the >>>> search will be over all possible values of lambda within the given >>>> range. (One can simply modify the objective value to be DBL_MAX when >>>> outside the bounds of the given lambda, or we can consider visiting how >>>> optimizers can work in a constrained context.) >>> >>> I think this behaviour should be handled by optimizers since we >>> suppose to call them only once. I guess we already have touched this >>> feature in the discussion about simulated annealing. >> >> I agree; at the current time we don't have any support for constrained >> optimizers though. Whatever you end up implementing for GridSearch >> might be a good start, since technically grid search is a special case >> of constrained optimization. >> >>> In the light of what we have discussed recently I think it is worth to >>> revisit what and when can be implemented as a GSoC project. <...> >> >> I agree with the changes that you have proposed. >> >> Thanks again for the discussion, I think the ideas here are getting >> really mature. I think that there is some cool functionality that will >> be possible with these modules that isn't possible in any other machine >> learning library. For instance, even just hyperparameter search over >> continuous variables isn't very well supported by other toolkits, and >> would be a really nice thing to showcase for mlpack. >> >> Ryan >> >> -- >> Ryan Curtin | "You can think about it... but don't do it." >> [email protected] <mailto:[email protected]> | - Sheriff Justice >
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
