One more thing: trying to defend R's honor, I've run optimx instead of optim (after dividing the IV by its max - same as for optim). I did not use L-BFGS-B with lower bounds anymore. Instead, I've used Nelder-Mead (no bounds). First, it was faster: for a loop across 10 different IVs BFGS took 6.14 sec and Nelder-Mead took just 3.9 sec. Second, the solution was better - Nelder-Mead fits were ALL better than L-BFGS-B fits and ALL better than Excel solver's solutions. Of course, those were small improvements, but still, it's nice! Dimitri
On Mon, Nov 14, 2011 at 5:26 PM, Dimitri Liakhovitski <dimitri.liakhovit...@gmail.com> wrote: > Just to provide some closure: > > I ended up dividing the IV by its max so that the input vector (IV) is > now between zero and one. I still used optim: > myopt <- optim(fn=myfunc, par=c(1,1), method="L-BFGS-B", lower=c(0,0)) > I was able to get great fit, in 3 cases out of 10 I've beaten Excel > Solver, but in 7 cases I lost to Excel - but again, by really tiny > margins (generally less than 1% of Excel's fit value). > > Thank you everybody! > Dimitri > > On Fri, Nov 11, 2011 at 10:28 AM, John C Nash <nas...@uottawa.ca> wrote: >> Some tips: >> >> 1) Excel did not, as far as I can determine, find a solution. No point seems >> to satisfy >> the KKT conditions (there is a function kktc in optfntools on R-forge >> project optimizer. >> It is called by optimx). >> >> 2) Scaling of the input vector is a good idea given the seeming wide range >> of values. That >> is, assuming this can be done. If the function depends on the relative >> values in the input >> vector rather than magnitude, this may explain the trouble with your >> function. That is, if >> the function depends on the relative change in the input vector and not its >> scale, then >> optimizers will have a lot of trouble if the scale factor for this vector is >> implicitly >> one of the optimization parameters. >> >> 3) If you can get the gradient function you will almost certainly be able to >> do better, >> especially in finding whether you have a minimum i.e., null gradient, >> positive definite >> Hessian. When you have gradient function, kktc uses Jacobian(gradient) to >> get the Hessian, >> avoiding one level of digit cancellation. >> >> JN >> >> >> On 11/11/2011 10:20 AM, Dimitri Liakhovitski wrote: >>> Thank you very much to everyone who replied! >>> As I mentioned - I am not a mathematician, so sorry for stupid >>> comments/questions. >>> I intuitively understand what you mean by scaling. While the solution >>> space for the first parameter (.alpha) is relatively compact (probably >>> between 0 and 2), the second one (.beta) is "all over the place" - >>> because it is a function of IV (input vector). And that's, probably, >>> my main challenge - that I am trying to write a routine for different >>> possible IVs that I might be facing (they may be in hundreds, in >>> thousands, in millions). Should I be rescaling the IV somehow (e.g., >>> by dividing it by its max) - or should I do something with the >>> parameter .beta inside my function? >>> >>> So far, I've written a loop over many different starting points for >>> both parameters. Then, I take the betas around the best solution so >>> far, split it into smaller steps for beta (as starting points) and >>> optimize again for those starting points. What disappoints me is that >>> even when I found a decent solution (the minimized value of 336) it >>> was still worse than the Solver solution! >>> >>> And I am trying to prove to everyone here that we should do R, not Excel :-) >>> >>> Thanks again for your help, guys! >>> Dimitri >>> >>> >>> On Fri, Nov 11, 2011 at 9:10 AM, John C Nash <nas...@uottawa.ca> wrote: >>>> I won't requote all the other msgs, but the latest (and possibly a bit >>>> glitchy) version of >>>> optimx on R-forge >>>> >>>> 1) finds that some methods wander into domains where the user function >>>> fails try() (new >>>> optimx runs try() around all function calls). This includes L-BFGS-B >>>> >>>> 2) reports that the scaling is such that you really might not expect to >>>> get a good solution >>>> >>>> then >>>> >>>> 3) Actually gets a better result than the >>>> >>>>> xlf<-myfunc(c(0.888452533990788,94812732.0897449)) >>>>> xlf >>>> [1] 334.607 >>>>> >>>> >>>> with Kelley's variant of Nelder Mead (from dfoptim package), with >>>> >>>>> myoptx >>>> method par fvalues fns grs itns conv KKT1 >>>> 4 LBFGSB NA, NA 8.988466e+307 NA NULL NULL 9999 NA >>>> 2 Rvmmin 0.1, 200186870.6 25593.83 20 1 NULL 0 FALSE >>>> 3 bobyqa 6.987875e-01, 2.001869e+08 1933.229 44 NA NULL 0 FALSE >>>> 1 nmkb 8.897590e-01, 9.470163e+07 334.1901 204 NA NULL 0 FALSE >>>> KKT2 xtimes meths >>>> 4 NA 0.01 LBFGSB >>>> 2 FALSE 0.11 Rvmmin >>>> 3 FALSE 0.24 bobyqa >>>> 1 FALSE 1.08 nmkb >>>> >>>> But do note the terrible scaling. Hardly surprising that this function >>>> does not work. I'll >>>> have to delve deeper to see what the scaling setup should be because of >>>> the nature of the >>>> function setup involving some of the data. (optimx includes parscale on >>>> all methods). >>>> >>>> However, original poster DID include code, so it was easy to do a quick >>>> check. Good for him. >>>> >>>> JN >>>> >>>>> ## Comparing this solution to Excel Solver solution: >>>>> myfunc(c(0.888452533990788,94812732.0897449)) >>>>> >>>>> -- Dimitri Liakhovitski marketfusionanalytics.com >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >> > > > > -- > Dimitri Liakhovitski > marketfusionanalytics.com > -- Dimitri Liakhovitski marketfusionanalytics.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.