Re: [R] ML optimization question--unidimensional unfolding scalin g
Alternatively, just type debug(optim) before using it, then step through it by hitting enter repeatedly... When you're done, do undebug(optim). Andy From: Spencer Graves Have you looked at the code for optim? If you execute optim, it will list the code. You can copy that into a script file and walk through it line by line to figure out what it does. By doing this, you should be able to find a place in the iteration where you can test both branches of each bifurcation and pick one -- or keep a list of however many you want and follow them all more or less simultaneously, pruning the ones that seem too implausible. Then you can alternate between a piece of the optim code, bifurcating and pruning, adjusting each and printing intermediate progress reports to help you understand what it's doing and how you might want to modify it. With a bit more effort, you can get the official source code with comments. To do that, I think you go to www.r-project.org - CRAN - (select a local mirror) - Software: R sources. From there, just download The latest release: R-2.2.0.tar.gz. For more detailed help, I suggest you try to think of the simplest possible toy problem that still contains one of the issues you find most difficult. Then send that to this list. If readers can copy a few lines of R code from your email into R and try a couple of things in less than a minute, I think you might get more useful replies quicker. Best Wishes, Spencer Graves Peter Muhlberger wrote: Hi Spencer: Thanks for your interest! Also, the posting guide was helpful. I think my problem might be solved if I could find a way to terminate nlm or optim runs from within the user-given minimization function they call. Optimization is unconstrained. I'm essentially using normal like curves that translate observed values on a set of variables (one curve per variable) into latent unfolded values. The observed values are on the Y-axis the latent (hence parameters to be estimated) are on the X-axis. The problem is that there are two points into which an observed value can map on a curve--one on either side of the curve mean. Only one of these values actually will be optimal for all observed variables, but it's easy to show that most estimation methods will get stuck on the non-optimal value if they find that one first. Moving away from that point, the likelihood gets a whole lot worse before the routine will 'see' the optimal point on the other side of the normal curve. SANN might work, but I kind of wonder how useful it'd be in estimating hundreds of parameters--thanks to that latent scale. My (possibly harebrained) thought for how to estimate this unfolding using some gradient-based method would be to run through some iterations and then check to see whether a better solution exists on the 'other side' of the normal curves. If it does, replace those parameters with the better ones. Because this causes the likelihood to jump, I'd probably have to start the estimation process over again (maybe). But, I see no way from within the minimization function called by NLM or optim to tell NLM or optim to terminate its current run. I could make the algorithm recursive, but that eats up resources will probably have to be terminated w/ an error. Peter On 10/11/05 11:11 PM, Spencer Graves [EMAIL PROTECTED] wrote: There may be a few problems where ML (or more generally Bayes) fails to give sensible answers, but they are relatively rare. What is your likelihood? How many parameters are you trying to estimate? Are you using constrained or unconstrained optimization? If constrained, I suggest you remove the constraints by appropriate transformation. When considering alternative transformations, I consider (a) what makes physical sense, and (b) which transformation produces a log likelihood that is more close to being parabolic. Hou are you calling optim? Have you tried all SANN as well as Nelder-Mead, BFGS, and CG? If you are using constrained optimization, I suggest you move the constraints to Inf by appropriate transformation and use the other methods, as I just suggested. If you would still like more suggestions from this group, please provide more detail -- but as tersely as possible. The posting guide is, I believe, quite useful (www.R-project.org/posting-guide.html). spencer graves __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED]
Re: [R] ML optimization question--unidimensional unfolding scalin g
Hi, Andy and Peter: That's interesting. I still like the idea of making my own local copy, because I can more easily add comments and test ideas while working through the code. I haven't used debug, but I think I should try it, because some things occur when running a function that don't occur when I walk through it line by line, e.g., parsing the call and ... arguments. Two more comments on the original question: 1. What is the structure of your data? Have you considered techniques for Multidimensional Scaling (MDS)? It seems that your problem is just a univariate analogue of the MDS problem. For metric MDS from a complete distance matrix, the solution is relatively straightforward computation of eigenvalues and vectors from a matrix computed from the distance matrix, and there is software widely available for the nonmetric MDS problem. For a terse introduction to that literature, see Venables and Ripley (2002) Modern Applied Statistics with S, 4th ed. (Springer, distance methods in sec. 11.1, pp. 306-308). 2. If you don't have a complete distance matrix, might it be feasible to approach the problem starting small and building larger, i.e., start with 3 nodes, then add a fourth, etc.? spencer graves Liaw, Andy wrote: Alternatively, just type debug(optim) before using it, then step through it by hitting enter repeatedly... When you're done, do undebug(optim). Andy From: Spencer Graves Have you looked at the code for optim? If you execute optim, it will list the code. You can copy that into a script file and walk through it line by line to figure out what it does. By doing this, you should be able to find a place in the iteration where you can test both branches of each bifurcation and pick one -- or keep a list of however many you want and follow them all more or less simultaneously, pruning the ones that seem too implausible. Then you can alternate between a piece of the optim code, bifurcating and pruning, adjusting each and printing intermediate progress reports to help you understand what it's doing and how you might want to modify it. With a bit more effort, you can get the official source code with comments. To do that, I think you go to www.r-project.org - CRAN - (select a local mirror) - Software: R sources. From there, just download The latest release: R-2.2.0.tar.gz. For more detailed help, I suggest you try to think of the simplest possible toy problem that still contains one of the issues you find most difficult. Then send that to this list. If readers can copy a few lines of R code from your email into R and try a couple of things in less than a minute, I think you might get more useful replies quicker. Best Wishes, Spencer Graves Peter Muhlberger wrote: Hi Spencer: Thanks for your interest! Also, the posting guide was helpful. I think my problem might be solved if I could find a way to terminate nlm or optim runs from within the user-given minimization function they call. Optimization is unconstrained. I'm essentially using normal like curves that translate observed values on a set of variables (one curve per variable) into latent unfolded values. The observed values are on the Y-axis the latent (hence parameters to be estimated) are on the X-axis. The problem is that there are two points into which an observed value can map on a curve--one on either side of the curve mean. Only one of these values actually will be optimal for all observed variables, but it's easy to show that most estimation methods will get stuck on the non-optimal value if they find that one first. Moving away from that point, the likelihood gets a whole lot worse before the routine will 'see' the optimal point on the other side of the normal curve. SANN might work, but I kind of wonder how useful it'd be in estimating hundreds of parameters--thanks to that latent scale. My (possibly harebrained) thought for how to estimate this unfolding using some gradient-based method would be to run through some iterations and then check to see whether a better solution exists on the 'other side' of the normal curves. If it does, replace those parameters with the better ones. Because this causes the likelihood to jump, I'd probably have to start the estimation process over again (maybe). But, I see no way from within the minimization function called by NLM or optim to tell NLM or optim to terminate its current run. I could make the algorithm recursive, but that eats up resources will probably have to be terminated w/ an error. Peter On 10/11/05 11:11 PM, Spencer Graves [EMAIL PROTECTED] wrote: There may be a few problems where ML (or more generally Bayes) fails to give sensible answers, but they are relatively rare. What is your likelihood? How many parameters are you trying
Re: [R] ML optimization question--unidimensional unfolding scalin g
Hi Spencer: Just realized I may have misunderstood your comments about branching--you may have been thinking about a restart. Sorry if I misrepresented them. See below: On 11/3/05 11:03 AM, Spencer Graves [EMAIL PROTECTED] wrote: Hi, Andy and Peter: That's interesting. I still like the idea of making my own local copy, because I can more easily add comments and test ideas while working through the code. I haven't used debug, but I think I should try it, because some things occur when running a function that don't occur when I walk through it line by line, e.g., parsing the call and ... arguments. Debug's handy tho I think it is line by line. Two more comments on the original question: 1. What is the structure of your data? Have you considered techniques for Multidimensional Scaling (MDS)? It seems that your problem is just a univariate analogue of the MDS problem. For metric MDS from a complete distance matrix, the solution is relatively straightforward computation of eigenvalues and vectors from a matrix computed from the distance matrix, and there is software widely available for the nonmetric MDS problem. For a terse introduction to that literature, see Venables and Ripley (2002) Modern Applied Statistics with S, 4th ed. (Springer, distance methods in sec. 11.1, pp. 306-308). I was looking for something on MDS in R, that'll be handy! The data structure is a set of variables (say about 6) that I have reason to believe measure an underlying dimension. I suspect that several of the variables are unfolding--that is, they have their highest value for some point on the scale and fall off w/ distance from that point in either direction. The degree of fall-off may vary depending on the variable. Some seem to fall off very rapidly, others not. A couple variables probably monotonically increase w/ the underlying scale, so they don't unfold. I can construct a distance matrix consisting of distances between these variables. Do you think MDS might be able to handle an arrangement like this, w/ some values folded about a scale point and with drop-off varying between variables? The distances between the variables do not map in any straightforward way into distances on the underlying scale because of folding and non-linearity. 2. If you don't have a complete distance matrix, might it be feasible to approach the problem starting small and building larger, i.e., start with 3 nodes, then add a fourth, etc.? Not sure I follow, but I do have a complete distance matrix of distances between the variables. spencer graves Thanks, Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html