Re: Interpreting mutliple regression Beta is only way?
On 4 Feb 2002 16:14:11 -0800, [EMAIL PROTECTED] (Wuzzy) wrote: In biostatistical studies, either version of beta is pretty worthless. Generally speaking. If I may be permitted to infer a reason: if you have bodyweight= -a(drug) - b(exercise) + food Then the standardized coefficients will affect bodyweight but they will also affect each other. They would only be useful if drug intake was perfectly independant of exercise and food in the population. Okay, that is a bit of a reason, which applies widely; but that is not what I was pointing at when I wrote that line. I was over-stating. But many of the results in clinical research are 'barely statistically significant' and when that is so, then (in my opinion), the 95% Confidence Interval does not add much to the statement of test result, and the point estimate of the effect (beta, or a mean-difference) is pretty loose, too. I want to beat a 5% alpha. And I want a result to have a 50% interval (say) that is actually interesting, and well-above the measurement jitters. -- That is an *essential* requirement for being coherent, if you want to describe observational studies. And it is a pretty good idea for writing about randomized-controlled studies, too. If they are not independant but partially collinear (0.5) using linear regression is it possible to know whether the drug is strong enough (colloquially speaking) to recommend? I assume that it would be impossible as a change in drug cannot be separated from a change in exercise in the population. Ie. people are exercising and taking the drug so it is impossible to distinguish which one is beneficial. You can look at the zero-order effect (raw correlation) as well as the partial effect after controlling for other potential predictors: one at a time, or in sets. If your Drug always shows the same prediction, regardless of how you test it, that is a pretty good sign. I've heard of ridge regression will try to investigate this area more.. will probably figure it out with time hehe.. Ridge regression can be decomposed into a combination of the p-variate regression, averaged with the 1-variable regressions, and all the i-variable regressions that lie in-between. There is not much gain using Ridge if you avoid suppressor relationships from the start -- all those cases where a beta is the opposite sign from its zero-order beta. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting mutliple regression Beta is only way?
In biostatistical studies, either version of beta is pretty worthless. Generally speaking. If I may be permitted to infer a reason: if you have bodyweight= -a(drug) - b(exercise) + food Then the standardized coefficients will affect bodyweight but they will also affect each other. They would only be useful if drug intake was perfectly independant of exercise and food in the population. If they are not independant but partially collinear (0.5) using linear regression is it possible to know whether the drug is strong enough (colloquially speaking) to recommend? I assume that it would be impossible as a change in drug cannot be separated from a change in exercise in the population. Ie. people are exercising and taking the drug so it is impossible to distinguish which one is beneficial. I've heard of ridge regression will try to investigate this area more.. will probably figure it out with time hehe.. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting mutliple regression Beta is only way?
On 18 Jan 2002 16:55:11 -0800, [EMAIL PROTECTED] (Wuzzy) wrote: Rich Ulrich [EMAIL PROTECTED] wrote in message Thanks Rich, most informative, I am trying to determine a method of comparing apples to oranges - it seems an improtant thing to try to do, perhaps it is impossible . Well, I do think there is a narrow, numerical answer about two particular tests, in a particular sample. But there are a lot of side-issues before you get to an answer that is very useful. See below. I am trying to determine which is better, glycemic index or carbohydrate total in predicting glycemic load (Glycemic load=glycemic index*carbohydrate). my results as a matrix: GI load GI Carb GI load 1.000 GI .5331.000 Carb .858.1241.000 So it seems that carb affects GI load more than does GI.. but this is on ALL foods.. (nobody eats ALL foods so cannot extrapolate to human diet) but I don't think you're allowed to do this kind of comparison as Carb and GI aretotal different values: I suspected that you would be allowed to make the comparisons if you use Betas, ie. measure how many standard deviation changes of GI and Carb it requires.. If it takes a bigger standard deviation of Carb then you could say that it is more likely that carb has a bigger effect on glycemic load. you seem to suggest that even using standard deviation changes, you cannot compare apples to oranges. Which sounds right but is dissapointing.. There is a narrow, numerical thing with an answer. For instance, if you are adding A+B=C, then two *independent* components of C affect C in proportion to their variances. Your two components don't have much correlation (.13) -- that is, they are nearly independent -- so that would work out. But you actually have an *exact* relationship, as a product. The one that matters in this case is the one that has the higher coefficient of variation -- the larger standard deviation (and variance) when expressed as logs. In a sum of two numbers, the one that is more varying will contribute more. You can state that A or B is bigger, for *this* particular sample. Now, I imagine a sample could be a) all healthy; b) all with a 'definite' diagnosis some relevant disease; c) all in the category of needing a diagnosis. Or you could have some mixture of the above; for some specified ages, etc. Now, What is it you are trying to decide? - that will help determine a relevant sampling. When you say that a number is more important, are you trying to say that the measure you have says a lot? - or that *improving* that particular number would gain you more, because the number you have is a lousy one? -- I can point out that if the two variables matter exactly the same in some physiological sense, then you can have two opposite conclusions about which is important, if one of them is *measured* much more poorly (much more inherent error; measurement error) than the other. And, you are reporting on tests, and a number. You don't get to conclude that glucose matters more than carbohydrates if you only know that your 50 mg glucose test is more pertinent than your 50mg c. test; as collected in your particular sample. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting mutliple regression Beta is only way?
Rich Ulrich [EMAIL PROTECTED] wrote in message Thanks Rich, most informative, I am trying to determine a method of comparing apples to oranges - it seems an improtant thing to try to do, perhaps it is impossible . I am trying to determine which is better, glycemic index or carbohydrate total in predicting glycemic load (Glycemic load=glycemic index*carbohydrate). my results as a matrix: GI load GI Carb GI load 1.000 GI .5331.000 Carb .858.1241.000 So it seems that carb affects GI load more than does GI.. but this is on ALL foods.. (nobody eats ALL foods so cannot extrapolate to human diet) but I don't think you're allowed to do this kind of comparison as Carb and GI aretotal different values: I suspected that you would be allowed to make the comparisons if you use Betas, ie. measure how many standard deviation changes of GI and Carb it requires.. If it takes a bigger standard deviation of Carb then you could say that it is more likely that carb has a bigger effect on glycemic load. you seem to suggest that even using standard deviation changes, you cannot compare apples to oranges. Which sounds right but is dissapointing.. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting mutliple regression Beta is only way?
Wuzzy [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Rich Ulrich [EMAIL PROTECTED] wrote in message Thanks Rich, most informative, I am trying to determine a method of comparing apples to oranges - it seems an improtant thing to try to do, perhaps it is impossible . I am trying to determine which is better, glycemic index or carbohydrate total in predicting glycemic load (Glycemic load=glycemic index*carbohydrate). my results as a matrix: GI load GI Carb GI load 1.000 GI .5331.000 Carb .858.1241.000 So it seems that carb affects GI load more than does GI.. but this is on ALL foods.. (nobody eats ALL foods so cannot extrapolate to human diet) but I don't think you're allowed to do this kind of comparison as Carb and GI aretotal different values: I suspected that you would be allowed to make the comparisons if you use Betas, ie. measure how many standard deviationGlycemic load=glycemic index*carbohydrate changes of GI and Carb it requires.. If it takes a bigger standard deviation of Carb then you could say that it is more likely that carb has a bigger effect on glycemic load. you seem to suggest that even using standard deviation changes, you cannot compare apples to oranges. Which sounds right but is dissapointing.. The glycaemic index is calculated as the area under the blood glucose curve for the two hours (or 3 hours for diabetics) after ingesting enough of a food to include 50 grams of carbohydrate, divided by the same area after ingesting 50 grams of pure glucose, expressed as a percentage. In some cases a reference food other than glucose is used. If the area under the curve is the glycaemic load you are studying I would expect the model Glycemic load=glycemic index*carbohydrate to fit the data very well when the carbohydrate content is near 50 gm, providing all the glycaemic indices have been calculated on the same basis. Using correlations or beta coefficients as you are doing is appropriate when linear relationships are involved, but not to test for goodness of fit to this model. What would be of interest would be a plot of the difference between the predicted glycaemic load and the observed value,against carbohydrate, especially for carbohydrate values far from 50 gm. If I have a meal of mainly of eggs or meat, the total carbohydrate content is very low, so the glycaemic load calculated from the formula may be wrong. One difficulty with the whole Glycaemic Index approach is that there is not, as far as I know, any way of calculating the glycaemic load from foods like cheese,eggs and meat. If the body needs glucose, it will be made from fat and protein foods. It is not surprising that it could be hard to persuade volunteers to ingest 8500 grams of processed cheese, containing 50gm of carbohydrate, in order to determine its glycaemic index :-) I would like to see another index constructed giving the glycaemic load produced by 100 gm of each food, rather than the load produced by that amount of food which contains 50gm of carbohydrate. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting mutliple regression Beta is only way?
Wuzzy [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Rich Ulrich [EMAIL PROTECTED] wrote in message Thanks Rich, most informative, I am trying to determine a method of comparing apples to oranges - it seems an improtant thing to try to do, perhaps it is impossible . I am trying to determine which is better, glycemic index or carbohydrate total in predicting glycemic load (Glycemic load=glycemic index*carbohydrate). my results as a matrix: GI load GI Carb GI load 1.000 GI .5331.000 Carb .858.1241.000 So it seems that carb affects GI load more than does GI.. but this is on ALL foods.. (nobody eats ALL foods so cannot extrapolate to human diet) but I don't think you're allowed to do this kind of comparison as Carb and GI aretotal different values: I suspected that you would be allowed to make the comparisons if you use Betas, ie. measure how many standard deviationGlycemic load=glycemic index*carbohydrate changes of GI and Carb it requires.. If it takes a bigger standard deviation of Carb then you could say that it is more likely that carb has a bigger effect on glycemic load. you seem to suggest that even using standard deviation changes, you cannot compare apples to oranges. Which sounds right but is dissapointing.. The glycaemic index is calculated as the area under the blood glucose curve for the two hours (or 3 hours for diabetics) after ingesting enough of a food to include 50 grams of carbohydrate, divided by the same area after ingesting 50 grams of pure glucose, expressed as a percentage. In some cases a reference food other than glucose is used. If the area under the curve is the glycaemic load you are studying I would expect the model Glycemic load=glycemic index*carbohydrate to fit the data very well when the carbohydrate content is near 50 gm, providing all the glycaemic indices have been calculated on the same basis. Using correlations or beta coefficients as you are doing is appropriate when linear relationships are involved, but not to test for goodness of fit to this model. What would be of interest would be a plot of the difference between the predicted glycaemic load and the observed value,against carbohydrate, especially for carbohydrate values far from 50 gm. If I have a meal of mainly of eggs or meat, the total carbohydrate content is very low, so the glycaemic load calculated from the formula may be wrong. One difficulty with the whole Glycaemic Index approach is that there is not, as far as I know, any way of calculating the glycaemic load from foods like cheese,eggs and meat. If the body needs glucose, it will be made from fat and protein foods. It is not surprising that it could be hard to persuade volunteers to ingest 8500 grams of processed cheese, containing 50gm of carbohydrate, in order to determine its glycaemic index :-) I would like to see another index constructed giving the glycaemic load produced by 100 gm of each food, rather than the load produced by that amount of food which contains 50gm of carbohydrate. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting mutliple regression Beta is only way?
Wuzzy [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Rich Ulrich [EMAIL PROTECTED] wrote in message Thanks Rich, most informative, I am trying to determine a method of comparing apples to oranges - it seems an improtant thing to try to do, perhaps it is impossible . I am trying to determine which is better, glycemic index or carbohydrate total in predicting glycemic load (Glycemic load=glycemic index*carbohydrate). my results as a matrix: GI load GI Carb GI load 1.000 GI .5331.000 Carb .858.1241.000 So it seems that carb affects GI load more than does GI.. but this is on ALL foods.. (nobody eats ALL foods so cannot extrapolate to human diet) but I don't think you're allowed to do this kind of comparison as Carb and GI aretotal different values: I suspected that you would be allowed to make the comparisons if you use Betas, ie. measure how many standard deviationGlycemic load=glycemic index*carbohydrate changes of GI and Carb it requires.. If it takes a bigger standard deviation of Carb then you could say that it is more likely that carb has a bigger effect on glycemic load. you seem to suggest that even using standard deviation changes, you cannot compare apples to oranges. Which sounds right but is dissapointing.. The glycaemic index is calculated as the area under the blood glucose curve for the two hours (or 3 hours for diabetics) after ingesting enough of a food to include 50 grams of carbohydrate, divided by the same area after ingesting 50 grams of pure glucose, expressed as a percentage. In some cases a reference food other than glucose is used. If the area under the curve is the glycaemic load you are studying I would expect the model Glycemic load=glycemic index*carbohydrate to fit the data very well when the carbohydrate content is near 50 gm, providing all the glycaemic indices have been calculated on the same basis. Using correlations or beta coefficients as you are doing is appropriate when linear relationships are involved, but not to test for goodness of fit to this model. What would be of interest would be a plot of the difference between the predicted glycaemic load and the observed value,against carbohydrate, especially for carbohydrate values far from 50 gm. If I have a meal of mainly of eggs or meat, the total carbohydrate content is very low, so the glycaemic load calculated from the formula may be wrong. One difficulty with the whole Glycaemic Index approach is that there is not, as far as I know, any way of calculating the glycaemic load from foods like cheese,eggs and meat. If the body needs glucose, it will be made from fat and protein foods. It is not surprising that it could be hard to persuade volunteers to ingest 8500 grams of processed cheese, containing 50gm of carbohydrate, in order to determine its glycaemic index :-) I would like to see another index constructed giving the glycaemic load produced by 100 gm of each food, rather than the load produced by that amount of food which contains 50gm of carbohydrate. = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting mutliple regression Beta is only way?
On 16 Jan 2002 11:33:15 -0800, [EMAIL PROTECTED] (Wuzzy) wrote: If your beta coefficients are on different scales: like you want to know whether temperature or pressure are affecting your bread baking more, Is the way to do this using Beta coefficients calculated as Beta=beta*SDx/SDy Something like that ... is called the standardized beta, and every OLS regression program gives them. ... It seems like the Beta coefficients are rarely cited in studies and it seems to me worthless to know beta (small b) as you are not allowed to compare them as they are on different scales. In biostatistical studies, either version of beta is pretty worthless. Generally speaking. What you have is prediction that is barely better than chance. The p-values tell you which is more powerful within this one equation. The zero-level correlation tells you how they related, alone. -- If these two indicators are not similar, then you have something complicated going on, with confounding taking place, or joint-prediction, and no single number will show it all. - When prediction is enough better-than-chance to be really interesting, then the raw units are probably interesting, too. [ ... ] Is there a way of converting this standardized coefficient to a correlation coefficient on a scale of -1 to +1) It would be useful to do this as you want to know the correlation coefficient of temperature after factoring out pressure. I think you are looking for simple answers that can't exist, even though there *is* a partial-r, and the beta in regression *is* a partial-beta. The main use I have found for the standardized (partial) beta is the simple check against confounding, etc. If ' beta' is similar to the zero-order r, for all variables, then there must be pretty good independence among the predictors, and interpretation doesn't hide any big surprises. If it is half-size, I look for shared prediction. If it is in the wrong direction or far too big (these conditions happen at the same time, for pairs of variables), then gross confounding exists. Hope this helps. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =