Re: [R] weights vs. offset (negative binomial regression)

Ben Bolker Tue, 31 Oct 2023 09:59:51 -0700

  [Please keep r-help in the cc: list]

I don't quite know how to interpret the difference between specifyingeffort as an offset vs. as weights; I would have to spend more timethinking about it/working through it than I have available at the moment.

I don't know that specifying effort as weights is *wrong*, but Idon't know that it's right or what it is doing: if I were the reviewerof a paper (for example) I would require you to explain what thedifference is and convince me that it was appropriate. (Furthermore, "Iwant to do it this way because it gives me significant effects" isautomatically suspicious.)

This would be a good question for CrossValidated(https://stats.stackexchange.com), you could try posting it there (Iwould be interested in the answer!)


  cheers
    Ben Bolker


On 2023-10-30 8:19 p.m., 유준택 wrote:

Dear Mr. Bolker,

Thank you for the fast response.

I also know that a poisson (or negative binomial ) regression of glm isgenerally modelled using an offset variable.

In this case, when a weights term instead of the offset is used, thisgave me significant coefficients of covariance.I understand that the weights function for exponential familydistributions in glm affects the variance of response variable.

I was just wondering whether my first model is a completely wrong modeland the use of offset variable is valid in the case thatresponse variable is not proportional to offset variable such as mydataset.


Sincerely,


Joon-Taek

2023년 10월 29일 (일) 오전 3:25, Ben Bolker <bbol...@gmail.com<mailto:bbol...@gmail.com>>님이 작성:


        Using an offset of log(Effort) as in your second model is the more
    standard way to approach this problem; it corresponds to assuming that
    catch is strictly proportional to effort. Adding log(Effort) as a
    covariate (as illustrated below) tests whether a power-law model (catch
    propto (Effort)^(b+1), b!=0) is a better description of the data.  (In
    this case it is not, although the confidence intervals on b are very
    wide, indicating that we have very little information -- this is not
    surprising since the proportional range of effort is very small
    (246-258) in this data set.

        In general you should *not* check overdispersion of the raw data
    (i.e., the *marginal distribution* of the data, you should check
    overdispersion of a fitted (e.g. Poisson) model, as below.

        cheers
         Ben Bolker


    edata <- data.frame(Catch, Effort, xx1, xx2, xx3)

    ## graphical exploration

    library(ggplot2); theme_set(theme_bw())
    library(tidyr)
    edata_long <- edata |> pivot_longer(names_to="var", cols =-c("Catch",
    "Effort"))
    ggplot(edata_long, aes(value, Catch)) +
          geom_point(alpha = 0.2, aes(size = Effort)) +
          facet_wrap(~var, scale="free_x") +
          geom_smooth(method = "glm", method.args = list(family =
    "quasipoisson"))
    #

    library(MASS)
    g1 <- glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata)
    g2 <- update(g1, . ~ . + log(Effort))
    g0 <- glm(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata,
                family = poisson)
    performance::check_overdispersion(g0)
    summary(g1)
    summary(g2)
    options(digits = 3)
    confint(g2)
    summary(g1)



    On 2023-10-28 3:30 a.m., 유준택 wrote:
     > Colleagues,
     >
     >
     >
     > I have a dataset that includes five variables.
     >
     > - Catch: the catch number counted in some species (ind.)
     >
     > - Effort: fishing effort (the number of fishing vessels)
     >
     > - xx1, xx2, xx3: some environmental factors
     >
     > As an overdispersion test on the “Catch” variable, I modeled with
    negative
     > binomial distribution using a GLM. The “Effort” variable showed a
    gradually
     > decreasing trend during the study period. I was able to get the
    results I
     > wanted when considered “Effort” function as a weights function in the
     > negative binomial regression as follows:
     >
     >
     >
     > library(qcc)
     >
     >
    Catch=c(25,2,7,6,75,5,1,4,66,15,9,25,40,8,7,4,36,11,1,14,141,9,74,38,126,3)
     >
     >
    
Effort=c(258,258,258,258,258,258,258,254,252,252,252,252,252,252,252,252,252,252,252,248,246,246,246,246,246,246)
     >
     >
    
xx1=c(0.8,0.5,1.2,0.5,1.1,1.1,1.0,0.6,0.9,0.5,1.2,0.6,1.2,0.7,1.0,0.6,1.6,0.7,0.8,0.6,1.7,0.9,1.1,0.5,1.4,0.5)
     >
     >
    
xx2=c(1.7,1.6,2.7,2.6,1.5,1.5,2.8,2.5,1.7,1.9,2.2,2.4,1.6,1.4,3.0,2.4,1.4,1.5,2.2,2.3,1.7,1.7,1.9,1.9,1.4,1.4)
     >
     >
    
xx3=c(188,40,2,10,210,102,117,14,141,28,48,15,220,115,10,14,320,20,3,10,400,150,145,160,460,66)
     >
     > #
     >
     > edata <- data.frame(Catch, Effort, xx1, xx2, xx3)
     >
     > #
     >
     > qcc.overdispersion.test(edata$Catch, type="poisson")
     >
     > #
     >
     > summary(glm.nb(Catch~xx1+xx2+xx3, weights=Effort, data=edata))
     >
     > summary(glm.nb(Catch~xx1+xx2+xx3+offset(log(Effort)), data=edata))
     >
     >
     >
     > I am not sure the application of the weights function to the negative
     > binomial regression is correct. Also I wonder if there is a
    better way
     > doing this. Can anyone help?
     >
     >       [[alternative HTML version deleted]]
     >
     > ______________________________________________
     > R-help@r-project.org <mailto:R-help@r-project.org> mailing list
    -- To UNSUBSCRIBE and more, see
     > https://stat.ethz.ch/mailman/listinfo/r-help
    <https://stat.ethz.ch/mailman/listinfo/r-help>
     > PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    <http://www.R-project.org/posting-guide.html>
     > and provide commented, minimal, self-contained, reproducible code.

    ______________________________________________
    R-help@r-project.org <mailto:R-help@r-project.org> mailing list --
    To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    <https://stat.ethz.ch/mailman/listinfo/r-help>
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    <http://www.R-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weights vs. offset (negative binomial regression)

Reply via email to