subject:"\[R\] Modelling"

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

2021-03-18 Thread Luigi Marongiu

Oh, I got it! I was sending the fluorescence instead of the cycles x.
Thank you

```
desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
max_fluorescence = 11839.8, back_fluorescence
= -138.864) , 1:45)
```

On Wed, Mar 17, 2021 at 8:58 PM Duncan Murdoch  wrote:
>
> On 17/03/2021 12:37 p.m., Luigi Marongiu wrote:
> > sorry, I don't get it...
>
> Modify your rutledge function to print x, and you'll see the values of
> high printed.  x should be 1:45.
>
> Duncan Murdoch
>
> >
> > On Wed, Mar 17, 2021 at 2:35 PM Duncan Murdoch  
> > wrote:
> >>
> >> On 17/03/2021 6:59 a.m., Luigi Marongiu wrote:
> >>> yes, but in `rutledge` I model y as `y = (M / ( 1 + exp(-(x-m)/s)) ) +
> >>> B`, with x being 1:45. Isn't that the equivalent of what I fed Desmos
> >>> with? Tx
> >>
> >> No, it's not.
> >>
> >> Duncan Murdoch
> >>
> >>>
> >>> On Wed, Mar 17, 2021 at 11:31 AM Duncan Murdoch
> >>>  wrote:
> 
>  On 17/03/2021 5:41 a.m., Luigi Marongiu wrote:
> > Hello,
> > I have a dataset from a polymerase chain reaction. I am using the
> > equation given by Rutledge
> > (https://pubmed.ncbi.nlm.nih.gov/15601990/) but the profile I get in R
> > does not match the data. I ran the same thing in Desmos and instead
> > the profile is correct (attached).
> > Why do I not get the same matching model as in Desmos? I believe the
> > formula in R is the same as the one in Desmos, and I am using the same
> > parameters.
> > Is there a procedure to debug models?
> > Thanks
> >
> > Here is the code:
> > ```
> > high <- c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39,
> >   -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 
> > 15.01,
> >   75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 
> > 5059.94,
> >   6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 
> > 10077.19,
> >   10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 
> > 11684.96,
> >   11781.77, 11863.35, 11927.44, 11980.81, 12021.88)
> > plot(1:45, high, type = "l")
> > rutledge <- function(p, x) {
> >  m = p$half_fluorescence
> >  s = p$slope
> >  M = p$max_fluorescence
> >  B = p$back_fluorescence
> >  y = (M / ( 1 + exp(-(x-m)/s)) ) + B
> >  return(y)
> > }
> > desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
> >max_fluorescence = 11839.8, back_fluorescence
> > = -138.864) , high)
> >
> > points(1:45, desmos, type="l", col="blue")
> 
> 
>  In your calculation of desmos, you are using the Y variable for x in the
>  formula.  Calculate it this way instead:
> 
>  desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
>   max_fluorescence = 11839.8, 
>  back_fluorescence
>  = -138.864) , 1:45)
> 
>  Duncan Murdoch
> 
> >>>
> >>>
> >>
> >
> >
>


-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

2021-03-17 Thread Duncan Murdoch


On 17/03/2021 12:37 p.m., Luigi Marongiu wrote:

sorry, I don't get it...


Modify your rutledge function to print x, and you'll see the values of 
high printed.  x should be 1:45.


Duncan Murdoch



On Wed, Mar 17, 2021 at 2:35 PM Duncan Murdoch  wrote:


On 17/03/2021 6:59 a.m., Luigi Marongiu wrote:

yes, but in `rutledge` I model y as `y = (M / ( 1 + exp(-(x-m)/s)) ) +
B`, with x being 1:45. Isn't that the equivalent of what I fed Desmos
with? Tx


No, it's not.

Duncan Murdoch



On Wed, Mar 17, 2021 at 11:31 AM Duncan Murdoch
 wrote:


On 17/03/2021 5:41 a.m., Luigi Marongiu wrote:

Hello,
I have a dataset from a polymerase chain reaction. I am using the
equation given by Rutledge
(https://pubmed.ncbi.nlm.nih.gov/15601990/) but the profile I get in R
does not match the data. I ran the same thing in Desmos and instead
the profile is correct (attached).
Why do I not get the same matching model as in Desmos? I believe the
formula in R is the same as the one in Desmos, and I am using the same
parameters.
Is there a procedure to debug models?
Thanks

Here is the code:
```
high <- c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39,
  -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01,
  75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94,
  6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19,
  10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 11684.96,
  11781.77, 11863.35, 11927.44, 11980.81, 12021.88)
plot(1:45, high, type = "l")
rutledge <- function(p, x) {
 m = p$half_fluorescence
 s = p$slope
 M = p$max_fluorescence
 B = p$back_fluorescence
 y = (M / ( 1 + exp(-(x-m)/s)) ) + B
 return(y)
}
desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
   max_fluorescence = 11839.8, back_fluorescence
= -138.864) , high)

points(1:45, desmos, type="l", col="blue")



In your calculation of desmos, you are using the Y variable for x in the
formula.  Calculate it this way instead:

desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
 max_fluorescence = 11839.8, back_fluorescence
= -138.864) , 1:45)

Duncan Murdoch











__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

2021-03-17 Thread Luigi Marongiu

sorry, I don't get it...

On Wed, Mar 17, 2021 at 2:35 PM Duncan Murdoch  wrote:
>
> On 17/03/2021 6:59 a.m., Luigi Marongiu wrote:
> > yes, but in `rutledge` I model y as `y = (M / ( 1 + exp(-(x-m)/s)) ) +
> > B`, with x being 1:45. Isn't that the equivalent of what I fed Desmos
> > with? Tx
>
> No, it's not.
>
> Duncan Murdoch
>
> >
> > On Wed, Mar 17, 2021 at 11:31 AM Duncan Murdoch
> >  wrote:
> >>
> >> On 17/03/2021 5:41 a.m., Luigi Marongiu wrote:
> >>> Hello,
> >>> I have a dataset from a polymerase chain reaction. I am using the
> >>> equation given by Rutledge
> >>> (https://pubmed.ncbi.nlm.nih.gov/15601990/) but the profile I get in R
> >>> does not match the data. I ran the same thing in Desmos and instead
> >>> the profile is correct (attached).
> >>> Why do I not get the same matching model as in Desmos? I believe the
> >>> formula in R is the same as the one in Desmos, and I am using the same
> >>> parameters.
> >>> Is there a procedure to debug models?
> >>> Thanks
> >>>
> >>> Here is the code:
> >>> ```
> >>> high <- c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39,
> >>>  -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 
> >>> 15.01,
> >>>  75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 
> >>> 5059.94,
> >>>  6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19,
> >>>  10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 
> >>> 11684.96,
> >>>  11781.77, 11863.35, 11927.44, 11980.81, 12021.88)
> >>> plot(1:45, high, type = "l")
> >>> rutledge <- function(p, x) {
> >>> m = p$half_fluorescence
> >>> s = p$slope
> >>> M = p$max_fluorescence
> >>> B = p$back_fluorescence
> >>> y = (M / ( 1 + exp(-(x-m)/s)) ) + B
> >>> return(y)
> >>> }
> >>> desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
> >>>   max_fluorescence = 11839.8, back_fluorescence
> >>> = -138.864) , high)
> >>>
> >>> points(1:45, desmos, type="l", col="blue")
> >>
> >>
> >> In your calculation of desmos, you are using the Y variable for x in the
> >> formula.  Calculate it this way instead:
> >>
> >> desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
> >> max_fluorescence = 11839.8, back_fluorescence
> >>= -138.864) , 1:45)
> >>
> >> Duncan Murdoch
> >>
> >
> >
>


-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

2021-03-17 Thread Duncan Murdoch


On 17/03/2021 6:59 a.m., Luigi Marongiu wrote:

yes, but in `rutledge` I model y as `y = (M / ( 1 + exp(-(x-m)/s)) ) +
B`, with x being 1:45. Isn't that the equivalent of what I fed Desmos
with? Tx


No, it's not.

Duncan Murdoch



On Wed, Mar 17, 2021 at 11:31 AM Duncan Murdoch
 wrote:


On 17/03/2021 5:41 a.m., Luigi Marongiu wrote:

Hello,
I have a dataset from a polymerase chain reaction. I am using the
equation given by Rutledge
(https://pubmed.ncbi.nlm.nih.gov/15601990/) but the profile I get in R
does not match the data. I ran the same thing in Desmos and instead
the profile is correct (attached).
Why do I not get the same matching model as in Desmos? I believe the
formula in R is the same as the one in Desmos, and I am using the same
parameters.
Is there a procedure to debug models?
Thanks

Here is the code:
```
high <- c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39,
 -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01,
 75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94,
 6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19,
 10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 11684.96,
 11781.77, 11863.35, 11927.44, 11980.81, 12021.88)
plot(1:45, high, type = "l")
rutledge <- function(p, x) {
m = p$half_fluorescence
s = p$slope
M = p$max_fluorescence
B = p$back_fluorescence
y = (M / ( 1 + exp(-(x-m)/s)) ) + B
return(y)
}
desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
  max_fluorescence = 11839.8, back_fluorescence
= -138.864) , high)

points(1:45, desmos, type="l", col="blue")



In your calculation of desmos, you are using the Y variable for x in the
formula.  Calculate it this way instead:

desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
max_fluorescence = 11839.8, back_fluorescence
   = -138.864) , 1:45)

Duncan Murdoch






__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

2021-03-17 Thread Luigi Marongiu

yes, but in `rutledge` I model y as `y = (M / ( 1 + exp(-(x-m)/s)) ) +
B`, with x being 1:45. Isn't that the equivalent of what I fed Desmos
with? Tx

On Wed, Mar 17, 2021 at 11:31 AM Duncan Murdoch
 wrote:
>
> On 17/03/2021 5:41 a.m., Luigi Marongiu wrote:
> > Hello,
> > I have a dataset from a polymerase chain reaction. I am using the
> > equation given by Rutledge
> > (https://pubmed.ncbi.nlm.nih.gov/15601990/) but the profile I get in R
> > does not match the data. I ran the same thing in Desmos and instead
> > the profile is correct (attached).
> > Why do I not get the same matching model as in Desmos? I believe the
> > formula in R is the same as the one in Desmos, and I am using the same
> > parameters.
> > Is there a procedure to debug models?
> > Thanks
> >
> > Here is the code:
> > ```
> > high <- c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39,
> > -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01,
> > 75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94,
> > 6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19,
> > 10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 
> > 11684.96,
> > 11781.77, 11863.35, 11927.44, 11980.81, 12021.88)
> > plot(1:45, high, type = "l")
> > rutledge <- function(p, x) {
> >m = p$half_fluorescence
> >s = p$slope
> >M = p$max_fluorescence
> >B = p$back_fluorescence
> >y = (M / ( 1 + exp(-(x-m)/s)) ) + B
> >return(y)
> > }
> > desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
> >  max_fluorescence = 11839.8, back_fluorescence
> > = -138.864) , high)
> >
> > points(1:45, desmos, type="l", col="blue")
>
>
> In your calculation of desmos, you are using the Y variable for x in the
> formula.  Calculate it this way instead:
>
> desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
>max_fluorescence = 11839.8, back_fluorescence
>   = -138.864) , 1:45)
>
> Duncan Murdoch
>


-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

2021-03-17 Thread Duncan Murdoch


On 17/03/2021 5:41 a.m., Luigi Marongiu wrote:

Hello,
I have a dataset from a polymerase chain reaction. I am using the
equation given by Rutledge
(https://pubmed.ncbi.nlm.nih.gov/15601990/) but the profile I get in R
does not match the data. I ran the same thing in Desmos and instead
the profile is correct (attached).
Why do I not get the same matching model as in Desmos? I believe the
formula in R is the same as the one in Desmos, and I am using the same
parameters.
Is there a procedure to debug models?
Thanks

Here is the code:
```
high <- c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39,
-14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01,
75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94,
6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19,
10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 11684.96,
11781.77, 11863.35, 11927.44, 11980.81, 12021.88)
plot(1:45, high, type = "l")
rutledge <- function(p, x) {
   m = p$half_fluorescence
   s = p$slope
   M = p$max_fluorescence
   B = p$back_fluorescence
   y = (M / ( 1 + exp(-(x-m)/s)) ) + B
   return(y)
}
desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
 max_fluorescence = 11839.8, back_fluorescence
= -138.864) , high)

points(1:45, desmos, type="l", col="blue")



In your calculation of desmos, you are using the Y variable for x in the 
formula.  Calculate it this way instead:


desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
  max_fluorescence = 11839.8, back_fluorescence
 = -138.864) , 1:45)

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] modelling 4-parameter curve in R does not match data - how to proceed?

2021-03-17 Thread Luigi Marongiu

Hello,
I have a dataset from a polymerase chain reaction. I am using the
equation given by Rutledge
(https://pubmed.ncbi.nlm.nih.gov/15601990/) but the profile I get in R
does not match the data. I ran the same thing in Desmos and instead
the profile is correct (attached).
Why do I not get the same matching model as in Desmos? I believe the
formula in R is the same as the one in Desmos, and I am using the same
parameters.
Is there a procedure to debug models?
Thanks

Here is the code:
```
high <- c(120.64, 66.14, 34.87, 27.11, 8.87, -5.8, 4.52, -7.16, -17.39,
   -14.29, -20.26, -14.99, -21.05, -20.64, -8.03, -21.56, -1.28, 15.01,
   75.26, 191.76, 455.09, 985.96, 1825.59, 2908.08, 3993.18, 5059.94,
   6071.93, 6986.32, 7796.01, 8502.25, 9111.46, 9638.01, 10077.19,
   10452.02, 10751.81, 11017.49, 11240.37, 11427.47, 11570.07, 11684.96,
   11781.77, 11863.35, 11927.44, 11980.81, 12021.88)
plot(1:45, high, type = "l")
rutledge <- function(p, x) {
  m = p$half_fluorescence
  s = p$slope
  M = p$max_fluorescence
  B = p$back_fluorescence
  y = (M / ( 1 + exp(-(x-m)/s)) ) + B
  return(y)
}
desmos <- rutledge(list(half_fluorescence = 27.1102, slope = 2.76798,
max_fluorescence = 11839.8, back_fluorescence
= -138.864) , high)

points(1:45, desmos, type="l", col="blue")
```

-- 
Best regards,
Luigi
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling non-Negative Time Series

2016-02-01 Thread Giorgio Garziano


https://cran.r-project.org/web/packages/tsintermittent/tsintermittent.pdf


Best,

--
GG



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modelling non-Negative Time Series

2016-01-31 Thread Lorenzo Isella


Dear All,
I am struggling to develop a model to forecast the daily expenses from
a bank account.
The daily time series consists (obviously) of non-negative numbers
which can be zero in the days when no money is taken from the bank
account.
To give you an idea of the kind of series I am dealing with, please
have a look at

myts<-structure(c(5.5, 0, 126.93, 0, 0, 0, 0, 10, 0, 135.34, 0, 0,
0, 0, 98.21, 0, 112.38, 0, 0, 0, 0, 0, 1373.77, 151.83, 26.66,
205.5, 129.33, 172.5, 0, 10, 131.09, 0, 0, 0, 0, 0, 689, 0, 0,
0, 0, 0, 0, 60.6, 183, 98.21, 0, 0, 0, 0, 1433.79, 175.89, 0,
0, 0, 200, 134.33, 98.26, 112.21, 0, 0, 0, 0, 0, 0, 112.31, 0,
0, 0, 0, 120, 0, 350, 0, 0, 98.21, 0, 0, 0, 113.24, 0, 0, 0,
0, 15, 696.65, 321.87, 929, 210.58, 0, 0, 10), .Tsp = c(16563,
16654, 1), class = "ts")

(the time origin is a bit funny, but what matters is that I have daily
data).

Do you know any R package to handle this kind of series? I think I am
outside the domain of the ARIMA approach.
I experimented with acp and tscount (to see if I could treat the
series as an autoregressive Poisson series), but I did not get very
far.
Any suggestion is appreciated.
Cheers

Lorenzo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modelling categorical variables

2015-09-08 Thread Jessica Lavabre

Hi,

I am a beginner with statistics and R and have no clue on how to model my
data. I have collected information on seed traps (ID) that includes the
habitat type (Hab) and different measures of distances. Also I have applied
a modularity analysis, so that the seeds traps are grouped into modules. My
dataset is as follow:



*IDHab  ModuleDistEdgeMeanDist1MeanDist2MeanDist3
F48F   A 21.768   24.941  6.033
27.642F50F   E 35.666**   60.505  149.927 *
*   48.582   F52F   B** 12.243**   103.041
72.908  *
*102.375N02N  B **58.681**
129.59  127.344   *
* 131.383   N17N  B** 62.829**   72.827 **
76.736  *
*77.644  N22N  B** 89.207**   78.719  **
75.005   *
*   81.176N33N  A** 23.288**   35.48**
25.317 *
* 36.931N40N  B** 36.734**   62.234 **
30.68   *
* 61.885N47N  E  **   60.443**   66.367  **
150.892 **   55.097   *

I am looking for a way to analyze if there is any correlation between the
Module classification and the other variables. My difficulties here are:
1 - is there a way to model my data where Module is the response variable
(something like Module~Hab*DistEdge*MeanDist1) ? If so, which model should
I use (I only have a bit of experience with glm) and which distribution?
2 - Is that a problem if I have different types of predictor variable
(factor and numerical)?

Any help would be greatly appreciated,

-- Jessica Lavabre-Micas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling categorical variables

2015-09-08 Thread PIKAL Petr

Hi

You probably wont get many answers because:

1 -you post in HTML (post in plain text)
2 -you provide data which are unreadable (copy output of dput(yourdata) instead)
3 -you ask statistical question which are rarely answered here (they are better 
suited to stackexchange list)

Regarding your models nothing prevents you to test any of them - lm, glm, ...
Or go through some available documents on CRAN like e.g.

Using R for Data Analysis and Graphics - Introduction, Examples and Commentary” 
by John Maindonald (PDF, data sets and scripts are available at JM's homepage).
“Practical Regression and Anova using R” by Julian Faraway (PDF, data sets and 
scripts are available at the book homepage).

among many others to learn how to use R for modelling.

Cheers
Petr


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jessica
> Lavabre
> Sent: Tuesday, September 08, 2015 11:01 AM
> To: r-help@r-project.org
> Subject: [R] Modelling categorical variables
>
> Hi,
>
> I am a beginner with statistics and R and have no clue on how to model
> my data. I have collected information on seed traps (ID) that includes
> the habitat type (Hab) and different measures of distances. Also I have
> applied a modularity analysis, so that the seeds traps are grouped into
> modules. My dataset is as follow:
>
>
>
> *IDHab  ModuleDistEdgeMeanDist1MeanDist2MeanDist3
> F48F   A 21.768   24.941  6.033
> 27.642F50F   E 35.666**   60.505
> 149.927 *
> *   48.582   F52F   B** 12.243**   103.041
> 72.908  *
> *102.375N02N  B **58.681**
> 129.59  127.344   *
> * 131.383   N17N  B** 62.829**   72.827 **
> 76.736  *
> *77.644  N22N  B** 89.207**   78.719  **
> 75.005   *
> *   81.176N33N  A** 23.288**   35.48**
> 25.317 *
> * 36.931N40N  B** 36.734**   62.234 **
> 30.68   *
> * 61.885N47N  E  **   60.443**   66.367  **
> 150.892 **   55.097   *
>
> I am looking for a way to analyze if there is any correlation between
> the Module classification and the other variables. My difficulties here
> are:
> 1 - is there a way to model my data where Module is the response
> variable (something like Module~Hab*DistEdge*MeanDist1) ? If so, which
> model should I use (I only have a bit of experience with glm) and which
> distribution?
> 2 - Is that a problem if I have different types of predictor variable
> (factor and numerical)?
>
> Any help would be greatly appreciated,
>
> -- Jessica Lavabre-Micas
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not author

[R] Modelling categorical and non-categorical datasets using Artifical Neural Networks

2013-05-29 Thread Shane Carey

Hi,

I have to do some Radon modelling and I have categorical and non
categorical datasets. I have been considering Artificial Neural Networks to
do this. I was wondering has anybody done anything like this before and
have you any advice before I start and where there might be some good
tutorials on this type of modelling.

Thanks

-- 
Shane

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling categorical and non-categorical datasets using Artifical Neural Networks

2013-05-29 Thread Rich Shepard


On Wed, 29 May 2013, Shane Carey wrote:


I have to do some Radon modelling and I have categorical and non
categorical datasets. I have been considering Artificial Neural Networks
to do this. I was wondering has anybody done anything like this before and
have you any advice before I start and where there might be some good
tutorials on this type of modelling.


Shane,

  What questions do you want to answer?

Rich

--
Richard B. Shepard, Ph.D.  |  Have knowledge, will travel.
Applied Ecosystem Services, Inc.   |
http://www.appl-ecosys.com Voice: 503-667-4517  Fax: 503-667-8863

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling and R misconceptions; was: package installtion

2011-11-17 Thread Uwe Ligges

This is hopeless, since you never seem to listen to our advice, 
therefore this will be my very last try:


So you actually need local advice, both for statistical concepts and R 
related. No statistics software can estimate effects of variables that 
you observed to be constant (e.g. 0) all the time. If any software does, 
please delete it a once from your machine.
Instead, ask a local statistician for advice on your problem. You 
certainly want to show the data and your model to the local expert - 
since you don't show us. And then you want to ask for local R course 
since reading the documentation seems not to help. Applying mtrace() in 
a non exiting object shows this straight away.


Uwe Ligges






On 17.11.2011 15:49, Scott Raynaud wrote:

I believe the problem is a column of zeroes in my x matrix.  I have tried the 
suggestions in the documentation,
so now to try to confirm the probelm I'd like to run debug.  Here's where I 
think the problem is:

###~~  Fitting the model using lmer funtion~~###
(fitmodel- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
mtrace(fitmodel)

I added the mtrace to catch the error, but get the following:

Error in mtrace(fitmodel) : Can't find fitmodel

How can I debug this?


- Original Message -
From: Rolf Turnerrolf.tur...@xtra.co.nz
To: Scott Raynaudscott.rayn...@yahoo.com
Cc: r-help@r-project.orgr-help@r-project.org
Sent: Wednesday, November 16, 2011 6:04 PM
Subject: Re: [R] package installtion

On 17/11/11 05:37, Scott Raynaud wrote:

That might be an option if it weren't my most important predictor.  I'm 
thinking my best bet is to use MLWin for the estimation since it will properly 
set fixed effects
   to 0.  All my other sample size simulation programs use SAS PROC IML which I 
don't have/can't afford.  I like R since it's free, but I can't work around the 
problem
I'm currently having.


This is the ``push every possible button until you get a result and to hell 
with what
anything actually means'' approach to statistics.  The probability of getting a
*meaningful* result from this approach is close to zero.

Why don't you try to *understand* what is going on, rather than wildly throwing
every possible piece of software at the problem until one such piece runs?

 cheers,

 Rolf Turner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling and R misconceptions; was: package installtion

2011-11-17 Thread Scott Raynaud

My responses are in brackets below, plus a final note after the main text.

- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Scott Raynaud scott.rayn...@yahoo.com
Cc: r-help@r-project.org r-help@r-project.org
Sent: Thursday, November 17, 2011 9:16 AM
Subject: Re: [R] modelling and R misconceptions; was: package installtion

This is hopeless [That's a matter of perception-even concentration camp 
prisoners 
found a way to hope (see Viktor Frankl)], since you never [never is a strong 
word 
and many times leads to cognitive errors] seem to listen to our 
advice [It's possible that I misunderstood your recommedations (more likely), 
or that you communicated poorly (less likely)], 
therefore this will be my very last try:

So you actually need local advice [Yes I need advice-that's why I post here!], 
both for statistical concepts and R related [I don't claim to be a statistical 
genius, 
but I can hold my own.  Now, R is a different matter].  No statistics software 
can estimate effects of variables that you observed to be constant (e.g. 0) 
all the time [I think you misuderstood my intentions-I never wanted to estimate 
effects that are 0 all of the time]. If any software does, 
please delete it a once from your machine.
Instead, ask a local statistician for advice on your problem. You 
certainly want to show the data and your model to the local expert - 
since you don't show us. [I gave a detailed explanation in a previous post 
which I repeat here:

|OK, I'm using William Browne's MLPowSim to create an R script which will 
simulate samples for estimation of sample size in mixed models.  I have subjects
| nested in hospitals with hospitals treated as random and all of my covariates 
at level 1.  My outcome is death, so it's binary and I'll have a fixed and 
|random intercept.  My interest is in the relation of the covariates to the 
outcome.  
| 
|My most important variable is gestational age (GA) which my investigators 
divide thusly: 23-24, 25-26, 27-28, 29-30 and 31-32.  I have recoded the
| dummies for GA in the script according to the MLPowSim instructions to a 
random multinomial variable:
| 
|   macpred-rmultinom(n2,1,c(.1031,.1482,.2385,.4404,.0698)) 
|   x[,3]-macpred[1,][l2id]
|   x[,4]-macpred[2,][l2id]
|   x[,5]-macpred[3,][l2id]
|   x[,6]-macpred[4,][l2id]
|
|GA 23-24 is the reference with p=.0698.  I started with a structured sampling 
scheme of 20, 60, 100, 120 and 140 level 2 units.  My level 2 units have 
|different sizes.  So at 20 I had 5 hospitals with 100 patients, 4 with 280, 3 
with 460, 3 with 640, 3 with 820 and 2 with 1000.  Thus, at 60 hospitals, I 
have 15, 
|12, 9, 9, 9, 6 with the same cell sample sizes.
| 
|According to the MLPowSim documentation, with small probablities it's possible 
to have a column of zeroes in the X matrix if there are not many units in 
|the random factor.  R will choke on this but MLWin sets the associated fixed 
effects to 0.  When R choked, I increased from 20 to 60 as my minimum as 
|suggested in the MLPowSim documentation.  Still no luck.

Since this is a simulation, I assume once and a while that by chance a 
coefficient could be 0. 
In fact, Browne mentions as much in his documentation.  There is a bit more to 
my simulation, 
but I thought I'd try to keep it as simple as possible, at least at the outset.]

And then you want to ask for local R course 
since reading the documentation seems not to help [You got that right!]. 
Applying mtrace() in 
a non exiting object shows this straight away.

Uwe Ligges

Apparently I misuderstood the prupose of mtrace after reading the 
documentation-I thought it was 
to debug problems of the sort I've encountered.  Michael Weylandt provided 
appropriate direction 
in the previous post for which I am grateful.

Not all of us can be intellectual superstars.  That's why we ask for help.  
This much I did read and understand
from the R posting guide:

Responding to other posts: 
* Rudeness and ad hominem comments are not acceptable. Brevity is OK. 
It's a good lesson to learn.

On 17.11.2011 15:49, Scott Raynaud wrote:
 I believe the problem is a column of zeroes in my x matrix.  I have tried the 
 suggestions in the documentation,
 so now to try to confirm the probelm I'd like to run debug.  Here's where I 
 think the problem is:

 ###~~      Fitting the model using lmer funtion    ~~###
 (fitmodel- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
 mtrace(fitmodel)

 I added the mtrace to catch the error, but get the following:

 Error in mtrace(fitmodel) : Can't find fitmodel

 How can I debug this?

 - Original Message -
 From: Rolf Turnerrolf.tur...@xtra.co.nz
 To: Scott Raynaudscott.rayn...@yahoo.com
 Cc: r-help@r-project.orgr-help@r-project.org
 Sent: Wednesday, November 16, 2011 6:04 PM
 Subject: Re: [R] package installtion

 On 17/11/11 05:37, Scott Raynaud wrote

[R] Modelling poisson distribution with variance structure

2010-08-04 Thread Karen Moore

I'm dealing with count data that's nested and has spatial dependence.
I ran a glmm in lmer with a random factor for nestedness. Spatial dependence
seems to have been accommodated by model. However I can't add a variance
strcuture to this model (to accommodate heterogeneity).

Is there a model that can have a poisson distribution *AND*  a variance
structure *AND* have AIC in output (for model comparison and selection)?
Some we've looked at that can't:

   - glmmPQL  - can add structures BUT can't have AIC (you can calculate it
   but it doesn't give correct AIC with this model)
   - glmm in lme4 (lmer)  - won't allow variance structure
   - gls -  can add variance but can't have Poisson

Thanks so much,

Karen Moore
PhD Researcher,
FORESTBIO,
Department of Botany,
Trinity College Dublin
Ireland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling poisson distribution with variance structure

2010-08-04 Thread Ben Bolker

Karen Moore kmoore at tcd.ie writes:

 
 I'm dealing with count data that's nested and has spatial dependence.
 I ran a glmm in lmer with a random factor for nestedness. Spatial dependence
 seems to have been accommodated by model. However I can't add a variance
 strcuture to this model (to accommodate heterogeneity).
 
 Is there a model that can have a poisson distribution *AND*  a variance
 structure *AND* have AIC in output (for model comparison and selection)?
 Some we've looked at that can't:
 
- glmmPQL  - can add structures BUT can't have AIC (you can calculate it
but it doesn't give correct AIC with this model)
- glmm in lme4 (lmer)  - won't allow variance structure
- gls -  can add variance but can't have Poisson


  [Any further discussion should probably go to 
r-sig-mixed-mod...@r-project.org ...]

  I'm not sure I know what you mean by Poisson + variance structure --
if the data are really Poisson (not overdispersed in some way), then
the variance structure is completely defined.  If you want to deal
with overdispersion, and have a well-defined AIC, you may be able
to add a per-observation random effect in lme4.  Alternatively,
you could just use a weights= argument in glmmPQL to set some sensible
mean-variance relationship, overlooking the fact that the data
are discrete and positive rather than being normally distributed
with an equivalent variance structure.  http://glmm.wikidot.com/faq
may also be useful.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling survival with time-dependent covariates

2010-07-02 Thread Terry Therneau

 1-Would informing the algorithm coxph which samples represents the same
person (through the use of an Id for example) improve the ?efficiency?
of the estimated model? And if so, how should i do that? Using strata()?

 No, it makes no change. The reason is that the (start, stop] is just a
trick.  At each death time the program needs to figure out what the
covariates are for everyone else at that time; the start,stop lets it
pick the right line for each subject.  As long as there are no overlaps,
i.e. (0,20], (15, 50], then there is only one copy of the person, and no
'correlated data' issue.  (Overlap is wierd -- it corresponds to two
copies of me being in the room at the same time.)
 If there are multiple events for a subject, then there is correlation
(via a different mechanism), and addition of a cluster() term is needed.

2- He later suggests ?accommodating non-proportional hazards by building
interactions between covariates and time into the Cox regression model?
as follows:
 
 coxph(Surv(start, stop, arrest.time) ~fin + age + age:stop + prio, ...

This trick ONLY works if 
  a. the data set has been artificially divided (as your example has)
into small uniform time increments, the same for each subject.
  b. the form of the non-ph is actally a linear change in beta over
time.  Use cox.zph on the original model to look at this.  When I see
non-ph (the plot from cox.zph is not horizontal) life is rarely so
simple.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modelling survival with time-dependent covariates

2010-07-01 Thread Ben Rhelp

Hi all,

I am looking at the tutorial/appendix from John Fox on “Cox 
Proportional-Hazards Regression for Survival Data” available here:
http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
I am particularly interested in modelling survival with time-dependent 
covariates (Section 4).
 
The data look like this:
  Rossi.2[1:50,]
start
stop arrest.time week arrest fin age race wexp mar paro prio educ employed
0 1 0 20 1 0 27 1 0 0 1 3 3 0
1 2 0 20 1 0 27 1 0 0 1 3 3 0
...
18 19 0 20 1 0 27 1 0 0 1 3 3 0
19 20 1 20 1 0 27 1 0 0 1 3 3 0
0 1 0 17 1 0 18 1 0 0 1 8 4 0
1 2 0 17 1 0 18 1 0 0 1 8 4 0
...
15 16 0 17 1 0 18 1 0 0 1 8 4 0
16 17 1 17 1 0 18 1 0 0 1 8 4 0
0 1 0 25 1 0 19 0 1 0 1 13 3 0
1 2 0 25 1 0 19 0 1 0 1 13 3 0
...
3.13 12 13 0 25 1 0 19
0 1 0 1 13 3 0
 
John suggests the following model:
mod.allison.2 - coxph(Surv(start, stop, arrest.time) ~
+ fin + age + race + wexp + mar + paro + prio + employed,
+ data=Rossi.2)
 1-Would informing the algorithm coxph which samples represents the same person 
(through the use of an Id for example) improve the “efficiency” of the 
estimated model? And if so, how should i do that? Using strata()?
 
2- He later suggests “accommodating non-proportional hazards by building 
interactions between covariates and time into the Cox regression model” as 
follows:
 
mod.allison.5
- coxph(Surv(start, stop, arrest.time) ~
+   fin + age + age:stop + prio,
+   data=Rossi.2)
 
I have read quite a lot of documentation to understand the meaning of “age + 
age:stop” in the formula, but I am unsure of what it means. If I wanted to  
visualise these variables which are entering the model, would it be something 
like:
data.frame(Rossi.2$age,Rossi.2$age %in% Rossi.2$stop)
 
I hope this make sense. Thanks for your help,
Ben




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modelling Crystal Growth

2010-06-25 Thread mplar1


Dear all,

I would like to hear from anyone who has experience using R to simulate and
visualise the formation and growth of crystals.

Thank you.

mpl
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Modelling-Crystal-Growth-tp2268746p2268746.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modelling

2009-09-24 Thread Ashta

Dear R-users,

Suppose I have the followin g sample of data,

 0   1   2  4  3
 1   2   1  3  1
 1   3   3  4  1
 0   1   2  1  2
 1   4   1  4  2
  1   2   2  1  1

The first variable is the response variable where 0 is defective and 1
normal. The other four factors( x1,x2,x3,x4) that influence the outcome.

I want to fit a binomial model in  R . I want also to rder the factors based
on their degree of  influence the outcome.  How do I do this  in R.

thanks in advance

Ashta

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling an incomplete Poisson distribution ?

2009-04-19 Thread Emmanuel Charpentier

Dear Ben,

Le samedi 18 avril 2009 à 23:37 +, Ben Bolker a écrit :
 Emmanuel Charpentier charpent at bacbuc.dyndns.org writes:
 
  
  I forgot to add that yes, I've done my homework, and that it seems to me
  that answers pointing to zero-inflated Poisson (and negative binomial)
  are irrelevant ; I do not have a mixture of distributions but only part
  of one distribution, or, if you'll have it, a zero-deflated Poisson.
  
  An answer by Brian Ripley
  (http://finzi.psych.upenn.edu/R/Rhelp02/archive/11029.html) to a similar
  question leaves me a bit dismayed : if it is easy to compute the
  probability function of this zero-deflated RV (off the top of my head,
  Pr(X=x)=e^-lambda.lambda^x/(x!.(1-e^-lambda))), and if I think that I'm
  (still) able to use optim to ML-estimate lambda, using this to
  (correctly) model my problem set and test it amounts to re-writing some
  (large) part of glm. Furthermore, I'd be a bit embarrassed to test it
  cleanly (i. e. justifiably) : out of the top of my head, only the
  likelihood ration test seems readily applicable to my problem. Testing
  contrasts in my covariates ... hum !
  
  So if someone has written something to that effect, I'd be awfully glad
  to use it. A not-so-cursory look at the existing packages did not ring
  any bells to my (admittedly untrained) ears...
  
  Of course, I could also bootstrap the damn thing and study the
  distribution of my contrasts. I'd still been hard pressed to formally
  test hypotheses on factors...
  
 
   I would call this a truncated Poisson distribution, related
 to hurdle models.  You could probably use the hurdle function
 in the pscl package to do this, by ignoring the fitting to
 the zero part of the model.  On the other hand, it might break
 if there are no zeros at all (adding some zeros would be a
 pretty awful hack/workaround).

Indeed ... 

   If you defined a dtpoisson() for the distribution of the
 truncated Poisson model, you could probably also use bbmle
 with the formula interface and the parameters argument.

This I was not aware of... Thank you for the tip !

By the way, I never delved into stats4, relying (erroneously) on its
one-liner description in CRAN, which is (more or less) an implementation
of stats with S4 classes. Therefore mle escaped me also... May I suggest
a better one-liner ? (This goes also for bbmle...)

   The likelihood ratio test seems absolutely appropriate for
 this case.  Why not?

Few data, therefore a bit far from the asymptotic condition...

Anyway, a better look at my data after discovering a bag'o bugs in the
original files led me to reconsider my model : I wanted to assess, among
other things, the effect of a boolean (really a two-class variable).
After the *right* graph, I now tend to think that my distribution is
indeed zero-deflated Poisson in one group and ... zero-deflated negative
binomial in the other (still might be zero-deflated Poisson with very
small mean, hard to say graphically...). Which gives me some insights on
the mechanisms at work (yippeee !!) but will require some nonparametric
beast for assessing central value differences (yes, this still has a
meaning...), such as bootstrap.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modelling an incomplete Poisson distribution ?

2009-04-18 Thread Emmanuel Charpentier

Dear list,

I have the following problem : I want to model a series of observations
of a given hospital activity on various days under various conditions.
among my outcomes (dependent variables) is the number of patients for
which a certain procedure is done. The problem is that, when no relevant
patient is hospitalized on said day, there is no observation (for which
the number of patients item would be 0). 

My goal is to model this number of patients as a function of the
various conditions described by my independant variables, mosty of
them observed but uncontrolled, some of them unobservable (random
effects). I am tempted to model them along the lines of :

glm(NoP~X+Y+..., data=MyData, family=poisson(link=log))

or (accounting for some random effects) :

lmer(NoP~X+Y+(X|Center)), data=Mydata, family=poisson(link=log))

While the preliminary analysis suggest that (the right part of) a
Poisson distribution might be reasonable for all real observations, the
lack of observations with count==0 bothers me.

Is there a way to cajole glm (and lmer, by the way) into modelling these
data to an incomplete Poisson model, i. e. with unobserved 0
values ?

Sincerely,

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling an incomplete Poisson distribution ?

2009-04-18 Thread Emmanuel Charpentier

I forgot to add that yes, I've done my homework, and that it seems to me
that answers pointing to zero-inflated Poisson (and negative binomial)
are irrelevant ; I do not have a mixture of distributions but only part
of one distribution, or, if you'll have it, a zero-deflated Poisson.

An answer by Brian Ripley
(http://finzi.psych.upenn.edu/R/Rhelp02/archive/11029.html) to a similar
question leaves me a bit dismayed : if it is easy to compute the
probability function of this zero-deflated RV (off the top of my head,
Pr(X=x)=e^-lambda.lambda^x/(x!.(1-e^-lambda))), and if I think that I'm
(still) able to use optim to ML-estimate lambda, using this to
(correctly) model my problem set and test it amounts to re-writing some
(large) part of glm. Furthermore, I'd be a bit embarrassed to test it
cleanly (i. e. justifiably) : out of the top of my head, only the
likelihood ration test seems readily applicable to my problem. Testing
contrasts in my covariates ... hum !

So if someone has written something to that effect, I'd be awfully glad
to use it. A not-so-cursory look at the existing packages did not ring
any bells to my (admittedly untrained) ears...

Of course, I could also bootstrap the damn thing and study the
distribution of my contrasts. I'd still been hard pressed to formally
test hypotheses on factors...

Any ideas ?

Emmanuel Charpentier

Le samedi 18 avril 2009 à 19:28 +0200, Emmanuel Charpentier a écrit :
 Dear list,
 
 I have the following problem : I want to model a series of observations
 of a given hospital activity on various days under various conditions.
 among my outcomes (dependent variables) is the number of patients for
 which a certain procedure is done. The problem is that, when no relevant
 patient is hospitalized on said day, there is no observation (for which
 the number of patients item would be 0). 
 
 My goal is to model this number of patients as a function of the
 various conditions described by my independant variables, mosty of
 them observed but uncontrolled, some of them unobservable (random
 effects). I am tempted to model them along the lines of :
 
 glm(NoP~X+Y+..., data=MyData, family=poisson(link=log))
 
 or (accounting for some random effects) :
 
 lmer(NoP~X+Y+(X|Center)), data=Mydata, family=poisson(link=log))
 
 While the preliminary analysis suggest that (the right part of) a
 Poisson distribution might be reasonable for all real observations, the
 lack of observations with count==0 bothers me.
 
 Is there a way to cajole glm (and lmer, by the way) into modelling these
 data to an incomplete Poisson model, i. e. with unobserved 0
 values ?
 
 Sincerely,
 
   Emmanuel Charpentier
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling an incomplete Poisson distribution ?

2009-04-18 Thread Ben Bolker

Emmanuel Charpentier charpent at bacbuc.dyndns.org writes:

 
 I forgot to add that yes, I've done my homework, and that it seems to me
 that answers pointing to zero-inflated Poisson (and negative binomial)
 are irrelevant ; I do not have a mixture of distributions but only part
 of one distribution, or, if you'll have it, a zero-deflated Poisson.
 
 An answer by Brian Ripley
 (http://finzi.psych.upenn.edu/R/Rhelp02/archive/11029.html) to a similar
 question leaves me a bit dismayed : if it is easy to compute the
 probability function of this zero-deflated RV (off the top of my head,
 Pr(X=x)=e^-lambda.lambda^x/(x!.(1-e^-lambda))), and if I think that I'm
 (still) able to use optim to ML-estimate lambda, using this to
 (correctly) model my problem set and test it amounts to re-writing some
 (large) part of glm. Furthermore, I'd be a bit embarrassed to test it
 cleanly (i. e. justifiably) : out of the top of my head, only the
 likelihood ration test seems readily applicable to my problem. Testing
 contrasts in my covariates ... hum !
 
 So if someone has written something to that effect, I'd be awfully glad
 to use it. A not-so-cursory look at the existing packages did not ring
 any bells to my (admittedly untrained) ears...
 
 Of course, I could also bootstrap the damn thing and study the
 distribution of my contrasts. I'd still been hard pressed to formally
 test hypotheses on factors...
 

  I would call this a truncated Poisson distribution, related
to hurdle models.  You could probably use the hurdle function
in the pscl package to do this, by ignoring the fitting to
the zero part of the model.  On the other hand, it might break
if there are no zeros at all (adding some zeros would be a
pretty awful hack/workaround).

  If you defined a dtpoisson() for the distribution of the
truncated Poisson model, you could probably also use bbmle
with the formula interface and the parameters argument.

  The likelihood ratio test seems absolutely appropriate for
this case.  Why not?

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling probabilities instead of binary data with logisticregression

2009-03-25 Thread ONKELINX, Thierry

Hi Joris,

glm() handles proportions but will give you a warning (and not an error)
about non-integer values. So if you get an error then there should be
something wrong with the syntax, model or data. Can you provide us with
a reproducible example?

Cheers,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens joris meys
Verzonden: dinsdag 24 maart 2009 20:30
Aan: R-help Mailing List
Onderwerp: [R] modelling probabilities instead of binary data with
logisticregression

Dear all,

I have a dataset where I reduced the dimensionality, and now I have a
response variable with probabilities/proportions between 0 and 1. I
wanted
to do a logistic regression on those, but the function glm refuses to do
that with non-integer values in the response. I also tried lrm, but that
one
interpretes the probabilities as different levels and gives for every
level
a different intercept. Not exactly what I want...

Is there a way to specify that the response variable should be
interpreted
as a probability?

Kind regards
Joris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling probabilities instead of binary data with logisticregression

2009-03-25 Thread joris meys

Hi Thierry,

You're very right and I was very wrong. It's a warning indeed, and the
outcome is close to what I become if I calculate the logit transformation
myself (although not exactly the same, but I guess that has to do with the
correction the logit function does when p=0 or p=1).

I should have paid more attention to the course of categorical ;-)
Kind regards
Joris

On Wed, Mar 25, 2009 at 10:32 AM, ONKELINX, Thierry 
thierry.onkel...@inbo.be wrote:

 Hi Joris,

 glm() handles proportions but will give you a warning (and not an error)
 about non-integer values. So if you get an error then there should be
 something wrong with the syntax, model or data. Can you provide us with
 a reproducible example?

 Cheers,

 Thierry


 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and Forest
 Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
 methodology and quality assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher

 The plural of anecdote is not data.
 ~ Roger Brinner

 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Namens joris meys
 Verzonden: dinsdag 24 maart 2009 20:30
 Aan: R-help Mailing List
 Onderwerp: [R] modelling probabilities instead of binary data with
 logisticregression

 Dear all,

 I have a dataset where I reduced the dimensionality, and now I have a
 response variable with probabilities/proportions between 0 and 1. I
 wanted
 to do a logistic regression on those, but the function glm refuses to do
 that with non-integer values in the response. I also tried lrm, but that
 one
 interpretes the probabilities as different levels and gives for every
 level
 a different intercept. Not exactly what I want...

 Is there a way to specify that the response variable should be
 interpreted
 as a probability?

 Kind regards
 Joris

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver
 weer
 en binden het INBO onder geen enkel beding, zolang dit bericht niet
 bevestigd is
 door een geldig ondertekend document. The views expressed in  this message
 and any annex are purely those of the writer and may not be regarded as
 stating
 an official position of INBO, as long as the message is not confirmed by a
 duly
 signed document.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] modelling probabilities instead of binary data with logistic regression

2009-03-24 Thread joris meys

Dear all,

I have a dataset where I reduced the dimensionality, and now I have a
response variable with probabilities/proportions between 0 and 1. I wanted
to do a logistic regression on those, but the function glm refuses to do
that with non-integer values in the response. I also tried lrm, but that one
interpretes the probabilities as different levels and gives for every level
a different intercept. Not exactly what I want...

Is there a way to specify that the response variable should be interpreted
as a probability?

Kind regards
Joris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling probabilities instead of binary data with logistic regression

2009-03-24 Thread joris meys

Thank you all for the very fast answers.

My proportions come from a factor analysis on a number of binary variables,
in order to avoid having to fit 12 logistic regressions on the same dataset.
By scaling the obtained scores to 0 and 1, I get weighted averages of the
response combinations I'm interested in.

I tried the betareg function, but that one can't deal with probabilities 0
and 1 unfortunately. I'll have to manually do the logit transformation, I'm
afraid. Thanks for the help.

Kind regards
Joris

On Tue, Mar 24, 2009 at 8:48 PM, Kjetil Halvorsen 
kjetilbrinchmannhalvor...@gmail.com wrote:

 You did'nt say how your proportions have arisen! If each corresonds to one
 observation, you could simply simulate
 indicator variables with those proportions as prob's, fit glm, repeat many
 times, and
 average results!

 More seriously, you could transform the proportions to logits
 logit - log(p/(1-p))
 and fit a linear regression.

 Kjetil

 On Tue, Mar 24, 2009 at 3:30 PM, joris meys jorism...@gmail.com wrote:

 Dear all,

 I have a dataset where I reduced the dimensionality, and now I have a
 response variable with probabilities/proportions between 0 and 1. I wanted
 to do a logistic regression on those, but the function glm refuses to do
 that with non-integer values in the response. I also tried lrm, but that
 one
 interpretes the probabilities as different levels and gives for every
 level
 a different intercept. Not exactly what I want...

 Is there a way to specify that the response variable should be interpreted
 as a probability?

 Kind regards
 Joris

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling disease spread

2008-02-14 Thread francogrex


Thanks Marcel,
In addition to your program and the reference to simecol, someone had
replied to my private email pointing out RLadyBug: An R package for
stochastic epidemic models which is on CRAN and which seems one of the most
relevant. I write it here as a reference for users doing a future search in
this archive.
Regards.


Bio7 wrote:
 
 The simecol package is maybe what you want.
  http://hhbio.wasser.tu-dresden.de/projects/simecol/
 http://hhbio.wasser.tu-dresden.de/projects/simecol/ 
 Another possibility is to use a program i've written.
 Here is a flash presentation maybe also interesting for you.
 
 http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/animaterplot.htm
 http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/animaterplot.htm
  
 

-- 
View this message in context: 
http://www.nabble.com/Modelling-disease-spread-tp15459834p15480798.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modelling disease spread

2008-02-13 Thread francogrex


I was at a lecture the other day and I saw a presentation of very neat
(short) animation modeling epidemic disease spread over a map region. When I
ask what software they used they mentioned SAS. Do you know if there are
equivalent resources in R to model the spread of disease with animation
output? My search in R-help and google didn't lead to any document (though I
found a couple of documents for SAS). However my intuition telles me there
must be a similar program in R. Thanks
-- 
View this message in context: 
http://www.nabble.com/Modelling-disease-spread-tp15459834p15459834.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modelling disease spread

2008-02-13 Thread Bio7


The simecol package is maybe what you want.
http://hhbio.wasser.tu-dresden.de/projects/simecol/
http://hhbio.wasser.tu-dresden.de/projects/simecol/ 

Another possibility is to use a program i've written.
Here is a flash presentation maybe also interesting for you.
http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/animaterplot.htm
http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/animaterplot.htm
 

With kind regards
Marcel

-- 
View this message in context: 
http://www.nabble.com/Modelling-disease-spread-tp15459834p15460869.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

Re: [R] modelling 4-parameter curve in R does not match data - how to proceed?

[R] modelling 4-parameter curve in R does not match data - how to proceed?

Re: [R] Modelling non-Negative Time Series

[R] Modelling non-Negative Time Series

[R] Modelling categorical variables

Re: [R] Modelling categorical variables

[R] Modelling categorical and non-categorical datasets using Artifical Neural Networks

Re: [R] Modelling categorical and non-categorical datasets using Artifical Neural Networks

Re: [R] modelling and R misconceptions; was: package installtion

Re: [R] modelling and R misconceptions; was: package installtion

[R] Modelling poisson distribution with variance structure

Re: [R] Modelling poisson distribution with variance structure

Re: [R] Modelling survival with time-dependent covariates

[R] Modelling survival with time-dependent covariates

[R] Modelling Crystal Growth

[R] Modelling

Re: [R] Modelling an incomplete Poisson distribution ?

[R] Modelling an incomplete Poisson distribution ?

Re: [R] Modelling an incomplete Poisson distribution ?

Re: [R] Modelling an incomplete Poisson distribution ?

Re: [R] modelling probabilities instead of binary data with logisticregression

Re: [R] modelling probabilities instead of binary data with logisticregression

[R] modelling probabilities instead of binary data with logistic regression

Re: [R] modelling probabilities instead of binary data with logistic regression

Re: [R] Modelling disease spread

[R] Modelling disease spread

Re: [R] Modelling disease spread

32 matches

Site Navigation

Mail list logo

Footer information