Re: [R-sig-eco] Using NADA and rkt for seasonal Mann Kendall tests for censored datasets

Philippi, Tom Thu, 24 Jul 2014 08:29:08 -0700

Tim--

When I had decreasing MDL over a monitoring period, I looked at the
distribution of the values collected under the later, lower MDL (MDL2) that
were <MDL1 (including BDL2), then used that distribution (estimated via
kerneling) to simulate/impute values under the earlier MDL1.  In my
application, log-normal was not a reasonable fit for those values.  In your
application log-normal may be fine, but I suggest plotting the pdf of those
values to be sure.  My approach can require pretty large sample sizes,
especially if you have to account for seasons and sites/regions.


If you haven't looked at it and rejected it for your application, you may
want to consider the EnvStats package as an alternative to rkt (and
possibly to NADA).  For my applications, I needed the additional tests for
heterogeneity it provides.

Also, if you need to do the equivalent of stl decomposition (graphing a
repeating seasonal component plus temporal trend), if you have irregular
times or missing values, or just don't have symmetric sinusoidal seasonal
patterns (for cosinor), Gavin Simpson has some nice work using GAMMs to fit
the data.
http://www.fromthebottomoftheheap.net/2014/05/09/modelling-seasonal-data-with-gam/

I hope that this helps a bit.
Tom 2

On Thu, Jul 24, 2014 at 8:59 AM, clarkbar <[email protected]> wrote:

> I have recently put together some code for running trend analyses on
> censored
> datasets with shifting Method Detection Limits (MDLs). I have applied the
> NADA package to determine the log mean and log standard deviation; these
> are
> then used to generate log-normal random numbers which then replace the <MDL
> observations. From this, I apply the rkt package for seasonal Mann-Kendall
> trend test to determine.
> I know the NADA package has cenken, but that has limited covariate and
> seasonal capabilities, and I believe my approach below is more
> conservative.
>
> I'm sorry if the code is sloppy, but I'm pretty new to all this.. going on
> nearly 2 months now.
> I would really appreciate your feedback and ideas. I think my method is
> similar to &lt;a
> href=&quot;
> http://dugi-doc.udg.edu/bitstream/handle/10256/708/Olea.pdf?sequence=1&quot
> ;>Olea's,
> but I could never get ILR to cooperate with me.
>
> Thanks in advance.
>
> #example data creation - probably not the best way to do this...
> test<-matrix(c(rep(1:7,each=20,1),rep(1:2,each=70),
>          rep(rlnorm(10,-5.6,.2),2),
>          rep(rlnorm(10,-5.6,.3)*1.2,2),
>          rep(rlnorm(10,-5.6,.3)*1.22,2),
>          rep(rlnorm(10,-5.6,.2)*1.25,2),
>          rep(rlnorm(10,-5.6,.3)*1.3,2),
>          rep(rlnorm(10,-5.6,.35)*1.4,2),
>          rep(rlnorm(10,-5.6,.35)*1.4,2)),
>          ncol=3)
> colnames(test)<-c('Year',"Season","Obs")
>
> test<-as.data.frame(test)
> with(test,rkt(Year,Obs,Season,correct=T,rep='a'))
> with(test,plot(Obs~Year))
>
> #function below
> require("NADA")
> require("rkt")
> MDL_fix<-function(value,detection,year,block,CV,month,plot_title)
>   #Observed Value vector, Detection Limit Vector,Block Vector for seasonal
> MannKendall,covariate,month,Plot title
> {
>   YearMonth=year+month/12
>   #using NADA package, determine log-mean of and log-standard deviation of
> the observed valued
>   #if(mean(value)<1) {#determine if log or -log should be used (cenfit has
> issues with negatives)
>   #lmeanvalue<-{sum(summary(cenmle(value,value<detection))[[9]][1:2])}
>   #lsdvalue&lt;-{summary(cenmle(value,value&lt;detection))[[6]]}
>   #}
>   #if(mean(value)>=1) {#determine if log or -log should be used (cenfit has
> issues with negatives)
>   if(length(which(value<detection))/length(value)>.50){
>     return("ERROR: TOO MANY POINTS BELOW MDL")}
>   lmeanvalue<-{(median(cenfit((log(value)),value<detection))[[1]])}
>     lsdvalue&lt;-{(sd(cenfit(log(value),value&lt;detection)))[[1]]}
>   #}
>   if(length(which(value&lt;detection))/length(value)>.05){ #only pursue MDL
> fix if >5% of data censored, otherewise: do 1/2 MDL method
>     #create matrix to fill with Mann-Kendall Output
>     rkt_1<-matrix(1,runs,12)
>     colnames(rkt_1)<- c('2-sided
> p-value','Score',"Slope","Var(Score)",'2-sided
> p-value_corrected','Var(Score)_corrected',
>                         'Partial Score', 'Partial 2-sided p-value',
> "Partial
> Var(Score)", "Partial 2-sided p-value_corrected"
>                         , 'Partial Var(Score)_corrected',"Tau")
>     for(i in seq(1:runs)) {
> {
>   xj<-1:length(value) #create random number vector
>   MDL_fixed<-1:length(value) #create final value vector
>   for(j in seq(1:length(value))) {
> {
>   if(value[j]<detection[j]) #only pursue number generation if the observed
> value is below the detection limit
> {
>     #create a random number (distributed log-normally based on lmean and
> lsd) that is below the detection limit
>     repeat {
>       xj[j]&lt;-rlnorm(1,lmeanvalue,lsdvalue)
>       if(xj[j]&lt;detection[j]) {break}
>     }
>   }
> }
>     MDL_fixed[j]&lt;-ifelse(value[j]&lt;detection[j],xj[j],value[j])
> #replace values below the detection limit with generated number
>   }
> }
> if(!missing(CV)&amp; !missing(block)) { #if covariate and block present, do
> seasonal mann-kendall
>
> rkt_1[i,]&lt;-as.vector(unlist(rkt(year,MDL_fixed,block,CV,correct=T,rep=&quot;a&quot;)))
> #run MK for each random number generation
> }
> if(missing(block) | missing(CV)){  #if covariate and block mising, do
> mann-kendall
>
>
> rkt_1[i,]&lt;-as.vector(unlist(rkt(year,MDL_fixed,correct=F,rep=&quot;a&quot;)))
> #run MK for each random number generation
> }
>     }
> #plot final iteration of MDL_fixed
> plot(YearMonth,MDL_fixed,
>
> log='y',main=as.character(substitute(plot_title)),xlab=&quot;Year&quot;,ylab='Concentration(ppm)')
> yaxis&lt;-10^(par('usr')[c(3,4)]) #grab axis size
> #draw first quartile slope and intercept
>
> abline(a=median(value[which(value>detection)])-fivenum(rkt_1[,"Slope"])[2]*median(year[which(value>detection)]),b=fivenum(rkt_1[,"Slope"])[2],
> untf=TRUE, col="red")
> #draw median slope and intercept
>
> abline(a=median(value[which(value>detection)])-fivenum(rkt_1[,"Slope"])[3]*median(year[which(value>detection)]),b=fivenum(rkt_1[,"Slope"])[3],
> untf=TRUE, col="blue")
> #draw third quartile slope and intercept
>
> abline(a=median(value[which(value>detection)])-fivenum(rkt_1[,"Slope"])[4]*median(year[which(value>detection)]),b=fivenum(rkt_1[,"Slope"])[4],
> untf=TRUE, col="green")
> par(new=T) #hold on
> #plot detection limit
> plot(YearMonth,detection, ylim=yaxis,
> log='y',type='l',axes=F,xlab='',ylab='',lty=3)
> par(new=F) #hold off
> return(rkt_1) #return matrix of iterations
>   }
> else{
>   value_half_MDL<-ifelse(value>detection,value,0.5*detection)
>   if(!missing(CV)& !missing(block)) {#if covariate and block present, do
> seasonal mann-kendall
>
>
> rkt_noMDL=as.vector(unlist(rkt(year,value_half_MDL,block,CV,correct=T,rep="a")))#run
> MK w/o num geneartion
>   }
>   if(missing(CV)| missing(block)) {#if covariate and block mising, do
> mann-kendall
>
> rkt_noMDL=as.vector(unlist(rkt(year,value_half_MDL,correct=F,rep="a")))#run
> MK w/o num geneartion
>   }
>   #plot value over time
>   plot(YearMonth,value_half_MDL,
>
> log='y',main=as.character(substitute(plot_title)),xlab="Year",ylab='Concentration(ppm)')
>   yaxis<-10^(par('usr')[c(3,4)])#grab axis size
>   par(new=T)
>   plot(YearMonth,detection, ylim=yaxis,
> log='y',type='l',axes=F,xlab='',ylab='',lty=3)
>   par(new=F)
>
>
> abline(a=median(value[which(value>detection)])-rkt_noMDL[3]*median(year[which(value>detection)]),b=rkt_noMDL[3],
> untf=TRUE, col="red")
>
>   return(rkt_noMDL) #return vector of Mann Kendall
> }
> } ##MDL fixing code for log-normal
>
> runs=100
>
> summary(with(test,MDL_fix(Obs,detection=c(rep(.005,80),rep(.002,60)),Year,Season,,month=rep(12,140)),"hello"))
>
>
>
> --
> View this message in context:
> http://r-sig-ecology.471788.n2.nabble.com/Using-NADA-and-rkt-for-seasonal-Mann-Kendall-tests-for-censored-datasets-tp7578989.html
> Sent from the r-sig-ecology mailing list archive at Nabble.com.
>
> _______________________________________________
> R-sig-ecology mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>



-- 
-------------------------------------------
Tom Philippi
Quantitative Ecologist & Data Therapist
Inventory and Monitoring Program
National Park Service
[email protected]
http://science.nature.nps.gov/im/monitor

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] Using NADA and rkt for seasonal Mann Kendall tests for censored datasets

Reply via email to