Re: [R] Installing dyplr on Linux requires a ton of chasing down dependencies

2019-10-14 Thread Collin Lynch
Adam, while I am not familiar with that particular variant of linux,
it sounds like a package manager mismatch in that the ubuntu package
looks for specific libraries which are named differently on your
system.  If you can run a GUI then you have some form of X but the
libraries may be named differently.  It looks like Pop uses apt so you
might try apt-get for R and see if that works or consider compiling
from source.

Collin Lynch.

On Mon, Oct 14, 2019 at 7:36 AM Dirk Eddelbuettel  wrote:
>
>
> Adam,
>
> You may find this blog post and the video instructive:
>
>   http://dirk.eddelbuettel.com/blog/2019/06/09#022_rocker_and_ppas
>
> It illustrates how 'installing tidyverse' (or rstan) can be a single and done
> in under two minutes == on Linux, with the appropriate distribution and
> settings.  In short:  some have binaries prebuilt, some don't.
>
> My blog has a few post in the 'r4' section on that as well as on other
> approaches to this.
>
>   http://dirk.eddelbuettel.com/blog/code/r4/
>
> Now, you choose a somewhat non-standard distro. The price of that choice may
> indeed be that you have to install everything (R/CRAN-related) from source.
>
> Hope this helps, Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
ArgLab & Center for Educational Informatics
Department of Computer Science
North Carolina State University

https://research.csc.ncsu.edu/arglab/people/cflynch.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PythonInR. Python script in R with parameters required. Download satellite images from NASA

2015-08-01 Thread Collin Lynch
Magi, is there a reason that you need to run the script via R? If your
plan is to download the data via python than then process with R, you
might consider using the Rpy2 package to link them.  This would allow
you to call the downloading code from python and then have python feed
the data to R.

Collin.

On Wed, Jul 29, 2015 at 12:29 PM, Magi Franquesa
 wrote:
> Hello,
>
> I'm trying to execute a python script within R (3.2.1 x 64) with the
> PythonInR package. I would like to download an order of satellite images
> from Nasa using a python script (
> http://landsat.usgs.gov/documents/espa_bulk_downloader_v1.0.0.zip) but I
> have no success. I first run the pyExecfile command with the *feedparser.py*
> script and then the *download_espa_order.py* giving the required parameters
> (my mail acount and the order number), here is the code:
>
> setwd("C:/Python27")
> install.packages("PythonInR")
> library(PythonInR)
> pyConnect(pythonExePath="C:/Python27/python.exe")
> pyIsConnected()
> # autodetectPython("C:/Python27/python.exe")
>
> pyExecfile("C:/Landsat/feedparser.py")
> pyExecfile("C:/Landsat/download_espa_order.py" -e "magifranqu...@gmail.com"
> -o "magifranqu...@gmail.com-07222015-120911" -d "C:/Landsat/ESPA")
>
> and I get this error:
>
> Error: unexpected string constant in
> "pyExecfile("C:/Landsat/download_espa_order.py" -e
> "magifranqu...@gmail.com""
>
> The code "C:/Landsat/download_espa_order.py" -e
> "magifranqu...@gmail.com" -o "magifranqu...@gmail.com-07222015-120911"
> -d "C:/Landsat/ESPA" runs ok when I use it within
> system console.
>
> I appreciate if someone could help me to solve this problem.
>
> Thank you
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VIF threshold implying multicollinearity

2015-07-27 Thread Collin Lynch
No actually it is a quiet good paper! :)

On Mon, Jul 27, 2015 at 8:14 AM, John Kane  wrote:

> +1
> I, originally,  read it as a stringent criticism of the first paper.
>
> John Kane
> Kingston ON Canada
>
>
> > -Original Message-
> > From: r.tur...@auckland.ac.nz
> > Sent: Mon, 27 Jul 2015 15:12:43 +1200
> > To: cfly...@ncsu.edu
> > Subject: Re: [R] VIF threshold implying multicollinearity
> >
> >
> > On 27/07/15 13:36, Collin Lynch wrote:
> >
> >> The following sources discuss the issues generally and may be a goof
> >> pointer to the literature ...
> >
> > 
> >
> > I think that the foregoing merits fortune status! :-)
> >
> > cheers,
> >
> > Rolf
> >
> > --
> > Technical Editor ANZJS
> > Department of Statistics
> > University of Auckland
> > Phone: +64-9-373-7599 ext. 88276
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
> family!
> Visit http://www.inbox.com/photosharing to find out more!
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VIF threshold implying multicollinearity

2015-07-26 Thread Collin Lynch
The following sources discuss the issues generally and may be a goof
pointer to the literature on VIF.  Particularly the Schroeder paper.

@article{Yi:Evaluation,
   AUTHOR = {Youjae Yi},
   TITLE  = {On the Evaluation of Main Effects in Multiplicative
 Regression Models.},
   JOURNAL = {Journal of the Market Research Society},
   VOLUME  = {31},
   NUMBER  = {1},
   MONTH   = {January},
   YEAR= {1989},
   PAGES   = {133-138}
}


@article{Gordon:Issues,
   AUTHOR  = {Robert A. Gordon},
   TITLE   = {Issues in Multiple Regression},
   JOURNAL = {American Journal of Sociology},
   VOLUME  = {73},
   NUMBER  = {5},
   MONTH   = {March},
   YEAR= {1968},
   PAGES   = {592-616}
}


@misc{Lynch:Multicollinearity,
   author = {Scott M. Lynch},
   title  = {Multicollinearity},
   year   = {2003},
   url= {\url{
http://www.princeton.edu/~slynch/soc504/multicollinearity.pdf}},
   note   = "[Online; accessed 11-October-2013]"
 }


@article{Schroeder:Multicollinearity,
   AUTHOR  = {Mary Ann Schroeder
   and Janice Lander
   and Stacey Levine-Silverman},
   TITLE   = {Diagnosing and Dealing with Multicollinearity},
   JOURNAL = {Western Journal of Nursing Research},
   VOLUME  = {12},
   NUMBER  = {2},
   YEAR= {1990},
   PAGES   = {175-187}
}


@book{Afifi:Computer,
  AUTHOR= {A. Afifi and V. Clark},
  TITLE = {Computer-aided Multivariate Analysis},
  PUBLISHER = {Wadsworth, Belmont California},
  YEAR  = {1984}
}


On Sun, Jul 26, 2015 at 5:00 PM, Wensui Liu  wrote:

> Dear All
> I have a general question about VIF.
> While there are multiple rules of thumb about the threshold value of
> VIF, e.g. 4 or 10, implying multicollinearity, I am wondering if
> anyone can point me to some literature supporting these rules of
> thumb.
>
> Thank you so much!
> wensui
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speeding up code?

2015-07-15 Thread Collin Lynch
Hi Ignacio, If I am reading your code correctly then the top while loop is
essentially seeking to select a random set of names from the original set,
then using unique to reduce it down, you then iterate until you have built
your quota.  Ultimately this results in a very inefficient attempt at
sampling without replacement.  Why not just sample without replacement
rather than loop iteratively and use unique?  Or if the set of possible
names are short enough why not just randomize it and then pull the first n
items off?

Best,
Collin.

On Wed, Jul 15, 2015 at 11:15 PM, Ignacio Martinez 
wrote:

> Hi R-Help!
>
> I'm hoping that some of you may give me some tips that could make my code
> more efficient. More precisely, I would like to make the answer to my
> stakoverflow
> <
> http://stackoverflow.com/questions/31137940/randomly-assign-teachers-to-classrooms-imposing-restrictions
> >
> question more efficient.
>
> This is the code:
>
> library(dplyr)
> library(randomNames)
> library(geosphere)
> set.seed(7142015)# Define Parameters
> n.Schools <- 20
> first.grade<-3
> last.grade<-5
> n.Grades <-last.grade-first.grade+1
> n.Classrooms <- 20 # THIS IS WHAT I WANTED TO BE ABLE TO CHANGE
> n.Teachers <- (n.Schools*n.Grades*n.Classrooms)/2 #Two classrooms per
> teacher
> # Define Random names function:
> gen.names <- function(n, which.names = "both", name.order = "last.first"){
>   names <- unique(randomNames(n=n, which.names = which.names,
> name.order = name.order))
>   need <- n - length(names)
>   while(need>0){
> names <- unique(c(randomNames(n=need, which.names = which.names,
> name.order = name.order), names))
> need <- n - length(names)
>   }
>   return(names)}
> # Generate n.Schools names
> gen.schools <- function(n.schools) {
>   School.ID <-
> paste0(gen.names(n = n.schools, which.names = "last"), ' School')
>   School.long <- rnorm(n = n.schools, mean = 21.7672, sd = 0.025)
>   School.lat <- rnorm(n = n.schools, mean = 58.8471, sd = 0.025)
>   School.RE <- rnorm(n = n.schools, mean = 0, sd = 1)
>   Schools <-
> data.frame(School.ID, School.lat, School.long, School.RE) %>%
> mutate(School.ID = as.character(School.ID)) %>%
> rowwise() %>%  mutate (School.distance = distHaversine(
>   p1 = c(School.long, School.lat),
>   p2 = c(21.7672, 58.8471), r = 3961
> ))
>   return(Schools)}
>
> Schools <- gen.schools(n.schools = n.Schools)
> # Generate Grades
> Grades <- c(first.grade:last.grade)
> # Generate n.Classrooms
>
> Classrooms <- LETTERS[1:n.Classrooms]
> # Group schools and grades
>
> SchGr <- outer(paste0(Schools$School.ID, '-'), paste0(Grades, '-'),
> FUN="paste")#head(SchGr)
> # Group SchGr and Classrooms
>
> SchGrClss <- outer(SchGr, paste0(Classrooms, '-'),
> FUN="paste")#head(SchGrClss)
> # These are the combination of  School-Grades-Classroom
> SchGrClssTmp <- as.matrix(SchGrClss, ncol=1, nrow=length(SchGrClss) )
> SchGrClssEnd <- as.data.frame(SchGrClssTmp)
> # Assign n.Teachers (2 classroom in a given school-grade)
> Allpairs <- as.data.frame(t(combn(SchGrClssTmp, 2)))
> AllpairsTmp <- paste(Allpairs$V1, Allpairs$V2, sep=" ")
>
> library(stringr)
> separoPairs <- as.data.frame(str_split(string = AllpairsTmp, pattern =
> "-"))
> separoPairs <- as.data.frame(t(separoPairs))
> row.names(separoPairs) <- NULL
> separoPairs <- separoPairs %>% select(-V7)  %>%  #Drops empty column
>   mutate(V1=as.character(V1), V4=as.character(V4), V2=as.numeric(V2),
> V5=as.numeric(V5)) %>% mutate(V4 = trimws(V4, which = "both"))
>
> separoPairs[120,]$V4#Only the rows with V1=V4 and V2=V5 are valid
> validPairs <- separoPairs %>% filter(V1==V4 & V2==V5) %>% select(V1, V2,
> V3, V6)
> # Generate n.Teachers
>
> gen.teachers <- function(n.teachers){
>   Teacher.ID <- gen.names(n = n.teachers, name.order = "last.first")
>   Teacher.exp <- runif(n = n.teachers, min = 1, max = 30)
>   Teacher.Other <- sample(c(0,1), replace = T, prob = c(0.5, 0.5),
> size = n.teachers)
>   Teacher.RE <- rnorm(n = n.teachers, mean = 0, sd = 1)
>   Teachers <- data.frame(Teacher.ID, Teacher.exp, Teacher.Other,
> Teacher.RE)
>   return(Teachers)}
> Teachers <- gen.teachers(n.teachers = n.Teachers) %>%
>   mutate(Teacher.ID = as.character(Teacher.ID))
> # Randomly assign n.Teachers teachers to the "ValidPairs"
> TmpAssignments <- validPairs[sample(1:nrow(validPairs), n.Teachers), ]
> Assignments <- cbind.data.frame(Teachers$Teacher.ID, TmpAssignments)
> names(Assignments) <- c("Teacher.ID", "School.ID", "Grade", "Class_1",
> "Class_2")
> # Tidy Data
> library(tidyr)
> TeacherClassroom <- Assignments %>%
>   gather(x, Classroom, Class_1,Class_2) %>%
>   select(-x) %>%
>   mutate(Teacher.ID = as.character(Teacher.ID))
> # Merge
> DF_Classrooms <- TeacherClassroom %>% full_join(Teachers,
> by="Teacher.ID") %>% full_join(Schools, by="School.ID")
> rm(list=setdiff(ls(), "DF_Classrooms")) # Clean the work space!
>
> *I want to end up with the same*  'DF_Classrooms *data frame* but getting
> there in a more 

Re: [R] Genetic algorithm workflow Problem..!! Is it right or wrong ??

2015-05-22 Thread Collin Lynch
Rashmi, I think that this might be beyond the scope of this list as it
is focused on issues with the R language specifically.  It does not
look like you have any R errors although we would need to see some
output to be sure.

With respect to the general GA workflow it appears that you are doing
it right although the exact function of your fitness operator is not
clear to me.  I recommend looking at "An Introduction to Genetic
Algorithms" by Melanie Mitchell.  That is a good resource for more
general GA advice.

Sincerely,
    Collin Lynch.



On Fri, May 22, 2015 at 3:09 AM, Rashmi Naik k  wrote:
> I'm actually looking for a way to use a genetic algorithm to optimize
> product price's in an e-commerce store. and im doing it in R language.
>
> I'm using GA package in r and below is my data set.
>
> library(GA)
> dataset---input1
>
> Production.Cost Product.Price Product.Quality Delivery.Time
> After.Sales.Service Sellers.Reputation Selling.Price
>
> [1,]871.1901879.99 1139.99
> 895.13 1029.98  986.2725  1066
>
> [2,]296.9901299.99  329.95
> 329.73  334.99323.6650   321
>
>
> -
> and this is my fitness function
>
> f <-function(x) x * runif(1)
>  fitness <-function(x) f(x)
> --
> and i have generated a random population
>
> pop<-gareal_Population(rp,input1)
> -
> i have used roulette wheel selection
>
> sel<-gareal_rwSelection(rp,pop$population)
> ---
>
> I have Used Single point Crossover
>
> cross<-gareal_spCrossover(rp,sel$population)
> --
>
> I Don't know whether i'm doing it right or not..!! But i don't know how to
> optimize a products price using Genetic algorithm.
> Can any body suggest me on this.. and take me to a right direction.
>
> Any Suggestions accepted..!! Please do help me on this
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need online version of R help pages

2015-04-17 Thread Collin Lynch
Hi Paul a quick search popped up these:

http://astrostatistics.psu.edu/datasets/R/html/index.html
http://finzi.psych.upenn.edu/
http://r.789695.n4.nabble.com/Online-R-documentation-td1009656.html

Are they what you are looking for?

Collin.

On Thu, Apr 16, 2015 at 5:02 PM, paul  wrote:
> The help for the cygwin port of R is buggy and hides random lines of
> text.  Consquently, I've been relying on Google, but it is often not
> clear how directly relevant the info is for the specific command that
> I'm using.  For example, reshape is complicated, and has more than 1
> version.
>
> Is there an online version of the help pages?
>
> I tried looking for html versions of the help pages by ferruting
> through the R.home() subtree.  Haven't found them so far.  There are
> package pages in subdirectories /html/00Index.html, but they
> just contain links to html files that don't reside in my R.home()
> subtree.  There are also subdirectories /help, but they
> contain pages that I don't recognize (*.rds, *.rdb, *.rdx).
>
> Getting desparate here, and realizing how the web is not in any way a
> substituted for locally available help pages that you can be confident
> is right for your installation.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kruskal-Wallace power calculations.

2015-04-03 Thread Collin Lynch
Thank you very much Greg, I will give that a try.

Best,
Collin.

On Fri, Apr 3, 2015 at 1:43 PM, Greg Snow <538...@gmail.com> wrote:
> Here is some sample code:
>
> ## Simulation function to create data, analyze it using
> ## kruskal.test, and return the p-value
> ## change rexp to change the simulation distribution
>
> simfun <- function(means, k=length(means), n=rep(50,k)) {
>   mydata <- lapply( seq_len(k), function(i) {
> rexp(n[i], 1) - 1 + means[i]
>   })
>   kruskal.test(mydata)$p.value
> }
>
> # simulate under the null to check proper sizing
> B <- 1
> out1 <- replicate(B, simfun(rep(3,4)))
> hist(out1)
> mean( out1 <= 0.05 )
> binom.test( sum(out1 <= 0.05), B, p=0.05)
>
> ### Now simulate for power
>
> B <- 1
> out2 <- replicate(B, simfun( c(3,3,3.2,3.3)))
> hist(out2)
> mean( out2 <= 0.05 )
> binom.test( sum(out2 <= 0.05), B, p=0.05 )
>
> This simulates from a continuous exponential (skewed) and shifts to
> get the means (shifted location is a common assumption, though not
> required for the actual test).
>
> On Thu, Apr 2, 2015 at 8:19 PM, Collin Lynch  wrote:
>> Thank you Jim, I did see those (though not my typo :) and am still
>> pondering the warning about post-hoc analyses.
>>
>> The situation that I am in is that I have a set of individuals who
>> have been assigned a course grade.  We have then clustered these
>> individuals into about 50 communities using standard community
>> detection algorithms with the goal of determining whether community
>> membership affects one of their grades.  We are using the KW test as
>> the grade data is strongly non-normal and my coauthors preferred KW as
>> an alternative.
>>
>> The two issues that I am struggling with are: 1) whether the post-hoc
>> power analysis would be useful; and 2) how to code the simulation
>> studies that are described in:
>> http://onlinelibrary.wiley.com/doi/10.1002/bimj.4710380510/abstract
>>
>>
>> Problem #1 is of course beyond the scope of this e-mail list though I
>> would welcome anyone's suggestions on that point.  I am not sure that
>> I buy the arguments against it offered here:
>>
>> http://graphpad.com/support/faq/why-it-is-not-helpful-to-compute-the-power-of-an-experiment-to-detect-the-difference-actually-observed-why-is-post-hoc-power-analysis-futile/
>>
>> It seems that the rationale boils down to "you didn't find it so you
>> couldn't find it" but that does not tell me how far off I was from the
>> goal.  I am still perusing the articles the author cites however.
>>
>>
>> With respect to question #2 I am trying to lay my hands on the article
>> and did find this old r-help discussion:
>> http://r.789695.n4.nabble.com/Power-of-Kruskal-Wallis-Test-td4671188.html
>> however I am not sure how to adapt the simulation studies that it
>> links to to my current problem.  The links it leads to focus on
>> mixed-effects models.  This may be more of a pure stats question and
>> not suited for this list but I thought I'd ask in the hopes that
>> anyone had any more specific KW code or knew of a good tutorial for
>> the right kinds of simulation studies.
>>
>> Thank you,
>> Collin.
>>
>>
>>
>>
>> On Thu, Apr 2, 2015 at 6:35 PM, Jim Lemon  wrote:
>>> Hi Collin,
>>> Have a look at this:
>>>
>>> http://stats.stackexchange.com/questions/70643/power-analysis-for-kruskal-wallis-or-mann-whitney-u-test-using-r
>>>
>>> Although, thinking about it, this might have constituted your "perusal of
>>> the literature".
>>>
>>> Plus it always looks better when you spell the names properly
>>>
>>> Jim
>>>
>>>
>>> On Fri, Apr 3, 2015 at 2:23 AM, Jeff Newmiller 
>>> wrote:
>>>>
>>>> Please stop... you are acting like a broken record, and are also posting
>>>> in HTML format. Please read the Posting Guide and demonstrate that you have
>>>> used a search engine on this topic before posting again.
>>>>
>>>> ---
>>>> Jeff NewmillerThe .   .  Go
>>>> Live...
>>>> DCN:Basics: ##.#.   ##.#.  Live
>>>> Go...
>>>>   Live:   OO#.. Dead: OO#..  Playing
>>>> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
>>>> /Software/Emb

Re: [R] Kruskal-Wallace power calculations.

2015-04-02 Thread Collin Lynch
Thank you Jim, I did see those (though not my typo :) and am still
pondering the warning about post-hoc analyses.

The situation that I am in is that I have a set of individuals who
have been assigned a course grade.  We have then clustered these
individuals into about 50 communities using standard community
detection algorithms with the goal of determining whether community
membership affects one of their grades.  We are using the KW test as
the grade data is strongly non-normal and my coauthors preferred KW as
an alternative.

The two issues that I am struggling with are: 1) whether the post-hoc
power analysis would be useful; and 2) how to code the simulation
studies that are described in:
http://onlinelibrary.wiley.com/doi/10.1002/bimj.4710380510/abstract


Problem #1 is of course beyond the scope of this e-mail list though I
would welcome anyone's suggestions on that point.  I am not sure that
I buy the arguments against it offered here:

http://graphpad.com/support/faq/why-it-is-not-helpful-to-compute-the-power-of-an-experiment-to-detect-the-difference-actually-observed-why-is-post-hoc-power-analysis-futile/

It seems that the rationale boils down to "you didn't find it so you
couldn't find it" but that does not tell me how far off I was from the
goal.  I am still perusing the articles the author cites however.


With respect to question #2 I am trying to lay my hands on the article
and did find this old r-help discussion:
http://r.789695.n4.nabble.com/Power-of-Kruskal-Wallis-Test-td4671188.html
however I am not sure how to adapt the simulation studies that it
links to to my current problem.  The links it leads to focus on
mixed-effects models.  This may be more of a pure stats question and
not suited for this list but I thought I'd ask in the hopes that
anyone had any more specific KW code or knew of a good tutorial for
the right kinds of simulation studies.

Thank you,
Collin.




On Thu, Apr 2, 2015 at 6:35 PM, Jim Lemon  wrote:
> Hi Collin,
> Have a look at this:
>
> http://stats.stackexchange.com/questions/70643/power-analysis-for-kruskal-wallis-or-mann-whitney-u-test-using-r
>
> Although, thinking about it, this might have constituted your "perusal of
> the literature".
>
> Plus it always looks better when you spell the names properly
>
> Jim
>
>
> On Fri, Apr 3, 2015 at 2:23 AM, Jeff Newmiller 
> wrote:
>>
>> Please stop... you are acting like a broken record, and are also posting
>> in HTML format. Please read the Posting Guide and demonstrate that you have
>> used a search engine on this topic before posting again.
>>
>> ---
>> Jeff NewmillerThe .   .  Go
>> Live...
>> DCN:Basics: ##.#.   ##.#.  Live
>> Go...
>>   Live:   OO#.. Dead: OO#..  Playing
>> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
>> /Software/Embedded Controllers)   .OO#.   .OO#.
>> rocks...1k
>>
>> ---
>> Sent from my phone. Please excuse my brevity.
>>
>> On April 2, 2015 7:25:20 AM PDT, Collin Lynch  wrote:
>> >Greetings, I am working on a project where we are applying the
>> >Kruskal-Wallace test to some factor data to evaluate their correlation
>> >with
>> >existing grade data.  I know that the grade data is nonnormal therefore
>> >we
>> >cannot rely on ANOVA or a similar parametric test.  What I would like
>> >to
>> >find is a mechanism for making power calculations for the KW test given
>> >the
>> >nonparametric assumptions.  My perusal of the literature has suggested
>> >that
>> >a simulation would be the best method.
>> >
>> >Can anyone point me to good examples of such simulations for KW in R?
>> >And
>> >does anyone have a favourite package for generating simulated data or
>> >conducting such tests?
>> >
>> >Thank you,
>> >Collin.
>> >
>> >   [[alternative HTML version deleted]]
>> >
>> >__
>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kruskal-Wallace power calculations.

2015-04-02 Thread Collin Lynch
Greetings, I am working on a project where we are applying the
Kruskal-Wallace test to some factor data to evaluate their correlation with
existing grade data.  I know that the grade data is nonnormal therefore we
cannot rely on ANOVA or a similar parametric test.  What I would like to
find is a mechanism for making power calculations for the KW test given the
nonparametric assumptions.  My perusal of the literature has suggested that
a simulation would be the best method.

Can anyone point me to good examples of such simulations for KW in R?  And
does anyone have a favourite package for generating simulated data or
conducting such tests?

Thank you,
Collin.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kruskal-Wallace power tests.

2015-04-01 Thread Collin Lynch
Greetings, I am working on a project where we are applying the
Kruskal-Wallace test to some factor data to evaluate their correlation with
existing grade data.  I know that the grade data is nonnormal therefore we
cannot rely on ANOVA or a similar parametric test.  What I would like to
find is a mechanism for making power calculations for the KW test given the
nonparametric assumptions.  My perusal of the literature has suggested that
a simulation would be the best method.

Can anyone point me to good examples of such simulations for KW in R?  And
does anyone have a favourite package for generating simulated data or
conducting such tests?

Thank you,
Collin.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and Python

2015-03-01 Thread Collin Lynch
I recommend rpy2.  http://rpy.sourceforge.net/rpy2.html

It provides direct access to a running R instance with full support for R
functions including package loading.  It has some minor issues with
graphics drivers making it best for programmatic and not interactive use
but it is excellent for munging data in python and then passing it off to R
for calculations.

Collin.

On Sun, Mar 1, 2015 at 10:17 AM, Sarah Goslee 
wrote:

> You mean like rPython? Or rpy? Or rpy2?
>
> Googling R Python is a great place to start.
>
> Sarah
>
> On Sun, Mar 1, 2015 at 9:41 AM, linda.s  wrote:
> > Is there any good example codes of integrating R and Python?
> > Thanks.
> > Linda
> >
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help with excel data

2015-01-21 Thread Collin Lynch
It is good to know R is up to the task and I have to agree with Ista and
Jeff that if you are more comfortable in R use it.  By way of comparison
the python code would look something like what is below.  You would need to
tweak the regular rexpression (re.match(...) to fit your needs but if you
are just learning Python then sticking with R might be a better choice.

   Best,
   Collin.

import csv, re

In = open("Sheet.csv", "r")
Reader = csv.DictReader(In)

Out = open("Out.csv")
Writer = csv.DictWriter(Out, ["Val", "Text", "Numbers"])
Writer.writeheader()

for D in Reader:
  NewDict = {}
  NewDict["Val"] = D["Col1Name"]
  Match = re.match("(?P\S+) (?P[0-9]+ [0-9]+\*[0-9]+,?
[0-9]+*[0-9]+)$" D["Col2Name"])
  NewDict["Text"] = Match.group("Text")
  NewDict["Numbers"] Match.group("Numbers")
  Writer.writerow(NewDict)

In.close()
Out.close()

On Wed, Jan 21, 2015 at 9:58 PM, Ista Zahn  wrote:

> I agree, R will be fine for this. Not being as expert with regex as
> Jeff I would tend to do this in a few steps, something like
>
> library(XLConnect)
> DF <- readWorksheetFromFile( "exampX.xlsx", sheet="examp" )
> library(stringi)
> ## insert a marker between the text and the numbers
> txt <- stri_replace_all_regex(DF[[2]], "([^\\d]{2,})(\\d+ )", "$1|||$2")
> ## separate the text from the numbers
> stringNums <- stri_split_fixed(txt, "|||", 2, simplify = TRUE)
> ## split the numbers apart
> nums <- stri_split_regex(stringNums[, 2], "[^\\d]+", n = 5, simplify=TRUE)
> ## put it all back together
> extracted <- data.frame(DF[, 1], stringNums[, 1], apply(nums, 2,
> as.numeric))
> ## put the names back
> names(extracted) <- c(names(DF)[1], paste(names(DF)[2], 1:6, sep = "_"))
>
> Best,
> Ista
>
> On Wed, Jan 21, 2015 at 8:02 PM, Jeff Newmiller
>  wrote:
> > I think R is quite capable of doing this. You would have to learn a
> > comparable number of fiddly bits to accomplish this in R, Python or Perl.
> >
> > That is not to say that learning Perl or Python is a bad idea... but in
> > terms of "shortest path" I think they are of comparable complexity. All
> > three languages support regular expressions, which would be the key bit
> of
> > knowledge to acquire regardless of which tool you use.
> >
> > Other fiddly bits might involve handling the cyrillic strings as data,
> > though you did not convey a desire to retain that information.
> >
> > One way (not extracting cyrillic text):
> >
> > library(XLConnect)
> > DF <- readWorksheetFromFile( "exampX.xlsx", sheet="examp" )
> > pattern <- "^.*(\\d+) *\\* *(\\d+)[^\\d]*(\\d+) *\\* *(\\d+).*$"
> > idx <- grep( pattern, DF[[2]] )
> > dta <- sub( pattern, "\\1,\\2,\\3,\\4", DF[[2]][idx])
> > dtamatrix <- apply( do.call( rbind
> >, strsplit( dta, "," ) )
> >   , 2
> >   , as.numeric
> >   )
> > extracted <- data.frame( V1=DF[[1]][idx], dtamatrix )
> >
> >
> > On Wed, 21 Jan 2015, Collin Lynch wrote:
> >
> >> Dr. Polanski, I would recommend something else.  Given the messy nature
> of
> >> your data I would suggest using a language like Python or Perl to
> extract
> >> it to an appropriate format.  Python has good regular expression support
> >> and unicode support.  If you can save your data as a csv file or even
> text
> >> line by line then it would be possible to write some code to read the
> >> file,
> >> match the lines with a simple regular expression, and then spit them
> back
> >> out as a csv file which you could read into R.
> >>
> >> I realize that this means learning a new language or finding someone
> with
> >> the requisite skills by I would recommend that over attempting to use
> R's
> >> text processing.
> >>
> >>Collin.
> >>
> >> On Wed, Jan 21, 2015 at 3:31 PM, Dr Polanski 
> >> wrote:
> >>
> >>> Hi all!
> >>>
> >>> Sorry to bother you, I am trying to learn some R via coursera courses
> and
> >>> other internet sources yet haven?t managed to go far
> >>>
> >>> And now I need to do some, I hope, not too difficult things, which I
> >>> think
> >>> R can do, yet have no idea how to make it do so
> >>>
> >>> I have a big set of data (empirical) 

Re: [R] need help with excel data

2015-01-21 Thread Collin Lynch
Dr. Polanski, I would recommend something else.  Given the messy nature of
your data I would suggest using a language like Python or Perl to extract
it to an appropriate format.  Python has good regular expression support
and unicode support.  If you can save your data as a csv file or even text
line by line then it would be possible to write some code to read the file,
match the lines with a simple regular expression, and then spit them back
out as a csv file which you could read into R.

I realize that this means learning a new language or finding someone with
the requisite skills by I would recommend that over attempting to use R's
text processing.

Collin.

On Wed, Jan 21, 2015 at 3:31 PM, Dr Polanski  wrote:

> Hi all!
>
> Sorry to bother you, I am trying to learn some R via coursera courses and
> other internet sources yet haven’t managed to go far
>
> And now I need to do some, I hope, not too difficult things, which I think
> R can do, yet have no idea how to make it do so
>
> I have a big set of data (empirical) which was obtained by my colleagues
> and store at not convenient  way - all of the data in two cells of an excel
> table
> an example of the data is in the attached file (the link)
>
>
> https://drive.google.com/file/d/0B64YMbf_hh5BS2tzVE9WVmV3bFU/view?usp=sharing
>
> so the first column has a number and the second has a whole vector (I
> guess it is) which looks like
> «some words in Cyrillic(the length varies)» and then the set of numbers
> «12*23 34*45» (another problem that some times it is «12*23, 34*56»
>
> And the number of raws is about 3000 so it is impossible to do manually
>
> what I need to have at the end is to have it separately in different excel
> cells
> - what is written in words - |  12  | 23 | 34 | 45 |
>
> Do you think it is possible to do so using R (or something else?)
>
> Thank you very much in advance and sorry for asking for help and so stupid
> question, the problem is - I am trying and yet haven’t even managed to
> install openSUSE onto my laptop - only Ubuntu! :)
>
>
> Thank you very much!
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Huge Dataset Dates Span two Lines

2015-01-08 Thread Collin Lynch
You might consider using something other than R to clean the file and even
to load it.  I regularly use python to preprocess data for R and often feed
it to R directly via the rpy2 interface.  If the dates are delimited by
some feature (e.g. ") you could potentially use the python csv library to
load it directly and then either send that or dump it in a clean form.
Alternatively you could use a simple script to iterate over the file and to
remove newlines in that case without doing any other processing.  A regular
experession of the form:

"([0-9]{4,4}-[0-9]{2,2}-[0-9]{2,2})(
)+(\n?)([0-9]{2,2}:[0-9]{2,2}:[0-9]{2,2})"

should match the date/time strings even with an embedded newline.  You
could use this to detect those cases in the file and then to replace the
newline with a whitespace character.

On Thu, Jan 8, 2015 at 1:20 PM, DVL  wrote:

> I'm trying to import a many gigabyte .txt file to analyze. It is asterisk
> delimited. I'm having an issue with the date field in the dataset. In the
> first 165 lines dates are listed as :
> -MM-DD HH:MM:SS
>
> Then on the 166th line and in other places the date spans two lines:
> -MM-DD
> HH:MM:SS
>
> This causes a problem because R thinks it has reached the end of a row in
> the table. How can I solve this?
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Huge-Dataset-Dates-Span-two-Lines-tp4701523.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parsing Google Finance page data?

2014-11-20 Thread Collin Lynch
If you do not need a pure R solution, you might also find it helpful to
blend languages.  For scraping and munging tasks such as this I generally
turn to python to do extraction then feed data to R for analysis via rpy.

On Thu, Nov 20, 2014 at 8:57 PM, Spencer Graves <
spencer.gra...@structuremonitoring.com> wrote:

>   The Ecfun package includes functions written to scrape data from web
> pages.  See, e.g., readUShouse, readUSsenate, readUSstateAbbreviations.
> They use getURL{RCurl} and readHTMLTable{XML}.
>
>
>   Hope this helps.
>
>
>   Spencer Graves
>
>
>
> On 11/20/2014 5:42 PM, Matt Considine wrote:
>
>> Hi,
>> I'm wondering if anyone can point me to code to parse data on Google
>> Finance pages, i.e. parse the results of a URL request such as this
>>   http://www.google.com/finance?q=apple
>>
>> I know how to return the contents of the page; it's figuring out the best
>> tools to parse it that I'm interested in and hopefully someone has already
>> done this.
>>
>> (For what it is worth, the only info I am looking for are the ticker,
>> exchange, currency and "Mkt Cap" datapoint)
>>
>> Thanks in advance for any help - scraping is not my strong suit.
>> Matt
>>
>>
>> ---
>> This email is free from viruses and malware because avast! Antivirus
>> protection is active.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GAM using penalized regression splines with 4 degress o.f.

2014-02-19 Thread Collin Lynch
Hi Katharina, what gam package are you using?  With mgcv you can inspect
the results of the output variables to check whether the fixed field is
true which would indicate whether the df is fixed or floating.  I'm not
sure if this is applicable to what you want.

My rough translation of your German error message, however, suggests that
that the problem may stem from the lagBYmean variable.  It looks like it
is complaining that the number of items is not a multiple of
Ersetzungslnge (4?).  It may be that you don't have a sufficient
distribution for the df that you want.

Hope that helps,
Collin.

On Wed, 19 Feb 2014, Katharina Mersmann wrote:

> Dear R-Users,
>
> I am fairly new to R and got in trouble by understanding how to run a GAM
> using penalized regression splines with 4 degress of freedom (even by
> reading the R Documentation).
>
>
>
> I tried the following:
>
> >  gamreg1.2<-gam(num_FCRlong ~ s(GDP,df=4)+s(cupol_GDPpCapita,df=4)
>
> +  +s(fort_budget,df=4)+s(fort_percentDebt,
> df=4)
>
> +  +s(linpol_primSurplus,df=4)+s(Inflation,df=4)
>
> +  +s(lagBYmean,df=4),
>
> +  data = data.plm)
>
> Fehler in o[, i] <- junk$o :
>
>   Anzahl der zu ersetzenden Elemente ist kein Vielfaches der Ersetzungsl?nge
>
>
>
> If I remove
>
>  +s(lagBYmean,df=4),
>
> It works.
>
>
>
> So I do not really understand why this is the case and what I am actually
> doing wrong.
>
> Further I am even not sure, if this is the right code at all.
>
> Will I receive run a GAM using penalized regression splines with 4 degress
> of freedom by this code?
>
>
>
> Thanks for your help and have a nice day!
> Katie
>
>
>
>
>
>
>
>
>   [[alternative HTML version deleted]]
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rJava works on R-32bit but fails in R 64bit

2014-01-15 Thread Collin Lynch
I'll echo this and expand.  Hui it is possible, indeed likely, that you
are running a 32bit version of Java.  In that case the error may be
attributed to a miscommunication between the two.

You can check you java version by running "java -version" in the command
prompt.  If it is 64 bit it will tell you.  If it does not say 64-Bit then
you have the 32-bit version and will need to change it.

Best,
Collin.

On Tue, 14 Jan 2014, Jeff Newmiller wrote:

> Post plain text per the posting guide?
> Install the 64bit version of the Java Runtime?
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> Hui Du  wrote:
> >
> >Hi All,
> >
> >I have R 64bit and R-32 bit installed in my windows 7.
> >
> >For 64 bit, the version info is
> >
> >> R.Version()
> >$platform
> >[1] "x86_64-w64-mingw32"
> >
> >$arch
> >[1] "x86_64"
> >
> >$os
> >[1] "mingw32"
> >
> >$system
> >[1] "x86_64, mingw32"
> >
> >$status
> >[1] ""
> >
> >$major
> >[1] "3"
> >
> >$minor
> >[1] "0.2"
> >
> >$year
> >[1] "2013"
> >
> >$month
> >[1] "09"
> >
> >$day
> >[1] "25"
> >
> >$`svn rev`
> >[1] "63987"
> >
> >$language
> >[1] "R"
> >
> >$version.string
> >[1] "R version 3.0.2 (2013-09-25)"
> >
> >$nickname
> >[1] "Frisbee Sailing"
> >
> >While trying to load 'rJava', I got the following error
> >
> >> library('rJava')
> >Error : .onLoad failed in loadNamespace() for 'rJava', details:
> >  call: fun(libname, pkgname)
> >error: No CurrentVersion entry in Software/JavaSoft registry! Try
> >re-installing Java and make sure R and Java have matching
> >architectures.
> >Error: package or namespace load failed for 'rJava'
> >
> >However it work in my R-32 bit
> >> R.Version()
> >$platform
> >[1] "i386-w64-mingw32"
> >
> >$arch
> >[1] "i386"
> >
> >$os
> >[1] "mingw32"
> >
> >$system
> >[1] "i386, mingw32"
> >
> >$status
> >[1] ""
> >
> >$major
> >[1] "3"
> >
> >$minor
> >[1] "0.2"
> >
> >$year
> >[1] "2013"
> >
> >$month
> >[1] "09"
> >
> >$day
> >[1] "25"
> >
> >$`svn rev`
> >[1] "63987"
> >
> >$language
> >[1] "R"
> >
> >$version.string
> >[1] "R version 3.0.2 (2013-09-25)"
> >
> >$nickname
> >[1] "Frisbee Sailing"
> >
> >> library('rJava')
> >>
> >
> >
> >Does somebody know how to fix the problem in R-64 bit? Many thanks.
> >
> >HXD
> >
> > [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Power calculations for Wilcox.test

2013-12-16 Thread Collin Lynch
Greetings, I'm working on some analyses where I need to calculate wilcox
tests for paired samples.  In my current literature search I've found a
few papers on sample size determination for the wilcox test notably:

Sample Size Determination for Some Common Nonparametric Tests
Gottfried E. Noether
Journal of the American Statistical Association

http://www.jstor.org.pitt.idm.oclc.org/stable/2289477

My question is: are there any implementations of power calculations for
the wilcox test in R based either on Noether's methods for sample size or
another method?

Thanks,
Collin.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GAM Assumption Tests

2013-12-05 Thread Collin Lynch
Hi Mike, I recently had this issue and didn't find any package that
implemented these tests directly for the gam object.  I found it simplest
just to pull the residuals from it and run tests like shapiro.test
directly.

Best,
Collin.

On Thu, 5 Dec 2013, Mike.lang wrote:

> Dear all,
>
> currently I set up a GAM for my dataset (~32k records). I assume a normal
> distribution, constant variance and no correlation effects.
>
> With gam.check() it is possible to check those assumptions graphically. But
> is there also any option to do quantitative tests like the Wald-Test,
> shapiro-wilk test or VIF?
>
>
> Looking forward to your responses!
>
> Best
> Mike
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/GAM-Assumption-Tests-tp4681670.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] more rpy2 questions...mostly R

2013-11-17 Thread Collin Lynch
Erin, at first glance I would say that this is an R error.  When Rpy2
detects an error it will pass it through errors as library errors like
this.  At first glance it appears that your r code is not loading the
requisite packages as it cannot find them.  That I suspect is what is
causing your coordinates error.

I would first run R and confirm that you can load the requisite libraries
in it before depending upon them in rpy2.  In general I find that rpy2 is
a bad interface for debugging R code and you need to run it in R first to
get the problems out.

Best,
Collin.

On Sun, 17 Nov 2013, Erin Hodgess wrote:

> Hello again!
>
> I'm using python, rpy2, and R for a project.  It's actually pretty
> interesting.  Anyhow, I pass in an R file to the python program.  However,
> I am getting the following errors, which seem more like R errors(?):
>
> Loading required package: gstat
> Loading required package: automap
> Error in coordinates(x.df) <- ~x + y :
>   could not find function "coordinates<-"
> In addition: Warning messages:
> 1: In library(package, lib.loc = lib.loc, character.only = TRUE,
> logical.return = TRUE,  :
>   there is no package called ?gstat?
> 2: In library(package, lib.loc = lib.loc, character.only = TRUE,
> logical.return = TRUE,  :
>   there is no package called ?automap?
> Traceback (most recent call last):
>   File "reg4.py", line 38, in 
> spat1.spat1(file1=file1,file2=file2)
>   File "/usr/local/lib/python2.7/site-packages/rpy2/robjects/functions.py",
> line 86, in __call__
> return super(SignatureTranslatedFunction, self).__call__(*args,
> **kwargs)
>   File "/usr/local/lib/python2.7/site-packages/rpy2/robjects/functions.py",
> line 35, in __call__
> res = super(Function, self).__call__(*new_args, **new_kwargs)
> rpy2.rinterface.RRuntimeError: Error in coordinates(x.df) <- ~x + y :
>   could not find function "coordinates<-"
>
> I've checked and gstat and automap are both there.  Here is the actual R
> code:
> spat1 <- function(file1,file2) {
> require(gstat)
> require(automap)
> x.df <- read.table(file=file1,header=TRUE)
> coordinates(x.df) <- ~x+y
> zz <- scan(file=file2,what="character")
> proj4string(x.df) <- zz
> png(file="map1.png")
> u <- autoKrige(x.df@data[,1]~1,x.df)
> plot(u)
> return(u)
> }
>
>
> I know that there are a few people who have used both rpy2, python and R,
> so I thought I'd launch this out.
>
> Thanks in advance for any help.
>
> Sincerely,
> Erin
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>
>   [[alternative HTML version deleted]]
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Manually setting coefficients in an lm.

2013-11-12 Thread Collin Lynch
> You can use the "offset" function as part of a formula in "lm" (and
> other model fitting functions) to set a specific slope or set of
> slopes.  Using this up front will give you the correct residuals,
> standard errors, etc.  This is better than trying to modify a fitted
> regression object.

Great, I'll try that out.  Thanks.

Collin.

>
> On Tue, Nov 12, 2013 at 9:33 AM,   wrote:
> > Greetings, I'm working on a project where I want to hand-tailor an lm.
> > Specifically I want to construct an lm with an existing formula and then
> > hand tailor the coefficients myself.  Is there an established method for
> > that other than manipulating the $coefficients values?
> >
> >  Thank you,
> > Collin.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Manually setting coefficients in an lm.

2013-11-12 Thread Collin Lynch
> Even if this is possible, won't all the other estimates (i.e., standard error 
> of betas) produced be junk since they aren't derived from the associated 
> estimators?

That was actually my primary concern.  It looks like the offset is the
solution.

Thanks.
Collin.
>
> Michael
>
> >Collin.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to replace NA's data with some value

2013-11-12 Thread Collin Lynch
Dila, take a look at this:

http://r.789695.n4.nabble.com/How-to-replace-all-lt-NA-gt-values-in-a-data-frame-with-another-not-0-value-td2125458.html

Does that help?
Best,
Collin.

On Tue, 12 Nov 2013, dila radi wrote:

> Hi all,
>
> I have a data set with missing value. I would like to estimate those
> missing value by using normal ratio method.
> Below is part of my data:
>
>   AS   BL Serdang  Jhr   Phg  Target station
>00.012.8  0.0  23.7  0.0
>60.081.7  0.2  0.0   NA
>01.560.9  0.0  0.0   15.5
>1   13.056.8 17.5 32.8  6.4
>4 3.066.4  2.0  0.3   NA
>
> Now I want to replace those NA's,  with the estimation values by using this
> formula:
> weight$v6
> <-(weight1*AS)+(weight2*BL)+(weight3*Serdang)+(weight4*Jhr)+(weight5*Phg);
> Targetstation
>
> but I still could not replace the NA's. My problem is, how do I replace
> those NA's with another value?
>
> Thank you so much for your help and attention.
>
> Regards,
> Dila
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging two dataframes with a condition involving variables of both dataframes

2013-11-07 Thread Collin Lynch
You might need to implement it as a nested pair of for loops using rbind.
In essence iterate over the rows in df1 and each time find the matching
row in df2.  If none is found then add the df1 row by itself to the
result.  If one is then remove it from df2 and rbind both of them.  Once
done just merge in all rows that remain in df2.

This would likely be slower than a sql-based method but is essentially the
same algorithm.

You can find advice on for-loops in R here:
http://paleocave.sciencesortof.com/2013/03/writing-a-for-loop-in-r/

Best,
Collin.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonnormal Residuals and GAMs

2013-11-06 Thread Collin Lynch
> The default functional link for mgcv::gam is "log", so I doubt that
 your theoretical understanding applies to GAM's in general. When Simon
 Wood wrote his book on GAMs his first chapter was on linear models, his
 second chapter was on generalized lienar models at which point he had
 written over 100 pages, and only then did he "introduce" GAMs. I think
 you need to follow the same progression, and this forum is not the
 correct one for statistics education. Perhaps pose your follow-up
 questions to CrossValidated.com

David, thank you for your advice, has the default changed for mgcv::gam?
Based upon the help pages for the version I have (1.7-27) I had thought
that the default family was gaussian() with link "identity".

In any event I will look again at Simon Woods' book and consider
CrossValidated in the future.

Best,
Collin.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nonnormal Residuals and GAMs

2013-11-06 Thread Collin Lynch
Greetings, My question is more algorithmic than prectical.  What I am
trying to determine is, are the GAM algorithms used in the mgcv package
affected by nonnormally-distributed residuals?

As I understand the theory of linear models the Gauss-Markov theorem
guarantees that least-squares regression is optimal over all unbiased
estimators iff the data meet the conditions linearity, homoscedasticity,
independence, and normally-distributed residuals.  Absent the last
requirement it is optimal but only over unbiased linear estimators.

What I am trying to determine is whether or not it is necessary to check
for normally-distributed errors in a GAM from mgcv.  I know that the
unsmoothed terms, if any, will be fitted by ordinary least-squares but I
am unsure whether the default Penalized Iteratively Reweighted Least
Squares method used in the package is also based upon this assumption or
falls under any analogue to the Gauss-Markov Theorem.

Thank you in advance for any help.

Sincrely,
Collin Lynch.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an rpy2, R cgi type question

2013-10-31 Thread Collin Lynch
Not terribly verbose.  If you google it there is a cgitb flag you can set
to get verbose python output.  It is off by default for deployment as it
is a security hole but it is useful now.

Collin.

On Thu, 31 Oct 2013, Erin Hodgess wrote:

> Hi again:
>
> Here is the web output:
>
> Internal Server Error
>
> The server encountered an internal error or misconfiguration and was unable
> to complete your request.
>
> Please contact the server administrator, webmas...@erinm.info and inform
> them of the time the error occurred, and anything you might have done that
> may have caused the error.
>
> More information about this error may be available in the server error log.
>
> Additionally, a 404 Not Found error was encountered while trying to use an
> ErrorDocument to handle the request.
> I did indeed check permissions and they seem to be in order.
>
> Thanks,
> Erin
>
>
>
> On Wed, Oct 30, 2013 at 10:51 PM, Collin Lynch  wrote:
>
> > Erin can you share the internal error details?
> >
> > As a first guess are the files executable by all?  CGI requires world rwx.
> >
> > Best,
> > Collin.
> >
> > On Wed, 30 Oct 2013, Erin Hodgess wrote:
> >
> > > Hi again.
> > >
> > > I'm putting together a little project with R, python, and a website.  So
> > I
> > > have an HTML file, a py file, an R file.
> > >
> > > Here is the HTML file:
> > > 
> > >  Integrate
> > >  Differentiate
> > >  Graph
> > > Function 
> > > 
> > > 
> > >
> > > Now the radio4.py file:
> > >
> > > # Import modules for CGI handling
> > > import cgi, cgitb
> > > from sympy import *
> > > import sys
> > >
> > > from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage as
> > > STAP
> > > with open("bz2.R","r") as f:
> > > string=''.join(f.readlines())
> > > etest = STAP(string,"etest")
> > > etest.etest(500)
> > >
> > >
> > > # Create instance of FieldStorage
> > > form = cgi.FieldStorage()
> > >
> > > # Get data from fields
> > > if form.getvalue('subject'):
> > >subject = form.getvalue('subject')
> > > else:
> > >subject = "Not set"
> > >
> > > if form.getvalue('func1'):
> > >func1 = form.getvalue('func1')
> > > else:
> > >func1 = "Not entered"
> > >
> > >
> > >
> > >
> > >
> > > print "Content-type:text/html\r\n\r\n"
> > > print ""
> > > print ""
> > > print "Test Project"
> > > print ""
> > > print ""
> > > print " Selected Action is %s" % subject
> > > print " output function is %s" % func1
> > > print ""
> > > print ""
> > >
> > >
> > > Finally, the bz2.R file:
> > >
> > > etest <- function(n=100) {
> > > y <- rnorm(n)
> > > pdf(file="lap1.png")
> > > plot(y)
> > > dev.off()
> > > }
> > >
> > >
> > > The radio4.py file is in a cgi-bin directory, along with the bz2.R file.
> > >
> > > I keep getting the Internal server error.
> > >
> > > Thanks for any help.
> > >
> > > Sincerely,
> > > Erin
> > >
> > > This is R version 3.0.2 and Python 2.7.5
> > >
> > > --
> > > Erin Hodgess
> > > Associate Professor
> > > Department of Computer and Mathematical Sciences
> > > University of Houston - Downtown
> > > mailto: erinm.hodg...@gmail.com
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an rpy2, R cgi type question

2013-10-30 Thread Collin Lynch
Erin can you share the internal error details?

As a first guess are the files executable by all?  CGI requires world rwx.

Best,
Collin.

On Wed, 30 Oct 2013, Erin Hodgess wrote:

> Hi again.
>
> I'm putting together a little project with R, python, and a website.  So I
> have an HTML file, a py file, an R file.
>
> Here is the HTML file:
> 
>  Integrate
>  Differentiate
>  Graph
> Function 
> 
> 
>
> Now the radio4.py file:
>
> # Import modules for CGI handling
> import cgi, cgitb
> from sympy import *
> import sys
>
> from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage as
> STAP
> with open("bz2.R","r") as f:
> string=''.join(f.readlines())
> etest = STAP(string,"etest")
> etest.etest(500)
>
>
> # Create instance of FieldStorage
> form = cgi.FieldStorage()
>
> # Get data from fields
> if form.getvalue('subject'):
>subject = form.getvalue('subject')
> else:
>subject = "Not set"
>
> if form.getvalue('func1'):
>func1 = form.getvalue('func1')
> else:
>func1 = "Not entered"
>
>
>
>
>
> print "Content-type:text/html\r\n\r\n"
> print ""
> print ""
> print "Test Project"
> print ""
> print ""
> print " Selected Action is %s" % subject
> print " output function is %s" % func1
> print ""
> print ""
>
>
> Finally, the bz2.R file:
>
> etest <- function(n=100) {
> y <- rnorm(n)
> pdf(file="lap1.png")
> plot(y)
> dev.off()
> }
>
>
> The radio4.py file is in a cgi-bin directory, along with the bz2.R file.
>
> I keep getting the Internal server error.
>
> Thanks for any help.
>
> Sincerely,
> Erin
>
> This is R version 3.0.2 and Python 2.7.5
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rpy2 and user defined functions from R

2013-10-30 Thread Collin Lynch
I don't believe that rpy2 will load a saved workspace.  When I have worked
with this I always load my functions by sourcing an r file separately:

R.r['source'](MyFuncs.r)


Best,
Collin.

On Wed, 30 Oct 2013, Erin Hodgess wrote:

> Here we go:
>
> > buzz
> function(x) {
> y <- x + pi
> return(y)
> }
> > q()
> Save workspace image? [y/n/c]: python
> Save workspace image? [y/n/c]: y
> root@erinminfo [/home/erinminf/public_html]# python
> Python 2.7.5 (default, Sep 11 2013, 02:14:06)
> [GCC 4.1.2 20080704 (Red Hat 4.1.2-54)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import rpy2.robjects as R
> >>> R.r.buzz(3)
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/usr/local/lib/python2.7/site-packages/rpy2/robjects/__init__.py",
> line 213, in __getattribute__
> raise orig_ae
> AttributeError: 'R' object has no attribute 'buzz'
> >>> R.r['buzz'](3)
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/usr/local/lib/python2.7/site-packages/rpy2/robjects/__init__.py",
> line 216, in __getitem__
> res = _globalenv.get(item)
> LookupError: 'buzz' not found
> >>>
> root@erinminfo [/home/erinminf/public_html]#
>
>
> On Wed, Oct 30, 2013 at 10:16 AM, Collin Lynch  wrote:
>
> > Erin, one question, can you access the defined functions by key?
> >
> > In lieu of:
> > > x = R.r.buzz(3)
> >
> > Can you do:
> >   x = R.r['buzz'](3)
> >
> >
> > Alternatively if you need only one or two custom functions have you
> > considered just defining them via python as in:
> >
> > PStr = """
> > function(LM) {
> >   S <- summary(LM);
> >   print(S$fstatistic);
> >   F <- S$fstatistic;
> >   P <- pf(F[1], F[2], F[3], lower=FALSE);
> >   return(P);
> > }
> > """
> > r_LMPValFunc = robjects.r(PStr)
> >
> > Best,
> > Collin.
> >
> >
> > On Tue, 29 Oct 2013, Erin Hodgess wrote:
> >
> > > Hello again!
> > >
> > > I'm using python with a module rpy2 to call functions from R.
> > >
> > > It works fine on built in R functions like rnorm.
> > >
> > > However, I would like to access user-defined functions as well.  For
> > those
> > > of you who use this, I have:
> > >
> > > import rpy2.robjects as R
> > > R object as no attribute buzz
> > >
> > > (user defined function of buzz)
> > >
> > > This is on a Centos 5 machine with R-3.0.2 and python of 2.7.5.
> > >
> > > Thanks for any help.
> > > Sincerely,
> > > Erin
> > >
> > >
> > >
> > > --
> > > Erin Hodgess
> > > Associate Professor
> > > Department of Computer and Mathematical Sciences
> > > University of Houston - Downtown
> > > mailto: erinm.hodg...@gmail.com
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rpy2 and user defined functions from R

2013-10-30 Thread Collin Lynch
Erin, one question, can you access the defined functions by key?

In lieu of:
> x = R.r.buzz(3)

Can you do:
  x = R.r['buzz'](3)


Alternatively if you need only one or two custom functions have you
considered just defining them via python as in:

PStr = """
function(LM) {
  S <- summary(LM);
  print(S$fstatistic);
  F <- S$fstatistic;
  P <- pf(F[1], F[2], F[3], lower=FALSE);
  return(P);
}
"""
r_LMPValFunc = robjects.r(PStr)

Best,
Collin.


On Tue, 29 Oct 2013, Erin Hodgess wrote:

> Hello again!
>
> I'm using python with a module rpy2 to call functions from R.
>
> It works fine on built in R functions like rnorm.
>
> However, I would like to access user-defined functions as well.  For those
> of you who use this, I have:
>
> import rpy2.robjects as R
> R object as no attribute buzz
>
> (user defined function of buzz)
>
> This is on a Centos 5 machine with R-3.0.2 and python of 2.7.5.
>
> Thanks for any help.
> Sincerely,
> Erin
>
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitdistr: was Heteroscedasity...

2013-10-28 Thread Collin Lynch
Hello again, first off thank you for your suggestion Mr. Rigby, I'll take
a look at the GAMLSS package.

I have a (slightly) related followup question regarding the 'fitdistr'
function.  I was examining my data, a sample of which is attached, using
this function and I am confused about the interpretation of the loglik
result.

Based upon my experience I understand log-likelyhood value returned from
this function should be in the range -Inf - 0 with values closer to 0
being the best choice.  However when I test this data I get the following
results:

fitdistr(E, "normal")$loglik
[1] 11.15125

fitdistr(E + 1, "lognormal")$loglik
[1] -0.8575117

fitdistr(E, "exponential")$loglik
[1] -73.18107

Is this a sign of error, in my system or on my part?  And if not am I
correct in interpreting lognormal as the best choice?

    Thank you in advance,
Collin Lynch.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Heteroscedasticity and mgcv.

2013-10-26 Thread Collin Lynch
I have a two part question one about statistical theory and the other
about implementations in R.  Thank you for all help in advance.

(1) Am I correct in understanding that Heteroscedasticity is a problem for
Generalized Additive Models as it is for standard linear models?  I am
asking particularly about the GAMs as implemented in the mgcv package.
Based upon my online search it seems that some forms of penalized splines
can address heteroscedasticity while others cannot and I'm not sure what
is true of the methods used in mgcv.

(2) Assuming that heteroscedasticity is a problem for the mgcv GAMs, can
anyone recommend a good test implementation?  I am familiar with the
ncvTest method implemented in the car package but that applies only to
lms.

Thank you,
    Collin Lynch.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical power of correlations.

2012-05-07 Thread Collin Lynch
Thank you Arun!

Collin.

On Mon, 7 May 2012, arun wrote:

> Hi Collin,

Look in the package 'pwr' for 'pwr.r.test'.

A.K.

- Original Message -
From: Collin Lynch 
To: r-help@r-project.org
Cc:
Sent: Monday, May 7, 2012 1:44 AM
Subject: [R] Statistical power of correlations.

My apologies for the statistical naivete of my question but...

Is there an established method or calulating the statistical power of a
correlation test?? And if so is there a method in R for it?

??? Thank you,
??? Collin Lynch.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Statistical power of correlations.

2012-05-07 Thread Collin Lynch
Great thanks Peter!

Collin.

On Mon, 7 May 2012, peter dalgaard wrote:

>
> On May 7, 2012, at 07:44 , Collin Lynch wrote:
>
> > My apologies for the statistical naivete of my question but...
> >
> > Is there an established method or calulating the statistical power of a
> > correlation test?  And if so is there a method in R for it?
>
> There's a pwr.r.test in the "pwr" package. This is based on the Z transform, 
> which makes quite good sense to me.
>
> >
> > Thank you,
> > Collin Lynch.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Statistical power of correlations.

2012-05-06 Thread Collin Lynch
My apologies for the statistical naivete of my question but...

Is there an established method or calulating the statistical power of a
correlation test?  And if so is there a method in R for it?

Thank you,
Collin Lynch.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Proper power computation for one-sided binomial tests.

2008-09-25 Thread Collin Lynch
Am 23.09.2008 um 23:57 schrieb Peter Dalgaard:

> For this kind of problem I'd go directly for the binomial
> distribution. If the actual probability is 0, this is essentially
> deterministic and you can look at
>
> > binom.test(0,99,p=.03, alt="less")
>

> This means that you don't sample from the p=.03 population?
> Note that there is a 5 per cent chance to have 0 failures in 99
> trials with p=.03.

In this case I am given the p=.03 as static value.  So my goal is to
compare the sample statistic against that.  I am essentially checking the
real population against an a-priori hypothesis of p=.03 or rather whatever
we set it at (long story).

Collin.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Proper power computation for one-sided binomial tests.

2008-09-24 Thread Collin Lynch
Thank you Peter.  That is incredibly helpfyul, and much much smaller!

Best,
Collin.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Proper power computation for one-sided binomial tests.

2008-09-23 Thread Collin Lynch
Hi, I trying to determine the best way to compute the power for a
one-sample one-sided binomial test.  Specifically I need to sample a
population of individuals and ask whether a sample rate of 0% is
compatable with a minimum threshold of 3% and how many samples are needed.

I have made use of power.prop.test but I am not sure if a) that is the
correct (or best) function to use and b) if the output is quite right.

Here is a sample run:
> power.prop.test(p1=0, p2=0.03, sig.level=0.05, power=0.90,
alt="one.sided")

 Two-sample comparison of proportions power calculation

  n = 279.3004
 p1 = 0
 p2 = 0.03
  sig.level = 0.05
  power = 0.9
alternative = one.sided

 NOTE: n is number in *each* group

This is an attempt to test whether a sample of 0% occurrance is compatable
with an a-priori probability of 3% at the specified significance levels.

My questions are those above, and, as a followup whether the caveat about
n being the number in each group means that I need to sample twice that
number in a single group.  I don't believe so but I want to be sure.

Thanks in advance,
    Collin Lynch.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.