Re: [R] Error running lda example from Help File (MASS library )
I don't run into any problem when runing examples from lda help file. > sessionInfo() R version 2.10.0 Patched (2009-11-09 r50375) i386-pc-mingw32 locale: [1] LC_COLLATE=Chinese_People's Republic of China.936 [2] LC_CTYPE=Chinese_People's Republic of China.936 [3] LC_MONETARY=Chinese_People's Republic of China.936 [4] LC_NUMERIC=C [5] LC_TIME=Chinese_People's Republic of China.936 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] MASS_7.3-3 ASRR_0.0-1 ASAtable_0.0-1 QCA3_0.0-3 loaded via a namespace (and not attached): [1] car_1.2-16 tools_2.10.0 2009/11/15 Greg Riddick : > Hello all, > > I'm trying to run lda() from the MASS library but the Help example generates > the > following error: > > > #Code from example in lda Help file > > > > # Resulting Error > >>Error in if (targetlist[i] == stringname) { : argument is of length zero > > > My Current R Installation: > MacOSX: 10.5.8 > R: 2.10.0 > > > > > -- > Gregory Riddick, PhD. > CRTA Research Fellow > > National Institutes of Health > National Cancer Institute, Neuro-Oncology Branch > http://home.ccr.cancer.gov/nob/ > > 37 Convent Drive > Building 37, Room 1142 > Bethesda, MD 20892-8202 > > Phone: 301-443-2490 > Fax: 240-396-5920 > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Wincent Ronggui HUANG Doctoral Candidate Dept of Public and Social Administration City University of Hong Kong http://asrr.r-forge.r-project.org/rghuang.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted descriptives by levels of another variables
In addition to using the survey package (and the svyby function), I've found that many of the 'weighted' functions, such as wtd.mean, work well with the plyr package. For example, wtdmean=function(df)wtd.mean(df$obese,df$sampwt); ddply(mydata, ~cut2(age,c(2,6,12,16)),'wtdmean') hth, david freedman Andrew Miles-2 wrote: > > I've noticed that R has a number of very useful functions for > obtaining descriptive statistics on groups of variables, including > summary {stats}, describe {Hmisc}, and describe {psych}, but none that > I have found is able to provided weighted descriptives of subsets of a > data set (ex. descriptives for both males and females for age, where > accurate results require use of sampling weights). > > Does anybody know of a function that does this? > > What I've looked at already: > > I have looked at describe.by {psych} which will give descriptives by > levels of another variable (eg. mean ages of males and females), but > does not accept sample weights. > > I have also looked at describe {Hmisc} which allows for weights, but > has no functionality for subdivision. > > I tried using a by() function with describe{Hmisc}: > > by(cbind(my, variables, here), division.variable, describe, > weights=weight.variable) > > but found that this returns an error message stating that the > variables to be described and the weights variable are not the same > length: > > Error in describe.vector(xx, nam[i], exclude.missing = > exclude.missing, : >length of weights must equal length of x > In addition: Warning message: > In present & !is.na(weights) : >longer object length is not a multiple of shorter object length > > This comes because the by() function passes down a subset of the > variables to be described to describe(), but not a subset of the > weights variable. describe() then searches the whatever data set is > attached in order to find the weights variables, but this is in its > original (i.e. not subsetted) form. Here is an example using the > ChickWeight dataset that comes in the "datasets" package. > > data(ChickWeight) > attach(ChickWeight) > library(Hmisc) > #this gives descriptive data on the variables "Time" and "Chick" by > levels of "Diet") > by(cbind(Time, Chick), Diet, describe) > #trying to add weights, however, does not work for reasons described > above > wgt=rnorm(length(Chick), 12, 1) > by(cbind(Time, Chick), Diet, describe, weights=wgt) > > Again, my question is, does anybody know of a function that combines > both the ability to provided weighted descriptives with the ability to > subdivide by the levels of some other variable? > > > Andrew Miles > Department of Sociology > Duke University > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://old.nabble.com/Weighted-descriptives-by-levels-of-another-variables-tp26354665p26355886.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weighted descriptives by levels of another variables
In addition to using the survey package (and the svyby function), I've found that many of the 'weighted' functions, such as wtd.mean, work well with the plyr package. For example, wtdmean=function(df)wtd.mean(df$obese,df$sampwt); ddply(mydata, ~cut2(age,c(2,6,12,16)),'wtdmean') hth, david freedman Andrew Miles-2 wrote: > > I've noticed that R has a number of very useful functions for > obtaining descriptive statistics on groups of variables, including > summary {stats}, describe {Hmisc}, and describe {psych}, but none that > I have found is able to provided weighted descriptives of subsets of a > data set (ex. descriptives for both males and females for age, where > accurate results require use of sampling weights). > > Does anybody know of a function that does this? > > What I've looked at already: > > I have looked at describe.by {psych} which will give descriptives by > levels of another variable (eg. mean ages of males and females), but > does not accept sample weights. > > I have also looked at describe {Hmisc} which allows for weights, but > has no functionality for subdivision. > > I tried using a by() function with describe{Hmisc}: > > by(cbind(my, variables, here), division.variable, describe, > weights=weight.variable) > > but found that this returns an error message stating that the > variables to be described and the weights variable are not the same > length: > > Error in describe.vector(xx, nam[i], exclude.missing = > exclude.missing, : >length of weights must equal length of x > In addition: Warning message: > In present & !is.na(weights) : >longer object length is not a multiple of shorter object length > > This comes because the by() function passes down a subset of the > variables to be described to describe(), but not a subset of the > weights variable. describe() then searches the whatever data set is > attached in order to find the weights variables, but this is in its > original (i.e. not subsetted) form. Here is an example using the > ChickWeight dataset that comes in the "datasets" package. > > data(ChickWeight) > attach(ChickWeight) > library(Hmisc) > #this gives descriptive data on the variables "Time" and "Chick" by > levels of "Diet") > by(cbind(Time, Chick), Diet, describe) > #trying to add weights, however, does not work for reasons described > above > wgt=rnorm(length(Chick), 12, 1) > by(cbind(Time, Chick), Diet, describe, weights=wgt) > > Again, my question is, does anybody know of a function that combines > both the ability to provided weighted descriptives with the ability to > subdivide by the levels of some other variable? > > > Andrew Miles > Department of Sociology > Duke University > > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://old.nabble.com/Weighted-descriptives-by-levels-of-another-variables-tp26354665p26355885.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A combinatorial optimization problem: finding the best permutation of a complex vector
Hi, It was pointed out to me by Hans Borchers that the timing that I reported (0.3 sec) in the previous email for solving the LSAP problem, for N=1000, was too optimistic, because "X" and "Y" were equivalent up to a permutation. In order to test this out, I ran a few more experiments with different random variate distributions for X and Y. In all the experiments, I took N = 500. The execution times were faster when X and Y had the same or similar distributions. This is generally around 8 - 9 seconds. The more different the distributions are, the greater the time. For example, when I took the real and imaginary parts of X to be from a t-distribution with 3 degrees of freedom, and Y to be from uniform distribution in (0, 1), the execution times were around 80-90 seconds. n <- 500 x <- rt(n, df=3) + 1i * rt(n, df=3) y <- runif(n) + 1i * runif(n) Cmat <- outer(x, y, FUN=function(x,y) Mod(x - y)) system.time(ans <- solve_LSAP(Cmat, maximum=FALSE)) When I increased N = 1000, the time was about 1400 seconds! Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Ravi Varadhan Date: Saturday, November 14, 2009 10:53 am Subject: Re: [R] A combinatorial optimization problem: finding the best permutation of a complex vector To: "Charles C. Berry" Cc: r-help@r-project.org > Hi, > > I have solved the problem that I had posed before. Here is a > statement of the problem: > > "I have a complex-valued vector X in C^n. Given another > complex-valued vector Y in C^n, I want to find a permutation of Y, > say, Y*, that minimizes ||X - Y*||, the distance between X and Y*. " > > I was talking to Professor Moody T. Chu, who is a well-known numerical > analyst from NC State Univ, and he pointed out that this problem is an > instance of the classical "linear sum assignment problem (LSAP)" in > discrete mathematics. Once this was revealed to me, it didn't take me > long to find out the existence of various algorithms (e.g. Hungarian > algorithm) and codes (C, Fortran, Matlab) for solving this problem. I > also looked in the CRAN task view on optimization and found that the > LSAP solver is present in the "clue" package. Thanks to Kurt Hornik > for this package. > > So, here is an illustration of the "amazing" power of mathematics: > > n <- 1000 > > x <- rt(n, df=3) + 1i * rt(n, df=3) # this is the target vector to be > matched > > y <- x[sample(n)] # this is the vector to be permuted > > # Note: I have chosen a random permutation of the target so that I > know the answer is "x" itself > # and the minimum distance is zero > > Cmat <- outer(x, y, FUN=function(x,y) Mod(x - y)) > > require(clue) > > ans <- solve_LSAP(Cmat, maximum=FALSE) # We are minimizing the linear > sum > > dist <- function(x, y) sqrt(sum(Mod(x - y)^2)) > > dist(x, y[c(ans)]) > > > This is remarkable. It takes only about 0.3 seconds to solve this > difficult combinatorial problem! > > > Best, > Ravi. > > > > Ravi Varadhan, Ph.D. > Assistant Professor, > Division of Geriatric Medicine and Gerontology > School of Medicine > Johns Hopkins University > > Ph. (410) 502-2619 > email: rvarad...@jhmi.edu > > > - Original Message - > From: "Charles C. Berry" > Date: Thursday, November 12, 2009 2:20 pm > Subject: Re: [R] A combinatorial optimization problem: finding the > best permutation of a complex vector > To: Ravi Varadhan > Cc: r-help@r-project.org > > > > On Thu, 12 Nov 2009, Ravi Varadhan wrote: > > > > > > > > Hi, > > > > > > I have a complex-valued vector X in C^n. Given another > > complex-valued > > > vector Y in C^n, I want to find a permutation of Y, say, Y*, that > > > > minimizes ||X - Y*||, the distance between X and Y*. > > > > > > Note that this problem can be trivially solved for "Real" vectors, > > > since > > > real numbers possess the ordering property. Complex numbers, > > however, do > > > not possess this property. Hence the challenge. > > > > > > The obvious solution is to enumerate all the permutations of Y and > > > pick > > > out the one that yields the smallest distance. This, however, is > > > only > > > feasible for small n. I would like to be able to solve this for n > > > as > > > large as 100 - 1000, in which cases the permutation approach is > > > infeasible. > > > > > > I am looking for algorithms, possibly iterative, that can provide > a > > > > > "good" approximate solutions that are not necessarily optimal for > > > > high-dimensional vectors. I can do random sampling, but this can > be > > very > > > inefficient in high-dimensional problems. I am looking for > > efficient > > > algorithms because this
Re: [R] Weighted descriptives by levels of another variables
Have you reviewed the survey package functions? -- David On Nov 14, 2009, at 5:31 PM, Andrew Miles wrote: I've noticed that R has a number of very useful functions for obtaining descriptive statistics on groups of variables, including summary {stats}, describe {Hmisc}, and describe {psych}, but none that I have found is able to provided weighted descriptives of subsets of a data set (ex. descriptives for both males and females for age, where accurate results require use of sampling weights). Does anybody know of a function that does this? What I've looked at already: I have looked at describe.by {psych} which will give descriptives by levels of another variable (eg. mean ages of males and females), but does not accept sample weights. I have also looked at describe {Hmisc} which allows for weights, but has no functionality for subdivision. I tried using a by() function with describe{Hmisc}: by(cbind(my, variables, here), division.variable, describe, weights=weight.variable) but found that this returns an error message stating that the variables to be described and the weights variable are not the same length: Error in describe.vector(xx, nam[i], exclude.missing = exclude.missing, : length of weights must equal length of x In addition: Warning message: In present & !is.na(weights) : longer object length is not a multiple of shorter object length This comes because the by() function passes down a subset of the variables to be described to describe(), but not a subset of the weights variable. describe() then searches the whatever data set is attached in order to find the weights variables, but this is in its original (i.e. not subsetted) form. Here is an example using the ChickWeight dataset that comes in the "datasets" package. data(ChickWeight) attach(ChickWeight) library(Hmisc) #this gives descriptive data on the variables "Time" and "Chick" by levels of "Diet") by(cbind(Time, Chick), Diet, describe) #trying to add weights, however, does not work for reasons described above wgt=rnorm(length(Chick), 12, 1) by(cbind(Time, Chick), Diet, describe, weights=wgt) Again, my question is, does anybody know of a function that combines both the ability to provided weighted descriptives with the ability to subdivide by the levels of some other variable? Andrew Miles Department of Sociology Duke University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re move row if the column "date_abandoned" has a date in it
On Nov 14, 2009, at 5:24 PM, frenchcr wrote: I tried the following but it does the opposite of what i want: new_data5 <- subset(new_data4, date_abandoned > "0101") I want to remove the rows with dates and leave just the rows without a date. This removes all the rows that dont have a date in the date_abandoned column ...on a positive note, as i did this next... dim(new_data5) [1] 263 80 i now know that i have 263 dates in that column :) I want to remove the 263 rows with dates and leave just the rows without a date. Con=me on frenchcr. Stop making us guess. Give us enough information to work with. You asked for something which I construed as saying you wanted dates greater than the the first day of the year 101. You did not address this question. What do you get with str(new_data4) and summary(new_data4$date_abandoned) ? In order to know what sort of comparison to use we need to know what the data looks like. Even better if you offered the output from: small <- head(new_data4, 20) dump("small", 20), -- David David Winsemius wrote: On Nov 14, 2009, at 1:21 PM, frenchcr wrote: I want to go through a column in data called Bad name for a data.frame. Fortunes, "dog" and all that. date_abandoneddata["date_abandoned"]and remove all the rows that have numbers greater than 1,010,000. Are you doing archeology? Given what you say next I wondered what range you were really asking for. The dates are in the format 20091114 so i'm just going to treat them as numbers for clean up purposes. I know that i use subset but not sure how to proceed from there. subdata <- subset(data, date_abandoned > "0101"() The problem with > "101" is that your specified minimum point had an insufficient number of "places" to be in MMDD format. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error running lda example from Help File (MASS library )
Hello all, I'm trying to run lda() from the MASS library but the Help example generates the following error: #Code from example in lda Help file # Resulting Error >Error in if (targetlist[i] == stringname) { : argument is of length zero My Current R Installation: MacOSX: 10.5.8 R: 2.10.0 -- Gregory Riddick, PhD. CRTA Research Fellow National Institutes of Health National Cancer Institute, Neuro-Oncology Branch http://home.ccr.cancer.gov/nob/ 37 Convent Drive Building 37, Room 1142 Bethesda, MD 20892-8202 Phone: 301-443-2490 Fax: 240-396-5920 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] when vectorising does not work: silent function fail?
> > Also, you probably get less data copying by using a for() or while() loop > than by using apply() in this context. Why may there be less data copying with "for" and "while" compared to apply? > > Finally, the overhead of formula parsing and model matrix construction > repeated thousands of times probably dominates this computation; if it isn't > just a one-off it would probably be worth a lower-level implementation. > Does "lower-level implementation" mean code this outside of R. Thanks! Juliet __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re move row if the column "date_abandoned" has a date in it
I tried the following but it does the opposite of what i want: new_data5 <- subset(new_data4, date_abandoned > "0101") I want to remove the rows with dates and leave just the rows without a date. This removes all the rows that dont have a date in the date_abandoned column ...on a positive note, as i did this next... dim(new_data5) [1] 263 80 i now know that i have 263 dates in that column :) I want to remove the 263 rows with dates and leave just the rows without a date. David Winsemius wrote: > > > On Nov 14, 2009, at 1:21 PM, frenchcr wrote: > >> >> >> I want to go through a column in data called > > Bad name for a data.frame. Fortunes, "dog" and all that. > >> date_abandoneddata["date_abandoned"]and remove all the rows >> that >> have numbers greater than 1,010,000. > > Are you doing archeology? Given what you say next I wondered what > range you were really asking for. > >> >> The dates are in the format 20091114 so i'm just going to treat them >> as >> numbers for clean up purposes. >> >> >> I know that i use subset but not sure how to proceed from there. > > subdata <- subset(data, date_abandoned > "0101"() > > > The problem with > "101" is that your specified minimum point had > an insufficient number of "places" to be in MMDD format. > > -- > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Weighted descriptives by levels of another variables
I've noticed that R has a number of very useful functions for obtaining descriptive statistics on groups of variables, including summary {stats}, describe {Hmisc}, and describe {psych}, but none that I have found is able to provided weighted descriptives of subsets of a data set (ex. descriptives for both males and females for age, where accurate results require use of sampling weights). Does anybody know of a function that does this? What I've looked at already: I have looked at describe.by {psych} which will give descriptives by levels of another variable (eg. mean ages of males and females), but does not accept sample weights. I have also looked at describe {Hmisc} which allows for weights, but has no functionality for subdivision. I tried using a by() function with describe{Hmisc}: by(cbind(my, variables, here), division.variable, describe, weights=weight.variable) but found that this returns an error message stating that the variables to be described and the weights variable are not the same length: Error in describe.vector(xx, nam[i], exclude.missing = exclude.missing, : length of weights must equal length of x In addition: Warning message: In present & !is.na(weights) : longer object length is not a multiple of shorter object length This comes because the by() function passes down a subset of the variables to be described to describe(), but not a subset of the weights variable. describe() then searches the whatever data set is attached in order to find the weights variables, but this is in its original (i.e. not subsetted) form. Here is an example using the ChickWeight dataset that comes in the "datasets" package. data(ChickWeight) attach(ChickWeight) library(Hmisc) #this gives descriptive data on the variables "Time" and "Chick" by levels of "Diet") by(cbind(Time, Chick), Diet, describe) #trying to add weights, however, does not work for reasons described above wgt=rnorm(length(Chick), 12, 1) by(cbind(Time, Chick), Diet, describe, weights=wgt) Again, my question is, does anybody know of a function that combines both the ability to provided weighted descriptives with the ability to subdivide by the levels of some other variable? Andrew Miles Department of Sociology Duke University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best advice for connect R and Octave
Charlie, Thank you very much for your reply. I also read this earlier and noticed this package was contributed in 2002 and not updated since then, so I am afraid it has long since been surpassed by both the R and Octave architectures and not been maintained. Thus, I guess my search will continue to try to identify a way to access the frequency domain analysis techniques (bode, nyquist, root locus, etc.) from Octave within R, or I may have to access R from within Octave. I have not really looked at loading R into Octave in Windows just yet, but I guess that is the next thing to be considered. Thank you again for your reply and insights. - Original Message From: cls59 To: r-help@r-project.org Sent: Sat, November 14, 2009 4:29:18 PM Subject: Re: [R] Best advice for connect R and Octave Jason Rupert wrote: > > I see at one time there was a package called ROctave. I tried to install > that package: > >> install.packages("ROctave") > --- Please select a CRAN mirror for use in this session --- > Warning message: > In getDependencies(pkgs, dependencies, available, lib) : > package ‘ROctave’ is not available > > Unfortunately it appears that the package is no longer available. By any > chance is there another package or series of steps that need to be > followed to allow R to interface with Octave on the Window platform (not > using Cygwin)? > > ROctave appears to be an Omegahat package-- and a rough one at that as it has not been loaded onto the Omegahat CRAN-style server. You can find info at: http://www.omegahat.org/ROctave/ - Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/Best-advice-for-connect-R-and-Octave-tp26353037p26354485.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best advice for connect R and Octave
See http://www.omegahat.org/ROctave/ which offers the source package for download and some documentation that is not too promising. Good luck, Uwe Ligges Jason Rupert wrote: Uwe, Thank you for the quick response, but I think I'm missing what is being suggested about the Omegahat site. I think I may be overlooking something about that site. I tried: install.packages(ROctave, repos = "http://www.omegahat.org/R";) Error in install.packages(ROctave, repos = "http://www.omegahat.org/R";) : object 'ROctave' not found Results were similar to trying the CRAN site, so can you provide any additional hints (I think the caffeine may be fogging my understanding of your previous hint). I would also be willing to try other alternatives for accessing Octave functionality within R. I am not locked into any approach at this point, but would really like, if possible to stay in the R environment, but also need to access some of the frequency domain plotting and analysis capability, e.g. bode, nyquist, root locus, etc. offered by Octave. Given the other analysis and plotting capabilities within R I would likely be switching between the two programs quite a bit so having access to data in a common workspace would really help workflow. Thanks again and I guess no more Diet Coke for me today...Cie la vie... - Original Message From: Uwe Ligges To: Jason Rupert Cc: R-help@r-project.org Sent: Sat, November 14, 2009 4:17:03 PM Subject: Re: [R] Best advice for connect R and Octave - It has never been on CRAN. - A quick Google search suggests it is on Omegahat. Uwe Ligges Jason Rupert wrote: I see at one time there was a package called ROctave. I tried to install that package: install.packages("ROctave") --- Please select a CRAN mirror for use in this session --- Warning message: In getDependencies(pkgs, dependencies, available, lib) : package ‘ROctave’ is not available Unfortunately it appears that the package is no longer available. By any chance is there another package or series of steps that need to be followed to allow R to interface with Octave on the Window platform (not using Cygwin)? Ideally the interface would allow R to make Octave calls. I am using Octave Version 3.2.3 installed from http://octave.sourceforge.net/. For example I would like to call the bode function in Octave from R: L = tf2sys(3e4 * [0.0025 0.1 1], [0.01 1.03 3.03 3.01 1]); bode(L); Thanks for any feedback and insights. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best advice for connect R and Octave
Uwe, Thank you for the quick response, but I think I'm missing what is being suggested about the Omegahat site. I think I may be overlooking something about that site. I tried: > install.packages(ROctave, repos = "http://www.omegahat.org/R";) Error in install.packages(ROctave, repos = "http://www.omegahat.org/R";) : object 'ROctave' not found Results were similar to trying the CRAN site, so can you provide any additional hints (I think the caffeine may be fogging my understanding of your previous hint). I would also be willing to try other alternatives for accessing Octave functionality within R. I am not locked into any approach at this point, but would really like, if possible to stay in the R environment, but also need to access some of the frequency domain plotting and analysis capability, e.g. bode, nyquist, root locus, etc. offered by Octave. Given the other analysis and plotting capabilities within R I would likely be switching between the two programs quite a bit so having access to data in a common workspace would really help workflow. Thanks again and I guess no more Diet Coke for me today...Cie la vie... - Original Message From: Uwe Ligges To: Jason Rupert Cc: R-help@r-project.org Sent: Sat, November 14, 2009 4:17:03 PM Subject: Re: [R] Best advice for connect R and Octave - It has never been on CRAN. - A quick Google search suggests it is on Omegahat. Uwe Ligges Jason Rupert wrote: > I see at one time there was a package called ROctave. I tried to install > that package: > >> install.packages("ROctave") > --- Please select a CRAN mirror for use in this session --- > Warning message: > In getDependencies(pkgs, dependencies, available, lib) : > package ‘ROctave’ is not available > > Unfortunately it appears that the package is no longer available. By any > chance is there another package or series of steps that need to be followed > to allow R to interface with Octave on the Window platform (not using > Cygwin)? > Ideally the interface would allow R to make Octave calls. I am using Octave > Version 3.2.3 installed from http://octave.sourceforge.net/. > For example I would like to call the bode function in Octave from R: > > > L = tf2sys(3e4 * [0.0025 0.1 1], [0.01 1.03 3.03 3.01 1]); > bode(L); > Thanks for any feedback and insights. > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best advice for connect R and Octave
There is also this lexicon which might be sufficient to allow you to rewrite the Octave routine in R: http://cran.r-project.org/doc/contrib/R-and-octave.txt On Sat, Nov 14, 2009 at 5:29 PM, cls59 wrote: > > > Jason Rupert wrote: >> >> I see at one time there was a package called ROctave. I tried to install >> that package: >> >>> install.packages("ROctave") >> --- Please select a CRAN mirror for use in this session --- >> Warning message: >> In getDependencies(pkgs, dependencies, available, lib) : >> package ‘ROctave’ is not available >> >> Unfortunately it appears that the package is no longer available. By any >> chance is there another package or series of steps that need to be >> followed to allow R to interface with Octave on the Window platform (not >> using Cygwin)? >> >> > > ROctave appears to be an Omegahat package-- and a rough one at that as it > has not been loaded onto the Omegahat CRAN-style server. You can find info > at: > > http://www.omegahat.org/ROctave/ > > - Charlie > > - > Charlie Sharpsteen > Undergraduate > Environmental Resources Engineering > Humboldt State University > -- > View this message in context: > http://old.nabble.com/Best-advice-for-connect-R-and-Octave-tp26353037p26354485.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best advice for connect R and Octave
Jason Rupert wrote: > > I see at one time there was a package called ROctave. I tried to install > that package: > >> install.packages("ROctave") > --- Please select a CRAN mirror for use in this session --- > Warning message: > In getDependencies(pkgs, dependencies, available, lib) : > package ‘ROctave’ is not available > > Unfortunately it appears that the package is no longer available. By any > chance is there another package or series of steps that need to be > followed to allow R to interface with Octave on the Window platform (not > using Cygwin)? > > ROctave appears to be an Omegahat package-- and a rough one at that as it has not been loaded onto the Omegahat CRAN-style server. You can find info at: http://www.omegahat.org/ROctave/ - Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/Best-advice-for-connect-R-and-Octave-tp26353037p26354485.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best advice for connect R and Octave
- It has never been on CRAN. - A quick Google search suggests it is on Omegahat. Uwe Ligges Jason Rupert wrote: I see at one time there was a package called ROctave. I tried to install that package: install.packages("ROctave") --- Please select a CRAN mirror for use in this session --- Warning message: In getDependencies(pkgs, dependencies, available, lib) : package ‘ROctave’ is not available Unfortunately it appears that the package is no longer available. By any chance is there another package or series of steps that need to be followed to allow R to interface with Octave on the Window platform (not using Cygwin)? Ideally the interface would allow R to make Octave calls. I am using Octave Version 3.2.3 installed from http://octave.sourceforge.net/. For example I would like to call the bode function in Octave from R: L = tf2sys(3e4 * [0.0025 0.1 1], [0.01 1.03 3.03 3.01 1]); bode(L); Thanks for any feedback and insights. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naive "collinear" weighted linear regression
Mauricio O Calvao wrote: Unfortunately you eschewed answering objectively any of my questions; I insist they do make sense. Don't mention the data are perfect; this does not help to make any progress in understanding the choice of convenient summary info the lm method provides, as compared to what, in my humble opinion and in this specific particular case, it should provide: the covariance matrix of the estimated coefficients... The point is that R (as well as almost all other mainstream statistical software) assumes that a "weight" means that the variance of the corresponding observation is the general variance divided by the weight factor. The general variance is still determined from the residuals, and if they are zero to machine precision, well, there you go. I suspect you get closer to the mark with glm, which allows you to assume that the dispersion is known: > summary(glm(y~x,family="gaussian"),dispersion=0.3^2) or > summary(glm(y~x,family="gaussian",weights=1/error^2),dispersion=1) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naive "collinear" weighted linear regression
On Nov 14, 2009, at 1:50 PM, Mauricio O Calvao wrote: David Winsemius comcast.net> writes: Which means those x, y, and "error" figures did not come from an experiment, but rather from theory??? The fact is I am trying to compare the results of: (1) lm under R and (2) the Java applet at http://omnis.if.ufrj.br/~carlos/applets/reta/reta.html (3) the Fit method of the ROOT system used by CERN, (4) the Numerical Recipes functions for weighted linear regression The three last ones all provide, for the "fake" data set I furnished in my first post in this thread, the same results; particularly they give erros or uncertainties in the estimated coefficients of intercept and slope which, as seems intuitive, are not zero at all, but of the order 0.1 or 0.2, whereas the method lm under R issues a "Std. Error", which is zero. Independently of terminology, which sure is of utmost importance, the data I provided should give rise to a best fit straight line with intercept zero and slope 2, but also with non-vanishing errors associated with them. How do I get this from lm I only want, for instance, calculation of the so-called covariance matrix for the estimated coefficients, as given, for instance, in Equation (2.3.2) of the second edition of Draper and Smith, "Applied regression analysis"; this is a standard statistical result, right? So why does R not directly provide it as a summary from an lm object??? It's really not that difficult to get the variance covariance matrix. What is not so clear is why you think differential weighting of a set that has a perfect fit should give meaningfully different results than a fit that has no weights. ?lm ?vcov > y <- c(2,4,6,8) # response vect > fit_mod <- lm(y~x,weights=1/error^2) Error in eval(expr, envir, enclos) : object 'error' not found > error <- c(0.3,0.3,0.3,0.3) > fit_mod <- lm(y~x,weights=1/error^2) > vcov(fit_mod) (Intercept) x (Intercept) 2.396165e-30 -7.987217e-31 x -7.987217e-31 3.194887e-31 Numerically those are effectively zero. > fit_mod <- lm(y~x) > vcov(fit_mod) (Intercept) x (Intercept) 0 0 x 0 0 -- David. Of course the best fit coefficients should be 0 for the intercept and 2 for the slope. Furthermore, it seems completely plausible (or not?) that, since the y_i have associated non-vanishing ``errors'' (dispersions), there should be corresponding non- vanishing ``errors'' associated to the best fit coefficients, right? When I try: fit_mod <- lm(y~x,weights=1/error^2) I get Warning message: In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : extra arguments weigths are just disregarded. (Actually the weights are for adjusting for sampling, and I do not see any sampling in your "design".) Keeping on, despite the warning message, which I did not quite understand, when I type: summary(fit_mod) I get Call: lm(formula = y ~ x, weigths = 1/error^2) Residuals: 1 2 3 4 -5.067e-17 8.445e-17 -1.689e-17 -1.689e-17 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.000e+00 8.776e-17 0.000e+001 x 2.000e+00 3.205e-17 6.241e+16 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 7.166e-17 on 2 degrees of freedom Multiple R-squared: 1, Adjusted R-squared: 1 F-statistic: 3.895e+33 on 1 and 2 DF, p-value: < 2.2e-16 Naively, should not the column Std. Error be different from zero?? What I have in mind, and sure is not what Std. Error means, is that if I carried out a large simulation, assuming each response y_i a Gaussian random variable with mean y_i and standard deviation 2*error=0.6, and then making an ordinary least squares fitting of the slope and intercept, I would end up with a mean for these simulated coefficients which should be 2 and 0, respectively, Well, not precisely 2 and 0, but rather something very close ... i.e, within "experimental error". Please note that numbers in the range of 10e-17 are effectively zero from a numerical analysis perspective. http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f .Machine$double.eps ^ 0.5 [1] 1.490116e-08 I know this all too well and it is obviously a trivial supernewbie issue, which I have already overcome a long time ago... and, that's the point, a non-vanishing standard deviation for these fitted coefficients, right?? This somehow is what I expected should be an estimate or, at least, a good indicator, of the degree of uncertainty which I should assign to the fitted coefficients; it seems to me these deviations, thus calculated as a result of the simulation, will certainly not be zero (or 3e-17, for that matter). So this Std. Error does not provide what I, naively, think should be given as a measure of the uncertainties or errors in the f
[R] Best way to model colonization: logistic regression vs. Poisson regression, or perhaps some other technique.
Please forgive a stats question I am trying to model colonization with a bacterium with the aims of quantifying overall colonization rate as well as determining risk factors for colonization in a in-hospital setting. Risk factors to me measured include type of contact with a patient (e.g. feeding vs. obtaining vital signs vs. wound care etc), length of contact, use of antibiotics, etc. I believe there are two techniques that can be used to model colonization: logistic regression or Poisson regression. Either method can give me a measure of likelihood of transmission (the OR from logistic regression or a true rate from Poisson regression. n.b. the OR from logistic regression can be converted to a probability). Can anyone suggest reasons for using one technique or the other? Thank you for your help and understanding about posting a stats question to the R listserver. Thanks, John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re move row if the column "date_abandoned" has a date in it
On Nov 14, 2009, at 1:21 PM, frenchcr wrote: I want to go through a column in data called Bad name for a data.frame. Fortunes, "dog" and all that. date_abandoneddata["date_abandoned"]and remove all the rows that have numbers greater than 1,010,000. Are you doing archeology? Given what you say next I wondered what range you were really asking for. The dates are in the format 20091114 so i'm just going to treat them as numbers for clean up purposes. I know that i use subset but not sure how to proceed from there. subdata <- subset(data, date_abandoned > "0101"() The problem with > "101" is that your specified minimum point had an insufficient number of "places" to be in MMDD format. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re move row if the column "date_abandoned" has a date in it
I want to go through a column in data called date_abandoneddata["date_abandoned"]and remove all the rows that have numbers greater than 1,010,000. The dates are in the format 20091114 so i'm just going to treat them as numbers for clean up purposes. I know that i use subset but not sure how to proceed from there. -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26352457.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naive "collinear" weighted linear regression
David Winsemius comcast.net> writes: > > Which means those x, y, and "error" figures did not come from an > experiment, but rather from theory??? > The fact is I am trying to compare the results of: (1) lm under R and (2) the Java applet at http://omnis.if.ufrj.br/~carlos/applets/reta/reta.html (3) the Fit method of the ROOT system used by CERN, (4) the Numerical Recipes functions for weighted linear regression The three last ones all provide, for the "fake" data set I furnished in my first post in this thread, the same results; particularly they give erros or uncertainties in the estimated coefficients of intercept and slope which, as seems intuitive, are not zero at all, but of the order 0.1 or 0.2, whereas the method lm under R issues a "Std. Error", which is zero. Independently of terminology, which sure is of utmost importance, the data I provided should give rise to a best fit straight line with intercept zero and slope 2, but also with non-vanishing errors associated with them. How do I get this from lm I only want, for instance, calculation of the so-called covariance matrix for the estimated coefficients, as given, for instance, in Equation (2.3.2) of the second edition of Draper and Smith, "Applied regression analysis"; this is a standard statistical result, right? So why does R not directly provide it as a summary from an lm object??? > > > > Of course the best fit coefficients should be 0 for the intercept > > and 2 for the slope. Furthermore, it seems completely plausible (or > > not?) that, since the y_i have associated non-vanishing > > ``errors'' (dispersions), there should be corresponding non- > > vanishing ``errors'' associated to the best fit coefficients, right? > > > > When I try: > > > > > fit_mod <- lm(y~x,weights=1/error^2) > > > > I get > > > > Warning message: > > In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : > > extra arguments weigths are just disregarded. > > (Actually the weights are for adjusting for sampling, and I do not > see any sampling in your "design".) > > > > > Keeping on, despite the warning message, which I did not quite > > understand, when I type: > > > > > summary(fit_mod) > > > > I get > > > > Call: > > lm(formula = y ~ x, weigths = 1/error^2) > > > > Residuals: > > 1 2 3 4 > > -5.067e-17 8.445e-17 -1.689e-17 -1.689e-17 > > > > Coefficients: > > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 0.000e+00 8.776e-17 0.000e+001 > > x 2.000e+00 3.205e-17 6.241e+16 <2e-16 *** > > --- > > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > > > Residual standard error: 7.166e-17 on 2 degrees of freedom > > Multiple R-squared: 1, Adjusted R-squared: 1 > > F-statistic: 3.895e+33 on 1 and 2 DF, p-value: < 2.2e-16 > > > > > > Naively, should not the column Std. Error be different from zero?? > > What I have in mind, and sure is not what Std. Error means, is that > > if I carried out a large simulation, assuming each response y_i a > > Gaussian random variable with mean y_i and standard deviation > > 2*error=0.6, and then making an ordinary least squares fitting of > > the slope and intercept, I would end up with a mean for these > > simulated coefficients which should be 2 and 0, respectively, > > Well, not precisely 2 and 0, but rather something very close ... i.e, > within "experimental error". Please note that numbers in the range of > 10e-17 are effectively zero from a numerical analysis perspective. > > http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f > > > .Machine$double.eps ^ 0.5 > [1] 1.490116e-08 I know this all too well and it is obviously a trivial supernewbie issue, which I have already overcome a long time ago... > > > and, that's the point, a non-vanishing standard deviation for these > > fitted coefficients, right?? This somehow is what I expected should > > be an estimate or, at least, a good indicator, of the degree of > > uncertainty which I should assign to the fitted coefficients; it > > seems to me these deviations, thus calculated as a result of the > > simulation, will certainly not be zero (or 3e-17, for that matter). > > So this Std. Error does not provide what I, naively, think should be > > given as a measure of the uncertainties or errors in the fitted > > coefficients... > > You are trying to impose an error structure on a data situation that > you constructed artificially to be perfect. > > > > > What am I not getting right?? > > That if you input "perfection" into R's linear regression program, you > get appropriate warnings? > > > > > Thanks and sorry for the naive and non-expert question! > > You are a Professor of physics, right? You do experiments, right? You > replicate them. S0 perhaps I'm the one who should be puzzled. Unfortunately you eschewed answering objectively any of my
Re: [R] vignettes: .png graphics or pre-compiled .pdf
Yes, I also wish Sweave could give us more flexible options, e.g. it should not be difficult to free the graphics device specification as an R function (pdf, png, CairoPDF, ...) instead of just letting us set pdf=T/F and eps=T/F. If we don't want to hack the Sweave code, we may also rewrite it as a package. This has been in my mind for a long time. Regards, Yihui -- Yihui Xie Phone: 515-294-6609 Web: http://yihui.name Department of Statistics, Iowa State University 3211 Snedecor Hall, Ames, IA 2009/11/14 Michael Friendly : > Thanks, Yihui > Your solution, for png(), only looks dirty because you had to hack the Sweave > code. > It would be nice to have png() support included directly. > > Yihui Xie wrote: >> >> I was reminded that the attachments were blocked by the list, so I >> send these links again: >> >> http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.Rnw >> http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.r >> http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.pdf >> >> Regards, >> Yihui >> -- >> Yihui Xie >> Phone: 515-294-6609 Web: http://yihui.name >> Department of Statistics, Iowa State University >> 3211 Snedecor Hall, Ames, IA >> >> >> >> On Fri, Nov 13, 2009 at 8:31 PM, Yihui Xie wrote: >> >>> >>> Hi Michael, >>> >>> I have a dirty solution as attached to use png() for Sweave. >>> >>> HTH. >>> >>> Regards, >>> Yihui >>> -- >>> Yihui Xie >>> Phone: 515-294-6609 Web: http://yihui.name >>> Department of Statistics, Iowa State University >>> 3211 Snedecor Hall, Ames, IA >>> >>> >>> On Fri, Nov 13, 2009 at 10:02 AM, Michael Friendly >>> wrote: >>> In a package I'm working on there is a vignette with a number of graphs that result in huge .pdf files, so the .pdf for the vignette is around 17 Mb. If these graphs are converted to .png, and the .tex file is compiled with pdflatex, the resulting .pdf is ~1 Mb. I'm reluctant to put the .Rnw file into the package as is, generating the huge .pdf for the vignette. I first tried installing the smaller .pdf file in the package by itself (no .Rnw) together with a file inst/doc/index.html as recommended in 'Writing R Extensions.' However, when the package is installed, vignette() can't find it > > vignette(package="Guerry") > no vignettes found > > vignette("MultiSpat") > Warning message: vignette 'MultiSpat' *not* found Alternatively, is there a way to generate .png graphs from the .Rnw file so that those are used in building the .pdf for the package? AFAICS, \SweaveOpts{} offers only the choices of eps/pdf = {TRUE/FALSE}. -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. >> >> > > > -- > Michael Friendly Email: frien...@yorku.ca Professor, Psychology Dept. > York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html > Toronto, ONT M3J 1P3 CANADA > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting contrasts for a logistic regression
w_poet wrote: Hi everyone, I'm doing a logistic regression with an ordinal variable. I'd like to set the contrasts on the ordinal variable. However, when I set the contrasts, they work for ordinary linear regression (lm), but not logistic regression (lrm): ddist = datadist(bin.time, exp.loc) options(datadist='ddist') contrasts(exp.loc) = contr.treatment(3, base = 3, contrasts = TRUE) lrm.loc = lrm(bin.time ~ exp.loc, data = Dataset) In this case, lrm still uses exp.loc = 1 as the base, at least in terms of notation, even though I set exp.loc = 3 as the base. Is there a way to set contrasts for lrm? Thanks for any advice, Stephen In the Design package and its replacement the rms package, the package wants control of the contrasts used during fitting. But one should not in my view be too concerned with this, as after-the-fit contrasts are simple to get using the contrast.rms or contrast.Design functions. They use the philosophy that getting predicted values is the safest way to go because you don't need keep track of contrasts/coding. The summary and plot function in rms and Design are also helpful here. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] formatting dates in axis labels (ggplot2)
I'm having trouble figuring out how to format Date variables when used as axis labels in graphs. The particular case here is an attempt to re-create Nightingale's coxcomb graph with ggplot2, where I'd like the months to be labeled as "Mar 1885", "Apr 1885", using a date format of "%b %Y" applied to label the dates, or really anything other than "1885-03-01". I know the solution has to do with formatting the dates, while preserving their status as an ordered factor, but I don't know how to do that. Here's a subset of the data: Night1 <- structure(list(Date = structure(c(-42278, -42248, -42217, -42187, -42156, -42125, -42095, -42064, -42034, -42003, -41972, -42278, -42248, -42217, -42187, -42156, -42125, -42095, -42064, -42034, -42003, -41972, -42278, -42248, -42217, -42187, -42156, -42125, -42095, -42064, -42034, -42003, -41972), class = "Date"), Cause = c("Disease", "Disease", "Disease", "Disease", "Disease", "Disease", "Disease", "Disease", "Disease", "Disease", "Disease", "Wounds", "Wounds", "Wounds", "Wounds", "Wounds", "Wounds", "Wounds", "Wounds", "Wounds", "Wounds", "Wounds", "Other", "Other", "Other", "Other", "Other", "Other", "Other", "Other", "Other", "Other", "Other"), Deaths = c(1.4, 6.2, 4.7, 150, 328.5, 312.2, 197, 340.6, 631.5, 1022.8, 822.8, 0, 0, 0, 0, 0.4, 32.1, 51.7, 115.8, 41.7, 30.7, 16.3, 7, 4.6, 2.5, 9.6, 11.9, 27.7, 50.1, 42.8, 48, 120, 140.1), Regime = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L ), .Label = c("Before", "After"), class = c("ordered", "factor" ))), .Names = c("Date", "Cause", "Deaths", "Regime"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L), class = "data.frame") > str(Night1) 'data.frame': 33 obs. of 4 variables: $ Date :Class 'Date' num [1:33] -42278 -42248 -42217 -42187 -42156 ... $ Cause : chr "Disease" "Disease" "Disease" "Disease" ... $ Deaths: num 1.4 6.2 4.7 150 328.5 ... $ Regime: Ord.factor w/ 2 levels "Before"<"After": 1 1 1 1 1 1 1 1 1 1 ... > Here are a few things I've tried, some of which give errors and others of which simply give the wrong graph library(ggplot2) cxc1 <- ggplot(Night1, aes(x = factor(Date), y=Deaths, fill = Cause)) + # do it as a stacked bar chart first geom_bar(width = 1, position="identity", color="black") + # set scale so area ~ Deaths scale_y_sqrt() # A coxcomb plot = bar chart + polar coordinates cxc1 + coord_polar(start=3*pi/2) + opts(title="Causes of Mortality in the Army in the East") + xlab("") # why doesn't this work? cxc1 <- cxc1 + scale_x_date(format="%b %Y", major="months") cxc1 stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this. OK, I tried formatting Date first, in different ways. Each time, I get a graphical result, but I don't know how to use format() for dates to make the result ordered as normal dates, rather than as character strings. Night1$dt1 <- format(Night1$Date, "%b %Y") cxc1 <- ggplot(Night1, aes(x = factor(dt1), y=Deaths, fill = Cause)) + geom_bar(width = 1, position="identity", color="black") + scale_y_sqrt() cxc1 + coord_polar(start=3*pi/2) + opts(title="Causes of Mortality in the Army in the East") + xlab("") -Michael -- Michael Friendly Email: frien...@yorku.ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Best advice for connect R and Octave
I see at one time there was a package called ROctave. I tried to install that package: > install.packages("ROctave") --- Please select a CRAN mirror for use in this session --- Warning message: In getDependencies(pkgs, dependencies, available, lib) : package ‘ROctave’ is not available Unfortunately it appears that the package is no longer available. By any chance is there another package or series of steps that need to be followed to allow R to interface with Octave on the Window platform (not using Cygwin)? Ideally the interface would allow R to make Octave calls. I am using Octave Version 3.2.3 installed from http://octave.sourceforge.net/. For example I would like to call the bode function in Octave from R: L = tf2sys(3e4 * [0.0025 0.1 1], [0.01 1.03 3.03 3.01 1]); bode(L); Thanks for any feedback and insights. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems by saving Rprofile.site under vista
You may have to run R as Administrator (right-click, choose run as administrator) to make these kinds of changes. After you have things the way you like them, run R in the usual way by clicking on the icon. Charles Annis, P.E. charles.an...@statisticalengineering.com 561-352-9699 http://www.StatisticalEngineering.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of anna_l Sent: Friday, November 13, 2009 11:46 AM To: r-help@r-project.org Subject: [R] Problems by saving Rprofile.site under vista Hello, I am trying to save some changes I have done on the Rprofile.site under vista and it doesn´t let me save the file saying that it can´t create the following file (Rprofile.site) and that I should check the pathfile or the file name. -- View this message in context: http://old.nabble.com/Problems-by-saving-Rprofile.site-under-vista-tp26339605p26339605.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vignettes: .png graphics or pre-compiled .pdf
Thanks, Yihui Your solution, for png(), only looks dirty because you had to hack the Sweave code. It would be nice to have png() support included directly. Yihui Xie wrote: I was reminded that the attachments were blocked by the list, so I send these links again: http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.Rnw http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.r http://yihui.name/en/wp-content/uploads/2009/11/Sweave2.pdf Regards, Yihui -- Yihui Xie Phone: 515-294-6609 Web: http://yihui.name Department of Statistics, Iowa State University 3211 Snedecor Hall, Ames, IA On Fri, Nov 13, 2009 at 8:31 PM, Yihui Xie wrote: Hi Michael, I have a dirty solution as attached to use png() for Sweave. HTH. Regards, Yihui -- Yihui Xie Phone: 515-294-6609 Web: http://yihui.name Department of Statistics, Iowa State University 3211 Snedecor Hall, Ames, IA On Fri, Nov 13, 2009 at 10:02 AM, Michael Friendly wrote: In a package I'm working on there is a vignette with a number of graphs that result in huge .pdf files, so the .pdf for the vignette is around 17 Mb. If these graphs are converted to .png, and the .tex file is compiled with pdflatex, the resulting .pdf is ~1 Mb. I'm reluctant to put the .Rnw file into the package as is, generating the huge .pdf for the vignette. I first tried installing the smaller .pdf file in the package by itself (no .Rnw) together with a file inst/doc/index.html as recommended in 'Writing R Extensions.' However, when the package is installed, vignette() can't find it vignette(package="Guerry") no vignettes found vignette("MultiSpat") Warning message: vignette 'MultiSpat' *not* found Alternatively, is there a way to generate .png graphs from the .Rnw file so that those are used in building the .pdf for the package? AFAICS, \SweaveOpts{} offers only the choices of eps/pdf = {TRUE/FALSE}. -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Friendly Email: frien...@yorku.ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] refactoring in R
Hi Peng, I'm wondering which eclipse I shall download to use with StatET. Would you please let me know? http://www.eclipse.org/downloads/ This depends on your needs other than R programming. As you can see, there are Eclipse Packages targeted at C/C++ developers, PHP developers, Java EE developers etc. Whatever Eclipse Package you choose, Eclipse is a highly modular (component-based) platform and you can install (or remove) any plug-ins you want after installation. The StatET plug-ins will be the first additional ones you want to install. http://www.walware.de/it/statet/installation.mframe For general use, I would just take Eclipse Classic. HTH, Tobias On Sat, Nov 14, 2009 at 5:46 AM, Tobias Verbeke wrote: Hi Peng, Some of the refactoring methods I identified back then were integrated into Eclipse/StatET in the mean time. StatET by the way contains some extensions that were not in the original proposal on that website. For the announcement of the latest release, see http://lists.r-forge.r-project.org/pipermail/statet-user/2009-September/000208.html It is advisable to use it with the rJava version referenced in the announcement (as rJava 0.8.* had some non-backwards-compatible API changes). A new StatET version (a.o. adapted to rJava 0.8.x) is likely to be released on short notice. If you want to keep up to date, there is a dedicated mailing list at https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/statet-user HTH, Tobias P.S. The refactoring methods are available under the Source menu, and there is one [simple rename] made available as a QuickFix (Ctrl+1). Peng Yu wrote: I found the examples of how to change the code for each refactoring activity. Are there tools that can help automate this process? On Fri, Nov 13, 2009 at 9:16 PM, milton ruser wrote: Hi Peng, If that information is preliminary, so I guess you have a more clear problem and may be you are able to state a minimally reproducible code/example with what you really need. Bests milton On Fri, Nov 13, 2009 at 9:30 PM, Peng Yu wrote: I'm wondering if there are some tips for refactoring in R. I found the following website, which is still preliminary. Is there any program that can help me do refactoring in R? http://www.r-developer.org/projects/show/refactoring __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and Python
There appear to be win32 binaries for the the current release of rpy2. L. On Nov 4, 4:21 pm, Gabor Grothendieck wrote: > As far as I know the latest versions of neither RSpython norrpy2 > support Windows. For accessing SymPy (which is a python computer > algebra system) from R rSymPy went with jython. Its slower than > cpython, particularly the startup, but it should work on all > platforms. See > http://rsympy.googlecode.com > The latest version is 0.1-4. > > If ruby is an option see the rinruby project for accessing R from > ruby. There is a paper on it on jstatsoft.org . > > If java is an option see the RServe package for accessing R from java. > Also in the other direction the rJava package can be used to access > java from R. > > > > On Mon, Nov 2, 2009 at 11:37 AM, Ryan Sheftel > wrote: > > I am a long time user of R for financial time series analysis (xts, zoo, > > etc) and for my next project I am thinking of adding the Python language to > > the mix. The reason for adding Python is primarily its non-statistical > > capabilities. > > > So my questions are what have people's experiences been with using interop > > between R and Python. I see there are two items, rPy and RSPython. It looks > > like rPy makes it possible to call R code from Python, and RSPython goes > > both ways. My needs would be to use Python to drive R to get it's extensive > > time series and stats, and also to get to Python objects from inside R. > > > I searched the list archives and it looks like many people use rPy, but what > > about RSPython for calling Python from R? It looks like rPy only goes from R > > into Python, and RSPython has not been updated since 2005? > > > Other messages in the archives state that RSPython only works on Unix? > > > Would I be foolish to build anything mission critical on RSPython? Is there > > a better way to get at Python from R? > > > Thanks as always. > > > [[alternative HTML version deleted]] > > > __ > > r-h...@r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cleanse columns and unwanted rows
The full code and error message i get is... > cleanse <- function(a){ + data1<-a + for (i in 1:dim(data1)[1]) + { + if (data1[i,"legal_status"] == "Private"){ + data1[i,"legal_status"]<-data1[-i,] + if (data1[i,"legal_status"] == "Private (Op"){ + data1[i,"legal_status"]<-data1[-i,] + if (data1[i,"legal_status"] == "Unknown"){ + data1[i,"legal_status"]<-data1[-i,] + } +} + } + } + return(data1) + } > new_data<-cleanse(data) Error in if (data1[i, "legal_status"] == "Private (Op") { : missing value where TRUE/FALSE needed In addition: There were 50 or more warnings (use warnings() to see the first 50) > frenchcr wrote: > > hello folks, > > Im trying to clean out a large file with data i dont need. > The column im manipulating in the file is called "legal_status" > There are three kinds of rows i want to remove. Those that have "Private", > "Private (Op", or "Unknown" in the legal_status column. > > > I wrote this code but i get errors and it says im missing a TRUE/ False > thingy...im lost...heres the code... > > > > cleanse <- function(a){ > data1<-a > > for (i in 1:dim(data1)[1]) > { > if (data1[i,"legal_status"] == "Private") > { > data1[i,"legal_status"]<-data1[-i,"legal_status"] > } > if (data1[i,"legal_status"] == "Private (Op"){ > data1[i,"legal_status"]<-data1[-i,"legal_status"] > } > if (data1[i,"legal_status"] == "Unknown"){ > data1[i,"legal_status"]<-data1[-i,"legal_status"] > } > } > > return(data1) > } > new_data<-cleanse(data) > > > > > Any ideas? > -- View this message in context: http://old.nabble.com/cleanse-columns-and-unwanted-rows-tp26342169p26350857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cleanse columns and unwanted rows
The solution is much simpler (thanks Phil!) new_data = data[!data$"legal status" %in% c("Private","Private (Op","Unknown"),] ...works nicely. frenchcr wrote: > > hello folks, > > Im trying to clean out a large file with data i dont need. > The column im manipulating in the file is called "legal_status" > There are three kinds of rows i want to remove. Those that have "Private", > "Private (Op", or "Unknown" in the legal_status column. > > > I wrote this code but i get errors and it says im missing a TRUE/ False > thingy...im lost...heres the code... > > > > cleanse <- function(a){ > data1<-a > > for (i in 1:dim(data1)[1]) > { > if (data1[i,"legal_status"] == "Private") > { > data1[i,"legal_status"]<-data1[-i,"legal_status"] > } > if (data1[i,"legal_status"] == "Private (Op"){ > data1[i,"legal_status"]<-data1[-i,"legal_status"] > } > if (data1[i,"legal_status"] == "Unknown"){ > data1[i,"legal_status"]<-data1[-i,"legal_status"] > } > } > > return(data1) > } > new_data<-cleanse(data) > > > > > Any ideas? > -- View this message in context: http://old.nabble.com/cleanse-columns-and-unwanted-rows-tp26342169p26350874.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Silently loading an R package.
Yihui Xie a écrit : please read the 'Details' section of ?require To suppress messages during the loading of packages use 'suppressPackageStartupMessages': this will suppress all messages from R itself but not necessarily all those from package authors. Regards, Yihui Thank you so much... Guillaume Yziquel. -- Yihui Xie Phone: 515-294-6609 Web: http://yihui.name Department of Statistics, Iowa State University 3211 Snedecor Hall, Ames, IA On Fri, Nov 13, 2009 at 6:02 PM, Guillaume Yziquel wrote: Hello. I've been working an a binding between OCaml and R (i.e. calling R from OCaml, mostly). See below for a taste of it. I'm currently wondering how to load a given R package silently. I tried require(xts, quietly = TRUE) but I still get some ugly output. Is it possible to squeeze off this output on stdout? All the best, Guillaume Yziquel. yziq...@seldon:~$ ocaml-batteriesObjective Caml version 3.11.1 _ | | | | [| + | | Batteries Included - | |___|_|___| _ | | | | | -Type '#help;;' | | + |] |___|_|___| # #require "R.interpreter";; # R.sexp "require(xts)";; Le chargement a nécessité le package : xts Le chargement a nécessité le package : zoo Attachement du package : 'zoo' The following object(s) are masked from package:base : as.Date.numeric xts now requires a valid TZ variable to be set no TZ var is set, setting to TZ=GMT - : R.sexp = # -- Guillaume Yziquel http://yziquel.homelinux.org/ -- Guillaume Yziquel http://yziquel.homelinux.org/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A combinatorial optimization problem: finding the best permutation of a complex vector
Hi, I have solved the problem that I had posed before. Here is a statement of the problem: "I have a complex-valued vector X in C^n. Given another complex-valued vector Y in C^n, I want to find a permutation of Y, say, Y*, that minimizes ||X - Y*||, the distance between X and Y*. " I was talking to Professor Moody T. Chu, who is a well-known numerical analyst from NC State Univ, and he pointed out that this problem is an instance of the classical "linear sum assignment problem (LSAP)" in discrete mathematics. Once this was revealed to me, it didn't take me long to find out the existence of various algorithms (e.g. Hungarian algorithm) and codes (C, Fortran, Matlab) for solving this problem. I also looked in the CRAN task view on optimization and found that the LSAP solver is present in the "clue" package. Thanks to Kurt Hornik for this package. So, here is an illustration of the "amazing" power of mathematics: n <- 1000 x <- rt(n, df=3) + 1i * rt(n, df=3) # this is the target vector to be matched y <- x[sample(n)] # this is the vector to be permuted # Note: I have chosen a random permutation of the target so that I know the answer is "x" itself # and the minimum distance is zero Cmat <- outer(x, y, FUN=function(x,y) Mod(x - y)) require(clue) ans <- solve_LSAP(Cmat, maximum=FALSE) # We are minimizing the linear sum dist <- function(x, y) sqrt(sum(Mod(x - y)^2)) dist(x, y[c(ans)]) This is remarkable. It takes only about 0.3 seconds to solve this difficult combinatorial problem! Best, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: "Charles C. Berry" Date: Thursday, November 12, 2009 2:20 pm Subject: Re: [R] A combinatorial optimization problem: finding the best permutation of a complex vector To: Ravi Varadhan Cc: r-help@r-project.org > On Thu, 12 Nov 2009, Ravi Varadhan wrote: > > > > > Hi, > > > > I have a complex-valued vector X in C^n. Given another > complex-valued > > vector Y in C^n, I want to find a permutation of Y, say, Y*, that > > minimizes ||X - Y*||, the distance between X and Y*. > > > > Note that this problem can be trivially solved for "Real" vectors, > since > > real numbers possess the ordering property. Complex numbers, > however, do > > not possess this property. Hence the challenge. > > > > The obvious solution is to enumerate all the permutations of Y and > pick > > out the one that yields the smallest distance. This, however, is > only > > feasible for small n. I would like to be able to solve this for n > as > > large as 100 - 1000, in which cases the permutation approach is > > infeasible. > > > > I am looking for algorithms, possibly iterative, that can provide a > > > "good" approximate solutions that are not necessarily optimal for > > high-dimensional vectors. I can do random sampling, but this can be > very > > inefficient in high-dimensional problems. I am looking for > efficient > > algorithms because this step has to be performed in each iteration > of an > > "outer" algorithm. > > > > Are there any clever adaptive algorithms out there? > > > > I do not know. > > But would you settle for a not-so-clever adaptive heuristic? > > If so, see below. > > > > > Here is an example illustrating the problem: > > > > require(e1071) > > > > n <- 8 > > x <- runif(n) + 1i * runif(n) > > y <- runif(n) + 1i * runif(n) > > > > dist <- function(x, y) sqrt(sum(Mod(x - y)^2)) > > > > perms <- permutations(n) > > dim(perms) # [1] 40320 8 > > tmp <- apply(perms, 1, function(ord) dist(x, y[ord])) > > z <- y[perms[which.min(tmp), ]] # exact solution > > dist(x, z) > > > > # an aproximate random-sampling approach > > nsamp <- 1 > > perms <- t(replicate(nsamp, sample(1:n, size=n, replace=FALSE))) > > tmp <- apply(perms, 1, function(ord) dist(x, y[ord])) > > z.app <- y[perms[which.min(tmp), ]] # approximate solution > > dist(x, z.app) > > > > The heuristic is to use a stochastic greedy updates. Here is a very > simple > one: > > swap.samp <- > function(index) { > sub.ind <- sample(seq(along=index),2) > index[sub.ind]<- rev(sub.ind) > index > } > > > z.app <- y > z.cand <- y > > for (i in 1:100) z.cand <- > if( dist(x,z.app) < dist(x,z.cand) ) { > > z.app[swap.samp(1:8)] > > } else { > z.app <- z.cand > z.cand[swap.samp(1:8)] > } > > On your toy example, this usually finds the min(dist(x,z.app)) in < > 100 > trials. > > Note that when > > z.diff <- z.app != z.cand > > dist(x[ z.diff ],z.app[ z.diff ])^2 - dist(x[ z.diff ],z.cand[ > z.diff ])^2 > > equals > > dist(x,z.app)^2 - dist(x,z.cand)^2 > > so you could vectorize the above to randomly pair up all the poi
Re: [R] refactoring in R
I'm wondering which eclipse I shall download to use with StatET. Would you please let me know? http://www.eclipse.org/downloads/ On Sat, Nov 14, 2009 at 5:46 AM, Tobias Verbeke wrote: > Hi Peng, > > Some of the refactoring methods I identified back > then were integrated into Eclipse/StatET in the > mean time. > > StatET by the way contains some extensions that were > not in the original proposal on that website. > > For the announcement of the latest release, see > > http://lists.r-forge.r-project.org/pipermail/statet-user/2009-September/000208.html > > It is advisable to use it with the rJava version > referenced in the announcement (as rJava 0.8.* > had some non-backwards-compatible API changes). > > A new StatET version (a.o. adapted to rJava 0.8.x) > is likely to be released on short notice. > > If you want to keep up to date, there is a dedicated > mailing list at > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/statet-user > > HTH, > Tobias > > P.S. The refactoring methods are available under the Source menu, > and there is one [simple rename] made available as a QuickFix (Ctrl+1). > > Peng Yu wrote: >> >> I found the examples of how to change the code for each refactoring >> activity. Are there tools that can help automate this process? >> >> On Fri, Nov 13, 2009 at 9:16 PM, milton ruser >> wrote: >>> >>> Hi Peng, >>> >>> If that information is preliminary, so I guess you >>> have a more clear problem and may be you are able to >>> state a minimally reproducible code/example with >>> what you really need. >>> >>> Bests >>> >>> milton >>> >>> >>> On Fri, Nov 13, 2009 at 9:30 PM, Peng Yu wrote: I'm wondering if there are some tips for refactoring in R. I found the following website, which is still preliminary. Is there any program that can help me do refactoring in R? http://www.r-developer.org/projects/show/refactoring __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. >>> >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear model and by()
On Nov 13, 2009, at 11:49 AM, Sam Albers wrote: Hello R list, snipped answered question Sorry to not use your data but it's not in a form that lends itself very well to quick testing. If you had included the input commands I might have tried it. No problem not use my data. For future reference, would it have been easier to attach a .csv file and then include the appropriate read.csv command? I realized that the easier one makes it to help, the easier it is to get a response. The Posting Guide suggests you include dump("x", file=stdout()) I have a simple x object in my workspace: > dump("x", file=stdout()) x <- c(0, 0, 1, 1, 1, 1) After reading again the Posting Guide, I am not sure about csv attachments, but I can try a simple test. I am removing your address (which probably would have accepted the attachment) and leaving only the r-help address. If you see an attachment then csv's are accepted. snipped earlier data -- * Sam Albers Geography Program David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a problem about GPD distribution fit
If i want to fit my data using gpd(data) in a extreme theory packages , how can i fit the lower tail of my data ?? the gpd function seems just has upper threshold , so ,if i want to fit the lower tail of my data , i have to use gpd(-data) . can i fit the lower tail just use gpd(data) ??thank you ! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] refactoring in R
Hi Peng, Some of the refactoring methods I identified back then were integrated into Eclipse/StatET in the mean time. StatET by the way contains some extensions that were not in the original proposal on that website. For the announcement of the latest release, see http://lists.r-forge.r-project.org/pipermail/statet-user/2009-September/000208.html It is advisable to use it with the rJava version referenced in the announcement (as rJava 0.8.* had some non-backwards-compatible API changes). A new StatET version (a.o. adapted to rJava 0.8.x) is likely to be released on short notice. If you want to keep up to date, there is a dedicated mailing list at https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/statet-user HTH, Tobias P.S. The refactoring methods are available under the Source menu, and there is one [simple rename] made available as a QuickFix (Ctrl+1). Peng Yu wrote: I found the examples of how to change the code for each refactoring activity. Are there tools that can help automate this process? On Fri, Nov 13, 2009 at 9:16 PM, milton ruser wrote: Hi Peng, If that information is preliminary, so I guess you have a more clear problem and may be you are able to state a minimally reproducible code/example with what you really need. Bests milton On Fri, Nov 13, 2009 at 9:30 PM, Peng Yu wrote: I'm wondering if there are some tips for refactoring in R. I found the following website, which is still preliminary. Is there any program that can help me do refactoring in R? http://www.r-developer.org/projects/show/refactoring __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and HDF5 Question
Hi, You can also read the hdf5 files with the rgdal package. This loads them into sp-objects, see the sp-package for more info. In the archives of the r-sig-geo mailing list there have been some other people (including myself :)) that have asked this question: https://stat.ethz.ch/pipermail/r-sig-geo/2009-January/004828.html http://markmail.org/message/ypsr77vl3qscq72f#query:r-sig-geo%20read%20hdf5+page:1+mid:ivkt5qxroeh3z646+state:results http://www.mail-archive.com/r-sig-...@stat.math.ethz.ch/msg01871.html cheers, Paul Scott MacDonald wrote: That did it, boy do I feel silly. Thanks! On Fri, Nov 13, 2009 at 10:16 PM, Berwin A Turlach wrote: G'day Scott, On Fri, 13 Nov 2009 09:52:43 -0700 Scott MacDonald wrote: I am trying to load an hdf5 file into R and running into some problems. It's a while that I used hdf5 files and that package in R, but: This builds fine. The library seems to load without issue, but no data is returned when I try to load a file: > library(hdf5) > hdf5load("test.h5") > NULL Is NULL the return of the hdf5load command or are you typing it on the command line? Anyway, .hdf5 files can contain several objects, just as R's .rda file. load() will load an .rda file and put all objects in that file into the workspace. Likewise, hdf5load() loads an hdf5 file and puts all objects in that file into the workspace. Yet, osx:data scott$ h5dump test.h5 HDF5 "test.h5" { GROUP "/" { DATASET "dset" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 31 ) / ( 31 ) } DATA { (0): 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, (14): 16384, 32768, 65536, 131072, 262144, 524288, 1048576, 2097152, (22): 4194304, 8388608, 16777216, 33554432, 67108864, 134217728, (28): 268435456, 536870912, 1073741824 } } } } Any thoughts? Did you try an ls() after the hdf5load() command? If the hdf5load() command was successfull, an ls() should show you that an object with name "dset" is now in your workspace; if I read the output above correctly. HTH. Cheers, Berwin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change working directory
On Fri, Nov 13, 2009 at 6:24 PM, Don MacQueen wrote: > In R for Macintosh, there is a Preferences setting that will do this. > You can also drag and drop a file onto the R icon and I believe it will > change the working directory to the directory that contains the file. > > On unix-like systems, using the command line, it's whatever directory you > start R in. > > I don't use R on Windows, so I don't know there, but I imagine there may be > a preferences setting, or perhaps the drag and drop method works. Or maybe > create a shortcut in the directory you want to be the working directory? On Windows if you start R using Rgui.bat from the command line then it will start in whatever directory you were in (as in UNIX). Rgui.bat is a single file with no dependencies so just copy it to any directory on your path. It will automatically find R using the registry and then start it up. Rgui.bat is found in the batchfiles distribution whose home page is: http://batchfiles.googlecode.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change working directory
anna_l wrote: Hello, I am using setwd() to change the working directory but I have to enter it everytime I open R, is there a way to set this permanently as a working directory? Thanx =^D Hi Anna, I create a .First function that is run when the session starts that looks like this: .First<-function () { options(editor="nedit",show.signif.stars=FALSE) source("SelectAnalysis.R") } This runs the file "SelectAnalysis.R" that looks like this: cat("(B)ullying\n(P)alatability\nR\n") answer<-toupper(readline("Enter the letter corresponding to the project - ")) if(answer == "B") setwd("/home/jim/research/bullying/R") if(answer == "P") setwd("/home/jim/research/palatability_heavydrink/R") if(answer == "R") setwd("/home/jim/R") print(list.files(pattern="[.]R")) I can then select whatever analysis I happen to be working on with a single letter (and newline) and see all of the ".R" and ".Rdata" files in that directory. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shrink list by mathed entries
On 14.11.2009, at 03:58, David Winsemius wrote: On Nov 13, 2009, at 11:19 AM, soeren.vo...@eawag.ch wrote: a <- c("Mama", "Papa", "Papa; Mama", "", "Sammy; Mama; Papa") a <- strsplit(a, "; ") mama <- rep(F, length(a)) mama[sapply(a, function(x) { sum(x=="Mama") }, simplify=T) > 0] <- T [...] ... produces the variables "mama" and "papa" correctly. But how do I remove all "Mama" list entries [...] Maybe you should explain what you were trying to do? Perhaps: > a[!mama] [...] I would sidestep that confusing sequence of logical assignments and just do this: > a[ -grep("Mama", a) ] [...] Explanation of what I want to do: This code is PHP, maybe rather crude but it works the way I want it and explains my goal: #!/usr/bin/php $strings = array("Mama", "Papa", "Papa; Mama", "", "Sammy; Mama; Papa", "Josh", "Mama"); $vars = array("Mama", "Papa", "Sammy"); $i=0; foreach($strings as $line){ $line = explode("; ", $line); $matches = array_intersect($line, $vars); $diffs = array_diff($line, $vars); foreach($matches as $match){ eval("\$$match"."["."$i"."] = 1;"); // no easier way } foreach($diffs as $diff){ $others[$i] = $diff; } $i++; } print_r($Mama); // array with elements 0, 2, 4, and 6 set to "1" print_r($Papa); // array with elements 1, 2, and 4, set to "1" print_r($Sammy); // array with element 4 set to "1" print_r($others); // array with elements 3 set to "", and 5 set to "Josh" ?> Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setting contrasts for a logistic regression
Hi everyone, I'm doing a logistic regression with an ordinal variable. I'd like to set the contrasts on the ordinal variable. However, when I set the contrasts, they work for ordinary linear regression (lm), but not logistic regression (lrm): ddist = datadist(bin.time, exp.loc) options(datadist='ddist') contrasts(exp.loc) = contr.treatment(3, base = 3, contrasts = TRUE) lrm.loc = lrm(bin.time ~ exp.loc, data = Dataset) In this case, lrm still uses exp.loc = 1 as the base, at least in terms of notation, even though I set exp.loc = 3 as the base. Is there a way to set contrasts for lrm? Thanks for any advice, Stephen -- View this message in context: http://old.nabble.com/setting-contrasts-for-a-logistic-regression-tp26347831p26347831.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and HDF5 Question
That did it, boy do I feel silly. Thanks! On Fri, Nov 13, 2009 at 10:16 PM, Berwin A Turlach wrote: > G'day Scott, > > On Fri, 13 Nov 2009 09:52:43 -0700 > Scott MacDonald wrote: > > > I am trying to load an hdf5 file into R and running into some > > problems. > > It's a while that I used hdf5 files and that package in R, but: > > > This builds fine. The library seems to load without issue, but no > > data is returned when I try to load a file: > > > > > library(hdf5) > > > hdf5load("test.h5") > > > NULL > > Is NULL the return of the hdf5load command or are you typing it on the > command line? > > Anyway, .hdf5 files can contain several objects, just as R's .rda > file. load() will load an .rda file and put all objects in that file > into the workspace. Likewise, hdf5load() loads an hdf5 file and puts > all objects in that file into the workspace. > > > Yet, > > > > osx:data scott$ h5dump test.h5 HDF5 "test.h5" { GROUP > > "/" { DATASET "dset" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE > > { ( 31 ) / ( 31 ) } DATA { (0): 1, 2, 4, 8, 16, 32, 64, 128, 256, > > 512, 1024, 2048, 4096, 8192, (14): 16384, 32768, 65536, 131072, > > 262144, 524288, 1048576, 2097152, (22): 4194304, 8388608, 16777216, > > 33554432, 67108864, 134217728, (28): 268435456, 536870912, > > 1073741824 } } } } > > > > Any thoughts? > > Did you try an ls() after the hdf5load() command? If the hdf5load() > command was successfull, an ls() should show you that an object with > name "dset" is now in your workspace; if I read the output above > correctly. > > HTH. > > Cheers, > >Berwin > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.