[R] define variables from rows of a matrix
I have a following matrix and wish to define a variable based the variable A=matrix(0,5,5) A[1,]=c(30,20,100,120,90) A[2,]=c(40,30,20,50,100) A[3,]=c(50,50,40,30,30) A[4,]=c(30,20,40,50,50) A[5,]=c(30,50,NA,NA,100) A [,1] [,2] [,3] [,4] [,5] [1,] 30 20 100 120 90 [2,] 40 30 20 50 100 [3,] 50 50 40 30 30 [4,] 30 20 40 50 50 [5,] 30 20 NA NA 100 I want to define two variables: X is the first column in each row that is equal to 20, for example, for the first row, I need X=2; 2nd row, X=3; 3rd row, X5; 3th row, X=2, 5th row, X=NA; Y is then the first column in each row that is equal to 100 if before this a 20 has been reached, for example, for the first row, Y=3; 2nd row, Y=5; 3rd row, Y=NA, 4th row, Y5; 5th row, Y=NA. the matrix may involve NA as well. How can I define these two variables quickly? (When X5 or Y5, we can arbitrarily assign a value 6, and this is different from being NA) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
Hi Something to get you started ? as.list a data.frame can be regarded as a 2 dimensional array of list vectors df = data.frame(a=1:2,b=2:1,c=4:5,d=9:10) as.list(df[,1:3]) $a [1] 1 2 $b [1] 2 1 $c [1] 4 5 see also http://cran.ms.unimelb.edu.au/doc/contrib/Burns-unwilling_S.pdf Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England ARMIDALE NSW 2351 Email: home mac...@northnet.com.au At 10:58 10/08/2011, you wrote: Hello, This is my first project in R, so I'm trying to work 'the R way', but it still feels awkward sometimes. The problem that I'm facing right now is that I need to convert a data.frame into a structure of lists. The data.frame has columns in the order of tens (I need to focus on only three of them) and rows in the order of millions. So it's quite a big dataset. Let say that the columns of interest are A, B and C. I need to take the data.frame and construct a structure of list where I have a list for every level of A, those list all contain lists for every levels of B, and the 'b-lists' contains all the values of C that match the corresponding levels of A and B. So, I should be able to write something like this: MyData@list_structure$x_level_of_A$y_level_of_B and get a vector of the values of C that were on rows where A=x_level_of_A and B=y_level_of_B. My first attempt was to use two imbricated lapply functions running something like this: list_structure-lapply(levels(A) function(x) { as.character(x) = lapply( levels(B), function(y) { as.character(y) = C[A==x B==y] }) }) The real code was not quite as simple, but I managed to have it work, and it worked well on my first dataset (where A and B had only few levels). I was quite happy... but the imbricated loops killed me on a second dataset where A had several thousand levels. So I tried something else. My second attempt was to go through every row of the data.frame and append the value to the appropriate vector. I first initialized a structure of lists ending with NULL vector, then I did something like this: for (i in 1:nrow(DataFrame)) { eval( substitute( append(MyData@list_structure$a_value$b_value, c_value), list(a_value=as.character(DF$A[i]), b_value=as.character(DF$B[i]), c_value=as.character(DF$C[i])) ) ) } This works... but way too slowly for my purpose. I would like to know if there is a better road to take to do this transformation. Or, if there is a way of speeding one of the two solutions that I have tried. Thank you very much for your help! (And in your replies, please remember that this is my first project in R, so don't hesitate to state the obvious if it seems like I am missing it!) Frederic -- View this message in context: http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3731746.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can R handle a matrix with 8 billion entries?
On Tue, 9 Aug 2011, Peter Langfelder wrote: Assuming you need the full distance matrix at one time (which you do not for hierarchical clustering, itself a highly dubious method for more than a few hundred points). Apologies if this hijacks the thread, but why is hierarchical clustering highly dubious for more than a few hundred points? That is off-topic for R-help: see the posting guide. Peter -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subqueries in sqlQuery function (package RODBC)
Hi R users. sorry for missing example and if question is to general but I am wondering if it is possible to execute subqueries in function sqlQuery (package RODBC) with opened connection with Excel or SQL server 2000. I couldn't find any example of this. And if it is possible what should be a correct syntax for this query: SELECT ct,COUNT(*) as n FROM (SELECT COUNT(*) AS ct FROM children GROUP BY family_id) AS x GROUP BY ct; sqlQuery(connecton, CORRECT SYNTAX ) (This query is an example from book Data Manipulation with R, Phil Spector, page 47) Thanks for any help Andrija [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glmnet
Hi Andra. I wonder how you come about trying to use LASSO without knowing what lambda is. I'd advise you to read up on it. In the help (?glmnet) you can find several paper references, but for a more gentle introduction, you can read http://www-stat.stanford.edu/~tibs/ElemStatLearn/ In a nutshell, though: lambda is the parameter that balances the weight given to the penalty. The bigger this one is, the more 'pressure' there is on the coefficients to be small (or better yet: disappear). The way you use LASSO is: you look at a reasonable set of lambda values (this is e.g. done by glmnet), calculate some measure of success with each lambda value (e.g.: misclassification, AUC,...), generally by using crossvalidation (as is provided by cv.glmnet: read its help). Having this measure of success (say the AUC) for each lambda in your reasonable set allows you to pick the most optimal (lambda.min) or, to avoid happenstance peaks, a more conservative and parsimonious one (lambda.1se), after which you can rerun your lasso with this selected lambda on the full dataset, to find the variables in your model. Finally, to avoid downward bias, you could run a normal glm with only the variables selected in the previous step. Good luck! Nick Sabbe -- ping: nick.sa...@ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Andra Isan Sent: woensdag 10 augustus 2011 5:59 To: r-help@r-project.org Subject: [R] glmnet Hi All, I have been trying to use glmnet package to do LASSO linear regression. my x data is a matrix n_row by n_col and y is a vector of size n_row corresponding to the vector data. The number of n_col is much more larger than the number of n_row. I do the following: fits = glmnet(x, y, family=multinomial)I have been following this article: http://cran.r-project.org/web/packages/glmnet/glmnet.pdfpage 8, but there are some unclear parts that I dont understand. The lambda variable only returns 100 and I exactly dont know what lambda represents. So, basically I would like to know how to get the coefficients weights and what exactly lambda is? how I can see the difference between predicted values and observed values? If there is a sample code that helps me to understand how to use these, that would be great. Thanks a lot,Andra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple plot question
Hi, thanks a lot for pointing me at conditional plotting! I have to confess that I'm still not really convinced whether this type of philosophy holds true in each and every situation, especially when there appears to be a common sense in literature (even if it is not optimal) to depict such data like requested! For the conventional plot: thanks for the newline-trick! I will play around with the ggplot library and dive a little deeper into the literature, perhaps I have looked at the wrong papers! Best Maxim 2011/8/10 R. Michael Weylandt michael.weyla...@gmail.com Hi Maxim, I notice no one has replied to you (on list at least) so I'll take a stab at answering your question and giving some productive advice. I believe the axis command will do what you want with a little tweaking: It certainly lines things up for me. x - data.frame(cell=paste(line,c(a,a,b,b)),treat = paste(treat,c(1,2,1,2)),value=c(4,3,8,11)) # Next time please provide data that can be directly entered plot(x$value,xaxt = n) axis(1,at=1:4,label=paste(as.character(x$cell),\n,as.character(x$treat),sep=)) That said, I'd recommend against it. This sort of data with a bivariate+categorical x-axis really isn't best viewed in this manner: in fact, it's not really well-viewed in this manner as well. Rather, I'd strongly suggest that you use some sort of conditional plotting: either R's built in coplot() function or (even better) the ggplot2 or lattice libraries. These two packages are truly outstanding and are both well-documented on the web, but for just a silly little taste, try this library(lattice) x - data.frame(x1 = sample(1:6,25,replace=T),x2 = sample(1:6,25,replace=T)) x - data.frame(x, y = x$x1 + x$x2+runif(25)*3) with(x, xyplot(y~x1|x2)) # Compare to this plot where no information can be gleaned plot(x$y,xaxt=n) axis(1,at=1:25,label=paste(x$x1,\n,x$x2,sep=)) Hopefully this shows you how the idea of conditioning on an independent variable can yield a more easily interpreted graph. There's many great examples of these two packages and I'd highly recommend them for this sort of plot. Hope this helps, Michael Weylandt On Tue, Aug 9, 2011 at 4:48 PM, Maxim deeeperso...@googlemail.com wrote: Hi, please excuse the most likely very trivial question, but I'm having no idea where to find related information: I try to recapitulate very simple plotting behavior of Excel within R but have no clue how to get where I want. I have tab delimited data like cell treatment value line a treat1 4 line a treat2 3 line b treat1 8 line b treat2 11 I'd like to have a plot (barplot), that specifies 2 scales on the x-axis (cell and treatment condition). In future this might become more complex, so basically I'd like to have a table/matrix as x-axis! Where do I have to look for working examples, I really spent a lot of time studying graph galleries? Wanted: the same look that you get when marking above data within Excel and selecting barplot! I have no clue how my search-term should look like in order to find the necessary information. The only thing I can get to work is to generate a second X-axis at position 3: read.delim(file='test')-x plot(x$value,xaxt=n) axis(3,1:4,x$treatment) axis(1,1:4,x$cell) Not nice, but ok! Unfortunately this does not work with barplot as the axis does not align with the bars! plot(x$value,xaxt=n,beside=T) Any help is appreciated! Regards Maxim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Solved] Re: lavaan: how to analyse residuals of a latent variable
Maybe this kind of usage of lavaan is not very common, but in order to help others in my situation, is this documented somewhere? My understanding of latent variable analysis is indeed limited, but I did not understand that lavaan worked liked this when I read the documentation. This is not specific to lavaan; the same strategy would work in other (commercial) software as well. But of course, lavaan needs better documentation. If only there was more time... Yves Rosseel http://lavaan.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subqueries in sqlQuery function (package RODBC)
In what sense is this a 'subquery'? It is just an SQL command (write it on one line, no terminating ;, which is not part of the query). On Wed, 10 Aug 2011, andrija djurovic wrote: Hi R users. sorry for missing example and if question is to general but I am wondering if it is possible to execute subqueries in function sqlQuery (package RODBC) with opened connection with Excel or SQL server 2000. I couldn't find any example of this. And if it is possible what should be a correct syntax for this query: SELECT ct,COUNT(*) as n FROM (SELECT COUNT(*) AS ct FROM children GROUP BY family_id) AS x GROUP BY ct; sqlQuery(connecton, CORRECT SYNTAX ) (This query is an example from book Data Manipulation with R, Phil Spector, page 47) Thanks for any help Andrija [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] define variables from rows of a matrix
Hi, I was irritated about your printed last row of A, which apart from definition contains a 20. Anyway, how about this: y-x-rep(NA,nrow(A)) #its not clear, whether multiple values of 100 can occur in a single #row, and what to do, when 100 is found before and after 20, so you may #alter the indexing as you need: ind-subset(merge(which(A==20,arr.ind=T),which(A==100,arr.ind=T),by=row,all.x=T),col.xcol.y|is.na(col.y)) x[ind$row]-ind$col.x y[ind$row]-ind$col.y x y which places an NA in x and y for all no shows. I did not get the logic behind setting x[3] to 5 but x[5] to NA (same for y[3], y[4]), so this is left for you to implement. cheers. Am 10.08.2011 08:15, schrieb gallon li: I have a following matrix and wish to define a variable based the variable A=matrix(0,5,5) A[1,]=c(30,20,100,120,90) A[2,]=c(40,30,20,50,100) A[3,]=c(50,50,40,30,30) A[4,]=c(30,20,40,50,50) A[5,]=c(30,50,NA,NA,100) A [,1] [,2] [,3] [,4] [,5] [1,] 30 20 100 120 90 [2,] 40 30 20 50 100 [3,] 50 50 40 30 30 [4,] 30 20 40 50 50 [5,] 30 20 NA NA 100 I want to define two variables: X is the first column in each row that is equal to 20, for example, for the first row, I need X=2; 2nd row, X=3; 3rd row, X5; 3th row, X=2, 5th row, X=NA; Y is then the first column in each row that is equal to 100 if before this a 20 has been reached, for example, for the first row, Y=3; 2nd row, Y=5; 3rd row, Y=NA, 4th row, Y5; 5th row, Y=NA. the matrix may involve NA as well. How can I define these two variables quickly? (When X5 or Y5, we can arbitrarily assign a value 6, and this is different from being NA) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different approach to set up Cohen Kappa
The kappa2() function in the irr library takes an n x 2 matrix as input, where the two columns are the ratings by two raters. Let x and y below be the ratings of the two raters: x-sample(c(0,1,2),100,replace=T) x o-sample(c(0,0,0,1),100,replace=T) y-x+o y #Then kappa is computed as: kappa2(cbind(x,y)) Otherwise, your post suggest that you should start with the basics and pick up an R manual to get acquainted with R. HTH, Daniel gavfung wrote: Hi, I just started learning R, and one of the most frequent thing that I need to calculate is cohen kappa in my psychology lab and I figure being able to do inter rater reliability is a great way for me to explore R. There are two different scenario in which I need help with. (By the way, I have installed irr and concord.) 1) In the lab I am working in, we go through transcripts and find certain words to code, and we usually compare the codes between two raters. In this case, the two rater can agree (both rater coded apple with A), have a mismatch (one rater coded apple with A but another coded it as B) or a miss (a rater coded apple with A but the other rater did not code it at all). There are several codes with similar procedures and as the codes are tallied together, a chart is constructed, similar to the one attached( http://r.789695.n4.nabble.com/file/n3732320/example1.xls example1.xls ) title example1.xls. The maroon color represents the cells with data, and the lavender cells is just the total in each row/column. My question in this case is How would I calculate the Cohen Kappa for the cell shaded in maroon? 2)I helped run participants over the summer for a psych summer internship and after coding them, I will enter the data as shown in the attachment title example2.xls ( http://r.789695.n4.nabble.com/file/n3732320/example2.xls example2.xls ) There is also another research assistant that entered the data and I want to find a way to check whether we are reliable or not, and want to calculate reliability for the following:TimeI, TimeA, TriesI, and TriesA. Once again, i would need to convert the excel file into csv, but aside from that, I am lost as to what I need to do. Any help is appreciated! Thank you! -- View this message in context: http://r.789695.n4.nabble.com/Different-approach-to-set-up-Cohen-Kappa-tp3732320p3732388.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] round() a data frame containing 'character' variables?
Dear all It is difficult to use round(..., digits=2) on a data frame since one has to first take care to remove non-numeric variables such as 'character' or 'factor': head(round(iris, 2)) Error in Math.data.frame(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, : non-numeric variable in data frame: Species head(round(iris[1:4], 2)) Sepal.Length Sepal.Width Petal.Length Petal.Width 1 5.1 3.5 1.4 0.2 2 4.9 3.0 1.4 0.2 3 4.7 3.2 1.3 0.2 4 4.6 3.1 1.5 0.2 5 5.0 3.6 1.4 0.2 6 5.4 3.9 1.7 0.4 Is there an elegant way to use round() on a data frame containing 'character' variables without removing them? Thank you Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() a data frame containing 'character' variables?
One approach is the following: numVars - sapply(iris, is.numeric) iris[numVars] - lapply(iris[numVars], round, digits = 2) head(iris) You can also put it in one lapply() call if you like. I hope it helps. Best, Dimitris On 8/10/2011 11:34 AM, Liviu Andronic wrote: Dear all It is difficult to use round(..., digits=2) on a data frame since one has to first take care to remove non-numeric variables such as 'character' or 'factor': head(round(iris, 2)) Error in Math.data.frame(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, : non-numeric variable in data frame: Species head(round(iris[1:4], 2)) Sepal.Length Sepal.Width Petal.Length Petal.Width 1 5.1 3.5 1.4 0.2 2 4.9 3.0 1.4 0.2 3 4.7 3.2 1.3 0.2 4 4.6 3.1 1.5 0.2 5 5.0 3.6 1.4 0.2 6 5.4 3.9 1.7 0.4 Is there an elegant way to use round() on a data frame containing 'character' variables without removing them? Thank you Liviu -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() a data frame containing 'character' variables?
Hello On Wed, Aug 10, 2011 at 11:41 AM, Dimitris Rizopoulos d.rizopou...@erasmusmc.nl wrote: One approach is the following: numVars - sapply(iris, is.numeric) iris[numVars] - lapply(iris[numVars], round, digits = 2) head(iris) That's interesting, but still doesn't do what I need. Since it's a read-only View() operation I would like to avoid at all cost modifying the original data frame. And since my data frames are relatively big, I would like to avoid generating unnecessary copies. Basically I would need round() to ignore objects that it knows it cannot handle. Any other ideas? Thanks Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot 3d info in 2d
Hi Listers, Is it possible to produce an ordination plot in 2d, where bubbles represent the location of sites (this part is easy enough) and the size of the bubbles is proportional to the sites location in 3d space (I am stuck on this option). So sites that are very near the 2d plane of the xy axes would be larger while sites that are actually further away in 3 d space would be proportionally smaller. any help/advice appreciated Andy -- Andrew Halford Ph.D Associate Research Scientist Marine Laboratory University of Guam Ph: +1 671 734 2948 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot 3d info in 2d
On 08/10/2011 10:02 AM, Andrew Halford wrote: Hi Listers, Is it possible to produce an ordination plot in 2d, where bubbles represent the location of sites (this part is easy enough) and the size of the bubbles is proportional to the sites location in 3d space (I am stuck on this option). So sites that are very near the 2d plane of the xy axes would be larger while sites that are actually further away in 3 d space would be proportionally smaller. any help/advice appreciated Andy Hi Andy! I think ggplot2 would be the package I would use to do this kind of plots. However, without commented, minimal, self-contained, reproducible code I cannot provide an example of how to do it. cheers, Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot 3d info in 2d
Very easy if you note that cex in plot can be a vector. example: x - runif(100) y-runif(100) z-runif(100) #shift and scale z for convenience 9the scaling is based on range 'cos we know this is in [0,1] #your mileage may vary but the principle is ) z.scaled - 0.05 + (z-min(z))/diff(range(z)) plot(x, y, cex=2*z.scaled) #Symbol size increases linearly with z You can add a key by giving legend() a list of three or four cex values and corresponding distances in z, if yo like. But ggplot (as a previous poster indicated) is also a natural way to do this, and adds a nice key for you if you map a variable to an aesthetic. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Andrew Halford Sent: 10 August 2011 11:03 To: r-help@r-project.org Subject: [R] plot 3d info in 2d Hi Listers, Is it possible to produce an ordination plot in 2d, where bubbles represent the location of sites (this part is easy enough) and the size of the bubbles is proportional to the sites location in 3d space (I am stuck on this option). So sites that are very near the 2d plane of the xy axes would be larger while sites that are actually further away in 3 d space would be proportionally smaller. any help/advice appreciated Andy -- Andrew Halford Ph.D Associate Research Scientist Marine Laboratory University of Guam Ph: +1 671 734 2948 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subqueries in sqlQuery function (package RODBC)
I thought subqueries in sense of commad inside the command (in my example two select commands). It works as you proposed an I thought in this case (subqueries) that I need different syntax for sqlQury function combining SQL query and paste. But now I have another problem and again sorry if it is to general and basic but I just can't find the right option to set up. Namely, when I import table from SQL server into R, columns that are defined in SQL table as char (with leading zeros as 001, 002,...) are imported as integers. Could you, please, guide me on some options that should be set up to solve this problem? On Wed, Aug 10, 2011 at 10:16 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: In what sense is this a 'subquery'? It is just an SQL command (write it on one line, no terminating ;, which is not part of the query). On Wed, 10 Aug 2011, andrija djurovic wrote: Hi R users. sorry for missing example and if question is to general but I am wondering if it is possible to execute subqueries in function sqlQuery (package RODBC) with opened connection with Excel or SQL server 2000. I couldn't find any example of this. And if it is possible what should be a correct syntax for this query: SELECT ct,COUNT(*) as n FROM (SELECT COUNT(*) AS ct FROM children GROUP BY family_id) AS x GROUP BY ct; sqlQuery(connecton, CORRECT SYNTAX ) (This query is an example from book Data Manipulation with R, Phil Spector, page 47) Thanks for any help Andrija [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the date of specific value within a zoo object?
Xts is an extension of zoo that has some other nice features: character subsetting, periodic apply functions, good built in time conversions, etc. More importantly for my work, the authors put a lot of work into making sure it plays well with all of R's many ts classes so I almost always start any project by changing input from any source to xts so I don't have to think about the occasional inconsistencies. If you arent working with multiple data sources and don't need the extended functionality, there's absolutely nothing wrong with zoo - it's also a great package. Michael Weylandt PS - more than anything, I was somewhat distracted when answering and couldn't get your code to work so I falsely assumed it was a zoo problem: so I just wrote an example I knew worked in xts - when I thought for a moment and saw your bug, I realized the same trick would work in zoo. On Aug 10, 2011, at 1:53 AM, Richard Ma xuanlong...@uts.edu.au wrote: Hi Michael, Thanks for your kindly help. Problem solved! Just curious why you prefer xts rather than zoo? Is xts more powerful? BTW, It's my mistake that incorrectly type the code. ;-) Cheers, Richard R. Michael Weylandt lt;michael.weyla...@gmail.comgt; wrote: I'd suggest you look into the xts class and write require(xts) xts = as.xts(1:5,Sys.Date()+1:5) time(xts)[xts==3] By the way, your code isn't pastable for me: not sure why. Michael Weylandt - Richard Ma PhD student, Ecology Remote Sensing Climate Change Cluster, Department of Environment Science University of Technology, Sydney http://everydropr.wordpress.com -- View this message in context: http://r.789695.n4.nabble.com/How-to-get-the-date-of-specific-value-within-a-zoo-object-tp3731885p3732108.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
I would use the tapply function (which is designed for the case in which data exists for most pairs of the levels of A and B) or the reshape::sparseby function, or something else in the reshape package. These won't give you exactly the structure you were asking for, but they will separate the data properly. By the way, it's a good idea when posting a question to post a simple example; then other solutions can be illustrated on the same example. It doesn't need to contain millions of rows. Duncan Murdoch On 11-08-09 8:58 PM, Frederic F wrote: Hello, This is my first project in R, so I'm trying to work 'the R way', but it still feels awkward sometimes. The problem that I'm facing right now is that I need to convert a data.frame into a structure of lists. The data.frame has columns in the order of tens (I need to focus on only three of them) and rows in the order of millions. So it's quite a big dataset. Let say that the columns of interest are A, B and C. I need to take the data.frame and construct a structure of list where I have a list for every level of A, those list all contain lists for every levels of B, and the 'b-lists' contains all the values of C that match the corresponding levels of A and B. So, I should be able to write something like this: MyData@list_structure$x_level_of_A$y_level_of_B and get a vector of the values of C that were on rows where A=x_level_of_A and B=y_level_of_B. My first attempt was to use two imbricated lapply functions running something like this: list_structure-lapply(levels(A) function(x) { as.character(x) = lapply( levels(B), function(y) { as.character(y) = C[A==x B==y] }) }) The real code was not quite as simple, but I managed to have it work, and it worked well on my first dataset (where A and B had only few levels). I was quite happy... but the imbricated loops killed me on a second dataset where A had several thousand levels. So I tried something else. My second attempt was to go through every row of the data.frame and append the value to the appropriate vector. I first initialized a structure of lists ending with NULL vector, then I did something like this: for (i in 1:nrow(DataFrame)) { eval( substitute( append(MyData@list_structure$a_value$b_value, c_value), list(a_value=as.character(DF$A[i]), b_value=as.character(DF$B[i]), c_value=as.character(DF$C[i])) ) ) } This works... but way too slowly for my purpose. I would like to know if there is a better road to take to do this transformation. Or, if there is a way of speeding one of the two solutions that I have tried. Thank you very much for your help! (And in your replies, please remember that this is my first project in R, so don't hesitate to state the obvious if it seems like I am missing it!) Frederic -- View this message in context: http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3731746.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot 3d info in 2d
On 08/10/2011 10:02 AM, Andrew Halford wrote: Hi Listers, Is it possible to produce an ordination plot in 2d, where bubbles represent the location of sites (this part is easy enough) and the size of the bubbles is proportional to the sites location in 3d space (I am stuck on this option). So sites that are very near the 2d plane of the xy axes would be larger while sites that are actually further away in 3 d space would be proportionally smaller. any help/advice appreciated Andy Plotting the dataset which was proposed by S. Ellison using ggplot2 is done in this fashion: library(ggplot2) theme_set(theme_bw()) dat = data.frame(x - runif(100), y-runif(100), z-runif(100)) ggplot(aes(x = x, y = y, size = z), data = dat) + geom_point(color = 'lightblue') # Using log(z) in stead of z ggplot(aes(x = x, y = y, size = z), data = dat) + geom_point(color = 'lightblue') + scale_size_continuous(trans = 'log') # Alternatively, making the color of the point dependend on the value of z ggplot(aes(x = x, y = y, color = z), data = dat) + geom_point(size = 6) + scale_color_gradient(low = 'white', high = 'blue') cheers, Paul -- Paul Hiemstra, Ph.D. Global Climate Division Royal Netherlands Meteorological Institute (KNMI) Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39 P.O. Box 201 | 3730 AE | De Bilt tel: +31 30 2206 494 http://intamap.geo.uu.nl/~paul http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Source Code glm() question
Dear List, I'm fairly new in R. I'd like to see how glm() uses the argument family in fitting a model. Specifically, I'd like to see how a glm with a gamma family is fitted. Thanks for any help, Axel. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Source Code glm() question
Just type glm at the prompt. glm function (formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(...), model = TRUE, method = glm.fit, x = FALSE, y = TRUE, contrasts = NULL, ...) { call - match.call() if (is.character(family)) family - get(family, mode = function, envir = parent.frame()) if (is.function(family)) family - family() and so on -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Axel Urbiz Verzonden: woensdag 10 augustus 2011 13:16 Aan: R-help@r-project.org Onderwerp: [R] Source Code glm() question Dear List, I'm fairly new in R. I'd like to see how glm() uses the argument family in fitting a model. Specifically, I'd like to see how a glm with a gamma family is fitted. Thanks for any help, Axel. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glmnet
On 08/10/2011 03:00 AM, Nick Sabbe wrote: Finally, to avoid downward bias, you could run a normal glm with only the variables selected in the previous step. At the cost, of course, of introducing upward bias -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reflecting a PCA biplot
On Tue, 2011-08-09 at 22:57 +1000, Andrew Halford wrote: Hi Listers, I am trying to reflect a PCA biplot in the x-axis (i.e. PC1) but am not having much success. In theory I believe all I need to do is multiply the site and species scores for the PC1 by -1, which would effectively flip the biplot. I am creating a blank plot using the plot command and accessing the results from a call to rda. I then use the calls to scores to obtain separate site and species coordinates and I have worked out how to multiply the appropriate PC1 scores by -1 to create the site and species scores I want. However I am not sure how to change the call to plot which accesses the results of the call to rda to draw the blank plot. The coordinates it is accessing are for the unreflected ordination and this does not match the new site and species scores that I have. fish.pca - rda(fish.hel) fish.site - scores(fish.pca,display=sites,scaling=3) fish.spp - scores(fish.pca,display=species,scaling=3) fish.site[,PC1] - -1*(fish.site[,PC1]) fish.spp[,PC1] - -1*(fish.spp[,PC1]) graph - plot(fish.pca,display=c(sites,species),type=n,scaling=3) # how do I get the plot to draw up the blank display based on the reversed site and species scores? Do you mean something like...? require(vegan) data(dune) mod - rda(dune) si.scrs - scores(mod, display = sites, scaling = 3, choices = 1:2) sp.scrs - scores(mod, display = species, scaling = 3, choices = 1:2) si.scrs[, PC1] - -1 * si.scrs[, PC1] sp.scrs[, PC1] - -1 * sp.scrs[, PC1] xlim - range(0, si.scrs[, 1], sp.scrs[, 1]) ylim - range(0, si.scrs[, 2], sp.scrs[, 2]) plot(si.scrs[,1], si.scrs[,2], ylim = ylim, xlim = xlim, ylab = PC2, xlab = PC1, cex = 0.7, asp = 1) abline(h = 0, lty = dotted) abline(v = 0, lty = dotted) points(sp.scrs[,1], sp.scrs[,2], col = red, pch = 3, cex = 0.7) box() For non-standard plotting of ordination objects, our advice has always been to build the plot up from lower-level plotting functions rather than the specific methods supplied with vegan. A comparison: (not quite the same, I know, but close enough) layout(matrix(1:2, ncol = 2)) plot(mod, display=c(sites,species), type = p, scaling=3, main = Original) plot(si.scrs[,1], si.scrs[,2], ylim = ylim, xlim = xlim, ylab = PC2, xlab = PC1, cex = 0.7, asp = 1, main = Flipped PC1) abline(h = 0, lty = dotted) abline(v = 0, lty = dotted) points(sp.scrs[,1], sp.scrs[,2], col = red, pch = 3, cex = 0.7) box() layout(1) If you want type = t, the default for plot.cca, then use the same coordinates but extract the relevant labels from the two scores objects: plot(si.scrs[,1], si.scrs[,2], ylim = ylim, xlim = xlim, ylab = PC2, xlab = PC1, cex = 0.7, asp = 1, type = n) ## no plotting this time first abline(h = 0, lty = dotted) abline(v = 0, lty = dotted) text(si.scrs[,1], si.scrs[,2], labels = rownames(si.scrs), cex = 0.7) ## add site and species scores as labels text(sp.scrs[,1], sp.scrs[,2], labels = rownames(sp.scrs), col = red, cex = 0.7) box() HTH G Any help appreciated. cheers Andy -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] igraph - designing graph plot by attributes
Hi, I'm working on some social networks and I managed to create the graphs with labels and edges weight, but I would also like to change the size of the vertices according to the age of the persons in the network and the shape according to the gender. Now for the age, I have people with ages between 20 and 78, and I would like to have 4 categories (sizes): 20-35, 36-50, 50-65, 65. I have entered the ages as attributes of the vertices from a table, so they are included in the graph, but how do I change the size in the plot? And the same for gender with different shapes (circle and square maybe). Thanks in advance and regards, Andreea. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Floats in Microsoft Basic format
Hi all, I need to convert a floating point value from Microsoft Basic format to IEEE format. There's a simple way to achieve this in R or I have to write my own function? (e.g. convert the C code below) thanks t #include string.h/* for strncpy */ int _fmsbintoieee(float *src4, float *dest4) { unsigned char *msbin = (unsigned char *)src4; unsigned char *ieee = (unsigned char *)dest4; unsigned char sign = 0x00; unsigned char ieee_exp = 0x00; int i; /* MS Binary Format */ /* byte order =m3 | m2 | m1 | exponent */ /* m1 is most significant byte = sbbb| */ /* m3 is the least significant byte */ /* m = mantissa byte */ /* s = sign bit*/ /* b = bit */ sign = msbin[2] 0x80; /* 1000|b */ /* IEEE Single Precision Float Format */ /*m3m2m1 exponent */ /* | | emmm| seee| */ /* s = sign bit*/ /* e = exponent bit*/ /* m = mantissa bit*/ for (i=0; i4; i++) ieee[i] = 0; /* any msbin w/ exponent of zero = zero */ if (msbin[3] == 0) return 0; ieee[3] |= sign; /* MBF is bias 128 and IEEE is bias 127. ALSO, MBF places */ /* the decimal point before the assumed bit, while */ /* IEEE places the decimal point after the assumed bit. */ ieee_exp = msbin[3] - 2;/* actually, msbin[3]-1-128+127 */ /* the first 7 bits of the exponent in ieee[3] */ ieee[3] |= ieee_exp 1; /* the one remaining bit in first bin of ieee[2] */ ieee[2] |= ieee_exp 7; /* 0111|b : mask out the msbin sign bit */ ieee[2] |= msbin[2] 0x7f; ieee[1] = msbin[1]; ieee[0] = msbin[0]; return 0; } -- View this message in context: http://r.789695.n4.nabble.com/Floats-in-Microsoft-Basic-format-tp3732456p3732456.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using ANCOVA in R
Hello, I have a problem with using the following design with ANCOVA in R. There are two groups (control + treatment), each with ten subjects. The subjects show a response that is monitored over time (four time points). For a single given subject, the response can be analysed with linear regression with time as the independent variable. The question is, how does the response differ between the two groups. It is to be expected that the slope of the response differs between the two groups, while the intercept itself is of minor interest. I have tried a linear model as in g - lm( response ~ time + subject + group) but somehow I think that this is not correct, as it would not show a difference in the slope of the response. Any kind of help would be greatly appreciated. -- January __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Different approach to set up Cohen Kappa
Hi, I just started learning R, and one of the most frequent thing that I need to calculate is cohen kappa in my psychology lab and I figure being able to do inter rater reliability is a great way for me to explore R. There are two different scenario in which I need help with. (By the way, I have installed irr and concord.) 1) In the lab I am working in, we go through transcripts and find certain words to code, and we usually compare the codes between two raters. In this case, the two rater can agree (both rater coded apple with A), have a mismatch (one rater coded apple with A but another coded it as B) or a miss (a rater coded apple with A but the other rater did not code it at all). There are several codes with similar procedures and as the codes are tallied together, a chart is constructed, similar to the one attached( http://r.789695.n4.nabble.com/file/n3732320/example1.xls example1.xls ) title example1.xls. The maroon color represents the cells with data, and the lavender cells is just the total in each row/column. My question in this case is How would I calculate the Cohen Kappa for the cell shaded in maroon? 2)I helped run participants over the summer for a psych summer internship and after coding them, I will enter the data as shown in the attachment title example2.xls ( http://r.789695.n4.nabble.com/file/n3732320/example2.xls example2.xls ) There is also another research assistant that entered the data and I want to find a way to check whether we are reliable or not, and want to calculate reliability for the following:TimeI, TimeA, TriesI, and TriesA. Once again, i would need to convert the excel file into csv, but aside from that, I am lost as to what I need to do. Any help is appreciated! Thank you! -- View this message in context: http://r.789695.n4.nabble.com/Different-approach-to-set-up-Cohen-Kappa-tp3732320p3732320.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 102, Issue 10
Wir sind bis am 20. August in den Ferien und werden keine e-mails beantworten. Bei dringenden Fällen melden Sie sich bei Stefanie von Felten steffi.vonfel...@oikostat.ch We are on vacation until 20. August. In urgent cases, please contact Stefanie von Felten steffi.vonfel...@oikostat.ch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] varpart with repeated measures
Dear all, I would like to do variance partitioning of community dissimilarity matrix (Y) using 4 explanatory tables: X1=environmental characteristics X2=species traits related to dispersal X3=species characteristics (abundances and richnness) X4= xy spatial coordinates The problem is that I have repeated measures design (longitudinal data). I treated Time (factor with 4 levels) as fixed factor in the main analysis (relating community dissimilarity matrix with environmental characteristics) and checked for sphericity etc. Now, I would like to see what proportion of variation is due to these 4 explanatory tables, but I am not sure how to deal with spatial coordinates (X4), without having to pool all data across dates? Should I just repeat xy data 4 times in the table? Should I do pcnm with X4, before using it in varpart? I would greatly appreciate any help, Vesna -- View this message in context: http://r.789695.n4.nabble.com/varpart-with-repeated-measures-tp3732631p3732631.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function runif in for loop
Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X -(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() a data frame containing 'character' variables?
The function format() might serve your needs. format(head(iris), digits=1) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 15 41 0.2 setosa 25 31 0.2 setosa 35 31 0.2 setosa 45 32 0.2 setosa 55 41 0.2 setosa 65 42 0.4 setosa Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Liviu Andronic landronim...@gmail.com To: r-help@r-project.org Help r-help@r-project.org Date: 08/10/2011 04:37 AM Subject: [R] round() a data frame containing 'character' variables? Sent by: r-help-boun...@r-project.org Dear all It is difficult to use round(..., digits=2) on a data frame since one has to first take care to remove non-numeric variables such as 'character' or 'factor': head(round(iris, 2)) Error in Math.data.frame(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, : non-numeric variable in data frame: Species head(round(iris[1:4], 2)) Sepal.Length Sepal.Width Petal.Length Petal.Width 1 5.1 3.5 1.4 0.2 2 4.9 3.0 1.4 0.2 3 4.7 3.2 1.3 0.2 4 4.6 3.1 1.5 0.2 5 5.0 3.6 1.4 0.2 6 5.4 3.9 1.7 0.4 Is there an elegant way to use round() on a data frame containing 'character' variables without removing them? Thank you Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rbind/cbind
Dear list, I wonder if there a better way to have rbind/cbind/append to create the first element (if it is empty) instead of doing the following in a loop? for (i in 1:10) { if (i == 1) { aRow = SomeExpression(i) } else { aRow = rbind(aRow,SomeExpression(i)) } } Thanks Anthony __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting Ellipses and Points of Matching Colors in an Ordination
Gabe, Since you didn't provide a small example of your data, I can't test out your code. However, I used an example from the ?ordiellipse help page to draw different colored ellipses (using the show.groups= argument) with labels (using the label= argument). Hope this helps. library(vegan) data(dune) data(dune.env) mod - cca(dune ~ Management, dune.env) attach(dune.env) plot(mod, type=n) points(mod, display=sites, pch=as.numeric(Management), col=as.numeric(Management)) groupz - sort(unique(Management)) for(i in seq(groupz)) { ordiellipse(mod, Management, kind=se, conf=0.95, label=T, font=2, cex=1.5, col=i, show.groups=groupz[i]) } Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Gabriel Yospin yosp...@gmail.com To: r-help@r-project.org Date: 08/09/2011 10:34 PM Subject: [R] Plotting Ellipses and Points of Matching Colors in an Ordination Sent by: r-help-boun...@r-project.org Hello, R-Help - I am trying to plot the results of an ordination from package vegan. The tricky part for me right now is getting the colors of the ellipses denoting the 95% confidence intervals of the group centroids to match the colors of the points for those same groups. From and earlier post, I saw the code to make a plot of the ordination using different colors for my different groups. My functional code is below: library(vegan) comvars - read.csv(commvars9trim2.csv) com.mds - metaMDS(comvars, trace=FALSE) firefactors - read.csv(commvars9factors.csv) plot(com.mds, type = n) points(com.mds, dis=si, pch = as.numeric(firefactors$comm4), col = as.numeric(firefactors$comm4)) This code will generate the ordination and plot it, using the factor levels from firefactors$comm4 to determine the colors and characters to use in the plot. What I would like to do next is plot the ellipses denoting the 95% confidence intervals of the group centroids, with colors matching those of the points. The following piece of code plots the centroids: with(firefactors, ordiellipse(com.mds, comm4, kind = se, conf = 0.95, col = as.numeric(firefactors$comm4)), label = TRUE) But the above code makes all of the ellipses blue. Blue is the fourth color in my default palette(), and the first value returned by as.numeric(firefactors$comm4) is 4. I assume that's not a coincidence, but I could be wrong. I have also tried using: plot(com.mds, display = sites, type = p) with(firefactors, ordiellipse(com.mds, comm4, kind = se, conf = 0.95, col = as.numeric(firefactors$comm4)) But that code also gives me blue ellipses. Finally, I'd like to label the ellipses. The only way I've found to do that is by using the ordispider() function. Is there any way to make ordispider draw no lines? I've tried with(firefactors, ordispider(com.mds, comm4, col = green3, label = TRUE, lty = 3)) My two questions: 1. How do I make the ellipse color match the color of the points each ellipse represents? 2. How do I label those ellipses, without drawing the dashed lines as per ordispider()? Many thanks, Gabe -- Gabriel I. Yospin Center for Ecology Evolutionary Biology Bridgham Lab University of Oregon Eugene, OR 97403-5289 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loops for repetitive task
Hi: Try this: ## Function that takes a data frame as input and outputs a data frame: chrSumm - function(d) { # d is a data frame colnames(d) - c(chr,start,end,base1,base2, totalreads,methylation,strand) TR - nrow(d) RG1 - sum(d['totalreads'] = 1) percent - TR/RG1 methylSumm - summary(d$methylation) names(methylSumm) - c('Min', 'Q1', 'Median', 'Mean', 'Q3', 'Max') data.frame(TR, RG1, percent, as.data.frame(as.list(methylSumm))) } # Read the data files into a list and apply the function to each file recursively, # resulting in a data frame # vector of file names files - c('chr1.out.txt', 'chr2.out.txt') # use lapply() to read files into a list filelist - lapply(files, read.table, header = FALSE) # Use the ldply() function from the plyr package to # process the list and return a data frame library('plyr') ldply(filelist, chrSumm) # Result from your example: ldply(filelist, chrSumm) TR RG1 percent Min Q1 MedianMean Q3 Max 1 4 4 1.0 0.04 0.0475 0.07 0.07500 0.0975 0.12 2 3 2 1.5 0.00 0.0150 0.03 0.03667 0.0550 0.08 HTH, Dennis On Tue, Aug 9, 2011 at 9:31 PM, a217 aj...@case.edu wrote: Hello, I have an R script that I use as a template to perform a task for multiple files (in this case, multiple chromosomes). What I would like to do is to utilize a simple loop to parse through each chromosome number so that I don't have to type the same code over and over again in the R console. I've tried using: for(i in 1:22){ etc.. } and replacing each chromosome number with [[i]], but that did not seem to work. Below is the script I have. Basically everywhere you see a '2' I would like there to be an 'i' so that the script can be applied in a general sense. Code### chr2.data-read.table(file=chr2.out.txt, header=F) colnames(chr2.data)-c(chr,start,end,base1,base2,totalreads,methylation,strand) splc2-split(chr2.data, paste(chr2.data$chr)) chr2.df-as.data.frame(t(sapply(splc2, function(x) list(TR=NROW(x[['totalreads']]), RG1=sum(x[['totalreads']]=1), percent=(NROW(x[['totalreads']]=1)/sum(x[['totalreads']])) chr2.df.summ-as.data.frame(t(sapply(splc2, function(x) summary(x$methylation chr2.summ-cbind(chr2.df,chr2.df.summ) ## Here are some sample input files in case you'd like to test the code: ## # chr1.out.txt ## chr1 100 159 104 104 1 0.05 + chr1 100 159 145 145 1 0.04 + chr1 200 260 205 205 1 0.12 + chr1 500 750 600 600 1 0.09 + ## # chr2.out.txt ## chr2 100 200 105 105 1 0.03 + chr2 100 200 110 110 1 0.08 + chr2 300 400 350 350 0 0 + The code works perfectly fine just typing everything out by hand, but that is very inefficient given that there are 24 chromosomes for each dataset. I am just looking for any suggestions as to how I can write a general version of this code. -- View this message in context: http://r.789695.n4.nabble.com/Loops-for-repetitive-task-tp3732022p3732022.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] column names issue with read.csv
Dear List, I wonder why when using read.csv(), if the column name contains a numeric i.e. a stock symbols-0001.HK, it will automatically insert an X character to the column names - X0001.HK. Now I have to manually do a loop and use substring() to remove the X character. Any advice? Thanks Anthony __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
Johannes, You have the loop set up right, you just need to add indexing to refer to the looping variable, i. lT - sample(1:10) uT - sample(21:30) X - numeric(length(lT)) for (i in 1:length(lT)) X[i] - runif(1, lT[i], uT[i]) X Note that I changed the name of the result from T to X, because T has special meaning in R. Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Johannes Radinger jradin...@gmx.at To: r-help@r-project.org Date: 08/10/2011 07:23 AM Subject: [R] function runif in for loop Sent by: r-help-boun...@r-project.org Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X -(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] column names issue with read.csv
Anthony, See ?make.names for a description of valid names. Here's an excerpt: A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. ... The character X is prepended if necessary. There is an argument in read.csv() called check.names. Try setting this to FALSE and see if that works. read.csv(file=Something.csv, check.names=F) Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Anthony Ching Ho Ng anthony.ch...@gmail.com To: r-help@r-project.org Date: 08/10/2011 08:18 AM Subject: [R] column names issue with read.csv Sent by: r-help-boun...@r-project.org Dear List, I wonder why when using read.csv(), if the column name contains a numeric i.e. a stock symbols-0001.HK, it will automatically insert an X character to the column names - X0001.HK. Now I have to manually do a loop and use substring() to remove the X character. Any advice? Thanks Anthony __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
To borrow shamelessly from one of the prominent helpers on this list: What is the problem you're trying to solve?(attribution: Jim Holtman) I have the sense you want to do something over many subsets of your data frame. If so, breaking things up into lists of lists of lists is not necessarily productive, nor may it be necessary to use loops explicitly, depending on the nature of what you want to do. If you're more explicit about the nature of your task, it's entirely possible that there may be a nice 'R way' to do it. Read the posting guide and if at all possible, provide a small, reproducible example that demonstrates what you want to accomplish. (See ?dput to learn how to transmit data by e-mail.) HTH, Dennis On Tue, Aug 9, 2011 at 5:58 PM, Frederic F fournier.frede...@gmail.com wrote: Hello, This is my first project in R, so I'm trying to work 'the R way', but it still feels awkward sometimes. The problem that I'm facing right now is that I need to convert a data.frame into a structure of lists. The data.frame has columns in the order of tens (I need to focus on only three of them) and rows in the order of millions. So it's quite a big dataset. Let say that the columns of interest are A, B and C. I need to take the data.frame and construct a structure of list where I have a list for every level of A, those list all contain lists for every levels of B, and the 'b-lists' contains all the values of C that match the corresponding levels of A and B. So, I should be able to write something like this: MyData@list_structure$x_level_of_A$y_level_of_B and get a vector of the values of C that were on rows where A=x_level_of_A and B=y_level_of_B. My first attempt was to use two imbricated lapply functions running something like this: list_structure-lapply(levels(A) function(x) { as.character(x) = lapply( levels(B), function(y) { as.character(y) = C[A==x B==y] }) }) The real code was not quite as simple, but I managed to have it work, and it worked well on my first dataset (where A and B had only few levels). I was quite happy... but the imbricated loops killed me on a second dataset where A had several thousand levels. So I tried something else. My second attempt was to go through every row of the data.frame and append the value to the appropriate vector. I first initialized a structure of lists ending with NULL vector, then I did something like this: for (i in 1:nrow(DataFrame)) { eval( substitute( append(MyData@list_structure$a_value$b_value, c_value), list(a_value=as.character(DF$A[i]), b_value=as.character(DF$B[i]), c_value=as.character(DF$C[i])) ) ) } This works... but way too slowly for my purpose. I would like to know if there is a better road to take to do this transformation. Or, if there is a way of speeding one of the two solutions that I have tried. Thank you very much for your help! (And in your replies, please remember that this is my first project in R, so don't hesitate to state the obvious if it seems like I am missing it!) Frederic -- View this message in context: http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3731746.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Floats in Microsoft Basic format
On 10/08/2011 5:58 AM, taraxacum wrote: Hi all, I need to convert a floating point value from Microsoft Basic format to IEEE format. There's a simple way to achieve this in R or I have to write my own function? (e.g. convert the C code below) You'll need to write your own function. It can be very similar to what you have below, except: - you'll need to produce double rather than single (easy, just produce a single and assign to a double, C does the conversion for you) - you need to change the args to your function to some that R understands. Duncan Murdoch thanks t #includestring.h /* for strncpy */ int _fmsbintoieee(float *src4, float *dest4) { unsigned char *msbin = (unsigned char *)src4; unsigned char *ieee = (unsigned char *)dest4; unsigned char sign = 0x00; unsigned char ieee_exp = 0x00; int i; /* MS Binary Format */ /* byte order = m3 | m2 | m1 | exponent */ /* m1 is most significant byte = sbbb| */ /* m3 is the least significant byte */ /* m = mantissa byte */ /* s = sign bit*/ /* b = bit */ sign = msbin[2] 0x80; /* 1000|b */ /* IEEE Single Precision Float Format */ /*m3m2m1 exponent */ /* | | emmm| seee| */ /* s = sign bit*/ /* e = exponent bit*/ /* m = mantissa bit*/ for (i=0; i4; i++) ieee[i] = 0; /* any msbin w/ exponent of zero = zero */ if (msbin[3] == 0) return 0; ieee[3] |= sign; /* MBF is bias 128 and IEEE is bias 127. ALSO, MBF places */ /* the decimal point before the assumed bit, while */ /* IEEE places the decimal point after the assumed bit. */ ieee_exp = msbin[3] - 2;/* actually, msbin[3]-1-128+127 */ /* the first 7 bits of the exponent in ieee[3] */ ieee[3] |= ieee_exp 1; /* the one remaining bit in first bin of ieee[2] */ ieee[2] |= ieee_exp 7; /* 0111|b : mask out the msbin sign bit */ ieee[2] |= msbin[2] 0x7f; ieee[1] = msbin[1]; ieee[0] = msbin[0]; return 0; } -- View this message in context: http://r.789695.n4.nabble.com/Floats-in-Microsoft-Basic-format-tp3732456p3732456.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
On 10/08/2011 7:28 AM, Johannes Radinger wrote: Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X-(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. runif() does allow vectors. Assuming Xa and Xb are vectors of length n, then X - runif(n, Xa, Xb) will work. (Xa and Xb don't both have to be vectors; values will be recycled as necessary.) Duncan Murdoch I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind/cbind
On Aug 10, 2011, at 9:08 AM, Anthony Ching Ho Ng wrote: Dear list, I wonder if there a better way to have rbind/cbind/append to create the first element (if it is empty) instead of doing the following in a loop? for (i in 1:10) { if (i == 1) { aRow = SomeExpression(i) } else { aRow = rbind(aRow,SomeExpression(i)) } } Generally one is advised not to use rbind in this manner but rather to pre-allocate aRow to the size needed and then to add information by rows using [. For a matrix this might be: aRow - matrix(NA, ncol=3, nrow=10) for (i in 1:10) { aRow[1,] - SomeExpression(i) } -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind/cbind
Can't you use sapply? sapply(seq_len(10), function(i){SomeExpression(i)}) Best regards, Thierry -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens David Winsemius Verzonden: woensdag 10 augustus 2011 15:50 Aan: Anthony Ching Ho Ng CC: r-help@r-project.org Onderwerp: Re: [R] rbind/cbind On Aug 10, 2011, at 9:08 AM, Anthony Ching Ho Ng wrote: Dear list, I wonder if there a better way to have rbind/cbind/append to create the first element (if it is empty) instead of doing the following in a loop? for (i in 1:10) { if (i == 1) { aRow = SomeExpression(i) } else { aRow = rbind(aRow,SomeExpression(i)) } } Generally one is advised not to use rbind in this manner but rather to pre- allocate aRow to the size needed and then to add information by rows using [. For a matrix this might be: aRow - matrix(NA, ncol=3, nrow=10) for (i in 1:10) { aRow[1,] - SomeExpression(i) } -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] studentized and standarized residuals
Hi, I must be doing something silly here, because I can't get the studentised and standardised residuals from r output of a linear model to agree with what I think they should be from equation form. Thanks in advance, Jennifer x = seq(1,10) y = x + rnorm(10) mod = lm(y~x) rstandard(mod) residuals(mod)/(summary(mod)$sigma) rstudent(mod) residuals(mod)/(summary(mod)$sigma*sqrt(1-lm.influence(mod)$hat)) -- View this message in context: http://r.789695.n4.nabble.com/studentized-and-standarized-residuals-tp3732997p3732997.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Floats in Microsoft Basic format
On Wed, Aug 10, 2011 at 2:34 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 10/08/2011 5:58 AM, taraxacum wrote: Hi all, I need to convert a floating point value from Microsoft Basic format to IEEE format. There's a simple way to achieve this in R or I have to write my own function? (e.g. convert the C code below) You'll need to write your own function. It can be very similar to what you have below, except: - you'll need to produce double rather than single (easy, just produce a single and assign to a double, C does the conversion for you) - you need to change the args to your function to some that R understands. Wouldn't it be easier, given a working C implementation, to write a wrapper that calls the C code? R isn't built for bit-twiddling, C very definitely is. Also, I imagine you may have a large number of these to change, in which case you could do it all with a single call to a C function that loops over the numbers from an R vector. See the R docs for how to load compiled C into R. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] useR! 2012 Nashville TN: Your Input Needed
So far we have received over 70 responses to our survey. If you have NOT responded already and are likely to attend useR! 2012, please take the extremely short survey today. The link is below. More information about Nashville may be seen at http://biostat.mc.vanderbilt.edu/UseR-2012 In a day or two the conference web address will be http://www.r-project.org/useR-2012/ Thanks! -- The 2012 R User Conference - useR! 2012 - will be held in Nashville Tennessee USA, June 12-15, 2012 on the campus of Vanderbilt University. We would like to begin estimating the number of attendees, their area of interest, and the number seeking hotel vs. lower-cost housing. If you are likely to attend useR! 2012 please go to the following link to answer two questions: https://redcap.vanderbilt.edu/surveys/?s=2wiqWo There are many fun things to do in Nashville -- Music City USA -- around the time of the meeting. Vanderbilt is 2.2 miles (3.5 km) from the center of the action. I hope to see many of you at useR! 2011 at the University of Warwick in Coventry UK in just over a week. If anyone knows of another site to which I should send this announcement please e-mail me your suggestion. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Floats in Microsoft Basic format
On 10/08/2011 10:16 AM, Barry Rowlingson wrote: On Wed, Aug 10, 2011 at 2:34 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 10/08/2011 5:58 AM, taraxacum wrote: Hi all, I need to convert a floating point value from Microsoft Basic format to IEEE format. There's a simple way to achieve this in R or I have to write my own function? (e.g. convert the C code below) You'll need to write your own function. It can be very similar to what you have below, except: - you'll need to produce double rather than single (easy, just produce a single and assign to a double, C does the conversion for you) - you need to change the args to your function to some that R understands. Wouldn't it be easier, given a working C implementation, to write a wrapper that calls the C code? R isn't built for bit-twiddling, C very definitely is. That's actually what I was suggesting. Duncan Murdoch Also, I imagine you may have a large number of these to change, in which case you could do it all with a single call to a C function that loops over the numbers from an R vector. See the R docs for how to load compiled C into R. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading XML files masquerading as XL files
R version 2.13.1 OS X (or Windows) Colleagues, I received a number of files with a .xls extension. These files open in XL and, by all appearances, are XL files. However, it appears to me that the files are actually XML: readLines(dir()[16])[1:10] [1] ?xml version=\1.0\? [2] Workbook xmlns=\urn:schemas-microsoft-com:office:spreadsheet\ [3] xmlns:o=\urn:schemas-microsoft-com:office:office\ [4] xmlns:x=\urn:schemas-microsoft-com:office:excel\ [5] xmlns:ss=\urn:schemas-microsoft-com:office:spreadsheet\ [6] xmlns:html=\http://www.w3.org/TR/REC-html40\; [7] DocumentProperties xmlns=\urn:schemas-microsoft-com:office:office\ [8] Version12.0/Version [9] /DocumentProperties [10] OfficeDocumentSettings xmlns=\urn:schemas-microsoft-com:office:office\ I had initially tried to read the files using read.xls (gdata) but that failed (not surprisingly). I could open each Excel file, then save as csv, then use read.csv. However, there are many files so I would love to have a solution that does not require this brute force approach. Are there any packages that would allow me to read these files without the additional steps? Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] studentized and standarized residuals
On 08/10/2011 10:03 AM, Jen wrote: Hi, I must be doing something silly here, because I can't get the studentised and standardised residuals from r output of a linear model to agree with what I think they should be from equation form. x = seq(1,10) y = x + rnorm(10) mod = lm(y~x) rstandard(mod) residuals(mod)/(summary(mod)$sigma) rstudent(mod) residuals(mod)/(summary(mod)$sigma*sqrt(1-lm.influence(mod)$hat)) The terms studentized and standardized are sometimes used differently by different authors and software packages. In R, the standardized residuals are based on your second calculation above. The studentized residuals are similar, but involve estimating sigma in a way that leaves out the ith data point when calculating the ith residual (some authors call these the studentized deleted residuals or the externally studentized residuals). There is a closed form expression, but it is somewhat bulky. -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
Following the suggestion by Duncan Murdoch, this should work for you. X - runif(length(lT), lT, uT) Jean From: Johannes Radinger jradin...@gmx.at To: Jean V Adams jvad...@usgs.gov Cc: r-help@r-project.org Date: 08/10/2011 08:40 AM Subject: Re: [R] function runif in for loop Jean, thank you for your answer. especially the line X - numeric(length(lT)) helped me a lot. Anyway, in my case I'd like to get a dynamic variable or better a function for X. I mean if i try to call X I'd like that this drawing of random number is performed. In the case now if I call X several times I'll always get the same random numbers. I thought about something like: X - for (i in 1:length(lT)) runif(1, lT[i], uT[i]) So that I can use X as a variable for multiple runs and each run new random values are used. thank you Johannes Original-Nachricht Datum: Wed, 10 Aug 2011 08:19:07 -0500 Von: Jean V Adams jvad...@usgs.gov An: Johannes Radinger jradin...@gmx.at CC: r-help@r-project.org Betreff: Re: [R] function runif in for loop Johannes, You have the loop set up right, you just need to add indexing to refer to the looping variable, i. lT - sample(1:10) uT - sample(21:30) X - numeric(length(lT)) for (i in 1:length(lT)) X[i] - runif(1, lT[i], uT[i]) X Note that I changed the name of the result from T to X, because T has special meaning in R. Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Johannes Radinger jradin...@gmx.at To: r-help@r-project.org Date: 08/10/2011 07:23 AM Subject: [R] function runif in for loop Sent by: r-help-boun...@r-project.org Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X -(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie! Jetzt informieren: http://www.gmx.net/de/go/freephone [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] studentized and standarized residuals
Thanks Patrick - at least I know I wasn't being too silly :-) Jen -- View this message in context: http://r.789695.n4.nabble.com/studentized-and-standarized-residuals-tp3732997p3733173.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] join columns
Dear R-help, I wonder if you could give me some suggestions in how to do a union join of two data frames as follow: - union join the common column, and insert a 0 if one is missing. I made a function to perform the following, and I know it may not that quite welly written, but it works. Any suggestions are welcome, many thanks. Anthony q1 = data.frame(a=1,b=2,c=3,row.names=q1) a b c q1 1 2 3 q2 = data.frame(d=4,b=1,a=4, row.names=q2) d b a q2 4 1 4 - myJoinColumns(q1,q2) a b c d q1 1 2 3 0 q2 4 1 0 4 myJoinColumns - function(q1,q2){ allNames = sort(union(colnames(q1),colnames(q2))) for (i in 1:length(allNames)){ t1 = which(colnames(q1) == allNames[i]) t2 = which(colnames(q2) == allNames[i]) if (length(t1) == 1){ sec1 = q1[,t1] } else { sec1 = 0 } if (length(t2) == 1){ sec2 = q2[,t2] } else { sec2 = 0 } if (i == 1){ qTable = matrix(c(sec1,sec2)) }else{ qTable = cbind(qTable,c(sec1,sec2)) } } colnames(qTable) = allNames rownames(qTable) = c(q1,q2) qTable } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] join columns
On Aug 10, 2011, at 11:04 AM, Anthony Ching Ho Ng wrote: Dear R-help, I wonder if you could give me some suggestions in how to do a union join of two data frames as follow: - union join the common column, and insert a 0 if one is missing. I made a function to perform the following, and I know it may not that quite welly written, but it works. Any suggestions are welcome, many thanks. Anthony q1 = data.frame(a=1,b=2,c=3,row.names=q1) a b c q1 1 2 3 q2 = data.frame(d=4,b=1,a=4, row.names=q2) d b a q2 4 1 4 - myJoinColumns(q1,q2) a b c d q1 1 2 3 0 q2 4 1 0 4 temp - merge(q1,q2, all=TRUE) temp[is.na(temp)] - 0 temp a b c d 1 1 2 3 0 2 4 1 0 4 myJoinColumns - function(q1,q2){ allNames = sort(union(colnames(q1),colnames(q2))) for (i in 1:length(allNames)){ t1 = which(colnames(q1) == allNames[i]) t2 = which(colnames(q2) == allNames[i]) if (length(t1) == 1){ sec1 = q1[,t1] } else { sec1 = 0 } if (length(t2) == 1){ sec2 = q2[,t2] } else { sec2 = 0 } if (i == 1){ qTable = matrix(c(sec1,sec2)) }else{ qTable = cbind(qTable,c(sec1,sec2)) } } colnames(qTable) = allNames rownames(qTable) = c(q1,q2) qTable } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using ANCOVA in R
January, It looks like you will need an interaction effect, perhaps g - lm( response ~ subject + group*time) Please see the ancova function in the HH package. install.packages(HH) ## if necessary library(HH) ?ancova Rich On Wed, Aug 10, 2011 at 5:15 AM, January Weiner january.wei...@gmail.comwrote: Hello, I have a problem with using the following design with ANCOVA in R. There are two groups (control + treatment), each with ten subjects. The subjects show a response that is monitored over time (four time points). For a single given subject, the response can be analysed with linear regression with time as the independent variable. The question is, how does the response differ between the two groups. It is to be expected that the slope of the response differs between the two groups, while the intercept itself is of minor interest. I have tried a linear model as in g - lm( response ~ time + subject + group) but somehow I think that this is not correct, as it would not show a difference in the slope of the response. Any kind of help would be greatly appreciated. -- January __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() a data frame containing 'character' variables?
Hello On Wed, Aug 10, 2011 at 2:31 PM, Jean V Adams jvad...@usgs.gov wrote: The function format() might serve your needs. This looks very promising, but yields some strange results. See below: x - data.frame(a=rnorm(10), b=rnorm(10), c=rnorm(10), d=letters[1:10]) x a bc d 1 0.54114449 -0.11195580 1.526279364 a 2 3.27109063 0.50848249 -0.215760332 b 3 -0.27064475 -1.04749725 0.082319811 c 4 -0.06638611 -0.58600572 0.004148253 d 5 -0.06170739 -0.37885203 0.689125494 e 6 0.53211363 -0.09150913 -0.463972307 f 7 -0.43314431 -0.28981614 -0.973410994 g 8 0.52137857 -1.15077343 0.163120205 h 9 -1.39581552 1.27378389 0.136708313 i 10 0.06348058 -0.00369746 -0.570214119 j format(x, digits=2) ##it displays 3 or 4 digits, instead of the required 2 a b c d 1 0.541 -0.1120 1.5263 a 2 3.271 0.5085 -0.2158 b 3 -0.271 -1.0475 0.0823 c 4 -0.066 -0.5860 0.0041 d 5 -0.062 -0.3789 0.6891 e 6 0.532 -0.0915 -0.4640 f 7 -0.433 -0.2898 -0.9734 g 8 0.521 -1.1508 0.1631 h 9 -1.396 1.2738 0.1367 i 10 0.063 -0.0037 -0.5702 j format(x, digits=2, nsmall=1, scientific=7) ##no change when setting related arguments a b c d 1 0.541 -0.1120 1.5263 a 2 3.271 0.5085 -0.2158 b 3 -0.271 -1.0475 0.0823 c 4 -0.066 -0.5860 0.0041 d 5 -0.062 -0.3789 0.6891 e 6 0.532 -0.0915 -0.4640 f 7 -0.433 -0.2898 -0.9734 g 8 0.521 -1.1508 0.1631 h 9 -1.396 1.2738 0.1367 i 10 0.063 -0.0037 -0.5702 j round(x[1:3], digits=2) ##works as expected a b c 1 0.54 -0.11 1.53 2 3.27 0.51 -0.22 3 -0.27 -1.05 0.08 4 -0.07 -0.59 0.00 5 -0.06 -0.38 0.69 6 0.53 -0.09 -0.46 7 -0.43 -0.29 -0.97 8 0.52 -1.15 0.16 9 -1.40 1.27 0.14 10 0.06 0.00 -0.57 Any ideas why format() and round() give so different results? Can format() be set to behave similarly to round? Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New Matrix and lme4: Must reinstall lme4 if got new Matrix
We have released to CRAN a new version of the (recommended) package Matrix, and of package lme4 yesterday. Anyone who gets the new version of Matrix *MUST* re-install lme4 -- if (s)he is using lme4 at all. Technical details about that further below. The fact that yesterday's version number of Matrix is 0.9995875-2, indicates that Matrix' version is indeed approaching 1.0 (*), and I'd declare this version as release candidate for 1.0. As a recommended package, Matrix is part of every R distribution, our aim is to release Matrix_1.0-0 (or higher) with the next non-patch release of R, i.e., R-2.14.0 somewhere in October. For this reason, we are asking R useRs, programmeRs and provideRs, to ``hash at'' the package, trying to find problems / bugs, badly lacking (or wrong) documentation, etc, and report it to us (to 'Matrix-authors@...' or possibly R-devel@...), so we can prepare a shiny sparkling Matrix_1.0-0 in time. Thank you in advance for such a contribution to the Free Software universe. Martin Maechler and Doug Bates (and Ben Bolker for lme4). -- -- -- - Why must lme4 be re-installed as well ? [Technical !] This is because lme4 has a 'LinkingTo: Matrix' in its DESCRIPTION and indeed, lme4's C code is using part of Matrix' C code. The change in Matrix: Part of the C-level interface in Matrix is now (again, after several years) using the standard CHOLMOD typedef of UFlong. This means that the C API of Matrix should behave conformly with what other instances of CHOLMOD export. (*) For some, it may be amusing to read https://stat.ethz.ch/pipermail/r-packages/2008/000911.html which was the last announcement of Matrix on R-packages. ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] studentized and standarized residuals
Dear Jen, Actually you can check out what R does by looking at the source. # first type the name of the function rstandard function (model, ...) UseMethod(rstandard) environment: namespace:stats # ?methods will list you the corresponding functions methods(rstandard) [1] rstandard.glm rstandard.lm # choose rstandard.lm rstandard.lm function (model, infl = lm.influence(model, do.coef = FALSE), sd = sqrt(deviance(model)/df.residual(model)), ...) { res - infl$wt.res/(sd * sqrt(1 - infl$hat)) res[is.infinite(res)] - NaN res } # in case the function is not visible, # you can use package-name:::function-name to display it stats:::rstandard.lm Best, Denes Thanks Patrick - at least I know I wasn't being too silly :-) Jen -- View this message in context: http://r.789695.n4.nabble.com/studentized-and-standarized-residuals-tp3732997p3733173.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] join columns
Try this: merge(q1, q2, all = TRUE) On Wed, Aug 10, 2011 at 12:04 PM, Anthony Ching Ho Ng anthony.ch...@gmail.com wrote: Dear R-help, I wonder if you could give me some suggestions in how to do a union join of two data frames as follow: - union join the common column, and insert a 0 if one is missing. I made a function to perform the following, and I know it may not that quite welly written, but it works. Any suggestions are welcome, many thanks. Anthony q1 = data.frame(a=1,b=2,c=3,row.names=q1) a b c q1 1 2 3 q2 = data.frame(d=4,b=1,a=4, row.names=q2) d b a q2 4 1 4 - myJoinColumns(q1,q2) a b c d q1 1 2 3 0 q2 4 1 0 4 myJoinColumns - function(q1,q2){ allNames = sort(union(colnames(q1),colnames(q2))) for (i in 1:length(allNames)){ t1 = which(colnames(q1) == allNames[i]) t2 = which(colnames(q2) == allNames[i]) if (length(t1) == 1){ sec1 = q1[,t1] } else { sec1 = 0 } if (length(t2) == 1){ sec2 = q2[,t2] } else { sec2 = 0 } if (i == 1){ qTable = matrix(c(sec1,sec2)) }else{ qTable = cbind(qTable,c(sec1,sec2)) } } colnames(qTable) = allNames rownames(qTable) = c(q1,q2) qTable } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round() a data frame containing 'character' variables?
As it says in ?format the digits argument specifies ... how many significant digits are to be used ... enough decimal places will be used so that the smallest (in magnitude) number has this many significant digits ... In your example, the last value in column a is 0.06348058 which is written as 0.063 to two significant digits. There is no way (that I know of) to make the format() function do the same sort of thing as round(). If digits won't meet your needs, you could try something like this data.frame(lapply(x, function(y) if(is.numeric(y)) round(y, 2) else y)) Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA Liviu Andronic landronim...@gmail.com wrote on 08/10/2011 10:26:43 AM: [image removed] Re: [R] round() a data frame containing 'character' variables? Liviu Andronic to: Jean V Adams 08/10/2011 10:27 AM Cc: r-help@r-project.org Help Hello On Wed, Aug 10, 2011 at 2:31 PM, Jean V Adams jvad...@usgs.gov wrote: The function format() might serve your needs. This looks very promising, but yields some strange results. See below: x - data.frame(a=rnorm(10), b=rnorm(10), c=rnorm(10), d=letters[1:10]) x a bc d 1 0.54114449 -0.11195580 1.526279364 a 2 3.27109063 0.50848249 -0.215760332 b 3 -0.27064475 -1.04749725 0.082319811 c 4 -0.06638611 -0.58600572 0.004148253 d 5 -0.06170739 -0.37885203 0.689125494 e 6 0.53211363 -0.09150913 -0.463972307 f 7 -0.43314431 -0.28981614 -0.973410994 g 8 0.52137857 -1.15077343 0.163120205 h 9 -1.39581552 1.27378389 0.136708313 i 10 0.06348058 -0.00369746 -0.570214119 j format(x, digits=2) ##it displays 3 or 4 digits, instead of the required 2 a b c d 1 0.541 -0.1120 1.5263 a 2 3.271 0.5085 -0.2158 b 3 -0.271 -1.0475 0.0823 c 4 -0.066 -0.5860 0.0041 d 5 -0.062 -0.3789 0.6891 e 6 0.532 -0.0915 -0.4640 f 7 -0.433 -0.2898 -0.9734 g 8 0.521 -1.1508 0.1631 h 9 -1.396 1.2738 0.1367 i 10 0.063 -0.0037 -0.5702 j format(x, digits=2, nsmall=1, scientific=7) ##no change when setting related arguments a b c d 1 0.541 -0.1120 1.5263 a 2 3.271 0.5085 -0.2158 b 3 -0.271 -1.0475 0.0823 c 4 -0.066 -0.5860 0.0041 d 5 -0.062 -0.3789 0.6891 e 6 0.532 -0.0915 -0.4640 f 7 -0.433 -0.2898 -0.9734 g 8 0.521 -1.1508 0.1631 h 9 -1.396 1.2738 0.1367 i 10 0.063 -0.0037 -0.5702 j round(x[1:3], digits=2) ##works as expected a b c 1 0.54 -0.11 1.53 2 3.27 0.51 -0.22 3 -0.27 -1.05 0.08 4 -0.07 -0.59 0.00 5 -0.06 -0.38 0.69 6 0.53 -0.09 -0.46 7 -0.43 -0.29 -0.97 8 0.52 -1.15 0.16 9 -1.40 1.27 0.14 10 0.06 0.00 -0.57 Any ideas why format() and round() give so different results? Can format() be set to behave similarly to round? Regards Liviu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RExcel
Hi list, I used to work with RExcel in excel 2003. Now in 2007, I tried the same RExcel, but it did not work. I got R version 12. I downloaded/installed the latest version of RExcel 3.2.0 from http://sunsite.univie.ac.at/rcom/. It has added the RExcel add-ins, but when I click on starting R in add-ins, I get the following sequentional errors: Errors: SCtools not available SCTools can not be loaded. could not start Rserver There seems to be no R proecess conneceted to Excel I used to install Rsrv200.exe, and do not know if I still need to install it in this version. Any help, please, thanks, Alireza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spfeml error message
Hello I am trying to learn the spatial panel data analysis (newbie). I have the R version 2.13.1 and I did download the spml package required for the spatial panel data analysis. However, when I tried the analysis, I get the following error message. could not find function spfeml. Can somebody help me with this. Thanks in advance. -Dar [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
Hello Duncan, Here is a small example to illustrate what I am trying to do. # Example data.frame df=data.frame(A=c(a,a,b,b), B=c(X,X,Y,Z), C=c(1,2,3,4)) # A B C # 1 a X 1 # 2 a X 2 # 3 b Y 3 # 4 b Z 4 ### First way of getting the list structure (ls1) using imbricated lapply loops: # Get the structure and populate it: ls1-lapply(levels(df$A), function(levelA) { lapply(levels(df$B), function(levelB) {df$C[df$A==levelA df$B==levelB]}) }) # Apply the names: names(list_structure)-levels(df$A) for (i in 1:length(list_structure)) {names(list_structure[[i]])-levels(df$B)} # Result: ls1$a$X # [1] 1 2 ls1$b$Z # [1] 4 The data.frame will always be 'complete', i.e., there will be a value in every row for every column. I want to produce a structure like this one quickly (I aim at something below 10 seconds) for a dataset containing between 1 and 2 millions of rows. I hope that this helps clarify things. Thanks for your help, Frederic -- View this message in context: http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3733073.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] instal tar.gz package on windows
Hi Duncan, I have tried to install a tar.gz package following your instructions (https://stat.ethz.ch/pipermail/r-help/2008-August/169599.html) but without success. Here are the steps I followed: I installed the last version of Rtools and ran Rcmd INSTALL rJava_0.8-8.tar.gz and got the error message attach in errorLog.txt. It seems that there is a problem with JRI but I´m really stuck. Do you have any idea of what it could be? Thanks! ** Visit our website at http://www.btgpactual.com The content of this message is confidential and was intended solely to its recipient. In case this message is received by mistake, please destroy it and notify the sender immediately. Electronic mails are not a safe channel for data transmission and the sender accepts NO liability for eventual errors, delays, loss, interception or virus infection. When necessary, the receiver must request a hard-copy version. O conteúdo desta mensagem é confidencial e destinado exclusivamente aos destinatários. Caso a receba por engano, favor destruí-la e notificar o remetente de imediato. O correio eletrônico não configura meio seguro para transmissão de dados e o remetente NÃO se responsabiliza por eventual erro, atraso, extravio, interceptação ou infecção por vírus. Cabe ao destinatário solicitar versão física sempre que necessário. ** P:\DesktopRcmd INSTALL rJava_0.8-8.tar.gz * installing to library 'C:/Program Files/R/R-2.13.1/library' * installing *source* package 'rJava' ... Generate Windows-specific files (src/jvm-w32) ... cygwin warning: MS-DOS style path detected: C:/PROGRA~1/R/R-213~1.1/etc/i386/Makeconf Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-213~1.1/etc/i386/Makec onf CYGWIN environment variable option nodosfilewarning turns off this warning. Consult the user's guide for more details about POSIX paths: http://cygwin.com/cygwin-ug-net/using.html#using-pathnames make: Entering directory `/cygdrive/c/DOCUME~1/lippelan/LOCALS~1/Temp/RtmpHz0yZg /R.INSTALL6d282489/rJava/src/jvm-w32' dlltool --input-def jvm.def --kill-at --dllname jvm.dll --output-lib libjvm.dll. a gcc -O2 -c -o findjava.o findjava.c gcc -s -o findjava.exe findjava.o make: Leaving directory `/cygdrive/c/DOCUME~1/lippelan/LOCALS~1/Temp/RtmpHz0yZg/ R.INSTALL6d282489/rJava/src/jvm-w32' Find Java... JAVA_HOME=C:/PROGRA~1/Java/JRE16~1.0_1 === Building JRI === JAVA_HOME=C:/PROGRA~1/Java/JRE16~1.0_1 R_HOME=C:/PROGRA~1/R/R-213~1.1 Creating Makefiles ... Configuration done. make -C src JRI.jar make[1]: Entering directory `/cygdrive/c/DOCUME~1/lippelan/LOCALS~1/Temp/RtmpHz0 yZg/R.INSTALL6d282489/rJava/jri/src' C:/PROGRA~1/Java/JRE16~1.0_1/bin/javac -target 1.4 -source 1.4 -d . ../Mutex.jav a ../RBool.java ../RConsoleOutputStream.java ../REXP.java ../RFactor.java ../RLi st.java ../RMainLoopCallbacks.java ../RVector.java ../Rengine.java ../package-in fo.java make[1]: C:/PROGRA~1/Java/JRE16~1.0_1/bin/javac: Command not found make[1]: *** [org/rosuda/JRI/Rengine.class] Error 127 make[1]: Leaving directory `/cygdrive/c/DOCUME~1/lippelan/LOCALS~1/Temp/RtmpHz0y Zg/R.INSTALL6d282489/rJava/jri/src' make: *** [src/JRI.jar] Error 2 WARNING: JRI could NOT be built Set IGNORE=1 if you want to build rJava anyway. ERROR: configuration failed for package 'rJava' * removing 'C:/Program Files/R/R-2.13.1/library/rJava' * restoring previous 'C:/Program Files/R/R-2.13.1/library/rJava' __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
Jean, thank you for your answer. especially the line X - numeric(length(lT)) helped me a lot. Anyway, in my case I'd like to get a dynamic variable or better a function for X. I mean if i try to call X I'd like that this drawing of random number is performed. In the case now if I call X several times I'll always get the same random numbers. I thought about something like: X - for (i in 1:length(lT)) runif(1, lT[i], uT[i]) So that I can use X as a variable for multiple runs and each run new random values are used. thank you Johannes Original-Nachricht Datum: Wed, 10 Aug 2011 08:19:07 -0500 Von: Jean V Adams jvad...@usgs.gov An: Johannes Radinger jradin...@gmx.at CC: r-help@r-project.org Betreff: Re: [R] function runif in for loop Johannes, You have the loop set up right, you just need to add indexing to refer to the looping variable, i. lT - sample(1:10) uT - sample(21:30) X - numeric(length(lT)) for (i in 1:length(lT)) X[i] - runif(1, lT[i], uT[i]) X Note that I changed the name of the result from T to X, because T has special meaning in R. Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Johannes Radinger jradin...@gmx.at To: r-help@r-project.org Date: 08/10/2011 07:23 AM Subject: [R] function runif in for loop Sent by: r-help-boun...@r-project.org Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X -(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
Original-Nachricht Datum: Wed, 10 Aug 2011 09:38:38 -0400 Von: Duncan Murdoch murdoch.dun...@gmail.com An: Johannes Radinger jradin...@gmx.at CC: r-help@r-project.org Betreff: Re: [R] function runif in for loop On 10/08/2011 7:28 AM, Johannes Radinger wrote: Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X-(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. runif() does allow vectors. Assuming Xa and Xb are vectors of length n, then X - runif(n, Xa, Xb) Thank you very much, I just got confused with the n, as I wanted always only one value between Xa and Xb. So far my solution based on your suggestion is: X - runif(length(Xa), Xa, Xb) /johannes will work. (Xa and Xb don't both have to be vectors; values will be recycled as necessary.) Duncan Murdoch I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
Hello Denis, To borrow shamelessly from one of the prominent helpers on this list: What is the problem you're trying to solve? Â Â (attribution: Jim Holtman) I'm trying to connect two sets of legacy R tools: the output of the first one can be transformed in a data.frame without loss of information, the input of the second one takes the form of a structure of list. it's entirely possible that there may be a nice 'R way' to do it. Read the posting guide and if at all possible, provide a small, reproducible example that demonstrates what you want to accomplish. Here is the first way attacked the problems illustrated on a tiny dataset (this way does not work quickly enough on a real dataset unfortunately): df=data.frame(A=c(a,a,b,b), B=c(X,X,Y,Z), C=c(1,2,3,4)) # Get the structure and populate it: ls1-lapply(levels(df$A), function(levelA) { lapply(levels(df$B), function(levelB) {df$C[df$A==levelA df$B==levelB]}) }) # Get the names: names(list_structure)-levels(df$A) for (i in 1:length(list_structure)) {names(list_structure[[i]])-levels(df$B)} # Results: ls1$a$X # [1] 1 2 ls1$b$Z # [1] 4 Thanks for your help, Frederic -- View this message in context: http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3733114.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scan(file, encoding=?)
Hi, this command gives you all possible encoding options on your platform: iconvlist() hope it answers your question, T -- View this message in context: http://r.789695.n4.nabble.com/scan-file-encoding-tp840838p3733327.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RExcel
Don't they have their own support mailing list? You should review their documentation for specifics. -- David. On Aug 10, 2011, at 12:24 PM, Dr. Alireza Zolfaghari wrote: Hi list, I used to work with RExcel in excel 2003. Now in 2007, I tried the same RExcel, but it did not work. I got R version 12. I downloaded/ installed the latest version of RExcel 3.2.0 from http://sunsite.univie.ac.at/ rcom/. It has added the RExcel add-ins, but when I click on starting R in add- ins, I get the following sequentional errors: Errors: SCtools not available SCTools can not be loaded. could not start Rserver There seems to be no R proecess conneceted to Excel I used to install Rsrv200.exe, and do not know if I still need to install it in this version. Any help, please, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
On 10/08/2011 10:30 AM, Frederic F wrote: Hello Duncan, Here is a small example to illustrate what I am trying to do. # Example data.frame df=data.frame(A=c(a,a,b,b), B=c(X,X,Y,Z), C=c(1,2,3,4)) # A B C # 1 a X 1 # 2 a X 2 # 3 b Y 3 # 4 b Z 4 ### First way of getting the list structure (ls1) using imbricated lapply loops: # Get the structure and populate it: ls1-lapply(levels(df$A), function(levelA) { lapply(levels(df$B), function(levelB) {df$C[df$A==levelA df$B==levelB]}) }) # Apply the names: names(list_structure)-levels(df$A) for (i in 1:length(list_structure)) {names(list_structure[[i]])-levels(df$B)} # Result: ls1$a$X # [1] 1 2 ls1$b$Z # [1] 4 The data.frame will always be 'complete', i.e., there will be a value in every row for every column. I want to produce a structure like this one quickly (I aim at something below 10 seconds) for a dataset containing between 1 and 2 millions of rows. I don't know what the timing would be like for your real data, but this does look like by() would work: ls1 - by(df$C, df[,1:2], identity) When I repeat the rows of df a million times each, this finishes in a few seconds. It would definitely be slower if there were more levels of A or B. Now ls1 will be a matrix whose entries are the subsets of C that you want, so you can see your two results with slightly different syntax: ls1[[a, X]] [1] 1 2 ls1[[b,Z]] [1] 4 Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] instal tar.gz package on windows
On 10/08/2011 11:59 AM, Lippel, Anna wrote: Hi Duncan, I have tried to install a tar.gz package following your instructions (https://stat.ethz.ch/pipermail/r-help/2008-August/169599.html) but without success. Here are the steps I followed: I installed the last version of Rtools and ran Rcmd INSTALL rJava_0.8-8.tar.gz and got the error message attach in errorLog.txt. It seems that there is a problem with JRI but I´m really stuck. Do you have any idea of what it could be? The important error message is: make[1]: C:/PROGRA~1/Java/JRE16~1.0_1/bin/javac: Command not found make[1]: *** [org/rosuda/JRI/Rengine.class] Error 127 So you don't have the Java compiler, or don't have it where that package was looking for it. But I don't know why you wouldn't just install the binary version; why do you want to compile it yourself? Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
On 10/08/2011 9:40 AM, Johannes Radinger wrote: Jean, thank you for your answer. especially the line X- numeric(length(lT)) helped me a lot. Anyway, in my case I'd like to get a dynamic variable or better a function for X. I mean if i try to call X I'd like that this drawing of random number is performed. In the case now if I call X several times I'll always get the same random numbers. Such things do exist in R, but they aren't easy to set up. Why not just make X be a function explicitly? That is, X - function() runif(length(lT), lT, uT) Then use X() to call the function where you were previously using X. Duncan Murdoch I thought about something like: X- for (i in 1:length(lT)) runif(1, lT[i], uT[i]) So that I can use X as a variable for multiple runs and each run new random values are used. thank you Johannes Original-Nachricht Datum: Wed, 10 Aug 2011 08:19:07 -0500 Von: Jean V Adamsjvad...@usgs.gov An: Johannes Radingerjradin...@gmx.at CC: r-help@r-project.org Betreff: Re: [R] function runif in for loop Johannes, You have the loop set up right, you just need to add indexing to refer to the looping variable, i. lT- sample(1:10) uT- sample(21:30) X- numeric(length(lT)) for (i in 1:length(lT)) X[i]- runif(1, lT[i], uT[i]) X Note that I changed the name of the result from T to X, because T has special meaning in R. Jean `·.,,(((º`·.,,(((º`·.,,(((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Johannes Radingerjradin...@gmx.at To: r-help@r-project.org Date: 08/10/2011 07:23 AM Subject: [R] function runif in for loop Sent by: r-help-boun...@r-project.org Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X-(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can R handle a matrix with 8 billion entries?
You might want to look into the packages bigmemory and biganalytics. Corey On Tue, Aug 9, 2011 at 8:38 PM, Chris Howden ch...@trickysolutions.com.auwrote: Hi, Im trying to do a hierarchical cluster analysis in R with a Big Data set. Im running into problems using the dist() function. Ive been looking at a few threads about Rs memory and have read the memory limits section in R help. However Im no computer expert so Im hoping Ive misunderstood something and R can handle my Big Data set, somehow. Although at the moment I think my dataset is simply too big and there is no way around it, but Id like to be proved wrong! My data set has 90523 rows of data and 24 columns. My understanding is that this means the distance matrix has a min of 90523^2 elements which is 8194413529. Which roughly translates as 8GB of memory being required (if I assume each entry requires 1 bit). I only have 4GB on a 32bit build of windows and R. So there is no way thats going to work. So then I thought of getting access to a more powerful computer, and maybe using cloud computing. However the R memory limit help mentions On all builds of R, the maximum length (number of elements) of a vector is 2^31 - 1 ~ 2*10^9. Now as the distance matrix I require has more elements than this does this mean its too big for R no matter what I do? Any ideas would be welcome. Thanks. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) ch...@trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- *The mark of a successful man is one that has spent an entire day on the bank of a river without feeling guilty about it.* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
Hi Frederic, shouldn't there be an result for the 3rd row as well, eg ls1$b$Y? Maybe this will do what you want? dtf-within(dtf,index-factor(A:B)) tapply(dtf$C,dtf$index,list) Hth. Am 10.08.2011 16:30, schrieb Frederic F: Hello Duncan, Here is a small example to illustrate what I am trying to do. # Example data.frame df=data.frame(A=c(a,a,b,b), B=c(X,X,Y,Z), C=c(1,2,3,4)) # A B C # 1 a X 1 # 2 a X 2 # 3 b Y 3 # 4 b Z 4 ### First way of getting the list structure (ls1) using imbricated lapply loops: # Get the structure and populate it: ls1-lapply(levels(df$A), function(levelA) { lapply(levels(df$B), function(levelB) {df$C[df$A==levelA df$B==levelB]}) }) # Apply the names: names(list_structure)-levels(df$A) for (i in 1:length(list_structure)) {names(list_structure[[i]])-levels(df$B)} # Result: ls1$a$X # [1] 1 2 ls1$b$Z # [1] 4 The data.frame will always be 'complete', i.e., there will be a value in every row for every column. I want to produce a structure like this one quickly (I aim at something below 10 seconds) for a dataset containing between 1 and 2 millions of rows. I hope that this helps clarify things. Thanks for your help, Frederic -- View this message in context: http://r.789695.n4.nabble.com/How-to-quickly-convert-a-data-frame-into-a-structure-of-lists-tp3731746p3733073.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
I was going to suggest AB - df[c(A,B)] ls2 - array(split(df$C, AB), dim=sapply(AB, nlevels), dimnames=sapply(AB, levels)) which produces a matrix very similar to what Duncan's by() call produces ls1 - by(df$C, df[,1:2], identity) E.g., ls2[[a,X]] [1] 1 2 ls1[[a,X]] [1] 1 2 ls1[[a,Y]] # by assigns NULL to unoccupied slots NULL ls2[[a,Y]] # split gives the same type to all slots, copied from input numeric(0) They both are quick because they use split() to avoid the repeated evaluations of bigVector[ anotherBigVector == scalar ] that your nested (not imbricated) loops do. If you really need to convert the matrix to a list of lists that will probably be a quick transformation. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Wednesday, August 10, 2011 9:43 AM To: Frederic F Cc: r-help@r-project.org Subject: Re: [R] How to quickly convert a data.frame into a structure of lists On 10/08/2011 10:30 AM, Frederic F wrote: Hello Duncan, Here is a small example to illustrate what I am trying to do. # Example data.frame df=data.frame(A=c(a,a,b,b), B=c(X,X,Y,Z), C=c(1,2,3,4)) # A B C # 1 a X 1 # 2 a X 2 # 3 b Y 3 # 4 b Z 4 ### First way of getting the list structure (ls1) using imbricated lapply loops: # Get the structure and populate it: ls1-lapply(levels(df$A), function(levelA) { lapply(levels(df$B), function(levelB) {df$C[df$A==levelA df$B==levelB]}) }) # Apply the names: names(list_structure)-levels(df$A) for (i in 1:length(list_structure)) {names(list_structure[[i]])-levels(df$B)} # Result: ls1$a$X # [1] 1 2 ls1$b$Z # [1] 4 The data.frame will always be 'complete', i.e., there will be a value in every row for every column. I want to produce a structure like this one quickly (I aim at something below 10 seconds) for a dataset containing between 1 and 2 millions of rows. I don't know what the timing would be like for your real data, but this does look like by() would work: ls1 - by(df$C, df[,1:2], identity) When I repeat the rows of df a million times each, this finishes in a few seconds. It would definitely be slower if there were more levels of A or B. Now ls1 will be a matrix whose entries are the subsets of C that you want, so you can see your two results with slightly different syntax: ls1[[a, X]] [1] 1 2 ls1[[b,Z]] [1] 4 Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
Duncan et. al: Inline below. On Wed, Aug 10, 2011 at 9:48 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 10/08/2011 9:40 AM, Johannes Radinger wrote: Jean, thank you for your answer. especially the line X- numeric(length(lT)) helped me a lot. Anyway, in my case I'd like to get a dynamic variable or better a function for X. I mean if i try to call X I'd like that this drawing of random number is performed. In the case now if I call X several times I'll always get the same random numbers. Such things do exist in R, but they aren't easy to set up. Well, how about: Well... How about: X - function()runif(1) class(X) - c(wizz,class(X)) print.wizz - function(x){y -x(); print(y);y } X [1] 0.875768 X [1] 0.955208 X [1] 0.1150938 z - X z [1] 0.3760085 z - X z [1] 0.1506062 Cheers, Bert Why not just make X be a function explicitly? That is, X - function() runif(length(lT), lT, uT) Then use X() to call the function where you were previously using X. Duncan Murdoch I thought about something like: X- for (i in 1:length(lT)) runif(1, lT[i], uT[i]) So that I can use X as a variable for multiple runs and each run new random values are used. thank you Johannes Original-Nachricht Datum: Wed, 10 Aug 2011 08:19:07 -0500 Von: Jean V Adamsjvad...@usgs.gov An: Johannes Radingerjradin...@gmx.at CC: r-help@r-project.org Betreff: Re: [R] function runif in for loop Johannes, You have the loop set up right, you just need to add indexing to refer to the looping variable, i. lT- sample(1:10) uT- sample(21:30) X- numeric(length(lT)) for (i in 1:length(lT)) X[i]- runif(1, lT[i], uT[i]) X Note that I changed the name of the result from T to X, because T has special meaning in R. Jean `·.,,(((º `·.,,(((º `·.,,(((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Johannes Radingerjradin...@gmx.at To: r-help@r-project.org Date: 08/10/2011 07:23 AM Subject: [R] function runif in for loop Sent by: r-help-boun...@r-project.org Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X-(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Histograms in R
HI everyone, I'm plotting a histogram in R and within that histogram i need to demonstrate the percentage of another variable (Percentage of MutStatus) within the bins plotted inthe histogramI don't know how to do that! Data:Validation_Status Mutation_Status TvarRatio WildtypeNone0.08 WildtypeNone0.08 WildtypeNone0.08 WildtypeNone0.08 WildtypeNone0.080139373 WildtypeNone0.080152672 WildtypeNone0.080213904 WildtypeNone0.080357143 WildtypeNone0.080357143 WildtypeNone0.080357143 WildtypeNone0.08045977 WildtypeNone0.1 WildtypeLOH 0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeSomtatic0.1 WildtypeNone0.100558659 WildtypeNone0.100591716 WildtypeNone0.101010101 WildtypeNone0.101123596 WildtypeGline 0.10133 WildtypeNone0.101369863 WildtypeNone0.101449275 WildtypeNone0.101522843 WildtypeNone0.101604278 WildtypeNone0.102040816 WildtypeGline 0.102040816 WildtypeNone0.102362205 WildtypeNone0.102459016 WildtypeNone0.102564103 WildtypeNone0.102702703 WildtypeNone0.102739726 WildtypeNone0.102803738 Valid Somatic 0.102941176 WildtypeNone0.102941176 -- View this message in context: http://r.789695.n4.nabble.com/Histograms-in-R-tp3733644p3733644.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histograms in R
Perhaps you could shade the bars as appropriate? I'm not going to use your data because it's not an easily paste-able but how about this: x = rnorm(100) y = sample(c(A,B),100,replace=T,prob=c(0.7,0.3)) d = data.frame(level = x, status = y) n = 10 # Number of bins breaks = quantile(d$level, (0:n)/n) #breaks = with(d,hist(level,breaks=n,plot=F)$breaks) breaksAssign = findInterval(d$level,breaks) percentB = unique(ave(d$status==B,breaksAssign)[order(breaksAssign)]) percentB = gray(percentB/max(percentB)) # Many other color functions available as well. with(d, hist(level, breaks = breaks, col = percentB)) If you want to use hist()'s smart choice of bins (which I'd recommend), you can call this hist command once with plot=F and get the breaks from there. I.e., breaks = with(d,hist(level,breaks=n,plot=F)$breaks) There's probably a smarter way to do all this, but this does seem to work... Hope this helps, Michael Weylandt On Wed, Aug 10, 2011 at 1:16 PM, lt2 l...@bcm.edu wrote: HI everyone, I'm plotting a histogram in R and within that histogram i need to demonstrate the percentage of another variable (Percentage of MutStatus) within the bins plotted inthe histogramI don't know how to do that! Data:Validation_Status Mutation_Status TvarRatio WildtypeNone0.08 WildtypeNone0.08 WildtypeNone0.08 WildtypeNone0.08 WildtypeNone0.080139373 WildtypeNone0.080152672 WildtypeNone0.080213904 WildtypeNone0.080357143 WildtypeNone0.080357143 WildtypeNone0.080357143 WildtypeNone0.08045977 WildtypeNone0.1 WildtypeLOH 0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeNone0.1 WildtypeSomtatic0.1 WildtypeNone0.100558659 WildtypeNone0.100591716 WildtypeNone0.101010101 WildtypeNone0.101123596 WildtypeGline 0.10133 WildtypeNone0.101369863 WildtypeNone0.101449275 WildtypeNone0.101522843 WildtypeNone0.101604278 WildtypeNone0.102040816 WildtypeGline 0.102040816 WildtypeNone0.102362205 WildtypeNone0.102459016 WildtypeNone0.102564103 WildtypeNone0.102702703 WildtypeNone0.102739726 WildtypeNone0.102803738 Valid Somatic 0.102941176 WildtypeNone0.102941176 -- View this message in context: http://r.789695.n4.nabble.com/Histograms-in-R-tp3733644p3733644.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
On 10/08/2011 1:16 PM, Bert Gunter wrote: Duncan et. al: Inline below. On Wed, Aug 10, 2011 at 9:48 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 10/08/2011 9:40 AM, Johannes Radinger wrote: Jean, thank you for your answer. especially the line X- numeric(length(lT)) helped me a lot. Anyway, in my case I'd like to get a dynamic variable or better a function for X. I mean if i try to call X I'd like that this drawing of random number is performed. In the case now if I call X several times I'll always get the same random numbers. Such things do exist in R, but they aren't easy to set up. Well, how about: Well... How about: X- function()runif(1) class(X)- c(wizz,class(X)) print.wizz- function(x){y-x(); print(y);y } X [1] 0.875768 X [1] 0.955208 X [1] 0.1150938 z- X z [1] 0.3760085 z- X z [1] 0.1506062 That sort of looks as if it works, but it doesn't: for (i in 1:3) { r - X + 1 print(r) } Duncan Murdoch Cheers, Bert Why not just make X be a function explicitly? That is, X- function() runif(length(lT), lT, uT) Then use X() to call the function where you were previously using X. Duncan Murdoch I thought about something like: X- for (i in 1:length(lT)) runif(1, lT[i], uT[i]) So that I can use X as a variable for multiple runs and each run new random values are used. thank you Johannes Original-Nachricht Datum: Wed, 10 Aug 2011 08:19:07 -0500 Von: Jean V Adamsjvad...@usgs.gov An: Johannes Radingerjradin...@gmx.at CC: r-help@r-project.org Betreff: Re: [R] function runif in for loop Johannes, You have the loop set up right, you just need to add indexing to refer to the looping variable, i. lT- sample(1:10) uT- sample(21:30) X- numeric(length(lT)) for (i in 1:length(lT)) X[i]- runif(1, lT[i], uT[i]) X Note that I changed the name of the result from T to X, because T has special meaning in R. Jean `·.,,(((º `·.,,(((º `·.,,(((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Johannes Radingerjradin...@gmx.at To: r-help@r-project.org Date: 08/10/2011 07:23 AM Subject: [R] function runif in for loop Sent by: r-help-boun...@r-project.org Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X-(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Histograms in R
Assuming your data is in a data.frame called df, try this: attach(df) TR.groups - cut(TvarRatio, seq(0.07, 0.11, 0.01)) m - table(Mutation_Status, TR.groups) mut.no - dim(m)[1] barplot(m, col=seq(mut.no), xlab=TvarRatio, ylab=Frequency) legend(topleft, dimnames(m)[[1]], fill=seq(mut.no), title=Mutation_Status) Jean `·.,, (((º `·.,, (((º `·.,, (((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA r-help-boun...@r-project.org wrote on 08/10/2011 12:16:51 PM: [image removed] [R] Histograms in R lt2 to: r-help 08/10/2011 12:18 PM Sent by: r-help-boun...@r-project.org HI everyone, I'm plotting a histogram in R and within that histogram i need to demonstrate the percentage of another variable (Percentage of MutStatus) within the bins plotted inthe histogramI don't know how to do that! Data:Validation_Status Mutation_Status TvarRatio Wildtype None 0.08 Wildtype None 0.08 Wildtype None 0.08 Wildtype None 0.08 Wildtype None 0.080139373 Wildtype None 0.080152672 Wildtype None 0.080213904 Wildtype None 0.080357143 Wildtype None 0.080357143 Wildtype None 0.080357143 Wildtype None 0.08045977 Wildtype None 0.1 Wildtype LOH 0.1 Wildtype None 0.1 Wildtype None 0.1 Wildtype None 0.1 Wildtype None 0.1 Wildtype None 0.1 Wildtype Somtatic 0.1 Wildtype None 0.100558659 Wildtype None 0.100591716 Wildtype None 0.101010101 Wildtype None 0.101123596 Wildtype Gline 0.10133 Wildtype None 0.101369863 Wildtype None 0.101449275 Wildtype None 0.101522843 Wildtype None 0.101604278 Wildtype None 0.102040816 Wildtype Gline 0.102040816 Wildtype None 0.102362205 Wildtype None 0.102459016 Wildtype None 0.102564103 Wildtype None 0.102702703 Wildtype None 0.102739726 Wildtype None 0.102803738 Valid Somatic 0.102941176 Wildtype None 0.102941176 -- View this message in context: http://r.789695.n4.nabble.com/ Histograms-in-R-tp3733644p3733644.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Opposite of paste function
Dear All, I have vn variable vn [1] V300 V376 What I want to get is 300 376 without V and from vn variable. Could you help me about this issue? Thank you, Soyeon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Opposite of paste function
On Wed, Aug 10, 2011 at 11:22 AM, Soyeon Kim yunni0...@gmail.com wrote: Dear All, I have vn variable vn [1] V300 V376 What I want to get is 300 376 as.numeric(substring(vn, 2)) HTH Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Opposite of paste function
or, gsub('V','',vn) On 8/10/2011 2:23 PM, Peter Langfelder wrote: On Wed, Aug 10, 2011 at 11:22 AM, Soyeon Kimyunni0...@gmail.com wrote: Dear All, I have vn variable vn [1] V300 V376 What I want to get is 300 376 as.numeric(substring(vn, 2)) HTH Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Opposite of paste function
The see also potion of paste gives you the functions you can use for this -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Soyeon Kim Sent: Wednesday, August 10, 2011 2:22 PM To: r-help@r-project.org Subject: [R] Opposite of paste function Dear All, I have vn variable vn [1] V300 V376 What I want to get is 300 376 without V and from vn variable. Could you help me about this issue? Thank you, Soyeon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function runif in for loop
Duncan: Yup, you're right. Can't assign, just print. -- Bert On Wed, Aug 10, 2011 at 11:02 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 10/08/2011 1:16 PM, Bert Gunter wrote: Duncan et. al: Inline below. On Wed, Aug 10, 2011 at 9:48 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 10/08/2011 9:40 AM, Johannes Radinger wrote: Jean, thank you for your answer. especially the line X- numeric(length(lT)) helped me a lot. Anyway, in my case I'd like to get a dynamic variable or better a function for X. I mean if i try to call X I'd like that this drawing of random number is performed. In the case now if I call X several times I'll always get the same random numbers. Such things do exist in R, but they aren't easy to set up. Well, how about: Well... How about: X- function()runif(1) class(X)- c(wizz,class(X)) print.wizz- function(x){y-x(); print(y);y } X [1] 0.875768 X [1] 0.955208 X [1] 0.1150938 z- X z [1] 0.3760085 z- X z [1] 0.1506062 That sort of looks as if it works, but it doesn't: for (i in 1:3) { r - X + 1 print(r) } Duncan Murdoch Cheers, Bert Why not just make X be a function explicitly? That is, X- function() runif(length(lT), lT, uT) Then use X() to call the function where you were previously using X. Duncan Murdoch I thought about something like: X- for (i in 1:length(lT)) runif(1, lT[i], uT[i]) So that I can use X as a variable for multiple runs and each run new random values are used. thank you Johannes Original-Nachricht Datum: Wed, 10 Aug 2011 08:19:07 -0500 Von: Jean V Adamsjvad...@usgs.gov An: Johannes Radingerjradin...@gmx.at CC: r-help@r-project.org Betreff: Re: [R] function runif in for loop Johannes, You have the loop set up right, you just need to add indexing to refer to the looping variable, i. lT- sample(1:10) uT- sample(21:30) X- numeric(length(lT)) for (i in 1:length(lT)) X[i]- runif(1, lT[i], uT[i]) X Note that I changed the name of the result from T to X, because T has special meaning in R. Jean `·.,,(((º `·.,,(((º `·.,,(((º Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA From: Johannes Radingerjradin...@gmx.at To: r-help@r-project.org Date: 08/10/2011 07:23 AM Subject: [R] function runif in for loop Sent by: r-help-boun...@r-project.org Hello, I'd like to perform a regression using MCMCregress (MCMCpack). One variable therefore should be a function rather than a variable: I want to use X as an input and X should be defined as a random number between to values. Therefore I want to use the function runif like: X-(1, Xa, Xb) but it seems that runif doesn't allow to use vectors. So I think I've to calculate the new vector X by using a for loop. I tried for (i in 1:length(lT)) T-runif(1,lT,uT) but that doesn't work. What is the correct for-loop function to create this new vector/variable? Can I use that function then as an input for MCMCregress? thank you Johannes -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions. -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
[R] convert 'list' to 'vector'?
Dear all How does one convert a non-symmetric list to a vector? See below: x - list() x[[1]] - letters[1:5] x[[2]] - letters[6:10] x[[3]] - letters[11:12] x [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l paste(x) [1] c(\a\, \b\, \c\, \d\, \e\) c(\f\, \g\, \h\, \i\, \j\) [3] c(\k\, \l\) as.vector(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l simplify2array(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l What I would need to get instead is: letters[1:12] [1] a b c d e f g h i j k l Any ideas? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] choosing selective data with permutations
Hello, I am a R beginner and hoping to obtain some hints or suggestions about using permutations to sort a data set I have. Here is an example dataset: Ind1 11 00 12 15 28 Ind2 21 33 22 67 52 Ind3 22 45 21 22 56 Ind4 11 25 74 77 42 Ind5 41 32 67 45 22 This will be read into a variable using read.table. What I want to do is permute these individuals and every time pick 3 individuals and write them to a new variable. I want to do this 100 times so that in the end I will have 100 tables containing data for 3 individuals each. The data (for individuals) itself is not to be permuted, rather the selection of individuals. I am guessing this is probably trivial to do. But I would appreciate any advice on this matter. Thank you. Vikram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert 'list' to 'vector'?
unlist() Michael Weylandt On Wed, Aug 10, 2011 at 2:58 PM, Liviu Andronic landronim...@gmail.comwrote: Dear all How does one convert a non-symmetric list to a vector? See below: x - list() x[[1]] - letters[1:5] x[[2]] - letters[6:10] x[[3]] - letters[11:12] x [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l paste(x) [1] c(\a\, \b\, \c\, \d\, \e\) c(\f\, \g\, \h\, \i\, \j\) [3] c(\k\, \l\) as.vector(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l simplify2array(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l What I would need to get instead is: letters[1:12] [1] a b c d e f g h i j k l Any ideas? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xtable - caption missing with float=FALSE
Hi, For some reason I'm finding that my table caption is disappearing if I print xtable output with the floating argument set to FALSE. Below is a very simple Sweave file that produces two tables the first has no caption and the second has a caption (if you want to see it http://www.zevross.com/temp/test.pdf). Does anyone know what I can do to fix this? Zev (I'm using Windows 7, 64 bit, R 2.12.2) % begin Rnw file \documentclass[a4paper]{article} \begin{document} results=tex, echo=FALSE= library(xtable) atable-data.frame(a=1:10, b=rnorm(10)) print(xtable(atable, caption=FLOAT), floating=FALSE) print(xtable(atable, caption=NO FLOAT), floating=TRUE) @ \end{document} Sweave(test.Rnw) texi2dvi(test.tex, pdf=TRUE, clean=TRUE) -- Zev Ross ZevRoss Spatial Analysis 120 N Aurora, Suite 3A Ithaca, NY 14850 607-277-0004 (phone) 866-877-3690 (fax, toll-free) z...@zevross.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert 'list' to 'vector'?
Check function unlist(). Best, Dimitris On 8/10/2011 8:58 PM, Liviu Andronic wrote: Dear all How does one convert a non-symmetric list to a vector? See below: x- list() x[[1]]- letters[1:5] x[[2]]- letters[6:10] x[[3]]- letters[11:12] x [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l paste(x) [1] c(\a\, \b\, \c\, \d\, \e\) c(\f\, \g\, \h\, \i\, \j\) [3] c(\k\, \l\) as.vector(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l simplify2array(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l What I would need to get instead is: letters[1:12] [1] a b c d e f g h i j k l Any ideas? Regards Liviu -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert 'list' to 'vector'?
unlist(x) r-help-boun...@r-project.org wrote on 08/10/2011 01:58:57 PM: [image removed] [R] convert 'list' to 'vector'? Liviu Andronic to: r-help@r-project.org Help 08/10/2011 02:02 PM Sent by: r-help-boun...@r-project.org Dear all How does one convert a non-symmetric list to a vector? See below: x - list() x[[1]] - letters[1:5] x[[2]] - letters[6:10] x[[3]] - letters[11:12] x [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l paste(x) [1] c(\a\, \b\, \c\, \d\, \e\) c(\f\, \g\, \h\, \i\, \j\) [3] c(\k\, \l\) as.vector(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l simplify2array(x) [[1]] [1] a b c d e [[2]] [1] f g h i j [[3]] [1] k l What I would need to get instead is: letters[1:12] [1] a b c d e f g h i j k l Any ideas? Regards Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] choosing selective data with permutations
To pick random elements to sample, you can just use the sample function sample(1:5,3,replace=T/F) # pick true or false as needed for your data. If you replicate this, you should have no problem. replicate(100,function() return(sample(1:5,3,replace=T/F))) This will be plenty fast, but if you get into very large scale boot strapping, you might want to vectorize the whole thing. That's easily done with replace = T (just take 3*100 samples and then convert the output vector to a matrix) but I'm not sure its quite as easy with replace = F. Hope this helps, Michael On Wed, Aug 10, 2011 at 2:37 PM, Vikram Chhatre crypticline...@gmail.comwrote: Hello, I am a R beginner and hoping to obtain some hints or suggestions about using permutations to sort a data set I have. Here is an example dataset: Ind1 11 00 12 15 28 Ind2 21 33 22 67 52 Ind3 22 45 21 22 56 Ind4 11 25 74 77 42 Ind5 41 32 67 45 22 This will be read into a variable using read.table. What I want to do is permute these individuals and every time pick 3 individuals and write them to a new variable. I want to do this 100 times so that in the end I will have 100 tables containing data for 3 individuals each. The data (for individuals) itself is not to be permuted, rather the selection of individuals. I am guessing this is probably trivial to do. But I would appreciate any advice on this matter. Thank you. Vikram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Clustering Large Applications..sort of
Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of clusters present, also due to computational resources and time hclust is not functionally good enough; furthermore k-means works fine assuming the number of clusters within the data, which is not realistic. The silhouette functions in 'Pam' and 'Clara' and (if I remember correctly) 'cluster' seem to be really bad through very thorough experimentation of data generation with known clusters. I am left then with either theoretical abstractions such as pruning hclust trees with minimal spanning trees or perhaps hand-rolling a hierarchical k-medoids which works extremely efficiently and without cluster number assumptions. Anybody have any suggestions as to possible libraries which I have missed or suggestions in general? Note: this is not a question for 'Bigkmeans' unless there exists a 'findbigkmeansnumberofclusters' function also. Thank you in advance for your assistance, Ken [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert 'list' to 'vector'?
On Wed, Aug 10, 2011 at 9:02 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: unlist() Thanks all! This is perfect, and very R-ish: never where a novice would expect it to be. Cheers Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] choosing selective data with permutations
Sorry, that second line of code won't work: do it in 2. f - function() {return(sample(1:5,3,replace=T/F))} replicate(100,f()) Michael On Wed, Aug 10, 2011 at 3:06 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: To pick random elements to sample, you can just use the sample function sample(1:5,3,replace=T/F) # pick true or false as needed for your data. If you replicate this, you should have no problem. replicate(100,function() return(sample(1:5,3,replace=T/F))) This will be plenty fast, but if you get into very large scale boot strapping, you might want to vectorize the whole thing. That's easily done with replace = T (just take 3*100 samples and then convert the output vector to a matrix) but I'm not sure its quite as easy with replace = F. Hope this helps, Michael On Wed, Aug 10, 2011 at 2:37 PM, Vikram Chhatre crypticline...@gmail.comwrote: Hello, I am a R beginner and hoping to obtain some hints or suggestions about using permutations to sort a data set I have. Here is an example dataset: Ind1 11 00 12 15 28 Ind2 21 33 22 67 52 Ind3 22 45 21 22 56 Ind4 11 25 74 77 42 Ind5 41 32 67 45 22 This will be read into a variable using read.table. What I want to do is permute these individuals and every time pick 3 individuals and write them to a new variable. I want to do this 100 times so that in the end I will have 100 tables containing data for 3 individuals each. The data (for individuals) itself is not to be permuted, rather the selection of individuals. I am guessing this is probably trivial to do. But I would appreciate any advice on this matter. Thank you. Vikram __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xtable - caption missing with float=FALSE
On Aug 10, 2011, at 2:02 PM, Zev Ross wrote: Hi, For some reason I'm finding that my table caption is disappearing if I print xtable output with the floating argument set to FALSE. Below is a very simple Sweave file that produces two tables the first has no caption and the second has a caption (if you want to see it http://www.zevross.com/temp/test.pdf). Does anyone know what I can do to fix this? Zev (I'm using Windows 7, 64 bit, R 2.12.2) % begin Rnw file \documentclass[a4paper]{article} \begin{document} results=tex, echo=FALSE= library(xtable) atable-data.frame(a=1:10, b=rnorm(10)) print(xtable(atable, caption=FLOAT), floating=FALSE) print(xtable(atable, caption=NO FLOAT), floating=TRUE) @ \end{document} Sweave(test.Rnw) texi2dvi(test.tex, pdf=TRUE, clean=TRUE) Hi, If you compare the output of the two commands, the 'floating = TRUE' variant places the tabular environment within a table environment. The tabular environment does not support the \caption command, table does, hence no caption if the tabular is not contained within a float (floating = FALSE). HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert 'list' to 'vector'?
On Aug 10, 2011, at 3:10 PM, Liviu Andronic wrote: On Wed, Aug 10, 2011 at 9:02 PM, R. Michael Weylandt michael.weyla...@gmail.com wrote: unlist() Thanks all! This is perfect, and very R-ish: never where a novice would expect it to be. Well, since `unlist` is linked in the See Also on the help page for `list`, I can only hope you meant that in complete jest. As the Posting Guide says: ... sometimes `read the manual' is the appropriate response. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to quickly convert a data.frame into a structure of lists
Here is code to transform the matrix that by() or array(split()) produces, along with an example of the speed of the various approaches. Using split(), either directly or via by() or tapply(), saves a lot of time. f0 - function(df) { # original code with typos fixed. list_structure - lapply(levels(df$A), function(levelA) { lapply(levels(df$B), function(levelB) {df$C[df$A==levelA df$B==levelB]}) }) # Apply the names: names(list_structure)-levels(df$A) for (i in 1:length(list_structure)) { names(list_structure[[i]])-levels(df$B) } list_structure } f0a - function(df) { # slightly faster version of f0, taking repeated # calculations out of loops. A - df$A B - df$B C - df$C levelsA - structure(levels(A), names=levels(A)) levelsB - structure(levels(B), names=levels(B)) lapply(levelsA, function(levelA) { tmpA - A == levelA # this is responsible for most of the savings lapply(levelsB, function(levelB) {C[tmpA B==levelB]}) }) } f1 - function(df) { # DM's code by(df$C, df[,1:2], identity) } f2 - function(df) { # WD's code AB- df[c(A,B)] array(split(df$C, AB), dim=sapply(AB, nlevels), dimnames=sapply(AB, levels)) } matrix2ListOfRows - function(mat) { # convert a matrix to a list of its rows, converting dimnames to names. retval - structure(as.vector(mat), names=rep(colnames(mat), each=nrow(mat))) retval - split(retval, row(mat)) names(retval) - rownames(mat) retval } The test involves 10^5 rows of data with 26 levels for A and 200 for B. r200 - as.character(as.roman(1:200)) set.seed(1) df - data.frame(A=factor(sample(letters, size=1e5, replace=TRUE), levels=letters), + B=factor(sample(r200, size=1e5, replace=TRUE), levels=r200), + C=1:1e5) system.time(ls0 - f0(df)) user system elapsed 74.082.34 76.60 system.time(ls0a - f0a(df)) user system elapsed 43.090.44 43.73 all.equal(ls0, ls0a) [1] TRUE system.time(ls2 - matrix2ListOfRows(f2(df))) user system elapsed 0.090.020.11 all.equal(ls0, ls2) [1] TRUE system.time(ls1 - matrix2ListOfRows(f1(df))) user system elapsed 0.690.000.69 all.equal(ls0, ls1) [1] TRUE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap Sent: Wednesday, August 10, 2011 10:05 AM To: Duncan Murdoch; Frederic F Cc: r-help@r-project.org Subject: Re: [R] How to quickly convert a data.frame into a structure of lists I was going to suggest AB - df[c(A,B)] ls2 - array(split(df$C, AB), dim=sapply(AB, nlevels), dimnames=sapply(AB, levels)) which produces a matrix very similar to what Duncan's by() call produces ls1 - by(df$C, df[,1:2], identity) E.g., ls2[[a,X]] [1] 1 2 ls1[[a,X]] [1] 1 2 ls1[[a,Y]] # by assigns NULL to unoccupied slots NULL ls2[[a,Y]] # split gives the same type to all slots, copied from input numeric(0) They both are quick because they use split() to avoid the repeated evaluations of bigVector[ anotherBigVector == scalar ] that your nested (not imbricated) loops do. If you really need to convert the matrix to a list of lists that will probably be a quick transformation. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Wednesday, August 10, 2011 9:43 AM To: Frederic F Cc: r-help@r-project.org Subject: Re: [R] How to quickly convert a data.frame into a structure of lists On 10/08/2011 10:30 AM, Frederic F wrote: Hello Duncan, Here is a small example to illustrate what I am trying to do. # Example data.frame df=data.frame(A=c(a,a,b,b), B=c(X,X,Y,Z), C=c(1,2,3,4)) # A B C # 1 a X 1 # 2 a X 2 # 3 b Y 3 # 4 b Z 4 ### First way of getting the list structure (ls1) using imbricated lapply loops: # Get the structure and populate it: ls1-lapply(levels(df$A), function(levelA) { lapply(levels(df$B), function(levelB) {df$C[df$A==levelA df$B==levelB]}) }) # Apply the names: names(list_structure)-levels(df$A) for (i in 1:length(list_structure)) {names(list_structure[[i]])-levels(df$B)} # Result: ls1$a$X # [1] 1 2 ls1$b$Z # [1] 4 The data.frame will always be 'complete', i.e., there will be a value in every row for every column. I want to produce a structure like this one quickly (I aim at something below 10 seconds) for a dataset containing between 1 and 2 millions of rows. I don't know what the timing would be like for your real data, but this does look like by() would work: ls1 - by(df$C, df[,1:2], identity) When I repeat the rows of df a million times
Re: [R] convert 'list' to 'vector'?
On Wed, Aug 10, 2011 at 9:32 PM, David Winsemius dwinsem...@comcast.net wrote: Thanks all! This is perfect, and very R-ish: never where a novice would expect it to be. Well, since `unlist` is linked in the See Also on the help page for `list`, I can only hope you meant that in complete jest. More or less. I would have expected that to transform a 'list' into a 'vector' I should look into 'as.vector' (or its See Also), and I would have never guessed to look for 'unlist'. R documentation is sometimes (often?) hard to parse, and when learning R more often than not you're looking in the wrong place. But yes, it was intended as humour (although I did expect to get grilled). As the Posting Guide says: ... sometimes `read the manual' is the appropriate response. I did, but I was on the wrong track. It actually hasn't occurred to me to check ?list, but See Also in both ?as.vector and ?simplify2array does not link to 'unlist'. Since these are the two places where I turned to in the first place, and I have also played extensively with sapply(..., simplify=...) arguments, and there was nothing obvious in their respective See Also, I figured that I did my homework reasonably well. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] font (charter)
dear R-experts---can someone please refer me to the latest installation instructions for graphics fonts in R (the pdf device)? (I would like to install the Charter font from the texlive 2011 distribution under OSX.) sincerely, /iaw Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Clustering Large Applications..sort of
Try the flow cytometry clustering functions in Bioconductor. -thomas On Thu, Aug 11, 2011 at 7:07 AM, Ken Hutchison vicvoncas...@gmail.com wrote: Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly poor for this application) in that I do not want to make assumptions about the number of clusters present, also due to computational resources and time hclust is not functionally good enough; furthermore k-means works fine assuming the number of clusters within the data, which is not realistic. The silhouette functions in 'Pam' and 'Clara' and (if I remember correctly) 'cluster' seem to be really bad through very thorough experimentation of data generation with known clusters. I am left then with either theoretical abstractions such as pruning hclust trees with minimal spanning trees or perhaps hand-rolling a hierarchical k-medoids which works extremely efficiently and without cluster number assumptions. Anybody have any suggestions as to possible libraries which I have missed or suggestions in general? Note: this is not a question for 'Bigkmeans' unless there exists a 'findbigkmeansnumberofclusters' function also. Thank you in advance for your assistance, Ken [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.