Re: [R] transpose dataset to PC-ORD?

2006-05-23 Thread Dave Roberts
Daniel, I can help somewhat I think. PC-ORD also allows data input in what it calls database format, where each row is sample, taxon, abundance There as many rows/sample as there are non-zero species, and only three columns. To get your taxon data.frame (currently samples as rows,

Re: [R] Concave Hull?

2006-05-04 Thread Dave Roberts
Those are pretty interesting approaches Ted. An alternative is to establish a maximum length for any segment and delete all segments longer than that. Then, find the shortest connected path such that no segments are longer than your threshold. In many cases this would result in the same

Re: [R] Fortran code

2006-04-13 Thread Dave Roberts
Sotiris, R is generally fairly graceful about FORTRAN in linux; Windows is another matter. For example, R/linux will allow you to write to the R console as a file device without using the special I/O routines often needed in R. There are many very complicated FORTRAN routines currently

Re: [R] Ordination of feature film data question

2006-03-13 Thread Dave Roberts
In addition to the references from Professor Ripley, you might be interested in the R packages and pages maintained by ecologists for such work (even if you're doing movies). Packages labdsv, vegan, and ade4 both have a broad variety of distance/dissimilarity indices and numerous alternative

Re: [R] 3-d splinefun

2006-02-10 Thread Dave Roberts
Colby, Function surf() in package labdsv uses the gam() function from mgcv to do this in conjunction with akima. You might want to look at that routine for an idea. Currently it fits the gam as z - gam(s(x) + s(y), but it's possible in the mgcv version of gam to fit z - gam(s(x,y)) as

Re: [R] collating columns

2006-01-30 Thread Dave Roberts
You can just use data.frame(). If (using your example) your dataframes are called first and second, your could new - dataframe(first$A,second$Z,first$B,second$Y,first$C,second$X...) followed by names(new) - c('A','Z','B','Y','C','X') If you have an enormous number of columns that's a pain,

Re: [R] understanding patterns in categorical vs. continuous data

2006-01-26 Thread Dave Roberts
You might prefer boxplot(insolation~veg_type) as a graphic. That will give you quantiles. To get the actual numeric values you could for (i in levels(veg_type)) { print(i) quantile(insolation[veg_type==i]) } see ?quantile for more help. Dylan Beaudette wrote: Greetings, I have a

Re: [R] combining variables with PCA

2006-01-25 Thread Dave Roberts
Christian, One of the arguments to prcomp() is retx, with a default value of TRUE. As explained in the help file, if retx is TRUE the prcomp object reurned by the function contains the projection of the original data along the principal components (which many of us call scores). Thus

Re: [R] Clustering function

2006-01-17 Thread Dave Roberts
Norman, You're missing a step. You need to convert the data file into a 'dist' object, which is either a distance or dissimilarity matrix. This is typically done by function dist(), but may also be done by other functions which produce dist objects, like daisy() in package cluster,

Re: [R] Age of an object?

2005-12-14 Thread Dave Roberts
This would be extraordinarily helpful, but I have not thought of a graceful way to do it. Everything in R now has a class attribute, but a timestamp for such simple things as vectors seems like overkill. On the other hand, those of us writing packages could implement this pretty easily for

Re: [R] R is GNU S, not C.... [was how to get or store .....]

2005-12-07 Thread Dave Roberts
Well, this has been an interesting thread. I guess my own perspective is warped, having never been a C programmer. My native languages are FORTRAN, python, and R, all of which accept (or demand) a linefeed as a terminator, rather than a semicolon, and two of which are very particular about

Re: [R] retrieve most abundant species by sample unit

2005-11-09 Thread Dave Roberts
Graham, It's relatively easily done, especially the first one. Let's suppose your veg data frame is called veg dom1 - apply(veg,1,which.max) returns a vector with the column number of the species with the highest abundance for each sample (if there are ties, it returns the first one). If

Re: [R] elements in a matrix to a vector

2005-11-09 Thread Dave Roberts
Mike, It's not clear whaty way you are doing it now, but this works x - matrix(c(0,2,0,0,0,4,3,0,0),nrow=3) x [,1] [,2] [,3] [1,]003 [2,]200 [3,]040 y - as.vector(t(x)) z - y[y!=0] z [1] 3 2 4 Dave Mike Jones wrote: hi all, i'm

Re: [R] how to convert strings back to values?

2005-11-09 Thread Dave Roberts
Eszter, I suspect the problem is different than you think. It's possible that when you read in the data it assumed that the first column was data, not row names, and so when you transpose the first column becomes the first row. Since it is alpha, all the columns become factors rather then

Re: [R] Stress in multidimensional scaling

2005-11-04 Thread Dave Roberts
cmdscale calculates an eigenanalysis of the dissimilarity matrix, and does not employ stress per se. Rather, it attempts to maximize variability along axes. If you call the the cmdscale() function with eig=TRUE it returns a list object with the coordinates called points and the eigenvalues

Re: [R] help : matrix row/column random mixing

2005-11-02 Thread Dave Roberts
There are several bootstrap packages available at CRAN that probably provide an elegant solution, but simply permuting the matrix is pretty easy data - matrix(1:100,nrow=5) # matrix of 5 rows and 20 columns x - data[sample(1:5),] # permute the rows y - x[,sample(1:20)] # permute the

Re: [R] replacing a factor value in a data frame

2005-10-28 Thread Dave Roberts
AT XX TT CC NA NA TT 8 TT XX TT AC AG AG TT 9 AT XX TT CC AG NA TT 10 TT XX TT CC GG GG TT Notice that the instances of 'CC' in tmp$V7 did not change. HTH, Dave Roberts Federico Calboli wrote: Hi All, I have the following problem, that's driving me mad. I have a dataframe

Re: [R] read data from pdf file

2005-10-21 Thread Dave Roberts
neck a few times. It does not work with acroread (the linux Acobat Reader program) however. Dave Roberts Thomas Schönhoff wrote: Hi, 2005/10/21, Thomas Schönhoff [EMAIL PROTECTED]: Hello again, 2005/10/21, Thomas Schönhoff [EMAIL PROTECTED]: 2005/10/21, Ted Harding [EMAIL PROTECTED

Re: [R] Descriptive statistics for tables

2005-09-30 Thread Dave Roberts
If I understand the request, he wants to take a large number of matrices of identical size and stack them into a three dimentional array, and then calculate statistics on the the third dimension. If the multiple arrays have object names they can be combined into a 3-d array a -

Re: [R] CART for 0/1 data

2005-09-23 Thread Dave Roberts
Martin, If the data are actually coded 0/1, the tree function would probably intepret them as integers and try a regression instead of a classification. If the dependent variable is called var, try x - tree(factor(var)~species)

Re: [R] CART for 0/1 data

2005-09-23 Thread Dave Roberts
0.0 1.0 0.0 0.0 ) * I'll try agin with a larger dataset and see if it's a memory limitation. Dave Roberts Martin Wegmann wrote: On Friday 23 September 2005 17:08, Dave Roberts wrote: Martin

Re: [R] CART for 0/1 data

2005-09-23 Thread Dave Roberts
locations, but I think it should work. Good luck, Dave Roberts Martin Wegmann wrote: On Friday 23 September 2005 17:08, Dave Roberts wrote: Martin, If the data are actually coded 0/1, the tree function would probably intepret them as integers and try a regression instead

Re: [R] indicator value in labdsv

2005-09-19 Thread Dave Roberts
? Perhaps something altogether different will work better. Thanks, Dave Roberts On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote: Hi, I'm trying to find out what threshold of indicator value in labadsv should be used to accept a specie as an indicator one? So far I assumed that indval

Re: [R] indicator value in labdsv

2005-09-19 Thread Dave Roberts
. Thanks, Dave Roberts On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote: Hi, I'm trying to find out what threshold of indicator value in labadsv should be used to accept a specie as an indicator one? So far I assumed that indval=0.5 is high enough to avoid any mistakes