Re: [R] User R to create MySQL database and table

2010-05-23 Thread Prof Brian Ripley

On Sat, 22 May 2010, Waverley @ Palo Alto wrote:


Hi,

I am thinking about using R to create a database, then create table in
MySQL server.  Can I do that using RMySQL package?


Maybe: it is done by SQL commands which you can use *if* you have the 
correct privileges.


However, this is R-help, not R-sig-db and discussion of non-R 
programming questions in detail would not be appropriate.  If you do 
want to follow up on R-sig-db, do first study the R posting guide and 
provide the 'at a minimum' information requested.




I am familiar with RMySQL, and in the online help most of the sample
code assumes the database exists and transact with the table inside
the database.

Can someone provide me some sample code to create a database and
table?  Specifically create a database first, then create a table
inside the database.

Thanks a lot in advance.

--
Waverley @ Palo Alto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting first 7 elements

2010-05-23 Thread Kang Min
Hi,

I have a list of 100, each list has 20 elements, and I would like to
select the first 7 elements in each list.
Let's take the alphabet as an example.

x - lapply(1:100, function(i) sample(LETTERS))

I tried x[[1:7]], but it doesn't work. Can anyone enlighten me on how
to do such selections?

Thank you.

Kang Min

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'

2010-05-23 Thread sedm1000

Sorry - I figured that this to be a more common defined error than anything
specific to the data/function...  Thanks for looking at this.

The data and function are below. Creating a single line of the data.frame at
a time will work (i.e. fold(s))

For multiple line data.frames, an error is generated. Ideally I would like
to record the output from fold(sq) in a two column data.frame, whether it
requires reading in the data to fold one line at a time or in bulk.

 library(GeneRfold)
 s- ATTATGCATCGACTAGCATCACTAG
 fold(s)
[[1]]
[1] .....

[[2]]
[1] -2.3


 sq - data.frame(c(ATGTGTGATATGCATGTACAGCATCGAC,
+   ACTAGCACTAGCATCAGCTGTAGATAGA,
+   ACTAGCATCGACATCATCGACATGATAG,
+   CATCGACTACGACTACGTAGATAGATAG,
+   ATCAGCACTACGACACATAGATAGAATA))

fold(sq)
Error in fold(sq) : 
  STRING_ELT() can only be applied to a 'character vector', not a 'list'

 struct - t(as.data.frame(sapply(sq[,1], fold, t=37)))

Error in FUN(X[[1L]], ...) :
 STRING_ELT() can only be applied to a 'character vector', not a 'integer'


dput(fold,file=fred123)

function (s, t = 37) 
{
.Call(foldR, s, t, PACKAGE = GeneRfold)
}

 dput(sq)
structure(list(c..ATGTGTGATATGCATGTACAGCATCGACACTAGCACTAGCATCAGCTGTAGATAGA...
= structure(c(4L, 
1L, 2L, 5L, 3L), .Label = c(ACTAGCACTAGCATCAGCTGTAGATAGA,
ACTAGCATCGACATCATCGACATGATAG, 
ATCAGCACTACGACACATAGATAGAATA, ATGTGTGATATGCATGTACAGCATCGAC, 
CATCGACTACGACTACGTAGATAGATAG), class = factor)), .Names =
c..ATGTGTGATATGCATGTACAGCATCGACACTAGCACTAGCATCAGCTGTAGATAGA...,
row.names = c(NA, 
-5L), class = data.frame)

 dput(s)
ATTATGCATCGACTAGCATCACTAG

 sessionInfo()
R version 2.11.0 (2010-04-22)
i386-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] GeneRfold_1.6.0 GeneR_2.18.0

loaded via a namespace (and not attached):
[1] tools_2.11.0

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Error-in-FUN-X-1L-STRING-ELT-can-only-be-applied-to-a-character-vector-not-a-integer-tp2226811p2227512.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Increasing the maximum number of rows

2010-05-23 Thread Wu Gong

Might there be a limit ?

 c - matrix(1:1, ncol=200)
 dim(c)
[1] 50200
 c - matrix(1:10, ncol=200)
Error: cannot allocate vector of size 3.7 Gb


-
A R learner.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] need help in understanding R code, and maybe some math

2010-05-23 Thread john smith
Hi,
I am trying to implement Higham's algorithm for correcting a non positive
definite covariance matrix.
I found this code in R:
http://projects.cs.kent.ac.uk/projects/cxxr/trac/browser/trunk/src/library/Recommended/Matrix/R/nearPD.R?rev=637

I managed to understand most of it, the only line I really don't understand
is this one:
X - tcrossprod(Q * rep(d[p], each=nrow(Q)), Q)

This line is supposed to calculate the matrix product Q*D*Q^T, Q is an n by
m matrix and R is a diagonal n by n matrix. What does this mean?
I also don't understand the meaning of a cross product between matrices, I
only know it between vectors.

Thanks,
Barisdad.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] importing columns as factors

2010-05-23 Thread Caitlin Sadowski
I have a large csv table I am trying to read into R. I would like each
column to be of type factor. However, most columns have only numeral
entries (e.g. likert scales), so are automatically imported as type
numeric. Is there a way to convert ALL columns to be of type factor,
without having to convert each column manually?

Cheers,

Caitlin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting first 7 elements

2010-05-23 Thread Erik Iverson

Kang Min wrote:

Hi,

I have a list of 100, each list has 20 elements, and I would like to
select the first 7 elements in each list.
Let's take the alphabet as an example.

x - lapply(1:100, function(i) sample(LETTERS))

I tried x[[1:7]], but it doesn't work. Can anyone enlighten me on how
to do such selections?


[ is a function, and you want to use it on each element of the list, so...

lapply(x, [, c(1:7))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting first 7 elements

2010-05-23 Thread Erik Iverson


[ is a function, and you want to use it on each element of the list, 
so...


lapply(x, [, c(1:7))


and the call to c() is of course not necessary, since : will generate a 
vector.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How sample without replacement on more than one variables?

2010-05-23 Thread thmsfuller...@gmail.com
Hello All,

sample() only sample on one variable x. But I'm interested in sampling
more than one variable without replacement.

Suppose I have 3 vectors x, y, z. I want to draw samples from all
three vectors such that the combination of the three elements in each
draw is not the same as any previous draws. I could use expand.grid to
generate a vector out of the three vectors. But when the number of
vectors are large and the number of elements in some vectors are
large, it will be infeasible to do so.

If you know there is a method on sampling on more than one variables,
would you please let me know? Thank you!

-- 
Tom

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] importing columns as factors

2010-05-23 Thread Erik Iverson

Caitlin Sadowski wrote:

I have a large csv table I am trying to read into R. I would like each
column to be of type factor. However, most columns have only numeral
entries (e.g. likert scales), so are automatically imported as type
numeric. Is there a way to convert ALL columns to be of type factor,
without having to convert each column manually?


It's in the help file for ?read.csv, the colClasses argument:

colClasses: character.  A vector of classes to be assumed for the
  columns.  Recycled as necessary, or if the character vector
  is named, unspecified values are taken to be ‘NA’.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting first 7 elements

2010-05-23 Thread Kang Min
Thanks a lot, it works!

On May 23, 3:10 pm, Erik Iverson er...@ccbr.umn.edu wrote:
  [ is a function, and you want to use it on each element of the list,
  so...

  lapply(x, [, c(1:7))

 and the call to c() is of course not necessary, since : will generate a 
 vector.

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 --
 You received this message because you are subscribed to the Google Groups 
 R-help-archive group.
 To post to this group, send email to r-help-arch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 r-help-archive+unsubscr...@googlegroups.com.
 For more options, visit this group 
 athttp://groups.google.com/group/r-help-archive?hl=en.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'

2010-05-23 Thread Erik Iverson

Hello,

sedm1000 wrote:

Sorry - I figured that this to be a more common defined error than anything
specific to the data/function...  Thanks for looking at this.

The data and function are below. Creating a single line of the data.frame at
a time will work (i.e. fold(s))

For multiple line data.frames, an error is generated. Ideally I would like
to record the output from fold(sq) in a two column data.frame, whether it
requires reading in the data to fold one line at a time or in bulk.


library(GeneRfold)
s- ATTATGCATCGACTAGCATCACTAG
fold(s)

[[1]]
[1] .....

[[2]]
[1] -2.3



sq - data.frame(c(ATGTGTGATATGCATGTACAGCATCGAC,

+   ACTAGCACTAGCATCAGCTGTAGATAGA,
+   ACTAGCATCGACATCATCGACATGATAG,
+   CATCGACTACGACTACGTAGATAGATAG,
+   ATCAGCACTACGACACATAGATAGAATA))


fold(sq)
Error in fold(sq) : 
  STRING_ELT() can only be applied to a 'character vector', not a 'list'



struct - t(as.data.frame(sapply(sq[,1], fold, t=37)))


Error in FUN(X[[1L]], ...) :
 STRING_ELT() can only be applied to a 'character vector', not a 'integer'



This appears to be a Bioconductor package, so if this doesn't help, I'd ask on 
the specific bioconductor mailing list.  I don't have the package installed, so 
take the following advice with that in mind.


Did you look at the str(sq) ?  It is not a character vector, it is a factor, so 
you might need to convert or see stringsAsFactors in ?options.


Try

lapply(sq[, 1], function(x) fold(as.character(x)))

If that doesn't work, try the other list.

Good luck,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sample without replacement on more than one variables?

2010-05-23 Thread Erik Iverson

thmsfuller...@gmail.com wrote:

Hello All,

sample() only sample on one variable x. But I'm interested in sampling
more than one variable without replacement.

Suppose I have 3 vectors x, y, z. I want to draw samples from all
three vectors such that the combination of the three elements in each
draw is not the same as any previous draws. I could use expand.grid to
generate a vector out of the three vectors. But when the number of
vectors are large and the number of elements in some vectors are
large, it will be infeasible to do so.

If you know there is a method on sampling on more than one variables,
would you please let me know? Thank you!



Can you give a reproducible example?  Since you suggested the method that is 
most reasonable, but it will not work in large cases, I suppose you'll have to 
draw independently from each vector one at a time, then somehow concatenate the 
results, perhaps as a character vector, even if the vectors are, say, integers. 
 Then repeat this process checking each time if your new vector is %in% the 
vector.


There may be a much better way, too, see if anyone else responds.

Also, you'll have to think about what a unique sample is.

If

x - 1:3

y - 2:4 ,

is x = 2, y = 3 the same as x = 3, y = 2?

Good luck,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sample without replacement on more than one variables?

2010-05-23 Thread dusadrian

This might help, depending on your exact needs:
 v1 - sample(letters[1:2], 10, replace=TRUE)
 v2 - sample(letters[3:4], 10, replace=TRUE)
 v3 - sample(letters[5:6], 10, replace=TRUE)
 aa - data.frame(v1=v1, v2=v2, v3=v3)
 aa
   v1 v2 v3
1   a  d  e
2   a  d  e
3   a  c  e
4   b  d  e
5   b  d  f
6   a  c  f
7   a  c  f
8   a  c  f
9   a  c  e
10  b  c  e
 bb - unique(aa)
 bb
   v1 v2 v3
1   a  d  e
3   a  c  e
4   b  d  e
5   b  d  f
6   a  c  f
10  b  c  e

You can sample from the bb dataframe, or from the corresponding rows of
the aa dataframe that are unique (1, 3, 4, 5, 6 and 10) which can be
obtained via rownames(bb).

Hth,
Adrian
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-sample-without-replacement-on-more-than-one-variables-tp2227665p2227683.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Increasing the maximum number of rows

2010-05-23 Thread jim holtman
You are trying to create an object with 1G elements.  Given that these
are integers, this will require about 4GB of space.  If you are
running on a 32-bit system, which has a total phyical limit of 2-3GB
depending on what options you are running (at least on Windows), then
you have exceeded the limits.  It is a good idea to limit your largest
object to about 25% of physical memory in case copies have to be made
during some of the analysis.


On Sat, May 22, 2010 at 10:31 PM, Wu Gong gho...@gmail.com wrote:

 Might there be a limit ?

 c - matrix(1:1, ncol=200)
 dim(c)
 [1] 50    200
 c - matrix(1:10, ncol=200)
 Error: cannot allocate vector of size 3.7 Gb


 -
 A R learner.
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help in understanding R code, and maybe some math

2010-05-23 Thread Peter Ehlers

On 2010-05-23 0:56, john smith wrote:

Hi,
I am trying to implement Higham's algorithm for correcting a non positive
definite covariance matrix.
I found this code in R:
http://projects.cs.kent.ac.uk/projects/cxxr/trac/browser/trunk/src/library/Recommended/Matrix/R/nearPD.R?rev=637

I managed to understand most of it, the only line I really don't understand
is this one:
X- tcrossprod(Q * rep(d[p], each=nrow(Q)), Q)

This line is supposed to calculate the matrix product Q*D*Q^T, Q is an n by
m matrix and R is a diagonal n by n matrix. What does this mean?
I also don't understand the meaning of a cross product between matrices, I
only know it between vectors.


You could have a look at the help page for crossprod which
gives the definitions of crossprod and tcrossprod.

Perhaps this will help:

Q - matrix(1:12, ncol=3)
v - rep(1:3, each=nrow(Q)
Q
v
Q * v
(Q * v) %*% t(Q)
tcrossprod(Q * v, Q)

 -Peter Ehlers



Thanks,
Barisdad.

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Increasing the maximum number of rows

2010-05-23 Thread Tal Galili
Hello Jim,
It sounds like a good time to go read about the packages
bigmemory
and/or
ff

Best,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Sun, May 23, 2010 at 12:31 PM, jim holtman jholt...@gmail.com wrote:

 You are trying to create an object with 1G elements.  Given that these
 are integers, this will require about 4GB of space.  If you are
 running on a 32-bit system, which has a total phyical limit of 2-3GB
 depending on what options you are running (at least on Windows), then
 you have exceeded the limits.  It is a good idea to limit your largest
 object to about 25% of physical memory in case copies have to be made
 during some of the analysis.


 On Sat, May 22, 2010 at 10:31 PM, Wu Gong gho...@gmail.com wrote:
 
  Might there be a limit ?
 
  c - matrix(1:1, ncol=200)
  dim(c)
  [1] 50200
  c - matrix(1:10, ncol=200)
  Error: cannot allocate vector of size 3.7 Gb
 
 
  -
  A R learner.
  --
  View this message in context:
 http://r.789695.n4.nabble.com/Increasing-the-maximum-number-of-rows-tp2226950p2227578.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sample without replacement on more than one variables?

2010-05-23 Thread Bernardo Rangel Tura
On Sun, 2010-05-23 at 00:56 -0700, dusadrian wrote:
 This might help, depending on your exact needs:
  v1 - sample(letters[1:2], 10, replace=TRUE)
  v2 - sample(letters[3:4], 10, replace=TRUE)
  v3 - sample(letters[5:6], 10, replace=TRUE)
  aa - data.frame(v1=v1, v2=v2, v3=v3)

And now is simple, sample the row of data frame
aa[sample(1:nrows(aa),3),]


-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : Re : Re : Nomogram with multiple interactions (package rms)

2010-05-23 Thread Marc Carpentier
Thanks for the answer.
Unfortunately, I'm not yet skilled enough to do such a thing. I had a look on 
the code and I'll try to understand it, as a good exercise.
I thought about sending fake fit objects to nomogram() derived from the 
original one :
- orignal : f2- cph(Surv(d.time,death) ~ 
sex*(rcs(cholesterol,4)+blood.pressure)
- manually derived :
* fMale : with coef rcs(cholesterol,4) and blood.pressure form f2, no sex effect
* fFemale : with agregated coef sex:rcs(cholesterol,4) for cholesterol and 
sex:blood.pressure for BP and an obligatory sex effect.
But I failed to fool your function. Had to try though...

Marc





- Message d'origine 
De : Frank E Harrell Jr f.harr...@vanderbilt.edu
À : Marc Carpentier marc.carpent...@ymail.com
Cc : r-help-request Mailing List r-help@r-project.org
Envoyé le : Jeu 20 mai 2010, 15h 30min 27s
Objet : Re: Re : Re : [R] Nomogram with multiple interactions (package rms)

On 05/20/2010 01:42 AM, Marc Carpentier wrote:
 Thank you for your responses, but I don't think you're right about the doc...
 I carefully looked at it before posting and ran the examples, looked in 
 Vanderbilt Biostat doc, and just looked again example(nomogram) :
 1st example : categorical*continous : two axes for each sex
 f- lrm(y ~ lsp(age,50)+sex*rcs(cholesterol,4)+blood.pressure)

Hi Marc,

My apologies; I misread my own example.  This will take some digging 
into the code.  If you have time to do this before I do, code change 
suggestions welcomed.

Frank



 2nd : continous*continous : one age axe for each specified value of 
 cholesterol
 g- lrm(y ~ sex + rcs(age,3)*rcs(cholesterol,3))

 3rd and 4th : categorical*continous : two axes for each sex (4th with fun)
 f- psm(Surv(d.time,death) ~ sex*age, dist='lognormal')

 5th : categorical*continous : two axes for each sex (with fun)
 g- lrm(Y ~ age+rcs(cholesterol,4)*sex)

 I'm desperately trying to represent a case of 
 categorical*(continous+continous) :
 f2- cph(Surv(d.time,death) ~ sex*(rcs(cholesterol,4)+blood.pressure)
 The best solution I can think of is to draw one nomogram for each sex :
 Assuming 'male' is the ref level of sex :
 1st nomogram : one axe for rcs(cholesterol,4), one axe for blood.pressure
 2nd nomogram : one axe for sex:rcs(cholesterol,4), one axe for 
 sex:blood.pressure, both shifted because of the sex own effect.
 (I badly draw it in my previous mail)
 I didn't see any example of this adjustement of nomogram to 'male' or 
 'female'...

 I hope I gave a clearer explanation and I'm not wrong about this unmentioned 
 case.

 Marc




 - Message d'origine 
 De : Frank E Harrell Jrf.harr...@vanderbilt.edu
 À : Marc Carpentiermarc.carpent...@ymail.com
 Cc : r-help-request Mailing Listr-help@r-project.org
 Envoyé le : Jeu 20 mai 2010, 0h 55min 32s
 Objet : Re: Re : [R] Nomogram with multiple interactions (package rms)

 On 05/19/2010 04:36 PM, Marc Carpentier wrote:
 I'm sorry. I don't understand the omit solution, and maybe I mislead you 
 with my explanation.

 With the data from the f exemple of nomogram() :
 Let's declare :
 f2- cph(Surv(d.time,death) ~ sex*(age+blood.pressure))
 I guess the best (and maybe the only) way to represent it with a nomogram is 
 to plot two nomograms (I couldn't find better).
 Is there a way to have :

 Nomogram1 : male :
 - points 1-100 ---
 - age (men) ---
 - blood.pressure (men) ---
 - linear predictor ---

 And nomogram2 : female :
 - points 1-100 ---
 - age (female) ---
 - blood.pressure (female) ---
 - linear predictor ---

 As I said I tried and failed (nomogram() still wants me to define
 interact=list(...)) with :
 plot(nomorgam(f2, adj.to=list(sex=male)) #and female for the other one

 Marc

 I think the documentation tells you how to do this.  But you failed to
 look at the output from example(nomogram).  In one of the examples two
 continuous predictors have two axes each, with male and female in close
 proximity.  Or maybe I'm just missing your point.

 Frank




 - Message d'origine 
 De : Frank E Harrell Jrf.harr...@vanderbilt.edu
 À : Marc Carpentiermarc.carpent...@ymail.com; r-help-request Mailing 
 Listr-help@r-project.org
 Envoyé le : Mer 19 mai 2010, 22h 28min 51s
 Objet : Re: [R] Nomogram with multiple interactions (package rms)

 On 05/19/2010 03:17 PM, Marc Carpentier wrote:
 Dear list, I'm facing the following problem : A cox model with my sex
 variable interacting with several continuous variables :
 cph(S~sex*(x1+x2+x3)) And I'd like to make a nomogram. I know it's a
 bit tricky and one mights argue that nomogram is not a good a
 choice... I could use the parameter
 interact=list(sex=(male,female),x1=c(a,b,c))... but with rcs or
 pol transformations of x1, x2 and x3, the choice of the
 categorization (a,b,c,...) is arbitrary and the nomogram not so
 useful... Considering that sex is the problem, I thought I could draw
 two nomograms, one for each 

Re: [R] need help in understanding R code, and maybe some math

2010-05-23 Thread Douglas Bates
On Sun, May 23, 2010 at 5:09 AM, Peter Ehlers ehl...@ucalgary.ca wrote:
 On 2010-05-23 0:56, john smith wrote:

 Hi,
 I am trying to implement Higham's algorithm for correcting a non positive
 definite covariance matrix.
 I found this code in R:

 http://projects.cs.kent.ac.uk/projects/cxxr/trac/browser/trunk/src/library/Recommended/Matrix/R/nearPD.R?rev=637

 I managed to understand most of it, the only line I really don't
 understand
 is this one:
 X- tcrossprod(Q * rep(d[p], each=nrow(Q)), Q)

 This line is supposed to calculate the matrix product Q*D*Q^T, Q is an n
 by
 m matrix and R is a diagonal n by n matrix. What does this mean?
 I also don't understand the meaning of a cross product between matrices, I
 only know it between vectors.

In the original S language, on which R is based, the function named
crossprod was used for what statisticians view as the cross-product of
the columns of a matrix, such as a multivariate data matrix or a model
matrix.  That is

crossprod(X) := X'X

This is a special case of the cross-product of the columns of two
matrices with the same number of rows

crossprod(X, Y) := X'Y

The tcrossprod function was introduced more recently to mean the
crossprod of the transpose of X.  That is

trcossprod(X) := crossprod(t(X)) := X %*% t(X)

These definitions are unrelated to the cross-product of vectors used
in Physics and related disciplines.

The reason for creating such functions is that these are common
operations in statistical computing and it helps to know the special
structure (e.g. the result of crossprod(X) or tcrossprod(X) is a
symmetric, positive semidefinite matrix).

 You could have a look at the help page for crossprod which
 gives the definitions of crossprod and tcrossprod.

 Perhaps this will help:

 Q - matrix(1:12, ncol=3)
 v - rep(1:3, each=nrow(Q)
 Q
 v
 Q * v
 (Q * v) %*% t(Q)
 tcrossprod(Q * v, Q)

  -Peter Ehlers


 Thanks,
 Barisdad.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression with sparse matricies

2010-05-23 Thread Douglas Bates
As Frank mentioned in his reply, expecting to estimate tens of
thousands of fixed-effects parameters in a logistic regression is
optimistic.  You could start with a generalized linear mixed model
instead

library(lme4)
fm1 - glmer(resp ~ 1 + (1|f1) + (1|f2) + (1|f1:f2), mydata, binomial))

If you have difficulty with that it might be best to switch the
discussion to the r-sig-mixed-mod...@r-project.org mailing list.

On Sat, May 22, 2010 at 2:19 PM, Robin Jeffries rjeffr...@ucla.edu wrote:
 I would like to run a logistic regression on some factor variables (main
 effects and eventually an interaction) that are very sparse. I have a
 moderately large dataset, ~100k observations with 1500 factor levels for one
 variable (x1) and 600 for another (X2), creating ~19000 levels for the
 interaction (X1:X2).

 I would like to take advantage of the sparseness in these factors to avoid
 using GLM. Actually glm is not an option given the size of the design
 matrix.

 I have looked through the Matrix package as well as other packages without
 much help.

 Is there some option, some modification of glm, some way that it will
 recognize a sparse matrix and avoid large matrix inversions?

 -Robin

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'

2010-05-23 Thread David Winsemius


On May 23, 2010, at 3:27 AM, Erik Iverson wrote:


Hello,

sedm1000 wrote:
Sorry - I figured that this to be a more common defined error than  
anything

specific to the data/function...  Thanks for looking at this.
The data and function are below. Creating a single line of the  
data.frame at

a time will work (i.e. fold(s))
For multiple line data.frames, an error is generated. Ideally I  
would like
to record the output from fold(sq) in a two column data.frame,  
whether it

requires reading in the data to fold one line at a time or in bulk.

library(GeneRfold)
s- ATTATGCATCGACTAGCATCACTAG
fold(s)

[[1]]
[1] .....
[[2]]
[1] -2.3

sq - data.frame(c(ATGTGTGATATGCATGTACAGCATCGAC,

+   ACTAGCACTAGCATCAGCTGTAGATAGA,
+   ACTAGCATCGACATCATCGACATGATAG,
+   CATCGACTACGACTACGTAGATAGATAG,
+   ATCAGCACTACGACACATAGATAGAATA))

fold(sq)


Building on Erik's comments, perhaps trying:

 sq - data.frame(s1 = c(ATGTGTGATATGCATGTACAGCATCGAC,
+   ACTAGCACTAGCATCAGCTGTAGATAGA,
+   ACTAGCATCGACATCATCGACATGATAG,
+   CATCGACTACGACTACGTAGATAGATAG,
+   ATCAGCACTACGACACATAGATAGAATA), stringsAsFactors=FALSE)

stringsAsFactors=FALSE leaves the character vector unfactored.

 str(sq)
'data.frame':   5 obs. of  1 variable:
 $ s1: chr  ATGTGTGATATGCATGTACAGCATCGAC  
ACTAGCACTAGCATCAGCTGTAGATAGA ACTAGCATCGACATCATCGACATGATAG  
CATCGACTACGACTACGTAGATAGATAG ...


Passing sq would still be passing a list. You probably want just the  
first and only column.


 str(sq$s1)
 chr [1:5] ATGTGTGATATGCATGTACAGCATCGAC  
ACTAGCACTAGCATCAGCTGTAGATAGA ...


fold(sq$s1)   # passing a character vector, which is what the error  
message says is needed.


--
David.

Error in fold(sq) :   STRING_ELT() can only be applied to a  
'character vector', not a 'list'

struct - t(as.data.frame(sapply(sq[,1], fold, t=37)))

Error in FUN(X[[1L]], ...) :
STRING_ELT() can only be applied to a 'character vector', not a  
'integer'


This appears to be a Bioconductor package, so if this doesn't help,  
I'd ask on the specific bioconductor mailing list.  I don't have the  
package installed, so take the following advice with that in mind.


Did you look at the str(sq) ?  It is not a character vector, it is a  
factor, so you might need to convert or see stringsAsFactors in ? 
options.


Try

lapply(sq[, 1], function(x) fold(as.character(x)))

If that doesn't work, try the other list.

Good luck,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re : Re : Re : Nomogram with multiple interactions (package rms)

2010-05-23 Thread Frank E Harrell Jr

On 05/23/2010 06:29 AM, Marc Carpentier wrote:

Thanks for the answer.
Unfortunately, I'm not yet skilled enough to do such a thing. I had a look on 
the code and I'll try to understand it, as a good exercise.
I thought about sending fake fit objects to nomogram() derived from the 
original one :
- orignal : f2- cph(Surv(d.time,death) ~ 
sex*(rcs(cholesterol,4)+blood.pressure)
- manually derived :
* fMale : with coef rcs(cholesterol,4) and blood.pressure form f2, no sex effect
* fFemale : with agregated coef sex:rcs(cholesterol,4) for cholesterol and 
sex:blood.pressure for BP and an obligatory sex effect.
But I failed to fool your function. Had to try though...

Marc


Marc,

Although this feature should really be implemented or fixed in 
nomogram(), you can always use ols to predict (with an R^2 of 1.0) the 
linear predictor from predict(cph fit) setting a variable to a constant 
in the newdata argument to predict, and not using that variable to 
predict the linear predictor.  Then you can make a nomogram from the ols 
model.


Frank








- Message d'origine 
De : Frank E Harrell Jrf.harr...@vanderbilt.edu
À : Marc Carpentiermarc.carpent...@ymail.com
Cc : r-help-request Mailing Listr-help@r-project.org
Envoyé le : Jeu 20 mai 2010, 15h 30min 27s
Objet : Re: Re : Re : [R] Nomogram with multiple interactions (package rms)

On 05/20/2010 01:42 AM, Marc Carpentier wrote:

Thank you for your responses, but I don't think you're right about the doc...
I carefully looked at it before posting and ran the examples, looked in 
Vanderbilt Biostat doc, and just looked again example(nomogram) :
1st example : categorical*continous : two axes for each sex
f- lrm(y ~ lsp(age,50)+sex*rcs(cholesterol,4)+blood.pressure)


Hi Marc,

My apologies; I misread my own example.  This will take some digging
into the code.  If you have time to do this before I do, code change
suggestions welcomed.

Frank




2nd : continous*continous : one age axe for each specified value of 
cholesterol
g- lrm(y ~ sex + rcs(age,3)*rcs(cholesterol,3))

3rd and 4th : categorical*continous : two axes for each sex (4th with fun)
f- psm(Surv(d.time,death) ~ sex*age, dist='lognormal')

5th : categorical*continous : two axes for each sex (with fun)
g- lrm(Y ~ age+rcs(cholesterol,4)*sex)

I'm desperately trying to represent a case of categorical*(continous+continous) 
:
f2- cph(Surv(d.time,death) ~ sex*(rcs(cholesterol,4)+blood.pressure)
The best solution I can think of is to draw one nomogram for each sex :
Assuming 'male' is the ref level of sex :
1st nomogram : one axe for rcs(cholesterol,4), one axe for blood.pressure
2nd nomogram : one axe for sex:rcs(cholesterol,4), one axe for 
sex:blood.pressure, both shifted because of the sex own effect.
(I badly draw it in my previous mail)
I didn't see any example of this adjustement of nomogram to 'male' or 
'female'...

I hope I gave a clearer explanation and I'm not wrong about this unmentioned 
case.

Marc




- Message d'origine 
De : Frank E Harrell Jrf.harr...@vanderbilt.edu
À : Marc Carpentiermarc.carpent...@ymail.com
Cc : r-help-request Mailing Listr-help@r-project.org
Envoyé le : Jeu 20 mai 2010, 0h 55min 32s
Objet : Re: Re : [R] Nomogram with multiple interactions (package rms)

On 05/19/2010 04:36 PM, Marc Carpentier wrote:

I'm sorry. I don't understand the omit solution, and maybe I mislead you with 
my explanation.

With the data from the f exemple of nomogram() :
Let's declare :
f2- cph(Surv(d.time,death) ~ sex*(age+blood.pressure))
I guess the best (and maybe the only) way to represent it with a nomogram is to 
plot two nomograms (I couldn't find better).
Is there a way to have :

Nomogram1 : male :
- points 1-100 ---
- age (men) ---
- blood.pressure (men) ---
- linear predictor ---

And nomogram2 : female :
- points 1-100 ---
- age (female) ---
- blood.pressure (female) ---
- linear predictor ---

As I said I tried and failed (nomogram() still wants me to define
interact=list(...)) with :
plot(nomorgam(f2, adj.to=list(sex=male)) #and female for the other one

Marc


I think the documentation tells you how to do this.  But you failed to
look at the output from example(nomogram).  In one of the examples two
continuous predictors have two axes each, with male and female in close
proximity.  Or maybe I'm just missing your point.

Frank





- Message d'origine 
De : Frank E Harrell Jrf.harr...@vanderbilt.edu
À : Marc Carpentiermarc.carpent...@ymail.com; r-help-request Mailing 
Listr-help@r-project.org
Envoyé le : Mer 19 mai 2010, 22h 28min 51s
Objet : Re: [R] Nomogram with multiple interactions (package rms)

On 05/19/2010 03:17 PM, Marc Carpentier wrote:

Dear list, I'm facing the following problem : A cox model with my sex
variable interacting with several continuous variables :
cph(S~sex*(x1+x2+x3)) And I'd like to make a nomogram. I know it's a
bit tricky and one mights 

Re: [R] Re : Indexing array to 1000

2010-05-23 Thread David Winsemius


On May 22, 2010, at 10:48 PM, Mohan L wrote:


Dear All,

I have an array some thing like this:


avglog

 January  February March April   May  June  July
August September
   60102 83397 56774 48785 49010 40572 38175
47037 51402

The class of avglog array.

class(avglog)

[1] array


str(avglog)

num [1:9(1d)] 60102 83397 56774 48785 49010 ...
- attr(*, dimnames)=List of 1
 ..$ : chr [1:9] January February March April ...

I have to normalize this avglog array to 1000. I mean, I need to  
devide
1000/avglog[1] and have to multiply this to all the elements in the  
array
and need to plot graph Month Vs Index. To achive this I am doing the  
below

code. I am feeling there may be a simple way to do this.


This would accomplish those two goals in two lines:

 plot(normedavlog - 1000*avlog/avlog[1], xaxt=n)
 axis(1, at=1:9, labels =names(avlog))

--
David.




value - matrix (avglog)



value

  [,1]
[1,] 60102
[2,] 83397
[3,] 56774
[4,] 48785
[5,] 49010
[6,] 40572
[7,] 38175
[8,] 47037
[9,] 51402


day1Avg - value[1]
day1Avg

[1] 60102

ID - (1000/day1Avg)
ID

[1] 0.01663838

index - value*ID
index

  [,1]
[1,] 1000.
[2,] 1387.5911
[3,]  944.6275
[4,]  811.7034
[5,]  815.4471
[6,]  675.0524
[7,]  635.1702
[8,]  782.6195
[9,]  855.2461


monthcount - length(avglog)



Month -  c(1:monthcount)



trend - cbind(Month,c(index))



colnames(trend) - c(Month,Index)



trend

  Month  Index
[1,]1 1000.
[2,]2 1387.5911
[3,]3  944.6275
[4,]4  811.7034
[5,]5  815.4471
[6,]6  675.0524
[7,]7  635.1702
[8,]8  782.6195
[9,]9  855.2461

any help will be greatly appreciated.

Thanks  Rg
Mohan L

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subsetting with a list of vectors

2010-05-23 Thread Kang Min
Hi,

I have a dataset that looks like the one below.

data
plot plantno.species
H  31 ABC
D  2   DEF
Y  54 GFE
E  12 ERF
Y  98 FVD
H  4   JKU
J   7   JFG
A  55 EGD
.. .
.. .
.. .

I want to select rows belonging to 7 random plots for 100 times.
(There are 50 plots in total)
So I created a list of 100 vectors, each vector has 7 elements.

samp - lapply(1:100, function(i) sample(LETTERS))
samp2 - lapply(samp2, [, 1:7)

How can I select the 26 plots from 'data' using 'samp'?

samp3 - sample(LETTERS, 7)
samp4 - subset(data, plot %in% samp3) # this works
samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but
I used a for loop to get it to select 7 plots 100 times.

for (i in nrow(samp2)) {
  samp6 - subset(data, plot %in% samp2[[i]])
} # this doesn't work


Am I missing something, or is there a better solution?

Thanks.
Kang Min

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting with a list of vectors

2010-05-23 Thread jim holtman
try this:

 x - read.table(textConnection(plot plantno.species
+ H  31 ABC
+ D  2   DEF
+ Y  54 GFE
+ E  12 ERF
+ Y  98 FVD
+ H  4   JKU
+ J   7   JFG
+ A  55 EGD), header=TRUE, as.is=TRUE)
 closeAllConnections()
 # chose 10 groups of 3 sample
 choice - lapply(1:10, function(.dummy){
+ x[sample(nrow(x),3),]
+ })

 choice
[[1]]
  plot plantno. species
3Y   54 GFE
8A   55 EGD
4E   12 ERF

[[2]]
  plot plantno. species
8A   55 EGD
2D2 DEF
6H4 JKU

[[3]]
  plot plantno. species
8A   55 EGD
5Y   98 FVD
4E   12 ERF





On Sun, May 23, 2010 at 10:00 AM, Kang Min ngokang...@gmail.com wrote:
 Hi,

 I have a dataset that looks like the one below.

 data
 plot     plantno.    species
 H          31             ABC
 D          2               DEF
 Y          54             GFE
 E          12             ERF
 Y          98             FVD
 H          4               JKU
 J           7               JFG
 A          55             EGD
 .            .                 .
 .            .                 .
 .            .                 .

 I want to select rows belonging to 7 random plots for 100 times.
 (There are 50 plots in total)
 So I created a list of 100 vectors, each vector has 7 elements.

 samp - lapply(1:100, function(i) sample(LETTERS))
 samp2 - lapply(samp2, [, 1:7)

 How can I select the 26 plots from 'data' using 'samp'?

 samp3 - sample(LETTERS, 7)
 samp4 - subset(data, plot %in% samp3) # this works
 samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but
 I used a for loop to get it to select 7 plots 100 times.

 for (i in nrow(samp2)) {
      samp6 - subset(data, plot %in% samp2[[i]])
 } # this doesn't work


 Am I missing something, or is there a better solution?

 Thanks.
 Kang Min

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting with a list of vectors

2010-05-23 Thread Kang Min
Thanks, but what I want is not 100 groups of 7 samples. Let's say in
my samp2 I get

[[1]] D H K S E U O
[[2]] H S R V A L B
etc...

I want to select all rows from 'data' containing D H K S E
U O first, then H S R V A L B and so on.



On May 23, 10:12 pm, jim holtman jholt...@gmail.com wrote:
 try this:

  x - read.table(textConnection(plot     plantno.    species

 + H          31             ABC
 + D          2               DEF
 + Y          54             GFE
 + E          12             ERF
 + Y          98             FVD
 + H          4               JKU
 + J           7               JFG
 + A          55             EGD), header=TRUE, as.is=TRUE) 
 closeAllConnections()
  # chose 10 groups of 3 sample
  choice - lapply(1:10, function(.dummy){

 +     x[sample(nrow(x),3),]
 + })

  choice

 [[1]]
   plot plantno. species
 3    Y       54     GFE
 8    A       55     EGD
 4    E       12     ERF

 [[2]]
   plot plantno. species
 8    A       55     EGD
 2    D        2     DEF
 6    H        4     JKU

 [[3]]
   plot plantno. species
 8    A       55     EGD
 5    Y       98     FVD
 4    E       12     ERF

 





 On Sun, May 23, 2010 at 10:00 AM, Kang Min ngokang...@gmail.com wrote:
  Hi,

  I have a dataset that looks like the one below.

  data
  plot     plantno.    species
  H          31             ABC
  D          2               DEF
  Y          54             GFE
  E          12             ERF
  Y          98             FVD
  H          4               JKU
  J           7               JFG
  A          55             EGD
  .            .                 .
  .            .                 .
  .            .                 .

  I want to select rows belonging to 7 random plots for 100 times.
  (There are 50 plots in total)
  So I created a list of 100 vectors, each vector has 7 elements.

  samp - lapply(1:100, function(i) sample(LETTERS))
  samp2 - lapply(samp2, [, 1:7)

  How can I select the 26 plots from 'data' using 'samp'?

  samp3 - sample(LETTERS, 7)
  samp4 - subset(data, plot %in% samp3) # this works
  samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but
  I used a for loop to get it to select 7 plots 100 times.

  for (i in nrow(samp2)) {
       samp6 - subset(data, plot %in% samp2[[i]])
  } # this doesn't work

  Am I missing something, or is there a better solution?

  Thanks.
  Kang Min

  __
  r-h...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 --
 You received this message because you are subscribed to the Google Groups 
 R-help-archive group.
 To post to this group, send email to r-help-arch...@googlegroups.com.
 To unsubscribe from this group, send email to 
 r-help-archive+unsubscr...@googlegroups.com.
 For more options, visit this group 
 athttp://groups.google.com/group/r-help-archive?hl=en.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subsetting with a list of vectors

2010-05-23 Thread David Winsemius


On May 23, 2010, at 10:00 AM, Kang Min wrote:


Hi,

I have a dataset that looks like the one below.

data
plot plantno.species
H  31 ABC
D  2   DEF
Y  54 GFE
E  12 ERF
Y  98 FVD
H  4   JKU
J   7   JFG
A  55 EGD
.. .
.. .
.. .

I want to select rows belonging to 7 random plots for 100 times.


So you should be thinking about a function that will do what you want  
exactly once and then wrapping it in replicate().




(There are 50 plots in total)
So I created a list of 100 vectors, each vector has 7 elements.

samp - lapply(1:100, function(i) sample(LETTERS))


Please. Minimal!!!   5 samples should be enough for testing.


samp2 - lapply(samp2, [, 1:7)

How can I select the 26 plots from 'data' using 'samp'?

samp3 - sample(LETTERS, 7)


You do not want to sample from LETTERS but rather from the vector of  
data named plot. Otherwise you will not be creating a representative  
sample. And ... plot is a really crappy name for a column. Try to  
avoid naming your columns with names that are common functions.  
Confusion of the humans reading your code is the predictable result,  
and occasional confusion of the R interpreter also may occur.


[After reading your reply to Holtman Or maybe you do want to  
sample from LETTERS. The fix would be obvious.]



samp4 - subset(data, plot %in% samp3) # this works


So this is what you want to do once:

samp1 - function() subset(data, plot %in% sample(data$plot, 7) )

samp15 - replicate(10, samp1())

samp5[,1] will be one sampled subset. (samp10 is now an array of lists.)

Unforfunately, I noticed that even with minimal data example you  
provided (not in reproducible form unfortunately) that I was getting 7  
or 8 samples and realized that using letters to subset was creating  
some overlaps whenever H was sampled. So this is safer:


samp1 - function() data[ sample(1:nrow(data), 7 ),]
samp5 - replicate(5, samp1() )
for(1 in 1:5) print(samp5[,i])

Then I noticed your reply to Holtman, so perhaps you do really wnat  
the first solution. Just so you understand it might not be  
statistically correct.


--
David.




samp5 - subset(data, plot %in% samp2[[1]]) # this works as well, but
I used a for loop to get it to select 7 plots 100 times.

for (i in nrow(samp2)) {
 samp6 - subset(data, plot %in% samp2[[i]])
} # this doesn't work

Am I missing something, or is there a better solution?

Thanks.
Kang Min

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating a reverse geometric sequence

2010-05-23 Thread Erik Iverson

Hello,

Can anyone think of a non-iterative way to generate a decreasing geometric 
sequence in R?


For example, for a hypothetical function dg, I would like:

 dg(20)
[1] 20 10 5 2 1

where I am using integer division by 2 to get each subsequent value in the 
sequence.



There is of course:

dg - function(x) {
  res - integer()
  while(x = 1) {
res - c(res, x)
x - x %/% 2
  }
  res
}

 dg(20)
[1] 20 10  5  2  1

This implementation of 'dg' uses an interative 'while' loop.  I'm simply 
wondering if there is a way to vectorize this process?


Thanks,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a reverse geometric sequence

2010-05-23 Thread Duncan Murdoch

Erik Iverson wrote:

Hello,

Can anyone think of a non-iterative way to generate a decreasing geometric 
sequence in R?


For example, for a hypothetical function dg, I would like:

  dg(20)
[1] 20 10 5 2 1

where I am using integer division by 2 to get each subsequent value in the 
sequence.



There is of course:

dg - function(x) {
   res - integer()
   while(x = 1) {
 res - c(res, x)
 x - x %/% 2
   }
   res
}

  dg(20)
[1] 20 10  5  2  1

This implementation of 'dg' uses an interative 'while' loop.  I'm simply 
wondering if there is a way to vectorize this process?
  

Something like this should work, at least for integer bases:

base - 2
len - ceiling(log(x, base))
floor(x/base^(seq_len(len)-1))


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotCI overlay

2010-05-23 Thread Rick Reiss

I'm using the plotCI function and I'd like to overlay additional means
with CIs onto an existing plotCI-created plot in a different color.  Is
this possible?  Thanks.

Rick

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a reverse geometric sequence

2010-05-23 Thread Dan Davison
Erik Iverson er...@ccbr.umn.edu writes:

 Hello,

 Can anyone think of a non-iterative way to generate a decreasing
 geometric sequence in R?

 For example, for a hypothetical function dg, I would like:

 dg(20)
 [1] 20 10 5 2 1

 where I am using integer division by 2 to get each subsequent value in
 the sequence.


 There is of course:

 dg - function(x) {
   res - integer()
   while(x = 1) {
 res - c(res, x)
 x - x %/% 2
   }
   res
 }

 dg(20)
 [1] 20 10  5  2  1

 This implementation of 'dg' uses an interative 'while' loop.  I'm
 simply wondering if there is a way to vectorize this process?

Hi Erik,

How about

dg - function(x) {
maxi - floor(log(x)/log(2))
floor(x / (2^(0:maxi)))
}

I don't think the remainders cause a problem.

Dan


 Thanks,
 Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotCI overlay

2010-05-23 Thread Ben Bolker
Rick Reiss rreiss at exponent.com writes:

 
 
 I'm using the plotCI function and I'd like to overlay additional means
 with CIs onto an existing plotCI-created plot in a different color.  Is
 this possible?  Thanks.
 
 Rick
 
 

  Assuming you mean the one from the plotrix package: use add=TRUE

e.g.:

library(plotrix)
plotCI(1:5,1:5,1,xlim=c(0,6))
plotCI((1:5)+0.2,rep(4,5),0.5,col=2,add=TRUE)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a reverse geometric sequence

2010-05-23 Thread Ben Bolker
Erik Iverson eriki at ccbr.umn.edu writes:

 Can anyone think of a non-iterative way to generate a decreasing geometric 
 sequence in R?

Reduce(%/%,rep(2,4),init=20,accum=TRUE)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a reverse geometric sequence

2010-05-23 Thread David Winsemius


On May 23, 2010, at 1:43 PM, Erik Iverson wrote:


Hello,

Can anyone think of a non-iterative way to generate a decreasing  
geometric sequence in R?


For example, for a hypothetical function dg, I would like:

 dg(20)
[1] 20 10 5 2 1

where I am using integer division by 2 to get each subsequent value  
in the sequence.


 dg - function(ratio, len) (ratio)^( 0:(len-1) )

 20*dg(.5, 20)
 [1] 2.00e+01 1.00e+01 5.00e+00 2.50e+00 1.25e+00  
6.25e-01
 [7] 3.125000e-01 1.562500e-01 7.812500e-02 3.906250e-02 1.953125e-02  
9.765625e-03
[13] 4.882812e-03 2.441406e-03 1.220703e-03 6.103516e-04 3.051758e-04  
1.525879e-04

[19] 7.629395e-05 3.814697e-05

 (20*dg(.5, 20))[1:19] / (20*dg(.5, 20))[2:20]
 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2




There is of course:

dg - function(x) {
 res - integer()
 while(x = 1) {
   res - c(res, x)
   x - x %/% 2
 }
 res
}

 dg(20)
[1] 20 10  5  2  1

This implementation of 'dg' uses an interative 'while' loop.  I'm  
simply wondering if there is a way to vectorize this process?


Thanks,
Erik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] order issue

2010-05-23 Thread Zoppoli, Gabriele (NIH/NCI) [G]
Hi everybody, this is a real dummy thing.

I sorted a matrix based on a given column, and what I get is right, until it 
comes to columns of negative and positive values; than, order orders 
everything from max to min in the negative values, and then AGAIN from max to 
min in the positive values!!!

Why isn't everything order from max to min, and that's it?

Thank you!!!

Attached is the txt file I use; try:

x=x[order(x[,2]),]

What I get is:

print(x)


  Product A B   Tissue
44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
17 CNS:SNB_75-0.23183 1.03945  CNS
37   LE:K_562-0.58218  1.8581 Leukemia
43ME:MALME_3M-0.67327-1.33493 Melanoma
49ME:UACC_257-0.72431-1.84753 Melanoma
42 ME:M14-0.73942-0.73904 Melanoma
40  LE:SR-0.93541 2.95346 Leukemia
25  CO:SW_620-1.53265-1.35446Colon
63  RE:CAKI_1-2.48443 0.43245Renal
39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
26LC:A549-2.66221 0.71215 Lung
61RE:A498-2.89402 0.93287Renal
9   BR:HS578T-2.94118  1.1217   Breast
34LC:NCI_H522-2.94381  0.3859 Lung
66   RE:TK_10-2.95281 1.26245Renal
52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
57 OV:SK_OV_3-3.04477 2.15405  Ovarian
53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
14 CNS:SF_295-3.09348-1.00095  CNS
54 OV:OVCAR_4-3.13137-0.47497  Ovarian
36   LE:HL_60-3.16745-3.16745 Leukemia
38  LE:MOLT_4-3.20055-1.72841 Leukemia
11  BR:MDA_MB_231-3.24907 1.58326   Breast
59PR:PC_3-3.36612 1.39328 Prostate
19 CO:HCT_116-3.39764 0.43061Colon
12BR:T47D-3.41228 1.13818   Breast
22  CO:HCT_15-3.45342 0.16357Colon
64 RE:RXF_393-3.49615 2.59144Renal
28  LC:HOP_62 -3.4968 0.67884 Lung
60   RE:786_0 -3.5086 1.75056Renal
35LE:CCRF_CEM-3.54526-2.09262 Leukemia
29  LC:HOP_92-3.60636 0.87116 Lung
21CO:HCC_2998-3.61457-0.32362Colon
13 CNS:SF_268-3.63916 2.54378  CNS
20 CO:COLO205-3.64656 0.54344Colon
56 OV:OVCAR_8-3.66053 -0.9594  Ovarian
24CO:KM12-3.68703 2.19991Colon
55 OV:OVCAR_5 -3.7852 2.43038  Ovarian
8   BR:BT_549-3.80239-0.43099   Breast
15 CNS:SF_539-3.86184 1.39114  CNS
65   RE:SN12C-3.90776 0.85244Renal
31 LC:NCI_H23-3.91625-1.14955 Lung
62RE:ACHN-3.96246-0.62365Renal
67   RE:UO_31-3.99791-1.09215Renal
10BR:MCF7-4.00187 1.46303   Breast
51  OV:IGROV1-4.02758 2.04324  Ovarian
23CO:HT29-4.11624-0.02799Colon
41 ME:LOXIMVI -4.2572 0.37259 Melanoma
32   LC:NCI_H322M-4.28534 1.66783 Lung
27LC:EKVX-4.32847 1.66042 Lung
58  PR:DU_145-4.33961 1.57548 Prostate
30LC:NCI_H226-4.37408-0.22311 Lung
33LC:NCI_H460  0.0042 -0.6023 Lung
18   CNS:U251 0.01263 1.66389  CNS
16 CNS:SNB_19 0.16583 0.03737  CNS
45   ME:MDA_N 0.21077 0.05502 Melanoma
50 ME:UACC_62 0.52503  0.1605 Melanoma
46ME:SK_MEL_2 0.55255 -1.6667 Melanoma
47   ME:SK_MEL_28  1.7425 1.45266 Melanoma
48ME:SK_MEL_5 1.74749-1.47817 Melanoma

Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of 
Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.govProduct   A B Tissue
44ME:MDA_MB_435 -0.1915   -0.16744  Melanoma
17CNS:SNB_75-0.23183  1.03945   CNS
37LE:K_562  -0.58218  1.8581Leukemia
43ME:MALME_3M   -0.67327  -1.33493  Melanoma
49ME:UACC_257   -0.72431  -1.84753  Melanoma
42ME:M14-0.73942  -0.73904  Melanoma
40LE:SR -0.93541  2.95346   Leukemia
25CO:SW_620 -1.53265  -1.35446  Colon
63RE:CAKI_1 -2.48443  0.43245   Renal
39LE:RPMI_8226  -2.59561  -1.9448   Leukemia
26LC:A549   -2.66221  0.71215   Lung
61RE:A498   -2.89402  0.93287   Renal
9 BR:HS578T -2.94118  1.1217Breast
34LC:NCI_H522   -2.94381  0.3859Lung
66RE:TK_10  -2.95281  1.26245   Renal
52OV:NCI_ADR_RES-3.04456  0.17046   Ovarian
57OV:SK_OV_3-3.04477  2.15405   Ovarian
53OV:OVCAR_3-3.0705   -0.31743  Ovarian
14CNS:SF_295-3.09348  -1.00095  CNS
54OV:OVCAR_4-3.13137  -0.47497  Ovarian
36LE:HL_60  -3.16745  -3.16745  Leukemia
38LE:MOLT_4 

Re: [R] order issue

2010-05-23 Thread Jim Holtman
do 'str' on your object to see if you have factors where you think you  
have numerics.


What is the problem you are trying to solve?

Sent from my iPhone.

On May 23, 2010, at 17:39, Zoppoli, Gabriele (NIH/NCI) [G] zoppo...@mail.nih.gov 
 wrote:



Hi everybody, this is a real dummy thing.

I sorted a matrix based on a given column, and what I get is right,  
until it comes to columns of negative and positive values; than,  
order orders everything from max to min in the negative values,  
and then AGAIN from max to min in the positive values!!!


Why isn't everything order from max to min, and that's it?

Thank you!!!

Attached is the txt file I use; try:

x=x[order(x[,2]),]

What I get is:

print(x)


 Product A B   Tissue
44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
17 CNS:SNB_75-0.23183 1.03945  CNS
37   LE:K_562-0.58218  1.8581 Leukemia
43ME:MALME_3M-0.67327-1.33493 Melanoma
49ME:UACC_257-0.72431-1.84753 Melanoma
42 ME:M14-0.73942-0.73904 Melanoma
40  LE:SR-0.93541 2.95346 Leukemia
25  CO:SW_620-1.53265-1.35446Colon
63  RE:CAKI_1-2.48443 0.43245Renal
39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
26LC:A549-2.66221 0.71215 Lung
61RE:A498-2.89402 0.93287Renal
9   BR:HS578T-2.94118  1.1217   Breast
34LC:NCI_H522-2.94381  0.3859 Lung
66   RE:TK_10-2.95281 1.26245Renal
52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
57 OV:SK_OV_3-3.04477 2.15405  Ovarian
53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
14 CNS:SF_295-3.09348-1.00095  CNS
54 OV:OVCAR_4-3.13137-0.47497  Ovarian
36   LE:HL_60-3.16745-3.16745 Leukemia
38  LE:MOLT_4-3.20055-1.72841 Leukemia
11  BR:MDA_MB_231-3.24907 1.58326   Breast
59PR:PC_3-3.36612 1.39328 Prostate
19 CO:HCT_116-3.39764 0.43061Colon
12BR:T47D-3.41228 1.13818   Breast
22  CO:HCT_15-3.45342 0.16357Colon
64 RE:RXF_393-3.49615 2.59144Renal
28  LC:HOP_62 -3.4968 0.67884 Lung
60   RE:786_0 -3.5086 1.75056Renal
35LE:CCRF_CEM-3.54526-2.09262 Leukemia
29  LC:HOP_92-3.60636 0.87116 Lung
21CO:HCC_2998-3.61457-0.32362Colon
13 CNS:SF_268-3.63916 2.54378  CNS
20 CO:COLO205-3.64656 0.54344Colon
56 OV:OVCAR_8-3.66053 -0.9594  Ovarian
24CO:KM12-3.68703 2.19991Colon
55 OV:OVCAR_5 -3.7852 2.43038  Ovarian
8   BR:BT_549-3.80239-0.43099   Breast
15 CNS:SF_539-3.86184 1.39114  CNS
65   RE:SN12C-3.90776 0.85244Renal
31 LC:NCI_H23-3.91625-1.14955 Lung
62RE:ACHN-3.96246-0.62365Renal
67   RE:UO_31-3.99791-1.09215Renal
10BR:MCF7-4.00187 1.46303   Breast
51  OV:IGROV1-4.02758 2.04324  Ovarian
23CO:HT29-4.11624-0.02799Colon
41 ME:LOXIMVI -4.2572 0.37259 Melanoma
32   LC:NCI_H322M-4.28534 1.66783 Lung
27LC:EKVX-4.32847 1.66042 Lung
58  PR:DU_145-4.33961 1.57548 Prostate
30LC:NCI_H226-4.37408-0.22311 Lung
33LC:NCI_H460  0.0042 -0.6023 Lung
18   CNS:U251 0.01263 1.66389  CNS
16 CNS:SNB_19 0.16583 0.03737  CNS
45   ME:MDA_N 0.21077 0.05502 Melanoma
50 ME:UACC_62 0.52503  0.1605 Melanoma
46ME:SK_MEL_2 0.55255 -1.6667 Melanoma
47   ME:SK_MEL_28  1.7425 1.45266 Melanoma
48ME:SK_MEL_5 1.74749-1.47817 Melanoma

Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology,  
University of Genova, Genova, Italy

Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov
x.txt
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] order issue

2010-05-23 Thread Zoppoli, Gabriele (NIH/NCI) [G]
I tried this, but it doesn't change the division between negative and 
positive values (see that you have first positive from max to min, and then 
negative from min to max, as if order considered only the absolute values...)

 Product  hsa.miR.204 hsa.miR.210 Tissue
48 ME:SK_MEL_51.74749   -1.47817  Melanoma
47 ME:SK_MEL_28   1.74251.45266   Melanoma
46 ME:SK_MEL_20.55255   -1.6667   Melanoma
50 ME:UACC_62 0.52503   0.1605Melanoma
45 ME:MDA_N   0.21077   0.05502   Melanoma
16 CNS:SNB_19 0.16583   0.03737   CNS 
18 CNS:U251   0.01263   1.66389   CNS 
33 LC:NCI_H4600.0042-0.6023   Lung
30 LC:NCI_H226-4.37408  -0.22311  Lung
58 PR:DU_145  -4.33961  1.57548   Prostate
27 LC:EKVX-4.32847  1.66042   Lung
32 LC:NCI_H322M   -4.28534  1.66783   Lung
41 ME:LOXIMVI -4.2572   0.37259   Melanoma
23 CO:HT29-4.11624  -0.02799  Colon   
51 OV:IGROV1  -4.02758  2.04324   Ovarian 
10 BR:MCF7-4.00187  1.46303   Breast  
67 RE:UO_31   -3.99791  -1.09215  Renal   
62 RE:ACHN-3.96246  -0.62365  Renal   
31 LC:NCI_H23 -3.91625  -1.14955  Lung
65 RE:SN12C   -3.90776  0.85244   Renal   
15 CNS:SF_539 -3.86184  1.39114   CNS 
8  BR:BT_549  -3.80239  -0.43099  Breast  
55 OV:OVCAR_5 -3.7852   2.43038   Ovarian 
24 CO:KM12-3.68703  2.19991   Colon   
56 OV:OVCAR_8 -3.66053  -0.9594   Ovarian 
20 CO:COLO205 -3.64656  0.54344   Colon   
13 CNS:SF_268 -3.63916  2.54378   CNS 
21 CO:HCC_2998-3.61457  -0.32362  Colon   
29 LC:HOP_92  -3.60636  0.87116   Lung
35 LE:CCRF_CEM-3.54526  -2.09262  Leukemia
60 RE:786_0   -3.5086   1.75056   Renal   
28 LC:HOP_62  -3.4968   0.67884   Lung
64 RE:RXF_393 -3.49615  2.59144   Renal   
22 CO:HCT_15  -3.45342  0.16357   Colon   
12 BR:T47D-3.41228  1.13818   Breast  
19 CO:HCT_116 -3.39764  0.43061   Colon   
59 PR:PC_3-3.36612  1.39328   Prostate
11 BR:MDA_MB_231  -3.24907  1.58326   Breast  
38 LE:MOLT_4  -3.20055  -1.72841  Leukemia
36 LE:HL_60   -3.16745  -3.16745  Leukemia
54 OV:OVCAR_4 -3.13137  -0.47497  Ovarian 
14 CNS:SF_295 -3.09348  -1.00095  CNS 
53 OV:OVCAR_3 -3.0705   -0.31743  Ovarian 
57 OV:SK_OV_3 -3.04477  2.15405   Ovarian 
52 OV:NCI_ADR_RES -3.04456  0.17046   Ovarian 
66 RE:TK_10   -2.95281  1.26245   Renal   
34 LC:NCI_H522-2.94381  0.3859Lung
9  BR:HS578T  -2.94118  1.1217Breast  
61 RE:A498-2.89402  0.93287   Renal   
26 LC:A549-2.66221  0.71215   Lung
39 LE:RPMI_8226   -2.59561  -1.9448   Leukemia
63 RE:CAKI_1  -2.48443  0.43245   Renal   
25 CO:SW_620  -1.53265  -1.35446  Colon   
40 LE:SR  -0.93541  2.95346   Leukemia
42 ME:M14 -0.73942  -0.73904  Melanoma
49 ME:UACC_257-0.72431  -1.84753  Melanoma
43 ME:MALME_3M-0.67327  -1.33493  Melanoma
37 LE:K_562   -0.58218  1.8581Leukemia
17 CNS:SNB_75 -0.23183  1.03945   CNS 
44 ME:MDA_MB_435  -0.1915   -0.16744  Melanoma

Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of 
Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov

From: Jorge Ivan Velez [jorgeivanve...@gmail.com]
Sent: Sunday, May 23, 2010 6:09 PM
To: Zoppoli, Gabriele (NIH/NCI) [G]
Subject: Re: [R] order issue

Hi Gabriele,

Take a look at the decreasing argument in ?order

 xx -c(1,3,4,10, 5,2,3,8,9)
 xx
[1]  1  3  4 10  5  2  3  8  9
 xx[order(xx)]
[1]  1  2  3  3  4  5  8  9 10
 xx[order(xx, decreasing = TRUE)]
[1] 10  9  8  5  4  3  3  2  1

HTH,
Jorge


On Sun, May 23, 2010 at 5:39 PM, Zoppoli, Gabriele (NIH/NCI) [G] 
zoppo...@mail.nih.govmailto:zoppo...@mail.nih.gov wrote:
Hi everybody, this is a real dummy thing.

I sorted a matrix based on a given column, and what I get is right, until it 
comes to columns of negative and positive values; than, order orders 
everything from max to min in the negative values, and then AGAIN from max to 
min in the positive values!!!

Why isn't everything order from max to min, and that's it?

Thank you!!!

Attached is the txt file I use; try:

x=x[order(x[,2]),]

What I get is:

print(x)


 Product A B   Tissue
44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
17 CNS:SNB_75-0.23183 1.03945  CNS
37   LE:K_562-0.58218  1.8581 Leukemia
43ME:MALME_3M-0.67327-1.33493 Melanoma
49ME:UACC_257-0.72431-1.84753 Melanoma
42 ME:M14-0.73942-0.73904 Melanoma
40  LE:SR-0.93541 2.95346 Leukemia
25  CO:SW_620-1.53265-1.35446Colon
63  RE:CAKI_1-2.48443 0.43245Renal
39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
26LC:A549-2.66221 0.71215 Lung
61RE:A498-2.89402 0.93287 

Re: [R] order issue

2010-05-23 Thread Ted Harding
On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote:
 Hi everybody, this is a real dummy thing.
 
 I sorted a matrix based on a given column, and what I get is right,
 until it comes to columns of negative and positive values; than,
 order orders everything from max to min in the negative values, and
 then AGAIN from max to min in the positive values!!!
 
 Why isn't everything order from max to min, and that's it?
 Thank you!!!
 
 Attached is the txt file I use; try:
 
 x=x[order(x[,2]),]
 
 What I get is:
 
 print(x)
 
   Product A B   Tissue
 44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
 17 CNS:SNB_75-0.23183 1.03945  CNS
 37   LE:K_562-0.58218  1.8581 Leukemia
 43ME:MALME_3M-0.67327-1.33493 Melanoma
 49ME:UACC_257-0.72431-1.84753 Melanoma
 42 ME:M14-0.73942-0.73904 Melanoma
 40  LE:SR-0.93541 2.95346 Leukemia
 25  CO:SW_620-1.53265-1.35446Colon
 63  RE:CAKI_1-2.48443 0.43245Renal
 39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
 26LC:A549-2.66221 0.71215 Lung
 61RE:A498-2.89402 0.93287Renal
 9   BR:HS578T-2.94118  1.1217   Breast
 34LC:NCI_H522-2.94381  0.3859 Lung
 66   RE:TK_10-2.95281 1.26245Renal
 52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
 57 OV:SK_OV_3-3.04477 2.15405  Ovarian
 53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
 14 CNS:SF_295-3.09348-1.00095  CNS
 54 OV:OVCAR_4-3.13137-0.47497  Ovarian
 36   LE:HL_60-3.16745-3.16745 Leukemia
 38  LE:MOLT_4-3.20055-1.72841 Leukemia
 11  BR:MDA_MB_231-3.24907 1.58326   Breast
 59PR:PC_3-3.36612 1.39328 Prostate
 19 CO:HCT_116-3.39764 0.43061Colon
 12BR:T47D-3.41228 1.13818   Breast
 22  CO:HCT_15-3.45342 0.16357Colon
 64 RE:RXF_393-3.49615 2.59144Renal
 28  LC:HOP_62 -3.4968 0.67884 Lung
 60   RE:786_0 -3.5086 1.75056Renal
 35LE:CCRF_CEM-3.54526-2.09262 Leukemia
 29  LC:HOP_92-3.60636 0.87116 Lung
 21CO:HCC_2998-3.61457-0.32362Colon
 13 CNS:SF_268-3.63916 2.54378  CNS
 20 CO:COLO205-3.64656 0.54344Colon
 56 OV:OVCAR_8-3.66053 -0.9594  Ovarian
 24CO:KM12-3.68703 2.19991Colon
 55 OV:OVCAR_5 -3.7852 2.43038  Ovarian
 8   BR:BT_549-3.80239-0.43099   Breast
 15 CNS:SF_539-3.86184 1.39114  CNS
 65   RE:SN12C-3.90776 0.85244Renal
 31 LC:NCI_H23-3.91625-1.14955 Lung
 62RE:ACHN-3.96246-0.62365Renal
 67   RE:UO_31-3.99791-1.09215Renal
 10BR:MCF7-4.00187 1.46303   Breast
 51  OV:IGROV1-4.02758 2.04324  Ovarian
 23CO:HT29-4.11624-0.02799Colon
 41 ME:LOXIMVI -4.2572 0.37259 Melanoma
 32   LC:NCI_H322M-4.28534 1.66783 Lung
 27LC:EKVX-4.32847 1.66042 Lung
 58  PR:DU_145-4.33961 1.57548 Prostate
 30LC:NCI_H226-4.37408-0.22311 Lung
 33LC:NCI_H460  0.0042 -0.6023 Lung
 18   CNS:U251 0.01263 1.66389  CNS
 16 CNS:SNB_19 0.16583 0.03737  CNS
 45   ME:MDA_N 0.21077 0.05502 Melanoma
 50 ME:UACC_62 0.52503  0.1605 Melanoma
 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma
 47   ME:SK_MEL_28  1.7425 1.45266 Melanoma
 48ME:SK_MEL_5 1.74749-1.47817 Melanoma
 
 Gabriele Zoppoli, MD

Somewhat strange indeed! The only further question I can think of
is to ask how what did x look like before your re-ordered it.
Using the x.txt file you supplied, I get:

  x - read.table(x.txt)
  str(x)
  # 'data.frame':   60 obs. of  4 variables:
  #  $ Product: Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 37 10 30
  #36 42 35 33 18 56 32 ...
  #  $ A  : num  -0.192 -0.232 -0.582 -0.673 -0.724 ...
  #  $ B  : num  -0.167 1.039 1.858 -1.335 -1.848 ...
  #  $ Tissue : Factor w/ 9 levels Breast,CNS,..: 6 2 4 6 6 6 4 3 9 4
  #...


so x[,2] and x[,3] are indeed numeric. Then (similar to yours):

  X-x[order(x[,2]),]
  print(X)
  #   ProductAB   Tissue
  # 30LC:NCI_H226 -4.37408 -0.22311 Lung
  # 58  PR:DU_145 -4.33961  1.57548 Prostate
  # 27LC:EKVX -4.32847  1.66042 Lung
  # 32   LC:NCI_H322M -4.28534  1.66783 Lung
  # 41 ME:LOXIMVI -4.25720  0.37259 Melanoma
  # 23CO:HT29 -4.11624 -0.02799Colon
  # 51  OV:IGROV1 -4.02758  2.04324  Ovarian
  # 10BR:MCF7 -4.00187  1.46303   Breast
  # 67   RE:UO_31 -3.99791 -1.09215Renal
  # 62RE:ACHN -3.96246 -0.62365Renal
  # 31 LC:NCI_H23 -3.91625 -1.14955 Lung
  # 65   RE:SN12C -3.90776  0.85244Renal
  

Re: [R] order issue

2010-05-23 Thread Zoppoli, Gabriele (NIH/NCI) [G]
This is what I get:

str(x)

 chr [1:60, 1:4] ME:SK_MEL_5 ME:SK_MEL_28 ME:SK_MEL_2 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:60] 48 47 46 50 ...
  ..$ : chr [1:4] Product hsa.miR.204 hsa.miR.210 Tissue

It doesn't make much sense to me...

I would like to have the second column ordered from max to min, or from min to 
max (with the argument decreasing=TRUE), but order seems to reorder 
everything without considering negative number as smaller than positive ones...


Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of 
Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov

From: Jim Holtman [jholt...@gmail.com]
Sent: Sunday, May 23, 2010 6:07 PM
To: Zoppoli, Gabriele (NIH/NCI) [G]
Cc: R help
Subject: Re: [R] order issue

do 'str' on your object to see if you have factors where you think you
have numerics.

What is the problem you are trying to solve?

Sent from my iPhone.

On May 23, 2010, at 17:39, Zoppoli, Gabriele (NIH/NCI) [G] 
zoppo...@mail.nih.gov
  wrote:

 Hi everybody, this is a real dummy thing.

 I sorted a matrix based on a given column, and what I get is right,
 until it comes to columns of negative and positive values; than,
 order orders everything from max to min in the negative values,
 and then AGAIN from max to min in the positive values!!!

 Why isn't everything order from max to min, and that's it?

 Thank you!!!

 Attached is the txt file I use; try:

 x=x[order(x[,2]),]

 What I get is:

 print(x)


  Product A B   Tissue
 44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
 17 CNS:SNB_75-0.23183 1.03945  CNS
 37   LE:K_562-0.58218  1.8581 Leukemia
 43ME:MALME_3M-0.67327-1.33493 Melanoma
 49ME:UACC_257-0.72431-1.84753 Melanoma
 42 ME:M14-0.73942-0.73904 Melanoma
 40  LE:SR-0.93541 2.95346 Leukemia
 25  CO:SW_620-1.53265-1.35446Colon
 63  RE:CAKI_1-2.48443 0.43245Renal
 39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
 26LC:A549-2.66221 0.71215 Lung
 61RE:A498-2.89402 0.93287Renal
 9   BR:HS578T-2.94118  1.1217   Breast
 34LC:NCI_H522-2.94381  0.3859 Lung
 66   RE:TK_10-2.95281 1.26245Renal
 52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
 57 OV:SK_OV_3-3.04477 2.15405  Ovarian
 53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
 14 CNS:SF_295-3.09348-1.00095  CNS
 54 OV:OVCAR_4-3.13137-0.47497  Ovarian
 36   LE:HL_60-3.16745-3.16745 Leukemia
 38  LE:MOLT_4-3.20055-1.72841 Leukemia
 11  BR:MDA_MB_231-3.24907 1.58326   Breast
 59PR:PC_3-3.36612 1.39328 Prostate
 19 CO:HCT_116-3.39764 0.43061Colon
 12BR:T47D-3.41228 1.13818   Breast
 22  CO:HCT_15-3.45342 0.16357Colon
 64 RE:RXF_393-3.49615 2.59144Renal
 28  LC:HOP_62 -3.4968 0.67884 Lung
 60   RE:786_0 -3.5086 1.75056Renal
 35LE:CCRF_CEM-3.54526-2.09262 Leukemia
 29  LC:HOP_92-3.60636 0.87116 Lung
 21CO:HCC_2998-3.61457-0.32362Colon
 13 CNS:SF_268-3.63916 2.54378  CNS
 20 CO:COLO205-3.64656 0.54344Colon
 56 OV:OVCAR_8-3.66053 -0.9594  Ovarian
 24CO:KM12-3.68703 2.19991Colon
 55 OV:OVCAR_5 -3.7852 2.43038  Ovarian
 8   BR:BT_549-3.80239-0.43099   Breast
 15 CNS:SF_539-3.86184 1.39114  CNS
 65   RE:SN12C-3.90776 0.85244Renal
 31 LC:NCI_H23-3.91625-1.14955 Lung
 62RE:ACHN-3.96246-0.62365Renal
 67   RE:UO_31-3.99791-1.09215Renal
 10BR:MCF7-4.00187 1.46303   Breast
 51  OV:IGROV1-4.02758 2.04324  Ovarian
 23CO:HT29-4.11624-0.02799Colon
 41 ME:LOXIMVI -4.2572 0.37259 Melanoma
 32   LC:NCI_H322M-4.28534 1.66783 Lung
 27LC:EKVX-4.32847 1.66042 Lung
 58  PR:DU_145-4.33961 1.57548 Prostate
 30LC:NCI_H226-4.37408-0.22311 Lung
 33LC:NCI_H460  0.0042 -0.6023 Lung
 18   CNS:U251 0.01263 1.66389  CNS
 16 CNS:SNB_19 0.16583 0.03737  CNS
 45   ME:MDA_N 0.21077 0.05502 Melanoma
 50 ME:UACC_62 0.52503  0.1605 Melanoma
 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma
 47   ME:SK_MEL_28  1.7425 1.45266 Melanoma
 48ME:SK_MEL_5 1.74749-1.47817 Melanoma

 Gabriele Zoppoli, MD
 Ph.D. Fellow, Experimental and Clinical Oncology and Hematology,
 University of Genova, Genova, Italy
 Guest Researcher, LMP, NCI, NIH, Bethesda MD

 Work: 301-451-8575
 Mobile: 301-204-5642
 Email: 

Re: [R] order issue

2010-05-23 Thread Zoppoli, Gabriele (NIH/NCI) [G]
crazy stuff!!! I tried to reload the txt file, and now it's working...

this is the original (attached)

thanks!

Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of 
Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov

From: Ted Harding [ted.hard...@manchester.ac.uk]
Sent: Sunday, May 23, 2010 6:31 PM
To: Zoppoli, Gabriele (NIH/NCI) [G]
Cc: R help
Subject: RE: [R] order issue

On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote:
 Hi everybody, this is a real dummy thing.

 I sorted a matrix based on a given column, and what I get is right,
 until it comes to columns of negative and positive values; than,
 order orders everything from max to min in the negative values, and
 then AGAIN from max to min in the positive values!!!

 Why isn't everything order from max to min, and that's it?
 Thank you!!!

 Attached is the txt file I use; try:

 x=x[order(x[,2]),]

 What I get is:

 print(x)

   Product A B   Tissue
 44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
 17 CNS:SNB_75-0.23183 1.03945  CNS
 37   LE:K_562-0.58218  1.8581 Leukemia
 43ME:MALME_3M-0.67327-1.33493 Melanoma
 49ME:UACC_257-0.72431-1.84753 Melanoma
 42 ME:M14-0.73942-0.73904 Melanoma
 40  LE:SR-0.93541 2.95346 Leukemia
 25  CO:SW_620-1.53265-1.35446Colon
 63  RE:CAKI_1-2.48443 0.43245Renal
 39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
 26LC:A549-2.66221 0.71215 Lung
 61RE:A498-2.89402 0.93287Renal
 9   BR:HS578T-2.94118  1.1217   Breast
 34LC:NCI_H522-2.94381  0.3859 Lung
 66   RE:TK_10-2.95281 1.26245Renal
 52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
 57 OV:SK_OV_3-3.04477 2.15405  Ovarian
 53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
 14 CNS:SF_295-3.09348-1.00095  CNS
 54 OV:OVCAR_4-3.13137-0.47497  Ovarian
 36   LE:HL_60-3.16745-3.16745 Leukemia
 38  LE:MOLT_4-3.20055-1.72841 Leukemia
 11  BR:MDA_MB_231-3.24907 1.58326   Breast
 59PR:PC_3-3.36612 1.39328 Prostate
 19 CO:HCT_116-3.39764 0.43061Colon
 12BR:T47D-3.41228 1.13818   Breast
 22  CO:HCT_15-3.45342 0.16357Colon
 64 RE:RXF_393-3.49615 2.59144Renal
 28  LC:HOP_62 -3.4968 0.67884 Lung
 60   RE:786_0 -3.5086 1.75056Renal
 35LE:CCRF_CEM-3.54526-2.09262 Leukemia
 29  LC:HOP_92-3.60636 0.87116 Lung
 21CO:HCC_2998-3.61457-0.32362Colon
 13 CNS:SF_268-3.63916 2.54378  CNS
 20 CO:COLO205-3.64656 0.54344Colon
 56 OV:OVCAR_8-3.66053 -0.9594  Ovarian
 24CO:KM12-3.68703 2.19991Colon
 55 OV:OVCAR_5 -3.7852 2.43038  Ovarian
 8   BR:BT_549-3.80239-0.43099   Breast
 15 CNS:SF_539-3.86184 1.39114  CNS
 65   RE:SN12C-3.90776 0.85244Renal
 31 LC:NCI_H23-3.91625-1.14955 Lung
 62RE:ACHN-3.96246-0.62365Renal
 67   RE:UO_31-3.99791-1.09215Renal
 10BR:MCF7-4.00187 1.46303   Breast
 51  OV:IGROV1-4.02758 2.04324  Ovarian
 23CO:HT29-4.11624-0.02799Colon
 41 ME:LOXIMVI -4.2572 0.37259 Melanoma
 32   LC:NCI_H322M-4.28534 1.66783 Lung
 27LC:EKVX-4.32847 1.66042 Lung
 58  PR:DU_145-4.33961 1.57548 Prostate
 30LC:NCI_H226-4.37408-0.22311 Lung
 33LC:NCI_H460  0.0042 -0.6023 Lung
 18   CNS:U251 0.01263 1.66389  CNS
 16 CNS:SNB_19 0.16583 0.03737  CNS
 45   ME:MDA_N 0.21077 0.05502 Melanoma
 50 ME:UACC_62 0.52503  0.1605 Melanoma
 46ME:SK_MEL_2 0.55255 -1.6667 Melanoma
 47   ME:SK_MEL_28  1.7425 1.45266 Melanoma
 48ME:SK_MEL_5 1.74749-1.47817 Melanoma

 Gabriele Zoppoli, MD

Somewhat strange indeed! The only further question I can think of
is to ask how what did x look like before your re-ordered it.
Using the x.txt file you supplied, I get:

  x - read.table(x.txt)
  str(x)
  # 'data.frame':   60 obs. of  4 variables:
  #  $ Product: Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 37 10 30
  #36 42 35 33 18 56 32 ...
  #  $ A  : num  -0.192 -0.232 -0.582 -0.673 -0.724 ...
  #  $ B  : num  -0.167 1.039 1.858 -1.335 -1.848 ...
  #  $ Tissue : Factor w/ 9 levels Breast,CNS,..: 6 2 4 6 6 6 4 3 9 4
  #...


so x[,2] and x[,3] are indeed numeric. Then (similar to yours):

  X-x[order(x[,2]),]
  print(X)
  #   ProductAB   Tissue
  # 30LC:NCI_H226 -4.37408 -0.22311 

Re: [R] order issue

2010-05-23 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Zoppoli, 
 Gabriele (NIH/NCI) [G]
 Sent: Sunday, May 23, 2010 3:44 PM
 To: ted.hard...@manchester.ac.uk
 Cc: R-help@r-project.org
 Subject: Re: [R] order issue
 
 crazy stuff!!! I tried to reload the txt file, and now it's working...

When you reloaded the txt file (with what function?) it
probably was made into a data.frame, with some columns
factors or characters and some columns numerics.  It looks
like your original problem arose after you converted that
data.frame into a matrix, all of whose columns must be
the same (character in this case).  Sorting character
representations of numbers is different than sorting the
numbers as numbers.
   sort(c(1, 0.05, 0., -0.10, -2))
  [1] -2.00 -0.10  0.00  0.05  1.00
   sort(as.character(c(1, 0.05, 0., -0.10, -2)))
  [1] -0.1 -2   00.05 1

Use str(x) again to see if this is what is happening. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com   

 
 this is the original (attached)
 
 thanks!
 
 Gabriele Zoppoli, MD
 Ph.D. Fellow, Experimental and Clinical Oncology and 
 Hematology, University of Genova, Genova, Italy
 Guest Researcher, LMP, NCI, NIH, Bethesda MD
 
 Work: 301-451-8575
 Mobile: 301-204-5642
 Email: zoppo...@mail.nih.gov
 
 From: Ted Harding [ted.hard...@manchester.ac.uk]
 Sent: Sunday, May 23, 2010 6:31 PM
 To: Zoppoli, Gabriele (NIH/NCI) [G]
 Cc: R help
 Subject: RE: [R] order issue
 
 On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote:
  Hi everybody, this is a real dummy thing.
 
  I sorted a matrix based on a given column, and what I get is right,
  until it comes to columns of negative and positive values; than,
  order orders everything from max to min in the negative 
 values, and
  then AGAIN from max to min in the positive values!!!
 
  Why isn't everything order from max to min, and that's it?
  Thank you!!!
 
  Attached is the txt file I use; try:
 
  x=x[order(x[,2]),]
 
  What I get is:
 
  print(x)
 
Product A B   Tissue
  44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
  17 CNS:SNB_75-0.23183 1.03945  CNS
  37   LE:K_562-0.58218  1.8581 Leukemia
  43ME:MALME_3M-0.67327-1.33493 Melanoma
  49ME:UACC_257-0.72431-1.84753 Melanoma
  42 ME:M14-0.73942-0.73904 Melanoma
  40  LE:SR-0.93541 2.95346 Leukemia
  25  CO:SW_620-1.53265-1.35446Colon
  63  RE:CAKI_1-2.48443 0.43245Renal
  39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
  26LC:A549-2.66221 0.71215 Lung
  61RE:A498-2.89402 0.93287Renal
  9   BR:HS578T-2.94118  1.1217   Breast
  34LC:NCI_H522-2.94381  0.3859 Lung
  66   RE:TK_10-2.95281 1.26245Renal
  52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
  57 OV:SK_OV_3-3.04477 2.15405  Ovarian
  53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
  14 CNS:SF_295-3.09348-1.00095  CNS
  54 OV:OVCAR_4-3.13137-0.47497  Ovarian
  36   LE:HL_60-3.16745-3.16745 Leukemia
  38  LE:MOLT_4-3.20055-1.72841 Leukemia
  11  BR:MDA_MB_231-3.24907 1.58326   Breast
  59PR:PC_3-3.36612 1.39328 Prostate
  19 CO:HCT_116-3.39764 0.43061Colon
  12BR:T47D-3.41228 1.13818   Breast
  22  CO:HCT_15-3.45342 0.16357Colon
  64 RE:RXF_393-3.49615 2.59144Renal
  28  LC:HOP_62 -3.4968 0.67884 Lung
  60   RE:786_0 -3.5086 1.75056Renal
  35LE:CCRF_CEM-3.54526-2.09262 Leukemia
  29  LC:HOP_92-3.60636 0.87116 Lung
  21CO:HCC_2998-3.61457-0.32362Colon
  13 CNS:SF_268-3.63916 2.54378  CNS
  20 CO:COLO205-3.64656 0.54344Colon
  56 OV:OVCAR_8-3.66053 -0.9594  Ovarian
  24CO:KM12-3.68703 2.19991Colon
  55 OV:OVCAR_5 -3.7852 2.43038  Ovarian
  8   BR:BT_549-3.80239-0.43099   Breast
  15 CNS:SF_539-3.86184 1.39114  CNS
  65   RE:SN12C-3.90776 0.85244Renal
  31 LC:NCI_H23-3.91625-1.14955 Lung
  62RE:ACHN-3.96246-0.62365Renal
  67   RE:UO_31-3.99791-1.09215Renal
  10BR:MCF7-4.00187 1.46303   Breast
  51  OV:IGROV1-4.02758 2.04324  Ovarian
  23CO:HT29-4.11624-0.02799Colon
  41 ME:LOXIMVI -4.2572 0.37259 Melanoma
  32   LC:NCI_H322M-4.28534 1.66783 Lung
  27LC:EKVX-4.32847 1.66042 Lung
  58  PR:DU_145-4.33961 1.57548 Prostate
  30LC:NCI_H226-4.37408-0.22311 Lung
  33LC:NCI_H460  0.0042 -0.6023 Lung
  18   CNS:U251 0.01263 1.66389  CNS
  16 

Re: [R] order issue

2010-05-23 Thread David Winsemius


On May 23, 2010, at 6:32 PM, Zoppoli, Gabriele (NIH/NCI) [G] wrote:


This is what I get:

str(x)

chr [1:60, 1:4] ME:SK_MEL_5 ME:SK_MEL_28 ME:SK_MEL_2 ...
- attr(*, dimnames)=List of 2
 ..$ : chr [1:60] 48 47 46 50 ...
 ..$ : chr [1:4] Product hsa.miR.204 hsa.miR.210 Tissue

It doesn't make much sense to me...


 How did you bring that text file into R? Both Ted and I are getting:

 str(x)
'data.frame':   60 obs. of  4 variables:
 $ Product: Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 37 10 30  
36 42 35 33 18 56 32 ...

 $ A  : num  -0.192 -0.232 -0.582 -0.673 -0.724 ...
 $ B  : num  -0.167 1.039 1.858 -1.335 -1.848 ...
 $ Tissue : Factor w/ 9 levels Breast,CNS,..: 6 2 4 6 6 6 4 3 9  
4 ...


Your x is a  60 x 4 matrix of all character elements.

If I try:
x[ order(as.character(x[,2])),]

I get the same behavior as you describe.

--
David.




I would like to have the second column ordered from max to min, or  
from min to max (with the argument decreasing=TRUE), but order  
seems to reorder everything without considering negative number as  
smaller than positive ones...



Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology,  
University of Genova, Genova, Italy

Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov

From: Jim Holtman [jholt...@gmail.com]
Sent: Sunday, May 23, 2010 6:07 PM
To: Zoppoli, Gabriele (NIH/NCI) [G]
Cc: R help
Subject: Re: [R] order issue

do 'str' on your object to see if you have factors where you think you
have numerics.

What is the problem you are trying to solve?

Sent from my iPhone.

On May 23, 2010, at 17:39, Zoppoli, Gabriele (NIH/NCI) [G] 
zoppo...@mail.nih.gov

wrote:



Hi everybody, this is a real dummy thing.

I sorted a matrix based on a given column, and what I get is right,
until it comes to columns of negative and positive values; than,
order orders everything from max to min in the negative values,
and then AGAIN from max to min in the positive values!!!

Why isn't everything order from max to min, and that's it?

Thank you!!!

Attached is the txt file I use; try:

x=x[order(x[,2]),]

What I get is:

print(x)


Product A B   Tissue
44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
17 CNS:SNB_75-0.23183 1.03945  CNS
37   LE:K_562-0.58218  1.8581 Leukemia
43ME:MALME_3M-0.67327-1.33493 Melanoma
49ME:UACC_257-0.72431-1.84753 Melanoma
42 ME:M14-0.73942-0.73904 Melanoma
40  LE:SR-0.93541 2.95346 Leukemia
25  CO:SW_620-1.53265-1.35446Colon
63  RE:CAKI_1-2.48443 0.43245Renal
39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
26LC:A549-2.66221 0.71215 Lung
61RE:A498-2.89402 0.93287Renal
9   BR:HS578T-2.94118  1.1217   Breast
34LC:NCI_H522-2.94381  0.3859 Lung
66   RE:TK_10-2.95281 1.26245Renal
52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
57 OV:SK_OV_3-3.04477 2.15405  Ovarian
53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
14 CNS:SF_295-3.09348-1.00095  CNS
54 OV:OVCAR_4-3.13137-0.47497  Ovarian
36   LE:HL_60-3.16745-3.16745 Leukemia
38  LE:MOLT_4-3.20055-1.72841 Leukemia
11  BR:MDA_MB_231-3.24907 1.58326   Breast
59PR:PC_3-3.36612 1.39328 Prostate
19 CO:HCT_116-3.39764 0.43061Colon
12BR:T47D-3.41228 1.13818   Breast
22  CO:HCT_15-3.45342 0.16357Colon
64 RE:RXF_393-3.49615 2.59144Renal
28  LC:HOP_62 -3.4968 0.67884 Lung
60   RE:786_0 -3.5086 1.75056Renal
35LE:CCRF_CEM-3.54526-2.09262 Leukemia
29  LC:HOP_92-3.60636 0.87116 Lung
21CO:HCC_2998-3.61457-0.32362Colon
13 CNS:SF_268-3.63916 2.54378  CNS
20 CO:COLO205-3.64656 0.54344Colon
56 OV:OVCAR_8-3.66053 -0.9594  Ovarian
24CO:KM12-3.68703 2.19991Colon
55 OV:OVCAR_5 -3.7852 2.43038  Ovarian
8   BR:BT_549-3.80239-0.43099   Breast
15 CNS:SF_539-3.86184 1.39114  CNS
65   RE:SN12C-3.90776 0.85244Renal
31 LC:NCI_H23-3.91625-1.14955 Lung
62RE:ACHN-3.96246-0.62365Renal
67   RE:UO_31-3.99791-1.09215Renal
10BR:MCF7-4.00187 1.46303   Breast
51  OV:IGROV1-4.02758 2.04324  Ovarian
23CO:HT29-4.11624-0.02799Colon
41 ME:LOXIMVI -4.2572 0.37259 Melanoma
32   LC:NCI_H322M-4.28534 1.66783 Lung
27LC:EKVX-4.32847 1.66042 Lung
58  PR:DU_145-4.33961 1.57548 Prostate
30LC:NCI_H226-4.37408-0.22311 Lung
33LC:NCI_H460  0.0042 -0.6023 Lung
18   CNS:U251 

[R] Split-plot design in GLM with only fixed factors.

2010-05-23 Thread Ivan Allaman

Good evening gentlemen!


I have a test in split-plot with randomized block design where my answer is
a binomial variable. I wonder if there is any way I can calculate the
probability of my factors considering the design errors in the case are two.
I looked at various threads here and elsewhere, and unfortunately no to
answer objective my problem that is very simple. My interest isn't to
estimate variance components, so I see no reason to use functions like LM as
found on most topics. If anyone knows a way to calculate the probability of
my factors as we do in an LM model, ie, announcing the types of errors so
that the probability factor in interest is not prejudiced, I'd be grateful.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Split-plot-design-in-GLM-with-only-fixed-factors-tp2228126p2228126.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] order issue

2010-05-23 Thread Zoppoli, Gabriele (NIH/NCI) [G]
after read.delim:

'data.frame':   60 obs. of  4 variables:
 $ Cell   : Factor w/ 60 levels BR:BT_549,BR:HS578T,..: 23 51 20 25 34 
16 44 3 60 55 ...
 $ hsa-miR-204: num  -4.37 -4.34 -4.33 -4.29 -4.26 ...
 $ hsa-miR-210: num  -0.223 1.575 1.66 1.668 0.373 ...
 $ Tissue : Factor w/ 9 levels Breast,CNS,..: 5 8 5 5 6 3 7 1 9 9 ...

before:

 chr [1:60, 1:4] ME:SK_MEL_5 ME:SK_MEL_28 ME:SK_MEL_2 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:60] 48 47 46 50 ...
  ..$ : chr [1:4] Product hsa.miR.204 hsa.miR.210 Tissue

Looks like the issue is that, after the first time I read.delimmed the txt 
file, I removed the first three raws by doing

x=x[-c(1:3),]

because the first three raws were characters (parameters like probe name, 
chromosomal position ecc.)

So maybe R remembers that the columns used were characters and not numeric... 
How would you explain R (sorry for the naive definitions but I've learnt R 
over time by myself and I misuse some words, hope it's clear anyway) that a 
matrix is all numeric? by doing as.numeric(x), it transforms everything in a 
long colum of number, but loses the matrix structure...

Thank you all guys! You're really precious!

Now, how can you explain (sorry for my naive definitions...) R that now all 
of your values are numeric in a matrix? If you do as.numeric, everything 
becomes a long column of n 



Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of 
Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD

Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppo...@mail.nih.gov

From: William Dunlap [wdun...@tibco.com]
Sent: Sunday, May 23, 2010 7:05 PM
To: Zoppoli, Gabriele (NIH/NCI) [G]; ted.hard...@manchester.ac.uk
Cc: R-help@r-project.org
Subject: RE: [R] order issue

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Zoppoli,
 Gabriele (NIH/NCI) [G]
 Sent: Sunday, May 23, 2010 3:44 PM
 To: ted.hard...@manchester.ac.uk
 Cc: R-help@r-project.org
 Subject: Re: [R] order issue

 crazy stuff!!! I tried to reload the txt file, and now it's working...

When you reloaded the txt file (with what function?) it
probably was made into a data.frame, with some columns
factors or characters and some columns numerics.  It looks
like your original problem arose after you converted that
data.frame into a matrix, all of whose columns must be
the same (character in this case).  Sorting character
representations of numbers is different than sorting the
numbers as numbers.
   sort(c(1, 0.05, 0., -0.10, -2))
  [1] -2.00 -0.10  0.00  0.05  1.00
   sort(as.character(c(1, 0.05, 0., -0.10, -2)))
  [1] -0.1 -2   00.05 1

Use str(x) again to see if this is what is happening.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 this is the original (attached)

 thanks!

 Gabriele Zoppoli, MD
 Ph.D. Fellow, Experimental and Clinical Oncology and
 Hematology, University of Genova, Genova, Italy
 Guest Researcher, LMP, NCI, NIH, Bethesda MD

 Work: 301-451-8575
 Mobile: 301-204-5642
 Email: zoppo...@mail.nih.gov
 
 From: Ted Harding [ted.hard...@manchester.ac.uk]
 Sent: Sunday, May 23, 2010 6:31 PM
 To: Zoppoli, Gabriele (NIH/NCI) [G]
 Cc: R help
 Subject: RE: [R] order issue

 On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote:
  Hi everybody, this is a real dummy thing.
 
  I sorted a matrix based on a given column, and what I get is right,
  until it comes to columns of negative and positive values; than,
  order orders everything from max to min in the negative
 values, and
  then AGAIN from max to min in the positive values!!!
 
  Why isn't everything order from max to min, and that's it?
  Thank you!!!
 
  Attached is the txt file I use; try:
 
  x=x[order(x[,2]),]
 
  What I get is:
 
  print(x)
 
Product A B   Tissue
  44  ME:MDA_MB_435 -0.1915-0.16744 Melanoma
  17 CNS:SNB_75-0.23183 1.03945  CNS
  37   LE:K_562-0.58218  1.8581 Leukemia
  43ME:MALME_3M-0.67327-1.33493 Melanoma
  49ME:UACC_257-0.72431-1.84753 Melanoma
  42 ME:M14-0.73942-0.73904 Melanoma
  40  LE:SR-0.93541 2.95346 Leukemia
  25  CO:SW_620-1.53265-1.35446Colon
  63  RE:CAKI_1-2.48443 0.43245Renal
  39   LE:RPMI_8226-2.59561 -1.9448 Leukemia
  26LC:A549-2.66221 0.71215 Lung
  61RE:A498-2.89402 0.93287Renal
  9   BR:HS578T-2.94118  1.1217   Breast
  34LC:NCI_H522-2.94381  0.3859 Lung
  66   RE:TK_10-2.95281 1.26245Renal
  52 OV:NCI_ADR_RES-3.04456 0.17046  Ovarian
  57 OV:SK_OV_3-3.04477 2.15405  Ovarian
  53 OV:OVCAR_3 -3.0705-0.31743  Ovarian
  14 CNS:SF_295-3.09348-1.00095  CNS
  54 OV:OVCAR_4-3.13137

[R] sum of certain length

2010-05-23 Thread Roslina Zakaria
Hi r-users,
 
I have this data below.  I would like to obtain the weekly rainfall sum.  That 
is I would like to find sum for day 1 to day 7, day 8 - day15, and so on.
   year month day rain
1  1922 1   1  0.0
2  1922 1   2  0.0
3  1922 1   3  0.0
4  1922 1   4  0.0
5  1922 1   5  0.0
6  1922 1   6  0.0
7  1922 1   7  0.0
8  1922 1   8  6.6
9  1922 1   9  1.5
10 1922 1  10  0.0
11 1922 1  11  0.0
12 1922 1  12  4.8
13 1922 1  13 14.7
14 1922 1  14  0.0
15 1922 1  15  0.0
16 1922 1  16  0.0
17 1922 1  17  0.0
18 1922 1  18  0.0
19 1922 1  19  0.0
20 1922 1  20  0.8
21 1922 1  21  0.0
22 1922 1  22  0.0
23 1922 1  23  0.0
24 1922 1  24  0.0
25 1922 1  25  0.0
26 1922 1  26  0.0
27 1922 1  27  0.0
28 1922 1  28  0.0
29 1922 1  29  0.0
30 1922 1  30  0.0
31 1922 1  31  0.0
32 1922 2   1  0.0
33 1922 2   2  0.0
34 1922 2   3  0.0
35 1922 2   4  0.0
36 1922 2   5  0.0
37 1922 2   6  0.0
38 1922 2   7  0.0
39 1922 2   8  0.0
40 1922 2   9  0.0

Thank you.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] retrieve path analysis coefficients (package agricolae)

2010-05-23 Thread Zack Holden
Dear list,
I'd like to use path.analysis in the package agricolae in batch format on
many files, retrieving the path coefficients for each run and appending them
to a table. I don't see any posts in the help files about this package or
the path.analysis package. I've tried creating an object out of the call to
path.analysis, but no matter what I try, the function automatically prints
the result. I'll be grateful for any assistance.

Thanks in advance,
Zack

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sum of certain length

2010-05-23 Thread Bill.Venables
This is one way to do it.  Suppose your data is in the file rainfall.txt, as 
set out below.  Then

 dat - read.table(rainfall.txt, header = TRUE)
 dat - within(dat, {
+   date - as.Date(paste(year, month, day, sep=-))
+   week - factor(as.numeric(date - date[1]) %/% 7)
+ })
 wRain - with(dat, tapply(rain, week, sum))
 wRain
   012345 
 0.0 27.6  0.8  0.0  0.0  0.0 
 

 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Roslina Zakaria
Sent: Monday, 24 May 2010 10:09 AM
To: r-help@r-project.org
Subject: [R] sum of certain length

Hi r-users,
 
I have this data below.  I would like to obtain the weekly rainfall sum.  That 
is I would like to find sum for day 1 to day 7, day 8 - day15, and so on.
   year month day rain
1  1922 1   1  0.0
2  1922 1   2  0.0
3  1922 1   3  0.0
4  1922 1   4  0.0
5  1922 1   5  0.0
6  1922 1   6  0.0
7  1922 1   7  0.0
8  1922 1   8  6.6
9  1922 1   9  1.5
10 1922 1  10  0.0
11 1922 1  11  0.0
12 1922 1  12  4.8
13 1922 1  13 14.7
14 1922 1  14  0.0
15 1922 1  15  0.0
16 1922 1  16  0.0
17 1922 1  17  0.0
18 1922 1  18  0.0
19 1922 1  19  0.0
20 1922 1  20  0.8
21 1922 1  21  0.0
22 1922 1  22  0.0
23 1922 1  23  0.0
24 1922 1  24  0.0
25 1922 1  25  0.0
26 1922 1  26  0.0
27 1922 1  27  0.0
28 1922 1  28  0.0
29 1922 1  29  0.0
30 1922 1  30  0.0
31 1922 1  31  0.0
32 1922 2   1  0.0
33 1922 2   2  0.0
34 1922 2   3  0.0
35 1922 2   4  0.0
36 1922 2   5  0.0
37 1922 2   6  0.0
38 1922 2   7  0.0
39 1922 2   8  0.0
40 1922 2   9  0.0

Thank you.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] library location and error messages when loading packages

2010-05-23 Thread Daisy Englert Duursma
Hello,

I am running R on a server that several people share. Previously we
all had separate libraries for R.
I have set up R so everyone on the server shares the same library and
I downloaded the latest version of R and installed it on the
main drive of our server in the  Program Files folder (obvious
enough).

I changed the Environmental Variables in the advanced system setting
so R_LIBS  is C:\\RLIBRARY and restarted the server.

The commands :
.libPaths()
[1] C:\\RLIBRARYC:/PROGRA~1/R/R-211~1.0/library

 .Library
[1] C:/PROGRA~1/R/R-211~1.0/library


 When I try to run several packages it says It can not load them
(although many packages do work). So I tried
to install the  packages again (I deleted the old ones, downloaded a zip file
of the new ones  and this is what happened :

 library(Hmisc)
Error in library.dynam(lib, package, package.lib) :
 shared library 'cluster' not found
Error: package/namespace load failed for 'Hmisc'

utils:::menuInstallPkgs()
trying URL 
'http://cran.ms.unimelb.edu.au/bin/windows/contrib/2.11/cluster_1.12.3.zip'
Content type 'application/zip' length 340188 bytes (332 Kb)
opened URL
downloaded 332 Kb
package 'cluster' successfully unpacked and MD5 sums checked
The downloaded packages are in
   C:\Users\Daisy
Englert\AppData\Local\Temp\2\RtmpWGZV31\downloaded_packages


library(cluster)
Error in get(Info[i, 1], envir = env) :
 internal error -3 in R_decompress1
Error: package/namespace load failed for 'cluster'


**But, I can fix this by setting the lib.loc

 library(cluster, lib.loc =  C:/PROGRA~1/R/R-211~1.0/library )

**Unfortunately this is not where the updated packages went, the
updated package went to  C:\\RLIBRARY . I have messed something up
and I do not know how to fix it.

Any advice would be welcome.


Thanks,
Daisy

-- 
Daisy Englert Duursma

Room E8C156
Dept. Biological Sciences
Macquarie University  NSW  2109
Australia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in FUN(X[[1L]], ...) : STRING_ELT() can only be applied to a 'character vector', not a 'integer'

2010-05-23 Thread sedm1000

Thanks for your time with this. Erik's solution works best to deal with the
input... I'll try to reshape the output back into the appropriate columns.

David, fold(sq$s1) only outputs the result for the first sequence in the
list I'm afraid. The 'fold' function doesn't deal well with spaces...

Thanks again.


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Error-in-FUN-X-1L-STRING-ELT-can-only-be-applied-to-a-character-vector-not-a-integer-tp2226811p2228157.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] high-dimensional contingency table

2010-05-23 Thread Claudia Rodriguez
Dear Friends.
I am just starting to use R. And in this occasion I want to construct a
high-dimensional contingency table, because I want to crate a mosaic plot
with the vcd package.
My table is in this format:

año ac.repcat.gru conteos
1  2005  Rparejas 253
2  2005  Nparejas  23
3  2006  Rparejas 347
4  2006  Nparejas  39
5  2007  Rparejas 266
6  2007  Nparejas  83
7  2005  R solitarios  53
8  2005  N solitarios   1
9  2006  R solitarios 109
10 2006  N solitarios   8
11 2007  R solitarios  85
12 2007  N solitarios  34
13 2005  R  trios  29
14 2005  N  trios   1
15 2006  R  trios  62
16 2006  N  trios  19
17 2007  R  trios  48
18 2007  N  trios   3

How can I do this?
I saw the help of the mosaic command, and I found that the files are like
a hig-dimensional contingency table (for example Tytanic data), but I was
unable to do the change.
Thank you very much!!!
With best wishes

-- 
Claudia I. Rodríguez-Flores
Maestra en Ciencias Biológicas
Laboratorio de Ecología, UBIPRO
UNAM FES-Iztacala
52-55-56231130

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AM/PM strptime %p failing 2.11.0 WinXP

2010-05-23 Thread Samuel Dennis
I am attempting to import dates in the following format to R:
5/20/2010 6:45:32 PM

Unfortunately I am unable to get the AM/PM function (%p) to work correctly
under either 2.11.0 or 2.8.1.
 strptime(5/20/2010 6:45:32 PM, %m/%d/%Y %I:%M:%S %p)
[1] NA

but
 strptime(5/20/2010 6:45:32, %m/%d/%Y %I:%M:%S)
[1] 2010-05-20 06:45:32

showing that the problem is with %p.

I could only find one previous mention of this issue in the archives (
http://tolstoy.newcastle.edu.au/R/e2/help/06/11/6272.html) , which provided
no solution beyond upgrading R (which I have done), and just suggested it
was a problem with that particular installation of R and Windows.

What could I do to get this function working on my Windows XP machine?

Thankyou,

Samuel Dennis
sjdenn...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AM/PM strptime %p failing 2.11.0 WinXP

2010-05-23 Thread Mario Valle
I know it is not very useful to you, but on Vista with 2.11.patched it 
works:

 strptime(5/20/2010 6:45:32 PM, %m/%d/%Y %I:%M:%S %p)
[1] 2010-05-20 18:45:32
 strptime(5/20/2010 6:45:32, %m/%d/%Y %I:%M:%S)
[1] 2010-05-20 06:45:32

 sessionInfo()
R version 2.11.0 Patched (2010-04-26 r51822)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Maybe you can try to set the LANGUAGE to English.
Good luck!
mario

On 24-May-10 2:59, Samuel Dennis wrote:

I am attempting to import dates in the following format to R:
5/20/2010 6:45:32 PM

Unfortunately I am unable to get the AM/PM function (%p) to work correctly
under either 2.11.0 or 2.8.1.

strptime(5/20/2010 6:45:32 PM, %m/%d/%Y %I:%M:%S %p)

[1] NA

but

strptime(5/20/2010 6:45:32, %m/%d/%Y %I:%M:%S)

[1] 2010-05-20 06:45:32

showing that the problem is with %p.

I could only find one previous mention of this issue in the archives (
http://tolstoy.newcastle.edu.au/R/e2/help/06/11/6272.html) , which provided
no solution beyond upgrading R (which I have done), and just suggested it
was a problem with that particular installation of R and Windows.

What could I do to get this function working on my Windows XP machine?

Thankyou,

Samuel Dennis
sjdenn...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Ing. Mario Valle
Data Analysis and Visualization Group| 
http://www.cscs.ch/~mvalle

Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS for R-users

2010-05-23 Thread Thomas Levine
Thanks for the suggestions! This will keep me busy for a while.

Tom

2010/5/15 Muenchen, Robert A (Bob) muenc...@utk.edu:
 Thomas Levine wrote:
Bob Muenchen says that 'Ralph O’Brien says that
in a few years there will be so many students
graduating knowing mainly R that [he]’ll need to
write, “SAS for R Users.” That’ll be the day!'

 Heh! I quite agree. I've had a few people write me saying they had used my 
 book R for SAS and SPSS Users to learn SAS, but I certainly didn't aim for 
 that when writing it. For R programmers wanting to learn SAS, here's what I 
 recommend:

 1. Read the text of the free version of R for SAS and SPSS Users at 
 http://r4stats.com. That version has extremely short explanations of the 
 differences by topic. Most of the explanation about R is in the form of 
 comments in the R programs, which you can skip of course. The SAS programs 
 will give you an idea of the basics. The book version adds lots of 
 explanation but it's all about R, so skip that.

 2. Read The Little SAS Book 
 http://www.amazon.com/Little-SAS-Book-Primer-Third/dp/1590473337/ref=sr_1_1?ie=UTF8s=booksqid=1273963558sr=8-1

 This is a quick and easy read that covers the basics well.

 3. Read SAS and R 
 http://www.amazon.com/SAS-Management-Statistical-Analysis-Graphics/dp/1420070576/ref=sr_1_1?ie=UTF8s=booksqid=1273963594sr=1-1

 SAS and R is a good book that covers both SAS and R. The explanations are 
 very brief but well written. That brevity allows it to cover a lot of ground.

 4. For in-depth topics, the SAS documentation is well written and all online: 
 http://support.sas.com/documentation/index.html

 Although the SAS manuals are online, knowing what to look up is the challenge 
 for an R user. That's where 1 and 3 will help.

 Get ready for a whole different kind of world!

 Cheers,
 Bob

 =
  Bob Muenchen (pronounced Min'-chen), Manager
  Research Computing Support
  Voice: (865) 974-5230
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research,
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
 =





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] high-dimensional contingency table

2010-05-23 Thread David Winsemius


On May 23, 2010, at 8:41 PM, Claudia Rodriguez wrote:


Dear Friends.
I am just starting to use R. And in this occasion I want to  
construct a
high-dimensional contingency table, because I want to crate a mosaic  
plot

with the vcd package.
My table is in this format:

   año ac.repcat.gru conteos
1  2005  Rparejas 253
2  2005  Nparejas  23
3  2006  Rparejas 347
4  2006  Nparejas  39
5  2007  Rparejas 266
6  2007  Nparejas  83
7  2005  R solitarios  53
8  2005  N solitarios   1
9  2006  R solitarios 109
10 2006  N solitarios   8
11 2007  R solitarios  85
12 2007  N solitarios  34
13 2005  R  trios  29
14 2005  N  trios   1
15 2006  R  trios  62
16 2006  N  trios  19
17 2007  R  trios  48
18 2007  N  trios   3

How can I do this?
I saw the help of the mosaic command, and I found that the files  
are like
a hig-dimensional contingency table (for example Tytanic data),  
but I was

unable to do the change.


mosaic's help page says you need to supply a data.frame or a  
contingency table. Given that you do not have separate records for  
each individual, but rather have counts in the last column, you can  
use xtabs to create a table object. Note the help page of xtabs says:


## xtabs() - as.data.frame.table()
You need to tell xtabs which column has the counts.

xtabs(conteos ~.,dta)
mosaic( cat.gru ~año, data = xtabs( conteos ~., dta))

--
David.


Thank you very much!!!
With best wishes

--
Claudia I. Rodríguez-Flores
Maestra en Ciencias Biológicas
Laboratorio de Ecología, UBIPRO
UNAM FES-Iztacala
52-55-56231130



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROC curve

2010-05-23 Thread Changbin Du
HI, Dear R community,

I want to know how to select the optimal decision threshold from the ROC
curve? At what threshold will give the highest accuracy?

Thanks!

-- 
Sincerely,
Changbin
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sum of certain length

2010-05-23 Thread Gabor Grothendieck
If the days are consecutive with no missing rows then the dates don't
need to be calculated and it could be represented as a ts series with
a frequency of 7.  Just aggregate it down to a frequency of 1:

   rain - ts(dat$rain, freq = 7)
   aggregate(rain, 1)

If there are missing rows (or even there are none missing) you could
use zoo.  This time lets use dates:

   library(zoo)
   rain - with(dat, zoo(rain, as.Date(paste(year, month, day, sep = -
   week - 7 * (as.numeric(time(rain)-start(rain)) %/% 7) + start(rain) + 6
   aggregate(rain, week)

Each point in the aggregated series is associated with the date of the
end of its week.


On Sun, May 23, 2010 at 8:09 PM, Roslina Zakaria zrosl...@yahoo.com wrote:
 Hi r-users,

 I have this data below.  I would like to obtain the weekly rainfall sum.  
 That is I would like to find sum for day 1 to day 7, day 8 - day15, and so on.
    year month day rain
 1  1922 1   1  0.0
 2  1922 1   2  0.0
 3  1922 1   3  0.0
 4  1922 1   4  0.0
 5  1922 1   5  0.0
 6  1922 1   6  0.0
 7  1922 1   7  0.0
 8  1922 1   8  6.6
 9  1922 1   9  1.5
 10 1922 1  10  0.0
 11 1922 1  11  0.0
 12 1922 1  12  4.8
 13 1922 1  13 14.7
 14 1922 1  14  0.0
 15 1922 1  15  0.0
 16 1922 1  16  0.0
 17 1922 1  17  0.0
 18 1922 1  18  0.0
 19 1922 1  19  0.0
 20 1922 1  20  0.8
 21 1922 1  21  0.0
 22 1922 1  22  0.0
 23 1922 1  23  0.0
 24 1922 1  24  0.0
 25 1922 1  25  0.0
 26 1922 1  26  0.0
 27 1922 1  27  0.0
 28 1922 1  28  0.0
 29 1922 1  29  0.0
 30 1922 1  30  0.0
 31 1922 1  31  0.0
 32 1922 2   1  0.0
 33 1922 2   2  0.0
 34 1922 2   3  0.0
 35 1922 2   4  0.0
 36 1922 2   5  0.0
 37 1922 2   6  0.0
 38 1922 2   7  0.0
 39 1922 2   8  0.0
 40 1922 2   9  0.0

 Thank you.



        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.