[R] How to change the order of columns in a data frame?

2012-02-17 Thread Joel Fürstenberg-Hägg
Dear all,
 
I have a data frame in which the columns need to be ordered. The first column X 
is at the right position, but the remaining columns X1-Xn should be ordered 
like this: X1, X2, X3 etc instead of like below.
 
 colnames(pos1)
 [1] X   X1  X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 
X2  X20 X3  X4  X5  X6  X7  X8  X9 
 
 pos1[1:5,1:5]
  X   X1   X10   X11   X12
1 100.5 7949.469 18509.064  8484.969 17401.056
2 101.5 3080.058  7794.691  3211.323  8211.058
3 102.5 1854.347  4347.571  1783.846  4827.338
4 103.5 2064.441  8421.746  2012.536  8363.785
5 104.5 9650.402 26637.926 10730.647 27053.421
 
I am trying to first change the first column name to something without an X and 
save as a vector. I would then remove the X from each position use the vector 
for renaming the columns. Then the column 2-n could be ordered, I hope...

colnames(pos)[1] - Mass
columnNames - colnames(pos)
 
Does any of you have an idea how to do this, or perhaps there is a smoother 
solution?
Would it be easier to solve it if the contents of the first column were 
extracted and used as row names instead?
 
Best regards,
 
Joel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to change the order of columns in a data frame?

2012-02-17 Thread Joel Fürstenberg-Hägg
It does not work when using more variables, and my data frames usually
contains about thousand columns...
 
Best,
 
Joel
 
 fakedata - data.frame(A=c(0,0,0), X1=c(1,1,1), X6=c(6,6,6),
X7=c(7,7,7), X3=c(3,3,3), X4=c(4,4,4), X9=c(9,9,9), X2=c(2,2,2),
X8=c(8,8,8), X5=c(5,5,5))
 fakedata
  A X1 X6 X7 X3 X4 X9 X2 X8 X5
1 0  1  6  7  3  4  9  2  8  5
2 0  1  6  7  3  4  9  2  8  5
3 0  1  6  7  3  4  9  2  8  5
 pos - colnames(fakedata)[2:ncol(fakedata)]
 pos
[1] X1 X6 X7 X3 X4 X9 X2 X8 X5
 pos - c(1, 1+as.numeric(gsub(X, , pos)))
 pos
 [1]  1  2  7  8  4  5 10  3  9  6
 fakedata[,  pos]
  A X1 X9 X2 X7 X3 X5 X6 X8 X4
1 0  1  9  2  7  3  5  6  8  4
2 0  1  9  2  7  3  5  6  8  4
3 0  1  9  2  7  3  5  6  8  4

 Sarah Goslee sarah.gos...@gmail.com 17-02-2012 14:36 
 fakedata - data.frame(A=c(0,0,0), X2=c(2,2,2), X1=c(1,1,1),
X3=c(3,3,3))
 fakedata
  A X2 X1 X3
1 0  2  1  3
2 0  2  1  3
3 0  2  1  3
 pos - colnames(fakedata)[2:ncol(fakedata)]
 pos - c(1, 1+as.numeric(gsub(X, , pos)))
 fakedata[,  pos]
  A X1 X2 X3
1 0  1  2  3
2 0  1  2  3
3 0  1  2  3


Sarah

2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk:
 Dear all,

 I have a data frame in which the columns need to be ordered. The
first column X is at the right position, but the remaining columns X1-Xn
should be ordered like this: X1, X2, X3 etc instead of like below.

 colnames(pos1)
  [1] X   X1  X10 X11 X12 X13 X14 X15 X16 X17
X18 X19 X2  X20 X3  X4  X5  X6  X7  X8  X9

 pos1[1:5,1:5]
  X   X1   X10   X11   X12
 1 100.5 7949.469 18509.064  8484.969 17401.056
 2 101.5 3080.058  7794.691  3211.323  8211.058
 3 102.5 1854.347  4347.571  1783.846  4827.338
 4 103.5 2064.441  8421.746  2012.536  8363.785
 5 104.5 9650.402 26637.926 10730.647 27053.421

 I am trying to first change the first column name to something
without an X and save as a vector. I would then remove the X from each
position use the vector for renaming the columns. Then the column 2-n
could be ordered, I hope...

 colnames(pos)[1] - Mass
 columnNames - colnames(pos)

 Does any of you have an idea how to do this, or perhaps there is a
smoother solution?
 Would it be easier to solve it if the contents of the first column
were extracted and used as row names instead?

 Best regards,

 Joel



-- 
Sarah Goslee
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to change the order of columns in a data frame?

2012-02-17 Thread Joel Fürstenberg-Hägg
@Alfredo
 
The X is removed, but the reordering does not work:
 
 colnames(df)[1] - Mass
 columnNames - colnames(df)
 colnames(df)
 [1] Mass X1   X10  X11  X12  X13  X14  X15  X16 
X17  X18  X19  X2   X20  X3   X4   X5   X6   X7  
X8   X9  
 
 colnames(df) - gsub(X,,colnames(df))
 colnames(df)
 [1] Mass 110   11   12   13   14   15   16  
17   18   19   220   34567   
89

 df - df[,colnames(df)]
 colnames(df)
 [1] Mass 110   11   12   13   14   15   16  
17   18   19   220   34567   
89 
 
Best,
 
Joel
 
 Alfredo Alessandrini caveneb...@gmail.com 17-02-2012 14:40 
Hi Joel,

to replace the colnames:

colnames(dataframe - )gsub(X,,colnames(dataframe))

to order by colnames:

dataframe - dataframe[,colnames(dataframe)]



Alfredo


2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk


Dear all,

I have a data frame in which the columns need to be ordered. The first
column X is at the right position, but the remaining columns X1-Xn
should be ordered like this: X1, X2, X3 etc instead of like below.

 colnames(pos1)
[1] X X1 X10 X11 X12 X13 X14 X15 X16 X17 X18
X19 X2 X20 X3 X4 X5 X6 X7 X8 X9

 pos1[1:5,1:5]
X X1 X10 X11 X12
1 100.5 7949.469 18509.064 8484.969 17401.056
2 101.5 3080.058 7794.691 3211.323 8211.058
3 102.5 1854.347 4347.571 1783.846 4827.338
4 103.5 2064.441 8421.746 2012.536 8363.785
5 104.5 9650.402 26637.926 10730.647 27053.421

I am trying to first change the first column name to something without
an X and save as a vector. I would then remove the X from each position
use the vector for renaming the columns. Then the column 2-n could be
ordered, I hope...

colnames(pos)[1] - Mass
columnNames - colnames(pos)

Does any of you have an idea how to do this, or perhaps there is a
smoother solution?
Would it be easier to solve it if the contents of the first column were
extracted and used as row names instead?

Best regards,

Joel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to change the order of columns in a data frame?

2012-02-17 Thread Joel Fürstenberg-Hägg
@ Jim
 
That would work for just a few columns, but I will have around 1000 of
them so I need something more generic.
 
best,
 
Joel

 jim holtman jholt...@gmail.com 17-02-2012 14:44 
pos2 - pos1[, c(X, X1, X2, X3, X4, X5, X6, X7, X8,
X9, X10, X11, X12,
  X13, X14, X15, X16, X17, X18, X19, X20)]


2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk:
 Dear all,

 I have a data frame in which the columns need to be ordered. The
first column X is at the right position, but the remaining columns X1-Xn
should be ordered like this: X1, X2, X3 etc instead of like below.

 colnames(pos1)
  [1] X   X1  X10 X11 X12 X13 X14 X15 X16 X17
X18 X19 X2  X20 X3  X4  X5  X6  X7  X8  X9

 pos1[1:5,1:5]
  X   X1   X10   X11   X12
 1 100.5 7949.469 18509.064  8484.969 17401.056
 2 101.5 3080.058  7794.691  3211.323  8211.058
 3 102.5 1854.347  4347.571  1783.846  4827.338
 4 103.5 2064.441  8421.746  2012.536  8363.785
 5 104.5 9650.402 26637.926 10730.647 27053.421

 I am trying to first change the first column name to something
without an X and save as a vector. I would then remove the X from each
position use the vector for renaming the columns. Then the column 2-n
could be ordered, I hope...

 colnames(pos)[1] - Mass
 columnNames - colnames(pos)

 Does any of you have an idea how to do this, or perhaps there is a
smoother solution?
 Would it be easier to solve it if the contents of the first column
were extracted and used as row names instead?

 Best regards,

 Joel

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to change the order of columns in a data frame?

2012-02-17 Thread Joel Fürstenberg-Hägg
Thank you Sarah, now it works!!

 Sarah Goslee sarah.gos...@gmail.com 17-02-2012 15:13 
Sorry, it should be:
 fakedata[, order(pos)]
  A X1 X2 X3 X4 X5 X6 X7 X8 X9
1 0  1  2  3  4  5  6  7  8  9
2 0  1  2  3  4  5  6  7  8  9
3 0  1  2  3  4  5  6  7  8  9

Using order also ensures that non-sequential column ids will work:

 fakedata - data.frame(A=c(0,0,0), X1=c(1,1,1), X6=c(6,6,6),
X7=c(7,7,7), X3=c(3,3,3), X4=c(4,4,4), X9=c(9,9,9), X2=c(2,2,2),
X8=c(8,8,8))
 pos - colnames(fakedata)[2:ncol(fakedata)]
 pos - c(1, 1+as.numeric(gsub(X, , pos)))
 fakedata
  A X1 X6 X7 X3 X4 X9 X2 X8
1 0  1  6  7  3  4  9  2  8
2 0  1  6  7  3  4  9  2  8
3 0  1  6  7  3  4  9  2  8
 fakedata[, order(pos)]
  A X1 X2 X3 X4 X6 X7 X8 X9
1 0  1  2  3  4  6  7  8  9
2 0  1  2  3  4  6  7  8  9
3 0  1  2  3  4  6  7  8  9

Sarah

2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk:
 It does not work when using more variables, and my data frames
usually
 contains about thousand columns...

 Best,

 Joel

 fakedata - data.frame(A=c(0,0,0), X1=c(1,1,1), X6=c(6,6,6),
X7=c(7,7,7),
 X3=c(3,3,3), X4=c(4,4,4), X9=c(9,9,9), X2=c(2,2,2), X8=c(8,8,8),
 X5=c(5,5,5))
 fakedata
   A X1 X6 X7 X3 X4 X9 X2 X8 X5
 1 0  1  6  7  3  4  9  2  8  5
 2 0  1  6  7  3  4  9  2  8  5
 3 0  1  6  7  3  4  9  2  8  5
 pos - colnames(fakedata)[2:ncol(fakedata)]
 pos
 [1] X1 X6 X7 X3 X4 X9 X2 X8 X5
 pos - c(1, 1+as.numeric(gsub(X, , pos)))
 pos
  [1]  1  2  7  8  4  5 10  3  9  6
 fakedata[,  pos]
   A X1 X9 X2 X7 X3 X5 X6 X8 X4
 1 0  1  9  2  7  3  5  6  8  4
 2 0  1  9  2  7  3  5  6  8  4
 3 0  1  9  2  7  3  5  6  8  4

 Sarah Goslee sarah.gos...@gmail.com 17-02-2012 14:36 
 fakedata - data.frame(A=c(0,0,0), X2=c(2,2,2), X1=c(1,1,1),
X3=c(3,3,3))
 fakedata
   A X2 X1 X3
 1 0  2  1  3
 2 0  2  1  3
 3 0  2  1  3
 pos - colnames(fakedata)[2:ncol(fakedata)]
 pos - c(1, 1+as.numeric(gsub(X, , pos)))
 fakedata[,  pos]
   A X1 X2 X3
 1 0  1  2  3
 2 0  1  2  3
 3 0  1  2  3


 Sarah

 2012/2/17 Joel Fürstenberg-Hägg jo...@life.ku.dk:

 Dear all,

 I have a data frame in which the columns need to be ordered. The
first
 column X is at the right position, but the remaining columns X1-Xn
should be
 ordered like this: X1, X2, X3 etc instead of like below.

 colnames(pos1)
  [1] X   X1  X10 X11 X12 X13 X14 X15 X16 X17
X18
 X19 X2  X20 X3  X4  X5  X6  X7  X8  X9

 pos1[1:5,1:5]
  X   X1   X10   X11   X12
 1 100.5 7949.469 18509.064  8484.969 17401.056
 2 101.5 3080.058  7794.691  3211.323  8211.058
 3 102.5 1854.347  4347.571  1783.846  4827.338
 4 103.5 2064.441  8421.746  2012.536  8363.785
 5 104.5 9650.402 26637.926 10730.647 27053.421

 I am trying to first change the first column name to something
without an
 X and save as a vector. I would then remove the X from each position
use the
 vector for renaming the columns. Then the column 2-n could be
ordered, I
 hope...

 colnames(pos)[1] - Mass
 columnNames - colnames(pos)

 Does any of you have an idea how to do this, or perhaps there is a
 smoother solution?
 Would it be easier to solve it if the contents of the first column
were
 extracted and used as row names instead?

 Best regards,

 Joel



 --
 Sarah Goslee
 http://www.functionaldiversity.org



-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Normality tests on groups of rows in a data frame, grouped based on content in other columns

2011-10-31 Thread Joel Fürstenberg-Hägg
Hi Dennis,
 
Thanks for your prompt response.
 
Best,
 
Joel

 Dennis Murphy djmu...@gmail.com 30-10-2011 21:11 
Hi:

Here are a few ways (untested, so caveat emptor):

# plyr package
library('plyr')
ddply(df, .(Plant, Tissue, Gene), summarise, ntest =
shapiro.test(ExpressionLevel))

# data.table package
library('data.table')
dt - data.table(df, key = 'Plant, Tissue, Gene')
dt[, list(ntest = shapiro.test(ExpressionLevel)), by = key(dt)]

# aggregate() function
aggregate(ExpressionLevel ~ Plant + Tissue + Gene, data = df, FUN =
shapiro.test)

# doBy package:
summaryBy(ExpressionLevel ~ Plant + Tissue + Gene, data = df, FUN =
shapiro.test)

There are others, too...

HTH,
Dennis

2011/10/30 Joel Fürstenberg-Hägg jo...@life.ku.dk:
 Dear R users,

 I have a data frame in the form below, on which I would like to make
normality tests on the values in the ExpressionLevel column.

 head(df)
  ID Plant  Tissue  Gene ExpressionLevel
 1  1 p1 t1  g1   366.53
 2  2 p1 t1  g2 0.57
 3  3 p1 t1  g311.81
 4  4 p1 t2  g1   498.43
 5  5 p1 t2  g2 2.14
 6  6 p1 t2  g3 7.85

 I would like to make the tests on every group according to the
content of the Plant, Tissue and Gene columns.

 My first problem is how to run a function for all these sub groups.
 I first thought of making subsets:

 group1 - subset(df, Plant==p1  Tissue==t1  Gene==g1)
 group2 - subset(df, Plant==p1  Tissue==t1  Gene==g2)
 group3 - subset(df, Plant==p1  Tissue==t1  Gene==g3)
 group4 - subset(df, Plant==p1  Tissue==t2  Gene==g1)
 group5 - subset(df, Plant==p1  Tissue==t2  Gene==g2)
 group6 - subset(df, Plant==p1  Tissue==t2  Gene==g3) etc...

 But that would be very time consuming and I would like to be able to
use the code for other data frames...
 I have also tried to store these in a list, which I am looping
through, running the tests, something like this:

 alist=list(group1, group2, group3, group4, group5, group6)
 for(i in alist)
 {
  print(shapiro.test(i$ExpressionLevel))
  print(pearson.test(i$ExpressionLevel))
  print(pearson.test(i$ExpressionLevel, adjust=FALSE))
 }

 But, there must be an easier and more elegant way of doing this... I
found the example below at
http://stackoverflow.com/questions/4716152/why-do-r-objects-not-print-in-a-function-or-a-for-loop.
I think might be used for the printing of the results, but I do not know
how to adjust for my data frame, since the functions are applied on
several columns instead of certain rows in one column.

 DF - data.frame(A = rnorm(100), B = rlnorm(100))

 obj2 - lapply(DF, shapiro.test)

 tab2 - lapply(obj, function(x) c(W = unname(x$statistic), p.value =
x$p.value))
 tab2 - data.frame(do.call(rbind, tab2))
 printCoefmat(tab2, has.Pvalue = TRUE)

 Finally, I have found several different functions for testing for
normality, but which one(s) should I choose? As far as I can see in the
help files they only differ in the minimum number of samples required.

 Thanks in advance!

 Kind regards,

 Joel






[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Normality tests on groups of rows in a data frame, grouped based on content in other columns

2011-10-30 Thread Joel Fürstenberg-Hägg
Dear R users,

I have a data frame in the form below, on which I would like to make normality 
tests on the values in the ExpressionLevel column.

 head(df)
  ID Plant  Tissue  Gene ExpressionLevel
1  1 p1 t1  g1   366.53
2  2 p1 t1  g2 0.57
3  3 p1 t1  g311.81
4  4 p1 t2  g1   498.43
5  5 p1 t2  g2 2.14
6  6 p1 t2  g3 7.85

I would like to make the tests on every group according to the content of the 
Plant, Tissue and Gene columns.

My first problem is how to run a function for all these sub groups.
I first thought of making subsets:

group1 - subset(df, Plant==p1  Tissue==t1  Gene==g1)
group2 - subset(df, Plant==p1  Tissue==t1  Gene==g2)
group3 - subset(df, Plant==p1  Tissue==t1  Gene==g3)
group4 - subset(df, Plant==p1  Tissue==t2  Gene==g1)
group5 - subset(df, Plant==p1  Tissue==t2  Gene==g2)
group6 - subset(df, Plant==p1  Tissue==t2  Gene==g3) etc...

But that would be very time consuming and I would like to be able to use the 
code for other data frames...
I have also tried to store these in a list, which I am looping through, running 
the tests, something like this:

alist=list(group1, group2, group3, group4, group5, group6)
for(i in alist)
{
  print(shapiro.test(i$ExpressionLevel))
  print(pearson.test(i$ExpressionLevel))
  print(pearson.test(i$ExpressionLevel, adjust=FALSE))
}

But, there must be an easier and more elegant way of doing this... I found the 
example below at 
http://stackoverflow.com/questions/4716152/why-do-r-objects-not-print-in-a-function-or-a-for-loop.
 I think might be used for the printing of the results, but I do not know how 
to adjust for my data frame, since the functions are applied on several columns 
instead of certain rows in one column.

DF - data.frame(A = rnorm(100), B = rlnorm(100))

obj2 - lapply(DF, shapiro.test)

tab2 - lapply(obj, function(x) c(W = unname(x$statistic), p.value = x$p.value))
tab2 - data.frame(do.call(rbind, tab2))
printCoefmat(tab2, has.Pvalue = TRUE)

Finally, I have found several different functions for testing for normality, 
but which one(s) should I choose? As far as I can see in the help files they 
only differ in the minimum number of samples required.

Thanks in advance!

Kind regards,

Joel






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NA Replacement by lowest value?

2010-01-28 Thread Joel Fürstenberg-Hägg

Hi all,

 

I need to replace missing values in a matrix by 10 % of the lowest available 
value in the matrix. I've got a function I've used earlier to replace negative 
values by the lowest value, in a data frame, but I'm not sure how to modify 
it...

 

nonNeg = as.data.frame(apply(orig.df, 2, function(col) # Change negative values 
to a small value, close to zero
{
   min.val = min(col[col  0])

   col[col  0] = (min.val / 10)
   col # Column index
}))

 

I think this is how to start, but the NA replacement part doesn't work...

 

newMatrix = as.matrix(apply(oldMatrix, 2, function(col)

{

   min.val = min(mData, na.rm = T) # Find the smallest value in the dataset

   col[col == NA] = (min.val / 10) # Doesn't work...
   col # Column index

}

 

Does any of you have any suggestions?

 

 

Best regards,

 

Joel

 
  
_
Hitta kärleken i vinter!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA Replacement by lowest value?

2010-01-28 Thread Joel Fürstenberg-Hägg

Thanks a lot Paul!!

 

Best,

 

Joel
 
 Date: Thu, 28 Jan 2010 10:48:37 +0100
 From: p.hiems...@geo.uu.nl
 To: joel_furstenberg_h...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] NA Replacement by lowest value?
 
 Joel Fürstenberg-Hägg wrote:
  Hi all,
 
  
 
  I need to replace missing values in a matrix by 10 % of the lowest 
  available value in the matrix. I've got a function I've used earlier to 
  replace negative values by the lowest value, in a data frame, but I'm not 
  sure how to modify it...
 
  
 
  nonNeg = as.data.frame(apply(orig.df, 2, function(col) # Change negative 
  values to a small value, close to zero
  {
  min.val = min(col[col  0])
  
 
  col[col  0] = (min.val / 10)
  col # Column index
  }))
 
  
 
  I think this is how to start, but the NA replacement part doesn't work...
 
  
 
  newMatrix = as.matrix(apply(oldMatrix, 2, function(col)
 
  {
 
  min.val = min(mData, na.rm = T) # Find the smallest value in the dataset
 
  col[col == NA] = (min.val / 10) # Doesn't work...
  
 use is.na(col) t find the NA's.
 
 cheers,
 Paul
  col # Column index
 
  }
 
  
 
  Does any of you have any suggestions?
 
  
 
  
 
  Best regards,
 
  
 
  Joel
 
  
  
  _
  Hitta kärleken i vinter!
  http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
  [[alternative HTML version deleted]]
 
  
  
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 
 -- 
 Drs. Paul Hiemstra
 Department of Physical Geography
 Faculty of Geosciences
 University of Utrecht
 Heidelberglaan 2
 P.O. Box 80.115
 3508 TC Utrecht
 Phone: +3130 274 3113 Mon-Tue
 Phone: +3130 253 5773 Wed-Fri
 http://intamap.geo.uu.nl/~paul
 
  
_
Hitta hetaste singlarna på MSN Dejting!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA Replacement by lowest value?

2010-01-28 Thread Joel Fürstenberg-Hägg

Hi Jim,

 

That's what Pauls suggested too, works great!

 

Best,

 

Joel
 
 Date: Thu, 28 Jan 2010 20:57:57 +1100
 From: j...@bitwrit.com.au
 To: joel_furstenberg_h...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] NA Replacement by lowest value?
 
 On 01/28/2010 08:35 PM, Joel Fürstenberg-Hägg wrote:
 
  Hi all,
 
 
 
  I need to replace missing values in a matrix by 10 % of the lowest 
  available value in the matrix. I've got a function I've used earlier to 
  replace negative values by the lowest value, in a data frame, but I'm not 
  sure how to modify it...
 
 
 
  nonNeg = as.data.frame(apply(orig.df, 2, function(col) # Change negative 
  values to a small value, close to zero
  {
  min.val = min(col[col 0])
 
  col[col 0] = (min.val / 10)
  col # Column index
  }))
 
 
 
  I think this is how to start, but the NA replacement part doesn't work...
 
 
 
  newMatrix = as.matrix(apply(oldMatrix, 2, function(col)
 
  {
 
  min.val = min(mData, na.rm = T) # Find the smallest value in the dataset
 
  col[col == NA] = (min.val / 10) # Doesn't work...
  col # Column index
 
  }
 
 
 
  Does any of you have any suggestions?
 
 Hi Joel,
 
 You probably want to use:
 
 col[is.na(col)]-min.val/10
 
 Jim
 
  
_
Hitta kärleken i vinter!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove part of string in colname and calculate mean for columns groups

2010-01-15 Thread Joel Fürstenberg-Hägg

Hi all,

 

I have two question. First, I wonder how to remove a part of the column names 
in a matrix? I would like to remove the _ACCX or _NAX part below. Is there 
a method where the _ as well as all characters after i can be removed?

 

 dim(exprdata)
[1]  88 512



 colnames(exprdata[,c(1:20)])
 [1] Akita_ACC1 Akita_ACC2 Akita_ACC3 Akita_ACC4 Alc.0_ACC1 
Alc.0_ACC2 Alc.0_ACC3
 [8] Alc.0_ACC4 Alc.0_ACC5 Bl.1_ACC1  Bl.1_ACC2  Bl.1_ACC3  
Bl.1_ACC4  Bla.1_ACC1
[15] Bla.1_ACC2 Bla.1_ACC3 Bla.1_ACC4 Blh.1_ACC1 Blh.1_ACC2 
Blh.1_ACC3

 

 

Secondly, I would like to calculate the mean of each column group in the 
matrix, for instance all columns beginning with Akita, and save all new 
columns as a new matrix. 

 

For instance, use:

 

 head(exprdata[,c(1:4)])
Akita_ACC1 Akita_ACC2 Akita_ACC3 Akita_ACC4
A15-101   6.668931 NA NA NA
A122001-101  10.562564  11.706395  11.608989   8.289093
A128001-101  14.946749   8.112625   8.176438  10.104254
A133001-101   5.186679   6.089870   4.119589   3.168841
A133003-101 NA NA  19.825480   2.587695
A134001-101   3.259402   4.835642   4.679607   4.490254

 

To get something like:

 

 Akita
A15-101   6.668931

A122001-101   10.54176  

A128001-101  10.10425
A133001-101   3.168841

A133003-101   2.587695 
A134001-101   4.490254

 

 

However, the column groups are of different sizes (3-10 columns) so I guess 
I'll need a method based on the column names.

 

Anyone who can help me?

 

Best regards,

 

Joel
  
_
Nya Windows 7 - Hitta en dator som passar dig! Mer information. 
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to delete matrix rows based on NA frequency?

2010-01-15 Thread Joel Fürstenberg-Hägg

Hi all,

 

I would like to remove rows from a matrix, based on the frequency of missing 
values. If there are more than 10 % missing values, the row should be deleted.

 

I use the following to calculate the frequencies, thereby getting a new matrix 
with the frequencies:

 

freqNA=rowMeans(is.na(exprdata))

 

But is there a shorter way to remove the rows based on (1-freqNA)0.1 than 
looping through the whole matrix using a for loop?

 

 

All the best,

 

Joel
  
_
Hitta kärleken i vinter!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to calculate the row wise means for grouped columns in matrix?

2010-01-15 Thread Joel Fürstenberg-Hägg

Hi all,

 

I want to calculate the row wise mean of groups of columns in a matrix M. All 
columns belonging to the same group have the same column name. My idea is to 
create a new vector V containing these column names, but after first removing 
the duplicates. Then I would calculate the means using for instance rowMean() 
and by comparing the column names of M with the vector V, getting the indices 
of the columns to use.

 

What do you think, is it a good idea or not? If yes, any suggestions how to do 
it? If no, is there any alternative solution that might work better?

 

 

 

All the best,

 

Joel
  
_
Lagra alla dina foton på Skydrive. Det är enkelt och säkert!
http://www.skydrive.live.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Exchange NAs for mean

2009-12-17 Thread Joel Fürstenberg-Hägg

Hi all,

 

I'm have a matrix (X) with observations as rows and parameters as columns. I'm 
trying to exchange all missing values in a column by the column mean using the 
code below, but so far, nothing happens with the NAs... Can anyone see where 
the problem is?

 

N-nrow(X) # Calculate number of rows = 108
p-ncol(X) # Calculate number of columns = 88
   

# Replace by columnwise mean
for (i in colnames(X)) # Do for all columns in the matrix
{ 
   for (j in rownames(X)) # Go through all rows
   {
  if(is.na(X[j,i])) # Search for missing value in the given position
  {
 X[j,i]=mean(X[1:p, i]) # Change missing value to the mean of the column
  }
   }
} 

 

All the best,

 

Joel

 


 
  
_
Hitta hetaste singlarna på MSN Dejting!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Get rid of automatic title from Anova plots

2009-12-10 Thread Joel Fürstenberg-Hägg

Hi all,

 

I'm having a problem when putting the four plots (Residuals vs Fitted, Normal 
Q-Q, Scale - Location, and Residuals vs Leverage) from an Anova using 
plot(aov()) into a pdf. A string aov(df[,i] ~ cat)) is added as the main 
title, so when I use mtext() the two titles are overlapping. Does anyone know 
how to get rid of that title? I've tried plot(aov(df[,i] ~ cat), main=) but 
without succes.


 

pdf(Anova plots 0809.pdf, height=10, width=10)
par(mfrow=c(2,2), oma=c(2,2,4,2))


# Anova test for LT50 with different anthocyanin scores
df=fieldTrial[idx0809, c(31:32)]
cat=fieldTrial[idx0809, c(14)]
for (i in colnames(df))
{ 
   print(i)
   print(summary(aov(df[,i] ~ cat)))
   plot(aov(df[,i] ~ cat))
   mtext(text=paste(Anova test for, i, with different anthocyanin scores), 
cex=1.5, side=3, outer=TRUE)
}
dev.off()

 

Best regards,

 

Joel
  
_
Hitta kärleken i vinter!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple regression script

2009-12-07 Thread Joel Fürstenberg-Hägg

Hi all,

I'm doing Multiple linear regression for a data set. However, it takes a lot of 
time, as I would like to check every possible combination of factors, evalute 
the results based for instance on their p values, and then choose the best 
regression model.

So, I wonder if anyone might have a script for that? Or if not, do you have 
some suggestions how to create such a script?

I've been told there is a similar function in SAS, but I'm not sure how it 
works. Furthermore, I'm not sure how to deal with the evaluation of the 
results, are there any other factors I should consider, such as R square etc?

All the best,

Joel
  
_
Hitta kärleken i vinter!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Decision trees with factors and numericals

2009-11-24 Thread Joel Fürstenberg-Hägg

Hi all,

 

Does any of you know how to make a decision tree when the data set contains 
factors and numericals?

I've got a data frame with 3 columns, where y and x1 are numerical and x2 
contains factors. Is it possible to use the rpart package, and in that case 
how? Otherwise, is there another alternative?

 

This is what I've tried so far

 

 rpart(LT50_NA ~ Raf + Antho, data=decTreeNA, method=anova) # Have tried 
 method=class as well
Error in as.character(x) : 
  cannot coerce type 'closure' to vector of type 'character'

 

Best regards,

 

Joel
  
_
Lagra alla dina foton på Skydrive. Det är enkelt och säkert!
http://www.skydrive.live.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] From R to LaTeX to pdf?

2009-11-24 Thread Joel Fürstenberg-Hägg

Hi all,

 

Anyone experienced in the LaTeX format?

 

I'm trying to use the xtable package to create nice anova tables, but how do I 
do to produce a pdf from the resulting LaTeX table? I've tried WinShell and 
MiKTeX, but I couldn't get any of them working...

 

Here's an example of the output in R:

 

% latex table generated in R 2.9.2 by xtable 1.5-6 package
% Tue Nov 24 14:17:32 2009
\begin{tabular}{lr}
  \hline
  Df  Sum Sq  Mean Sq  F value  Pr($$F) \\ 
  \hline
cat  2  40.50  20.25  6.66  0.0019 \\ 
  Residuals  107  325.13  3.04 \\ 
   \hline
\end{tabular}

 

Best regards,

 

Joel
  
_
Lagra alla dina foton på Skydrive. Det är enkelt och säkert!
http://www.skydrive.live.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in text.rpart(fit) : fit is not a tree, just a root

2009-11-24 Thread Joel Fürstenberg-Hägg

Hi all,

 

I've tried to make a decision tree for the following data set:

 

Level X Response
279 C 2.4728646   -9.445
341 B 0.5986398   -9.413
343 B 1.1786271   -9.413
384 D 1.4797870   -9.413
390 C 2.0364569   -9.133
391 D 0.9365739   -9.133
452 A 1.2858741  -11.480
455 C 1.3256245   -9.413
510 C 0.5758865   -9.413
537 D 1.9289431   -9.413
540 C 1.8646144   -9.413
554 B 1.3903752  -10.080

 

Using these commands:

 

 fit=rpart(Response ~ X + Level, data=decTree, method=anova, 
 control=rpart.control(minsplit=1))
 
 printcp(fit) # display cp table  

Regression tree:
rpart(formula = Response ~ X + Level, data=decTree, method = anova, 
control = rpart.control(minsplit = 1))

Variables actually used in tree construction:
character(0)

Root node error: 4.4697/12 = 0.37247

n= 12 

CP nsplit rel error
1 0.01  0 1


I don't get a tree...

 

 plot(fit) # plot decision tree  
Error in plot.rpart(fit) : fit is not a tree, just a root
 text(fit) # label the decision tree plot  
Error in text.rpart(fit) : fit is not a tree, just a root

 

Can anyone tell me what's going wrong and give a hint how to solve it?

 

Best regards,

 

Joel

 

 
  
_
Nya Windows 7 - Hitta en dator som passar dig! Mer information. 
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D plot, rotatable and with adjustable symbols

2009-11-19 Thread Joel Fürstenberg-Hägg

Hi all,

 

I've tried to make a 3D plot, but have run into some problems.

 

I'd like to have a plot that I can rotate interactively using the mouse, which 
is possible using the plots3d {R.basic}. However, I would like to change the 
symbols used as the points, but there's no pch in plot3d().

 

If I use the Scatterplot3d package, I'm able to change this, but not able to 
rotate the plot interactively.

 

Does anyone know a solution to this? Maybe another package is better?

 

Best regards,

 

Joel
  
_
Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig!
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trellis settings get lost when printing to pdf

2009-11-13 Thread Joel Fürstenberg-Hägg

Hi all,

 

I've got some problems when changing the trellis settings for the lattice 
plots. The plots look exactly as I want them to when calling show.settings() as 
well as when plotting them in the graphical window. But when printing to a pdf 
file, none of the settings are used!? Does anyone know what might have 
happened? Because the when changing the trellis settings, these should remain 
in the new state until you close R right..?

 

# Change settings for the boxplot appearance
new.dot=trellis.par.get(box.dot)
new.rectangle=trellis.par.get(box.rectangle)
new.umbrella=trellis.par.get(box.umbrella)
new.symbol=trellis.par.get(plot.symbol)
new.strip.background=trellis.par.get(strip.background)
new.strip.shingle=trellis.par.get(strip.shingle)
new.dot$pch=|
new.dot$col=black
new.rectangle$col=black
new.rectangle$fill=grey65
new.umbrella$col=black
new.umbrella$lty=1 # Continous line, not dotted
new.symbol$col=black
new.strip.background$col=grey87 # Background colour in the upper label
new.strip.shingle$col=black # Border colour around the upper label
trellis.par.set(box.dot=new.dot, box.rectangle=new.rectangle, 
box.umbrella=new.umbrella, plot.symbol=new.symbol, 
strip.background=new.strip.background, strip.shingle=new.strip.shingle)

 

Best regards,

 

Joel
  
_
Nya Windows 7 - Hitta en dator som passar dig! Mer information. 
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loadings and scores from fastICA?

2009-11-12 Thread Joel Fürstenberg-Hägg

Ok, so then the S gives the individual components, good. Thanks Tony!

But what about the principal components from the PCA plot, how are they 
calculated?

And are the linear mixing matrix A really the same as the loadings/weights? 
There must be different loadings for the PCA and ICA right?

Best regards,

Joel

 Date: Wed, 11 Nov 2009 14:29:06 -0700
 From: tpl...@acm.org
 To: joel_furstenberg_h...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] Loadings and scores from fastICA?
 
 The help for fastICA says:
 
  The data matrix X is considered to be a linear combination of
  non-Gaussian (independent) components i.e. X = SA where columns of
  S contain the independent components and A is a linear mixing
  matrix.
 
 The value of fastICA is a list with components S (the estimated source 
 matrix) and A (the estimated mixing matrix).  Are these what you want?
 
 -- Tony Plate
 
 Joel Fürstenberg-Hägg wrote:
  Hi all,
  
   
  
  Does anyone know how to get the independent components and loadings from an 
  Independent Component Analysis (ICA), as well as principal components and 
  loadings from a Pricipal Component analysis (PCA) using the fastICA 
  package? Or perhaps if there's another way to do ICAs in R?
  
  
  Below is an example from the fastICA manual 
  (http://cran.r-project.org/web/packages/fastICA/fastICA.pdf)
  
   
  
  if(require(MASS))
  {
   x - mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 
  2))
   x1 - mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 
  2, 2))
   X - rbind(x, x1)
   a - fastICA(X, 2, alg.typ = deflation, fun = logcosh, alpha = 1, 
  method = R, row.norm = FALSE, maxit = 200, tol = 0.0001, verbose = TRUE)
   par(mfrow = c(1, 3))
   plot(a$X, main = Pre-processed data)
   plot(a$X%*%a$K, main = PCA components)
   plot(a$S, main = ICA components)
  }
  
   
  
  Best regards,
  
   
  
  Joel

  _
  Hitta kärleken i vinter!
  http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
  [[alternative HTML version deleted]]
  
  
  
  
  
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  
_
Hitta kärleken nu i vår!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prcomp() PCA vs fastICA() PCA?

2009-11-11 Thread Joel Fürstenberg-Hägg

Hi all,

 

I wonder what the difference is between the functions prcomp and the PCA 
plotting method used in example 3 from the fastICA package. They give totally 
different plots. The reason for asking is that I've earlier used prcomp, but 
now I should do an ICA, and I guess I cannot compare the PCA plot from prcomp 
with the ICA plot if the two PCA plots looks different?

 

Does anyone knows anything about this? Maybe there's a different approach 
that's better?

 

 

if(require(MASS))

{
   x - mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 2))
   x1 - mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 2, 2))
   X - rbind(x, x1)
   a - fastICA(X, 2, alg.typ = deflation, fun = logcosh, alpha = 1,
   method = R, row.norm = FALSE, maxit = 200,
   tol = 0.0001, verbose = TRUE)
   par(mfrow = c(1, 3))
   plot(a$X, main = Pre-processed data)
   plot(a$X%*%a$K, main = PCA components)
   plot(a$S, main = ICA components)
}


PC=prcomp (X, center=T, scale=T)
hcl=hclust(dist(df))
plot(PC$x[,1],PC$x[,2], main=PCA components (prcomp))

 

 

Best regards,

 

Joel
  
_
Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig!
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loadings and scores from fastICA?

2009-11-11 Thread Joel Fürstenberg-Hägg

Hi all,

 

Does anyone know how to get the independent components and loadings from an 
Independent Component Analysis (ICA), as well as principal components and 
loadings from a Pricipal Component analysis (PCA) using the fastICA package? Or 
perhaps if there's another way to do ICAs in R?


Below is an example from the fastICA manual 
(http://cran.r-project.org/web/packages/fastICA/fastICA.pdf)

 

if(require(MASS))
{
 x - mvrnorm(n = 1000, mu = c(0, 0), Sigma = matrix(c(10, 3, 3, 1), 2, 2))
 x1 - mvrnorm(n = 1000, mu = c(-1, 2), Sigma = matrix(c(10, 3, 3, 1), 2, 
2))
 X - rbind(x, x1)
 a - fastICA(X, 2, alg.typ = deflation, fun = logcosh, alpha = 1, 
method = R, row.norm = FALSE, maxit = 200, tol = 0.0001, verbose = TRUE)
 par(mfrow = c(1, 3))
 plot(a$X, main = Pre-processed data)
 plot(a$X%*%a$K, main = PCA components)
 plot(a$S, main = ICA components)
}

 

Best regards,

 

Joel
  
_
Hitta kärleken i vinter!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCA with tow response variables

2009-11-04 Thread Joel Fürstenberg-Hägg

Hi all,

 

I'm new to PCA in R, so this might be a basical thing, but I cannot find 
anything on the net about it.

I need to make a PCA plot with two response variables (df$resp1 and df$resp2) 
against eight metabolites (df$met1, df$met2, ...) and I don't have a clue how 
to do... and I've only used the simplest PCAs before, like this:

 

pcaObj=prcomp(t(df[idx, c(40:47)]))

biplot(pcaObj)

 

Anyone who knows how to do?

 

Best rageds,

 

Joel
  
_
Hitta hetaste singlarna på MSN Dejting!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change negative values in column

2009-11-03 Thread Joel Fürstenberg-Hägg

Hi all,

 

I'm trying to write a script that changes all negative values in a data frame 
column to a small positive value, based on the the minimum value of the column.

However, I get the following error:

 

Error in if (x[i]  0) { : argument is of length zero

 

 

As well, I would minimum to be the smallest of the non-negative values...

 

 

Aa_non_neg=(fieldTrial0809$Aa) # Copy column from data frame to manipulate

 

nonNegative = function(x)
{
   minimum=min(x) # Should only use positive minimum!
   for (i in x)
   {

  if(x[i]0) # Found a negative value
  {
 x[i]=minimum/10 # Change to a new non-negative value
  }
   }
}

 

nonNegative(Aa_non_neg) # Apply function on column
  
_
Lagra alla dina foton på Skydrive. Det är enkelt och säkert!
http://www.skydrive.live.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Exclude rows in xyplot

2009-10-27 Thread Joel Fürstenberg-Hägg

Hi all,

I'm searching for a way to exclude outliers from my dataset while making 
xyplots. While plotting using pairs(), I exclude specific row in my data frame 
and save the settings as a variable which I later include as an argument:

# Discard outliers and save settings as idx
idx=with(fieldTrial0809, which(Pro0  Pro0.95  Fum0  Fum0.4   Mal0.1  
Mal2.5  Glc2  Glc20  Fru1  Fru30  Raf0  Raf3  Suc1  Suc14))

#Plot the numerical and ranked columns pair wise using pairs()
pdf(ranked.pdf, height=20, width=20)
pairs(fieldTrial0809[idx, c(21,26,30,32,34,36,38,40,42,44,46,48,50)], 
main=Ranked, col=blue, pch=°, gap=0.2)
dev.off()

Now I'm trying to make xyplots to compare the result from three different 
categories:

# Plot Pro against Glc for each of the three categories
xyplot(Pro ~ Glc | Categories_BBCH_ID, data=fieldTrial0809, pch=°, 
layout=c(1, 3), aspect=1, index.cond=list(3:1))

I would like to exlude outliers like above. I've found that limits can be used 
in a similar manner as with xlim and ylim, but though I've read about them I 
don't understand how to solve it... Anyone who can help me?

All the best,

Joel


  
_
Hitta hetaste singlarna på MSN Dejting!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Print several xyplots to the same page in a pdf file

2009-10-27 Thread Joel Fürstenberg-Hägg

Hello everybody,

I'm using the lattice package and the xyplot to make several graphs like below. 
However, I can just print the three grouped plots onto one page as I'm putting 
them into a pdf-file, which gives me a huge amount of pages... Is it possible 
to put them all, or at least more than one on the same page, for instance put 
three groups beside each other like columns?

...
xyplot(Pro ~ Glc | Categories_BBCH_ID, data=fieldTrial0809, pch=°, 
layout=c(1, 3), aspect=1, index.cond=list(3:1))
xyplot(Pro ~ Raf | Categories_BBCH_ID, data=fieldTrial0809, pch=°, 
layout=c(1, 3), aspect=1, index.cond=list(3:1))
xyplot(Pro ~ Suc | Categories_BBCH_ID, data=fieldTrial0809, pch=°, 
layout=c(1, 3), aspect=1, index.cond=list(3:1))
xyplot(Fum ~ Aa | Categories_BBCH_ID, data=fieldTrial0809, pch=°, layout=c(1, 
3), aspect=1, index.cond=list(3:1))
xyplot(Fum ~ Pro | Categories_BBCH_ID, data=fieldTrial0809, pch=°, 
layout=c(1, 3), aspect=1, index.cond=list(3:1))
etc...

All the best,

Joel
  
_
Nya Windows 7 - Hitta en dator som passar dig! Mer information.
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change positions of columns in data frame

2009-10-23 Thread Joel Fürstenberg-Hägg

Hi all,

Probably a simple question, but I just can't find a simple answear in the older 
threads or anywhere else.

I've added some new vectors as columns in a data frame using cbind(). As 
they're all put as the last columns inte the data frame, I would like to move 
them to specific positions. How do you do to change the position of a column in 
a data frame?

I know I can use 
fieldTrial0809=data.frame(Sample_ID=as.factor(fieldTrial0809$Sample_ID), 
Plant_ID=as.factor(fieldTrial0809$Plant_ID), ...) to create a new data frame 
with the given columns in the specified order, but there must be an easier 
way..?

All the best,

Joel
  
_
Nya Windows 7 - Hitta en dator som passar dig! Mer information. 
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automatization of non-linear regression

2009-10-22 Thread Joel Fürstenberg-Hägg

Hi everybody,

 

I'm using the method described here to make a linear regression:

 

http://www.apsnet.org/education/advancedplantpath/topics/Rmodules/Doc1/05_Nonlinear_regression.html

 

 ## Input the data that include the variables time, plant ID, and severity
 time - c(seq(0,10),seq(0,10),seq(0,10))
 plant - c(rep(1,11),rep(2,11),rep(3,11))
 
 ## Severity represents the number of
 ## lesions on the leaf surface, standardized
 ## as a proportion of the maximum
 severity - c(
+ 42,51,59,64,76,93,106,125,149,171,199,
+ 40,49,58,72,84,103,122,138,162,187,209,
+ 41,49,57,71,89,112,146,174,218,250,288)/288
 data1 - data.frame(
+ cbind(
+ time,
+ plant,
+ severity
+ )
+ )
 
 ## Plot severity versus time 
 ## to see the relationship between## the two variables for each plant
 plot(
+ data1$time,
+ data1$severity,
+ xlab=Time,
+ ylab=Severity,
+ type=n
+ )
 text(
+ data1$time,
+ data1$severity,
+ data1$plant
+ )
 title(main=Graph of severity vs time)
 
 getInitial(
+ severity ~ SSlogis(time, alpha, xmid, scale),
+ data = data1
+ )
alpha  xmid scale 
 2.212468 12.506960  4.572391 
 
 
 ## Using the initial parameters above,
 ## fit the data with a logistic curve.
 para0.st - c(
+ alpha=2.212,
+ beta=12.507/4.572, # beta in our model is xmid/scale
+ gamma=1/4.572 # gamma (or r) is 1/scale
+ )
 
 fit0 - nls(
+ severity~alpha/(1+exp(beta-gamma*time)),
+ data1,
+ start=para0.st,
+ trace=T
+ )
0.1621433 :  2.212 2.7355643 0.2187227 
0.1621427 :  2.2124095 2.7352979 0.2187056 
 
 ## Plot to see how the model fits the data; plot the
 ## logistic curve on a scatter plot
 plot(
+ data1$time,
+ data1$severity,
+ type=n
+ )
 
 text(
+ data1$time,
+ data1$severity,
+ data1$plant
+ )
 
 title(main=Graph of severity vs time)
 
 curve(
+ 2.21/(1+exp(2.74-0.22*x)),
+ from=time[1],
+ to=time[11],
+ add=TRUE
+ )


As you can see I have to do some work manually, such as setting the numbers to 
be used for calculation of alpha, beta and gamma. I wonder if you might have an 
idea how to automatize this? I suppose it should be possible to save the output 
from getInitial() and reach the elements via index or something, but how?

 

I guess a similar approach could be used for the values of fit0?

 

Or even better, if the variables alpha, beta and gamma could be used right away 
for instance in curve(), instead of adding the values manually. But just 
exchanging the values with the varables (alpha instead of 2.21 etc) doesn't 
seem to work. What is the reason for that? Any solution?

 

A last, general but somewhat related question. If I set variables in a function 
such as para0.st - c(alpha=2.212, ...), is it just stored locally, or can it 
be used globally, I mean, can I use the variable anywhere (for instance in 
curve()) or just in the function where it was created? I'm asking because I'm 
used to Java, where the life time of local variabels only extends to the 
closing braces, while global variables can be reached everywhere.

 

The reason for automatization is that I'll have to repeat the procedure more 
than a hundred times, while making overview pair waise plots of my data, with 
both this logaritmic regression and several others (exponential, monomoelcular, 
logistic, Gompertz and Weibull).

 

 

Wish you all the best,

 

Joel
  
_
Nya Windows 7 gör allt lite enklare. Hitta en dator som passar dig!
http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Division of data frame and deletion of values from column

2009-10-16 Thread Joel Fürstenberg-Hägg

Hi all,

I guess this might be an easy question, but I've searched multiple help pages 
without finding any answear... so now I put my trust in you!

I have a data frame (36 variables and 556 observations). One column contains  
three factors, and I would like to divide the data frame into three new ones, 
based on the value of the factors, thereby having only one value for all 
elements of the particular column in each of the data frames. The reason is 
that I later will create plots and do statistical analyzes on these data 
frames, and I don't want those factors affecting the result.
ID Weight  Age_days ...
1   18   76.1   106
2   19   77.0   175  
3   20   78.1   121
4   21   78.2   121
5   22   78.8   106
6   23   76.3   106
.
.
.

I also have another column containing several factors, of which I would like to 
exclude one (get NA instead).

ID Weight  Age_days  Value_ID ...
1   18   76.1   106  high
2   19   77.0   175   low
3   20   78.1   121middle
4   21   78.2   121  high
5   22   78.8   106  high
6   23   76.3   106number   -- exclude
7   24   76.9   175   low
.
.
.

I really hope someone could help me, though you might think it's too easy...

Best regards,

Joel
  
_
Hitta kärleken nu i vår!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot overview xy plots from data frame?

2009-10-14 Thread Joel Fürstenberg-Hägg

Hi,

I've got a data frame (556 rows and 36 columns) from which I need to create 
several xy plots and print to pdf, in order to detect outliers and trends in 
the data. 16 of the columns contains numerical values, and I would like to 
create graphs for all combinations. It can be done manually, but creating 256 
plots by hand takes time... I guess I have to iterate through the data frame, 
but I'm not used to do that with R. Below I've written my thoughts, trying to 
combine my knowledge in Java and R, just to give you the idea: 

pdf(FieldTrial0809Overview.pdf)

int colWidth = fieldTrial[0].length;


for(i=0, icolWidth, i++)
{
for(j=0, jcolWidth, j++)
{
String colI=get.fieldTrial$i;
String colJ=get.fieldTrial$j;

plot(fieldTrial$i~fieldTrial$j, main=colI +  vs  + colJ, xlab=colI, 
ylab=colJ)
}
}

dev.off()

Anyone who know how to solve this?
Do I have to copy the 16 numerical columns to a new dataset, because they are 
not grouped and there are 20 additional non-numerical columns in the data frame.
By the way, can the iterations be made using R, or do you have to combine with 
for instance Perl?

All the best,

Joel
  
_
Hitta hetaste singlarna på MSN Dejting!
http://dejting.se.msn.com/channel/index.aspx?trackingid=1002952
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.