[R] Rounding error in seq(...)

2009-09-30 Thread Michael Knudsen
Hi,

Today I was flabbergasted to see something that looks like a rounding
error in the very basic seq function in R.

 a = seq(0.1,0.9,by=0.1)
 a
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
 a[1] == 0.1
[1] TRUE
 a[2] == 0.2
[1] TRUE
 a[3] == 0.3
[1] FALSE

It turns out that the alternative

 a = (1:9)/10

works just fine. Are there any good guides out there on how to deal
with issues like this? I am normally aware of rounding errors, but it
really surprised me to see that an elementary function like seq would
behave in this way.

Thanks,
Michael Knudsen

-- 
Michael Knudsen
micknud...@gmail.com
http://sites.google.com/site/micknudsen/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rounding error in seq(...)

2009-09-30 Thread Michael Knudsen
On Wed, Sep 30, 2009 at 8:44 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:

 Why?  You asked for an increment of 1 in the second case (which is exactly
 represented in R), then divided by 10, so you'll get the same as 0.3 gives
 you.  In the seq() case you asked for an increment of a number close to but
 not equal to 1/10 (because 1/10 is not exactly representable in R), so you
 got something different.

Well, the problem is that I don't know how seq is implemented. I just
assumed that it wouldn't behave like this.

-- 
Michael Knudsen
micknud...@gmail.com
http://sites.google.com/site/micknudsen/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rounding error in seq(...)

2009-09-30 Thread Michael Knudsen
On Wed, Sep 30, 2009 at 8:40 PM, Michael Knudsen micknud...@gmail.com wrote:

 a = seq(0.1,0.9,by=0.1)
 a
 [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
 a[1] == 0.1
 [1] TRUE
 a[2] == 0.2
 [1] TRUE
 a[3] == 0.3
 [1] FALSE

A friend of mine just pointed out a possible solution:

 a=seq(0.1,0.9,by=0.1)
 a = seq(0.1,0.9,by=0.1)
 a[3]==0.3
[1] FALSE
 all.equal(a[3],0.3)
[1] TRUE

The all.equal function checks if two objects are nearly equal.

-- 
Michael Knudsen
micknud...@gmail.com
http://sites.google.com/site/micknudsen/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in Schools

2009-09-19 Thread Michael Knudsen
On Sat, Sep 19, 2009 at 6:28 AM, John Maindonald
john.maindon...@anu.edu.au wrote:

 I am looking for information on experimentation with the use
 of R in the teaching of statistics and science in schools.  Any
 leads would be very welcome.  I am certain that there is such
 experimentation.

I read this paper

http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000482

some days ago. It's quite interesting, and it links to some excellent
slides that look great as templates for making your own R course.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on Multi Core

2009-09-13 Thread Michael Knudsen
On Fri, Sep 11, 2009 at 10:05 PM, Noah Silverman
n...@smartmediacorp.com wrote:

 Is there a version of R that would take advantage of BOTH cores??

Well, if your job is parallelizable, it's actually fairly easy. When I
discovered the package 'foreach', I wrote the following piece
completely overwhelmed by enthusiasm:

http://lifeofknudsen.blogspot.com/2009/07/most-of-work-i-do-in-r-has-to-do-with.html

It's very basic, but maybe you'll find it useful.

Best,
Michael Knudsen

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to determine if a variable is already set?

2009-09-13 Thread Michael Knudsen
On Fri, Sep 11, 2009 at 7:15 PM, carol white wht_...@yahoo.com wrote:

 It might be a primitive question but how it is possible to determine if a 
 variable is initialized in an environment?

What about this?

 x %in% ls()
[1] FALSE
 x = 41
 x %in% ls()
[1] TRUE

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A matrix calculation

2009-08-23 Thread Michael Knudsen
On Sun, Aug 23, 2009 at 8:53 PM, Bogasobogaso.christo...@gmail.com wrote:

 No no, I actually want following result :

 7,   14,   21, 6,   13,   20, 5,   12,   19,

How about this?

x = c()
for (i in 7:1) x = c(x,mat[i,])

Guess that would do the trick.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package read large file

2009-08-19 Thread Michael Knudsen
On Wed, Aug 19, 2009 at 10:51 AM, Mohamed
Lajnefmohamed.laj...@inserm.fr wrote:

 I am looking for packages that could read large files in R?
 any suggestions are welcome.

As already pointed out by Jim, your question is not very specific. My
wild guess is that you probably have some memory issues -- if that is
the case, maybe the package bigmemory can alleviate your pain.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot(x,y)

2009-08-18 Thread Michael Knudsen
On Sun, Aug 16, 2009 at 9:19 PM, malcolm
Crouchmalcolm.crouc...@gmail.com wrote:

 plot(V6,V5, col=red)
 or
 plot(V6,V5)

It seems that V5 and V6 are column names in your data matrix. If your
matrix is called data, you should use

plot(x$V6,x$V5,col=red)

instead.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph label greek symbol failure

2009-08-18 Thread Michael Knudsen
On Mon, Aug 17, 2009 at 12:51 PM, e-letter inp...@gmail.com wrote:

 I have tried to add the delta (δ) symbol to the y axis label and the
 result is D, using the command:

 ...ylab=δt...

Try ylab = expression(delta*t) instead.

Best,
Michael

--
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot(x,y)

2009-08-18 Thread Michael Knudsen
On Tue, Aug 18, 2009 at 6:59 PM, David Winsemiusdwinsem...@comcast.net wrote:

 ITYM:

 plot(data$V6, data$V5, col=red)

Yup! My mistake.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] randomForest question--problem with ntree

2009-08-14 Thread Michael Knudsen
On Thu, Aug 13, 2009 at 11:11 PM, Mary Puttmp...@mail.med.upenn.edu wrote:

Hi Mary,

 I would like to use a random Forest model to get an idea about which 
 variables from a dataset may have some prognostic significance in a smallish 
 study. The default for the number of trees seems to be 500. I tried changing 
 the default to ntree=2000 or ntree=200 and the results appear identical. Have 
 changed mtry from mtry=5 to mtry=6 successfully. Have seen same problem on 
 both a Windows machine and our linux system running 2.8 and 2.9.

I don't think it's correct to call it a problem; it's more likely a
feature! Try to take a look a Breiman's paper (in the Machine
Learning journal), where he introduces random forests. I read it
recently, and somewhere he explicitly mentions that ntree often may be
set very low without lowering the performance.

The random forest algorithm is very robust and apparently 500 trees
are usually more than enough. Therefore you don't get better results
by using 2000 trees, and often it doesn't affect the performance if
you use fewer trees (e.g. 200).

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] randomForest question--problem with ntree

2009-08-14 Thread Michael Knudsen
On Fri, Aug 14, 2009 at 1:43 PM, Mary Puttmp...@mail.med.upenn.edu wrote:

 I'm not calling it a problem that the answer converges--i.e. that the 
 algorithm is stable. but if you look at the example even though I've asked 
 for 2000 or 200 tress, ntree=2000 or ntree=200, it still gives me 500 trees 
 according to the output and identical results when you set the seed before 
 the call. While results are expected to be similar they should not be 
 identical if the number of trees was actuallly changed.

Oops! You have written n.tree instead of ntree.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Output to screen and file at the same time

2009-08-13 Thread Michael Knudsen
Hello!

Using the sink function, output from R may be written to a file
instead of the screen. I would really like to write my output to a
file while running an R script and at the same time view the output
live on my screen. Is there are way to do that?

Thanks,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix addition function

2009-08-13 Thread Michael Knudsen
On Thu, Aug 13, 2009 at 11:35 AM, Lina Rusyteliner...@yahoo.co.uk wrote:

Hi Lina,

 What function can I use for matrices addition? I couldn’t find any 
 information about it in the manual or in the internet.
 (A+B suits, when the number of matrixes is small, function sum() doesn’t suit 
 for matrices addition, because it sums all variables in the matrices and 
 produces as an answer single number, not a matrix).

I don't know of any function doing that, but you could easily write a
one yourself. Suppese that X is a list of matrices. Then you could
e.g. do as follows:

matrixSum = function(X)
{
   N = length(X)
   if (N==2) return(X[[1]]+X[[2]])
   else return(matrixSum(X[[1:(N-1)]],X[[N]]))
}

I guess that one should do the trick.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] downsampling

2009-07-24 Thread Michael Knudsen
On Fri, Jul 24, 2009 at 9:32 AM, Jan Wienerjan.wie...@tuebingen.mpg.de wrote:

 x=sample(1:5, 115, replace=TRUE)

 How do I downsample this vector to 100 entries? Are there any R functions or 
 packages that provide such functionality.

What exactly do you mean by downsampling? Do you just want to sample
100 random entries from x?

sample(sample(1:5,115,replace=TRUE),100,replace=FALSE))

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re placing null values (#NULL!)

2009-07-23 Thread Michael Knudsen
On Wed, Jul 22, 2009 at 10:18 PM, Josh Rollj_r...@hotmail.com wrote:

 In `[-.factor`(`*tmp*`, Props_$pct_vacant == #NULL!, value = 0) :
   invalid factor level, NAs generated

 Thats what made me wonder if the #NULL! value was being treated
 differently than a typical NULL value.  Thoughts?

I would like to test it myself, but I haven't succeeded in copying
your data to a text file readable by R. If you can email it to me,
I'll give it a try.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multi-line comments?

2009-07-22 Thread Michael Knudsen
On Wed, Jul 22, 2009 at 4:30 PM, Mark Knechtmarkkne...@gmail.com wrote:

   I wanted to comment out 20 lines that I'm moving to a function but
 didn't want to delete them. Is there no defined way to get around
 using a # on each of the 20 lines?

Just like you, I have been longing for that myself. It seems that the
answer is negative, so I have ended up using

if (1==0)
{
   # code goes here
}

although is not really nice to look at.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multi-line comments?

2009-07-22 Thread Michael Knudsen
On Wed, Jul 22, 2009 at 5:27 PM, Erik Iversoneiver...@nmdp.org wrote:

 What editor are you all using to write R code?  Many will have ways of doing 
 what you want, e.g., comment-region (bound  by default to M-; through 
 comment-dwim) in Emacs.

Cool! I'm using Xcode, and I have just realized that cmd+/ will make a
block comment. By default it adds '//' instead of '#', but I guess
that it can be fixed somehow.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Find multiple elements in a vector

2009-07-22 Thread Michael Knudsen
Hi,

Given a vector, say

x=sample(0:9,10)
x
[1] 0 6 3 5 1 9 7 4 8 2

I can find the location of an element by

which(x==2)
[1] 10

but what if I want to find the location of more than one number? I could do

c(which(x==2),which(x==3))

but isn't there something more streamlined? My first guess was

y=c(2,3)
which(x==y)
integer(0)

which doesn't work. I haven't found any clue in the R manual.

Thanks!

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find multiple elements in a vector

2009-07-22 Thread Michael Knudsen
On Wed, Jul 22, 2009 at 9:37 PM, Chuck Clelandcclel...@optonline.net wrote:

  How about this?

 which(x %in% c(2,3))

Thanks to you all! I had never thought about using %% in this context.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] heatmap plot

2009-07-21 Thread Michael Knudsen
2009/7/21 Markus Mühlbacher muehli...@yahoo.com:

 So just that I understand right. x and y are the scalings of the x and y axis 
 and the matrix represents the color of the points at each gridpoint?

Precisely! Try ?image for more details.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging lot of zoo objects

2009-07-21 Thread Michael Knudsen
On Tue, Jul 21, 2009 at 9:07 AM, RON70ron_michae...@yahoo.com wrote:

 I have 100 price data series like price1, price2, price3, . All
 are zoo objects. Now I want to merge all them together. Obviously I can do
 this using merge(price1, price2, price3, ). However as I have lot
 of price series (almost 1000) above systax is very tiresome. Is there any
 other way on doing to in one-go?

How did you get the names price1, price2, ..., price_100 in the first
place? Did you make 100 lines of code? If you had stored the objects
in a list, such that

priceN = list_of_prices[[N]]

you could easily define a recursive function to do the job for you.
Would it difficult for you to read the data into a list?

When dealing with only a few sets, numbering objects as you do is no
problem, but for many objects it can become very cumbersome.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writte file doubt

2009-07-21 Thread Michael Knudsen
On Tue, Jul 21, 2009 at 2:28 PM, Jose Narillos de
Santosnarillosdesan...@gmail.com wrote:

 Hi I wrotte this function but when I get the tmp.xls file  it shows data
 in a rare way. I mean not appears a matrix with 2000 rows and 100 columns.
 Can anyone help me, guide me?

Short answer: ?write

 write(tmp,file=tmp.xls)

You have to add an ncolumns option like
write(tmp,file=tmp.xls,ncolumns=100). The default is five columns.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] heatmap plot

2009-07-21 Thread Michael Knudsen
2009/7/21 Markus Mühlbacher muehli...@yahoo.com:

 I tried to add white to the colors, but this did not change my problem. Still 
 the values of the diagonal seem to be different from those occurring in the 
 matrix. Or in other words all squares of the diagonal should have to SAME 
 color!

If you can send me the matrix as a text file -- ready to import in R
-- I can give it a try.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building a big.matrix using foreach

2009-07-20 Thread Michael Knudsen
On Sun, Jul 19, 2009 at 2:29 PM, Jay Emersonjayemer...@gmail.com wrote:

Hi Jay!

 foreach(i=1:nrow(x),.combine=c) %dopar% f(x[i,])

That was also my first guess, but it doesn't seem to work. Here is a
trivial example using a regular matrix instead of a big.matrix. The
outcome is the same.

m = matrix(0,nrow=5,ncol=5)
foreach (i=1:5) %dopar% { m[i,] = rnorm(5) }

Since I didn't include the .combine option, a list containing five
independent rnorm(5) is returned. However, the matrix m is not
changed. If I replace %dopar% with %do%, everything works fine (but
not in parallel, of course).

Another thing is: The reason why I want to use big.matrix is, of
course, that my data set is too big to store in a regular matrix.
However, it seems that no matter how you run foreach, it will always
return something (a list, a vector, or...), and that will end up
having the same dimension as the big.matrix. If the returned object
can't be a big.matrix, I'm bound to run out of memory anyway.

 should work, essentially applying the functin f() to the rows of x?  But
 perhaps I misunderstand you.  Please feel free to email me or Mike
 (michael.k...@yale.edu) directoy with questions about bigmemory, we are very
 interested in applications of it to real problems.

My acutal problem is the following: I have a big data set of
observations, and I have a distance measure on this set. I would like
to calculate all pairwise distances and store them in a big.matrix. My
hope was to be able to build the matrix row by row in a parallel way.

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building a big.matrix using foreach

2009-07-20 Thread Michael Knudsen
On Sun, Jul 19, 2009 at 4:02 PM, Michael Kanekaneplusp...@gmail.com wrote:

Hi Mike,

 desc = describe(x)
 foreach (i=1:nrow(x), .combine=c, .packages='bigmemory') %dopar%
 {
   x = attach.big.matrix(desc)
   f(x[i,])
 }

Thanks! The shared.big.matrix was exactly what I needed. It still
remains for me, though, to check if I run into memory problems anyway.
It doesn't seem as if there's a don't return anything option in the
foreach function (also mentioned in my previous post in this thread).

Best,
Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building a big.matrix using foreach

2009-07-20 Thread Michael Knudsen
On Mon, Jul 20, 2009 at 10:23 AM, Michael Knudsenmicknud...@gmail.com wrote:

 Thanks! The shared.big.matrix was exactly what I needed. It still
 remains for me, though, to check if I run into memory problems anyway.
 It doesn't seem as if there's a don't return anything option in the
 foreach function (also mentioned in my previous post in this thread).

Oops! The following code made R go bananas and left the system dead
for fifteen minutes. Trying a simple 'ls' in a terminal resulted in
something like Segmentation fault. Too many files open.

distances = 
shared.big.matrix(nrow=length(x),ncol=length(x),type=double,init=0)
desc = describe(distances)

foreach (i=1:(length(x)-1)) %dopar%
{
   first_x = x[[i]]
   these_distances = numeric(length(x))

   for (j in (i+1):length(x))
   {
  second_x = x[[j]]
  these_distances[j] = as.numeric(ks.test(first_x,second_x)$statistic)
   }

   y = attach.big.matrix(desc)
   y[i,] = these_distances
}

The error message from R was:

*** caught segfault ***
address (nil), cause 'memory not mapped'

/Michael

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what is meaning of the bubbles in boxplots?

2009-07-20 Thread Michael Knudsen
On Mon, Jul 20, 2009 at 5:18 AM, Jie TANGtotang...@gmail.com wrote:

  Can anyone tell me what the correct meaning of these bubbles?and how to
 remove it?

Have a look at the Wikipedia entry on box plots:

http://en.wikipedia.org/wiki/Boxplot

Why do you want to remove the bubbles? They are outliers and part of your data.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I might be dumb : a simple question about foreach

2009-07-20 Thread Michael Knudsen
On Mon, Jul 20, 2009 at 2:48 PM, Olivier
ETERRADOSSIolivier.eterrado...@ema.fr wrote:

  x - foreach(i = 1:3) %do% sqrt(i)

 and get :

 Erreur dans sqrt(i) : indice hors limites ( i.e. error in sqrt(i) : index
 out of bounds)

I once got similar errors because I didn't encapsulate the part af
%do% or %dopar% in curly brackets. Try

x - foreach(i = 1:3) %do% { sqrt(i) }

I should say, however, that in this particular case, your original
code evaluates without errors on my computer (Mac OSX 10.5.7 with R
2.9.1).

By the way, remember to use %dopar% instead of %do%, if you want to
take advantage of multiple cores. While being totally ecstatic after
discovering foreach, I wrote the following (very simple) guide:

http://lifeofknudsen.blogspot.com/2009/07/most-of-work-i-do-in-r-has-to-do-with.html

Maybe you'll find it useful, maybe not.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I might be dumb : a simple question about foreach

2009-07-20 Thread Michael Knudsen
On Mon, Jul 20, 2009 at 3:14 PM, Olivier
ETERRADOSSIolivier.eterrado...@ema.fr wrote:

 Do you suggest some Windows related behaviour ?

I haven't used Windows for more than ten years, so unfortunately I
have no clue whatsoever. Maybe there are some Windows experts here who
can help you. America is slowly waking up now, so cross your fingers
:-)

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mahalanobis distance

2009-07-20 Thread Michael Knudsen
On Mon, Jul 20, 2009 at 3:08 PM, ekinakoglue...@ims.metu.edu.tr wrote:

 Error in solve.default(cov, ...) :
  system is computationally singular: reciprocal condition number =
 1.65972e-18

Try calculating the determinant of the S matrix:

 det(S)
[1] 2.825397e-06

It's very close to zero, and I guess that the matrix is therefore
considered non-invertible by R. Recall that S must be invertible

http://en.wikipedia.org/wiki/Mahalanobis_distance

to work as a covarinace matrix in the definition of the Mahalanobis distance.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] heatmap plot

2009-07-20 Thread Michael Knudsen
2009/7/20 Markus Mühlbacher muehli...@yahoo.com:

 What is my mistake?

I don't know about the heatmap function, but I have often used 'image'
with 'heat.colors' without any problems. There is a nice example here:

http://addictedtor.free.fr/graphiques/graphcode.php?graph=20

It should be fairly easy to fit your data into that one. I guess that
this should work:

x = 1:length(activity.matrix)
y = 1:length(activity.matrix)
image(x, y, activity.matrix, col=heat.colors(100))

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] heatmap plot

2009-07-20 Thread Michael Knudsen
2009/7/20 Markus Mühlbacher muehli...@yahoo.com:

 Gives the attached image. Again I am missing the white diagonal. Is there 
 some kind of sorting that I do not consider?

Maybe col=c(white,heat.colors(100)) will do the trick?

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kmeans.big.matrix

2009-07-20 Thread Michael Knudsen
Hi,

I'm playing around with the 'bigmemory' package, and I have finally
managed to create some really big matrices. However, only now I
realize that there may not be functions made for what I want to do
with the matrices...

I would like to perform a cluster analysis based on a big.matrix.
Googling around I have found indications that a certain
kmeans.big.matrix() function should exist. It is mentioned, among
other places, in this document:

http://www.stat.yale.edu/~jay/662/bm-nojss.pdf

Unfortunately, on my computer the following happens:

 require(bigmemory)
Loading required package: bigmemory
 kmeans.big.matrix
Error: object 'kmeans.big.matrix' not found

Does anybody know how to get the kmeans.big.matrix() function? Are
there other cluster algorithms out there ready to accept a big.matrix
as input?

Thanks!

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mahalanobis distance

2009-07-20 Thread Michael Knudsen
On Mon, Jul 20, 2009 at 9:37 PM, ekinakoglue...@ims.metu.edu.tr wrote:

 Could you please help me with a pseudo matrix of 4x4
 that is gonna work with mahalanobis?

Hmmm ... I have been trying some different matrices myself now, but I
keep getting the same error. Even if det(S) is very far from zero.
Maybe I just don't get the point of the mahalanobis() function in R.
It looks quite weird to me :-(

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re placing null values (#NULL!)

2009-07-18 Thread Michael Knudsen
On Fri, Jul 17, 2009 at 10:37 PM, PDXRuggerj_r...@hotmail.com wrote:

 So i need to replace the the #NULL! with 0.  I have tried:

 Props_pct_vacant-Props_pct_vacant[Props_$pct_vacant !=#NULL!]

Try this instead:

Props_$pct_vacant[which(Props_$pct_vacant==#NULL!)] = 0

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Building a big.matrix using foreach

2009-07-18 Thread Michael Knudsen
Hi there!

I have become a big fan of the 'foreach' package allowing me to do a
lot of stuff in parallel. For example, evaluating the function f on
all elements in a vector x is easily accomplished:

foreach(i=1:length(x),.combine=c) %dopar% f(x[i])

Here the .combine=c option tells foreach to combine output using the
c()-function. That is, to return it as a vector.

Today I discovered the 'bigmemory' package, and I would like to
contruct a big.matrix in a parralel fashion row by row. To use foreach
I see no other way than to come up with a substitute for c in the
.combine option. I have checked out the big.matrix manual, but I can't
find a function suitable for just that.

Actually, I wouldn't even know how to do it for a usual matrix. Any clues?

Thanks!

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transformation of data!

2009-07-17 Thread Michael Knudsen
On Fri, Jul 17, 2009 at 10:29 AM, Andriy Fetsunfet...@googlemail.com wrote:

 I want to perform some sort of transformation on all the
 elements in the matrix I have posted and that I have only presented
 those 3 elements as an example of how the transformation will affect
 those 3 elements.

 Do you see the problem now?

Hmmm ... I don't think to. If you have a function f, you can apply it
to all elements in a vector x by simply typing f(x). If you want to
arrange it in two columns as you suggest, you could do

cbind(1:length(x),f(x)),

but I really can't see why you would ever want to do that.

 Here the proposed by R-guy solution

  mydata - data.frame(W21)

Where is your function, and what is W21?

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dot plot with several points for 2 categories

2009-07-17 Thread Michael Knudsen
On Fri, Jul 17, 2009 at 7:17 PM, jaregisuck...@mpi-cbg.de wrote:

 I'm trying to wean myself off the very limited capabilities of Excel and Oo.
 Currently, I am trying to make a plot showing several values for 2
 categories in a dot blot (see
 http://www.nabble.com/file/p24538360/Picture%2B1.png Picture+1.png  except
 that the x axis should contain the category not a number, which was the only
 way to coax Excel into displaying a plot like this).

Let y1 be a vector containing the values in the first category, and
let y2 contain those of the second. The you could do like this:

x1 = rep(1,times=length(y1))
x2 = rep(2,times=length(y2))
plot(c(x1,x2),c(y1,y2),xaxt=n)
axis(side=1,at=c(1,2),labels=c(label1,label2))

It looks like a hack, but it should work.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loading a file in R in mac OS X

2009-07-16 Thread Michael Knudsen
On Thu, Jul 16, 2009 at 1:32 PM, caballojamespi...@hotmail.com wrote:

 Error in file(file, r) : cannot open the connection
 In addition: Warning message:
 In file(file, r) :
 cannot open file 'c:\harddrivename\users\username\desktop\schools.txt':
 No such file or directory.

 I'm guessing that there's a different way to enter the path name on a
 macintosh?

There is no such thing as a c-drive on a Mac. Try this location instead:

~/Desktop/schools.txt

and note that Mac is case-sensitive. The tilde (~) refers to your home
directory. Alternatively you could write:

/Users/username/Desktop/schools.txt

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loading a file in R in mac OS X

2009-07-16 Thread Michael Knudsen
On Thu, Jul 16, 2009 at 3:04 PM, Marc Schwartzmarc_schwa...@me.com wrote:

 Actually, by default, the OSX HFS+ file system is not case sensitive:

Sorry. I just took that for granted, as Mac (at least in a terminal)
is very similar to Linux.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loading a file in R in mac OS X

2009-07-16 Thread Michael Knudsen
On Thu, Jul 16, 2009 at 2:44 PM, Marc Schwartzmarc_schwa...@me.com wrote:

 If you are using the OSX GUI (R.app) you may also want to review the OSX
 FAQ:

If you use the GUI, you may also just want to hit cmd+d and browse to
your preferred working directory. If you set the working directory to
Desktop, you can just type

read.table(school.txt,...)

I would, however, suggest that you move your files to a directory
specifically dedicated to your R project in order not to clutter up
your desktop.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transformation of data!

2009-07-16 Thread Michael Knudsen
On Thu, Jul 16, 2009 at 10:09 PM, Andriy Fetsun fet...@googlemail.com wrote:

   [1]  0.00e+00  1.89e-04  3.933000e-05  1.701501e-04  2.040456e-04
   [6]  3.119242e-04  2.545665e-04  1.893930e-03  1.303112e-03  9.880183e-04
  [11]  1.504378e-03  1.549246e-03  5.877690e-04  4.771359e-04  8.528219e-04

That it a vector of length 15.

 How is it possible to transform the data to get a vector as following

 10   0.017511063
 11   0.017819918
 12   0.017944472

That looks like a 3x2 matrix. How do you get that from the vector above?

--
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The greatest common divisor between more than two integers

2009-07-15 Thread Michael Knudsen
On Wed, Jul 15, 2009 at 8:55 AM, Atte Tenkanenatte...@utu.fi wrote:

 Do somebody know if there is a function in R which  computes the greatest 
 common divisor between several (more than two) integers?

Is there a function for computing the greatest common divisor of *two*
numbers? I can't find one, but assume that there is such a function,
and call it gcd. Then you could define a recursive function to do the
job. Something like

new_gcd = function(v)
{
   if (length(v)==2) return(gcd(v))
   else return (new_gcd(v[1],new_gcd(v[2:length(v)]))
}

where v is a vector containing the numbers you want to calculate the
greatest common divisor of.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nested for loops

2009-07-14 Thread Michael Knudsen
On Tue, Jul 14, 2009 at 8:03 AM, Moshe Olshanskym_olshan...@yahoo.com wrote:

 Make it
 for (i in 1:9)

Thanks. That's also how I solved the problem myself. I just somehow
think it makes my code look rather clumsy and opaque. Maybe I just
have to get used to this kind of nasty tricks.

 This is not the general solution, (...)

What do you mean? It looks a like a very general solution to me.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nested for loops

2009-07-14 Thread Michael Knudsen
On Tue, Jul 14, 2009 at 8:20 AM, Michael Knudsenmicknud...@gmail.com wrote:

 What do you mean? It looks a like a very general solution to me.

Just got an email suggesting using the functions col and row. For example

temp = matrix(c(1:36),nrow=6)
which(col(temp)row(temp))

This gives the indices (in the matrix viewed as a vector) of the
above-diagonal entries.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nested for loops

2009-07-14 Thread Michael Knudsen
On Tue, Jul 14, 2009 at 1:56 PM, David Winsemiusdwinsem...@comcast.net wrote:

 temp[ upper.tri(temp) ]
  [1]  7 13 14 19 20 21 25 26 27 28 31 32 33 34 35

Thanks! I didn't know about that function; it certainly makes things a
lot easier. For example, until now I have used the following, homemade
expression

(1:N^2)[which((1:N^2)!=seq(0,(N-1)*N,by=N)+(1:N))]

to get the indices of the non-diagonal entries of a matrix :-)

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nested for loops

2009-07-14 Thread Michael Knudsen
On Tue, Jul 14, 2009 at 2:29 PM, Gabor
Grothendieckggrothendi...@gmail.com wrote:

 seq. - function(from, to) seq(from = from, length = max(0, to - from + 1))

Really nice! Thank you!

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine two matricies

2009-07-13 Thread Michael Knudsen
On Mon, Jul 13, 2009 at 11:31 AM, Tom Liptrot tomlipt...@hotmail.com wrote:

 I wish to combine these two into one matrix using the values from x where x 
 has values, and values from a where x has NA's, giving a new matrix which 
 would look like this:

This should do the trick:

x[which(is.na(x))]=a[which(is.na(x))]

--
Michael Knudsen
micknud...@gmail.com
http://www.google.com/profiles/micknudsen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph: axis label font

2009-07-13 Thread Michael Knudsen
On Mon, Jul 13, 2009 at 5:31 PM, serbringbracard...@email.it wrote:

 excuse me for my english, i am using R on windows and i have to do several
 graphs with axis labels and the axis text thicks has a specified font type,
 (Arial) and a specified font size. How can i do these? Thank you in advance

Interesting question, I didn't know the answer to, so I tried to look
it up. There might be some help towards the bottom of this page:

http://www.statmethods.net/advgraphs/parameters.html

It seems to be specific for Windows, so I can't test it myself.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nested for loops

2009-07-13 Thread Michael Knudsen
Hi,

I have spent some time locating a quite subtle (at least in my
opinion) bug in my code. I want two nested for loops traversing the
above-diagonal part of a square matrix. In pseudo code it would
something like

for i = 1 to 10
{
   for j = i+1 to 10
   {
  // do something
   }
}

However, trying to do the same in R, my first try was

for (i in 1:10)
{
   for (j in (i+1):10)
   {
   // do something
   }
}

but there's a problem here. For i=10, the last for loop is over 11:10.
Usually programming laguages would regard what corresponds to 11:10 as
empty, but A:B with A bigger than B is in R interpreted as the numbers
from B to A in reverse order.

Is there a clever way to make nested loops like the one above in R?

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.