Re: [R] Replace split with regex for speed ?

2011-03-18 Thread rex.dwyer
That's a good solution, but if you're really, really sure that the timestamps 
are in the format you gave, it's quite a bit faster to use substr and paste, 
because you don't have to do any searching in the string.
HTH
Rex

 x = rep(09:30:00.000.633,100)
 system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=))
   user  system elapsed
   0.870.000.88
 system.time(y-sub(\\.(\\d+)$, \\1, x))
   user  system elapsed
   1.650.001.65
 system.time(y-sub(\\.(\\d+)$, \\1, x))
   user  system elapsed
   1.650.001.66
 system.time(y-paste(substr(x,1,12),substr(x,14,16),sep=))
   user  system elapsed
   0.880.000.89


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Henrique Dallazuanna
Sent: Friday, March 18, 2011 8:32 AM
To: rivercode
Cc: r-help@r-project.org
Subject: Re: [R] Replace split with regex for speed ?

Try this:

 sub(\\.(\\d+)$, \\1, ts)


On Thu, Mar 17, 2011 at 11:01 PM, rivercode aqua...@gmail.com wrote:

 Have timestamp in format HH:MM:SS.MMM.UUU and need to remove the last . so
 it is in format HH:MM:SS.MMMUUU.

 What is the fastest way to do this, since it has to be repeated on millions
 of rows. Should I use regex ?

 Currently doing it with a string split, which is slow:

  head(ts)
 [1]  09:30:00.000.245  09:30:00.000.256  09:30:00.000.633  09:30:00.001.309
 09:30:00.003.635  09:30:00.026.370


  ts = strsplit(ts, ., fixed = TRUE)
  ts=lapply(ts, function(x) { paste(x[1], ., x[2], x[3], sep=) } )  #
 Remove last . from timestamp, from HH:MM:SS.MMM.UUU to HH:MM:SS.MMMUUU
  ts = unlist(ts)

 Thanks,
 Chris

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Replace-split-with-regex-for-speed-tp3386098p3386098.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Singularity problem

2011-03-16 Thread rex.dwyer
Feng,
Your matrix is *not* (practically) singular; its inverse is.
The message said that the *system* was singular, not the matrix.
Remember Cramer's Rule:  xi = |Ai| / |A|
The really, really large determinant of your matrix is going to appear in the 
denominator of your solutions, so, essentially, you get underflow. Try working 
out the entire solution with Cramer's Rule if you still don't see the problem.  
solve doesn't really use Cramer's Rule, but it will give you a feel for the 
issue.
HTH
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Berend Hasselman
Sent: Wednesday, March 16, 2011 1:33 PM
To: r-help@r-project.org
Subject: Re: [R] Singularity problem


Peter Langfelder wrote:

 On Wed, Mar 16, 2011 at 8:28 AM, Feng Li lt;m...@feng.ligt; wrote:
 Dear R,

 If I have remembered correctly, a square matrix is singular if and only
 if
 its determinant is zero. I am a bit confused by the following code error.
 Can someone give me a hint?

 a - matrix(c(1e20,1e2,1e3,1e3),2)
 det(a)
 [1] 1e+23
 solve(a)
 Error in solve.default(a) :
  system is computationally singular: reciprocal condition number = 1e-17


 You are right, a matrix is mathematically singular iff its determinant
 is zero. However, this condition is useless in practice since in
 practice one cares about the matrix being computationally singular,
 i.e. so close to singular that it cannot be inverted using the
 standard precision of real numbers. And that's what your matrix is
 (and the error message you got says so).

 You can write your matrix as

 a = 1e20 * matrix (c(1, 1e-18, 1e-17, 1e-17), 2, 2)

 Compared to the first element, all of the other elements are nearly
 zero, so the matrix is numerically nearly singular even though the
 determinant is 1e23. A better measure of how numerically unstable the
 inversion of a matrix is is the condition number which IIRC is
 something like the largest eigenvalue divided by the smallest
 eigenvalue.


svd(a) indicates the problem.

largest singular value / smallest singular value=1e17  (condition number)
-- reciprocal condition number=1e-17
and the standard solve can't handle that.

(pivoted) QR decomposition does help. And so does SVD.

Berend

--
View this message in context: 
http://r.789695.n4.nabble.com/Singularity-problem-tp3382093p3382465.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2011-03-15 Thread rex.dwyer
Hi Jon,  I read your question differently.  Is the answer?  - Rex

 ch=scan(stdin(),what=character(0),n=1)
1: f
Read 1 item
 ch
[1] f


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Bert Gunter
Sent: Tuesday, March 15, 2011 10:09 AM
To: Jonathan P Daily
Cc: r-help@r-project.org
Subject: Re: [R] (no subject)

?strsplit

x - ThisIsaString
y- strsplit(x,)

This gives a list, which you can convert to a vector by unlist(y)

Incidentally, you could have found out about strsplit via R's
help.search(character string) (or similar) or even googling R
string function . Please use R's native Help capabilities before
posting to the list.

(I will grant that the unlist()  trick may not be that easy to find).

Also, it's often worthwhile searching the Help archives first. Peter
Dalgaard answered this same question here a day or two ago.

Cheers,
Bert



On Tue, Mar 15, 2011 at 6:35 AM, Jonathan P Daily jda...@usgs.gov wrote:
 I was wondering if there is a way to get read in a single keystroke at a
 time in R as a string, akin to ncurses-style interfaces. I looked into
 readLines, readChar, etc. using stdin, but these all require the use of an
 end of line. Has anyone ever had need to do this or have any ideas on how
 to do this?

 Thanks,
 Jon

 PS I apologize if this double-sends, but I am having mail client issues.
 --
 Jonathan P. Daily
 Technician - USGS Leetown Science Center
 11649 Leetown Road
 Kearneysville WV, 25430
 (304) 724-4480
 Is the room still a room when its empty? Does the room,
  the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read only specified columns from a data file

2011-03-15 Thread rex.dwyer
I think you need to read an introduction to R.
For starters, read.table returns its results as a value, which you are not 
saving.
The probable answer to your question:
Read the whole file with read.table, and select columns you need, e.g.:
tab - read.table(myfile, skip=2)[,1:5]

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Luis Ridao
Sent: Tuesday, March 15, 2011 11:53 AM
To: r-help@r-project.org
Subject: [R] How to read only specified columns from a data file

R-help,

I'm trying to read a data file with plenty of columns.
I just need the first 5 but it doe not work by doing something like:

 mycols - rep(NULL, 430) ; mycols[c(1:4)] - NA
 read.table(myfile, skip=2, colClasses=mycols)

Any suggestions?

Thanks in advance

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does R have a const object?

2011-03-15 Thread rex.dwyer
Cheer up! R is a step closer to that concept than the old FORTRAN compilers 
that couldn't even guarantee that 37 was a constant if used repeatedly in a 
subroutine call.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Uwe Ligges
Sent: Tuesday, March 15, 2011 2:23 PM
To: xiagao1982
Cc: r-help
Subject: Re: [R] Does R have a const object?



On 15.03.2011 15:53, xiagao1982 wrote:
 Hi, all,

 Does R have a const object concept like which is in C++ language? I want to 
 set some data frames as constant to avoid being modified unintentionally. 
 Thanks!


Although there is almost never a No in R, the best short answer is: No.

Best,
Uwe  Ligges





 xiagao1982
 2011-03-15

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] *Building* a covariance matrix efficiently

2011-03-14 Thread rex.dwyer
Tjerk,
This is just a pseudo code outline of what you need to do:

M = matrix(0, number of variables, number of variables)
V = rep(0, number of variables)
N = 0
While (more observations to read) {
   X - next observation
   V - V + X
   M - M + outer(X,X)
   N - N+1
}
Compute covariance matrix from elements of V,M, and N

You just need to refer to the formula defining covariance.
Outlook seems to think all my variables should be upper case.

HTH
Rex


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tsjerk Wassenaar
Sent: Monday, March 14, 2011 10:14 AM
To: R-help
Subject: [R] *Building* a covariance matrix efficiently

deaRs,

I want to build a covariance matrix out of the data from a binary
file, that I can read in chunk by chunk, with each chunk containing a
single observation vector X. I wonder how to do that most efficiently,
avoiding the calculation of the full symmetric matrices XX'. The
trivial non-optimal approach boils down to something like:

Q - matrix(rnorm(10),ncol=200)
M - matrix(0,ncol=200,nrow=200)
for (i in 1:nrow(Q))
  M - M + tcrossprod(Q[i,])

I would appreciate pointers to help me fill this lacuna in my R skills :)

Cheers,

Tsjerk

--
Tsjerk A. Wassenaar, Ph.D.

post-doctoral researcher
Molecular Dynamics Group
* Groningen Institute for Biomolecular Research and Biotechnology
* Zernike Institute for Advanced Materials
University of Groningen
The Netherlands

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] *Building* a covariance matrix efficiently

2011-03-14 Thread rex.dwyer
Tsjerk,
It seems to me that memory and not time is your big efficiency problem, and 
I've showed you how to avoid storing your entire input.
If you want to avoid doing each multiplication twice, you can replace the 
outer with a function that computes each product only once and accumulate 
sums of those products.
Iii = matrix(c(rep(1:200,200),rep(1:200,each=200)), ncol=2)
Iii = iii[ iii[,1]=iii[,2]]

and at each step

V2 = v2+X[iii[,1]] * X[iii[,2]]

instead of M.

I would imagine that the internal cov does this anyway, as you are not the 
first person to notice this symmetry, so I'm not sure of the point of this 
exercise.

PS: I actually know how to spell and pronounce Tsjerk, but Tsj is not a very 
familiar pattern for my fingers.

From: Tsjerk Wassenaar [mailto:tsje...@gmail.com]
Sent: Monday, March 14, 2011 1:41 PM
To: Dwyer Rex USRE
Subject: Re: RE: [R] *Building* a covariance matrix efficiently


Hi Rex,

Thanks for the reply. But it doesn't solve the issues of redundant calculations 
due to symmetry, both in the outer product and in the summation.

Cheers,

Tsjerk (correct spelling, really)
On Mar 14, 2011 5:44 PM, 
rex.dw...@syngenta.commailto:rex.dw...@syngenta.com wrote:

Tjerk,
This is just a pseudo code outline of what you need to do:

M = matrix(0, number of variables, number of variables)
V = rep(0, number of variables)
N = 0
While (more observations to read) {
  X - next observation
  V - V + X
  M - M + outer(X,X)
  N - N+1
}
Compute covariance matrix from elements of V,M, and N

You just need to refer to the formula defining covariance.
Outlook seems to think all my variables should be upper case.

HTH
Rex

-Original Message- From: 
r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.org.mailto:r-help-boun...@r-project.org...
__
R-help@r-project.orgmailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited.



message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minimum distance between line segments

2011-03-11 Thread rex.dwyer
I like Thomas's idea as a quick practical solution.  Here is one more little 
variation just in case you really do have millions of these distances.  Pick 
point P1 on line segment L1 (e.g., an endpoint).  Pick 101 evenly spaced points 
on line segment L2.  Find the nearest to P1 and call it P2.  Now go back to L1 
and pick a new P1.  Alternate until the distance stops dropping.  I think it is 
probably a theorem that three iterations suffice.  So, you could get by with 
303 distance calculations instead of 10,201.  This might also be interesting to 
some because it defines a function that returns a function.


line = function(p1,p2) function(a) (1-a)*p1 + a*p2  # returns a function 
mapping any a in [0,1] to a point between p1 and p2.
line1 = line(c(pi,12),c(7,-3))  # one line
line2 = line(c(0.1,5),c(3,7*sqrt(2)))  #another line

print(line1(1))  # one endpoint
print(line1(0))  # the other
print(line1(0.5))  # midpoint

d2 = function(p1,p2) sum((p1-p2)^2)

# parameter a for the point on some.line nearest to some.point.
nearest = function(some.point,some.line) {
seq = (0:100)/100;
return(seq[which.min(sapply(seq, function(a) d2(some.point,some.line(a])
}

plot.seg = function (some.line,...) lines(rbind(some.line(0),some.line(1)),...)


a = 0
b = nearest(line1(a),line2)
a = nearest(line2(b),line1)
b = nearest(line1(a),line2)
plot(c(-5,15),c(-5,15),asp=1,type=n,main=sqrt(d2(line1(a),line2(b
plot.seg(line1,col=black)
plot.seg(line2,col=blue)
plot.seg(line(line1(a),line2(b)),col=red)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Thomas Lumley
Sent: Thursday, March 10, 2011 2:54 PM
To: Mike Marchywka
Cc: r-help@r-project.org; darcy.web...@gmail.com
Subject: Re: [R] minimum distance between line segments

On Fri, Mar 11, 2011 at 2:46 AM, Mike Marchywka marchy...@hotmail.com wrote:

 
 Date: Wed, 9 Mar 2011 10:55:46 +1300
 From: darcy.web...@gmail.com
 To: r-help@r-project.org
 Subject: [R] minimum distance between line segments

 Dear R helpers,

 I think that this may be a bit of a math question as the more I
 consider it, the harder it seems. I am trying to come up with a way to
 work out the minimum distance between line segments. For instance,
 consider 20 random line segments:

 x1 - runif(20)
 y1 - runif(20)
 x2 - runif(20)
 y2 - runif(20)

 plot(x1, y1, type = n)
 segments(x1, y1, x2, y2)

 Inititally I thought the solution to this problem was to work out the
 distance between midpoints (it quickly became apparent that this is
 totally wrong when looking at the plot). So, I thought that perhaps
 finding the minimum distance between each of the lines endpoints AND
 their midpoints would be a good proxy for this, so I set up a loop
 that uses pythagoras to work out these 9 distances and find the
 minimum. But, this solution is obviously flawed as well (sometimes
 lines actually intersect, sometimes the minimum distances are less
 etc). Any help/dection on this one would be much appreciated.


There are two possibilities:

If the segments cross, the minimum distance is where they cross, obviously.

If they don't cross, the minimum distance is from one of the four
endpoints to the closest point on the other segment.  The closest
point on the other segment is either the nearest endpoint of the other
segment or the closest point on the infinite line that extends the
other segment.

That gives a small set of possibilities to work with.

If you're not doing this for millions of segments and you don't need
very high accuracy, however, taking lots of points from each segment
and computing pairwise distances by brute force is likely to be easier

peri-function(xstart,ystart,xend,yend){

line1x-seq(xstart[1],xend[1],length=98)
line1y-seq(ystart[1],yend[1],length=98)

line2x-seq(xstart[2],xend[2],length=100)
line2y-seq(ystart[2],yend[2],length=100)

   distsq-outer(1:98,1:100, function(i,j)
(line1x[i]-line2x[j])^2+(line1y[i]-line2y[j])^2)

closest-which(distsq==min(distsq),arr.ind=TRUE)


rbind(c(line1x[closest[1]],line1y[closest[1]]),c(line2x[closest[2]],line2y[closest[2]]))

}

   -thomas


--
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

Re: [R] minimum distance between line segments

2011-03-11 Thread rex.dwyer
I think I need to retract the part about 3 iterations...  not true if, e.g., 
the segments intersect and the angle is small.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of rex.dw...@syngenta.com
Sent: Friday, March 11, 2011 2:37 PM
To: tlum...@uw.edu; marchy...@hotmail.com
Cc: r-help@r-project.org; darcy.web...@gmail.com
Subject: Re: [R] minimum distance between line segments

I like Thomas's idea as a quick practical solution.  Here is one more little 
variation just in case you really do have millions of these distances.  Pick 
point P1 on line segment L1 (e.g., an endpoint).  Pick 101 evenly spaced points 
on line segment L2.  Find the nearest to P1 and call it P2.  Now go back to L1 
and pick a new P1.  Alternate until the distance stops dropping.  I think it is 
probably a theorem that three iterations suffice.  So, you could get by with 
303 distance calculations instead of 10,201.  This might also be interesting to 
some because it defines a function that returns a function.


line = function(p1,p2) function(a) (1-a)*p1 + a*p2  # returns a function 
mapping any a in [0,1] to a point between p1 and p2.
line1 = line(c(pi,12),c(7,-3))  # one line
line2 = line(c(0.1,5),c(3,7*sqrt(2)))  #another line

print(line1(1))  # one endpoint
print(line1(0))  # the other
print(line1(0.5))  # midpoint

d2 = function(p1,p2) sum((p1-p2)^2)

# parameter a for the point on some.line nearest to some.point.
nearest = function(some.point,some.line) {
seq = (0:100)/100;
return(seq[which.min(sapply(seq, function(a) d2(some.point,some.line(a])
}

plot.seg = function (some.line,...) lines(rbind(some.line(0),some.line(1)),...)


a = 0
b = nearest(line1(a),line2)
a = nearest(line2(b),line1)
b = nearest(line1(a),line2)
plot(c(-5,15),c(-5,15),asp=1,type=n,main=sqrt(d2(line1(a),line2(b
plot.seg(line1,col=black)
plot.seg(line2,col=blue)
plot.seg(line(line1(a),line2(b)),col=red)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Thomas Lumley
Sent: Thursday, March 10, 2011 2:54 PM
To: Mike Marchywka
Cc: r-help@r-project.org; darcy.web...@gmail.com
Subject: Re: [R] minimum distance between line segments

On Fri, Mar 11, 2011 at 2:46 AM, Mike Marchywka marchy...@hotmail.com wrote:

 
 Date: Wed, 9 Mar 2011 10:55:46 +1300
 From: darcy.web...@gmail.com
 To: r-help@r-project.org
 Subject: [R] minimum distance between line segments

 Dear R helpers,

 I think that this may be a bit of a math question as the more I
 consider it, the harder it seems. I am trying to come up with a way to
 work out the minimum distance between line segments. For instance,
 consider 20 random line segments:

 x1 - runif(20)
 y1 - runif(20)
 x2 - runif(20)
 y2 - runif(20)

 plot(x1, y1, type = n)
 segments(x1, y1, x2, y2)

 Inititally I thought the solution to this problem was to work out the
 distance between midpoints (it quickly became apparent that this is
 totally wrong when looking at the plot). So, I thought that perhaps
 finding the minimum distance between each of the lines endpoints AND
 their midpoints would be a good proxy for this, so I set up a loop
 that uses pythagoras to work out these 9 distances and find the
 minimum. But, this solution is obviously flawed as well (sometimes
 lines actually intersect, sometimes the minimum distances are less
 etc). Any help/dection on this one would be much appreciated.


There are two possibilities:

If the segments cross, the minimum distance is where they cross, obviously.

If they don't cross, the minimum distance is from one of the four
endpoints to the closest point on the other segment.  The closest
point on the other segment is either the nearest endpoint of the other
segment or the closest point on the infinite line that extends the
other segment.

That gives a small set of possibilities to work with.

If you're not doing this for millions of segments and you don't need
very high accuracy, however, taking lots of points from each segment
and computing pairwise distances by brute force is likely to be easier

peri-function(xstart,ystart,xend,yend){

line1x-seq(xstart[1],xend[1],length=98)
line1y-seq(ystart[1],yend[1],length=98)

line2x-seq(xstart[2],xend[2],length=100)
line2y-seq(ystart[2],yend[2],length=100)

   distsq-outer(1:98,1:100, function(i,j)
(line1x[i]-line2x[j])^2+(line1y[i]-line2y[j])^2)

closest-which(distsq==min(distsq),arr.ind=TRUE)


rbind(c(line1x[closest[1]],line1y[closest[1]]),c(line2x[closest[2]],line2y[closest[2]]))

}

   -thomas


--
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, 

Re: [R] using lapply

2011-03-10 Thread rex.dwyer
But no one answered Kushan's question about performance implications of 
for-loop vs lapply.
With apologies to George Orwell:
for-loops BAAD, no loops GOOD.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Uwe Ligges
Sent: Thursday, March 10, 2011 4:38 AM
To: Arun Kumar Saha
Cc: r-help@r-project.org
Subject: Re: [R] using lapply



On 10.03.2011 08:30, Arun Kumar Saha wrote:
 On reply to the post
 http://r.789695.n4.nabble.com/using-lapply-td3345268.html

Hmmm, can you please reply to the original post and quote it?
You mail was not recognized to be in the same thread as the message of
the original poster (and hence I wasted time to answer it again).

Thanks,
Uwe Ligges




 Dear Kushan, this may be a good start:

 ## assuming 'instr.list' is  your list object and you are applying
 my.strat() function on each element of that list, you can use lapply
 function as
 lapply(instr.list, function(x) return(my.strat(x)))

 Here resulting element will again be another list with length is same as the
 length of your original list 'instr.list.'

 Instead if the returned object for my.strat() function is a single number
 then you might want to create a vector instead list, in that case just use
 'sapply'

 HTH

 Arun,

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting only odd columns from a matrix

2011-03-09 Thread rex.dwyer
Or, if X1 Y1 X2 Y2... are really your column names
m[, grep(X,colnames(m)) ]
or
m[, grepl(X,colnames(m)) ]

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Dimitris Rizopoulos
Sent: Wednesday, March 09, 2011 9:36 AM
To: Nixon, Matthew
Cc: r-help@R-project.org
Subject: Re: [R] Extracting only odd columns from a matrix

one way is using the seq(), e.g., say 'm' is your matrix, then try:

m[, seq(1, ncol(m), by = 2)]


I hope it helps.

Best,
Dimitris


On 3/9/2011 3:20 PM, Nixon, Matthew wrote:
 Hi,

 This might seem like a simple question but at the moment I am stuck for 
 ideas. The columns of my matrix in which some data is stored are of this form:

 X1 Y1 X2 Y2 X3 Y3 ... Xn Yn

 with n~100. I would like to look at just the X values (i.e. odd column 
 numbers). Is there an easy way to loop round extracting only these columns?

 Any help would be appreciated.

 Thank you.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Complex sampling?

2011-03-09 Thread rex.dwyer
It sounds like you want a bunch of random permutations of 1:7.
Try order(runif(7))
If you need, say, 10 of them:
as.vector(sapply(1:10,function(i) order(runif(7
Is it more complicated than that?


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Hosack, Michael
Sent: Wednesday, March 09, 2011 1:02 PM
To: r-help@R-project.org
Subject: [R] Complex sampling?

 -Original Message-
 From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
 On Behalf Of Hosack, Michael
 Sent: Wednesday, March 09, 2011 7:34 AM
 To: r-help at R-project.org
 Subject: [R] Complex sampling?

 R users,

 I am trying to generate a randomized weekday survey schedule that ensures
 even coverage of weekdays in
 the sample, where the distribution of variable DOW is random with respect
 to WEEK. To accomplish this I need
 to randomly sample without replacement two weekdays per week for each of
 27 weeks (only 5 are shown).

This seems simple enough, sampling without replacement.

However,
 I need to sample from a sequence (3:7) that needs to be completely
 depleted and replenished until the
 final selection is made. Here is an example of what I want to do,
 beginning at WEEK 1. I would prefer to do
 this without using a loop, if possible.

 sample frame: [3,4,5,6,7] -- [4,5,6] -- [4],[1,2,3,(4),5,6] --
 [1,2,4,5,6] -- for each WEEK in dataframe

OK, now you have me completely lost.  Sorry, but I have no clue as to what you 
just did here.  I looks like you are trying to describe some 
transformation/algorithm but I don't follow it.



I could not reply to this email because it not been delivered to my inbox, so I 
had to copy it from the forum.
I apologize for the confusion, this would take less than a minute to explain in 
conversation but an hour
to explain well in print. Two DOW_NUMs will be selected randomly without 
replacement from the vector 3:7 for each WEEK. When this vector is reduced to a 
single integer that integer will be selected and the vector will be restored 
and a single integer will then be selected that differs from the prior selected 
integer (i.e. cannot sample the same day twice in the same week). This process 
will be repeated until two DOW_NUM have been assigned for each WEEK. That 
process is what I attempted to illustrate in my original message. This is 
beyond my current coding capabilities.




 Randomly sample 2 DOW_NUM without replacement from each WEEK ( () = no two
 identical DOW_NUM can be sampled
 in the same WEEK)

 sample = {3,7}, {5,6}, {4,3}, {1,5}, -- for each WEEK in dataframe


So, are you sampling from [3,4,5,6,7], or [1,2,4,5,6], or ...?  Can you show an 
'example' of what you would like to end up given your data below?


 Thanks you,

 Mike


  DATE DOW DOW_NUM WEEK
 2  2011-05-02 Mon   31
 3  2011-05-03 Tue   41
 4  2011-05-04 Wed   51
 5  2011-05-05 Thu   61
 6  2011-05-06 Fri   71
 9  2011-05-09 Mon   32
 10 2011-05-10 Tue   42
 11 2011-05-11 Wed   52
 12 2011-05-12 Thu   62
 13 2011-05-13 Fri   72
 16 2011-05-16 Mon   33
 17 2011-05-17 Tue   43
 18 2011-05-18 Wed   53
 19 2011-05-19 Thu   63
 20 2011-05-20 Fri   73
 23 2011-05-23 Mon   34
 24 2011-05-24 Tue   44
 25 2011-05-25 Wed   54
 26 2011-05-26 Thu   64
 27 2011-05-27 Fri   74
 30 2011-05-30 Mon   35
 31 2011-05-31 Tue   45
 32 2011-06-01 Wed   55
 33 2011-06-02 Thu   65
 34 2011-06-03 Fri   75

 DF -
 structure(list(DATE = structure(c(15096, 15097, 15098, 15099,
 15100, 15103, 15104, 15105, 15106, 15107, 15110, 15111, 15112,
 15113, 15114, 15117, 15118, 15119, 15120, 15121, 15124, 15125,
 15126, 15127, 15128), class = Date), DOW = c(Mon, Tue,
 Wed, Thu, Fri, Mon, Tue, Wed, Thu, Fri, Mon,
 Tue, Wed, Thu, Fri, Mon, Tue, Wed, Thu, Fri,
 Mon, Tue, Wed, Thu, Fri), DOW_NUM = c(3, 4, 5, 6, 7,
 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7, 3, 4, 5, 6, 7),
 WEEK = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
 4, 4, 4, 4, 5, 5, 5, 5, 5)), .Names = c(DATE, DOW, DOW_NUM,
 WEEK), row.names = c(2L, 3L, 4L, 5L, 6L, 9L, 10L, 11L, 12L,
 13L, 16L, 17L, 18L, 19L, 20L, 23L, 24L, 25L, 26L, 27L, 30L, 31L,
 32L, 33L, 34L), class = data.frame)


Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list

Re: [R] rowSums - am I getting something wrong?

2011-03-07 Thread rex.dwyer
Hi Thomas,
Several of us explained this in different ways just last week, so you might 
search the archive.  Floating point numbers are an approximate representation 
of real numbers.  Things that can be expressed exactly in powers of 10 can't be 
expressed exactly in powers of 2.  So the sum 0.6+0.3+0.1 is NOT clearly 1.0.
You can use signif and round to overcome this
 a = seq(0,1,0.1)
 a
 [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
 a[7]-0.6
[1] 1.110223e-16

 1-(a[4]+a[7]+a[2])
[1] -2.220446e-16
 b = rev(seq(1,0,-0.1))
 b
 [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
 a-b
 [1] 0.00e+00 2.775558e-17 5.551115e-17 1.110223e-16 1.110223e-16
 [6] 0.00e+00 1.110223e-16 1.110223e-16 0.00e+00 0.00e+00
[11] 0.00e+00
 round(a-b,10)
 [1] 0 0 0 0 0 0 0 0 0 0 0
 round(a,10)-round(b,10)
 [1] 0 0 0 0 0 0 0 0 0 0 0

The first commandment of floating point programming is
THOU SHALT NOT TEST WHETHER TWO FP NUMBERS ARE EQUAL
HTH
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of thomas.salve...@syngenta.com
Sent: Monday, March 07, 2011 2:09 AM
To: r-help@r-project.org
Subject: [R] rowSums - am I getting something wrong?

I am trying to construct a data set with some sequences for example:

a = seq(0,1,0.1)

m = matrix(nrow = 1331, ncol = 3)
m[,1] = rep(a,121)
m[,2] = rep(a,11,each = 11)
m[,3] = rep(a,1,each = 121)

I realize that there may be better ways of doing this, but this approach 
demonstrates the problem I'm having.

I then want to get the sum of the rows and delete any row with a sum of greater 
than 1.  But have a problem with rows containing any combination of the values 
0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows 
have a sum greater than 1 will return rows with these values.  Row 161 is the 
first row containing these values:

[161,]  0.6  0.3  0.1

which(rowSum(m)1)

 [53]  119  120  121  132  142  143  152  153  154  161  162

As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though 
I haven't checked every value in the matrix)

If I try the following:

q=rowSums(m)
which(q1)

[53]  119  120  121  132  142  143  152  153  154  161  162

But if I add and subtract 1 from this:

q=q+1
q=q-1
which(q1)

[53]  119  120  121  132  142  143  152  153  154  162

What exactly is going on here?  I don't have the problem with other 
combinations (eg 0.7, 0.2, 0.1).  I assume that there is something about the 
data format that I don't understand, but if I make a data frame of the matrix I 
found the same effect.

Any help would be great

Tom






message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] generate 3 distinct random samples without replacement

2011-03-07 Thread rex.dwyer
Cesar, I think your basic misconception is that you believe 'sample' returns a 
list of indices into the original vector.  It does not; it returns actual 
elements of the vector:

 sample(runif(100),3)
[1] 0.4492988 0.0336069 0.6948440

I'm not sure why you keep resetting the seed, but if it's important, replace
d2-d1[-i]
with
d2- setdiff(d1,i)

Otherwise Duncan's suggestion is must nicer:
s = sample(d1,300,replace=FALSE)
s1 = sort(s[1:100])
s2 = sort(s[101:200])
s3 = sort(s[201:300])
If what you actually need are indices into the original vector, replace d1 with 
length(d1).

(When you say 'distinct', I'm assuming you mean 'disjoint'.)

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Duncan Murdoch
Sent: Monday, March 07, 2011 3:52 PM
To: Cesar Hincapié
Cc: r-help@r-project.org
Subject: Re: [R] generate 3 distinct random samples without replacement

On 07/03/2011 2:17 PM, Cesar Hincapié wrote:
 Hello:

 I wonder if I could get a little help with random sampling in R.

 I have a vector of length 7375.  I would like to draw 3 distinct random 
 samples, each of length 100 without replacement.  I have tried the following:

 d1- 1:7375

 set.seed(7)
 i- sample(d1, 100, replace=F)
 s1- sort(d1[i])
 s1

 d2- d1[-i]
 set.seed(77)
 j- sample(d2, 100, replace=F)
 s2- sort(d2[j])
 s2

 d3- d2[-j]
 set.seed(777)
 k- sample(d3, 100, replace=F)
 s3- sort(d3[k])
 s3

 D- data.frame(a=s1,b=s2,c=s3)


 However, s2 is only 97 elements long, and s3, only 96 long.

 I would appreciate any suggestions on a better approach.
 I'm also curious to know why my second and third samples are less than 100 
 elements in length.

If you want 3 non-overlapping, non-repeating samples of 100, why not
draw one sample of 300, and take 3 subsets of it?

The reason you were finding shorter samples is because you were using j
and k as indices into vectors d2 and d3 that didn't have enough
elements, and then you sorted the result, losing the NAs.  For example,

d2 - 1:10
d2[10:12]
sort(d2[10:12])

See ?sort for an explanation of how to keep NA values when you sort.

Duncan Murdoch

 Thanks for your time and consideration,

 Cesar A. Hincapié, DC, MHSc

 Research Fellow, Division of Health Care and Outcomes Research, Toronto 
 Western Research Institute
 PhD Candidate in Epidemiology, Dalla Lana School of Public Health, University 
 of Toronto
 e. cesar.hinca...@utoronto.ca





   [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] questions about using loop, while and next

2011-03-04 Thread rex.dwyer
Carrie,
If your while-loop condition depends only on dt, and you don't change dt in 
your loop, your loop won't terminate.
The only thing inside your loop is next.
Perhaps you mean to write:
temp=rep(NA, 10)
for(i in 1:10)
{
dt=sum(rbinom(10, 5, 0.5))
while (dt25) {
   dt=sum(rbinom(10, 5, 0.5))
}
temp[i]=dt
}
It doesn't look like you understand next.  Try reading the help with ?next 
-- the quotes are necessary in this case.
If you still don't understand next, you should be able to program without it 
with appropriate if's.
HTH
Rex


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Carrie Li
Sent: Friday, March 04, 2011 12:10 AM
To: r-help@r-project.org
Subject: [R] questions about using loop, while and next

Hello R helpers,

I have a quick question about loop and next

In my loop, I have some random generation of data, but if the data doesn't
meet some condition, then I want it to go next, and generate data again for
next round.

# just an example..
# i want to generate the data again, if the sum is smaller than 25
temp=rep(NA, 10)
for(i in 1:10)
{
dt=sum(rbinom(10, 5, 0.5))
while (dt25) next
temp[i]=dt
}

I also tried while(dt25) {i=i+1}
But it doesn't seem right to me, since it running nonstop. Any solutions ?
Thanks for helps!
Carrie--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R usage survey

2011-03-04 Thread rex.dwyer
You still don't say what organization you are associated with.  Your domain 
name and e-mail address give no hint. How do we know that Harsh Singhal is 
even a real person?  An e-mail address at a university (for example) would go a 
long way to establish that.  Gmail doesn't cut it for me.
The preponderance of evidence is that you're just a naïve person who would give 
your own information to anyone who asked.  On the other hand, it's possible 
that you are conducting industrial espionage by recording IP addresses and 
associating use cases with companies.  In my opinion, the onus is on you to 
show your bona fides, and you haven't done it.
That's all I have to say...


From: Harsh [mailto:singhal...@gmail.com]
Sent: Friday, March 04, 2011 4:19 AM
To: bill.venab...@csiro.au
Cc: Dwyer Rex USRE; r-help@r-project.org
Subject: Re: [R] R usage survey

The R usage survey goo.gl/jw1ighttp://goo.gl/jw1ig has been updated with the 
following changes:

Addition of -
Disclaimer :
This data will not be used for any commercial purposes
Do not include any personally identifiable information
Contact: Harsh Singhal (singhalblr AT gmail DOT com) for any queries

Removal of -
Name field

My primary purpose in conducting this survey is -
- Find multiple use cases for various R packages
- Understand the nature of work when R is being used in Academia / Commercial 
settings
- The kind of technologies that are being used in conjunction with R 
(popularity of usage of Python with R, and what purpose does using Python solve)

The outcome of this analysis will be published on my blog (in the process of 
being created).
There is absolutely no commercial purpose behind collecting this information 
and as earlier stated, this information will not be shared with personally 
identifiable information.

Thank you once again Mr. Dwyer and Mr. Venables for raising very import 
questions.

I thank the R users who have already filled in the survey 
goo.gl/jw1ighttp://goo.gl/jw1ig and request more to do so.

Regards,
Harsh Singhal







On Fri, Mar 4, 2011 at 7:41 AM, bill.venab...@csiro.au wrote:
No.  That's not answering the question.  ALL surveys are for collecting 
information.

The substantive issue is what purpose do you have in seeking this information 
in the first place and what are you going to do with it when you get it?

Do you have some commercial purpose in mind?  If so, what is it?

-Original Message-
From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
Behalf Of Harsh
Sent: Friday, 4 March 2011 1:13 AM
To: rex.dw...@syngenta.commailto:rex.dw...@syngenta.com
Cc: r-help@r-project.orgmailto:r-help@r-project.org
Subject: Re: [R] R usage survey

Hi Rex and useRs,

The purpose of the survey has been mentioned on the survey link 
goo.gl/jw1ighttp://goo.gl/jw1ig
but I will also reproduce it here.
- Geographical distribution of R users
- Application areas where R is being used
- Supporting technology being used along with R
- Academic background distribution of R users

The potential personally identifiable information such as name and employer
name are optional fields. Actually all the fields in the survey are
optional.

Some of the analysis output(s) could be along the lines of :-
- Usage statistics of various R packages
- Distribution of R users across countries/cities
- Mapping various applications to packages
- Text Mining of the responses to create informative word clouds

Personally, I am excited about the kind of data I will receive through this
survey and the various insights that could be derived. As already mentioned,
the results will be shared with the community.

Thank you Rex for raising an important point. It is indeed necessary for me
to personally assure the user community that the results will be shared in a
manner that will not contain any personally identifiable information.

Those who wish to gain access to the raw data will be provided with all the
fields but not the name and employer name fields.

Just out of curiosity : It is possible to get name, employer name, location,
usage information and academic background details when searching for R users
on LinkedIn and the many R related groups there.
Does this also provide potential opportunities for misuse and outrageous
analyses, since almost anyone can get onto LinkedIn and access user profiles
?

Thank you for your interest and support.
Regards,
Harsh












On Thu, Mar 3, 2011 at 8:02 PM, 
rex.dw...@syngenta.commailto:rex.dw...@syngenta.com wrote:

 Harsh, Suitably analyzed for whose purposes?  One man's suitable is
 another's outrageous. That's why people want to see the gowns at the
 Oscars.  Under what auspices are you conducting this survey?  What do you
 intend to do with it?  You don't give any assurance that the results you
 post won't have personally identifiable information. I don't get the
 impression that you know much about survey design.

 

Re: [R] R usage survey

2011-03-04 Thread rex.dwyer
Harsh, not to worry, but you were wrong to assert that I engaged in any name 
calling, let alone constant name calling.
I also didn't and don't claim to be an authority on survey design.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Harsh
Sent: Friday, March 04, 2011 4:13 PM
To: Ista Zahn
Cc: r-help@r-project.org
Subject: Re: [R] R usage survey

Rex, Please accept my apologies for my inappropriate and utterly juvenile 
remarks.
I got carried away by what I thought was criticism and was quick to respond in 
a scathing manner.
I do accept and apologize for my inability in understanding what was 
essentially being asked of me.

Thanks to Ista and other members for clarifying what I failed to understand.

I'm now aware that I must submit to appropriately answering questions from 
potential respondents of the survey.

I must reiterate that  This survey is not sponsored or approved by any 
organization or company. The purpose of the survey is to satisfy my personal 
curiosity regarding R usage patterns. Results will be posted to a publicly 
available weblog; the data will not be used for any other purpose.
(Thanks Ista for wording this out. I couldn't have done it better)

Regards,
Harsh Singhal
http://in.linkedin.com/in/harshsinghal








On Sat, Mar 5, 2011 at 2:11 AM, Ista Zahn iz...@psych.rochester.edu wrote:

 On Fri, Mar 4, 2011 at 3:20 PM, Harsh singhal...@gmail.com wrote:
  Hi Ista, Spencer and Greg,
 snip
  The information being collected is purely out of personal interest
  and I have mentioned this earlier.

 No, I don't think you did actually. This is the key thing we wanted to
 know up-front, and it's a shame that it took the better part of the
 day before we finally understand why you are conducting the survey.

  There is no commercial interest involved.
 
  Is it possible that I am interested in this sort of information to
  better understand R's usage patterns ? In doing so, the survey I am
  conducting would seem an appropriate way for my requirements.
 
  And how does belittling someone on a mailing list help ?
 
  If anyone wants the kind of information I am collecting, are there
  suggestions of better ways of finding it besides the method that I
  have adopted ? Sure I could scrape the data of LinkedIn pages, or
  find other
 ways
  of doing it, but I found this suitable.
 
 
 
  On Sat, Mar 5, 2011 at 1:27 AM, Spencer Graves
  spencer.gra...@structuremonitoring.com wrote:
 
   Most surveys done in the US today are done during election
  season,
 to
  determine how to package candidates to attract votes.  Officials
  elected under such circumstances spend half their time in office
  servicing the bribes that they accepted to pay for the surveys and
  the resulting advertising (and the other half soliciting more
  bribes er contributions
 for
  their next campaign).  The best reference on this I know is Thomas
 Ferguson
  (1995) Golden Rule (U. Chicago Pr.).  It's by now somewhat old but
  is
 still
  cited by leading researchers.
 
 
   People have a right to be cautious of surveys, because too
  rarely today are surveys used for legitimate scientific purposes.
  Most often,
 they
  are used to defraud the public into doing things that are contrary
  to
 their
  best interests.
 
 
   Spencer Graves
 
 
  On 3/4/2011 11:37 AM, Ista Zahn wrote:
 
  Now hold on a second Harsh! I was fairly neutral up to this point,
  but this response is totally uncalled for. The problem is that
  despite repeated requests you never clarified the purpose of your 
  research!
  That is all you were asked to do, but rather than responding to
  this inquirly in a straightforward and honest manner you kept
  dodging the question. The most charitable explanation is that you
  just don't understand what information you were being asked to
  provide, which is frustrating but understandable; your last
  response on the other hand is completly out of line. Research
  participants have a right to know the purpose for which their data
  is being collected, and as a researcher you have a responsibility to tell 
  them.
 
  Rex, thank you for generating this discussion. When I first say
  Harsh's original email I was just getting ready to fill out the
  survey. When I saw your response I delayed. Boy am I glad I did!
 
  Best,
  Ista
 
  On Fri, Mar 4, 2011 at 2:20 PM, Harshsinghal...@gmail.com  wrote:
 
  Rex,
  You're just paranoid and I'm in no way answerable to you. Your
 constant
  name
  calling presupposes your own naivete.
 
  The survey has a disclaimer and those who wish to respond can do
  so at their own discretion.
 
  Judging by the nature (and number) of respondents, there seem to
  be a lot of highly qualified people who have no qualms about
  sharing information regarding their R usage patterns.
 
  You can believe what you want and can continue to spin your
 imaginative
  tales of industrial espionage while assuming a 

Re: [R] Developing a web crawler

2011-03-03 Thread rex.dwyer
Perl seems like a 10x better choice for the task, but try looking at the 
examples in ?strsplit to get started.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of antujsrv
Sent: Thursday, March 03, 2011 4:23 AM
To: r-help@r-project.org
Subject: [R] Developing a web crawler

Hi,

I wish to develop a web crawler in R. I have been using the functionalities
available under the RCurl package.
I am able to extract the html content of the site but i don't know how to go
about analyzing the html formatted document.
I wish to know the frequency of a word in the document. I am only acquainted
with analyzing data sets.
So how should i go about analyzing data that is not available in table
format.

Few chunks of code that i wrote:
w -
getURL(http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B003DZ1Y8Q/ref=dp_reviewsanchor#FullQuotes;)
write.table(w,test.txt)
t - readLines(w)

readLines also didnt prove out to be of any help.

Any help would be highly appreciated. Thanks in advance.


--
View this message in context: 
http://r.789695.n4.nabble.com/Developing-a-web-crawler-tp3332993p3332993.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greek character and R

2011-03-03 Thread rex.dwyer
mytitle = parse(text=paste(expression(paste(delta^13,'C Station ',,i,
title(mytitle)

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Filoche
Sent: Thursday, March 03, 2011 8:16 AM
To: r-help@r-project.org
Subject: [R] Greek character and R

Dear R users.

In a loop, I set the title of my graph with :

mytitle = expression(paste(delta^13,'C Station ', i)
title(mytitle)

However, instead of using value of i, it will literally use i character.

Any one know the way to concatenate the value of i to the mathematical
expression?

With regards,
Phil



--
View this message in context: 
http://r.789695.n4.nabble.com/Greek-character-and-R-tp304p304.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R usage survey

2011-03-03 Thread rex.dwyer
Harsh, Suitably analyzed for whose purposes?  One man's suitable is 
another's outrageous. That's why people want to see the gowns at the Oscars.  
Under what auspices are you conducting this survey?  What do you intend to do 
with it?  You don't give any assurance that the results you post won't have 
personally identifiable information. I don't get the impression that you know 
much about survey design.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Harsh
Sent: Thursday, March 03, 2011 5:53 AM
To: r-help@r-project.org
Subject: [R] R usage survey

Hi R users,
I request members of the R community to consider filling a short survey
regarding the use of R.
The survey can be found at http://goo.gl/jw1ig

Please accept my apologies for posting here for a non-technical reason.

The data collected will be suitably analyzed and I'll post a link to the
results in the coming weeks.

Thank you all for your interest and for sharing your R usage information.

Regards,
Harsh Singhal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greek character and R

2011-03-03 Thread rex.dwyer
Eval it.  This works at my house:

plot(0)
title(eval(parse(text=paste(expression(paste(delta^13,'C Station ',,i,))

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Filoche
Sent: Thursday, March 03, 2011 9:39 AM
To: r-help@r-project.org
Subject: Re: [R] Greek character and R

Hi and ty for the answer.

However, it's not working. It will print expression(d13C Station 1).

Thank for any help,
Phil

--
View this message in context: 
http://r.789695.n4.nabble.com/Greek-character-and-R-tp304p467.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R usage survey

2011-03-03 Thread rex.dwyer

Just out of curiosity : It is possible to get name, employer name, location, 
usage information and academic background details when searching for R users on 
LinkedIn and the many R related groups there.
Does this also provide potential opportunities for misuse and outrageous  
analyses, since almost anyone can get onto LinkedIn and access user profiles ?
[Dwyer Rex USRE] That's a no-brainer: YES!

On Thu, Mar 3, 2011 at 8:02 PM, 
rex.dw...@syngenta.commailto:rex.dw...@syngenta.com wrote:
Harsh, Suitably analyzed for whose purposes?  One man's suitable is 
another's outrageous. That's why people want to see the gowns at the Oscars.  
Under what auspices are you conducting this survey?  What do you intend to do 
with it?  You don't give any assurance that the results you post won't have 
personally identifiable information. I don't get the impression that you know 
much about survey design.

-Original Message-
From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On 
Behalf Of Harsh
Sent: Thursday, March 03, 2011 5:53 AM
To: r-help@r-project.orgmailto:r-help@r-project.org
Subject: [R] R usage survey

Hi R users,
I request members of the R community to consider filling a short survey
regarding the use of R.
The survey can be found at http://goo.gl/jw1ig

Please accept my apologies for posting here for a non-technical reason.

The data collected will be suitably analyzed and I'll post a link to the
results in the coming weeks.

Thank you all for your interest and for sharing your R usage information.

Regards,
Harsh Singhal
   [[alternative HTML version deleted]]

__
R-help@r-project.orgmailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Floating points and floor() ?

2011-03-03 Thread rex.dwyer
Hi Michael,

In floating point calculation, 1.0-.9 is not exactly 0.1.  This is easily seen 
by subtracting.
 (1.0-.9)-0.1
[1] -2.775558e-17
 (1.0-.9)==0.1
[1] FALSE

David is right, you can't correct this.  You can only compensate by taking 
care that you never, ever test whether 2 FP numbers are equal, because they 
almost never are.  You must always ask whether the difference is small.

 round(1.0-.9-.1,15)==0
[1] TRUE

Unfortunately, most of us forget this rule once in a while and write a loop 
like while (x!=0)... that won't terminate.
HTH
Rex



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Folkes, Michael
Sent: Thursday, March 03, 2011 9:24 PM
To: r-help@r-project.org
Subject: [R] Floating points and floor() ?

Perhaps somebody could clarify for me if the following is a floating
point matter or otherwise, and how am I to correct for it?

 floor(100*.1)
[1] 10

 100*(1.0-.9)
[1] 10

 floor(100*(1-0.9))
[1] 9


Thanks!
Michael
___
Michael Folkes
Salmon Stock Assessment
Canadian Dept. of Fisheries  Oceans
Pacific Biological Station
3190 Hammond Bay Rd.
Nanaimo, B.C., Canada
V9T-6N7
Ph (250) 756-7264 Fax (250) 756-7053  michael.fol...@dfo-mpo.gc.ca


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inefficient ifelse() ?

2011-03-02 Thread rex.dwyer
Hi Ivo,
It might be useful for you to study the examples below.
The key from a programming language point of view is that functions like ifelse 
are functions of whole vectors, not elements of vectors.  You either evaluate 
an argument or you don't; you don't evaluate only part of argument.  (Somebody 
correct me if I'm wrong.)
As you can see from the examples, if there are no TRUEs or no FALSEs in the 
condition, the corresponding arms are not evaluated, but if there are some of 
each, both must be evaluated.  This a property of the entire condition vector.  
You can see all this if you type ifelse (not ?ifelse, just ifelse) and look at 
the definition.
If you want to operate on elements of vectors, you need to use subsetting, e.g.:
s = rep(NA,length(t)); b=t%%2==0; s[b]=g(t[b]); s[!b]=f(t[!b])
I agree that it might be counterintuitive for a beginner, but so is 0!=0^0=1, 
and both follow from first principles. (e.g. n! = n(n-1)!)
Counterintuitive is not the same as incorrect, and correct is not the 
same as efficient.  :)
HTH
Rex

 t = 1:30
 ifelse(t%%2==0,g(t),f(t))
g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 
28 29 30
f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 
28 29 30
 [1]  2  6  6 12 10 18 14 24 18 30 22 36 26 42 30 48 34 54 38 60 42 66 46 72 50
[26] 78 54 84 58 90

 t = 2*(1:30)
 ifelse(t%%2==0,g(t),f(t))
g for 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 
54 56 58 60
 [1]   6  12  18  24  30  36  42  48  54  60  66  72  78  84  90  96 102 108 114
[20] 120 126 132 138 144 150 156 162 168 174 180

 t = 2*(1:30)+1
 ifelse(t%%2==0,g(t),f(t))
f for 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 
55 57 59 61
 [1]   6  10  14  18  22  26  30  34  38  42  46  50  54  58  62  66  70  74  78
[20]  82  86  90  94  98 102 106 110 114 118 122

 t = rep(c(1,2,NA),3)
 ifelse(t%%2==0,g(t),f(t))
g for 1 2 NA 1 2 NA 1 2 NA
f for 1 2 NA 1 2 NA 1 2 NA
[1]  2  6 NA  2  6 NA  2  6 NA

 t = rep(NA,10)
 ifelse(t%%2==0,g(t),f(t))
 [1] NA NA NA NA NA NA NA NA NA NA

 t=1:30
 ifelse(c(TRUE,FALSE,FALSE,TRUE),g(t),f(t))
g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 
28 29 30
f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 
28 29 30
[1]  3  4  6 12


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of ivo welch
Sent: Tuesday, March 01, 2011 5:20 PM
To: William Dunlap
Cc: r-help
Subject: Re: [R] inefficient ifelse() ?

yikes.  you are asking me too much.

thanks everybody for the information.  I learned something new.

my suggestion would be for the much smarter language designers (than
I) to offer us more or less blissfully ignorant users another
vector-related construct in R.  It could perhaps be named %if% %else%,
analogous to if else (with naming inspired by %in%, and with
evaluation only of relevant parts [just as if else for scalars]), with
different outcomes in some cases, but with the advantage of typically
evaluating only half as many conditions as the ifelse() vector
construct.  %if% %else% may work only in a subset of cases, but when
it does work, it would be nice to have.  it would probably be my first
goto function, with ifelse() use only as a fallback.

of course, I now know how to fix my specific issue.  I was just
surprised that my first choice, ifelse(), was not as optimized as I
had thought.

best,

/iaw


On Tue, Mar 1, 2011 at 5:13 PM, William Dunlap wdun...@tibco.com wrote:
 An ifelse-like function that only evaluated
 what was needed would be fine, but it would
 have to be different from ifelse itself.  The
 trick is to come up with a good parameterization.

 E.g., how would it deal with things like
   ifelse(is.na(x), mean(x, na.rm=TRUE), x)
 or
   ifelse(x1, log(x), runif(length(x),-1,0))
 or
   ifelse(x1, log(x), -seq_along(x))
 Would it reject such things?  Deciding that the
 x in mean(x,na.rm=TRUE) should be replaced by
 x[is.na(x)] would be wrong.  Deciding that
 runif(length(x)) should be replaced by runif(sum(x1))
 seems a bit much to expect.  Replacing seq_along(x) with
 seq_len(sum(x1)) is wrong.  It would be better to
 parameterize the new function so it wouldn't have to
 think about those cases.

 Would you want it to depend only on a logical
 vector or perhaps also on a factor (a vectorized
 switch/case function)?

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of ivo welch
 Sent: Tuesday, March 01, 2011 12:36 PM
 To: Henrique Dallazuanna
 Cc: r-help
 Subject: Re: [R] inefficient ifelse() ?

 thanks, Henrique.  did you mean

 as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)),
 list(f, g   ?

 otherwise, you get a matrix.

 its a good solution, but unfortunately I don't think this can be used
 to redefine 

Re: [R] clustering problem

2011-03-02 Thread rex.dwyer
Don't you expect it to be a lot faster if you cluster 20 items instead of 25000?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Maxim
Sent: Wednesday, March 02, 2011 4:08 PM
To: r-help@r-project.org
Subject: [R] clustering problem

Hi,

I have a gene expression experiment with 20 samples and 25000 genes each.
I'd like to perform clustering on these. It turned out to become much faster
when I transform the underlying matrix with t(matrix). Unfortunately then
I'm not anymore able to use cutree to access individual clusters. In general
I do something like this:

hc - hclust(dist(USArrests), ave)

library(RColorBrewer)
library(gplots)
clrno=3
cols-rainbow(clrno, alpha = 1)
clstrs - cutree(hc, k=clrno)
ccols - cols[as.vector(clstrs)]
heatcol-colorRampPalette(c(3,1,2), bias = 1.0)(32)
heatmap.2(as.matrix(USArrests), Rowv=as.dendrogram(hc),col=heatcol,
trace=none,RowSideColors=ccols)

Nice, I can access 3 main clusters with cutree. But what about a situation
when I perform hclust like

hc - hclust(dist(t(USArrests)), ave)

which I have to do in order to speed up the clustering process. This I can
plot with:

heatmap.2(as.matrix(USArrests), Colv=as.dendrogram(hc),col=heatcol,
trace=none)

But where do I find information about the clustering that was applied to the
rows?
cutree(hc, k=clrno) delivers the clustering on the columns, so what can I do
to access the levels for the rows?
I guess the solution is easy, but after ours of playing around I thought it
might be a good time to contact the mailing list!

Maxim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge( , by='row.names') slowness

2011-03-02 Thread rex.dwyer


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of dms
Sent: Wednesday, March 02, 2011 3:16 PM
To: r-help@r-project.org
Subject: [R] merge( , by='row.names') slowness

I noticed that joining two data.frames  in R using the merge
function that using by='row.names'  slows things down substantially
when compared to just joining on a common index column.

Using a dataframe size of ~10,000 rows: it's as slow as 10 minutes in
the by='row.names' case versus merely 1 second using an index column.
Beyond the 10^6 range, it's unusably slow.


n - 5
a - data.frame(id=as.character(1:10^n), x=rnorm(10^n)); rownames(a)
- a$id
b - data.frame(id=as.character(1:10^n + 10^(n-1)), y=rnorm(10^n));
rownames(b) - b$id

date()
fast - merge(a, b,  all=T)
date()
slow - merge(a, b, all=T, by='row.names')
date()


Has anybody else noticed this?
_

HI DMS,
Well, first off, they don't give the same answer... in fact, not even the same 
dimension.
Even so, from looking at merge.data.frame, it's not immediately obvious what 
would make a difference of this magnitude.
The answer might be buried in the internal merge.

Here for n=3:
 system.time(print(dim(merge(a,b,all=T
[1] 11003
   user  system elapsed
   0.010.000.01
 system.time(print(dim(merge(a,b,all=T,by=1
[1] 11003
   user  system elapsed
   0.010.000.02
 system.time(print(dim(merge(a,b,all=T,by=0
[1] 11005
   user  system elapsed
   3.260.003.17
 system.time(print(dim(merge(a,b,all=T,by=row.names
[1] 11005
   user  system elapsed
   3.170.003.17


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Explained variance for ICA

2011-03-01 Thread rex.dwyer
You determine the variance explained by *any* unit vector by taking its inner 
product with the data points, then finding the variance of the results.  In the 
case of FastICA, the variance explained by the ICs collectively is exactly the 
same as the variance explained by the principal components (collectively) from 
which they are derived.
HTH
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Pavel Goldstein
Sent: Tuesday, March 01, 2011 1:24 AM
To: r-help@r-project.org
Subject: [R] Explained variance for ICA

Hello,
I think to use FastICA package for microarray data clusterization,
but one question stops me:  can I know how much variance explain each
component (or all components together) ?
I will be very thankful for the help.

Thanks,

Pavel

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any Command showing correlation of all variables in a dataset?

2011-03-01 Thread rex.dwyer
?cor  answers that question.  If Housing is a dataframe, cor(Housing) should do 
it.  Surprisingly, ??correlation doesn't point you to ?cor.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of JoonGi
Sent: Tuesday, March 01, 2011 5:41 AM
To: r-help@r-project.org
Subject: [R] Is there any Command showing correlation of all variables in a 
dataset?


Thanks in advance.

I want to derive correlations of variables in a dataset

Specifically

library(Ecdat)
data(Housing)
attach(Housing)
cor(lotsize, bathrooms)

this code results only the correlationship between two variables.
But I want to examine all the combinations of variables in this dataset.
And I will finally make a table in Latex.

How can I test correlations for all combinations of variables?
with one simple command?


--
View this message in context: 
http://r.789695.n4.nabble.com/Is-there-any-Command-showing-correlation-of-all-variables-in-a-dataset-tp3329599p3329599.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding pairs with least magnitude difference from mean

2011-03-01 Thread rex.dwyer
No, that's not what I meant, but maybe I didn't understand the question.
What I suggested would involve sorting y, not x: sort the *distances*.
If you want to minimize the sd of a subset of numbers, you sort the numbers and 
find a subset that is clumped together.
If the numbers are a function of pairs, you compute the function for all pairs 
of numbers, and find a subset that's clumped together.
Anyway, it's an idea, not a theorem, so proof is left as an exercise for the 
esteemed reader.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Hans W Borchers
Sent: Monday, February 28, 2011 2:17 PM
To: r-h...@stat.math.ethz.ch
Subject: Re: [R] Finding pairs with least magnitude difference from mean

 rex.dwyer at syngenta.com writes:

 James,
 It seems the 2*mean(x) term is irrelevant if you are seeking to
 minimize sd. Then you want to sort the distances from smallest to
 largest. Then it seems clear that your five values will be adjacent in
 the list, since if you have a set of five adjacent values, exchanging
 any of them for one further away in the list will increase the sd. The
 only problem I see with this is that you can't use a number more than
 once. In any case, you need to compute the best five pairs beginning
 at position i in the sorted list, for 1=i=choose(n,2), then take the
 max over all i.
 There no R in my answer such as you'd notice, but I hope it helps just
 the same.
 Rex

You probably mean something like the following:

x - rnorm(10)
y - outer(x, x, +) - (2 * mean(x))

o - order(x)
sd(c(y[o[1],o[10]], y[o[2],o[9]], y[o[3],o[8]], y[o[4],o[7]], y[o[5],o[6]]))

This seems reasonable, though you would have to supply a more stringent
argument. I did two tests and it works alright.

--Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help

2011-02-28 Thread rex.dwyer
Generally, you can save your excel spreadsheet as comma-separated values, and 
then read with read.csv function:   ?read.csv
Or, tab-separated values and use read.delim.
Then look at ?barplot
Possibly you would like to read the Intro to R on the CRAN website.  Go to 
www.r-project.org , find Documentation in the menu on the left, and click on 
Manuals.

HTH
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Laura Clasemann
Sent: Monday, February 28, 2011 10:03 AM
To: r-help@r-project.org
Subject: [R] help


Hi,

I was wondering if anyone could provide me with help in entering the  dataset  
below into R? I've been having a hard time in trying to figure out how to 
assemble it into both a frequency table and a bar graph within R. I've been 
trying to present the way I had the data arranged, as below, in Excel 
Spreadsheet into R. I am uncertain what the correct commands and exact 
techniques are into getting it correctly organized in R. Any help would be 
greatly appreciated! Thank you!








Diet
Binger-Yes
Binger-No
Total

None
24
134
158

Healthy
9
52
61

Unhealthy
23
72
95

Dangerous
12
15
27

Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-28 Thread rex.dwyer
I have to agree that it's pretty hard to take something that works and figure 
out why it doesn't work :)
The only other suggestion is that sometimes I find that this sort of error goes 
away if I add drop=FALSE to the subsetting, and, if so,  that usually lets me 
figure out why.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Duncan Murdoch
Sent: Sunday, February 27, 2011 5:52 AM
To: mathijsdevaan
Cc: r-help@r-project.org
Subject: Re: [R] Error

On 11-02-26 8:26 AM, mathijsdevaan wrote:
 Mean doesn't work either... I understand that the message replacement has 0
 items, need 37597770 implies that the function is not returning any values,
 but I don't understand why then this is not the case in the example.

 DF = data.frame(read.table(textConnection(A  B  C  D  E
 1 1  a  1999  1  0
 2 1  b  1999  0  1
 3 1  c  1999  0  1
 4 1  d  1999  1  0
 5 2  c  2001  1  0
 6 2  d  2001  0  1
 7 3  a  2004  0  1
 8 3  b  2004  0  1
 9 3  d  2004  0  1
 10 4  b  2001  1  0
 11 4  c  2001  1  0
 12 4  d  2001  0  1),head=TRUE,stringsAsFactors=FALSE))

 DF = DF[order(DF$B,DF$C),]

 #first option - works fine in example and my target data frame
 DF$F = ave(DF$D,DF$B, FUN = function(x) cumsum(x)-x)
 DF$G = ave(DF$E,DF$B, FUN = function(x) cumsum(x)-x)

 #second option - works fine in example but not in my target data frame
 foo- function(x)
   {
   unlist(lapply(x, FUN = function(z) cumsum(z) - z))
   }
 n-ave(DF[,c(4:5)],DF$B,FUN = foo)

 Why is this second option not working in my target data frame (which is much
 bigger than the example)?

Presumably something is different about it.  I don't see how you expect
people to debug your problem when you don't show it.

If you want help, you need to give us an example that shows the error.
Start with your large dataset, and shrink it as much as possible, but
not so much that the error goes away.  (I suspect when you do this,
you'll end up seeing your error yourself.  But maybe not.)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding pairs with least magnitude difference from mean

2011-02-28 Thread rex.dwyer
James,
It seems the 2*mean(x) term is irrelevant if you are seeking to minimize sd.  
Then you want to sort the distances from smallest to largest.  Then it seems 
clear that your five values will be adjacent in the list, since if you have a 
set of five adjacent values, exchanging any of them for one further away in the 
list will increase the sd.  The only problem I see with this is that you can't 
use a number more than once.  In any case, you need to compute the best five 
pairs beginning at position i in the sorted list, for 1=i=choose(n,2), then 
take the max over all i.
There no R in my answer such as you'd notice, but I hope it helps just the same.
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Hans W Borchers
Sent: Saturday, February 26, 2011 6:43 AM
To: r-h...@stat.math.ethz.ch
Subject: Re: [R] Finding pairs with least magnitude difference from mean

 I have what I think is some kind of linear programming question.
 Basically, what I want to figure out is if I have a vector of numbers,

  x - rnorm(10)
  x
  [1] -0.44305959 -0.26707077  0.07121266  0.44123714 -1.10323616
 -0.19712807  0.20679494 -0.98629992  0.97191659 -0.77561593

  mean(x)
 [1] -0.2081249

 Using each number only once, I want to find the set of five pairs
 where the magnitude of the differences between the mean(x) and each
 pairs sum is least.

  y - outer(x, x, +) - (2 * mean(x))

 With this matrix, if I put together a combination of pairs which uses
 each number only once, the sum of the corresponding numbers is 0.

 For example, compare the SD between this set of 5 pairs
  sd(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5]))
 [1] 1.007960

 versus this hand-selected, possibly lowest SD combination of pairs
  sd(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7]))
 [1] 0.2367030

Your selection is not bad, as only about 0.4% of all possible distinct
combinations have a smaller value -- the minimum is 0.1770076, for example
[10 7 9 5 8 4 6 2 3 1].

(1) combinat() from the 'combinations' package seems slow, try instead the
permutations() function from 'e1071'.

(2) Yes, except your vector is getting much larger in which case brute force
is no longer feasible.

(3) This is not a linear programming, but a combinatorial optimization task.
You could try optim() with the SANN method, or some mixed-integer linear
program (e.g., lpSolve, Rglpk, Rsymphony) by intelligently using binary
variables to define the sets.

This does not mean that some specialized approach might not be more
appropriate.

--Hans Werner

 I believe that if I could test all the various five pair combinations,
 the combination with the lowest SD of values from the table would give
 me my answer.  I believe I have 3 questions regarding my problem.

 1) How can I find all the 5 pair combinations of my 10 numbers so that
 I can perform a brute force test of each set of combinations?  I
 believe there are 45 different pairs (i.e. choose(10,2)). I found
 combinations from the {Combinations} package but I can't figure out
 how to get it to provide pairs.

 2) Will my brute force strategy of testing the SD of each of these 5
 pair combinations actually give me the answer I'm searching for?

 3) Is there a better way of doing this?  Probably something to do with
 real linear programming, rather than this method I've concocted.

 Thanks for any help you can provide regarding my question.

 Best regards,

 James


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculate probabilty

2011-02-25 Thread rex.dwyer
Are you clear about the question you are asking?  Do you want to know whether 
there are 6 balls or at least 6 balls?  (It sounds like at least.)  Do you 
want to know whether there are at least 6 balls in the first box, or at least 6 
balls in exactly one box or at least 6 balls in at least one box?

This is the probability that there are exactly 6 balls in the first box:
 dbinom(6,142,1/491)
[1] 5.53366e-07

This is the probability that there are MORE THAN 6 balls in the first box:  
(NOT at least 6)
 1-pbinom(6,142,1/491)
[1] 2.272026e-08
 sum(sapply(7:142, function(i) dbinom(i,142,1/491)))
[1] 2.272026e-08
 1-sum(sapply(0:6, function(i) dbinom(i,142,1/491)))
[1] 2.272026e-08

This is probability that there are at least 6 balls in the first box:
 1-pbinom(5,142,1/491)
[1] 5.760862e-07

You can get all this from ?dbinom, but it pretty confusing that the argument n 
and the italic n in the details are totally different things, italic n = 
argument size.  (Likewise, italic p = argument prob, not argument p.)

Questions about more than one box are a little harder since the boxes are not 
independent.

HTH,
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Fabrice Tourre
Sent: Thursday, February 24, 2011 3:51 PM
To: r-help@r-project.org
Subject: [R] Calculate probabilty

Hi List,

I have a question to calculate probability using R.

There are 491 boxes and 142 balles. If the ball randomly put into the
box. How to calculate the probability of six or more there are in one
box?

I have try :

dbinom(6,142,1/491)

1-pbinom(6,142,1/491)

But I think I have some unclear about the dbinom and pbinom.

Thank you very much in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-25 Thread rex.dwyer
Does it work for FUN=mean?  If yes, you need to print out the results of f 
before you return them to find the anomalous value.
BTW Error is not a very good subject line.  I don't see many posts from 
people reporting how well things are going :)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of mathijsdevaan
Sent: Friday, February 25, 2011 9:31 AM
To: r-help@r-project.org
Subject: [R] Error


Hi, I am running the following script for a different (much larger data
frame):

DF = data.frame(read.table(textConnection(A  B  C  D  E
1 1  a  1999  1  0
2 1  b  1999  0  1
3 1  c  1999  0  1
4 1  d  1999  1  0
5 2  c  2001  1  0
6 2  d  2001  0  1
7 3  a  2004  0  1
8 3  b  2004  0  1
9 3  d  2004  0  1
10 4  b  2001  1  0
11 4  c  2001  1  0
12 4  d  2001  0  1),head=TRUE,stringsAsFactors=FALSE))
DF-DF[order(DF$B,DF$C),]#order by developer_id and year
f- function(x)
{
unlist(lapply(x, FUN = function(z) cumsum(z) - z))
}
DF-cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f))

I get the following error:

Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) :
  replacement has 0 items, need 37597770
In addition: Warning message:
In max(i) : no non-missing arguments to max; returning -Inf

The dimensions of the data frame are (5,108), so the last line of the
script becomes:

DF-cbind(DF[,c(1:3)],ave(DF[, c(4:108)],DF$B, FUN = f))

Any idea how to solve this problem? Thanks!


--
View this message in context: 
http://r.789695.n4.nabble.com/Error-tp3324531p3324531.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Gaps in plotting temporal data.

2011-02-24 Thread rex.dwyer
If you're in a hurry, it's way easier than that:

t - c(1,2,3,7,8,9,11,12,13)
x - rnorm(length(t))

new.t - min(t):max(t)
new.x - NULL
new.x[t-min(t)+1] - x

plot(new.t, new.x, type='l')

This is wastes max(t)-min(t)-length(t)+1 vector entries, but presumably you 
won't be wasting a lot of real estate along the horizontal axis of a plot.  If 
you do, you're going to need to fix that anyway.



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Duncan Murdoch
Sent: Thursday, February 24, 2011 8:45 AM
To: Christos Delivorias
Cc: r-help@r-project.org
Subject: Re: [R] Gaps in plotting temporal data.

On 24/02/2011 7:38 AM, Christos Delivorias wrote:
 I'm trying to plot some temporal data that have some gaps in them. You
 can see the plot here: http://www.tiikoni.com/tis/view/?id=da222e2.

 The problem is that during the time gaps in the TS the line plot is
 interpolated over the gap and I don't want it to. I've tried
 interleaving the gaps with an NA flag, but there are around 1
 data-points sorted from multiple files, that makes it difficult to add
 the NA flag manually. If it's not possible to define the behaviour of
 the plot(0function, is there another plot I can use, e.g. zoo, that will
 allow me to not have the lines drawn between the gaps?

Any software is going to have the same problem you had:  how do you
define a gap?  If the definition is something simple like time
difference greater than X, then it will be fairly easy:  use diff() to
find all the time differences in the sorted times, and wherever those
exceed X, insert a new data point with an NA value.  For example,

t - c(1,2,3,7,8,9,11,12,13)
x - rnorm(length(T))
d - diff(t)
gap - which(d  1.5)
if (length(gap)) {
   newT - (t[gap] + t[gap+1])/2
   t - c(t, newT)
   x - c(x, rep(NA, length(newT)))
   o - order(t)
   t - t[o]
   x - x[o]
}
plot(t, x, type='l')

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running code sequentially from separate scripts (but not functions)

2011-02-24 Thread rex.dwyer
You don't need to write functions to source files:

source(code1.R)
source(code2.R)
source(code3.R)

When you source a file with a bunch of function definitions, the definitions 
are just assignment statements:

f - function (x)...
g - function (x,y,z) ...

Did you think you would break your computer if you just tried this to see if it 
worked?  :)



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Dimitri Liakhovitski
Sent: Thursday, February 24, 2011 10:22 AM
To: r-help
Subject: [R] Running code sequentially from separate scripts (but not functions)

Hello!

I am wondering if it's possible to run - in sequence - code that is
stored in several R scripts.
For example:

Script in the file code1.r contains the code:
a = 3; b = 5; c = a + b

Script in the file code2.r contains the code:
d = 10; e = d - c

Script in the file code3.r contains the code:
result=e/a

I understand that I could write those 3 scripts as 3 functions and
source them from another script.
But maybe there is a way of having them run one by one as such?

Thanks a lot!

--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted Voronoi diagrams

2011-02-24 Thread rex.dwyer
One way to do Dirichlet triangulations is to map point (x,y) to point 
(x,y,x^2+y^2)  (I think, it's been a while) and then find the convex hull of 
these points in 3 dimensions.  You can do the Voronoi diagram of circles by 
mapping (x,y,r) to (x,y,x^2+y^2-r^2)  I would try assigning an r to each point 
so that the area of the circle (r^2) is proportional to the size of the 
subject tree.  You will need to scale the r^2 values so that none of the 
polygons disappear.  See papers by Aurenhammer and/or Edelsbrunner from around 
1990. I've been out of the field for a long time, so there may be more recent 
stuff.  Hopefully, there is an R package for convex hull in three dimensions.  
Possibly, there is someone in the CS department at UMn who does computational 
geometry who could assist you.  Maybe someone else knows of an R package that 
will do exactly what you want.
HTH
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Tuomas Aakala
Sent: Thursday, February 24, 2011 10:35 AM
To: r-help@r-project.org
Subject: [R] weighted Voronoi diagrams

Dear R-users,

Does anyone know how to do weighted Voronoi diagrams (Dirichlet
tesselation) in R? To be more specific, I have a set of coordinates for
tree locations on a plot, and I'm looking for a way to do the
tesselation so that the polygon size for each tree depends on the size
of the subject tree, and the size of its neighbors. So, the location of
the bisection between two trees would not necessarily be at the
midpoint, but determined by the tree sizes. I have looked through the
options in tripack and deldir-packages, but editing the functions in
those packages is beyond my skills.

Thanks,

Tuomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find points of intersection

2011-02-22 Thread rex.dwyer
How is the curve defined?  If the curve is y=f(x) and the line is y=mx+b, you 
look for the roots of f(x)-mx-b.
?polyroot
?uniroot

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of FMH
Sent: Tuesday, February 22, 2011 6:28 AM
To: r-help@r-project.org
Subject: [R] How to find points of intersection

Dear All,

I'm looking an appropriate way in R to compute/estimate  points of intersection
between a line and a curve and will really appreciate for any suggestion or
ideas?

Thank you,
Fir




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Discrepancies in run times

2011-02-22 Thread rex.dwyer
My surmise would be that you have not analyzed the situation correctly, and you 
are making a false assumption about your code.  Since you can't show the code, 
it's pretty hard to figure out what that is.  I think you're going to have to 
produce a simple example that you can share that has the same behavior.  My 
guess is that you will answer your own question as you try to do that.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Sébastien Bihorel
Sent: Tuesday, February 22, 2011 12:16 PM
To: r-help@r-project.org
Subject: [R] Discrepancies in run times

Dear R-users,

I am in the process of creating new custom functions and am quite puzzled by
some discrepancies in execution time when I run some R scripts that call
those new functions. So here is the situation:
- let's assume I have created two custom functions, called myg and myf;
- myg is mostly a plotting function, which makes a heavy use of grid and
lattice functions;
- myf is a function that massages data, opens and closes graph devices, and
pass the data to myg:
  * myf contains loops and sub-loops which subset the data in little pieces
necessary for plotting purposes;
  * the most inner loop in myf contains two calls to myg, one in section A
of the code, one in section B of the code;
  * Both sections could be turn on and off based upon an input of the myf
function;
  * Both sections passes the same data to myg, except for some graph
settings;
  * All graph devices open in section A are closed before section B starts;
and all graph devices open in section B are closed before the next iteration
of the inner loop.

Running a script passing a particular set of data to myf and turning on both
section A and B takes around 9 minutes (~3 combined minutes for Section A,
~6 combined minutes for Section B). The results of R CMD Rprof indicates
that most of the execution time is used by print (see extract below).
   %   total   %self
 totalseconds selfsecondsname
  99.4545.84   0.0  0.00 myf
  95.2522.70   0.0  0.06 myg
  90.7498.20   0.0  0.02 standardGeneric
  90.6497.70   0.0  0.00 print
  90.5497.32   5.3 29.06 printFunction
  90.5497.32   0.0  0.00 print.trellis
  62.7344.58  62.6343.96 lattice.setStatus
...
   %self   %  total
  selfsecondstotal   secondsname
  62.6343.96  62.7344.58 lattice.setStatus
   5.6 30.86   7.6 41.60 .Call.graphics
   5.3 29.06  90.5497.32 printFunction
   3.5 19.12   3.7 20.22 .Call
   2.3 12.52   2.3 12.52 $
   1.3  7.18   1.9 10.42 match
   1.3  6.98   1.3  6.98 dev.off
...

Running another script passing the same set of data to myf and turning
section A on and section B off takes around 3 minutes. The results of R CMD
Rprof also indicates that most of the execution time is used by print (see
extract below).
   %   total   %self
 totalseconds selfsecondsname
  98.1177.16   0.0  0.00 myf
  93.3168.40   0.0  0.00 myg
  85.0153.50   0.1  0.10 standardGeneric
  84.7152.94   0.0  0.02 print
  84.6152.74   0.0  0.00 print.trellis
  84.6152.72   4.5  8.04 printFunction
  51.3 92.66  51.3 92.58 lattice.setStatus
...
   %self   %  total
  selfsecondstotal   secondsname
  51.3 92.58  51.3 92.66 lattice.setStatus
   8.5 15.34  10.7 19.36 .Call.graphics
   6.6 11.96   6.9 12.50 .Call
   4.5  8.04  84.6152.72 printFunction
   3.4  6.14   3.4  6.14 $
   2.1  3.72   3.0  5.40 match
   0.8  1.46   0.8  1.46 dev.off
...
Running another script passing the same set of data to myf and turning
section A off and section B on takes around 3 minutes. The results of R CMD
Rprof also indicates that most of the execution time is used by print (see
extract below).
   %   total   %self
 totalseconds selfsecondsname
  98.1175.00   0.0  0.00 myf
  90.7161.82   0.0  0.00 myg
  86.8154.90   0.0  0.06 standardGeneric
  86.5154.32   0.0  0.02 print
  86.4154.16   4.0  7.18 printFunction
  86.4154.16   0.0  0.00 print.trellis
  52.6 93.76  52.5 93.62 lattice.setStatus
...
   %self   %  total
  selfsecondstotal   secondsname
  52.5 93.62  52.6 93.76 lattice.setStatus
   8.6 15.28  10.9 19.40 .Call.graphics
   4.2  7.58   4.5  7.98 .Call
   4.0  7.18  86.4154.16 printFunction
   3.1  5.56   3.1 

Re: [R] How to find points of intersection between harmonic function and a line

2011-02-22 Thread rex.dwyer
How is the curve is represented?  That's more important that its 
organ-of-origin. If you have values of y=f(x) at discrete time points, then 
y-(x+2) will change sign sometimes... the intersection point is at some time x' 
in between.  Am I missing something subtle here?
You could interpolate the time more precisely in many different ways, e.g., a 
spline -- read  help(spline).

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of FMH
Sent: Tuesday, February 22, 2011 1:21 PM
To: r-help@r-project.org
Cc: lig...@statistik.tu-dortmund.de
Subject: [R] How to find points of intersection between harmonic function and a 
line

Hi,

Sorry for the very short explanation about the problem of intersection.

I have a wave function monitored from the heart beat in a particular interval of

times. Apart fom that, there is a line with positive slope (e.g: y = x+2) which
lies across the wave and intersect on a number of points. My problem is i have
no exact equation for such a complex harmonic wave produced by the heart pulse
and so, cannot manage to find the intersection points.


Therefore, i would be very grateful if someone could give some ideas or might
suggest any packages in  R that can assist me to do so.

Thank you,
Fir




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building an array from matrix blocks

2011-02-21 Thread rex.dwyer
Well, you can lose B by just adding to X in the first for-loop, can't you?
For (...) X - X + A[...]

But if you want elegance, you could try:

X = Reduce(+,lapply(1:(p+1), function(i) A[i:(n-p-1+i),i:(n-p-1+i)]))

I imagine someone can be even more eleganter than this.

rad

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Eduardo de Oliveira Horta
Sent: Saturday, February 19, 2011 9:49 AM
To: r-help
Subject: [R] Building an array from matrix blocks

Hello,

I've googled for a while and couldn't find anything on this topic: say
I have a matrix A and want to build matrices B1, B2,... using blocks
from A (or equivalently an array B with B[,,i] being a block from A),
and that I must sum the B[,,i]'s.

I've come up with this rather non-elegant code:

 n = 6
 p = 3

 A - matrix(1:(n^2), n, n, byrow=TRUE)

 B - array(0, c(n-p, n-p, p+1))
 for (i in 1:(p+1)) B[,,i] - A[i:(n-p-1+i), i:(n-p-1+i)]

 X - matrix(0, n-p, n-p)
 for (i in 1:(p+1)) X - X + B[,,i]
 A
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]123456
[2,]789   10   11   12
[3,]   13   14   15   16   17   18
[4,]   19   20   21   22   23   24
[5,]   25   26   27   28   29   30
[6,]   31   32   33   34   35   36
 B
, , 1

 [,1] [,2] [,3]
[1,]123
[2,]789
[3,]   13   14   15

, , 2

 [,1] [,2] [,3]
[1,]89   10
[2,]   14   15   16
[3,]   20   21   22

, , 3

 [,1] [,2] [,3]
[1,]   15   16   17
[2,]   21   22   23
[3,]   27   28   29

, , 4

 [,1] [,2] [,3]
[1,]   22   23   24
[2,]   28   29   30
[3,]   34   35   36

 X
 [,1] [,2] [,3]
[1,]   46   50   54
[2,]   70   74   78
[3,]   94   98  102

Note that the blocks B[,,i] are obtained by sweeping the diagonal of
A. I wonder if there is a better and faster way to achieve this using
block matrix operations for instance. Actually what matters most for
me is getting to the matrix X, so if it is possible to do this without
having to construct the array B it would be ok as well...

Interesting observation:

 system.time(for (j in 1:1) {X - matrix(0, n-p, n-p); for (i in 1:(p+1)) 
 X - X + B[,,i]})
   user  system elapsed
   0.270.000.26
 system.time(for (j in 1:1) {X - apply(B,c(1,2),sum)})
   user  system elapsed
   1.820.021.86

Thanks in advance, and best regards,

Eduardo Horta

 sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-pc-mingw32

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Revobase_4.2.0   RevoScaleR_1.1-1 lattice_0.19-13

loaded via a namespace (and not attached):
[1] grid_2.11.1   pkgXMLBuilder_1.0 revoIpe_1.0   tools_2.11.1
[5] XML_3.1-0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about generics

2011-02-21 Thread rex.dwyer
?InternalMethods
?S3groupGeneric
?S4groupGeneric



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erin Hodgess
Sent: Friday, February 18, 2011 10:45 PM
To: R help
Subject: [R] question about generics

Dear R People:

Is there a way to determine which functions are generics, please?  I
looked for something like is.Generic, but no luck.

Thanks in advance!

Sincerely,
Erin


--
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up the code

2011-02-18 Thread rex.dwyer
Yes, remove the call to intersect, and rely on the results of match to tell you 
whether there is an overlap.  If there are any matches,  all(is.na(index)) will 
be false.  Read help for match.

?match


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Hui Du
Sent: Wednesday, February 16, 2011 6:29 PM
To: r-help@r-project.org
Subject: [R] speed up the code


Hi All,

The following is a snippet of my code. It works fine but it is very slow. Is it 
possible to speed it up by using different data structure or better solution? 
For 4 runs, it takes 8 minutes now. Thanks a lot



fun_activation = function(s_k, s_hat, x_k, s_hat_len)
{
common = intersect(s_k, s_hat);
if(length(common) != 0)
{
index  = match(common, s_k);
round(sum(x_k[index]) * length(common) / (s_hat_len * length(s_k)), 3);
}
else
{
0;
}

}

fun_x = function(a)
{
round(runif(length(a), 0, 1), 2);
}

symbol_len = 50;
PHI_set = 1:symbol_len;

S = matrix(replicate(M * M, sort(sample(PHI_set, sample(symbol_len, 1, M, 
M);
X = matrix(mapply(fun_x, S), M, M);

S_hat = c(28, 34, 35)
S_hat_len = length(S_hat);

  S_hat_matrix = matrix(list(S_hat), M, M);

system.time(
for(I in 1:4)
{
A = matrix(mapply(fun_activation, S, S_hat_matrix, X, S_hat_len), M, M);
}
)



HXD


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort a 3 dimensional array across third dimension ?

2011-02-18 Thread rex.dwyer
Although I suggested to someone else that for-loops be avoided, they are not in 
the inner loop in this code, and it's probably easier to understand than some 
sort of apply:

a = array(round(100*runif(60)),dim=c(3,4,5))
a
for (i in 1:dim(a)[1])
 for (j in 1:dim(a)[2])
  a[i,j,] = sort(a[i,j,])
a

Is that what you want?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Maas James Dr (MED)
Sent: Friday, February 18, 2011 8:01 AM
To: r-help@r-project.org
Subject: [R] sort a 3 dimensional array across third dimension ?

I'm attempting to sort a 3 dimensional array that looks like this
 x
, , 1
 [,1] [,2]
[1,]99
[2,]79
, , 2
 [,1] [,2]
[1,]65
[2,]46
, , 3
 [,1] [,2]
[1,]21
[2,]32

Such that it ends up like this 
 y
, , 1
 [,1] [,2]
[1,]21
[2,]32
, , 2
 [,1] [,2]
[1,]65
[2,]46
, , 3
 [,1] [,2]
[1,]99
[2,]79

I think this is sorting across the third dimension but several attempts using 
either the sort or apply functions have not worked.  Any and all suggestions 
most welcome.  Thanks

J

===
Dr. Jim Maas
University of East Anglia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort a 3 dimensional array across third dimension ?

2011-02-18 Thread rex.dwyer
I was going to say:

The problem with for-loops (as best I understand it) is that the R code gets 
interpreted over and over; what you normally want to do is design the 
computation so that you jump into the internals of R and stay there.  But the 
inner loop is in the R internals of the sort in this case. If the third 
dimension is even just moderately large, the cost of interpretation is small 
relative to the cost of the sort.

But my quick experiments don't exactly bear that out:

 foo = runif(1)
 system.time(for (i in 1:1000) sort(foo))
   user  system elapsed
   1.600.001.61
 system.time(for (i in 1:1000) for (j in 1:1) k=k+1)
   user  system elapsed
   7.520.007.54

I imagine I could find a prettier way with various flavors of apply, if my 
employer didn't have other things for me to do.

Maybe someone else can explain why the for loop is so slow that the overhead to 
increment the index is greater than sorting 1 doubles.  I know it used to 
be even slower in splus than in R.


-Original Message-
From: Maas James Dr (MED) [mailto:j.m...@uea.ac.uk]
Sent: Friday, February 18, 2011 10:06 AM
To: Dwyer Rex USRE; r-help@r-project.org
Subject: RE: sort a 3 dimensional array across third dimension ?

Hi Rex,

Thanks, this is exactly what I want but have to do it with many big arrays ... 
thus if there were a way to do it with a vectorized function would it not be a 
lot more efficient?

Much appreciated!

J

Subject: RE: sort a 3 dimensional array across third dimension ?

Although I suggested to someone else that for-loops be avoided, they are
not in the inner loop in this code, and it's probably easier to
understand than some sort of apply:

a = array(round(100*runif(60)),dim=c(3,4,5))
a
for (i in 1:dim(a)[1])
 for (j in 1:dim(a)[2])
  a[i,j,] = sort(a[i,j,])
a

Is that what you want?

Subject: [R] sort a 3 dimensional array across third dimension ?

I'm attempting to sort a 3 dimensional array that looks like this
 x
, , 1
 [,1] [,2]
[1,]99
[2,]79
, , 2
 [,1] [,2]
[1,]65
[2,]46
, , 3
 [,1] [,2]
[1,]21
[2,]32

Such that it ends up like this 
 y
, , 1
 [,1] [,2]
[1,]21
[2,]32
, , 2
 [,1] [,2]
[1,]65
[2,]46
, , 3
 [,1] [,2]
[1,]99
[2,]79

I think this is sorting across the third dimension but several attempts
using either the sort or apply functions have not worked.  Any and all
suggestions most welcome.  Thanks

J

===
Dr. Jim Maas
University of East Anglia





message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calling pairs of variables into a function

2011-02-17 Thread rex.dwyer
Try putting d,e,f in a list:
Xxx = list(d,e,f)

For (I in 1:length(xxx))
For (j in 1:length(xxx))
If (i!=j) bigfunction(xxx[[i]],xxx[[j]])

(bad indentation, caps thanks to outlook)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of squamous
Sent: Wednesday, February 16, 2011 7:11 PM
To: r-help@r-project.org
Subject: [R] calling pairs of variables into a function


Hi,
I have imported three text files into R using read.table.  Their variables
are called d, e and f.

I want to run a function on all the possible combinations of these three
files.  The only way I know how to do that is like this:

bigfunction(d,e)
bigfunction(d,f)
bigfunction(e,d)
bigfunction(e,f)
bigfunction(f,e)
bigfunction(f,d)

Is there an easier way?  I will have five files later on, so it would be
useful to know!  I'd imagine I can use a loop somehow, and I have installed
a package (gregmisc) so that typing permutations(3,2) gives all the possible
pairs of three numbers, but I don't know how to combine these things to make
it work.
--
View this message in context: 
http://r.789695.n4.nabble.com/calling-pairs-of-variables-into-a-function-tp3309993p3309993.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] covar

2011-02-17 Thread rex.dwyer
I hate to sound like David Have You Read The Posting Guide? Winsemius, but 
there's no way for anyone to know what you are trying to accomplish here 
without a lot more information.  You don't show us the output you expect and 
the output you got.  I would expected relatedness to be on a scale from 0 to 
1, but it's clear that you'll get values 1 in this program.
To use R effectively, you need to rephrase your computation as a matrix 
computation.  People generally use R at least partly to avoid debugging index 
computations in for-loops.  For-loops are also much slower than the 
corresponding matrix operations in R.  If you want to use for-loops, you can 
always put in some prints and trace what's going on, just like the old days!

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Val
Sent: Wednesday, February 16, 2011 3:14 PM
To: r-h...@stat.math.ethz.ch
Subject: [R] covar

Hi all,

I want to construct relatedness among individuals and have a look at the
following script.

#
rm(list=ls())

N=5
id   = c(1:N)
dad = c(0,0,0,3,3)
mom  = c(0,0,2,1,1)
sex  = c(2,2,1,2,2) # 1= M and 2=F

   A=diag(nrow = N)
   for(i in 1:N){
  for(j in i:N)  {
 ss = dad[j]
 dd = mom[j]
 sx = sex[j]
  if( ss  0  dd  0 )
{
  if(i == j)
   { A[i,i] = 1 + 0.5*A[ss,dd] }
 else
  { A[i,j] = A[i,ss] + 0.5*(A[i,dd])
A[j,i] = A[i,j] }
}

  } #inner for loop
 } # outer for loop
  A

If the sex is male ( sex=1)  then I want to set A[i,i]=0.5*A[ss,dd]
If it is female ( sex=2) then A[i,i] = 1 + 0.5*A[ss,dd]


How do I do it ?

I tried several cases but it did not work from me. Your assistance is
highly  appreciated  in advance

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] String manipulation

2011-02-16 Thread rex.dwyer
A quick way to do this is to replace \d and \D with character classes [0-9.]
and [^0-9.] .  This assumes that there is no scientific notation and that there 
is nothing like 123.45.678 in the string.  You did not account for a leading 
minus sign.
The book Mastering Regular Expressions is probably worth the expense if you are 
going to be doing a lot of this, even though similar content can be gleaned 
from on line.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Megh Dal
Sent: Sunday, February 13, 2011 4:42 PM
To: Gabor Grothendieck
Cc: r-help@r-project.org
Subject: Re: [R] String manipulation

Hi Gabor, thanks (and Jim as well) for your suggestion. However this is not
working properly for following string:

 MyString - ABCFR34564IJVEOJC3434.36453
 strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d file://d+)(//d+)(//D+)(//d+),
c)[[1]]
[1] ABCFR   34564   IJVEOJC 3434

Therefore there is decimal number in the 4th group, which is numeric then
that is not taken care off...

Similarly same kind of unintended result here as well:

 MyString - ABCFR34564.354IJVEOJC3434.36453
 strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d file://d+)(//d+)(//D+)(//d+),
c)[[1]]
[1] ABCFR   34564   .   354 IJVEOJC 3434.
36453
Can you please tell me how can I modify that?

Thanks,


On Sun, Feb 13, 2011 at 11:10 PM, Gabor Grothendieck 
ggrothendi...@gmail.com wrote:

  On Sun, Feb 13, 2011 at 10:27 AM, Megh Dal megh700...@gmail.com wrote:
  Please consider following string:
 
  MyString - ABCFR34564IJVEOJC3434
 
  Here you see that, there are 4 groups in above string. 1st and 3rd groups
  are for english letters and 2nd and 4th for numeric. Given a string, how
 can
  I separate out those 4 groups?
 

 Try this.  \\D+ and \\d+ match non-digits and digits respectively.
  The portions within parentheses are captures and passed to the c
 function.  It returns a list with a component for each element of
 MyString.  Like R's split it returns a list with a component per
 element of MyString but MyString only has one element so we get its
 contents using  [[1]].

  library(gsubfn)
  strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d+), c)[[1]]
 [1] ABCFR   34564   IJVEOJC 3434

 Alternately we could convert the relevant portions to numbers at the
 same time.  ~ list(...) is interpreted as a  function whose body is
 the right hand side of the ~ and whose arguments are the free
 variables, i.e. s1, s2, s3 and s4.

 strapply(MyString, (\\D+)(\\d+)(\\D+)(\\d+), ~ list(s1,
 as.numeric(s2), s3, as.numeric(s4)))[[1]]

 See http://gsubfn.googlecode.com for more.

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] string parsing

2011-02-16 Thread rex.dwyer
It's only awfully inefficient if it's a bottleneck.  You're not doing this 
more than once per item fetched from the network, and the time is insignificant 
relative to the fetch.  If it were somehow in your inner loop, it would be 
worth worrying about, but your purpose is to eliminate Ms and Bs so that you'll 
never ever see them again. If performance is a problem, look at your inner 
loop, not here.


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Mike Marchywka
Sent: Tuesday, February 15, 2011 9:01 PM
To: s...@gnu.org; r-h...@stat.math.ethz.ch
Subject: Re: [R] string parsing








 To: r-h...@stat.math.ethz.ch
 From: s...@gnu.org
 Date: Tue, 15 Feb 2011 17:20:11 -0500
 Subject: [R] string parsing

 I am trying to get stock metadata from Yahoo finance (or maybe there is
 a better source?)

search this for yahoo,

http://cran.r-project.org/web/packages/quantmod/quantmod.pdf

as a perennial page scraper, I was amazed this existed :)


 here is what I did so far:

 yahoo.url - http://finance.yahoo.com/d/quotes.csv?f=j1jka2s=;;
 stocks - c(IBM,NOIZ,MSFT,LNN,C,BODY,F); # just some samples
 socket - url(paste(yahoo.url,sep=,paste(stocks,collapse=+)),open=r);
 data - read.csv(socket, header = FALSE);
 close(socket);
 data is now:
 V1 V2 V3 V4
 1 200.5B 116.00 166.25 4965150
 2 19.1M 3.75 5.47 8521
 3 226.6B 22.73 31.58 57127000
 4 886.4M 30.80 74.54 226690
 5 142.4B 3.21 5.15 541804992
 6 276.4M 11.98 21.30 149656
 7 55.823B 9.75 18.97 89369000

 now I need to do this:

 -- convert 55.823B to 55e9 and 19.1M to 19e6

 parse.num - function (s) { as.numeric(gsub(M$,e6,gsub(B$,e9,s))); }
 data[1]-lapply(data[1],parse.num);

 seems like awfully inefficient (two regexp substitutions),
 is there a better way?

 -- iterate over stocks  data at the same time and put the results into
 a hash table:
 for (i in 1:length(stocks)) cache[[stocks[i]]] - data[i,];

 I do get the right results,
 but I am wondering if I am doing it the right R way.
 E.g., the hash table value is a data frame.
 A structure(record?) seems more appropriate.

 thanks!

 --
 Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final)
 http://pmw.org.il http://ffii.org http://camera.org http://honestreporting.com
 http://iris.org.il http://mideasttruth.com http://thereligionofpeace.com
 I haven't lost my mind -- it's backed up on tape somewhere.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A Math question

2011-02-16 Thread rex.dwyer
If y'all want to discuss this more, do it somewhere else, please.
This has little to do with R except that both depend on Peano's Axioms.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of jlu...@ria.buffalo.edu
Sent: Tuesday, February 15, 2011 12:46 PM
To: Kjetil Halvorsen
Cc: r-help@r-project.org; r-help-boun...@r-project.org; Maithula Chandrashekhar
Subject: Re: [R] A Math question

Kjetil et al,
Unlike finite sums, infinite sums are not commutative. To have
commutativity, one must have absolute summability, that is, the sum of the
absolute values of the terms must be finite.  If one has absolute
summability, the infinite sum exists and is unique. This sum is not
absolutely summable and thus undefined.   If one does not require
commutativity, then the order of the summation must be specified.  The
order is often implicitly assumed to be the order of the integers. The sum
of the negative integers is negative infinity, the sum of the positive
integers  is infinity, and the sum of these two sums is undefined.
However, Riemann's rearrangement theorem shows that the terms can be
re-ordered to yield any sum whatsoever.  In particular, if one creates
pairs of terms consisting of  a positive integer and its negative, then
the infinite sum is zero.   So the unique sum is undefined; otherwise the
sum depends on the order of addition.
Joe



David Winsemius dwinsem...@comcast.net
Sent by: r-help-boun...@r-project.org
02/15/2011 09:17 AM

To
Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com
cc
r-help@r-project.org, Maithula Chandrashekhar
m.chandrashekhar1...@gmail.com
Subject
Re: [R] A Math question







On Feb 14, 2011, at 7:33 PM, Kjetil Halvorsen wrote:

 or even better:

 http://mathoverflow.net/

I beg to differ. That is designated in its FAQ as expecting research
level questions, while the forum I offered is labeled as Welcome to
QA for people studying math at any level and professionals in related
fields. I don't think the proffered question could be considered
research level.


 On Sun, Feb 13, 2011 at 8:02 PM, David Winsemius dwinsem...@comcast.net

  wrote:

 On Feb 13, 2011, at 4:47 PM, Maithula Chandrashekhar wrote:

 Dear all, I admit this is not anything to do R and even with
 Statistics perhaps. Strictly speaking this is a math related
 question.
 However I have some reasonable feeling that experts here would
 come up
 with some elegant suggestion to my question.

 Here my question is: What is sum of all Integers? I somewhere heard
 that it is Zero as positive and negative integers will just cancel
 each other out. However want to know is it correct?

 There are more appropriate places to pose such questions:
 http://math.stackexchange.com/


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cycle in a directed graph

2011-02-11 Thread rex.dwyer
If the graph has n nodes and is represented by an adjacency matrix, you can 
square the matrix (log_2 n)+1 times.  Then you can multiply the matrix 
element-wise by its transpose.  The positive entries in the 7th row will tell 
you all nodes sharing a cycle with node 7.  This assumes all edge weights are 
positive.
Are you sure we're not doing your graph theory homework?  You asked about MSTs 
yesterday.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of amir
Sent: Friday, February 11, 2011 10:11 AM
To: r-help@r-project.org
Subject: [R] cycle in a directed graph

Hi,

I have a directed graph and wants to find is there any cycle in it? If
it is, which nodes or edges are in the cycle.
Is there any way to find the cycle in a directed graph in R?

Regards,
Amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generate multivariate normal data with a random correlation matrix

2011-02-10 Thread rex.dwyer
If you want a random correlation matrix, why not just generate random data and 
accept the correlation matrix that you get?  The standard normal distribution 
in k dimensions is (hyper)spherically symmetric.  If you generate k standard 
normal N(0,1) variates, you have a point in k-space with direction uniformly 
distributed on the (k-1)sphere and Gaussian magnitude.  If you generate k such, 
you have a random linear transformation with all sorts of desirable symmetries. 
 So, if you generate a kxk matrix of standard normal variates, and another nxk 
standard normal variates, and multiply the two matrices to get n points in k 
space, that seems to be a pretty good definition of random correlation to me.  
I'm sure you can decompose the kxk matrix to get the theoretical distribution, 
maybe by multiplying it by its transpose and doing an SVD; I'd have to think 
about that part.
... unless you have a particular distribution of correlation matrices in mind 
to begin with, which doesn't seem to be the case.


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Szumiloski, John
Sent: Wednesday, February 09, 2011 11:30 AM
To: r-help@r-project.org
Cc: Rick DeShon
Subject: Re: [R] Generate multivariate normal data with a random correlation 
matrix

The knee jerk thought I had was to express the correlation matrix as a generic 
Choleski decomposition, then randomly populate the triangular decomposed 
matrix.  When you remultiply, you can simply rescale to 1s on the diagonals.  
Then rmnorm as usual.

In R, see ?chol

If you want to get fancy, you could look at the random distribution you would 
use for the triangular matrix and play with that, including different 
distributions for different elements, elements' distributions being conditional 
on values of previously randomized elements, etc.

John

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Rick DeShon
Sent: Wednesday, 09 February, 2011 11:06 AM
To: r-h...@stat.math.ethz.ch
Subject: [R] Generate multivariate normal data with a random correlation matrix

Hi All.

I'd like to generate a sample of n observations from a k dimensional 
multivariate normal distribution with a random correlation matrix.

My solution:
The lower (or upper) triangle of the correlation matrix has n.tri=(d/2)(d+1)-d 
entries.
Take a uniform sample of n.tri possible correlations (runi(n.tr,-.99,.99) 
Populate a triangle of the matrix with the sampled correlations Mirror the 
triangle to populate the other triangle forming a symmetric matrix, cormat 
Sample n observations from a multivariate normal distribution with mean 
vector=0 and varcov=cormat


Problem:
This approach violates the triangle inequality property of correlation 
matrices.  So, the matrix I've constructed is certainly a valid matrix but it 
is not a valid correlation matrix and it blows up when you submit it to a 
random number generator such as rmnorm.  With a small matrix you sometimes get 
lucky and generate a valid correlation matrix but as you increase d the 
probability of obtaining a valid correlation matrix drops off quickly.

So, any ideas on how to construct a correlation matrix with random entries that 
cover the range (or most of the range) or the correlation [-1,1]?

Here's the code I've used that won't work.

library(mnormt)
n - 1000
d - 50

n.tri - ((d*(d+1))/2)-d
r   - runif(n.tri, min=-.5, max=.5)

cormat - diag(c)
count1=1
for (i in 1:c){
   for (j in 1:c){
   if (ij) {
   cormat[i,j]=r[count1]
   cormat[j,i]=cormat[i,j]
   count1=count1+1
}
   }
}
eigen(cormat) # if negative eigenvalue, then the matrix violates the 
triangle inequality

x -  rmnorm(n, rep(0, c), cormat)  # Sample the data



Thanks in advance,

Rick DeShon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting 

Re: [R] Calculating rowMeans from different columns in each row?

2011-02-10 Thread rex.dwyer


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Marine Andersson
Sent: Thursday, February 10, 2011 3:53 PM
To: r-help@r-project.org
Subject: [R] Calculating rowMeans from different columns in each row?

Hello!

I have a dataset like this:

X1   X2   X3   X4   X5X6X7X8
1 2  2 1 2  3   2  6
2 3  2 5 7  9   1  3
19 12 6 1  1   3  6

The columns X1-X6 contains ordinary numeric values.

X7 contains the number of the first column that the rowMeans should be 
calculated from and
X8 contains the last column that should be included in the rowMeans.

when I try

test - (df[,df$X7:df$X8])

the rowMeans are calculated based on the values in the X7 and X8 in the first 
row only.

Thanks in advance!

/Marine
__
[Dwyer Rex USRE]


Well, if you print df$X7:df$X8, you'll see why... you can't : together two 
vectors:

 c(1,2,3):c(8,10,12)
[1] 1 2 3 4 5 6 7 8
Warning messages:
1: In c(1, 2, 3):c(8, 10, 12) :
  numerical expression has 3 elements: only the first used
2: In c(1, 2, 3):c(8, 10, 12) :
  numerical expression has 3 elements: only the first used


So try: apply(df,1, function(v) {n=length(v); mean(v[v[n-1]:v[n]]) })







message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question on Duplicating

2011-02-09 Thread rex.dwyer
ab = paste(a,b,sep=;~;~;~)
flag = length(ab)==length(unique(ab))

This should work unless you use 3 consecutive winking elephants in other places 
in your program.



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Nipesh Bajaj
Sent: Wednesday, February 09, 2011 10:12 AM
To: r-help@r-project.org
Subject: [R] A question on Duplicating

Hello I am struggling to accomplice an idea which is as follows:

I have a vector say: a - c(a, b, c, a) and another: b - c(m,
n, o, m). Length of those 2 vectors are essentially be same. Here task
is to check the duplicates in the vector 'a' and then to check whether any
duplicates are there in the same places of 'b'. If not, flag a FALSE.

I above example, it is correct hence TRUE. However in general how can I
implement this?

Can somebody please help me?

Thanks,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strange behavior of panel.abline inside a for-loop

2011-02-08 Thread rex.dwyer
Dear Marius,
Try this:

plot.list = lapply(1:10,

function(i) xyplot(i~i,type=p,xlim=c(0,11),panel=function(...) {
panel.xyplot(...); panel.abline(v=i)})
)
plot.list[[3]]

I imagine it will work for Mr Luftjammer, too.
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Marius Hofert
Sent: Tuesday, February 08, 2011 6:37 AM
To: Help R
Subject: [R] strange behavior of panel.abline inside a for-loop

Dear expeRts,

I would like to create a list of lattice xyplots. Here is the minimal example:

library(lattice)
plot.list - vector(list, 10)
for(i in 1:10){
plot.list[[i]] - xyplot(i~i, type=p, xlim=c(0,11), panel=function(...){
panel.xyplot(...)
panel.abline(v=i)
})
}
plot.list[[3]]

As  you can see, the vertical line is *always* printed at x=10 [and not at 
x=i]. Why?

Cheers,

Marius
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pass nrow(x) to dots in function(x){plot(x,...)}

2011-02-04 Thread rex.dwyer
Hi Marianne,
The quick-and-dirty solution is to add one character and make ns global:
ns - nrow(x)
Poor practice, but OK temporarily if you're just debugging.

This is an issue of scope.  You are assuming dynamic scope, whereas R uses 
static scope.  'ns' was not defined when you said paste(n=,ns); it doesn't 
matter what its value is later.  Even though R delays evaluation of the 
argument until it is first needed, if it ends up being evaluated, the result is 
the same as if you evaluated it in the environment where it appeared in the 
text.  You can do something like this:

myfun - function(x, title.fun, ...) {
  ns - nrow(x)
  title = title.fun(ns)
  barplot(x , main=title, ... ) }

myfun(m1, title.fun=function(n) paste(n = ,n) )

Then the paste isn't evaluated until title.fun is *called*.  If you don't want 
to always supply title.fun, you give a default:
myfun - function(x, title.fun=paste, ...) { ...
or
myfun - function(x, title.fun=function(...){}, ...) { ...
or
myfun - function(x, title.fun=function(...){main}, main=, ...) { ... # 
(I think)


Rex

---
Message: 4
Date: Wed, 2 Feb 2011 11:51:50 +
From: Marianne Promberger marianne.promber...@kcl.ac.uk
To: r-help@r-project.org r-help@r-project.org
Subject: [R] pass nrow(x) to dots in function(x){plot(x,...)}
Message-ID: 20110202115150.GD8598@lauren
Content-Type: text/plain; charset=us-ascii

Dear Rers,

I have a function to barplot() a matrix, eg

myfun - function(x, ...) { barplot(x , ... )}

(The real function is more complicated, it does things to the matrix first.)

So I can do:

m1 -  matrix(1:20,4)
myfun(m1)
myfun(m1, main=My title)

I'd like to be able to add the number of rows of the matrix passed to
the function to the ... argument, eg

myfun(m1, main=paste(n=,ns))

where 'ns' would be nrow(m1)

I've tried this but it doesn't work:

myfun - function(x, ...) {
  ns - nrow(x)
  barplot(x , ... ) }

myfun(m1, main=paste(n = ,ns) )

ns is not found

So, basically, how do I assign an object inside a function that I can
then access in the dots when executing the function?

Many thanks

Marianne


--
Marianne Promberger PhD, King's College London
http://promberger.info
R version 2.12.0 (2010-10-15)
Ubuntu 9.04



--



message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.