Re: [R] Dicrete Laplace distribution

2010-03-11 Thread Moshe Olshansky
Dear Nicolette, You can always use the bruit force solution which works for every discrete distribution with finite number of states: let p0,p1,...,pK be the probabilities of 0,1,...,K (such that they sum up to 1). Let P - c(p0,p1,...,pK) and P1 - c(cumsum(P),1) Now let x = runif() (uniform in

Re: [R] Optimise huge data.frame construction

2010-02-24 Thread Moshe Olshansky
Hi Daniele, One possibility would be to make two runs. In the first run you are not building the matrix but just calculating the number of rows you need (in a loop). Then you allocate such matrix (only once) and fill it in the second run. Regards, Moshe. --- On Wed, 24/2/10, Daniele Amberti

Re: [R] reading surfer files

2010-02-23 Thread Moshe Olshansky
Check read.table (?read.table). --- On Wed, 24/2/10, RagingJim wrote: From: RagingJim Subject: [R] reading surfer files To: Received: Wednesday, 24 February, 2010, 3:23 PM To the R experts, I am currently playing

Re: [R] Goodness of fit test for count data

2010-02-22 Thread Moshe Olshansky
You can compute the conditional probability that your variable equals k given that it is non-zero. For example, if X has poisson distribution with parameter lambda then P(X=k/X!=0) = P(X=k)/(1-P(X=0)) = (exp(-lambda)/(1-exp(-lambda))*lambda^k/k! Now you can find lambda for which the sum of

Re: [R] Quadprog help

2010-02-22 Thread Moshe Olshansky
Hi Sergio, Having singular Dmat is certainly a problem. I can see two possibilities: 1) try to eliminate X1,...,X9, so that you are left with P1,...,P6 only. 2) if you can not do this, add eps*X1^+...+eps*X9^2 to your matrix Dmat so that it is positive definite (eps is a small positive number).

Re: [R] Integral of function of dnorm

2010-02-17 Thread Moshe Olshansky
Yes, this can be easily computed analytically (even though my result is a bit different). --- On Fri, 12/2/10, wrote: From: Subject: Re: [R] Integral of function of dnorm To: Greg Snow

Re: [R] difftimes; histogram; memory problems

2010-02-15 Thread Moshe Olshansky
Hi Jonathan, If minDate = min(Condition1) - max(Condition2) and maxDate = max(Condition1) - min(Condition2) then all your differences would be between minDay and maxDay, and hopefully this is not a very big range (unless you are going many thousands years into the past or the future). So

Re: [R] Question about rank() function

2010-02-10 Thread Moshe Olshansky
Hi, I believe that the reason is that even though the first 4 elements of your fmodel look equal (when rounded to 4 decimal places) they are actually not. To check this try fmodel[1:4]-fmodel[1] --- On Thu, 11/2/10, Something Something wrote: From: Something Something

Re: [R] Resampling a grid to coarsen its resolution

2010-02-09 Thread Moshe Olshansky
One possibility I can see is to replace - by NA and use mean with na.rm=TRUE. --- On Wed, 10/2/10, Steve Murray wrote: From: Steve Murray Subject: [R] Resampling a grid to coarsen its resolution To: Received: Wednesday,

Re: [R] Polynomial equation

2010-01-07 Thread Moshe Olshansky
Hi Chris, You can use lm with poly (look ?lm, ?poly). If x and y are your arrays of points and you wish to fit a polynom of degree 4, say, enter: model - lm(y~poly(x,4,raw=TRUE) and then summary(model) The raw=TRUE causes poly to use 1,x,x^2,x^3,... instead of orthogonal polynomials (which are

Re: [R] Polynomial equation

2010-01-07 Thread Moshe Olshansky
unable to find out the equation of the trendline from the summary table. Besides, how do I fit the trendline on the graph? I intend to put the first column of data onto x axis and the second column onto y axis. Are they the x and y in your example? Many thanks, Chris Moshe Olshansky-2

[R] Confidence intervals - a statistical question, nothing to do with R

2009-11-18 Thread Moshe Olshansky
Dear list, I have r towns, T1,...,Tr where town i has population Ni. For each town I randomly sampled Mi individuals and found that Ki of them have a certain property. So Pi = Ki/Mi is an unbiased estimate of the proportion of people in town i having that property and the weighted average of

Re: [R] Kolmogorov smirnov test

2009-10-12 Thread Moshe Olshansky
Hi Roslina, I believe that you can ignore the warning. Alternatively, you may add a very small random noise to pairs with ties, i.e. something like xobs[which(duplicated(xobs))] - xobs[which(duplicated(xobs))] + 1.0e-6*sd(xobs)*rnorm(length(which(duplicated(xobs Regards, Moshe. --- On

Re: [R] keeping all rows with the same values, and not only unique ones

2009-09-24 Thread Moshe Olshansky
test[which(test[,total] %in% needed),] --- On Fri, 25/9/09, Dimitri Liakhovitski wrote: From: Dimitri Liakhovitski Subject: [R] keeping all rows with the same values, and not only unique ones To: R-Help List Received: Friday, 25

Re: [R] Basic population dynamics

2009-09-01 Thread Moshe Olshansky
Assuming that at the end all of them are dead, you can do the following: sum(deaths)-cumsum(deaths) Regards, Moshe. --- On Wed, 2/9/09, Frostygoat wrote: From: Frostygoat Subject: [R] Basic population dynamics To: Received:

Re: [R] Help on efficiency/vectorization

2009-08-27 Thread Moshe Olshansky
You can do for (i in 1:ncol(x)) {names - rownames(x)[which(x[,i]==1)];eval(parse(text=paste(V,i,.ind-names,sep=)));} --- On Thu, 27/8/09, Steven Kang wrote: From: Steven Kang Subject: [R] Help on efficiency/vectorization To:

Re: [R] Submit a R job to a server

2009-08-26 Thread Moshe Olshansky
Hi Deb, Based on your last note (and after briefly looking at Rserve) I believe that you should install R with all the packages you need on the server and then use it like you are using any workstation, i.e. log in to it and do whatever you need. Regards, Moshe. --- On Thu, 27/8/09,

Re: [R] expanding 1:12 months to Jan:Dec

2009-08-20 Thread Moshe Olshansky
One possible (but not very elegant) solution is: aa - paste(1:12,:10:2009,sep=) dd-as.Date(aa,format=%m:%d:%Y) mon - format(dd,%b) mon [1] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec --- On Thu, 20/8/09, Liviu Andronic wrote: From: Liviu Andronic

Re: [R] Principle components analysis on a large dataset

2009-08-20 Thread Moshe Olshansky
Hi Misha, Since PCA is a linear procedure and you have only 6000 observations, you do not need 68000 variables. Using any 6000 of your variables so that the resulting 6000x6000 matrix is non-singular will do. You can choose these 6000 variables (columns) randomly, hoping that the resulting

Re: [R] extra .

2009-08-20 Thread Moshe Olshansky
My guess is that 6. comes for 6.0 - something which comes from programming languages where 6 represents 6 as integer while 6. (or 6.0) represents 6 as floating point number. --- On Fri, 21/8/09, kfcnhl wrote: From: kfcnhl Subject: [R]

Re: [R] feature weighting in randomForest

2009-08-17 Thread Moshe Olshansky
Hi Tim, As far as I know you can not weigh predictors (and I believe that you really should not). You may weigh classes (and, in a sense, cases), but this is an entirely different issue. --- On Wed, 5/8/09, Häring, Tim (LWF) wrote: From: Häring, Tim (LWF)

Re: [R] System is computationally singular and scale of covariates

2009-08-16 Thread Moshe Olshansky
Hi, What do you mean by outer product? If you have two vectors, say x and y, of lenght n and you define matrix A by A(i,j) = x(i)*y(j) then your matrix has rank one and it is VERY singular (in exact arithmetics). Is this is what you mean by outer product? --- On Sun, 16/8/09, Stephan Lindner

Re: [R] problem about t test

2009-08-13 Thread Moshe Olshansky
You could do the following: y - apply(dat,1,function(a) t.test(a[1:10],a[11:30])$p.value) This will produce an array of 2 p-values. --- On Fri, 14/8/09, Gina Liao wrote: From: Gina Liao Subject: [R] problem about t test To:

Re: [R] Solutions of equation systems

2009-08-13 Thread Moshe Olshansky
Is your system of equations linear? --- On Fri, 14/8/09, Moreno Mancosu wrote: From: Moreno Mancosu Subject: [R] Solutions of equation systems To: Received: Friday, 14 August, 2009, 2:29 AM Hello all! Maybe it's a newbie

Re: [R] Counting the number of non-NA values per day

2009-08-12 Thread Moshe Olshansky
Try tempFun - function(x) sum(! nonZeros - aggregate(pollution[pol],format(pollution[date],%Y-%j), FUN = tempFun) --- On Wed, 12/8/09, Tim Chatterton wrote: From: Tim Chatterton Subject: [R] Counting the number of non-NA values per

Re: [R] Re : How to Import Excel file into R 2.9.0 version

2009-08-12 Thread Moshe Olshansky
Alternatively download the xlsReadWrite package from install it an proceed as in older version of R. --- On Tue, 11/8/09, Inchallah Yarab wrote: From: Inchallah Yarab Subject: [R] Re : How to Import Excel file

Re: [R] Matrix Integral

2009-08-12 Thread Moshe Olshansky
Hi, Is your matrix K symmetric? If yes, there is an analytical solution. --- On Sat, 1/8/09, nhawrylyshyn wrote: From: nhawrylyshyn Subject: [R] Matrix Integral To: Received: Saturday, 1 August, 2009, 12:15 AM

Re: [R] Sampling of non-overlapping intervals of variable length

2009-07-19 Thread Moshe Olshansky
Another possibility, if the total length of your intervals is small in comparison to the big interval is to choose the starting points of all your intervals randomly and to dismiss the entire set if some of the intervals overlap. Most probably you will not have too many such cases (assuming,

Re: [R] searching for elements

2009-07-15 Thread Moshe Olshansky
?outer --- On Thu, 16/7/09, Chyden Finance wrote: From: Chyden Finance Subject: [R] searching for elements To: Received: Thursday, 16 July, 2009, 3:00 AM Hello! For the past three years, I have been using R extensively in my

Re: [R] Nested for loops

2009-07-14 Thread Moshe Olshansky
Make it for (i in 1:9) This is not the general solution, but in your case when i=10 you do not want to do anything. --- On Tue, 14/7/09, Michael Knudsen wrote: From: Michael Knudsen Subject: [R] Nested for loops To: Received:

Re: [R] ifultools on ppc debian

2009-07-14 Thread Moshe Olshansky
Hi Stephen, The error message clearly says what is wrong. Big Endian and Little Endian are two ways of storing data (mostly often double precision numbers) in memory. A double precision number occupies two blocks of 4 bytes each. On Big Endian machines (most machines which are not Intel) if

Re: [R] Grouping data in dataframe

2009-07-14 Thread Moshe Olshansky
Try ?aggregate --- On Wed, 15/7/09, Timo Schneider wrote: From: Timo Schneider Subject: [R] Grouping data in dataframe To: Received: Wednesday, 15 July, 2009, 1:56 PM Hello,

Re: [R] averaging two matrices whilst ignoring missing values

2009-07-13 Thread Moshe Olshansky
One (awkward) way to do this is: x - matrix(c(c(test),c(test2)),ncol=2) y - rowMeans(x,na.rm=TRUE) testave - matrix(y,nrow=nrow(test)) --- On Tue, 14/7/09, Tish Robertson wrote: From: Tish Robertson Subject: [R] averaging two matrices

Re: [R] how to keep row name if there is only one row selected from a data frame

2009-07-12 Thread Moshe Olshansky
Try A[1,,drop=FALSE] - see help(\[) --- On Mon, 13/7/09, Weiwei Shi wrote: From: Weiwei Shi Subject: [R] how to keep row name if there is only one row selected from a data frame To: Received: Monday,

Re: [R] Extracting a column name in loop?

2009-07-09 Thread Moshe Olshansky
If df is your dataframe then names(df) contains the column names and so names(df)[i] is the name of i-th column. --- On Thu, 9/7/09, mister_bluesman wrote: From: mister_bluesman Subject: [R] Extracting a column name in loop? To:

Re: [R] print() to file?

2009-07-09 Thread Moshe Olshansky
One possibility is to use sink (see ?sink). --- On Thu, 9/7/09, Steve Jaffe wrote: From: Steve Jaffe Subject: [R] print() to file? To: Received: Thursday, 9 July, 2009, 5:03 AM I'd like to write some objects (eg arrays) to a

Re: [R] Substituting numerical values using `apply'

2009-07-09 Thread Moshe Olshansky
Let M be your matrix. Do the following: B - t(matrix(colnames(a),nrow=ncol(M),ncol=nrow(M))) B[M==0] - NA --- On Thu, 9/7/09, Olivella wrote: From: Olivella Subject: [R] Substituting numerical values using `apply' To: Received:

Re: [R] naming of columns in R dataframe consisting of mixed data (alphanumeric and numeric)

2009-07-09 Thread Moshe Olshansky
Hi Mary, Your data.frame has just one column (not 2)! You can check this by dim(tresult2). What appears to you to be the first column (names) are indeed rownames. If you really want to have two columns do something like tresult2 - cbind(colnames(tresult),data.frame(t(tresult),row.names=NULL))

Re: [R] Counting the number of cycles in a temperature test

2009-07-07 Thread Moshe Olshansky
Hi Antje, Are your measurements taken every minute (i.e. 30 minutes correspond to 30 consecutive values)? How fast is your transition? If you had 30 minures of upper temperature, then 1000 minutes of room temperature and then 30 minutes of lower temperature - would you count this as a cycle?

Re: [R] Uncorrelated random vectors

2009-07-07 Thread Moshe Olshansky
As mentioned by somebody before, there is no problem for the normal case - use mvrnorm function from MASS package with any mu and make Sigma be any diagonal matrix (with strictly positive diagonal). Note that even though all the correlations are 0, the SAMPLE correlations won't be 0. If you

Re: [R] a really simple question on polynomial multiplication

2008-10-15 Thread Moshe Olshansky
One way is to use convolution (?convolve): If A(x) = a_p*x^p + ... + a_1*x + a_0 and B(x) = b_q*x^q + ... + b_1*x + b_0 and if C(x) = A(x)*B(x) = c_(p+q)*x^(p+q) + ... + c_0 then c = convolve(a,rev(b),type=open) where c is the vector (c_(p+q),...,c_0), a is (a_p,...,a_0) and b is (b_q,...,b_0).

Re: [R] runs of heads when flipping a coin

2008-10-09 Thread Moshe Olshansky
First of all, we must define what is a run of length r: is it a tail, then EXACTLY r heads and a tail again or is it AT LEAST r heads. Let's assume that we are looking for a run of EXACTLY r heads (and we toss the coin n times). Let X[1],X[2],...,X[n-r+1] be random variables such that Xi = 1 if

Re: [R] ordering problem

2008-10-07 Thread Moshe Olshansky
Try AA - apply(A,1,function(x) paste(x,collapse=)) and work with AA. --- On Tue, 30/9/08, Jose Luis Aznarte M. [EMAIL PROTECTED] wrote: From: Jose Luis Aznarte M. [EMAIL PROTECTED] Subject: [R] ordering problem To: [EMAIL PROTECTED] Received: Tuesday, 30 September, 2008, 8:43 PM Hi

Re: [R] design question on piping multiple data sets from 1 file into R

2008-09-24 Thread Moshe Olshansky
I think that you can use read.csv with nrows and skip arguments (see ?read.table). --- On Mon, 22/9/08, DS [EMAIL PROTECTED] wrote: From: DS [EMAIL PROTECTED] Subject: [R] design question on piping multiple data sets from 1 file into R To: Received: Monday, 22

Re: [R] sort a data matrix by all the values and keep the names

2008-09-22 Thread Moshe Olshansky
One possibility is: x - data.frame(x1=c(1,7),x2=c(4,6),x3=c(8,2)) names - t(matrix(rep(names(x),times=nrow(x)),nrow=ncol(x))) m - as.matrix(x) ind - order(m) df - data.frame(name=names[ind],value=m[ind]) df name value 1 x1 1 2 x3 2 3 x2 4 4 x2 6 5 x1 7 6 x3

Re: [R] perl expression question

2008-09-22 Thread Moshe Olshansky
Hi Mark, stock-/opt/limsrv/mark/research/equity/projects/testDL/stock_data/fhdb/US/BLC.NYSE gsub(.*/([^/]+)$, \\1,stock) [1] BLC.NYSE --- On Tue, 23/9/08, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] [EMAIL PROTECTED] Subject: [R] perl expression question To:

Re: [R] help on sampling from the truncated normal/gamma distribution on the far end (probability is very low)

2008-09-18 Thread Moshe Olshansky
Hi Sonia, If I did not make a mistake, the conditional distribution of X given that X 0 is very close to exponential distribution with parameter lambda = 40, so you can sample from this distribution. --- On Mon, 15/9/08, Daniel Davis [EMAIL PROTECTED] wrote: From: Daniel Davis [EMAIL

Re: [R] help on sampling from the truncated normal/gamma distribution on the far end (probability is very low)

2008-09-18 Thread Moshe Olshansky
Well, I made a mistake - your lambda should be 400 and not 40!!! --- On Thu, 18/9/08, Moshe Olshansky [EMAIL PROTECTED] wrote: From: Moshe Olshansky [EMAIL PROTECTED] Subject: Re: [R] help on sampling from the truncated normal/gamma distribution on the far end (probability is very low

Re: [R] inserting values for null

2008-09-17 Thread Moshe Olshansky
Hi Ramya, Assuming that the problem is well defined (i.e. the values in col1 of the data.frames are unique and every value in D.F.sub.2[,1] appears also in D.F1[,1]) you can do the following: ind - match(D.F.sub.2[,1],D.F1[,1]) D.F1[ind,] - D.F.sub.2 --- On Thu, 18/9/08, Rajasekaramya [EMAIL

Re: [R] database table merging tips with R

2008-09-11 Thread Moshe Olshansky
One possibility is as follows: If r$userid is your array of (2000) ID's then s - paste(r$userid,sep=,) s- paste(select t.userid, x, y, z from largetable t where t.serid in (,s,),sep=) and finally d - sqlQuery(connection,s) Regards, Moshe. --- On Fri, 12/9/08, Avram Aelony [EMAIL PROTECTED]

Re: [R] database table merging tips with R

2008-09-11 Thread Moshe Olshansky
Just a small correction: start with s - paste(r$userid,collapse=,) and not s - paste(r$userid,sep=,) --- On Fri, 12/9/08, Moshe Olshansky [EMAIL PROTECTED] wrote: From: Moshe Olshansky [EMAIL PROTECTED] Subject: Re: [R] database table merging tips with R To: [EMAIL PROTECTED], Avram

Re: [R] densities with overlapping area of 0.35

2008-09-08 Thread Moshe Olshansky
Let X be normally distributed with mean 0 and let f be it's density. Now the density of X+a will be f shifted right by a. Since the density is symmetric around mean it follows that the area of overlap of the two densities is exactly P(Xa) + P(X-a). So if X~N(0,1), we want P(Xa) + P(X-a) =

Re: [R] densities with overlapping area of 0.35

2008-09-08 Thread Moshe Olshansky
Just a correction: if we take X+2a then everything is OK (the curves intersect at a), so a = 0.9345893 is correct but one must take X ~ N(0,1) and Y ~N(2*a,1). --- On Tue, 9/9/08, Moshe Olshansky [EMAIL PROTECTED] wrote: From: Moshe Olshansky [EMAIL PROTECTED] Subject: Re: [R] densities

Re: [R] intercept of 3D line? (Orthogonal regression)

2008-09-01 Thread Moshe Olshansky
I do not see why you can not use regression even in this case. To make things more simple suppose that the exact model is: y = a + b*x, i.e. y1 = a + b*x1 ... yn = a + b*xn But you can not observe y and x. Instead you observe ui = xi + ei (i=1,...,n) and vi = yi + di (i=1,...,n) Now you have

Re: [R] Integrate a 1-variable function with 1 parameter (Jose L. Romero)

2008-08-27 Thread Moshe Olshansky
This can be done analytically: after changing a variable (2*t - t) and some scaling we need to compute f(x) = integral from 0 to 20 of (t^x*exp(-t))dt/factorial(x) f(0) = int from 0 to 20 of exp(-t)dt = 1 - exp(-20) and integration by parts yields (for x=1,2,3,...) f(x) =

Re: [R] Problem with Integrate for NEF-HS distribution

2008-08-26 Thread Moshe Olshansky
If you look at your sech(pi*x/2) you can write it as sech(pi*x/2) = 2*exp(pi*x/2)/(1 + exp(pi*x)) For x -15, exp(pi*x) 10^-20, so for this interval you can replace sech(pi*x/2) by 2*exp(pi*x/2) and so the integral from -Inf to -15 (or even -10 - depends on your accuracy requirements) can be

Re: [R] Finding a probability

2008-08-26 Thread Moshe Olshansky
You commands are correct and the interpretation is that the probability that a normal random variable with mean 1454.190 and standard deviation 162.6301 achieves a value of 417 or less is 8.99413e-11 --- On Wed, 27/8/08, rr400 [EMAIL PROTECTED] wrote: From: rr400 [EMAIL PROTECTED] Subject:

Re: [R] Igraph library: How to calculate APSP (shortest path matrix) matrix for a subset list of nodes.

2008-08-25 Thread Moshe Olshansky
I was too optimistic - the complexity is O(E*log(V)) where V is the number of nodes, but since log(25000) 20 this is still reasonable. --- On Mon, 25/8/08, Moshe Olshansky [EMAIL PROTECTED] wrote: From: Moshe Olshansky [EMAIL PROTECTED] Subject: Re: [R] Igraph library: How to calculate APSP

Re: [R] paste: multiple collapse arguments?

2008-08-25 Thread Moshe Olshansky
One possibility is: y - rep( ,6) y[6] - y[c(2,4)] - \n res - paste(paste(x,y,sep=),collapse=) --- On Tue, 26/8/08, remko duursma [EMAIL PROTECTED] wrote: From: remko duursma [EMAIL PROTECTED] Subject: [R] paste: multiple collapse arguments? To: Received: Tuesday, 26

Re: [R] deconvolution: Using the output and a IRF to get the input

2008-08-24 Thread Moshe Olshansky
Hi Wolf, Without noise you could use FFT, i.e. FFT of a convolution is the product of the individual FFTs and so you get the FFT of your input signal and using inverse FFT you get the signal itself. When there is noise you must experiment. You may want to filter the response before doing FFT.

Re: [R] Igraph library: How to calculate APSP (shortest path matrix) matrix for a subset list of nodes.

2008-08-24 Thread Moshe Olshansky
As far as I know/remember, if your graph is connected and contains E edges then you can find the shortest distance from any particular vertex to all other vertices in O(E) operations. You can repeat this procedure starting from every node (out of the 500). If you have 100,000 edges this will

Re: [R] Help Regarding 'integrate'

2008-08-21 Thread Moshe Olshansky
The phenomenon is most likely caused by numerical errors. I do not know how 'integrate' works but numerical integration over a very long interval does not look a good idea to me. I would do the following: f1-function(x){ return(dchisq(x,9,77)*((13.5/x)^5)*exp(-13.5/x)) } f2-function(y){

Re: [R] Null and Alternate hypothesis for Significance test

2008-08-21 Thread Moshe Olshansky
Hi Nitin, I believe that you can not have null hypothesis to be that A and B come from different distributions. Asymptotically (as both sample sizes go to infinity) KS test has power 1, i.e. it will reject H0:A=B for any case where A and B have different distributions. To work with a finite

Re: [R] How I can read the binary file with different type?

2008-08-21 Thread Moshe Olshansky
Hi Miao, I can write a function which takes an integer and produces a float number whose binary representation equals to that of the integer, but this would be an awkward solution. So if nobody suggests anything better I will write such a function for you, but let's wait for a better solution.

Re: [R] Random sequence of days?

2008-08-20 Thread Moshe Olshansky
How about d[sample(length(d),10)] --- On Wed, 20/8/08, Lauri Nikkinen [EMAIL PROTECTED] wrote: From: Lauri Nikkinen [EMAIL PROTECTED] Subject: [R] Random sequence of days? To: [EMAIL PROTECTED] Received: Wednesday, 20 August, 2008, 4:04 PM Dear list, I tried to find a solution for this

Re: [R] Conversion - lowercase to Uppercase letters

2008-08-19 Thread Moshe Olshansky
Use toupper or tolower (see ?toupper, ?tolower) --- On Wed, 20/8/08, suman Duvvuru [EMAIL PROTECTED] wrote: From: suman Duvvuru [EMAIL PROTECTED] Subject: [R] Conversion - lowercase to Uppercase letters To: Received: Wednesday, 20 August, 2008, 2:19 PM I would like to

Re: [R] matrix row product and cumulative product

2008-08-18 Thread Moshe Olshansky
Hi Jeff, If I understand correctly, the overhead of a loop is that at each iteration the command must be interpreted, and this time is independent of the number of rows N. So if N is small this overhead may be very significant but when N is large this should be very small compared to the time

Re: [R] A doubt about lm and the meaning of its summary

2008-08-18 Thread Moshe Olshansky
Hi Alberto, Please disregard my previous note - I probably had a black-out!!! --- On Tue, 19/8/08, Alberto Monteiro [EMAIL PROTECTED] wrote: From: Alberto Monteiro [EMAIL PROTECTED] Subject: [R] A doubt about lm and the meaning of its summary To: Received: Tuesday, 19

Re: [R] Vectorization of duration of the game in the gambler ruin's problem

2008-08-15 Thread Moshe Olshansky
Hi Jose, If you are only interested in the expected duration, the problem can be solved analytically - no simulation is needed. Let P be the probability to get (and then 1-P is the probability to loose all the money) when starting with This probability P is well

Re: [R] ignoring zeros or converting to NA

2008-08-13 Thread Moshe Olshansky
Since 0 can be represented exactly as a floating point number, there is no problem with something like x[x==0]. What you can not rely on is something like 0.1+0.2 == 0.3 to be TRUE. --- On Thu, 14/8/08, Roland Rau [EMAIL PROTECTED] wrote: From: Roland Rau [EMAIL PROTECTED] Subject: Re: [R]

Re: [R] missing TRUE/FALSE error in conditional construct

2008-08-13 Thread Moshe Olshansky
The problem is that if x is either NA or NaN then x != 0 is NA (and not FALSE or TRUE) and the function is.nan tests for a NaN but not for NA, i.e. is.nan(NA) returns FALSE. You can do something like: mat_zeroless[! mat != 0] - mat[! mat != 0] --- On Thu, 14/8/08,

Re: [R] Covariance matrix

2008-08-07 Thread Moshe Olshansky
Just interchange rows 2 and 3 and then columns 2 and 3 of the original covariance matrix. --- On Fri, 8/8/08, Zhang Yanwei - Princeton-MRAm [EMAIL PROTECTED] wrote: From: Zhang Yanwei - Princeton-MRAm [EMAIL PROTECTED] Subject: [R] Covariance matrix To:

Re: [R] simulate data based on partial correlation matrix

2008-08-05 Thread Moshe Olshansky
Hi Benjamin, Creating 0 correlations is easier and always possible, but creating arbitrary correlations can be done as well (when possible - see below). Suppose that x1,x2,x3,x4 have mean 0 and suppose that the desired correlations are r = (r1,r2,r3,r4). Let A be an orthogonal 4x4 matrix such

Re: [R] stats question

2008-07-31 Thread Moshe Olshansky
Hello Jason, You are not specific enough. What do you mean by significant difference? Let's assume that indeed the incidence in A is 6% and in B is 10% and we are looking for Na and Nb such that with probability of at least 80% the mean of Nb sample from B will be at least, say, 0.03 (=3%)

Re: [R] Code to calculate internal rate of return

2008-07-31 Thread Moshe Olshansky
You can use uniroot (see ?uniroot). As an example, suppose you have a $100 bond which pays 3% every half year (6% coupon) and lasts for 4 years. Suppose that it now sells for $95. In such a case your time intervals are 0,0.5,1,...,4 and the payoffs are: -95,3,3,...,3,103. To find internal rate

Re: [R] cutting out numbers from vectors

2008-07-31 Thread Moshe Olshansky
This is something that is easier done in C than in R (to the best of my very limited knowledge). To do this in R you could do something like: x - 082-232-232-1 y -unlist(strsplit(x,)) i - which(y != 0)[1]-1 paste(y[-(1:i)],collapse=) [1] 82-232-232-1 --- On Fri, 1/8/08, calundergrad

Re: [R] Grouping Index of Matrix Based on Certain Condition

2008-07-31 Thread Moshe Olshansky
y - 2 - (x[,1] x[,2]) you can also do cbind(x,y) if you wish. --- On Fri, 1/8/08, Gundala Viswanath [EMAIL PROTECTED] wrote: From: Gundala Viswanath [EMAIL PROTECTED] Subject: [R] Grouping Index of Matrix Based on Certain Condition To: [EMAIL PROTECTED] Received: Friday, 1 August,

Re: [R] cutting out numbers from vectors

2008-07-31 Thread Moshe Olshansky
Yes, this is how it should be done! --- On Fri, 1/8/08, Christos Hatzis [EMAIL PROTECTED] wrote: From: Christos Hatzis [EMAIL PROTECTED] Subject: Re: [R] cutting out numbers from vectors To: 'calundergrad' [EMAIL PROTECTED], Received: Friday, 1 August, 2008, 2:11 PM

Re: [R] Urgent

2008-07-30 Thread Moshe Olshansky
Hi Yunlei, Is your problem constrained or not? If it is unconstrained and your matrix is not positive definite, the minimum is unbounded (unless you are extremely lucky and the matrix is positive semi-definite and the vector which multiplies the unknowns is exactly perpendicular to all the

Re: [R] Sampling two exponentials

2008-07-30 Thread Moshe Olshansky
I am not sure that this is well defined. For a multivariate normal distribution (which is well defined), the covariance matrix (and the means vector) fully determine the distribution. In the exponential case, what is multivariate (bivariate) exponential distribution? I believe that knowing

Re: [R] product of successive rows

2008-07-29 Thread Moshe Olshansky
Assuming that the number of rows is even and that your matrix is A, element-wise product of pairs of rows can be calculated as A[seq(1,nrow(A),by=2),]*A{seq(2,nrow(A),by=2),] --- On Mon, 28/7/08, rcoder [EMAIL PROTECTED] wrote: From: rcoder [EMAIL PROTECTED] Subject: [R] product of

Re: [R] finding a faster way to do an iterative computation

2008-07-29 Thread Moshe Olshansky
Try abs(outer(xk,x,-)) (see ?outer) --- On Wed, 30/7/08, dxc13 [EMAIL PROTECTED] wrote: From: dxc13 [EMAIL PROTECTED] Subject: [R] finding a faster way to do an iterative computation To: Received: Wednesday, 30 July, 2008, 4:12 AM useR's, I am trying trying to

Re: [R] Chi-square parameter estimation

2008-07-29 Thread Moshe Olshansky
If v is your vector of sample variances (and assuming that their distribution is chi-square) you can define f(df) - sum(dchisq(v,df,log=TRUE)) and now you need to maximize f, which can be done using any optimization function (like optim). --- On Sat, 26/7/08, Julio Rojas [EMAIL PROTECTED]

Re: [R] Constrained coefficients in lm (correction)

2008-07-23 Thread Moshe Olshansky
This problem can be easily solved analytically: we want to minimize sum(res(i) -a*st(i) -b*mod(i))^2 subject to a+b=1,a,b=0, so we want to minimize f(a) = sum((res(i)-mod(i)) - a*(st(i)-mod(i)))^2 for 0=a=1 Define Xi = res(i) - mod(i), Yi = st(i) - mod(i), then f(a) = sum(Xi - a*Yi)^2 f(0)

Re: [R] spectral decomposition for near-singular pd matrices

2008-07-16 Thread Moshe Olshansky
How large is your matrix? Are the very small eigenvalues well separated? If your matrix is not very small and the lower eigenvalues are clustered, this may be a really hard problem! You may need a special purpose algorithm and/or higher precision arithmetic. If your matrix is A and there

Re: [R] spectral decomposition for near-singular pd matrices

2008-07-16 Thread Moshe Olshansky
Kapat [EMAIL PROTECTED] Subject: Re: [R] spectral decomposition for near-singular pd matrices To: [EMAIL PROTECTED] Received: Thursday, 17 July, 2008, 10:56 AM Moshe Olshansky m_olshansky at writes: How large is your matrix? Right now I am looking at sizes between 30x30

Re: [R] rounding

2008-07-10 Thread Moshe Olshansky
The problem is that neither 0.55 nor 2.55 are exact machine numbers (the computer uses binary representation), so it may happen that the machine representation of 0.55 is slightly less than 0.55 while the machine representation of 2.55 is slightly above 2.55. --- On Fri, 11/7/08, Korn, Ed

Re: [R] rounding

2008-07-10 Thread Moshe Olshansky
is below 255, so that x is less than 2.55 and should have been rounded to 2.5. --- On Fri, 11/7/08, Moshe Olshansky [EMAIL PROTECTED] wrote: From: Moshe Olshansky [EMAIL PROTECTED] Subject: Re: [R] rounding To: [EMAIL PROTECTED], Korn, Ed (NIH/NCI) [E] [EMAIL PROTECTED] Received: Friday, 11 July

Re: [R] number of effective tests

2008-07-10 Thread Moshe Olshansky
It looks like SR, SU and ST are strongly correlated to each other, as well as DR, DU and DT. You can try to do PCA on your 6 variables, pick the first 2 principal components as your new variables and use them for regression. --- On Fri, 11/7/08, Georg Ehret [EMAIL PROTECTED] wrote: From:

Re: [R] Sum(Random Numbers)=100

2008-07-08 Thread Moshe Olshansky
Karanth wrote: On 2008-7-8, at 下午2:39, Moshe Olshansky wrote: If they are really random you can not expect their sum to be 100. However, it is not difficult to get that given that the sum of n independent Poisson random variables equals N, any individual one has the conditional

Re: [R] multiplication question

2008-07-07 Thread Moshe Olshansky
The answer to your first question is sum(x)8sum(y) - sum(x*y) and for the second one x %*% R %*% y - sum(x*y*diag(R)) --- On Thu, 3/7/08, Murali Menon [EMAIL PROTECTED] wrote: From: Murali Menon [EMAIL PROTECTED] Subject: [R] multiplication question To: [EMAIL PROTECTED] Received:

Re: [R] odd dnorm behaviour (?)

2008-07-07 Thread Moshe Olshansky
dnorm() computes the density, so it may be 1; pnorm() computes the distribution function. --- On Tue, 8/7/08, Mike Lawrence [EMAIL PROTECTED] wrote: From: Mike Lawrence [EMAIL PROTECTED] Subject: Re: [R] odd dnorm behaviour (?) To: Rhelp [EMAIL PROTECTED] Received: Tuesday, 8 July, 2008,

Re: [R] Lots of huge matrices, for-loops, speed

2008-07-06 Thread Moshe Olshansky
Another possibility is to use explicit formula, i.e. if you are doing linear regression like y = a*x + b then the explicit formulae are: a = (meanXY - meanX*meanY)/(meanX2 - meanX^2) b = (meanY*meanX2 - meanX*meanXY)/(meanX2 - meanX^2) where meanX is mean(x), meanXY is mean(x*y), meanX2 is

Re: [R] Lots of huge matrices, for-loops, speed

2008-07-06 Thread Moshe Olshansky
matrices, for-loops, speed To: [EMAIL PROTECTED] Cc:, Zarza [EMAIL PROTECTED] Received: Monday, 7 July, 2008, 9:40 AM On 7/07/2008, at 11:05 AM, Moshe Olshansky wrote: Another possibility is to use explicit formula, i.e. if you are doing linear regression like y = a*x + b

Re: [R] Plot Mixtures of Synthetically Generated Gamma Distributions

2008-07-06 Thread Moshe Olshansky
I know very little about graphics, so my primitive and brute force solution would be plot(density(x[1:30]),col=blue);lines(density(x[31:60]),col=red);lines(density(x[61:90]),col=green) --- On Mon, 7/7/08, Gundala Viswanath [EMAIL PROTECTED] wrote: From: Gundala Viswanath [EMAIL PROTECTED]

Re: [R] A regression problem using dummy variables

2008-07-01 Thread Moshe Olshansky
Do you have a reason to treat all 3 levels together and not have a separate regression for each level? --- On Tue, 1/7/08, rlearner309 [EMAIL PROTECTED] wrote: From: rlearner309 [EMAIL PROTECTED] Subject: [R] A regression problem using dummy variables To: Received:

Re: [R] help_transformation

2008-06-26 Thread Moshe Olshansky
Let F be the distribution function of Y, PSI the standard normal distribution anf IPSI it's inverse. Let f(x) = IPSI(F(x)). It is not difficult to see that f(Y) has standard normal distribution. So you can replace F with the empirical distribution and IPSI is the qnorm function of R. --- On

Re: [R] [SPAM] - constructing arbitrary (positive definite) covariance matrix - Found word(s) list error in the Text body

2008-06-26 Thread Moshe Olshansky
If the main diagonal element of matrix A is 1 and the off diagonal element is a then for any vector x we get that t(x)*A*x = (1-a)*sum(x^2) +a*(sum(x))^2 . If we want A to be positive (semi)definite we need this expression to be positive (non-negative) for any x!= 0. Since sum(x)^2/sum(x*2) = n

Re: [R] Measuring Goodness of a Matrix

2008-06-24 Thread Moshe Olshansky
What do you mean by A similar to X? Do you mean norm of the difference, similar eigenvalues/vectors, anything else? --- On Wed, 25/6/08, Gundala Viswanath [EMAIL PROTECTED] wrote: From: Gundala Viswanath [EMAIL PROTECTED] Subject: [R] Measuring Goodness of a Matrix To: [EMAIL PROTECTED]

Re: [R] Pairwise Partitioning of a Vector

2008-06-23 Thread Moshe Olshansky
One possibility is: x - c(30.9, 60.1 , 70.0 , 73.0 , 75.0 , 83.9 , 93.1 , 97.6 , 98.8 , 113.9) for (i in 1:9) assign(paste(PAIR,i,sep=),list(part1 = x[1:i],part2 = x[-(1:i)])) --- On Mon, 23/6/08, Gundala Viswanath [EMAIL PROTECTED] wrote: From: Gundala Viswanath [EMAIL PROTECTED]

  1   2   >