[R] encoding accentsand tildes in R Macosx

2008-08-10 Thread Carlos Cuartas
Hello,
In R under  Mac OS X 10.5.4 I've had problems when I've tried to read a 
data.frame with characters including tildes and accents.
For instance Floreña is changed to Flore\x96a and Ranchería is changed to 
Rancher\x92a 
In the code:  
section<-read.table('Sectiondic.txt',sep='\t',header=T,stringsAsFactors=F,encoding="
 ")
I've changed the  "encoding" argument but I have not could find the solution.
Any suggestion?

Thanks a lot

Carlos Cuartas


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mode value

2008-09-06 Thread Carlos Morales
Hello everyone,


I would like to know if there is any function to calculate the mode value, or I 
have to build one to do it.


Thanks so much
Carlos




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with read.table in Windows and Linux

2008-10-03 Thread Carlos Morales
Hello everyone,

I'm trying to open the same file under Linux and Windows. Under Windows 
everything is ok but when I try to do it under Linux I have a mistake and I 
don't know why. This is the mistake:

Error in make.names(col.names,unique=TRUE):
string multibyte 1 invalid

why?

I write this when I want to do it under Windows: 
zz.info<-read.table(file("C:/Documents and 
Settings/Administrador/Desktop/carlos/aCGH/aCGH/examples/Anal_sin_norm_1Mb86_Segmentos3.txt","r+"),header=TRUE,sep="\t",dec=".")
 
and under Linux:
 
zz.info<-read.table(file("/home/carlos/Desktop/Anal_sin_norm_1Mb86_Segmentos3.txt","r+"),header=TRUE,sep="\t",dec=".")
 
Why do I have problems under Linux?. If you need the text file tell me it.
 
 
Thanks so much for your help
 
Carlos




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in X11

2008-10-06 Thread Carlos Morales
Hello everyone,

I'm trying to plot a graphic in Linux, when I type X11() then I have an error 
which is the next: Error in X11(d$display, d$width, d$height, d$pointsize, 
d$gamma, d$colortype,  :
 unable to start device X11cairo

 
Why?, what I must do to fix it?. Thanks so much
 
Carlos




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with plotting table

2007-11-27 Thread Carlos Gershenson
Hi all,

Let us have:

x<-1:10
y<-x/2
plot(table(x), type="p")
points(table(y), pch=2)


Why does the last command plots the values of table(y) using the x  
coordinates of table(x)???
Am I doing something wrong?
What would be a way of plotting the points of table(y) on their place?

#this problem also occurs with:
plot(table(y), type="p")
points(table(x), pch=2)


Thank you very much,
Carlos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with plotting table

2007-11-28 Thread Carlos Gershenson
Thank you very much Duncan, that did the works.

Thank you also Gavin and Bernardo for your feedback.

Best regards,
Carlos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing multiple matrices

2008-12-11 Thread Carlos López

Hello all :)

I have a for loop where in each cycle I create certain matrix object, 
let´s say, X, I would like to write it
so I use the write.table function but I would like to write as many 
matrices as cycles, this is, I would like
to use a variable, let´s say y, that will be in the for, as in: for (y 
in 1:100)


and then when I write the matrix in a file I would like to produce files 
with different names, for examen


my_matrix_1.dat

my_matrix_2.dat

my_matrix_3.dat
.
.
.
my_matrix_100.dat


is there any way to do this with the write.table function?

Thank you very much in advance
Carlos


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] character count

2008-12-12 Thread Carlos Cuartas
Hi Ista,
one way could be:

ncharacters<-unlist(lapply(x,function(x)nchar(gsub(' ','',x
ncharacters




From: Ista Zahn 
To: r-help@r-project.org
Sent: Friday, December 12, 2008 10:31:10 AM
Subject: [R] character count

Dear list,
I have a variable that consists of typed responses. I wish to compute
a variable equal to the number of characters in the original variable.
For example:

> x <- c("convert this to 32 because it has 32 characters", "this one has 22 
> characters", "12 characters")

[Some magic function here]

> x
[1] 32 22 12

Any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Useful books for learning the R software and the S programming language

2009-01-12 Thread Carlos Guerra

Robert,

I have Peter's book and I think it can be a very good place to start  
from... dispite the discount... :)


If you like spatial analysis you can try to look for Roger Bivand et  
al. "Applied Spatial Data Analysis with R", If you are into something  
else try the "Use R" collection from Springer... you may find  
something not that "pricey" that you can use.


Best regards
Carlos

Em 2009/01/12, às 22:07, Peter Dalgaard escreveu:


Robert Wilk wrote:

any useful books for learning the R statistical software?
are they pricey?


Many. "Useful" depends on the reader, though, so look around. Here's  
a starting point


http://www.r-project.org/doc/bib/R-books.html

(modesty should forbid me to point at item 18 on the list and the  
fact that Amazon US has it currently 19% discounted)


In general R books are cheaper than statistical monographs, but more  
expensive than the large market computer science books.


and if the books recommended focus on S, how compatible will they  
be for

someone learning R?


Such books are strongly outnumbered by now. One important book from  
that group is Venables+Ripley's Modern Applied Statistics with S  
explicitly addresses R issues.



thank you in advance for your help.
P.S.
specialized survey statistical procedures? Is R good at that?


Not R in itself, but the "survey" package for it is rumoured to be  
state of the art, and its author has a book on it in its final stages.



--
  O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)  
35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45)  
35327907


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] number of Mondays

2009-01-15 Thread Carlos Hernandez

dear All,
i'm trying to calculate the number of Mondays, Tuesdays, etc that each  
month within a date range has. I have time series data that spans 60  
months and i want to calculate the number of Mondays, Tuesdays, Wed,  
etc of each month. (I want to control for weekly seasonality but my  
data is monthly).


Is there an easy way to to this in R? or is there a package i could  
use? i did some quick search in the help files and R sites but could  
not find any answers.


i appreciate any hint you could give,

thanks.

Carlos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] number of Mondays

2009-01-15 Thread Carlos Hernandez

Indeed, i overlooked weekdays.

Thank you all for your replies!


On Jan 15, 2009, at 21:23 , Prof Brian Ripley wrote:


Or for those not allergic to reading help, see ?weekdays .

Just how hard do you have to work to miss that?  E.g. ??day works.

On Thu, 15 Jan 2009, Peter Dalgaard wrote:


Carlos Hernandez wrote:

dear All,
i'm trying to calculate the number of Mondays, Tuesdays, etc that  
each

month within a date range has. I have time series data that spans 60
months and i want to calculate the number of Mondays, Tuesdays,  
Wed, etc
of each month. (I want to control for weekly seasonality but my  
data is

monthly).

Is there an easy way to to this in R? or is there a package i  
could use?
i did some quick search in the help files and R sites but could  
not find

any answers.

i appreciate any hint you could give,


This is where POSIXlt objects are useful:


unlist(unclass(as.POSIXlt(ISOdate(1959,3,11

sec   min  hour  mday   mon  year  wday  yday isdst
  0 01211 259 369 0

Which means that I was born on a Wednesday (wday==3) in March  
(mon==2)

(some of the fields count from 0 and others, like mday, from 1;
presumably some UNIX vendor back in the Stone Age got their
implementation turned into a standard...).

This allows you to do stuff like:



dd <- seq(Sys.Date(),as.Date("2009-3-11"),1)
dd <- as.POSIXlt(dd)
with(dd, table(mon,wday))

 wday
mon 0 1 2 3 4 5 6
0 2 2 2 2 3 3 3
1 4 4 4 4 4 4 4
2 2 2 2 2 1 1 1

which I think is pretty much what you were looking for.



thanks.

Carlos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
 O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)  
35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45)  
35327907


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] permutation of the rows of a matrix that minimizes the distance to another matrix

2008-04-24 Thread Carlos Soares
Hi!
I have 2 matrices of numbers m1 and m2 with the same number of columns 
and rows. I would like to compute m2', the permutation of the rows of m2 
such that the distance (e.g., sum(m1-m2') or sum((m1-m2')^2))  is 
minimized. Do you know of any function/algorithm to obtain such a 
permutation?
Best regards,
Carlos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] van der Corput sequences

2008-05-23 Thread Carlos Ungil
Alberto,

I think the functions below do what you want:

> vanDerCorput(12,6)
 [1] 0.1667 0. 0.5000 0.6667 0.8333 0.0278
 [7] 0.1944 0.3611 0.5278 0.6944 0.8611 0.0556

Regards,

Carlos

number2digits=function(n,base){
  #first digit in output is the least significant
  digit=n%%base
  if (nhttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Datasets in R

2008-05-29 Thread Carlos López
I´m trying to find datasets that will give me residuals, after applying 
the lm function, with no normality, non linearity, and heteroscedacity 
so I can try to exemplify
those cases in the linear regression model. Can you give any advice on 
what datasets would be appropiate? I can´t use the ones in the alr3 
package because those have

already been seen in class.

Thank you very much :-)
natorro

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] left-aligned title?

2008-06-16 Thread Carlos Gershenson

Hi,

I am trying to insert a letter in a plot corner outside the plotting  
area. Thus, "legend" and "text" don't seem to work. "title" does the  
trick, but I cannot find a way of moving it from the center to the  
left corner... I already tried with a few parameters from par, but  
title does not take them.


Would anyone have an idea on how to pull this one off?

Thank you very much,

Carlos Gershenson
http://homepages.vub.ac.be/~cgershen/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constrained regression

2008-03-02 Thread Carlos Alzola
Dear list members,

I am trying to get information on how to fit a linear regression with
constrained parameters. Specifically, I have 8 predictors , their
coeffiecients should all be non-negative and add up to 1. I understand it is
a quadratic programming problem but I have no experience in the subject. I
searched the archives but the results were inconclusive. 

Could someone provide suggestions and references to the literature, please?

Thank you very much.

Carlos

Carlos Alzola
[EMAIL PROTECTED]
(703) 242-6747 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constrained regression

2008-03-05 Thread Carlos Alzola
I would like to acknowledge the answers I received from Tom Filloon, Mike
Cheung and Berwyn Turlach.
Berwyn's response was exactly what I needed. Use solve.QP from the quadprog
package in R. S-Plus has the equivalent function solveQP in the NuOpt
module.

Berwyn's response is below

G'day Carlos,

On Mon, Mar 3, 2008 at 11:52 AM
Carlos Alzola <[EMAIL PROTECTED]> wrote:

>  I am trying to get information on how to fit a linear regression with 
> constrained parameters. Specifically, I have 8 predictors , their 
> coeffiecients should all be non-negative and add up to 1. I understand 
> it is a quadratic programming problem but I have no experience in the 
> subject. I searched the archives but the results were inconclusive.
>
>  Could someone provide suggestions and references to the literature, 
> please?

A suggestion:

> library(MASS)   ## to access the Boston data
> designmat <- model.matrix(medv~., data=Boston) Dmat <- 
> crossprod(designmat, designmat) dvec <- crossprod(designmat, 
> Boston$medv) Amat <- cbind(1, diag(NROW(Dmat))) bvec <- c(1, 
> rep(0,NROW(Dmat)) meq <- 1
> library(quadprog)
> res <- solve.QP(Dmat, dvec, Amat, bvec, meq)

The solution seems to contain values that are, for all practical purposes,
actually zero:

> res$solution
 [1]  4.535581e-16  2.661931e-18  1.016929e-01 -1.850699e-17  [5]
1.458219e-16 -3.892418e-15  8.544939e-01  0.00e+00  [9]  2.410742e-16
2.905722e-17 -5.700600e-20 -4.227261e-17 [13]  4.381328e-02 -3.723065e-18

So perhaps better:

> zapsmall(res$solution)
 [1] 0.000 0.000 0.1016929 0.000 0.000 0.000  [7]
0.8544939 0.000 0.000 0.000 0.000 0.000 [13] 0.0438133
0.000

So the estimates seem to follow the constraints.

And the unconstrained solution is:

> res$unconstrainted.solution
 [1]  3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02  [5]
2.686734e+00 -1.776661e+01  3.809865e+00  6.922246e-04  [9] -1.475567e+00
3.060495e-01 -1.233459e-02 -9.527472e-01 [13]  9.311683e-03 -5.247584e-01

which seems to coincide with what lm() thinks it should be:

> coef(lm(medv~., Boston))
  (Intercept)  crimzn indus  chas 
 3.645949e+01 -1.080114e-01  4.642046e-02  2.055863e-02  2.686734e+00 
  noxrm   age   dis   rad 
-1.776661e+01  3.809865e+00  6.922246e-04 -1.475567e+00  3.060495e-01 
  tax   ptratio black lstat 
-1.233459e-02 -9.527472e-01  9.311683e-03 -5.247584e-01 

So there seem to be no numeric problems.  Otherwise we could have done
something else (e.g calculate the QR factorization of the design matrix, say
X, and give the R factor to solve.QP, instead of calculating X'X and giving
that one to solve.QP).

If the intercept is not supposed to be included in the set of constrained
estimates, then something like the following can be done:

> Amat[1,] <- 0
> res <- solve.QP(Dmat, dvec, Amat, bvec, meq)
> zapsmall(res$solution)
 [1] 6.073972 0.00 0.109124 0.00 0.00 0.00 0.863421  [8]
0.00 0.00 0.00 0.00 0.00 0.027455 0.00

Of course, since after the first command in that last block the second
column of Amat contains only zeros
> Amat[,2]
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
we might as well have removed it (and the corresponding entry in bvec)
> Amat <- Amat[, -2]
> bvec <- bvec[-2]
before calling solve.QP().

Note, the Boston data set was only used to illustrate how to fit such
models, I do not want to imply that these models are sensible for these
data. :-)

Hope this helps.

Cheers,

Berwin


Carlos Alzola
[EMAIL PROTECTED]
(703) 242-6747 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with script

2008-10-26 Thread Carlos López
e_counter] <- x_list[random_sq]
 y_list[range_counter] <- y_list[random_sq] + 1
 myImagePlot(mexico_matrix + random_matrix)
   }
 }
 
 
 if (aleatorio < 0.50){

   if(random_matrix[x_list[random_sq]+1, y_list[random_sq]] == 0) {
 random_matrix[x_list[random_sq] + 1, y_list[random_sq]] <- 1
 flag <- 1
 range_counter <- range_counter + 1
 x_list[range_counter] <- x_list[random_sq] + 1
 y_list[range_counter] <- y_list[random_sq]
 myImagePlot(mexico_matrix + random_matrix)
   }
 }
 
 if (aleatorio < 0.25) {

   if(random_matrix[x_list[random_sq]-1, y_list[random_sq]] == 0 ) {
 random_matrix[x_list[random_sq] - 1, y_list[random_sq]] <- 1
 flag <- 1
 range_counter <- range_counter + 1
 x_list[range_counter] <- x_list[random_sq] - 1
 y_list[range_counter] <- y_list[random_sq]
 myImagePlot(mexico_matrix + random_matrix)
   }
 }
 
 
   }

 }
}

sum(random_matrix)


#---


I do the "sum(random_matrix)" just to check out if the number of zeroes 
needed have been set (never has reached 100 so far :-()


I´m not sure what I am doing wrong, I´ve been working with this file in 
windows and mac at the same time in a colaboration.

Any help will be greatly appreciated.

Carlos


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About Pareto distribution

2008-10-27 Thread Carlos López
Hi again :-) I finally was able to fix the program, thank you all very 
much for your help :-)


Now I have a problem and I don´t know if it is possible to solve it with 
R, I have a data set, and
because it is data from salaries I am suspecting it comes from a Pareto 
distribution, my questions

are:

1. Is there any test of hypothesis in R to prove if data comes from a 
Pareto distribution?


2. How could I estimate the parameters with R? are there any functions 
for this?


Thank you very much again :)
Carlos

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to plot with different colours

2008-11-02 Thread Carlos Morales
Hello everyone,

I'm trying to plot 3600 points and my idea is if this value is higher than 0.35 
then this point must appear in green colour, if it's smaller than -0.35 then 
values must appear in red and if values are between -0.35 and 0.35 they must be 
in yellow. I'm thinking and I'm trying many things but I don't achieve it. Any 
idea?.

Thanks so much
Carlos Morales Diego




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] licensing of R packages

2008-11-14 Thread Carlos Ungil

I know the standard answer to this kind of question is "get legal
advice from a lawyer", but I would like to hear the (hopefully
informed) opinion of other people.

I would say that, according to the FSF's interpretation of the GPL,
any R code using GPL packages can be distributed legally only using
GPL-compatible licenses.

http://www.gnu.org/licenses/gpl-faq.html#IfInterpreterIsGPL
> Another similar and very common case is to provide libraries with the
> interpreter which are themselves interpreted. For instance, Perl comes
> with many Perl modules, and a Java implementation comes with many Java
> classes. These libraries and the programs that call them are always
> dynamically linked together.  
>
> A consequence is that if you choose to use GPL'd Perl modules or Java
> classes in your program, you must release the program in a
> GPL-compatible way, regardless of the license used in the Perl or Java
> interpreter that the combined Perl or Java program will run on.

If the reasoning above applies to R as it does to Perl, all R code
would be affected given that core packages like "base" are GPL.

The interpretation of the R Foundation (the copyright holder in this
case) seems more relaxed, but I wonder what is the intent of other
people distributing R packages under the GPL. Maybe some of them would
protest if R code using their package was distributed under a
non-GPL-compatible license. For example, I would expect the authors of
the GNU Scientific Library to defend that any package using "gsl" (a
wrapper on their GPL library) should be published under a
GPL-compatible license, being a derivative work (the FSF thinks so).

Another question is if that "strict" interpretation of the GPL could
be actually enforced, of course. Coming back to the GSL example, it
seems a more flagrant violation of the license is already happening:
http://www.numerit.com/gsl.htm (apparently the publisher of that
product thinks that linking to a GPL dll doesn't impose any obligation
to him, but the usual view of the FSF is quite the opposite; I just
found that page by chance, I don't know anything else about that
particular case).

I've noticed that this question was posed in r-devel a couple of years ago,
I'm surprised it didn't provoke more than one reply:
https://stat.ethz.ch/pipermail/r-devel/2006-September/042715.html

Cheers,

Carlos

PS: By the way, I think FAQ 2.11 should be fixed: it states that "R is
released under the GNU General Public License (GPL)", without
specifying the version and linking to
http://www.gnu.org/copyleft/gpl.html (GPLv3). However, the COPYING
file in the R directory corresponds to GPL2.


-- 
View this message in context: 
http://www.nabble.com/licensing-of-R-packages-tp20497391p20497391.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] licensing of R packages

2008-11-14 Thread Carlos Ungil


Prof Brian Ripley wrote:
> 
> I'm not going into the original question except to point out that R is 
> licensed under GPL-2 and the quote was from the GPL-3 FAQ.  As FSF 
> themselves insist, the two licences are incompatible.
> 

Let me quote the corresponding section in the GPL2 FAQ, then:

http://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html#IfInterpreterIsGPL
> Another similar and very common case is to provide libraries with the
> interpreter which are themselves interpreted. For instance, Perl comes
> with many Perl modules, and a Java implementation comes with many Java
> classes. These libraries and the programs that call them are always
> dynamically linked together.
>
> A consequence is that if you choose to use GPL'd Perl modules or Java
> classes in your program, you must release the program in a
> GPL-compatible way, regardless of the license used in the Perl or Java
>  interpreter that the combined Perl or Java program will run on.

Core R packages included in the R distribution are in fact "GPL (>= 2)" [*],
but choosing GPLv2 or GPLv3 seems to make no difference in regard to the
issue being discussed (again, according to the interpretation given by the
FSF). 

Regards,

Carlos

[*] this is not the case for all the recommended packages in the
distribution
-- 
View this message in context: 
http://www.nabble.com/licensing-of-R-packages-tp20497391p20503264.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] licensing of R packages

2008-11-14 Thread Carlos Ungil


Barry Rowlingson wrote:
> 
> This misconception of the license terms comes about because of the
> use of the word 'use'. If I distribute a short C program that has a
> call in it to a function that has the same name as something in the
> GSL, does my C program use the GSL? No. Maybe it _mentions_ the GSL,
> but the GPL has no problems with that.

Maybe the GPL has no problems with that, but GSL authors will have. For
example, regarding a similar situation one of the GSL authors commented:

[http://sourceware.org/ml/gsl-discuss/2001-q4/msg00033.html]
> Any distributed code which refers to GSL functions should be licensed
> to the end-user under the GPL.  The intent of the GPL is that we make
> our code free to other people if they do the same for us --- two-way
> cooperation.  The current R-quant license is not a free software
> license so there should not be anything distributed under that license
> which directly refers to GSL functions.


Barry Rowlingson wrote:
> 
> I'm distributing my C program, and not the GPL-covered code, so I can
> license it how I like.
> 

And the copyright owners have recourse to legal action if they think there
is a license violation. Again, I don't know what a court would decide, but
if you want to test the limits of the GPL license I would avoid challenging
a GNU project :-)

Cheers,

Carlos

-- 
View this message in context: 
http://www.nabble.com/licensing-of-R-packages-tp20497391p20504401.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] licensing of R packages

2008-11-14 Thread Carlos Ungil


Duncan Murdoch-2 wrote:
> 
> The way to lose a GPL lawsuit is to incorporate GPL'd code into your own 
> project, and then not follow the GPL when you redistribute.  There's 
> evidence of that.
> 
> But I've never heard of anyone linking to but not distributing GPL'd 
> code and being sued for it, let alone losing lawsuits over it.  That's 
> evidence enough for me that it is a safe thing to do.
> 

The LGPL covers the first point as well as the GPL does (I think), while
explicitily allowing dynamic linking (so the second point is clearly not a
problem). The FSF encourages using the GPL (and not the LGPL) precisely to
make libraries available only to GPL projects. It's not surprising therefore
that the GPL license scares people off.

The safest option is to take the GPL at face value and accept the FSF
interpretation, but depending on the jurisdiction, the details of the
situation, and the level of risk-aversion, people might decide to do
otherwise. The concept of "derivative work" is not really well defined, see
for example http://www.rosenlaw.com/html/GPL.PDF

To give another example related to R, the FSF foundation view of the world
is that RPy, because it links dynamically to libR.so (or R.dll, etc), has to
be distributed with a GPL (-compatible?) license. And the same restriction
applies in turn to a python program using RPy (again, according to the FSF;
because RPy and the "derivative work" are dynamically linked by the python
interpreter).

However, not everyone shares that view:

http://mail.python.org/pipermail/python-list/2005-January/304974.html
> On the basis of these clauses, the legal advice to us was that merely 
> including "import rpy" and making calls to RPy-wrapped R functions does 
> not invoke the provisions of the GPL because these calls only relate to 
> run-time linking, which is not covered by the GPL. However, combining 
> GPLed source code or static linking would invoke the GPL provisions. 
> []
> IANAL, and the above constitutes mangled paraphrasing of carefully 
> worded formal legal advice, the scope of which was restricted to 
> Australian law. However, the sections of the GPL quoted above are pretty 
> unambiguous.
> The other, informal advice, was to ignore the FAQs and other opinions on 
> the FSF web site regarding intepretation of the GPL - it's only the 
> license text which counts.

Cheers,

Carlos
-- 
View this message in context: 
http://www.nabble.com/licensing-of-R-packages-tp20497391p20509444.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] applying a function to another function

2008-12-03 Thread Carlos Cuartas
Hello,
I did a function (sec_conop) whose arguments are syndic,
well and wellconop. 
 
sec_conop(syndic='01syndic.txt',well='well-1.csv',wellconop='well-1.dat');closeAllConnections()
 
This function takes “well” and “syndic”, matching between
them and then it does some transformations. The result is exported to
“wellconop”.
I will apply this function to one hundred different “wells”.
Therefore, for each well I use, the “wellconop argument will change too.
For intance if “well” is now well-2.csv, the function will be 

sec_conop(syndic='01syndic.txt',well='well-2.csv',wellconop='well-2.dat');closeAllConnections()
 
I am trying to apply this function automatically to all
“well” I have, but I do not find the way.
The last I tried, for three different “wells”, was :

wells<-data.frame(funct=rep('sec_conop(',3),syndic=c('01syndic.txt','01syndic.txt','01syndic.txt'),well=c('well-1.csv','well-2.csv','well-3-1.csv'),wellconop=c('well-1.dat','well-2.dat','well-3.dat'))
 
funct_3wells<-paste(wells$funct,"'",wells$syndic,"'", ","
,"'", wells$well,"'",
"," , "'" ,wells$wellconop,"'",")",";","closeAllConnections()",sep='')
 
lapply(funct_3wells,as.formula)
 
This way works partially because the results in “wellconop”
are truncated. 
Has anyone any suggestion?

Thanks in advance

Carlos


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] applying a function several times

2008-12-05 Thread Carlos Cuartas
I am sorry. I am not sure if the mail a send before to this list was rejected 
because of header (subject).
I've changed it. The first maybe was not appropriate.

I did a function (sec_conop) whose arguments are syndic,
well and wellconop. 

sec_conop(syndic='01syndic.txt',well='well-1.csv',wellconop='well-1.dat');closeAllConnections()

This function takes “well” and “syndic”, matching between
them and then it does some transformations. The result is exported to
“wellconop”.

I will apply this function to one hundred different “wells”.
Therefore, for each well I use, the “wellconop argument will change too.
For intance if “well” is now well-2.csv, the function will be 

sec_conop(syndic='01syndic.txt',well='well-2.csv',wellconop='well-2.dat');closeAllConnections()

I am trying to apply this function automatically to all
“well” I have, but I do not find the way.
The last I tried, for three different “wells”, was :

wells<-data.frame(funct=rep('sec_conop(',3),syndic=c('01syndic.txt','01syndic.txt','01syndic.txt'),well=c('well-1.csv','well-2.csv','well-3-1.csv'),wellconop=c('well-1.dat','well-2.dat','well-3.dat'))

funct_3wells<-paste(wells$funct,"'",wells$syndic,"'", ","
,"'", wells$well,"'",
"," , "'" ,wells$wellconop,"'",")",";","closeAllConnections()",sep='')

lapply(funct_3wells,as.formula)

This way works partially because the results in “wellconop”
are truncated. 
Has anyone any suggestion?

Thanks in advance

Carlos


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in var(x, na.rm = na.rm) : no complete element pairs

2009-02-22 Thread Carlos Morales
Hello all,

I'm trying to calculate the standar desviation and I'm using the function 
sd(x,na.rm=TRUE) and I have this error:  Error in var(x, na.rm = na.rm) : no 
complete element pairs . Why happen this?, What can I do to solve it?. x is 
list of three numbers which I have from a table.


Thanks so much from Spain
Carlos Morales Diego




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in var(x, na.rm = na.rm) : no complete element pairs

2009-02-22 Thread Carlos Morales


Hello all,

I'm trying to calculate the standar desviation with sd(x,na.rm=TRUE) and I 
don't know why I have this error Error in var(x, na.rm = na.rm) : no complete 
element pairs when I try to calculate it, I have been looking for information 
about this error but nothing. Why it happens?. What can I do to fix it?. Thanks 
so much from Spain


Carlos Morales Diego




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in var(x, na.rm = na.rm) : no complete element pairs

2009-03-05 Thread Carlos Morales

Hello,

I still have the same error which I have written in the Subject field, I leave 
here the code and I hope you can help me with this:

filter.clones<-function(zz.info,crom.info) 
{ 
clones.info<-zz.info 
 
cat("Removing clones which has a flag minor than 0\n") 
ord <- order(clones.info$Flags) 
clones.info<- clones.info[ ord, ] 
#for(j in 1:nrow(clones.info)) 
#{ 
del<-0 
#print(j) 
del<-which(as.numeric(clones.info$Flags)<0) 
if (length(del)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del,] 
#eliminados.info<-clones.info[del,] 
#if(j==1) 
#{ 
#j<-0 
#} 
} 
#} 
##Eliminar levaduras, moscas etc 
#for(j in 1:nrow(clones.info)) 
#{ 
 
del1<-0 
del1<-grep("mix",clones.info$Name) 
if (length(del1)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del1,] 
} 
#} 
#for(j in 1:nrow(clones.info)) 
#{ 
 
del2<-0 
del2<-grep("fly",clones.info$Name) 
if (length(del2)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del2,] 
} 
#} 
#for(j in 1:nrow(clones.info)) 
#{ 
 
del3<-0 
del3<-grep("pombe",clones.info$Name) 
if (length(del3)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del3,] 
} 
#} 
#for(j in 1:nrow(clones.info)) 
#{ 
 
del4<-0 
del4<-grep("DMSO",clones.info$Name) 
if (length(del4)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del4,] 
} 
#} 
#Eliminar los clones que estan unidos por un + o un menos 
#for(j in 1:nrow(clones.info)) 
#{ 
del5<-0 
del5<-grep("[+]",clones.info$Name) 
if (length(del5)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del5,] 
} 
#} 
#for(j in 1:nrow(clones.info)) 
#{ 
 
del6<-0 
del6<-grep("[-]",clones.info$Name) 
if(length(del6)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del6,] 
} 
#} 
#for(j in 1:nrow(clones.info)) 
#{ 
 
del7<-0 
del7<-grep("rep",clones.info$Name) 
if(length(del7)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del7,] 
} 
#} 
del8<-0 
del8<-grep("REP",clones.info$Name) 
if(length(del8)!=0) 
{ 
#print(j) 
clones.info<-clones.info[-del8,] 
} 
 
 
 
#cat("Numero de clones:",NROW(clones.info$Name),"\n") 
#chroms.info<-croms.info(PruebaDefinitiva.obj) 
#cat("Reordering the chromosomes\n") 
#ord <- order(chroms.info$picked_off_as_SI_name) 
#chroms.info<- chroms.info[ ord, ] 
 
 
#ord <- order(PruebaDefinitiva.obj$crom.info$picked_off_as_SI_name) 
##crom.info <- crom.info[ ord, ] 
 
nrow(clones.info) 
#a<-PruebaDefinitiva.obj$zz.info 
#PruebaDefinitiva.obj$zz.info<-0 
#PruebaDefinitiva.obj$zz.info<-clones.info 
#PruebaDefinitiva.obj$zz.info 
clones.info 
 
 
cat("Reordering the chromosomes\n") 
ord <- order(crom.info$picked_off_as_SI_name) 
crom.info<- crom.info[ ord, ] 
#arch.info<-cbind(arch.info,000) 
#names(arch.info)[NCOL(arch.info)]<-"Cromosomas" 
 
clones2.info<-clones.info 
clones2.info<-cbind(clones2.info,000) 
names(clones2.info)[NCOL(clones2.info)]<-"Cromosomas" 
clones2.info 
 
 
##Añadir columna con los cromosomas 
#ncol(arch.info) 
#arch.info<-arch.info 
#arch.info<-cbind(arch.info,000) 
#names(arch.info)[NCOL(arch.info)]<-"Cromosomas" 
ord <- order(clones2.info$Name) 
clones2.info<- clones2.info[ ord, ] 
 
for(i in 1:nrow(clones2.info)) 
{ 
cat("Processing clon ",i,"\n") 

find<-match(clones2.info$Name[i],crom.info$picked_off_as_SI_name,nomatch=0) 
print(find) 
if((length(find)!=0) &&(find!=0)) 
{ 
 
clones2.info$Cromosomas[i]<-paste(crom.info$current_chromosome[find]) 
} 
find<-0 
} 
 
del1<-0 
del1<-grep("X",clones2.info$Cromosomas) 
if (length(del1)!=0) 
{ 
#print(j) 
clones2.info<-clones2.info[-del1,] 
} 
 
del1<-0 
del1<-grep("Y",clones2.info$Cromosomas) 
if (length(del1)!=0) 
{ 
#print(j) 
clones2.info<-clones2.info[-del1,] 
} 
 
del1<-0 
del1<-grep("Un_",clones2.info$Cromosomas) 
if (length(del1)!=0) 
{ 
#print(j) 
clones2.info<-clones2.info[-del1,] 
} 
 
del1<-0 
del1<-grep("DR",clones2.info$Cromosomas) 
if (length(del1)!=0) 
{ 
#print(j) 

Re: [R] Append to a csv file

2009-04-22 Thread Carlos Cuartas
Maybe each dataframe you are adding during the loop include the column name.
I would add 
write.csv(mydata, file= “data.csv”=F, append=T,col.names=F)
Hope that help

Carlos





To: r-help@r-project.org
Sent: Monday, April 20, 2009 4:39:48 PM
Subject: [R]  Append to a csv file


I am looping over a data set and at each loop I am creating a dataframe
“mydata”
That I wanted to be saves in a .csv file, but I want all the results to be
saved in the same file and this is the way I do it

write.csv(mydata, file= “data.csv”=F, append=T) . the csv file looks fine
but I always get the following warning message


Warning messages:
1: In write.table(mydata, file =”data.csv”,  ... :
  appending column names to file


Does anyone see why R print out this warning message?




-- 
View this message in context: 
http://www.nabble.com/Append-to-a-csv-file-tp23145471p23145471.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Logistic regression and R

2009-07-30 Thread Carlos López

Hello everybody :-)

I have some data that I want to model with a logistic regression, most 
of the independent variables are numeric and the only dependent is 
categorical, I was thinking that I could apply a logistic regression 
using glm but I wanted to deepen my knowledge of this so I tried to do 
some reading and found the "iris" dataset, now I would like to ask two 
things, first if you know of any bibliography to read more about the 
logistic regression and R so I could understand and interpret better the 
output, and second, what could I do when I have some independent 
variables that are not only numerical but categorical too, i.e. mixed 
(categorical and numerical), can I still use a logistic regression?


Thank you very much!!! :-D

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extract and replace columns of matrices stored in a list

2009-09-03 Thread Carlos Hernandez
Dear All,
I created a list (of length Z) in the following way:

my.array <- vector("list", Z)

then i assigned a matrix (of T rows by N columns) in each of the elements of
the list my.array in the following way:

my.array[[i]] <- matrix.data   ##( matrix.data has dimensions TxN, and i
repeated this command for i from 1 to Z, the matrix.data contains only
numeric data)

and
1. i would like to extract all the third columns of each of the Z matrices
stored in my.array (such that i get a new list only with the 3rd columns of
each matrix in the elements of a new list)

2. i would like to know how could i replace all the 3rd columns of each
matrix in my.array if i have a second matrix (size ZxT) with these columns.

is there a simple way to do these tasks? i appreciate any hints or advice.

Carlos

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract and replace columns of matrices stored in a list

2009-09-03 Thread Carlos Hernandez
On Thu, Sep 3, 2009 at 4:34 PM, Henrique Dallazuanna wrote:

> Try this:
>
> #1
> lapply(my.array, '[', , 3)
>
>
this works! thank you a lot!


> #2
> newThirdColumn <- sample(3)
> lapply(my.array, replace, list = 7:9, values = newThirdColumn)
>
>
i did not understand this last line, so far i couldn't make it work.

 would it be easier to replace the values (the third column of each matrix
in my.array) using an array like in #1?

thank you for your reply!


> On Thu, Sep 3, 2009 at 11:16 AM, Carlos Hernandez 
> wrote:
>
>> Dear All,
>> I created a list (of length Z) in the following way:
>>
>> my.array <- vector("list", Z)
>>
>> then i assigned a matrix (of T rows by N columns) in each of the elements
>> of
>> the list my.array in the following way:
>>
>> my.array[[i]] <- matrix.data   ##( matrix.data has dimensions TxN, and i
>> repeated this command for i from 1 to Z, the matrix.data contains only
>> numeric data)
>>
>> and
>> 1. i would like to extract all the third columns of each of the Z matrices
>> stored in my.array (such that i get a new list only with the 3rd columns
>> of
>> each matrix in the elements of a new list)
>>
>> 2. i would like to know how could i replace all the 3rd columns of each
>> matrix in my.array if i have a second matrix (size ZxT) with these
>> columns.
>>
>> is there a simple way to do these tasks? i appreciate any hints or advice.
>>
>> Carlos
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SAS vs. R in web application

2009-09-08 Thread Carlos Alzola
Good evening,

I have been asked to investigate the pros and cons of using SAS vs. R in a web 
application. Either SAS or R would be the engine used to make some very simple 
calculations and to produce graphs, preferably in png format.

The advantages of R are pretty obvious as there would be no licensing issues. 
The only drawback I can see is that when calling it in batch (using R CMD 
BATCH), a DOS window appears. Thus I have some basic questions:

a) Is it possible to have R operate in the background without the DOS window 
appear? How?
b) Is it correct that there will be no licensing issues?
c) What would be an efficient way to run it? I am thinking of having R running 
in the client's local machine and upload the results to a central server.

If using SAS, would the model described in c) above be the best way to design 
it, or would it be better to upload the raw data to the server and have SAS 
perform the calculations there. Would this option require a multi-user SAS 
license? (I know, I should check with SAS Institute, but I thought I'd ask 
anyway. Someone in the list may have done something similar).

Thanks in advance for any suggestions.

Carlos Alzola 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] more efficient vectorization of a function ?

2009-09-09 Thread Carlos Hernandez
dear All,
i'm using the following two functions:

share.vector <- function (vec1)
{
  vec1 <- vec1 - max(vec1,na.rm=TRUE) -0.1  ## this line avoids overflow
  vec1 <- exp(vec1)
  vec2 <- vec1/(1+sum(vec1,na.rm=TRUE))
  vec2
}

share.matrix <- function (mat1)
{
  out1 <- apply(mat1,2,share.vector)
  return(out1)
}

vec1 is a vector (of numeric data, usually small numbers), mat1 is a matrix
with many vec1's

is there another way to program them such that they are more efficient (in
terms of time)?

i appreciate any hints or advice.

best regards,
Carlos

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] could not find function "Varcov" after upgrade of R?

2009-09-12 Thread Carlos Alzola

Did you type library(Hmisc,T) before loading Design?

Carlos

--
From: "David Freedman" <3.14da...@gmail.com>
Sent: Saturday, September 12, 2009 8:26 AM
To: 
Subject: Re: [R] could not find function "Varcov" after upgrade of R?



I've had the same problem with predict.Design, and have sent an email to 
the

maintainer of the Design package at Vanderbilt University.  I wasn't even
able to run the examples given on the help page of predict.Design - I
received the same error about Varcov that you did.

I *think* it's a problem with the package, rather than R 2.9.2, and I hope
the problem will soon be fixed.  I was able to use predict.Design with 
2.9.2

until I updated the Design package a few days ago.

david freedman


zhu yao wrote:


I uses the Design library.

take this example:

library(Design)
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n,
  rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
dd <- datadist(age, sex)
options(datadist='dd')
Srv <- Surv(dt,e)

f <- cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
cox.zph(f, "rank") # tests of PH
anova(f)
# Error in anova.Design(f) : could not find function "Varcov"



Yao Zhu
Department of Urology
Fudan University Shanghai Cancer Center
No. 270 Dongan Road, Shanghai, China


2009/9/12 Ronggui Huang 


I cannot reproduce the problem you mentioned.

>  ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
>  trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
>   group <- gl(2,10,20, labels=c("Ctl","Trt"))
>   weight <- c(ctl, trt)
>   anova(lm.D9 <- lm(weight ~ group))
> sessionInfo()
R version 2.9.2 (2009-08-24)
i386-pc-mingw32

locale:
LC_COLLATE=Chinese (Simplified)_People's Republic of
China.936;LC_CTYPE=Chinese (Simplified)_People's Republic of
China.936;LC_MONETARY=Chinese (Simplified)_People's Republic of
China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_People's Republic
of China.936

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

2009/9/12 zhu yao :
> After upgrading R to 2.9.2, I can't use the anova() fuction.
> It says "could not find function "Varcov" ".
> What's wrong with my computer? Help needed, thanks!
>
> Yao Zhu
> Department of Urology
> Fudan University Shanghai Cancer Center
> No. 270 Dongan Road, Shanghai, China
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
HUANG Ronggui, Wincent
Doctoral Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/could-not-find-function-%22Varcov%22-after-upgrade-of-R--tp25412881p25414017.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitting a curve to data points

2008-10-06 Thread Carlos "Guâno" Grohmann
Hello all. This is likely to be a silly question, but I have a set of
data points and I want to fit a curve to it, like this:
http://www.igc.usp.br/pessoais/guano/temp/curve.png.

which function should I use?

many thanks

Carlos

-- 
+---+
  Carlos Henrique Grohmann - Guano
  Geologist M.Sc  - Doctorate Student at IGc-USP - Brazil
Linux User #89721  - carlos dot grohmann at gmail dot com
+---+
_
"Good morning, doctors. I have taken the liberty of removing Windows
95 from my hard drive."
--The winning entry in a "What were HAL's first words" contest judged
by 2001: A SPACE ODYSSEY creator Arthur C. Clarke

Can't stop the signal.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] truehist?

2007-09-24 Thread Carlos "Guâno" Grohmann
Hello,
After a long time, I needed the truehist function, but my system
couldn't found it. I tried to install the package MAAS, but I couldn't
found it! Something happened?

Carlos

-- 
+---+
  Carlos Henrique Grohmann - Guano
  Visiting Researcher at Kingston University London - UK
  Geologist M.Sc  - Doctorate Student at IGc-USP - Brazil
Linux User #89721  - carlos dot grohmann at gmail dot com
+---+
_
"Good morning, doctors. I have taken the liberty of removing Windows
95 from my hard drive."
--The winning entry in a "What were HAL's first words" contest judged
by 2001: A SPACE ODYSSEY creator Arthur C. Clarke

Can't stop the signal.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] truehist?

2007-09-26 Thread Carlos "Guâno" Grohmann
Oh yes.

I did searched for help, but Ijust didn't read carefully, I read
"MAAS", instead of MASS..

Carlos



On 9/25/07, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
> Carlos "Guâno" Grohmann wrote:
> > Hello,
> > After a long time, I needed the truehist function, but my system
> > couldn't found it. I tried to install the package MAAS, but I couldn't
> > found it! Something happened?
> >
> > Carlos
> >
> >
> It's in MASS (sic).
>
> help.search("truehist") would have told you.
>
>
> --
>O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907
>
>
>


-- 
+---+
  Carlos Henrique Grohmann - Guano
  Visiting Researcher at Kingston University London - UK
  Geologist M.Sc  - Doctorate Student at IGc-USP - Brazil
Linux User #89721  - carlos dot grohmann at gmail dot com
+---+
_
"Good morning, doctors. I have taken the liberty of removing Windows
95 from my hard drive."
--The winning entry in a "What were HAL's first words" contest judged
by 2001: A SPACE ODYSSEY creator Arthur C. Clarke

Can't stop the signal.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help Problems formatting date and using Regul function

2007-10-11 Thread João Carlos Santos
Hello,

 

I problem is in the format of the date, my time series is like this:

 

2006070100  1244 6162

2006070101  1221 6060

2006070102  1214 6060

2006070103  1194 5959

2006070104  1182 5858

2006070105  1178 5858

2006070106  1176 5858

2006070107  1173 5858

2006070108  1179 5859

2006070109  1246 6162



 

When I attempt to format the time like this:

 

A <- read.table("file", sep="\t", col.names=c("date", "my1", "my2", "my3"))

temp <- as.Date(A$date, format="%Y%m%d%H")

temp

I get

  [1] "4403-05-21" "4403-05-22" "4403-05-23" "4403-05-24" "4403-05-25"

  [6] "4403-05-26" "4403-05-27" "4403-05-28" "4403-05-29" "4403-05-30"

 

Another problem is in REGUL, I using the variables created in the extraction of 
the data but the regulation is not possible

 

REGUL

Ts.regul<-regul(A$date, y=A$my2, xmin=2006070100, n=800, units="hours", 
frequency=1,

deltat=1/3600, datemin=NULL, dateformat="m/d/Y", tol=NULL,

tol.type="both", methods="linear", rule=1, f=0, periodic=FALSE,

window=(2006080316 - 2006070100)/(800 - 1), split=100, specs=NULL)

 

I think if the question is resolved the function REGUL will work to.

 

Can someone help me? I only now start too use R.

 

 

Thanks for the help in advance,

 

João Santos


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] code optimization problem ... using or not using "which" function

2009-05-29 Thread Juan Carlos Laguardia
hello all,

I have two data sets that share certain fields of of interest (
facility, unit, date) which I want to match up, and from this extract
information from one dataset and store it in the other.

my first initial idea  (which I know is bad) goes like this:

##  capacity  and new_trayloc are datasets in example code:

for( i in 1: nrow( new_trayloc) {


theshifts<-which(as.Date(capacity$shift_dt) == new_trayloc$admit_dt[i] &
  as.character(capacity$unit)==as.character(new_trayloc$UNIT_1[i]) &
  as.character(capacity$fac_id)==as.character(new_trayloc$ORIG_FAC_ID[i]))


thenightshifts<-which(as.Date(capacity$shift_dt) == new_trayloc$admit_dt[i]-1 &
  as.character(capacity$unit)==as.character(new_trayloc$UNIT_1[i]) &
  as.character(capacity$fac_id)==as.character(new_trayloc$ORIG_FAC_ID[i]))


. obtain information by using theshifts and thenightshifts objects
and store in new_trayloc

}

. by doing a system.time on the entire for loop for 5 iterations, i
get a time of
 user  system elapsed
  25.661.04   26.72

That seems really bad... and plus, i need to run it for over 100,000 iterations.

Any suggestions in either the way I match the fields, or my approach
to my problem?


Cheers,
Juan Carlos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] longitudinal analysis with latent construct

2009-06-16 Thread Carlos Santos Jr.

Folks,

I need to test a model that has one predictor (a construct with three 
indicators) influencing four other variables. Something like what I try to show 
below.

 X --->   Y1, Y2, Y3, Y4
  /  |  \
 i1  i2  i3

Also, each variable was measured at 5 points in time. So, I'd like to model 
their change, in a longitudinal fashion (mixed model?).

I know it would be too much to find a script that does it all, but I thought 
that maybe you guys have a reference for me to read and learn the steps I'll 
have to follow to perform this analysis. Any guidance would be very welcome as 
I'm trying to migrate from EQS to R and thus have no experience with R yet.

Thanks in advance,
Carlos Santos Jr.


  Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Paste in a FOR loop

2008-12-31 Thread Carlos J. Gil Bellosta
?eval

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package installation

2008-12-31 Thread Carlos J. Gil Bellosta
Why don't you post your message in the Bioconductor list? People there
will be able to help you better.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2008-12-31 at 08:00 -0500, jianying...@med.unc.edu wrote:
> Dear all,
> 
> I tried to install bioconductor package using biocLite(). I got warning that 
> c:/program file/R../library is not writable. In any rate, after the 
> downloading/installing, I looked for packages like "affy" "gcrma" etc. but I 
> did not seem them in the "library" folder. However, when I try to load the 
> library, it worked.
> 
> Does anybody have any experience? Where are those packages installed?
> 
> P.S. I am using window vista
> 
> Thanks.
> 
> Jianying
> 
> - Original Message -
> From: "Carlos J. Gil Bellosta" 
> Date: Wednesday, December 31, 2008 6:30 am
> Subject: Re: [R] Paste in a FOR loop
> To: Michael Pearmain 
> Cc: r-help@r-project.org
> 
> > ?eval
> > 
> > Carlos J. Gil Bellosta
> > http://www.datanalytics.com
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-
> > project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I plot multiple XY plots on the same graph

2009-01-01 Thread Carlos J. Gil Bellosta
Hello,

You can use plot for the first plot and points for the subsequent ones.

Points will add new points to the existing plot reusing the axes,
labels, etc.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2008-12-31 at 15:36 -0800, George Chen wrote:
> Hello,
> I have multiple data sets I would like to plot on the same XY scatterplot.
> The data sets have in common the same Y values.
> Could somebody tell me how to do this?
> I tried par(T=new)   (I think this was it) but it literally overlays plots on 
> each other.
> 
> Thanks in advance.
> 
> George
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the first and last observation for each subject

2009-01-02 Thread Carlos J. Gil Bellosta
Hello,

First, order your data by ID and time.

The columns you want in your output dataframe are then

unique(ID),

tapply( x, ID, function( z ) z[ 1 ] )

and

tapply( y, ID, function( z ) z[ lenght( z ) ] - z[ 1 ] )

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Fri, 2009-01-02 at 17:20 +0800, gallon li wrote:
> I have the following data
> 
> ID x y time
> 1  10 20 0
> 1  10 30 1
> 1 10 40 2
> 2 12 23 0
> 2 12 25 1
> 2 12 28 2
> 2 12 38 3
> 3 5 10 0
> 3 5 15 2
> .
> 
> x is time invariant, ID is the subject id number, y is changing over time.
> 
> I want to find out the difference between the first and last observed y
> value for each subject and get a table like
> 
> ID x y
> 1 10 20
> 2 12 15
> 3 5 5
> ..
> 
> Is there any easy way to generate the data set?
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the first and last observation for each subject

2009-01-02 Thread Carlos J. Gil Bellosta
Hello,

Is is truly

y=max(y)-min(y)

what you want below?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Fri, 2009-01-02 at 13:16 -0500, Stavros Macrakis wrote:
> I think there's a pretty simple solution here, though probably not the
> most efficient:
> 
> t(sapply(split(a,a$ID),
> function(q) with(q,c(ID=unique(ID),x=unique(x),y=max(y)-min(y)
> 
> Using 'unique' instead of min or [[1]] has the advantage that if x is
> in fact not time-invariant, this gives an error rather than silently
> ignore inconsistencies.
> 
> Trying to package up this idiom into a function leads to:
> 
> select <-
>   function(df, groupby, selection)
>{
>  pf <- parent.frame()
>  fields <- substitute(selection)
>  t(sapply(split(df,eval(substitute(groupby),df,enclos=pf)),
>  function(q) eval(fields,q,enclos=pf)))  }
> 
> which I admit is rather ugly (and does no error-checking), but it does work:
> 
> > select(a,ID,list(min(ID),unique(x),max(y)-min(y)))
> [,1] [,2] [,3]
>   1 110   20
>   2 212   15
>   3 355
> 
> Perhaps some of the more experienced people on the list could show me
> how to write this more cleanly.
> 
>-s
> 
> 
> On Fri, Jan 2, 2009 at 4:20 AM, gallon li  wrote:
> > I have the following data
> >
> > ID x y time
> > 1  10 20 0
> > 1  10 30 1
> > 1 10 40 2
> > 2 12 23 0
> > 2 12 25 1
> > 2 12 28 2
> > 2 12 38 3
> > 3 5 10 0
> > 3 5 15 2
> > .
> >
> > x is time invariant, ID is the subject id number, y is changing over time.
> >
> > I want to find out the difference between the first and last observed y
> > value for each subject and get a table like
> >
> > ID x y
> > 1 10 20
> > 2 12 15
> > 3 5 5
> > ..
> >
> > Is there any easy way to generate the data set?
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Equivalent of match for data.frame

2009-01-03 Thread Carlos J. Gil Bellosta
Hello,

Why not something like

lapply(mydf, function(x) match(myarg, x) )

?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Sat, 2009-01-03 at 07:24 -0500, Sébastien wrote:
> Dear R-users,
> 
> I am translating a S script into R and having some troubles with the 
> match function. This function appears to work with vector and data.frame 
> in S, but not in R, e.g.:
> a <- rep((1:4), each = 10)
> b <- rep((1:10), times = 4)
> mydf <- data.frame(a,b)
> myarg <- mydf[1,]
> match(myarg, mydf)
> 
> # S returns 1 but R returns NA NA
> 
> I guess one could use match(interaction(myarg), interaction(mydf)) to do 
> the job but I was just wondering if there was a more direct function.
> 
> Thanks,
> 
> Sebastien
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] if statement

2009-01-05 Thread Carlos J. Gil Bellosta
Hello,

If you do

C <- A
C[A > X & A < Y] <- 0

you get what it seems you want.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

On Mon, 2009-01-05 at 03:41 -0800, Shruthi Jayaram wrote:
> A <- ts(rnorm(120), freq=12, start=c(1992,8))
> X <- 0.5
> Y <- 0.8

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing Matrix Header

2009-01-06 Thread Carlos J. Gil Bellosta
Hello,

colnames( dat ) <- NULL

will do the trick.

Carlos J. Gil Bellosta
http://www.datanalytics.com

On Tue, 2009-01-06 at 17:14 +0900, Gundala Viswanath wrote:
> Dear all,
> 
> I have the following matrix.
> 
> > dat
>  A A A A A A A A A A
> [1,] 0 0 0 0 0 0 0 0 0 0
> [2,] 0 0 0 0 0 0 0 0 0 1
> [3,] 0 0 0 0 0 0 0 0 0 2
> 
> How can I change it into:
>  [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]   [,9]   [,10]
> [1,] 0 0 0  0 0 0 0 0 0   0
> [2,] 0 0 0  0 0 0 0 0 0   1
> [3,] 0 0 0  0 0 0 0 0 02
> 
> 
> I tried:
> 
> > as.matrix(x)
> 
> But failed.
> 
> 
> - Gundala Viswanath
> Jakarta - Indonesia
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rbind for matrices - rep argument

2009-01-07 Thread Carlos J. Gil Bellosta

On Wed, 2009-01-07 at 16:22 +0100, Niccolò Bassani wrote:
> Dear R users,I'm facing a trivial problem, but I really can't solve it. I've
> tried a dozen of codes, but I can't get the result I want.
> The question is: I have a dataframe like this one
> 
> [,1] [,2] [,3] [,4] [,5]
> [1,]12345
> [2,]25549
> [3,]16812
> [4,]86415
> 
> made up of decimal numbers, of course.
> I want to append this dataframe to itself a number x of times, i.e. 3. That
> is I want a dataframe like this
> 
> 
> [,1] [,2] [,3] [,4] [,5]
> [1,]12345
> [2,]25549
> [3,]16812
> [4,]86415
> [5,]12345
> [6,]25549
> [7,]16812
> [8,]86415
> [9,]12345
> [10,]25549
> [11,]16812
> [12,]86415
> 
> I'm searching for an "authomatic" way to do this (I've already used the
> rbind re-writing x times the name of the frame...), as it must enter a
> function where one argument is exactly the number x of times to repeat this
> frame.
> 
> Any ideas??
> Thanks in advance!

Hello,

If your matrix is

kk <- matrix( 1:16, 4, 4)

You can do

kkk <- lapply( 1:5, function(x) kk )
do.call(rbind, kkk)

You can write your code in a single line, though. I used 5 here as a
matter of example. You can build a function on these lines with an
arbitrary argument if need be.

Carlos J. Gil Bellosta
http://www.datanalytics.com



> Niccol
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VCOV Source Code

2009-01-08 Thread Carlos J. Gil Bellosta
Hello,

You can do

stats:::vcov.lm

to see the source code for that particular method. In order to see which
are the methods supported by vcov, write

methods("vcov")

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2009-01-07 at 21:37 -0600, Yang Wan wrote:
> Dear R Help,
> 
>  
> 
> I wonder the way to show the source code of [vcov] command.  Usually, it
> can show the source code after input the command and enter. But for
> [vcov], it shows 
> 
>  
> 
> function (object, ...) 
> 
> UseMethod("vcov")
> 
> 
> 
>  
> 
> I appreciate for your help.  Best wishes.
> 
>  
> 
> Christina
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dataframe with unequal rows

2009-01-08 Thread Carlos J. Gil Bellosta
Hello,

You are not very precise there. Do you mean that the rows in your text
file do not all have the same number of separators (commas, in your
case)?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Thu, 2009-01-08 at 04:38 -0500, rahul-a.agar...@ubs.com wrote:
> I have a data frame with unequal rows length separated by comma.I
> have to read the data first and then calculate number of comma in each
> row...how can I do that
> 
> Regards Rahul
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VaR-Monte carlo Simulation, Historic simulation, Variance-Covariance Simulation

2009-01-08 Thread Carlos J. Gil Bellosta
Yes, there are: replicate and quantile are your friends.

You will find better support in the R-Finance list, though.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Thu, 2009-01-08 at 01:36 -0800, Maithili Shiva wrote:
> Dear R helpers
> 
> Suppose I have a portfolio of securities with exposure to Equity, Bonds and 
> Forex (say $ 100 each). 
> 
> Is there any fucntion in R that will help me calculate Value at Risk (VaR) 
> using Monte carlo Simulation , Historic simulation and Variance - Covariance 
> Simulation.
> 
> 
> With regards
> 
> Maithili
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Carlos J. Gil Bellosta
On Thu, 2009-01-08 at 10:42 -0600, Stas Kolenikov wrote:
> A really good measure for R will be the total # of the downloads of
> r-base for all platforms from all CRAN mirrors (and I would expect
> that # can be found from the servers' logs). 

Hello,

You obviate here that many of us are downloading R from our Linux
distribution repositories directly. 

Besides, given the free nature of R, some of us install it in several
computers, even, in my case, briefly in somebody else's computer for a
short time if I have an urgent task to solve. Of course, I would never
do (or be able to do) this with SAS...

So, the number of downloads from CRAN servers seems like a lousy proxy
for the total number of users of SAS.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Carlos J. Gil Bellosta
On Thu, 2009-01-08 at 13:52 -0600, Marc Schwartz wrote:
> Reading the posts on SAS-L since yesterday via Google RSS, where the
> NYT
> article was also posted, some have noted that SAS itself offers online
> support forums (http://support.sas.com/forums/index.jspa). From a
> quick
> review, it looks like the SAS.com forums date back to perhaps early
> 2006, thus possibly accounting for some of the leveling of the posts
> on
> SAS-L recently.

Hello,

Not only that: the corporate intranet of SAS (sections of which are
sometime open for external consultants for certain products) also
contain forums with an uneven traffic flow. These will certainly absorb
part of the traffic that would otherwise hit lists like SAS-L.

In fact, in my five years experience working (also as) a SAS consultant,
I have never posted to SAS-L. However, I have posted (or had my requests
posted by other SAS employees) on these lists.

Having said that, I should also add that R represents a threat to SAS
(which does not stand for Statistical Analysis System for a long time
already) in a business segment that very doubtfully accounts for more
than 5-10% of their revenue. They have to sell about 1000 licenses of
SAS/BASE and SAS/STAT in order to match the annual revenues from a
single license for a single "solution" in a single top tier bank.

It is quite amusing, though, to browse SAS marketing internal
documentation --to which I had access some time ago-- on "how to
compete" against R. The SAS salesperson statement in the article seems
to have been extracted verbatim from them. 

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Boxplot from matrices

2009-01-11 Thread Carlos J. Gil Bellosta
Hello,

The following code may help you:

> my.matrix <- matrix( rnorm(16), ncol = 4 )
> boxplot( my.matrix ~ col(  my.matrix ) )

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com



On Sun, 2009-01-11 at 05:23 -0800, johnhj wrote:
> Hii,
> 
> I will create boxplots from matrices. I have the following data sets:
> 5.0  1.78  2.99  2.019 0
> 10.0  1.79  3.00  1.744 0
> 15.0  1.78  2.98  1.936 0
> 20.0  1.78  2.99  1.975 0
> 25.0  1.73  2.91  3.591 0
> 30.0  1.79  3.00  1.966 0
> 35.0  1.79  3.00  2.451 0
> 40.0  1.79  3.00  1.853 0
> 45.0  1.79  3.00  2.077 0
> 50.0  1.79  3.00  1.943 0
> 55.0  1.79  3.00  2.608 0
> 60.0  1.79  3.00  1.790 0
> 65.0  1.79  3.00  1.893 0
> 70.0  1.79  3.00  2.079 0
> 75.0  1.77  2.97  2.200 0
> 80.0  1.79  3.01  1.868 0
> 85.0  1.78  2.99  2.179 0
> 90.0  1.70  2.85  2.305 0
> 95.0  1.71  2.87  1.854 0
> 100.0  1.79  3.00  2.362 0
> 105.0  1.79  3.00  3.634 0
> 110.0  1.79  3.00  1.578 0
> 115.0  1.79  3.00  1.835 0
> 120.0  1.79  3.00  2.359 0
> 125.0  1.79  3.00  2.542 0
> 130.0  1.76  2.95  2.620 0
> 135.0  1.79  3.00  4.181 0
> 140.0  1.79  3.00  1.375 0
> 145.0  1.79  3.00  2.872 0
> 150.0  1.79  3.00  3.002 0
> 155.0  1.79  3.00  3.712 0
> 160.0  1.79  3.01  3.175 0
> 165.0  1.79  3.00  2.821 0
> 170.0  1.79  3.00  3.320 0.078
> 175.0  1.79  3.00  2.076 0
> 180.0  1.77  2.97  2.186 0
> 185.0  1.78  2.99  4.652 0
> 190.0  1.79  3.01  2.051 0
> 195.0  1.79  3.00  1.922 0
> 200.0  1.79  3.00  1.945 0
> 
> The first thing I do is, to run the command
> y<-matrix(c(test$V3),ncol=8)
> to divide the third column in 8 matrices to create 8 boxplots.
> The I run the command
> w<-summary(y) 
> to get the values min, max, mean, median, 1.Quan, 3.Quan
> 
> My problem is, I cann't run the plot command to create the 8 boxplots in a
> graph...
> The command 
> plot(y)
> gives me an error..
> 
> Can anybody help me to create the boxplot from matrices in a graph ?
> 
> greetings,
> j
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to reference previous row?

2009-01-12 Thread Carlos J. Gil Bellosta
Hello,

The solution to problem will seem far easier if you think in a different
way. For instance, you may want to consider the extra dummy column

previous.first.value <- c( "NA", first[ - length(first) ] )

Then you can "horizontally" compare first with it's previous value.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Mon, 2009-01-12 at 18:57 +1100, Heston Capital wrote:
> I am trying to write some code where the factor references its
> previous value, but can't find a solution searching through the
> archive.
> 
> > X
>   first second
> 1 A  1
> 2 A  2
> 3 B  3
> 4 B  4
> 5 B  5
> 6 C  6
> 7 C  7
> 
> I need a third column, in pseudo code-
>  If value of first=previous value of first:
>   third=previous value of third
>  else third = second
> 
> So the third column would look like:
> 0
> 0
> 3
> 3
> 3
> 6
> 6
> 
> 
> Thanks!
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] meaning of asymmetric on help page for intersect

2009-01-13 Thread Carlos J. Gil Bellosta
Hello,

The symmetric set difference of A and B is the set of elements in A or B
but not in A intersection B, i.e., ( (A U B) \ (A intersection B) ).

The asymmetric set difference of A and B is the set of elements of A
except those in B, i.e., (A \ B).

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Comparing elements for equality

2009-01-13 Thread Carlos J. Gil Bellosta
Hello,

You could build your output dataframe along the following lines:

foo <- function(x) length( unique(x) ) == 1

results <- data.frame(
freq = tapply( dat$id,   dat$id, length ),
var1 = tapply( dat$var1, dat$id, foo ),
var2 = tapply( dat$var2, dat$id, foo )
)

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Tue, 2009-01-13 at 14:17 -0500, Doran, Harold wrote:
> Suppose I have a dataframe as follows:
> 
> dat <- data.frame(id = c(1,1,2,2,2), var1 = c(10,10,20,20,25), var2 =
> c('foo', 'foo', 'foo', 'foobar', 'foo'))
> 
> Now, if I were to subset by id, such as:
> 
> > subset(dat, id==1)
>   id var1 var2
> 1  1   10  foo
> 2  1   10  foo
> 
> I can see that the elements in var1 are exactly the same and the
> elements in var2 are exactly the same. However,
> 
> > subset(dat, id==2)
>   id var1   var2
> 3  2   20foo
> 4  2   20 foobar
> 5  2   25foo
> 
> Shows the elements are not the same for either variable in this
> instance. So, what I am looking to create is a data frame that would be
> like this
> 
> idfreqvar1var2
> 1 2   TRUETRUE
> 2 3   FALSE   FALSE
> 
> Where freq is the number of times the ID is repeated in the dataframe. A
> TRUE appears in the cell if all elements in the column are the same for
> the ID and FALSE otherwise. It is insignificant which values differ for
> my problem.
> 
> The way I am thinking about tackling this is to loop through the ID
> variable and compare the values in the various columns of the dataframe.
> The problem I am encountering is that I don't think all.equal or
> identical are the right functions in this case.
> 
> So, say I was wanting to compare the elements of var1 for id ==1. I
> would have
> 
> x <- c(10,10)
> 
> Of course, the following works
> 
> > all.equal(x[1], x[2])
> [1] TRUE
> 
> As would a similar call to identical. However, what if I only have a
> vector of values (or if the column consists of names) that I want to
> assess for equality when I am trying to automate a process over
> thousands of cases? As in the example above, the vector may contain only
> two values or it may contain many more. The number of values in the
> vector differ by id.
> 
> Any thoughts?
> 
> Harold
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Howto access object of object

2009-01-14 Thread Carlos J. Gil Bellosta
Hello, 

Use "@" instead of "$" to extract slots from a S4 object.

Regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2009-01-14 at 17:07 +0900, Gundala Viswanath wrote:
> Dear all,
> 
> I have the following object:
> 
> > print(x)
> An object of class "matrix.csr"
> Slot "ra":
>  [1] 0.992056718 1.0 1.0 1.0 1.0 1.0
>  [7] 1.0 1.0 1.0 1.0 1.0 1.0
> [13] 1.0 1.0 1.0 1.0 1.0 1.0
> [19] 1.0 1.0 1.0 0.002647761 0.000882587 0.000882587
> [25] 0.000882587 0.000882587 0.000882587 0.000882587
> 
> Slot "ja":
>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
> 
> Slot "ia":
>   [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 22 22 
> 22
>  [26] 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 
> 22
>  [51] 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 
> 22
>  [76] 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 22 
> 22
> 
> Slot "dimension":
> [1] 100   1
> 
> __ END__
> 
> How can I acces "Slot 'ra'" only?
> 
> I tried
> 
> print(x$ra)
> 
> but fail.
> 
> - Gundala Viswanath
> Jakarta - Indonesia
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vectorization of three embedded loops

2009-01-14 Thread Carlos J. Gil Bellosta
Hello,

I believe that your bottleneck lies at this piece of code:

sum<-c();
for(j in 1:length(val)){
sum[j]<-euc[rownames(start.b)[i],val[j]]
}

In order to speed up your code, there are two alternatives:

1) Try to reorder the euc matrix so that the sum vector corresponds to
(part of) a row or column of euc.

2) For each i value, create a matrix with the coordinates corresponding
to ( rownames(start.b)[i], val[j] ) and index the matrix by this matrix
in order to create sum. This will be easiest if you can reorder euc in a
way that accessing its elements will be easy (and then you would be back
into (1)).

Creating a variable sum as c() and increasing its size in a loop is one
of the easiest ways to uselessly burn your CPU.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2009-01-14 at 10:32 +0300, Thomas Terhoeven-Urselmans wrote:
> Dear R-programmer,
> 
> I wrote an adapted implementation of the Kennard-Stone algorithm for  
> sample selection of multivariate data (R 2.7.1 under MacBook Pro,  
> Processor 2.2 GHz Intel Core 2 Duo, Memory 2 GB 667 MHZ DDR2 SDRAM).
> I used for the heart of the script three embedded loops. This makes it  
> especially for huge datasets very slow. For a datamatrix of 1853*1853  
> and the selection of 556 samples needed computation time of more than  
> 24 hours.
> I did some research on vecotrization, but I could not figure out how  
> to do it better/faster. Which ways are there to replace the time  
> consuming loops?
> 
> Here are some information:
> 
> # val.n<-24;
> # start.b<-matrix(nrow=1812, ncol=20);
> # val is a vector of the rownames of 22 in an earlier step chosen  
> extrem samples;
> # euc<-<-matrix(nrow=1853, ncol=1853); [contains the Euclidean  
> distance calculations]
> 
> The following calculation of the system.time was for the selection of  
> two samples:
> system.time(KEN.STO(val.n,start.b,val.start,euc))
> user  system elapsed
>   25.294  13.262  38.927
> 
> The function:
> 
> KEN.STO<-function(val.n,start.b,val,euc){
> 
> for(k in 1:val.n){
> sum.dist<-c();
> for(i in 1:length(start.b[,1])){
>   sum<-c();
>   for(j in 1:length(val)){
>   sum[j]<-euc[rownames(start.b)[i],val[j]]
>   }
>   sum.dist[i]<-min(sum);
>   }
> bla<-rownames(start.b)[which(sum.dist==max(sum.dist))]
> val<-c(val,bla[1]);
> start.b<-start.b[-(which(match(rownames(start.b),val[length(val)])! 
> ="NA")),];
> if(length(val)>=val.n)break;
> }
> return(val);
> }
> 
> Regards,
> 
> Thomas
> 
> Dr. Thomas Terhoeven-Urselmans
> Post-Doc Fellow
> Soil infrared spectroscopy
> World Agroforestry Center (ICRAF) 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtain numbers from vector of NAs and numbers

2009-01-14 Thread Carlos J. Gil Bellosta
Hello,

new.dat <- dat[ ! is.na(dat) ]

should do the trick.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2009-01-14 at 19:32 +0900, Gundala Viswanath wrote:
> Dear all,
> 
> I have this set of vectors generated via a loop.
> 
> > for (i in 1:nrow(dat)) {
> +  print(dat$v1) }
> 
>  [1] NA NA NA NA NA NA NA NA NA NA  9 NA NA NA NA NA NA NA NA NA
>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA 18 NA NA NA NA NA NA
>  [1] NA NA NA NA NA  8 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1] 17 18 NA 13 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1] NA NA NA NA  9 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1]  5  6  7 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1]  9 10 11 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1] 13 14  3 17 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1]  2  1 14 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1]  3  3 13 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>  [1]  4  4 16 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> 
> What I want to do is to extract only integer (i.e. every numbers except NA)
> yielding 1 single vector that contain all.
> 
> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20  9 18  8 17 18
> [26] 13  9  5  6  7  9 10 11 13 14  3 17  2  1 14  3  3 13  4  4 16
> 
> Is there a quick way to do it?
> 
> I tried "grep("[0-9]", vect)" but fail.
> 
> - Gundala Viswanath
> Jakarta - Indonesia
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Value Lookup from File without Slurping

2009-01-16 Thread Carlos J. Gil Bellosta
On Fri, 2009-01-16 at 18:02 +0900, Gundala Viswanath wrote:
> Dear all,
> 
> I have a repository file (let's call it repo.txt)
>  that contain two columns like this:
> 
> # tag  value
> AAA0.2
> AAT0.3
> AAC   0.02
> AAG   0.02
> ATA0.3
> ATT   0.7
> 
> Given another query vector
> 
> > qr <- c("AAC", "ATT")
> 
> I would like to find the corresponding value for each query above,
> yielding:
> 
> 0.02
> 0.7
> 
> However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
> Is there any ways to do that?
> 
> The reason I want to do that because repo.txt is very2 large size
> (milions of lines,
> with tag length > 30 bp),  and my PC memory is too small to keep it.
> 
> - Gundala Viswanath
> Jakarta - Indonesia

Hello,

You can always store your repo.txt into a database, say, SQLite, and
select only the values you want via an SQL query.

Thus, you will prevent loading the full file into memory.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with applying where condition

2009-01-19 Thread Carlos J. Gil Bellosta
Hello,

You can merge both tables first and then select the rows and columns you
want. Do it the other way around if your tables are too big.

All you need you can read it at

?merge
?subset

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Tue, 2009-01-20 at 11:03 +0530, venkata kirankumar wrote:
> Hi all,
> 
> I am a biggener in R-Project
> 
> I got one problem with applying *where condition*
> like
> 
> if 2 tables like
> table1:
> 
> empidname   dep
>   101  kiransolutions
>102 ram  testing
>103pavan database
> 
> table2:
> 
> empid   month   sal
>   101  Dec  9500
>   102  Dec  9800
>   103  Dec  8500
> 
> in first table i have to take *empid* with using the *name(kiran)*
> and after getting that *empid* i have to get *sal *with using that *empid*
> 
> can any one suggest how can I acheave this
> 
> 
> Thanks & regards;
> kiran
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging tables

2009-01-20 Thread Carlos J. Gil Bellosta
Hello,

Use merge. 

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Tue, 2009-01-20 at 13:41 +, Dry, Jonathan R wrote:
> I am relatively new to R and am trying to do some basic data manipulation.  
> Basically I have a table (csv - table 1) of data for a set of samples (rows), 
> and a second table (table 2) of information about a subset of samples of 
> particular interest.  I want to pull out the data from table 1 for the 
> samples in table 2, either by:
> * Merging the two tables based on a common identifier (SampleID - may 
> have a different header in the two tables), and filter for overlapping 
> entries (preferred approach)
> * OR filter table 1 for entries where SampleID matches to one in a list 
> taken from table 2
> 
> Any help would be gratefully recieved.
> 
> --
> AstraZeneca UK Limited is a company incorporated in Engl...{{dropped:21}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] from matrix to data.frame

2009-01-20 Thread Carlos J. Gil Bellosta
Hello, 

The columns in your output dataframe are the following vectors:

X1: as.vector( row(a) )
X2: colnames(a)[as.vector( col(a) )]
X3: as.vector( a )

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

On Tue, 2009-01-20 at 15:10 +0100, Antje wrote:
> Hello,
> 
> I have a question how to reshape a given matrix to a data frame.
> 
> # --
>  > a <- matrix(1:25, nrow=5)
>  > a
>   [,1] [,2] [,3] [,4] [,5]
> [1,]16   11   16   21
> [2,]27   12   17   22
> [3,]38   13   18   23
> [4,]49   14   19   24
> [5,]5   10   15   20   25
> 
>  > colnames(a) <- LETTERS[1:5]
>  > rownames(a) <- as.character(1:5)
>  > a
>A  B  C  D  E
> 1 1  6 11 16 21
> 2 2  7 12 17 22
> 3 3  8 13 18 23
> 4 4  9 14 19 24
> 5 5 10 15 20 25
> 
> # ---
> 
> This is an example on how my matrix looks like.
> Now, I'd like to reshape the data that I get a data frame with three columns:
> 
> - the row name of the enty (X1)
> - the column name of the entry (X2)
> - the entry itself (X3)
> 
> like:
> 
> X1X2  X3
> 1 A   1
> 2 A   2
> 3 A   3
> 
> 1 B   6
> 2 B   7
> 
> 5 E   25
> 
> How would you solve this problem in an elegant way?
> 
> Antje
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does anyone has this paper in pdf?

2009-01-23 Thread Carlos J. Gil Bellosta
> Note there is a link to issues dealing with teaching material, and I'd
> imagine colleagues at your institution are likely to have the same access
> rights, so technically its just as easy to send them a link to download a
> paper themselves.

Hello,

On this point, I remember taking courses at university in which the
professor was not allowed to make and distribute copies of certain
articles for the students. However we were free to go and make the
copies (legally) ourselves. 

There seems to be a point beyond which capitalism introduces more
inefficiencies into the market than it solves.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mode (statistics) in R?

2009-01-26 Thread Carlos J. Gil Bellosta
Hello,

You can try ?table. 

Best regards,

Carlos J. Gil Bellosta
http://www.datanaytics.com

On Mon, 2009-01-26 at 05:28 -0800, Jason Rupert wrote:
> Hopefully this is a pretty simple question:
> 
> Is there a function in R that calculates the "mode" of a sample? That is, I 
> would like to be able to determine the value that occurs the most frequently 
> in a data set. 
> 
> I tried the default R "mode" function, but it appears to provide a storage 
> type or something else. 
> 
> I tried the RSeek and some R documentation that I downloaded, but nothing 
> seems to mention calculating the "mode". 
> 
> Thanks again.
> 
> 
> 
> 
>   
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] D'Hondt method

2009-02-04 Thread Carlos J. Gil Bellosta
Hello,

I believe that a "productionized" version of the following would do:

dHont <- function( candidates, votes, seats ){
tmp <- data.frame(
candidates = rep( candidates, each = seats ),
scores = as.vector(sapply( votes, function(x) x /
1:seats ))
)
tmp <- tmp$candidates[order( - tmp$scores )] [1:seats]
table(tmp)
}


> votes <- sample(1:1, 5)
> votes
[1]  448 7685 5445  482 6266
> dHont(letters[1:5], votes, 10 )
tmp
a b c d e 
0 4 3 0 3 

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2009-02-04 at 12:16 +0100, Thomas Steiner wrote:
> Is there a R function to calculate the seats in parliament given the
> total number of seats and the votes for each party -- for different
> methods including the method of D'Hont?
> http://en.wikipedia.org/wiki/D%27Hondt_method
> Thanks,
> thomas
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Interface Coming to SAS/IML Studio

2009-02-05 Thread Carlos J. Gil Bellosta
Hello,

I thought this link could be of interest to the list. 

http://support.sas.com/rnd/app/studio/Rinterface2.html

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do you apply a function to each variable in a data frame?

2008-11-03 Thread Carlos J. Gil Bellosta
Data frames are lists themselves.

Something like

do.call( rbind, lapply( my.data.frame, quantile, probs=c(0.1,0.9)) )

should work.

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Mon, 2008-11-03 at 07:03 -0800, zerfetzen wrote:
> I want to apply a more complicated function than what I use in my example,
> but the idea is the same:
> 
> Suppose you have a data frame named x and you want to a function applied to
> each variable, we'll just use the quantile function for this example.  I'm
> trying all sorts of apply functions, but not having luck.  My best guess
> would be:
> 
> sapply(x, FUN=quantile)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CROSSTABULATION

2008-11-13 Thread Carlos J. Gil Bellosta
Hello...

Which code are you using to perform the breakup into the three classes?
Can you be more specific on that?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Thu, 2008-11-13 at 09:57 +, Sohail wrote:
> I want to form a 3x3 crosstabulation for the signs of two vectors (i.e.
> Negative, Zero, Positive). The problem is that I am simulating the data so
> for some iterations one of the categories is absent. Thus the resulting
> table shrinks to 3x2. I want it to be 3x3 with zero column corresponding to
> the missing category. Moreover, I have tried but failed to give the
> dimension names.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS Institute Adding Support for R

2009-02-13 Thread Carlos J. Gil Bellosta
On Thu, 2009-02-12 at 21:32 -0300, milton ruser wrote:
> Dear all,
> I was thinking how much of R capabilities SAS Institute could incorporate on
> SAS support?
> 
> Cheers
> 
> miltinho
> brazil


Most likely, as many as they currently provide for Tomcat, Postgres,
Apache or other open source products they bundle along with their
solutions.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] everybody loves R...

2009-02-20 Thread Carlos J. Gil Bellosta
Hello,

I do not know any such "community of R users in Spain and Latin America"
but it sounds like a great idea. 

A number of R official documents have already been translated into
Spanish, but enhancing local language support on basic documentation
would facilitate adoption of the language by universities and other
institutions currently working under a "Spanish only" restriction.

Adding a directory of local providers of coding and consultancy
resources would also increase the speed of adoption of R in the
industry, for sure.

Please, do contact me so that we can develop the idea further.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Fri, 2009-02-20 at 09:59 +, UsuarioR España wrote:
> Hi all
> 
> This
> topic is very interesting to me as I was planning to do something
> similar, but in Spanish. In my opinion with the existing infrastructure
> in English, new resources are not necessary. However, local language
> support, and, in particular, in Spanish, is rather weak.
> 
> I don't
> actually know if there is something already done like "community of R
> users in Spain and Latin America" but your opinions, ideas, and offers
> of collaboration to create it, would be very useful for me if finally we
> decide to do something.
> 
> Best regards
> 
> > Date: Thu, 19 Feb 2009 23:42:31 +0100
> > From: waclaw.marcin.kusnierc...@idi.ntnu.no
> > To: landronim...@gmail.com
> > CC: r-help@r-project.org
> > Subject: Re: [R] everybody loves R...
> > 
> > Liviu Andronic wrote:
> > > On Thu, Feb 19, 2009 at 5:29 PM, Gábor Csárdi  wrote:
> > >   
> > >> I don't want to be mean, I really like wikidot, but isn't it a better
> > >> solution to use the R wiki instead?
> > >>
> > >> http://wiki.r-project.org/rwiki/doku.php
> > >>
> > >> 
> > > Or even to contribute to existing well-structured sites such as
> > > Quick-R [1]? It would avoid doubling efforts, and dispersing similar
> > > information accross too many places.
> > >
> > >   
> > 
> > well, if the purpose is to have the message 'everybody loves r' imposed
> > on as many as possible, dispersing similar information across places is
> > one way to go ;)
> > 
> > vQ
> > 
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> _
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modifying a built in function from the stats package (fixing arima)

2009-03-03 Thread Carlos J. Gil Bellosta
Hello,

I do not think that is the way to go. If you believe that your algorithm
is better than the existing one, talk to the author of the package and
discuss the improvement. The whole community will benefit.

If you want to tune the existing function and tailor it to your needs,
you have several ways to go, among them:

1) Copy the existing function into a new file, edit it and load it via
source.

2) Download the source package and modify it for your own purposes.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Tue, 2009-03-03 at 18:20 +0100, Marc Vinyes wrote:
> Dear members of the list,
> 
> I'm a beginner in R and I'm having some trouble with: "Error in
> optim(init[mask], armafn, method = "BFGS", hessian = TRUE, control =
> optim.control,  :
>   non-finite finite-difference value [8]"
> 
> when running "arima".
> 
> I've seen that some people have come accross the same problem:
> https://stat.ethz.ch/pipermail/r-help/2008-August/169660.html
> 
> So I'd like to modify the code of arima to change the optimization function
> with another one that handles these problems automatically , however I don't
> find the way to do it and
> http://tolstoy.newcastle.edu.au/R/e6/help/09/01/2476.html points out a way
> that doesn't work for me:
> 
> * If I type edit(arima) and I modify it, changes are not saved,
> * If I copy the code and I save it like a different function, I get the hard
> error: "Error in Delta %+% c(1, -1) : object "R_TSconv" not found"
> 
> Anybody can give me a hint? I miss matlab's easy way of doing this ("edit
> function.m").
> 
> Thanks in advance
> 
> MarC (AleaSoft)
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fast Fourier Transform w.r.t. CreditRisk+

2009-03-05 Thread Carlos J. Gil Bellosta
Hello,

You have a link on the subject here:

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1122844

The author has extra literature and code on the subject. 

Also, there was a thread in R-SIG-Finance list on the subject a few
months ago.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

On Thu, 2009-03-05 at 03:48 -0800, Maithili Shiva wrote:
> Dear R Helpers,
> 
> Is there any literaure available (including R code) on Fast Fourier Transform 
> being used in CreditRisk+? I need to learn how to apply the Fast Fourier 
> Transform. I agree I am too vaue in my question and sincerely apologize for 
> the same, but I am not able to understand as to where do I start for this 
> particular assignment. I tried to search google for CRAN and Fast Fourier 
> Transform, but I got something for FFT image. Basically I need to understand 
> what is Fast Fourier Transform is and its use in CreditRisk+?
> 
> With regards and tahnking in advance
> 
> Maithili
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Describing clusters

2009-03-18 Thread Carlos J. Gil Bellosta
Dear R-helpers,

I am writing to the list in order to inquire whether there exists any
R package or program that will help me "describe" clusters.

The situation is as follows:

1) I create some clusters (say, with any clustering method in R).

2) I want to "describe" and assign some kind of "label" to each of them.

In order to "label" each cluster, I want to compare the distribution
of the variables in each cluster with respect to the distribution of
the variables in the original dataset. I would like to do it
graphically, if possible. In this way I could review this output and
say: this cluster corresponds to, say, "older patients who were not
treated before", etc.

I am aware this is not sound scientific practice, but I am asked to do
something like that. I have some ideas about how to do it, but I would
like to know if I am walking on a well trodden path.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Describing clusters

2009-03-18 Thread Carlos J. Gil Bellosta
Dear R-helpers,

I am writing to the list in order to inquire whether there exists any
R package or program that will help me "describe" clusters.

The situation is as follows:

1) I create some clusters (say, with any clustering method in R).

2) I want to "describe" and assign some kind of "label" to each of them.

In order to "label" each cluster, I want to compare the distribution
of the variables in each cluster with respect to the distribution of
the variables in the original dataset. I would like to do it
graphically, if possible. In this way I could review this output and
say: this cluster corresponds to, say, "older patients who were not
treated before", etc.

I am aware this is not sound scientific practice, but I am asked to do
something like that. I have some ideas about how to do it, but I would
like to know if I am walking on a well trodden path.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I have a question about a programa in R

2009-03-22 Thread Carlos J. Gil Bellosta
Hello,

And what is exactly your problem? 

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to: Read Multi-filtes and sort to different files

2009-03-25 Thread Carlos J. Gil Bellosta
Dear Mr. Li,

To make things simpler, you could place the files corresponding to
different stations in different directories. Then:

1) I would loop over the directories.

2) I would use dir and loop through the resulting vector (that would
contain the file names).

3) I would use read.table with parameters skip (to skip the header) and
the header option set to true.

4) I would aggregate the resulting files in a single big file.

There are ways to do that. Some involve using for loops; you can also
use sapply to loop over files and cbind if you feel confident with a
command similar to

do.call( cbind, sapply( dir(), read.table, skip = 1, header = TRUE ) )

I have not been able to test the expression above and it may not even
parse in R but it is close to something that should work.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Wed, 2009-03-25 at 14:30 -0700, Qianfeng Li wrote:
> new R user has a question: 
> 
> I have several hundreds of .txt files from  different monitoring sites over 
> several years. 
> (1) different site has a unique name( such as : ST2.20090321.txt = Sation 2  
> 2009 March 21 data, ST3.20090322=Station3, 2009, March 22 data). 
> (2) different site has different file header, but for the same site, the 
> header is the same.
> for example: 
> Sation 2 
> date time wind CO2 
> 2009 10:30 2 3 
> station 3
> data time solar NO
> 2009 10:20 4 5
> 
> Question: 
> How to write a "R" program to read all these files, and combine the data from 
> each station to one file (such as: ST2.master will save all the data from 
> station 2, and ST1.master will save all the data from station 1) ? 
> 
> 
> Thanks a million times!
> 
> Jeff 
> 
> 
>   
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] "Overloading" some non-dispatched S3 methods for new classes

2009-05-09 Thread Carlos J. Gil Bellosta
Hello,

I am building a package that creates a new kind of object not unlike a
dataframe. However, it is not an extension of a dataframe, as the data
themselves reside elsewhere. It only contains "metadata".

I would like to be able to retrieve data from my objects such as the
number of rows, the number of columns, the colnames, etc.

I --quite naively-- thought that ncol, nrow, colnames, etc. would be
dispatched, so I would only need to create a, say, ncol.myclassname
function so as to be able to invoke "ncol" directly and transparently.

However, it is not the case. The only alternative I can think about is
to create decorated versions of ncol, nrow, etc. to avoid naming
conflicts. But I would still prefer my package users to be able to use
the undecorated function names.

Do I have a chance?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Segmentation fault in package rJava on CentOS server

2009-05-13 Thread Carlos J. Gil Bellosta
Hello,

I just installed rJava on

[r...@ug13 ~]# R --version
R version 2.9.0 (2009-04-17)

runing on a

[r...@ug13 ~]# cat /etc/redhat-release
CentOS release 5.3 (Final)

This is the output of

[r...@ug13 ~]# R CMD javareconf
Java interpreter : /usr/bin/java
Java version : 1.4.2_18
Java home path   : /usr/java/j2sdk1.4.2_18/jre
Java compiler: /usr/bin/javac
Java headers gen.: /usr/bin/javah
Java archive tool: /usr/bin/jar
Java library path:
$(JAVA_HOME)/lib/i386/client:$(JAVA_HOME)/lib/i386:$(JAVA_HOME)/../lib/i386
JNI linker flags : -L$(JAVA_HOME)/lib/i386/client
-L$(JAVA_HOME)/lib/i386 -L$(JAVA_HOME)/../lib/i386 -ljvm
JNI cpp flags: -I$(JAVA_HOME)/../include -I$(JAVA_HOME)/../include/linux

Package rJava got properly installed (there were a number of warnings,
though, in the installation process). However,

> library(rJava)
> .jinit("")

 *** caught segfault ***
address 0xc, cause 'memory not mapped'

Traceback:
 1: .External("RinitJVM", boot.classpath, parameters, PACKAGE = "rJava")
 2: .jinit("")

Whenever I try to interact with Java from R --I am interested in the
RJDBC package--, I get the same segmentation fault at the .jinit call.
In particular, when .jinit calls RinitJVM.

Any ideas?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bug in truncgof package?

2009-05-31 Thread Carlos J. Gil Bellosta
Dear R-helpers,

I was testing the truncgof CRAN package, found something that looked
like a bug, and did my job: contacted the maintainer. But he did not
reply, so I am resending my query here.

I installed package truncgof and run the example for function ad.test. I
got the following output:

set.seed(123)
treshold <- 10
xc  <- rlnorm(100, 2, 2)# complete sample
xt <- xc[xc >= treshold]# left truncated sample
ad.test(xt, "plnorm", list(meanlog = 2, sdlog = 2), H = 10)


Supremum Class Anderson-Darling Test

data:  xt 
AD = 3.124, p-value = 0.12
alternative hypothesis: two.sided 

treshold = 10, simulations: 100


So I cannot reject the hipothesis (at a standard confidence level) that
the original sample comes from a lognormal distribution (as it is the
case).

But let us try to iterate on this example:

set.seed( 123 )
treshold <- 10

foo <- function(){
  xc  <- rlnorm(100, 2, 2) # complete sample
  xt <- xc[xc >= treshold] # left truncated sample
  ks.test(xt, "plnorm", list(meanlog = 2, sdlog = 2), H =
10)$p.value
}

results <- replicate( 100, foo() )


Then:

> table( results )
results
   0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09  0.1 0.11 0.16 0.18
0.19  0.2 
  257931234112211
32 
0.21 0.22 0.26 0.27 0.28  0.3 0.31 0.32 0.33 0.36 0.38  0.4 0.44 0.49
0.54 0.55 
   22131211121211
21 
0.56 0.57 0.62  0.7 0.76 0.78 0.96 0.98 
   12111111 


This is, in a 45% of the cases, you would reject the H_0 hypothesis,
which happens to be true, at the 5% "standard" confidence level.

Do you think this behaviour is buggy? If so, given that the maintainer
does not seem to be contactable, what would be the next step to take?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Recursive partitioning algorithms in R vs. alia

2009-06-19 Thread Carlos J. Gil Bellosta
Dear R-helpers,

I had a conversation with a guy working in a "business intelligence"
department at a major Spanish bank. They rely on recursive partitioning
methods to rank customers according to certain criteria. 

They use both SAS EM and Salford Systems' CART. I have used package R
part in the past, but I could not provide any kind of feature comparison
or the like as I have no access to any installation of the first two
proprietary products.

Has anybody experience with them? Is there any public benchmark
available? Is there any very good --although solely technical-- reason
to pay hefty software licences? How would the algorithms implemented in
rpart compare to those in SAS and/or CART?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Large Stata file Import in R

2009-06-29 Thread Carlos J. Gil Bellosta
Hello,

You are dealing with two different problems at the same time: importing
Stata data and importing a relatively big file.

Can you try to export your data to txt file first and try to import from
it directly?

Secondly, problems concerning reading big files with R occur quite often
and there are plenty of discussions and workarounds described in
previous posts. 

I am the author of a new package aimed at reading files column-wise. It
is quite frugal with memory as the data resides mostly on R dumped files
of the objects representing the rows of your data.

You can install and test it via

install.packages("colbycol",repos="http://R-Forge.R-project.org";)

Comments and bug reports are more than welcome!

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com




On Mon, 2009-06-29 at 15:50 +0100, saurav pathak wrote:
> Hi
> 
> I am using Stata 10 and I need to import a data set in stata 10 to R, I have
> saved the dataset in lower versions of Stata as well by using saveold
> command in Stata.
> 
> My RAM is 4gb and the stata file is 600MB, I am getting an error message
> which says :
> 
> "Error: cannot allocate vector of size 3.4 Mb
> In addition: There were 50 or more warnings (use warnings() to see the first
> 50)"
> 
> Thus far I have already tried the following
> 
> 1. By right clicking on the R icon I have used --max-mem-size=1000M in the
> "target" under "properties of the R icon
> 2. I have used library(foreign) at teh command prompt
> 3. then I use trialfile <- read.dta("C:/filename.dta")
>  Here I get error for a Stata data file that is 600MB in size, however, with
> data set in Stata 10 and Stata 8 of the size of 200KB, I have successfully
> being able to import the stata file in R
> 
> I am therefor confused whteher there is problem with the version of my stata
> file (which should not eb the case as I the smaller file of both versions
> are working fine) or is it the size issue,
> 
> Its pretty important for me, kindly address this question
> Thanks
> Saurav
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Aggregate, max and time of max

2009-07-24 Thread Carlos J. Gil Bellosta
Hello,

I believe that

by( data.ex, data.ex[,c(3,4)], function(x) x[which.max(x[,1]),] )

does what you want. Then,

do.call( rbind, by( data.ex, data.ex[,c(3,4)], function(x)
x[which.max(x[,1]),] ) )

looks somewhat nicer.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Fri, 2009-07-24 at 13:06 -0400, Afshartous, David wrote:
> All,
> 
> For data consisting of serial measurements on subjects, one may use the
> aggregate function to say compute the peak response for each subject for
> each design condition.  Is there a way to alter this or another one-liner to
> also retain the time at which the peak occurred and thus avoid writing a
> doing this via a loop?  I suppose one could attempt to employ the split
> function but that's probably no simpler than employing a loop.  Sample code
> below:
> 
> 
> data = expand.grid(time = seq(1,6), subject = seq(1,20), treatment =
> c("placebo", "drug"))
> data.ex = cbind(y = rnorm(dim(data)[1], 5, 1), data.ex)
> 
> data.peak = aggregate(data.ex[c(1)], data.ex[c(3,4)], max)
> ## this provides the peak of each subject on each treatment, but time is
> ## lost.  Including time in the statement doesn't help clearly as then
> ## the peak of all the times will be calculated
> 
> 
> David
> 
> --
> David Afshartous, Ph.D.
> Research Assistant Professor
> University of Miami, Miller School of Medicine
> Division of Clinical Pharmacology
> 1500 N.W. 12th Avenue, 15th Floor West
> Miami, Florida 33136
> 
> E-mail: afs...@med.miami.edu
> Phone: +1 305-243-1549
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about rpart decision trees (being used to predict customer churn)

2009-08-01 Thread Carlos J. Gil Bellosta
Hello,

If you do

my.tree <- rpart(cancel ~ experience)

and then you check

my.tree$frame

you will note that the complexity parameter there is 0. 

Check ?rpart.object to get a description of what this output means. But
essentially, you will not be able to break the leaf unless you set a
complexity parameter below that value, this is, never.

You may need to go into the internals of the function (and the C code)
in order to understand how this parameter is calculated. It looks to me
as an oddity and it is worth trying to understand why. 

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


P.S.: Note that there is a bug in your submitted code that requires some
hand fixing.



On Sun, 2009-07-26 at 11:37 -0700, Robert Smith wrote:
> Hi,
> 
> I am using rpart decision trees to analyze customer churn. I am finding that
> the decision trees created are not effective because they are not able to
> recognize factors that influence churn. I have created an example situation
> below. What do I need to do to for rpart to build a tree with the variable
> experience? My guess is that this would happen if rpart used the loss matrix
> while creating the tree.
> 
> > experience <- as.factor(c(rep("good",90), rep("bad",10)))
> > cancel <- as.factor(c(rep("no",85), rep("yes",5), rep("no",5),
> rep("yes",5)))
> > table(experience, cancel)
>   cancel
> experience no yes
>   bad   5   5
>   good 85   5
> > rpart(cancel ~ experience)
> n= 100
> node), split, n, loss, yval, (yprob)
>   * denotes terminal node
> 1) root 100 10 no (0.900 0.100) *
> 
> I tried the following commands with no success.
> rpart(cancel ~ experience, control=rpart.control(cp=.0001))
> rpart(cancel ~ experience, parms=list(split='information'))
> rpart(cancel ~ experience, parms=list(split='information'),
> control=rpart.control(cp=.0001))
> rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2,
> ncol=2)))
> 
> Thanks a lot for your help.
> 
> Best regards,
> Robert
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about rpart decision trees (being used to predict customer churn)

2009-08-02 Thread Carlos J. Gil Bellosta
Hello,

Isn't it totally counter-intuitive that if you penalize the error less
the tree finds it?

See:

experience <- as.factor(c(rep("good",90), rep("bad",10)))
cancel <- as.factor(c(rep("no",85), rep("yes",5),
rep("no",5),rep("yes",5)))

foo <- function( i ){
tmp <- rpart(cancel ~ experience, parms=list(loss=matrix(c(0,i,1,0),
byrow=TRUE,nrow=2)))
nrow( tmp$frame )
}

sapply( 1:20, foo )

The ouput I get is:

 [1] 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 1 1 1

So, something unexpected happens after penalization exceeds 16... Should
it be?

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com



On Sun, 2009-08-02 at 08:41 +1000, Graham Williams wrote:
> 2009/7/27 Robert Smith 
> 
> > Hi,
> >
> > I am using rpart decision trees to analyze customer churn. I am finding
> > that
> > the decision trees created are not effective because they are not able to
> > recognize factors that influence churn. I have created an example situation
> > below. What do I need to do to for rpart to build a tree with the variable
> > experience? My guess is that this would happen if rpart used the loss
> > matrix
> > while creating the tree.
> >
> > > experience <- as.factor(c(rep("good",90), rep("bad",10)))
> > > cancel <- as.factor(c(rep("no",85), rep("yes",5), rep("no",5),
> > rep("yes",5)))
> > > table(experience, cancel)
> >  cancel
> > experience no yes
> >  bad   5   5
> >  good 85   5
> > > rpart(cancel ~ experience)
> > n= 100
> > node), split, n, loss, yval, (yprob)
> >  * denotes terminal node
> > 1) root 100 10 no (0.900 0.100) *
> >
> > I tried the following commands with no success.
> > rpart(cancel ~ experience, control=rpart.control(cp=.0001))
> > rpart(cancel ~ experience, parms=list(split='information'))
> > rpart(cancel ~ experience, parms=list(split='information'),
> > control=rpart.control(cp=.0001))
> > rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2,
> > ncol=2)))
> >
> > Thanks a lot for your help.
> >
> > Best regards,
> > Robert
> >
> 
> Hi Robert,
> 
> Perhaps try a less extreme loss matrix:
> 
> rpart(cancel ~ experience, parms=list(loss=matrix(c(0,5,1,0), byrow=TRUE,
> nrow=2)))
> 
> Output from Rattle:
> 
> Summary of the Tree model for Classification (built using rpart):
> 
> n= 100
> 
> node), split, n, loss, yval, (yprob)
>   * denotes terminal node
> 
> 1) root 100 50 no (0.9000 0.1000)
>   2) experience=good 90 25 no (0.9444 0.0556) *
>   3) experience=bad 10  5 yes (0.5000 0.5000) *
> 
> Classification tree:
> rpart(formula = cancel ~ ., data = crs$dataset, method = "class",
> parms = list(loss = matrix(c(0, 5, 1, 0), byrow = TRUE, nrow = 2)),
> control = rpart.control(cp = 0.0001, usesurrogate = 0, maxsurrogate =
> 0))
> 
> Variables actually used in tree construction:
> [1] experience
> 
> Root node error: 50/100 = 0.5
> 
> n= 100
> 
>   CP nsplit rel error xerror xstd
> 1 0.4000  0   1.01.0 0.30
> 2 0.0001  1   0.60.6 0.22
> 
> TRAINING DATA Error Matrix - Counts
> 
>  Actual
> Predicted no yes
>   no  85   5
>   yes  5   5
> 
> 
> TRAINING DATA Error Matrix - Percentages
> 
>  Actual
> Predicted no yes
>   no  85   5
>   yes  5   5
> 
> Time taken: 0.01 secs
> 
> Generated by Rattle 2009-08-02 08:24:50 gjw
> ==
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to avoid a script from hanging up

2009-08-02 Thread Carlos J. Gil Bellosta
Hello,

Something you can do is saving your strings in a external text file
(using cat, for instance). In this way, you would not require much
memory while extracting your data.

Once you have extracted it, you can always have a look at your external
file to see if it is too big, what to do with it, etc.

You can even consider saving your data into a database if need be.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Sun, 2009-08-02 at 20:02 +0200, mau...@alice.it wrote:
> I am submitting this problem to the  R forum , rather than the Bioconductor 
> forum, because its nature is closer to programming style than any  
> Bioinformatic contents.
> I have implemented an R script to extracts many strings  through querying 3 
> Bioinformatic databases in the same loop cycle. Ideally, the script should 
> perform as many cycles as necessary to extract all available data of interest.
> Inevitably it triggers a BioMart exception after running many cycles in a 
> row. The exception seems to be independent of the script instructions because 
> if I restart the script from the point where it got interrupted then it runs 
> for another while, extracting also the data where the exception occurred with 
> no problem at all.
> Sometimes, though, the script does not respond any more, it hangs up, even if 
> no exception has apparently occurred, and the only way to regain control is 
> to kill the R process. This way I lose memory of how many data have been 
> processed and stored to disk files (unless I manually count them ... there 
> are thousands ..). If I restart the script then it restarts processing the 
> data strings from scratch. I guess it may be a memory problem as the task 
> manager (Windows/XP) shows that the hung-up R script is taking more than 70% 
> of the available RAM.
> I wonder whether there is any system command to make the script self-aware of 
> its memory requirements and running time.
> Ideally the script should be able to trap the exception and be sensitive to 
> its current RAM / CPU time requirements, self-exit after freezing and saving 
> the current program status so that when rerun it would not restart from 
> scratch but rather pick up from where it exited.
> Maybe this is asking too much from a non-compiled language ?
> 
> Thank you in advance,
> Maura 
> 
> 
> tutti i telefonini TIM!
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset of a matrix

2009-08-27 Thread Carlos Gonzalo Merino Mendez
Hello everyone, I would appreciate any help with the following.

My dataset is a list containing matrices. So if you type e.g.

data[[1]]

you get something like:

   [,1][,2]
361a   AT
456b   AG
72145aTG


As you can see my rows have names which are character strings containing 
numbers and letters. I want something similar to a histogram, per column. i.e. 
I want to know how many times I have a single repeat character in a column and 
how many times I have a twice repeated character and so on. Maybe there is an 
easy way to do this, but I wrote my own code which works perfectly, so don't 
bother to correct it unless extremely necessary. I write down the code so you 
know exactly what I'm trying to do:

table <- vector()

for (i in (1:length(data))){

for (j in (1:length(data[[i]][1,]))){

t <- table(data[[i]][,j])

table <- c(table, t)
}}

ncount <- table[names(table) != "-"] #this line is necessary to eliminate "-" 
characters which should not be included in the analysis

sfs <- table (ncount)

And with this code I get something like:

 1   2   3   4   5   6   7   8   9  10 

542 125  98  49  47  41  26  31  22  18  

which is what I'm looking for.


Now comes THE problem:

As I said before my rows have names. Each name is unique. I want to apply my 
analysis to a subset of rows en each matrix, namely all rows whose names start 
with 3, all that start with 4, all that start with 721. In most cases only the 
first character is important, but since I have names of different length, in 
some cases I need the first three characters to differentiate the groups. I 
want to integrate this into the loop so that I get a vector (such as the one 
called "table" in my code) for each subset analyzed.

I tried using the subset function, but I couldn't figure out how to use it, 
because it's intended to use row values to define the subset, not row names. 

I hope someone can help me out, but please bear in mind I am really new at R 
and most commands and parameters are really unfamiliar to me.

Thanks.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset of a matrix

2009-08-27 Thread Carlos Gonzalo Merino Mendez
Hi Milton,

Thanks for trying to help anyway.





From: milton ruser 

Cc: r-help@r-project.org
Sent: Thursday, August 27, 2009 6:48:41 PM
Subject: Re: [R] subset of a matrix


Hi Carlos,
 
I think I made a wrong suggestion. Sorry about that.
I was thinking that if you have the same rowname length it helps you on the 
data handling. Is it true?! Case yes I can try suggest another automatic way of 
you get it.
 
 
bests
 
milton


 
On Thu, Aug 27, 2009 at 12:39 PM, milton ruser  wrote:

Hi Carlos,
> 
>how about this step first:
> 
>rownames(mydata)<-gsub("361a","00361a",rownames(mydata))
>rownames(mydata)<-gsub("456a","00456a",rownames(mydata))
>
>good luck
> milton
> 

>
>Hello everyone, I would appreciate any help with the following.
>>
>>My dataset is a list containing matrices. So if you type e.g.
>>
>>data[[1]]
>>
>>you get something like:
>>
>>  [,1][,2]
>>361a   AT
>>456b   AG
>>72145aTG
>>
>>
>>As you can see my rows have names which are character strings containing 
>>numbers and letters. I want something similar to a histogram, per column. 
>>i.e. I want to know how many times I have a single repeat character in a 
>>column and how many times I have a twice repeated character and so on. Maybe 
>>there is an easy way to do this, but I wrote my own code which works 
>>perfectly, so don't bother to correct it unless extremely necessary. I write 
>>down the code so you know exactly what I'm trying to do:
>>
>>table <- vector()
>>
>>for (i in (1:length(data))){
>>
>>   for (j in (1:length(data[[i]][1,]))){
>>
>>   t <- table(data[[i]][,j])
>>
>>   table <- c(table, t)
>>}}
>>
>>ncount <- table[names(table) != "-"] #this line is necessary to eliminate "-" 
>>characters which should not be included in the analysis
>>
>>sfs <- table (ncount)
>>
>>And with this code I get something like:
>>
>> 1   2   3   4   5   6   7   8   9  10 
>>
>>542 125  98  49  47  41  26  31  22  18  
>>
>>which is what I'm looking for.
>>
>>
>>Now comes THE problem:
>>
>>As I said before my rows have names. Each name is unique. I want to apply my 
>>analysis to a subset of rows en each matrix, namely all rows whose names 
>>start with 3, all that start with 4, all that start with 721. In most cases 
>>only the first character is important, but since I have names of different 
>>length, in some cases I need the first three characters to differentiate the 
>>groups. I want to integrate this into the loop so that I get a vector (such 
>>as the one called "table" in my code) for each subset analyzed.
>>
>>I tried using the subset function, but I couldn't figure out how to use it, 
>>because it's intended to use row values to define the subset, not row names.
>>
>>I hope someone can help me out, but please bear in mind I am really new at R 
>>and most commands and parameters are really unfamiliar to me.
>>
>>Thanks.
>>
>>
>>
>>   [[alternative HTML version deleted]]
>>
>>__
>>R-help@r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
>>>>and provide commented, minimal, self-contained, reproducible code.
>>
>



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset of a matrix

2009-08-27 Thread Carlos Gonzalo Merino Mendez
Hi Henrique,

I tried your code. I simply copied and pasted it 'cause I have no idea how it 
works. What I get is the total number of A's and T's and all other characters, 
which was not my intention. Maybe I need to make some modifications to your 
script before being able to apply within my script? Can you explain what for 
are you using those commands?

Thanks for the help anyway.

Cheers,

Carlos





From: Henrique Dallazuanna 

Cc: r-help@r-project.org
Sent: Thursday, August 27, 2009 7:00:45 PM
Subject: Re: [R] subset of a matrix

Try this:

lapply(data, 
   function(r)
lapply(split(r, 
 substr(sprintf("%05d", as.numeric(gsub("[a-z]", "", 
row.names(r, 1, 3)), table))


On Thu, Aug 27, 2009 at 1:27 PM, Carlos Gonzalo Merino Mendez Hello everyone, I would appreciate any help with the following.
>
>>My dataset is a list containing matrices. So if you type e.g.
>
>>data[[1]]
>
>>you get something like:
>
>>   [,1][,2]
>>361a   AT
>>456b   AG
>>72145aTG
>>
>
>>As you can see my rows have names which are character strings containing 
>>numbers and letters. I want something similar to a histogram, per column. 
>>i.e. I want to know how many times I have a single repeat character in a 
>>column and how many times I have a twice repeated character and so on. Maybe 
>>there is an easy way to do this, but I wrote my own code which works 
>>perfectly, so don't bother to correct it unless extremely necessary. I write 
>>down the code so you know exactly what I'm trying to do:
>
>>table <- vector()
>
>>for (i in (1:length(data))){
>
>>for (j in (1:length(data[[i]][1,]))){
>
>>t <- table(data[[i]][,j])
>
>>table <- c(table, t)
>>}}
>
>>ncount <- table[names(table) != "-"] #this line is necessary to eliminate "-" 
>>characters which should not be included in the analysis
>
>>sfs <- table (ncount)
>
>>And with this code I get something like:
>
>> 1   2   3   4   5   6   7   8   9  10 
>
>>542 125  98  49  47  41  26  31  22  18  
>
>>which is what I'm looking for.
>
>
>>Now comes THE problem:
>
>>As I said before my rows have names. Each name is unique. I want to apply my 
>>analysis to a subset of rows en each matrix, namely all rows whose names 
>>start with 3, all that start with 4, all that start with 721. In most cases 
>>only the first character is important, but since I have names of different 
>>length, in some cases I need the first three characters to differentiate the 
>>groups. I want to integrate this into the loop so that I get a vector (such 
>>as the one called "table" in my code) for each subset analyzed.
>
>>I tried using the subset function, but I couldn't figure out how to use it, 
>>because it's intended to use row values to define the subset, not row names.
>
>>I hope someone can help me out, but please bear in mind I am really new at R 
>>and most commands and parameters are really unfamiliar to me.
>
>>Thanks.
>
>
>
>>[[alternative HTML version deleted]]
>
>>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regexp help needed.

2009-08-28 Thread Carlos Gonzalo Merino Mendez
Hi,

I posted yesterday with a problem in a script. I still have the same problem, 
but I think I found a better way to explain my problem.

I have a vector of character strings. Each string is unique, including numbers 
and letters. In the real world they represent a list of codes, so each position 
in the string has a meaning to me. I want to make a subset of the vector using 
"wildcards". So for example, take all the strings that start with 3. 

Any ideas how to do that?

Thanks for any help.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regexp help needed.

2009-08-28 Thread Carlos Gonzalo Merino Mendez
Thanks, this what I was looking for.





From: Henrique Dallazuanna 

Cc: r-help@r-project.org
Sent: Friday, August 28, 2009 1:42:25 PM
Subject: Re: [R] regexp help needed.

See this example:

Str <- c("345asd", "31qwe", "234tyu", "40kjhg")
grep("^3", Str, value = TRUE)
split(Str, substr(Str, 1, 1))


On Fri, Aug 28, 2009 at 7:29 AM, Carlos Gonzalo Merino Mendez Hi,
>
>>I posted yesterday with a problem in a script. I still have the same problem, 
>>but I think I found a better way to explain my problem.
>
>>I have a vector of character strings. Each string is unique, including 
>>numbers and letters. In the real world they represent a list of codes, so 
>>each position in the string has a meaning to me. I want to make a subset of 
>>the vector using "wildcards". So for example, take all the strings that start 
>>with 3.
>
>>Any ideas how to do that?
>
>>Thanks for any help.
>
>
>
>
>>[[alternative HTML version deleted]]
>
>>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DOE in R?

2009-09-06 Thread Carlos J. Gil Bellosta
Hello,

This is your starting point:

http://cran.r-project.org/web/views/ExperimentalDesign.html

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Thu, 2009-09-03 at 17:38 -0700, B_miner wrote:
> Hello!
> 
> 
> This is not a topic I am well versed in but required to become well versed
> in...I welcome any assistance!
> 
> Using R, I want to create an optimal design for an experiment. I'll be
> analyzing the results with logistic regression or some generalized linear
> model. I am thinking that the algdesign package can help (but no idea where
> to start?).
> 
> I'm presenting an example here that I have seen the answer to (in SAS) in
> order to make sure I would have gotten it *right*.
> 
> There are 5 factors: 4 are quantitative with three levels each and 1 is
> qualitative with two levels.
> 
> Factor and levels:
> 
> Intro: 0, 1.99, 2.99
> Duration: 6, 9 ,12
> GOTO: 3.99, 4.99, 5.99
> Fee: 0, 15, 45
> Color: Red, White
> 
> 
> In order to screen these factors, I would want to get a design where I could
> evaluate all main effects, all first order interactions and the squared
> terms of Intro, Duration, GOTO and FEE (for example Intro*Intro).
> 
> Looking for the D-optimal design.
> 
> Is this something that  R can provide?
> 
> 
> These are, according to the SAS paper I read the following:
> Obs intro duration goto fee color
> 1 0.00 6 3.99 0 WHITE
> 2 0.00 6 3.99 45 RED
> 3 0.00 6 5.99 0 RED
> 4 0.00 6 5.99 45 WHITE
> 5 0.00 9 3.99 45 RED
> 6 0.00 9 4.99 15 WHITE
> 7 0.00 9 5.99 0 RED
> 8 0.00 12 3.99 0 RED
> 9 0.00 12 3.99 45 WHITE
> 10 0.00 12 5.99 0 WHITE
> 11 0.00 12 5.99 45 RED
> 12 0.00 12 5.99 45 WHITE
> 13 1.99 6 3.99 15 RED
> 14 1.99 6 4.99 45 WHITE
> 15 1.99 6 5.99 0 WHITE
> 16 1.99 9 5.99 45 RED
> 17 1.99 12 3.99 0 WHITE
> 18 1.99 12 5.99 15 RED
> 19 2.99 6 3.99 0 WHITE
> 20 2.99 6 3.99 45 WHITE
> 21 2.99 6 4.99 0 RED
> 22 2.99 6 5.99 15 WHITE
> 23 2.99 6 5.99 45 RED
> 24 2.99 9 3.99 0 RED
> 25 2.99 12 3.99 15 WHITE
> 26 2.99 12 3.99 45 RED
> 27 2.99 12 4.99 0 WHITE
> 28 2.99 12 4.99 45 RED
> 29 2.99 12 5.99 0 RED
> 30 2.99 12 5.99 45 WHITE
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DOE in R?

2009-09-08 Thread Carlos J. Gil Bellosta
Hello,

Maybe you want something like this:

desD2 <- optFederov(~(Intro+Duration+GOTO+Fee+Color)^2,dat, nTrials =
30)

In any case, both SAS's proc optex and R's optFederov implement a
non-exhaustive search algorithm and nothing guarantees that the final
design will be the same. 

However, you can check SAS's and R's D values to see to which extent the
designs are far away from the "optimal".

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com



On Sun, 2009-09-06 at 20:33 -0400, b miner wrote:
> I attempted to use the package algdesign. I used the following code.
> However, the results were very much not matching the reference I noted
> (which is located at
> http://www2.sas.com/proceedings/sugi31/196-31.pdf). Instead of 30
> design points, I received 25 and those that were returned, only half
> or so matched the reference. I am inexperienced with optimal designs
> so I dont know if I am doing something wrong, the package is not the
> correct one for the task or a combination. Here is the code in case
> anyone has any input:
> 
> #CODE:
> 
> runif(1) #for a bug in program (assumes random seed object exists)
> 
> dat<-gen.factorial(c(3,3,3,3,2),varNames=c("Intro","Duration","GOTO","Fee","Color"))
> dat #show design plan
> 
> desD<-optFederov(~Intro+Duration+GOTO+Fee+Color
> +quad(Intro)+quad(Duration)+quad(GOTO)+quad(Fee)+Intro*Duration
> +Intro*GOTO+Intro*Fee+Intro*Color+Duration*GOTO+Duration*Fee
> +Duration*Color+GOTO*Fee+GOTO*Color
> +Fee*Color,dat,crit="D",maxIteration=1000,eval=TRUE)
> 
> #D
> desD$D
> 
> #design
> desD$design
> design<-desD$design
> 
> 
> 
> 
> > Subject: Re: [R] DOE in R?
> > From: c...@datanalytics.com
> > To: b_mi...@live.com
> > CC: r-help@r-project.org
> > Date: Sun, 6 Sep 2009 14:57:36 +0200
> > 
> > Hello,
> > 
> > This is your starting point:
> > 
> > http://cran.r-project.org/web/views/ExperimentalDesign.html
> > 
> > Best regards,
> > 
> > Carlos J. Gil Bellosta
> > http://www.datanalytics.com
> > 
> > 
> > On Thu, 2009-09-03 at 17:38 -0700, B_miner wrote:
> > > Hello!
> > > 
> > > 
> > > This is not a topic I am well versed in but required to become
> well versed
> > > in...I welcome any assistance!
> > > 
> > > Using R, I want to create an optimal design for an experiment.
> I'll be
> > > analyzing the results with logistic regression or some generalized
> linear
> > > model. I am thinking that the algdesign package can help (but no
> idea where
> > > to start?).
> > > 
> > > I'm presenting an example here that I have seen the answer to (in
> SAS) in
> > > order to make sure I would have gotten it *right*.
> > > 
> > > There are 5 factors: 4 are quantitative with three levels each and
> 1 is
> > > qualitative with two levels.
> > > 
> > > Factor and levels:
> > > 
> > > Intro: 0, 1.99, 2.99
> > > Duration: 6, 9 ,12
> > > GOTO: 3.99, 4.99, 5.99
> > > Fee: 0, 15, 45
> > > Color: Red, White
> > > 
> > > 
> > > In order to screen these factors, I would want to get a design
> where I could
> > > evaluate all main effects, all first order interactions and the
> squared
> > > terms of Intro, Duration, GOTO and FEE (for example Intro*Intro).
> > > 
> > > Looking for the D-optimal design.
> > > 
> > > Is this something that R can provide?
> > > 
> > > 
> > > These are, according to the SAS paper I read the following:
> > > Obs intro duration goto fee color
> > > 1 0.00 6 3.99 0 WHITE
> > > 2 0.00 6 3.99 45 RED
> > > 3 0.00 6 5.99 0 RED
> > > 4 0.00 6 5.99 45 WHITE
> > > 5 0.00 9 3.99 45 RED
> > > 6 0.00 9 4.99 15 WHITE
> > > 7 0.00 9 5.99 0 RED
> > > 8 0.00 12 3.99 0 RED
> > > 9 0.00 12 3.99 45 WHITE
> > > 10 0.00 12 5.99 0 WHITE
> > > 11 0.00 12 5.99 45 RED
> > > 12 0.00 12 5.99 45 WHITE
> > > 13 1.99 6 3.99 15 RED
> > > 14 1.99 6 4.99 45 WHITE
> > > 15 1.99 6 5.99 0 WHITE
> > > 16 1.99 9 5.99 45 RED
> > > 17 1.99 12 3.99 0 WHITE
> > > 18 1.99 12 5.99 15 RED
> > > 19 2.99 6 3.99 0 WHITE
> > > 20 2.99 6 3.99 45 WHITE
> > > 21 2.99 6 4.99 0 RED
> > > 22 2.99 6 5.99 15 WHITE
> > > 23 2.99 6 5.99 45 RED
> > > 24 2.99 9 3.99 0 RED
> > > 25 2.99 12 3.99 15 WHITE
> > > 26 2.99 12 3.99 45 RED
> > > 27 2.99 12 4.99 0 WHITE
> > > 28 2.99 12 4.99 45 RED
> > > 29 2.99 12 5.99 0 RED
> > > 30 2.99 12 5.99 45 WHITE
> > > 
> > > 
> > 
> > 
> > 
> > 
> 
> 
> __
> Windows Live: Make it easier for your friends to see what you’re up to
> on Facebook. Find out more.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DOE in R?

2009-09-09 Thread Carlos J. Gil Bellosta
Hello,

I think I misread the link you sent me yesterday. In any case, the
reason why SAS generates a model with 30 trials whereas R produces
another one with 25 is that, if the parameter for the number of trials
is not specified in the proc/function call, both systems adhere to
different defaults: number of cols in the design matrix plus 10 for SAS
and plus 5 for R. You can check proc optex and optFederov documentation
for details.

Beyond that, SAS implements several optimization methods, among which,
Federov's. R implements just (one version of) Federov's. Finally, the
non-deterministic nature of the search for optimality (and local optima)
may lead to discrepancies between the outputs. You will have to work
through the different proc/function inputs to make sure you are running
an analogous algorithm on both systems.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com




On Sun, 2009-09-06 at 20:33 -0400, b miner wrote:
> I attempted to use the package algdesign. I used the following code.
> However, the results were very much not matching the reference I noted
> (which is located at
> http://www2.sas.com/proceedings/sugi31/196-31.pdf). Instead of 30
> design points, I received 25 and those that were returned, only half
> or so matched the reference. I am inexperienced with optimal designs
> so I dont know if I am doing something wrong, the package is not the
> correct one for the task or a combination. Here is the code in case
> anyone has any input:
> 
> #CODE:
> 
> runif(1) #for a bug in program (assumes random seed object exists)
> 
> dat<-gen.factorial(c(3,3,3,3,2),varNames=c("Intro","Duration","GOTO","Fee","Color"))
> dat #show design plan
> 
> desD<-optFederov(~Intro+Duration+GOTO+Fee+Color
> +quad(Intro)+quad(Duration)+quad(GOTO)+quad(Fee)+Intro*Duration
> +Intro*GOTO+Intro*Fee+Intro*Color+Duration*GOTO+Duration*Fee
> +Duration*Color+GOTO*Fee+GOTO*Color
> +Fee*Color,dat,crit="D",maxIteration=1000,eval=TRUE)
> 
> #D
> desD$D
> 
> #design
> desD$design
> design<-desD$design
> 
> 
> 
> 
> > Subject: Re: [R] DOE in R?
> > From: c...@datanalytics.com
> > To: b_mi...@live.com
> > CC: r-help@r-project.org
> > Date: Sun, 6 Sep 2009 14:57:36 +0200
> > 
> > Hello,
> > 
> > This is your starting point:
> > 
> > http://cran.r-project.org/web/views/ExperimentalDesign.html
> > 
> > Best regards,
> > 
> > Carlos J. Gil Bellosta
> > http://www.datanalytics.com
> > 
> > 
> > On Thu, 2009-09-03 at 17:38 -0700, B_miner wrote:
> > > Hello!
> > > 
> > > 
> > > This is not a topic I am well versed in but required to become
> well versed
> > > in...I welcome any assistance!
> > > 
> > > Using R, I want to create an optimal design for an experiment.
> I'll be
> > > analyzing the results with logistic regression or some generalized
> linear
> > > model. I am thinking that the algdesign package can help (but no
> idea where
> > > to start?).
> > > 
> > > I'm presenting an example here that I have seen the answer to (in
> SAS) in
> > > order to make sure I would have gotten it *right*.
> > > 
> > > There are 5 factors: 4 are quantitative with three levels each and
> 1 is
> > > qualitative with two levels.
> > > 
> > > Factor and levels:
> > > 
> > > Intro: 0, 1.99, 2.99
> > > Duration: 6, 9 ,12
> > > GOTO: 3.99, 4.99, 5.99
> > > Fee: 0, 15, 45
> > > Color: Red, White
> > > 
> > > 
> > > In order to screen these factors, I would want to get a design
> where I could
> > > evaluate all main effects, all first order interactions and the
> squared
> > > terms of Intro, Duration, GOTO and FEE (for example Intro*Intro).
> > > 
> > > Looking for the D-optimal design.
> > > 
> > > Is this something that R can provide?
> > > 
> > > 
> > > These are, according to the SAS paper I read the following:
> > > Obs intro duration goto fee color
> > > 1 0.00 6 3.99 0 WHITE
> > > 2 0.00 6 3.99 45 RED
> > > 3 0.00 6 5.99 0 RED
> > > 4 0.00 6 5.99 45 WHITE
> > > 5 0.00 9 3.99 45 RED
> > > 6 0.00 9 4.99 15 WHITE
> > > 7 0.00 9 5.99 0 RED
> > > 8 0.00 12 3.99 0 RED
> > > 9 0.00 12 3.99 45 WHITE
> > > 10 0.00 12 5.99 0 WHITE
> > > 11 0.00 12 5.99 45 RED
> > > 12 0.00 12 5.99 45 WHITE
> > > 13 1.99 6 3.99 15 RED
> > > 14 1.99 6 4.99 45 WHITE
> > > 15

Re: [R] R Memory Usage Concerns

2009-09-15 Thread Carlos J. Gil Bellosta
Hello,

I do not know whether my package "colbycol" may help you. It can help
you read files that would not have fitted into memory otherwise.
Internally, as the name indicates, data is read into R in a column by
column fashion. 

IO times increase but you need just a fraction of "intermediate memory"
to read the files.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com


On Tue, 2009-09-15 at 00:10 -0700, Evan Klitzke wrote:
> On Mon, Sep 14, 2009 at 10:01 PM, Henrik Bengtsson  
> wrote:
> > As already suggested, you're (much) better off if you specify colClasses, 
> > e.g.
> >
> > tab <- read.table("~/20090708.tab", colClasses=c("factor", "double", 
> > "double"));
> >
> > Otherwise, R has to load all the data, make a best guess of the column
> > classes, and then coerce (which requires a copy).
> 
> Thanks Henrik, I tried this as well as a variant that another user
> sent me privately. When I tell R the colClasses, it does a much better
> job of allocating memory (ending up with 96M of RSS memory, which
> isn't great but is definitely acceptable).
> 
> A couple of notes I made from testing some variants, if anyone else is
> interested:
>  * giving it an nrows argument doesn't help it allocate less memory
> (just a guess, but maybe because it's trying the powers-of-two
> allocation strategy in both cases)
>  * there's no difference in memory usage between telling it a column
> is "numeric" vs "double"
>  * when telling it the types in advance, loading the table is much, much 
> faster
> 
> Maybe if I gather some more fortitude in the future, I'll poke around
> at the internals and see where the extra memory is going, since I'm
> still curious where the extra memory is going. Is that just the
> overhead of allocating a full object for each value (i.e. rather than
> just a double[] or whatever)?
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.