date:20110704

Re: [R] Prevent 'R CMD check' from reporting "NA"/"NA_character_" missmatch?

2011-07-04 Thread Johannes Graumann

Prof Brian Ripley wrote:

> On Mon, 4 Jul 2011, Johannes Graumann wrote:
> 
>> Hello,
>>
>> I'm writing a package am running 'R CMD check' on it.
>>
>> Is there any way to make 'R CMD check' not warn about a missmatch between
>> 'NA_character_' (in the function definition) and 'NA' (in the
>> documentation)?
> 
> Be consistent   Why do you want incorrect documentation of your
> package?  (It is not clear of the circumstances here: normally 1 vs 1L
> and similar are not reported if they are the only errors.)
> 
> And please do note the posting guide
> 
> - this is not really the correct list
> - you were asked to give an actual example with output.
> 

Taken to R-devel. Thanks. Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Create factor variable by groups

2011-07-04 Thread Mateus Rabello

Hi, suppose that I have the following data.frame:

  cnae4 cnpj 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Y 
  24996 10020470 1 1 2 12 16 21 17 51 43 19 183 
  24996 10020470 69 91 79 92 91 77 90 96 98 108 891 
  36145 10020470 0 0 0 0 2 83 112 97 91 144 529 
  4 1002 5 20 60 0 0 0 0 5 20 1000 1110 


I would like to create a new variable X that indicates which line, within the 
cnpj variable, has the highest value Y. For instance, within the cnpj = 
10020470, the second line has the largest value Y (891). For cnpj = 1002 is 
trivial (1110). Then, my new data.frame would become:

  cnae4 cnpj 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Y X 
  24996 10020470 1 1 2 12 16 21 17 51 43 19 183 FALSE 
  24996 10020470 69 91 79 92 91 77 90 96 98 108 891 TRUE 
  36145 10020470 0 0 0 0 2 83 112 97 91 144 529 FALSE 
  4 1002 5 20 60 0 0 0 0 5 20 1000 1110 TRUE 


Notice that for every value of the variable cnpj, only one line will have X = 
TRUE. 

Then, I would like to create a variable Z that is the sum of variable Y, also 
by variable cnpj. Thus, if cnpj = 10020470, Z = 183 + 891 +529 and for cnpj = 
1002, Z = 120. These sums can easily be done with tapply or aggregate but 
those would eliminate line with equal cnpj and I donât want that. I would 
like to achieve a data.frame like the following:

  cnae4 cnpj 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Y X Z 
  24996 10020470 1 1 2 12 16 21 17 51 43 19 183 FALSE 1603 
  24996 10020470 69 91 79 92 91 77 90 96 98 108 891 TRUE 1603 
  36145 10020470 0 0 0 0 2 83 112 97 91 144 529 FALSE 1603 
  4 1002 5 20 60 0 0 0 0 5 20 1000 1110 TRUE 1110 


In the end I will eliminate all lines with X = FALSE. 


Thank you and sorry for the long question.

Mateus Rabello
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loop in optim

2011-07-04 Thread EdBo

Hi

I have re-worked on my likelihood function and it is now working(#the code
is below#).

 May you help me correct my loop function.

 I want optim to estimates al_j; au_j; sigma_j;  b_j by looking at 0 to 20,
 21 to 40, 41 to 60 data points.

 The final result should have 4 columns of each of the estimates AND 4 rows
 of each of 0 to 20, 21 to 40, 41 to 60.

#likelihood function
a=read.table("D:/hope.txt",header=T)
attach(a)
a
llik = function(x) 
   { 
al_j=x[1]; au_j=x[2]; sigma_j=x[3];  b_j=x[4]
sum(na.rm=T,
ifelse(a$R_j< 0, log(1/(2*pi*(sigma_j^2)))-
   (1/(2*(sigma_j^2))*(a$R_j+al_j-b_j*a$R_m))^2, 
 ifelse(a$R_j>0 , log(1/(2*pi*(sigma_j^2)))-
   (1/(2*(sigma_j^2))*(a$R_j+au_j-b_j*a$R_m))^2,
log(ifelse (( pnorm (au_j, mean=b_j * a$R_m, 
sd= sqrt(sigma_j^2))-
   pnorm(al_j, mean=b_j * a$R_m, sd=sqrt (sigma_j^2)
)) > 0,
(pnorm (au_j,mean=b_j * a$R_m, sd= 
sqrt(sigma_j^2))-
   pnorm(al_j, mean=b_j * a$R_m, sd= sqrt(sigma_j^2)
)),
1)) ))
  )
   } 
start.par = c(-0.01,0.01,0.1,1) 
out1 = optim(llik, par=start.par, method="Nelder-Mead")
out1

--
View this message in context: 
http://r.789695.n4.nabble.com/loop-in-optim-tp3643230p3645031.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Duncan Murdoch


On 11-07-04 3:47 AM, Vaishali Sadaphal wrote:

Hi All,

I need to give my R code to my client to use. I would like to protect the
logic/algorithms that have been coded in R. This means that I would not
like anyone to be able to read the code.


R is an open source project, so providing ways for you to do this is not 
one of our goals.


If I were your client I would have asked for the source code for 
whatever you're doing; if your client isn't savvy enough to do that, you 
should provide it and explain why it is useful, and what your client is 
and isn't allowed to do with it.


If you think your client will steal from you, then you should find 
another client.


Duncan Murdoch



I am searching for ways to protect R code. I would like to create a .exe
kind of file which could be executed without using R or requiring to
install R. I would not like the R code to be loaded in R. This is so
because, after R loads a function, if you type the function name on the
command prompt, you can see the complete code. I would not like to give
this type of access to the R code.

I explored the option of creating .bat file (using command: R CMD BAT) and
byte code (using command: compile). These are not useful since they open
R, load these functions and then the R code is visible.

Is there any other way to protect the R code which would help me package
all my files/source files and give me an executable file which would be
run without opening R? Another problem is that R is freely downloadable.
Is it somehow possible to protect the code from being loaded in R and
being seen.

Thanks
--
Vaishali
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stuck ...can't get sapply and xmlTreeParse working

2011-07-04 Thread jim holtman

Probably this is what you want; convert the first column of 'new.add'
to character and then use in the sapply.

Now it seems to work in that data is read in, but the new error is
that "f" is not defined.  What is it supposed to be?

> x <- as.character(new.add[[1]])
> z <- sapply(x, hm)
Error in f$zpid <- sapply(getNodeSet(zdoc, "//result/zpid"), xmlValue) :
  object 'f' not found

Enter a frame number, or 0 to exit

1: sapply(x, hm)
2: lapply(X, FUN, ...)
3: FUN(c("10+PACER+LN&citystatezip=East+Norriton%2C+PA",
"141+ROSEMONT+AVE&citystatezip=Norristown%2C+PA", "6



On Mon, Jul 4, 2011 at 8:42 PM, eric  wrote:
> Can't seem to get the code below working. It gets stuck on line 24 inside the
> function hm; comments show the line in question. The function hm is called
> by sapply and is at the bottom of the code. Other stuff above line 24 works
> correctly including the first couple of lines of the function hm. Should I
> be using a different apply function or am I doing something wrong with
> xmlTreeParse ?
>
>
> library(XML)
> url.montco <-
> "http://webapp.montcopa.org/sherreal/salelist.asp?saledate=07/27/2011";
> tbl <-data.frame(readHTMLTable(url.montco))[, c(3,5,6,8,9)]
> tbl <-tbl[2: length(tbl[,1]),]
> names(tbl) <- c("Address", "Township", "Parcel", "SaleDate", "Costs");
> rownames(tbl) <- NULL
> v <- gregexpr("( aka )|( AKA )",tbl$Address)
> s <-sapply(v, function(x) max(unlist(x)))
> tbl$Address <- substring(tbl$Address, ifelse(s== -1, 0, s+4), 1)
> tbl$Cost <- gsub(',', '', tbl$Costs)
> temp <- strsplit(tbl$Cost, "\\$")
> temp <- do.call(rbind, temp)  # create a matrix
> mode(temp) <- 'numeric'
> tbl$Debt <- round(temp[, 2]/1000,2)
> tbl$Court <- round(temp[, 3]/1000,2)
> z <- data.frame(substr(tbl$SaleDate,regexpr("[A-Za-z]", tbl$SaleDate),
> regexpr("[0-9]", tbl$SaleDate,)-1)) ; names(z) <- "Action"
> y <- data.frame(substr(tbl$SaleDate,regexpr("[0-9]", tbl$SaleDate),2011)) ;
> names(y) <- "ActionDate"
> tbl <-cbind(tbl[, c(1,2,3,7,8)],z,y)
> new.add <- paste(tbl$Address,"&citystatezip=",tbl$Township,"%2C+PA", sep='')
> new.add <- sub("^( )+","", new.add)
> new.add <-data.frame(gsub("( )+",'+', new.add)); names(new.add) <-
> "ParseAddress"
> hm <- function(x) {
>  url.zill
> <-paste("http://www.zillow.com/webservice/GetDeepSearchResults.htm?zws-id=X1-ZWz1bup03e49vv_5kvb6&address=",x,
> sep="")
>  ## problem line is next #
>  zdoc <-xmlTreeParse(url.zill, useInternalNode=TRUE, isURL=TRUE)
>  # problem line above  ##
>  f$zpid <- sapply(getNodeSet(zdoc, "//result/zpid"), xmlValue)
>  f$zest.low <-sapply(getNodeSet(zdoc, "//valuationRange/low"), xmlValue)
>  f$zest <- sapply(getNodeSet(zdoc, "//zestimate/amount"), xmlValue)
>  rm(zdoc)
>  return(f)
> }
> j <-sapply(new.add, FUN=hm)
> print(zest)
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Stuck-can-t-get-sapply-and-xmlTreeParse-working-tp3644894p3644894.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fw: volcano plot.r

2011-07-04 Thread Ungku Akashah

Hello.

My name is Akashah. i work at metabolic laboratory. From my study, i found that 
volcano plot can help a lot in my section. 
i already studied about the volcano plot and get the coding to run in R 
software, unfortunately, there is may be something wrong with the coding. This 
is because  no graph appear, but no error (blue color text) was shown on the R 
console. Below is the coding for volcano plot, i hope anybody can help me to 
solve the problem.





#    volcano_plot.r
#
#    Author:    Amsha Nahid, Jairus Bowne, Gerard Murray
#    Purpose:    Produces a volcano plot
#
#    Input:    Data matrix as specified in Data-matrix-format.pdf
#    Output:    Plots log2(fold change) vs log10(t-test P-value)
#
#    Notes:    Group value for control must be alphanumerically first
#              Script will return an error if there are more than 2 groups

#
#    Load the data matrix
#
# Read in the .csv file
data<-read.csv("file:///Users/nadya/Desktop/praktikal UTM/TASKS1/RT BE EMS 
300-399.csv", sep=",", row.names=1, header=TRUE)
# Get groups information
groups<-data[,1]
# Get levels for groups
grp_levs<-levels(groups)
if (length(levels(groups)) > 2)
    print("Number of groups is greater than 2!") else {

    #
    #    Split the matrix by group
    #
    new_mats<-c()
    for (ii in 1:length(grp_levs))
        new_mats[ii]<-list(data[which(groups==levels(groups)[ii]),])
    
    #
    #    Calculate the means
    #
    # For each matrix, calculate the averages per column
    submeans<-c()
    # Preallocate a matrix for the means
    means<-matrix(
        nrow = 2,
        ncol = length(colnames(data[,-1])),
        dimnames = list(grp_levs,colnames(data[,-1]))
        )
    # Calculate the means for each variable per sample
    for (ii in 1:length(new_mats))
        {submeans[ii]<-list(apply(new_mats[[ii]][,-1],2,mean,na.rm=TRUE))
        means[ii,]<-submeans[[ii]]}
    
    #
    #    Calculate the fold change
    #
    folds<-matrix(
        nrow=length(means[,1]),
        ncol=length(means[1,]),
        dimnames=list(rownames(means),colnames(means))
        )
    for (ii in 1:length(means[,1]))
        for (jj in 1:length(means[1,]))
            folds[ii,jj]<-means[ii,jj]/means[1,jj]
    
    #
    #    t-test P value data
    #
    
pvals<-matrix(nrow=ncol(data[,-1]),ncol=1,dimnames=list(colnames(data[-1]),"P-Value"))
    
    #
    #    Perform the t-Test
    #
    for(ii in 1:nrow(pvals)) {
        pvals[ii,1]<-t.test(new_mats[[1]][,ii+1],new_mats[[2]][,ii+1])$p.value
        }
    
    m<-length(pvals)
    x_range<-range(c(
        min(
            range(log2(folds[2,])),
            range(c(-1.5,1.5))
            ),
        max(
            range(log2(folds[2,])),
            range(c(-1.5,1.5))
            )
        ))
    y_range<-range(c(
        min(range(-log10(pvals)),
            range(c(0,2))
            ),
        max(range(-log10(pvals)),
            range(c(0,2))
            )
        ))
    
    #
    #    Plot data
    #
    # Define a function, since it's rather involved
    volcano_plot<-function(fold, pval)
        {plot(x_range,                                 # x-dim 
            y_range,                                   # y-dim
            type="n",                                  # empty plot
            xlab="log2 Fold Change",                   # x-axis title
            ylab="-log10 t-Test P-value",              # y-axis title
            main="Volcano Plot",                       # plot title
            )
            abline(h=-log10(0.05),col="green",lty="44")# horizontal line at 
P=0.05
            abline(v=c(-1,1),col="violet",lty="1343")  # vertical lines at 
2-fold
            # Plot points based on their values:
            for (ii in 1:m)
                # If it's below 0.05, we're not overly interested: purple.
                if (-log10(pvals[ii])>(-log10(0.05))) {
                    # Otherwise, more checks;
                    # if it's greater than 2-fold decrease: blue
                    if (log2(folds[2,][ii])>(-1)) {
                        # If it's significant but didn't change much: orange
                        if (log2(folds[2,][ii])<1) {
                            points(
                                log2(folds[2,][ii]),
                                -log10(pvals[ii]),
                                col="orange",
                                pch=20
                                )
                            # Otherwise, greater than 2-fold increase: red
                            } else {
                                points(
                                    log2(folds[2,][ii]), 
                                    -log10(pvals[ii]),
                                    col="red",
                                    pch=20
                                    )
                            }
                        } else {
                            points(
                                log2(folds[2,][ii]),

Re: [R] Stuck ...can't get sapply and xmlTreeParse working

2011-07-04 Thread jim holtman

The value of 'url.zill' is a vector of 407 character strings:

Browse[1]> str(url.zill)
 chr [1:407] 
"http://www.zillow.com/webservice/GetDeepSearchResults.htm?zws-id=X1-ZWz1bup03e49vv_5kvb6&address=10+PACER+LN&citystatezip=East+";|
__truncated__ ...

Isn't it supposed to be just a single file name?

On Mon, Jul 4, 2011 at 8:42 PM, eric  wrote:
> Can't seem to get the code below working. It gets stuck on line 24 inside the
> function hm; comments show the line in question. The function hm is called
> by sapply and is at the bottom of the code. Other stuff above line 24 works
> correctly including the first couple of lines of the function hm. Should I
> be using a different apply function or am I doing something wrong with
> xmlTreeParse ?
>
>
> library(XML)
> url.montco <-
> "http://webapp.montcopa.org/sherreal/salelist.asp?saledate=07/27/2011";
> tbl <-data.frame(readHTMLTable(url.montco))[, c(3,5,6,8,9)]
> tbl <-tbl[2: length(tbl[,1]),]
> names(tbl) <- c("Address", "Township", "Parcel", "SaleDate", "Costs");
> rownames(tbl) <- NULL
> v <- gregexpr("( aka )|( AKA )",tbl$Address)
> s <-sapply(v, function(x) max(unlist(x)))
> tbl$Address <- substring(tbl$Address, ifelse(s== -1, 0, s+4), 1)
> tbl$Cost <- gsub(',', '', tbl$Costs)
> temp <- strsplit(tbl$Cost, "\\$")
> temp <- do.call(rbind, temp)  # create a matrix
> mode(temp) <- 'numeric'
> tbl$Debt <- round(temp[, 2]/1000,2)
> tbl$Court <- round(temp[, 3]/1000,2)
> z <- data.frame(substr(tbl$SaleDate,regexpr("[A-Za-z]", tbl$SaleDate),
> regexpr("[0-9]", tbl$SaleDate,)-1)) ; names(z) <- "Action"
> y <- data.frame(substr(tbl$SaleDate,regexpr("[0-9]", tbl$SaleDate),2011)) ;
> names(y) <- "ActionDate"
> tbl <-cbind(tbl[, c(1,2,3,7,8)],z,y)
> new.add <- paste(tbl$Address,"&citystatezip=",tbl$Township,"%2C+PA", sep='')
> new.add <- sub("^( )+","", new.add)
> new.add <-data.frame(gsub("( )+",'+', new.add)); names(new.add) <-
> "ParseAddress"
> hm <- function(x) {
>  url.zill
> <-paste("http://www.zillow.com/webservice/GetDeepSearchResults.htm?zws-id=X1-ZWz1bup03e49vv_5kvb6&address=",x,
> sep="")
>  ## problem line is next #
>  zdoc <-xmlTreeParse(url.zill, useInternalNode=TRUE, isURL=TRUE)
>  # problem line above  ##
>  f$zpid <- sapply(getNodeSet(zdoc, "//result/zpid"), xmlValue)
>  f$zest.low <-sapply(getNodeSet(zdoc, "//valuationRange/low"), xmlValue)
>  f$zest <- sapply(getNodeSet(zdoc, "//zestimate/amount"), xmlValue)
>  rm(zdoc)
>  return(f)
> }
> j <-sapply(new.add, FUN=hm)
> print(zest)
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Stuck-can-t-get-sapply-and-xmlTreeParse-working-tp3644894p3644894.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bad Confirmation String

2011-07-04 Thread Matilda E. Gogos

I just signed up for R-help 2 days ago, and received an *Invalid
confirmation string:*2beb5b0883f29fae71a80fcb30324117ec9ece94 when
confirming subscription. I am receiving e-mails in my inbox with about 40%
of individual e-mails saying across the top "this message may not have been
sent by: n...@somee-mail.com." Is this cause for concern?  Should these
messages be reported as phishing, and should I mark messages received in my
inbox with the message "this message may not have been sent by. . ." as
spam?


-- 
Matilda Gogos
matildaelizabe...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Seasonality of time series

2011-07-04 Thread Martin B.

Dear all, 

I have a time series of 10-day tropical rainfall data with a typical rainy
and dry season. Is there a way to extract seasonal information with R, like
the day of the start and end of each rainy season for each year? 


Martin Brandt 
University of Vienna

--
View this message in context: 
http://r.789695.n4.nabble.com/Seasonality-of-time-series-tp3644985p3644985.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-SIG-Finance] FinCenter in timeSeries with "merge", "cbind" and "rbind"

2011-07-04 Thread David Winsemius



On Jul 4, 2011, at 6:47 PM, Kenneth Roy Cabrera Torres wrote:


Hi R users:

When I try to merge or bind (cbind or rbind) two series,
both with a "FinCenter" different that GMT, the
result is "GMT" not the original financial center?


It's not in the help(cbind.timeSeries) page but looking at the  
function  (and at the documentation for the timeSeries class) you see  
that there is a "zone" argument and that it is "GMT" by default. So  
why don't you add something meaningful to your code (... that is not  
presented in a reproducible manner for testing.)


It is listed in the documentation as zone="", but in the cbind  
function call, the code is zone="GMT". It appears that the "zone"  
argument might be for input and the FinCenter might be for output in  
the help page for timeDate, but I think that aspect of the various  
parts of the documentation is rather vague and might do with a bit of  
clarification.


--

David.


What am I doing wrong?

##
require(timeSeries)

getRmetricsOptions("myFinCenter")
setRmetricsOptions(myFinCenter = "America/Bogota")
getRmetricsOptions("myFinCenter")

fechas <- format(timeCalendar(2010, sample(12, 6)))
datos <- matrix(round(rnorm(6), 3))
t1 <- sort(timeSeries(datos, fechas, units = "A"))
t1

fechas <- format(timeCalendar(2010, sample(12, 6)))
datos <- matrix(round(rnorm(6), 3))
t2 <- sort(timeSeries(datos, fechas, units = "B"))
t2

merge(t1,t2)
cbind(t1,t2)
rbind(t1,t2)
##

Thank you for your help.

Kenneth

___
r-sig-fina...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R  
questions should go.


___


e at 1 Churchill Place, London, E14 5HP.  This email may relate to  
or be sent from other members of the Barclays Group.

___

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stuck ...can't get sapply and xmlTreeParse working

2011-07-04 Thread eric

Can't seem to get the code below working. It gets stuck on line 24 inside the
function hm; comments show the line in question. The function hm is called
by sapply and is at the bottom of the code. Other stuff above line 24 works
correctly including the first couple of lines of the function hm. Should I
be using a different apply function or am I doing something wrong with
xmlTreeParse ? 


library(XML)
url.montco <-
"http://webapp.montcopa.org/sherreal/salelist.asp?saledate=07/27/2011";
tbl <-data.frame(readHTMLTable(url.montco))[, c(3,5,6,8,9)]
tbl <-tbl[2: length(tbl[,1]),]
names(tbl) <- c("Address", "Township", "Parcel", "SaleDate", "Costs");
rownames(tbl) <- NULL
v <- gregexpr("( aka )|( AKA )",tbl$Address)
s <-sapply(v, function(x) max(unlist(x)))
tbl$Address <- substring(tbl$Address, ifelse(s== -1, 0, s+4), 1)
tbl$Cost <- gsub(',', '', tbl$Costs) 
temp <- strsplit(tbl$Cost, "\\$")  
temp <- do.call(rbind, temp)  # create a matrix
mode(temp) <- 'numeric'
tbl$Debt <- round(temp[, 2]/1000,2) 
tbl$Court <- round(temp[, 3]/1000,2)
z <- data.frame(substr(tbl$SaleDate,regexpr("[A-Za-z]", tbl$SaleDate),
regexpr("[0-9]", tbl$SaleDate,)-1)) ; names(z) <- "Action"
y <- data.frame(substr(tbl$SaleDate,regexpr("[0-9]", tbl$SaleDate),2011)) ;
names(y) <- "ActionDate"
tbl <-cbind(tbl[, c(1,2,3,7,8)],z,y)
new.add <- paste(tbl$Address,"&citystatezip=",tbl$Township,"%2C+PA", sep='')
new.add <- sub("^( )+","", new.add)
new.add <-data.frame(gsub("( )+",'+', new.add)); names(new.add) <-
"ParseAddress"
hm <- function(x) {
  url.zill
<-paste("http://www.zillow.com/webservice/GetDeepSearchResults.htm?zws-id=X1-ZWz1bup03e49vv_5kvb6&address=",x,
sep="")
  ## problem line is next #
  zdoc <-xmlTreeParse(url.zill, useInternalNode=TRUE, isURL=TRUE)
  # problem line above  ##
  f$zpid <- sapply(getNodeSet(zdoc, "//result/zpid"), xmlValue)
  f$zest.low <-sapply(getNodeSet(zdoc, "//valuationRange/low"), xmlValue)
  f$zest <- sapply(getNodeSet(zdoc, "//zestimate/amount"), xmlValue)
  rm(zdoc)
  return(f)
}
j <-sapply(new.add, FUN=hm)
print(zest)

--
View this message in context: 
http://r.789695.n4.nabble.com/Stuck-can-t-get-sapply-and-xmlTreeParse-working-tp3644894p3644894.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] "Low Pain" Unicode Characters in pdf graph?

2011-07-04 Thread Liviu Andronic

On Sun, May 15, 2011 at 3:06 PM, ivo welch  wrote:
> Dear R-experts---is there a relatively low-pain way to get unicode
> characters into a plot to a pdf device?
>
Have you tried Cairo package or cairo_pdf()? Both are making use of
Cairo, which uses UTF-8 and automatically embeds fonts.

Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FinCenter in timeSeries with "merge", "cbind" and "rbind"

2011-07-04 Thread Kenneth Roy Cabrera Torres

Hi R users:

When I try to merge or bind (cbind or rbind) two series,
both with a "FinCenter" different that GMT, the
result is "GMT" not the original financial center?

What am I doing wrong?

##
require(timeSeries)

getRmetricsOptions("myFinCenter")
setRmetricsOptions(myFinCenter = "America/Bogota")
getRmetricsOptions("myFinCenter")

fechas <- format(timeCalendar(2010, sample(12, 6)))
datos <- matrix(round(rnorm(6), 3))
t1 <- sort(timeSeries(datos, fechas, units = "A"))
t1
 
fechas <- format(timeCalendar(2010, sample(12, 6)))
datos <- matrix(round(rnorm(6), 3))
t2 <- sort(timeSeries(datos, fechas, units = "B"))
t2

merge(t1,t2)
cbind(t1,t2)
rbind(t1,t2)
##

Thank you for your help.

Kenneth

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modification of cross-validations in rpart

2011-07-04 Thread Weidong Gu

One way around hacking rpart is to write code to do K fold samples
based on unit outside rpart, then build trees using training sets and
summarize scores on testing sets.

Weidong Gu

On Mon, Jul 4, 2011 at 9:22 AM, Katerine Goyer  wrote:
>
>
>
>
>
>
>
> Hello,
>
>
>
> I am using
> the rpart function (from the rpart package) to do a regression tree that 
> would describe
> the behaviour of a fish species according to several environmental variables.
> For each fish (sampling unit), I have repeated observations of the response
> variable, which means that the data are not independent. Normally, in this
> case, V-fold cross-validation needs to be modified to prevent over-optimistic
> predictions of error rates by cross-validation and overestimation of the tree
> size. A way to overcome this problem is by selecting only whole sampling units
> in our subsets of cross-validation. My problem is that I don’t know how to
> perform this modification of the cross-validation process in the rpart
> function.
>
>
> Is there a
> way to do this modification in rpart or is there any other function I could 
> use
> that would consider interdependence in the response variable?
>
>
> Here is an
> example of the code I am using (“Y” being the response variable and “data.env”
> being a data frame of the environmental
> variables):
>
>
> Tree = rpart(Y
> ~ X1 + X2 + X3,xval=100,data=data.env)
>
>
>
> Thanks
>
> Katerine
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R CMD SHLIB with ifort

2011-07-04 Thread Jeff Newmiller

I don't have any idea about linking Fortran to R, but if I did I am sure I 
would want complete command lines and error messages before I looked at your 
problem.
---
Jeff Newmiller The . . Go Live...
DCN: Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Ted Rosenbaum  wrote:

Hi,
I am running Linux (64 bit) R, compiled under gnu compilers. I am looking
to compile a Fortran program with the ifort (the intel compiler) and then be
able to import that as a library into my version of R. I am using the
flags FC=ifort and SHLIB_FCLD=ifort, however ifort does not seem to
recognize the libR.so file.

Does any one have experience with this/know of a a way to load an ifort
compiled file into R.

Thanks!

_

Ted Rosenbaum
Graduate Student
Yale University
Department of Economics

[[alternative HTML version deleted]]

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R CMD SHLIB with ifort

2011-07-04 Thread Ted Rosenbaum

Hi,
I am running Linux (64 bit) R, compiled under gnu compilers.  I am looking
to compile a Fortran program with the ifort (the intel compiler) and then be
able to import that as a library into my version of R.  I am using the
flags  FC=ifort and SHLIB_FCLD=ifort, however ifort does not seem to
recognize the libR.so file.

Does any one have experience with this/know of a a way to load an ifort
compiled file into R.

Thanks!

---
Ted Rosenbaum
Graduate Student
Yale University
Department of Economics

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loop in optim

2011-07-04 Thread Joshua Wiley

Hi Edward,

At least for me, your llik() function returns Inf for the starting
values specified, so optim() never gets to estimate anything.  You
need to alter llik() or find starting parameters that work before
worrying about getting the for loop working.

Cheers,

Josh

On Mon, Jul 4, 2011 at 2:34 AM, EdBo  wrote:
> Hi
>
> May you help me correct my loop function.
>
> I want optim to estimates al_j; au_j; sigma_j;  b_j by looking at 0 to 20,
> 21 to 40, 41 to 60 data points.
>
> The final result should have 4 columns of each of the estimates AND 4 rows
> of each of 0 to 20, 21 to 40, 41 to 60.
>
> ###MY code is
>
> n=20
> runs=4
> out=matrix(0,nrow=runs)
>
> llik = function(x)
>   {
>    al_j=x[1]; au_j=x[2]; sigma_j=x[3];  b_j=x[4]
>    sum(na.rm=T,
>        ifelse(a$R_j< 0, -log(1/(2*pi*(sigma_j^2)))-
>                           (1/(2*(sigma_j^2))*(a$R_j+al_j-b_j*a$R_m))^2,
>         ifelse(a$R_j>0 , -log(1/(2*pi*(sigma_j^2)))-
>                           (1/(2*(sigma_j^2))*(a$R_j+au_j-b_j*a$R_m))^2,
>
> -log(pnorm(au_j,mean=b_j*a$R_m,sd=sqrt(sigma_j^2))-
>                           pnorm(au_j,mean=b_j*a$R_m,sd=sqrt(sigma_j^2)
>
>       )
>
>   }
>
> start.par = c(0, 0, 0.01, 1)
> out1 = optim(llik, par=start.par, method="Nelder-Mead")
>
>
> for (i in 1: runs)
> {
>  index_start=20*(i-1)+1
>  index_end= 20*i
>  out[i]=out1[index_start:index_end]
> }
> out
>
>
> Thank you in advance
>
> Edward
> UCT
> My data
>
> R_j             R_m
> -0.0625         0.002320654
> 0               -0.004642807
> 0.0     0.005936332
> 0.032258065     0.001060848
> 0               0.007114057
> 0.015625        0.005581558
> 0               0.002974794
> 0.015384615     0.004215271
> 0.060606061     0.005073116
> 0.028571429     -0.006001279
> 0               -0.002789594
> 0.01389     0.00770633
> 0               0.000371663
> 0.02739726      -0.004224228
> -0.04           0.008362539
> 0               -0.010951605
> 0               0.004682924
> 0.01389     0.011839993
> -0.01369863     0.004210383
> -0.02778    -0.04658949
> 0               0.00987272
> -0.057142857    -0.062203157
> -0.03030303     -0.119177639
> 0.09375         0.077054642
> 0               -0.022763619
> -0.057142857    0.050408775
> 0               0.024706076
> -0.03030303     0.004043701
> 0.0625          0.004951088
> 0               -0.005968731
> 0               -0.038292548
> 0               0.013381097
> 0.014705882     0.006424728
> -0.014492754    -0.020115626
> 0               -0.004837891
> -0.029411765    -0.022054654
> 0.03030303      0.008936428
> 0.044117647     8.16925E-05
> 0               -0.004827246
> -0.042253521    0.004653096
> -0.014705882    -0.004222151
> 0.029850746     0.000107267
> -0.028985507    -0.001783206
> 0.029850746     -0.006372981
> 0.014492754     0.005492374
> -0.028571429    -0.009005846
> 0               0.001031683
> 0.044117647     0.002800551
>
>
>
>
>
>
>
>
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/loop-in-optim-tp3643230p3643230.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] placing multiple rows in a single row

2011-07-04 Thread David Winsemius

On Jul 4, 2011, at 2:32 PM, Annemarie Verkerk wrote:

Dear people from the R help list,

I have a question that I can't get my head around to start  
answering, that is why I am writing to the list.

I have data in a format like this (tabs might look weird):

John A1 1 0 1
John A2 111
John A3 100
MaryA1 1 0 1
Mary A2 001
Mary A3110
Peter A1 100
Peter A2 001
Peter A3 111
Josh A1 1 00
Josh A2
Josh A3000

I want to convert it into a format where variable rows from a single  
subject are placed behind each other, but with the different scores  
still matching up (i.e., it needs to be able to cope with missing  
data, as for Josh's A2 score).

John A1 1 0 1 A2 111 A3 1 
00
MaryA1 1 0 1A2 001 A31 
10
Peter A1 100 A2 001 A3 1 
11

Josh A1 1 00  A2A3000

Preferably, the row identification would become the header of the  
new table, something like this:

  A11A12A13 A21A22A23A31A32A33
John  1 0 1  111  100
Mary 1 0 1 001 110
Peter  100  001  111
Josh  1 00  000

Probably, this has been addressed before - I just don't know how to  
search for the answer with the right search terms.

Any help is appreciated, even just a link to a page where this is  
addressed!

There is a reshape function in the stats package that nobody except  
Phil Spector seems to understand and then there is the reshape and  
reshape2 packages that everybody seems to get. (I don't understand why  
the classification variables are on the left-hand-side, though.  
Positionally it makes some sense, but logically it does not connect  
with how I understand the process.)

require(reshape2)
# entered your data with default names V1 V2 V3 V4 V5
> nam123
  V1 V2 V3 V4 V5
1   John A1  1  0  1
2   John A2  1  1  1
3   John A3  1  0  0
4   Mary A1  1  0  1
5   Mary A2  0  0  1
6   Mary A3  1  1  0
7  Peter A1  1  0  0
8  Peter A2  0  0  1
9  Peter A3  1  1  1
10  Josh A1  1  0  0
11  Josh A2 NA NA NA
12  Josh A3  0  0  0

> nams.mlt <- melt(nam123, idvars=c("V1", "V2"))

> str(nams.mlt)
'data.frame':   36 obs. of  4 variables:
 $ V1  : Factor w/ 4 levels "John","Josh",..: 1 1 1 3 3 3 4 4 4  
2 ...

 $ V2  : Factor w/ 3 levels "A1","A2","A3": 1 2 3 1 2 3 1 2 3 1 ...
 $ variable: Factor w/ 3 levels "V3","V4","V5": 1 1 1 1 1 1 1 1 1 1 ...
 $ value   : int  1 1 1 1 0 1 1 0 1 1 ...

> dcast(nams.mlt, V1+V2 ~ variable)
  V1 V2 V3 V4 V5
1   John A1  1  0  1
2   John A2  1  1  1
3   John A3  1  0  0
4   Josh A1  1  0  0
5   Josh A2 NA NA NA
6   Josh A3  0  0  0
7   Mary A1  1  0  1
8   Mary A2  0  0  1
9   Mary A3  1  1  0
10 Peter A1  1  0  0
11 Peter A2  0  0  1
12 Peter A3  1  1  1
> dcast(nams.mlt, V1 ~ V2+variable)
 V1 A1_V3 A1_V4 A1_V5 A2_V3 A2_V4 A2_V5 A3_V3 A3_V4 A3_V5
1  John 1 0 1 1 1 1 1 0 0
2  Josh 1 0 0NANANA 0 0 0
3  Mary 1 0 1 0 0 1 1 1 0
4 Peter 1 0 0 0 0 1 1 1 1

You can always change the names of the dataframe if you want, and in  
this case it would be a simple sub() operation. Personally I would  
substitute "." rather than "".

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to merge two files

2011-07-04 Thread Joshua Wiley

Dear Albert,

Here is one way:

tmp.scores <- readLines("~/scores.txt")
tmp.seq <- readLines("~/seq.txt")
tmp.seq <- strsplit(gsub("N", "", tmp.seq), "")[[1]]
genedat <- data.frame(Sequence = tmp.seq, Scores = as.numeric(tmp.scores))
## Yields
> genedat
   Sequence Scores
1 A   0.80
2 T   0.70
3 T   0.30
4 A   0.50
5 A   0.60
6 A   0.50
7 G   0.01
8 G   0.90
9 G   0.30
10C   0.80

Hope this helps,

Josh

2011/7/4 albert coster :
> Dear all,
>
> I have two files :
>
> seq.txt: NNATTAAAGGGC
>
> scores.txt :
>
> 0.8
> 0.7
> 0.3
> 0.5
> 0.6
> 0.5
> 0.01
> 0.9
> 0.3
> 0.8
>
> I want output as following
>
> A 0.8
> T 0.7
> T 0.3
> A 0.5
> A 0.6
> A 0.5
> G 0.01
> G 0.9
> G 0.3
> C 0.8
>
> Where N are deleted and only A/T/G/C are appearing in a column.
>
> Thanks
>
> Albert
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] clustering based on most significant pvalues does not separate the groups!

2011-07-04 Thread pguilha

Hi all,

I have some microarray data on 40 samples that fall into two groups. I have
a value for 480k probes for each of those samples. I performed a t test
(rowttests) on each row(giving the indices of the columns for each group)
then used p.adjust() to adjust the pvalues for the number of tests
performed. I then selected only the probes with adj-p.value<=0.05. I end up
with roughly 2000 probes to do the clustering on but using pvclust, and
hclust, the samples do no split up into the two groups. I would have
imagined that using only those values that are significantly different
between the two groups, the clustering should surely reflect that?

Please, what am I missing???

Thanks!

Paul

PS: I am hoping I have just thought this through in the wrong way and there
is a simple explanation, but can provide the code I am using for clustering
if necessary!



--
View this message in context: 
http://r.789695.n4.nabble.com/clustering-based-on-most-significant-pvalues-does-not-separate-the-groups-tp3644249p3644249.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] placing multiple rows in a single row

2011-07-04 Thread Annemarie Verkerk


Dear people from the R help list,

I have a question that I can't get my head around to start answering, 
that is why I am writing to the list.


I have data in a format like this (tabs might look weird):

John A1 1 0 1
John A2 111
John A3 100
MaryA1 1 0 1
Mary A2 001
Mary A3110
Peter A1 100
Peter A2 001
Peter A3 111
Josh A1 1 00
Josh A2
Josh A3000

I want to convert it into a format where variable rows from a single 
subject are placed behind each other, but with the different scores 
still matching up (i.e., it needs to be able to cope with missing data, 
as for Josh's A2 score).


John A1 1 0 1 A2 111 A3 100
MaryA1 1 0 1A2 001 A3110
Peter A1 100 A2 001 A3 111
Josh A1 1 00  A2A3000

Preferably, the row identification would become the header of the new 
table, something like this:


   A11A12A13 A21A22A23A31A32A33
John  1 0 1  111  100
Mary 1 0 1 001 110
Peter  100  001  111
Josh  1 00  000

Probably, this has been addressed before - I just don't know how to 
search for the answer with the right search terms.


Any help is appreciated, even just a link to a page where this is addressed!

Thank you!
Annemarie

--
Annemarie Verkerk, MA
Evolutionary Processes in Language and Culture (PhD student)
Max Planck Institute for Psycholinguistics
P.O. Box 310, 6500AH Nijmegen, The Netherlands
+31 (0)24 3521 185
http://www.mpi.nl/research/research-projects/evolutionary-processes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to merge two files

2011-07-04 Thread albert coster

Dear all,

I have two files :

seq.txt: NNATTAAAGGGC

scores.txt :

0.8
0.7
0.3
0.5
0.6
0.5
0.01
0.9
0.3
0.8

I want output as following

A 0.8
T 0.7
T 0.3
A 0.5
A 0.6
A 0.5
G 0.01
G 0.9
G 0.3
C 0.8

Where N are deleted and only A/T/G/C are appearing in a column.

Thanks

Albert
0.8
0.7
0.3
0.5
0.6
0.5
0.01
0.9
0.3
0.8NNATTAAAGGGC__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Steve Lianoglou

On Mon, Jul 4, 2011 at 2:04 PM, Barry Rowlingson
 wrote:
> On Mon, Jul 4, 2011 at 5:48 PM, Vaishali Sadaphal > wrote:
>
>>
>> Hey All,
>>
>> Thank you so much for quick replies.
>> Looks like translation to C/C++ is the only robust option. Do you think
>> there exists any ready-made R to C translator?
>>
>>
>  No, I think they are normally all born without the R to C translation
> skills and acquire them through a long process of going to school and
> college and spending long long hours studying R and C...
>
>  I suggest that if your code is so commercially sensitive that you want it
> written in C, then hire a C programmer to do it. Money well spent.

Also -- check out Rcpp:
http://cran.r-project.org/web/packages/Rcpp/index.html

It will ease some of the R <--> C(++) bridging pain, but also provides
things like "sugar":
http://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-sugar.pdf

Which may make writing your code in C++ a bit easier. YMMV, of course.

The library is also GPL though, so  I'm not sure what that will make
your end code. Although I guess you'll just be linking to it at the
end of the day, but I'm not sure what prevailing wisdom these days
about whether or not that restricts the license of your code.

HTH,

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rpad library

2011-07-04 Thread David Winsemius



On Jul 4, 2011, at 12:45 PM, ATANU wrote:

can anyone help me with a well documented tutorial on Rpad package?  
I need to

do HTML programming in R.Can anyone help me with a tutorial?


Trivial Google searching produces this link:

http://rpad.googlecode.com/svn-history/r76/Rpad_homepage/index.html



--
View this message in context: 
http://r.789695.n4.nabble.com/Rpad-library-tp3644041p3644041.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SOLVED: superimposing different plot types in lattice panel.superpose

2011-07-04 Thread genghis

Thank you very much Dennis, that's wonderful.  I tried it first without
LatticeExtra and it didn't work, so yes that package is the key.  

Best,

John

--
View this message in context: 
http://r.789695.n4.nabble.com/superimposing-different-plot-types-in-lattice-panel-superpose-tp3642808p3644145.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Barry Rowlingson

On Mon, Jul 4, 2011 at 5:48 PM, Vaishali Sadaphal  wrote:

>
> Hey All,
>
> Thank you so much for quick replies.
> Looks like translation to C/C++ is the only robust option. Do you think
> there exists any ready-made R to C translator?
>
>
 No, I think they are normally all born without the R to C translation
skills and acquire them through a long process of going to school and
college and spending long long hours studying R and C...

 I suggest that if your code is so commercially sensitive that you want it
written in C, then hire a C programmer to do it. Money well spent.

Barry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I need help for creating a "timevar"

2011-07-04 Thread Joshua Wiley

Hi Karen,

As long as your IDENTITY column goes up in order, this should work:

## example data
dat <- data.frame(IDENTITY = rep(101:103, 3:1), EVENT = "Event")
dat$TIMEVAR <- unlist(with(dat, tapply(EVENT, IDENTITY, seq_along)))
## Result
dat

See ?tapply and ?seq_along for some documentation

Hope this helps,

Josh

On Mon, Jul 4, 2011 at 6:41 AM, kbr  wrote:
> Hi all!
>
> I have data in „Long“ format which I would like to reshape to „Wide“. I know
> that one possibility is the „reshape“ command, which needs a „timevar“.
>
> Data look as follows: There are approx. 3000 persons („IDENTITY“) and, for
> each person, there are between 2 and 20 events („EVENT“).  For now, there's
> one row for each event (9506 rows)
>
> http://r.789695.n4.nabble.com/file/n3643658/Screenshot-2.png
>
> What is missing is the „timevar“ (SPSS calls it „INDEX“), which numbers the
> events WITHIN each person (right column).
>
> I managed to number the events from 1 to 9506 with the seq-command, first
> writing the number of rows in nEVENT:
>> number <-seq(file=event, 1, nEVENT, b=1)
> Yet, I didn't manage to do so for each individual separately. I guess it
> would be possible with the „split“ command, but I can't figure out how to
> apply it.
>
> Can anyone give me a hint?
> Thank you!
>
> Karen
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/I-need-help-for-creating-a-timevar-tp3643658p3643658.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to build a matrix of number of appearance?

2011-07-04 Thread jim holtman

Here is another way:

> xx <- data.frame(P = sample(5, 100, TRUE), M = sample(5, 100, TRUE), id = 
> 1:100)
> require(data.table)
> xx <- data.table(xx)  # convert to data.table
> count <- xx[
+ , list(count = length(id))
+ , by = list(M, P)
+   ]
> str(count)
Classes ‘data.table’ and 'data.frame':  24 obs. of  3 variables:
 $ M: int  1 1 1 1 1 2 2 2 2 2 ...
 $ P: int  1 2 3 4 5 1 2 3 4 5 ...
 $ count: int  5 4 3 2 9 3 3 6 3 7 ...
> count
   M P count
   1 1 5
   1 2 4
   1 3 3
   1 4 2
   1 5 9
   2 1 3
   2 2 3
   2 3 6
   2 4 3


On Mon, Jul 4, 2011 at 5:48 AM, UriB  wrote:
> I have a matrix of claims at year1 that I get simply by
>
> claims<-read.csv(file="Claims.csv")
> qq1<-claims[claims$Year=="Y1",]
>
> I have MemberID and ProviderID for every claim in qq1 both are integers
>
> An example for the type of questions that I want to answer is
> how many times ProviderID number 345 appears together with MemberID 23 in
> the table qq1
>
> In order to answer these questions for every possible ProviderId and every
> possible MemberID
> I would like to have a matrix that has first column as memberID when every
> memberID in qq1 appears only once and columns that have number of appearance
> of ProviderID==i for every i that has
> sum(qq1$ProviderID==i)>0
>
> My question is if there is a simple way to do it in R
> Thanks in Advance
>
> Uri
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/How-to-build-a-matrix-of-number-of-appearance-tp3643248p3643248.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] For help in R coding

2011-07-04 Thread David Winsemius



On Jul 4, 2011, at 1:29 PM, Bansal, Vikas wrote:


Dear sir,

I have one more problem.Sorry to disturb you again.

I have a data frame like this-

Col1 Col2 Col3  Col4
1 0  14
0 0  02
4  2 00
1  5 00
0  0 43
0   0 02
0   0 00
1   1 05

I want to delete all those rows which have more than two 0s
like in above input  row2 has 3 zeros,row6 has 3 zeros and row 7 has  
4 zeros.so i want to exclude them so that my output should be-


Col1 Col2 Col3  Col4
1 0  14
4  2 00
1  5 00
0  0 43
1   1 05

Can you please tell me how to code for this problem?


I am having a difficult time figuring out why this is not an obvious  
application for `apply` and "[" using logical indexing. I suggest you  
do some more self-study with the introductory material that you will  
find here:


http://cran.r-project.org/other-docs.html

(It also is a frequently asked question, so searching the archives for  
worked examples should also be considered.)


Using search terms "delete all rows with =="

http://search.r-project.org/cgi-bin/namazu.cgi?query=delete+rows+with+all+%3D%3D&max=100&result=normal&sort=score&idxname=functions&idxname=Rhelp08&idxname=Rhelp10&idxname=Rhelp02


--
David.








Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: David Winsemius [dwinsem...@comcast.net]
Sent: Monday, July 04, 2011 2:02 AM
To: Bansal, Vikas
Cc: Dennis Murphy; r-help@r-project.org
Subject: Re: [R] For help in R coding

On Jul 3, 2011, at 6:10 PM, Bansal, Vikas wrote:


So I want to code so that it will give the output like this-

DATA FRAME (Input)


Editing the task so it is reproducible:

dat <- read.table(textConnection(' col3 col9
  T  .a,g,,
  A.t,t,,
  A.,c,c,
  C .,a,,,
  G .,t,t,t
  A .c,,g,^!.
  A  .g,ggg.^!,
  A  .$,.,
  C  a,g,,t,
  T  ,.,^!.
  T   ,$.,."'), header=TRUE,
stringsAsFactors=FALSE)


output

AC GT
1 0  14
4 0  02
4  2 00
1  5 00
0  0 43


It's also possible to apply the logic that Gabor Grothendieck offered
at the beginning of this thread:

dat[, "newcol"] <- apply(dat, 1, function(x) gsub("\\,|\\." ,x[1],
x[2])  )
# ... and the obvious repetition for C.G.T


dat[,"A"] <- nchar( gsub("[^aA]", "", dat[ , "newcol"] ))
dat

   col3   col9 newcol A
1 T .a,g,, TaTgTT 1
2 A .t,t,, AtAtAA 4
3 A .,c,c, AAcAcA 4
4 C .,a,,, CCaCCC 1
5 G.,t,t,tGGtGtGt 0
6 A  .c,,g,^!.  AcAAgA^!A 5
7 A .g,ggg.^!, AgAgggA^!A 4
8 A  .$,.,  A$AAA 8
9 Ca,g,,t,aCgCCtC 1
10T ,.,^!. TTT^!T 0
11T ,$.,." T$TTT" 0

I am deeply in debt to Gabor Grothendieck. He taught me all I know
regarding regex. The man is a master at patterns.

--

David Winsemius, MD
West Hartford, CT



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wrong environment when evaluating and expression?

2011-07-04 Thread Joshua Wiley

Thanks Gabor, that makes sense now.

In case anyone else runs into something similar, I ended up just
passing a character string of the formula so it could be coerced to a
formula in the correct environment.

Thanks again,

Josh

On Mon, Jul 4, 2011 at 4:26 AM, Gabor Grothendieck
 wrote:
> On Mon, Jul 4, 2011 at 4:11 AM, Joshua Wiley  wrote:
>> Hi All,
>>
>> I have constructed two expressions (e1 & e2).  I can see that they are
>> not identical, but I cannot figure out how they differ.
>>
>> ###
>> dat <- mtcars
>> e1 <- expression(with(data = dat, lm(mpg ~ hp)))
>> e2 <- as.expression(substitute(with(data = dat, lm(f)), list(f = mpg ~ hp)))
>>
>> str(e1)
>> str(e2)
>> all.equal(e1, e2)
>> identical(e1, e2) # false
>>
>> eval(e1)
>> eval(e2)
>> 
>>
>> The context is trying to use a list of formulae to generate several
>> models from a multiply imputed dataset.  The package I am using (mice)
>> has methods for with() and that is how I can (easily) get the pooled
>> results.  Passing the formula directly does not work, so I was trying
>> to generate the entire call and evaluate it as if I had typed it at
>> the console, but I am missing something (probably rather silly).
>>
>
> In e1, mpg ~ hp is a call object but in e2 its a formula with an environment:
>
>> e1[[1]][[3]][[2]]
> mpg ~ hp
>> e2[[1]][[3]][[2]]
> mpg ~ hp
>>
>> class(e1[[1]][[3]][[2]])
> [1] "call"
>> class(e2[[1]][[3]][[2]])
> [1] "formula"
>>
>> environment(e2[[1]][[3]][[2]])
> 
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] For help in R coding

2011-07-04 Thread Bansal, Vikas

Dear sir,

I have one more problem.Sorry to disturb you again.

I have a data frame like this-

Col1 Col2 Col3  Col4
1 0  14
 0 0  02
 4  2 00
 1  5 00
 0  0 43
0   0 02
0   0 00
1   1 05

I want to delete all those rows which have more than two 0s
like in above input  row2 has 3 zeros,row6 has 3 zeros and row 7 has 4 zeros.so 
i want to exclude them so that my output should be-

Col1 Col2 Col3  Col4
1 0  14
4  2 00
 1  5 00
 0  0 43
1   1 05

Can you please tell me how to code for this problem?






Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: David Winsemius [dwinsem...@comcast.net]
Sent: Monday, July 04, 2011 2:02 AM
To: Bansal, Vikas
Cc: Dennis Murphy; r-help@r-project.org
Subject: Re: [R] For help in R coding

On Jul 3, 2011, at 6:10 PM, Bansal, Vikas wrote:

>> So I want to code so that it will give the output like this-
>>
>> DATA FRAME (Input)

Editing the task so it is reproducible:

dat <- read.table(textConnection(' col3 col9
   T  .a,g,,
   A.t,t,,
   A.,c,c,
   C .,a,,,
   G .,t,t,t
   A .c,,g,^!.
   A  .g,ggg.^!,
   A  .$,.,
   C  a,g,,t,
   T  ,.,^!.
   T   ,$.,."'), header=TRUE,
stringsAsFactors=FALSE)

>> output
>>
>> AC GT
>> 1 0  14
>> 4 0  02
>> 4  2 00
>> 1  5 00
>> 0  0 43

It's also possible to apply the logic that Gabor Grothendieck offered
at the beginning of this thread:

dat[, "newcol"] <- apply(dat, 1, function(x) gsub("\\,|\\." ,x[1],
x[2])  )
# ... and the obvious repetition for C.G.T

 > dat[,"A"] <- nchar( gsub("[^aA]", "", dat[ , "newcol"] ))
 > dat
col3   col9 newcol A
1 T .a,g,, TaTgTT 1
2 A .t,t,, AtAtAA 4
3 A .,c,c, AAcAcA 4
4 C .,a,,, CCaCCC 1
5 G.,t,t,tGGtGtGt 0
6 A  .c,,g,^!.  AcAAgA^!A 5
7 A .g,ggg.^!, AgAgggA^!A 4
8 A  .$,.,  A$AAA 8
9 Ca,g,,t,aCgCCtC 1
10T ,.,^!. TTT^!T 0
11T ,$.,." T$TTT" 0

I am deeply in debt to Gabor Grothendieck. He taught me all I know
regarding regex. The man is a master at patterns.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rpad library

2011-07-04 Thread ATANU

can anyone help me with a well documented tutorial on Rpad package? I need to
do HTML programming in R.Can anyone help me with a tutorial?

--
View this message in context: 
http://r.789695.n4.nabble.com/Rpad-library-tp3644041p3644041.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] forecast: bias in sampling from seasonal Arima model?

2011-07-04 Thread Nicolas Chapados

Dear all,

I stumbled upon what appears to be a troublesome issue when sampling from an
ARIMA model (from Rob Hyndman's excellent 'forecast' package) that contains
a seasonal AR component.

Here's how to reproduce the issue.  (I'm using R 2.9.2 with forecast 2.19;
see sessionInfo() below).

First some data:

> x <- c(
 0.132475,  0.143119,  0.108104,  0.247291,  0.029510, -0.119591, -0.133313,
-0.098128,  0.192698,  0.110328,  0.163671, -0.004925, -0.239209, -0.055122,
-0.051121,  0.154108,  0.008665, -0.074702,  0.066534, -0.098728, -0.068668,
 0.150935, -0.022547,  0.028625,  0.107092, -0.065396, -0.253247, -0.115240,
-0.113535, -0.064191, -0.006032,  0.039233,  0.129013, -0.068462,  0.022398,
-0.052427, -0.005586,  0.011447, -0.022667, -0.120536, -0.234398, -0.164087,
-0.177160, -0.120624, -0.025104,  0.001144, -0.193424, -0.260674, -0.036976,
-0.009590, -0.004920,  0.130545,  0.120527,  0.041121, -0.123321,  0.023836,
-0.188418,  0.015807, -0.056012,  0.000496,  0.051806, -0.067574,  0.012775,
 0.244083,  0.148857,  0.013874,  0.235252,  0.151935,  0.036986,  0.134482,
-0.003359, -0.019422,  0.086195,  0.206569,  0.123565,  0.070835, -0.183189,
-0.046513,  0.071920, -0.038360,  0.135293,  0.054746, -0.280340,  0.110638,
 0.009729,  0.115541,  0.021397,  0.097835, -0.028434, -0.218416,  0.044552,
 0.442563,  0.084317,  0.044149,  0.201100,  0.076112, -0.134955,  0.023870,
 0.077111,  0.085490,  0.023154,  0.099757, -0.026509, -0.189839,  0.026614,
 0.184916, -0.007266,  0.081276,  0.312526,  0.051199, -0.104707, -0.004206,
 0.062440,  0.126385, -0.018100,  0.092513,  0.186459, -0.170184, -0.126168,
 0.122739,  0.097495,  0.008633, -0.034519,  0.187264, -0.153409,  0.009440,
 0.150561,  0.067744,  0.045129,  0.230831, -0.079700, -0.162694, -0.044251,
-0.007663,  0.048986,  0.065724,  0.159706,  0.040067, -0.059949,  0.024810,
-0.154852,  0.018080,  0.165935,  0.203050,  0.011035, -0.232585, -0.162248,
-0.104872, -0.062516, -0.089766,  0.100304,  0.142170, -0.144969, -0.032500,
-0.002131,  0.165890,  0.107629,  0.075752,  0.119003,  0.095955,  0.039842,
 0.081208,  0.348529,  0.145694, -0.210700,  0.384966, -0.054503,  0.293329,
 0.184295,  0.368986,  0.135270,  0.124917,  0.185286, -0.252088, -0.169708,
-0.010204,  0.021934,  0.003572,  0.180148,  0.075836, -0.232065, -0.127255,
-0.147122,  0.056163,  0.067004,  0.217810,  0.074513, -0.167389,  0.172578,
-0.148127,  0.057025,  0.042623,  0.094214,  0.047004, -0.345453, -0.265104,
-0.082897,  0.052705, -0.067002,  0.191941,  0.010989, -0.298567, -0.162841,
 0.043773,  0.185459,  0.126305,  0.383101,  0.092747, -0.368453, -0.325097,
 0.029564, -0.015390,  0.013807,  0.152062, -0.047015, -0.429245, -0.097742,
 0.104502, -0.007547, -0.000245,  0.062830,  0.030093, -0.381043, -0.267704,
-0.125930, -0.032264, -0.041657,  0.040073,  0.084431, -0.276316, -0.305253,
-0.019942,  0.045390,  0.046090,  0.145700,  0.069920, -0.210079,  0.050967,
 0.042283,  0.248840,  0.007883,  0.203171,  0.050722, -0.109773, -0.110301,
-0.095433,  0.071133,  0.023793,  0.192476,  0.057746)

First, a CORRECT model, containing a seasonal MA component but no seasonal
AR component.  After estimation, I forecast for 1 time-step, and I take the
mean of sampling 1 times from the same model:

> my.arima1 <- Arima(x, order=c(3,0,0), seasonal=list(order=c(0,0,2),
period=7), include.mean=FALSE)
> forecast(my.arima1, 1)
Point Forecast  Lo 80 Hi 80 Lo 95 Hi 95
251-0.03143283 -0.1882245 0.1253589 -0.271225 0.2083594
> set.seed(1827) ; mean(sapply(seq_len(1), function(i)
as.numeric(simulate(my.arima1, 1)) ))
[1] -0.03258454

The results ("Point Forecast" versus the output of mean()) are identical to
some sampling error.

Now the INCORRECT model arises from adding one seasonal AR component:

> my.arima2 <- Arima(x, order=c(3,0,0), seasonal=list(order=c(1,0,2),
period=7), include.mean=FALSE)
> forecast(my.arima2, 1)
Point Forecast Lo 80   Hi 80  Lo 95  Hi 95
251 -0.1848579 -0.322421 -0.04729492 -0.3952424 0.02552655
> set.seed(1827) ; mean(sapply(seq_len(1), function(i)
as.numeric(simulate(my.arima2, 1)) ))
[1] -0.05416299

For the results are substantially different (-0.18 versus -0.05), and the
latter does not change much if I take a much bigger sample.

Did anybody encounter this in the past?  Is this a bug?

For reference, here are the results of sessionInfo():
> sessionInfo()
R version 2.9.2 (2009-08-24)
x86_64-pc-linux-gnu

locale:
C

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
 [1] KernSmooth_2.23-3  digest_0.4.2   forecast_2.19  fracdiff_1.3-2

 [5] tseries_0.10-12zoo_1.4-0  quadprog_1.4-11glmnet_1.7

 [9] Matrix_0.999375-30 biglm_0.7  DBI_0.2-4  inline_0.3.7

[13] XML_2.6-0  timeSeries_2110.86 timeDate_2110.87   RODBC_1.3-2

[17] reshape_0.8.3  plyr_0.1.9 MASS_7.2-48nnet_7.2-48

Re: [R] Protecting R code

2011-07-04 Thread Mike Marchywka

Put it on rapache or otherwise server but this seems like a waste depending on 
what you are doing 

Server side is only good way but making c++ may be interesting test
Sent from my Verizon Wireless BlackBerry

-Original Message-
From: Vaishali Sadaphal 
Date: Mon, 4 Jul 2011 16:48:13 
To: 
Cc: ; 
Subject: Re: [R] Protecting R code

Hey All,

Thank you so much for quick replies.
Looks like translation to C/C++ is the only robust option. Do you think 
there exists any ready-made R to C translator?

Thanks
--
Vaishali

Vaishali Paithankar Sadaphal
Tata Consultancy Services
Mailto: vaishali.sadap...@tcs.com
Website: http://www.tcs.com

Experience certainty.   IT Services
    Business Solutions
    Outsourcing

From:
Spencer Graves 
To:
Barry Rowlingson 
Cc:
Vaishali Sadaphal , r-help@r-project.org
Date:
07/04/2011 08:42 PM
Subject:
Re: [R] Protecting R code

Hello:

On 7/4/2011 7:41 AM, Barry Rowlingson wrote:
> On Mon, Jul 4, 2011 at 8:47 AM, Vaishali Sadaphal
>   wrote:
>> Hi All,
>>
>> I need to give my R code to my client to use. I would like to protect 
the
>> logic/algorithms that have been coded in R. This means that I would not
>> like anyone to be able to read the code.
>    At some point the R code has to be run. Which means it has to be
> read by an interpreter that can handle R code. Which means, unless you
> rewrite the interpreter, the R code must exist as such.
>
>   Even if you could compile R into C code into machine code and
> distribute a .exe file, its still possible in theory to
> reverse-engineer it and get something like the original back - the
> original logic if not the original names of the variables and
> functions.
>
>   You could rewrite the interpreter to only run encrypted, signed code
> that requires a decryption key, but you still have to give the user
> the decryption key at some point in order to get the plaintext code.
> Again, its an obfuscation problem of hiding the key somewhere, and
> hence is going to fail.
>
>   It all depends on how much expense you want to go to in order to make
> the expense of circumventing your solution more than its worth. Tell
> me how much that is, and I will tell you the solution.
>
>   For total security[1], you need to run the code on servers YOU
> control, and only give access via a network API. You can do this with
> RServe or any of the HTTP-based systems like Rapache.

   An organization I know that encrypted R code started with making 
it available only on their servers.  This was maybe four years ago.  I'm 
not sure what they do now, but I think they have since lost their major 
proponents of R internally and have probably translated all the code 
they wanted to sell into a compiled language in a way that didn't 
require R at all.

   Spencer

>
> Barry
>
> [1] Except of course servers can be hacked or socially-engineered
> into. For total security, disconnect your machine from the network and
> from any power supply.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Vaishali Sadaphal

Hey All,

Thank you so much for quick replies.
Looks like translation to C/C++ is the only robust option. Do you think 
there exists any ready-made R to C translator?

Thanks
--
Vaishali

Vaishali Paithankar Sadaphal
Tata Consultancy Services
Mailto: vaishali.sadap...@tcs.com
Website: http://www.tcs.com

Experience certainty.   IT Services
Business Solutions
Outsourcing

From:
Spencer Graves 
To:
Barry Rowlingson 
Cc:
Vaishali Sadaphal , r-help@r-project.org
Date:
07/04/2011 08:42 PM
Subject:
Re: [R] Protecting R code

Hello:

On 7/4/2011 7:41 AM, Barry Rowlingson wrote:
> On Mon, Jul 4, 2011 at 8:47 AM, Vaishali Sadaphal
>   wrote:
>> Hi All,
>>
>> I need to give my R code to my client to use. I would like to protect 
the
>> logic/algorithms that have been coded in R. This means that I would not
>> like anyone to be able to read the code.
>At some point the R code has to be run. Which means it has to be
> read by an interpreter that can handle R code. Which means, unless you
> rewrite the interpreter, the R code must exist as such.
>
>   Even if you could compile R into C code into machine code and
> distribute a .exe file, its still possible in theory to
> reverse-engineer it and get something like the original back - the
> original logic if not the original names of the variables and
> functions.
>
>   You could rewrite the interpreter to only run encrypted, signed code
> that requires a decryption key, but you still have to give the user
> the decryption key at some point in order to get the plaintext code.
> Again, its an obfuscation problem of hiding the key somewhere, and
> hence is going to fail.
>
>   It all depends on how much expense you want to go to in order to make
> the expense of circumventing your solution more than its worth. Tell
> me how much that is, and I will tell you the solution.
>
>   For total security[1], you need to run the code on servers YOU
> control, and only give access via a network API. You can do this with
> RServe or any of the HTTP-based systems like Rapache.

   An organization I know that encrypted R code started with making 
it available only on their servers.  This was maybe four years ago.  I'm 
not sure what they do now, but I think they have since lost their major 
proponents of R internally and have probably translated all the code 
they wanted to sell into a compiled language in a way that didn't 
require R at all.

   Spencer

>
> Barry
>
> [1] Except of course servers can be hacked or socially-engineered
> into. For total security, disconnect your machine from the network and
> from any power supply.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] I need help for creating a "timevar"

2011-07-04 Thread kbr

Hi all!

I have data in „Long“ format which I would like to reshape to „Wide“. I know
that one possibility is the „reshape“ command, which needs a „timevar“.

Data look as follows: There are approx. 3000 persons („IDENTITY“) and, for
each person, there are between 2 and 20 events („EVENT“).  For now, there's
one row for each event (9506 rows)

http://r.789695.n4.nabble.com/file/n3643658/Screenshot-2.png 

What is missing is the „timevar“ (SPSS calls it „INDEX“), which numbers the
events WITHIN each person (right column).

I managed to number the events from 1 to 9506 with the seq-command, first
writing the number of rows in nEVENT:
> number <-seq(file=event, 1, nEVENT, b=1) 
Yet, I didn't manage to do so for each individual separately. I guess it
would be possible with the „split“ command, but I can't figure out how to
apply it.

Can anyone give me a hint?
Thank you!

Karen

--
View this message in context: 
http://r.789695.n4.nabble.com/I-need-help-for-creating-a-timevar-tp3643658p3643658.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Spencer Graves


Hello:


On 7/4/2011 7:41 AM, Barry Rowlingson wrote:

On Mon, Jul 4, 2011 at 8:47 AM, Vaishali Sadaphal
  wrote:

Hi All,

I need to give my R code to my client to use. I would like to protect the
logic/algorithms that have been coded in R. This means that I would not
like anyone to be able to read the code.

   At some point the R code has to be run. Which means it has to be
read by an interpreter that can handle R code. Which means, unless you
rewrite the interpreter, the R code must exist as such.

  Even if you could compile R into C code into machine code and
distribute a .exe file, its still possible in theory to
reverse-engineer it and get something like the original back - the
original logic if not the original names of the variables and
functions.

  You could rewrite the interpreter to only run encrypted, signed code
that requires a decryption key, but you still have to give the user
the decryption key at some point in order to get the plaintext code.
Again, its an obfuscation problem of hiding the key somewhere, and
hence is going to fail.

  It all depends on how much expense you want to go to in order to make
the expense of circumventing your solution more than its worth. Tell
me how much that is, and I will tell you the solution.

  For total security[1], you need to run the code on servers YOU
control, and only give access via a network API. You can do this with
RServe or any of the HTTP-based systems like Rapache.


  An organization I know that encrypted R code started with making 
it available only on their servers.  This was maybe four years ago.  I'm 
not sure what they do now, but I think they have since lost their major 
proponents of R internally and have probably translated all the code 
they wanted to sell into a compiled language in a way that didn't 
require R at all.



  Spencer



Barry

[1] Except of course servers can be hacked or socially-engineered
into. For total security, disconnect your machine from the network and
from any power supply.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] modification of cross-validations in rpart

2011-07-04 Thread Katerine Goyer








Hello, 



I am using
the rpart function (from the rpart package) to do a regression tree that would 
describe
the behaviour of a fish species according to several environmental variables.
For each fish (sampling unit), I have repeated observations of the response
variable, which means that the data are not independent. Normally, in this
case, V-fold cross-validation needs to be modified to prevent over-optimistic
predictions of error rates by cross-validation and overestimation of the tree
size. A way to overcome this problem is by selecting only whole sampling units
in our subsets of cross-validation. My problem is that I dont know how to
perform this modification of the cross-validation process in the rpart
function.


Is there a
way to do this modification in rpart or is there any other function I could use
that would consider interdependence in the response variable?


Here is an
example of the code I am using (Y being the response variable and data.env
being a data frame of the environmental 
variables):


Tree = rpart(Y
~ X1 + X2 + X3,xval=100,data=data.env) 



Thanks

Katerine


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Spencer Graves


On 7/4/2011 7:28 AM, Uwe Ligges wrote:



On 04.07.2011 09:47, Vaishali Sadaphal wrote:

Hi All,

I need to give my R code to my client to use. I would like to protect 
the

logic/algorithms that have been coded in R. This means that I would not
like anyone to be able to read the code.

I am searching for ways to protect R code. I would like to create a .exe
kind of file which could be executed without using R or requiring to
install R. I would not like the R code to be loaded in R. This is so
because, after R loads a function, if you type the function name on the
command prompt, you can see the complete code. I would not like to give
this type of access to the R code.

I explored the option of creating .bat file (using command: R CMD 
BAT) and

byte code (using command: compile). These are not useful since they open
R, load these functions and then the R code is visible.

Is there any other way to protect the R code which would help me package
all my files/source files and give me an executable file which would be
run without opening R? Another problem is that R is freely downloadable.
Is it somehow possible to protect the code from being loaded in R and
being seen.



H, R is open source software under the GPL (which is infective) 
and designed as such. Good luck it is almost impossible to hide the 
source code in R. And people who tried to generate C based binary 
packages found those can only be used under a small subset of 
platforms with few versions of R.


Since R is distributed under the GPL: When you write code and make it 
available to others, you should be aware of this fact that you may 
have to distribute the sources under GPL as well - under some 
circumstances your lawyer can explain much better than I.



  Linux is distributed under the GPL, and people distribute 
software implemented in Linux without having to release their source 
code.  There are different versions of the GPL.  You should read them 
carefully and consult with an attorney.  However, if you honestly read 
the GPL verbiage, you may find that you know more than your attorney -- 
but you still need the attorney.  I'm not an attorney and I haven't read 
GPL verbiage in a while, but as I recall a key issue is whether your 
code is your creation or a modification of some other GPL code.  If the 
latter, you could lose in court if challenged.



  I see two options:


1.  Write the proprietary portion of your code in a 
compiled language like C, C++, or Fortran, and link from R to your 
compiled subroutines.  If you do not already write R packages, I 
strongly urge you to first learn how to produce and use R packages.  
Documentation on "Creating R Packages" is available from any standard 
CRAN mirror.  I suggest you create separate R packages (with different 
names) complete with documentation for your internal only version in R 
only and for your public version that uses compiled code.  This allows 
you to prototype your new ideas quickly in R before you spend the money 
to convert them to compiled code.  It also encourages you to build test 
cases in a way that increases software quality.  Then you can distribute 
the public R package in its standard compiled format, which your users 
can install using the standard procedure to "Install package(s) from 
local zip file" (available on the "Packages" menu in Rgui).  This is 
arguably the cleanest legally, because then it's clear that your 
proprietary code has an existence independent of R.  You can distribute 
your package with an appropriate end user license agreement and 
instructions for how to install R and any CRAN packages you use plus 
your own code.



2.  You can write something to encrypt your R code.  I know 
someone who has done this.  However, the legal status is not as clean as 
if you wrote you proprietary algorithm in a compiled language, because 
if someone with a larger budget for attorneys wants to take you to court 
demanding your source code, you might lose.  I doubt if that would 
happen, but I'm not an attorney, so I don't know.  I do know that people 
often lose legal battles just because their opponents have much better 
attorneys.  The advantage of this is that you could then distribute your 
latest changes immediately after you get them working.  Another 
disadvantage is that your code will have to decrypt the R code prior to 
running it, which means that your code might still be available to 
anyone clever enough to interrupt your code while it's running.  Thus, 
it's not as secure as writing compiled code, in addition to not having 
as strong a claim to having an existence independent of R.  You could 
also combine this with the first, where your latest release would 
encrypt your latest enhancements while you are working to translate 
those into compiled code.



  Few people with university appointments have to worry about these 
issues, because they get paid for generating new knowledge and sharing 
it wi

Re: [R] How to build a matrix of number of appearance?

2011-07-04 Thread David Winsemius



On Jul 4, 2011, at 5:48 AM, UriB wrote:


I have a matrix of claims at year1 that I get simply by

claims<-read.csv(file="Claims.csv")
qq1<-claims[claims$Year=="Y1",]

I have MemberID and ProviderID for every claim in qq1 both are  
integers


An example for the type of questions that I want to answer is
how many times ProviderID number 345 appears together with MemberID  
23 in

the table qq1

In order to answer these questions for every possible ProviderId and  
every

possible MemberID
I would like to have a matrix that has first column as memberID when  
every
memberID in qq1 appears only once and columns that have number of  
appearance

of ProviderID==i for every i that has
sum(qq1$ProviderID==i)>0

My question is if there is a simple way to do it in R


A really quick way of finding this would be:

as.data.frame ( xtabs(  ~ ProviderID +MemberID, data= qq1) )

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Barry Rowlingson

On Mon, Jul 4, 2011 at 8:47 AM, Vaishali Sadaphal
 wrote:
> Hi All,
>
> I need to give my R code to my client to use. I would like to protect the
> logic/algorithms that have been coded in R. This means that I would not
> like anyone to be able to read the code.

  At some point the R code has to be run. Which means it has to be
read by an interpreter that can handle R code. Which means, unless you
rewrite the interpreter, the R code must exist as such.

 Even if you could compile R into C code into machine code and
distribute a .exe file, its still possible in theory to
reverse-engineer it and get something like the original back - the
original logic if not the original names of the variables and
functions.

 You could rewrite the interpreter to only run encrypted, signed code
that requires a decryption key, but you still have to give the user
the decryption key at some point in order to get the plaintext code.
Again, its an obfuscation problem of hiding the key somewhere, and
hence is going to fail.

 It all depends on how much expense you want to go to in order to make
the expense of circumventing your solution more than its worth. Tell
me how much that is, and I will tell you the solution.

 For total security[1], you need to run the code on servers YOU
control, and only give access via a network API. You can do this with
RServe or any of the HTTP-based systems like Rapache.

Barry

[1] Except of course servers can be hacked or socially-engineered
into. For total security, disconnect your machine from the network and
from any power supply.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Protecting R code

2011-07-04 Thread Uwe Ligges




On 04.07.2011 09:47, Vaishali Sadaphal wrote:

Hi All,

I need to give my R code to my client to use. I would like to protect the
logic/algorithms that have been coded in R. This means that I would not
like anyone to be able to read the code.

I am searching for ways to protect R code. I would like to create a .exe
kind of file which could be executed without using R or requiring to
install R. I would not like the R code to be loaded in R. This is so
because, after R loads a function, if you type the function name on the
command prompt, you can see the complete code. I would not like to give
this type of access to the R code.

I explored the option of creating .bat file (using command: R CMD BAT) and
byte code (using command: compile). These are not useful since they open
R, load these functions and then the R code is visible.

Is there any other way to protect the R code which would help me package
all my files/source files and give me an executable file which would be
run without opening R? Another problem is that R is freely downloadable.
Is it somehow possible to protect the code from being loaded in R and
being seen.



H, R is open source software under the GPL (which is infective) and 
designed as such. Good luck it is almost impossible to hide the source 
code in R. And people who tried to generate C based binary packages 
found those can only be used under a small subset of platforms with few 
versions of R.


Since R is distributed under the GPL: When you write code and make it 
available to others, you should be aware of this fact that you may have 
to distribute the sources under GPL as well - under some circumstances 
your lawyer can explain much better than I.




Thanks
--
Vaishali
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information.



If it is "confidential or privileged information", you should not send 
it to a mailing list where the archives are published.



Best,
Uwe Ligges






If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cumulative incidence plot vs survival plot

2011-07-04 Thread rgeskus

Note that most of the nonparametric and semi-parametric competing risks
analyses can be performed within the survival package. This includes
nonparametric estimation of cause-specific cumulative incidence curves and
the log-rank type test. It suffices to create a weighted data set as
explained in Geskus, Biometrics 67, p. 39-49, 2011. 

Ronald Geskus
Academic Medical Center
Amsterdam, the Netherlands

>> Hi, I am wondering if anyone can explain to me if cumulative incidence
>> (CI) is 

> The cumulative incidence curve and the KM are not the same, when there
> are multiple outcomes.  See the "etype" argument to survfit, which is
> used to create CI curves (?survfit.formula).  For testing differences
> between CI curves use the cmprsk library from Gray; it can also draw
> curves by the survfit routine has a lot more flexibility.


--
View this message in context: 
http://r.789695.n4.nabble.com/cumulative-incidence-plot-vs-survival-plot-tp3628772p3643659.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unusual graph- modified wind rose perhaps?

2011-07-04 Thread John Kane

Very pretty Thierry,

I was wondering if ggplot2 could do something like it but my knowledge of 
ggplot2 is far to little to attempt it myself.  

I'm going to really have to spend some time on that code.

Thanks

--- On Mon, 7/4/11, ONKELINX, Thierry  wrote:

> From: ONKELINX, Thierry 
> Subject: RE: [R] Unusual graph-  modified wind rose perhaps?
> To: "John Kane" , "r-help@r-project.org" 
> 
> Received: Monday, July 4, 2011, 8:14 AM
> Dear John,
> 
> You can get pretty close with ggplot2.
> 
> Best regards,
> 
> Thierry
> 
> library(ggplot2)
> dataset <- data.frame(Name = LETTERS[1:26])
> dataset$Score <- runif(nrow(dataset))
> dataset$Category <- cut(dataset$Score, breaks = c(-Inf,
> 0.33, 0.66, Inf), labels = c("Bad", "Neutral", "Good"))
> dataset$Name <- factor(dataset$Name, levels =
> dataset$Name[order(dataset$Score)])
> dataset$Location <- as.numeric(dataset$Name)
> 
> ggplot(dataset, aes(x = Name, y = Score, fill = Category))
> + geom_bar() + coord_polar()
> 
> 
> #with some extra tweeking
> dataset <- rbind(dataset, 
>     data.frame(
>         Location =
> c(max(dataset$Location) + seq_len(max(dataset$Location) /
> 2), min(dataset$Location) - seq_len(max(dataset$Location) /
> 2)),
>         Name = "",
>         Score = 0,
>         Category = "Good"
>     )
> )
> ggplot(dataset, aes(x = Location, y = Score, fill =
> Category)) + geom_bar(stat = "identity") + coord_polar(start
> = pi, direction = -1) + scale_fill_manual(value = c(Good =
> "green", Neutral = "grey", Bad = "red")) + theme_bw() +
> scale_x_continuous("", breaks = dataset$Location, labels =
> dataset$Name)
> 
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek
> team Biometrie & Kwaliteitszorg
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> 
> Research Institute for Nature and Forest
> team Biometrics & Quality Assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> 
> tel. + 32 54/436 185
> thierry.onkel...@inbo.be
> www.inbo.be
> 
> To call in the statistician after the experiment is done
> may be no more than asking him to perform a post-mortem
> examination: he may be able to say what the experiment died
> of.
> ~ Sir Ronald Aylmer Fisher
> 
> The plural of anecdote is not data.
> ~ Roger Brinner
> 
> The combination of some data and an aching desire for an
> answer does not ensure that a reasonable answer can be
> extracted from a given body of data.
> ~ John Tukey
>  
> 
> > -Oorspronkelijk bericht-
> > Van: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org]
> > Namens John Kane
> > Verzonden: maandag 4 juli 2011 13:22
> > Aan: r-help@r-project.org
> > Onderwerp: [R] Unusual graph- modified wind rose
> perhaps?
> > 
> > 
> > In a OpenOffice.org forum someone was asking if the
> spreadsheet could graph
> > this http://www.elmundo.es/elmundosalud/documentos/2011/06/leche.html
> > 
> > I didn't think it could. :)
> > 
> > I don't think I've ever seen exactly this layout. Does
> anyone know if there is
> > anything in R that does a graph like this or that can
> be adapted to do it.
> > 
> > Unfortunately my Spanish is non-existent so I am not
> sure how effective the
> > graph is in achieving whatever it's suppposed to
> do.  A dot chart might be as
> > effective but it is a flashy graphic.
> > 
> > Thanks
> > 
> > __
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extracting data

2011-07-04 Thread Ana Kolar

Thanks! That works well.

Best,

Ana



>
>From: Peter Ehlers 
>To: Ana Kolar 
>Cc: Sarah Goslee ; R 
>Sent: Tuesday, 28 June 2011, 19:37
>Subject: Re: [R] extracting data
>
>On 2011-06-28 09:54, Ana Kolar wrote:
>> Hi Sarah,
>>
>> Thank you for your response. Here is a toy example:
>>
>>
>> library(MatchIt)
>> data(lalonde)
>>
>> A<-lalonde
>> f<-treat ~ age + I(age^2) + educ + I(educ^2) + black + hispan +
>>      married + nodegree + re74 + I(re74^2) + re75 + I(re75^2)
>> m<-"nearest"
>> m.out.base<- matchit(formula=f, data=A, method=m)
>>
>> B<- match.data(m.out.base)
>>
>> An<- nrow(A)
>> Bn<- nrow(B)
>>
>> Cn<- An - Bn
>> C<- ??
>
>Can't you just use
>
>  idx <- setdiff(rownames(A), rownames(B))
>  C <- A[idx, ]
>
>Peter Ehlers
>
>>
>>
>>
>>
>>> 
>>> From: Sarah Goslee
>>> To: Ana Kolar
>>> Cc: R
>>> Sent: Tuesday, 28 June 2011, 18:44
>>> Subject: Re: [R] extracting data
>>>
>>> Hi Ana,
>>>
>>> On Tue, Jun 28, 2011 at 12:28 PM, Ana Kolar  wrote:
 Let's say I have an original data set which is called A and data extracted 
 from this original data set, called B. Based on these A and B data set I 
 would like to get data set C which includes all the remaining data from 
 the data set A after we exclude data of the data set B.

 Any idea how to do this?
>>>
>>> Yes. Several.
>>>
>>> But to know which one to suggest, I need to know more about your data.
>>>
>>> How about a toy example, so the list members can see your index
>>> variables, etc? Or how you created the subset B, and why you can't
>>> just use the opposite of that procedure?
>>>
>>> Sarah
>>> --
>>> Sarah Goslee
>>> http://www.functionaldiversity.org
>>>
>>>
>>>
>>     [[alternative HTML version deleted]]
>>
>
>
>
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writeLines + foreach/doMC

2011-07-04 Thread Mario Valle

Read something about parallel processing and how I/O should be done by a 
single process.
Suggestion: write a different file from each thread then combine the 
results with cat or similar.

Hope it helps
 mario

On 04-Jul-11 11:58, Ramzi TEMANNI wrote:

Hi
I'm processing sequencing data trying to collapsing the locations of each
unique sequence and write the results to a file (as storing that in a table
will require 10GB mem at least)
so I wrote a function that, given a sequence id, provide the needed line to
be stored
library(doMC) # load library
registerDoMC(12) # assign the Number of CPU


fileConn<-file(paste(fq_file,"_SeqID.txt",sep=""),open = "at") # open
connection
writeLines(paste("ReadID","Freq","Seq","LOC_UG","Nb_UG_Seq",sep="\t"),
fileConn) # write header
foreach(i=1:length(uniq.Seq)) %dopar% # for eqch unique sequence
{
writeLines(paste(gettable1(uniq.Seq[i]),collapse="   "), fileConn) #write
the the results line
}
close(fileConn)

the code excute well, but the problem is that some lines are wired:
The  header and lot of lines are ok :
ReadIDFreqSeqLOC_UGNb_UG_Seq
HWI-EA332_0036:5:16:9530:21025#ATGC/1     2
X_10130:489:+,X_10130:489:+   2
HWI-EA332_0036:5:117:6674:4940#ATGC/1      1
X:432:-,X:432:-   2
HWI-EA332_0036:5:62:15592:7375#ATGC/1      2
X_22660:253:+,X_22660:253:+   2
HWI-EA332_0036:5:110:14349:8422#ATGC/1      4
X_13806:399:+,X_13806:399:+,X_27263:481:+,X_27263:481:+   4
other looks wired
HWI-EA332_0036:5:17:1400ReadIDFreqSeqLOC_UGNb_UG_Seq
HWI-EA332_0036:5:61:7734:4201ReadIDFreqSeqLOC_UGNb_UG_Seq
HWI-EA332_0036:5:117:5361:10666#ATGReadIDFreqSeqLOC_UG
Nb_UG_Seq
HWI-EA332_0036:5:115:7421:20664#ATGC/1   GATCReadIDFreqSeq
LOC_UGNb_UG_Seq
HWI-EA332_0036:5:175:95:-   2
HWI-EA332_0036:5JCVI_35536:444:+   2
X   1  X_22484:571:-,X_22484:571:-   2

Is this due to the fact that one process start to write prior the other has
finished ?
Is there a way to solve this problem ?
Any suggestions would be greatly appreciated.
Thanks and have a nice day.


Best,
Ramzi TEMANNI
http://www.linkedin.com/in/ramzitemanni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Ing. Mario Valle
Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Prevent 'R CMD check' from reporting "NA"/"NA_character_" missmatch?

2011-07-04 Thread Prof Brian Ripley


On Mon, 4 Jul 2011, Johannes Graumann wrote:


Hello,

I'm writing a package am running 'R CMD check' on it.

Is there any way to make 'R CMD check' not warn about a missmatch between
'NA_character_' (in the function definition) and 'NA' (in the
documentation)?


Be consistent   Why do you want incorrect documentation of your 
package?  (It is not clear of the circumstances here: normally 1 vs 1L 
and similar are not reported if they are the only errors.)


And please do note the posting guide

- this is not really the correct list
- you were asked to give an actual example with output.

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wavelets

2011-07-04 Thread Mike Marchywka


> From: jdnew...@dcn.davis.ca.us
> Date: Mon, 4 Jul 2011 00:45:41 -0700
> To: tyagi...@gmail.com; r-help@r-project.org
> Subject: Re: [R] wavelets
>
> Study the topic more carefully, I suppose. My understanding is that wavelets 
> do not in themselves compress anything, but because they sort out the 
> interesting data from the uninteresting data, it can be easy to toss the 
> uninteresting data (lossy data compression). Perhaps you should understand 
> better what your Matlab library is doing.
> ---
> Jeff Newmiller The . . Go Live...
> DCN: Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> user123  wrote:
>
> I'm new to the topic of wavelets. When I tried to use the mra function in the
> wavelets package, the data is not getting compressed. eg. if the original
> data has 500 values , the output data also has the same.
> However in MATLAB, depending on the level of decompositon, the data gets
> compressed.
> How do I implement this in R?



can you post some code? You can always compress into one value of course by 
turning
bytes into a single char string, what you want is entropy. I posted some
example code before and I remember it took effort to not get the subsampling.
mra is probably multi-resolution analysis and I'd suppose you want all the 
samples.
You probably need paper and pencil however at this point. 
 




>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/wavelets-tp3642973p3642973.html
> Sent from the R help mailing list archive at Nabble.com.
>
> _
>
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] writeLines + foreach/doMC

2011-07-04 Thread Ramzi TEMANNI

Hi
I'm processing sequencing data trying to collapsing the locations of each
unique sequence and write the results to a file (as storing that in a table
will require 10GB mem at least)
so I wrote a function that, given a sequence id, provide the needed line to
be stored
library(doMC) # load library
registerDoMC(12) # assign the Number of CPU


fileConn<-file(paste(fq_file,"_SeqID.txt",sep=""),open = "at") # open
connection
writeLines(paste("ReadID","Freq","Seq","LOC_UG","Nb_UG_Seq",sep="\t"),
fileConn) # write header
foreach(i=1:length(uniq.Seq)) %dopar% # for eqch unique sequence
{
writeLines(paste(gettable1(uniq.Seq[i]),collapse="   "), fileConn) #write
the the results line
}
close(fileConn)

the code excute well, but the problem is that some lines are wired:
The  header and lot of lines are ok :
ReadIDFreqSeqLOC_UGNb_UG_Seq
HWI-EA332_0036:5:16:9530:21025#ATGC/1     2
X_10130:489:+,X_10130:489:+   2
HWI-EA332_0036:5:117:6674:4940#ATGC/1      1
X:432:-,X:432:-   2
HWI-EA332_0036:5:62:15592:7375#ATGC/1      2
X_22660:253:+,X_22660:253:+   2
HWI-EA332_0036:5:110:14349:8422#ATGC/1      4
X_13806:399:+,X_13806:399:+,X_27263:481:+,X_27263:481:+   4
other looks wired
HWI-EA332_0036:5:17:1400ReadIDFreqSeqLOC_UGNb_UG_Seq
HWI-EA332_0036:5:61:7734:4201ReadIDFreqSeqLOC_UGNb_UG_Seq
HWI-EA332_0036:5:117:5361:10666#ATGReadIDFreqSeqLOC_UG
Nb_UG_Seq
HWI-EA332_0036:5:115:7421:20664#ATGC/1   GATCReadIDFreqSeq
LOC_UGNb_UG_Seq
HWI-EA332_0036:5:175:95:-   2
HWI-EA332_0036:5JCVI_35536:444:+   2
X   1  X_22484:571:-,X_22484:571:-   2

Is this due to the fact that one process start to write prior the other has
finished ?
Is there a way to solve this problem ?
Any suggestions would be greatly appreciated.
Thanks and have a nice day.


Best,
Ramzi TEMANNI
http://www.linkedin.com/in/ramzitemanni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to build a matrix of number of appearance?

2011-07-04 Thread UriB

I have a matrix of claims at year1 that I get simply by 

claims<-read.csv(file="Claims.csv")
qq1<-claims[claims$Year=="Y1",]

I have MemberID and ProviderID for every claim in qq1 both are integers

An example for the type of questions that I want to answer is 
how many times ProviderID number 345 appears together with MemberID 23 in
the table qq1

In order to answer these questions for every possible ProviderId and every
possible MemberID 
I would like to have a matrix that has first column as memberID when every
memberID in qq1 appears only once and columns that have number of appearance
of ProviderID==i for every i that has
sum(qq1$ProviderID==i)>0

My question is if there is a simple way to do it in R
Thanks in Advance 

Uri

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-build-a-matrix-of-number-of-appearance-tp3643248p3643248.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Protecting R code

2011-07-04 Thread Vaishali Sadaphal

Hi All,

I need to give my R code to my client to use. I would like to protect the 
logic/algorithms that have been coded in R. This means that I would not 
like anyone to be able to read the code.

I am searching for ways to protect R code. I would like to create a .exe 
kind of file which could be executed without using R or requiring to 
install R. I would not like the R code to be loaded in R. This is so 
because, after R loads a function, if you type the function name on the 
command prompt, you can see the complete code. I would not like to give 
this type of access to the R code.

I explored the option of creating .bat file (using command: R CMD BAT) and 
byte code (using command: compile). These are not useful since they open 
R, load these functions and then the R code is visible.

Is there any other way to protect the R code which would help me package 
all my files/source files and give me an executable file which would be 
run without opening R? Another problem is that R is freely downloadable. 
Is it somehow possible to protect the code from being loaded in R and 
being seen. 

Thanks
--
Vaishali
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] loop in optim

2011-07-04 Thread EdBo

Hi

May you help me correct my loop function.

I want optim to estimates al_j; au_j; sigma_j;  b_j by looking at 0 to 20,
21 to 40, 41 to 60 data points.

The final result should have 4 columns of each of the estimates AND 4 rows
of each of 0 to 20, 21 to 40, 41 to 60.

###MY code is

n=20
runs=4
out=matrix(0,nrow=runs)

llik = function(x) 
   { 
al_j=x[1]; au_j=x[2]; sigma_j=x[3];  b_j=x[4]
sum(na.rm=T,
ifelse(a$R_j< 0, -log(1/(2*pi*(sigma_j^2)))-
   (1/(2*(sigma_j^2))*(a$R_j+al_j-b_j*a$R_m))^2, 
 ifelse(a$R_j>0 , -log(1/(2*pi*(sigma_j^2)))-
   (1/(2*(sigma_j^2))*(a$R_j+au_j-b_j*a$R_m))^2,
 
-log(pnorm(au_j,mean=b_j*a$R_m,sd=sqrt(sigma_j^2))-
   pnorm(au_j,mean=b_j*a$R_m,sd=sqrt(sigma_j^2) 

   )

   } 

start.par = c(0, 0, 0.01, 1) 
out1 = optim(llik, par=start.par, method="Nelder-Mead")


for (i in 1: runs)
{
 index_start=20*(i-1)+1
 index_end= 20*i
 out[i]=out1[index_start:index_end]
}
out


Thank you in advance

Edward
UCT
My data

R_j R_m
-0.0625 0.002320654
0   -0.004642807
0.0 0.005936332
0.032258065 0.001060848
0   0.007114057
0.0156250.005581558
0   0.002974794
0.015384615 0.004215271
0.060606061 0.005073116
0.028571429 -0.006001279
0   -0.002789594
0.01389 0.00770633
0   0.000371663
0.02739726  -0.004224228
-0.04   0.008362539
0   -0.010951605
0   0.004682924
0.01389 0.011839993
-0.01369863 0.004210383
-0.02778-0.04658949
0   0.00987272
-0.057142857-0.062203157
-0.03030303 -0.119177639
0.09375 0.077054642
0   -0.022763619
-0.0571428570.050408775
0   0.024706076
-0.03030303 0.004043701
0.0625  0.004951088
0   -0.005968731
0   -0.038292548
0   0.013381097
0.014705882 0.006424728
-0.014492754-0.020115626
0   -0.004837891
-0.029411765-0.022054654
0.03030303  0.008936428
0.044117647 8.16925E-05
0   -0.004827246
-0.0422535210.004653096
-0.014705882-0.004222151
0.029850746 0.000107267
-0.028985507-0.001783206
0.029850746 -0.006372981
0.014492754 0.005492374
-0.028571429-0.009005846
0   0.001031683
0.044117647 0.002800551












--
View this message in context: 
http://r.789695.n4.nabble.com/loop-in-optim-tp3643230p3643230.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Copying to R a rectangular array from a Java class

2011-07-04 Thread Sanketh

Hi, can you please tel me how to retrieve String two dimensional array as
like sapply?

--
View this message in context: 
http://r.789695.n4.nabble.com/Copying-to-R-a-rectangular-array-from-a-Java-class-tp3486167p3643223.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Prevent 'R CMD check' from reporting "NA"/"NA_character_" missmatch?

2011-07-04 Thread Johannes Graumann

Hello,

I'm writing a package am running 'R CMD check' on it.

Is there any way to make 'R CMD check' not warn about a missmatch between 
'NA_character_' (in the function definition) and 'NA' (in the 
documentation)?

Thanks for any help.

Sincerely, Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unusual graph- modified wind rose perhaps?

2011-07-04 Thread Jim Lemon


On 07/04/2011 09:21 PM, John Kane wrote:


In a OpenOffice.org forum someone was asking if the spreadsheet could graph 
this http://www.elmundo.es/elmundosalud/documentos/2011/06/leche.html

I didn't think it could. :)

I don't think I've ever seen exactly this layout. Does anyone know if there is 
anything in R that does a graph like this or that can be adapted to do it.

Unfortunately my Spanish is non-existent so I am not sure how effective the 
graph is in achieving whatever it's suppposed to do.  A dot chart might be as 
effective but it is a flashy graphic.


Hi John,
It's a bit like the function I am working on after the request by 
Patrick Jemison (radial.pie), and I thought the interactive fill when 
you sweep the pointer over it is pretty neat. I don't think I'll go that 
far. I'll let you know when I've got a working function.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unusual graph- modified wind rose perhaps?

2011-07-04 Thread ONKELINX, Thierry

Dear John,

You can get pretty close with ggplot2.

Best regards,

Thierry

library(ggplot2)
dataset <- data.frame(Name = LETTERS[1:26])
dataset$Score <- runif(nrow(dataset))
dataset$Category <- cut(dataset$Score, breaks = c(-Inf, 0.33, 0.66, Inf), 
labels = c("Bad", "Neutral", "Good"))
dataset$Name <- factor(dataset$Name, levels = 
dataset$Name[order(dataset$Score)])
dataset$Location <- as.numeric(dataset$Name)

ggplot(dataset, aes(x = Name, y = Score, fill = Category)) + geom_bar() + 
coord_polar()


#with some extra tweeking
dataset <- rbind(dataset, 
data.frame(
Location = c(max(dataset$Location) + 
seq_len(max(dataset$Location) / 2), min(dataset$Location) - 
seq_len(max(dataset$Location) / 2)),
Name = "",
Score = 0,
Category = "Good"
)
)
ggplot(dataset, aes(x = Location, y = Score, fill = Category)) + geom_bar(stat 
= "identity") + coord_polar(start = pi, direction = -1) + 
scale_fill_manual(value = c(Good = "green", Neutral = "grey", Bad = "red")) + 
theme_bw() + scale_x_continuous("", breaks = dataset$Location, labels = 
dataset$Name)


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


> -Oorspronkelijk bericht-
> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> Namens John Kane
> Verzonden: maandag 4 juli 2011 13:22
> Aan: r-help@r-project.org
> Onderwerp: [R] Unusual graph- modified wind rose perhaps?
> 
> 
> In a OpenOffice.org forum someone was asking if the spreadsheet could graph
> this http://www.elmundo.es/elmundosalud/documentos/2011/06/leche.html
> 
> I didn't think it could. :)
> 
> I don't think I've ever seen exactly this layout. Does anyone know if there is
> anything in R that does a graph like this or that can be adapted to do it.
> 
> Unfortunately my Spanish is non-existent so I am not sure how effective the
> graph is in achieving whatever it's suppposed to do.  A dot chart might be as
> effective but it is a flashy graphic.
> 
> Thanks
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wrong environment when evaluating and expression?

2011-07-04 Thread Gabor Grothendieck

On Mon, Jul 4, 2011 at 4:11 AM, Joshua Wiley  wrote:
> Hi All,
>
> I have constructed two expressions (e1 & e2).  I can see that they are
> not identical, but I cannot figure out how they differ.
>
> ###
> dat <- mtcars
> e1 <- expression(with(data = dat, lm(mpg ~ hp)))
> e2 <- as.expression(substitute(with(data = dat, lm(f)), list(f = mpg ~ hp)))
>
> str(e1)
> str(e2)
> all.equal(e1, e2)
> identical(e1, e2) # false
>
> eval(e1)
> eval(e2)
> 
>
> The context is trying to use a list of formulae to generate several
> models from a multiply imputed dataset.  The package I am using (mice)
> has methods for with() and that is how I can (easily) get the pooled
> results.  Passing the formula directly does not work, so I was trying
> to generate the entire call and evaluate it as if I had typed it at
> the console, but I am missing something (probably rather silly).
>

In e1, mpg ~ hp is a call object but in e2 its a formula with an environment:

> e1[[1]][[3]][[2]]
mpg ~ hp
> e2[[1]][[3]][[2]]
mpg ~ hp
>
> class(e1[[1]][[3]][[2]])
[1] "call"
> class(e2[[1]][[3]][[2]])
[1] "formula"
>
> environment(e2[[1]][[3]][[2]])


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Unusual graph- modified wind rose perhaps?

2011-07-04 Thread John Kane


In a OpenOffice.org forum someone was asking if the spreadsheet could graph 
this http://www.elmundo.es/elmundosalud/documentos/2011/06/leche.html 

I didn't think it could. :)   

I don't think I've ever seen exactly this layout. Does anyone know if there is 
anything in R that does a graph like this or that can be adapted to do it.

Unfortunately my Spanish is non-existent so I am not sure how effective the 
graph is in achieving whatever it's suppposed to do.  A dot chart might be as 
effective but it is a flashy graphic.

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] rgdal 0.7-1 release

2011-07-04 Thread Roger Bivand

A new release of rgdal, a package providing bindings for the Geospatial 
Data Abstraction Library for reading and writing spatial data, has reached 
CRAN.


This release changes the error handling mechanisms, and is more fully 
described in a posting on R-sig-geo:


https://stat.ethz.ch/pipermail/r-sig-geo/2011-July/012126.html

If any users observe unexpected behaviour following update, please revert 
to the 0.6-* series, and report with full details to the package 
maintainer. Extensive checking has been carried out, and no unexpected 
behaviour observed, but it is not feasible to check all possible use 
cases, especially erroneous use cases, hence this message.


--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: roger.biv...@nhh.no

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Wrong environment when evaluating and expression?

2011-07-04 Thread Joshua Wiley

Hi All,

I have constructed two expressions (e1 & e2).  I can see that they are
not identical, but I cannot figure out how they differ.

###
dat <- mtcars
e1 <- expression(with(data = dat, lm(mpg ~ hp)))
e2 <- as.expression(substitute(with(data = dat, lm(f)), list(f = mpg ~ hp)))

str(e1)
str(e2)
all.equal(e1, e2)
identical(e1, e2) # false

eval(e1)
eval(e2)


The context is trying to use a list of formulae to generate several
models from a multiply imputed dataset.  The package I am using (mice)
has methods for with() and that is how I can (easily) get the pooled
results.  Passing the formula directly does not work, so I was trying
to generate the entire call and evaluate it as if I had typed it at
the console, but I am missing something (probably rather silly).

Thanks,

Josh


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RWinEdt problem

2011-07-04 Thread Uwe Ligges




On 04.07.2011 06:56, Simon Knapp wrote:

... I just tried fiddling with the appearance settings, and when I uncheck
"Custom Colors" under "Document Tabs", the file names reappear, though I
don't get the coloring I am used to (red for modified, green for
unmodified).



Right. That is fixed in RWinEdt 1.8-3 (i.e. customm coloring disabled) 
which has been uploaded to CRAN during the weekend. A Windows binary 
will be created shortly.


Uwe




Thanks again,
Simon Knapp


On Mon, Jul 4, 2011 at 2:48 PM, Simon Knapp  wrote:


Hi R Helpers,

I am a long time RWinEdt user and have just acquired a new laptop. I have
installed RWinEdt and things are going smoothly except for one small glitch
- file names are not appearing on the document tabs. When I use WinEdt (as
opposed to RWinEdt), they are appearing. Can anyone offer any advice on
this?

Thanks in advance,
Simon Knapp

OS: windows7
Arch: 64 bit
R version: 2.13.0 (2011-04-13)
WinEdt version: 5.14 (build 20050701)



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] wavelets

2011-07-04 Thread Jeff Newmiller

Study the topic more carefully, I suppose. My understanding is that wavelets do 
not in themselves compress anything, but because they sort out the interesting 
data from the uninteresting data, it can be easy to toss the uninteresting data 
(lossy data compression). Perhaps you should understand better what your Matlab 
library is doing.
---
Jeff Newmiller The . . Go Live...
DCN: Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

user123  wrote:

I'm new to the topic of wavelets. When I tried to use the mra function in the
wavelets package, the data is not getting compressed. eg. if the original
data has 500 values , the output data also has the same.
However in MATLAB, depending on the level of decompositon, the data gets
compressed.
How do I implement this in R?

--
View this message in context: 
http://r.789695.n4.nabble.com/wavelets-tp3642973p3642973.html
Sent from the R help mailing list archive at Nabble.com.

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Poisson GLM with a logged dependent variable...just asking for trouble?

2011-07-04 Thread ONKELINX, Thierry

Dear Mark,

I think you want glm(DV ~ log10(IV), family=poisson)
Note that the poisson family uses the log-link by default. Hence you don't need 
to log-transform DV yourself.

Best regards,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie & Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics & Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

> -Oorspronkelijk bericht-
> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> Namens Mark Na
> Verzonden: vrijdag 1 juli 2011 23:10
> Aan: r-help@r-project.org
> Onderwerp: [R] Poisson GLM with a logged dependent variable...just asking for
> trouble?
> 
> Dear R-helpers,
> 
> I'm using a GLM with poisson errors to model integer count data as a function 
> of
> one non-integer covariate.
> 
> The model formula is: log(DV) ~ glm(log(IV,10),family=poisson).
> 
> I'm getting a warning because the logged DV is no longer an integer.
> 
> I have three questions:
> 
> 1) Can I ignore the warning, or is logging the DV (resulting in
> non-integers) a serious violation of the Poisson error structure?
> 
> 2) If the answer to #1 is "no, don't ignore it, it's serious" then can I use a
> quasipoisson error structure instead (does not  give the same
> warning) and if so are there any pitfalls to using the quasipoisson model? Are
> there any better alternatives for count data where the counts must be logged?
> Or, should I just abandon logging the DV? In that case, how could I compare 
> the
> fit of a Poisson model (without logging the DV) to that of a GLM with normal
> errors (with a logged DV). AIC would not be valid because the DVs are 
> different,
> right?
> 
> 3) The quasipoisson model doesn't return an AIC value. Why, and is there
> anything I can do to calculate AIC manually, that would allow me to compare
> this model to other models?
> 
> Many thanks in advance for your help!
> 
> Cheers, Mark
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] superimposing different plot types in lattice panel.superpose

2011-07-04 Thread Dennis Murphy

Hi:

Here's one way out, if I read your intention properly...

plot1 <- xyplot(y ~ x, data = modelresults, groups = model, type = 'l',
xlim = c(0, 10), ylim = c(0.5, 3.5))
plot2 <- xyplot(y ~ x, data = data, groups = model, type = 'p',
pch = 16, xlim = c(0, 10), ylim = c(0.5, 3.5))
plot1 + plot2

You might need latticeExtra to combine the plots (??); I had it loaded
with lattice when I did the plot.

HTH,
Dennis

On Sun, Jul 3, 2011 at 6:52 PM, genghis  wrote:
> I would like to plot 3 best-fit models in a single panel of a lattice plot,
> superimposed on 3 corresponding datasets in the same panel.  My goal is to
> show the models as lines of 3 different colors, and the data as points whose
> colors correspond to the model colors.  In essence, I have two levels of
> grouping: 1) model vs. data, and 2) model number.  Since there is only one
> “groups” variable, I have tried to deal with the additional grouping level
> by subsetting the data inside a custom panel function (basically a hack),
> but something about the way parameters are passed in panel.superpose (I
> think) is making it hard to show both points and lines.
>
> My question is very similar to a previous post:
> http://www.ask.com/web?q=r%20panel.superpose%20bwplot%20sim%20actual&o=15527&l=dis&prt=NIS&chn=retail&geo=US&ver=18
> , but the questioner in that case was using bwplot, which automatically
> makes a separate plot for every level of the categorical variable, so they
> didn’t face the two-level grouping problem, and I have been unable to figure
> out how to adapt their answer.  Another approach I tried was to put my model
> function inside of my custom panel function so that the analysis occurred
> there, but I couldn’t get it to subset the x and y data appropriately.
>
> In the toy problem below I want to plot each inverted V (“model”) as LINES,
> with a single POINT (“data”) in the center of each inverted V, the same
> color as the inverted V.  The code runs, but I can’t seem to mix lines and
> points.  In my real problem (stable isotopes with ellipses superimposed on
> data) there will be additional panels but I am creating only one panel here,
> for simplicity.
>
> #generate test "model results"
> x<-1:9
> y<-rep(c(1,3,1),3)
> model<-c(rep("a",3),rep("b",3),rep("c",3))
> modelresults<-data.frame(x,y,model)
>
> #generate test "data"
> x<-c(2,5,8)
> y<-rep(1.5,3)
> model<-c("a","b","c")
> data<-data.frame(x,y,model)
>
> #combine them into one data set
> combined<-make.groups(modelresults,data)
>
> #custom panel function
> panel.dualplot <- function(...) {
>    panel.xyplot(x[which="modelresults"],y[which="modelresults"],...)
>    panel.points(x[which="data"],y[which="data"],...)
> }
>
> #main call to xyplot
> xyplot(y ~ x, data=combined,
> type = "l",
> panel = panel.superpose,
> groups = model,
> panel.groups = panel.dualplot,
> )
>
>
> I’d be very grateful for any suggestions.
>
> Thanks!
>
> John
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/superimposing-different-plot-types-in-lattice-panel-superpose-tp3642808p3642808.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

64 matches

Mail list logo