Re: [R] How to apply a function to subsets of a data frame *and* obtain a data frame again?

2011-08-17 Thread Nick Sabbe
You might want to look at package plyr and use ddply.

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Marius Hofert
 Sent: woensdag 17 augustus 2011 12:42
 To: Help R
 Subject: [R] How to apply a function to subsets of a data frame *and*
 obtain a data frame again?
 
 Dear all,
 
 First, let's create some data to play around:
 
 set.seed(1)
 (df - data.frame(Group=rep(c(Group1,Group2,Group3), each=10),
  Value=c(rexp(10, 1), rexp(10, 4), rexp(10,
 10)))[sample(1:30,30),])
 
 ## Now we need the empirical distribution function:
 edf - function(x) ecdf(x)(x) # empirical distribution function
 evaluated at x
 
 ## The big question is how one can apply the empirical distribution
 function to
 ## each subset of df determined by Group, so how to apply it to
 Group1, then
 ## to Group2, and finally to Group3. You might suggest (?) to use
 tapply:
 
 (edf. - tapply(df$Value, df$Group, FUN=edf))
 
 ## That's correct. But typically, one would like to obtain not only the
 values,
 ## but a data.frame containing the original information and the new
 (edf-)values.
 ## What's a simple way to get this? (one would be required to first
 sort df
 ## according to Group, then paste the values computed by edf to the
 sorted df;
 ## seems a bit tedious).
 ## A solution I have is the following (but I would like to know if
 there is a
 ## simpler one):
 
 (edf.. - do.call(rbind, lapply(unique(df$Group), function(strg){
 subdata - subset(df, Group==strg) # sub-data
 subdata - cbind(subdata, edf=edf(subdata$Value))
 })) )
 
 
 Cheers,
 
 Marius
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glmnet

2011-08-10 Thread Nick Sabbe
Hi Andra.

I wonder how you come about trying to use LASSO without knowing what lambda
is. I'd advise you to read up on it. In the help (?glmnet) you can find
several paper references, but for a more gentle introduction, you can read
http://www-stat.stanford.edu/~tibs/ElemStatLearn/

In a nutshell, though: lambda is the parameter that balances the weight
given to the penalty. The bigger this one is, the more 'pressure' there is
on the coefficients to be small (or better yet: disappear).
The way you use LASSO is: you look at a reasonable set of lambda values
(this is e.g. done by glmnet), calculate some measure of success with each
lambda value (e.g.: misclassification, AUC,...), generally by using
crossvalidation (as is provided by cv.glmnet: read its help).

Having this measure of success (say the AUC) for each lambda in your
reasonable set allows you to pick the most optimal (lambda.min) or, to avoid
happenstance peaks, a more conservative and parsimonious one (lambda.1se),
after which you can rerun your lasso with this selected lambda on the full
dataset, to find the variables in your model.

Finally, to avoid downward bias, you could run a normal glm with only the
variables selected in the previous step.

Good luck!


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Andra Isan
 Sent: woensdag 10 augustus 2011 5:59
 To: r-help@r-project.org
 Subject: [R] glmnet
 
 Hi All,
 I have been trying to use glmnet package to do LASSO linear regression.
 my x data is a matrix n_row by n_col and y is a vector of size n_row
 corresponding to the vector data. The number of n_col is much more
 larger than the number of n_row. I do the following:
 fits = glmnet(x, y, family=multinomial)I have been following this
 article: http://cran.r-project.org/web/packages/glmnet/glmnet.pdfpage
 8, but there are some unclear parts that I dont understand. The lambda
 variable only returns 100 and I exactly dont know what lambda
 represents. So, basically I would like to know how to get the
 coefficients weights and what exactly lambda is? how I can see the
 difference between predicted values and observed values?
 If there is a sample code that helps me to understand how to use these,
 that would be great.
 Thanks a lot,Andra
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] counting columns that fulfill specific criteria

2011-06-24 Thread Nick Sabbe
Hello Paul.

You could try something like
perc-apply(pwdiff, 1 function(currow){
mean(abs(currow)  t, na.rm=TRUE)*100
})

I haven't tested this, as you did not provide a sample pwdiff. You should
probably check ?apply for more info.

Two suggestions: probably best not to name any variable t, as this is also
the function for transposing a matrix, and could end up being confusing at
the least. Second: for most practical purposes, it's better to leave out the
*100.

Good luck,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of pguilha
 Sent: vrijdag 24 juni 2011 13:15
 To: r-help@r-project.org
 Subject: [R] counting columns that fulfill specific criteria
 
 Hi,
 
 I have a matrix (pwdiff in the example below) with ~48 rows and 780
 columns.
 For each row, I want to get the percentage of columns that have an
 absolute
 value above a certain threshold t. I then want to allocate that
 percentage
 to matrix 'perc' in the corresponding row. Below is my attempt at doing
 this, but it does not work: I get 'replacement has length zero'. Any
 help
 would be much appreciated!!
 
 perc-matrix(c(1:nrow(pwdiff)))
 for (x in 1:nrow(pwdiff))
 perc[x]-(((ncol(pwdiff[,abs(pwdiff[x,]=t)]))/ncol(pwdiff))*100)
 
 I should add that my data has NAs in some rows and not others (but I do
 not
 want to just ignore rows that have NAs)
 
 Thanks!
 
 Paul
 
 --
 View this message in context: http://r.789695.n4.nabble.com/counting-
 columns-that-fulfill-specific-criteria-tp3622265p3622265.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem (and solution) to rle on vector with NA values

2011-06-23 Thread Nick Sabbe
Hello Cormac.

Not having thoroughly checked whether your code actually works, the behavior
of rle you describe is the one documented (check the details of ?rle) and
makes sense as the missingness could have different reasons.
As such, changing this type of behavior would probably break a lot of
existing code that is built on top of rle.

There are other peculiarities and disputabilities about some base R
functions (the order of the arguments for sample trips me every time), but
unless the argument is really strong or a downright bug, I doubt people will
be willing to change this. Perhaps making the new behavior optional (through
a new parameter na.action or similar, with the default the original
behavior) is an option?

Feel free to run your own version of rle in any case. I suggest you rename
it, though, as it may cause problems for some packages.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Cormac Long
 Sent: donderdag 23 juni 2011 15:44
 To: r-help@r-project.org
 Subject: [R] problem (and solution) to rle on vector with NA values
 
 Hello there R-help,
 
 I'm not sure if this should be posted here - so apologies if this is
 the case.
 I've found a problem while using rle and am proposing a solution to the
 issue.
 
 Description:
 I ran into a niggle with rle today when working with vectors with NA
 values
 (using R 2.31.0 on Windows 7 x64). It transpires that a run of NA
 values
 is not encoded in the same way as a run of other values. See the
 following
 example as an illustration:
 
 Example:
 The example
     rv-c(1,1,NA,NA,3,3,3);rle(rv)
 Returns
     Run Length Encoding
       lengths: int [1:4] 2 1 1 3
       values : num [1:4] 1 NA NA 3
 not
     Run Length Encoding
       lengths: int [1:3] 2 2 3
       values : num [1:3] 1 NA 3
 as I expected. This caused my code to fail later (unsurprising).
 
 Analysis:
 The problem stems from the test
          y - x[-1L] != x[-n]
 in line 7 of the rle function body. In this test, NA values return
 logical NA
 values, not TRUE/FALSE (again, unsurprising).
 
 Resolution:
 I modified the rle function code as included below. As far as I tested,
 this
 modification appears safe. The convoluted construction of naMaskVal
 should guarantee that the NA masking value is always different from
 any value in the vector and should be safe regardless of the input
 vector
 form (a raw vector is not handled since the NA values do not apply
 here).
 
 rle-function (x)
 {
     if (!is.vector(x)  !is.list(x))
     stop('x' must be an atomic vector)
     n - length(x)
     if (n == 0L)
     return(structure(list(lengths = integer(), values = x),
     class = rle))
 
      BEGIN NEW SECTION PART 1 
     naRepFlag-F
     if(any(is.na(x))){
     naRepFlag-T
     IS_LOGIC-ifelse(typeof(x)==logical,T,F)
 
     if(typeof(x)==logical){
     x-as.integer(x)
     naMaskVal-2
     }else if(typeof(x)==character){
     naMaskVal-
 paste(sample(c(letters,LETTERS,0:9),32,replace=T),collapse=)
     }else{
     naMaskVal-max(0,abs(x[!is.infinite(x)]),na.rm=T)+1
     }
 
     x[which(is.na(x))]-naMaskVal
     }
      END NEW SECTION PART 1 
 
     y - x[-1L] != x[-n]
     i - c(which(y), n)
 
      BEGIN NEW SECTION PART 2 
     if(naRepFlag)
     x[which(x==naMaskVal)]-NA
 
     if(IS_LOGIC)
     x-as.logical(x)
      END NEW SECTION PART 2 
 
     structure(list(lengths = diff(c(0L, i)), values = x[i]),
     class = rle)
 }
 
 Conclusion:
 I think that the proposed code modification is an improvement on the
 existing
 implementation of rle. Is it impertinent to suggest this R-modification
 to the
 gurus at R?
 
 Best wishes (in flame-war trepidation),
 Dr. Cormac Long.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rcpp and Object Factories

2011-06-09 Thread Nick Sabbe
You might want to send this message to the Rcpp mailing list at:
Rcpp-devel mailing list
rcpp-de...@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

It will improve your chances of getting a swift (if not helpful) reply.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Michael King
 Sent: donderdag 9 juni 2011 21:03
 To: r-help@r-project.org
 Subject: [R] Rcpp and Object Factories
 
 Hello,
 I'm not exactly sure how to ask this question, but let me give it a
 shot...
 
 Is it possible (easy) to use Rcpp Modules in conjunction with object
 factories? For example
 
 what I am trying to do is something like this:
 
 // c++ classes
 
 class Foo {
   public:
 void do_something() {};
 };
 
 class Foo_Factory {
   public:
 Foo * create_foo() {
   return new Foo();
}
 };
 
 ## R Code
 
 library(Rcpp)
 ff - Module(Foo_Factory)
 foo - ff$create_foo()
 foo$do_something()
 
 It appears after scouring some message boards that it is doable via
 boost
 python, but i'm not literate enough yet about how this works to know if
 the
 same logic holds for R.
 
 Thanks for the help.
 
 -Mike King
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread Nick Sabbe
Try
(df1[order(-df1[,2]),])
Adding the minus within the [ leaves out the column (in this case column 2).
See ?[.

HTH.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of John Kane
Sent: donderdag 12 mei 2011 14:33
To: R R-help
Subject: [R] Simple order() data frame question.

Clearly, I don't understand what order() is doing and as ususl the help for
order seems to only confuse me more. For some reason I just don't follow the
examples there. I must be missing something about the data frame sort there
but what?

I originally wanted to  reverse-order my data frame df1 (see below) by aa (a
factor) but since this was not working I decided to simplify and order by bb
to see what was haqppening!!

I'm obviously doing something stupid but what? 

(df1 - data.frame(aa=letters[1:10],
 bb=rnorm(10)))
# Order in acending order by bb
(df1[order(df1[,2]),] ) # seems to work fine

# Order in decending order by bb.
(df1[order(df1[,-2]),])  # does not seem to work


===
 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=CLC_TIME=English_Canada.1252

attached base packages:
 [1] grid  grDevices datasets  splines   graphics  stats tcltk
utils methods   base 

other attached packages:
[1] ggplot2_0.8.9   proto_0.3-9.2   reshape_0.8.4   plyr_1.5.2
svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2 
[8] Hmisc_3.8-3 survival_2.36-9

loaded via a namespace (and not attached):
[1] cluster_1.13.3  lattice_0.19-26 svMisc_0.9-61   tools_2.13.0   


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looping over graphs in igraph

2011-05-06 Thread Nick Sabbe
Hi Danielle.

You appear to have two problems:
1) getting the data into R
Because I don't have the file at hand, I'm going to simulate reading it
through a text connection
orgdata-textConnection(Graph ID | Vertex1 | Vertex2 | weight\n1 | Alice |
Bob | 2\n1 | Alice | Chris | 1\n1 | Alice | Jane | 2\n1 | Bob | Jane | 2\n1
| Chris | Jane | 3\n2 | Alice | Tom | 2\n2 | Alice | Kate | 1\n2 | Kate |
Tom | 3\n2 | Tom | Mike | 2)
dfr -read.table(orgdata, header=TRUE, sep=|, as.is=TRUE, strip.whit=TRUE)

For you, this would probably be more like
dfr -read.table(somepath/fileOfInterest.csv, header=TRUE, sep=|,
as.is=TRUE, strip.whit=TRUE)

2) performing actions per graph id
require(igraph)
result-sapply(unique(dfr$Graph.ID), function(curID){
#There may be more elegant ways of creating the graphs per
ID, but it works
curDfr- dfr[dfr$Graph.ID==curID,]
g-graph.edgelist(as.matrix(curDfr[,c(Vertex1,
Vertex2)]))
g-set.edge.attribute(g, weight, value= curDfr$weight)
#return whatever information you're interested about, based
on graph object g
#for now I'm just returning edge and vertex counts
return(c(v=vcount(g), e=ecount(g)))
})
colnames(result)-unique(dfr$Graph.ID)
print(result)

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Danielle Li
Sent: donderdag 5 mei 2011 22:25
To: r-help@r-project.org
Subject: [R] Looping over graphs in igraph

Hi,


I'm trying to do some basic social network analysis with igraph in R, but
I'm new to R and haven't been able to find documentation on a couple basic
things:

I want to run igraph's community detection algorithms on a couple thousand
small graphs but don't know how to automate igraph looking at multiple
graphs described in a single csv file.  My data look like something in ncol
format, but with an additional column that has an ID for which graph the
edge belongs in:


Graph ID | Vertex1 | Vertex2 | weight

1 | Alice | Bob | 2

1 | Alice | Chris | 1

1 | Alice | Jane | 2

1 | Bob | Jane | 2

1 | Chris | Jane | 3


2 | Alice | Tom | 2

2 | Alice | Kate | 1

2 | Kate | Tom | 3

2 | Tom | Mike | 2


so on and so forth for about 2000 graph IDs, each with about 20-40
vertices.  I've tried using the split command but it doesn't recognize my
graph id: (object 'graphid' not found)--this may just be because I don't
know how to classify a column of a csv as an object.


Ultimately, I want to run community detection on each graph separately--to
look only at the edges when the graph identifier is 1, make calculations on
that graph, then do it again for 2 and so forth.  I suspect that this isn't
related to igraph specifically--I just don't know the equivalent command in
R for what in pseudo Stata code would read as:


forvalues i of 1/N {

temp_graph=subrows of the main csv file for which graphid==`i'

cs`i' = leading.eigenvector.community.step(temp_graph)
convert cs`i'$membership into a column in the original csv

}


I want the output to look something like:


Graph ID | Vertex1 | Vertex2 | weight | Vertex 1 membership | Vertex 2
membership | # of communities in the graph


1 | Alice | Bob | 2 | A | B | 2

1 | Alice | Chris | 1 | A | B | 2

1 | Alice | Jane | 2 | A | B | 2

1 | Bob | Jane | 2 | B | B | 2

1 | Chris | Jane | 3 | B | B | 2


 2 | Alice | Tom | 2 | A | B | 3

2 | Alice | Kate | 1 | A | C | 3

2 | Kate | Tom | 3 |  C | B | 3

2 | Tom | Mike | 2 | B | C | 3


Here, the graphs are treated completely separately so that community A in
graph 1 need not have anything to do with community A in graph 2.


I would really appreciate any ideas you guys have.


Thank you!

Danielle

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lasso with Categorical Variables

2011-05-03 Thread Nick Sabbe
For performance reasons, I advise on using the following function instead of
model.matrix:

factorsToDummyVariables-function(dfr, betweenColAndLevel=)
{
nc-dim(dfr)[2]
firstRow-dfr[1,]
coln-colnames(dfr)
retval-do.call(cbind, lapply(seq(nc), function(ci){
if(is.factor(firstRow[,ci]))
{
lvls-levels(firstRow[,ci])[-1]
stretchedcols-sapply(lvls, function(lvl){
rv-dfr[,ci]==lvl
mode(rv)-integer
return(rv)
})
if(!is.matrix(stretchedcols))
stretchedcols-matrix(stretchedcols, nrow=1)
colnames(stretchedcols)-paste(coln[ci],
lvls, sep=betweenColAndLevel)
return(stretchedcols)
}
else
{
curcol-matrix(dfr[,ci], ncol=1)
colnames(curcol)-coln[ci]
return(curcol)
}
}))
rownames(retval)-rownames(dfr)
return(retval)
}


Just for comparison: here is my old version of the same function, using
model.matrix:

factorsToDummyVariables.old-function(dfrPredictors,
form=paste(~,paste(colnames(dfrPredictors), collapse=+), sep=))
{
#note: this function seems to operate quite slowly!
#Because it is used often, it may be worth improving its speed
dfrTmp-model.frame(dfrPredictors, na.action=na.pass)
frm-as.formula(form)
mm-model.matrix(frm, data=dfrTmp)
retval-as.matrix(mm)[,-1]

return(retval)
}

In a testcase with a reasonably big dataset, I compared the speeds:

#system.time(tmp.fd.convds.full.man-manualFactorsToDummyVariables(ds))
##   user  system elapsed
##   9.440.009.48
#system.time(tmp.fd.convds.full-factorsToDummyVariables.old(ds))
##   user  system elapsed
##  15.490.00   15.64
#system.time(invisible(factorsToDummyVariables (ds[10,])))
##   user  system elapsed
##   0.360.000.36
#system.time(invisible(factorsToDummyVariables.old (ds[10,])))
##   user  system elapsed
##   2.180.002.20
#system.time(invisible(factorsToDummyVariables (ds[20:30,])))
##   user  system elapsed
##   0.340.000.38
#system.time(invisible(factorsToDummyVariables.old (ds[20:30,])))
##   user  system elapsed
##   2.110.002.15

If you have to do this quite often, the difference surely adds up...
More improvements may be possible.
This function only works if you don't include interactions, though.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of David Winsemius
Sent: maandag 2 mei 2011 20:48
To: Steve Lianoglou
Cc: r-help@r-project.org
Subject: Re: [R] Lasso with Categorical Variables


On May 2, 2011, at 10:51 AM, Steve Lianoglou wrote:

 Hi,

 On Mon, May 2, 2011 at 12:45 PM, Clemontina Alexander ckale...@ncsu.edu 
  wrote:
 Hi! This is my first time posting. I've read the general rules and
 guidelines, but please bear with me if I make some fatal error in
 posting. Anyway, I have a continuous response and 29 predictors made
 up of continuous variables and nominal and ordinal categorical
 variables. I'd like to do lasso on these, but I get an error. The way
 I am using lars doesn't allow for the factors. Is there a special
 option or some other method in order to do lasso with cat. variables?

 Here is and example (considering ordinal variables as just nominal):

 set.seed(1)
 Y - rnorm(10,0,1)
 X1 - factor(sample(x=LETTERS[1:4], size=10, replace = TRUE))
 X2 - factor(sample(x=LETTERS[5:10], size=10, replace = TRUE))
 X3 - sample(x=30:55, size=10, replace=TRUE)  # think age
 X4 - rchisq(10, df=4, ncp=0)
 X - data.frame(X1,X2,X3,X4)

 str(X)
 'data.frame':   10 obs. of  4 variables:
  $ X1: Factor w/ 4 levels A,B,C,D: 4 1 3 1 2 2 1 2 4 2
  $ X2: Factor w/ 5 levels E,F,G,H,..: 3 4 3 2 5 5 5 1 5 3
  $ X3: int  51 46 50 44 43 50 30 42 49 48
  $ X4: num  2.86 1.55 1.94 2.45 2.75 ...


 I'd like to do:
 obj - lars(x=X, y=Y, type = lasso)

 Instead, what I have been doing is converting all data to continuous
 but I think this is really bad!

 Yeah, it is.

 Check out the Categorical Predictor Variables section here for a way
 to handle such predictor vars:
 http://www.psychstat.missouristate.edu/multibook/mlt08m.html

Steve's citation is somewhat helpful, but not sufficient to take the  
next steps. You can find details regarding the mechanics of typical  
linear regression in R on the ?lm page where you find

Re: [R] Reference variables by string in for loop

2011-04-29 Thread Nick Sabbe
Hi Michael.
This is a classic :-)

ObjectsOfInterest- list(one_df, two_df, three_df)
for(namedf in ObjectsOfInterest){...}

or probably even better
sapply(ObjectsOfInterest, function(namedf){...})

hth.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Michael Bach
Sent: vrijdag 29 april 2011 12:03
To: r-help@r-project.org
Subject: [R] Reference variables by string in for loop

Dear R Users,

I am trying to get the following to work better:

namevec - c(one, two, three)
for (name in namevec) {
namedf - eval(parse(text=paste(name, _df, sep=)))
...
...
}

The rationale behind it being that I created variables with names
one_df, two_df and three_df earlier in the same script which I want to
reference inside the for loop.  Is there a more elegant way to do this?

Best Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] abline outside of plot region

2011-04-29 Thread Nick Sabbe
Hi R people.

 

I ran into this problem: I created a plot with errbars, like this:

 errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5),
yplus=c(2.5,1.5,3.5,3.5))

Next, I wanted to accentuate some x value with an abline, like this:

 abline(v=2)

 

In one of my R sessions (which admittedly I have had open for quite a while
now), the abline draws outside of the plotting region of errbars (till the
edge of my plotting window at least).

I tested for the cause by opening another session (clean) of the same
version of R (2.13), and running the same set of commands. In this session,
I do not have this behavior. Conclusion: I must have changed some graphical
parameter in my original session, but I don't know which one. Do you?

 

As an addendum: I also want to add a few specific axis ticks besides the
standard ones in my graph. I used axis for this, and it works. I set
col.ticks to match the color of my abline (in the nonsimplified code), and
this works too, but unfortunately, the label below the tick is not in this
color, and a parameter for this is not present in axis.

 

Suggestions for either? Note: I'm on windows 7 with R 2.13.

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be/ http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assignments inside lapply

2011-04-27 Thread Nick Sabbe
No, that does not work.
You cannot do assignment within (l)apply.
Nor in any other function for that matter.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Alaios
Sent: woensdag 27 april 2011 11:37
To: R-help@r-project.org
Subject: [R] Assignments inside lapply

Dear all I would like to ask you if an assignment can be done inside a
lapply statement.

For example

I would like to covert a double nested for loop

for (i in c(1:dimx)){ 
  for (j in c(1:dimy)){
  Powermap[i,j] - Pr(c(i,j),c(PRX,PRY),f)
   }
}

to something like that:


ij-expand.grid(i=seq(1:dimx),j=(1:dimy))

unlist(lapply(1:nrow(ij),function(rowId) { return
(Powermap[i,j]-Pr(c(ij$i[rowId],ij$j[rowId]),c(PRX,PRY),f))   }))


as you can see lapply does not return nothing as the assignment is done
inside the function. Would that work correctly? What are the cases such a
statement will misfunction?

I would like to thank you in advace for your help.

Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] programming: telling a function where to look for the entered variables

2011-04-01 Thread Nick Sabbe
See the warning in ?subset.
Passing the column name of lvar is not the same as passing the 'contextual
column' (as I coin it in these circumstances).
You can solve it by indeed using [] instead.

For my own comfort, here is the relevant line from your original function:
Data.tmp - subset(Fulldf, lvar==subgroup, select=c(xvar,yvar))
Which should become something like (untested but should be close):
Data.tmp - Fulldf[Fulldf[,lvar]==subgroup, c(xvar,yvar)]

This should be a lot easier to translate based on column names, as the
column names are now used as such.

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of E Hofstadler
Sent: vrijdag 1 april 2011 13:09
To: r-help@r-project.org
Subject: [R] programming: telling a function where to look for the entered
variables

Hi there,

Could someone help me with the following programming problem..?

I have written a function that works for my intended purpose, but it
is quite closely tied to a particular dataframe and the names of the
variables in this dataframe. However, I'd like to use the same
function for different dataframes and variables. My problem is that
I'm not quite sure how to tell my function in which dataframe the
entered variables are located.

Here's some reproducible data and the function:

# create reproducible data
set.seed(124)
xvar - sample(0:3, 1000, replace = T)
yvar - sample(0:1, 1000, replace=T)
zvar - rnorm(100)
lvar - sample(0:1, 1000, replace=T)
Fulldf - as.data.frame(cbind(xvar,yvar,zvar,lvar))
Fulldf$xvar - factor(xvar, labels=c(blue,green,red,yellow))
Fulldf$yvar - factor(yvar, labels=c(area1,area2))
Fulldf$lvar - factor(lvar, labels=c(yes,no))

and here's the function in the form that it currently works: from a
subset of the dataframe Fulldf, a contingency table is created (in my
actual data, several other operations are then performed on that
contingency table, but these are not relevant for the problem in
question, therefore I've deleted it) .

# function as it currently works: tailored to a particular dataframe
(Fulldf)

myfunct - function(subgroup){ # enter a particular subgroup for which
the contingency table should be calculated (i.e. a particular value of
the factor lvar)
Data.tmp - subset(Fulldf, lvar==subgroup, select=c(xvar,yvar))
#restrict dataframe to given subgroup and two columns of the original
dataframe
Data.tmp - na.omit(Data.tmp) # exclude missing values
indextable - table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table
return(indextable)
}

#Since I need to use the function with different dataframes and
variable names, I'd like to be able to tell my function the name of
the dataframe and variables it should use for calculating the index.
This is how I tried to modify the first part of the #function, but it
didn't work:

# function as I would like it to work: independent of any particular
dataframe or variable names (doesn't work)

myfunct.better - function(subgroup, lvarname, yvarname, dataframe){
#enter the subgroup, the variable names to be used and the dataframe
in which they are found
Data.tmp - subset(dataframe, lvarname==subgroup, select=c(xvar,
deparse(substitute(yvarname # trying to subset the given dataframe
for the given subgroup of the given variable. The variable xvar
happens to have the same name in all dataframes) but the variable
yvarname has different names in the different dataframes
Data.tmp - na.omit(Data.tmp)
indextable - table(Data.tmp$xvar, Data.tmp$yvarname) # create the
contingency table on the basis of the entered variables
return(indextable)
}

calling

myfunct.better(yes, lvarname=lvar, yvarname=yvar, dataframe=Fulldf)

results in the following error:

Error in `[.data.frame`(x, r, vars, drop = drop) :
  undefined columns selected

My feeling is that R doesn't know where to look for the entered
variables (lvar, yvar), but I'm not sure how to solve this problem. I
tried using with() and even attach() within the function, but that
didn't work.

Any help is greatly appreciated.

Best,
Esther

P.S.:
Are there books that elaborate programming in R for beginners -- and I
mean things like how to best use vectorization instead of loops and
general best practice tips for programming. Most of the books I've
been looking at focus on applying R for particular statistical
analyses, and only comparably briefly deal with more general
programming aspects. I was wondering if there's any books or tutorials
out there that cover the latter aspects in a more elaborate and
systematic way...?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code

Re: [R] programming: telling a function where to look for the entered variables

2011-04-01 Thread Nick Sabbe
This should be a version that does what you want.
Because you named the variable lvarname, I assumed you were already passing
lvar instead of trying to pass lvar (without the quotes), which is in no
way a 'name'.

myfunct.better - function(subgroup, lvarname, xvarname, yvarname,
dataframe)
{
#enter the subgroup, the variable names to be used and the dataframe
#in which they are found
Data.tmp - Fulldf[Fulldf[,lvarname]==subgroup,
c(xvarname,yvarname)]
Data.tmp -na.omit(Data.tmp)
indextable - table(Data.tmp[,xvarname], Data.tmp[,yvarname]) #
create the contingency 
#table on the basis of the entered variables
#actually, if I remember well, you could simply use
indextable-table(Data.tmp) here
#that would allow for some more simplifications (replace xvarname
and yvarname by
#columnsOfInterest or similar, and pass that instead of c(xvarname,
yvarname) )
return(indextable)
}

myfunct.better(yes, lvarname=lvar, xvarname=xvar, yvarname=yvar,
dataframe=Fulldf)


HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: irene.p...@googlemail.com [mailto:irene.p...@googlemail.com] On Behalf
Of E Hofstadler
Sent: vrijdag 1 april 2011 14:28
To: Nick Sabbe
Cc: r-help@r-project.org
Subject: Re: [R] programming: telling a function where to look for the
entered variables

Thanks Nick and Juan for your replies.

Nick, thanks for pointing out the warning in subset(). I'm not sure
though I understand the example you provided -- because despite using
subset() rather than bracket notation, the original function (myfunct)
does what is expected of it. The problem I have is with the second
function (myfunct.better), where variable names + dataframe are not
fixed within the function but passed to the function when calling it
-- and even with bracket notation I don't quite manage to tell R where
to look for the columns that related to the entered column names.
(but then perhaps I misunderstood you)

This is what I tried (using bracket notation):

myfunct.better(dataframe, subgroup, lvarname,yvarname){
Data.tmp - dataframe[dataframe[,deparse(substitute(lvarname))]==subgroup,
c(xvar,deparse(substitute(yvarname)))]
}

but this creates an empty contingency table only -- perhaps because my
use of deparse() is flawed (I think what is converted into a string is
lvarname and yvarname, rather than the column names that these two
function-variables represent in the dataframe)?


2011/4/1 Nick Sabbe nick.sa...@ugent.be:
 See the warning in ?subset.
 Passing the column name of lvar is not the same as passing the 'contextual
 column' (as I coin it in these circumstances).
 You can solve it by indeed using [] instead.

 For my own comfort, here is the relevant line from your original function:
 Data.tmp - subset(Fulldf, lvar==subgroup, select=c(xvar,yvar))
 Which should become something like (untested but should be close):
 Data.tmp - Fulldf[Fulldf[,lvar]==subgroup, c(xvar,yvar)]

 This should be a lot easier to translate based on column names, as the
 column names are now used as such.

 HTH,


 Nick Sabbe
 --
 ping: nick.sa...@ugent.be
 link: http://biomath.ugent.be
 wink: A1.056, Coupure Links 653, 9000 Gent
 ring: 09/264.59.36

 -- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of E Hofstadler
 Sent: vrijdag 1 april 2011 13:09
 To: r-help@r-project.org
 Subject: [R] programming: telling a function where to look for the entered
 variables

 Hi there,

 Could someone help me with the following programming problem..?

 I have written a function that works for my intended purpose, but it
 is quite closely tied to a particular dataframe and the names of the
 variables in this dataframe. However, I'd like to use the same
 function for different dataframes and variables. My problem is that
 I'm not quite sure how to tell my function in which dataframe the
 entered variables are located.

 Here's some reproducible data and the function:

 # create reproducible data
 set.seed(124)
 xvar - sample(0:3, 1000, replace = T)
 yvar - sample(0:1, 1000, replace=T)
 zvar - rnorm(100)
 lvar - sample(0:1, 1000, replace=T)
 Fulldf - as.data.frame(cbind(xvar,yvar,zvar,lvar))
 Fulldf$xvar - factor(xvar, labels=c(blue,green,red,yellow))
 Fulldf$yvar - factor(yvar, labels=c(area1,area2))
 Fulldf$lvar - factor(lvar, labels=c(yes,no))

 and here's the function in the form that it currently works: from a
 subset of the dataframe Fulldf, a contingency table is created (in my
 actual data, several other operations are then performed on that
 contingency table, but these are not relevant for the problem in
 question, therefore I've deleted it) .

 # function as it currently works: tailored to a particular dataframe
 (Fulldf)

 myfunct - function(subgroup

Re: [R] Graph many points without hiding some

2011-03-31 Thread Nick Sabbe
Hi.
You could also turn it into a 3D plot with some variation on the function
below:
plot4d-function(x,y,z, u, main=, xlab=, ylab=, zlab=, ulab=)
{
require(rgl)#may need to install this package first

#standard trick to get some intensity colors
uLim-range(u)
uLen-uLim[2] - uLim[1] + 1
colorlut-terrain.colors(uLen)
col-colorlut[u - uLim[1] + 1]

open3d()#Open new device
points3d(x=x, y=y, z=z,  col=col)
aspect3d(x=1, y=1, z=1) #ensure bounding box is in cube-form
(scaling variables)
#note: if you want to flip an axis, use -1 in the statement above

axes3d() #Show axes
title3d(main = main, sub=paste(Green is low, ulab, , red is
high)
xlab = xlab, ylab = ylab, zlab = zlab)
}

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Peter Langfelder
Sent: donderdag 31 maart 2011 9:26
To: Samuel Dennis
Cc: R-help@r-project.org
Subject: Re: [R] Graph many points without hiding some

On Wed, Mar 30, 2011 at 10:04 PM, Samuel Dennis sjdenn...@gmail.com wrote:
 I have a very large dataset with three variables that I need to graph
using
 a scatterplot. However I find that the first variable gets masked by the
 other two, so the graph looks entirely different depending on the order of
 variables. Does anyone have any suggestions how to manage this?

 This code is an illustration of what I am dealing with:

 x - 1
 plot(rnorm(x,mean=20),rnorm(x),col=1,xlim=c(16,24))
 points(rnorm(x,mean=21),rnorm(x),col=2)
 points(rnorm(x,mean=19),rnorm(x),col=3)

 gives an entirely different looking graph to:

 x - 1
 plot(rnorm(x,mean=19),rnorm(x),col=3,xlim=c(16,24))
 points(rnorm(x,mean=20),rnorm(x),col=1)
 points(rnorm(x,mean=21),rnorm(x),col=2)

 despite being identical in all respects except for the order in which the
 variables are plotted.

 I have tried using pch=., however the colours are very difficult to
 discern. I have experimented with a number of other symbols with no real
 solution.

 The only way that appears to work is to iterate the plot with a for loop,
 and progressively add a few numbers from each variable, as below. However
 although I can do this simply with random numbers as I have done here,
this
 is an extremely cumbersome method to use with real datasets.

 plot(1,1,xlim=c(16,24),ylim=c(-4,4),col=white)
 x - 100
 for (i in 1:100) {
 points(rnorm(x,mean=19),rnorm(x),col=3)
 points(rnorm(x,mean=20),rnorm(x),col=1)
 points(rnorm(x,mean=21),rnorm(x),col=2)
 }

 Is there some function in R that could solve this through automatically
 iterating my data as above, using transparent symbols, or something else?
Is
 there some other way of solving this issue that I haven't thought of?

Assume you are plotting variables y1, y2, y3 of the same length
against a common x, and you would like to assign colors say c(1,2,3).
You can automate the randomization of order as follows:

n = length(y1);
y = c(y1, y2, y3);
xx = rep(x, 3);
colors = rep(c(1,2,3), c(n, n, n));

order = sample(c(1:(3*n)));

plot(xx[order], y[order], col= colors[order])

I basically turn the y's into a single vector y with the corresponding
values of x stored in xx and the plotting colors, then randomize the
order using the sample function.

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] choosing best 'match' for given factor

2011-03-31 Thread Nick Sabbe
Hi Murali.
I haven't compared, but this is what I would do:

bestMatch-function(searchVector, matchMat)
{
searchRow-unique(sort(match(searchVector, colnames(matchMat #if
you're sure, you could drop unique
cat(Original row indices:)
print(searchRow)
matchMat-matchMat[, -searchRow, drop=FALSE] #avoid duplicates
altogether
cat(Corrected Matrix:\n)
print(matchMat)
correctedRows-searchRow - seq_along(searchRow) + 1 #works because
of the sort above
cat(Corrected row indices:)
print(correctedRows)
sapply(correctedRows, function(cr){
lookWhere-matchMat[cr, seq(cr-1)]
cat(Will now look into:\n)
print(lookWhere)
cc-which.max(lookWhere)
cat(Max at position, cc, \n)
colnames(matchMat)[cc]
})
}
I don't think there's that much difference. Depending on specific sizes, it
may be more or less costly to first shrink the search matrix like I do. And
similarly depending, I may be better still if you remove the rows that
you're not interested in as well (some more but similar index trickery
required then.

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of murali.me...@avivainvestors.com
Sent: donderdag 31 maart 2011 16:46
To: r-help@r-project.org
Subject: [R] choosing best 'match' for given factor

Folks,

I have a 'matching' matrix between variables A, X, L, O:

 a - structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58, 
0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list(
c(A, X, L, O), c(A, X, L, O)))

 a
  A X L O
A  1.00  0.41  0.58  0.75
X  0.41  1.00  0.60  0.86
L  0.58  0.75  1.00  0.83
O  0.60  0.86  0.83  1.00

And I have a search vector of variables

 v - c(X, O)

I want to write a function bestMatch(searchvector, matchMat) such that for
each variable in searchvector, I get the variable that it has the highest
match to - but searching only among variables to the left of it in the
'matching' matrix, and not matching with any variable in searchvector
itself.

So in the above example, although X has the highest match (0.86) with O,
I can't choose O as it's to the right of X (and also because O is in the
searchvector v already); I'll have to choose A.

For O, I will choose L, the variable it's best matched with - as it
can't match X already in the search vector.

My function bestMatch(v, a) will then return c(A, L)

My matrix a is quite large, and I have a long list of search vectors v, so I
need an efficient method.

I wrote this:

bestMatch - function(searchvector,  matchMat) {
sapply(searchvector, function(cc) {
 y - matchMat[!(rownames(matchMat) %in%
searchvector)  (index(rownames(matchMat))  match(cc, rownames(matchMat))),
cc, drop = FALSE];
 rownames(y)[which.max(y)]
})   
}

Any advice?

Thanks,

Murali

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a for loop to lapply

2011-03-30 Thread Nick Sabbe
Hello Alex.
A few issues:
* you want seq(dimx) instead of seq(1:dimx)  (d'oh)
* I think you have problems with your dimensions in the original code as
well: you use i, which runs up to dimx as an indexer for your third
dimension, of size dimmaps. If dimx  dimmaps, you're in for unexpected
results.
* basic idea of the apply-style functions (nicked *apply below):
- first argument = a collection of items to run over. Could be a
list or a vector
- second argument a function, that could take any of the items in
the collection as its first argument
- other arguments: either tuning parameters (like simplify) for
*apply or passed on as more arguments to the function
- each item from the collection is sequentially fed as the first
argument, the extra arguments (always the same) are also passed to *apply.
- normally, the results of each call are collected into a list,
where the names of the list items refers to your original collection. In
more elaborate versions (sapply) and under some circumstances, this list is
transformed into a simpler structure.
* your test case is rather complicated: I don't think there is a way to make
lapply or one of its cousins to return a threedimensional array just like
that. With sapply (and simplify=TRUE, the default), if the result for each
item of your collection has the same length, the result is coerced into a
twodimensional array with one column for each item in your collection.
* on the other hand, for your example, you probably don't want to use *apply
functions nor loops: it can be done with some clever use of seq and rep and
dim, for sure.

All in all, it seems you may need to get your basics up to speed first, then
shift to *apply (and use a simpler example to get started, like: given a
matrix with two columns, create a vector holding the differences and the
sums of the columns - I know this can be done without *apply as well, but
apart from that it is a more attainable exercise).

Good luck to you on that!

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Alaios
Sent: woensdag 30 maart 2011 8:31
To: R-help@r-project.org
Subject: [R] a for loop to lapply

Dear all,
I am trying to learn lapply.
I would like, as a test case, to try the lapply alternative for the 


Shadowlist-array(data=NA,dim=c(dimx,dimy,dimmaps))
for (i in c(1:dimx)){
Shadowlist[,,i]-i
}


---so I wrote the following---


returni -function(i,ShadowMatrix) {ShadowMatrix-i}
lapply(seq(1:dimx),Shadowlist[,,seq(1:dimx)],returni)

So far I do not get same results with both ways.
Could you please help me understand what might be wrong?


Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A question on glmnet analysis

2011-03-25 Thread Nick Sabbe
I haven't read all of your code, but at first read, it seems right.

With regard to your questions:
1. Am I doing it correctly or not?
Seems OK, as I said. You could use some more standard code to convert your
data to a matrix, but essentially the results should be the same.
Also, lambda.min may be a tad to optimistic: to correct for the reuse of
data in crossvalidation, one normally uses the minus one se trick (I think
this is described in the helpfile for glmnet.cv, and that is also present in
the glmnet.cv return value (lambda.1se if I'm not mistaken))

2. Which model, I mean lasso or elastic net, should be selected? and
why? Both models chose the same variables but different coefficient values.
You may want to read 'the elements of statistical learning' to find some
info on the advantages of ridge/lasso/elnet compared. Lasso should work fine
in this relatively low-dimensional setting, although it depends on the
correlation structure of your covariates.
Depending on your goals, you may want to refit a standard logistic
regression with only the variables selected by the lasso: this avoids the
downward bias that is in (just about) every penalized regression.

3. Is it O.K. to calculate odds ratio by exp(coefficients)? And how can
you calculate 95% confidence interval of odds ratio?
Or 95%CI is meaningless in this kind of analysis?
At this time, confidence intervals for lasso/elnet in GLM settings is an
open problem (the reason being that the L1 penalty is not differentiable).
Some 'solutions' exist (bootstrap, for one), but they have all been shown to
have (statistical) properties that make them - at the least - doubtful. I
know, because I'm working on this. Short answer: there is no way to do this
(at this time).

HTH (and hang on there in Japan),


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of 
Sent: vrijdag 25 maart 2011 14:04
To: r-h...@stat.math.ethz.ch
Subject: [R] A question on glmnet analysis

Hi,
I am trying to do logistic regression for data of 104 patients, which
have one outcome (yes or no) and 15 variables (9 categorical factors
[yes or no] and 6 continuous variables). Number of yes outcome is 25.
Twenty-five events and 15 variables mean events per variable is much
less than 10. Therefore, I tried to analyze the data with penalized
regression method. I would like please some of the experts here to help me.

First of all, I standardized all 6 continuous variables by scale() with
center=TRUE and scale=TRUE option. Nine categorical variables and one
outcome variable were re-coded as 0 or 1. Then, I used glmnet with
standardize=FALSE option because of presence of categorical variables.

x15std - matrix(c(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15),
104, 15)
y - outcome
library(glmnet)
fit.1 - glmnet(x15std, y, family=binomial, standardize=FALSE)
fit.1cv - cv.glmnet(x15std, y, family=binomial, standardize=FALSE)

default alpha=1, so this should be lasso penalty.

Coefficients.fit1 - coef(fit1, s=fit1.cv$lambda.min)
Active.Index.fit1 - which(Coefficients.fit1 !=0)
Active.Coefficients.fit1 - Coefficients.fit1[Active.Index.fit1]
Active.Index.fit1
[1]  1  5  9 10 16
Active.Coefficients.fit1
[1] -1.28774827  0.01420395  0.70444865 -0.27726625  0.18455926

My optimal model chose 5 active covariates including intercept as first one.

Second, I did the same things with alpha=0.5 option to do elastic net
analysis.

fit.2 - glmnet(x15std, y, family=binomial, standardize=FALSE, alpha=0.5)
fit.2cv - cv.glmnet(x15std, y, family=binomial, standardize=FALSE,
alpha=0.5)
Coefficients.fit2 - coef(fit2, s=fit2.cv$lambda.min)
Active.Index.fit2 - which(Coefficients.fit2 !=0)
Active.Coefficients.fit2 - Coefficients.fit2[Active.Index.fit2]
Active.Index.fit2
[1]  1  5  9 10 16
Active.Coefficients.fit2
[1] -1.3286190  0.1410739  0.6315108 -0.2668022  0.2292459

This model chose the same 5 active covariates as first one with lasso
penalty.

My questions are followings;
1. Am I doing it correctly or not?
2. Which model, I mean lasso or elastic net, should be selected? and
why? Both models chose the same variables but different coefficient values.
3. Is it O.K. to calculate odds ratio by exp(coefficients)? And how can
you calculate 95% confidence interval of odds ratio?
Or 95%CI is meaningless in this kind of analysis?

I would appreciate your help in advance.
KH

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http

Re: [R] One to One Matching multiple vectors

2011-03-16 Thread Nick Sabbe
Hello Vincy.
You probably want
y[match(z,x)]
Or, more instructional:
whereAreZInX-match(z, x)
y[whereAreZInX]

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Vincy Pyne
Sent: woensdag 16 maart 2011 10:42
To: r-help@r-project.org
Subject: [R] One to One Matching multiple vectors

Dear R helpers

Suppose,

x = c(0,  1,  2,  3)

y = c(A, B, C, D)

z = c(1, 3)

For given values of z, I need to the values of y. So I should get B and
D. 

I tried doing 

y[x][z] but it gives 

 y[x][z]
[1] A C

Kindly guide.

Regards

Vincy



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Generic mixup?

2011-03-04 Thread Nick Sabbe
Hello list.

 

This is from an R session (admittedly, I'm still using R 2.11.1):

 print

function (x, ...) 

UseMethod(print)

environment: namespace:base

 showMethods(print)

 

Function print:

 not a generic function

 

Don't the two results contradict each other? Or do I have a terrible
misunderstanding of what comprises a generic function?

 

Thx,

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be/ http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up process

2011-02-25 Thread Nick Sabbe
Simply avoiding the for loops by using lapply (I may have missed a bracket
here or there cause I did this without opening R)...
Haven't checked the speed up, though.

lapply(seq.yvar, function(k){
   plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
   lapply(seq_along(mydata_list), function(j){
 foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
 return(NULL)
   })
   invisible(NULL)
})

HTH,

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: vrijdag 25 februari 2011 11:20
To: r-help
Subject: [R] speed up process

Dear users,

I have a double for loop that does exactly what I want, but is quite 
slow. It is not so much with this simplified example, but IRL it is slow.
Can anyone help me improve it?

The data and code for foo_reg() are available at the end of the email; I 
preferred going directly into the problematic part.
Here is the code (I tried to simplify it but I cannot do it too much or 
else it wouldn't represent my problem). It might also look too complex 
for what it is intended to do, but my colleagues who are also supposed 
to use it don't know much about R. So I wrote it so that they don't have 
to modify the critical parts to run the script for their needs.

#column indexes for function
ind.xvar - 2
seq.yvar - 3:4
#position vector for legend(), stupid positioning but it doesn't matter here
mypos - c(topleft, topright,bottomleft)

#run the function for columns 34 as y (seq.yvar) with column 2 as x 
(ind.xvar) for all 3 datasets (mydata_list)
par(mfrow=c(2,1))
for (i in seq_along(seq.yvar)){
   k - seq.yvar[i]
   plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p, 
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
   for (j in seq_along(mydata_list)){
 foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j, 
pos=mypos[j], name.dat=names(mydata_list)[j])
   }
}

I tried with lapply() or mapply() but couldn't manage to pass the 
arguments for names() and col= correctly, e.g. for the 2nd loop:
lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar, 
yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])}, 
mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))

Thanks in advance for any hints.
Ivan




#create data (it looks horrible with these datasets but it doesn't 
matter here)
mydata1 - structure(list(species = structure(1:8, .Label = c(alsen, 
gogor, loalb, mafas, pacyn, patro, poabe, thgel), class = 
factor), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc = 
c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809, 
119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483, 
43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651, 
50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names = 
c(species, fruit, Asfc, Tfv), row.names = c(NA, 8L), class = 
data.frame)

mydata2 - mydata1[!(mydata1$species %in% c(thgel,alsen)),]
mydata3 - mydata1[!(mydata1$species %in% c(thgel,alsen,poabe)),]
mydata_list - list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)

#function for regression
library(WRS)
foo_reg - function(dat, xvar, yvar, mycol, pos, name.dat){
  tsts - tstsreg(dat[[xvar]], dat[[yvar]])
  tsts_inter - signif(tsts$coef[1], digits=3)
  tsts_slope - signif(tsts$coef[2], digits=3)
  abline(tsts$coef, lty=1, col=mycol)
  legend(x=pos, legend=c(paste(TSTS ,name.dat,: 
Y=,tsts_inter,+,tsts_slope,X,sep=)), lty=1, col=mycol)
}

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glmnet with binary predictors

2011-02-03 Thread Nick Sabbe
Hello Sambit.

Step1:
Create a matrix out of your predictor data, having columns for every
predictor, coding 1 for yes and 0 for no. he matrix should have a row for
each observation (called pred.mat below)
Besides that, you need a vector with the outcome variable for each
observation (best if this is a factor with 2 levels) (called out.v below)
Step2
Because you are working with categorical variables, don't forget to always
use  standardize = FALSE  in any call to the glmnet functions (see the
docs)
Step3
To see how the predictor coefficients move over different values of your
penalization parameter, simply do something like
myLognet-glmnet(x=pred.mat, y=out.v, standardize = FALSE,
family=binomial)
and then
plot(myLognet, xvar= lambda, label = TRUE)
Note: the labels in the plot indicate column numbers in pred.mat
Step4
To find the 'best' value of the penalization parameter, use cv.glmnet with
the same parameters plus a type (see ?cv.glmnet). Note: if the criterion you
want is not provided 'out of the box', it will take you quite a bit of
coding, so if you can, take one of the provided ones.
Visually, you can select the 'best' value for the penalization parameter
from the plot (see ?plot.cv.glmnet), or you can use some numerical argument
to find the reasonable extreme value for the criterion.

Really boilerplate, I guess.
Good luck.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of sambit rath
Sent: donderdag 3 februari 2011 10:58
To: r-help@r-project.org
Subject: [R] glmnet with binary predictors

Hi Everybody!

I must start with a declaration that I am a sparse user of R. I am
creating a credit scorecard using a dataset which has a variable
depicting actual credit history (good/bad) and 41 other variables of
yes/no type. The procedure I am asked to follow is to use a penalized
logistic procedure for variable selection. I have located the package
glmnet which gives the complete elasticnet regularization path for
logistic models. I want some help in setting up the process.

Can someone point out the basic steps?

Thanks

Sambit

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Preparing dataset for glmnet: factors to dummies

2011-02-01 Thread Nick Sabbe
Hello list.

For some reason, the makers of glmnet do not accept a dataframe as input.
They expect the input to be a matrix, where the dummies are already
precoded.
Now I have created a sample dataset with
. 11 factor columns with two levels
. 4 factor columns with three levels
. 135 continuous columns (from a standard normal)
. 100 observations (rows)
Say this dataframe is in dfrPredictors.

What I do now, is use the following code:

form-paste(~,paste(colnames(dfrPredictors), collapse=+), sep=)
dfrTmp-model.frame(dfrPredictors, na.action=na.pass)
result- as.matrix(model.matrix(as.formula(form), data=dfrTmp))[,-1]

This works (although admittedly, I don't understand everything of it).
However, I notice that for this rather limited dataset, this conversion
takes around 0.1 seconds user/elapsed time (on a relatively speedy laptop).

For my current work, I need to do this a lot of times on very similar
dataframes (in fact, they are multiply imputed from the same 'original'
dataframe), so I need all the speed I can get.
Does anybody know of a way that is quicker than the above? Note: because of
other uses of the dataframe, I don't have the option to do this conversion
before the imputation, so I really need the conversion itself to work
quickly.

Thanks,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot not generic

2011-01-28 Thread Nick Sabbe
Hello list.

 

I was trying to see some of the code for plot.glmnet in package glmnet (this
function name is in the documentation).

After loading the library, I tried the obvious typing in the name, but I
received a message telling me it could not be found.

 

So I fiddled around a little, and noticed that R does not recognize 'plot'
as a generic function, and as such, showMethods does not work.

This seems to conflict with the documentation for plot.

 

So 2 questions:

. How can I find the code of plot.glmnet

. Why is plot not seen as generic?

 

Thx.

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be/ http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get type functions (was: RE: plot not generic)

2011-01-28 Thread Nick Sabbe
Thanks for that, Vito.

Somehow, I often get lost in the whole slew of similar methods:
* get
* getMethod
* showMethods
* getAnywhere
* methods

Does anybody have a simple list of when to use which one?
And maybe I'm even missing some variants?

Thx.

Nick.

-Original Message-
From: Vito Muggeo (UniPa) [mailto:vito.mug...@unipa.it] 
Sent: vrijdag 28 januari 2011 14:42
To: Nick Sabbe
Cc: r-help@r-project.org
Subject: Re: [R] plot not generic

dear Nick,

getAnywhere(plot.glmnet)


Note the message you get when you type

methods(plot)
...
Non-visible functions are asterisked




Il 28/01/2011 14.26, Nick Sabbe ha scritto:
 Hello list.



 I was trying to see some of the code for plot.glmnet in package glmnet
(this
 function name is in the documentation).

 After loading the library, I tried the obvious typing in the name, but I
 received a message telling me it could not be found.



 So I fiddled around a little, and noticed that R does not recognize 'plot'
 as a generic function, and as such, showMethods does not work.

 This seems to conflict with the documentation for plot.



 So 2 questions:

 . How can I find the code of plot.glmnet

 . Why is plot not seen as generic?



 Thx.



 Nick Sabbe

 --

 ping: nick.sa...@ugent.be

 link:http://biomath.ugent.be/  http://biomath.ugent.be

 wink: A1.056, Coupure Links 653, 9000 Gent

 ring: 09/264.59.36



 -- Do Not Disapprove




   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 23895240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Injecting code in a package?

2011-01-28 Thread Nick Sabbe
Dear list,

 

I've had this a few times now, and wonder if this is possible:

I'm using a package, often for plotting something, but I want to tune the
way the plotting goes, in a way that was not foreseen by the maker of the
package.

Now, most of the time, these kinds of R functions (say pkg::plot.something)
call into other R functions (say pkg::plot.something.internal), and it is
these that I want to tinker with.

 

So, my question is: can I replace an R function in a package with  a version
of my own, without having to somehow rebuild the package? I don't just want
a non-package bound copy of the function, I want to make sure that when I
call pkg::plot.something, this works as before, but when, from within this
function, pkg:: plot.something.internal is called, I want it to call _my_
version of it.

 

Any takes?

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be/ http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using a list as multidimensional indexer

2011-01-20 Thread Nick Sabbe
Hello list.

Another 'puzzle' for which I don't have a clean solution.
Say I have a multidimensional object, e.g.:
Mm-matrix(1:6, nrow=2, dimnames=list(c(a,b), c(g,h,i)))
And on the other hand I have a list
Ind-list(b,g)
This holds, for each dimension, an indexer for that dimension.
Now I would like to get the element pointed at by the list.
The obvious solutions don't seem to work, and I can't seem to get do.call to
call the indexer ('[') on my multidimensional object.

Any suggestions?

Thanks in advance,

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using a list as multidimensional indexer

2011-01-20 Thread Nick Sabbe
Hm. I got somewhat further:

Ind2-list(Mm,b,g)
do.call([,Ind2)
Seems to work.

However, now I need it one step beyond: in fact, my actual multidimensional
object holds one dimension more than my list holds indexes.
i.e.: I want the equivalent of Mm[a,].

I tried some variants of
Ind3-list(Mm,b,NULL)
do.call([, Ind3)
But all of these return integer(0).

So the actual new question is: how do I pass a 'missing' argument through a
do.call?

Thanks for any pointers,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Nick Sabbe
Sent: donderdag 20 januari 2011 11:05
To: r-help@r-project.org
Subject: [R] Using a list as multidimensional indexer

Hello list.

Another 'puzzle' for which I don't have a clean solution.
Say I have a multidimensional object, e.g.:
Mm-matrix(1:6, nrow=2, dimnames=list(c(a,b), c(g,h,i)))
And on the other hand I have a list
Ind-list(b,g)
This holds, for each dimension, an indexer for that dimension.
Now I would like to get the element pointed at by the list.
The obvious solutions don't seem to work, and I can't seem to get do.call to
call the indexer ('[') on my multidimensional object.

Any suggestions?

Thanks in advance,

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] expand.grid

2011-01-19 Thread Nick Sabbe
Hello list.

 

I feel like an idiot. 

 

There exists a method called expand.grid which, from the documentation,
appears to do just what I want, but then it doesn't, and I can't get it to
behave.

 

Given a dataframe

dfr-data.frame(c1=c(a, b, NA, a, a), c2=c(d, NA, d, e, e),
c3=c(g, h, i, j, k))

I would like to have a dataframe with all (unique) combinations of all the
factors present.

 

In fact, I would like a simple solution for these two cases: given the three
factor columns above, I would like both all _possible_ combinations of the
factor levels, and all _present_ combinations of the factor levels (e.g. if
I would do this for the first 4 rows of dfr, it would contain no
combinations with c3=k). It would also be nice to be able to choose
whether or not NA's are included.

 

I'm convinced that some package holds a readymade solution, and I'm trying
to switch from always writing my own stuff (get the number of levels per
column, then use some apply magic) to using what is there, so thanks for any
hints,

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be/ http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] expand.grid

2011-01-19 Thread Nick Sabbe
slaps self in forehead/

I appear to have misinterpreted the help: considering that it explicitly
makes note of factors, I wrongly assumed that it would use the levels of a
factor automatically. My bad.

For completeness' sake, my final solution:

getLevels-function(vec, includeNA=FALSE, onlyOccurring=FALSE)
{
if(onlyOccurring)
{
rv-levels(factor(vec))
}
else
{
rv-levels(vec)
}
#cat(levels so far: , rv, \n)
if(includeNA  any(is.na(vec)))
{
rv-c(rv,NA)
}
#cat(levels with na: , rv, \n)
return(rv)
}

expand.combs-function(dfr, includeNA=FALSE, onlyOccurring=FALSE)
{
expand.grid(lapply(dfr, getLevels, includeNA, onlyOccurring))
}

Thx.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: Berwin A Turlach [mailto:ber...@maths.uwa.edu.au] 
Sent: woensdag 19 januari 2011 11:04
To: Nick Sabbe
Cc: r-help@r-project.org
Subject: Re: [R] expand.grid

G'day Nick,

On Wed, 19 Jan 2011 09:43:56 +0100
Nick Sabbe nick.sa...@ugent.be wrote:

 Given a dataframe
 
 dfr-data.frame(c1=c(a, b, NA, a, a), c2=c(d, NA, d, e,
 e), c3=c(g, h, i, j, k))
 
 I would like to have a dataframe with all (unique) combinations of
 all the factors present.

Easy:

R expand.grid(lapply(dfr, levels))
   c1 c2 c3
1   a  d  g
2   b  d  g
3   a  e  g
4   b  e  g
5   a  d  h
6   b  d  h
7   a  e  h
8   b  e  h
9   a  d  i
10  b  d  i
11  a  e  i
12  b  e  i
13  a  d  j
14  b  d  j
15  a  e  j
16  b  e  j
17  a  d  k
18  b  d  k
19  a  e  k
20  b  e  k


 In fact, I would like a simple solution for these two cases: given
 the three factor columns above, I would like both all _possible_
 combinations of the factor levels, and all _present_ combinations of
 the factor levels (e.g. if I would do this for the first 4 rows of
 dfr, it would contain no combinations with c3=k). 

R dfrpart - lapply(dfr[1:4,], factor)
R expand.grid(lapply(dfrpart, levels))
   c1 c2 c3
1   a  d  g
2   b  d  g
3   a  e  g
4   b  e  g
5   a  d  h
6   b  d  h
7   a  e  h
8   b  e  h
9   a  d  i
10  b  d  i
11  a  e  i
12  b  e  i
13  a  d  j
14  b  d  j
15  a  e  j
16  b  e  j

 It would also be nice to be able to choose whether or not NA's are
 included. 

R expand.grid(lapply(dfrpart, function(x) c(levels(x),
+   if(any(is.na(x))) NA else NULL)))
 c1   c2 c3
1 ad  g
2 bd  g
3  NAd  g
4 ae  g
5 be  g
6  NAe  g
7 a NA  g
8 b NA  g
9  NA NA  g
10ad  h
11bd  h


HTH.

Cheers,

Berwin

== Full address 
Berwin A Turlach  Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019)+61 (8) 6488 3383 (self)
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway   
Crawley WA 6009e-mail: ber...@maths.uwa.edu.au
Australiahttp://www.maths.uwa.edu.au/~berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Repeating value occurence

2011-01-13 Thread Nick Sabbe
It is not exactly clear from your message what you want.
If you want n random values holding either -1, 0 or 1, use sample(c(-1,0,1),
10, replace=TRUE) or also sample(3, 10, replace=TRUE)-2
If you want n values following the pattern -1, 0, 1, 0 as your example seems
to follow, use
n-10
pattern- c(-1,0,1,0)
rep(pattern, ceiling(n/length(pattern)))[1:n]

If you want a sequence of random real numbers between -1 and 1, use
runif(10, min=-1, max=1)

Here's hoping I haven't just solved your homework...


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Rustamali Manesiya
Sent: donderdag 13 januari 2011 5:12
To: r-help@r-project.org
Subject: [R] Repeating value occurence

How can achieve this in R using seq, or rep function

c(-1,0,1,0,-1,0,1,0,-1,0)

The range value is between-1 and 1,  and I want it such that there could be
n number of points between -1 and 1

Anyone? Please help Thanks
Rusty

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Numbers in a string

2010-12-15 Thread Nick Sabbe
Hi Felipe,

gsub([^0123456789], , AB15E9SDF654VKBN?dvb.65)
results in 15965465.
Would that be what you are looking for?


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Rainer Schuermann
Sent: woensdag 15 december 2010 11:19
To: r-help@r-project.org
Subject: Re: [R] Numbers in a string

If your OS is Linux, you might want to look at sed or gawk. They are very
good and efficient for such tasks.
You need it once or as a part of program? 
Some samples would be helpful...
Rgds,
Rainer


 Original-Nachricht 
 Datum: Wed, 15 Dec 2010 16:55:26 +0800
 Von: Luis Felipe Parra felipe.pa...@quantil.com.co
 An: r-help r-help@r-project.org
 Betreff: [R] Numbers in a string

 Hello, I have stings which have all sort of characters (numbers, letters,
 punctuation marks, etc) I would like to stay only with the numbers in
 them,
 does somebody know how to do this?
 
 Thank you
 
 Felipe Parra
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 

---

Windows: Just say No.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem accessing complex list data frames

2010-12-08 Thread Nick Sabbe
Hello Germán.

You probably want something like:
sapply(vmat, function(curMat){
curMat[,999] != 0
})
Or if you want the indices, just surround this with a which.

HTH.

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Germán Sanchis
Sent: woensdag 8 december 2010 11:50
To: r-help@r-project.org
Subject: [R] problem accessing complex list data frames

Hi all.

I am currently attempting to build a list of sparse matrixes. That I have
already achieved, by

 vmat - list()
 for (i in 1:n) {
   vmat - c(vmat, sparseMatrix(i,j,x=data) }

How I am trying to select those elements from the list where the column e.g.
999 is not null. I can do this for one of the sparse matrices with

 which(vmat[[1]][,999] != 0)

which returns the rows where such column is non-zero.

However, my purpose is to obtain the list indices of the sparse matrices
with such non-zero elements. I tried things like

which(vmat[[]][,999] != 0)
which(vmat[,,999] != 0)
sapply(vmat, which, [,999] != 0)

but none worked... any help will be appreciated!!

Cheers,

German

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dataframe from list of similar lists: not _a_ way, but _the best_ way

2010-12-07 Thread Nick Sabbe
Hi All.

 

I often find myself in this situation:

. Based on some vector (or list) of values, I need to calculate a
few new values for each of them, where some of the new values are numbers,
but some are more of descriptive nature (so: character strings)

. So I use e.g. sapply, passing a custom function that returns a
list with all the calculated values

. The result of this is: a list (=the return value of sapply) of
lists, that all have the same kind of named values

A silly example:

list.of.lists-sapply(1:10, function(nr){list(org=nr,
chr=as.character(nr))})

 

It seems rather obvious that the result would be better structured as a
dataframe.

Now I know a few ways to do this (using do.call), but I fear most of these
are rather bad in performance: I suspect all the data is being repetitively
copied which may be slow.

 

So, my question to the specialists:

. Is the above way of working reasonable for this kind of problem?
Or would you suggest otherwise?

. What would be the best (as in: quickest) way of transforming this
list of lists to a dataframe? The answer to this is probably based upon
knowledge of the inner workings of R? Or is there any way in which this
depends on the specifics of my function (for nontrivial functions and list
sizes)?

 

Thanks!

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be/ http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] small problem in coding

2010-11-26 Thread Nick Sabbe
Hello Mike.

I'm not clear on why you would want to state your g parameter (In fact, I
don't even know for sure what you mean by that) with
 lamda-c(g=0.2)

If you want a variable (vector) g containing 0.2, why don't you simply do:
 g-0.2
If you need that lamda thing for some reason later on, you can always do:
 lamda-c(g=g)
Afterwards to get the same effect.

If you have some reason not to do this:
With your statement, you create a vector lamda, with one item in it, and
that first and only item is named g.
So from your statement, you can access g by: 
lamda[g]
as in:
 Q-exp(lamda[g])

It looks like you've got a misunderstanding of how R variables work, but
maybe I just misunderstood your question...

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Mike Gibson
Sent: vrijdag 26 november 2010 9:13
To: r-help@r-project.org
Subject: [R] small problem in coding


I must be missing something.  
 
I first state my g parameter with:  
 
 lamda-c(g=0.2)

However, when I do the next step R is telling me object g not found  Here
is my next step:  
 
 Q-exp(g)
 
 
???  
 
Any help would be greatly appreciated.  
 
Mike  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Function

2010-11-10 Thread Nick Sabbe
Hi Nikos.

There is quite a bit going on here, both in the code and in your
terminology.
You should really consider reading An introduction to R that comes with
your R installation.

A few pointers though:
* in R speak, you have nowhere declared 2 global matrices: it is not
completely clear why you use code like y-c(NA) to try to achieve such a
thing, but if I'm not mistaken, this creates a logical vector of length 1.
Surely not a matrix.
* operator - only looks for variables in the environment in which they are
evaluated, as does = (note: I would advise you to use - in R as an
assignment operator instead of =). If you want to change variables in other
environments, particularly the global environment, you need to use - (?-
does not seem to work to get you to its help page, but open R help, then
find the search page and search for -, for more information).
* apart from that: you may want to avoid the for loop here altogether:
y[i:10]-(i:10)+1
f[i:10]-y[(i-1):9]/2
gives you the same result, but more in the R fashion (in general, you want
to avoid explicit for loops in R)

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of rnick
Sent: woensdag 10 november 2010 7:51
To: r-help@r-project.org
Subject: [R] Simple Function


Hi guys,

Very new to R and your help would be highly appreciated for the following
problem. I am trying to create a simple function which registers values
within an array through a for loop. These are the steps I have followed:

1) Declared 2 global matrices
2) Create function mat() with i as an input
3) constructed the for loop
4) called mat(2)

The problem is that when i try to get y[4] and f[5] the output is: [1] NA
 
my concern is that i am not addressing any of the following topics:
1) definition of global variable
2) the argument does not go through the for loop
3) the matrices definition is not correct
4) other

Please check my code below:

y=c(NA)
f=c(NA)
mat-function(i)
{
for (k in i:10)
{
y[k]=k+1
f[k]=y[k-1]/2   
}
}
mat(2)

Any thoughts or recommendations would be highly appreciated.

Thanks in advance,

N 
-- 
View this message in context:
http://r.789695.n4.nabble.com/Simple-Function-tp3035572p3035572.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with Iterator

2010-11-09 Thread Nick Sabbe
I guess what you want is:
- change the line :
xnew-f(xold,data)
into
xnew-f(xold,data, itel)
- change your mat function to take itel as an extra parameter:
mat-function (x, data=NULL, itel) {return (1+x^itel)}

That should do the trick (though I haven't checked whether the rest of your
code is OK)


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of zhiji19
Sent: dinsdag 9 november 2010 9:02
To: r-help@r-project.org
Subject: [R] Help with Iterator


Dear Experts,

The following is my Iterator. When I try to write a new function with
itel, I got error. 

This is what I have: 
 supDist-function(x,y) return(max(abs(x-y)))
 
 myIterator - function(xinit,f,data=NULL,eps=1e-6,itmax=5,verbose=FALSE) {
+ xold-xinit
+ itel-0
+ repeat {
+   xnew-f(xold,data)
+   if (verbose) {
+ cat(
+ Iteration: ,formatC(itel,width=3, format=d),
+ xold: ,formatC(xold,digits=8,width=12,format=f),
+ xnew: ,formatC(xnew,digits=8,width=12,format=f),
+ \n
+ )
+ }  
+ if ((supDist(xold,xnew)  eps) || (itel == itmax)) {
+   return(xnew)
+ }
+   xold-xnew; itel-itel+1
+   }
+ }

 mat-function (x, data=NULL) {return (1+x^itel)}
 myIterator(3, f=mat, verbose=TRUE)
Error in f(xold, data) : object 'itel' not found


Can anyone please help me to fix the error?
-- 
View this message in context:
http://r.789695.n4.nabble.com/Help-with-Iterator-tp3033254p3033254.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to eliminate this for loop ?

2010-11-09 Thread Nick Sabbe
I doubt this to be true.
Try this in R:
 dmy-rep(1,5)
 dmy[2:5]-dmy[1:4]+1
This is equivalent to what you propose (even simpler), but it does not, as
OP seems to have wanted, fill dmy with 1,2,3,4,5, but, as I had expected,
with 1,2,2,2,2.

I would be interested in knowing what exactly the difference beween my
example above, and the one you suggest, is.

As others have suggested: another way is to use actual recursive calls, but
I seriously doubt these to be more efficient. You should probably only use
it if you really hate to type the word 'for' (-: Though I would also like to
see an example where they prove to be the better way to go (by any criteria,
but preferably speed or perhaps other resource usage)


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: maandag 8 november 2010 15:04
To: Nick Sabbe
Cc: 'PLucas'; r-help@r-project.org
Subject: Re: [R] How to eliminate this for loop ?


On Nov 8, 2010, at 4:30 AM, Nick Sabbe wrote:

 Whenever you use a recursion (that cannot be expressed otherwise), you
 always need a (for) loop.

Not necessarily true ... assuming a is of length n:

a[2:n] - a[1:(n-1))]*b + cc[1:(n-1)]
# might work if b and n were numeric vectors of length 1 and cc had  
length = n. (Never use c as a vector name.)
# it won't work if there are no values for the nth element at the  
beginning and you are building up a element by element.

And you always need to use operations that appropriate to the object  
type. So if a really is a list, this will always fail since  
arithmetic does not work on list elements. If on the other hand, the  
OP were incorrect in calling this a list and a were a numeric  
vector, there might be a chance of success if the rules of indexing  
were adhered to. The devil is in the details and the OP has not  
supplied enough code to tell what might happen.

-- 
David.

 Apply and the like do not allow to use the intermediary results  
 (i.e. a[i-1]
 to calculate a[i]).

 So: no, it cannot be avoided in your case, I guess.


 Nick Sabbe
 --
 ping: nick.sa...@ugent.be
 link: http://biomath.ugent.be
 wink: A1.056, Coupure Links 653, 9000 Gent
 ring: 09/264.59.36

 -- Do Not Disapprove



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
 ] On
 Behalf Of PLucas
 Sent: maandag 8 november 2010 10:26
 To: r-help@r-project.org
 Subject: [R] How to eliminate this for loop ?


 Hi, I would like to create a list recursively and eliminate my for  
 loop :

 a-c()
 a[1] - 1; # initial value
 for(i in 2:N) {
   a[i]-a[i-1]*b - c[i-1] # b is a value, c is another vector
 }


 Is it possible ?

 Thanks
 -- 
 View this message in context:

http://r.789695.n4.nabble.com/How-to-eliminate-this-for-loop-tp3031667p30316
 67.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to eliminate this for loop ?

2010-11-08 Thread Nick Sabbe
Whenever you use a recursion (that cannot be expressed otherwise), you
always need a (for) loop.
Apply and the like do not allow to use the intermediary results (i.e. a[i-1]
to calculate a[i]).

So: no, it cannot be avoided in your case, I guess.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of PLucas
Sent: maandag 8 november 2010 10:26
To: r-help@r-project.org
Subject: [R] How to eliminate this for loop ?


Hi, I would like to create a list recursively and eliminate my for loop :

a-c()
a[1] - 1; # initial value
for(i in 2:N) {
a[i]-a[i-1]*b - c[i-1] # b is a value, c is another vector
}


Is it possible ?

Thanks
-- 
View this message in context:
http://r.789695.n4.nabble.com/How-to-eliminate-this-for-loop-tp3031667p30316
67.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Integer Number in Uniform Distribution

2010-10-25 Thread Nick Sabbe
Check ?sample.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Gundala Viswanath
Sent: maandag 25 oktober 2010 8:38
To: r-h...@stat.math.ethz.ch
Subject: [R] Random Integer Number in Uniform Distribution

Is there a way to do it? At best what I can achieve
is non integer:

 runif(10, min=1, max=100)
 [1] 51.959151 56.654146 63.630251  3.172794  4.073018 11.977437 86.601869
 [8] 75.788618 11.734361  6.770962


-G.V.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] if statement and truncated distribution

2010-10-25 Thread Nick Sabbe
What I guess you want is something like (this is for zero-truncation):

rZeroTruncNormal1d-function(mu, sig, invalidSign) #sig holds standard
deviation!
{
val-rnorm(1, mu, sig)
while(val * invalidSign  0)
{
val-rnorm(1, mu, sig)
}
return(val)
}


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Sally Luo
Sent: maandag 25 oktober 2010 2:01
To: r-help@r-project.org
Subject: [R] if statement and truncated distribution

Hi R helpers,

I am trying to use the if statement to generate a truncated random variable
as follows:

if (y[i]==0)  { v[i] ~ rnorm(1,0,1) | (-inf ,0) }
if (y[i]==1) { v[i] ~ rnorm(1,0,1) | (0, inf) }

I guess I cannot use  | (  , )  to restrict the range of a variable in R.
Could you let me know how to write the code correctly in R?

Many thanks for your help.

Maomao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find index of a string inside a string?

2010-10-25 Thread Nick Sabbe
For simple searches, use grep with fixed=TRUE.
Check ?grep.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of yoav baranan
Sent: maandag 25 oktober 2010 13:27
To: r-help@r-project.org
Subject: [R] Find index of a string inside a string?


Hi, 
I am searching for the equivalent of the function Index from SAS. 

In SAS: index(abcd, bcd) will return 2 because bcd is located in the 2nd
cell of the abcd string. 
The equivalent in R should do this:
 myIndex - foo(abcd, bcd) #return 2. 
What is the function that I am looking for?

I want to use the return value in substr, like I do in SAS.

thanks, y. baranan.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] printing a variable during a loop

2010-10-22 Thread Nick Sabbe
At least in the Windows version, there is an option in the menu that might 
resolve your issue:
In Rgui, Under Misc, there is the option Buffered Output which is checked by 
default.
Unchecking it seems to make sure that messages, print statements and cat output 
is rendered immediately.
A likely consequence will be that your code will run somewhat slower.

For using some output as 'progress control' you definitely want to turn the 
option off.

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of j.delashe...@ed.ac.uk
Sent: vrijdag 22 oktober 2010 13:00
To: Joshua Wiley
Cc: R-help
Subject: Re: [R] printing a variable during a loop


Thank you for this!

I had also wanted in the past to do this, and ended up writing dummy  
files with informative names to a folder I set to collect these  
messages, so I'd check the folder to see the new files being  
generated... It did the job, and at the same time I could see how long  
it took for my program to reach certain points (filer creation time)  
but not the most elegant! I didn't know about flush.console()

However I used that approach to generate some diagnostic files, so  
that if a complex process broke (sometimes it involved several system  
calls to external programs) I had good information about what the  
program was doing and at what stage it failed. I created a vector to  
store teh names of all teh files being generated and they could be  
removed automatically afterwards.

Not what the OP wanted, but this strategy may be useful for certain tasks.

Jose


Quoting Joshua Wiley jwiley.ps...@gmail.com:

 On Thu, Oct 21, 2010 at 12:03 PM, David Winsemius
 dwinsem...@comcast.net wrote:

 On Oct 21, 2010, at 8:58 PM, Antonio Olinto wrote:

 Thanks Adrienne, but I still in doubt. The behavior of print and message
 looks the same.

 Nothing is displayed on the screen after minutes of routine processing .
 All values of i are displayed only when I press the stop button   
 (I'm under
 Windows) or when i reaches the maximum value.


 In the past people have needed to use flush.console() to get output to the
 screen. Unable to test since A) I'm not running your OS, and B) no
 reproducible example offered.

 I am running your OS (though it would also be nice if you reported the
 results of sessionInfo() ).  In any case, this worked for me on R
 2.12.0 (i386-pc-mingw32):

 for(i in 1:6) {Sys.sleep(3); print(i); flush.console()}

 For your problem, I imagine something like (though untested because no data):

 for (i in 1:23194) {
 dat.stat[i,c(2:8)]-quantile(dat.bat[BL==block[i],2],prob=c(0,0.025,0.25,0.5,0.75,0.975,1))
 print(i)
 flush.console()
 }


 Thanks again,

 Antônio Olitno


 Citando Adrienne Wootten amwoo...@ncsu.edu:

 instead of print use this

 message(i)

 the message command is used for things like this and it will print the
 value
 of i as you are looping through, but you can also do this:

 message(Counter value is: ,i)

 which returns for i = 20 for example

 Counter value is 20

 for more check out the message help section in the html

 ? message


 Adrienne Wootten
 NCSU

 On Thu, Oct 21, 2010 at 2:05 PM, Antonio Olinto
 aolint...@bignet.com.brwrote:

 Hello,

 About looping, consider the example:

 for (i in 1:23194) {


 dat.stat[i,c(2:8)]-quantile(dat.bat[BL==block[i],2],prob=c(0,0.025,0.25,0.5,0.75,0.975,1))
 print(i)
 }

 I'd like to have the value of i printed for each loop (step). As I
 could
 see the values of i are shown on screen only after all the work is
 done.

 Thanks in advance for any suggestion.

 Best regards,

 Antonio

 --

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Dr. Jose I. de las Heras  Email: j.delashe...@ed.ac.uk
The Wellcome Trust Centre for Cell BiologyPhone: +44 (0)131 6507095
Institute for Cell  Molecular BiologyFax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

__
R-help@r-project.org