Re: [R] enableJIT(2) causes major slow-up in rpart

2012-04-13 Thread Tal Galili
Thank you very much Luke,

With regards,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, Apr 13, 2012 at 11:58 PM,  wrote:

> The level 2 is a heuristic meant to help with certain kinds of
> programming idioms. It isn't always going to work.  In this case
> trace(cmpfun) will show three functions being compiled each time
> through. Not sure why -- I'll try to find out and see if it can be
> avoided.
>
> luke
>
>
> On Thu, 12 Apr 2012, Tal Galili wrote:
>
>  Hello,
>>
>> Due to exploration of the JIT capabilities offered through the {compiler}
>> package, I came by the fact that using enableJIT(2) can *slow* the rpart
>>
>> function (from the {rpart} package) by a magnitude of about 10 times.
>>
>> Here is an example code to run:
>>
>> library(rpart)
>> require(compiler)
>>
>> enableJIT(0) # just making sure that JIT is off # We could also use
>> enableJIT(1) and it would be fine
>> fo <- function() {rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)}
>> system.time(fo())
>> #   user  system elapsed
>> #  0   0   0   # this can also be 0.01 sometimes.
>>
>> enableJIT(2)  # also happens for enableJIT(3)
>> system.time(fo())
>> #   user  system elapsed
>> #   0.120.000.12
>>
>>
>> Which brings me to my *questions*:
>>
>> 1) Is this a bug or a feature?
>> 2) If this is a feature, what is causing it? (or put another way, can one
>> predict ahead of time the implications of using enableJIT(2) or
>> enableJIT(3) on his code?)
>>
>>
>> *Links*:
>>
>> A post I recently wrote about my exploration of JIT -
>> www.r-statistics.com/2012/04/**speed-up-your-r-code-using-a-**
>> just-in-time-jit-compiler/
>> The question asked on SO regarding the limitations of JIT:
>> http://stackoverflow.com/**questions/10106736/possible-**
>> shortcomings-for-using-jit-**with-r
>>
>> Thanks,
>> Tal
>>
>>
>>
>> Contact
>> Details:--**--**---
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>> --**--**
>> --**
>>
>>[[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> --
> Luke Tierney
> Chair, Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>   Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tobit Fixed Effects

2012-04-13 Thread Arne Henningsen
Hi Felipe

On 22 March 2012 10:13, Felipe Nunes  wrote:
> I'm using censReg() to run a random effects model, but R is having an error.
> Can you help me understanding why?
>
> When I run this model, everything works fine:
>
> tob1 <- censReg(transfers.cap ~ factor(year) + district2 + gdp.cap - 1,
> left=0, right=Inf, method="BFGS", nGHQ=15, iterlim=1, data = d3)
>
> But this one, does not:
>
> tob2 <- censReg(transfers.cap ~ factor(year) + I(constituency.coa.v/100) +
> district2 - 1, left=0, right=Inf, method="BFGS", nGHQ=15, iterlim=1,
> data = d3)
>
> The error is:
>
> Error in solve.default(OM) :
>   system is computationally singular: reciprocal condition number =
> 4.41531e-17

Did you solve this problem in the mean time?

If not: I could take a look at it if you send me a reproducible example.

Best wishes,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A little exercise in R!

2012-04-13 Thread Justin Haynes
Since I thought this was a cool question, I posted it to StackOverflow.
 Vincent Zookynd's  answer is amazing and really exercises the power of R.


http://stackoverflow.com/questions/10150161/ordering-117-by-perfect-square-pairs/10150797#10150797



On Fri, Apr 13, 2012 at 10:06 PM, Bert Gunter wrote:

> ... and a moment's more consideration immediately shows it cannot be
> done for n = 18, since 16,17, and 18 cannot all be at an end.
>
> -- Bert
>
> On Fri, Apr 13, 2012 at 9:59 PM, Bert Gunter  wrote:
> > Folks:
> >
> > IMHO this is exactly the **wrong** way t go about this. These are
> > mathematical exercises that should employ mathematical thinking, not
> > brute force checking of cases.
> >
> > Consider, for example, the 1 to 17 sequence given by Ted. Then 17
> > **must** be one end of the sequence and 16 the other. (Why?) Hence,
> > starting from the 17 end, the values ** must** be 17  8 1 ...
> > Proceeding in this way, it takes only a couple of minutes to solve.
> >
> > The more interesting point which I think the question was really
> > about, is can this always be done? I haven't given this any thought,
> > but there may be an easy proof or counterexample. If the answer to
> > this latter is no, then perhaps even more interesting is to
> > characterize the set of numbers where it can/cannot be done.
> >
> > But this is all way off topic, no?
> >
> > Cheers,
> > Bert
> >
> >
> >
> > On Fri, Apr 13, 2012 at 6:26 PM, Philippe Grosjean
> >  wrote:
> >> Hi all,
> >>
> >> I got another solution, and it would apply probably for the ugliest one
> :-(
> >> I made it general enough so that it works for any series from 1 to n (n
> not
> >> too large, please... tested up to 30).
> >>
> >> Hint for a better algorithm: inspect the object 'friends' in my code:
> there
> >> is a nice pattern appearing there!!!
> >>
> >> Best,
> >>
> >> Philippe
> >>
> >> ..<¡}))><
> >>  ) ) ) ) )
> >> ( ( ( ( (Prof. Philippe Grosjean
> >>  ) ) ) ) )
> >> ( ( ( ( (Numerical Ecology of Aquatic Systems
> >>  ) ) ) ) )   Mons University, Belgium
> >> ( ( ( ( (
> >> ..
> >>
> >> findSerie <- function (n, tmax = 500) {
> >>  ## Check arguments
> >>  n <- as.integer(n)
> >>  if (length(n) != 1 || is.na(n) || n < 1)
> >>stop("'n' must be a single positive integer")
> >>
> >>  tmax <- as.integer(tmax)
> >>  if (length(tmax) != 1 || is.na(tmax) || tmax < 1)
> >>stop("'tmax' must be a single positive integer")
> >>
> >>  ## Suite of our numbers to be sorted
> >>  nbrs <- 1:n
> >>
> >>  ## Trivial cases: only one or two numbers
> >>  if (n == 1) return(1)
> >>  if (n == 2) stop("The pair does not sum to a square number")
> >>
> >>  ## Compute all possible pairs
> >>  omat <- outer(rep(1, n), nbrs)
> >>  ## Which pairs sum to a square number?
> >>  friends <- sqrt(omat + nbrs) %% 1 < .Machine$double.eps
> >>  diag(friends) <- FALSE # Eliminate pairs of same numbers
> >>
> >>  ## Get a list of possible neighbours
> >>  neigb <- apply(friends, 1, function(x) nbrs[x])
> >>
> >>  ## Nbr of neighbours for each number
> >>  nf <- sapply(neigb, length)
> >>
> >>  ## Are there numbers without neighbours?
> >>  ## then, problem impossible to solve..
> >>  if (any(!nf))
> >>stop("Impossible to solve:\n",
> >>  paste(nbrs[!nf], collapse = ", "),
> >>  " sum to square with nobody else!")
> >>
> >>  ## Are there numbers that can have only one neighbour?
> >>  ## Must be placed at one extreme
> >>  toEnds <- nbrs[nf == 1]
> >>  ## I must have two of them maximum!
> >>  l <- length(toEnds)
> >>  if (l > 2)
> >>stop("Impossible to solve:\n",
> >>  "More than two numbers form only one pair:\n",
> >>  paste(toEnds, collapse = ", "))
> >>
> >>  ## The other numbers can appear in the middle of the suite
> >>  inMiddle <- nbrs[!nbrs %in% toEnds]
> >>
> >>  generateSerie <- function (neigb, toEnds, inMiddle) {
> >>## Allow to generate serie by picking candidates randomly
> >>if (length(toEnds) > 1) toEnds <- sample(toEnds)
> >>if (length(inMiddle) > 1) inMiddle <- sample(inMiddle)
> >>
> >>## Choose a number to start with
> >>res <- rep(NA, n)
> >>
> >>## Three cases: 0, 1, or 2 numbers that must be at an extreme
> >>## Following code works in all cases
> >>res[1] <- toEnds[1]
> >>res[n] <- toEnds[2]
> >>
> >>## List of already taken numbers
> >>taken <- toEnds
> >>
> >>## Is there one number in res[1]? Otherwise, fill it now...
> >>if (is.na(res[1])) {
> >>taken <- inMiddle[1]
> >>res[1] <- taken
> >>}
> >>
> >>## For each number in the middle, choose one acceptable neighbour
> >>for (ii in 2:(n-1)) {
> >>  prev <- res[ii - 1]
> >>  allpossible <- neigb[[prev]]
> >>  candidate <- allpossible[!(allpossible %in% taken)]
> >>  if (!length(candidate)) break # We fail to construct the serie
> >>  ## Take ra

Re: [R] how to divide data by week

2012-04-13 Thread Özgür Asar
Dear Stefano,

A practical way might be as the following

R> acc<-read.table("acc.txt",header=T) #reading your data into R
R> acc.may<-acc[acc[,3]==5,] #subsetting data with respect to may
R> acc.may.order<-acc.may[order(acc.may[,4]),] #ordering with respect to day
R> mean(acc.may.order[1:7,5]) # mean of the period of day 1 to 7
R> mean(acc.may.order[8:14,5]) # mean of the period of day 8 to 15

This script is just for May and taking means needs manual manipulations.  On
the other hand, simple loops might deal with your problem such as for loop.

Best
Ozgur


-

Ozgur ASAR

Research Assistant
Middle East Technical University
Department of Statistics
06531, Ankara Turkey
Ph: 90-312-2105309
http://www.stat.metu.edu.tr/people/assistants/ozgur/
--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-divide-data-by-week-tp4556650p4556740.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to divide data by week

2012-04-13 Thread Hasan Diwan
Stefano,

On 13 April 2012 20:51, Stefano Sofia wrote:

> I have a data frame as below specified.
> From the 1st of May to the 30th of September of several years (e.g. from
> 2004 to 2011) I have a frequency of accidents.
> I need the mean of accidents divided by weeks (i.e. the mean of accidents
> from the 1st to the 7th of May of all the years, from the 8th to the 14th
> of May,
> ..., from the 29th to the 31st of May, from the 1st to the 7th of July and
> so on).
> Is there an easy way to do that?
>

Take a look at
https://bitbucket.org/hd1/financeocr/raw/13c4990bd21b/visualizations/dayAmountsHist.Rto
get the subsets and sapply(list, mean, simplify=TRUE). -- H
-- 
Sent from my mobile device
Envoyait de mon portable

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A little exercise in R!

2012-04-13 Thread Bert Gunter
... and a moment's more consideration immediately shows it cannot be
done for n = 18, since 16,17, and 18 cannot all be at an end.

-- Bert

On Fri, Apr 13, 2012 at 9:59 PM, Bert Gunter  wrote:
> Folks:
>
> IMHO this is exactly the **wrong** way t go about this. These are
> mathematical exercises that should employ mathematical thinking, not
> brute force checking of cases.
>
> Consider, for example, the 1 to 17 sequence given by Ted. Then 17
> **must** be one end of the sequence and 16 the other. (Why?) Hence,
> starting from the 17 end, the values ** must** be 17  8 1 ...
> Proceeding in this way, it takes only a couple of minutes to solve.
>
> The more interesting point which I think the question was really
> about, is can this always be done? I haven't given this any thought,
> but there may be an easy proof or counterexample. If the answer to
> this latter is no, then perhaps even more interesting is to
> characterize the set of numbers where it can/cannot be done.
>
> But this is all way off topic, no?
>
> Cheers,
> Bert
>
>
>
> On Fri, Apr 13, 2012 at 6:26 PM, Philippe Grosjean
>  wrote:
>> Hi all,
>>
>> I got another solution, and it would apply probably for the ugliest one :-(
>> I made it general enough so that it works for any series from 1 to n (n not
>> too large, please... tested up to 30).
>>
>> Hint for a better algorithm: inspect the object 'friends' in my code: there
>> is a nice pattern appearing there!!!
>>
>> Best,
>>
>> Philippe
>>
>> ..<¡}))><
>>  ) ) ) ) )
>> ( ( ( ( (    Prof. Philippe Grosjean
>>  ) ) ) ) )
>> ( ( ( ( (    Numerical Ecology of Aquatic Systems
>>  ) ) ) ) )   Mons University, Belgium
>> ( ( ( ( (
>> ..
>>
>> findSerie <- function (n, tmax = 500) {
>>  ## Check arguments
>>  n <- as.integer(n)
>>  if (length(n) != 1 || is.na(n) || n < 1)
>>    stop("'n' must be a single positive integer")
>>
>>  tmax <- as.integer(tmax)
>>  if (length(tmax) != 1 || is.na(tmax) || tmax < 1)
>>    stop("'tmax' must be a single positive integer")
>>
>>  ## Suite of our numbers to be sorted
>>  nbrs <- 1:n
>>
>>  ## Trivial cases: only one or two numbers
>>  if (n == 1) return(1)
>>  if (n == 2) stop("The pair does not sum to a square number")
>>
>>  ## Compute all possible pairs
>>  omat <- outer(rep(1, n), nbrs)
>>  ## Which pairs sum to a square number?
>>  friends <- sqrt(omat + nbrs) %% 1 < .Machine$double.eps
>>  diag(friends) <- FALSE # Eliminate pairs of same numbers
>>
>>  ## Get a list of possible neighbours
>>  neigb <- apply(friends, 1, function(x) nbrs[x])
>>
>>  ## Nbr of neighbours for each number
>>  nf <- sapply(neigb, length)
>>
>>  ## Are there numbers without neighbours?
>>  ## then, problem impossible to solve..
>>  if (any(!nf))
>>    stop("Impossible to solve:\n    ",
>>      paste(nbrs[!nf], collapse = ", "),
>>      " sum to square with nobody else!")
>>
>>  ## Are there numbers that can have only one neighbour?
>>  ## Must be placed at one extreme
>>  toEnds <- nbrs[nf == 1]
>>  ## I must have two of them maximum!
>>  l <- length(toEnds)
>>  if (l > 2)
>>    stop("Impossible to solve:\n    ",
>>      "More than two numbers form only one pair:\n    ",
>>      paste(toEnds, collapse = ", "))
>>
>>  ## The other numbers can appear in the middle of the suite
>>  inMiddle <- nbrs[!nbrs %in% toEnds]
>>
>>  generateSerie <- function (neigb, toEnds, inMiddle) {
>>    ## Allow to generate serie by picking candidates randomly
>>    if (length(toEnds) > 1) toEnds <- sample(toEnds)
>>    if (length(inMiddle) > 1) inMiddle <- sample(inMiddle)
>>
>>    ## Choose a number to start with
>>    res <- rep(NA, n)
>>
>>    ## Three cases: 0, 1, or 2 numbers that must be at an extreme
>>    ## Following code works in all cases
>>    res[1] <- toEnds[1]
>>    res[n] <- toEnds[2]
>>
>>    ## List of already taken numbers
>>    taken <- toEnds
>>
>>    ## Is there one number in res[1]? Otherwise, fill it now...
>>    if (is.na(res[1])) {
>>        taken <- inMiddle[1]
>>        res[1] <- taken
>>    }
>>
>>    ## For each number in the middle, choose one acceptable neighbour
>>    for (ii in 2:(n-1)) {
>>      prev <- res[ii - 1]
>>      allpossible <- neigb[[prev]]
>>      candidate <- allpossible[!(allpossible %in% taken)]
>>      if (!length(candidate)) break # We fail to construct the serie
>>      ## Take randomly one possible candidate
>>      if (length(candidate) > 1) take <- sample(candidate, 1) else
>>        take <- candidate
>>      res[ii] <- take
>>      taken <- c(taken, take)
>>    }
>>
>>    ## If we manage to go to the end, check last pair...
>>    if (length(taken) == (n - 1)) {
>>      take <- nbrs[!(nbrs %in% taken)]
>>      res[n] <- take
>>      taken <- c(take, taken)
>>    }
>>    if (length(taken) == n && !(res[n] %in% neigb[[res[n - 1]]]))
>>    res[n] <- NA # Last one pair not allowed
>>
>>    ## Return the series
>>    return(res)
>>  }
>>
>>  fo

Re: [R] A little exercise in R!

2012-04-13 Thread Bert Gunter
Folks:

IMHO this is exactly the **wrong** way t go about this. These are
mathematical exercises that should employ mathematical thinking, not
brute force checking of cases.

Consider, for example, the 1 to 17 sequence given by Ted. Then 17
**must** be one end of the sequence and 16 the other. (Why?) Hence,
starting from the 17 end, the values ** must** be 17  8 1 ...
Proceeding in this way, it takes only a couple of minutes to solve.

The more interesting point which I think the question was really
about, is can this always be done? I haven't given this any thought,
but there may be an easy proof or counterexample. If the answer to
this latter is no, then perhaps even more interesting is to
characterize the set of numbers where it can/cannot be done.

But this is all way off topic, no?

Cheers,
Bert



On Fri, Apr 13, 2012 at 6:26 PM, Philippe Grosjean
 wrote:
> Hi all,
>
> I got another solution, and it would apply probably for the ugliest one :-(
> I made it general enough so that it works for any series from 1 to n (n not
> too large, please... tested up to 30).
>
> Hint for a better algorithm: inspect the object 'friends' in my code: there
> is a nice pattern appearing there!!!
>
> Best,
>
> Philippe
>
> ..<¡}))><
>  ) ) ) ) )
> ( ( ( ( (    Prof. Philippe Grosjean
>  ) ) ) ) )
> ( ( ( ( (    Numerical Ecology of Aquatic Systems
>  ) ) ) ) )   Mons University, Belgium
> ( ( ( ( (
> ..
>
> findSerie <- function (n, tmax = 500) {
>  ## Check arguments
>  n <- as.integer(n)
>  if (length(n) != 1 || is.na(n) || n < 1)
>    stop("'n' must be a single positive integer")
>
>  tmax <- as.integer(tmax)
>  if (length(tmax) != 1 || is.na(tmax) || tmax < 1)
>    stop("'tmax' must be a single positive integer")
>
>  ## Suite of our numbers to be sorted
>  nbrs <- 1:n
>
>  ## Trivial cases: only one or two numbers
>  if (n == 1) return(1)
>  if (n == 2) stop("The pair does not sum to a square number")
>
>  ## Compute all possible pairs
>  omat <- outer(rep(1, n), nbrs)
>  ## Which pairs sum to a square number?
>  friends <- sqrt(omat + nbrs) %% 1 < .Machine$double.eps
>  diag(friends) <- FALSE # Eliminate pairs of same numbers
>
>  ## Get a list of possible neighbours
>  neigb <- apply(friends, 1, function(x) nbrs[x])
>
>  ## Nbr of neighbours for each number
>  nf <- sapply(neigb, length)
>
>  ## Are there numbers without neighbours?
>  ## then, problem impossible to solve..
>  if (any(!nf))
>    stop("Impossible to solve:\n    ",
>      paste(nbrs[!nf], collapse = ", "),
>      " sum to square with nobody else!")
>
>  ## Are there numbers that can have only one neighbour?
>  ## Must be placed at one extreme
>  toEnds <- nbrs[nf == 1]
>  ## I must have two of them maximum!
>  l <- length(toEnds)
>  if (l > 2)
>    stop("Impossible to solve:\n    ",
>      "More than two numbers form only one pair:\n    ",
>      paste(toEnds, collapse = ", "))
>
>  ## The other numbers can appear in the middle of the suite
>  inMiddle <- nbrs[!nbrs %in% toEnds]
>
>  generateSerie <- function (neigb, toEnds, inMiddle) {
>    ## Allow to generate serie by picking candidates randomly
>    if (length(toEnds) > 1) toEnds <- sample(toEnds)
>    if (length(inMiddle) > 1) inMiddle <- sample(inMiddle)
>
>    ## Choose a number to start with
>    res <- rep(NA, n)
>
>    ## Three cases: 0, 1, or 2 numbers that must be at an extreme
>    ## Following code works in all cases
>    res[1] <- toEnds[1]
>    res[n] <- toEnds[2]
>
>    ## List of already taken numbers
>    taken <- toEnds
>
>    ## Is there one number in res[1]? Otherwise, fill it now...
>    if (is.na(res[1])) {
>        taken <- inMiddle[1]
>        res[1] <- taken
>    }
>
>    ## For each number in the middle, choose one acceptable neighbour
>    for (ii in 2:(n-1)) {
>      prev <- res[ii - 1]
>      allpossible <- neigb[[prev]]
>      candidate <- allpossible[!(allpossible %in% taken)]
>      if (!length(candidate)) break # We fail to construct the serie
>      ## Take randomly one possible candidate
>      if (length(candidate) > 1) take <- sample(candidate, 1) else
>        take <- candidate
>      res[ii] <- take
>      taken <- c(taken, take)
>    }
>
>    ## If we manage to go to the end, check last pair...
>    if (length(taken) == (n - 1)) {
>      take <- nbrs[!(nbrs %in% taken)]
>      res[n] <- take
>      taken <- c(take, taken)
>    }
>    if (length(taken) == n && !(res[n] %in% neigb[[res[n - 1]]]))
>    res[n] <- NA # Last one pair not allowed
>
>    ## Return the series
>    return(res)
>  }
>
>  for (trial in 1:tmax) {
>    cat("Trial", trial, ":")
>    serie <- generateSerie(neigb = neigb, toEnds = toEnds,
>      inMiddle = inMiddle)
>    cat(paste(serie, collapse = ", "), "\n")
>    flush.console() # Print text now
>    if (!any(is.na(serie))) break
>  }
>  if (any(is.na(serie))) {
>    cat("\nSorry, I did not find a solution\n\n")
>  } else cat("

Re: [R] A little exercise in R!

2012-04-13 Thread Petr Savicky
On Fri, Apr 13, 2012 at 10:34:49PM +0100, Ted Harding wrote:
> Greetings all!
> A recent news item got me thinking that a problem stated
> therein could provide a teasing little exercise in R
> programming.
> 
> http://www.bbc.co.uk/news/uk-england-cambridgeshire-17680326
> 
>   Cambridge University hosts first European 'maths Olympiad'
>   for girls
> 
>   The first European girls-only "mathematical Olympiad"
>   competition is being hosted by Cambridge University.
>   [...]
>   Olympiad co-director, Dr Ceri Fiddes, said competition questions
>   encouraged "clever thinking rather than regurgitating a taught
>   syllabus".
>   [...]
>   "A lot of Olympiad questions in the competition are about
>   proving things," Dr Fiddes said.
> 
>   "If you have a puzzle, it's not good enough to give one answer.
>   You have to prove that it's the only possible answer."
>   [...]
>   "In the Olympiad it's about starting with a problem that anybody
>   could understand, then coming up with that clever idea that
>   enables you to solve it," she said.
> 
>   "For example, take the numbers one up to 17.
> 
>   "Can you write them out in a line so that every pair of numbers
>   that are next to each other, adds up to give a square number?"
> 
> Well, that's the challenge: Write (from scratch) an R program
> that solves this problem. And make it neat.

Hi.

Is recursion acceptable? Using recursion, i obtained
two solutions.

  extend <- function(x)
  {
  y <- setdiff((1:17), x)
  if (length(y) == 0) {
  cat(x, "\n")
  return
  }
  y <- y[(y + x[length(x)]) %in% (1:5)^2]
  for (z in y) {
  extend(c(x, z))
  }
  }

  for (i in 1:17) extend(i)

  16 9 7 2 14 11 5 4 12 13 3 6 10 15 1 8 17 
  17 8 1 15 10 6 3 13 12 4 5 11 14 2 7 9 16 

Petr.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to divide data by week

2012-04-13 Thread Stefano Sofia
Dear list users,
I have a data frame as below specified.
From the 1st of May to the 30th of September of several years (e.g. from 2004 
to 2011) I have a frequency of accidents.
I need the mean of accidents divided by weeks (i.e. the mean of accidents from 
the 1st to the 7th of May of all the years, from the 8th to the 14th of May,
..., from the 29th to the 31st of May, from the 1st to the 7th of July and so 
on).
Is there an easy way to do that?

Thank you for your help
Stefano Sofia


  yearmonthday Freq
1   2004 51   3
2   2004 5 10   2
3   2004 5 11   2
4   2004 5 12   2
5   2004 5 13   3
6   2004 5 14   0
7   2004 5 15   2
8   2004 5 16   1
9   2004 5 17   6
10 20045 18   1
11 20045 19   2
12 20045   2   4
13 20045 20   0
14 20045 21   0
15 20045 22   3
16 20045 23   4
17 20045 24   3
18 20045 25   2
19 20045 26   2
20 20045 27   0
21 20045 28   2
22 20045 29   3
23 20045   3   2
24 20045 30   3
25 20045 31   7
26 20045   4   1
27 20045   5   2
28 20045   6   3
29 20045   7   3
30 20045   8   1
31 20045   9   1
32 20046   1   3
33 20046 10   1
34 20046 11   3
35 20046 12   1
36 20046 13   3
37 20046 14   1
38 20046 15   1

AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere 
informazioni confidenziali, pertanto è destinato solo a persone autorizzate 
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche 
possono contenere informazioni confidenziali e con privilegi legali. Se non si 
è il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al 
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi 
dell’art. 6 della  DGR n. 1394/2008 si segnala che, in caso di necessità ed 
urgenza, la risposta al presente messaggio di posta elettronica può essere 
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Regione Marche may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help - Importing data from txt and xlsx files

2012-04-13 Thread steven mosher
read your file with readLines(). copy the first few lines for me to read
here

test <- readLines(yur filename)
test[1:5]

post the result. we can figure it out from there
On Apr 13, 2012 11:46 AM, "AMFTom"  wrote:

> Dear Thierry,
>
> Thanks for your help. Now though, I try to import data from a txt file, and
> it says either
>
> > mydataframe <- read.table("Lv2.8.txt")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> :
>  line 3 did not have 15 elements
>
> or
>
> > mydataframe <- read.table("Lv2.8.txt", header = TRUE)
> Error in read.table("Lv2.8.txt", header = TRUE) :
>  more columns than column names
>
> even though I seem to have put a column name into the txt file for each
> column. Any ideas?
>
> Thanks again!
>
> Tom
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Help-Importing-data-from-txt-and-xlsx-files-tp4554622p4555001.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Curve fitting, probably splines

2012-04-13 Thread Greg Snow
This sounds like possibly using logsplines may be what you want.  See
the 'oldlogspline' function in the 'logspline' package.

On Thu, Apr 12, 2012 at 7:45 AM, Michael Haenlein
 wrote:
> Dear all,
>
> This is probably more related to statistics than to [R] but I hope someone
> can give me an idea how to solve it nevertheless:
>
> Assume I have a variable y that is a function of x: y=f(x). I know the
> average value of y for different intervals of x. For example, I know that
> in the interval[0;x1] the average y is y1, in the interval [x1;x2] the
> average y is y2 and so forth.
>
> I would like to find a line of minimum curvature so that the average values
> of y in each interval correspond to y1, y2, ...
>
> My idea was to use (cubic) splines. But the problem I have seems somewhat
> different to what is usually done with splines. As far as I understand it,
> splines help to find a curve that passes a set of given points. But I don't
> have any points, I only have average values of y per interval.
>
> If you have any suggestions on how to solve this, I'd love to hear them.
>
> Thanks very much in advance,
>
> Michael
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] some questions about sympy (that is, rSymPy)

2012-04-13 Thread Kjetil Halvorsen
see below.

On Fri, Apr 13, 2012 at 8:50 PM, Kjetil Halvorsen
 wrote:
> I am experimenting with rSymPy, and it seems to work nice.
>
>
> However, I dislike the need to wrap all sympy expressions within
> quotes, it leads to ugly calls like
> library(rSymPy)
> Var("x,y,z")
> sympy("(x+y)**2")
> and so on.
>
> Inspired by the function cq from mvbutiles package:
> library(mvbutils)
>> cq
> function (...)
> {
>    as.character(sapply(as.list(match.call(expand.dots = TRUE))[-1],
>        as.character))
> }
> 
> 
>
> I tried to write
>> sympyq
> function(...) {
>   arg <- as.character(match.call(expand.dots=TRUE)[-1])
>   thiscall   <- as.call(list(as.name("sympy"), arg))
>   print( thiscall ) # for debugging
>   eval(thiscall, parent.frame() )
> }
>
> Some examples:
> (After doing
> Var("x,y,z") )
>> sympyq(4+4)
> sympy("4 + 4")
> [1] "8"
>
>> sympyq(3*x+4*y+89*z-6*x)
> sympy("3 * x + 4 * y + 89 * z - 6 * x")
> [1] "-3*x + 4*y + 89*z"
>>
>
> But then:
>
>> sympyq( (x+y)**2 )
> sympy("(x + y)^2")
> Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  :
>  Traceback (most recent call last):
>  File "", line 1, in 
> TypeError: unsupported operand type(s) for ^: 'Add' and 'int'
>
> Note that R has changed the  syntax **2 to ^2, which sympy does not
> seem to like!
>
> Any ideas for avoiding this, or more generally, better ideas for
> achieving what I am trying to do?
>
> Kjetil

Ok, I am trying now with

> sympyq
function(...) {
   arg <- as.character(match.call(expand.dots=TRUE)[-1])
   arg <- gsub('^', x=arg, replacement='**', fixed=TRUE)
   thiscall   <- as.call(list(as.name("sympy"), arg))
   print( thiscall )
   eval(thiscall, parent.frame() )
}

which gives:

> sympyq( (x+y)**2 )
sympy("(x + y)**2")
[1] "(x + y)**2"

> sympyq ( sin(pi) )
sympy("sin(pi)")
[1] "0"

> sympyq( limit(sin(x)/x, x, 0) )
sympy("limit(sin(x)/x, x, 0)")
[1] "1"

> sympyq( diff(sin(2*x), x, 2) )
sympy("diff(sin(2 * x), x, 2)")
[1] "-4*sin(2*x)"

so seems to work, but then, tyhe following I do not understand:

> sympyq( ((x+y)**2).expand() )
Error: unexpected symbol in "sympyq( ((x+y)**2).expand"

> sympyq( sin(x+y).expand(trig=True) )
Error: unexpected symbol in "sympyq( sin(x+y).expand"

???

Kjetil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] some questions about sympy (that is, rSymPy)

2012-04-13 Thread Kjetil Halvorsen
I am experimenting with rSymPy, and it seems to work nice.


However, I dislike the need to wrap all sympy expressions within
quotes, it leads to ugly calls like
library(rSymPy)
Var("x,y,z")
sympy("(x+y)**2")
and so on.

Inspired by the function cq from mvbutiles package:
library(mvbutils)
> cq
function (...)
{
as.character(sapply(as.list(match.call(expand.dots = TRUE))[-1],
as.character))
}



I tried to write
> sympyq
function(...) {
   arg <- as.character(match.call(expand.dots=TRUE)[-1])
   thiscall   <- as.call(list(as.name("sympy"), arg))
   print( thiscall ) # for debugging
   eval(thiscall, parent.frame() )
}

Some examples:
(After doing
Var("x,y,z") )
> sympyq(4+4)
sympy("4 + 4")
[1] "8"

> sympyq(3*x+4*y+89*z-6*x)
sympy("3 * x + 4 * y + 89 * z - 6 * x")
[1] "-3*x + 4*y + 89*z"
>

But then:

> sympyq( (x+y)**2 )
sympy("(x + y)^2")
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  :
  Traceback (most recent call last):
  File "", line 1, in 
TypeError: unsupported operand type(s) for ^: 'Add' and 'int'

Note that R has changed the  syntax **2 to ^2, which sympy does not
seem to like!

Any ideas for avoiding this, or more generally, better ideas for
achieving what I am trying to do?

Kjetil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A little exercise in R!

2012-04-13 Thread Philippe Grosjean

Hi all,

I got another solution, and it would apply probably for the ugliest one :-(
I made it general enough so that it works for any series from 1 to n (n 
not too large, please... tested up to 30).


Hint for a better algorithm: inspect the object 'friends' in my code: 
there is a nice pattern appearing there!!!


Best,

Philippe

..<¡}))><
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons University, Belgium
( ( ( ( (
..

findSerie <- function (n, tmax = 500) {
  ## Check arguments
  n <- as.integer(n)
  if (length(n) != 1 || is.na(n) || n < 1)
stop("'n' must be a single positive integer")

  tmax <- as.integer(tmax)
  if (length(tmax) != 1 || is.na(tmax) || tmax < 1)
stop("'tmax' must be a single positive integer")

  ## Suite of our numbers to be sorted
  nbrs <- 1:n

  ## Trivial cases: only one or two numbers
  if (n == 1) return(1)
  if (n == 2) stop("The pair does not sum to a square number")

  ## Compute all possible pairs
  omat <- outer(rep(1, n), nbrs) 
  ## Which pairs sum to a square number?
  friends <- sqrt(omat + nbrs) %% 1 < .Machine$double.eps
  diag(friends) <- FALSE # Eliminate pairs of same numbers

  ## Get a list of possible neighbours
  neigb <- apply(friends, 1, function(x) nbrs[x])

  ## Nbr of neighbours for each number
  nf <- sapply(neigb, length)

  ## Are there numbers without neighbours?
  ## then, problem impossible to solve..
  if (any(!nf))
stop("Impossible to solve:\n",
  paste(nbrs[!nf], collapse = ", "),
  " sum to square with nobody else!")

  ## Are there numbers that can have only one neighbour?
  ## Must be placed at one extreme
  toEnds <- nbrs[nf == 1]
  ## I must have two of them maximum!
  l <- length(toEnds)
  if (l > 2)
stop("Impossible to solve:\n",
  "More than two numbers form only one pair:\n",
  paste(toEnds, collapse = ", "))

  ## The other numbers can appear in the middle of the suite
  inMiddle <- nbrs[!nbrs %in% toEnds]

  generateSerie <- function (neigb, toEnds, inMiddle) {
## Allow to generate serie by picking candidates randomly
if (length(toEnds) > 1) toEnds <- sample(toEnds)
if (length(inMiddle) > 1) inMiddle <- sample(inMiddle)

## Choose a number to start with
res <- rep(NA, n)

## Three cases: 0, 1, or 2 numbers that must be at an extreme
## Following code works in all cases
res[1] <- toEnds[1]
res[n] <- toEnds[2]

## List of already taken numbers
taken <- toEnds

## Is there one number in res[1]? Otherwise, fill it now...
if (is.na(res[1])) {
taken <- inMiddle[1]
res[1] <- taken
}

## For each number in the middle, choose one acceptable neighbour
for (ii in 2:(n-1)) {
  prev <- res[ii - 1]
  allpossible <- neigb[[prev]]
  candidate <- allpossible[!(allpossible %in% taken)]
  if (!length(candidate)) break # We fail to construct the serie
  ## Take randomly one possible candidate
  if (length(candidate) > 1) take <- sample(candidate, 1) else
take <- candidate
  res[ii] <- take
  taken <- c(taken, take)
}

## If we manage to go to the end, check last pair...
if (length(taken) == (n - 1)) {
  take <- nbrs[!(nbrs %in% taken)]
  res[n] <- take
  taken <- c(take, taken)
}
if (length(taken) == n && !(res[n] %in% neigb[[res[n - 1]]]))
res[n] <- NA # Last one pair not allowed

## Return the series
return(res)
  }

  for (trial in 1:tmax) {
cat("Trial", trial, ":")
serie <- generateSerie(neigb = neigb, toEnds = toEnds,
  inMiddle = inMiddle)
cat(paste(serie, collapse = ", "), "\n")
flush.console() # Print text now
if (!any(is.na(serie))) break
  }
  if (any(is.na(serie))) {
cat("\nSorry, I did not find a solution\n\n")
  } else cat("\n** I got it! **\n\n")
  return(serie)
}

findSerie(17)


On 13/04/12 23:34, (Ted Harding) wrote:

Greetings all!
A recent news item got me thinking that a problem stated
therein could provide a teasing little exercise in R
programming.

http://www.bbc.co.uk/news/uk-england-cambridgeshire-17680326

   Cambridge University hosts first European 'maths Olympiad'
   for girls

   The first European girls-only "mathematical Olympiad"
   competition is being hosted by Cambridge University.
   [...]
   Olympiad co-director, Dr Ceri Fiddes, said competition questions
   encouraged "clever thinking rather than regurgitating a taught
   syllabus".
   [...]
   "A lot of Olympiad questions in the competition are about
   proving things," Dr Fiddes said.

   "If you have a puzzle, it's not good enough to give one answer.
   You

Re: [R] A little exercise in R!

2012-04-13 Thread Justin Haynes
I thought this was kinda cool!  Here's my solution, its not robust or
probably efficient

I'd to hear improvements or other solutions!

Justin


sq.test <- function(a, b) {
  ## test for number pairs that sum to squares.
  sqrt(sum(a, b)) == floor(sqrt(sum(a, b)))
}

ok.pairs <- function(n, vec) {
  ## given n as a member of vec,
  ## which other members of vec satisfiy sq.test
  vec <- vec[vec!=n]
  vec[sapply(vec, sq.test, b=n)]
}

grow.seq <- function(y) {
  ## given a starting point (y) and a pairs list (pl)
  ## grow the squaring sequence.
  ly <- length(y)
  if(ly == y[1]) return(y)

  ## this line is the one that breaks down on other number sets...
  y <- c(y, max(pl[[y[ly]]][!pl[[y[ly]]] %in% y]))
  y <- grow.seq(y)

  return(y)
}


## start vector
x <- 1:17

## get list of possible pairs
pl <- lapply(x, ok.pairs, vec=x)

## pick start at max since few combinations there.
y <- max(x)
grow.seq(y)



On Fri, Apr 13, 2012 at 2:34 PM, Ted Harding wrote:

> Greetings all!
> A recent news item got me thinking that a problem stated
> therein could provide a teasing little exercise in R
> programming.
>
> http://www.bbc.co.uk/news/uk-england-cambridgeshire-17680326
>
>  Cambridge University hosts first European 'maths Olympiad'
>  for girls
>
>  The first European girls-only "mathematical Olympiad"
>  competition is being hosted by Cambridge University.
>  [...]
>  Olympiad co-director, Dr Ceri Fiddes, said competition questions
>  encouraged "clever thinking rather than regurgitating a taught
>  syllabus".
>  [...]
>  "A lot of Olympiad questions in the competition are about
>  proving things," Dr Fiddes said.
>
>  "If you have a puzzle, it's not good enough to give one answer.
>  You have to prove that it's the only possible answer."
>  [...]
>  "In the Olympiad it's about starting with a problem that anybody
>  could understand, then coming up with that clever idea that
>  enables you to solve it," she said.
>
>  "For example, take the numbers one up to 17.
>
>  "Can you write them out in a line so that every pair of numbers
>  that are next to each other, adds up to give a square number?"
>
> Well, that's the challenge: Write (from scratch) an R program
> that solves this problem. And make it neat.
>
> NOTE: If there should happen to be some R package that can solve
> this kind of problem already, without you having to think much,
> then its use is illegitimate! (I.e. will be deemed "regurgitation").
>
> Over to you.
>
> With best wishes,
> Ted.
>
> -
> E-Mail: (Ted Harding) 
> Date: 13-Apr-2012  Time: 22:33:43
> This message was sent by XFMail
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Geostatitics 3D Variogram Map

2012-04-13 Thread Murphy, Mark P (AU)
Dear R Helpers

I'm investigation the geostatistics tools in R and have found the package 
'gstat', which looks to be useful four two dimensional data.  However, I 
usually deal with three dimensional information.  I would like to compute a 3D 
variogram map using the variogram tool but cannot seem to get it to recognise 
the third (z) dimension

The data file reads ok and the spatial coordinates seem to be correctly applied 
(see result below)

Dt<-read.table(DtFl,header = TRUE,sep = ",",na.strings = "-" )
>
>
> #set spatial coordinates
> coordinates(Dt)= c("X","Y","Z")
> summary(Dt)
Object of class SpatialPointsDataFrame
Coordinates:
 min   max
X  5070.7241  6427.053
Y 12324.2617 13226.178
Z   602.1896   782.880

However when I attempt to compute the variogram map I get an error:

> Vrgrm.Mp.StdNi1<- variogram(Dt$StdREC1~1, Dt, cutoff = 1500, width = 50, map 
> = TRUE)
Error: dimensions do not match: locations 14397 and data 3780

While the variogram tool is touted to be able to deal with three D data perhaps 
the map option does not?



Mark Murphy
Technical Director Mining and Geology
AMEC - Perth



This email contains confidential information. The contents must
not be disclosed to anyone else except with the authority of the sender.
Unauthorised recipients are requested to maintain this confidentiality and
immediately advise the sender of any error or misdirection in transmission.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Merging two data frames with different columns names

2012-04-13 Thread Johnny Liseth
I am trying to merge two data frames, but one of the column headings are
different in the two frames. How can I rjoin or rbind the tho frames?

Johnny

# Generate 2 blocks by confounding on abc
d1 <- conf.design(c(1,1,1), p=2, block.name="blk", treatment.names =
c("A","B","C"))
d2 <- conf.design(c(1,1,1), p=2, block.name="blk", treatment.names =
c("A","B","C"))

rep1 <- c(550,669,633,642,1037,749,1075,729)
rep2 <- c(604,650,601,635,1052,868,1063,860)

part1 <- data.frame(d1,rep1)
part2 <- data.frame(d2,rep2)

d12 <- rbind(part1,part2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] predict GLM with offset MASS

2012-04-13 Thread smfa
Hi, 

I know this is probably a stupid question... But I don't seem to find the
answer. 

I'm fitting a GLM with a Poisson family, using MASS, and then tried to get a
look at the predictions, however the offset does seem to be taken into
consideration:

model_glm=glm(cases~rhs(data$year,2003)+lhs(data$year,2003),
offset=(log(population)), data=data, subset=28:36, family=poisson())

predict (model_glm, type="response")

I get cases not rates...

I've tried also

model_glm=glm(cases~rhs(data$year,2003)+lhs(data$year,2003)+
offset(log(population)), data=data, subset=28:36, family=poisson())

with the same results. However when I predict from GAM, using mgcv, the
predictions consider the offset (I get rates).

I'm missing something?

I would appreciate any comment,

thanks

Sandra



--
View this message in context: 
http://r.789695.n4.nabble.com/predict-GLM-with-offset-MASS-tp4556308p4556308.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list.dirs() full.names broken?

2012-04-13 Thread J Toll
OK, well list.dirs() seems broken to me.  In case someone else needs a
working version, I wrote a new function called lsdir().  It adds the
ability to choose whether to include hidden directories.  It should
work on Mac and probably Linux/Unix.

lsdir <- function(path, format = "relative", recursive = FALSE, all = FALSE) {
  # list directories
  # format is any part of "fullpath", "relative", or "basename"

  # set a path if necessary
  if (missing(path)) {
path <- "."
  }

  # recursion
  if (recursive == FALSE) {
argRecursive <- "-maxdepth 1"
  } else if (recursive) {
argRecursive <- ""
  }

  # piece together system command
  execFind <- paste("find", path, "-type d", argRecursive,
"-mindepth 1 -print", sep = " ")

  # execute system command
  tmp <- system(execFind, intern = TRUE)

  # remove .hidden files if all == FALSE
  if (all == FALSE) {
tmp <- tmp[grep("^\\..*", basename(tmp), invert = TRUE)]
  }

  # match format argument
  format <- match.arg(tolower(format), c("fullpath", "relative", "basename"))

  # format output based upon format argument
  if (format == "basename") {
out <- basename(tmp)
  } else if (format == "fullpath") {
out <- normalizePath(tmp)
  } else {
out <- tmp
  }

  # clean up any duplicate "/" and return
  return(gsub("/+", "/", out))

}


James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] deep copy?

2012-04-13 Thread Whit Armstrong
Is putting a variable into a list a deep copy (and is tracemem the
correct way to confirm)?

warmstrong@krypton:~/dvl/R.packages$ R
> x <- rnorm(1000)
> tracemem(x)
[1] "<0x3214c90>"
> x.list <- list(x.in.list=x)
tracemem[0x3214c90 -> 0x2af0a20]:
>

Is it possible to put a variable into a list without causing a deep
copy (i.e. if you _really_ want the objects to share the same
underlying memory)?

-Whit

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to define format of number

2012-04-13 Thread Bert Gunter
Well...

?format

(strangely enough).
Try searching in R at least a wee bit before posting.

-- Bert

On Fri, Apr 13, 2012 at 12:35 PM, mrzung  wrote:
> hi all,
>
> What I want to do is show a number with thousand expression.
>
> I dont know exactly the expression name but here is example.
>
>  1,000
>  10,000,000
>
> is there a way to express a number like that?
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/How-to-define-format-of-number-tp4555778p4555778.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coding columns for survival analysis

2012-04-13 Thread jim holtman
try this:

> x <- read.table(text = "   tree live1 live2 live3 live4 live5
+1 tree1 0 0 0 1 1
+2 tree2 0 0 1 1 0
+3 tree3 0 1 1 0 0
+4 tree4 1 1 0 0 0
+6 tree4 1 1 1 1 0  # another test condition
+5 tree5 1 0 0 0 0", header = TRUE)
>
> # get matrix of data columns
> z <- as.matrix(x[, -1])
> # process each row
> a <- apply(z, 1, function(.row){
+ # determine where found (will be a 2)
+ found <- pmin(cumsum(.row) + 1, 3) # cannot be greater than 3
+ # determined where it died
+ die <- cumsum(diff(c(0, .row)) != 0)
+ # replace value at die == 2 with 4
+ found[die == 2] <- 4
+ c(NA, "found", "alive", "mort")[found]
+ })
> t(a)  # result
  [,1][,2][,3][,4][,5]
1 NA  NA  NA  "found" "alive"
2 NA  NA  "found" "alive" "mort"
3 NA  "found" "alive" "mort"  "mort"
4 "found" "alive" "mort"  "mort"  "mort"
6 "found" "alive" "alive" "alive" "mort"
5 "found" "mort"  "mort"  "mort"  "mort"
>


On Fri, Apr 13, 2012 at 4:53 PM, Alexander Shenkin  wrote:
> Hello Folks,
>
> I have 5 columns for thousands of tree records that record whether that
> tree was alive or dead.  I want to recode the columns such that the cell
> reads "found" when a live tree is first observed, "alive" for when a
> tree is found alive and is not just found, and "mort" when it was
> previously alive but is now dead.
>
> Given the following:
>
>    > tree_live = data.frame(tree =
> c("tree1","tree2","tree3","tree4","tree5"), live1 = c(0,0,0,1,1), live2
> = c(0,0,1,1,0), live3 = c(0,1,1,0,0), live4 = c(1,1,0,0,0), live5 = c(1,
> 0, 0, 0, 0))
>
>       tree live1 live2 live3 live4 live5
>    1 tree1     0     0     0     1     1
>    2 tree2     0     0     1     1     0
>    3 tree3     0     1     1     0     0
>    4 tree4     1     1     0     0     0
>    5 tree5     1     0     0     0     0
>
> I would like to end up with the following:
>
>    > tree_live_recode
>
>      live1 live2 live3 live4 live5
>    1    NA    NA    NA found alive
>    2    NA    NA found alive  mort
>    3    NA found alive  mort     0
>    4 found alive  mort     0     0
>    5 found  mort     0     0     0
>
> I've accomplished the recode in the past, but only by going over the
> dataset multiple times in messy and inefficient fashion.  I'm wondering
> if there are concise and efficient ways of going about it?
>
> (I haven't been using the Survival package for my analyses, but I'm
> starting to look into it.)
>
> Thanks,
> Allie
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to define format of number

2012-04-13 Thread jim holtman
try this:

> prettyNum(1, big.mark = ",")
[1] "100,000,000"
>




On Fri, Apr 13, 2012 at 3:35 PM, mrzung  wrote:

> hi all,
>
> What I want to do is show a number with thousand expression.
>
> I dont know exactly the expression name but here is example.
>
>  1,000
>  10,000,000
>
> is there a way to express a number like that?
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/How-to-define-format-of-number-tp4555778p4555778.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Large Dataset Problem

2012-04-13 Thread efulas
I am using the codes below,


options(max.print=5.5E5)
x=rep(1,1052)
b=read.fwf(file="efetez.binary", widths=c(6,x),header=FALSE)

and i get " C stack usage is too close to the limit" this error. I want to
get my data like ;

molecul id v1  v2   v3 .

19029  1,1,0,1,0,1,0,...
29837  0,1,1,1,1,0,1
.
.
.

However, i cant get it like above because there are no commas between
"1000110010". So R define it as a inf. 


Many Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/R-Large-Dataset-Problem-tp4554469p4556188.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Seemingly simple "lm" giving unexpected results

2012-04-13 Thread Gene Leynes
I can't figure out why this is returning an NA for the slope in one case,
but not in the other.

I can tell that R thinks the first case is singular, but why isn't the
second?

## Define X and Y
## There are two versions of x
## 1) "as is"
## 2) shifted to start at 0
y  = c(58, 57, 57, 279, 252, 851, 45, 87, 47)
x1 = c(1334009411.437, 1334009411.437, 1334009411.437, 1334009469.297,
1334009469.297, 1334009469.297, 1334009485.697, 1334009485.697,
1334009485.697)
x2 = x1 - min(x1)

## Why doesn't the LM model work for the "as is" x?
lm(y~x1)
lm(y~x2)


My environment:
Windows XP,
R 2.14.1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with svyby and NAs (survey package)

2012-04-13 Thread A.F.Fenton
> I'm trying to get the proportion "true" for dichotomous variable for
various
> subgroups in a survey.

Sorry, I'm new to the list, and just saw the advice about minimally
reproducible code. Here goes:


library(survey)
foo <- data.frame(id   = 1:25,
  weight   = runif(25),
  year = rep(2002:2006, 5),
  problem  = rnorm(25) > 0)
foo.dsn = svydesign(id=~id, weight=~weight, data=foo)
svyby(~problem, ~year, foo.dsn, svymean, na.rm=TRUE) # Fine

# One year is missing 
foo[foo$year == 2004, "problem"] = NA
foo.dsn = svydesign(id=~id, weight=~weight, data=foo)
svyby(~problem, ~year, foo.dsn, svymean, na.rm=TRUE) # Error


thanks
alex

Please access the attached hyperlink for an important electronic communications 
disclaimer: http://lse.ac.uk/emailDisclaimer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odd characters at beginning of file

2012-04-13 Thread Lee Hachadoorian
I'm use RPostgreSQL to access data on a Postgres server. I would like to
keep my SQL statements in external files, as they're easier to write and
debug in pgAdmin, then I use readLines to bring them into R and feed to
dbGetQuery.

Here's the problem. When I create a SQL script with pgAdmin, then load it
in R, the ensuing script fails when I feed it to dbGetQuery. When I inspect
the string in R on Linux, it *looks* OK, but fails. When I inspect the
string in R on Windows, each file begins with "". (It took me a while to
figure this out since I usually run R on Linux and the characters weren't
displaying there.)

I haven't tested a large number of applications but it appears to be a
pgAdmin problem. Characters appear in SQL scripts created by pgAdmin on
both Linux and Windows, do not appear in scripts created in Notepad on
Windows or gedit on Linux. On Windows I think I can clean the string because

sql = readLines("SELECT 1.sql")
sql

[1] "SELECT 1;"

But on Linux

sql = readLines("SELECT 1.sql")
sql

[1] "SELECT 1;"

So how do I remove something that isn't even there? And yet the query fails.

So I would like to know if someone knows why this is happening and how to
avoid it, or how to clean these characters when working in R on Linux where
they aren't visible in the string. Also, when I open the pgAdmin-created
files in Notepad or gedit, I don't see anything unusual. But they are still
there, because if I edit a pgAdmin-created file in Notepad or gedit, then
save and read into R with readLines, they are there.

Best,
--Lee

-- 
Lee Hachadoorian
PhD, Earth & Environmental Sciences (Geography)
Research Associate, CUNY Center for Urban Research
http://freecity.commons.gc.cuny.edu/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R: Colouring phylogenetic tip labels and/or edges

2012-04-13 Thread Monroe, Melanie
Hi,

I have reconstructed ancestral character states on a phylogeny using MuSSE in 
the diversitree package and plotted the character state probabilities as pie 
charts on the nodes. I would, however, like to colour the character states of 
my extant species, i.e. the tip labels, the same colours as my pie charts, such 
that all species in  state 1 are e.g. blue, species in state 2 red and  species 
in state 3 yellow, and have not been successful with my attempts. I am only 
able to colour them in repeating sets of 3e.g. sp1=blue, sp.2=red, sp.3=yellow, 
sp.4=blue, sp5=red, sp6=yellow etc. I am also wondering how to colour the 
branches or edges as the states transition from one to another over time (i.e. 
as in the "Analyzing diversification with diversitree" manual by Rich FitzJohn 
on page 23).

Code I've been working with is below:

library(diversitree) #loads library
tree<-read.tree("tree")#loads tree
tree<-chronopl(tree, lambda=1,CV=TRUE) #converts to ultrametric
states<-read.delim("states", header=TRUE)#load states
head(states)

#match states to tree
states<-structure(states$PC, names=states$Species)
names(states)<-tree$tip.label

#MuSSE
diversitree:::argnames.musse(NULL, 3) #number of states
lik<-make.musse(tree, states, 3)
argnames(lik)
#contstrain lambda
lik.base<-constrain(lik, lambda2~lambda1, lambda3~lambda1, 
mu2~mu1,mu3~mu1,q13~q12,q21~q12,q23~q12,q31~q12,q32~q12)
#find ML point for this model
p<-starting.point.musse(tree, 3)
fit.base<-find.mle(lik.base, p[argnames(lik.base)])

#unconstrained
lik.lambda<-constrain(lik,mu2~mu1,mu3~mu1,q13~q12,q21~q12,q23~q12,q31~q12,q32~q12)
fit.lambda<-find.mle(lik.lambda, p[argnames(lik.lambda)])
anova(fit.base, free.lambda=fit.lambda)

#find ancestral state probabilities
state.probs<-asr.marginal(lik.base, coef(fit.base))#ancestral state 
probabilities
state.probs
pie.probs<-t(state.probs)
pr<-apply(t(state.probs), 1, which.max)#max probability
tree$node.label<-pr #labels the nodes with the character states
write.tree(tree, file="T")#exports tree with nodes character states to 
directory #configuration in figtree - node label display label

#tree pie charts
pdf("TREE_PLOT.pdf", height=11, width=8.5)
plot(tree ,cex=.8)
nodelabels(pie=pie.probs,piecol=c("blue","red","yellow"), cex=.5)
dev.off()

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to define format of number

2012-04-13 Thread mrzung
hi all,

What I want to do is show a number with thousand expression.

I dont know exactly the expression name but here is example.

 1,000
 10,000,000
 
is there a way to express a number like that?

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-define-format-of-number-tp4555778p4555778.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question with R CMD SHLIB in 64 bit R

2012-04-13 Thread Katharine Miller
Hi,

I have some C++ code that I compiled into a dll for use in 32 bit R and
would like to recompile for use in 64bit R.  I thought it would be as easy
as going to R-2.15.0\bib\x64 and running R CMD SHLIB mfregRF.c

 but that doesn't do anything.  It doesn't give me any error messages, but
it also doesn't create a shared (so) file. I just get the command prompt
back.

 I also tried \bin\x64 R CMD SHLIB - help, and again got nothing but the
command prompt back.

Just to check, I ran R CMD SHLIB from \bin\i386 on the same C++ files and
it produced a working dll just like usual.

Is there something that I need to do to the C++ files to make them
compatible with 64 bit?  They are not file that I made, so I do not know
how they were originally compiled.

Thanks for any help.

- Katharine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A little exercise in R!

2012-04-13 Thread Ted Harding
Greetings all!
A recent news item got me thinking that a problem stated
therein could provide a teasing little exercise in R
programming.

http://www.bbc.co.uk/news/uk-england-cambridgeshire-17680326

  Cambridge University hosts first European 'maths Olympiad'
  for girls

  The first European girls-only "mathematical Olympiad"
  competition is being hosted by Cambridge University.
  [...]
  Olympiad co-director, Dr Ceri Fiddes, said competition questions
  encouraged "clever thinking rather than regurgitating a taught
  syllabus".
  [...]
  "A lot of Olympiad questions in the competition are about
  proving things," Dr Fiddes said.

  "If you have a puzzle, it's not good enough to give one answer.
  You have to prove that it's the only possible answer."
  [...]
  "In the Olympiad it's about starting with a problem that anybody
  could understand, then coming up with that clever idea that
  enables you to solve it," she said.

  "For example, take the numbers one up to 17.

  "Can you write them out in a line so that every pair of numbers
  that are next to each other, adds up to give a square number?"

Well, that's the challenge: Write (from scratch) an R program
that solves this problem. And make it neat.

NOTE: If there should happen to be some R package that can solve
this kind of problem already, without you having to think much,
then its use is illegitimate! (I.e. will be deemed "regurgitation").

Over to you.

With best wishes,
Ted.

-
E-Mail: (Ted Harding) 
Date: 13-Apr-2012  Time: 22:33:43
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loess function take

2012-04-13 Thread Duncan Mackay
As lowess() is mentioned another in similar vein is locfit() from 
package locfit


Duncan


Duncan Mackay
Department of Agronomy and Soil Science
University of New England
ARMIDALE NSW 2351
Email home: mac...@northnet.com.au

At 00:07 14/04/2012, you wrote:

Since you have only one dependent variable, try using lowess()
instead. It is less flexible -- only does local linear robust fitting
-- but has arguments built in that allow you to sample and interpolate
and limit the number of robustness iterations. It runs considerably
faster as a result.

-- Bert

On Fri, Apr 13, 2012 at 6:32 AM, Liaw, Andy  wrote:
> Alternatively, use only a subset to run loess(), either a random 
sample or something like every other k-th (sorted) data value, or 
the quantiles.  It's hard for me to imagine that that many data 
points are going to improve your model much at all (unless you use tiny span).

>
> Andy
>
>
> From: r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.org] On Behalf Of Uwe Ligges

>
> On 12.04.2012 05:49, arunkumar wrote:
>> Hi
>>
>> The function loess takes very long time if the dataset is very huge
>> I have around 100 records
>> and used only one independent variable. still it takes very long time
>>
>> Any suggestion to reduce the time
>
>
> Use another method that is computationally less expensive for that many
> observations.
>
> Uwe Ligges
>
>
>> -
>> Thanks in Advance
>>  Arun
>> --
>> View this message in context: 
http://r.789695.n4.nabble.com/loess-function-take-tp4550896p4550896.html

>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.
> Notice:  This e-mail message, together with any attachme...{{dropped:11}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.



--

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] enableJIT(2) causes major slow-up in rpart

2012-04-13 Thread luke-tierney

The level 2 is a heuristic meant to help with certain kinds of
programming idioms. It isn't always going to work.  In this case
trace(cmpfun) will show three functions being compiled each time
through. Not sure why -- I'll try to find out and see if it can be
avoided.

luke

On Thu, 12 Apr 2012, Tal Galili wrote:


Hello,

Due to exploration of the JIT capabilities offered through the {compiler}
package, I came by the fact that using enableJIT(2) can *slow* the rpart
function (from the {rpart} package) by a magnitude of about 10 times.

Here is an example code to run:

library(rpart)
require(compiler)

enableJIT(0) # just making sure that JIT is off # We could also use
enableJIT(1) and it would be fine
fo <- function() {rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)}
system.time(fo())
#   user  system elapsed
#  0   0   0   # this can also be 0.01 sometimes.

enableJIT(2)  # also happens for enableJIT(3)
system.time(fo())
#   user  system elapsed
#   0.120.000.12


Which brings me to my *questions*:
1) Is this a bug or a feature?
2) If this is a feature, what is causing it? (or put another way, can one
predict ahead of time the implications of using enableJIT(2) or
enableJIT(3) on his code?)


*Links*:
A post I recently wrote about my exploration of JIT -
www.r-statistics.com/2012/04/speed-up-your-r-code-using-a-just-in-time-jit-compiler/
The question asked on SO regarding the limitations of JIT:
http://stackoverflow.com/questions/10106736/possible-shortcomings-for-using-jit-with-r

Thanks,
Tal



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't read a binary file

2012-04-13 Thread Roy Mendelssohn
Hi Scott:
On Apr 13, 2012, at 1:45 PM, Waichler, Scott R wrote:

> Hi, I've read up on readBin() and chapter 6 in the R Data Import/Export 
> manual, but I still can't read a binary file.  Here is how the creator of the 
> file described the code that would be needed in Fortran:
> 
> "Every record has a return in fortran.  The length of each record is nx*ny*4. 
>  To read you would use the following:
> 
> nlayx = nx*ny*4
> do iz=1,nz,4
> read(binary file) var(1:nlayx)
> enddo
> nrest=mod(nx*ny*nz,nlayx)
> read(binary file) var(1:nrest)"
> 
> The first value in the file should be 0.05, and all of the data values are 
> real.  Here is what I get (with similar answers using double):
> 
>> v<-readBin("plotb.251", numeric(), size=4, n=1)
>> v
> [1] 1.614296e-39
> 
>> v<-readBin("plotb.251", numeric(), size=4, n=1, endian="swap")
>> v
> [1] 1.359775e-38
> 
> Platform is Intel Linux.  How can I read the file described above?


The creator of the file left out the "open" statement and the defaults for 
binary files in the compiler s/he used.  This is just a guess, but some 
Fortrans when writing "binary" files also include the record length in each 
record.  That would throw off a strict binary read.

Again, just a guess, but I have seen this before.  Don't know enough abut 
readBin to see if you can get the result of reading the first record in Hex or 
binary, but that might tell you what is going on.

-Roy M.  

**
"The contents of this message do not reflect any position of the U.S. 
Government or NOAA."
**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: roy.mendelss...@noaa.gov (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected" 
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coding columns for survival analysis

2012-04-13 Thread Alexander Shenkin
Hello Folks,

I have 5 columns for thousands of tree records that record whether that
tree was alive or dead.  I want to recode the columns such that the cell
reads "found" when a live tree is first observed, "alive" for when a
tree is found alive and is not just found, and "mort" when it was
previously alive but is now dead.

Given the following:

> tree_live = data.frame(tree =
c("tree1","tree2","tree3","tree4","tree5"), live1 = c(0,0,0,1,1), live2
= c(0,0,1,1,0), live3 = c(0,1,1,0,0), live4 = c(1,1,0,0,0), live5 = c(1,
0, 0, 0, 0))

   tree live1 live2 live3 live4 live5
1 tree1 0 0 0 1 1
2 tree2 0 0 1 1 0
3 tree3 0 1 1 0 0
4 tree4 1 1 0 0 0
5 tree5 1 0 0 0 0

I would like to end up with the following:

> tree_live_recode

  live1 live2 live3 live4 live5
1NANANA found alive
2NANA found alive  mort
3NA found alive  mort 0
4 found alive  mort 0 0
5 found  mort 0 0 0

I've accomplished the recode in the past, but only by going over the
dataset multiple times in messy and inefficient fashion.  I'm wondering
if there are concise and efficient ways of going about it?

(I haven't been using the Survival package for my analyses, but I'm
starting to look into it.)

Thanks,
Allie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can't read a binary file

2012-04-13 Thread Waichler, Scott R
Hi, I've read up on readBin() and chapter 6 in the R Data Import/Export manual, 
but I still can't read a binary file.  Here is how the creator of the file 
described the code that would be needed in Fortran:

"Every record has a return in fortran.  The length of each record is nx*ny*4.  
To read you would use the following:

nlayx = nx*ny*4
do iz=1,nz,4
 read(binary file) var(1:nlayx)
enddo
nrest=mod(nx*ny*nz,nlayx)
read(binary file) var(1:nrest)"

The first value in the file should be 0.05, and all of the data values are 
real.  Here is what I get (with similar answers using double):

> v<-readBin("plotb.251", numeric(), size=4, n=1)
> v
[1] 1.614296e-39

> v<-readBin("plotb.251", numeric(), size=4, n=1, endian="swap")
> v
[1] 1.359775e-38

Platform is Intel Linux.  How can I read the file described above?

Thanks,
Scott Waichler, PhD
Hydrology Group, Energy & Environment Directorate
Pacific Northwest National Laboratory
scott.waich...@pnnl.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't get R to recognize Java for rJava installation

2012-04-13 Thread Waichler, Scott R
Milan,

Merci.  I did find the javah file and put it in /usr/bin, where R can now find 
it.  
However, I still get a similar error message when trying to install rJava, i.e. 

configure: error: One or more Java configuration variables are not set.

The only field that doesn't have a value now are the cpp flags:

cpp flags   : ''

Could this be the problem now?  How can I set those, and what value should I 
give?

Scott Waichler

> So I guess you need to find out what package provides this file on your
> distribution (which you did not mention). First check the file is
> currently not present.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writing spdiags function for R

2012-04-13 Thread Ben Bolker
Ben Bolker  gmail.com> writes:

> 
>   I'm not quite sure how to do it, but I think you should look
> at the ?band function in Matrix.  In combination with diag() of a 
> suitably truncated matrix, you should be able to extract bands
> of sparse matrices efficiently ...
> 


getband <- function(A,k) {
n <- nrow(A)
if (abs(k)>(n-1)) stop("bad band requested")
if (k>0) {
v <- seq(n-k) ## -seq((n-k+1),n)
w <- seq(k+1,n) ## -seq(n-k-1)
} else if (k<0) {
v <- seq(-k+1,n)
w <- seq(n+k)
} else return(diag(A))
diag(band(A,k,k)[v,w,drop=FALSE])
}  

PS: I think this should extract the k^th off-diagonal
band in a way that should (?) work reasonably efficiently
with sparse matrices.  I have not tested it carefully,
nor benchmarked it.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] is there a way to call python like source(file, echo = TRUE) in R?

2012-04-13 Thread Yihui Xie
Sorry this is more like a Python question, but I believe many R users
also know well about Python, so here is my question: I want to run
python code like source(file, echo = TRUE) in R, i.e. echo both the
source code and the output.

This only shows the output:

python -c 'print "hello"'

Thanks!

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl_0.92.879 package broke with R 2.15

2012-04-13 Thread Grimes Mark

Duncan

Brilliant.  This solved the problem.  Library (rgl) is now  
accessiible, and the plot3d function works fine in X11 (which I think  
is how it worked before anyway).  Whatever I may be missing, I don't  
think I'll notice.


Best,

Mark


On Apr 13, 2012, at 11:08 AM, Duncan Murdoch wrote:


On 13/04/2012 11:44 AM, Duncan Murdoch wrote:
There were no changes to that code (the OSXGUIFactory) between  
0.92.861

and 0.92.879, so I'd have to assume it's a difference between the
systems used to build it.  All I can suggest is that you go back to  
the

older one if you can't do a build yourself.


I've had some discussions offline with another user who had this  
problem.  For him, the workaround was to delete the aglrgl.so  
library (full path given in the error message).  This will cause rgl  
to fall back to using X11 for graphics.  It doesn't look as nice as  
the native window, but as long as you have X11 working, it should  
work.


Duncan Murdoch


Duncan Murdoch

On 12/04/2012 5:27 PM, Grimes Mark wrote:
>  Dear All
>
>  I am unhappy to report that rgl_0.92.879 gives the following  
error:

>  Error : .onLoad failed in loadNamespace() for 'rgl', details:
>call: dyn.load(file, DLLpath = DLLpath, ...)
>error: unable to load shared object
>  '/Library/Frameworks/R.framework/Versions/2.15/Resources/library/ 
rgl/libs/x86_64/aglrgl.so':

>
>  dlopen(/Library/Frameworks/R.framework/Versions/2.15/Resources/ 
library/rgl/libs/x86_64/aglrgl.so,

>  6): Symbol not found: __ZN3gui13OSXGUIFactory12hasEventLoopEv
>Referenced from:
>  /Library/Frameworks/R.framework/Versions/2.15/Resources/library/ 
rgl/libs/x86_64/aglrgl.so

>Expected in: dynamic lookup
>
>  Error: package/namespace load failed for ‘rgl’
>  >sessionInfo()
>  R version 2.15.0 (2012-03-30)
>  Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
>  locale:
>  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
>  attached base packages:
>  [1] splines   stats graphics  grDevices utils datasets
>  methods   base
>
>  other attached packages:
>   [1] RUnit_0.4.26 bio3d_1.1-3  RCytoscape_1.6.2
>  XMLRPC_0.2-4 graph_1.34.0 org.Hs.eg.db_2.7.1
>  RSQLite_0.11.1   DBI_0.2-5AnnotationDbi_1.18.0
>  [10] Biobase_2.16.0   BiocGenerics_0.2.0   adegenet_1.3-4
>  ade4_1.4-17  MASS_7.3-17  cluster_1.14.2
>  Hmisc_3.9-3  survival_2.36-12 plyr_1.7.1
>
>  loaded via a namespace (and not attached):
>  [1] grid_2.15.0IRanges_1.14.2 lattice_0.20-6 RCurl_1.91-1
>  stats4_2.15.0  tools_2.15.0   XML_3.9-4
>  >
>  I don't understand how this happens or how to fix it.  Doesn't  
matter

>  if I use CRAN or R CMD INSTALL from terminal.  Terminal gave me no
>  errors when I installed in this way:
>  ~¤  R CMD INSTALL /Users/Mark/R_/Packages/rgl_0.92.879.tar
>  * installing to library
>  ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
>  * installing *binary* package ‘rgl’ ...
>
>  * DONE (rgl)
>  ~ ¤
>  Mark
>
>  On Apr 5, 2012, at 5:06 PM, Grimes Mark wrote:
>
>>  Dear David, Duncan, and Jochen, and everyone
>>
>>  I am happy to report that  with R version 2.15.0, rgl_0.92.861  
now

>>  loads properly.  To whoever fixed this, thank you!
>>
>>  Mark
>>
>>
>>>  Below some more observations that might help you locate the  
problem.

>>>
>>>  Also sorry for ignoring the posting rules in my last post. I was
>>>  still on 2.14.1, and installing from source from the GUI  
version of

>>>  R64. Installing from source from R64 GUI still works after the
>>>  upgrade to R 2.14.2, .
>>>
>>>  The 32 bit versions of R don't work for me anymore, probably  
because
>>>  I have only installed the 64 bit versions of some libraries  
under
>>>  Lion. I have never noticed before, since after the problem  
discussed

>>>  in this thread
>>>  https://stat.ethz.ch/pipermail/r-sig-mac/2010-July/007609.html I
>>>  have been using R64 exclusively.
>>>
>>>  Also note that installing from the command line (Terminal)  
version

>>>  of either R32 or R64 gives the error
>>>
  checking for glEnd in -lGL... no
  configure: error: missing required library GL
>>>
>>>  Presumably, something is wrong with my bash environment and the
>>>  configure script is picking up the wrong gl libraries.
>>>  Installing from source does work after downloading and  
unzipping the

>>>  rgl tarball with
>>>
>>>  ./configure --with-gl-libs=/usr/X11/lib
>>>  --with-gl-includes=/usr/X11/include
>>>
>>>
>>>  Jochen
>>>
>>>  On Mar 28, 2012, at 10:15 AM, Duncan Murdoch wrote:
>>>
  On 12-03-27 6:31 PM, Grimes Mark wrote:
>  Dear People
>
>   I can't figure out how to fix this problem: rgl won't run  
under R
>  2.14.2 (it was working for me before under 2.14.0). The  
error message

>  is:

  rgl is currently changing fairly rapidly.  I'd suggest trying  
to
  install again (the current version, as of yesterday, is  
0.92.861).
   If th

Re: [R] #!/usr/bin/env Rscript --vanilla ??

2012-04-13 Thread Berend Hasselman

On 13-04-2012, at 10:32, Martin Maechler wrote:

> I think that's my first true question (rather than answer)
> to R-help.
> 
> As R has, for a long time, become my primary scripting and
> programming language, I'm prefering at times to write  Rscript
> files instead of shell scripts, notably when R has nice ways to
> do some of the things.
> On a standard standalone platform with standard R, 
> I would start such a script with
> ---
> #! /usr/bin/Rscript --vanilla
> ---
> (yes, the "--vanilla" is important to me, in this case)
> 
> However; as, at work, my scripts have to work correctly on quite a
> few different (unixy : several flavors of Linux, Solaris, MacOS X) platforms,
> *and* as an R developer, I have many different versions of R
> installed simultaneously, using /usr/bin/Rscript  is not an
> option.
> Rather, I'd use the /usr/bin/env trick :
> 
> ---
> #! /usr/bin/env Rscript
> ---
> 
> which finds Rscript in "the correct" place, according to the
> current PATH.  All fine till now.
> 
> PROBLEM:  It does not work with '--vanilla' or any other argument:
> If I start my script with   
> #! /usr/bin/env Rscript --vanilla
> the error message simply is
> /usr/bin/env: Rscript --vanilla: No such file or directory
> 
> I have tried a few variations on the theme, using quotes in
> different places, but have not succeeded till now.
> Any suggestions?

I had similar problems running R scripts from BBEdit on Mac OS X.
The problem could only be solved by making a shell script that uses the file 
extension to determine what to run.

Searching internet yielded these pages which may be helpful:

http://www.in-ulm.de/~mascheck/various/shebang/

http://stackoverflow.com/questions/4303128/how-to-use-multiple-arguments-with-a-shebang-i-e

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writing spdiags function for R

2012-04-13 Thread Ben Bolker
Moreno I. Coco  sms.ed.ac.uk> writes:


  [snip snip snip]
> 
> So, I have written my own spdiags function (below); following
> also a suggestion in an old, and perhaps unique post, about
> this issue.
> 
> It works only for square matrices (that's my need), however I
> have a couple of issues, mainly related to computational
> efficiency:
> 
> 1) if I use it with a sparseMatrix, it throws a tedious warning
> "[  ] : .M.sub.i.logical() maybe inefficient";
> can I suppress this warning somehow, this is slowing the computation
> very radically;

   quick answer: use suppressMessages()
> 
> 2) I can go around this problem by translating a sparseMatrix back
> into a logical matrix before I run spdiags on it. However, the loop
> gets very slow for large matrices (e.g., 2000x2000), which is the
> kind of matrices I have to handle. If you look in the code,
> I have placed a system.time() where the code is slowing down, and
> it takes about:
> 

  I'm not quite sure how to do it, but I think you should look
at the ?band function in Matrix.  In combination with diag() of a 
suitably truncated matrix, you should be able to extract bands
of sparse matrices efficiently ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Displayed Date Format in Plot Title.

2012-04-13 Thread R. Michael Weylandt
as.Date() attemps to coerce a character string to a date where you
specify the input format -- if you want to specify an output format,
you need ?strftime [str + f + time == string format time]

E.g.,

titleDate <- as.Date("2011-05-03", format = "%Y-%m-%d")

plot(1:10, main = strftime(titleDate, "%b-%d-%Y"))

Michael

On Fri, Apr 13, 2012 at 2:43 PM, Sam Albers  wrote:
> Hello all,
>
> I can't seem to figure out how to format a date as a title. I have
> something like this:
>
> plot(x=1:10, y=runif(10,1,18), main=paste(as.Date("2011-05-03",
> format="%Y-%m-%d")))
>
> ## When I would really like this
> plot(x=1:10, y=runif(10,1,18), main=paste("May-03-2011"))
>
> ## I thought to try this but that produces an NA.
> plot(x=1:10, y=runif(10,1,18), main=paste(as.Date("2011-05-03",
> format="%Y-%b-%d")))
>
> How do folks usually accomplish something like this?
>
> Thanks so much in advance!
>
> Sam
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help - Importing data from txt and xlsx files

2012-04-13 Thread MacQueen, Don
Have you correctly set the value of the 'sep' argument to read.table?

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/13/12 7:28 AM, "AMFTom"  wrote:

>Dear Thierry, 
>
>Thanks for your help. Now though, I try to import data from a txt file,
>and
>it says either 
>
>> mydataframe <- read.table("Lv2.8.txt")
>Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
>na.strings, 
>: 
>  line 3 did not have 15 elements
>
>or
>
>> mydataframe <- read.table("Lv2.8.txt", header = TRUE)
>Error in read.table("Lv2.8.txt", header = TRUE) :
>  more columns than column names
>
>even though I seem to have put a column name into the txt file for each
>column. Any ideas?
>
>Thanks again!
>
>Tom
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Help-Importing-data-from-txt-and-xlsx-files-
>tp4554622p4555001.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help - Importing data from txt and xlsx files

2012-04-13 Thread jim holtman
This is a case where you need to provide a sample of your data.  Most
likely it is not in a format that read.table can read with the parameters
you have given it.  It may have different field separators, it might have
"#" in data fields, you might have unbalanced quotes, etc.  So it is a
problem in your data and the way you are trying to read it.  You have to
include  commented, minimal, self-contained, reproducible code.

On Fri, Apr 13, 2012 at 10:28 AM, AMFTom  wrote:

> Dear Thierry,
>
> Thanks for your help. Now though, I try to import data from a txt file, and
> it says either
>
> > mydataframe <- read.table("Lv2.8.txt")
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> :
>  line 3 did not have 15 elements
>
> or
>
> > mydataframe <- read.table("Lv2.8.txt", header = TRUE)
> Error in read.table("Lv2.8.txt", header = TRUE) :
>  more columns than column names
>
> even though I seem to have put a column name into the txt file for each
> column. Any ideas?
>
> Thanks again!
>
> Tom
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Help-Importing-data-from-txt-and-xlsx-files-tp4554622p4555001.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vif calculation with car and HH packages

2012-04-13 Thread Özgür Asar
Dear Prof. Fox,

I got the point, things are clear now.

Thank you very much,
Best wishes
Ozgur

-


Ozgur ASAR

Research Assistant
Middle East Technical University
Department of Statistics
06531, Ankara Turkey
Ph: 90-312-2105309
--
View this message in context: 
http://r.789695.n4.nabble.com/vif-calculation-with-car-and-HH-packages-tp4555402p4555653.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Displayed Date Format in Plot Title.

2012-04-13 Thread Sam Albers
Hello all,

I can't seem to figure out how to format a date as a title. I have
something like this:

plot(x=1:10, y=runif(10,1,18), main=paste(as.Date("2011-05-03",
format="%Y-%m-%d")))

## When I would really like this
plot(x=1:10, y=runif(10,1,18), main=paste("May-03-2011"))

## I thought to try this but that produces an NA.
plot(x=1:10, y=runif(10,1,18), main=paste(as.Date("2011-05-03",
format="%Y-%b-%d")))

How do folks usually accomplish something like this?

Thanks so much in advance!

Sam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with svyby and NAs (survey package)

2012-04-13 Thread A.F.Fenton
Hello

I'm trying to get the proportion "true" for dichotomous variable for
various subgroups in a survey.

This works fine, but obviously doesn't give proportions directly:
svytable(~SurvYear+problem.vandal, seh.dsn, round=TRUE)
problem.vandal
SurvYear FALSE  TRUE
1995  8906   786
1997 17164  2494
1998 17890  1921
1999 18322  1669
2001 17623  2122
...

Note some years are missing - they are part of the dataset, but all
responses are NA (the question wasn't asked).

However, this gives an error, and I'd like to understand why - it works
for variables without missing years:

svyby(~problem.vandal, ~SurvYear, seh.dsn, svymean, na.rm=TRUE)
Error in tapply(1:NROW(x), list(factor(strata)), function(index) { : 
  arguments must have same length

The error only occurs when na.rm=TRUE and there are no observations in
one year.

Thanks
alex

Please access the attached hyperlink for an important electronic communications 
disclaimer: http://lse.ac.uk/emailDisclaimer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] could not find function when compiling PDF

2012-04-13 Thread damiloveu
Hi all,

I could use the function when I am in the console. After I finish my
assignment and compile them to the pdf, there was an error said, could not
find the function XXX.

How come this happen? I am a new user to R. Thanks for everyone's help!

--
View this message in context: 
http://r.789695.n4.nabble.com/could-not-find-function-when-compiling-PDF-tp4555489p4555489.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove superscripts from HTML objects

2012-04-13 Thread Chris Stubben
Sorry if I was not clear.  I wanted to remove the superscripts using xpath
queries if possible.  For example this will get p nodes with superscripts,
but how do I remove the superscripts if there are many matching nodes and
different superscripts?

xpathSApply(doc, "//p[sup]", xmlValue) 
[1] "Cata"


Chris

--
View this message in context: 
http://r.789695.n4.nabble.com/Remove-superscripts-from-HTML-objects-tp4550738p4555370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] correlation matrix between data from different files

2012-04-13 Thread Rui Barradas
Hello,


jeff6868 wrote
> 
> Dear users,
> 
> I'm quite a new french R-user, and I have a problem about doing a
> correlation matrix.
> I have temperature data for each weather station of my study area and for
> each year (for example, a data file for the weather station N°1 for the
> year 2009, a data file  for the N°2 for the year 2010, ). So I have 70
> weather stations with one data file per year since 2005. Each station has
> 4 temperature sensors.
> Each data file has exactly the same structure: date&hour, sensor1,
> sensor2, sensor3, sensor4. Here's an example:
> 
> timesensor1   sensor2 sensor3sensor4
> 01/01/2008 00:00  -0.25   -2.43   -3.25   -2.37
> 01/01/2008 00:15  -0.18   -2.37   -3.18   -2.25
> 01/01/2008 00:30  -0.25   -2.5-3.37   -2.56
> 01/01/2008 00:45  -0.25   -2.37   -3.31   -2.37
> 
> I need to do a matrix correlation between each same sensors of the
> different stations (one correlation matrix between all the sensors 1 of
> the 70 stations, another one for sensor 2, ...). 
> I have to find for each year and each station the best correlation. For
> example, which one of the 70 weather stations is the most well correlated
> with station 1 for the sensor 1? and with station 2? ... and so one for
> each sensor and each station.
> 
> Example:
> 
> Sensor 1 for the year 2009
> 
>Station 1 Station 2 Station 3 [...]
> Station 1 1   0.910 0.748
> Station 2 0.91010.6 
> Station 3  0.748   0.6  1   
> [...]
> 
> And the same for year 2005,2006,2007,2008,2009,2010,2011 for each of the 4
> sensors.
> 
> Have you got any idea how can I do this on R? 
> Should I first merge all the sensors in one file or could I do it with
> data in separate files (like I have for the moment)?
> Thank you very much for all your answers!
> 


You don't need to merge all files, but you must do some preprocessing.
If you put all data of one year in a 3d array, then simply use 'cor'.

I've made up some fake data, in files named "station1_2009.dat", etc (only 6
stations),
each of them with the same number of observations. If you have 70 stations
per year, you'll
need an automated process to access them. Something like the function below
would solve
part of that problem.
What follows assumes that the n. obs. is the same in all files.

# This function gives file names with the pattern above
filenames <- function(y, n=70){
tmp <- paste("station", seq_len(n), sep="")
tmp <- paste(tmp, y, sep="_")
paste(tmp, "dat", sep=".")
}


Sensors <- paste("sensor", 1:4, sep="")
Stations <- paste("station", 1:6, sep="")

nsensors <- length(Sensors)
nstations <- length(Stations)

year <- 2009
fnames <- filenames(year, nstations)

# If nobs is the same in all files, any one will do.
nobs <- nrow(read.table(fnames[1], header=TRUE))

yr2009 <- array(NA, dim=c(nobs, nsensors, nstations))
for(i in seq_len(nstations)){
tmp <- read.table(fnames[i], header=TRUE)
yr2009[ , , i] <- as.matrix(tmp[, Sensors])
}

dimnames(yr2009) <- list(seq.int(nobs), Sensors, Stations)

# correlations for sensor 1
cor(yr2009[ , 1, ])

# a list of correlations for the 4 sensors
cor2009 <- lapply(Sensors, function(s) cor(yr2009[ , s, ]))
names(cor2009) <- Sensors
cor2009$sensor1


Don't pay much attention to the files part, what's relevant is to create and
fill the array.

Hope this helps,

Rui Barradas


--
View this message in context: 
http://r.789695.n4.nabble.com/correlation-matrix-between-data-from-different-files-tp4552226p4555317.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] could not find function when compiling PDF

2012-04-13 Thread damiloveu
it works in console, like this:
> mult.corr(X,Y)
$mult.corr
[1] 0.8382398

$p.mult
[1] 3.570699e-12

$partial.corr
[1]  0.18447499 -0.09837888  0.12007457

$p.partial
[1] 0.2094076 0.5058976 0.4162641

and it doesn't work when compiling.
Error:  chunk 3 (label = ques3) 
Error in eval(expr, envir, enclos) : could not find function "mult.corr"
Execution halted

--
View this message in context: 
http://r.789695.n4.nabble.com/could-not-find-function-when-compiling-PDF-tp4555489p418.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [BioC] Read .idat Illumina files in R

2012-04-13 Thread grimbough
I posted this to BioC yesterday, but I'll include it here for completeness:  

The expression array idats are indeed encrypted.  However you can read them
using the package available here:

http://www.compbio.group.cam.ac.uk/Resources/IDATreader/

You can get back a data.frame containing the summarized intensity values for
all bead types, along with values such as the number of beads of each type,
the standard deviation with and without outliers etc.

Note that this isn't the same as bead-level data.  If that wasn't generated
at the time of the scan there's nothing that you can do to get it from the
idats and jpegs.

Mike


Tim Triche, Jr. wrote
> 
> Unfortunately, this won't help for expression arrays.  Last time I
> checked,
> those IDATs appeared to be encrypted.
> 
> crlmm, methylumi, and minfi can all read IDAT files... *if* they are
> "version 3" or later (e.g. genotyping, methylation, etc).
> 
> Otherwise you are probably stuck with GenomeStudio if it is expression
> data
> you're dealing with.
> 


--
View this message in context: 
http://r.789695.n4.nabble.com/Read-idat-Illumina-files-in-R-tp4548360p4555147.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Partial Dependence and RandomForest

2012-04-13 Thread jmc
Thank you Andy.  I obviously neglected to read into the help file and,
frustratingly, could have known this all along.  However, I am still
interested in knowing the relative maximum value in the partial plots via
query instead of visual interpretation (and possibly getting at other
statistical measures like standard deviation).  Is it possible to do this? 
I will keep investigating, but would appreciate a hint in the right
direction if you have time.

--
View this message in context: 
http://r.789695.n4.nabble.com/Partial-Dependence-and-RandomForest-tp4549705p4555146.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying a function to categorized data?

2012-04-13 Thread Robert Latest
Hello Steve,

thank you for your reply. You're right, just before I read your post
I'd found aggregate() and indeed it brought me a long way towards my
goal.

I've been a C programmer for 20+ years, and I'm fairly firm in SQL, so
to understand R I need to lose my scalar and row (record) oriented
thinking and get my head into vectors and columns.

I'm still nowhere near where I think I need to be in order to work mit
my data. I'll get back to the list when I have pinpointed my problem a
bit better, and I'll also supply some sample data.

Have a nice weekend,
robert

On Thu, Apr 12, 2012 at 8:52 PM, steven mosher  wrote:
>  Welcome to R and the list.
>
>  Others may suggest books ( Nutshell was my first ) but first there are some
> things that will help you
>  both in programming and getting help on the list.
>
>  You should post executable code in your question. So, build a toy example
> of the data.frame you have
> and show what you tried. Folks here should be able to run your toy example
> and  show you how to get the answer you want.
>
> For your problem I'm guessing that aggregate() would be one path
>
> ?aggregate
>
>  you will need to specify   "by"  to aggregate by month
>
> Steve
>
> On Thu, Apr 12, 2012 at 7:10 AM, Robert Latest  wrote:
>>
>> Hi all,
>>
>> I'm just getting started in R. My problem is the following:
>>
>> I have a data frame (v1) with lots of production data measurements.
>> Each row contains a single measurement ('ARI_MIT') with a timestamp. I
>> want to "lump" the data by months with their mean and standard
>> deviation.
>>
>> I have already successfully managed to do the lumping by adding
>> another column to my data frame:
>>
>> v1$MONTH = strftime(v1$TIMESTAMP, "%y%m")
>>
>> This makes a nice month-wise boxplot of my data, although I don't have
>> an idea why:
>> boxplot(v1$ARI_MIT ~ v1$MONTH)
>>
>> I don't need this plotted, though, but in the form of a new data frame
>> with three columns: the month, the mean, and the standard deviation of
>> all values from that month.
>>
>> I tried un-stacking v1 into a list of vectors and then looping over
>> its elements, calculating the mean of each group:
>>
>> for (i in unstack(v1, v1$ARI_MIT ~ v1$MONTH)) { write(mean(i), "") }
>>
>> This works, but how do I get the data into a data frame? With the
>> month labels in a column? They are not avaliable inside the loop body.
>>
>> I know I need to get a book on R.
>>
>> Thanks,
>> robert
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help - Importing data from txt and xlsx files

2012-04-13 Thread AMFTom
Dear Thierry, 

Thanks for your help. Now though, I try to import data from a txt file, and
it says either 

> mydataframe <- read.table("Lv2.8.txt")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, 
: 
  line 3 did not have 15 elements

or

> mydataframe <- read.table("Lv2.8.txt", header = TRUE)
Error in read.table("Lv2.8.txt", header = TRUE) : 
  more columns than column names

even though I seem to have put a column name into the txt file for each
column. Any ideas?

Thanks again!

Tom

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-Importing-data-from-txt-and-xlsx-files-tp4554622p4555001.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Gradients in bar charts XXXX

2012-04-13 Thread Greg Snow
Here is one approach:

tmp <- rbinom(10, 100, 0.78)

mp <- barplot(tmp, space=0, ylim=c(0,100))

tmpfun <- colorRamp( c('green','yellow',rep('red',8)) )

mat <- 1-row(matrix( nrow=100, ncol=10 ))/100
tmp2 <- tmpfun(mat)

mat2 <- as.raster( matrix( rgb(tmp2, maxColorValue=255), ncol=10) )

for(i in 1:10) mat2[ mat[,i] >= tmp[i]/100, i] <- NA


rasterImage(mat2, mp[1] - (mp[2]-mp[1])/2, 0, mp[10] + (mp[2]-mp[1])/2, 100,
interpolate=FALSE)

barplot(tmp, col=NA, add=TRUE, space=0)


You can tweak it to your desire.  It might look a little better if
each bar were drawn independently with interpolate=TRUE (this would
also be needed if you had space between the bars).


On Mon, Apr 9, 2012 at 12:40 PM, Jason Rodriguez
 wrote:
> Hello, I have a graphics-related question:
>
> I was wondering if anyone knows of a way to create a bar chart that is 
> colored with a three-part gradient that changes at fixed y-values. Each bar 
> needs to fade green-to-yellow at Y=.10 and from yellow-to-red at Y=.20. Is 
> there an option in a package somewhere that offers an easy way to do this?
>
> Attached is a chart I macgyvered together in Excel using a combination of a 
> simple bar chart, fit line, and some drawing tools. I want to avoid doing it 
> this way in the future by finding a way to replicate it in R.
>
> Any ideas?
>
> Thanks,
>
> Jason Michael Rodriguez
> Data Analyst
> State Housing Trust Fund for the Homeless
> Georgia Department of Community Affairs
> Email:  jason.rodrig...@dca.ga.gov
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vif calculation with car and HH packages

2012-04-13 Thread John Fox
Dear Özgür,

car::vif() produces a warning, not an error. It will proceed to compute VIFs 
based on the correlation matrix of the coefficients (take a look at 
car:::vif.lm) even if there is no intercept, and even though this would not 
normally correspond to variance inflation due to correlation of the predictors. 
If you think that makes sense, then by all means use the VIFs.

Best,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Fri, 13 Apr 2012 09:49:54 -0700 (PDT)
 Özgür Asar  wrote:
> Dear all,
> 
> I have faced a problem while calculating VIF values via the packages, car
> and HH for the models witout intercepts. Below is an illustrative example:
> 
> 1) via the car package
> 
> > y<-rnorm(100,0,1)
> > x1<-rnorm(100,0,1)
> > x2<-rnorm(100,0,1)
> > x3<-rnorm(100,0,1)
> > model1<-lm(y~-1+x1+x2+x3)
> > model2<-lm(y~-1+x1+x2)
> >library(car)
> > vif(model1)
>   x1   x2   x3 
> 1.000279 1.019231 1.019376 
> Warning message:
> In vif.lm(model1) : No intercept: vifs may not be sensible.
> > vif(model2)
>   x1   x2 
> 1.85 1.85 
> Warning message:
> In vif.lm(model2) : No intercept: vifs may not be sensible.
> 
> 2) via the HH package
> > library(HH)
> > vif(model1)
>   x2   x3 
> 1.000557 1.000557 
> > vif(model2)
> Error in vif.default(xx, na.action = na.action) : 
>   vif requires two or more X-variables.
> 
> I could not understand why this occured. Does anyone have any idea about it?
> 
> Best
> Ozgur
> 
> 
> 
> 
> -
> 
> 
> Ozgur ASAR
> 
> Research Assistant
> Middle East Technical University
> Department of Statistics
> 06531, Ankara Turkey
> Ph: 90-312-2105309
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/vif-calculation-with-car-and-HH-packages-tp4555402p4555402.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using wildcards in download.file?

2012-04-13 Thread steven mosher
one way to solve your problem is to fetch the directory using rcurl. then
mapply using the dirlist as a parameter passed to download file
On Apr 13, 2012 9:24 AM, "MacQueen, Don"  wrote:

> If you take a thorough look at the help page for download.file, and follow
> its advice, you may find a solution.
>
> Hint:
> The help page for download.file says,
>
> -- quote --
> The function 'download.file' can be used to download a single file
> as described by 'url' from the internet and store it in
> 'destfile'.
> -- end quote --
>
> I'd say that makes it pretty clear the answer is no.
>
> Further on in the help page in the "See Also" section it refers to another
> package.
>
> Hint:
> A help page for one of the functions in that packages says,
> -- quote --
> # FTP
>   # Download the files within a directory.
>
> -- end quote --
>
>
> which looks promising.
>
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
>
>
>
>
> On 4/12/12 2:07 PM, "jt...@mappi.helsinki.fi" 
> wrote:
>
> >Hi,
> >Do you know whether it is possible to use wildcards in download.file()?
> >For example:
> >url = "ftp://abc.com/*.*"; # to download all the files in the ftp folder
> >download.file(url,destfile=...) # does not work, any solutions?
> >
> >Thanks!
> >
> >JIng
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kaplan Meier analysis: 95% CI wider in R than in SAS

2012-04-13 Thread Frank Harrell
Make sure you use the log S(t) basis on both systems (and avoid log-log S(t)
basis as this results in instability in the front part of the survival
curve).
Frank

Paul Miller wrote
> 
> Hi Enrico,
> 
> Not sure how SAS builds the CI but I can look into it. The SAS
> documentation does have a section on computational formulas at:
> 
> http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_lifetest_a000259.htm
> 
> Although I can't provide my dataset, I can provide the data and code
> below. This is the R-equivalent of an analysis from "Common Statistical
> Methods for Clinical Research with SAS Examples."
> 
> R produces the follwoing output:
> 
>> print(surv.by.vac)
> Call: survfit(formula = Surv(WKS, CENS == 0) ~ VAC, data = hsv)
> 
> records n.max n.start events median 0.95LCL 0.95UCL
> VAC=GD2  2525  25 14 35  15  NA
> VAC=PBO  2323  23 17 15  12  35
> 
> SAS has the same 95% CI for VAC=GD2 but has a 95% CI of [10, 27] for
> VAC=PBO. This is just like in the analysis I'm doing currently.
> 
> Thanks,
> 
> Paul
> 
>  
> ###
>  Chapter 21: The Log-Rank Test 
> ###
>  
> #
>  Example 21.1: HSV2 Vaccine with gD2 Vaccine 
> #
>  
> connection <- textConnection("
> GD2  1   8 12  GD2  3 -12 10  GD2  6 -52  7
> GD2  7  28 10  GD2  8  44  6  GD2 10  14  8
> GD2 12   3  8  GD2 14 -52  9  GD2 15  35 11
> GD2 18   6 13  GD2 20  12  7  GD2 23  -7 13
> GD2 24 -52  9  GD2 26 -52 12  GD2 28  36 13
> GD2 31 -52  8  GD2 33   9 10  GD2 34 -11 16
> GD2 36 -52  6  GD2 39  15 14  GD2 40  13 13
> GD2 42  21 13  GD2 44 -24 16  GD2 46 -52 13
> GD2 48  28  9  PBO  2  15  9  PBO  4 -44 10
> PBO  5  -2 12  PBO  9   8  7  PBO 11  12  7
> PBO 13 -52  7  PBO 16  21  7  PBO 17  19 11
> PBO 19   6 16  PBO 21  10 16  PBO 22 -15  6
> PBO 25   4 15  PBO 27  -9  9  PBO 29  27 10
> PBO 30   1 17  PBO 32  12  8  PBO 35  20  8
> PBO 37 -32  8  PBO 38  15  8  PBO 41   5 14
> PBO 43  35 13  PBO 45  28  9  PBO 47   6 15
> ")
> 
> hsv <- data.frame(scan(connection, list(VAC="", PAT=0, WKS=0, X=0)))
> hsv <- transform(hsv,
>   CENS = ifelse(WKS < 1, 1, 0),
>   WKS  = abs(WKS),
>   TRT  = ifelse(VAC=="GD2", 1, 0))
> 
> library("survival")
> surv.by.vac <- survfit(Surv(WKS,CENS==0)~VAC, data=hsv)
> 
> plot(surv.by.vac, 
>  main = "The Log-Rank Test \n Example 21.1: HSV-Episodes with gD2
> Vaccine",
>  ylab = "Survival Distribution Function",
>  xlab = "Survival Time in Weeks",
>  lty = c(1,2))
> 
> legend(0.75,0.19, 
>  legend = c("gD2","PBO"), 
>  lty = c(1,2), title = "Treatment")
> 
> summary(surv.by.vac)
> print(surv.by.vac)
>  
> 
> __
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Kaplan-Meier-analysis-95-CI-wider-in-R-than-in-SAS-tp4554559p4555447.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] caret package: custom summary function in trainControl doesn't work with oob?

2012-04-13 Thread Max Kuhn
Matt,

> I've been using a custom summary function to optimise regression model
> methods using the caret package. This has worked smoothly. I've been using
> the default bootstrapping resampling method. For bagging models
> (specifically randomForest in this case) caret can, in theory, uses the
> out-of-bag (oob) error estimate from the model instead of resampling, which
> (in theory) is largely redundant for such models. Since they take a while
> to build in the first place, it really slows things down when estimating
> performance using boostrap.
>
> I can successfully run either using the oob 'resampling method' with the
> default RMSE optimisation, or run using bootstrap and my custom
> summaryFunction as the thing to optimise, but they don't work together. If
> I try and use oob and supply a summaryFunction caret throws an error saying
> it can't find the relevant metric.
>
> Now, if caret is simply polling the randomForest object for the stored oob
> error I can understand this limitation

That is exactly what it does. See caret:::rfStats (not a public function)

train() was written to be fairly general and this level of control
would be very difficult to implement, especially since each model that
does some type of bagging uses different internal structures etc.

> but in the case of randomForest
> (and probably other bagging methods?) the training function can be asked to
> return information about the individual tree predictions and whether data
> points were oob in each case. With this information you can reconstruct an
> oob 'error' using whatever function you choose to target for optimisation.
> As far as I can tell, caret is not doing this and I can't see anywhere that
> it can be coerced to do so.

It will not be able to do this. I'm not sure that you can either.
randomForest() will return the individual forests and
predict.randomForest() can return the per-tree results but I don't
know if it saves the indices that tell you which bootstrap samples
contained which training set points. Perhaps Andy would know.

> Have I missed something? Can anyone suggest how this could be achieved? It
> wouldn't be *that* hard to code up something that essentially operates in
> the same way as caret.train but can handle this feature for bagging models,
> but if it is already there and I've missed something please let me know.

Well, everything is easy for the person not doing it =]

If you save the proximity measures, you might gain the sampling
indices. WIth these, you would use predict.randomForest(...,
predict.all=TRUE) to get the individual predictions.

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] vif calculation with car and HH packages

2012-04-13 Thread Özgür Asar
Dear all,

I have faced a problem while calculating VIF values via the packages, car
and HH for the models witout intercepts. Below is an illustrative example:

1) via the car package

> y<-rnorm(100,0,1)
> x1<-rnorm(100,0,1)
> x2<-rnorm(100,0,1)
> x3<-rnorm(100,0,1)
> model1<-lm(y~-1+x1+x2+x3)
> model2<-lm(y~-1+x1+x2)
>library(car)
> vif(model1)
  x1   x2   x3 
1.000279 1.019231 1.019376 
Warning message:
In vif.lm(model1) : No intercept: vifs may not be sensible.
> vif(model2)
  x1   x2 
1.85 1.85 
Warning message:
In vif.lm(model2) : No intercept: vifs may not be sensible.

2) via the HH package
> library(HH)
> vif(model1)
  x2   x3 
1.000557 1.000557 
> vif(model2)
Error in vif.default(xx, na.action = na.action) : 
  vif requires two or more X-variables.

I could not understand why this occured. Does anyone have any idea about it?

Best
Ozgur




-


Ozgur ASAR

Research Assistant
Middle East Technical University
Department of Statistics
06531, Ankara Turkey
Ph: 90-312-2105309
--
View this message in context: 
http://r.789695.n4.nabble.com/vif-calculation-with-car-and-HH-packages-tp4555402p4555402.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Large Dataset Problem

2012-04-13 Thread jim holtman
When using 'scan' I had no problem reading a string that had 1000 'columns'

> x <- scan('/temp/tempxx.txt', what = list(0L, ''))
Read 14 records
>
> str(x)
List of 2
 $ : int [1:14] 129876 129876 129876 129876 129876 129876 129876 129876
129876 129876 ...
 $ : chr [1:14]
"101010111000000111000011010101011100000011100001101010101110000001110000110101010111000"|
__truncated__
"101010111000000111000011010101011100000011100001101010101110000001110000110101010111000"|
__truncated__
"101010111000000111000011010101011100000011100001101010101110000001110000110101010111000"|
__truncated__
"101010111000000111000011010101011100000011100001101010101110000001110000110101010111000"|
__truncated__ ...
> nchar(x[[2]])
 [1] 1184 1184 1184 1184 1184 1184 1184 1184 1184 1184 1184 1184 1184 1184
>


Notice I specified the second field as character.

On Fri, Apr 13, 2012 at 9:21 AM, Milan Bouchet-Valat wrote:

> Le vendredi 13 avril 2012 à 05:44 -0700, efulas a écrit :
> > Thank you very much for your helps guys. Both message help me to run the
> data
> > in R. However, R is omitting many columns from my data. Am i missing
> > something?
> Please read the posting guide. If you don't provide the code you ran and
> the resulting objects and messages, we cannot possibly help you.
>
>
> Regards
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using wildcards in download.file?

2012-04-13 Thread MacQueen, Don
If you take a thorough look at the help page for download.file, and follow
its advice, you may find a solution.

Hint:
The help page for download.file says,

-- quote --
The function 'download.file' can be used to download a single file
 as described by 'url' from the internet and store it in
 'destfile'.
-- end quote --

I'd say that makes it pretty clear the answer is no.

Further on in the help page in the "See Also" section it refers to another
package.

Hint:
A help page for one of the functions in that packages says,
-- quote --
# FTP
   # Download the files within a directory.
 
-- end quote --


which looks promising.


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/12/12 2:07 PM, "jt...@mappi.helsinki.fi" 
wrote:

>Hi,
>Do you know whether it is possible to use wildcards in download.file()?
>For example:
>url = "ftp://abc.com/*.*"; # to download all the files in the ftp folder
>download.file(url,destfile=...) # does not work, any solutions?
>
>Thanks!
>
>JIng
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Long command in Sweave

2012-04-13 Thread Gavin Simpson
I use Emacs and ESS, with the coding standards in one of the R manuals.
I have to insert the carriage returns where I want them, but Emacs/ESS
indents the code correctly

G

On Fri, 2012-04-13 at 22:17 +0800, Wincent wrote:
> Thanks, Gavin and Duncan.
> 
> In that case, what I need is a suitable editor which can break the
> command properly.
> 
> All the best
> 
> On 13 April 2012 19:33, Gavin Simpson  wrote:
> > On Fri, 2012-04-13 at 17:46 +0800, Wincent wrote:
> >> Dear useRs,
> >>
> >> I am writing a vignette for a package, which contains long command like 
> >> this,
> >> >reduce(Lipset_cs,"SURVIVAL",c("GNPCAP", "URBANIZA", "LITERACY", "INDLAB", 
> >> >"GOVSTAB"),explain="positive",remainder="exclude",case="CASEID")
> >> It is longer than the width a page and part of it will become "missing".
> >> Currently, I have to manually break the command into multiple lines.
> >> Is there a better way to handle such issue?
> >
> > Not that I am aware of.
> >
> >> It seems that others have raised similar question which seems to
> >> remain unsolved in a satisfactory fashion.
> >>
> >> Thanks for your kind attention in advance.
> >>
> >
> > 1) use some spacing and format the code over multiple lines
> >
> > reduce(Lipset_cs, "SURVIVAL",
> >   c("GNPCAP", "URBANIZA", "LITERACY", "INDLAB", "GOVSTAB"),
> >   explain="positive", remainder="exclude", case="CASEID")
> >
> > Isn't that more readable?! Any good R-aware editor should be able to
> > handle appropriate formatting of the code. I *never* write long lines in
> > my editor; I always break the code down to fit roughly into a 72 column
> > editor window.
> >
> > 2) if you want to force Sweave to respect your new formatting, use
> > argument `keep.source=TRUE` for the code chunk. Or set it document wide
> > using \SweaveOpts{option1=value1, option2=value2} etc in the preamble
> > (where optionX is one of the arguments and valueX what you want to set
> > that argument too.
> >
> > Thought IIRC, `keep.source=TRUE` is the default now and as such Sweave
> > will respect your formatting by default now - before it broke lines
> > where it could.
> >
> > In short get out of the habit of writing long lines of R code; you'll be
> > better in the long run laying your code out logically.
> >
> > HTH
> >
> > G
> >
> > --
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> >  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
> >  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
> >  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
> >  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >
> >
> >
> 
> 
> 

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Long command in Sweave

2012-04-13 Thread Yihui Xie
You can probably try knitr; see the manual for example:
https://github.com/downloads/yihui/knitr/knitr-manual.pdf

Code reformatting is based on the formatR package
(https://github.com/yihui/formatR/wiki), which tries to preserve your
comments while breaking your long lines into shorter ones.

However, the absolutely reliable answer is that you should do it by
yourself (either manually or with a decent code editor).

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Fri, Apr 13, 2012 at 9:17 AM, Wincent  wrote:
> Thanks, Gavin and Duncan.
>
> In that case, what I need is a suitable editor which can break the
> command properly.
>
> All the best
>
> On 13 April 2012 19:33, Gavin Simpson  wrote:
>> On Fri, 2012-04-13 at 17:46 +0800, Wincent wrote:
>>> Dear useRs,
>>>
>>> I am writing a vignette for a package, which contains long command like 
>>> this,
>>> >reduce(Lipset_cs,"SURVIVAL",c("GNPCAP", "URBANIZA", "LITERACY", "INDLAB", 
>>> >"GOVSTAB"),explain="positive",remainder="exclude",case="CASEID")
>>> It is longer than the width a page and part of it will become "missing".
>>> Currently, I have to manually break the command into multiple lines.
>>> Is there a better way to handle such issue?
>>
>> Not that I am aware of.
>>
>>> It seems that others have raised similar question which seems to
>>> remain unsolved in a satisfactory fashion.
>>>
>>> Thanks for your kind attention in advance.
>>>
>>
>> 1) use some spacing and format the code over multiple lines
>>
>> reduce(Lipset_cs, "SURVIVAL",
>>       c("GNPCAP", "URBANIZA", "LITERACY", "INDLAB", "GOVSTAB"),
>>       explain="positive", remainder="exclude", case="CASEID")
>>
>> Isn't that more readable?! Any good R-aware editor should be able to
>> handle appropriate formatting of the code. I *never* write long lines in
>> my editor; I always break the code down to fit roughly into a 72 column
>> editor window.
>>
>> 2) if you want to force Sweave to respect your new formatting, use
>> argument `keep.source=TRUE` for the code chunk. Or set it document wide
>> using \SweaveOpts{option1=value1, option2=value2} etc in the preamble
>> (where optionX is one of the arguments and valueX what you want to set
>> that argument too.
>>
>> Thought IIRC, `keep.source=TRUE` is the default now and as such Sweave
>> will respect your formatting by default now - before it broke lines
>> where it could.
>>
>> In short get out of the habit of writing long lines of R code; you'll be
>> better in the long run laying your code out logically.
>>
>> HTH
>>
>> G
>>
>> --
>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>>
>>
>>
>
>
>
> --
> Wincent Ronggui HUANG
> Sociology Department of Fudan University
> PhD of City University of Hong Kong
> http://homepage.fudan.edu.cn/rghuang/cv/
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to read netcdf file in R

2012-04-13 Thread David William Pierce
On Fri, Apr 13, 2012 at 12:09 AM, Yogesh Tiwari
 wrote:
> Dear David,
>
> Thanks,
>
> I could read and open .nc file in R, but now how to plot a simple filled
> color. [...]

Hi Yogesh,

glad to hear that the ncdf package is doing its job correctly. I'm
sure you understand that I don't have the resources to answer
miscellaneous general questions about how to use R, especially
considering that there are good instruction manuals freely available
on the web. You can also buy a textbook on R if you want to learn it
in a more structured fashion. Either way, R is a capable system that
rewards a modest effort devoted to learning how to use it.

Regards,

--Dave

-- 
David W. Pierce
Division of Climate, Atmospheric Science, and Physical Oceanography
Scripps Institution of Oceanography, La Jolla, California, USA
(858) 534-8276 (voice)  /  (858) 534-8561 (fax)    dpie...@ucsd.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kaplan Meier analysis: 95% CI wider in R than in SAS

2012-04-13 Thread Paul Miller
Hi Enrico,

Not sure how SAS builds the CI but I can look into it. The SAS documentation 
does have a section on computational formulas at:

http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_lifetest_a000259.htm

Although I can't provide my dataset, I can provide the data and code below. 
This is the R-equivalent of an analysis from "Common Statistical Methods for 
Clinical Research with SAS Examples."

R produces the follwoing output:

> print(surv.by.vac)
Call: survfit(formula = Surv(WKS, CENS == 0) ~ VAC, data = hsv)

records n.max n.start events median 0.95LCL 0.95UCL
VAC=GD2  2525  25 14 35  15  NA
VAC=PBO  2323  23 17 15  12  35

SAS has the same 95% CI for VAC=GD2 but has a 95% CI of [10, 27] for VAC=PBO. 
This is just like in the analysis I'm doing currently.

Thanks,

Paul

 
###
 Chapter 21: The Log-Rank Test 
###
 
#
 Example 21.1: HSV2 Vaccine with gD2 Vaccine 
#
 
connection <- textConnection("
GD2  1   8 12  GD2  3 -12 10  GD2  6 -52  7
GD2  7  28 10  GD2  8  44  6  GD2 10  14  8
GD2 12   3  8  GD2 14 -52  9  GD2 15  35 11
GD2 18   6 13  GD2 20  12  7  GD2 23  -7 13
GD2 24 -52  9  GD2 26 -52 12  GD2 28  36 13
GD2 31 -52  8  GD2 33   9 10  GD2 34 -11 16
GD2 36 -52  6  GD2 39  15 14  GD2 40  13 13
GD2 42  21 13  GD2 44 -24 16  GD2 46 -52 13
GD2 48  28  9  PBO  2  15  9  PBO  4 -44 10
PBO  5  -2 12  PBO  9   8  7  PBO 11  12  7
PBO 13 -52  7  PBO 16  21  7  PBO 17  19 11
PBO 19   6 16  PBO 21  10 16  PBO 22 -15  6
PBO 25   4 15  PBO 27  -9  9  PBO 29  27 10
PBO 30   1 17  PBO 32  12  8  PBO 35  20  8
PBO 37 -32  8  PBO 38  15  8  PBO 41   5 14
PBO 43  35 13  PBO 45  28  9  PBO 47   6 15
")

hsv <- data.frame(scan(connection, list(VAC="", PAT=0, WKS=0, X=0)))
hsv <- transform(hsv,
  CENS = ifelse(WKS < 1, 1, 0),
  WKS  = abs(WKS),
  TRT  = ifelse(VAC=="GD2", 1, 0))

library("survival")
surv.by.vac <- survfit(Surv(WKS,CENS==0)~VAC, data=hsv)

plot(surv.by.vac, 
 main = "The Log-Rank Test \n Example 21.1: HSV-Episodes with gD2 Vaccine",
 ylab = "Survival Distribution Function",
 xlab = "Survival Time in Weeks",
 lty = c(1,2))

legend(0.75,0.19, 
 legend = c("gD2","PBO"), 
 lty = c(1,2), title = "Treatment")

summary(surv.by.vac)
print(surv.by.vac)
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Execution speed in randomForest

2012-04-13 Thread Liaw, Andy
Without seeing your code, it's hard to say much more, but do avoid using 
formula when you have large data.

Andy 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Jason & Caroline Shaw
Sent: Friday, April 06, 2012 1:20 PM
To: jim holtman
Cc: r-help@r-project.org
Subject: Re: [R] Execution speed in randomForest

The CPU time and elapsed time are essentially identical. (That is, the
system time is negligible.)

Using Rprof, I just ran the code twice.  The first time, while
randomForest is doing its thing, there are 850 consecutive lines which
read:
".C" "randomForest.default" "randomForest" "randomForest.formula" "randomForest"
Upon running it a second time, this time taking 285 seconds to
complete, there are 14201 such lines, with nothing intervening

There shouldn't be interference from elsewhere on the machine.  This
is the only memory- and CPU-intensive process.  I don't know how to
check what kind of paging is going on, but since the machine has 16GB
of memory and I am using maybe 3 or 4 at most, I hope paging is not an
issue.

I'm on a CentOS 5 box running R 2.15.0.

On Fri, Apr 6, 2012 at 12:45 PM, jim holtman  wrote:
> Are you looking at the CPU or the elapsed time?  If it is the elapsed
> time, then also capture the CPU time to see if it is different.  Also
> consider the use of the Rprof function to see where time is being
> spent.  What else is running on the machine?  Are you doing any
> paging?  What type of system are you running on?  Use some of the
> system level profiling tools.  If on Windows, then use perfmon.
>
> On Fri, Apr 6, 2012 at 11:28 AM, Jason & Caroline Shaw
>  wrote:
>> I am using the randomForest package.  I have found that multiple runs
>> of precisely the same command can generate drastically different run
>> times.  Can anyone with knowledge of this package provide some insight
>> as to why this would happen and whether there's anything I can do
>> about it?  Here are some details of what I'm doing:
>>
>> - Data: ~80,000 rows, with 10 columns (one of which is the class label)
>> - I randomly select 90% of the data to use to build 500 trees.
>>
>> And this is what I find:
>>
>> - Execution times of randomForest() using the entire dataset (in
>> seconds): 20.65, 20.93, 20.79, 21.05, 21.00, 21.52, 21.22, 21.22
>> - Execution times of randomForest() using the 90% selection: 17.78,
>> 17.74, 126.52, 241.87, 17.56, 17.97, 182.05, 17.82 <-- Note the 3rd,
>> 4th, and 7th.
>> - When the speed is slow, it often stutters, with one or a few trees
>> being produced very quickly, followed by a slow build taking 10 or 20
>> seconds
>> - The oob results are indistinguishable between the fast and slow runs.
>>
>> I select the 90% of my data by using sample() to generate indices and
>> then subsetting, like: selection <- data[sample,].  I thought perhaps
>> this subsetting was getting repeated, rather than storing in memory a
>> new copy of all that data, so I tried circumventing this with
>> eval(data[sample,]).  Probably barking up the wrong tree -- it had no
>> effect, and doesn't explain the run-to-run variation (really, I'm just
>> not clear on what eval() is for).  I have also tried garbage
>> collecting with gc() between each run, and adding a Sys.sleep() for 5
>> seconds, but neither of these has helped either.
>>
>> Any ideas?
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Long command in Sweave

2012-04-13 Thread Wincent
Thanks, Gavin and Duncan.

In that case, what I need is a suitable editor which can break the
command properly.

All the best

On 13 April 2012 19:33, Gavin Simpson  wrote:
> On Fri, 2012-04-13 at 17:46 +0800, Wincent wrote:
>> Dear useRs,
>>
>> I am writing a vignette for a package, which contains long command like this,
>> >reduce(Lipset_cs,"SURVIVAL",c("GNPCAP", "URBANIZA", "LITERACY", "INDLAB", 
>> >"GOVSTAB"),explain="positive",remainder="exclude",case="CASEID")
>> It is longer than the width a page and part of it will become "missing".
>> Currently, I have to manually break the command into multiple lines.
>> Is there a better way to handle such issue?
>
> Not that I am aware of.
>
>> It seems that others have raised similar question which seems to
>> remain unsolved in a satisfactory fashion.
>>
>> Thanks for your kind attention in advance.
>>
>
> 1) use some spacing and format the code over multiple lines
>
> reduce(Lipset_cs, "SURVIVAL",
>       c("GNPCAP", "URBANIZA", "LITERACY", "INDLAB", "GOVSTAB"),
>       explain="positive", remainder="exclude", case="CASEID")
>
> Isn't that more readable?! Any good R-aware editor should be able to
> handle appropriate formatting of the code. I *never* write long lines in
> my editor; I always break the code down to fit roughly into a 72 column
> editor window.
>
> 2) if you want to force Sweave to respect your new formatting, use
> argument `keep.source=TRUE` for the code chunk. Or set it document wide
> using \SweaveOpts{option1=value1, option2=value2} etc in the preamble
> (where optionX is one of the arguments and valueX what you want to set
> that argument too.
>
> Thought IIRC, `keep.source=TRUE` is the default now and as such Sweave
> will respect your formatting by default now - before it broke lines
> where it could.
>
> In short get out of the habit of writing long lines of R code; you'll be
> better in the long run laying your code out logically.
>
> HTH
>
> G
>
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>
>



-- 
Wincent Ronggui HUANG
Sociology Department of Fudan University
PhD of City University of Hong Kong
http://homepage.fudan.edu.cn/rghuang/cv/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Problem: Plotting mathematical functions

2012-04-13 Thread R. A. Bilonick

On 04/12/2012 09:11 PM, David Winsemius wrote:


On Apr 12, 2012, at 3:49 PM, Aye wrote:


Okay, i got this far:
f <- function(x) 0.25*x^2 + 6.47*x -32.6
g <- function(x) 0.99*x^2 -6*x -195
h <- function(x) 0.77*x^2 +14*x -495
j <- function(x) 0.001*x^2 + 65*x -785
k <- function(x) 0.9*x^2 -2*x -636
plot(x, f(x), xlab="Elemente in der Reihung", ylab="Indexwert des
Sortieraufwands"), type="l")
// lines(x, g(x), lty=3) //Not sure if it works, but this is 
irrelevant atm.


As soon as R does the plot command, he states that x is an object he 
can't

find.
So i propably have to do something like x <- XX. What do I 
insert

for the X to make it have the values - say - from 15 to 1?


x <- seq(15, 1000, by=5)




Thanks again.

--
View this message in context: 
http://r.789695.n4.nabble.com/Simple-Problem-Plotting-mathematical-functions-tp4552668p4552894.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


You can certainly use "plot" but you could also use "curve":

> curve(0.25*x^2 + 6.47*x -32.6,0,10)
> f <- function(x) 0.25*x^2 + 6.47*x -32.6
> curve(f)
> curve(f,0,100)

Rick

--
---
Richard A. Bilonick, PhD
Assistant Professor
412 647 5756
Dept. of Ophthalmology, School of Medicine
Dept. of Biostatistics, Graduate School of Public Health
Principal Investigator: Pittsburgh Aerosol Research and Inhalation Epidemiology 
Study (PARIES)
University of Pittsburgh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] list.dirs() full.names broken?

2012-04-13 Thread J Toll
Hi,

I am trying to list all the sub-directories in a particular directory
and having a few issues.  list.dirs seems to be slightly broken and/or
poorly labelled.  My issue appears to be the same as this one, from
the archives:

http://tolstoy.newcastle.edu.au/R/e16/help/11/11/1156.html

Here is some of my current platform info:
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
RStudio Version 0.95.263

Setting the full.names argument to FALSE simply doesn't appear to
work.  To get the desired behavior, I have to wrap the entire command
in basename(), as in the final example.  BTW, I've truncated to output
in width and length for the sake of brevity.

> getwd()
[1] "/Users/name/Documents/R"
> list.dirs(full.names = FALSE, recursive = FALSE)
 [1] "./abc" "./bcd""./cde"  "./data1"
> list.dirs(".", full.names = FALSE, recursive = FALSE)
 [1] "./abc" "./bcd""./cde"  "./data1"
> list.dirs("./", full.names = FALSE, recursive = FALSE)
 [1] ".//abc"".//bcd"   ".//cde"
".//data1"
> list.dirs("data1/SP", full.names = FALSE, recursive = FALSE)
  [1] "data1/SP/2010-01-29" "data1/SP/2010-12-13" "data1/SP/2010-12-31"
  [7] "data1/SP/2011-01-06" "data1/SP/2011-01-07" "data1/SP/2011-01-10"
 [13] "data1/SP/2011-01-14" "data1/SP/2011-01-18" "data1/SP/2011-01-19"
 [19] "data1/SP/2011-01-25" "data1/SP/2011-01-26" "data1/SP/2011-01-27"
> list.dirs("data/SP", full.names = TRUE, recursive = FALSE)
  [1] "data1/SP/2010-01-29" "data1/SP/2010-12-13" "data1/SP/2010-12-31"
  [7] "data1/SP/2011-01-06" "data1/SP/2011-01-07" "data1/SP/2011-01-10"
 [13] "data1/SP/2011-01-14" "data1/SP/2011-01-18" "data1/SP/2011-01-19"
 [19] "data1/SP/2011-01-25" "data1/SP/2011-01-26" "data1/SP/2011-01-27"
> basename(list.dirs("data1/SP", full.names = FALSE, recursive = FALSE))
  [1] "2010-01-29" "2010-12-13" "2010-12-31" "2011-01-03" "2011-01-04"
 [11] "2011-01-12" "2011-01-13" "2011-01-14" "2011-01-18" "2011-01-19"
 [21] "2011-01-27" "2011-01-28" "2011-01-31" "2011-02-01" "2011-02-02"
 [31] "2011-02-10" "2011-02-11" "2011-02-14" "2011-02-15" "2011-02-16"

Is the full.names argument broken, and what is it supposed to do?
Even as it currently is, I wouldn't exactly describe the outputted
pathnames as "full", more accurately as "relative pathnames" from the
current working directory.

Does anyone know if list.dirs is really broken or if this is the
desired behavior?

Thanks,


James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] group comparison for ordinal variable

2012-04-13 Thread Weiwei Shi
Hi there,

I have a task of two group samples' comparison for ordinal variable, the
possible values are from 0 to 3 with many many ties for about 60 samples
totally. I am wondering if wilcox test is a proper one and which wilcox
test like a regular wilcox.test in R or the version wilcox_test in package
"coin" can do this job or not?

Thanks,

Weiwei

-- 
Weiwei Shi, Ph.D
Research Scientist


"Did you always know?"
"No, I did not. But I believed..."
---Matrix III



-- 
Weiwei Shi, Ph.D
Research Scientist


"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loess function take

2012-04-13 Thread Bert Gunter
Since you have only one dependent variable, try using lowess()
instead. It is less flexible -- only does local linear robust fitting
-- but has arguments built in that allow you to sample and interpolate
and limit the number of robustness iterations. It runs considerably
faster as a result.

-- Bert

On Fri, Apr 13, 2012 at 6:32 AM, Liaw, Andy  wrote:
> Alternatively, use only a subset to run loess(), either a random sample or 
> something like every other k-th (sorted) data value, or the quantiles.  It's 
> hard for me to imagine that that many data points are going to improve your 
> model much at all (unless you use tiny span).
>
> Andy
>
>
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Uwe Ligges
>
> On 12.04.2012 05:49, arunkumar wrote:
>> Hi
>>
>> The function loess takes very long time if the dataset is very huge
>> I have around 100 records
>> and used only one independent variable. still it takes very long time
>>
>> Any suggestion to reduce the time
>
>
> Use another method that is computationally less expensive for that many
> observations.
>
> Uwe Ligges
>
>
>> -
>> Thanks in Advance
>>          Arun
>> --
>> View this message in context: 
>> http://r.789695.n4.nabble.com/loess-function-take-tp4550896p4550896.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> Notice:  This e-mail message, together with any attachme...{{dropped:11}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting the value from previous row

2012-04-13 Thread Milan Bouchet-Valat
Le vendredi 13 avril 2012 à 06:09 -0700, arunkumar a écrit :
> Hi
> 
> I've a dataset  with record  A  = 100,200,300,400...
> 
> There will be a parameter n.  say n=10 means i have add 10% of previous
> value to the current row
> 
> current_Val New_value
> 100   100
> 200210 (200+10)
> 300330( 300 +20+10)
> 400 460 (400+30+20+10)
> 
> I'm using a loop
> But i want takes a long time. Please help
One solution is:
A <- 1:4 * 100
A + cumsum(c(0, A[-length(A)]) * 0.1)
[1] 100 210 330 460


Hope this helps

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Partial Dependence and RandomForest

2012-04-13 Thread Liaw, Andy
Please read the help page for the partialPlot() function and make sure you 
learn about all its arguments (in particular, "which.class").

Andy 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of jmc
Sent: Wednesday, April 11, 2012 2:44 PM
To: r-help@r-project.org
Subject: [R] Partial Dependence and RandomForest

Hello all~

I am interested in clarifying something more conceptual, so I won't be
providing any data or code here.  

>From what I understand, partial dependence plots can help you understand the
relative dependence on a variable, and the subsequent values of that
variable, after "averaging out the effects" of the other input variables. 
This is great, but what I am interested in knowing is how that relates to
each predictor class, not just the overall prediction.

Is it possible to plot partial dependence per class?  Specifically, I'd like
to know the important threshold values of my most important variables.

Thank you for your time,


--
View this message in context: 
http://r.789695.n4.nabble.com/Partial-Dependence-and-RandomForest-tp4549705p4549705.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loess function take

2012-04-13 Thread Liaw, Andy
Alternatively, use only a subset to run loess(), either a random sample or 
something like every other k-th (sorted) data value, or the quantiles.  It's 
hard for me to imagine that that many data points are going to improve your 
model much at all (unless you use tiny span).

Andy


From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Uwe Ligges

On 12.04.2012 05:49, arunkumar wrote:
> Hi
>
> The function loess takes very long time if the dataset is very huge
> I have around 100 records
> and used only one independent variable. still it takes very long time
>
> Any suggestion to reduce the time


Use another method that is computationally less expensive for that many 
observations.

Uwe Ligges


> -
> Thanks in Advance
>  Arun
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/loess-function-take-tp4550896p4550896.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I convert factors to numeric? It's an FAQ but...

2012-04-13 Thread David Winsemius


On Apr 13, 2012, at 9:08 AM, John Coulthard wrote:



Dear R list people

I loaded a file of numbers into R and got a dataframe of factors.   
So I tried to convert it to numeric as per the FAQ using as.numeric().


Actually you used as.numeric(as.character()) which should have been  
successful under ordinary circumstances. However you applied it to an  
entire dataframe, when you should have applied it to each column  
separately. The last error message told you that you were sending the  
function the wrong datatype (list).



 But I'm getting errors (please see example), so what am I getting  
wrong?


Thanks for your time.
John

Example...

#my data object

f
  GSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158  
GSM187159
13  7.199346  7.394519  7.466155  8.035864  7.438536  7.308401   
7.707994
14  6.910426  6.360291  6.228221   7.42918  7.120322  6.108129   
7.201477
15   8.85921  9.152096  9.1250676.4458  8.600319   8.97577   
9.691167
16  5.851665  5.621529  5.673689  6.331274  6.160159   5.65945   
5.595156
17  9.905257  8.596643   9.11741  9.872789  8.909299  9.104171   
9.158998
18  6.176691  6.429807  6.418132  6.849236  6.162308  6.432743   
6.444664
19  7.599871  8.795133  8.382509  5.887119  7.941895  7.92   
8.170374
20  9.458262   8.39701  8.4020159.0859  8.995632  8.427601   
8.265105
21  8.179803  9.868286 10.570601  4.905013  9.488779  9.148336   
9.654022
22  7.456822  8.037138  7.953766  6.666418  7.674927  7.995109   
7.635158



That is not a reproducible example. You should provide the unedited  
output from dput(f)


Try:

numf <- lapply(f, function(x) as.numeric(as.character(x)) ) # returns  
a list

numf <- as.data.frame(numf)
str(numf)

'data.frame':   10 obs. of  7 variables:
 $ GSM187153: num  7.2 6.91 8.86 5.85 9.91 ...
 $ GSM187154: num  7.39 6.36 9.15 5.62 8.6 ...
 $ GSM187155: num  7.47 6.23 9.13 5.67 9.12 ...
 $ GSM187156: num  8.04 7.43 6.45 6.33 9.87 ...
 $ GSM187157: num  7.44 7.12 8.6 6.16 8.91 ...
 $ GSM187158: num  7.31 6.11 8.98 5.66 9.1 ...
 $ GSM187159: num  7.71 7.2 9.69 5.6 9.16 ...


Tested on
> dput(f)
structure(list(GSM187153 = structure(c(4L, 3L, 8L, 1L, 10L, 2L,
6L, 9L, 7L, 5L), .Label = c("5.851665", "6.176691", "6.910426",
"7.199346", "7.456822", "7.599871", "8.179803", "8.85921", "9.458262",
"9.905257"), class = "factor"), GSM187154 = structure(c(4L, 2L,
9L, 1L, 7L, 3L, 8L, 6L, 10L, 5L), .Label = c("5.621529", "6.360291",
"6.429807", "7.394519", "8.037138", "8.39701", "8.596643", "8.795133",
"9.152096", "9.868286"), class = "factor"), GSM187155 = structure(c(5L,
3L, 10L, 2L, 9L, 4L, 7L, 8L, 1L, 6L), .Label = c("10.570601",
"5.673689", "6.228221", "6.418132", "7.466155", "7.953766", "8.382509",
"8.402015", "9.11741", "9.125067"), class = "factor"), GSM187156 =  
structure(c(8L,

7L, 4L, 3L, 10L, 6L, 2L, 9L, 1L, 5L), .Label = c("4.905013",
"5.887119", "6.331274", "6.4458", "6.666418", "6.849236", "7.42918",
"8.035864", "9.0859", "9.872789"), class = "factor"), GSM187157 =  
structure(c(4L,

3L, 7L, 1L, 8L, 2L, 6L, 9L, 10L, 5L), .Label = c("6.160159",
"6.162308", "7.120322", "7.438536", "7.674927", "7.941895", "8.600319",
"8.909299", "8.995632", "9.488779"), class = "factor"), GSM187158 =  
structure(c(4L,

2L, 8L, 1L, 9L, 3L, 5L, 7L, 10L, 6L), .Label = c("5.65945", "6.108129",
"6.432743", "7.308401", "7.92", "7.995109", "8.427601", "8.97577",
"9.104171", "9.148336"), class = "factor"), GSM187159 = structure(c(5L,
3L, 10L, 1L, 8L, 2L, 6L, 7L, 9L, 4L), .Label = c("5.595156",
"6.444664", "7.201477", "7.635158", "7.707994", "8.170374", "8.265105",
"9.158998", "9.654022", "9.691167"), class = "factor")), .Names =  
c("GSM187153",

"GSM187154", "GSM187155", "GSM187156", "GSM187157", "GSM187158",
"GSM187159"), class = "data.frame", row.names = c("13", "14",
"15", "16", "17", "18", "19", "20", "21", "22"))




class(f)

[1] "data.frame"

#all the columns in the dataframe are of class 'factor'

for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}}


#but it won't convert to numeric

g<-as.numeric(as.character(f))

Warning message:
NAs introduced by coercion

g

[1] NA NA NA NA NA NA NA NA NA NA

g<-as.numeric(levels(f))[as.integer(f)]

Error: (list) object cannot be coerced to type 'integer'





R version 2.14.1 (2011-12-22)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-redhat-linux-gnu (32-bit)



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/p

Re: [R] #!/usr/bin/env Rscript --vanilla ??

2012-04-13 Thread Dirk Eddelbuettel

On 13 April 2012 at 10:32, Martin Maechler wrote:
| I think that's my first true question (rather than answer)
| to R-help.
| 
| As R has, for a long time, become my primary scripting and
| programming language, I'm prefering at times to write  Rscript
| files instead of shell scripts, notably when R has nice ways to
| do some of the things.
| On a standard standalone platform with standard R, 
| I would start such a script with
| ---
| #! /usr/bin/Rscript --vanilla
| ---
| (yes, the "--vanilla" is important to me, in this case)
| 
| However; as, at work, my scripts have to work correctly on quite a
| few different (unixy : several flavors of Linux, Solaris, MacOS X) platforms,
| *and* as an R developer, I have many different versions of R
| installed simultaneously, using /usr/bin/Rscript  is not an
| option.
| Rather, I'd use the /usr/bin/env trick :
| 
| ---
| #! /usr/bin/env Rscript
| ---
| 
| which finds Rscript in "the correct" place, according to the
| current PATH.  All fine till now.
| 
| PROBLEM:  It does not work with '--vanilla' or any other argument:
|  If I start my script with  
| #! /usr/bin/env Rscript --vanilla
|  the error message simply is
| /usr/bin/env: Rscript --vanilla: No such file or directory
| 
| I have tried a few variations on the theme, using quotes in
| different places, but have not succeeded till now.
| Any suggestions?

If moving away from Rscript to littler is an option:

   #!/usr/bin/r 

There is a well-known limitation for #! scripts which apparently reflect
precisely one command-line argument.  Because r uses getopt semantics, I
*think* you can combine them as eg in 

   #!/usr/bin/r -vti

which would use --vanilla, --interactive and --rtemp toggle making sure we
use temp files/dirs the way R does. 

However, it seems that --vanilla is now implicit as I can't get littler to
read ~/.Rprofile.  Somewhere between a feature and a bug :)

Dirk

-- 
R/Finance 2012 Conference on May 11 and 12, 2012 at UIC in Chicago, IL
See agenda, registration details and more at http://www.RinFinance.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I convert factors to numeric? It's an FAQ but...

2012-04-13 Thread ONKELINX, Thierry
f is a dataframe of factor, not a factor

use either

as.numeric(levels(f$your.factor))[f$your.factor]

or if f only contains factors

apply(f, 2, function(x){as.numeric(levels(x))[x]})

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
John Coulthard
Verzonden: vrijdag 13 april 2012 15:08
Aan: r-help@r-project.org
Onderwerp: [R] How do I convert factors to numeric? It's an FAQ but...


Dear R list people

I loaded a file of numbers into R and got a dataframe of factors.  So I tried 
to convert it to numeric as per the FAQ using as.numeric().  But I'm getting 
errors (please see example), so what am I getting wrong?

Thanks for your time.
John

Example...

#my data object
> f
   GSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158 GSM187159
13  7.199346  7.394519  7.466155  8.035864  7.438536  7.308401  7.707994
14  6.910426  6.360291  6.228221   7.42918  7.120322  6.108129  7.201477
15   8.85921  9.152096  9.1250676.4458  8.600319   8.97577  9.691167
16  5.851665  5.621529  5.673689  6.331274  6.160159   5.65945  5.595156
17  9.905257  8.596643   9.11741  9.872789  8.909299  9.104171  9.158998
18  6.176691  6.429807  6.418132  6.849236  6.162308  6.432743  6.444664
19  7.599871  8.795133  8.382509  5.887119  7.941895  7.92  8.170374
20  9.458262   8.39701  8.4020159.0859  8.995632  8.427601  8.265105
21  8.179803  9.868286 10.570601  4.905013  9.488779  9.148336  9.654022
22  7.456822  8.037138  7.953766  6.666418  7.674927  7.995109  7.635158
   GSM187160 GSM187161 GSM187162
13  7.269558  7.537711  7.099806
14   6.61534  7.125821  6.413295
15   8.64715  8.252031  9.445682
16  5.6398165.9257  5.752994
17  8.856829  9.043991  8.839183
186.4307   6.710526.5269
19  7.674577  7.390617  8.638025
20  8.132649  8.755642  8.137992
21  9.897561  7.619129 10.242096
22  7.836658  7.297986  8.679438
> class(f)
[1] "data.frame"

#all the columns in the dataframe are of class 'factor'
> for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}}
>
#but it won't convert to numeric
> g<-as.numeric(as.character(f))
Warning message:
NAs introduced by coercion
> g
 [1] NA NA NA NA NA NA NA NA NA NA
> g<-as.numeric(levels(f))[as.integer(f)]
Error: (list) object cannot be coerced to type 'integer'
>


R version 2.14.1 (2011-12-22)
Copyright (C) 2011 The R Foundation for Statistical Computing ISBN 3-900051-07-0
Platform: i386-redhat-linux-gnu (32-bit)



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I convert factors to numeric? It's an FAQ but...

2012-04-13 Thread Milan Bouchet-Valat
Le vendredi 13 avril 2012 à 13:08 +, John Coulthard a écrit :
> Dear R list people
> 
> I loaded a file of numbers into R and got a dataframe of factors.  So I tried 
> to convert it to numeric as per the FAQ using as.numeric().  But I'm getting 
> errors (please see example), so what am I getting wrong?
> 
> Thanks for your time.
> John
> 
> Example...
> 
> #my data object
> > f
>GSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158 GSM187159
> 13  7.199346  7.394519  7.466155  8.035864  7.438536  7.308401  7.707994
> 14  6.910426  6.360291  6.228221   7.42918  7.120322  6.108129  7.201477
> 15   8.85921  9.152096  9.1250676.4458  8.600319   8.97577  9.691167
> 16  5.851665  5.621529  5.673689  6.331274  6.160159   5.65945  5.595156
> 17  9.905257  8.596643   9.11741  9.872789  8.909299  9.104171  9.158998
> 18  6.176691  6.429807  6.418132  6.849236  6.162308  6.432743  6.444664
> 19  7.599871  8.795133  8.382509  5.887119  7.941895  7.92  8.170374
> 20  9.458262   8.39701  8.4020159.0859  8.995632  8.427601  8.265105
> 21  8.179803  9.868286 10.570601  4.905013  9.488779  9.148336  9.654022
> 22  7.456822  8.037138  7.953766  6.666418  7.674927  7.995109  7.635158
>GSM187160 GSM187161 GSM187162
> 13  7.269558  7.537711  7.099806
> 14   6.61534  7.125821  6.413295
> 15   8.64715  8.252031  9.445682
> 16  5.6398165.9257  5.752994
> 17  8.856829  9.043991  8.839183
> 186.4307   6.710526.5269
> 19  7.674577  7.390617  8.638025
> 20  8.132649  8.755642  8.137992
> 21  9.897561  7.619129 10.242096
> 22  7.836658  7.297986  8.679438
> > class(f)
> [1] "data.frame"
> 
> #all the columns in the dataframe are of class 'factor'
> > for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}}
> >
> #but it won't convert to numeric
> > g<-as.numeric(as.character(f))
> Warning message:
> NAs introduced by coercion 
> > g
>  [1] NA NA NA NA NA NA NA NA NA NA
> > g<-as.numeric(levels(f))[as.integer(f)]
> Error: (list) object cannot be coerced to type 'integer'
That's because you're trying to convert the whole data frame, which is a
list of vectors, instead of converting the vectors individually. You can
use:
g <- sapply(f, function(x) as.numeric(as.character(x)))

But it would probably be better to fix the import step so that you get
numeric vectors in the first place. ;-)


Cheers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Large Dataset Problem

2012-04-13 Thread Milan Bouchet-Valat
Le vendredi 13 avril 2012 à 05:44 -0700, efulas a écrit :
> Thank you very much for your helps guys. Both message help me to run the data
> in R. However, R is omitting many columns from my data. Am i missing
> something?
Please read the posting guide. If you don't provide the code you ran and
the resulting objects and messages, we cannot possibly help you.


Regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] getting the value from previous row

2012-04-13 Thread arunkumar1111
Hi

I've a dataset  with record  A  = 100,200,300,400...

There will be a parameter n.  say n=10 means i have add 10% of previous
value to the current row

current_Val New_value
100   100
200210 (200+10)
300330( 300 +20+10)
400 460 (400+30+20+10)

I'm using a loop
But i want takes a long time. Please help



-
Thanks in Advance
Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/getting-the-value-from-previous-row-tp4554761p4554761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I convert factors to numeric? It's an FAQ but...

2012-04-13 Thread John Coulthard

Dear R list people

I loaded a file of numbers into R and got a dataframe of factors.  So I tried 
to convert it to numeric as per the FAQ using as.numeric().  But I'm getting 
errors (please see example), so what am I getting wrong?

Thanks for your time.
John

Example...

#my data object
> f
   GSM187153 GSM187154 GSM187155 GSM187156 GSM187157 GSM187158 GSM187159
13  7.199346  7.394519  7.466155  8.035864  7.438536  7.308401  7.707994
14  6.910426  6.360291  6.228221   7.42918  7.120322  6.108129  7.201477
15   8.85921  9.152096  9.1250676.4458  8.600319   8.97577  9.691167
16  5.851665  5.621529  5.673689  6.331274  6.160159   5.65945  5.595156
17  9.905257  8.596643   9.11741  9.872789  8.909299  9.104171  9.158998
18  6.176691  6.429807  6.418132  6.849236  6.162308  6.432743  6.444664
19  7.599871  8.795133  8.382509  5.887119  7.941895  7.92  8.170374
20  9.458262   8.39701  8.4020159.0859  8.995632  8.427601  8.265105
21  8.179803  9.868286 10.570601  4.905013  9.488779  9.148336  9.654022
22  7.456822  8.037138  7.953766  6.666418  7.674927  7.995109  7.635158
   GSM187160 GSM187161 GSM187162
13  7.269558  7.537711  7.099806
14   6.61534  7.125821  6.413295
15   8.64715  8.252031  9.445682
16  5.6398165.9257  5.752994
17  8.856829  9.043991  8.839183
186.4307   6.710526.5269
19  7.674577  7.390617  8.638025
20  8.132649  8.755642  8.137992
21  9.897561  7.619129 10.242096
22  7.836658  7.297986  8.679438
> class(f)
[1] "data.frame"

#all the columns in the dataframe are of class 'factor'
> for(i in 1:ncol(f)){if(class(f[,i])!="factor"){print(class(f[,i]))}}
>
#but it won't convert to numeric
> g<-as.numeric(as.character(f))
Warning message:
NAs introduced by coercion 
> g
 [1] NA NA NA NA NA NA NA NA NA NA
> g<-as.numeric(levels(f))[as.integer(f)]
Error: (list) object cannot be coerced to type 'integer'
> 


R version 2.14.1 (2011-12-22)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-redhat-linux-gnu (32-bit)


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Boxcox transformation

2012-04-13 Thread ruchi
I am time series data in Eviews to see the fitting of logistic and gompertz
model with my data.
I used NLS and then the Box cox transformation.

I need to see my graph of original data Vs predicted values / fiited
values on single graph . I cam do this when dealing only with NLS but I am
not able to plot it
with box cox. I read about" lines.boxcox.fit " but not working. Please
let me know if anyone knows
Below is the code.

library(nlstools)
library(nlme)
library(nlrwr)

Use <-c(12.6,
15.1,
18.3,
21.99,
26.15,
29.205,
33.6,
37.4,
41.07,
44.94,
50.86,
58.5,
69.2,
78.49,
91.01,
105.4,
120.47,
135.79,
153.99,
172.23,
192.7,
212.51,
233.68,
258.23,
297.26,
328.23,
370.59,
421.58,
478.68,
527.62,
578.49,
641.73,
698.37,
737.33,
761.2 )
time  <-
c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,242,25,26,27,28,29,30,31,32,33,34)
df <- data.frame(Use, time )
df
plot(df$time , df$Use, col = "blue", xlab="time((Time t=0 at Year 2003
Q1 (march))",ylab= "Users")
win.graph()
para0.st <- c(K=1062,
a=3.998, b=0.129 )
 fitGG <- nls(
  Use~ K*exp(-a*exp(-b*time )), df, start= para0.st,trace=T,
control=nls.control(maxiter=200))
 summary(fitGG)
// Plotting original Vs NLS fit data on same graph
plot(df$time , df$Use, xlim=c(0,40), ylim=c(0,1000), col = "blue",
xlab="Time (Year 2003)",ylab= "Use")
x <- seq(0, 40, length=100)

y2 <- predict(fitGG,data.frame(time =x))
lines(x,y2, lty="dotted", col="red")
nlsResiduals(fitGG)
resGG<-nlsResiduals(fitGG)

plot(resGG,which=0)
// now using box cox transformation
bcfitGG2<- boxcox.nls(fitGG)
 bcSummary(bcfitGG2)
coef(summary(bcfitGG2))
summary(bcfitGG2)
resGG1<-nlsResiduals(bcfitGG2)
 plot(resGG1,which=0)

/// trying to plot for the original Vs fitted box-cox values
plot(df$time , df$Use, xlim=c(0,40), ylim=c(0,1000), col = "blue",
xlab="Time (Year)",ylab= "Use")
x <- seq(0, 40, length=100)
y3 <- predict(bcfitGG2,data.frame(time =x))
lines(x,y3, lty="dotted", col="red")





--
View this message in context: 
http://r.789695.n4.nabble.com/Boxcox-transformation-tp4554769p4554769.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help - Importing data from txt and xlsx files

2012-04-13 Thread ONKELINX, Thierry
Dear Tom,

R does not searches your entire file system for the file. It only looks in the 
working directory. Have a look at ?setwd() and ?getwd()

So you will need to set the working directory, use a relative path to the file 
or use and absolute path to the file.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
AMFTom
Verzonden: vrijdag 13 april 2012 14:29
Aan: r-help@r-project.org
Onderwerp: [R] Help - Importing data from txt and xlsx files

Hi all,

I have just started to use R for my PhD project and have no previous
experience in programming. I am having trouble importing data to R.   This
is the output:

> mydata <- read.table("Lv2.8.txt")
Error in file(file, "rt") : cannot open the connection In addition: Warning 
message:
In file(file, "rt") :
  cannot open file 'Lv2.8.txt': No such file or directory

I know there is a file on my computer with this name though... What am I doing 
wrong?

Thanks in advance!

Tom

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-Importing-data-from-txt-and-xlsx-files-tp4554622p4554622.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Large Dataset Problem

2012-04-13 Thread efulas
Thank you very much for your helps guys. Both message help me to run the data
in R. However, R is omitting many columns from my data. Am i missing
something?



Many Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/R-Large-Dataset-Problem-tp4554469p4554698.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Organizations where IT has approved the use of R software

2012-04-13 Thread lynnland
Hi All,

Thanks for all the responses and interest. 

I was aware of Revolution Analytics and had seen some of the other links but
some were new.  The list received should be of great help.  Given the R
courses that used to be offered through the USGS site, I would assume they
are using it but haven't received a response from them yet to the email I
sent inquiring.

The organization in question is one of the provincial Natural Resources
Ministries in Canada.


Cheers

Lynn

--
View this message in context: 
http://r.789695.n4.nabble.com/Organizations-where-IT-has-approved-the-use-of-R-software-tp4552609p4554590.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove superscripts from HTML objects

2012-04-13 Thread S Ellison
> h <- "CataDog"
> sub("","",h)

Probably safer to do  

gsub("","",h)

to avoid replacing multiple superscripts.

eg 
h2 <- 
"CataDogMouseaRaccoon"
sub("","",h2) #drops everything between first 
gsub("","",h2)#Drops each xxx


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with vectorization

2012-04-13 Thread Filoche
Thank you all.

Problem solved.

Regards,
Phil

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-vectorization-tp4552638p4554565.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help - Importing data from txt and xlsx files

2012-04-13 Thread AMFTom
Hi all, 

I have just started to use R for my PhD project and have no previous
experience in programming. I am having trouble importing data to R.   This
is the output:

> mydata <- read.table("Lv2.8.txt")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'Lv2.8.txt': No such file or directory

I know there is a file on my computer with this name though... What am I
doing wrong?

Thanks in advance!

Tom

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-Importing-data-from-txt-and-xlsx-files-tp4554622p4554622.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Large Dataset Problem

2012-04-13 Thread Rainer M Krug
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 13/04/12 14:20, Milan Bouchet-Valat wrote:
> Le vendredi 13 avril 2012 à 04:32 -0700, efulas a écrit :
>> Dear All,
>> 
>> I have a problem with my data. First problem is that my data is really large 
>> and R is
>> omitting some columns from my data. Is there any way to read the whole data 
>> without
>> omitting.
> How did you import it? Please be precise.
> 
>> Another problem is that my data have 102k columns and each column have 
>> active or inactive
>> molecules. The data is like below
>> 
>> Molecul id
>> 
>> 1298761010101110000001110000110.. 234532
>> 1010101110000001110000110.. 123678
>> 1010101110000001110000110.. . . . . (102k values)
>> 
>> 
>> When i read the data in R. R define my rows as a "Inf" because R read it as 
>> a one number. I
>> want them to be seperated like "1  0   1   0" . Is there anyway to do this 
>> in R?
> See ?read.fwf. If you still have problems loading your data, feel free to ask 
> again on specific
> issues.

You could also read them in so that the 1 and 0 are in one field and specify 
that this column is a
character, and then use strsplit() to split them up:

> x <- "1010101110000001110000110" strsplit(x, split="")
[[1]]
 [1] "1" "0" "1" "0" "1" "0" "1" "1" "1" "0" "0" "1" "1" "1" "1" "0" "0" "1" "1"
[20] "1" "1" "0" "0" "1" "1" "1" "0" "0" "1" "1" "1" "1" "0" "0" "1" "1" "0"



Cheers,

Rainer

> 
> 
> Regards
> 
> __ R-help@r-project.org mailing 
> list 
> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented, minimal, 
> self-contained,
> reproducible code.


- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
UCT), Dipl. Phys.
(Germany)

Centre of Excellence for Invasion Biology
Stellenbosch University
South Africa

Tel :   +33 - (0)9 53 10 27 44
Cell:   +33 - (0)6 85 62 59 98
Fax :   +33 - (0)9 58 10 27 44

Fax (D):+49 - (0)3 21 21 25 22 44

email:  rai...@krugs.de

Skype:  RMkrug
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+IIHQACgkQoYgNqgF2egoC5gCfb86H8KCMryM3zhvWPm3ejeIr
qDcAni5hTezs9rfJGKq0c6fE8pnltpYS
=wd+P
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Large Dataset Problem

2012-04-13 Thread Alekseiy Beloshitskiy
I would perform data pre-processing before loading in R.


Best,
-Alex

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
efulas [ef_u...@hotmail.com]
Sent: 13 April 2012 14:32
To: r-help@r-project.org
Subject: [R] R Large Dataset Problem

Dear All,

 I have a problem with my data. First problem is that my data is really
large and R is omitting some columns from my data. Is there any way to read
the whole data without omitting. Another problem is that my data have 102k
columns and each column have active or inactive molecules. The data is like
below

Molecul id

1298761010101110000001110000110..
2345321010101110000001110000110..
1236781010101110000001110000110..
.
.
.
.
(102k values)


When i read the data in R. R define my rows as a "Inf" because R read it as
a one number. I want them to be seperated like "1  0   1   0" . Is there
anyway to do this in R?

Many Thanks,


Efe

--
View this message in context: 
http://r.789695.n4.nabble.com/R-Large-Dataset-Problem-tp4554469p4554469.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scatterplot3d(); customise axes

2012-04-13 Thread Uwe Ligges



On 13.04.2012 09:52, Tonja Krueger wrote:


Hi all!
I’m using scatterplot3d() to show the distribution of data for different
locations. As I wound like to show distances between the locations and also
label the locations, I was wondering whether there is a function similar to
axis() for a 2D plot that works with scatterplot3d()?


Not really, but you can try:

 library(scatterplot3d)
 scatterplot3d(data[,1], data[,2], data[,3],
   xlab="Location", zlab="var 2", ylab="var 1",
   lab = c(10, 5, 2),
   x.ticklabs = c(1, rep("", 5), 7, "", "", 10))


Uwe Ligges


2D:
a<- runif(50)
a2<- qnorm(a)
b<- runif(50)
b2<- qnorm(b)
c<- runif(50)
c2<- qnorm(c)
data<-
rbind(cbind(rep(1,50),a,a2),cbind(rep(7,50),b,b2),cbind(rep(10,50),c,c2))
plot(data[,1],data[,2],xaxt="n",xlab="Location",ylab="var 1")
axis(1, at= c(1,7,10),labels = c("Loc 1","Loc 2", "Loc 3"))
3D:
library(scatterplot3d)
scatterplot3d(data[,1], data[,2], data[,3],box=T, col.axis="black", angle=
45,grid=T, xlab="Location",zlab="var 2",ylab="var 1")
Thank you for your suggestions,
Tonja


Ihr WEB.DE Postfach immer dabei: die kostenlose WEB.DE Mail App für iPhone
und Android.
[1]https://produkte.web.de/freemail_mobile_startseite/

References

1. https://produkte.web.de/freemail_mobile_startseite/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Large Dataset Problem

2012-04-13 Thread Milan Bouchet-Valat
Le vendredi 13 avril 2012 à 04:32 -0700, efulas a écrit :
> Dear All,
> 
>  I have a problem with my data. First problem is that my data is really
> large and R is omitting some columns from my data. Is there any way to read
> the whole data without omitting.
How did you import it? Please be precise.

> Another problem is that my data have 102k
> columns and each column have active or inactive molecules. The data is like
> below
> 
> Molecul id 
> 
> 1298761010101110000001110000110..
> 2345321010101110000001110000110..
> 1236781010101110000001110000110..
> .
> .
> .
> .
> (102k values)
> 
> 
> When i read the data in R. R define my rows as a "Inf" because R read it as
> a one number. I want them to be seperated like "1  0   1   0" . Is there
> anyway to do this in R?
See ?read.fwf. If you still have problems loading your data, feel free
to ask again on specific issues.


Regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kaplan Meier analysis: 95% CI wider in R than in SAS

2012-04-13 Thread Paul Miller
Hello All,
 
Am replicating in R an analysis I did earlier using SAS. See this as a test of 
whether I'm ready to start using R in my day-to-day work.
 
Just finished replicating a Kaplan Meier analysis. Everything seems to work out 
fine except for one thing. The 95% CI around my estimate for the median is 
substantially larger in R than in SAS. For example, in SAS I have a median of 
3.29 with a 95% CI of [1.15, 5.29]. In R, I get a median of 3.29 with a 95% CI 
of [1.35, 13.35].
 
Can anyone tell me why I get this difference?
 
My R code looks like:
 
survfrm <- Surv(progression_months_landmark_14,progression==1) ~ 
pr_rg_landmark_14 
survobj <- survfit(survfrm, data=Survival)
survlrk <- survdiff(survfrm, data=Survival)
summary(survobj)
print(survobj)
print(survlrk)
 
My SAS code looks like:
 
proc lifetest data=survival;
strata pr_rg_landmark_14;
time progression_months_landmark_14 * progression(0);
run;

Thought maybe the difference could have something to do with the strata 
statement in the SAS code not being translated properly into R. Tried changing 
my R code to make pr_rg_landmark_14 a strata but this didn't seem to change 
anything. Except that I no longer got a log rank test. 

Thanks,

Paul
 
 
 
 

 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting leapfrog in R

2012-04-13 Thread Jim Lemon

On 04/13/2012 12:25 PM, Colstat wrote:

Dear List

Is there a package for leapfrog plotting (Hamiltonian Monte Carlo
estimation) in R?  I tried the actual "LEAPFrOG" package which doesn't
actually give the plot like this one?
http://xianblog.files.wordpress.com/2010/09/hamilton.jpg

How doe one plot this in R?  So, there semi-circle and dots on that
semi-circle.
I don't think curve() or plot() would produce such plot.  Thanks in advance!


Hi Colstat,
If the discontinuities in the function can be characterized, perhaps by 
the apparently equal q values of two points, you can specify the line 
type as dotted for those lines and solid for all of the rest. Given 
that, it just becomes a point/line plot.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Large Dataset Problem

2012-04-13 Thread efulas
Dear All,

 I have a problem with my data. First problem is that my data is really
large and R is omitting some columns from my data. Is there any way to read
the whole data without omitting. Another problem is that my data have 102k
columns and each column have active or inactive molecules. The data is like
below

Molecul id 

1298761010101110000001110000110..
2345321010101110000001110000110..
1236781010101110000001110000110..
.
.
.
.
(102k values)


When i read the data in R. R define my rows as a "Inf" because R read it as
a one number. I want them to be seperated like "1  0   1   0" . Is there
anyway to do this in R?

Many Thanks,


Efe 

--
View this message in context: 
http://r.789695.n4.nabble.com/R-Large-Dataset-Problem-tp4554469p4554469.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with stemDocument

2012-04-13 Thread Alekseiy Beloshitskiy
Check this
slideshare.net/whitish/textmining-with-r

Best,
-Alex

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Deborah H. Deng [deborah.d...@alumni.utexas.net]
Sent: 13 April 2012 10:27
To: r-help@r-project.org
Subject: [R] Help with stemDocument

Hi, All:

I am new to R and tm package. I'm trying to do the stemming using tm_map()
and it doesn't seem to work:

*I used:*
> stemDocument(t_cmts[[100]])

*Where t_cmts is the corpus object, the results is:*
 bottle loose box abt airpak sections top plastic bottle squashed nearly
flush neck previous shipments bottle wrapped securely bubble wrap wno
bottle damage packaging poor surprisingly bottle leaking remove contents
bottle reusable packaging cancel automatic shipments
>

Which doesn't seem to have any stemming done at all. *What did I do wrong*?

I have rWeka, tm, rJava, Snowball installed (Use "install package" from the
top menu and it didn't say it failed.)

Thanks,
Deborah

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Week number of a date in a month

2012-04-13 Thread arunkumar1111
Hi

I have  a requirement such that a  a week should be ending on sunday. monday
is the week start date.

if a month start on sunday. then 2nd day of the month the monday will be in
week 2

I need help in creating the week  number

-
Thanks in Advance
Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/Week-number-of-a-date-in-a-month-tp4554282p4554282.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >