Re: [R] Conditional statement

2009-11-16 Thread jim holtman
Generate the numbers, test for zero and then set negatives to zero:

> set.seed(1)
> x <- rnorm(100,5,3)
> sum(x<0)
[1] 3
> x[x<0] <- 0
> sum(x<0)
[1] 0
>


On Mon, Nov 16, 2009 at 7:43 AM, Rafael Moral
 wrote:
> Dear useRs,
>
> I wrote a function that simulates a stochastic model in discrete time.
> The problem is that the stochastic parameters should not be negative and 
> sometimes they happen to be.
> How can I conditionate it to when it draws a negative number, it transforms 
> into zero in that time step?
>
> Here is the function:
>
> stochastic_prost <- function(Fmean, Fsd, Smean, Ssd, f, s, n, time, 
> out=FALSE, plot=TRUE) {
> nt <- rep(0, time)
> nt[1] <- n
> for(n in 2:time) {
> nt[n] <- 0.5*rnorm(1, Fmean, Fsd)*rnorm(1, Smean, 
> Ssd)*exp(1)^(-(f+s)*nt[n-1])*nt[n-1]}
> if(out==TRUE) {print(data.frame(nt))}
> if(plot==TRUE) {plot(1:time, nt, type='l', main='Simulation', 
> ylab='Population', xlab='Generations')}
> }
>
> The 2 rnorm()'s should not be negative; when negative they should turn into 
> zero.
>
> Thanks in advance,
> Rafael
>
>
>      
> 
> [[elided Yahoo spam]]
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting values from correlation matrix

2009-11-16 Thread jim holtman
Assuming that your data is in a dataframe 'cordata' , then following
should work:

cordata$cor2_value <- sapply(1:nrow(cordata), function(.row){
cor2[cordata$rowname[.row], cordata$colname[.row]]
}

On Mon, Nov 16, 2009 at 11:44 AM, Lee William  wrote:
> Hi! All,
>
> I have 2 correlation matrices of 4000x4000 both with same row names and
> column names say cor1 and cor2. I have extracted some information from 1st
> matrix cor1 which is something like this:
>
> rowname  colname  cor1_value
>  a              b            0.8
>  b              a            0.8
>  c              f             0.62
>  d              k            0.59
>  -              -              --
>  -              -              --
>
> Now I wish to extract values from matrix cor2 for the same rowname and
> colname as above so that it looks similar to something like this with values
> in cor2_value:
>
> rowname  colname  cor1_value  cor2_value
>  a              b            0.8             ---
>  b              a            0.8             ---
>  c              f             0.62           ---
>  d              k            0.59           ---
>  -              -              --              ---
>  -              -              --              ---
>
> I am running out of ideas. So I decided to post this on mailing list. Please
> Help!
>
> Best
> Lee
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re placing the dates format in R for exporting the data set...

2009-11-17 Thread jim holtman
First of all '2009-08-06' is 1995; this is probably not what you were
expecting.  What do you what your expression to do?  Is 'toms_dat' a
dataframe?  if so, your expression 'toms_dat ==2009-08-06' seem
strange.  So tell us what you want to do, not how you want to do it.

On Tue, Nov 17, 2009 at 4:54 PM, ychu066  wrote:
>
> hi everyone, i am having difficulties with replacing the dates format in R
> for exporting the data set...
>
> eg: the code that i used was
> toms_dat<- replace(toms_dat, toms_dat ==2009-08-06, 2)
> toms_dat<- replace(toms_dat, toms_dat ==2009-08-04, 1)
>
> but when i export the data as into txt file or excel file the dates come up
> with very large numbers .:drunk:
>
> please help me ...=)
> --
> View this message in context: 
> http://old.nabble.com/replacing-the-dates-format-in-R-for-exporting-the-data-set...-tp26396492p26396492.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting a dataframe with date format

2009-11-17 Thread jim holtman
It plots fine for me.  I see 2007 and 2008 on the x-axis.

On Tue, Nov 17, 2009 at 4:56 PM, separent  wrote:
>
> I tried to plot the attached dataframe with the following command.
>
> plot(inclino.06.1.r00.time.select.transpose[,1],inclino.06.1.r00.time.select.transpose[,2])
>
> The first column is in date format, second is numeric. The plot does not
> correspond to my values. Why?
>
> Regards,
>
> Serge-Étienne Parent
> Golder Associés
> Montréal
>
> http://old.nabble.com/file/p26396493/inclino.06.1.r00.time.select.transpose.rda
> inclino.06.1.r00.time.select.transpose.rda
> --
> View this message in context: 
> http://old.nabble.com/Plotting-a-dataframe-with-date-format-tp26396493p26396493.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re placing the dates format in R for exporting the data set...

2009-11-18 Thread jim holtman
?write.table

If you read the help file, and do a little experimenting, you will see
that there is a parameter 'rownames=FALSE' that may answer your
question.

Also since you did not have column names on your input, you get V1,
V2,...  You can put your own column names.  It helps again to read the
help file on 'read.table' and look at the parameter 'col.names'.
There is also the colnames function.  It also might help to (re)read
the Intro to R.

On Tue, Nov 17, 2009 at 8:27 PM, ychu066  wrote:
>
> Moreover,  I want to rename the column name V1,V2,V3,V4.V146.  how do i
> write the code in R ???
>
> thanks everyone that look at the thread/
>
>
>
> ychu066 wrote:
>>
>> hi everyone, i am having difficulties with replacing the dates format in R
>> for exporting the data set...
>>
>> eg: the code that i used was
>> toms_dat<- replace(toms_dat, toms_dat ==2009-08-06, 2)
>> toms_dat<- replace(toms_dat, toms_dat ==2009-08-04, 1)
>>
>> but when i export the data as into txt file or excel file the dates come
>> up with very large numbers .:drunk:
>>
>> please help me ...=)
>>
> http://old.nabble.com/file/p26400792/what.csv what.csv
> --
> View this message in context: 
> http://old.nabble.com/replacing-the-dates-format-in-R-for-exporting-the-data-set...-tp26396492p26400792.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error message; ylim + log="y"

2009-11-18 Thread jim holtman
like this?

> plot(c(),c(), xlim=c(1,10), ylim=c(0,1), log="y")
Error in axis(side = side, at = at, labels = labels, ...) :
  CreateAtVector [log-axis()]: axp[0] = 0 < 0!
In addition: Warning messages:
1: In is.na(y) : is.na() applied to non-(list or vector) of type 'NULL'
2: In plot.window(...) :
  nonfinite axis limits [GScale(-inf,4,2, .); log=1]
3: In axis(side = side, at = at, labels = labels, ...) :
  CreateAtVector "log"(from axis()): axp[0] = 0 !


You have no data to plot.  What were you expecting it to do?  When you
say "lot of error messages", please include them and also follow the
posting guide.

On Wed, Nov 18, 2009 at 4:52 PM, Martin Batholdy
 wrote:
> Hi,
>
>
> I get a lot of error messages with this command, but I don't understand why;
>
> plot(c(),c(), xlim=c(1,10), ylim=c(0,1), log="y")
>
>
> thanks for any help!
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there an variant of apply() that does not return anything?

2009-11-19 Thread jim holtman
invisible(apply(...))

On Thu, Nov 19, 2009 at 5:21 PM, Peng Yu  wrote:
> There are a few version of apply() (e.g., lapply(), sapply()). I'm
> wondering if there is one that does not return anything but just
> silently apply a function to the list argument.
>
> For example, the plot function is applied to each element in 'alist'.
> It is redundant to return anything from apply.
>
> apply(alist,function(x){ plot each element of alist})
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove leading and trailing white spaces

2009-11-20 Thread jim holtman
try this:

> x <- ' middle of the string  '
> sub("^[[:space:]]*(.*?)[[:space:]]*$", "\\1", x, perl=TRUE)
[1] "middle of the string"



On Fri, Nov 20, 2009 at 10:51 AM, Bos, Roger  wrote:
> I have a character string and I would like to remove the leading and
> tailing white spaces.  The example for 'sub' shows how to remove the
> trailing white spaces, but I still can't figure out how to remove both
> trailing and leading white spaces because I can't find any documentation
> for what "+$" means or what "\\s+$" means.  Maybe its because I don't
> have a Unix background.  Thanks in advance for any help with this.
>
> str <- '    Now is the time      '
> sub(' +$', '', str)  ## spaces only
> sub('[[:space:]]+$', '', str) ## white space, POSIX-style
> sub('\\s+$', '', str, perl = TRUE) ## Perl-style white space
>
> Thanks,
>
> Roger
> ***
>
> This message is for the named person's use only. It may\...{{dropped:23}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read a file into a matrix

2009-11-20 Thread jim holtman
What is wrong with the "extra step"?  Is it taking too much time (you
did not specify that), is it taking too much memory?  How many times
are you going to be doing it?  If not many, then may be it is OK.  You
have to quantify what you are asking for.  It may take longer to send
a message to R-Help and get a response than to just read the file in
and process it.

On Fri, Nov 20, 2009 at 1:01 PM, Peng Yu  wrote:
> On Sat, Nov 21, 2009 at 11:55 AM, Steve Lianoglou
>  wrote:
>>> read.delim gives me a data.frame. Is there a function that can return
>>> the result in a matrix rather than data.frame?
>>
>> m <- as.matrix(read.delim(..))
>
> I knew this approach. But this takes an extra step. Is there a command
> that read a file directly into a matrix?
>
> Regards,
> Peng
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to concatenate a vector of strings to a string?

2009-11-20 Thread jim holtman
?paste

Read the help file, esp the collapse parameter.  Might help to reread
Intro to R.

On Fri, Nov 20, 2009 at 8:03 PM, Peng Yu  wrote:
>> paste(c('a','b'),sep='')
> [1] "a" "b"
>
> The above command doesn't concatenate the strings in a single string.
> I'm wondering what is the correct way to do so.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with indexing

2009-11-21 Thread jim holtman
try this:

> # create a factor and then convert back to numeric
> x$nb <- as.integer(factor(x$name, levels=unique(x$name))) + 99
> x
  name freq  nb
1 Mary1 100
2 Mary2 100
3 Mary3 100
4  Sam1 101
5  Sam2 101
6 John1 102
7 John2 102
8 John3 102
9 John4 102


On Sat, Nov 21, 2009 at 7:00 PM, Dana Sevak  wrote:
> Dear R Helpers,
>
> I am missing something very elementary here, and I don't seem to get it from 
> the help pages of the ave, seq and seq_along functions, so I wonder if you 
> could offer a quick help.
>
> To use an example from an earlier post on this list, I have a dataframe of 
> this kind:
>
> dat = data.frame(name = rep(c("Mary", "Sam", "John"), c(3,2,4)))
> dat$freq = ave(seq_along(dat$name), dat$name, FUN = seq_along)
>
> dat
>  name freq
> 1 Mary    1
> 2 Mary    2
> 3 Mary    3
> 4  Sam    1
> 5  Sam    2
> 6 John    1
> 7 John    2
> 8 John    3
> 9 John    4
>
> What I need is another column assigning a number to each name starting from 
> index 100, that is:
>
>  name freq  nb
> 1 Mary    1 100
> 2 Mary    2 100
> 3 Mary    3 100
> 4  Sam    1 101
> 5  Sam    2 101
> 6 John    1 102
> 7 John    2 102
> 8 John    3 102
> 9 John    4 102
>
> What is the easiest way to do this?
>
> Thanks a lot for your kind help.
>
> Dana
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to read BRFSS file

2009-11-21 Thread jim holtman
Exactly what have you tried and what did not work?  I downloaded the
'.asc' (text) version of the data and it appears to be fixed format
with 1294 characters per line; there are about 414K lines of data in
the file.  How much of the data do you need to extract?  You can read
in a portion of the file at a time and then extract just the fields
that you need for processing.  If it is not too many fields, this
should be a reasonable sized object.

On Sat, Nov 21, 2009 at 7:58 PM, chloe yoon  wrote:
> hello,
> I am trying to do exploratory factor analysis with BRFSS dataset (
> http://www.cdc.gov/brfss/technical_infodata/surveydata/2008.htm) for a
> couple of days, but I was not able to do that and got frustrated. Can
> anybody help me with step by step guide? BRFSS dataset provides ASCII or SAS
> format.
> Thank you.
>
> chloe
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing "+" and "?" signs

2009-11-22 Thread jim holtman
'?' is a metacharacter in a regular expression.  You have to escape it:

> x <- "asdf+,jkl?"
>
> gsub("?", " ", x)
Error in gsub("?", " ", x) : invalid regular expression '?'
In addition: Warning message:
In gsub("?", " ", x) :
  regcomp error:  'Invalid preceding regular expression'
> # escape it
> gsub("\\?", " ", x)
[1] "asdf+,jkl "


On Sun, Nov 22, 2009 at 6:01 PM, Steven Kang  wrote:
> Hi all,
>
>
> I get an error message when trying to replace *+* or *?* signs (with empty
> space) from a string.
>
> x <- "asdf+,jkl?"
>
> gsub("?", " ", x)
>
>
> Error message:
>
> Error in
> gsub("?", " ", x) :
>  invalid regular expression '?'
> In addition: Warning message:
> In gsub("?", " ", x) :
>  regcomp error:  'Invalid preceding regular expression'
>
> Your expertise in resolving this issue would be appreciated.
>
> Thanks.
>
>
>
> Steven
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Check if string has all alphabets or numbers

2009-11-23 Thread jim holtman
try this:

> mywords<- c("harry","met","sally","subway10","1800Movies","12345")
> grep("^[[:alpha:]]*$", mywords)  # letters
[1] 1 2 3
> grep("^[[:digit:]]*$", mywords)  # numbers
[1] 6
>

On Mon, Nov 23, 2009 at 8:28 AM, Harsh  wrote:
> Hi R users,
> I'd like to know if anyone has come across problems wherein it was necessary
> to check if strings contained all alphabets, some numbers or all numbers?
>
> In my attempt to test if a string is numeric, alpha-numeric (also includes
> if string is only alphabets) :
>
> # Reproducible R code below
> mywords<- c("harry","met","sally","subway10","1800Movies","12345")
>
> mywords.alphanum
> <-lapply(sapply(mywords,function(x)strsplit(x,NULL)),function(y)
> ifelse(sum(is.na(sapply(y,as.numeric))) == 0 & length(y) >
> 0,"numeric","alpha-numeric"))
>
> names(mywords.alphanum)[(which(mywords.alphanum == "numeric"))]
>
>
> I understand that such "one-liners"  (the second line of code above) that
> make multiple calls are discouraged, but I seem to find then fascinating.
>
> Looking forward to alternate solutions/packages  for the above problem.
>
> Thanks
> Harsh Singhal
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Check if string has all alphabets or numbers

2009-11-23 Thread jim holtman
Added a little more:

> mywords<- c("harry","met","sally","subway10","1800Movies","12345", "not 
> correct 123")
> all.letters <- grep("^[[:alpha:]]*$", mywords)
> all.numbers <- grep("^[[:digit:]]*$", mywords)  # numbers
> mixed <- grep("^[[:digit:][:alpha:]]*$", mywords)
> all.letters
[1] 1 2 3
> all.numbers
[1] 6
> # mixed
> setdiff(mixed, c(all.numbers, all.letters))
[1] 4 5
> # not any of the above
> setdiff(seq(length(mywords)), c(mixed, all.numbers, all.letters))
[1] 7
>


On Mon, Nov 23, 2009 at 8:28 AM, Harsh  wrote:
> Hi R users,
> I'd like to know if anyone has come across problems wherein it was necessary
> to check if strings contained all alphabets, some numbers or all numbers?
>
> In my attempt to test if a string is numeric, alpha-numeric (also includes
> if string is only alphabets) :
>
> # Reproducible R code below
> mywords<- c("harry","met","sally","subway10","1800Movies","12345")
>
> mywords.alphanum
> <-lapply(sapply(mywords,function(x)strsplit(x,NULL)),function(y)
> ifelse(sum(is.na(sapply(y,as.numeric))) == 0 & length(y) >
> 0,"numeric","alpha-numeric"))
>
> names(mywords.alphanum)[(which(mywords.alphanum == "numeric"))]
>
>
> I understand that such "one-liners"  (the second line of code above) that
> make multiple calls are discouraged, but I seem to find then fascinating.
>
> Looking forward to alternate solutions/packages  for the above problem.
>
> Thanks
> Harsh Singhal
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Check if string has all alphabets or numbers

2009-11-23 Thread jim holtman
Here is the way you can use grepl to get the various combinations:

> mywords<- c("harry","met","sally","subway10","1800Movies","12345",
+ "not correct 123", "")
>
> numbers <- grepl("^[[:digit:]]+$", mywords)
> letters <- grepl("^[[:alpha:]]+$", mywords)
> both <- grepl("^[[:digit:][:alpha:]]+$", mywords)
>
> mywords[letters]
[1] "harry" "met"   "sally"
> mywords[numbers]
[1] "12345"
> mywords[xor((letters | numbers), both)] # letters & numbers mixed
[1] "subway10"   "1800Movies"
>
>


On Mon, Nov 23, 2009 at 9:17 AM, hadley wickham  wrote:
>>> mywords<- c("harry","met","sally","subway10","1800Movies","12345", "not 
>>> correct 123")
>>> all.letters <- grep("^[[:alpha:]]*$", mywords)
>>> all.numbers <- grep("^[[:digit:]]*$", mywords)  # numbers
>>> mixed <- grep("^[[:digit:][:alpha:]]*$", mywords)
>
> mywords<- c("harry","met","sally","subway10","1800Movies","12345",
> "not correct 123", "")
> mywords[grepl("^[[:digit:][:alpha:]]*$", mywords)]
>
> So maybe you should use
>
> mywords[grepl("^[[:digit:][:alpha:]]+$", mywords)]
>
>
> And grepl is highly recommended over grep.
>
> Hadley
>
> --
> http://had.co.nz/
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ow to have R automatically print traceback upon errors

2009-11-23 Thread jim holtman
I use this:

options(error=utils::recover)

and anytime an error occurs in the interactive mode, it will print out
the traceback and then allow you to explore the variables at each
level of the stack; just like putting 'browser()' in the code at the
error point.

Here is what I get in running under Windows:

> x <- function() xyz()  # non-existent function
>
>
> x()  # caa the function
Error in x() : could not find function "xyz"  <== error message

Enter a frame number, or 0 to exit

1: x() <== traceback

Selection: 0
>


On Mon, Nov 23, 2009 at 7:52 PM, Hao Cen  wrote:
> Hi,
>
> I wonder how to have R automatically print stack trace produced by
> traceback upon errors during interactive uses. I tried the suggestions on
> http://old.nabble.com/Automatically-execute-traceback-when-execution-of-script-causes-error--td22368483.html#a22368775
>
> and used options(error = recover)
> options(showErrorCalls = T)
>
> It just produces an extra message like "recover called non-interactively;
> frames dumped, use debugger() to view"
>
> Thanks
>
> Jeff
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write to file append by column

2009-11-24 Thread jim holtman
You can not append a column.  Best bet, read the old file in, do a
'cbind', write the object back out.

On Tue, Nov 24, 2009 at 5:59 AM, e-letter  wrote:
> Readers,
>
> Scenario: data x consists of one column;
> 1
> 2
> 3
>
> data y;
> 4
> 5
> 6
>
> Is it possible to write to file such that the file is:
> 1,4
> 2,5
> 3,6
>
> using the write.file function? I have tried the command:
>
> write(x,file="file.csv",ncolumns=1,append=TRUE,sep=",")
> write(y,file="file.csv",ncolumns=1,append=TRUE,sep=",")
>
> but the result is:
>
> 1
> 2
> 3
> 4
> 5
> 6
>
> yours,
>
> rhelpatconference.jabber.org
> r 251
> mandriva 2008
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write to file append by column

2009-11-24 Thread jim holtman
Here is the way to get the required output:

> x <- data.frame(a=1:3)
> write.csv(x, file='tempxx.csv', row.names=FALSE)
> # new data
> newData <- data.frame(b=4:6)
> # read old data back in
> oldData <- read.csv('tempxx.csv')
> # cbind the new data
> write.csv(cbind(oldData, newData), file='tempxx.csv', row.names=FALSE)
>
> file.show('tempxx.csv')
"a","b"
1,4
2,5
3,6


On Tue, Nov 24, 2009 at 8:22 AM, e-letter  wrote:
> On 24/11/2009, jim holtman  wrote:
>> You can not append a column.  Best bet, read the old file in, do a
>> 'cbind', write the object back out.
>>
>> On Tue, Nov 24, 2009 at 5:59 AM, e-letter  wrote:
>>> Readers,
>>>
>>> Scenario: data x consists of one column;
>>> 1
>>> 2
>>> 3
>>>
>>> data y;
>>> 4
>>> 5
>>> 6
>>>
>>> Is it possible to write to file such that the file is:
>>> 1,4
>>> 2,5
>>> 3,6
>>>
>>> using the write.file function? I have tried the command:
>>>
>>> write(x,file="file.csv",ncolumns=1,append=TRUE,sep=",")
>>> write(y,file="file.csv",ncolumns=1,append=TRUE,sep=",")
>>>
>>> but the result is:
>>>
>>> 1
>>> 2
>>> 3
>>> 4
>>> 5
>>> 6
>>>
>>> yours,
>>>
>>> rhelpatconference.jabber.org
>>> r 251
>>> mandriva 2008
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>>
> This is the requested format:
> 1,4
> 2,5
> 3,6
>
> The write functions described previously produce the following format:
> 1
> 2
> 3
> 4
> 5
> 6
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re moving white space help

2009-11-24 Thread jim holtman
Is this what you want:

> x <- "   and fgh-"
> gsub(" +", "", x)
[1] "andfgh-"


On Tue, Nov 24, 2009 at 4:18 PM, Ramyathulasingam
 wrote:
>
> Hi  there
>
> I am trying to remove the white space and replace it with nothing but didnt
> have any luck with that
>
> x <- and fgh-
>
> i can replace the comma using gsub
> gsub("\\-","",x)
> but i cant replace the white space with nothing.
>
> Ramya
> --
> View this message in context: 
> http://old.nabble.com/Removing-white-space-help-tp26503431p26503431.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test Binary File

2009-11-25 Thread jim holtman
Do you know how it is structured?  Is it 64-bit floating point, 32-bit
floating point, 64 bit integer, 32 bit integer, byte values, etc.?  If
we know the structure, then we can determine how to decode the
information.

On Wed, Nov 25, 2009 at 7:34 AM, Jason Rupert  wrote:
> I've got an error with the way I'm using readBin on a binary file of unknown 
> internal structure.  I know the structure consists of rows and columns, but 
> I'm not sure how many of each.
>
> So, does anyone know of a valid test set of binary data that I could 
> reference while trying to figure out the technique of using readBin?
>
> It would be really helpful to try out readBin on a readily available and 
> understood binary file instead of starting with one of dubious internal 
> structure.
>
> Thank you again for your help and feedback.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tick marks on fold change versus fold change plot

2009-11-25 Thread jim holtman
It sounds like you want to plot 'log' on both axis:

plot(..., log='xy')

On Wed, Nov 25, 2009 at 12:24 PM, Alla Bulashevska
 wrote:
>
> Dear R users,
> i try to produce the fold change versus fold change plot
> where i have the values for x and y ranging from 0.01 to
> 100. So i start with
> plot(x,y,xlim=c(0.01,100),ylim=c(0.01,100), axes=F).
> Then i would like both axes to have tick marks as
> c(0.01,0.1,1,10,100) but they should appear equidistant.
> How should i manage this?
> Thank you for your help,
> Alla.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing many files from a single code

2009-11-25 Thread jim holtman
Exactly what do you mean by "import"?  What commands are you using?
You can get a list of the files in a directory and then iterate
through reading each one in.  If you use 'lapply', you can
'read.table' in some data frames and then 'rbind' them into a single
data frame.  You need to be more specific on the problem you are
trying to solve.

On Wed, Nov 25, 2009 at 9:35 AM, ram basnet  wrote:
> Dear R users,
>
> Does somebody know the way to import many files by a single command in R ? I 
> have 50 files in a directory and now, i am importing the files repeatedly 
> (one by one). If there is a way to import all files at a time, it makes much 
> more easy and save times too.
> Thanks in advance.
>
>
> Sincerely,
> Ram Kumar Basent
> Wageningen University,
> the Netherlands
>
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique observations

2009-11-25 Thread jim holtman
shouldn't the first observation for Tree1 be " Tree1  leaves  01-01-2009"?

> x <- read.table(textConnection("Tree  disease  date
+  Tree1  leaves  01-01-2009
+  Tree2  roots  13-09-2009
+  Tree1  roots  24-10-2009"), header=TRUE)
> closeAllConnections()
> # split by "Tree" and take first observation
> do.call('rbind', lapply(split(x, x$Tree), function(.tr) .tr[1,]))
   Tree disease   date
Tree1 Tree1  leaves 01-01-2009
Tree2 Tree2   roots 13-09-2009
>


On Wed, Nov 25, 2009 at 9:44 AM, John Lipkins
 wrote:
> Hey R list,
>
> A beginners question. How can I do the following:
>
> In my research population it is possible that several items can appear
> several times, measured on different moments in time. This is being supplied
> in a total list with all observations identified by a number (per item) and
> a moment of observation (date). Now I want to make a unique list of this
> observation preserving the characteristics of the first observation. As
> example:
>
>  Tree  disease  date
>  Tree1  leaves  01-01-2009
>  Tree2  roots  13-09-2009
>  Tree1  roots  24-10-2009
>
> Now I want to create a list of unique elements (in the example only once
> Tree1 and Tree2) with the first observed disease and date. For the example
> the result would look like:
>
>  Tree  disease  date
>  Tree1  roots  24-10-2008
>  Tree2  roots  13-09-2009
>
> Can someone help me with this question?
>
> Thanks in advance.
> Kind regards,
>
> John
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difference of two rows

2009-11-25 Thread jim holtman
Try this:

> x <- read.table(textConnection("ID YEAR
+ 13 2007
+ 15 2003
+ 15 2006
+ 15 2008
+ 21 2006
+ 21 2007"), header=TRUE)
> x$diff <- ave(x$YEAR, x$ID, FUN=function(a) c(diff(a), NA))
>
> x
  ID YEAR diff
1 13 2007   NA
2 15 20033
3 15 20062
4 15 2008   NA
5 21 20061
6 21 2007   NA


On Wed, Nov 25, 2009 at 10:55 AM, clion  wrote:
>
> Dear R user,
> I'd like to calculate the difference of two rows, where "ID" is the same.
> eg.: I've got the following dataframe:
> ID YEAR
> 13 2007
> 15 2003
> 15 2006
> 15 2008
> 21 2006
> 21 2007
>
> and I'd like to get the difference, like this:
> ID YEAR     diff
> 13 2007      NA
> 15 2003       3
> 15 2006       2
> 15 2008      NA
> 21 2006       1
> 21 2007      NA
>
> that should be fairly easy...I hope
> Thanks for any helpful comments
> B.
>
>
>
> --
> View this message in context: 
> http://old.nabble.com/difference-of-two-rows-tp26515212p26515212.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Feature request for as.Date() function

2009-11-25 Thread jim holtman
Seems to work fine in my testing:

> x <- read.csv(textConnection("date,value
+ 2009-01-01,10
+ 2009-02-01,1
+ 'NA', 3"), colClasses=c("Date", 'integer'))
>
> str(x)
'data.frame':   3 obs. of  2 variables:
 $ date :Class 'Date'  num [1:3] 14245 14276 NA
 $ value: int  10 1 3
> x <- read.csv(textConnection("date,value
+ 2009-01-01,10
+ 2009-02-01,1
+ NA, 3"), colClasses=c("Date", 'integer'))
>
> str(x)
'data.frame':   3 obs. of  2 variables:
 $ date :Class 'Date'  num [1:3] 14245 14276 NA
 $ value: int  10 1 3
>

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

On Wed, Nov 25, 2009 at 12:38 PM,
 wrote:
> Hello -
>
> I have a csv file with a few date columns. Some of the records have an
> "NA" character string instead of the date. When I attempt to use
> read.csv() and typecast the columns using colClasses, I receive the
> following error:
>    Error in charToDate(x) :
>      character string is not in a standard unambiguous format
>
> Similarly, the following command produces the same error:
>    as.Date("NA")
>
> However, as.Date(NA) performs as documented.
>
> Can we enhance the as.Date() function to convert "NA" strings into NA
> value prior to type conversion?
>
> Thanks!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2-character plotting characters?

2009-11-28 Thread jim holtman
?text

You will have to plot your own text at each point that want.

On Sat, Nov 28, 2009 at 6:38 PM, Ben Seligman  wrote:
> I am trying to make a plot using the plot command in which I would like the
> plotting characters to be two-character strings (they're two-letter
> abbreviations of country names).  I've tried the pch argument and this, of
> course, only produces 1-character strings.  Looking through Intro to R and
> the reference manual, I can't find any obvious way around this.  Would
> anyone have any suggestions?
>
> Thanks so much!
>
> -Ben
>
> --
> Benjamin Seligman
> Stanford University, School of Medicine
> MD Candidate, SMS II
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find where the source code of an R function or package is installed?

2009-11-28 Thread jim holtman
Check out:

Uwe Ligges. R Help Desk: Accessing the sources. R News, 6(4):43-45, October 2006

On Sat, Nov 28, 2009 at 11:00 PM, Peng Yu  wrote:
> I'm wondering where is the source of an R function or a package is.
> For example, where is 'attributes'?
>
>> attributes
> function (obj)  .Primitive("attributes")
>
> I also do understand what .Primitive mean. Could somebody let me know
> how to locate source file in an R installation? Why typing
> 'attributes' does not give its definition?
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing objects from a list based on nrow

2009-11-29 Thread jim holtman
One thing to be careful of is if no dataframe have less than 3 rows:

> df1<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
> df2<-data.frame(letter=c("A","B"),number=c(1,2))
> df3<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
> df4<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
>
> lst<-list(df1,df3,df4)
> lst
[[1]]
  letter number
1  A  1
2  B  2
3  C  3
4  D  4
5  E  5

[[2]]
  letter number
1  A  1
2  B  2
3  C  3
4  D  4
5  E  5

[[3]]
  letter number
1  A  1
2  B  2
3  C  3
4  D  4
5  E  5

> lst[-which(sapply(lst, nrow) < 3)]
list()
>

Notice the list is now empty.  Instead use:

> lst[sapply(lst, nrow) >=3]
[[1]]
  letter number
1  A  1
2  B  2
3  C  3
4  D  4
5  E  5

[[2]]
  letter number
1  A  1
2  B  2
3  C  3
4  D  4
5  E  5

[[3]]
  letter number
1  A  1
2  B  2
3  C  3
4  D  4
5  E  5


On Sun, Nov 29, 2009 at 3:43 AM, Linlin Yan  wrote:
> Try these:
> sapply(lst, nrow) # get row numbers
> which(sapply(lst, nrow) < 3) # get the index of rows which has less than 3 
> rows
> lst <- lst[-which(sapply(lst, nrow) < 3)] # remove the rows from the list
>
> On Sun, Nov 29, 2009 at 4:36 PM, Tim Clark  wrote:
>> Dear List,
>>
>> I have a list containing data frames of various numbers of rows.  I need to 
>> remove any data frame that has less than 3 rows.  For example:
>>
>> df1<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
>> df2<-data.frame(letter=c("A","B"),number=c(1,2))
>> df3<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
>> df4<-data.frame(letter=c("A","B","C","D","E"),number=c(1,2,3,4,5))
>>
>> lst<-list(df1,df2,df3,df4)
>>
>> How can I determine that the second object (df2) has less than 3 rows and 
>> remove it from the list?
>>
>> Thanks!
>>
>> Tim
>>
>>
>>
>>
>> Tim Clark
>> Department of Zoology
>> University of Hawaii
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RSQLite does not read very large values correctly

2009-11-30 Thread jim holtman
It appears that you were reading the number in as an integer and not
numeric.  The value that you are seeing (-596864072) is the numeric
value trucated to 32 bit.  The number would have been in hex
(6DC6C93B8) but dropping the leading '6' you will get the result as a
32 bit integer.  Check your data base definition and how you are
reading in your data.

On Mon, Nov 30, 2009 at 11:14 AM, Ruecker, Sebastian
 wrote:
> Hello,
>
> I am trying to import data from an SQLite database to R.
> Unfortunately, I seem to get wrong data when I try to import very large
> numbers.
>
> For example:
> I look at the database via SQLiteStudio(v.1.1.3) and I see the following
> values:
>
> OrderID Day             TimeToclose
> 1               2009-11-25      29467907000
> 2               2009-11-25      29467907000
> 3               2009-11-25      29467907000
>
>
> Now I run this R Code:
>
>> library("DBI")
>> library("RSQLite")
>>
>> # DB Connection
>> con <- dbConnect(dbDriver("SQLite"), "C:/Temp/TickDB01.db")
>> raw_Data <- dbGetQuery(con, "SELECT OrderID, Day, TimeToClose FROM
> Tr_TickData WHERE OrderID in (1,2,3)")
>> raw_Data
>  OrderID        Day TimeToClose
> 1       1 2009-11-25  -596864072
> 2       2 2009-11-25  -596864072
> 3       3 2009-11-25  -596864072
>
>
> The values are totally wrong... Is it because RSQLite has a problem with
> big numbers?
> TimeToClose is microseconds till 17:00.
>
> When I make the numbers smaller, it works again:
>
>> raw_Data <- dbGetQuery(con, "SELECT TimeToClose/1000 as TTC FROM
> Tr_TickData WHERE OrderID in (1,2,3)")
>> raw_Data
>       TTC
> 1 29467907
> 2 29467907
> 3 29467907
>
>
> I would appreciate any help with this problem!
>
> Thanks and regards,
>
> Sebastian
>
> __________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] paste name in for loop?

2009-11-30 Thread jim holtman
Here is what you want:

xout <- c(1,5,10,25,50,100)
for(i in xout) { print(paste("Areal_Ppt_",i,"sqmi.txt", sep="")) }

Notice that 'i' will be assigned each value in xout; you do not have
to index into the vector.  Notice that you second value is 50 which is
xout[5].

On Mon, Nov 30, 2009 at 7:49 PM, Douglas M. Hultstrand
 wrote:
> Hello,
>
> I am trying to create subsets of grouped data (by area size), and use the
> area size as part of the output name.  The code below works for area (xout)
> 1 and 50, the other files are given NA for an area.
>
> A simple example:
> xout <- c(1,5,10,25,50,100)
> for(i in xout) { print(paste("Areal_Ppt_",xout[i],"sqmi.txt", sep="")) }
> [1] "Areal_Ppt_1sqmi.txt"
> [1] "Areal_Ppt_50sqmi.txt"
> [1] "Areal_Ppt_NAsqmi.txt"
> [1] "Areal_Ppt_NAsqmi.txt"
> [1] "Areal_Ppt_NAsqmi.txt"
> [1] "Areal_Ppt_NAsqmi.txt"
>
> The actual code and partial dataset are below.
>
> Thanks for your help,
> Doug
>
> ###
> ### Real Code ###
> ###
> data2 <- read.table("GROUP.txt", header=T, sep=",")
> xout <- c(1,5,10,25,50,100)
> for(i in xout) {
>   name <- paste("Areal_Ppt_",xout[i],"sqmi.txt", sep="")
>   b.1 <- subset(data2, area == i)
>   write.table(b.1, file=name,quote=FALSE,row.names=FALSE, sep=",")
> }
>
> ##
> ### Dataset GROUP.txt ###
> ###
> hr,area,avg_ppt
> 21,1,0
> 21,5,0.001
> 21,10,0.001
> 21,25,0.005
> 21,50,0.01
> 21,100,0.011
> 22,1,0.003
> 22,5,0.005
> 22,10,0.00824
> 22,25,0.04258
> 22,50,0.057
> 22,100,0.101
> 23,1,2.10328
> 23,5,2.02755
> 23,10,1.93808
> 23,25,1.78408
> 23,50,1.67407
> 23,100,1.568
> 24,1,3.20842
> 24,5,3.09228
> 24,10,2.95452
> 24,25,2.71661
> 24,50,2.54607
> 24,100,2.38108
>
> --
> -
> Douglas M. Hultstrand, MS
> Senior Hydrometeorologist
> Metstat, Inc. Windsor, Colorado
> voice: 970.686.1253
> email: dmhul...@metstat.com
> web: http://www.metstat.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message when logical indexing vecor is all FALSE

2009-12-01 Thread jim holtman
?try

On Tue, Dec 1, 2009 at 12:22 PM, Jannis  wrote:
> Dears,
>
>
> is there any way to "switch off" or work around the error message that
> pops up when I do something like:
>
>
> A<-B['logical vector']
>
>
> and when 'logical vector' only consists of FALSE values? My problem is
> that this message always kicks me out of my loops and always testing via
> an if clause whether 'logical vector' contains any TRUE values is much
> too complex due to many different conditions and several of the above
> statements (and actually it seems to make my code really slow).
>
>
> Cheers
> Jannis
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort a data frame by a vector

2009-12-01 Thread jim holtman
Is this what you want:

> dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3))
> dataDF
  A1 A2
1  B  1
2  A  2
3  C  3
> dataDF[order(dataDF$A1),]
  A1 A2
2  A  2
1  B  1
3  C  3
>

If you want the sequence "CAB" then you will have to change the
factors in column 1:

> dataDF$A1 <- factor(dataDF$A1, levels=c("C", "A", "B"))
> dataDF[order(dataDF$A1),]
  A1 A2
3  C  3
2  A  2
1  B  1
>


On Tue, Dec 1, 2009 at 10:36 PM, Hao Cen  wrote:
> Hi,
>
>
>
> I have a a vector  and a data frame with two columns
>
> vec = c("C", "A", "B")
>
> dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3))
>
>
>
> I would like to sort the data frame by column A1 such that the order of
> elements in A1 is as the same as in vec.
>
>
>
> After the ordering, the data frame would be
>
> A1           A2
>
> C             3
>
> A             2
>
> B             1
>
>
>
> Any suggestions would be appreciated.
>
>
>
> Thanks in advance
>
>
>
> Jeff
>
>
>        [[alternative HTML version deleted]]
>
> __________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort a data frame by a vector

2009-12-01 Thread jim holtman
The factor statement should have been:  (missed the 'vec' on the first reading)

dataDF$A1 <- factor(dataDF$A1, levels=vec)

On Tue, Dec 1, 2009 at 10:57 PM, jim holtman  wrote:
> Is this what you want:
>
>> dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3))
>> dataDF
>  A1 A2
> 1  B  1
> 2  A  2
> 3  C  3
>> dataDF[order(dataDF$A1),]
>  A1 A2
> 2  A  2
> 1  B  1
> 3  C  3
>>
>
> If you want the sequence "CAB" then you will have to change the
> factors in column 1:
>
>> dataDF$A1 <- factor(dataDF$A1, levels=c("C", "A", "B"))
>> dataDF[order(dataDF$A1),]
>  A1 A2
> 3  C  3
> 2  A  2
> 1  B  1
>>
>
>
> On Tue, Dec 1, 2009 at 10:36 PM, Hao Cen  wrote:
>> Hi,
>>
>>
>>
>> I have a a vector  and a data frame with two columns
>>
>> vec = c("C", "A", "B")
>>
>> dataDF = data.frame(A1 = c("B", "A", "C"), A2 = c(1,2,3))
>>
>>
>>
>> I would like to sort the data frame by column A1 such that the order of
>> elements in A1 is as the same as in vec.
>>
>>
>>
>> After the ordering, the data frame would be
>>
>> A1           A2
>>
>> C             3
>>
>> A             2
>>
>> B             1
>>
>>
>>
>> Any suggestions would be appreciated.
>>
>>
>>
>> Thanks in advance
>>
>>
>>
>> Jeff
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] find the index of the next largest element in a sorted vector

2009-12-02 Thread jim holtman
Is this what you want:

> x <- c(0,3,4)
> ?findInterval
> findInterval(2, x)
[1] 1
>


On Wed, Dec 2, 2009 at 9:34 AM, Hao Cen  wrote:
> Hi,
>
> How can I find the index of the next largest element in a sorted vector if
> an element is not found.
>
> for example, searching 2 in c(0,3,4) would return 1 since 2 is not in the
> vector and 0 is the next largest element to 2.
>
> I tried which and match and neither returns such information.
>
>> which(c(0,3,4) == 2)
> integer(0)
>> match(2, c(0,3,4))
> [1] NA
>
>
> thanks
>
> Jeff
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reordering the results from table(cut()) by break argument

2009-12-02 Thread jim holtman
try this:

> dat <- rnorm(100)
> breaks <- -3:3
> table((cut(dat, breaks)))

(-3,-2] (-2,-1]  (-1,0]   (0,1]   (1,2]   (2,3]
  1  10  35  39  13   2
> x <- table((cut(dat, breaks)))
> rev(x)

  (2,3]   (1,2]   (0,1]  (-1,0] (-2,-1] (-3,-2]
  2  13  39  35  10   1
>


On Wed, Dec 2, 2009 at 12:18 PM, Mark Heckmann  wrote:
> I have a vector and need to count how many data points fall inside each bin:
>
> dat <- rnorm(100)
> breaks <- -3:3
> table((cut(dat, breaks)))
>
> (-3,-2] (-2,-1]  (-1,0]   (0,1]   (1,2]   (2,3]
>      3      13      42      30      12       0
>
> if I reverse the breaks vector, the results remains the same:
> breaks <- rev(breaks)
> table((cut(dat, breaks)))
>
> (-3,-2] (-2,-1]  (-1,0]   (0,1]   (1,2]   (2,3]
>      3      13      42      30      12       0
>
> What I would like is break to also determine the order of the table output,
> in this case it should also be reversed, like:
> ( 3, 2] ( 2, 1]  ( 1,0]   (0,-1]   (-1,-2]   (-2,-3]
>      0      12      30      42      13       3
>
> Thus I would like to reorder the vector using break, but I do not know how.
>
> TIA
> Mark
> –––
> Mark Heckmann
> Dipl. Wirt.-Ing. cand. Psych.
> Vorstraße 93 B01
> 28359 Bremen
> Blog: www.markheckmann.de
> R-Blog: http://ryouready.wordpress.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading comments in text file from R

2009-12-02 Thread jim holtman
comment.char=''

On Wed, Dec 2, 2009 at 2:05 PM, Graham Smith  wrote:
> Thanks all.
>
> I assumed it would be easy, but searching yielded nothing useful.
>
> Graham
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting vectors from a matrix (err, I think) in RMySQL

2009-12-02 Thread jim holtman
try this:

> salaries
   yearID POS pct
12009  RF 203
22009  DH 200
32009  1B 198
42009  3B 180
52009  LF 169
62009  SS 156
72009  CF 148
82009  2B  97
92009   C  86
10   2008  DH 234
11   2008  1B 199
12   2008  RF 197
13   2008  3B 191
14   2008  SS 180
15   2008  CF 164
16   2008  LF 156
17   2008  2B 104
18   2008   C  98
> x <- split(salaries[c('yearID','pct')], salaries$POS)
> x
$`1B`
   yearID pct
32009 198
11   2008 199

$`2B`
   yearID pct
82009  97
17   2008 104

$`3B`
   yearID pct
42009 180
13   2008 191

$C
   yearID pct
92009  86
18   2008  98

$CF
   yearID pct
72009 148
15   2008 164

$DH
   yearID pct
22009 200
10   2008 234

$LF
   yearID pct
52009 169
16   2008 156

$RF
   yearID pct
12009 203
12   2008 197

$SS
   yearID pct
62009 156
14   2008 180

>


On Wed, Dec 2, 2009 at 4:01 PM, Wells Oliver  wrote:
> I have a query which returns a data set like so:
>
>> salaries
>   yearID POS pct
> 1    2009  RF 203
> 2    2009  DH 200
> 3    2009  1B 198
> 4    2009  3B 180
> 5    2009  LF 169
> 6    2009  SS 156
> 7    2009  CF 148
> 8    2009  2B  97
> 9    2009   C  86
> 10   2008  DH 234
> 11   2008  1B 199
> 12   2008  RF 197
> 13   2008  3B 191
> 14   2008  SS 180
> 15   2008  CF 164
> 16   2008  LF 156
> 17   2008  2B 104
> 18   2008   C  98
>
> I'd like to make a vector for all data for a given position, so for example
> here I'd like all yearID and pct for POS 'RF which should look like:
>
>   yearID pct
> 1 2009 203
> 2 2008 197
>
> Apologies if I'm mangling terminology here.
>
> --
> Wells Oliver
> we...@submute.net
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calculation problem when export and import data

2009-12-02 Thread jim holtman
Exactly what errors are you getting?  What is the 'str(a)' so we have
an idea of the data you are processing.  Why don't you use save/load
so that the data is saved in the original format.  Have you checked
the structure of the data before/after the write.table/read.table?

Also take a look at what is being returned with your 'mean(a[rep,])';
this would appear to be multivalued depending on what your dataframe
is: e.g.,

> x <- data.frame(a=1:10, b=1:10, c=letters[1:10])
> mean(x)
  a   b   c
5.5 5.5  NA
Warning message:
In mean.default(X[[3L]], ...) :
  argument is not numeric or logical: returning NA

So there is more information that you have to provide; also try to
look at the structure of all your objects to see if they are what you
think they should be.

On Wed, Dec 2, 2009 at 6:36 PM, aegea  wrote:
>
> Hello,
>
> I have a question on export and import data. Thank you for any suggestions.
>
> data 'simul' is generated as follows:
> N     <- 20
> n     <- N/2
> nsets <- 10
> simul <- matrix(0,nsets,N)
> th    <- c(0,1, 1)
> for(i in 1:nsets){
>    simul[i,] <- rnorm(N,mean= rep(th[1:2],N/2),sd=th[3])
> }
>
> I exported data as follows:
> write.table(simul, file="D:\\test.txt", row.names=F, col.names=F)
>
> When I want to use this data, I imported as follows:
> a=read.table("D:\\test.txt")
>
> So far, it works well. When I deal with data, I need use each row to do
> calculations:
>
> for(rep in 1:nsets){
> y   <- a[rep,]
> b<-c(mean(y)+3, mean(y)-4) # cannot calculate mean(y), the mean of this row
> m<-sd(y)   # also cannot calculate sd(y)
> }
>
> I need a lot of calculation based on y, but after I imported data, R comes
> error on it.
>
> Could you please give me some suggestions?
>
>
> --
> View this message in context: 
> http://n4.nabble.com/calculation-problem-when-export-and-import-data-tp947250p947250.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data manipulation

2009-12-03 Thread jim holtman
try this:

> x <- c('v2FfaPre15','v2FfaPre10','v2FfaPre5','v2Ffa2',
> 'v2Ffa3','v2Ffa4')
> sub("^.*?([0-9]+)$", "\\1", x, perl=TRUE)
[1] "15" "10" "5"  "2"  "3"  "4"
>


On Thu, Dec 3, 2009 at 9:00 AM, oscar linares  wrote:
> Dear Wiza[R]ds,
>
> I have a data.frame header that looks like this:
>
> v2FfaPre15    v2FfaPre10    v2FfaPre5    v2Ffa2    v2Ffa3    v2Ffa4
>
> I need it to look like this,
>
> 15    10    5    2    3     4
>
> i.e., with v2FfaPre and  v2Ffa stripped off
>
> Any suggestions,
>
> Thanks in advance!
>
> --
> Oscar
> Oscar A. Linares, MD
> Translational Medicine Unit
> LaPlaisance Bay, Bolles Harbor
> Monroe, Michigan 48161
>
> Department of Medicine,
> University of Toledo College of Medicine
> Toledo, OH 43606-3390
>
> Department of Internal Medicine,
> The Detroit Medical Center (DMC)
> Harper University Hospital
> Wayne State University School of Medicine
> Detroit, Michigan 48201
>
>        [[alternative HTML version deleted]]
>
> __________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two-way/Three-way sum.

2009-12-03 Thread jim holtman
try this:

> x
   State Month Year Value
1 NC   Jan 1996 1
2 NC   Jan 1996 2
3 NC   Feb 1997 2
4 NC   Feb 1997 3
5 NC   Mar 1998 3
6 NC   Mar 1998 4
7 NY   Jan 1996 4
8 NY   Jan 1996 5
9 NY   Feb 1997 5
10NY   Feb 1997 6
11NY   Mar 1998 6
12NY   Mar 1998 7
> tapply(x$Value, list(x$State, x$Year), sum)
   1996 1997 1998
NC357
NY9   11   13
>
> tapply(x$Value, list(x$State, x$Year, x$Month), sum)
, , Feb
   1996 1997 1998
NC   NA5   NA
NY   NA   11   NA
, , Jan
   1996 1997 1998
NC3   NA   NA
NY9   NA   NA
, , Mar
   1996 1997 1998
NC   NA   NA7
NY   NA   NA   13
>



On Thu, Dec 3, 2009 at 1:50 PM, Peng Cai  wrote:

> Hi R Users,
>
> I'm wondering how can I calculate two (or three) way sum of a variable. A
> sample data is:
>
> State Month Year Value
> NC Jan 1996 1
> NC Jan 1996 2
> NC Feb 1997 2
> NC Feb 1997 3
> NC Mar 1998 3
> NC Mar 1998 4
> NY Jan 1996 4
> NY Jan 1996 5
> NY Feb 1997 5
> NY Feb 1997 6
> NY Mar 1998 6
> NY Mar 1998 7
>
> I'm trying to sum up "value" column by State*Month and by State*Month*Year.
> Also, I may need to calculate mean value along with "sum".
>
> Any help would be greatly appreciated,
>
> Thanks,
> Peng
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataset index

2009-12-03 Thread jim holtman
Does this do what you want:

> x <- matrix(c(
+ 0, 0, 0,
+ 0, 0, 0,
+ 0, 1, 0,
+ 0, 1, 0,
+ 0, 1, 0,
+ 1, 2, 1,
+ 1, 2, 1,
+ 1, 3, 1,
+ 1, 3, 1,
+ 1, 3, 1),
+ ncol = 3, byrow = T,
+ dimnames = list(1:10, c("gender", "race", "disease")))
> key <- apply(x, 1, paste, collapse=":")
> m.flags <- lapply(unique(key), function(.indx){
+ key == .indx
+ })
> # create the keys
> do.call(rbind, m.flags)
 1 2 3 4 5 6 7 8 910
[1,]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[2,] FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
[3,] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE
[4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE
>


On Thu, Dec 3, 2009 at 5:07 PM, Lisa  wrote:
>
> Hello, All,
>
> I have a dataset that looks like this:
>
> x <- matrix(c(
> 0, 0, 0,
> 0, 0, 0,
> 0, 1, 0,
> 0, 1, 0,
> 0, 1, 0,
> 1, 2, 1,
> 1, 2, 1,
> 1, 3, 1,
> 1, 3, 1,
> 1, 3, 1),
> ncol = 5, byrow = T,
> dimnames = list(1:10, c("gender", "race", "disease")))
>
> I want to write a function to produce several matrices including only “TRUE”
> and “FALSE” for the different levels of the variables (these matrices may be
> thought as index matrices), like
>
>> m1
> TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>
>> m2
> FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
>
>> m3
> FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
>
>> m4
> FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE
>
> Can anyone please help how to get this done? Your help would be greatly
> appreciated.
>
> Lisa
>
> --
> View this message in context: 
> http://n4.nabble.com/dataset-index-tp948049p948049.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] corrupted matrix data.. sporadic result appears to be pairs of decimal numbers glommed together

2009-12-03 Thread jim holtman
Those appear to be complex numbers; some place in your script you must
be computing something that return a complex number.  Do an str on the
matrix to see what it says; see if it says this:

> x.1
 [,1] [,2]
[1,] 0.1820848-0.032i 0.1820848-0.032i
[2,] 0.1820848-0.032i 0.1820848-0.032i
> str(x.1)
 cplx [1:2, 1:2] 0.182-0i 0.182-0i 0.182-0i ...
>

If it does, look closely at your script.



On Thu, Dec 3, 2009 at 2:54 PM, Stephen Grubb  wrote:
> Hello,
>
> We are occasionally getting matrix results that appear to be corrupted... 
> here are the last several rows of an example.  These are supposed to be 
> floating point numbers.
>
> [25015,]  1.820848e-01-3.2090e-06i
> [25016,]  2.178046e-01-4.8140e-06i
> [25017,]  1.820848e-01-3.2090e-06i
> [25018,]  1.820848e-01-3.2090e-06i
> [25019,]  1.144594e-01-1.6657e-06i
> [25020,]  1.820848e-01-3.2090e-06i
> [25021,] -1.293271e-01+4.3889e-06i
> [25022,]  1.144594e-01-1.6657e-06i
> [25023,]  1.820848e-01-3.2090e-06i
> [25024,]  1.820848e-01-3.2090e-06i
> [25025,]  1.173487e-01-4.4415e-07i
> [25026,]  1.820848e-01-3.2090e-06i
> [25027,]  1.375304e-01-3.6167e-06i
> [25028,]  1.820848e-01-3.2090e-06i
> [25029,] -1.293271e-01+4.3889e-06i
> [25030,]  1.820848e-01-3.2090e-06i
> [25031,]  1.820848e-01-3.2090e-06i
> [25032,]  1.820848e-01-3.2090e-06i
> [25033,]  1.820848e-01-3.2090e-06i
>
> Any general idea what may be going on here?
>
> It is a sporadic problem... it occurs maybe 2% or 3% of the time when running 
> this particular script on various data.
>
> I apologize for not including a pared-down example that reproduces the 
> problem we are using an R script written elsewhere on large data sets.  
> If someone wants more specifics please follow up.
>
> Steve Grubb
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] writing 'output.csv' file

2009-12-04 Thread jim holtman
The csv file is exactly as you describe it.  ";35" represents two columns of
data.  If you read it is, you will probably get NA as the value in the first
column.  So what problem are you having in reading in the data?

> x <- read.csv(textConnection("8;32
+ 9;33
+ 10;34
+  ;35
+  ;36
+  ;37
+  ;38"), header=FALSE, sep=';')
> closeAllConnections()
> x
  V1 V2
1  8 32
2  9 33
3 10 34
4 NA 35
5 NA 36
6 NA 37
7 NA 38


On Fri, Dec 4, 2009 at 5:49 AM, Maithili Shiva wrote:

>
> Dear Mr Signer and Mr Cleland,
>
> Thanks a lot for you great help. However, the output which I am getting is
> as given below -
>
>
>
>
>
>
>
> x
>
> 1;25
>
> 2;26
>
> 3;27
>
> 4;28
>
> 5;29
>
> 6;30
>
> 7;31
>
> 8;32
>
> 9;33
>
> 10;34
>
>  ;35
>
>  ;36
>
>  ;37
>
>  ;38
>
>  ;39
>
>  ;40
>
>  ;41
>
>  ;42
>
>  ;43
>
>  ;44
>
>  ;45
>
>  ;46
>
>  ;47
>
>  ;48
>
>  ;49
>
>  ;50
>
> However, my requirement is I should get the csv file as
>
> M N
> 1 25
> 2 26
> 3 27
> 
>
> 10   34
>35
>36
>37
>
> ..
> ..
>50
>
> So that I can acrry out further calcualtions on this output file. Please
> guide.
>
> Regards
>
> Maithili
>
> --- On Fri, 4/12/09, Johannes Signer  wrote:
>
>
> From: Johannes Signer 
> Subject: Re: [R] writing 'output.csv' file
> To: "Maithili Shiva" 
> Date: Friday, 4 December, 2009, 10:29 AM
>
>
> Hello,
>
> maybe that helps:
>
> write.csv(paste((c(m,rep(" ",length(N)-length(M,n, sep=";"),
> "output.csv", row.names=F)
>
> Johannes
>
>
> On Fri, Dec 4, 2009 at 11:12 AM, Maithili Shiva 
> wrote:
>
> Dear R helpers
>
> Suppose
>
> M <- c(1:10)  #  length(M) = 10
> N <- c(25:50) #  length(N) = 26
>
> I wish to have an outut file giving M and N. So I have tried
>
> write.csv(data.frame(M, N), 'output.csv', row.names = FALSE)
>
> but I get the following error message
>
> Error in data.frame(M, N) :
>   arguments imply differing number of rows: 10, 26
>
> How do I modify my write.csv command to get my output in a single (csv)
> file irrespective of lengths.
>
> Plese Guide
>
> Thanks in advance
>
> Maithili
>
>
>
>  The INTERNET now has a personality. YOURS! See your Yahoo! Homepage.
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>
> [[elided Yahoo spam]]
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selective subsetting of a correlation matrix

2009-12-04 Thread jim holtman
Will something like this work for you:

> x <- matrix(1:100,10)
> dimnames(x) <- list(letters[1:10], LETTERS[1:10])
> x
   A  B  C  D  E  F  G  H  I   J
a  1 11 21 31 41 51 61 71 81  91
b  2 12 22 32 42 52 62 72 82  92
c  3 13 23 33 43 53 63 73 83  93
d  4 14 24 34 44 54 64 74 84  94
e  5 15 25 35 45 55 65 75 85  95
f  6 16 26 36 46 56 66 76 86  96
g  7 17 27 37 47 57 67 77 87  97
h  8 18 28 38 48 58 68 78 88  98
i  9 19 29 39 49 59 69 79 89  99
j 10 20 30 40 50 60 70 80 90 100
> x[c('c','g','j'), c("B","E","I")]
   B  E  I
c 13 43 83
g 17 47 87
j 20 50 90
>


On Fri, Dec 4, 2009 at 8:18 AM, Lee William  wrote:

> Dear All,
> I have a correlation matrix say 'M' (4000x4000) for 4000 genes and I want
> to
> subset it to 'N' (190x190) for 190 genes.
> The list of those 190 genes are in variable 't'. So the idea is to read the
> names of genes from variable 't' and subset the matrix M accordingly.
> Any thoughts are welcome!
>
> Best
> Lee
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class attributes

2009-12-04 Thread jim holtman
Here a way of doing it:

for (i in 5:12){
# convert to character so you can substitute 'x'
a <- as.character(dd[,i])
a[a == 'x'] <- '0'  replace with zero
dd[,i] <- as.numeric(a)
}

On Fri, Dec 4, 2009 at 11:55 AM, Allen L  wrote:

>
> Dear R forum,
> I want to replace all the elements in a data frame (dd) which match the
> character "x" with "0".
> What's the most elegant way of doing this (there must be an easy way which
> I've missed)? I settled on the following loop:
>
> >for(i in 5:12){# These are the column of dd I am
> interested
> in
> >dd[which(dd[,i]=="x"),i]<-0
> >}
>
> The problem with this is that the columns which used to contain "x" are
> still considered factors and I am unable to coerce them into numeric:
>
> > mean.species.biomass<-colMeans(as.numeric(dd.p[,5:12]))
> >Error in inherits(x, "data.frame") :
>  (list) object cannot be coerced to type 'double'
>
> I'm tried unclassing & reclassing, other functions etc. but nothing seems
> to
> work. What is wrong?
> Thanks in advance,
> Allen
> --
> View this message in context:
> http://n4.nabble.com/Class-attributes-tp948693p948693.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep() exclude certain patterns?

2009-12-04 Thread jim holtman
use !grepl

On Fri, Dec 4, 2009 at 2:43 PM, Peng Yu  wrote:

> On Fri, Dec 4, 2009 at 11:54 AM, Duncan Murdoch 
> wrote:
> > On 04/12/2009 12:52 PM, Peng Yu wrote:
> >>
> >> The external grep program has an option -v to select non-matching
> >> lines. I'm wondering if how to exclude certain patterns in grep() in
> >> R?
> >>
> >
> > ?grep
>
> I don't see which argument to use.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot data from tapply

2009-12-06 Thread jim holtman
Here is one way of doing it:

x=c(1,2,3,1)
y=c(1,2,3,1)
ss=c(55,NA,55,88)
ss_byxy_test=tapply(  ss, list( x, y), mean, na.rm=TRUE)
# use the 'reshape' package
ss_byxy_test
# now 'melt' the data to get it into a format for plotting
(ss_melt <- melt(ss_byxy_test))
# create the plot area so you can add the 'ss' as text
plot(0, type='n', xlim=range(ss_melt$X1), ylim=range(ss_melt$X2),
xlab="X", ylab="Y")
text(ss_melt$X1, ss_melt$X2, ss_melt$value, font=2, col='red')

On Sat, Dec 5, 2009 at 4:49 PM, dwwc  wrote:

>
> i have three data, x coordinate, y coordinate and  signal strength
>
> i use tapply() function to get the average ss in the give x,y location
>  x=c(1,2,3,1)
>  y=c(1,2,3,1)
>  ss=c(55,NA,55,88)
>  ss_byxy_test=tapply(  ss, list( x, y), mean)
> and I get this table
> 1  2  3
> 1 71.5 NA NA
> 2   NA NA NA
> 3   NA NA 55
> but i don't know how to plot different the ss with the xy location,
> can anyone help me
> --
> View this message in context:
> http://n4.nabble.com/plot-data-from-tapply-tp949436p949436.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot data from tapply

2009-12-06 Thread jim holtman
I left off the statement to load the reshape package.  If you don't have it,
install it from CRAN:

x=c(1,2,3,1)
y=c(1,2,3,1)
ss=c(55,NA,55,88)
ss_byxy_test=tapply(  ss, list( x, y), mean, na.rm=TRUE)
ss_byxy_test
# use the 'reshape' package
library(reshape)
# now 'melt' the data to get it into a format for plotting
(ss_melt <- melt(ss_byxy_test))
# create the plot area so you can add the 'ss' as text
plot(0, type='n', xlim=range(ss_melt$X1), ylim=range(ss_melt$X2),
xlab="X", ylab="Y")
text(ss_melt$X1, ss_melt$X2, ss_melt$value, font=2, col='red')


On Sat, Dec 5, 2009 at 4:49 PM, dwwc  wrote:

>
> i have three data, x coordinate, y coordinate and  signal strength
>
> i use tapply() function to get the average ss in the give x,y location
>  x=c(1,2,3,1)
>  y=c(1,2,3,1)
>  ss=c(55,NA,55,88)
>  ss_byxy_test=tapply(  ss, list( x, y), mean)
> and I get this table
> 1  2  3
> 1 71.5 NA NA
> 2   NA NA NA
> 3   NA NA 55
> but i don't know how to plot different the ss with the xy location,
> can anyone help me
> --
> View this message in context:
> http://n4.nabble.com/plot-data-from-tapply-tp949436p949436.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data manipulation/subsetting and relation matrix

2009-12-08 Thread jim holtman
try this:

myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()
corr_mat <-as.matrix(read.table(textConnection("1 1   .5  0   0   0   0
0   0   0
2 .5   1  0   0   0   0   0   0   0
3 00  1.0   0   0   0   0   0   0
4 00  0   1   .5  .5  0   0   0
5 00  0   .5  1.5  0   0   0
6 00  0   .5  .5   1 00   0
7 00  0   00   0  1   0  0
8 0   0   0   00   0   0  1  .5
9 0   0   0   0   00   0  .5 1"),header=FALSE))
closeAllConnections()
corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id
# split out the groups
groups <- split(as.character(myDat$id), myDat$group)
# process each subgroup
result <- lapply(groups, function(.grp){
subgroup <- corr_mat[.grp, .grp]
output <- NULL
# zero the diag
diag(subgroup) <- 0
same <- apply(subgroup, 1, function(x) any(x != 0))
if (any(same)){  # some match, choose one
output <- sample(same[same], 1)
}
if (any(!same)){  # get all that don't correlate
output <- c(output, same[!same])
}
output
})
# output as matrix
do.call(rbind, lapply(names(result), function(x) cbind(x,
names(result[[x]]



On Mon, Dec 7, 2009 at 7:38 PM, Juliet Hannah wrote:

> Hi List,
>
> Here is some example data.
>
> myDat <- read.table(textConnection("group id
> 1 101
> 1 201
> 1 301
> 2 401
> 2 501
> 2 601
> 3 701
> 3 801
> 3 901"),header=TRUE)
> closeAllConnections()
>
> corr_mat <-read.table(textConnection("1 1   .5  0   0   0   0   0   0   0
> 2 .5   1  0   0   0   0   0   0   0
> 3 00  1.0   0   0   0   0   0   0
> 4 00  0   1   .5  .5  0   0   0
> 5 00  0   .5  1.5  0   0   0
> 6 00  0   .5  .5   1 00   0
> 7 00  0   00   0  1   0  0
> 8 0   0   0   00   0   0  1  .5
> 9 0   0   0   0   00   0  .5 1"),header=FALSE)
> closeAllConnections()
>
> corr_mat <- corr_mat[,-1]
> colnames(corr_mat) <- myDat$id
> rownames(corr_mat) <- myDat$id
>
> I need to subset this data such that observations within a group are not
> related, which is indicated by a 0 in corr_mat.
>
> For example, within group 1, 101 and 201 are related, so one of these
> has to be selected, say
> 101. 301 is not related to 101 or 201, so the final set for group 1
> consists of 101 and 301. There will always be at least 2 members in
> each group. I need to carry this task on all groups.
>
> One possible final data set looks like:
>
>  group  id
> 1 1 101
> 3 1 301
> 4 2 401
> 7 3 701
> 8 3 801
>
> Any suggestions? Thanks!
>
> Juliet
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with split eating giga-bytes of memory

2009-12-08 Thread jim holtman
size of the attributes that get copied, I guess.
> > >>
> > >>
> > >>
> > >>
> > >>> myDataFrame <- data.frame(matrix(LETTERS, ncol = 7, nrow = 399000))
> > >>> mySplitVar <- factor(as.character(1:1400))
> > >>> myDataFrame <- cbind(myDataFrame, mySplitVar)
> > >>> object.size(myDataFrame)
> > >>> ## 12860880 bytes # ~ 13MB
> > >>> myDataFrame.split <- split(myDataFrame, myDataFrame$mySplitVar)
> > >>> object.size(myDataFrame.split)
> > >>> ## 144524992 bytes # ~ 144MB
> > >>>
> > >>
> > >> Note:
> > >>
> > >>  only.attr <- lapply(myDataFrame.split,function(x)
> sapply(x,attributes))
> > >>>
> > >>>
> >
> (object.size(myDataFrame.split)-object.size(myDataFrame))/object.size(only.attr)
> > >>>
> > >> 1.03726179240978 bytes
> > >>
> > >>
> > >>>
> > >>
> > >>  object.size(selectSubAct.df)
> > >>> ## 52,348,272 bytes # ~ 52MB
> > >>>
> > >>
> > >> What was this??
> > >>
> > >>
> > >> Chuck
> > >>
> > >>
> > >>>  sessionInfo()
> > >>>>
> > >>> R version 2.10.0 Patched (2009-10-27 r50222)
> > >>> x86_64-unknown-linux-gnu
> > >>>
> > >>> locale:
> > >>> [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> > >>> [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
> > >>> [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
> > >>> [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
> > >>> [9] LC_ADDRESS=C   LC_TELEPHONE=C
> > >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> > >>>
> > >>> attached base packages:
> > >>> [1] stats graphics  grDevices datasets  utils methods   base
> > >>>
> > >>> loaded via a namespace (and not attached):
> > >>> [1] tools_2.10.0
> > >>>
> > >>> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
> > >>> Indiana University School of Medicine
> > >>>
> > >>> 15032 Hunter Court, Westfield, IN  46074
> > >>>
> > >>> (317) 490-5129 Work, & Mobile & VoiceMail
> > >>> (317) 399-1219 Skype No Voicemail please
> > >>>
> > >>>[[alternative HTML version deleted]]
> > >>>
> > >>>
> > >>> __
> > >>> R-help@r-project.org mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>>
> > >> Charles C. Berry(858) 534-2098
> > >>Dept of Family/Preventive
> > >> Medicine
> > >> E mailto:cbe...@tajo.ucsd.edu   UC San Diego
> > >> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
> > 92093-0901
> > >>
> > >>
> > >>
> > >
> > >[[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> >
> > --
> > http://had.co.nz/
> >
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with split eating giga-bytes of memory

2009-12-09 Thread jim holtman
Here is an example:


> # create test data
> N <- 100
> x <- data.frame(a=sample(LETTERS, N, TRUE), b=sample(letters, N, TRUE),
+ c=as.numeric(1:N), d=runif(N))
> system.time({
+ x.df <- split(x, x$a)  # split
+ print(sapply(x.df, function(a) sum(a$c)))
+ })
  A   B   C   D   E
F   G   H
19132375146 19261600080 19290064552 19355472666 19143448231 18973627622
19278423676 19362576931
  I   J   K   L   M
N   O   P
19405443596 19295695044 19052377988 19236047192 19143226220 19197703946
19297192525 19129252399
  Q   R   S   T   U
V   W   X
19272964991 19315856972 19355660155 19303178409 19242322477 19081573240
19309444512 19077003863
  Y   Z
19259313705 19228653862
   user  system elapsed
   1.270.021.28
> # now use indices
> system.time({
+ x.indx <- split(seq(nrow(x)), x$a)  # create list of indices
+ print(sapply(x.indx, function(a) sum(x$c[a])))
+ })
  A   B   C   D   E
F   G   H
19132375146 19261600080 19290064552 19355472666 19143448231 18973627622
19278423676 19362576931
  I   J   K   L   M
N   O   P
19405443596 19295695044 19052377988 19236047192 19143226220 19197703946
19297192525 19129252399
  Q   R   S   T   U
V   W   X
19272964991 19315856972 19355660155 19303178409 19242322477 19081573240
19309444512 19077003863
  Y   Z
19259313705 19228653862
   user  system elapsed
   0.230.000.23
>
>
>
>
>


On Tue, Dec 8, 2009 at 10:26 PM, Mark Kimpel  wrote:

> Jim, could you provide a code snippit to illustrate what you mean?
>
> Hadley, good point, I did not know that.
>
> Mark
>
> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
> Indiana University School of Medicine
>
> 15032 Hunter Court, Westfield, IN  46074
>
> (317) 490-5129 Work, & Mobile & VoiceMail
> (317) 399-1219 Skype No Voicemail please
>
>
>   On Tue, Dec 8, 2009 at 11:00 PM, jim holtman  wrote:
>
>> Also instead of 'splitting' the data frame, I split the indices and then
>> use those to access the information in the original dataframe.
>>
>>
>> On Tue, Dec 8, 2009 at 9:54 PM, Mark Kimpel  wrote:
>>
>>> Hadley, Just as you were apparently writing I had the same thought and
>>> did
>>> exactly what you suggested, converting all columns except the one that I
>>> want split to character. Executed almost instantaneously without problem.
>>> Thanks! Mark
>>>
>>> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
>>> Indiana University School of Medicine
>>>
>>> 15032 Hunter Court, Westfield, IN  46074
>>>
>>> (317) 490-5129 Work, & Mobile & VoiceMail
>>> (317) 399-1219 Skype No Voicemail please
>>>
>>>
>>>  On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham 
>>> wrote:
>>>
>>> > Hi Mark,
>>> >
>>> > Why are you using factors?  I think for this case you might find
>>> > characters are faster and more space efficient.
>>> >
>>> > Alternatively, you can have a look at the plyr package which uses some
>>> > tricks to keep memory usage down.
>>> >
>>> > Hadley
>>> >
>>> > On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel 
>>> wrote:
>>> > > Charles, I suspect your are correct regarding copying of the
>>> attributes.
>>> > > First off, selectSubAct.df is my "real" data, which turns out to be
>>> of
>>> > the
>>> > > same dim() as myDataFrame below, but each column is make up of
>>> strings,
>>> > not
>>> > > simple letters, and there are many levels in each column, which I did
>>> not
>>> > > properly duplicate in my first example. I have ammended that below
>>> and
>>> > with
>>> > > the split the new object size is now not 10X the size of the
>>> original,
>>> > but
>>> > > 100X. My "real" data is even more complex than this, so I suspect
>>> that is
>>> > > where the problem lies. I need to search for a better solution to my
>>> > problem
>>> > > than split, for which I will start a separate thread if I can't
>>> figure
>>> > > something out.
>>> > >
>>

Re: [R] What is the function to test if a vector is ordered or not?

2009-12-09 Thread jim holtman
Try

all(diff(order(yourVector)) == 1)

On Wed, Dec 9, 2009 at 10:10 PM, Peng Yu  wrote:

> I did a search on www.rseek.org to look for the function to test if a
> vector is ordered or not. But I don't find it. Could somebody let me
> know what function I should use?
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] incorrect multiple outputs

2009-12-10 Thread jim holtman
If I rad you code right, file.rows is equal to 1 and your 'for' loop will
only iterate once.  Is that what you were expecting?

No reproducible code provided, so that is my best guess.

>file.rows<- c(nrow(file)/288)  # "input_file.txt" contains 288 reformatted
lines for each original data file
...
>for (k in 1:file.rows){  # iterates code for each 288 line block of
"input_file.txt"
...

On Thu, Dec 10, 2009 at 11:39 AM, biscuit  wrote:

>
> HI,
> I'm having trouble with a piece of Rscript which keeps outputting
> incorrectly. it's something like this: the code reads in from a file which
> contains (reformated) input
>
> >file<-read.table(file="input_file.txt",sep="\t")[,c(1,3:5)]
> >
> >file.rows<- c(nrow(file)/288)  # "input_file.txt" contains 288 reformatted
> lines for each original data file
> ...
> >for (k in 1:file.rows){  # iterates code for each 288 line block of
> "input_file.txt"
> ...
> >cv[k] <- 100*(sd(x.blank)/mean(x.blank))
> >t[k] <-
> (mean(x.note)-mean(x.blank))/sqrt(((sd(x.note)^2)/8)+((sd(x.blank)^2)/16))
> >t11[k] <-
> (sqrt(8)*(mean(x.note11)-mean(x.blank)))/sqrt(sd(x.note11)^2+sd(x.blank)^2)
> >}
> >
>
> >all.data<-data.frame(barcodes,t=format(as.numeric(t),digits=3),t11=format(as.numeric(t11),digits=3),cv=format(as.numeric(cv),digits=3))
> >write.table(all.data, file=
> "R_drug_plot.log",append=TRUE,sep="\t",row.names=FALSE)
>
> this all works correctly except that I believed it would output to file
> after completing the loop, instead it's writing to file every iteration. so
> the output file looks like:
>
> headers
> a1
> headers
> a1
> a2
> headers
> a1
> a2
> a3
> ...
>
> I have checked the missing sections of code and can confirm there are no
> missing/additional brackets. Has anyone any idea why this is happening and
> what I can do about it?
> --
> View this message in context:
> http://n4.nabble.com/incorrect-multiple-outputs-tp957192p957192.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About R memory management?

2009-12-10 Thread jim holtman
If you really want to code like a C++ coder in R, then create your own
object and extend it when necessary:

# take a variation of this; preallocate and then extend when you read a
limit
x <- numeric(2)
for (i in 1:100){
if (i > length(x)){
# double the length (or whatever you want)
length(x) <- length(x) * 2
}
x[i] <- i
}

On Thu, Dec 10, 2009 at 11:30 AM, Peng Yu  wrote:

> I have a situation that I can not predict the final result's dimension.
>
> In C++, I believe that the class valarray could preallocate some
> memory than it is actually needed (maybe 2 times more). The runtime
> for a C++ equivalent (using append) to the R code would still be C*n,
> where C is a constant and n is the length of the vector. However, if
> it just allocate enough memory, the run time will be C*n^2.
>
> Based on your reply, I suspect that R doesn't allocate some memory
> than it is currently needed, right?
>
> On Fri, Dec 11, 2009 at 11:22 AM, Henrik Bengtsson 
> wrote:
> > Related...
> >
> > Rule of thumb:
> > Pre-allocate your object of the *correct* data type, if you know the
> > final dimensions.
> >
> > /Henrik
> >
> > On Thu, Dec 10, 2009 at 8:26 AM, Peng Yu  wrote:
> >> I'm wondering where I can find the detailed descriptions on R memory
> >> management. Understanding this could help me understand the runtime of
> >> R program. For example, depending on how memory is allocated (either
> >> allocate a chuck of memory that is more than necessary for the current
> >> use, or allocate the memory that is just enough for the current use),
> >> the performance of the following program could be very different.
> >> Could somebody let me know some good references?
> >>
> >> unsorted_index=NULL
> >> for(i in 1:100) {
> >>  unsorted_index=c(unsorted_index, i)
> >> }
> >> unsorted_index
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
> __________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to draw three line on the same picture ?

2009-12-11 Thread jim holtman
try this:

x <- read.table(textConnection("No   V1  V2 V3
1 0.23 0.12 0.89
2 0.11 0.56 0.12"), header=TRUE)
matplot(x[,1], x[,-1], type='l')


On Fri, Dec 11, 2009 at 3:39 AM, z_axis  wrote:

>
> thanks for your answer ! Would you mind giving me an example using my data
> ?
>
> Sincerely!
>
>
> Patrick Connolly-4 wrote:
> >
> > On Thu, 10-Dec-2009 at 10:14PM -0800, z_axis wrote:
> >
> > |>
> > |> The following is  sampling data:
> > |> No   V1  V2 V3
> > |> 1 0.23 0.12 0.89
> > |> 2 0.11 0;56 0.12
> > |> ...
> > |>
> > |> I just want to draw three lines on same picture according to value of
> > V1, V2
> > |> and V3.
> >
> > ?lines
> >
> >
> > --
> > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
> >___Patrick Connolly
> >  {~._.~}   Great minds discuss ideas
> >  _( Y )_   Average minds discuss events
> > (:_~*~_:)  Small minds discuss people
> >  (_)-(_). Eleanor Roosevelt
> >
> > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> --
> View this message in context:
> http://n4.nabble.com/How-to-draw-three-line-on-the-same-picture-tp960823p960897.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __________
>  R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recoding factor labels that are lists into first element of list

2009-12-11 Thread jim holtman
try this:

> x <- data.frame(a=c('cat', 'cat,dog', 'dog', 'dog,cat'))
> x
a
1 cat
2 cat,dog
3 dog
4 dog,cat
> levels(x$a)
[1] "cat" "cat,dog" "dog" "dog,cat"
> # change the factors
> x$a <- factor(sapply(strsplit(as.character(x$a), ','), '[[', 1))
> x
a
1 cat
2 cat
3 dog
4 dog
> levels(x$a)
[1] "cat" "dog"


On Thu, Dec 10, 2009 at 10:53 PM, Jennifer Walsh  wrote:

> Hi all,
>
> I've Googled far and wide but don't think I know the correct terms to
> search for to find an answer.
>
> I have a massive dataset where one of the factors is made up of both
> individual items and lists of items (for example, "cat" and "cat, dog,
> bird"). I would like to recode this factor somehow into only the first
> element of the list (so every list starting with "cat," plus the
> observations that were already just "cat" would all be set equal to "cat").
> I would ideally like to do this in some simple way that does not require me
> to write hundreds of different sets of code (since the lists probably start
> with 300+ different items). Is this possible? Extremely complicated?
>
> Also, I am sure this is much simpler, but I cannot seem to get rid of
> levels of a factor that have no observations. I have tried setting the
> levels of the factor to only the ones with observations that I am interested
> in, but every time I summarize the variable there are still 100+ labels all
> with "0" as their count. This hasn't happened to me before; is there an
> explanation for it?
>
> Thanks very much,
> Jen
>
> ---
> Jennifer Walsh
> Graduate Student, Developmental Psychology
> University of Michigan
> 2020 East Hall, 530 Church St.
> Ann Arbor, MI 48109-1043
>
> __________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match problem

2009-12-15 Thread Jim Holtman

?merge

What is the problem you are trying to solve?

Sent from my iPhone.

On Dec 15, 2009, at 4:50, "Bunny, lautloscrew.com" > wrote:



Hi all,

I dont know if match is the right approach here. I´d like to match t 
o data.frames. One big dataframe and one small dataframe. In SQL, wh 
at i want to do what only be simple relation.


The first consists of two columns a) some value b) some key that is  
explained in the other dataframe. What i want to do is create  
(cbind) both to one dataframe like:


df1:

a  b
1  2
2  3
3  2
4  2
5  1

df2:
1 class1
2 class2
3 class3


to newdf:

a b class
1 2 class2
2 3 class3
3 2 class2
...
and so forth

I have connected R to several relational databases but I dont think  
it´s necessary here and that there should be a simpler solution for  
my problem. Maybe some combination of do.call, mapply, lapply or mat 
ch can do the job.
Unfortunately I am not really familiar with these and still keep  
tryin.


thx in advance for any help.

Best regards

matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] comparison of these types is not implemented

2009-12-15 Thread Jim Holtman

you need:

r_squared[[i]]

What is the problem you are trying to solve?

Sent from my iPhone.

On Dec 15, 2009, at 2:29, Tom Pitt  wrote:



Hi All,

Can you tell me why I get the error message below?  It's driving me  
nuts.


Thanks,
Tom


r_squared

[[1]]
[1] 0.9083936

[[2]]
[1] 0.8871647

[[3]]
[1] 0.8193883

[[4]]
[1] 0.728157

[[5]]
[1] 0.8849525

[[6]]
[1] 0.8459416

[[7]]
[1] 0.6702318

[[8]]
[1] 0.02997816

[[9]]
[1] 0.8974268

[[10]]
[1] 0.881217

[[11]]
[1] 0.8006688

[[12]]
[1] 0.7207697

[[13]]
[1] 0.8703734

[[14]]
[1] 0.8384346

[[15]]
[1] 0.6237472



biggest=c(0,0)

for (i in 1:15) {

+
+ if (r_squared[i]>biggest[1]) biggest=c(r_squared[i],i)}
Error in r_squared[i] > biggest[1] :
 comparison of these types is not implemented



--
View this message in context: 
http://n4.nabble.com/comparison-of-these-types-is-not-implemented-tp964195p964195.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with spliting a dataframe values

2009-12-17 Thread jim holtman
Does this do what you want.

> x <- "a,b,c|1,2,3|4,5,6|7,8,8"
> x.1 <- strsplit(x, "[|]")
> x.1
[[1]]
[1] "a,b,c" "1,2,3" "4,5,6" "7,8,8"
> x.2 <- lapply(x.1, strsplit, ',')
> x.2
[[1]]
[[1]][[1]]
[1] "a" "b" "c"
[[1]][[2]]
[1] "1" "2" "3"
[[1]][[3]]
[1] "4" "5" "6"
[[1]][[4]]
[1] "7" "8" "8"

> do.call(rbind, x.2[[1]])
 [,1] [,2] [,3]
[1,] "a"  "b"  "c"
[2,] "1"  "2"  "3"
[3,] "4"  "5"  "6"
[4,] "7"  "8"  "8"
>


On Thu, Dec 17, 2009 at 9:11 AM, venkata kirankumar
wrote:

> Hi all,
> Hi this is kiran
> I am facing a problem to split a dataframe
>
> that is..
>  i have a string like:"a,b,c|1,2,3|4,5,6|7,8,8"
> first I have to split  with respect to   "|"
> I did it with  command
>
> unlist(strsplit("a,b,c|1,2,3|4,5,6|7,8,8", "\\,"))
>
>
> after getting that set i made it as a dataframe and it comes like
>
> a,b,c
> 1,2,3
> 4,5,6
> 7,8,8
>
> now i have to split this dataframe with respect to ","  and i have to get
> it
> like
>
>
> a b c
> 1 2 3
> 4 5 6
> 7 8 8
>
>
> this one i am not able to findout
> can any one help me to get it done
>
> thanks in advance
> kiran
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] to remove an error with log(zero)

2009-12-17 Thread jim holtman
Does this do what you want:

> x
 [1]  1  2  4  0  7  5  0  0  0  9 11 12
> # create a matrix with the first column being a sequence number
> x.mat <- cbind(seq(length(x)), x)
> # remove zeros in second column
> x.mat <- x.mat[x.mat[,2] != 0,]
> x.mat
 x
[1,]  1  1
[2,]  2  2
[3,]  3  4
[4,]  5  7
[5,]  6  5
[6,] 10  9
[7,] 11 11
[8,] 12 12
> # now create an approxfun to interprete missing values
> x.fun <- approxfun(x.mat[,1], x.mat[,2])
> # now fill out a new matrix with interpreted values
> x.new <- cbind(seq(length(x)), x.fun(seq(length(x
> x.new
  [,1] [,2]
 [1,]1  1.0
 [2,]2  2.0
 [3,]3  4.0
 [4,]4  5.5
 [5,]5  7.0
 [6,]6  5.0
 [7,]7  6.0
 [8,]8  7.0
 [9,]9  8.0
[10,]   10  9.0
[11,]   11 11.0
[12,]   12 12.0
>


On Thu, Dec 17, 2009 at 8:16 PM, Moohwan Kim  wrote:

> Dear R family
>
> I have an arbitrary column vector.
> 1
> 2
> 4
> 0
> 7
> 5
> 0
> 0
> 0
> 9
> 11
> 12
> When I attempt to take natural logarithm of the series, as you guess
> there is an error message. To overcome this problem, my idea is to
> replace a zero or zeros in a row with appropriate numbers.
> In order to implement it, I need to detect where zeros are.
> Then I am going to take the average of two adjacent neighbors. In the
> case of zeros in a row, I guess I might apply the above idea
> sequentially.
>
> Would you help me out to escape from this jungle?
> Thanks in advance.
>
> Best
> Moohwan
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] some help regarding combining columns from different files

2009-12-18 Thread jim holtman
In your function, you have

 temp <- read.table(fnames,header=T,sep="\t",stringsAsFactors=F,quote="\"")

I think you mean:

 temp <- read.table(i,header=T,sep="\t",stringsAsFactors=F,quote="\"")

Also 'files' is a parameter, but you are using 'fnames' in the 'for' loop;
shouldn't that be 'files'?

On Thu, Dec 17, 2009 at 3:51 PM, Harikrishnadhar wrote:

> Dear all,
>
> Here is my code which am using to combine 5th column from different data
> sets.
>
> Here is the function  to do my job
>
>
> genesymbol.append.file <-NULL
> gene.column <- NULL
> readGeneSymbol <- function(files,genesymbol.column=5){
> for(i in fnames){
>  temp <- read.table(fnames,header=T,sep="\t",stringsAsFactors=F,quote="\"")
>  gene.column<-cbind(gene.column,temp[,genesymbol.column])
>  genesymbol.append.file$genecolumns <- gene.column
>  genesymbol.append.file
>  }
> }
>
>
>
>
> test <- readGeneSymbol(fnames,genesymbol.column=5)
>
> Here is the warning message  am getting only the 5th column from the first
> column is taken
>
>
> Warning messages:
> 1: In file(file, "r") : only first element of 'description' argument used
> 2: In file(file, "r") : only first element of 'description' argument used
> >
>
> Please help me to solve this
>
>
>
>
>
>
>
> --
> Thanks
> Hari
> 215-385-4122
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> "If there is anyone out there who still doubts that America is a place
> where
> all things are possible"
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.csv and col.names=F

2009-12-18 Thread jim holtman
In R.2.9.2 I get the following error message if setting col.names=FALSE:

> write.csv(x, '', col.names=FALSE)
"","a","b"
"1",1,1
"2",2,2
"3",3,3
"4",4,4
"5",5,5
"6",6,6
"7",7,7
"8",8,8
"9",9,9
"10",10,10
Warning message:
In write.csv(x, "", col.names = FALSE) : attempt to set 'col.names' ignored
You have to use write.table if you don't want the column names; it is on the
help page:

> write.table(x,sep=',', col.names=FALSE)
"1",1,1
"2",2,2
"3",3,3
"4",4,4
"5",5,5
"6",6,6
"7",7,7
"8",8,8
"9",9,9
"10",10,10
>


On Fri, Dec 18, 2009 at 8:37 AM, Reeyarn_李智洋_10928113 
wrote:

> On Fri, Dec 18, 2009 at 7:52 AM, kayj  wrote:
> >
> > Hi All,
> >
> > I always have a problem with write.csv when I want the column names to be
> > ignored, when I specify col.names=F, I get a header of V1 V2 V3 V4 etc.
> >
>
> I tried that and found the same problem, however, I found
>  write.table(mydata, file="data.csv",col.names=F)
> works.
>
> write.csv calls write.table to save data, is there something wrong with it?
>
> --
> Best Regards,
> Reeyarn T. Lee
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integer(0) and NA do not equal FALSE

2009-12-19 Thread jim holtman
try using 'grepl'

> if( grepl("hi", "hop", fixed = TRUE) ){
+  print('yes, your substring is in your string')
+ } else print('no, your substring is not in your string')
[1] "no, your substring is not in your string"
>


On Sat, Dec 19, 2009 at 3:47 PM, Jonathan  wrote:

> Hi,
>   A noobie question:  I'm simply trying to run a conditional statement that
> evaluates if a substring is found within a larger string.  I find that if
> it
> IS found, my function returns TRUE (great!), but if not, the condition does
> not evaluate to FALSE.
>
> ex):
>
> if( grep("hi", "hop", fixed = TRUE) )
>  print('yes, your substring is in your string')
> else print('no, your substring is not in your string')
>
> alternatively, I could replace grep with pmatch:
>
> if (pmatch('hi','hop'))
>  print('yes, your substring is in your string')
> else print('no, your substring is not in your string')
>
>
> The first example, using grep, returns logical(0).  The second, using
> pmatch, returns NA.  Any idea how to convert either of those to FALSE, or
> else a different function that would do the trick?
>
> Thanks,
> Jon
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table: mysterious line omissions

2009-12-20 Thread jim holtman
Most likely an unbalanced quote.  put the following option in the
read.table:

quote='', comment.char=''

On Sat, Dec 19, 2009 at 11:42 PM, Jonathan  wrote:

> Hello again,
> I am simply trying to import a rectangular table of strings.  The
> table's dimensions are 1990 x 2, yet my read.table() command can only find
> 362 of the rows (and they're not the first 362).  I would've taken the time
> to figure out how to use scan, readLines, or some other tool that can read
> in character strings, and then parse and input to a table, but that seems
> like overkill, and probably it would be good to understand what's wrong
> with
> my text file.
>
> The file is here.
>
> https://regtransfers-sth-se.diino.com/download/jonsleepy/_mydropbox_/finalInput.xls
>
> The code is here:
> temp <- as.matrix(read.table('finalInput.xls', header=FALSE, sep = "\t"))
> dim(temp) #expect 1990 x 2; but find 362 x 2
>
> Sorry to require a download (this probably won't make people happy), but
> since my problem is file-specific, the file is needed for troubleshooting.
>
> I generated it with some grep, gawk commands using Cygwin in a Windows
> environment (though subsequently converted it to Windows format - R loads
> it
> exactly the same way, regardless of whether it's in linux or windows
> format)
>
> Regards,
> Jonathan
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "Object is not a matrix" Error

2009-12-20 Thread jim holtman
Where is the object 'write'?  SHouldn't you be using:

 lm(visits ~ (day.f))


On Sun, Dec 20, 2009 at 5:59 PM, John Paul Telthorst
wrote:

> I'm trying to follow this guide here:
> http://www.ats.ucla.edu/stat/r/modules/dummy_vars.htm
>
> In which I'm creating categorical variables using the factor function.
>
> I am able to go through the example listed above and have everything work,
> however, when I try to input my own numbers, I get an error.  I input the
> following:
>
>
> > hits = read.csv(file.choose())
>
> > attach(hits)
>
> > day.f <- factor(day)
>
> > lm(write ~ (day.f))
>
> lm(write ~ (day.f))
>
> Error in model.frame.default(formula = write ~ (day.f), drop.unused.levels
> =
> > TRUE) :
> >   object is not a matrix
> >
>
> So I import "hits = read.csv(file.choose())" a .csv file, which has the
> columns "visits" and "day" where "visits" is the number of hits to a
> website, and "day" is a number 1-7, for example 1 corresponds to Sunday and
> 7 corresponds to Saturday.  I understand that the day variable needs to be
> a
> categorical variable, and I'm trying to use the factor function to do this.
>  I would like to be able to run a regression that will correlate the day
> with the number of hits.
>
> Any help would be much appreciated.
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Column naming issues using read.table

2009-12-23 Thread jim holtman
This reads in your posted data:

> x <- read.table(textConnection("Samplerate = 2 samps/sec
+   Nr   Cnt1X   Cnt1Y   Cnt2X   Cnt2Y  sec100  hour
+  153  84  43  2   22  12
+  290  155 74  0   72  12
+  390  155 74  0   121 12"), skip=1, header=TRUE)
> closeAllConnections()
>
> x
  Nr Cnt1X Cnt1Y Cnt2X Cnt2Y sec100 hour
1  1538443 2 22   12
2  290   15574 0 72   12
3  390   15574 0121   12


On Wed, Dec 23, 2009 at 8:31 PM, arthurbeer01 wrote:

>
> Hi, this is my first post so please be gentle.
> I quite new to R and using it for my biology degree.
>
> My problem is. Im trying to import data from a .csv file using the
> read.table command. The .csv file header starts on row 2 but is contained
> in
> column 1, i have 600 data files and for future ease would rather not edit
> each file seperatly. The data starts on row three and I only need the first
> 381 data points.
>
> The R error message using the code iv got so far is
>
> Error in read.table(file("s1-2c83.csv"), header = FALSE, sep = ",", quote =
> "",  :
>  more columns than column names
>
> The code I have so far is
>
> framename<-read.table(file ("s1-2c83.csv"),
> header = FALSE, # FLASE indicates headers are not included in input file
> sep = ",",# must have "," otherwise errors in table
> quote = "",
> dec = ".",
> row.names = 1, # must = 1 or extra column of row numbering is entered
> col.names = ("Nr2sec,Cnt1X,Cnt1Y,Cnt2X,Cnt2Y,sec100,hour"),
> as.is = FALSE,
> na.strings = "NA",
> colClasses = NULL,
> nrows = 381, # rows to stop data.table recording (not input file row
> number!)
> skip = 2,# number of rows to skp before reading data from input file
> strip.white = FALSE,
> comment.char = "")
>
> write.csv(framename, file = "s1-2c83-ok.csv")
>
> If I delete the line col.names, Iv manged to get the data read and saved to
> a new .csv file but cannot work out how to get the column headers renamed.
> The read.table (framename) displays the headers as v1,v2,v3 etc, this is
> what i cant change. Also it has the first column without a header (i think
> its the row number) which I dont want in the output file
>
> The read data file example s1-2c83.csv
>
> 1:Samplerate = 2 samps/sec
> 2:  Nr   Cnt1X   Cnt1Y   Cnt2X   Cnt2Y  sec100  hour
> 3: 153  84  43  2   22  12
> 4: 290  155 74  0   72  12
> 5: 390  155 74  0   121 12
>
> Any help will be greatly appreciated after the 5hrs Iv spent already on
> this
> problem.
>
> Many thanks in advance
>
>  Adam
>
>
>
> --
> View this message in context:
> http://n4.nabble.com/Column-naming-issues-using-read-table-tp978241p978241.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] by-group processing

2009-05-06 Thread jim holtman
Ths should do it:

> do.call(rbind, lapply(split(x, x$ID), tail, 1))
 ID Type N
45900 45900I 7
46550 46550I 7
49270 49270E 3


On Wed, May 6, 2009 at 6:09 PM, Max Webber  wrote:

> Given a dataframe like
>
>  > data
>ID Type N
>  1  45900A 1
>  2  45900B 2
>  3  45900C 3
>  4  45900D 4
>  5  45900E 5
>  6  45900F 6
>  7  45900I 7
>  8  49270A 1
>  9  49270B 2
>  10 49270E 3
>  18 46550A 1
>  19 46550B 2
>  20 46550C 3
>  21 46550D 4
>  22 46550E 5
>  23 46550F 6
>  24 46550I 7
>  >
>
> containing an identifier (ID), a variable type code (Type), and
> a running count of the number of records per ID (N), how can I
> return a dataframe of only those records with the maximum value
> of N for each ID? For instance,
>
>  > data
>ID Type N
>  7  45900I 7
>  10 49270E 3
>  24 46550I 7
>
> I know that I can use
>
>   > tapply ( data $ N , data $ ID , max )
>   45900 46550 49270
>   7 7 3
>   >
>
> to get the values of the maximum N for each ID, but how is it
> that I can find the index of these values to subsequently use to
> subscript data?
>
>
> --
> maxine-webber
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert large integers to hex

2009-05-06 Thread jim holtman
You can use the 'bc' command (use Cygwin if on Windows);

/cygdrive/c: bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
x=6595137340052185552
obase=16
x
5B86A277DEB9A1D0

You can call this from R.

On Wed, May 6, 2009 at 3:26 PM, Sundar Dorai-Raj wrote:

> Hi,
>
> I'm wondering if someone has solved the problem of converting very
> large integers to hex. I know about format.hexmode and as.hexmode, but
> these rely on integers. The numbers I'm working with are overflowing
> and losing precision. Here's an example:
>
> x <- "6595137340052185552" # stored as character
> as.integer(x) # warning about inaccurate conversion
> format.hexmode(as.numeric(x)) # warnings about loss of precision
> as.hexmode(x) # more warnings and does not do what I expected
>
> I'm planning on writing a function that will do this, but would like
> to know if anybody already has a solution. Basically, I would like the
> functionality of format.hexmode on arbitrarily large integers.
>
> Thanks,
>
> --sundar
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matching multiple columns in a data frame

2009-05-07 Thread jim holtman
?merge

> merge(A,B)
  C1  C2
1  A 200


On Thu, May 7, 2009 at 2:19 AM, Raghavan, Nandini [PRDUS] <
nragh...@its.jnj.com> wrote:

> Hello,
>
>
>
> I am trying to extract a subset of a dataframe A (2 columns) by
> extracting all entries in A (several repeated entries) that match
> dataframe B in both columns.  For example, part of A and B are shown
> below.
>
> The following does not seem to work correctly. This only seems to select
> on the first component and all instances of the second.
>
> ind <- A$C1 %in% B[,1] & A$C2 %in% B[,2]
>
> Any suggestions as to how to do this in general (even for matches in
> multiple columns) would be appreciated.
>
>
>
> Regards,
>
> Nandini
>
>
>
>
>
> A:
>
>   C1   C2
>
> 1   F 1500
>
> 2   P  120
>
> 4   F  250
>
> 5   I  200
>
> 6   D 2010
>
> 7   F 1000
>
> 8   V0
>
> 9   F 2100
>
> 10  F  500
>
> 11  E 1800
>
> 12  A  500
>
> 13  V0
>
> 14  I  125
>
> 15  I   30
>
> 16  M  300
>
> 17  D   75
>
> 18  V  500
>
> 19  A  200
>
> 20  M 1000
>
> 21  P  225
>
>
>
> B:
>
>  C1   C2
>
> 1   A  200
>
> 2   A  600
>
> 3   A 1500
>
> 4   B  100
>
> 5   B 1000
>
> 6   C 5000
>
> 7   C  225
>
> 8   C  150
>
> 9   C  150
>
> 10  C  200
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creation of a matrix

2009-05-08 Thread jim holtman
Is this what you want:

> x <- data.frame(n=sample(10, n, TRUE), text=sample(LETTERS, n, TRUE))
> table(x$text, x$n)

 1  2  3  4  5  6  7  8  9 10
  A  6  5  2  0  8  1  5  3  6  4
  B  2  2  5  2  2  7  5  4  4  5
  C  7  4  6  4  3  6  3  6  5  4
  D  9  5  1  6  3  1  3  2  6  3
  E  2  6  4  3  3  5  2  7  6  3
  F  6  5  3  5  3  5  1  2  2 10
  G  4  4  2  5  5  3  2  7  3  3
  H  4  4  4  5  3  3  3  6  3  4
  I  9  3  6  1  4  4  3  4  3  4
  J  4  7  3  4  3  3  4  1  2  5
  K  2  5  5  3  3  6  9  6  5  3
  L  3  3  5  4  3  3  3  3  5  5
  M  3  9  2  3  2  0  2  3  5  6
  N  4  1  0  5  8  4  4  3  6  2
  O  3  4  3  4  8  4  2  5  5  4
  P  3  6  2  6  4  4  3  4  3  6
  Q  5  2  2  5  3  3  0  2  5  4
  R  1  5  6  4  5  4  2  2  4  4
  S  6  2  4  2  1  7  0  1  1  2
  T  4  3  1  7  2  3  4  1  8  1
  U  4  5 11  8  3  2  5  3  4  5
  V  6  3  1  1  1  0  2  5  5  3
  W  3  5  1  4  4  5  6  3  4  2
  X  5  4  3  5  5  6  3  3  3  6
  Y  6  6  6  3  2  1  3  4  4  1
  Z  3  6  1  5  6  1  8  1  3  4
>


On Fri, May 8, 2009 at 4:48 AM, Erika Ahl  wrote:

> Hi all,
>
> I have a relative large amount (several thousand rows, but a small
> amount of unique objects) of data in a format like this:
>
> 1   text_string
> 1   text_string
> 1   text_string
> 2   text_string
> 2   text_string
> 3   text_string
> 3   text_string
> 3   text_string
> 3   text_string
> 3   text_string
> .
> .
> .
> n   text_string
>
> I want to create an n x p matrix, n objects (=40) and p unique text
> strings. Nij is number of occurrences of a text string j in object i.
>
>
> What is the most efficient way of creating this matrix?
>
> Best regards,
>
> Erika Ahl
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extending strsplit to handle missing text that doesn't have the target on which to split

2009-05-08 Thread jim holtman
Find the values that are missing a comma and add it:

> dat <- c("Tue, 15 Nov 2005 09:44:50 EST",
+ "15 Nov 2005 09:10:00 +0100",
+ "Tue, 15 Nov 2005 09:44:50 EST",
+ "Tue, 15 Nov 2005 16:29:57 +",
+ "Wed, 16 Nov 2005 07:00:45 EST",
+ "Wed, 16 Nov 2005 05:28:00 -0800",
+ "Wed, 16 Nov 2005 14:06:21 +",
+ "15 Nov 2005 09:10:00 +0100")
> # add comma if missing
> missing <- !grepl(',', dat)
> dat[missing] <- paste('', dat[missing], sep=',')
> tmp.dat.data <- matrix(unlist(strsplit(dat,",")),ncol = 2, byrow = TRUE)
>
> tmp.dat.data
 [,1]  [,2]
[1,] "Tue" " 15 Nov 2005 09:44:50 EST"
[2,] """15 Nov 2005 09:10:00 +0100"
[3,] "Tue" " 15 Nov 2005 09:44:50 EST"
[4,] "Tue" " 15 Nov 2005 16:29:57 +"
[5,] "Wed" " 16 Nov 2005 07:00:45 EST"
[6,] "Wed" " 16 Nov 2005 05:28:00 -0800"
[7,] "Wed" " 16 Nov 2005 14:06:21 +"
[8,] """15 Nov 2005 09:10:00 +0100"
>


On Thu, May 7, 2009 at 9:30 AM, Chris Evans  wrote:

> I am sure there is an obvious answer to this that I'm missing but I
> can't find it.  I'm parsing headers of Emails and most have a date like
> this:
>   "Wed, 16 Nov 2005 05:28:00 -0800"
> and I can parse that using:
>
> tmp.dat.data <- matrix(unlist(strsplit(headers$Date.line,",")),
>ncol = 2, byrow = TRUE)
> before going on to look at the day and date/time data.
>
> However, a very few headers I want to parse are missing the initial day
> of the week and look like this:
>   "15 Nov 2005 09:10:00 +0100"
>
> That means that my use of strsplit() results in that date/time part
> being all of the item in the list for those entries so the effect of
> matrix(unlist()) is to pull the next list entry "up" in the matrix.
> Because I happened to have only two errant entries I didn't see what was
> happening for a moment. (An odd number gives a warning message about
> dimensions not fitting but an odd number has silently moved things
> up/left so doesn't: no quarrel with that from me, my stupidity that I
> was slow to see what was happening!)
>
> I'm sure I should be able to find a simple way to get around this but at
> the moment I can't.
>
> Here's a simple, reproducible example:
>
> dat <- c("Tue, 15 Nov 2005 09:44:50 EST",
> "15 Nov 2005 09:10:00 +0100",
> "Tue, 15 Nov 2005 09:44:50 EST",
> "Tue, 15 Nov 2005 16:29:57 +",
> "Wed, 16 Nov 2005 07:00:45 EST",
> "Wed, 16 Nov 2005 05:28:00 -0800",
> "Wed, 16 Nov 2005 14:06:21 +",
> "15 Nov 2005 09:10:00 +0100")
> tmp.dat.data <- matrix(unlist(strsplit(dat,",")),ncol = 2, byrow = TRUE)
>
>
> tmp.dat.data comes out as a 7x2 matrix contents:
>
> [,1]  [,2]
> [1,] "Tue" " 15 Nov 2005 09:44:50 EST"
> [2,] "15 Nov 2005 09:10:00 +0100"  "Tue"
> [3,] " 15 Nov 2005 09:44:50 EST"   "Tue"
> [4,] " 15 Nov 2005 16:29:57 +" "Wed"
> [5,] " 16 Nov 2005 07:00:45 EST"   "Wed"
> [6,] " 16 Nov 2005 05:28:00 -0800" "Wed"
> [7,] " 16 Nov 2005 14:06:21 +" "15 Nov 2005 09:10:00 +0100"
>
> I'd like an 8x2 matrix with tmp.dat.data[2,1] == "" and
> tmp.dat.data[8,1] == ""
>
> I'm sure there must be a simple way to achieve this by rolling a
> slightly different variant of strsplit that pads things and then
> applying that to the input vector but I'm failing to see how to do this
> at the moment.
>
> TIA,
>
> Chris
>
> --
> Applied researcher, neither statistician nor programmer!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sscanf

2009-05-08 Thread jim holtman
You can always use regular expressions:

> x <- "Condition: 311"
> as.integer(sub(".*?(\\d +).*", "\\1 ", x,
perl=TRUE))
[1] 311
>


On Fri, May 8, 2009 at 10:16 AM, Matthias Gondan wrote:

> Dear list,
>
> Apparently, there is no function like sscanf in R.
>
> I have a string, "Condition: 311", and I would like
> to read out the number and store it to a numeric
> variable. Is there an easy way to do this?
>
> Best wishes,
>
> Matthias
> --
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading large files quickly

2009-05-09 Thread jim holtman
First 'wc' and readLines are doing vastly different functions.  'wc' is just
reading through the file without having to allocate memory to it;
'readLines' is actually storing the data in memory.

I have a 150MB file I was trying it on, and here is what 'wc' did on my
Windows system:

/cygdrive/c: time wc tempxx.txt
  1055808  13718468 151012320 tempxx.txt
real0m2.343s
user0m1.702s
sys 0m0.436s
/cygdrive/c:

If I multiply that by 25 to extrapolate to a 3.5GB file, it should take
about a little less than one minute to process on my relatively slow laptop.

'readLines' on the same file takes:

> system.time(x <- readLines('/tempxx.txt'))
   user  system elapsed
  37.820.47   39.23
If I extrapolate that to 3.5GB, it would take about 16 minutes.  Now
considering that I only have 2GB on my system, I would not be able to read
the whole file in at once.

You never did specify what type of system you were running on and how much
memory you had.  Were you 'paging' due to lack of memory?

> system.time(x <- readLines('/tempxx.txt'))
   user  system elapsed
  37.820.47   39.23
> object.size(x)
84814016 bytes



On Sat, May 9, 2009 at 12:25 PM, Rob Steele wrote:

> I'm finding that readLines() and read.fwf() take nearly two hours to
> work through a 3.5 GB file, even when reading in large (100 MB) chunks.
>  The unix command wc by contrast processes the same file in three
> minutes.  Is there a faster way to read files in R?
>
> Thanks!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating a "conditional time" variable

2009-05-09 Thread jim holtman
Here is yet another way of doing it (always the case in R):

#Simulated data frame: year from 1990 to 2003, for 5 different ids, each
having one or two eif "events"
test<-data.frame(year=rep(1990:2003,5),id=gl(5,length(1990:2003)),
eif=as.vector(sapply(1:5,function(z){
a<-rep(0,length(1990:2003))
a[sample(1:length(1990:2003),sample(1:2,1))]<-1
a
})))

# partition by 'id' and then by 'eif' changes
test.new <- do.call(rbind, lapply(split(test, test$id), function(.id){
# now by 'eif' changes
do.call(rbind, lapply(split(.id, cumsum(.id$eif)), function(.eif){
# create new dataframe with column
cbind(.eif, conditional_time=seq(nrow(.eif)))
}))
}))



On Sat, May 9, 2009 at 1:40 PM, Vincent Arel-Bundock  wrote:

>  Hi everyone,
>
> Please forgive me if my question is simple and my code terrible, I'm new to
> R. I am not looking for a ready-made answer, but I would really appreciate
> it if someone could share conceptual hints for programming, or point me
> toward an R function/package that could speed up my processing time.
>
> Thanks a lot for your help!
>
> ##
>
> My dataframe includes the variables 'year', 'id', and 'eif' and has +/- 1.9
> million id-year observations
>
> I would like to do 2 things:
>
> -1- I want to create a 'conditional_time' variable, which increases in
> increments of 1 every year, but which resets during year(t) if event 'eif'
> occured for this 'id' at year(t-1). It should also reset when we switch to
> a
> new 'id'. For example:
>
> dataframe = test
>  yearid eif  conditional_time
>
> 1990   1010  01
> 1991   1010  02
> 1992   1010  13
> 1993   1010  01
> 1994   1010  02
> 1995   1010  03
> 1996   1010  04
> 1997   1010  15
> 1998   1010  01
> 1999   1010  02
> 2000   1010  03
> 2001   1010  04
> 2002   1010  05
> 2003   1010  06
> 1990   2010  01
> 1991   2010  02
> 1992   2010  03
> 1993   2010  04
> 1994   2010  05
> 1995   2010  06
> 1996   2010  07
> 1997   2010  08
> 1998   2010  09
> 1999   2010  010
> 2000   2010  011
> 2001   2010  112
> 2002   2010  01
> 2003   2010  02
>
> -2- In a copy of the original dataframe, drop all id-year rows that
> correspond to years after a given id has experienced his first 'eif' event.
>
> I have written the code below to take care of -1-, but it is incredibly
> inefficient. Given the size of my database, and considering how slow my
> computer is, I don't think it's practical to use it. Also, it depends on
> correct sorting of the dataframe, which might generate errors.
>
> ##
>
> for (i in 1:nrow(test)) {
>if (i == 1) {# If first id-year
>cond_time <- 1
>test[i, 4] <- cond_time
>
>} else if ((test[i-1, 1]) != (test[i, 4])) { # If new id
>cond_time <- 1
>test[i, 4] <- cond_time
> } else {# Same id as previous row
>if (test[i, 3] == 0) {
>test[i, 4] <- sum(cond_time, 1)
>cond_time <- test[i, 6]
>} else {
>test[i, 4] <- sum(cond_time, 1)
>cond_time <- 0
>}
>}
> }
>
> --
> Vincent Arel
> M.A. Student, McGill University
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generating a "conditional time" variable

2009-05-09 Thread jim holtman
Corrected version.  I forgot the the count had to change 'after' eif==1:

#Simulated data frame: year from 1990 to 2003, for 5 different ids, each
having one or two eif "events"
test<-data.frame(year=rep(1990:2003,5),id=gl(5,length(1990:2003)),
eif=as.vector(sapply(1:5,function(z){
a<-rep(0,length(1990:2003))
a[sample(1:length(1990:2003),sample(1:2,1))]<-1
a
})))
# partition by 'id' and then by 'eif' changes
test.new <- do.call(rbind, lapply(split(test, test$id), function(.id){
# now by 'eif' changes
do.call(rbind, lapply(split(.id, cumsum(c(0, diff(.id$eif) == -1))),
function(.eif){
cbind(.eif, conditional_time=seq(nrow(.eif)))
}))
}))



On Sat, May 9, 2009 at 1:40 PM, Vincent Arel-Bundock  wrote:

>  Hi everyone,
>
> Please forgive me if my question is simple and my code terrible, I'm new to
> R. I am not looking for a ready-made answer, but I would really appreciate
> it if someone could share conceptual hints for programming, or point me
> toward an R function/package that could speed up my processing time.
>
> Thanks a lot for your help!
>
> ##
>
> My dataframe includes the variables 'year', 'id', and 'eif' and has +/- 1.9
> million id-year observations
>
> I would like to do 2 things:
>
> -1- I want to create a 'conditional_time' variable, which increases in
> increments of 1 every year, but which resets during year(t) if event 'eif'
> occured for this 'id' at year(t-1). It should also reset when we switch to
> a
> new 'id'. For example:
>
> dataframe = test
>  yearid eif  conditional_time
>
> 1990   1010  01
> 1991   1010  02
> 1992   1010  13
> 1993   1010  01
> 1994   1010  02
> 1995   1010  03
> 1996   1010  04
> 1997   1010  15
> 1998   1010  01
> 1999   1010  02
> 2000   1010  03
> 2001   1010  04
> 2002   1010  05
> 2003   1010  06
> 1990   2010  01
> 1991   2010  02
> 1992   2010  03
> 1993   2010  04
> 1994   2010  05
> 1995   2010  06
> 1996   2010  07
> 1997   2010  08
> 1998   2010  09
> 1999   2010  010
> 2000   2010  011
> 2001   2010  112
> 2002   2010  01
> 2003   2010  02
>
> -2- In a copy of the original dataframe, drop all id-year rows that
> correspond to years after a given id has experienced his first 'eif' event.
>
> I have written the code below to take care of -1-, but it is incredibly
> inefficient. Given the size of my database, and considering how slow my
> computer is, I don't think it's practical to use it. Also, it depends on
> correct sorting of the dataframe, which might generate errors.
>
> ##
>
> for (i in 1:nrow(test)) {
>if (i == 1) {# If first id-year
>cond_time <- 1
>test[i, 4] <- cond_time
>
>} else if ((test[i-1, 1]) != (test[i, 4])) { # If new id
>cond_time <- 1
>test[i, 4] <- cond_time
> } else {# Same id as previous row
>if (test[i, 3] == 0) {
>test[i, 4] <- sum(cond_time, 1)
>cond_time <- test[i, 6]
>} else {
>test[i, 4] <- sum(cond_time, 1)
>cond_time <- 0
>}
>}
> }
>
> --
> Vincent Arel
> M.A. Student, McGill University
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading large files quickly

2009-05-09 Thread jim holtman
Since you are reading it in chunks, I assume that you are writing out each
segment as you read it in.  How are you writing it out to save it?  Is the
time you are quoting both the reading and the writing?  If so, can you break
down the differences in what these operations are taking?

How do you plan to use the data?  Is it all numeric?  Are you keeping it in
a dataframe?  Have you considered using 'scan' to read in the data and to
specify what the columns are?  If you would like some more help, the answer
to these questions will help.

On Sat, May 9, 2009 at 10:09 PM, Rob Steele wrote:

> Thanks guys, good suggestions.  To clarify, I'm running on a fast
> multi-core server with 16 GB RAM under 64 bit CentOS 5 and R 2.8.1.
> Paging shouldn't be an issue since I'm reading in chunks and not trying
> to store the whole file in memory at once.  Thanks again.
>
> Rob Steele wrote:
> > I'm finding that readLines() and read.fwf() take nearly two hours to
> > work through a 3.5 GB file, even when reading in large (100 MB) chunks.
> >  The unix command wc by contrast processes the same file in three
> > minutes.  Is there a faster way to read files in R?
> >
> > Thanks!
>  >
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate over x cases

2009-05-11 Thread jim holtman
Here is a way of doing it:

> x
   block trial   x   y
1  1 1 605 150
2  1 1 603 148
3  1 1 604 140
4  1 1 600 140
5  1 1 590 135
6  1 1 580 135
7  1 2 607 148
8  1 2 605 152
10 1 2 600 158
> do.call(rbind, lapply(split(x, list(x$block, x$trial), drop=TRUE), head,
2))
  block trial   x   y
1.1.1 1 1 605 150
1.1.2 1 1 603 148
1.2.7 1 2 607 148
1.2.8 1 2 605 152


On Mon, May 11, 2009 at 7:49 AM, Jens Bölte wrote:

> Hello,
>
> I have been struggling for quite some time to find a solution for the
> following problem. I have a data frame which is organized by block and
> trial. Each trial is represented across several rows in this data frame. I'd
> like to extract the first x rows per trial and block.
>
> For example
>block   trial   x   y
> 1   1   1   605 150
> 2   1   1   603 148
> 3   1   1   604 140
> 4   1   1   600 140
> 5   1   1   590 135
> 6   1   1   580 135
> 7   1   2   607 148
> 8   1   2   605 152
> 10  1   2   600 158
> .
>
> Selecting only the the first two rows per trial should result in
> block trial x y 1   1   605 150
> 1   1   603 148
> 1   2   607 148
> 1   2   605 152
>
> The data I am dealing with a x-y coordinates (samples) from an eye-tracking
> experiment. I receive the data in this format and need to eliminate unwanted
> samples.
>
> Thanks Jens Bölte
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readBin: read from defined offset TO defined offset?

2009-05-11 Thread jim holtman
Can you be more specific on how you want to "define the endpoint of that
read".  What is the criteria you want to use?  Can you read in a block and
then search of the pattern?

On Mon, May 11, 2009 at 7:05 AM, Johannes Graumann  wrote:

> Hello,
>
> With the help of "seek" I can start "readBin" from any byte offset within
> my
> file that I deem appropriate.
> What I would like to do is to be able to define the endpoint of that read
> as
> well. Is there any solution to that already out there?
>
> Thanks for any hints, Joh
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing any text beginning with...

2009-05-11 Thread jim holtman
Is this what you want (using regular expressions):

> x <- "ENSG /// ENSGy /// ENSG"
> sub("^([[:alpha:]]+).*", "\\1 ", x)
[1] "ENSG"
>



On Mon, May 11, 2009 at 9:01 AM, Amélie Baud  wrote:

> Hi !
>
> >From an Ensembl annotation like ENSG /// ENSGy /// ENSG, I am
> trying to keep only the first part: ENSG. I wasn't able to find any
> helpful information about how to do it. Could you help me with that please ?
> Is the use of the equivalent to the Excel * (any text) a good way of doing
> it and how ?
> Your help will be very much appreciated.
>
> Amelie
>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for a quick way to combine rows in a matrix

2009-05-11 Thread jim holtman
Try this:

> key <- rownames(a)
> key[key == "AT"] <- "TA"
> do.call(rbind, by(a, key, colSums))
   V2 V3 V4 V5
AA  1  5  9 13
TA  5 13 21 29
TT  4  8 12 16


On Mon, May 11, 2009 at 4:53 PM, Crosby, Jacy R
wrote:

> I'm working with genotype data in a frequency table:
>
> > a=matrix(1:16, nrow=4)
> > rownames(a)=c("AA","AT","TA","TT")
> > a
>   [,1] [,2] [,3] [,4]
> AA159   13
> AT26   10   14
> TA37   11   15
> TT48   12   16
>
> 'AT' and 'TA' are essentially the same, and I'd like to combine (add) the
> rows to reflect this. The final matrix should be:
>
>   [,1] [,2] [,3] [,4]
> AA159   13
> AT513   21   29
> TT48   12   16
>
> Is there a fast way to do this?
>
> Thanks in advance!
>
> Jacy Crosby
> jacy.r.cro...@uth.tmc.edu
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to take away the same varible when I use "merge"

2009-05-12 Thread jim holtman
Can you provide commented, minimal, self-contained, reproducible code.
You can check out 'duplicated' to remove duplicates.

On Tue, May 12, 2009 at 7:06 AM, Xin Shi  wrote:

> Dear:
>
>
>
> I am trying to merge two tables by a common variable. However, there are a
> few same variables which are in both of two tables. How can I take them
> away
> when I merge the two tables?
>
>
>
> Thanks!
>
>
>
> Xin
>
>
>
>
>
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2009-05-12 Thread jim holtman
?curve

just create an R expression for the equation and then plot it.  I am not
sure exactly what your expression is supposed to be.

On Tue, May 12, 2009 at 10:22 PM, Debbie Zhang wrote:

>
>
> Dear R users,
>
> Does anyone know how to graph the function below?
>
> sqrt(2)Ã(n/2)/[sqrt(n - 1)Ã((n - 1)/2]
>
> Please help.
>
> Debbie
>
> _
> Want to stay on top of your life online? Find out how with Windows Live!
> http://windowslive.ninemsn.com.au/
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dates and arrays

2009-05-13 Thread jim holtman
On Wed, May 13, 2009 at 4:23 PM, myshare  wrote:

> hi,
>
> I have a and data frame with date-column and some other columns.
> My first question is what is the fastest way to get the index of an
> array if I know the value f.e
>
> > x = c(4,5,6,7,8)
>
> so i know the value is 6.. i.e. the index is 3. What I currently do is
> loop over the array, I was thinking if there
> is faster more direct way.


which(x == 6)

will give you the index.

>
> The next one...is I have a data frame one of the columns is Date based
> (stored as string), as you may be guessed
> I have the date and I want to find the index ;), but here is one more
> complication.
> The dates are not sequential, but only dates when the day is Mon-Fri
> i.e. for Sat and Sun i don't store information.
>
> So I have first convert the date I have into the closest Monday.
> Let me give you one example. Let say I have the date 2000/01/01 (Sat),
> now to be able to find any information I have to find the nearest
> Monday in this case it is 2000/01/03 (Mon)..
> So now that I have this new date I can find the index of the element
> in the array where it is stored and from this I can get the real data
> I need.
> In short conversation is from Data ==> nearest Monday ==> index of the
> element in the array where it is stored.


Here is a way of adjusting a date to the nearest Monday if it is a weekend:

> x <- seq(as.Date('2009-05-01'), by='1 day', length=30)
> x
 [1] "2009-05-01" "2009-05-02" "2009-05-03" "2009-05-04" "2009-05-05"
"2009-05-06" "2009-05-07"
 [8] "2009-05-08" "2009-05-09" "2009-05-10" "2009-05-11" "2009-05-12"
"2009-05-13" "2009-05-14"
[15] "2009-05-15" "2009-05-16" "2009-05-17" "2009-05-18" "2009-05-19"
"2009-05-20" "2009-05-21"
[22] "2009-05-22" "2009-05-23" "2009-05-24" "2009-05-25" "2009-05-26"
"2009-05-27" "2009-05-28"
[29] "2009-05-29" "2009-05-30"
> x.new <- x + ifelse(weekdays(x) == "Saturday", 2, ifelse(weekdays(x) ==
"Sunday", 1, 0))
> x.new
 [1] "2009-05-01" "2009-05-04" "2009-05-04" "2009-05-04" "2009-05-05"
"2009-05-06" "2009-05-07"
 [8] "2009-05-08" "2009-05-11" "2009-05-11" "2009-05-11" "2009-05-12"
"2009-05-13" "2009-05-14"
[15] "2009-05-15" "2009-05-18" "2009-05-18" "2009-05-18" "2009-05-19"
"2009-05-20" "2009-05-21"
[22] "2009-05-22" "2009-05-25" "2009-05-25" "2009-05-25" "2009-05-26"
"2009-05-27" "2009-05-28"
[29] "2009-05-29" "2009-06-01"
>

>
>
> thank you very much
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] specify the number of decimal numbers

2009-05-14 Thread jim holtman
Depending on what you want to do, use 'sprintf':

> x <- 1.23456789
> x
[1] 1.234568
> as.character(x)
[1] "1.23456789"
> sprintf("%.1f  %.3f  %.5f", x,x,x)
[1] "1.2  1.235  1.23457"
>


On Thu, May 14, 2009 at 7:40 AM, lehe  wrote:

>
> Hi,
> I was wondering how to specify the number of decimal numbers in my
> computation using R? I have too many decimal numbers for my result, when I
> convert them to string with as.character, the string will be too long.
> Thanks and regards!
> --
> View this message in context:
> http://www.nabble.com/specify-the-number-of-decimal-numbers-tp23538852p23538852.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] specify the number of decimal numbers

2009-05-14 Thread jim holtman
It all depends on what you want to do with the result.  Here are some
variations:

> x <- matrix(runif(16), 4)
> x
  [,1]  [,2]   [,3]  [,4]
[1,] 0.2655087 0.2016819 0.62911404 0.6870228
[2,] 0.3721239 0.8983897 0.06178627 0.3841037
[3,] 0.5728534 0.9446753 0.20597457 0.7698414
[4,] 0.9082078 0.6607978 0.17655675 0.4976992
> x[] <- sprintf("%.3f", x)
> x
 [,1][,2][,3][,4]
[1,] "0.266" "0.202" "0.629" "0.687"
[2,] "0.372" "0.898" "0.062" "0.384"
[3,] "0.573" "0.945" "0.206" "0.770"
[4,] "0.908" "0.661" "0.177" "0.498"
> print(x, quote=FALSE)
 [,1]  [,2]  [,3]  [,4]
[1,] 0.718 0.935 0.267 0.870
[2,] 0.992 0.212 0.386 0.340
[3,] 0.380 0.652 0.013 0.482
[4,] 0.777 0.126 0.382 0.600

> x <- matrix(runif(16), 4)
> signif(x,3)
  [,1]  [,2]   [,3]  [,4]
[1,] 0.718 0.935 0.2670 0.870
[2,] 0.992 0.212 0.3860 0.340
[3,] 0.380 0.652 0.0134 0.482
[4,] 0.777 0.126 0.3820 0.600
>

Can you specify what you want and how are you going to use it.  Is it for
generating a report?
On Thu, May 14, 2009 at 8:03 AM, lehe  wrote:

>
> Thanks!
> In my case, I need to deal with a lot of such results, e.g. elements in a
> matrix. If using sprintf, does it mean I have to apply to each result
> individually? Is it possible to do it in a single command?
>
>
> jholtman wrote:
> >
> > Depending on what you want to do, use 'sprintf':
> >
> >> x <- 1.23456789
> >> x
> > [1] 1.234568
> >> as.character(x)
> > [1] "1.23456789"
> >> sprintf("%.1f  %.3f  %.5f", x,x,x)
> > [1] "1.2  1.235  1.23457"
> >>
> >
> >
> > On Thu, May 14, 2009 at 7:40 AM, lehe  wrote:
> >
> >>
> >> Hi,
> >> I was wondering how to specify the number of decimal numbers in my
> >> computation using R? I have too many decimal numbers for my result, when
> >> I
> >> convert them to string with as.character, the string will be too long.
> >> Thanks and regards!
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/specify-the-number-of-decimal-numbers-tp23538852p23538852.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> <http://www.r-project.org/posting-guide.html>
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Jim Holtman
> > Cincinnati, OH
> > +1 513 646 9390
> >
> > What is the problem that you are trying to solve?
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/specify-the-number-of-decimal-numbers-tp23538852p23539189.html
>  Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicates and duplicated

2009-05-14 Thread jim holtman
Don't think I have seen this one come across:

> x <- c(1,2,3,2,4,4,6,1)
> duplicated(x) | duplicated(x, fromLast=TRUE)
[1]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE


On Thu, May 14, 2009 at 12:09 PM, Bert Gunter wrote:

> ... or, similar in character to Gabor's solution:
>
> tbl <- table(x)
> (tbl[as.character(sort(x))]>1)+0
>
>
> Bert Gunter
> Nonclinical Biostatistics
> 467-7374
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On
> Behalf Of Gabor Grothendieck
> Sent: Thursday, May 14, 2009 7:34 AM
> To: christiaan pauw
> Cc: r-help@r-project.org
> Subject: Re: [R] Duplicates and duplicated
>
> Noting that:
>
> > ave(x, x, FUN = length) > 1
>  [1] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
>
> try this:
>
> > rbind(x, dup = ave(x, x, FUN = length) > 1)
>[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> x  123445678 9
> dup000110000 0
>
>
> On Thu, May 14, 2009 at 2:16 AM, christiaan pauw  wrote:
> > Hi everybody.
> > I want to identify not only duplicate number but also the original number
> > that has been duplicated.
> > Example:
> > x=c(1,2,3,4,4,5,6,7,8,9)
> > y=duplicated(x)
> > rbind(x,y)
> >
> > gives:
> >[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> > x123445678 9
> > y000010000 0
> >
> > i.e. the second 4 [,5] is a duplicate.
> >
> > What I want is the first and second 4. i.e [,4] and [,5] to be TRUE
> >
> >[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> > x123445678 9
> > y000110000 0
> >
> > I assume it can be done by sorting the vector and then checking is the
> next
> > or the previous entry matches using
> > identical() . I am just unsure on how to write such a loop the logic of
> > which (I think) is as follows:
> >
> > sort x
> > for every value of x check if the next value is identical and return TRUE
> > (or 1) if it is and FALSE (or 0) if it is not
> > AND
> > check is the previous value is identical and return TRUE (or 1) if it is
> and
> > FALSE (or 0) if it is not
> >
> > Im i thinking correct and can some help to write such a function
> >
> > regards
> > Christiaan
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing data into R and combining 2 files

2009-05-14 Thread jim holtman
What have you tried?  Check the Intro manual for hints.

?read.table   probably using sep='\t'

On Thu, May 14, 2009 at 1:30 PM, Sunita22  wrote:

>
> Hello
>
> I have to import 2 txt files into R. 1 file contains the data and the other
> contains the header, column headings, datatypes and labels for the data.
>
> I have 2 problems:
>
> 1) my data file has mixed type of data e.g. 1 2 3 4 5 3-5 02/04/06 3 4 5
> and
> so on, the data file is tab separated. when I import it, the data is
> getting
> stored in one single variable say V1. I need to separate it into rows and
> columns. how do I this? Which commands in R would be useful for the same?
>
> 2) The other file is also tab separated. the 6 lines contains header and
> introduction as in the name of the dataset, year, etc. and then column
> names
> its datatypes and labels. After importing the data in this file also gets
> stored in one single variable. I need to separate it into rows and columns.
> how do I this? Which commands in R would be useful for the same?
>
> Thank you in advance
>
> Regards
> Sunita
> --
> View this message in context:
> http://www.nabble.com/Importing-data-into-R-and-combining-2-files-tp23545291p23545291.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output of binary representation

2009-05-17 Thread jim holtman
Are you looking for how the floating point is represented in the IEEE-754
format?  If so, you can use writeBin:

> writeBin(pi,raw(),endian='big')
[1] 40 09 21 fb 54 44 2d 18


On Sun, May 17, 2009 at 1:23 PM, Ted Harding
wrote:

> I am interested in studying the binary representation of numerics
> (doubles) in R, so am looking for possibilities of output of the
> internal binary representations. sprintf() with format "a" or "A"
> is halfway there:
>
>  sprintf("%A",pi)
> # [1] "0X1.921FB54442D18P+1"
>
> but it is in hex.
>
> The following illustrate the sort of thing I want:
>
> 1.1001 0010 0001  1011 0101 0100 0100 0100 0010 1101 0001 1000
> times 2
>
> 11.0010 0100 0011  0110 1010 1000 1000 1000 0101 1010 0011 000
>
> 0.1100 1001   1101 1010 1010 0010 0010 0001 0110 1000 1100 0
> times 4
>
> (without the spaces -- only put in above for clarity).
>
> While I could take the original output "0X1.921FB54442D18P+1" from
> sprintf() and parse it out into binary using gsub() or the like,
> of submit it to say an 'awk' script via an external file, this would
> be a tedious business!
>
> Is there some function already in R which outputs the bits in the
> binary representation directly?
>
> I see that Dabid Hinds asked a similar question on 17 Aug 2005:
> "Raw data type transformations"
>
>  http://finzi.psych.upenn.edu/R/Rhelp02/archive/59900.html
>
> (without, apparently, getting any response -- at any rate within
> the following 3 months).
>
> With thanks for any suggestions,
> Ted.
>
> 
> E-Mail: (Ted Harding) 
> Fax-to-email: +44 (0)870 094 0861
> Date: 17-May-09   Time: 18:23:49
> -- XFMail --
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple plotting errors

2009-05-18 Thread jim holtman
One way is to create a list of the dataframes and then use 'sapply' to
extract the values:

df.list <- list(FeketeJAN, ..., FeketeDEC)
plot(sapply(df.list, function(a) a["AMAZON", "SUM_"]))



On Mon, May 18, 2009 at 7:17 AM, Steve Murray wrote:

>
> Dear R Users,
>
> I have 12 data frames, each of 12 rows and 2 columns.
>
> e.g. FeketeJAN
>   MEANSUM_
> AMAZON  144.4997874 68348.4
> NILE  5.4701955  1394.9
> CONGO71.3670036 21196.0
> MISSISSIPPI  18.9273250  6511.0
> AMUR  1.8426874   466.2
> PARANA   58.3835497 13486.6
> YENISEI   1.4668313   592.6
> OB1.4239179   559.6
> LENA  0.9342164   387.7
> NIGER 4.7245709   826.8
> ZAMBEZI  76.6893794  8665.9
> YANGTZE  10.6759257  1729.5
>
>
> I want to do a line plot of the value of Amazon 'Sum' (in this case,
> 68348.4) for each of the 12 data frames. I've tried doing this as follows:
>
> plot(FeketeJAN[1,2], FeketeFEB[1,2], FeketeMAR[1,2], *through to December*
> type="l")
>
> but receive: Error in strsplit(log, NULL) : non-character argument
>
>
> I've also tried:
>
> plot(FeketeJAN$AMAZON[,2], FeketeFEB$AMAZON[,2], *through to December*
> type="l")
>
> but receive:
>
> Error in plot.window(...) : need finite 'xlim' values
> In addition: Warning messages:
> 1: In min(x) : no non-missing arguments to min; returning Inf
> 2: In max(x) : no non-missing arguments to max; returning -Inf
> 3: In min(x) : no non-missing arguments to min; returning Inf
> 4: In max(x) : no non-missing arguments to max; returning -Inf
>
>
> What is it that I'm doing wrong?!
>
> Many thanks for any advice,
>
> Steve
>
>
>
> _
> [[elided Hotmail spam]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split data frame based on Class

2009-05-18 Thread jim holtman
?split

new.df <- split(old.df, old.df$Class)

will create a list of dataframes split by Class

On Mon, May 18, 2009 at 7:23 AM, Chris Arthur wrote:

> Each row of my data frame is assigned to a class (eg country). Can you
> suggest how I break apart the data frame so that I create new data frames
> for each class
>
> eg
>
> If Class = "US" put in new dataframe dataUS
>
> Thanks in advance for your help
>
> Chris
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parsing configuration files

2009-05-18 Thread jim holtman
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.

There are regular expressions that can be used.  It is very dependent upon
the format of a configuration file; an example would help to show the way.

On Mon, May 18, 2009 at 6:10 AM, Marie Sivertsen wrote:

> Dear list,
>
> Is there any functionality in R that would allow me to parse config files?
> I have trie ??config and apropos('config') without succes, and also search
> the R package site.
>
> Mvh.
> Marie
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error in importing text files

2009-05-18 Thread jim holtman
gt; each question, and three addtiional data points of no interest.  The data
>> are arranged in an unstacked (long) text file such that each line contains
>> all of the above information and there are 34 (32 responses plus 2 extra
>> lines of meaningless data) lines per measurement occasion (upto 850 lines
>> of data if all 34 lines are present !
>> for all 25 measurment occasions). Below is an example of how the data are
>> arranged.
>>
>> 20080204131646 23256063  6 0
>> ""
>> 20080204131646 233152-1  7 0
>> ""
>> 20080204150043 2-32767  0    0 65535
>> ""
>> 20080204182117 2 1283-1  7 0
>> ""
>> 20080204182117 2 283834  6 0
>> ""
>> 20080204182117 2 326636  6 0
>> ""
>> Year/Month/Day/Time  Palm ID  Response/Q#Latency  Response  3
>> meangingless columnsThe dataset presented above begins with question
>> 32
>> of one measurement occasion on Febraury 4, 2008 taken at 13:16:46.  The
>> next line (33) is in the datafile because participants had to click a
>> button to exit the measurement occasion.  You then see the beginning of
>> another measurement occasion (20080204192117) in which the participant did
>> not respond (-32767).  The next measurement occasion begins on the next
>> line which actually starts with response 2 because participants were
>> required to read a screen and click through prior to answering any
>> questions.  Thus, anytime participants simply read an instruction page
>> responses are coded as a -1.  What I would like to do is write code to
>> automatically import these 107 files into R and structure them
>> appropriately while importing them.  Furthermore, I would like for the
>> code
>> to use conditional statements so that whenever it encounters a -32767!
>>  it inserts 32 variables (columns) with missing data and whenever it
>> encounters a -1 it deletes that column all together.  I would also like
>> the
>> code to separate the combined year/month/day/time column into 4 separate
>> columns (year, month, day, time).  Finally, I would like the code to stack
>> the 32 responses during each measurement occasion so that I have 32
>> columns
>> of reponses plus columns for year, month, day, and latency, but leave each
>> measurment occasion unstacked.
>>
>> Thanks!
>>
>> Eric S McKibben
>> Industrial-Organizational Psychology Graduate Student
>> Clemson University
>> Clemson, SC
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to calculate means of matrix elements

2009-05-18 Thread jim holtman
You can convert it to an array and then use apply:

> mat1
 [,1] [,2] [,3] [,4]
[1,]32   124
[2,]   14   13   132
[3,]   15969
[4,]2   15   13   19
> mat2
 [,1] [,2] [,3] [,4]
[1,]0   11   107
[2,]   1293   13
[3,]   -4   130   14
[4,]   -20   -4   -1
> mat3
 [,1] [,2] [,3] [,4]
[1,]   206   16   23
[2,]   248   11   12
[3,]   15   136   16
[4,]5   22   20   25
>
> x <- array(c(mat1,mat2,mat3), dim=c(4,4,3))
> apply(x,c(1,2),mean)
  [,1]  [,2]  [,3] [,4]
[1,]  7.67  6.33 12.67 11.3
[2,] 16.67 10.00  9.00  9.0
[3,]  8.67 11.67  4.00 13.0
[4,]  1.67 12.33  9.67 14.3


On Mon, May 18, 2009 at 8:40 PM, dxc13  wrote:

>
> useR's,
> I have several matrices of size 4x4 that I want to calculate means of their
> respective positions with.  For example, consider I have 3 matrices given
> by
> the code:
> mat1 <- matrix(sample(1:20,16,replace=T),4,4)
> mat2 <- matrix(sample(-5:15,16,replace=T),4,4)
> mat3 <- matrix(sample(5:25,16,replace=T),4,4)
>
> The result I want is one matrix of size 4x4 in which position [1,1] is the
> mean of position [1,1] of the given three matrices.  The same goes for all
> other positions of the matrix.  If these three matrices are given in
> separate text files, how can I write code that will get this result I need?
>
> Thanks in advance,
> dxc13
> --
> View this message in context:
> http://www.nabble.com/how-to-calculate-means-of-matrix-elements-tp23607694p23607694.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to copy files from one direction to another?

2009-05-19 Thread jim holtman
?file.copy

On Tue, May 19, 2009 at 9:51 PM, XinMeng  wrote:

> There's 10 files in c:\\
> I wanna copy 3 of them to d:\\
>
> How to do it via R?
>
>
> Thanks!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace / swap values of subset of a data.frame

2009-05-19 Thread jim holtman
Exactly what are you trying to do?  Are you trying to just change a subset
of the values?  'subset' does not have an 'assignment' operator.  Maybe you
want something like this (but it is not clear from your description.  Also
it is not clear if you have exactly the same set of matching values in the
two data frames for the subset conditions.  If you do, then this might work:

data1[(data1$Subject==25) & (data1$Session==1), 22] <-
data2[(data2$Subject==25)&(data2$Session==1), 23]

On Tue, May 19, 2009 at 3:50 PM, tsunhin wong  wrote:

> Dear R users,
>
> I have 1 data.frame of 1500x80 - data1. I found out that there are a
> few cells of data that I have misplace, and I need to fix the ordering
> of them.
> In an attempt trying to swap column 22 & 23 of the Subject with
> misplaced data, I did the following:
> > data2 <- data1
> > subset(data1,(Subject==25 & Session==1))[,22] <-
> subset(data2,(Subject==25 & Session==1))[,23]
> > (error messages... "Could not find function "subset<-")
> > subset(data1,(Subject==25 & Session==1))[,23] <-
> subset(data2,(Subject==25 & Session==1))[,22]
> > (error messages... "Could not find function "subset<-")
>
> Please, please point me to some ways to achieve the swapping.
> Thanks a lot!
>
> Cheers,
>
> John
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Too large a data set to be handled by R?

2009-05-20 Thread jim holtman
If your 1500 X 2 matrix is all numeric, it should take up about 240MB of
memory.  That should easily fit within the 2GB of your laptop and still
leave room for several copies that might arise during the processing.
Exactly what are you going to be doing with the data?  A lot will depend on
the functions/procedures that you will be calling, or the type of
transformations you might be doing.

On Tue, May 19, 2009 at 11:59 PM, tsunhin wong  wrote:

> Dear R users,
>
> I have been using a dynamic data extraction from raw files strategy at
> the moment, but it takes a long long time.
> In order to save time, I am planning to generate a data set of size
> 1500 x 2 with each data point a 9-digit decimal number, in order
> to save my time.
> I know R is limited to 2^31-1 and that my data set is not going to
> exceed this limit. But my laptop only has 2 Gb and is running 32-bit
> Windows / XP or Vista.
>
> I ran into R memory problem issue before. Please let me know your
> opinion according to your experience.
> Thanks a lot!
>
> - John
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] turning off specific types of warnings

2009-05-20 Thread jim holtman
?suppressWarnings

On Wed, May 20, 2009 at 8:10 AM, Eleni Rapsomaniki
wrote:

> Dear R users,
>
> I have a long function that among other things uses the "survest" function
> from the Design package. This function generates the warning:
>
> In survest.cph (...)
>  S.E. and confidence intervals are approximate except at predictor means.
> Use cph(...,x=T,y=T) (and don't use linear.predictors=) for better
> estimates.
>
> I would like to turn this specific warning off, as it makes it difficult to
> detect other (potentially more crucial) warnings generated by other parts of
> my code.
>
> Is there a way to do this?
>
> Eleni Rapsomaniki
>
> Research Associate
> Strangeways Research Laboratory
> Department of Public Health and Primary Care
>
> University of Cambridge
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] efficiency when processing ordered data frames

2009-05-20 Thread jim holtman
How much is it currently costing you in time to do the selection process?
Is it having a large impact on your program? Is it the part that is really
consuming the overall time?  What is your concern in this area? Here is the
timing that it take so select from 10M values those that are less than a
specific value.  This takes less than 0.2 seconds:

> x <- runif(1e7)
> system.time(y <- x < .5)
   user  system elapsed
   0.150.050.20
> x <- sort(x)
> system.time(y <- x < .5)
   user  system elapsed
   0.110.030.14
>


On Wed, May 20, 2009 at 8:54 AM, Brigid Mooney  wrote:

> Hoping for a little insight into how to make sure I have R running as
> efficiently as possible.
>
> Suppose I have a data frame, A, with n rows and m columns, where col1
> is a date time stamp.  Also suppose that when this data is imported
> (from a csv or SQL), that the data is already sorted such that the
> time stamp in col1 is in ascending (or descending) order.
>
> If I then wanted to select only the rows of A where col1 <= a certain
> time, I am wondering if R has to read through the entirety of col1 to
> select those rows (all n of them).  Is it possible for R to recognize
> (or somehow be told) that these rows are already in order, thus
> allowing the computation could be completed in ~log(n) row reads
> instead?
>
> Thanks!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class for time of day?

2009-05-20 Thread jim holtman
If you want the hours from a POSIXct, here is one way of doing it; you can
create a function for doing it:

> x <- Sys.time()
> x
[1] "2009-05-20 12:17:13 EDT"
> y <- difftime(x, trunc(x, units='days'), units='hours')
> y
Time difference of 12.28697 hours
> as.numeric(y)
[1] 12.28697
>
It depends on what type of computations you want to do with it.  You can
leave it as POSIXct and carry out a lot of them.  Can you specify what you
want?


On Wed, May 20, 2009 at 10:57 AM, Stavros Macrakis wrote:

> What is the recommended class for time of day (independent of calendar
> date)?
>
> And what is the recommended way to get the time of day from a POSIXct
> object? (Not a string representation, but a computable representation.)
>
> I have looked in the man page for DateTimeClasses, in the Time Series
> Analysis Task View and in Spector's Data Manipulation book but haven't
> found
> these. Clearly I can create my own Time class and hack around with the
> internal representation of POSIXct, e.g.
>
>days <- unclass(d)/(24*3600)
>days-floor(days)
>
> and write print.Time, `-.Time`, etc. etc. but I expect there is already a
> standard class or CRAN package.
>
>   -s
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


<    4   5   6   7   8   9   10   11   12   13   >