Re: [R] create new variable with ifelse? (reproducible example)

2012-09-16 Thread Niklas Fischer
Thank you very much for very valuable comments. They are very informative. Bests, Niklas 2012/9/16 Ted Harding ted.hard...@wlandres.net [See at end] On 15-Sep-2012 20:36:49 Niklas Fischer wrote: Dear R users, I have a reproducible data and try to create new variable clo is 1 if know

Re: [R] parallel version of tapply() or table()?

2012-09-16 Thread Patrick Connolly
On Fri, 14-Sep-2012 at 02:03PM -0400, Earl Brown wrote: | Hello R-helpers. | | I've tried to recreate a parallel version of tapply() and table() | using a combination of the parallel functions mclapply() and pvec() | and papply(), but haven't been successful. In the end, I'm trying | to get a

Re: [R] Post by non-member to a members-only list

2012-09-16 Thread Ted Harding
On 16-Sep-2012 05:22:47 David Winsemius wrote: On Sep 15, 2012, at 7:17 PM, mcg wrote: Dear moderator; I'm on the R-Mailing list with the same (giepe...@gmail.com) email address, still I get the Post by non-member... message. Am I not a member than? It appears you are currently

[R] How to plot two lines, and only one line with errorbar by qqplots of R

2012-09-16 Thread Dennis
Here is my code, which plots three lines with errorbar. How could I add an extra line without errorbar to the plot? Thank you very much. beta.data - data.frame ( method = rep(c(Wrong, Correct, Full Bayes), each = T_obs), mean.beta = c(mean.beta1, mean.beta2, mean.beta3), t = rep(points, 3),

[R] Count based on 2 conditions [Beginner Question]

2012-09-16 Thread SirRon
Hello, I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has either value 0 or 1, column 2 has values between 0 and 10. I would like to count how often Column 1 has the value 1, while Column 2 has a value greater 5. This is my attempt, which works but doesn't seem to be

Re: [R] Sub- or superscript in factorial variable - possible?

2012-09-16 Thread peter dalgaard
On Sep 16, 2012, at 07:48 , David Winsemius wrote: On Sep 15, 2012, at 7:15 PM, mcg wrote: Hello R-users, I would like to use subscript in chemical formulas for the different treatments in a boxplot. Fot title, xlab and ylab sub- and superscript is no problem, but for the different

Re: [R] Count based on 2 conditions [Beginner Question]

2012-09-16 Thread Uwe Ligges
On 16.09.2012 12:41, SirRon wrote: Hello, I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has either value 0 or 1, column 2 has values between 0 and 10. I would like to count how often Column 1 has the value 1, while Column 2 has a value greater 5. This is my attempt,

Re: [R] Count based on 2 conditions [Beginner Question]

2012-09-16 Thread Rui Barradas
Hello, Since logical values F/T are coded as integers 0/1, you can use this: set.seed(5712) # make it reproducible n - 1e3 x - data.frame(A = sample(0:1, n, TRUE), B = sample(0:10, n, TRUE)) count - sum(x$A == 1 x$B 5) # 207 Hope this helps, Rui Barradas Em 16-09-2012 11:41, SirRon

Re: [R] qplot: plotting precipitation data

2012-09-16 Thread Rui Barradas
Hello, Relative to the op's request for rectangls, I'm not understanding them. In your plot using geom_bar, the levels of as.factor(start) are sorted ascending. If both as.factor(mydata$start) [1] 5291000 10988025 11767950 11840900 12267450 12276675 Levels: 5291000 10988025 11767950

Re: [R] create new variable with ifelse? (reproducible example)

2012-09-16 Thread Stephen Politzer-Ahles
Hi Niklas, I like A.K.'s method. Here's another way to do what I think is the same thing you're asking for (this is how I did it before I knew ifelse() existed!) rep_data$clo - 0 rep_data[ rep_data$know %in% c(very well, fairly well) rep_data$getalong %in% c(4,5),]$clo - 1 Best, Steve

Re: [R] How to plot two lines, and only one line with errorbar by qqplots of R

2012-09-16 Thread R. Michael Weylandt
On Sun, Sep 16, 2012 at 9:06 AM, Dennis dcfl...@gmail.com wrote: Here is my code, which plots three lines with errorbar. How could I add an extra line without errorbar to the plot? Thank you very much. beta.data - data.frame ( method = rep(c(Wrong, Correct, Full Bayes), each = T_obs),

Re: [R] create new variable with ifelse? (reproducible example)

2012-09-16 Thread Rui Barradas
Hello, Here's another one. logic.result - with(rep_data, know %in% c(very well, fairly well) getalong %in% c(4,5)) rep_data$clo - 1*logic.result # coerce to numeric Rui Barradas Em 16-09-2012 13:29, Stephen Politzer-Ahles escreveu: Hi Niklas, I like A.K.'s method. Here's another way to

Re: [R] Adding annotations to qplot from a data frame

2012-09-16 Thread Euan Reavie
Thank you for the many replies on this issue. I turns out qplot is not suited to multiple annotations, so the best suggestions were to use ggplot. The following worked for making an annotated stacked bar plot: ggplot(algaedata) + geom_bar(aes(x = year, y = cellsperml, colour = DIV, group =

Re: [R] Sub- or superscript in factorial variable - possible?

2012-09-16 Thread David Winsemius
On Sep 16, 2012, at 4:40 AM, peter dalgaard wrote: On Sep 16, 2012, at 07:48 , David Winsemius wrote: On Sep 15, 2012, at 7:15 PM, mcg wrote: Hello R-users, I would like to use subscript in chemical formulas for the different treatments in a boxplot. Fot title, xlab and ylab

Re: [R] Count based on 2 conditions [Beginner Question]

2012-09-16 Thread arun
HI, Try this: set.seed(1)  dat1-data.frame(col1=sample(0:1,1000,replace=TRUE),col2=sample(0:10,1000,replace=TRUE)) count(dat1$col1==1 dat1$col25)[2,2] #[1] 209 A.K. - Original Message - From: SirRon thechrist...@gmx.at To: r-help@r-project.org Cc: Sent: Sunday, September 16, 2012

[R] two questions about character manipulation

2012-09-16 Thread Özgür Asar
Dear all, I want to manipulate a character string such as ex-cbind(data$response1,data$response2) in R in two ways: 1) extracting the response1 portion of ex 2) replacing $ with . I am wondering that is it possible efficiently doing these in R? Best Ozgur -- View this message in

Re: [R] Usage of trim in mean()

2012-09-16 Thread Özgür Asar
trim is for calculating trimmed mean such that the fraction (0 to 0.5) of observations to be trimmed from each end of x before the mean is computed from ?help(mean) Ozgur -- View this message in context: http://r.789695.n4.nabble.com/Usage-of-trim-in-mean-tp4643281p4643293.html Sent from

Re: [R] two questions about character manipulation

2012-09-16 Thread R. Michael Weylandt
On Sun, Sep 16, 2012 at 3:35 PM, Özgür Asar oa...@metu.edu.tr wrote: Dear all, I want to manipulate a character string such as ex-cbind(data$response1,data$response2) in R in two ways: 1) extracting the response1 portion of ex I'm not sure what you mean by portion -- if you just want

Re: [R] two questions about character manipulation

2012-09-16 Thread Rui Barradas
Hello, Try the following. 1) pattern - response. m - regexpr(pattern, ex) #gregexpr to get all response regmatches(ex, m) 2) gsub(\\$, \\., ex) Hope this helps, Rui Barradas Em 16-09-2012 15:35, Özgür Asar escreveu: Dear all, I want to manipulate a character string such as

Re: [R] qplot: plotting precipitation data

2012-09-16 Thread John Kane
-Original Message- From: ruipbarra...@sapo.pt Sent: Sun, 16 Sep 2012 13:13:47 +0100 To: jrkrid...@inbox.com Subject: Re: [R] qplot: plotting precipitation data Hello, Relative to the op's request for rectangls, I'm not understanding them. Neither am I really, I just googled a

Re: [R] qplot: plotting precipitation data

2012-09-16 Thread Rui Barradas
Maybe a bug in ggplot2::geom_rect? I'm Cceing this to Hadley Wickham, maybe he has an answer. Rui Barradas Em 16-09-2012 17:04, John Kane escreveu: -Original Message- From: ruipbarra...@sapo.pt Sent: Sun, 16 Sep 2012 13:13:47 +0100 To: jrkrid...@inbox.com Subject: Re: [R] qplot:

[R] multi-column factor

2012-09-16 Thread Sam Steingold
I have a data frame with columns which draw on the same underlying universe, so I want them to be factors with the same level set: --8---cut here---start-8--- z - data.frame(a=c(a,b,c),b=c(b,c,d),stringsAsFactors=FALSE) str(z) 'data.frame': 3 obs. of 2

Re: [R] two questions about character manipulation

2012-09-16 Thread Özgür Asar
Dear Rui Barradas and Michael Weylandt, Many thanks for your replies. My second question is solved now. But I think I did not expressed my first wish in a clear way Indeed, in ex-cbind(data$response1,data$response2), I want to extract the variable name between $ and , (corresponds to

Re: [R] multi-column factor

2012-09-16 Thread Rui Barradas
Hello, The obvious simplification is to call union() only once. With 10M rows it should save time. Then I've asked myself whether unique() wouldn't be faster. f1 - function(x){ x[[1]] - factor(x[[1]], levels = union(x[[1]], x[[2]])) x[[2]] - factor(x[[2]], levels = union(x[[1]],

Re: [R] create new variable with ifelse? (reproducible example)

2012-09-16 Thread Niklas Fischer
Thanks Rui and Stephen, They look very interesting. I am glad there are many ways to do so. All the bests, 2012/9/16 Rui Barradas ruipbarra...@sapo.pt Hello, Here's another one. logic.result - with(rep_data, know %in% c(very well, fairly well) getalong %in% c(4,5)) rep_data$clo -

Re: [R] two questions about character manipulation

2012-09-16 Thread Rui Barradas
Hello, This should do it. You can collapse the first two instructions, but I've left it like this for clarity. s - unlist(strsplit(ex, [,)[:blank:]])) s - gsub(^.*\\$, , s) s[nchar(s) 0] Rui Barradas Em 16-09-2012 17:26, Özgür Asar escreveu: Dear Rui Barradas and Michael Weylandt, Many

[R] possible TZ bug in parseISO8601 - Error in if (length(c(year, month, day, hour, min, sec)) == 6 c(year, : [...]

2012-09-16 Thread Bit Rocker
Hey all, Virgin post to this list - hope I've got it right ;o) I've been learning R intensively the last two weeks and gone from newbie status to *reasonably* comfortable with it. Here's an issue I just cannot solve however as it appears to be some kind of bug in R itself. But I won't claim

Re: [R] possible TZ bug in parseISO8601 - Error in if (length(c(year, month, day, hour, min, sec)) == 6 c(year, : [...]

2012-09-16 Thread Bit Rocker
Just found a typo elsewhere in the code which looks like it's the culprit. I'm not sure if the report below is still relevant. Will advise if so. On Sun, Sep 16, 2012 at 6:59 PM, Bit Rocker bitracket...@gmail.com wrote: Hey all, Virgin post to this list - hope I've got it right ;o) I've

Re: [R] two questions about character manipulation

2012-09-16 Thread arun
Hi, Try this: ex-cbind(data$response1,data$response2)  gsub(.*\\(.*\\$(.*)\\,.*\\$.*\\),\\1,ex) #[1] response1 unlist(strsplit(gsub(.*\\(.*\\$(.*)\\,.*\\$(.*)\\),\\1 \\2,ex), )) #[1] response1 response2 A.K. - Original Message - From: Özgür Asar oa...@metu.edu.tr To:

Re: [R] [newbie] aggregating table() results and simplifying code with loop

2012-09-16 Thread John Kane
Hi Davide, I had some time this afternoon and I wonder if this approach is llkely to get the results you want? As before it is not complete but I think it holds promise. On the other hand Rui is a much better programer than I am so he may have a much cleaner solution. My way still looks

[R] Question about R performance on UNIX/LINUX with different memory/swap configurations

2012-09-16 Thread Eberle, Anthony
Does anyone have any guidance on swap and memory configuration when running R v2.15.1 on UNIX/LINUX? Through some benchmarking across multiple hardware (UNIX, LINUX, SPARC, x86, Windows, physical, virtual) it seems that the smaller memory machines have an advantage. Typically my organization

[R] sum(table(v)) == length(v)

2012-09-16 Thread Sam Steingold
Is it possible to violate the identity sum(table(v)) == length(v) ?? I see no way to do that and it holds in my small examples, but it is violated in the huge set I have: system.time(z - unique(data.frame(u=U,s=S))) tab1 - table(z$u) tab1 - tab1[tab10] # S is factor so some counts were 0 tab2 -

Re: [R] sum(table(v)) == length(v)

2012-09-16 Thread R. Michael Weylandt
On Sun, Sep 16, 2012 at 7:34 PM, Sam Steingold s...@gnu.org wrote: Is it possible to violate the identity sum(table(v)) == length(v) ?? Quite easily: x - c(1:5, NA) sum(table(x)) # 5 length(x) # 6 Perhaps look at the exclude= argument. Cheers, Michael I see no way to do that and it holds

[R] Where is the R configuration file or how to override R compilers

2012-09-16 Thread Eberle, Anthony
I have a question about how one can modify or override the compilers that R uses for package installations? Or if perhaps this configuration is in some editable file somewhere. Initially I built the version of R 2.15.1 on Solaris SPARC (virtual T4), but found out the build was done as 32 bit.

Re: [R] Where is the R configuration file or how to override R compilers

2012-09-16 Thread Berend Hasselman
On 16-09-2012, at 20:47, Eberle, Anthony wrote: I have a question about how one can modify or override the compilers that R uses for package installations? Or if perhaps this configuration is in some editable file somewhere. Initially I built the version of R 2.15.1 on Solaris SPARC

Re: [R] boot() with glm/gnm on a contingency table

2012-09-16 Thread Milan Bouchet-Valat
Le mercredi 12 septembre 2012 à 07:08 -0700, Tim Hesterberg a écrit : One approach is to bootstrap the vector 1:n, where n is the number of individuals, with a function that does: f - function(vectorOfIndices, theTable) { (1) create a new table with the same dimensions, but with the counts

Re: [R] Where is the R configuration file or how to override R compilers

2012-09-16 Thread Dirk Eddelbuettel
On 16 September 2012 at 13:47, Eberle, Anthony wrote: | I have a question about how one can modify or override the compilers | that R uses for package installations? Or if perhaps this configuration | is in some editable file somewhere. You have several choices: a) system-wide:

Re: [R] Question about R performance on UNIX/LINUX with different memory/swap configurations

2012-09-16 Thread Dirk Eddelbuettel
On 16 September 2012 at 13:30, Eberle, Anthony wrote: | Does anyone have any guidance on swap and memory configuration when | running R v2.15.1 on UNIX/LINUX? Through some benchmarking across | multiple hardware (UNIX, LINUX, SPARC, x86, Windows, physical, virtual) | it seems that the smaller

Re: [R] [newbie] aggregating table() results and simplifying code with loop

2012-09-16 Thread Davide Rizzo
Thank you John, you are giving me two precious tips (in addition, well explained!): 1. to use the package plyr (I didn't know it before, but it seems to make the deal!) 2. a smart and promising way to use it I can finally plot the partial results, to have a first glance and compare to them

Re: [R] Count based on 2 conditions [Beginner Question]

2012-09-16 Thread Peter Ehlers
On 2012-09-16 05:04, Rui Barradas wrote: Hello, Since logical values F/T are coded as integers 0/1, you can use this: set.seed(5712) # make it reproducible n - 1e3 x - data.frame(A = sample(0:1, n, TRUE), B = sample(0:10, n, TRUE)) count - sum(x$A == 1 x$B 5) # 207 Another way:

Re: [R] Count based on 2 conditions [Beginner Question]

2012-09-16 Thread David Winsemius
On Sep 16, 2012, at 3:41 AM, SirRon wrote: Hello, I'm working with a dataset that has 2 columns and 1000 entries. Column 1 has either value 0 or 1, column 2 has values between 0 and 10. I would like to count how often Column 1 has the value 1, while Column 2 has a value greater 5. This

Re: [R] Sub- or superscript in factorial variable - possible?

2012-09-16 Thread Peter Ehlers
On 2012-09-16 08:32, David Winsemius wrote: On Sep 16, 2012, at 4:40 AM, peter dalgaard wrote: On Sep 16, 2012, at 07:48 , David Winsemius wrote: On Sep 15, 2012, at 7:15 PM, mcg wrote: Hello R-users, I would like to use subscript in chemical formulas for the different treatments in a

[R] trying to obtain same nls parameters as in example

2012-09-16 Thread Pedro Mardones
Dear R-users; I'm working with a a dataset that was previously used to fit a nonlinear model of the form: Y ~ a * (1 + b * log(1 - c * X^d)) The parameters published elsewhere are: a = 1.758863, b = .217217, c = .99031, and d = .054589 However, there is no way I can replicate this result.

[R] Possible Improvement of the R code

2012-09-16 Thread li li
Dear all, In the following code, I was trying to compute each row of the param iteratively based on the first row. This likely is not the best way. Can anyone suggest a simpler way to improve the code. Thanks a lot! Hannah param - matrix(0, 11, 5) colnames(param) - c(p, q, r,

Re: [R] Sub- or superscript in factorial variable - possible?

2012-09-16 Thread David Winsemius
On Sep 16, 2012, at 2:58 PM, Peter Ehlers wrote: On 2012-09-16 08:32, David Winsemius wrote: On Sep 16, 2012, at 4:40 AM, peter dalgaard wrote: On Sep 16, 2012, at 07:48 , David Winsemius wrote: On Sep 15, 2012, at 7:15 PM, mcg wrote: Hello R-users, I would like to use

[R] self defined distance matrix in NbClust

2012-09-16 Thread eliza botto
i m using a package NbClust for cluster analysis. in the following algorithm -NbClust(m, diss=NULL, distance = euclidean, min.nc=2, max.nc=15, method = ward, index = all, alphaBeale = 0.1) i want to define my own dissimilarity matrix of dimension 38*38. my original data m is a matrix of

Re: [R] Question about R performance on UNIX/LINUX with different memory/swap configurations

2012-09-16 Thread jim holtman
My first criteria is to make sure my application never swaps/pages due to memory issues -- have enough physical memory so it never happens and control what else is running on the machine. Once you start paging, performance takes a real hit. On Sun, Sep 16, 2012 at 2:30 PM, Eberle, Anthony

[R] Server R

2012-09-16 Thread Bazman76
Hi there, I used the command sudo apt-get install r-base to install R on an EC2 server as shown below: http://www.r-bloggers.com/ec2-micro-instance-of-rstudio/ It works but the version of R installed is: R.version.string [1] R version 2.12.1 (2010-12-16) I want to the latest version with

Re: [R] trying to obtain same nls parameters as in example

2012-09-16 Thread Ben Bolker
Pedro Mardones mardones.p at gmail.com writes: Dear R-users; I'm working with a a dataset that was previously used to fit a nonlinear model of the form: Y ~ a * (1 + b * log(1 - c * X^d)) The parameters published elsewhere are: a = 1.758863, b = .217217, c = .99031, and d =

Re: [R] self defined distance matrix in NbClust

2012-09-16 Thread s.s.m. fauzi
Hi, I have big .csv file. I would like to filter that file into a new table. For example, I have .csv file as below: f1 f2 f3 f4 f5 f6 f7 f9 f10 f11 t1 1 0 1 0 1 0 0 0 01 t2 1 0 0 0 0 1 1 1 11 t3 0 0 0 0 0 0 0 0 00 t4 1 0 0

[R] How to filter information from a big .csv table into a new table

2012-09-16 Thread s.s.m. fauzi
Hi, I have big .csv file. I would like to filter that file into a new table. For example, I have .csv file as below: f1 f2 f3 f4 f5 f6 f7 f9 f10 f11 t1 1 0 1 0 1 0 0 0 01 t2 1 0 0 0 0 1 1 1 11 t3 0 0 0 0 0 0 0 0 00 t4 1 0 0

[R] memory leak using XML readHTMLTable

2012-09-16 Thread J Toll
Hi, I'm using the XML package to scrape data and I'm trying to figure out how to eliminate the memory leak I'm currently experiencing. In the searches I've done, it sounds like the existence of the leak is fairly well known. What isn't as clear is exactly how to solve it. The general process

[R] variance of yBar.. for unbalanced a random effects model

2012-09-16 Thread Han-Lin Lai
Hi All, I am analyzing a set of data collected by two-stage cluster sampling. My model is y_ij = mu + T_i + e_ij where T_i is the ith treatment and e_ij is random error for the ijth individual. I have MSE_within and MSE_between, which lead to MSE_T for the model. Suppose I have balanced data

Re: [R] Possible Improvement of the R code

2012-09-16 Thread Berend Hasselman
On 17-09-2012, at 00:51, li li wrote: Dear all, In the following code, I was trying to compute each row of the param iteratively based on the first row. This likely is not the best way. Can anyone suggest a simpler way to improve the code. Thanks a lot! Hannah param -

[R] How to divide each column with its own value

2012-09-16 Thread s.s.m. fauzi
Hi, I have a matrix as below: mat= [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 What I want to do is, I would like to divide each column with its own value, in order to get value 1. Is there any simple script for that? [[alternative HTML version

Re: [R] How to divide each column with its own value

2012-09-16 Thread Berend Hasselman
On 17-09-2012, at 06:50, s.s.m. fauzi wrote: Hi, I have a matrix as below: mat= [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 What I want to do is, I would like to divide each column with its own value, in order to get value 1. Is there any