[R] Help If
Hi to everyone and sorry for my question, I would like to use IF in an example like this: If((condition1 and condition2) Or (condition 3 and condition4)) {print uhvef} BUt I don´t know how to write it correctly, Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help If
Hello, and is ; or is || ; and print() needs the parenthesis around its argument if((condition1 condition2) || (condition3 condition4)) {print(uhvef)} Hope this helps, Rui Barradas Em 29-08-2013 09:16, Mª Teresa Martinez Soriano escreveu: Hi to everyone and sorry for my question, I would like to use IF in an example like this: If((condition1 and condition2) Or (condition 3 and condition4)) {print uhvef} BUt I don´t know how to write it correctly, Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help If
Hey if (( (1==1) (2==2) ) || (3==3)) { print( hello world) } - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - On Thu, Aug 29, 2013 at 11:11 AM, Zsurzsa Laszlo zsurzsalas...@gmail.comwrote: Hey if (( (1==1) (2==2) ) || (3==3)) { print( hello world) } - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - On Thu, Aug 29, 2013 at 10:16 AM, Mª Teresa Martinez Soriano teresama...@hotmail.com wrote: Hi to everyone and sorry for my question, I would like to use IF in an example like this: If((condition1 and condition2) Or (condition 3 and condition4)) {print uhvef} BUt I don´t know how to write it correctly, Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] XLSX package + Excel creation question
Dear R users, I have a question about the xlsx package. It's possible to create excel files and color cells and etc. My question would be that is it possible to color only some part of the data hold in a cell. Let's assume I've got the following data : 167,153,120,100 and I want to color to red everything that is bigger then 120. How can I achive this using R. Example file setup with a few lines in attachment. (SEL_MASS column can be used for example) Thank you in advance, - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting time vs number
On 08/29/2013 02:19 PM, mohan.radhakrish...@polarisft.com wrote: Hi, ... The plots are all there but the x=axis labels are not there. The graph labels are only '12:30', '13:30' and '14:30' I think I need to use your code to get all the values. Hi Mohan, Try this: plot(strptime(data$Time,%H:%M:%S),data$Kbytes,pch=0, type=b,col=red,col.axis=red, ylab=, xlab=,las=2,lwd=2.5,xaxt=n) library(plotrix) staxlab(at=as.numeric(strptime(data$Time,%H:%M:%S)), labels=as.character(data$Time),nlines=3) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unsuccessful beginner's struggle with lm
I have two data frames, train and response. Here is my attempt to do a linear regression. All entries of both data frames are numeric. I am expecting the intercept value to lie between 2 and 3 (in particular, non-zero). Here is a record of my interaction with R: class(response) [1] data.frame c(nrow(response),ncol(response)) [1] 13891 class(train) [1] data.frame c(nrow(train),ncol(train)) [1] 1389 256 beta.lm - lm(response ~ train) Error in model.frame.default(formula = response ~ train, drop.unused.levels = TRUE) : invalid type (list) for variable 'response' What elementary syntax error am I making in my call to lm? And why does R think at first that the class of response is data.frame, but that its class is list when I call lm? Thanks David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sensitivy / Specificity and nulls
At 15:18 28/08/2013, Donald Catanzaro wrote: Good Day All, I am working with a diagnostic test and comparing the new test to an old test. Normally I would be able to calculate sensitivity and specificity quite easily. However, the 'gold standard' that I am comparing my new diagnostic with is really 'gold-plated' in that sometimes the 'gold standard' fails completely and I have no data from the 'gold standard' but I might have data from the diagnostic test. Of course sometimes my new diagnostic fails but I have data from my 'gold standard' I am not sure I completely understand the situation, my crystal ball is becoming rather opaque, but it sounds as though you are looking for some form of meta-analysis of diagnostic tests when there is no reference standard. HSROC, available from CRAN, claims to provide this although I have never used it myself. To me this really starts moving towards classification but I cannot seem to find the appropriate calculations. Can someone point me to some web resources to determine the appropriate method to be able to deal with the NULLs ? Resources within the medical realm would be better (because the rest of the folks would understand them better) but not required. -- - Don Donald Catanzaro PhD dgcatanz...@gmail.com 16144 Sigmond Lane Lowell, AR 72745 479-751-3616 [[alternative HTML version deleted]] Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unsuccessful beginner's struggle with lm
On 13-08-29 8:23 AM, David Epstein wrote: I have two data frames, train and response. Here is my attempt to do a linear regression. All entries of both data frames are numeric. I am expecting the intercept value to lie between 2 and 3 (in particular, non-zero). lm expects the variables in the formula to be numeric vectors (or factors). They are often columns of a dataframe, but they won't be dataframes themselves. Here is a record of my interaction with R: class(response) [1] data.frame c(nrow(response),ncol(response)) [1] 13891 class(train) [1] data.frame c(nrow(train),ncol(train)) [1] 1389 256 beta.lm - lm(response ~ train) Error in model.frame.default(formula = response ~ train, drop.unused.levels = TRUE) : invalid type (list) for variable 'response' What elementary syntax error am I making in my call to lm? And why does R think at first that the class of response is data.frame, but that its class is list when I call lm? dataframes are lists with some extra rules added. lm() is just reporting the low level type, rather than the high level one. The way to do what you want is to include the response as a column in the same dataframe that includes the predictor variables. If you call the dataframe df and the response column name response, then the lm call would look like lm(response ~ ., data=df) The . here means all the other columns. You could also list them explicitly, but 256 of them sounds like a lot... Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Narrowing values collected from .txt file
Here is how I would do it since are reading in the entire file. This breaks on each Flow Budget section, extracts the RECHARGE values and puts them in a list with the name of the Flow Budget: # read entire file input - readLines(C:\\Users\\jh52822\\Downloads\\MCR_Budgets.txt) # determine the lines of interest indx - grep(Flow Budget for Zone|RECHARGE =, input) # remove everything else input - input[indx] # split by Flow Budget sep - split(input, cumsum(grepl(Flow Budget, input))) # process the list extracting data result - lapply(sep, function(.lines){ + as.numeric(sub(.*=(.*), \\1, .lines[-1])) + }) # extract the names for each Flow fNames - sapply(sep, '[', 1) # add to the list names(result) - fNames result $` Flow Budget for Zone 1 at Time Step 1 of Stress Period 2` [1] 128980 0 0 0 $` Flow Budget for Zone 2 at Time Step 1 of Stress Period 2` [1] 274160 0 0 0 $` Flow Budget for Zone 3 at Time Step 1 of Stress Period 2` [1] 81084 0 0 0 $` Flow Budget for Zone 1 at Time Step 1 of Stress Period 3` [1] 128980 0 0 0 $` Flow Budget for Zone 2 at Time Step 1 of Stress Period 3` [1] 274160 0 0 0 $` Flow Budget for Zone 3 at Time Step 1 of Stress Period 3` [1] 81084 0 0 0 $` Flow Budget for Zone 1 at Time Step 1 of Stress Period 4` [1] 128980 0 0 0 $` Flow Budget for Zone 2 at Time Step 1 of Stress Period 4` [1] 274160 0 0 0 $` Flow Budget for Zone 3 at Time Step 1 of Stress Period 4` [1] 81084 0 0 0 $` Flow Budget for Zone 1 at Time Step 1 of Stress Period 5` [1] 128980 0 0 0 $` Flow Budget for Zone 2 at Time Step 1 of Stress Period 5` [1] 274160 0 0 0 $` Flow Budget for Zone 3 at Time Step 1 of Stress Period 5` [1] 81084 0 0 0 $` Flow Budget for Zone 1 at Time Step 1 of Stress Period 6` [1] 128980 0 0 0 $` Flow Budget for Zone 2 at Time Step 1 of Stress Period 6` [1] 274160 0 0 0 $` Flow Budget for Zone 3 at Time Step 1 of Stress Period 6` [1] 81084 0 0 0 $` Flow Budget for Zone 1 at Time Step 1 of Stress Period 7` [1] 128980 0 0 0 Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Aug 28, 2013 at 7:45 PM, Morway, Eric emor...@usgs.gov wrote: It looks as though the attachment to my last post didn't make the cut (or at least it's not appearing on the Nabble forum), for one reason or another. I'm reattaching a smaller version so folks can run the code (won't work without a text file to operate on). So, while the attached file is only a small sample of the larger file and will therefore run quickly, I would still be helpful if someone knows a more efficient approach to the code in the previous post. On Wed, Aug 28, 2013 at 11:28 AM, A relatively concise, commented, working solution to the problem originally motivating this thread was found (below). I suspect the approach I've taken has a major inefficiency through the use of the scan statement appearing inside the function g. The way the code works right now, it has to re-open and read the file 'length(matched) times' rather than sequentially reading through to the next pertinent section of the txt file. Does anyone have a more efficient approach in mind so I don't have to wait 1/2 hour to get the results? (The only adjustment to the code that follows is to point txt to wherever the attached file is placed) # where is the file? txt-c:/temp/MCR_Budgets.txt # Demarcation header hdr_str-Flow Budget for Zone 2 # string to identify lines with desired values srch_str- RECHARGE = # retrieves desired values g-function(txt_con, hdr_str, srch_str, from, to, ...) { L - readLines(txt_con) #matched contains the line #s w/ hdr_str matched - grep(hdr_str, L, value = FALSE, ...) #initialize output list fetched_list-numeric() #for each instance of hdr_str, loop for(i in 1:(length(matched))){ #retrieve a section of text following each hdr_str snippet-scan(txt_con, what=character(), skip=matched[i]-1, n=42, sep='\n') #get data within the short section of retrieved text fetched - grep(srch_str, snippet, value=TRUE) #append output vector for plotting time series fetched_list - c(fetched_list, as.numeric(substring(fetched, from, to))) #monitor print(i) } #return desired values as.numeric(fetched_list) } #The results of system.time reflect full 147 MB file, # only half of which is attached. system.time( rech_z2-g(txt,hdr_str,srch_str,37,51) ) # user system elapsed #1740.48 36.08 1825.77
[R] Fwd: Narrowing values collected from .txt file
Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. -- Forwarded message -- From: jim holtman jholt...@gmail.com Date: Thu, Aug 29, 2013 at 8:43 AM Subject: Re: [R] Narrowing values collected from .txt file To: Morway, Eric emor...@usgs.gov FYI, I duped your data to 100MB file and it took less that 10 seconds to process. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Aug 28, 2013 at 7:45 PM, Morway, Eric emor...@usgs.gov wrote: It looks as though the attachment to my last post didn't make the cut (or at least it's not appearing on the Nabble forum), for one reason or another. I'm reattaching a smaller version so folks can run the code (won't work without a text file to operate on). So, while the attached file is only a small sample of the larger file and will therefore run quickly, I would still be helpful if someone knows a more efficient approach to the code in the previous post. On Wed, Aug 28, 2013 at 11:28 AM, A relatively concise, commented, working solution to the problem originally motivating this thread was found (below). I suspect the approach I've taken has a major inefficiency through the use of the scan statement appearing inside the function g. The way the code works right now, it has to re-open and read the file 'length(matched) times' rather than sequentially reading through to the next pertinent section of the txt file. Does anyone have a more efficient approach in mind so I don't have to wait 1/2 hour to get the results? (The only adjustment to the code that follows is to point txt to wherever the attached file is placed) # where is the file? txt-c:/temp/MCR_Budgets.txt # Demarcation header hdr_str-Flow Budget for Zone 2 # string to identify lines with desired values srch_str- RECHARGE = # retrieves desired values g-function(txt_con, hdr_str, srch_str, from, to, ...) { L - readLines(txt_con) #matched contains the line #s w/ hdr_str matched - grep(hdr_str, L, value = FALSE, ...) #initialize output list fetched_list-numeric() #for each instance of hdr_str, loop for(i in 1:(length(matched))){ #retrieve a section of text following each hdr_str snippet-scan(txt_con, what=character(), skip=matched[i]-1, n=42, sep='\n') #get data within the short section of retrieved text fetched - grep(srch_str, snippet, value=TRUE) #append output vector for plotting time series fetched_list - c(fetched_list, as.numeric(substring(fetched, from, to))) #monitor print(i) } #return desired values as.numeric(fetched_list) } #The results of system.time reflect full 147 MB file, # only half of which is attached. system.time( rech_z2-g(txt,hdr_str,srch_str,37,51) ) # user system elapsed #1740.48 36.08 1825.77 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XLSX package + Excel creation question
Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo: Dear R users, I have a question about the xlsx package. It's possible to create excel files and color cells and etc. yes, with package xlsx you can colourize you data sheets, even the fonts. See for example ?CellStyle . A good demonstration of the capabilities is on http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/ My question would be that is it possible to color only some part of the data hold in a cell. Let's assume I've got the following data : 167,153,120,100 and I want to color to red everything that is bigger then 120. How can I achive this using R. Example file setup with a few lines in attachment. (SEL_MASS column can be used for example) Attachment missing ... HTH, Rainer Thank you in advance, - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XLSX package + Excel creation question
First of all thank you for the quick resposen. I know I can color and set up every cell. I will take a look again * CellStyle* but is it possbile for example to write an array to a single cell that has different colors for some data. Basically the color depends on the data. - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - On Thu, Aug 29, 2013 at 2:55 PM, Rainer Hurling rhur...@gwdg.de wrote: Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo: Dear R users, I have a question about the xlsx package. It's possible to create excel files and color cells and etc. yes, with package xlsx you can colourize you data sheets, even the fonts. See for example ?CellStyle . A good demonstration of the capabilities is on http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/ My question would be that is it possible to color only some part of the data hold in a cell. Let's assume I've got the following data : 167,153,120,100 and I want to color to red everything that is bigger then 120. How can I achive this using R. Example file setup with a few lines in attachment. (SEL_MASS column can be used for example) Attachment missing ... HTH, Rainer Thank you in advance, - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Few doubts about ANOVA
Hi can you please give the brief explanation about anova? what is the purpose of null hypothesis in anova? how can we find future predictive value from existing data? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Few doubts about R
Hi can you please give the brief explanation about anova? what is the purpose of null hypothesis in anova? how can we find future predictive value from existing data? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculation with Times Series
HI, May be this helps: ts1- ts(1:20) ts2- ts(1:25) ts1[-(1:3)]- ts1[-(1:3)]+ts2[1:17] as.numeric(ts1) # [1] 1 2 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 A.K. Hey everyone, I`m an absolut beginner in R and need some help for an exercise: I want to do ordinary calculations with 2 time series. The issue with this, that I want to use different elements of time series. Let me give you an example: I want to sum let`s say the 10th element of time series 1 with the 7th element of time series 2. And 9th element of TS 1 with 6th element of TS 2 and 8th element of TS 1 with 5th element of TS 2 ... This pattern of the summation should go all over the time series. Is there a function, which allows me to do this, if possible a function in which I can change the difference of the position with a variable. Thanks a lot for your support. I´m for any advice thankful! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XLSX package + Excel creation question
Am 29.08.2013 15:03 (UTC+1) schrieb Zsurzsa Laszlo: First of all thank you for the quick resposen. I know I can color and set up every cell. I will take a look again * CellStyle* but is it possbile for example to write an array to a single cell that has different colors for some data. Basically the color depends on the data. As far as I know there is no ready to use functionality to mask groups of selected cells. You have to write your own function, which selects the right cells and changes their style with setCellStyle(cell, cellStyle). Some hints are given in the examples section of ?CellStyle. - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - On Thu, Aug 29, 2013 at 2:55 PM, Rainer Hurling rhur...@gwdg.de wrote: Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo: Dear R users, I have a question about the xlsx package. It's possible to create excel files and color cells and etc. yes, with package xlsx you can colourize you data sheets, even the fonts. See for example ?CellStyle . A good demonstration of the capabilities is on http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/ My question would be that is it possible to color only some part of the data hold in a cell. Let's assume I've got the following data : 167,153,120,100 and I want to color to red everything that is bigger then 120. How can I achive this using R. Example file setup with a few lines in attachment. (SEL_MASS column can be used for example) Attachment missing ... HTH, Rainer Thank you in advance, - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sensitivy / Specificity and nulls
Hi All, I apologize for the opaqueness and I will try to make it clearer. I am comparing two diagnostic tests G (gold standard) and N (new). Both are real tests, real experiments. G is currently the gold standard because it is the best test available, not because it is a perfect test. G is a growth based test and sometimes the test fails (the sample is contaminated with multiple species of bacteria and no results). The new test is molecular based and DNA is present you get a result however, sometimes this test fails as well (the quality control parameters have been exceeded). Both tests are one-shots so there is no opportunity for retesting. Thus what happens is that sometimes G has fails while N detects DNA in the sample and sometimes the reverse is true, G has growth and N fails. So I guess the simplest way to think of this is that both G and N have some level of measurement error that is unknown but I would like to account for in my calculations. So rather than having data for a 'traditional' 2x2 matrix for sensitivity/specificity, my data (where 1=positive and 0 = negative test results) looks like this: GN 1 0 1 1 F 1 0 F 1 1 1 1 1 0 F 0 On Thu, Aug 29, 2013 at 7:32 AM, Michael Dewey i...@aghmed.fsnet.co.ukwrote: At 15:18 28/08/2013, Donald Catanzaro wrote: Good Day All, I am working with a diagnostic test and comparing the new test to an old test. Normally I would be able to calculate sensitivity and specificity quite easily. However, the 'gold standard' that I am comparing my new diagnostic with is really 'gold-plated' in that sometimes the 'gold standard' fails completely and I have no data from the 'gold standard' but I might have data from the diagnostic test. Of course sometimes my new diagnostic fails but I have data from my 'gold standard' I am not sure I completely understand the situation, my crystal ball is becoming rather opaque, but it sounds as though you are looking for some form of meta-analysis of diagnostic tests when there is no reference standard. HSROC, available from CRAN, claims to provide this although I have never used it myself. To me this really starts moving towards classification but I cannot seem to find the appropriate calculations. Can someone point me to some web resources to determine the appropriate method to be able to deal with the NULLs ? Resources within the medical realm would be better (because the rest of the folks would understand them better) but not required. -- - Don Donald Catanzaro PhD dgcatanz...@gmail.com 16144 Sigmond Lane Lowell, AR 72745 479-751-3616 [[alternative HTML version deleted]] Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html -- - Don Donald Catanzaro PhD dgcatanz...@gmail.com 16144 Sigmond Lane Lowell, AR 72745 479-751-3616 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] XLSX package + Excel creation question
I understand you response but it does not solve the problem. I'am aware that one can simply color every cell in an excel file by using his own algorithm. The question was if I can write my data to a *single* cells and use different formatting for every piece of data. - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - On Thu, Aug 29, 2013 at 3:36 PM, Rainer Hurling rhur...@gwdg.de wrote: Am 29.08.2013 15:03 (UTC+1) schrieb Zsurzsa Laszlo: First of all thank you for the quick resposen. I know I can color and set up every cell. I will take a look again * CellStyle* but is it possbile for example to write an array to a single cell that has different colors for some data. Basically the color depends on the data. As far as I know there is no ready to use functionality to mask groups of selected cells. You have to write your own function, which selects the right cells and changes their style with setCellStyle(cell, cellStyle). Some hints are given in the examples section of ?CellStyle. - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - On Thu, Aug 29, 2013 at 2:55 PM, Rainer Hurling rhur...@gwdg.de wrote: Am 29.08.2013 12:08 (UTC+1) schrieb Zsurzsa Laszlo: Dear R users, I have a question about the xlsx package. It's possible to create excel files and color cells and etc. yes, with package xlsx you can colourize you data sheets, even the fonts. See for example ?CellStyle . A good demonstration of the capabilities is on http://tradeblotter.wordpress.com/2013/05/02/writing-from-r-to-excel-with-xlsx/ My question would be that is it possible to color only some part of the data hold in a cell. Let's assume I've got the following data : 167,153,120,100 and I want to color to red everything that is bigger then 120. How can I achive this using R. Example file setup with a few lines in attachment. (SEL_MASS column can be used for example) Attachment missing ... HTH, Rainer Thank you in advance, - - László-András Zsurzsa, - - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help R
Hi to everyone, I would like to replace some values in a data.frame (D) str(D) 'data.frame': 116 obs. of 10 variables: $ X. : int 1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ... $ media : num 22 86.6 807 103.2 73 ... $ IE.2003: num 32 92 166 237 161 ... $ IE.2004: num 63 122.8 290 117.8 73.6 ... $ IE.2005: num 60 277 302 154 134 ... $ IE.2006: num 39 87 322 113 70 ... $ IE.2007: num 4 95 621 116 80 ... $ IE.2008: num 8 94 1071 90 74 ... $ IE.2009: num 16 81 1301 94 69 ... $ IE.2010: num 5 76 1225 1911 72 ... D X. media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009 IE.2010 1 1108 22.032.063.060.0 39 4.0 8 16 5.0 2 1591 86.692.0 122.8 276.6 8795.0 94 81 76.0 3 3408 807.0 166.0 290.0 302.0 322 621.010711301 1225.0 4 3872 103.25000 237.2 117.8 154.4 113 116.0 90 94 1911.2 5 5823 73.0 160.673.6 133.6 7080.0 74 69 72.0 6 8099 125.16667 169.0 206.0 196.0 161 150.0 94 72 78.0 7 10640 67.3 494.8 168.2 424.8 476 670.6 74 77 51.0 8 12600 2417.0 1958.0 1871.0 1960.02383 2453.025062758 2442.0 9 14680 38.0 142.246.030.0 61 404.0 42 19 243.8 10 14698 698.16667 505.0 482.0 553.0 664 847.0 800 679 646.0 WHat I really want to do is: for( i in 1: nrow(D)) { for( j in 5:ncol(D)) { D[((D[i,j]/D[i,2])1.5)]15999)]-100 } } Error en `[-.data.frame`(`*tmp*`, (D[i, j] 15999), value = 1e+06) : missing values are not allowed in subscripted assignments of data frames [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help R
Do you have NA/NAN in your data set? If yes our check with an IF or substitute them with a value that fits your need. I hope I understood correctly your problem. - - László-András Zsurzsa,- - Msc. Infromatics, Technical University Munich, Germany - - Scientific Employee, TUM - - On Thu, Aug 29, 2013 at 3:49 PM, Mª Teresa Martinez Soriano teresama...@hotmail.com wrote: Hi to everyone, I would like to replace some values in a data.frame (D) str(D) 'data.frame': 116 obs. of 10 variables: $ X. : int 1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ... $ media : num 22 86.6 807 103.2 73 ... $ IE.2003: num 32 92 166 237 161 ... $ IE.2004: num 63 122.8 290 117.8 73.6 ... $ IE.2005: num 60 277 302 154 134 ... $ IE.2006: num 39 87 322 113 70 ... $ IE.2007: num 4 95 621 116 80 ... $ IE.2008: num 8 94 1071 90 74 ... $ IE.2009: num 16 81 1301 94 69 ... $ IE.2010: num 5 76 1225 1911 72 ... D X. media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009 IE.2010 1 1108 22.032.063.060.0 39 4.0 8 16 5.0 2 1591 86.692.0 122.8 276.6 8795.0 94 8176.0 3 3408 807.0 166.0 290.0 302.0 322 621.01071 1301 1225.0 4 3872 103.25000 237.2 117.8 154.4 113 116.0 90 94 1911.2 5 5823 73.0 160.673.6 133.6 7080.0 74 6972.0 6 8099 125.16667 169.0 206.0 196.0 161 150.0 94 7278.0 7 10640 67.3 494.8 168.2 424.8 476 670.6 74 7751.0 8 12600 2417.0 1958.0 1871.0 1960.02383 2453.02506 2758 2442.0 9 14680 38.0 142.246.030.0 61 404.0 42 19 243.8 10 14698 698.16667 505.0 482.0 553.0 664 847.0 800 679 646.0 WHat I really want to do is: for( i in 1: nrow(D)) { for( j in 5:ncol(D)) { D[((D[i,j]/D[i,2])1.5)]15999)]-100 } } Error en `[-.data.frame`(`*tmp*`, (D[i, j] 15999), value = 1e+06) : missing values are not allowed in subscripted assignments of data frames [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help R
HI, Your code is not clear: for( i in 1: nrow(D)) { for( j in 5:ncol(D)) { D[((D[i,j]/D[i,2])1.5)]15999)]-100 ## 1.5)]15999)] } ^^^ } D- structure(list(X. = c(1108L, 1591L, 3408L, 3872L, 5823L, 8099L, 10640L, 12600L, 14680L, 14698L), media = c(22, 86.6, 807, 103.25, 73, 125.16667, 67.3, 2417, 38, 698.16667), IE.2003 = c(32, 92, 166, 237.2, 160.6, 169, 494.8, 1958, 142.2, 505), IE.2004 = c(63, 122.8, 290, 117.8, 73.6, 206, 168.2, 1871, 46, 482), IE.2005 = c(60, 276.6, 302, 154.4, 133.6, 196, 424.8, 1960, 30, 553), IE.2006 = c(39L, 87L, 322L, 113L, 70L, 161L, 476L, 2383L, 61L, 664L), IE.2007 = c(4, 95, 621, 116, 80, 150, 670.6, 2453, 404, 847), IE.2008 = c(8L, 94L, 1071L, 90L, 74L, 94L, 74L, 2506L, 42L, 800L), IE.2009 = c(16L, 81L, 1301L, 94L, 69L, 72L, 77L, 2758L, 19L, 679L), IE.2010 = c(5, 76, 1225, 1911.2, 72, 78, 51, 2442, 243.8, 646)), .Names = c(X., media, IE.2003, IE.2004, IE.2005, IE.2006, IE.2007, IE.2008, IE.2009, IE.2010), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) D[,-c(1:4)][D[,-c(1:4)]/D[,2]1.5] # [1] 60.0 276.6 133.6 196.0 424.8 39.0 476.0 61.0 670.6 404.0 #[11] 1301.0 1225.0 1911.2 243.8 A.K. Hi to everyone, I would like to replace some values in a data.frame (D) str(D) 'data.frame': 116 obs. of 10 variables: $ X. : int 1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ... $ media : num 22 86.6 807 103.2 73 ... $ IE.2003: num 32 92 166 237 161 ... $ IE.2004: num 63 122.8 290 117.8 73.6 ... $ IE.2005: num 60 277 302 154 134 ... $ IE.2006: num 39 87 322 113 70 ... $ IE.2007: num 4 95 621 116 80 ... $ IE.2008: num 8 94 1071 90 74 ... $ IE.2009: num 16 81 1301 94 69 ... $ IE.2010: num 5 76 1225 1911 72 ... D X. media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009 IE.2010 1 1108 22.0 32.0 63.0 60.0 39 4.0 8 16 5.0 2 1591 86.6 92.0 122.8 276.6 87 95.0 94 81 76.0 3 3408 807.0 166.0 290.0 302.0 322 621.0 1071 1301 1225.0 4 3872 103.25000 237.2 117.8 154.4 113 116.0 90 94 1911.2 5 5823 73.0 160.6 73.6 133.6 70 80.0 74 69 72.0 6 8099 125.16667 169.0 206.0 196.0 161 150.0 94 72 78.0 7 10640 67.3 494.8 168.2 424.8 476 670.6 74 77 51.0 8 12600 2417.0 1958.0 1871.0 1960.0 2383 2453.0 2506 2758 2442.0 9 14680 38.0 142.2 46.0 30.0 61 404.0 42 19 243.8 10 14698 698.16667 505.0 482.0 553.0 664 847.0 800 679 646.0 WHat I really want to do is: for( i in 1: nrow(D)) { for( j in 5:ncol(D)) { D[((D[i,j]/D[i,2])1.5)]15999)]-100 } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Few doubts about ANOVA
Looks like school is starting up again. We don't usually help with homework especially at this level. Read a text book John Kane Kingston ON Canada -Original Message- From: bal.chan...@gmail.com Sent: Thu, 29 Aug 2013 15:57:29 +0530 To: r-help@r-project.org Subject: [R] Few doubts about ANOVA Hi can you please give the brief explanation about anova? what is the purpose of null hypothesis in anova? how can we find future predictive value from existing data? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help R
As said by arun, the code is not clear. Ma Teresa, what is it that you actually want to do? Regards, José -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: 29 August 2013 15:12 To: R help Subject: Re: [R] Help R HI, Your code is not clear: for( i in 1: nrow(D)) { for( j in 5:ncol(D)) { D[((D[i,j]/D[i,2])1.5)]15999)]-100 ## 1.5)]15999)] } ^^^ } D- structure(list(X. = c(1108L, 1591L, 3408L, 3872L, 5823L, 8099L, 10640L, 12600L, 14680L, 14698L), media = c(22, 86.6, 807, 103.25, 73, 125.16667, 67.3, 2417, 38, 698.16667), IE.2003 = c(32, 92, 166, 237.2, 160.6, 169, 494.8, 1958, 142.2, 505), IE.2004 = c(63, 122.8, 290, 117.8, 73.6, 206, 168.2, 1871, 46, 482), IE.2005 = c(60, 276.6, 302, 154.4, 133.6, 196, 424.8, 1960, 30, 553), IE.2006 = c(39L, 87L, 322L, 113L, 70L, 161L, 476L, 2383L, 61L, 664L), IE.2007 = c(4, 95, 621, 116, 80, 150, 670.6, 2453, 404, 847), IE.2008 = c(8L, 94L, 1071L, 90L, 74L, 94L, 74L, 2506L, 42L, 800L), IE.2009 = c(16L, 81L, 1301L, 94L, 69L, 72L, 77L, 2758L, 19L, 679L), IE.2010 = c(5, 76, 1225, 1911.2, 72, 78, 51, 2442, 243.8, 646)), .Names = c(X., media, IE.2003, IE.2004, IE.2005, IE.2006, IE.2007, IE.2008, IE.2009, IE.2010), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) D[,-c(1:4)][D[,-c(1:4)]/D[,2]1.5] # [1] 60.0 276.6 133.6 196.0 424.8 39.0 476.0 61.0 670.6 404.0 #[11] 1301.0 1225.0 1911.2 243.8 A.K. Hi to everyone, I would like to replace some values in a data.frame (D) str(D) 'data.frame': 116 obs. of 10 variables: $ X. : int 1108 1591 3408 3872 5823 8099 10640 12600 14680 14698 ... $ media : num 22 86.6 807 103.2 73 ... $ IE.2003: num 32 92 166 237 161 ... $ IE.2004: num 63 122.8 290 117.8 73.6 ... $ IE.2005: num 60 277 302 154 134 ... $ IE.2006: num 39 87 322 113 70 ... $ IE.2007: num 4 95 621 116 80 ... $ IE.2008: num 8 94 1071 90 74 ... $ IE.2009: num 16 81 1301 94 69 ... $ IE.2010: num 5 76 1225 1911 72 ... D X. media IE.2003 IE.2004 IE.2005 IE.2006 IE.2007 IE.2008 IE.2009 IE.2010 1 1108 22.0 32.0 63.0 60.0 39 4.0 8 16 5.0 2 1591 86.6 92.0 122.8 276.6 87 95.0 94 81 76.0 3 3408 807.0 166.0 290.0 302.0 322 621.0 1071 1301 1225.0 4 3872 103.25000 237.2 117.8 154.4 113 116.0 90 94 1911.2 5 5823 73.0 160.6 73.6 133.6 70 80.0 74 69 72.0 6 8099 125.16667 169.0 206.0 196.0 161 150.0 94 72 78.0 7 10640 67.3 494.8 168.2 424.8 476 670.6 74 77 51.0 8 12600 2417.0 1958.0 1871.0 1960.0 2383 2453.0 2506 2758 2442.0 9 14680 38.0 142.2 46.0 30.0 61 404.0 42 19 243.8 10 14698 698.16667 505.0 482.0 553.0 664 847.0 800 679 646.0 WHat I really want to do is: for( i in 1: nrow(D)) { for( j in 5:ncol(D)) { D[((D[i,j]/D[i,2])1.5)]15999)]-100 } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The Wireless from Age UK | Radio for grown-ups. www.ageuk.org.uk/thewireless If you’re looking for a radio station that offers real variety, tune in to The Wireless from Age UK. Whether you choose to listen through the website at www.ageuk.org.uk/thewireless, on digital radio (currently available in London and Yorkshire) or through our TuneIn Radio app, you can look forward to an inspiring mix of music, conversation and useful information 24 hours a day. --- Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health cash plans customers respectively. Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and regulated by the Financial Services Authority. -- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you receive a message in error, please advise the sender and delete immediately. Except where this email is sent in the usual course of our business, any opinions expressed in
Re: [R] Plotting time vs number
Please use dput() to supply data. It's a lot easier for readers to just copy and paste into R. I have no idea of what variables are associated with the columns below. John Kane Kingston ON Canada -Original Message- From: mohan.radhakrish...@polarisft.com Sent: Thu, 29 Aug 2013 09:49:36 +0530 To: jholt...@gmail.com Subject: Re: [R] Plotting time vs number Hi, plot(strptime(data$Time,%H:%M:%S),data$Kbytes,pch=0,type=b,col = red, col.axis=red, ylab=, xlab=,las=2,lwd=2.5,cex.axis=1.5) title(,cex.main=3,xlab=Seconds, line=5.2,ylab=Kbytes, cex.lab=2,1) Hope I am not simplifying this in a bad way. These lines plot everything properly except the number of labels on the x-axis. The plots are all there but the x=axis labels are not there. The graph labels are only '12:30', '13:30' and '14:30' I think I need to use your code to get all the values. 13:18:452691296 1601996 1584936 13:20:252691296 1603548 1586488 13:22:052691296 1603556 1586496 13:23:452691296 1606760 1589700 13:25:252691296 1611020 1593960 13:27:052691296 1614348 1597288 13:28:452691296 1614356 1597296 13:30:252691296 1614380 1597320 13:32:052691296 1614388 1597328 13:33:452691296 1614392 1597332 13:35:252691296 1614408 1597352 13:37:052691296 1614416 1597356 13:38:452691296 161 1597384 13:40:262691296 1614624 1597564 13:42:062691296 1614716 1597660 13:43:462691296 1614740 1597680 13:45:262691296 1614744 1597684 13:47:062756832 1631728 1614668 13:48:462756832 1631768 1614708 13:50:262756832 1631892 1614832 Tell me what you want to do, not how you want to do it. If I don't show working code I don't get any response from the forum. So I need basic code to show how it works :-) Thanks, Mohan From: jim holtman jholt...@gmail.com To: mohan.radhakrish...@polarisft.com Cc: Jannis bt_jan...@yahoo.de, R mailing list r-help@r-project.org, r-help-boun...@r-project.org Date: 08/29/2013 01:32 AM Subject:Re: [R] Plotting time vs number What you need to do is to create the plot without an x-axis (xaxt = 'n') and then add your own values on the axis with 'axis' x - read.table(text = Time Kbytes RSS Dirty_Mode 1 11:42:02 2691296 15997961582736 2 11:43:42 2691396 15998041582744 3 11:45:22 2691496 15998041582744 4 11:47:02 2691596 15998121582752 5 11:48:42 2691696 15998161582756 6 11:50:22 2691796 15998201582760, as.is = TRUE, header = TRUE) x$tod - as.POSIXct(paste('2013-08-28', x$Time)) plot(x$tod,x$Kbytes,type=b,col = blue, ylab=, xaxt = 'n', xlab=,las=2,lwd=2.5, lty=1,cex.axis=1.5) # now plot you times axis(1, at = x$tod, labels = x$Time, las = 2) Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Aug 28, 2013 at 8:35 AM, mohan.radhakrish...@polarisft.com wrote: Hi, plot(strptime(data$Time,%H:%M:%S),data$Kbytes,type=l,col = blue, ylab=, xlab=,las=2,lwd=2.5, lty=1,cex.axis=1.5) strptime functions draws a proper graph but now all the time values are not in the x-axis. 1 11:42:02 2691296 2 11:43:42 2691396 3 11:45:22 2691496 4 11:47:02 2691596 5 11:48:42 2691696 I mean that each time value is not shown. It shows only a few values. Each individual pair is not plotted. Thanks. From: mohan.radhakrish...@polarisft.com To: Jannis bt_jan...@yahoo.de Cc: r-help@r-project.org, r-help-boun...@r-project.org Date: 08/28/2013 05:39 PM Subject:Re: [R] Plotting time vs number Sent by:r-help-boun...@r-project.org Hi Jannis, I have tried that. It doesn't work. Jumps are not there in my other graphs using numbers. Does this anything to do with time series ? Can I just convert this time representation into milliseconds and plot the graph ? The x-axis should still show this time format though(names.arg ? ). this.dir - dirname(parent.frame(2)$ofile) setwd(this.dir) data = read.table(D:\\Log analysis\\pmapdata-node1.1,header=F) colnames(data) - c(Time,Kbytes,RSS,Dirty Mode) png( pmapanalysis4705.png, width = 2224, height = 768) par(mar=c(5, 6, 5, 8) + 0.1) plot(data$Time,names.org=Test,data$Kbytes,type=b,col = blue, ylab=, xlab=,las=2,lwd=2.5, lty=1,cex.axis=1.5) box() dev.off() From: Jannis bt_jan...@yahoo.de To: r-help@r-project.org Date: 08/28/2013 05:32 PM Subject:Re: [R] Plotting time vs number Sent by:r-help-boun...@r-project.org Hi Mohan, i am not sure whether I understand your question correctly. Without beeing able to easily reproduce your plot, I would guess that the breaks come from the type='b' option you choose. When you use type ='l', the line would be continuous (though the jumps
Re: [R] Narrowing values collected from .txt file
On Thu, Aug 29, 2013 at 5:40 AM, jim holtman jholt...@gmail.com wrote: Here is how I would do it since are reading in the entire file. This breaks on each Flow Budget section, extracts the RECHARGE values and puts them in a list with the name of the Flow Budget: I learned more R in studying your solution than I could've in a week devoted to googling R. Thank you for making short work of the problem. What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. To answer your question, I'm simply trying to plot various components of the flow budget (e.g., recharge, lake seepage) for any zone that I name through time. For example, I tried altering your solution to restrict the retrieved output to zone 2 only: indx - grep(Flow Budget for Zone 2| RECHARGE =, input) But this was wholly unsatisfactory because I got recharge for all the other zones as well: [1] RECHARGE = 0.12898E+06 [2] RECHARGE =0. [3] Flow Budget for Zone 2 at Time Step 1 of Stress Period 2 [4] RECHARGE = 0.27416E+06 [5] RECHARGE =0. [6] RECHARGE =81084. [7] RECHARGE =0. [8] RECHARGE =45295. [9] RECHARGE =0. [10] RECHARGE =71834. [11] RECHARGE =0. [12] RECHARGE =97739. [13] RECHARGE =0. [14] RECHARGE = 0.12100E-01 [15] RECHARGE =0. [16] RECHARGE =25350. [17] RECHARGE =0. [18] RECHARGE =6167.2 [19] RECHARGE =0. [20] RECHARGE =28608. My thinking at this point is to amend your original solution to account for composite zones: indx - grep(Flow Budget for Zone|Flow Budget for Composite Zone|RECHARGE =, input) and from this extract zone 2's RECHARGE, or composite zone 10's LAKE SEEPAGE, etc. So from the output as I now have it (example shown below), how does one search this form of output for zone 2, or composite zone 10? and leaving the rest of the R as is, I get processed results not unlike what you showed, only with composite zones taken into acct (shown below). So, the final step of what I like to do is to then plot a time series of RECHARGE (not including UZF RECHAGE) for Zone 2, or plot LAKE SEEPAGE for Composite Zone 10 over all 574 stress periods. result[1:100] $` Flow Budget for Zone 1 at Time Step 1 of Stress Period 2` [1] 128980 0 $` Flow Budget for Zone 2 at Time Step 1 of Stress Period 2` [1] 274160 0 $` Flow Budget for Zone 3 at Time Step 1 of Stress Period 2` [1] 81084 0 $` Flow Budget for Zone 4 at Time Step 1 of Stress Period 2` [1] 45295 0 $` Flow Budget for Zone 5 at Time Step 1 of Stress Period 2` [1] 71834 0 $` Flow Budget for Zone 6 at Time Step 1 of Stress Period 2` [1] 97739 0 $` Flow Budget for Zone 7 at Time Step 1 of Stress Period 2` [1] 0.0121 0. ... $` Flow Budget for Zone 94 at Time Step 1 of Stress Period 2` [1] 0 0 $` Flow Budget for Zone 95 at Time Step 1 of Stress Period 2` [1] 0 0 $` Flow Budget for Zone 96 at Time Step 1 of Stress Period 2` [1] 0 0 $` Flow Budget for *Composite Zone* CZ001 at Time Step 1 of Stress Period 2` [1] 587810 0 $` Flow Budget for *Composite Zone* CZ002 at Time Step 1 of Stress Period 2` [1] 725030 0 $` Flow Budget for *Composite Zone* CZ003 at Time Step 1 of Stress Period 2` [1] 1312800 0 ... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale breaks
Hello all, I have decided to go ahead with gap.boxplot. I am trying to suppress the axis labels, both x and y labels. I tried using axis.labels=NULL but it would not work. gap.boxplot(DATA$Conductivity~factor(DATA$UnitName_1),ylim=c(LOWER_Y_Conductivity,UPPER_Y_Conductivity_int),gap=gap_Conductivity, axes=FALSE,col=colours,outwex=one,whisklty = solid,whisklwd=lwth,outcol= black, outpch=dtsym, outcex=dtsize, axis.labels=NULL,range=1.5) I would also like to display a y-axis value in the upper box, but I am unable to that and wondering is that possible to do so with this package. Is it possible to remove the upper and lower boxes horizontal lines and replace the gap symbol with axis.break on the y-axis instead. Any advice would be greatly appreciated!!! Thanks On Thu, Aug 29, 2013 at 9:38 AM, Shane Carey careys...@gmail.com wrote: Ok, thanks all :-) On Thu, Aug 29, 2013 at 2:39 AM, Jim Lemon j...@bitwrit.com.au wrote: On 08/29/2013 02:52 AM, Shane Carey wrote: Hi, Has anyone ever created scale breaks in R something like what is shown here in the section, Use a Scale Break http://www.r-bloggers.com/**graphing-highly-skewed-data/http://www.r-bloggers.com/graphing-highly-skewed-data/ Thanks Hi Shane, As Sarah answered, axis.break in the plotrix package is a start. gap.barplot (also in plotrix) does the whole thing. If they won't give you lunch until you do it that way, like Sarah I say, Go for it Jim -- Shane -- Shane [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale breaks
I would also like to display a y-axis value in the upper box I got this part working now. On Thu, Aug 29, 2013 at 4:28 PM, Shane Carey careys...@gmail.com wrote: Hello all, I have decided to go ahead with gap.boxplot. I am trying to suppress the axis labels, both x and y labels. I tried using axis.labels=NULL but it would not work. gap.boxplot(DATA$Conductivity~factor(DATA$UnitName_1),ylim=c(LOWER_Y_Conductivity,UPPER_Y_Conductivity_int),gap=gap_Conductivity, axes=FALSE,col=colours,outwex=one,whisklty = solid,whisklwd=lwth,outcol= black, outpch=dtsym, outcex=dtsize, axis.labels=NULL,range=1.5) I would also like to display a y-axis value in the upper box, but I am unable to that and wondering is that possible to do so with this package. Is it possible to remove the upper and lower boxes horizontal lines and replace the gap symbol with axis.break on the y-axis instead. Any advice would be greatly appreciated!!! Thanks On Thu, Aug 29, 2013 at 9:38 AM, Shane Carey careys...@gmail.com wrote: Ok, thanks all :-) On Thu, Aug 29, 2013 at 2:39 AM, Jim Lemon j...@bitwrit.com.au wrote: On 08/29/2013 02:52 AM, Shane Carey wrote: Hi, Has anyone ever created scale breaks in R something like what is shown here in the section, Use a Scale Break http://www.r-bloggers.com/**graphing-highly-skewed-data/http://www.r-bloggers.com/graphing-highly-skewed-data/ Thanks Hi Shane, As Sarah answered, axis.break in the plotrix package is a start. gap.barplot (also in plotrix) does the whole thing. If they won't give you lunch until you do it that way, like Sarah I say, Go for it Jim -- Shane -- Shane -- Shane [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help for a function
Hello, You should post your questions to r-help@r-project.org, the odds of getting more and better answers are greater. As for the question, try the following. Note that the functions now have an extra argument. incub - function(x, n = 2){ x$Incubation - 0 x$Incubation[1] - x$Symptomes[1] if(nrow(x) = n) x$Incubation[2] - sum(x$Symptomes[seq_len(n)]) for(i in seq_len(nrow(x))[-seq_len(n)]) x$Incubation[i] - sum(x$Symptomes[i - (seq_len(n) - 1)]) x } contag - function(x, n = 7){ x$CONTAGIEUX - 0 for(i in 1:min(nrow(x), n)) x$CONTAGIEUX[i] - sum(x$Symptomes[1:i], na.rm = TRUE) for (i in seq_len(nrow(x))[-seq_len(n)]) { x$CONTAGIEUX[i] - x$Symptomes[i] + x$CONTAGIEUX[i-1] - x$Symptomes[i-n] } x } incub_ARGENTINA -incub(ARGENTINA, 2) incub_ARGENTINA contag_ARGENTINA -contag(ARGENTINA, 7) contag_ARGENTINA derdata_ARGENTINA -merge(contag_ARGENTINA, incub_ARGENTINA) derdata_ARGENTINA Hope this helps, Rui Barradas Em 29-08-2013 08:31, teko maurice escreveu: Dear Rui, Long time! I came to ask for advice and help if you have time. I am on my PHD developping all to model pandemic. I have post on R help but nobody answer me,maybe it's so specific. So i back to you if you can help me. Hello all, I have such a datasets for a pandemic virus. DATE Algeria Antigua.and.Barbuda ARGENTINA AUSTRALIA AUSTRIA Bahamas 1 2009-04-24 0 0 0 0 0 0 2 2009-04-26 0 0 0 0 0 0 3 2009-04-27 0 0 0 0 0 0 4 2009-04-28 0 0 0 0 0 0 5 2009-04-29 0 0 0 1 0 0 6 2009-04-30 0 0 0 1 0 0 7 2009-05-01 0 0 0 1 0 0 8 2009-05-02 0 0 0 1 0 0 9 2009-05-03 0 0 0 1 0 0 10 2009-05-04 0 0 0 1 0 0 11 2009-05-05 0 0 0 1 0 0 12 2009-05-06 0 0 0 1 0 0 13 2009-05-07 0 0 0 1 0 0 14 2009-05-08 0 0 0 1 0 0 15 2009-05-09 0 0 1 2 0 0 16 2009-05-10 0 0 1 2 0 0 17 2009-05-11 0 0 1 1 1 0 18 2009-05-12 0 0 1 1 1 0 19 2009-05-13 0 0 1 1 1 0 20 2009-05-14 0 0 1 1 1 0 21 2009-05-15 0 0 1 1 1 0 22 2009-05-16 0 0 1 1 1 0 23 2009-05-17 0 0 1 1 1 0 24 2009-05-18 0 0 1 1 1 0 25 2009-05-19 0 0 1 1 1 0 26 2009-05-20 0 0 1 1 1 0 27 2009-05-21 0 0 1 3 1 0 28 2009-05-22 0 0 1 7 1 0 29 2009-05-23 0 0 112 1 0 30 2009-05-25 0 0 216 1 0 31 2009-05-26 0 0 519 1 0 32 2009-05-27 0 01939 1 0 33 2009-05-29 0 037 147 1 0 34 2009-06-01 0 0 100 297 1 1 35 2009-06-03 0 0 131 501 1 1 36 2009-06-05 0 0 147 876 2 1 37 2009-06-08 0 0 202 1051 5 1 38 2009-06-10 0 0 235 1224 5 2 39 2009-06-11 0 0 256 1307 7 1 40 2009-06-12 0 0 343 1307 7 1 41 2009-06-15 0 0 343 1823 7 1 42 2009-06-17 0 0 733 2112 7 2 43 2009-06-19 0 0 918 2199 8 2 44 2009-06-22 1 0 1010 2436 9 2 45 2009-06-24 3 2 1213 2857 12 6 46 2009-06-26 2 2 1391 3280 12 4 47 2009-06-29 2
[R] A question about multivariate normal distribution with a diagonal covariance matrix
Hi all R users: I am a little bit confused about the following results. See as follows: library(mvtnorm) xMean-c(24.12,66.92,77.65,131.97,158.8) xVar-c(0.01,0.06,0.32,0.18,0.95) xFloor-floor(xMean) # use mvtnorm package p1-dmvnorm(xFloor,mean=xMean,sigma=diag(xVar)) p2-dmvnorm(xFloor[1],mean=xMean[1],sigma=matrix(xVar[1]))*dmvnorm(xFloor[2],mean=xMean[2],sigma=matrix(xVar[2]))*dmvnorm(xFloor[3],mean=xMean[3],sigma=matrix(xVar[3])) # use the basic package stats p3-dnorm(xFloor[1],mean=xMean[1],sd=sqrt(xVar[1]))*dnorm(xFloor[2],mean=xMean[2],sd=sqrt(xVar[2]))*dnorm(xFloor[3],mean=xMean[3],sd=sqrt(xVar[3])) The result is: p1= 2.006403e-05, p2=p3= 0.00099646. My question is why p1 does not equal to p2 when the covariance matrix is diagonal, meaning no correlation among variates. From p2=p3, it seems that the mvtnorm package exhibits well agreement with the R basic package. Any explain will be greatly appreciated. Thanks in advance! David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add new calculated column to data frame
Hi, I have a following data set: ideventtime (in sec) 1 add 1373502892 2 add 1373502972 3 delete 1373502995 4 view 1373503896 5 add 1373503996 ... I'd like to add new column time on task which is time elapsed between two events (id2 - id1...). What would be the best approach to do that? Thanks, Srecko [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question about multivariate normal distribution with a diagonal covariance matrix
On 29/08/2013 1:37 PM, Marino David wrote: Hi all R users: I am a little bit confused about the following results. See as follows: library(mvtnorm) xMean-c(24.12,66.92,77.65,131.97,158.8) xVar-c(0.01,0.06,0.32,0.18,0.95) xFloor-floor(xMean) # use “mvtnorm” package p1-dmvnorm(xFloor,mean=xMean,sigma=diag(xVar)) p2-dmvnorm(xFloor[1],mean=xMean[1],sigma=matrix(xVar[1]))*dmvnorm(xFloor[2],mean=xMean[2],sigma=matrix(xVar[2]))*dmvnorm(xFloor[3],mean=xMean[3],sigma=matrix(xVar[3])) # use the basic package “stats” p3-dnorm(xFloor[1],mean=xMean[1],sd=sqrt(xVar[1]))*dnorm(xFloor[2],mean=xMean[2],sd=sqrt(xVar[2]))*dnorm(xFloor[3],mean=xMean[3],sd=sqrt(xVar[3])) The result is: p1= 2.006403e-05, p2=p3= 0.00099646. My question is why p1 does not equal to p2 when the covariance matrix is diagonal, meaning no correlation among variates. From p2=p3, it seems that the “mvtnorm” package exhibits well agreement with the R basic package. Any explain will be greatly appreciated. Why would you expect p1=p2? p1 is the density in 5 dimensions, p2 is only the first 3 components. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question about multivariate normal distribution with a diagonal covariance matrix
You got the point. Thank you for pointing out the problem. Thanks again. David 2013/8/30 Duncan Murdoch murdoch.dun...@gmail.com On 29/08/2013 1:37 PM, Marino David wrote: Hi all R users: I am a little bit confused about the following results. See as follows: library(mvtnorm) xMean-c(24.12,66.92,77.65,**131.97,158.8) xVar-c(0.01,0.06,0.32,0.18,0.**95) xFloor-floor(xMean) # use mvtnorm package p1-dmvnorm(xFloor,mean=xMean,**sigma=diag(xVar)) p2-dmvnorm(xFloor[1],mean=**xMean[1],sigma=matrix(xVar[1])** )*dmvnorm(xFloor[2],mean=**xMean[2],sigma=matrix(xVar[2])** )*dmvnorm(xFloor[3],mean=**xMean[3],sigma=matrix(xVar[3])**) # use the basic package stats p3-dnorm(xFloor[1],mean=**xMean[1],sd=sqrt(xVar[1]))*** dnorm(xFloor[2],mean=xMean[2],**sd=sqrt(xVar[2]))*dnorm(** xFloor[3],mean=xMean[3],sd=**sqrt(xVar[3])) The result is: p1= 2.006403e-05, p2=p3= 0.00099646. My question is why p1 does not equal to p2 when the covariance matrix is diagonal, meaning no correlation among variates. From p2=p3, it seems that the mvtnorm package exhibits well agreement with the R basic package. Any explain will be greatly appreciated. Why would you expect p1=p2? p1 is the density in 5 dimensions, p2 is only the first 3 components. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add new calculated column to data frame
Hi, Try: dat1- read.table(text= id event time 1 add 1373502892 2 add 1373502972 3 delete 1373502995 4 view 1373503896 5 add 1373503996 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat1$time_on_task- c(NA,diff(dat1$time)) dat1 # id event time time_on_task #1 1 add 1373502892 NA #2 2 add 1373502972 80 #3 3 delete 1373502995 23 #4 4 view 1373503896 901 #5 5 add 1373503996 100 #Not sure whether this depends on the values of event or not.. A.K. - Original Message - From: srecko joksimovic sreckojoksimo...@gmail.com To: R help R-help@r-project.org Cc: Sent: Thursday, August 29, 2013 1:52 PM Subject: [R] Add new calculated column to data frame Hi, I have a following data set: id event time (in sec) 1 add 1373502892 2 add 1373502972 3 delete 1373502995 4 view 1373503896 5 add 1373503996 ... I'd like to add new column time on task which is time elapsed between two events (id2 - id1...). What would be the best approach to do that? Thanks, Srecko [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add new calculated column to data frame
Thanks Arun, this is great. However, it should be just a little bit different: # id event time time_on_task #1 1add 1373502892 80 #2 2add 1373502972 23 #3 3 delete 1373502995 901 #4 4 view 1373503896 100 #5 5add 1373503996 NA When I calculate difference, I need to know how long each activity was. It is id2-id1 for the first activity... On Thu, Aug 29, 2013 at 11:03 AM, arun smartpink...@yahoo.com wrote: Hi, Try: dat1- read.table(text= ideventtime 1add 1373502892 2add 1373502972 3delete 1373502995 4view 1373503896 5add 1373503996 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat1$time_on_task- c(NA,diff(dat1$time)) dat1 # id event time time_on_task #1 1add 1373502892 NA #2 2add 1373502972 80 #3 3 delete 1373502995 23 #4 4 view 1373503896 901 #5 5add 1373503996 100 #Not sure whether this depends on the values of event or not.. A.K. - Original Message - From: srecko joksimovic sreckojoksimo...@gmail.com To: R help R-help@r-project.org Cc: Sent: Thursday, August 29, 2013 1:52 PM Subject: [R] Add new calculated column to data frame Hi, I have a following data set: ideventtime (in sec) 1 add 1373502892 2 add 1373502972 3 delete 1373502995 4 view 1373503896 5 add 1373503996 ... I'd like to add new column time on task which is time elapsed between two events (id2 - id1...). What would be the best approach to do that? Thanks, Srecko [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add new calculated column to data frame
Hi Arun, There is one more question... you explained me how to use split(dat1,cumsum(dat1$action==login)) in one of previous questions, and that is great. Now, if I have something like this: id moduleevent time time_on_task 1 sys login 1373502892 80 2 taskadd 1373502892 80 3 taskadd 1373502972 23 4 sys login 1373502892 80 5 list delete 1373502995 901 6 list view 1373503896 100 7 taskadd 1373503996 NA I know how to split at each login occurrence, and I know how to add new column with time differences. But, how to add new column category which will be calculated based on columns module and even? For example if module=task and event=add = category= A... Srecko On Thu, Aug 29, 2013 at 11:22 AM, arun smartpink...@yahoo.com wrote: Hi Srecko, No problem. Regards, Arun From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 2:22 PM Subject: Re: [R] Add new calculated column to data frame Sorry... I should figure it out... thanks so much! Srecko On Thu, Aug 29, 2013 at 11:21 AM, arun smartpink...@yahoo.com wrote: Hi, The one you showed is: dat1$time_on_task- c(diff(dat1$time),NA) dat1 # id event time time_on_task #1 1add 1373502892 80 #2 2add 1373502972 23 #3 3 delete 1373502995 901 #4 4 view 1373503896 100 #5 5add 1373503996 NA From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Thursday, August 29, 2013 2:15 PM Subject: Re: [R] Add new calculated column to data frame Thanks Arun, this is great. However, it should be just a little bit different: # id event time time_on_task #1 1add 1373502892 80 #2 2add 1373502972 23 #3 3 delete 1373502995 901 #4 4 view 1373503896 100 #5 5add 1373503996 NA When I calculate difference, I need to know how long each activity was. It is id2-id1 for the first activity... On Thu, Aug 29, 2013 at 11:03 AM, arun smartpink...@yahoo.com wrote: Hi, Try: dat1- read.table(text= ideventtime 1add 1373502892 2add 1373502972 3delete 1373502995 4view 1373503896 5add 1373503996 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat1$time_on_task- c(NA,diff(dat1$time)) dat1 # id event time time_on_task #1 1add 1373502892 NA #2 2add 1373502972 80 #3 3 delete 1373502995 23 #4 4 view 1373503896 901 #5 5add 1373503996 100 #Not sure whether this depends on the values of event or not.. A.K. - Original Message - From: srecko joksimovic sreckojoksimo...@gmail.com To: R help R-help@r-project.org Cc: Sent: Thursday, August 29, 2013 1:52 PM Subject: [R] Add new calculated column to data frame Hi, I have a following data set: ideventtime (in sec) 1 add 1373502892 2 add 1373502972 3 delete 1373502995 4 view 1373503896 5 add 1373503996 ... I'd like to add new column time on task which is time elapsed between two events (id2 - id1...). What would be the best approach to do that? Thanks, Srecko [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add new calculated column to data frame
On 29-08-2013, at 20:15, srecko joksimovic sreckojoksimo...@gmail.com wrote: Thanks Arun, this is great. However, it should be just a little bit different: # id event time time_on_task #1 1add 1373502892 80 #2 2add 1373502972 23 #3 3 delete 1373502995 901 #4 4 view 1373503896 100 #5 5add 1373503996 NA When I calculate difference, I need to know how long each activity was. It is id2-id1 for the first activity... then why don't you try dat1$time_on_task- c(diff(dat1$time),NA) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add new calculated column to data frame
Thanks Berend, I don't know why I didn't try that before posting the question... but... anyways, thanks for your help Srecko On Thu, Aug 29, 2013 at 11:34 AM, Berend Hasselman b...@xs4all.nl wrote: On 29-08-2013, at 20:15, srecko joksimovic sreckojoksimo...@gmail.com wrote: Thanks Arun, this is great. However, it should be just a little bit different: # id event time time_on_task #1 1add 1373502892 80 #2 2add 1373502972 23 #3 3 delete 1373502995 901 #4 4 view 1373503896 100 #5 5add 1373503996 NA When I calculate difference, I need to know how long each activity was. It is id2-id1 for the first activity... then why don't you try dat1$time_on_task- c(diff(dat1$time),NA) Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add new calculated column to data frame
Hi, You could try this: dat1- read.table(text= id module event time time_on_task 1 sys login 1373502892 80 2 task add 1373502892 80 3 task add 1373502972 23 4 sys login 1373502892 80 5 list delete 1373502995 901 6 list view 1373503896 100 7 task add 1373503996 NA ,sep=,header=TRUE,stringsAsFactors=FALSE) dat1$Categ-as.character(factor(with(dat1,paste(module,event,sep=_)),levels=c(task_add,sys_login,list_delete,list_view),labels=LETTERS[1:4])) dat1 # id module event time time_on_task Categ #1 1 sys login 1373502892 80 B #2 2 task add 1373502892 80 A #3 3 task add 1373502972 23 A #4 4 sys login 1373502892 80 B #5 5 list delete 1373502995 901 C #6 6 list view 1373503896 100 D #7 7 task add 1373503996 NA A A.K. From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Cc: R help R-help@r-project.org Sent: Thursday, August 29, 2013 2:34 PM Subject: Re: [R] Add new calculated column to data frame Hi Arun, There is one more question... you explained me how to use split(dat1,cumsum(dat1$action==login)) in one of previous questions, and that is great. Now, if I have something like this: id module event time time_on_task 1 sys login 1373502892 80 2 task add 1373502892 80 3 task add 1373502972 23 4 sys login 1373502892 80 5 list delete 1373502995 901 6 list view 1373503896 100 7 task add 1373503996 NA I know how to split at each login occurrence, and I know how to add new column with time differences. But, how to add new column category which will be calculated based on columns module and even? For example if module=task and event=add = category= A... Srecko On Thu, Aug 29, 2013 at 11:22 AM, arun smartpink...@yahoo.com wrote: Hi Srecko, No problem. Regards, Arun From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 2:22 PM Subject: Re: [R] Add new calculated column to data frame Sorry... I should figure it out... thanks so much! Srecko On Thu, Aug 29, 2013 at 11:21 AM, arun smartpink...@yahoo.com wrote: Hi, The one you showed is: dat1$time_on_task- c(diff(dat1$time),NA) dat1 # id event time time_on_task #1 1 add 1373502892 80 #2 2 add 1373502972 23 #3 3 delete 1373502995 901 #4 4 view 1373503896 100 #5 5 add 1373503996 NA From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Thursday, August 29, 2013 2:15 PM Subject: Re: [R] Add new calculated column to data frame Thanks Arun, this is great. However, it should be just a little bit different: # id event time time_on_task #1 1 add 1373502892 80 #2 2 add 1373502972 23 #3 3 delete 1373502995 901 #4 4 view 1373503896 100 #5 5 add 1373503996 NA When I calculate difference, I need to know how long each activity was. It is id2-id1 for the first activity... On Thu, Aug 29, 2013 at 11:03 AM, arun smartpink...@yahoo.com wrote: Hi, Try: dat1- read.table(text= id event time 1 add 1373502892 2 add 1373502972 3 delete 1373502995 4 view 1373503896 5 add 1373503996 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat1$time_on_task- c(NA,diff(dat1$time)) dat1 # id event time time_on_task #1 1 add 1373502892 NA #2 2 add 1373502972 80 #3 3 delete 1373502995 23 #4 4 view 1373503896 901 #5 5 add 1373503996 100 #Not sure whether this depends on the values of event or not.. A.K. - Original Message - From: srecko joksimovic sreckojoksimo...@gmail.com To: R help R-help@r-project.org Cc: Sent: Thursday, August 29, 2013 1:52 PM Subject: [R] Add new calculated column to data frame Hi, I have a following data set: id event time (in sec) 1 add 1373502892 2 add 1373502972 3 delete 1373502995 4 view 1373503896 5 add 1373503996 ... I'd like to add new column time on task which is time elapsed between two events (id2 - id1...). What would be the best approach to do that? Thanks, Srecko [[alternative HTML version deleted]] __
Re: [R] calculate with different columns from different datasets
Hi, Try: dat1- read.table(text= V1 V2 V3 2 6 8 4 3 4 1 9 8 ,sep=,header=TRUE) dat2- read.table(text= V1 V2 V3 6 8 4 2 0 7 8 1 3 ,sep=,header=TRUE) res1- as.matrix(dat1-dat2) res1 # V1 V2 V3 #[1,] -4 -2 4 #[2,] 2 3 -3 #[3,] -7 8 5 res2-t(t(dat1)-colMeans(dat2)) res2 # V1 V2 V3 #[1,] -3.33 3 3.333 #[2,] -1.33 0 -0.667 #[3,] -4.33 6 3.333 A.K. Hi there I've got two datasets of the following form (just an example, the real dataset got a lot more columns) dataset1 V1 V2 V3 2 6 8 4 3 4 1 9 8 and dataset 2 V1 V2 V3 6 8 4 2 0 7 8 1 3 First, I'd like to calculate the following: V1 from dataset1 minus V1 from dataset2, than V2 from dataset1 minus V2 from dataset2 ... and so on (always Vn-Vn, where n=1,2,n) and safe the solution-vectors in a new matrix. Second I'd like to run other functions over the two matching columns (for example: V1 from dataset1 minus mean(V1) from dataset2, V2 from dataset1 minus mean(V2) from dataset2,...). So I'm looking for a simple solution that always takes the matching columns from the different datasets and than I can just change the function for the two. Thank you for your help! Kind regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate with different columns from different datasets
Hi, Try: res-sapply(seq_len(ncol(dat1)),function(i) setNames(((1-coef(lm(dat1[,i]~dat2[,i]))[2])^2)*var(dat2[,i]),NULL)) res #[1] 21.0 16.11842 18.69231 A.K. Thank you for your answer. But further calculations will be much more difficult, like (1-b)^2 * Var(V1) for all matching columns where b is the slope from a regression V1 (from datset 1) on V1 (dataset 2) and Var(V1) the variance from V1(from dataset2). So what I'm looking for is something like a loop function... - Original Message - From: arun smartpink...@yahoo.com To: R help r-help@r-project.org Cc: Sent: Thursday, August 29, 2013 3:49 PM Subject: Re: calculate with different columns from different datasets Hi, Try: dat1- read.table(text= V1 V2 V3 2 6 8 4 3 4 1 9 8 ,sep=,header=TRUE) dat2- read.table(text= V1 V2 V3 6 8 4 2 0 7 8 1 3 ,sep=,header=TRUE) res1- as.matrix(dat1-dat2) res1 # V1 V2 V3 #[1,] -4 -2 4 #[2,] 2 3 -3 #[3,] -7 8 5 res2-t(t(dat1)-colMeans(dat2)) res2 # V1 V2 V3 #[1,] -3.33 3 3.333 #[2,] -1.33 0 -0.667 #[3,] -4.33 6 3.333 A.K. Hi there I've got two datasets of the following form (just an example, the real dataset got a lot more columns) dataset1 V1 V2 V3 2 6 8 4 3 4 1 9 8 and dataset 2 V1 V2 V3 6 8 4 2 0 7 8 1 3 First, I'd like to calculate the following: V1 from dataset1 minus V1 from dataset2, than V2 from dataset1 minus V2 from dataset2 ... and so on (always Vn-Vn, where n=1,2,n) and safe the solution-vectors in a new matrix. Second I'd like to run other functions over the two matching columns (for example: V1 from dataset1 minus mean(V1) from dataset2, V2 from dataset1 minus mean(V2) from dataset2,...). So I'm looking for a simple solution that always takes the matching columns from the different datasets and than I can just change the function for the two. Thank you for your help! Kind regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spacing problem in main title using car package scatterplot
Hi All, I'm using R 3.0.0. I'm trying to add the sample size of the paired data (calculated by a function n(), which returns a value of 70, correctly). My main title works fine except that the '70' appears far to the right on the line as in: at Month 18 (N= 70) Is there a way of left justifying the result of .(ss)? or some other way of removing with whitespace between n= and 70?. Thanks for any suggestions. Gerard library (car) data-read.csv(//users//smits//r_work//data.csv, header = TRUE) attach(data); ## ss-n(m18_das28*b_score) scatterplot(m18_das28~b_score, jitter=list(x=1, y=1), grid=F, smooth=F, las=1, pch=c(1), col='blue', main=bquote(paste(Hypothesis 9.4.1\nBaseline XYZ with Disease Activity (DAS28)\nat Month 18 (N=,.(ss),))), xlab=Baseline XYZ, ylab=Month 18 DAS28, legend.plot=F) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help If
In addition to the other suggestions, try typing help('') -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 8/29/13 1:16 AM, Mª Teresa Martinez Soriano teresama...@hotmail.com wrote: Hi to everyone and sorry for my question, I would like to use IF in an example like this: If((condition1 and condition2) Or (condition 3 and condition4)) {print uhvef} BUt I don´t know how to write it correctly, Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spacing problem in main title using car package scatterplot
Dear Gerard, Without your data, it's not possible to reproduce your problem exactly, but it's clear that it isn't specific to the scatterplot() function in the car package. For example, try plot(1:10) title(main=bquote(paste(Hypothesis 9.4.1\nBaseline XYZ with Disease Activity (DAS28)\nat Month 18 (N=, 100 ,))), adj=0) You should be able to adapt the following solution: plot(1:10) mtext(Hypothesis 9.4.1\nBaseline XYZ with Disease Activity (DAS28), side=3, line=2) mtext(paste(at Month 18 (N=, 100 ,), sep=), side=3, line=1) I hope this helps, John --- John Fox McMaster University Hamilton, Ontario, Canada -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Gerard Smits Sent: Thursday, August 29, 2013 5:00 PM To: r-help@r-project.org Subject: [R] spacing problem in main title using car package scatterplot Hi All, I'm using R 3.0.0. I'm trying to add the sample size of the paired data (calculated by a function n(), which returns a value of 70, correctly). My main title works fine except that the '70' appears far to the right on the line as in: at Month 18 (N= 70) Is there a way of left justifying the result of .(ss)? or some other way of removing with whitespace between n= and 70?. Thanks for any suggestions. Gerard library (car) data-read.csv(//users//smits//r_work//data.csv, header = TRUE) attach(data); ### ### ss-n(m18_das28*b_score) scatterplot(m18_das28~b_score, jitter=list(x=1, y=1), grid=F, smooth=F, las=1, pch=c(1), col='blue', main=bquote(paste(Hypothesis 9.4.1\nBaseline XYZ with Disease Activity (DAS28)\nat Month 18 (N=,.(ss),))), xlab=Baseline XYZ, ylab=Month 18 DAS28, legend.plot=F) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add new calculated column to data frame
HI, It's not really clear, but you can try this: dat1- read.table(text= id module event time time_on_task Categ url 1 sys login 1373502892 80 B http://post/add?id=42idp=45 2 task add 1373502892 80 A http://post/add?id=33idp=45 3 task add 1373502972 23 A http://post/add?id=34idp=45 4 sys login 1373502892 80 B http://post/add?id=39idp=42 5 list delete 1373502995 901 C http://post/add?id=37idp=41 6 list view 1373503896 100 D http://post/add?id=36idp=46 7 task add 1373503996 NA A http://post/add?id=31idp=45 ,sep=,header=TRUE,stringsAsFactors=FALSE) vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A])) vec1 #[1] 33 34 31 dat2- read.table(text= id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 ,sep=,header=TRUE) dat1$Categ[dat1$Categ==A][!vec1%in%dat2$idtopic]-F dat1 # id module event time time_on_task Categ url #1 1 sys login 1373502892 80 B http://post/add?id=42idp=45 #2 2 task add 1373502892 80 A http://post/add?id=33idp=45 #3 3 task add 1373502972 23 A http://post/add?id=34idp=45 #4 4 sys login 1373502892 80 B http://post/add?id=39idp=42 #5 5 list delete 1373502995 901 C http://post/add?id=37idp=41 #6 6 list view 1373503896 100 D http://post/add?id=36idp=46 #7 7 task add 1373503996 NA F http://post/add?id=31idp=45 A.K. From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 5:38 PM Subject: Re: [R] Add new calculated column to data frame Hi Arun, I really appreciate your help, and we did a great job :) but, now I think that R can do anything, so I'd like to try one more thing, if you don't mind... from the table with categories, # id module event time time_on_task Categ url #1 1 sys login 1373502892 80 B http: #2 2 task add 1373502892 80 A http: #3 3 task add 1373502972 23 A http: #4 4 sys login 1373502892 80 B http: #5 5 list delete 1373502995 901 C #6 6 list view 1373503896 100 D #7 7 task add 1373503996 NA A I'd like to use only certain category (for example A). Each of these fields has an url whose format is something like http://post/add?id=33idp=45. First step would be to extract this id (33 in this case). Based on that value, I want to find all iduser from the following table: id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 The next step would be to check if at least one of these values (iduser) is not in the vectors users (only ids). If that is the case, I want to change category to F, if not, I want to keep the same category. If this is too much for one question, I'll implement this in Java, but I'd really like to try this with R. Maybe this id extraction from url is the most important problem... I tried most of these steps, but still not able to put them all together... Thank you so much for your time. Srecko On Thu, Aug 29, 2013 at 12:22 PM, arun smartpink...@yahoo.com wrote: Hi Srecko, No problem. Arun From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 3:19 PM Subject: Re: [R] Add new calculated column to data frame This is great Arun, thank you again. I was thinking to use sqldf and issue query for each module-action combination, but this is much better. Since I have table with categories (module, action, category), I could create vector levels based on the first two columns and vector labels based on the category column and that should to the work... Best, Srecko On Thu, Aug 29, 2013 at 12:16 PM, arun smartpink...@yahoo.com wrote: Hi Srecko, You didn't mention the order in which the letters are assigned. If you need a different order, just change the order in the ,levels=c(),. Arun - Original Message - From: arun smartpink...@yahoo.com To: srecko joksimovic sreckojoksimo...@gmail.com Cc: R help r-help@r-project.org Sent: Thursday, August 29, 2013 3:13 PM Subject: Re: [R] Add new calculated column to data frame Hi, You could try this: dat1- read.table(text= id module event time time_on_task 1 sys login 1373502892 80 2 task add 1373502892 80 3 task add 1373502972 23 4 sys login
Re: [R] Add new calculated column to data frame
Hi Arun, this could to the work... Thanks so much! On Thu, Aug 29, 2013 at 3:10 PM, arun smartpink...@yahoo.com wrote: HI, It's not really clear, but you can try this: dat1- read.table(text= id module event time time_on_task Categurl 1sys login 1373502892 80 B http://post/add?id=42idp=45 2 taskadd 1373502892 80 A http://post/add?id=33idp=45 3 taskadd 1373502972 23 A http://post/add?id=34idp=45 4sys login 1373502892 80 B http://post/add?id=39idp=42 5 list delete 1373502995 901 C http://post/add?id=37idp=41 6 list view 1373503896 100 D http://post/add?id=36idp=46 7 taskadd 1373503996 NA A http://post/add?id=31idp=45 ,sep=,header=TRUE,stringsAsFactors=FALSE) vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A])) vec1 #[1] 33 34 31 dat2- read.table(text= id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 ,sep=,header=TRUE) dat1$Categ[dat1$Categ==A][!vec1%in%dat2$idtopic]-F dat1 # id module event time time_on_task Categ url #1 1sys login 1373502892 80 B http://post/add?id=42idp=45 #2 2 taskadd 1373502892 80 A http://post/add?id=33idp=45 #3 3 taskadd 1373502972 23 A http://post/add?id=34idp=45 #4 4sys login 1373502892 80 B http://post/add?id=39idp=42 #5 5 list delete 1373502995 901 C http://post/add?id=37idp=41 #6 6 list view 1373503896 100 D http://post/add?id=36idp=46 #7 7 taskadd 1373503996 NA F http://post/add?id=31idp=45 A.K. From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 5:38 PM Subject: Re: [R] Add new calculated column to data frame Hi Arun, I really appreciate your help, and we did a great job :) but, now I think that R can do anything, so I'd like to try one more thing, if you don't mind... from the table with categories, # id module event time time_on_task Categurl #1 1sys login 1373502892 80 B http: #2 2 taskadd 1373502892 80 A http: #3 3 taskadd 1373502972 23 A http: #4 4sys login 1373502892 80 B http: #5 5 list delete 1373502995 901 C #6 6 list view 1373503896 100 D #7 7 taskadd 1373503996 NA A I'd like to use only certain category (for example A). Each of these fields has an url whose format is something like http://post/add?id=33idp=45. First step would be to extract this id (33 in this case). Based on that value, I want to find all iduser from the following table: id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 The next step would be to check if at least one of these values (iduser) is not in the vectors users (only ids). If that is the case, I want to change category to F, if not, I want to keep the same category. If this is too much for one question, I'll implement this in Java, but I'd really like to try this with R. Maybe this id extraction from url is the most important problem... I tried most of these steps, but still not able to put them all together... Thank you so much for your time. Srecko On Thu, Aug 29, 2013 at 12:22 PM, arun smartpink...@yahoo.com wrote: Hi Srecko, No problem. Arun From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 3:19 PM Subject: Re: [R] Add new calculated column to data frame This is great Arun, thank you again. I was thinking to use sqldf and issue query for each module-action combination, but this is much better. Since I have table with categories (module, action, category), I could create vector levels based on the first two columns and vector labels based on the category column and that should to the work... Best, Srecko On Thu, Aug 29, 2013 at 12:16 PM, arun smartpink...@yahoo.com wrote: Hi Srecko, You didn't mention the order in which the letters are assigned. If you need a different order, just change the order in the ,levels=c(),. Arun - Original Message - From: arun smartpink...@yahoo.com To: srecko joksimovic sreckojoksimo...@gmail.com Cc: R help r-help@r-project.org Sent: Thursday, August 29, 2013 3:13 PM Subject: Re: [R] Add new calculated column to data frame Hi, You could try this: dat1- read.table(text= id moduleevent time
Re: [R] Add new calculated column to data frame
Hi Srecko, Try this: dat1- read.table(text= id module event time time_on_task Categ url 1 sys login 1373502892 80 B http:// 2 task add 1373502892 80 A http://post/add?id=33idp=67 3 task add 1373502972 23 A http://post/add?id=34idp=67 4 sys login 1373502892 80 B http:// 5 list delete 1373502995 901 C http:// 6 list view 1373503896 100 D http:// 7 task add 1373503996 NA A http://post/add?id=35idp=99 ,sep=,header=TRUE,stringsAsFactors=FALSE) vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A])) dat2- read.table(text= id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 ,sep=,header=TRUE) student_list- c(101:102,104:107) vec2-with(dat2,tapply(iduser,list(idtopic),FUN=function(x) all(x%in% student_list))) dat1$Categ[dat1$Categ==A][match(vec1,as.numeric(names(vec2)))[!vec2]]-F dat1 # id module event time time_on_task Categ url #1 1 sys login 1373502892 80 B http:// #2 2 task add 1373502892 80 F http://post/add?id=33idp=67 #3 3 task add 1373502972 23 A http://post/add?id=34idp=67 #4 4 sys login 1373502892 80 B http:// #5 5 list delete 1373502995 901 C http:// #6 6 list view 1373503896 100 D http:// #7 7 task add 1373503996 NA A http://post/add?id=35idp=99 A.K. From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 6:04 PM Subject: Re: [R] Add new calculated column to data frame Did you mean to separate the number 33 from the link? , yes that is correct. It should be something like this: # id module event time time_on_task Categ url #1 1 sys login 1373502892 80 B http:// #2 2 task add 1373502892 80 A http://post/add?id=33idp=67 #3 3 task add 1373502972 23 A http://post/add?id=34idp=67 #4 4 sys login 1373502892 80 B http:// #5 5 list delete 1373502995 901 C http:// #6 6 list view 1373503896 100 D http:// #7 7 task add 1373503996 NA A http://post/add?id=35idp=99 from this table I should get 3 rows with 3 URLs: http://post/add?id=33idp=67, http://post/add?id=34idp=67, and http://post/add?id=35idp=99 For each of them, I need to extract id (33, 34, and 35). Once I do that, I need to obtain users from this table: id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 again, for each id. This means: id = 33 = 101, 103 id = 34 = 102 id = 35 = 104 Next, for each vector I need to check whether or not all it's values are in the students list (101,102, 104,105, 106,107) id = 33 = FALSE (since 103 is not in the list) id = 34 = TRUE id = 35 = TRUE This means that category for row 2 in the first table is not A any more, but F... Thanks, Srecko On Thu, Aug 29, 2013 at 2:56 PM, arun smartpink...@yahoo.com wrote: HI Srecko, Did you mean to separate the number 33 from the link? Could you provide a reproducible example with the output you expected? Tx. Arun From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 5:38 PM Subject: Re: [R] Add new calculated column to data frame Hi Arun, I really appreciate your help, and we did a great job :) but, now I think that R can do anything, so I'd like to try one more thing, if you don't mind... from the table with categories, # id module event time time_on_task Categ url #1 1 sys login 1373502892 80 B http: #2 2 task add 1373502892 80 A http: #3 3 task add 1373502972 23 A http: #4 4 sys login 1373502892 80 B http: #5 5 list delete 1373502995 901 C #6 6 list view 1373503896 100 D #7 7 task add 1373503996 NA A I'd like to use only certain category (for example A). Each of these fields has an url whose format is something like http://post/add?id=33idp=45. First step would be to extract this id (33 in this case). Based on that value, I want to find all iduser from the following table: id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104
Re: [R] Add new calculated column to data frame
Thanks, I'll try this as well. Srecko On Thu, Aug 29, 2013 at 3:26 PM, arun smartpink...@yahoo.com wrote: Hi Srecko, Try this: dat1- read.table(text= id module event time time_on_task Categurl 1sys login 1373502892 80 B http:// 2 taskadd 1373502892 80 A http://post/add?id=33idp=67 3 taskadd 1373502972 23 A http://post/add?id=34idp=67 4sys login 1373502892 80 B http:// 5 list delete 1373502995 901 C http:// 6 list view 1373503896 100 D http:// 7 taskadd 1373503996 NA A http://post/add?id=35idp=99 ,sep=,header=TRUE,stringsAsFactors=FALSE) vec1-as.numeric(gsub(.*\\?.*=(\\d+)\\.*,\\1,dat1$url[dat1$Categ==A])) dat2- read.table(text= id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 ,sep=,header=TRUE) student_list- c(101:102,104:107) vec2-with(dat2,tapply(iduser,list(idtopic),FUN=function(x) all(x%in% student_list))) dat1$Categ[dat1$Categ==A][match(vec1,as.numeric(names(vec2)))[!vec2]]-F dat1 # id module event time time_on_task Categ url #1 1sys login 1373502892 80 B http:// #2 2 taskadd 1373502892 80 F http://post/add?id=33idp=67 #3 3 taskadd 1373502972 23 A http://post/add?id=34idp=67 #4 4sys login 1373502892 80 B http:// #5 5 list delete 1373502995 901 C http:// #6 6 list view 1373503896 100 D http:// #7 7 taskadd 1373503996 NA A http://post/add?id=35idp=99 A.K. From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 6:04 PM Subject: Re: [R] Add new calculated column to data frame Did you mean to separate the number 33 from the link? , yes that is correct. It should be something like this: # id module event time time_on_task Categurl #1 1sys login 1373502892 80 B http:// #2 2 taskadd 1373502892 80 A http://post/add?id=33idp=67 #3 3 taskadd 1373502972 23 A http://post/add?id=34idp=67 #4 4sys login 1373502892 80 B http:// #5 5 list delete 1373502995 901 C http:// #6 6 list view 1373503896 100 D http:// #7 7 taskadd 1373503996 NA A http://post/add?id=35idp=99 from this table I should get 3 rows with 3 URLs: http://post/add?id=33idp=67, http://post/add?id=34idp=67, and http://post/add?id=35idp=99 For each of them, I need to extract id (33, 34, and 35). Once I do that, I need to obtain users from this table: id idpost idtopic iduser 1 45 33 101 2 46 34 102 3 47 33 103 4 48 33 101 5 49 35 104 again, for each id. This means: id = 33 = 101, 103 id = 34 = 102 id = 35 = 104 Next, for each vector I need to check whether or not all it's values are in the students list (101,102, 104,105, 106,107) id = 33 = FALSE (since 103 is not in the list) id = 34 = TRUE id = 35 = TRUE This means that category for row 2 in the first table is not A any more, but F... Thanks, Srecko On Thu, Aug 29, 2013 at 2:56 PM, arun smartpink...@yahoo.com wrote: HI Srecko, Did you mean to separate the number 33 from the link? Could you provide a reproducible example with the output you expected? Tx. Arun From: srecko joksimovic sreckojoksimo...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, August 29, 2013 5:38 PM Subject: Re: [R] Add new calculated column to data frame Hi Arun, I really appreciate your help, and we did a great job :) but, now I think that R can do anything, so I'd like to try one more thing, if you don't mind... from the table with categories, # id module event time time_on_task Categurl #1 1sys login 1373502892 80 B http: #2 2 taskadd 1373502892 80 A http: #3 3 taskadd 1373502972 23 A http: #4 4sys login 1373502892 80 B http: #5 5 list delete 1373502995 901 C #6 6 list view 1373503896 100 D #7 7 taskadd 1373503996 NA A I'd like to use only certain category (for example A). Each of these fields has an url whose format is something like http://post/add?id=33idp=45. First step would be to extract this id (33 in this case). Based on that value, I want to find all iduser from the following table: id idpost idtopic iduser 1 45 33 101 2 46 34 102 3
[R] Vectorized version of colMeans/rowMeans for higher dimension arrays?
For matrices, colMeans/rowMeans are quick, vectorized functions. But say I have a higher dimensional array: moo - array(runif(400*9*3),dim=c(400,9,3)) And I want to get the mean along the 2nd dimension. I can, of course, use apply: moo1 - apply(moo,c(1,3),mean) But this is not a vectorized operation (so it doesn't execute as quickly). How would one vectorize this operation (if possible)? Is there an array equivalent of colMeans/rowMeans? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why is this a factor?
On 29/08/13 12:10, Ista Zahn wrote: On Wed, Aug 28, 2013 at 7:44 PM, Steve Lianoglou lianoglou.st...@gene.com wrote: Hi, On Wed, Aug 28, 2013 at 3:58 PM, Ista Zahn istaz...@gmail.com wrote: Or go all the way and put options(stringsAsFactors = FALSE) at the top your script or in your .Rprofile. This will prevent this kind of annoyance in the future without having to say stringsAsFactors = FALSE all the time. I go back and forth about doing this too (setting a global hammer to stringsAsFactors), but then other things might mess up -- imagine a scenario where a package is written with the assumption that the default `stringsAsFactors=TRUE` setting hasn't been changed, which could then break when you go the nuclear-global-override route. Yes, possibly, but I've yet to have that problem, whereas before I started changing it globally things used to break fairly regularly. Like Ista I have never had a problem arising from a package's assuming that `stringsAsFactors=TRUE` --- and I would opine that any package making such an assumption is badly written. (Of course there is a lot of bad code out there ) I have once or twice stumbled over a conundrum in respect of questions posed on r-help where the poster assumed `stringsAsFactors=TRUE`. But I eventually figured out what was going on. (And anyway that's the poster's problem, as far as I'm concerned.) cheers, Rolf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorized version of colMeans/rowMeans for higher dimension arrays?
Hi, You could try: res-colMeans(aperm(moo,c(2,1,3))) resOld-apply(moo,c(1,3),mean) identical(res,resOld) #[1] TRUE #Speed: set.seed(285) moo1- array(runif(1400*9*15),dim=c(1400,9,15)) system.time({res1- colMeans(aperm(moo1,c(2,1,3)))}) #user system elapsed # 0.004 0.000 0.002 system.time({res2- apply(moo1,c(1,3),mean)}) # user system elapsed # 0.180 0.000 0.178 identical(res1,res2) #[1] TRUE A.K. - Original Message - From: Jonathan Greenberg j...@illinois.edu To: r-help r-help@r-project.org Cc: Sent: Thursday, August 29, 2013 6:36 PM Subject: [R] Vectorized version of colMeans/rowMeans for higher dimension arrays? For matrices, colMeans/rowMeans are quick, vectorized functions. But say I have a higher dimensional array: moo - array(runif(400*9*3),dim=c(400,9,3)) And I want to get the mean along the 2nd dimension. I can, of course, use apply: moo1 - apply(moo,c(1,3),mean) But this is not a vectorized operation (so it doesn't execute as quickly). How would one vectorize this operation (if possible)? Is there an array equivalent of colMeans/rowMeans? --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] why is this a factor?
Hi, On Thu, Aug 29, 2013 at 3:03 PM, Rolf Turner rolf.tur...@xtra.co.nz wrote: On 29/08/13 12:10, Ista Zahn wrote: On Wed, Aug 28, 2013 at 7:44 PM, Steve Lianoglou lianoglou.st...@gene.com wrote: Hi, On Wed, Aug 28, 2013 at 3:58 PM, Ista Zahn istaz...@gmail.com wrote: Or go all the way and put options(stringsAsFactors = FALSE) at the top your script or in your .Rprofile. This will prevent this kind of annoyance in the future without having to say stringsAsFactors = FALSE all the time. I go back and forth about doing this too (setting a global hammer to stringsAsFactors), but then other things might mess up -- imagine a scenario where a package is written with the assumption that the default `stringsAsFactors=TRUE` setting hasn't been changed, which could then break when you go the nuclear-global-override route. Yes, possibly, but I've yet to have that problem, whereas before I started changing it globally things used to break fairly regularly. Like Ista I have never had a problem arising from a package's assuming that `stringsAsFactors=TRUE` --- and I would opine that any package making such an assumption is badly written. (Of course there is a lot of bad code out there ) It never happened to me either, except when code that *I* wrote was dependent on the global options settings to stringsAsFactors=FALSE. I had to hand over a codebase to a colleague in my lab when I left. Her options(stringsAsFactors) was at the default (TRUE), and things mysteriously broke until we (eventually) sorted out what was the what -- it took a while to find because I *totally* forgot I had set `options(stringsAsFactors=FALSE)` my ~/.Rprofile several years prior (a testament to how little it breaks things I guess). Of course, I can't argue with your premise that code written that depends on the defaults (or changed defaults) is, in the end, poorly written code ... sometimes we have to own up to being the ones who write poorly written code ;-) I only posted my original warning here to serve, more or less, as the sentiment put forth in this poster since a decent amount of time was lost chasing our tails: http://www.despair.com/mistakes.html ;-) -steve -- Steve Lianoglou Computational Biologist Bioinformatics and Computational Biology Genentech __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scale breaks
On 08/30/2013 01:28 AM, Shane Carey wrote: Hello all, I have decided to go ahead with gap.boxplot. I am trying to suppress the axis labels, both x and y labels. I tried using axis.labels=NULL but it would not work. Hi Shane, To suppress the axis labels, pass an empty string: gap.barplot(...,xlab=,ylab=,...) Many default values of NULL tell the function to work out labels from the data, usually names. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] new.env() and attach for write?
Or you can use with: a - new.env() with(a, b - function(x) x ) a$b function(x) x environment: 0x06e9e0b8 On Wed, Aug 28, 2013 at 3:45 PM, ivo welch ivo.we...@gmail.com wrote: duh! Ivo Welch (ivo.we...@gmail.com) http://www.ivo-welch.info/ J. Fred Weston Professor of Finance Anderson School at UCLA, C519 Director, UCLA Anderson Fink Center for Finance and Investments Free Finance Textbook, http://book.ivo-welch.info/ Editor, Critical Finance Review, http://www.critical-finance-review.org/ On Wed, Aug 28, 2013 at 2:42 PM, Hadley Wickham h.wick...@gmail.com wrote: On Wed, Aug 28, 2013 at 4:32 PM, ivo welch ivo.we...@anderson.ucla.edu wrote: is it possible to temporarily change the destination environment where objects are written to? I am thinking a - new.env() attach(a) ### run some code, such as... b - function(x) x detach(a) a$b obviously, this is wrong. attach() only attaches for read access. I could copy the globalenv, run my code, see what objects have been changed (how?), move the changed and new functions into my a environment, and then restore globalenv. or is this already done somewhere else? within? Or just: evalq({ b - function(x) x }, a) Hadley -- Chief Scientist, RStudio http://had.co.nz/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculate with different columns from different datasets
Thank you for your answer. But further calculations will be much more difficult, like (1-b)^2 * Var(V1) for all matching columns where b is the slope from a regression V1 (from datset 1) on V1 (dataset 2) and Var(V1) the variance from V1(from dataset2). So what I'm looking for is something like a loop function... -- View this message in context: http://r.789695.n4.nabble.com/calculate-with-different-columns-from-different-datasets-tp4674918p4674926.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculate with different columns from different datasets
Hi thereI've got two datasets of the following form (just an example, the real dataset got a lot more columns)dataset1V1 V2 V32 6 84 3 41 9 8and dataset 2V1 V2 V36 8 42 0 78 1 3First, I'd like to calculate the following:V1 from dataset1 minus V1 from dataset2,thanV2 from dataset1 minus V2 from dataset2...and so on (always Vn-Vn, where n=1,2,n) and safe the solution-vectors in a new matrix.Second I'd like to run other functions over the two matching columns (for example: V1 from dataset1 minus mean(V1) from dataset2, V2 from dataset1 minus mean(V2) from dataset2,...).So I'm looking for a simple solution that always takes the matching columns from the different datasets and than I can just change the function for the two.Thank you for your help!Kind regards -- View this message in context: http://r.789695.n4.nabble.com/calculate-with-different-columns-from-different-datasets-tp4674918.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Running pre R.14 version of R with R3.0.0
I upgraded R from 2.12.1 to 3.0.0 (on windows XP(, and as soon as I saved the 3.0.0 workspace, was unable to access .Rdata from 2.12.1. The message in the R console is Error in loadNamesSpace(name): there is no package called parallel and a popup window that says Fatal error: unable to restore saved date in .Rdata Is there anything that can be done to access the old .Rdata without destroying the new? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Missing value handling for felm function in lfe package
Dear All, I am trying to use the felm function in the lfe package. However it does not seem to deal with missing values the way the lm function does. I wish to tell it na.omit or na.action = na.omit but it does not recognize this. I need to allow for missing values as I have different specifications and don't want to remove observations for all. Help on this will be greatly appreciated! Thanks in advance and hope this is clear. Megha Megha Patnaik PhD candidate Dept of Economics Stanford University 650-868-6084 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with Peaks package - followup…
Hi, I apologize for not following the posting rules… Here is the text from my previous post: I started evaluating the 'Peaks' package a couple of months ago and found it to be quite useful. Getting back to it last week I had to set up my R environment due to hardware changes again. The Peaks package loads with no problem. After successfully reinstalling all packages (RedHat 4.4.7-3 and OS X 10.8.4) I am getting the following error message : Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma), : R_SpectrumSearchHighRes not available for .Call() for package Peaks I was not able to identify a problem with my installation. The script calling this function is the same, the actual call is the same as it was when I stopped working with this script. Any suggestion for how to fix this issue will be greatly appreciated. The following script (OS X 10.8.4) fails in a reproducible way: ### ## CUW 08/2013 ## Demo: Peaksearch issue (package 'Peaks') ## library(Peaks) ## Signal with well defined peaks x - seq( 0, 50, len=1024) y - 1/x * sin(x) ## Plot signal... plot(x, y, type='s') ## Call SpectrumSearch with default parameters res - SpectrumSearch(y, sigma=3.0, threshold=1.0, background=FALSE, iterations=13, markov=FALSE, window=3) ## Error message: Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma), : R_SpectrumSearchHighRes not available for .Call() for package Peaks Any suggestion for fixing this issue is very much appreciated! Thanks, Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Omitted/blank variables in R function
Hi All, I'm very green user and have little programming background, but appreciate any and all help/direction. I have a spreadsheet that successfully sends values from Excel cells to R as variables for a function, which then runs and generates a plot. I cannot figure out how to make R recognize those variables as NA if one of the cells in Excel is blank - or, for that matter, I don't know how to get an R function to recognize variables as NA if no value is assigned to that variable. I have unsuccessfully tried using : if(is.na(four)) return(NA) My function is very simple: mtmatches - c(one,two,three,four) Everything runs smoothly if the four variables have values assigned to them. Any advice on how to get it to run when one of the variables has no value? No worries about the Excel element...figure I can decipher that puzzle later! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Omitted-blank-variables-in-R-function-tp4674931.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Peaks package - followup…
Hi, I am getting the same error with R 3.0.1 SpectrumSearch(y, sigma=3.0, threshold=1.0, background=TRUE, iterations=13, markov=FALSE, window=3) #Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma), : # R_SpectrumSearchHighRes not available for .Call() for package Peaks But, it worked with R 2.15.2 It would be better to contact the package maintainer maintainer(Peaks) #[1] M.Kondrin mkond...@hppi.troitsk.ru sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Peaks_0.2 stringr_0.6.2 reshape2_1.2.2 loaded via a namespace (and not attached): [1] plyr_1.8 tcltk_3.0.1 tools_3.0.1 A.K. - Original Message - From: Wildgruber, Christoph U. wildgrube...@ornl.gov To: R-help@r-project.org R-help@r-project.org Cc: Wildgruber, Christoph U. wildgrube...@ornl.gov Sent: Thursday, August 29, 2013 11:16 AM Subject: [R] Problem with Peaks package - followup… Hi, I apologize for not following the posting rules… Here is the text from my previous post: I started evaluating the 'Peaks' package a couple of months ago and found it to be quite useful. Getting back to it last week I had to set up my R environment due to hardware changes again. The Peaks package loads with no problem. After successfully reinstalling all packages (RedHat 4.4.7-3 and OS X 10.8.4) I am getting the following error message : Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma), : R_SpectrumSearchHighRes not available for .Call() for package Peaks I was not able to identify a problem with my installation. The script calling this function is the same, the actual call is the same as it was when I stopped working with this script. Any suggestion for how to fix this issue will be greatly appreciated. The following script (OS X 10.8.4) fails in a reproducible way: ### ## CUW 08/2013 ## Demo: Peaksearch issue (package 'Peaks') ## library(Peaks) ## Signal with well defined peaks x - seq( 0, 50, len=1024) y - 1/x * sin(x) ## Plot signal... plot(x, y, type='s') ## Call SpectrumSearch with default parameters res - SpectrumSearch(y, sigma=3.0, threshold=1.0, background=FALSE, iterations=13, markov=FALSE, window=3) ## Error message: Error in .Call(R_SpectrumSearchHighRes, as.vector(y), as.numeric(sigma), : R_SpectrumSearchHighRes not available for .Call() for package Peaks Any suggestion for fixing this issue is very much appreciated! Thanks, Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Omitted/blank variables in R function
Look at the missing function. Or set the default value of the arguments to NA. On Thu, Aug 29, 2013 at 3:23 PM, newruser12345 smetc...@gelbergroup.comwrote: Hi All, I'm very green user and have little programming background, but appreciate any and all help/direction. I have a spreadsheet that successfully sends values from Excel cells to R as variables for a function, which then runs and generates a plot. I cannot figure out how to make R recognize those variables as NA if one of the cells in Excel is blank - or, for that matter, I don't know how to get an R function to recognize variables as NA if no value is assigned to that variable. I have unsuccessfully tried using : if(is.na(four)) return(NA) My function is very simple: mtmatches - c(one,two,three,four) Everything runs smoothly if the four variables have values assigned to them. Any advice on how to get it to run when one of the variables has no value? No worries about the Excel element...figure I can decipher that puzzle later! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Omitted-blank-variables-in-R-function-tp4674931.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Validating data type
I'm very new to R. I have a data file that I have read in via read.csv. I expect one of the columns to be of type date for example. However at least one value in that column is not of date type. I know this because another program I am trying to process the file with is erroring, yet it doesn't tell me what row/value is erroring. Does R have a way to: treat column x as date type, and print out all values/row numbers do not conform to that type for that specified column? Many thanks! Jeff [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Validating data type
The answer to your question is yes. You can convert a column of values to Date using the as.Date function with the appropriate format, and then test if any values are NA using the is.na function, and find them with the which function. If you want something less vague then you should read the Posting Guide mentioned at the bottom of this message and follow the advice about using plain text and providing a sample of data that exhibits the issue and your attempts to solve the problem (code). Sample data is almost always needed... if you don't make it, then we have do so in order to illustrate the solution, but we would be guessing and that is just a waste of time. You may find the following link helpful also: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. jeffj...@worldvision.org wrote: I'm very new to R. I have a data file that I have read in via read.csv. I expect one of the columns to be of type date for example. However at least one value in that column is not of date type. I know this because another program I am trying to process the file with is erroring, yet it doesn't tell me what row/value is erroring. Does R have a way to: treat column x as date type, and print out all values/row numbers do not conform to that type for that specified column? Many thanks! Jeff [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.