[R] Automated Start for new Rgui within existing R code?
Is there a way to start multiple instances of R in an automated manner? Since I'm not sure that question makes tons of sense, here's my scenario: I have a number of data updates that need to be completed on an ongoing basis with the data pulled from and then stored to another location. The updates are manually triggered, so can build up in terms of the volume that needs to be updated. The updates can take a lot of time if I run everything from within a single Rgui, but it isn't a big deal if there aren't a lot of accumulated updates that need to be run. But say if I have a week or a month's worth of accumulated updates to run, this takes a LOT of time in a single Rgui. If I split it up, I can run it on 6 Rgui's (I'm in windows 7 with an 8 Core machine) and increase my overall efficiency. But this requires manual intervention as there are steps before (that figure out how big the updates that must be run are) and then steps after that summarize and give me some metadata. What I would love is something that will execute within an existing Rgui and allows me to send a command (such as source( "myfile.R" ) to a new Rgui. Does such a command even exist or is this just wishful thinking? I'm trying to avoid writing a wrapfile in another language which would be able to source the .R file as most of this (before and after the ideal split point) is written in R already. (FWIW Windows 7, 8 core machine running 64bit R) Thanks, Brigid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Numeric "Label" of Factor value?
Sorry, I'm sure I'm not using the appropriate vocab here, which is undoubtedly why I can't seem to find a fix to this (hopefully very easy) problem. Suppose you have a factor abc <- factor(c(2,2,3,4,7,7)) And you want to know what the number in the nth spot in that would be abc[1] [1] 2 Levels: 2 3 4 7 shows the correct label of the first element - but if I want to pull out the numeric value of that label, I thought... as.numeric(abc[1]) but that gives [1] 1 which is the position of the label in the levels vector of the factor. Ideas? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Add rank column to data frame as in SQL...
Hopefully this is an easy problem... I'm trying to add a partitioned rank column to a data frame where the rank is calculated separately across a partition by categories, the way you could easily do in SQL. I found this solution in the archives that looked like it might work: http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8675.html The example has a data frame with several car companies, and employee salaries within them. A column is then added to the data.frame which should give the descending rank for each employee, partitioned by company. But when I implemented it, the results weren't the expected rankings. What am I doing wrong? set.seed(1) DF <- data.frame(Company=sample(c("Ford","Toyota","GM"),size=18,replace=TRUE), Person=LETTERS[1:18],Salary=runif(18)*1e5) DF <- within(DF, rank <- ave(Salary, Company, FUN=function(x)rev(order(x # Then checking each category manually DF[DF$Company == "Ford",] DF[DF$Company == "GM",] DF[DF$Company == "Toyota",] # My results show that it works for Ford and GM, but not Toyota > DF[DF$Company == "Ford",] Company Person Salary rank 1 Ford A 38003.524 5 Ford E 65167.382 10Ford J 38238.803 11Ford K 86969.081 12Ford L 34034.905 > DF[DF$Company == "GM",] Company Person Salary rank 4 GM D 21214.256 6 GM F 12555.517 7 GM G 26722.075 13 GM M 48208.014 15 GM O 49354.133 17 GM Q 82737.331 18 GM R 66846.672 > DF[DF$Company == "Toyota",] Company PersonSalary rank 2 Toyota B 77744.5222 3 Toyota C 93470.5231 8 Toyota H 38611.4095 9 Toyota I 1339.0333 14 Toyota N 59956.5836 16 Toyota P 18621.7604 For reference, I'm using R 2.11.1 on a Windows 7 machine. Can anyone provide insight into how I am implementing this incorrectly, or give an alternate way to add such a partitioned rank column to a data frame? Thanks in advance, Brigid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decimal Accuracy Loss?
Thanks, Bert. That's a big help. -Brigid On Wed, Apr 6, 2011 at 11:45 AM, Bert Gunter wrote: > Confirmed. "Casting" just adds/removes the dim attribute to the > numeric vector/matrix. > > -- Bert > > On Wed, Apr 6, 2011 at 8:33 AM, Brigid Mooney wrote: >> This is hopefully a quick question on decimal accuracy. Is any >> decimal accuracy lost when casting a numeric vector as a matrix? And >> then again casting the result back to a numeric? >> >> I'm finding that my calculation values are different when I run for >> loops that manually calculate matrix multiplication as compared to >> when I cast the vectors as matrices and multiply them using "%*%". >> (The errors are very small, but the process is run iteratively >> thousands of times, at which point the error between the two >> differences becomes noticeable.) >> >> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?", >> but just want to confirm that the differences in values are due to >> differences in the matrix multiplication operator and manual >> calculation via for loops, rather than information that is lost when >> casting a numeric as a matrix and back again. >> >> Thanks in advance for the help, >> Brigid >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > "Men by nature long to get on to the ultimate truths, and will often > be impatient with elementary studies or fight shy of them. If it were > possible to reach the ultimate truths without the elementary studies > usually prefixed to them, these would not be preparatory studies but > superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Decimal Accuracy Loss?
This is hopefully a quick question on decimal accuracy. Is any decimal accuracy lost when casting a numeric vector as a matrix? And then again casting the result back to a numeric? I'm finding that my calculation values are different when I run for loops that manually calculate matrix multiplication as compared to when I cast the vectors as matrices and multiply them using "%*%". (The errors are very small, but the process is run iteratively thousands of times, at which point the error between the two differences becomes noticeable.) I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?", but just want to confirm that the differences in values are due to differences in the matrix multiplication operator and manual calculation via for loops, rather than information that is lost when casting a numeric as a matrix and back again. Thanks in advance for the help, Brigid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Performance Difference? Windows vs. Linux
I'm not trying to start a Windows vs. Linux debate, but I've been using R on a Windows machine for a while, and was recently wondering if R's performance would be faster on a Linux machine. And similarly, if any incremental increase in processing speed would be worth the time it would take me to migrate my entire system to Linux (including a database that I access via an R package.) I don't know how much it matters what R is doing - but I've got R pulling a large amount data from a database, performing many complex computations on that data, and then writing output data to a database. Thanks so much for the input, Brigid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] randomness using runif
I'm working on a problem where I'm introducing random error and have been using the built in function runif to provide that random error. However, I realized that I seem to be getting some unexpected behavior out of the function and was hoping someone could share some insight. I don't know the runif algorithm at all, but from the behavior I'm seeing, it seems that whenever I open a new R console, the function runif gets "reset" to some initial value. For example... In a NEW R console, enter the following: x1 <- runif(1000, -1, 1) x2 <- runif(1000, -1, 1) x1[1:5] x2[1:5] objectsToSave <- c("x1", "x2") filename <- "C:\\Documents\\x1x2file.Rdata" save(list=objectsToSave, file=filename, envir = parent.frame()) Then in a different NEW R console, enter this: x3 <- runif(1000, -1, 1) x4 <- runif(1000, -1, 1) x3[1:5] x4[1:5] # For me, the values look identical to x1 and x2, but let's check by loading the x1x2 file and comparing them directly... filename <- "C:\\Documents\\x1x2file.Rdata" load(filename) sum(x1==x3) sum(x2==x4) For my results, I get that x1=x3 for all 1000 elements in the vector, and x2=x4 for all 1000 elements in that vector. Does anyone have insight into what's going on here? Am I doing something wrong here, or is this a quirk of the runif algorithm? Is there a better function out there for seeding truly random error? For what it's worth, here's my R version info: platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 8.1 year 2008 month 12 day22 svn rev47281 language R version.string R version 2.8.1 (2008-12-22) Thanks for the help, Brigid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem Matching Exact Values
Sorry for the basic question - bur I ran into something I haven't noticed before and would appreciate a little more perspective on my problem. I am using R to determine if various thresholds are hit (or surpassed) in a data set. If a threshold is surpassed, I have had no problems identifying it. However, when the threshold is matched *exactly*, not all cases are being identified. Please consider the following example, with base value of x = 59000 and threshold of 10% - so the target to hit is 59000*1.1 = 64900. > x <- 59000 > thresh <- 0.10 > > target <- x*(1+thresh) > target [1] 64900 > > > target == 64900 [1] FALSE > > > target-64900 [1] 7.275958e-12 Why is there this (very) small difference in the value of target and the numeric 64900? Is this using a floating point system or something else that I'm not understanding? Is using round() the best work-around in cases such as these - or is there a better (perhaps more accurate) way to classify data in cases such as this - avoiding whatever floating points are taking place in the background in the first place?. I'm using an older version of R if that matters at all... R.version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 8.1 year 2008 month 12 day22 svn rev47281 language R version.string R version 2.8.1 (2008-12-22) Thanks in advance, Brigid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Accessing outside data sources for stock data?
Hi All, I apologize if this is not the right forum for this question. I know there used to be a package for R that connected to opentick's (open source) stock market data. However, opentick closed up about a year ago. Does anyone know of a similar package that connects to IQfeed's stock market data (or another source of historical stock market data)? Thanks, Brigid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiply List by a Numeric
I apologize for what seems like it should be a straighforward query. I am trying to multiply a list by a numeric and thought there would be a straightforward way to do this, but the best solution I found so far has a for loop. Everything else I try seems to throw an error "non-numeric argument to binary operator" Consider the example: a <- 1 b <- 1:2 c <- 1:3 abc <- list(a,b,c) To multiply every element of abc by a numeric, say 3, I wrote a for-loop: for (i in 1:length(abc)) { abc[[i]] <- 3*abc[[i]] } Is this really the simplest way or am I missing something? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Activation Functions in Package Neural
Hi, I am trying to build a VERY basic neural network as a practice before hopefully increasing my scope. To do so, I have been using package "neural" and the MLP related functions (mlp and mlptrain) within that package. So far, I have created a basic network, but I have been unable to change the default activation function. If someone has a suggestion, please advise. The goal of the network is to properly classify a number as positive or negative. Simple 1-layer network with a single neuron in each layer. Rcode: trainInput <- matrix(rnorm(10)) trainAnswers <- ifelse(trainInput <0, -1, 1) trainNeurons <- 1 trainingData <- mlptrain(inp=trainInput, neurons=trainNeurons, out=trainAnswers, it=1000) ## To call this network, we can see how it works on a set of known positive and negative values testInput <- matrix(-2:2) mlp(testInput, trainingData$weight, trainingData$dist, trainingData$neurons, trainingData$actfns) Will vary - but output on my computer was: [,1] [1,] 0.001043291 [2,] 0.001045842 [3,] 0.072451270 [4,] 0.950744548 [5,] 0.950931168 So it's instead classifying the negatives as 0 and positives as 1 (getting close to, anyhow - increasing the number of iterations, ie it=5000, makes that more clear) This results in a neural net with activation function 1/(1+exp(-x)) - which will never result in the -1 value that the answers contain. The documentation for package neural specifies the parameter "actfns", which should be a list containing the numeric code for the activation functions of each layer - however, anytime I try to put in a value for "actfns" (such as actfns=2 for hyperbolic tangent), I get the error: "Different activation function and active layer number" If anyone can shed light on what I'm doing wrong here with the activation functions or how to change the activation functions, I'd really appreciate it. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Neural Networks
Hi, I am starting to play around with neural networks and noticed that there are several packages on the CRAN website for neural networks (AMORE, grnnR, neural, neuralnet, maybe more if I missed them). Are any of these packages more well-suited for newbies to neural networks? Are there any relative strengths / weaknesses to the different implementations? If anyone has any advice before I dive into this project, I'd appreciate it. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error Catching?
Hi, Is there an easy way to "catch" errors, in order to arrange for r-scripts to exit gracefully? I'm thinking of something along the lines of using is.na with an if/else statement, but for errors. Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Example for parsing XML file?
Thanks! That helps a lot! A quick follow-up question - I can't really tell what part of the commands tell it to only look at the child nodes of . Is there any way to also access the fields that are in the heirarchy? (ie the S, D, C, and F) I wouldn't necessarily want those repeated thousands of times in the data frame, but C and F are useful reference points as they are actually row numbers where specific events occurred. Thanks again for all the help! -Brigid On Wed, May 20, 2009 at 5:16 PM, Duncan Temple Lang wrote: > Hi Brigid. > > Here are a few commands that should do what you want: > > bri = xmlParse("myDataFile.xml") > > tmp = t(xmlSApply(xmlRoot(bri), xmlAttrs))[, -1] > dd = as.data.frame(tmp, stringsAsFactors = FALSE, > row.names = 1:nrow(tmp)) > > And then you can convert the columns to whatever types you want > using regular R commands. > > The basic idea is that for each of the child nodes of C, > i.e. the 's, we want the character vector of attributes > which we can get with xmlAttrs(). > > Then we stack them together into a matrix, drop the "N" > and then convert the result to a data frame, avoiding > duplicate row names which are all "T". > > (BTW, make certain the '-' on the second line is not in the XML content. > I assume that came from bringing the text into mail.) > > HTH > D. > > > Brigid Mooney wrote: >> >> Hi, >> >> I am trying to parse XML files and read them into R as a data frame, >> but have been unable to find examples which I could apply >> successfully. >> >> I'm afraid I don't know much about XML, which makes this all the more >> difficult. If someone could point me in the right direction to a >> resource (preferably with an example or two), it would be greatly >> appreciated. >> >> Here is a snippet from one of the XML files that I am looking to read, >> and I am aiming to be able to get it into a data frame with columns N, >> T, A, B, C as in the 2nd level of the heirarchy. >> >> >> - >> >> >> >> >> >> >> >> >> >> >> Thanks for any help or direction anyone can provide. >> >> As a point of reference, I am using R 2.8.1 and have loaded the XML >> package. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Example for parsing XML file?
Hi, I am trying to parse XML files and read them into R as a data frame, but have been unable to find examples which I could apply successfully. I'm afraid I don't know much about XML, which makes this all the more difficult. If someone could point me in the right direction to a resource (preferably with an example or two), it would be greatly appreciated. Here is a snippet from one of the XML files that I am looking to read, and I am aiming to be able to get it into a data frame with columns N, T, A, B, C as in the 2nd level of the heirarchy. - Thanks for any help or direction anyone can provide. As a point of reference, I am using R 2.8.1 and have loaded the XML package. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] efficiency when processing ordered data frames
Hoping for a little insight into how to make sure I have R running as efficiently as possible. Suppose I have a data frame, A, with n rows and m columns, where col1 is a date time stamp. Also suppose that when this data is imported (from a csv or SQL), that the data is already sorted such that the time stamp in col1 is in ascending (or descending) order. If I then wanted to select only the rows of A where col1 <= a certain time, I am wondering if R has to read through the entirety of col1 to select those rows (all n of them). Is it possible for R to recognize (or somehow be told) that these rows are already in order, thus allowing the computation could be completed in ~log(n) row reads instead? Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Print to File Formatting
Hello, I am writing out to a file and have two quick questions that I can't seem to track down the correct answers for. Luckily, I *think* they are both simple enough that someone might be able to point me in the right direction on them without too much trouble. Both questions relate to the process below where CompleteFrame is a data frame containing what I want printed to a file. filename <- "C:\\MyDocuments\\TestOut_050609.txt" output <-file(filename, open="wt") write.csv(CompleteFrame, output, row.names = FALSE, col.names=FALSE) close(output) Question #1: Every time I run this process, I get the warning: Warning message: In write.csv(CompleteFrame, output, row.names = FALSE, col.names = FALSE) : attempt to set 'col.names' ignored And it still prints the column names as the first row in my file, which I do not want... Question #2: This process puts quotes around all data of class = character. I can't have these quotes in my file - is it possible to get R to omit them even if my data frame contains character strings? Any help or hints on this are greatly appreciated! Thanks, Brigid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error when setting up Rcmd BATCH on new computer
My apologies, the files version 4-3 files I downloaded came from the link: http://cran.r-project.org/contrib/extra/batchfiles 2009/3/27 Uwe Ligges : > > > Brigid Mooney wrote: >> >> Hello, >> >> I got a new computer, and am trying to reinstall R and have run into a >> bit of a problem when running the BATCH command. >> For reference, the OS is Windows Vista, 64 bit. >> >> I installed R 2.8.1 and have the 4-3 files from the following link >> extracted with the containing folder in my system PATH variable. > > Can you explain the sentence above, please? > What is 4-3? Which link? containing folder? > > Uwe Ligges > > >> However, when I try to run the following command from the dos prompt: >> Rcmd BATCH TestBatch.R testoutput.txt >> >> Note: TestBatch.R is simply a file containing the statement: >> print("hello world") >> >> I get the error: \Common was unexpected at this time. >> >> If anyone can provide any insight into this problem, I would really >> appreciate it as I thought I remembered all the steps from when I set >> this all up on my old computer... >> >> Thanks, >> Brigid >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error when setting up Rcmd BATCH on new computer
Hello, I got a new computer, and am trying to reinstall R and have run into a bit of a problem when running the BATCH command. For reference, the OS is Windows Vista, 64 bit. I installed R 2.8.1 and have the 4-3 files from the following link extracted with the containing folder in my system PATH variable. However, when I try to run the following command from the dos prompt: Rcmd BATCH TestBatch.R testoutput.txt Note: TestBatch.R is simply a file containing the statement: print("hello world") I get the error: \Common was unexpected at this time. If anyone can provide any insight into this problem, I would really appreciate it as I thought I remembered all the steps from when I set this all up on my old computer... Thanks, Brigid __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] formula formatting/grammar for regression
Hi all, I am doing some basic regression analysis, and am getting a bit confused on how to enter non-polynomial formulas to be used. For example, consider that I want to find A and r such that the formula y = A*exp(r*x) provides the the best fit to the line y=x on the interval [0,50]. I can set: xpts <- seq(0, 50, by=0.1) ypts <- seq(0, 50, by=0.1) I know I can find a fitted polynomial of a given degree using lm(ypts ~ poly(xpts, degree=5, raw=TRUE)) But am confused on what the formula should be for trying to find a fit to y = A*exp(r*x). If anyone knows of a resource that describes the "grammar" behind assembling these formulas, I would really appreciate being pointed in that direction as I can't seem to find much beyond basic polynomials. Thanks for the help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
Thanks for pointing me to the quantreg package as a resource. I was hoping to ask be able to address one quick follow-up question... I get slightly different variants between using the rq funciton with formula = mydata ~ 1 as I would if I ran the same data using the quantile function. Example: mydata <- (1:10)^2/2 pctile <- seq(.59, .99, .1) quantile(mydata, pctile) 59%69%79%89%99% 20.015 26.075 32.935 40.595 49.145 rq(mydata~1, tau=pctile) Call: rq(formula = mydata ~ 1, tau = pctile) Coefficients: tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 (Intercept)18 24.532 40.550 Degrees of freedom: 10 total; 9 residual Is it correct to assume this is due to the different accepted methods of calculating quantiles? If so, do you know where I would be able to see the algorithms used in these functions? I'm not finding it in the documentation for function rq, and am new enough to R that I don't know where those references would generally be. On Tue, Feb 17, 2009 at 12:29 PM, roger koenker wrote: > http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869 > > gives one possibility... > > url:www.econ.uiuc.edu/~rogerRoger Koenker > emailrkoen...@uiuc.eduDepartment of Economics > vox: 217-333-4558University of Illinois > fax: 217-244-6678Champaign, IL 61820 > > > > > On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote: > > Hi All, >> >> I am looking at applications of percentiles to time sequenced data. I had >> just been using the quantile function to get percentiles over various >> periods, but am more interested in if there is an accepted (and/or >> R-implemented) method to apply weighting to the data so as to weigh recent >> data more heavily. >> >> I wrote the following function, but it seems quite inefficient, and not >> really very flexible in its applications - so if anyone has any >> suggestions >> on how to look at quantiles/percentiles within R while also using a >> weighting schema, I would be very interested. >> >> Note - this function supposes the data in X is time-sequenced, with the >> most >> recent (and thus heaviest weighted) data at the end of the vector >> >> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1)) >> { >> Xprime <- NA >> >> for(i in 1:length(X)) >> { >> Xprime <- c(Xprime, rep(X[i], times=i)) >> } >> >> print("Percentiles:") >> print(quantile(X, pctile)) >> print("Weighted:") >> print(Xprime) >> print("Weighted Percentiles:") >> print(quantile(Xprime, pctile, na.rm=TRUE)) >> } >> >> WtPercentile(1:10) >> WtPercentile(rnorm(10)) >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Percentiles/Quantiles with Weighting
Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime <- NA for(i in 1:length(X)) { Xprime <- c(Xprime, rep(X[i], times=i)) } print("Percentiles:") print(quantile(X, pctile)) print("Weighted:") print(Xprime) print("Weighted Percentiles:") print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Appending objects created using filehash package
Hi, I am working with a very large dataset, and am using the 'filehash' package to manage such a large file. While I have no problem accessing objects that I load into a database, I was hoping there is a better way to append to objects already in the database. The only way I know now to append to an object, basically requires rewriting the entire object. Sample code: = # Setting up the database library(filehash) A <- data.frame(a=c("abcde", "fghij", "klmno"), stringsAsFactors=FALSE) dumpDF( A, dbName="myTestDB") envTest <- db2env(db="myTestDB") ls(envTest) with(envTest, a) # Appending to object a, but basically rewriting it... envTest$a <- c(envTest$a, "HELLO", "GOODBYE") with(envTest, a) If anyone has a suggestion on how to append to an object without completely rewriting it, I would really appreciate it. Because in my actual implementation, a is a vector of class character with ~3.5 million elements... writing it (and rewriting it) takes quite a bit of time. For reference, I am using a Windows Vista machine with: R.version.string [1] "R version 2.8.0 (2008-10-20)" Thanks, Brigid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error as.Date on Invalid Dates
Hi All, I have an script in R which accepts user inputs for certain parameters, particularly dates, which the user inputs as character strings. eg: > date1 <- "2009-01-21" The script later parses the input via the as.Date function: > as.Date(date1) However, as.Date encounters an error when the string does not represent an actual date. eg: > date1 <- "2009-02-29" # Note: 2009 not a leap year > as.Date(date1) Error in fromchar(x) : character string is not in a standard unambiguous format As I have many instances of date entries like this, date1, date2, date3, etc. , I'd like the script to error out gracefully and to be able to point the user to which date they need to correct, rather than "Error in fromchar(x)...", which doesn't make it obvious what they need to do to fix the error. Ideally I'd love to send the user a message like: print(paste(date1, "is an invalid date. Refer to calendar.", sep=" ")) If anyone has any suggestions on catching this type of error and feedback which directs the user, it would be much appreciated. For reference, I am using a Windows Vista machine with: > R.version.string [1] "R version 2.8.0 (2008-10-20)" Thanks, Brigid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Missing file to run Rcmd batch on Windows
Hi, I'm trying to run an R script using Rcmd Batch from the command line on a Windows Vista machine. I am using R version 2.8.1. I installed the batch files 4-3 found at http://cran.r-project.org/contrib/extra/batchfiles/ and added them to my path. I also had to install the latest version of perl (it's Strawberry perl if that makes a difference) and have added this to my path. Now when I run the command: Rcmd batch TestBatch.R TestOutput.txt from the command line, I get the error: Can't open perl script "C:\Progra~1\R\R-28~1.0\bin\batch": No such file or directory Just for reference, TestBatch.R contains only one line: print("hello world") Does anyone have any idea on what this file is that I might be missing? Or is there some other mistake I'm making in trying to run the a script from the command line. Thanks, -Brigid [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Memory Size & Allocation in R
My apologies if this is a bit of a 'newbie' question. I am using R v 2.8.0 in Windows and am a bit confused about the memory size/allocation. A script I wrote faulted out with the error: "Error: cannot allocate vector of size 5.6 Mb" After this error, I still have: > memory.size() [1] 669.3517 > memory.limit() [1] 1535.875 Since the memory size is well under 5.6Mb less than the memory limit, I assume there is some limit on object size within R. Is this correct? If so, is there a way to determine which one of my objects is too large? - So that I can remove it or update my script. Also, is there a way to temporarily increase the limit on vector memory allocation above 5.6Mb? I'm hoping to still retreive the data that was calculated in the run prior to the error. To get at this data, I tried write.csv and got a similar error: > write.csv(Results, "TempOutput_011309.csv", row.names=FALSE, col.names=TRUE) Error: cannot allocate vector of size 5.5 Mb For reference, 'Results' is a data frame with about 800K rows and 18 columns (1 col contains character strings of class factor, 1 col contains Date/Time stamps of class factor, the remaining cols are all numeric). Any help here is greatly appreciated - Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I avoid nested 'for' loops or quicken the process?
loss, stoploss, opClProfit) ProfitTable <- data.frame(SymbolID=compareMarket$SymbolID, investbySymbol, Gains, percentGains=Gains/investbySymbol, LessComm=rep(comission, times=length(Gains)), NetGains=Gains/investbySymbol-2*comission) AggregatesTable <- data.frame( OutTotInvestment = sum(ProfitTable$investbySymbol, na.rm=TRUE), OutNumInvestments = sum(ProfitTable$investbySymbol, na.rm=TRUE)/investment, OutDolProf = sum(ProfitTable$Gains, na.rm=TRUE), OutPerProf = sum(ProfitTable$Gains, na.rm=TRUE)/sum(ProfitTable$investbySymbol, na.rm=TRUE), OutNetGains = sum(ProfitTable$Gains, na.rm=TRUE)/sum(ProfitTable$investbySymbol, na.rm=TRUE)-2*comission, OutLong = long, OutShort = short, OutInvestment = investment, OutStoploss = stoploss, OutComission = comission, OutPenny = penny, OutVolume = volume, OutNumU = numU, OutAccDefn = accDefn ) return(AggregatesTable) } # Sample iteration parameters (these can be vectors of arbitrary length) # Need to iterate through all possible combinations of these parameters Param <- list(long=c(.75, 1.5), short=c(-.5, -1), investment=1, stoploss=c(-.015), comission=.0002, penny=3, volume=c(.02, .01), numU=2, accDefn=0:1 ) CombParam <- expand.grid(Param) # Create sample X and Y data frames for function call Y <- data.frame(SymbolID=10:14, OpeningPrice = c(1,3,10,20,60), ClosingPrice = c(2,2.5,11,18,61.5), YesterdayClose= c(1,3,10,20,60), MinTrVol = rep(1000, times=5)) X <- data.frame(SymbolID=10:14, weight = c(1, .5, -3, -.75, 2), CPweight=c(1.5, .25, -1.75, 2, -1), noU = c(2,3,4,2,10)) for (i in 1:length(CombParam$long)) { if(i==1) { Results <- calcProfit(CombParam[i,], X, Y) } else { Results <- rbind(Results, calcProfit(CombParam[i,], X, Y)) } } Results2 <- apply(CombParam, 1, calcProfit, X, Y) -- On Tue, Dec 23, 2008 at 11:15 AM, David Winsemius wrote: > > On Dec 23, 2008, at 10:56 AM, Brigid Mooney wrote: > > Thank you again for your help. >> >> snip > >> >> > - >> With the 'apply' call, Results2 is of class list. >> >> Results2 <- apply(CombParam, 1, calcProfit, X, Y) >> >> --- >> >> How can I get convert Results2 from a list to a data frame like Results? >> > > Have you tried as.data.frame() on Results2? Each of its elements should > have the proper structure. > > You no longer have a reproducible example, but see this session clip: > > lairq <- apply(airquality,1, function(x) x ) > > str(lairq) > num [1:6, 1:153] 41 190 7.4 67 5 1 36 118 8 72 ... > - attr(*, "dimnames")=List of 2 > ..$ : chr [1:6] "Ozone" "Solar.R" "Wind" "Temp" ... > ..$ : NULL > > is.data.frame(lairq) > [1] FALSE > > is.data.frame(rbind(lairq)) > [1] FALSE > > is.data.frame( as.data.frame(lairq) ) > -- > David Winsemius > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I avoid nested 'for' loops or quicken the process?
Thank you again for your help. I updated the parsing at the beginning of the calcProfit function with: if (class(IterParam) == "numeric") { long <- IterParam["long"] short <- IterParam["short"] investment <- IterParam["investment"] stoploss <- IterParam["stoploss"] comission <- IterParam["comission"] penny <- IterParam["penny"] volume <- IterParam["volume"] numU <- IterParam["numU"] accDefn <- IterParam["accDefn"] } else { long <- IterParam$long short <- IterParam$short investment <- IterParam$investment stoploss <- IterParam$stoploss comission <- IterParam$comission penny <- IterParam$penny volume <- IterParam$volume numU <- IterParam$numU accDefn <- IterParam$accDefn } This allows for everything to process as expected for calling it both in the 'for' loop I showed before and as part of 'apply'. However, I have one other question. With the 'for' loop, Results is of class data frame. for (i in 1:length(CombParam$long)) { if(i==1) { Results <- calcProfit(CombParam[i,], X, Y) } else { Results <- rbind(Results, calcProfit(CombParam[i,], X, Y)) } } --- With the 'apply' call, Results2 is of class list. Results2 <- apply(CombParam, 1, calcProfit, X, Y) --- How can I get convert Results2 from a list to a data frame like Results? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I avoid nested 'for' loops or quicken the process?
t;- list("a","b","c") > > blist <- list("ab","ac","ad") > > > expand.grid(alist, blist) > Var1 Var2 > 1a ab > 2b ab > 3c ab > 4a ac > 5b ac > 6c ac > 7a ad > 8b ad > 9c ad > > > apply( expand.grid(alist, blist), 1, function(x) paste(x[1], x[2], > sep="")) > [1] "aab" "bab" "cab" "aac" "bac" "cac" "aad" "bad" "cad" > > > clist <- list("AA","BB") > > > apply(expand.grid(alist, blist, clist),1,function(x) paste(x[1], x[2], > x[3], sep="")) > [1] "aabAA" "babAA" "cabAA" "aacAA" "bacAA" "cacAA" "aadAA" "badAA" > "cadAA" "aabBB" > [11] "babBB" "cabBB" "aacBB" "bacBB" "cacBB" "aadBB" "badBB" "cadBB" > > > dlist <- list(TRUE,FALSE) > > > apply(expand.grid(alist, blist, clist, dlist),1,function(x) paste(x[1], > x[2], x[3], (x[4]), sep=""))[8:12] > [1] "badAATRUE" "cadAATRUE" "aabBBTRUE" "babBBTRUE" "cabBBTRUE" > > > This could get unwieldily if the length of the lists are appreciable, since > the number of rows will be the product of all the lengths. On the other hand > you could create a dataframe indexed by the variables in expand.grid's > output: > > > master.df <- data.frame( expand.grid(alist, blist, clist, dlist), >results = apply(expand.grid(alist, blist, > clist,dlist),1, >function(x) paste(x[1], x[2], x[3], > (x[4]), sep=""))) > > > > -- > David Winsemius > > On Dec 22, 2008, at 3:33 PM, Charles C. Berry wrote: > > On Mon, 22 Dec 2008, Brigid Mooney wrote: >> >> Hi All, >>> >>> I'm still pretty new to using R - and I was hoping I might be able to get >>> some advice as to how to use 'apply' or a similar function instead of >>> using >>> nested for loops. >>> >> >> Unfortunately, you have given nothing that is reproducible. >> >> The details of MyFunction and the exact structure of the list objects are >> crucial. >> >> Check out the _Posting Guide_ for hints on how to formulate a question >> that will elecit an answer that helps you. >> >> HTH, >> >> Chuck >> >> >> >>> Right now I have a script which uses nested for loops similar to this: >>> >>> i <- 1 >>> for(a in Alpha) { for (b in Beta) { for (c in Gamma) { for (d in Delta) { >>> for (e in Epsilon) >>> { >>> Output[i] <- MyFunction(X, Y, a, b, c, d, e) >>> i <- i+1 >>> } >>> >>> >>> Where Output[i] is a data frame, X and Y are data frames, and Alpha, >>> Beta, >>> Gamma, Delta, and Epsilon are all lists, some of which are numeric, some >>> logical (TRUE/FALSE). >>> >>> Any advice on how to implement some sort of solution that might be >>> quicker >>> than these nested 'for' loops would be greatly appreciated. >>> >>> Thanks! >>> >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> Charles C. Berry(858) 534-2098 >> Dept of Family/Preventive >> Medicine >> E mailto:cbe...@tajo.ucsd.edu UC San Diego >> http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego >> 92093-0901 >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How can I avoid nested 'for' loops or quicken the process?
Hi All, I'm still pretty new to using R - and I was hoping I might be able to get some advice as to how to use 'apply' or a similar function instead of using nested for loops. Right now I have a script which uses nested for loops similar to this: i <- 1 for(a in Alpha) { for (b in Beta) { for (c in Gamma) { for (d in Delta) { for (e in Epsilon) { Output[i] <- MyFunction(X, Y, a, b, c, d, e) i <- i+1 } Where Output[i] is a data frame, X and Y are data frames, and Alpha, Beta, Gamma, Delta, and Epsilon are all lists, some of which are numeric, some logical (TRUE/FALSE). Any advice on how to implement some sort of solution that might be quicker than these nested 'for' loops would be greatly appreciated. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] formatting print statements with multiple lines
When executing the command: print(cat(paste("Input criteria does not meet specifications. Check input against the following requirements: a >= 0 b <= 0 c >= 0 ", sep=""), "")) I get: Input criteria does not meet specifications. Check input against the following requirements: a >= 0 b <= 0 c >= 0 NULL (with that extra NULL at the end). If I omit the 'cat' command, the NULL goes away, but I no longer get the next-line formatting that I want. I think I'm missing something about one of these functions returning a NULL value, but I can't seem to get the exact result I want (no NULL value, and keep the next-line formatting.) Any ideas or suggestions are much appreciated. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Avoiding multiple outputs using RODBC package
I am using R as a data manipulation tool for a SQL database. So in some of my R scripts I use the RODBC package to retreive data, then run analysis, and use the sqlSave function in the RODBC package to store the results in a database. There are two problems I want to avoid, and they are highly related: (1) having R rerun analysis which has already been done and saved into output database table, and (2) ending up with more than one identical row in my output database table. - The analysis I am running allows the user to input a large number of variables, for example: date, version, a, b, c, d, e, f, g, ... After R completes its analysis, I write the results to a database table in the format: Value, date, version, a, b, c, d, e, f, g, ... where Value is the result of the R analysis, and the rest of the columns are the criteria that was used to get that value. -- Can anyone think of a way to address these problems? The only thing I can think of so far is to run an sqlQuery to get a table of all the variable combinations that are saved at the start, and then simply avoid computing and re-outputing those results. However, my results database table currently has over 200K rows (and will grow very quickly as I keep going with this project), so I think that would not be the most expeditious answer as I think just the SQL query to download 200K rows x 10+ columns is going to be time consuming in and of itself. I know this is kindof a weird problem, and am open to all sorts of ideas... Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Applying min to numeric vectors
I was surprised this morning, that it seems as though the min() function does not work as *I* anticipated when given vector arguments. For example: a <- 1:10 b <- c(rep(1, times=5), rep(10, times=5)) Result: > min(a,b) 1 What I actually wanted was a term by term minimum, i.e.: ifelse(a<=b, a, b) 1 1 1 1 1 6 7 8 9 10 Am I losing much in terms of computation power if I use the ifelse? I'm a little worried, because in implementation my vectors are quite long, and I will be computing the min of many of them, min(a,b,c,d,e,f) where a through f are all vectors of the same length. Any insight that can be provided is much appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding a time difference to a datetime stamp
I am trying to figure out a way to add a certain number of hours to a date/time stamp. Specifically, I have a string of date/time stamps that all have the time at midnight. I would like to be able to keep the date the same, but add a certain number of hours to create a new timestamp that is a few hours later. Below is the procedure I have tried so far with no luck: startDate <- "2008-11-01" endDate <- "2008-11-05" OutDates <- seq(as.Date(startDate), as.Date(endDate), by="day") > OutDates "2008-11-01" "2008-11-02" "2008-11-03" "2008-11-04" "2008-11-05" FourOclock <- as.difftime("16:00:00") > FourOclock Time difference of 16 hours Afternoons <- OutDates + FourOclock > Afternoons "2008-11-17" "2008-11-18" "2008-11-19" "2008-11-20" "2008-11-21" Gives the wrong answer, adding 16 days instead of 16 hours, and throws the following warning: Warning message: Incompatible methods ("+.Date", "Ops.difftime") for "+" I am stumped on this one and would appreciate any/all recommendations. Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in sqlCopy in RODBC
Hi All, I am trying to copy portions of tables from one SQL database to another, using sqlCopy in the RODBC package. RemoteChannel = connection to remote database LocalChannel = connection to local database LocalTable = table in my local database to receive data from the remote database query <- select query in SQL sqlCopy(RemoteChannel, query, "LocalTable", destchannel=LocalChannel, safer=TRUE) I am currently getting an error: Error in sqlSave(destchannel, dataset, destination, verbose = verbose, : table 'LocalTable' already exists I need to append the data retreieved through the query, to the table LocalTable. It was my understanding that when safer=TRUE, it would append the new data to an existing table, or create a new table otherwise. This error seems to suggest otherwise. Any ideas? All your help is greatly appreciated! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] run time function for R scripts?
Hi All, I was wondering if there was a function in R that would output the total run time for various scripts. For now I have the following workaround: begTime <- Sys.time() ... the rest of the R script... runTime <- Sys.time()-begTime Is there another function that I don't know about that would return this information in a more elegant manner? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exclude holidays in a subset of dates?
Hi All, I am iterating through dated materials, with variable start and end dates, and would like to skip procedures everytime I encounter a weekend or holiday. To do this, I thought the easiest way would be to create a TRUE/FALSE vector corresponding to each day where it is TRUE if a workday, and FALSE if a weekend or holiday. So far I have been able to do this for weekdays: startDate <- as.Date("2008-08-15") endDate <- as.Date("2008-09-15") AllDays <- seq(startDate, endDate, by="day") WorkDays <- ifelse(as.numeric(format(startDate+days-1, "%w"))%%6==0, FALSE, TRUE) But I'm a bit lost as to what to do for the holidays, for example "2008-09-01" is Labor Day in the above range. Is there some procedure to say if an object is "in" a given list or set? Mathematically, I would want to test: day \in Holidays where day is a given day, and Holidays is a set of all Holidays. Is there a way to do this without iteration since my start/endDates are variable? Or maybe there's a very elegant solution that I don't know about as I am still new to R. Thanks for all your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create vector of data frames
Hi All, I'm sorry I haven't been able to find anything that will help me with this problem, and I'm still pretty new to R - so any help here is greatly appreciated! I am looking to create a vector in a sequential process where each entry in the vector is a data frame, for example: days <- 3 for (i in 1:days) { FOO[i] <- data.frame(x=c(i, i+1, i+2), y=c(i, i*i, i*i*i)) } but when I try this, I get the error "object "FOO" not found". I tried to avoid this by concatenating blank data frames to create a shell for FOO via: FOO <- rep(data.frame(), times=days) before the other lines, but then I get lots of errors relating to replacing 0 rows with 3 rows. Needless to say, the data frames I am actually dealing with are quite a bit larger than listed here - so I wanted to get the process working on a toy example first. Thanks in advance for your help! for (i in 1:3) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Command line arguments with source() - Windows OS
Is there a better command to use rather than source which would take command arguments? I ask because I currently have 6 parameters, will likely have additional paramaters later, and would like to be able to have default values for each, if I do not specify new values. Thanks so much! On Mon, Nov 17, 2008 at 9:06 AM, Duncan Murdoch <[EMAIL PROTECTED]>wrote: > Brigid Mooney wrote: > >> Hi Everyone, >> >> I am pretty new to R and so far have mostly been using R interactively >> through the Windows console. >> >> I'm starting to write some scripts, and have been executing them using the >> source() command, i.e. source(myRfile.R). >> >> My questions is how can I pass command line arguments to R. My file >> "myRfile.R" has some global variables which I would like to be able to set >> at run-time, without having to go in and edit the text of the file each >> time >> I want to run it. >> >> I can't seem to find much information on this topic for Windows and using >> it >> with the console - so any help would be greatly appreciated. >> > > This is the same on all platforms, Windows isn't special. > > When you use source(), any variables currently defined in the R session > will be visible to your script. There is no "command line" needed, because > you're executing the R code in the same session. > > So you could do this: > > paramValue <- 10 > source("myRfile.R") > > paramValue <- 15 > source("myRfile.R") > > The quotes are necessary, because source(myRfile.R) would go looking for a > variable named myRfile.R, rather than using "myRfile.R" as the filename. > > Duncan Murdoch > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Command line arguments with source() - Windows OS
Hi Everyone, I am pretty new to R and so far have mostly been using R interactively through the Windows console. I'm starting to write some scripts, and have been executing them using the source() command, i.e. source(myRfile.R). My questions is how can I pass command line arguments to R. My file "myRfile.R" has some global variables which I would like to be able to set at run-time, without having to go in and edit the text of the file each time I want to run it. I can't seem to find much information on this topic for Windows and using it with the console - so any help would be greatly appreciated. Thanks so much! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Embed SQL queries in R?
Hi All, Most of the work I am doing with R uses data which I am pulling from various SQL queries. To streamline the process even more, I was wondering if it was possible to embed SQL queries in R - that way avoiding the need to first get the data, then move to R to process it. I haven't found anything out there on this yet, so if you know of a good resource that includes this topic, I would really appreciate it. Thanks again for all your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unflatten a table in R
Hi All, I'm pretty new to R, so would really appreciate it if someone could point me in the right direction on this problem. I am trying to "unflatten" a table in R, and can't seem to find a function or method to complete this task, (hopefully efficiently). My data table is full of historical stock-market data. Daily, each ticker-symbol has four data points: open, close, high, and low. Right now the table looks like the following. (Note, each ticker symbol has a unique numeric 'SymbolID'. SymbolID MarketDate Open Close HighLow 1 4 11/3/2008 1.5790 1.5788 1.5790 1.5788 2 4 11/4/2008 1.5891 1.5892 1.5892 1.5891 3 4 11/5/2008 1.5937 1.5931 1.5937 1.5931 4 4 11/6/2008 1.5727 1.5727 1.5727 1.5727 5 4 11/7/2008 1.5673 1.5669 1.5673 1.5669 6 5 11/3/2008 0.8433 0.8435 0.8435 0.8433 7 5 11/4/2008 0.8672 0.8672 0.8672 0.8672 8 5 11/5/2008 0.8597 0.8594 0.8597 0.8594 9 5 11/6/2008 0.8412 0.8410 0.8412 0.8410 105 11/7/2008 0.8407 0.8409 0.8411 0.8407 ... I'm envisioning a solution with a two-way lookup, something like: SymbolID 11/3/2008 11/4/2008 11/5/2008 ... 4 (open, close, high,low) 5 ... where each entry in the table is actually a vector of the four points for that symbol and date. or even something like SymbolID 11/3/2008-open 11/3/2008-close 11/3/2008-high 11/3/2008-low 11/4/2008-open ... 4 ... where in this case, each entry would just be the appropriate numeric entry from above. Again, any help or suggestions on this one is greatly appreciated... Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.